[2023-10-10 04:36:45,671][52050] Saving configuration to ./train_atari/atari_choppercommand_APPO/config.json... [2023-10-10 04:36:45,988][52050] Rollout worker 0 uses device cpu [2023-10-10 04:36:45,988][52050] Rollout worker 1 uses device cpu [2023-10-10 04:36:45,989][52050] Rollout worker 2 uses device cpu [2023-10-10 04:36:45,989][52050] Rollout worker 3 uses device cpu [2023-10-10 04:36:45,990][52050] Rollout worker 4 uses device cpu [2023-10-10 04:36:45,990][52050] Rollout worker 5 uses device cpu [2023-10-10 04:36:45,991][52050] Rollout worker 6 uses device cpu [2023-10-10 04:36:45,991][52050] Rollout worker 7 uses device cpu [2023-10-10 04:36:45,992][52050] Rollout worker 8 uses device cpu [2023-10-10 04:36:45,992][52050] Rollout worker 9 uses device cpu [2023-10-10 04:36:45,993][52050] Rollout worker 10 uses device cpu [2023-10-10 04:36:45,993][52050] Rollout worker 11 uses device cpu [2023-10-10 04:36:45,993][52050] Rollout worker 12 uses device cpu [2023-10-10 04:36:45,994][52050] Rollout worker 13 uses device cpu [2023-10-10 04:36:45,994][52050] Rollout worker 14 uses device cpu [2023-10-10 04:36:45,994][52050] Rollout worker 15 uses device cpu [2023-10-10 04:36:46,276][52050] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-10 04:36:46,277][52050] InferenceWorker_p0-w0: min num requests: 2 [2023-10-10 04:36:46,279][52050] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-10 04:36:46,280][52050] InferenceWorker_p1-w0: min num requests: 2 [2023-10-10 04:36:46,327][52050] Starting all processes... [2023-10-10 04:36:46,327][52050] Starting process learner_proc0 [2023-10-10 04:36:48,010][52050] Starting process learner_proc1 [2023-10-10 04:36:48,015][52846] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-10 04:36:48,015][52846] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-10-10 04:36:48,034][52846] Num visible devices: 1 [2023-10-10 04:36:48,051][52846] Setting fixed seed 1234 [2023-10-10 04:36:48,053][52846] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-10 04:36:48,053][52846] Initializing actor-critic model on device cuda:0 [2023-10-10 04:36:48,053][52846] RunningMeanStd input shape: (4, 84, 84) [2023-10-10 04:36:48,054][52846] RunningMeanStd input shape: (1,) [2023-10-10 04:36:48,065][52846] ConvEncoder: input_channels=4 [2023-10-10 04:36:48,244][52846] Conv encoder output size: 512 [2023-10-10 04:36:48,246][52846] Created Actor Critic model with architecture: [2023-10-10 04:36:48,246][52846] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) [2023-10-10 04:36:48,821][52846] Using optimizer [2023-10-10 04:36:48,822][52846] No checkpoints found [2023-10-10 04:36:48,822][52846] Did not load from checkpoint, starting from scratch! [2023-10-10 04:36:48,822][52846] Initialized policy 0 weights for model version 0 [2023-10-10 04:36:48,823][52846] LearnerWorker_p0 finished initialization! [2023-10-10 04:36:48,824][52846] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-10 04:36:49,765][52050] Starting all processes... [2023-10-10 04:36:49,768][53061] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-10 04:36:49,768][53061] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 [2023-10-10 04:36:49,774][52050] Starting process inference_proc0-0 [2023-10-10 04:36:49,775][52050] Starting process inference_proc1-0 [2023-10-10 04:36:49,775][52050] Starting process rollout_proc0 [2023-10-10 04:36:49,786][53061] Num visible devices: 1 [2023-10-10 04:36:49,775][52050] Starting process rollout_proc1 [2023-10-10 04:36:49,775][52050] Starting process rollout_proc2 [2023-10-10 04:36:49,776][52050] Starting process rollout_proc3 [2023-10-10 04:36:49,804][53061] Setting fixed seed 1234 [2023-10-10 04:36:49,805][53061] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-10 04:36:49,805][53061] Initializing actor-critic model on device cuda:0 [2023-10-10 04:36:49,806][53061] RunningMeanStd input shape: (4, 84, 84) [2023-10-10 04:36:49,806][53061] RunningMeanStd input shape: (1,) [2023-10-10 04:36:49,781][52050] Starting process rollout_proc4 [2023-10-10 04:36:49,782][52050] Starting process rollout_proc5 [2023-10-10 04:36:49,787][52050] Starting process rollout_proc6 [2023-10-10 04:36:49,787][52050] Starting process rollout_proc7 [2023-10-10 04:36:49,818][53061] ConvEncoder: input_channels=4 [2023-10-10 04:36:49,789][52050] Starting process rollout_proc8 [2023-10-10 04:36:49,789][52050] Starting process rollout_proc9 [2023-10-10 04:36:49,789][52050] Starting process rollout_proc10 [2023-10-10 04:36:49,792][52050] Starting process rollout_proc11 [2023-10-10 04:36:49,795][52050] Starting process rollout_proc12 [2023-10-10 04:36:49,795][52050] Starting process rollout_proc13 [2023-10-10 04:36:50,309][53061] Conv encoder output size: 512 [2023-10-10 04:36:50,312][53061] Created Actor Critic model with architecture: [2023-10-10 04:36:50,313][53061] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) [2023-10-10 04:36:50,965][53061] Using optimizer [2023-10-10 04:36:50,965][53061] No checkpoints found [2023-10-10 04:36:50,966][53061] Did not load from checkpoint, starting from scratch! [2023-10-10 04:36:50,966][53061] Initialized policy 1 weights for model version 0 [2023-10-10 04:36:50,967][53061] LearnerWorker_p1 finished initialization! [2023-10-10 04:36:50,968][53061] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-10 04:36:51,958][52050] Starting process rollout_proc14 [2023-10-10 04:36:51,964][53293] Worker 7 uses CPU cores [14, 15] [2023-10-10 04:36:52,041][52050] Starting process rollout_proc15 [2023-10-10 04:36:52,050][53289] Worker 3 uses CPU cores [6, 7] [2023-10-10 04:36:52,072][53291] Worker 5 uses CPU cores [10, 11] [2023-10-10 04:36:52,274][53268] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-10 04:36:52,275][53268] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 [2023-10-10 04:36:52,296][53296] Worker 8 uses CPU cores [16, 17] [2023-10-10 04:36:52,296][53268] Num visible devices: 1 [2023-10-10 04:36:52,417][53292] Worker 6 uses CPU cores [12, 13] [2023-10-10 04:36:52,432][53287] Worker 2 uses CPU cores [4, 5] [2023-10-10 04:36:52,434][53285] Worker 0 uses CPU cores [0, 1] [2023-10-10 04:36:52,450][53299] Worker 12 uses CPU cores [24, 25] [2023-10-10 04:36:52,536][53294] Worker 9 uses CPU cores [18, 19] [2023-10-10 04:36:52,556][53252] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-10 04:36:52,556][53252] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-10-10 04:36:52,561][53288] Worker 1 uses CPU cores [2, 3] [2023-10-10 04:36:52,565][53295] Worker 10 uses CPU cores [20, 21] [2023-10-10 04:36:52,575][53252] Num visible devices: 1 [2023-10-10 04:36:52,590][53290] Worker 4 uses CPU cores [8, 9] [2023-10-10 04:36:52,623][53300] Worker 13 uses CPU cores [26, 27] [2023-10-10 04:36:52,696][53298] Worker 11 uses CPU cores [22, 23] [2023-10-10 04:36:52,967][53268] RunningMeanStd input shape: (4, 84, 84) [2023-10-10 04:36:52,968][53268] RunningMeanStd input shape: (1,) [2023-10-10 04:36:52,979][53268] ConvEncoder: input_channels=4 [2023-10-10 04:36:53,081][53268] Conv encoder output size: 512 [2023-10-10 04:36:53,192][53252] RunningMeanStd input shape: (4, 84, 84) [2023-10-10 04:36:53,193][53252] RunningMeanStd input shape: (1,) [2023-10-10 04:36:53,204][53252] ConvEncoder: input_channels=4 [2023-10-10 04:36:53,304][53252] Conv encoder output size: 512 [2023-10-10 04:36:53,937][53954] Worker 14 uses CPU cores [28, 29] [2023-10-10 04:36:54,050][52050] Inference worker 1-0 is ready! [2023-10-10 04:36:54,051][52050] Inference worker 0-0 is ready! [2023-10-10 04:36:54,052][54020] Worker 15 uses CPU cores [30, 31] [2023-10-10 04:36:54,052][52050] All inference workers are ready! Signal rollout workers to start! [2023-10-10 04:36:54,053][53296] EnvRunner 8-0 uses policy 0 [2023-10-10 04:36:54,054][53292] EnvRunner 6-0 uses policy 0 [2023-10-10 04:36:54,053][53298] EnvRunner 11-0 uses policy 1 [2023-10-10 04:36:54,053][53288] EnvRunner 1-0 uses policy 1 [2023-10-10 04:36:54,053][53285] EnvRunner 0-0 uses policy 0 [2023-10-10 04:36:54,053][53299] EnvRunner 12-0 uses policy 0 [2023-10-10 04:36:54,053][53287] EnvRunner 2-0 uses policy 0 [2023-10-10 04:36:54,054][53294] EnvRunner 9-0 uses policy 1 [2023-10-10 04:36:54,054][53291] EnvRunner 5-0 uses policy 1 [2023-10-10 04:36:54,054][53289] EnvRunner 3-0 uses policy 1 [2023-10-10 04:36:54,054][53293] EnvRunner 7-0 uses policy 1 [2023-10-10 04:36:54,054][53300] EnvRunner 13-0 uses policy 1 [2023-10-10 04:36:54,054][53295] EnvRunner 10-0 uses policy 0 [2023-10-10 04:36:54,054][52050] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-10 04:36:54,054][53290] EnvRunner 4-0 uses policy 0 [2023-10-10 04:36:54,077][53954] EnvRunner 14-0 uses policy 0 [2023-10-10 04:36:54,269][54020] EnvRunner 15-0 uses policy 1 [2023-10-10 04:36:56,264][52050] Heartbeat connected on Batcher_0 [2023-10-10 04:36:56,266][52050] Heartbeat connected on LearnerWorker_p0 [2023-10-10 04:36:56,270][52050] Heartbeat connected on Batcher_1 [2023-10-10 04:36:56,272][52050] Heartbeat connected on LearnerWorker_p1 [2023-10-10 04:36:56,283][52050] Heartbeat connected on InferenceWorker_p0-w0 [2023-10-10 04:36:56,283][52050] Heartbeat connected on InferenceWorker_p1-w0 [2023-10-10 04:36:56,283][52050] Heartbeat connected on RolloutWorker_w0 [2023-10-10 04:36:56,289][52050] Heartbeat connected on RolloutWorker_w2 [2023-10-10 04:36:56,290][52050] Heartbeat connected on RolloutWorker_w1 [2023-10-10 04:36:56,291][52050] Heartbeat connected on RolloutWorker_w3 [2023-10-10 04:36:56,297][52050] Heartbeat connected on RolloutWorker_w5 [2023-10-10 04:36:56,298][52050] Heartbeat connected on RolloutWorker_w4 [2023-10-10 04:36:56,303][52050] Heartbeat connected on RolloutWorker_w7 [2023-10-10 04:36:56,304][52050] Heartbeat connected on RolloutWorker_w6 [2023-10-10 04:36:56,305][52050] Heartbeat connected on RolloutWorker_w8 [2023-10-10 04:36:56,308][52050] Heartbeat connected on RolloutWorker_w9 [2023-10-10 04:36:56,313][52050] Heartbeat connected on RolloutWorker_w10 [2023-10-10 04:36:56,313][52050] Heartbeat connected on RolloutWorker_w11 [2023-10-10 04:36:56,320][52050] Heartbeat connected on RolloutWorker_w13 [2023-10-10 04:36:56,323][52050] Heartbeat connected on RolloutWorker_w14 [2023-10-10 04:36:56,324][52050] Heartbeat connected on RolloutWorker_w12 [2023-10-10 04:36:56,326][52050] Heartbeat connected on RolloutWorker_w15 [2023-10-10 04:36:56,783][52050] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 644.1, 1: 413.3. Samples: 2886. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-10 04:36:56,784][52050] Avg episode reward: [(0, '2.000'), (1, '2.000')] [2023-10-10 04:37:01,783][52050] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 969.0, 1: 894.5. Samples: 14404. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-10 04:37:01,784][52050] Avg episode reward: [(0, '1.706'), (1, '1.485')] [2023-10-10 04:37:04,204][53252] Updated weights for policy 0, policy_version 10 (0.0009) [2023-10-10 04:37:04,229][53268] Updated weights for policy 1, policy_version 10 (0.0008) [2023-10-10 04:37:04,570][53252] Updated weights for policy 0, policy_version 20 (0.0008) [2023-10-10 04:37:04,599][53268] Updated weights for policy 1, policy_version 20 (0.0008) [2023-10-10 04:37:04,942][53252] Updated weights for policy 0, policy_version 30 (0.0009) [2023-10-10 04:37:04,959][53268] Updated weights for policy 1, policy_version 30 (0.0008) [2023-10-10 04:37:06,783][52050] Fps is (10 sec: 6553.4, 60 sec: 5148.3, 300 sec: 5148.3). Total num frames: 65536. Throughput: 0: 1210.2, 1: 1192.3. Samples: 30584. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 04:37:06,785][52050] Avg episode reward: [(0, '1.780'), (1, '1.694')] [2023-10-10 04:37:07,572][53252] Updated weights for policy 0, policy_version 40 (0.0007) [2023-10-10 04:37:07,685][53268] Updated weights for policy 1, policy_version 40 (0.0008) [2023-10-10 04:37:07,937][53252] Updated weights for policy 0, policy_version 50 (0.0008) [2023-10-10 04:37:08,047][53268] Updated weights for policy 1, policy_version 50 (0.0007) [2023-10-10 04:37:08,309][53252] Updated weights for policy 0, policy_version 60 (0.0007) [2023-10-10 04:37:08,410][53268] Updated weights for policy 1, policy_version 60 (0.0008) [2023-10-10 04:37:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 7392.9, 300 sec: 7392.9). Total num frames: 131072. Throughput: 0: 1448.3, 1: 1428.7. Samples: 51008. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) [2023-10-10 04:37:11,784][52050] Avg episode reward: [(0, '1.873'), (1, '1.900')] [2023-10-10 04:37:11,860][53252] Updated weights for policy 0, policy_version 70 (0.0009) [2023-10-10 04:37:12,111][53268] Updated weights for policy 1, policy_version 70 (0.0008) [2023-10-10 04:37:12,224][53252] Updated weights for policy 0, policy_version 80 (0.0008) [2023-10-10 04:37:12,470][53268] Updated weights for policy 1, policy_version 80 (0.0007) [2023-10-10 04:37:12,589][53252] Updated weights for policy 0, policy_version 90 (0.0008) [2023-10-10 04:37:12,840][53268] Updated weights for policy 1, policy_version 90 (0.0008) [2023-10-10 04:37:16,306][53268] Updated weights for policy 1, policy_version 100 (0.0008) [2023-10-10 04:37:16,457][53252] Updated weights for policy 0, policy_version 100 (0.0008) [2023-10-10 04:37:16,676][53268] Updated weights for policy 1, policy_version 110 (0.0008) [2023-10-10 04:37:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 8649.9, 300 sec: 8649.9). Total num frames: 196608. Throughput: 0: 1330.7, 1: 1315.1. Samples: 60138. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 04:37:16,784][52050] Avg episode reward: [(0, '1.978'), (1, '2.022')] [2023-10-10 04:37:16,832][53252] Updated weights for policy 0, policy_version 110 (0.0007) [2023-10-10 04:37:17,043][53268] Updated weights for policy 1, policy_version 120 (0.0008) [2023-10-10 04:37:17,205][53252] Updated weights for policy 0, policy_version 120 (0.0008) [2023-10-10 04:37:21,178][53252] Updated weights for policy 0, policy_version 130 (0.0008) [2023-10-10 04:37:21,178][53268] Updated weights for policy 1, policy_version 130 (0.0008) [2023-10-10 04:37:21,546][53268] Updated weights for policy 1, policy_version 140 (0.0010) [2023-10-10 04:37:21,554][53252] Updated weights for policy 0, policy_version 140 (0.0009) [2023-10-10 04:37:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 9453.7, 300 sec: 9453.7). Total num frames: 262144. Throughput: 0: 1461.6, 1: 1448.6. Samples: 80700. Policy #0 lag: (min: 22.0, avg: 28.3, max: 54.0) [2023-10-10 04:37:21,784][52050] Avg episode reward: [(0, '2.100'), (1, '2.050')] [2023-10-10 04:37:21,917][53252] Updated weights for policy 0, policy_version 150 (0.0008) [2023-10-10 04:37:21,918][53268] Updated weights for policy 1, policy_version 150 (0.0008) [2023-10-10 04:37:22,276][53061] Saving new best policy, reward=2.050! [2023-10-10 04:37:22,277][53268] Updated weights for policy 1, policy_version 160 (0.0009) [2023-10-10 04:37:22,291][52846] Saving new best policy, reward=2.100! [2023-10-10 04:37:22,293][53252] Updated weights for policy 0, policy_version 160 (0.0009) [2023-10-10 04:37:26,280][53268] Updated weights for policy 1, policy_version 170 (0.0008) [2023-10-10 04:37:26,315][53252] Updated weights for policy 0, policy_version 170 (0.0007) [2023-10-10 04:37:26,634][53268] Updated weights for policy 1, policy_version 180 (0.0007) [2023-10-10 04:37:26,679][53252] Updated weights for policy 0, policy_version 180 (0.0010) [2023-10-10 04:37:26,784][52050] Fps is (10 sec: 13106.9, 60 sec: 10011.7, 300 sec: 10011.7). Total num frames: 327680. Throughput: 0: 1543.2, 1: 1533.7. Samples: 100704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:37:26,785][52050] Avg episode reward: [(0, '2.200'), (1, '2.170')] [2023-10-10 04:37:27,005][53268] Updated weights for policy 1, policy_version 190 (0.0008) [2023-10-10 04:37:27,050][53252] Updated weights for policy 0, policy_version 190 (0.0007) [2023-10-10 04:37:27,078][53061] Saving new best policy, reward=2.170! [2023-10-10 04:37:27,126][52846] Saving new best policy, reward=2.200! [2023-10-10 04:37:31,094][53268] Updated weights for policy 1, policy_version 200 (0.0009) [2023-10-10 04:37:31,142][53252] Updated weights for policy 0, policy_version 200 (0.0009) [2023-10-10 04:37:31,457][53268] Updated weights for policy 1, policy_version 210 (0.0008) [2023-10-10 04:37:31,511][53252] Updated weights for policy 0, policy_version 210 (0.0007) [2023-10-10 04:37:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 10422.0, 300 sec: 10422.0). Total num frames: 393216. Throughput: 0: 1470.7, 1: 1460.8. Samples: 110606. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-10 04:37:31,784][52050] Avg episode reward: [(0, '2.160'), (1, '2.220')] [2023-10-10 04:37:31,822][53268] Updated weights for policy 1, policy_version 220 (0.0010) [2023-10-10 04:37:31,880][53252] Updated weights for policy 0, policy_version 220 (0.0008) [2023-10-10 04:37:31,961][53061] Saving new best policy, reward=2.220! [2023-10-10 04:37:35,847][53268] Updated weights for policy 1, policy_version 230 (0.0008) [2023-10-10 04:37:35,927][53252] Updated weights for policy 0, policy_version 230 (0.0007) [2023-10-10 04:37:36,219][53268] Updated weights for policy 1, policy_version 240 (0.0007) [2023-10-10 04:37:36,303][53252] Updated weights for policy 0, policy_version 240 (0.0008) [2023-10-10 04:37:36,584][53268] Updated weights for policy 1, policy_version 250 (0.0007) [2023-10-10 04:37:36,673][53252] Updated weights for policy 0, policy_version 250 (0.0009) [2023-10-10 04:37:36,783][52050] Fps is (10 sec: 13107.8, 60 sec: 10736.2, 300 sec: 10736.2). Total num frames: 458752. Throughput: 0: 1543.3, 1: 1535.6. Samples: 131560. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 04:37:36,784][52050] Avg episode reward: [(0, '2.050'), (1, '2.020')] [2023-10-10 04:37:40,541][53268] Updated weights for policy 1, policy_version 260 (0.0010) [2023-10-10 04:37:40,790][53252] Updated weights for policy 0, policy_version 260 (0.0010) [2023-10-10 04:37:40,897][53268] Updated weights for policy 1, policy_version 270 (0.0007) [2023-10-10 04:37:41,160][53252] Updated weights for policy 0, policy_version 270 (0.0009) [2023-10-10 04:37:41,262][53268] Updated weights for policy 1, policy_version 280 (0.0007) [2023-10-10 04:37:41,525][53252] Updated weights for policy 0, policy_version 280 (0.0009) [2023-10-10 04:37:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 11671.1, 300 sec: 11671.1). Total num frames: 557056. Throughput: 0: 1640.3, 1: 1653.0. Samples: 151084. Policy #0 lag: (min: 4.0, avg: 5.2, max: 21.0) [2023-10-10 04:37:41,784][52050] Avg episode reward: [(0, '1.930'), (1, '2.160')] [2023-10-10 04:37:45,193][53268] Updated weights for policy 1, policy_version 290 (0.0009) [2023-10-10 04:37:45,599][53268] Updated weights for policy 1, policy_version 300 (0.0010) [2023-10-10 04:37:45,657][53252] Updated weights for policy 0, policy_version 290 (0.0009) [2023-10-10 04:37:45,957][53268] Updated weights for policy 1, policy_version 310 (0.0009) [2023-10-10 04:37:46,029][53252] Updated weights for policy 0, policy_version 300 (0.0007) [2023-10-10 04:37:46,326][53268] Updated weights for policy 1, policy_version 320 (0.0009) [2023-10-10 04:37:46,391][53252] Updated weights for policy 0, policy_version 310 (0.0008) [2023-10-10 04:37:46,764][53252] Updated weights for policy 0, policy_version 320 (0.0008) [2023-10-10 04:37:46,783][52050] Fps is (10 sec: 19660.6, 60 sec: 12428.7, 300 sec: 12428.7). Total num frames: 655360. Throughput: 0: 1631.7, 1: 1648.4. Samples: 162008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:37:46,784][52050] Avg episode reward: [(0, '1.920'), (1, '2.220')] [2023-10-10 04:37:50,400][53268] Updated weights for policy 1, policy_version 330 (0.0009) [2023-10-10 04:37:50,756][53268] Updated weights for policy 1, policy_version 340 (0.0009) [2023-10-10 04:37:50,820][53252] Updated weights for policy 0, policy_version 330 (0.0009) [2023-10-10 04:37:51,122][53268] Updated weights for policy 1, policy_version 350 (0.0007) [2023-10-10 04:37:51,183][53252] Updated weights for policy 0, policy_version 340 (0.0008) [2023-10-10 04:37:51,553][53252] Updated weights for policy 0, policy_version 350 (0.0010) [2023-10-10 04:37:51,783][52050] Fps is (10 sec: 16384.0, 60 sec: 12487.5, 300 sec: 12487.5). Total num frames: 720896. Throughput: 0: 1684.9, 1: 1692.1. Samples: 182550. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) [2023-10-10 04:37:51,784][52050] Avg episode reward: [(0, '2.110'), (1, '2.340')] [2023-10-10 04:37:51,784][53061] Saving new best policy, reward=2.340! [2023-10-10 04:37:55,228][53268] Updated weights for policy 1, policy_version 360 (0.0009) [2023-10-10 04:37:55,454][53252] Updated weights for policy 0, policy_version 360 (0.0007) [2023-10-10 04:37:55,599][53268] Updated weights for policy 1, policy_version 370 (0.0010) [2023-10-10 04:37:55,822][53252] Updated weights for policy 0, policy_version 370 (0.0008) [2023-10-10 04:37:55,957][53268] Updated weights for policy 1, policy_version 380 (0.0007) [2023-10-10 04:37:56,193][53252] Updated weights for policy 0, policy_version 380 (0.0008) [2023-10-10 04:37:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12536.9). Total num frames: 786432. Throughput: 0: 1662.2, 1: 1670.8. Samples: 200994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:37:56,784][52050] Avg episode reward: [(0, '2.240'), (1, '2.290')] [2023-10-10 04:37:56,788][52846] Saving new best policy, reward=2.240! [2023-10-10 04:37:59,877][53268] Updated weights for policy 1, policy_version 390 (0.0007) [2023-10-10 04:38:00,049][53252] Updated weights for policy 0, policy_version 390 (0.0008) [2023-10-10 04:38:00,246][53268] Updated weights for policy 1, policy_version 400 (0.0008) [2023-10-10 04:38:00,419][53252] Updated weights for policy 0, policy_version 400 (0.0009) [2023-10-10 04:38:00,613][53268] Updated weights for policy 1, policy_version 410 (0.0008) [2023-10-10 04:38:00,785][53252] Updated weights for policy 0, policy_version 410 (0.0007) [2023-10-10 04:38:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 12579.0). Total num frames: 851968. Throughput: 0: 1689.9, 1: 1701.0. Samples: 212730. Policy #0 lag: (min: 22.0, avg: 29.8, max: 54.0) [2023-10-10 04:38:01,784][52050] Avg episode reward: [(0, '2.230'), (1, '2.350')] [2023-10-10 04:38:01,785][53061] Saving new best policy, reward=2.350! [2023-10-10 04:38:04,550][53268] Updated weights for policy 1, policy_version 420 (0.0009) [2023-10-10 04:38:04,787][53252] Updated weights for policy 0, policy_version 420 (0.0009) [2023-10-10 04:38:04,909][53268] Updated weights for policy 1, policy_version 430 (0.0009) [2023-10-10 04:38:05,154][53252] Updated weights for policy 0, policy_version 430 (0.0008) [2023-10-10 04:38:05,281][53268] Updated weights for policy 1, policy_version 440 (0.0009) [2023-10-10 04:38:05,530][53252] Updated weights for policy 0, policy_version 440 (0.0008) [2023-10-10 04:38:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 12615.3). Total num frames: 917504. Throughput: 0: 1675.6, 1: 1687.0. Samples: 232018. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-10 04:38:06,784][52050] Avg episode reward: [(0, '2.360'), (1, '2.250')] [2023-10-10 04:38:06,784][52846] Saving new best policy, reward=2.360! [2023-10-10 04:38:09,533][53268] Updated weights for policy 1, policy_version 450 (0.0009) [2023-10-10 04:38:09,600][53252] Updated weights for policy 0, policy_version 450 (0.0010) [2023-10-10 04:38:09,897][53268] Updated weights for policy 1, policy_version 460 (0.0007) [2023-10-10 04:38:09,979][53252] Updated weights for policy 0, policy_version 460 (0.0007) [2023-10-10 04:38:10,263][53268] Updated weights for policy 1, policy_version 470 (0.0008) [2023-10-10 04:38:10,342][53252] Updated weights for policy 0, policy_version 470 (0.0009) [2023-10-10 04:38:10,640][53268] Updated weights for policy 1, policy_version 480 (0.0009) [2023-10-10 04:38:10,718][53252] Updated weights for policy 0, policy_version 480 (0.0008) [2023-10-10 04:38:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 12647.0). Total num frames: 983040. Throughput: 0: 1671.0, 1: 1680.4. Samples: 251516. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-10-10 04:38:11,784][52050] Avg episode reward: [(0, '2.460'), (1, '2.210')] [2023-10-10 04:38:11,790][52846] Saving new best policy, reward=2.460! [2023-10-10 04:38:14,665][53268] Updated weights for policy 1, policy_version 490 (0.0007) [2023-10-10 04:38:14,835][53252] Updated weights for policy 0, policy_version 490 (0.0007) [2023-10-10 04:38:15,028][53268] Updated weights for policy 1, policy_version 500 (0.0009) [2023-10-10 04:38:15,206][53252] Updated weights for policy 0, policy_version 500 (0.0008) [2023-10-10 04:38:15,398][53268] Updated weights for policy 1, policy_version 510 (0.0008) [2023-10-10 04:38:15,580][53252] Updated weights for policy 0, policy_version 510 (0.0010) [2023-10-10 04:38:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 12674.8). Total num frames: 1048576. Throughput: 0: 1689.2, 1: 1705.7. Samples: 263376. Policy #0 lag: (min: 26.0, avg: 29.3, max: 58.0) [2023-10-10 04:38:16,785][52050] Avg episode reward: [(0, '2.380'), (1, '2.400')] [2023-10-10 04:38:16,786][53061] Saving new best policy, reward=2.400! [2023-10-10 04:38:19,526][53268] Updated weights for policy 1, policy_version 520 (0.0010) [2023-10-10 04:38:19,783][53252] Updated weights for policy 0, policy_version 520 (0.0010) [2023-10-10 04:38:19,900][53268] Updated weights for policy 1, policy_version 530 (0.0009) [2023-10-10 04:38:20,157][53252] Updated weights for policy 0, policy_version 530 (0.0010) [2023-10-10 04:38:20,272][53268] Updated weights for policy 1, policy_version 540 (0.0009) [2023-10-10 04:38:20,528][53252] Updated weights for policy 0, policy_version 540 (0.0008) [2023-10-10 04:38:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 12699.4). Total num frames: 1114112. Throughput: 0: 1667.6, 1: 1679.2. Samples: 282170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:38:21,784][52050] Avg episode reward: [(0, '2.450'), (1, '2.490')] [2023-10-10 04:38:21,786][53061] Saving new best policy, reward=2.490! [2023-10-10 04:38:24,366][53268] Updated weights for policy 1, policy_version 550 (0.0008) [2023-10-10 04:38:24,553][53252] Updated weights for policy 0, policy_version 550 (0.0008) [2023-10-10 04:38:24,730][53268] Updated weights for policy 1, policy_version 560 (0.0008) [2023-10-10 04:38:24,934][53252] Updated weights for policy 0, policy_version 560 (0.0010) [2023-10-10 04:38:25,096][53268] Updated weights for policy 1, policy_version 570 (0.0007) [2023-10-10 04:38:25,310][53252] Updated weights for policy 0, policy_version 570 (0.0007) [2023-10-10 04:38:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.6, 300 sec: 12721.4). Total num frames: 1179648. Throughput: 0: 1674.7, 1: 1681.8. Samples: 302126. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 04:38:26,784][52050] Avg episode reward: [(0, '2.640'), (1, '2.530')] [2023-10-10 04:38:26,792][52846] Saving new best policy, reward=2.640! [2023-10-10 04:38:26,793][53061] Saving new best policy, reward=2.530! [2023-10-10 04:38:29,274][53268] Updated weights for policy 1, policy_version 580 (0.0008) [2023-10-10 04:38:29,408][53252] Updated weights for policy 0, policy_version 580 (0.0009) [2023-10-10 04:38:29,645][53268] Updated weights for policy 1, policy_version 590 (0.0007) [2023-10-10 04:38:29,780][53252] Updated weights for policy 0, policy_version 590 (0.0007) [2023-10-10 04:38:30,012][53268] Updated weights for policy 1, policy_version 600 (0.0010) [2023-10-10 04:38:30,159][53252] Updated weights for policy 0, policy_version 600 (0.0008) [2023-10-10 04:38:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 12741.1). Total num frames: 1245184. Throughput: 0: 1682.8, 1: 1685.5. Samples: 313584. Policy #0 lag: (min: 17.0, avg: 20.5, max: 49.0) [2023-10-10 04:38:31,784][52050] Avg episode reward: [(0, '2.690'), (1, '2.700')] [2023-10-10 04:38:31,786][52846] Saving new best policy, reward=2.690! [2023-10-10 04:38:31,786][53061] Saving new best policy, reward=2.700! [2023-10-10 04:38:34,034][53268] Updated weights for policy 1, policy_version 610 (0.0008) [2023-10-10 04:38:34,150][53252] Updated weights for policy 0, policy_version 610 (0.0007) [2023-10-10 04:38:34,429][53268] Updated weights for policy 1, policy_version 620 (0.0009) [2023-10-10 04:38:34,555][53252] Updated weights for policy 0, policy_version 620 (0.0007) [2023-10-10 04:38:34,790][53268] Updated weights for policy 1, policy_version 630 (0.0007) [2023-10-10 04:38:34,930][53252] Updated weights for policy 0, policy_version 630 (0.0007) [2023-10-10 04:38:35,156][53268] Updated weights for policy 1, policy_version 640 (0.0008) [2023-10-10 04:38:35,290][53252] Updated weights for policy 0, policy_version 640 (0.0008) [2023-10-10 04:38:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 12759.0). Total num frames: 1310720. Throughput: 0: 1659.2, 1: 1660.1. Samples: 331918. Policy #0 lag: (min: 28.0, avg: 35.7, max: 60.0) [2023-10-10 04:38:36,784][52050] Avg episode reward: [(0, '2.710'), (1, '2.590')] [2023-10-10 04:38:36,785][52846] Saving new best policy, reward=2.710! [2023-10-10 04:38:39,197][53268] Updated weights for policy 1, policy_version 650 (0.0009) [2023-10-10 04:38:39,357][53252] Updated weights for policy 0, policy_version 650 (0.0008) [2023-10-10 04:38:39,557][53268] Updated weights for policy 1, policy_version 660 (0.0007) [2023-10-10 04:38:39,722][53252] Updated weights for policy 0, policy_version 660 (0.0009) [2023-10-10 04:38:39,919][53268] Updated weights for policy 1, policy_version 670 (0.0007) [2023-10-10 04:38:40,104][53252] Updated weights for policy 0, policy_version 670 (0.0010) [2023-10-10 04:38:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 12775.1). Total num frames: 1376256. Throughput: 0: 1687.5, 1: 1682.8. Samples: 352660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:38:41,784][52050] Avg episode reward: [(0, '2.640'), (1, '2.530')] [2023-10-10 04:38:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000000672_688128.pth... [2023-10-10 04:38:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000000672_688128.pth... [2023-10-10 04:38:44,052][53268] Updated weights for policy 1, policy_version 680 (0.0008) [2023-10-10 04:38:44,084][53252] Updated weights for policy 0, policy_version 680 (0.0010) [2023-10-10 04:38:44,420][53268] Updated weights for policy 1, policy_version 690 (0.0008) [2023-10-10 04:38:44,458][53252] Updated weights for policy 0, policy_version 690 (0.0010) [2023-10-10 04:38:44,789][53268] Updated weights for policy 1, policy_version 700 (0.0007) [2023-10-10 04:38:44,838][53252] Updated weights for policy 0, policy_version 700 (0.0008) [2023-10-10 04:38:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12789.8). Total num frames: 1441792. Throughput: 0: 1672.9, 1: 1673.1. Samples: 363302. Policy #0 lag: (min: 15.0, avg: 18.1, max: 47.0) [2023-10-10 04:38:46,784][52050] Avg episode reward: [(0, '2.390'), (1, '2.480')] [2023-10-10 04:38:48,945][53268] Updated weights for policy 1, policy_version 710 (0.0008) [2023-10-10 04:38:49,133][53252] Updated weights for policy 0, policy_version 710 (0.0008) [2023-10-10 04:38:49,303][53268] Updated weights for policy 1, policy_version 720 (0.0009) [2023-10-10 04:38:49,496][53252] Updated weights for policy 0, policy_version 720 (0.0008) [2023-10-10 04:38:49,669][53268] Updated weights for policy 1, policy_version 730 (0.0009) [2023-10-10 04:38:49,866][53252] Updated weights for policy 0, policy_version 730 (0.0008) [2023-10-10 04:38:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12803.3). Total num frames: 1507328. Throughput: 0: 1670.3, 1: 1668.1. Samples: 382244. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-10 04:38:51,784][52050] Avg episode reward: [(0, '2.390'), (1, '2.420')] [2023-10-10 04:38:53,546][53268] Updated weights for policy 1, policy_version 740 (0.0009) [2023-10-10 04:38:53,670][53252] Updated weights for policy 0, policy_version 740 (0.0009) [2023-10-10 04:38:53,911][53268] Updated weights for policy 1, policy_version 750 (0.0008) [2023-10-10 04:38:54,032][53252] Updated weights for policy 0, policy_version 750 (0.0008) [2023-10-10 04:38:54,292][53268] Updated weights for policy 1, policy_version 760 (0.0009) [2023-10-10 04:38:54,407][53252] Updated weights for policy 0, policy_version 760 (0.0008) [2023-10-10 04:38:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12815.7). Total num frames: 1572864. Throughput: 0: 1687.3, 1: 1683.7. Samples: 403214. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 04:38:56,784][52050] Avg episode reward: [(0, '2.510'), (1, '2.580')] [2023-10-10 04:38:58,466][53268] Updated weights for policy 1, policy_version 770 (0.0007) [2023-10-10 04:38:58,515][53252] Updated weights for policy 0, policy_version 770 (0.0007) [2023-10-10 04:38:58,822][53268] Updated weights for policy 1, policy_version 780 (0.0008) [2023-10-10 04:38:58,881][53252] Updated weights for policy 0, policy_version 780 (0.0007) [2023-10-10 04:38:59,190][53268] Updated weights for policy 1, policy_version 790 (0.0010) [2023-10-10 04:38:59,256][53252] Updated weights for policy 0, policy_version 790 (0.0007) [2023-10-10 04:38:59,567][53268] Updated weights for policy 1, policy_version 800 (0.0009) [2023-10-10 04:38:59,631][53252] Updated weights for policy 0, policy_version 800 (0.0008) [2023-10-10 04:39:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12827.1). Total num frames: 1638400. Throughput: 0: 1665.0, 1: 1660.7. Samples: 413032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:39:01,784][52050] Avg episode reward: [(0, '2.490'), (1, '2.690')] [2023-10-10 04:39:03,494][53268] Updated weights for policy 1, policy_version 810 (0.0011) [2023-10-10 04:39:03,854][53268] Updated weights for policy 1, policy_version 820 (0.0009) [2023-10-10 04:39:03,884][53252] Updated weights for policy 0, policy_version 810 (0.0009) [2023-10-10 04:39:04,223][53268] Updated weights for policy 1, policy_version 830 (0.0007) [2023-10-10 04:39:04,260][53252] Updated weights for policy 0, policy_version 820 (0.0007) [2023-10-10 04:39:04,633][53252] Updated weights for policy 0, policy_version 830 (0.0008) [2023-10-10 04:39:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12837.7). Total num frames: 1703936. Throughput: 0: 1674.7, 1: 1675.9. Samples: 432944. Policy #0 lag: (min: 12.0, avg: 21.1, max: 44.0) [2023-10-10 04:39:06,784][52050] Avg episode reward: [(0, '2.390'), (1, '2.710')] [2023-10-10 04:39:06,786][53061] Saving new best policy, reward=2.710! [2023-10-10 04:39:08,310][53268] Updated weights for policy 1, policy_version 840 (0.0010) [2023-10-10 04:39:08,682][53268] Updated weights for policy 1, policy_version 850 (0.0009) [2023-10-10 04:39:08,724][53252] Updated weights for policy 0, policy_version 840 (0.0009) [2023-10-10 04:39:09,042][53268] Updated weights for policy 1, policy_version 860 (0.0010) [2023-10-10 04:39:09,089][53252] Updated weights for policy 0, policy_version 850 (0.0007) [2023-10-10 04:39:09,461][53252] Updated weights for policy 0, policy_version 860 (0.0007) [2023-10-10 04:39:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 12847.4). Total num frames: 1769472. Throughput: 0: 1683.6, 1: 1681.0. Samples: 453536. Policy #0 lag: (min: 8.0, avg: 23.5, max: 40.0) [2023-10-10 04:39:11,784][52050] Avg episode reward: [(0, '2.480'), (1, '2.650')] [2023-10-10 04:39:13,203][53268] Updated weights for policy 1, policy_version 870 (0.0008) [2023-10-10 04:39:13,578][53268] Updated weights for policy 1, policy_version 880 (0.0008) [2023-10-10 04:39:13,631][53252] Updated weights for policy 0, policy_version 870 (0.0008) [2023-10-10 04:39:13,944][53268] Updated weights for policy 1, policy_version 890 (0.0008) [2023-10-10 04:39:14,002][53252] Updated weights for policy 0, policy_version 880 (0.0009) [2023-10-10 04:39:14,372][53252] Updated weights for policy 0, policy_version 890 (0.0009) [2023-10-10 04:39:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12856.6). Total num frames: 1835008. Throughput: 0: 1663.4, 1: 1655.9. Samples: 462950. Policy #0 lag: (min: 16.0, avg: 33.3, max: 48.0) [2023-10-10 04:39:16,784][52050] Avg episode reward: [(0, '2.420'), (1, '2.610')] [2023-10-10 04:39:17,998][53268] Updated weights for policy 1, policy_version 900 (0.0009) [2023-10-10 04:39:18,362][53268] Updated weights for policy 1, policy_version 910 (0.0008) [2023-10-10 04:39:18,479][53252] Updated weights for policy 0, policy_version 900 (0.0008) [2023-10-10 04:39:18,731][53268] Updated weights for policy 1, policy_version 920 (0.0009) [2023-10-10 04:39:18,841][53252] Updated weights for policy 0, policy_version 910 (0.0007) [2023-10-10 04:39:19,222][53252] Updated weights for policy 0, policy_version 920 (0.0007) [2023-10-10 04:39:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12865.0). Total num frames: 1900544. Throughput: 0: 1678.4, 1: 1684.5. Samples: 483250. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 04:39:21,784][52050] Avg episode reward: [(0, '2.860'), (1, '2.640')] [2023-10-10 04:39:21,786][52846] Saving new best policy, reward=2.860! [2023-10-10 04:39:22,787][53268] Updated weights for policy 1, policy_version 930 (0.0009) [2023-10-10 04:39:23,179][53268] Updated weights for policy 1, policy_version 940 (0.0008) [2023-10-10 04:39:23,238][53252] Updated weights for policy 0, policy_version 930 (0.0007) [2023-10-10 04:39:23,557][53268] Updated weights for policy 1, policy_version 950 (0.0008) [2023-10-10 04:39:23,628][53252] Updated weights for policy 0, policy_version 940 (0.0007) [2023-10-10 04:39:23,920][53268] Updated weights for policy 1, policy_version 960 (0.0009) [2023-10-10 04:39:23,993][53252] Updated weights for policy 0, policy_version 950 (0.0007) [2023-10-10 04:39:24,368][53252] Updated weights for policy 0, policy_version 960 (0.0007) [2023-10-10 04:39:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12873.0). Total num frames: 1966080. Throughput: 0: 1674.8, 1: 1687.3. Samples: 503952. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-10 04:39:26,784][52050] Avg episode reward: [(0, '2.710'), (1, '2.630')] [2023-10-10 04:39:28,106][53268] Updated weights for policy 1, policy_version 970 (0.0009) [2023-10-10 04:39:28,407][53252] Updated weights for policy 0, policy_version 970 (0.0008) [2023-10-10 04:39:28,476][53268] Updated weights for policy 1, policy_version 980 (0.0009) [2023-10-10 04:39:28,768][53252] Updated weights for policy 0, policy_version 980 (0.0007) [2023-10-10 04:39:28,843][53268] Updated weights for policy 1, policy_version 990 (0.0011) [2023-10-10 04:39:29,142][53252] Updated weights for policy 0, policy_version 990 (0.0009) [2023-10-10 04:39:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 12880.4). Total num frames: 2031616. Throughput: 0: 1663.0, 1: 1666.4. Samples: 513126. Policy #0 lag: (min: 21.0, avg: 24.6, max: 53.0) [2023-10-10 04:39:31,784][52050] Avg episode reward: [(0, '2.670'), (1, '2.640')] [2023-10-10 04:39:32,706][53268] Updated weights for policy 1, policy_version 1000 (0.0009) [2023-10-10 04:39:33,080][53268] Updated weights for policy 1, policy_version 1010 (0.0008) [2023-10-10 04:39:33,193][53252] Updated weights for policy 0, policy_version 1000 (0.0008) [2023-10-10 04:39:33,445][53268] Updated weights for policy 1, policy_version 1020 (0.0007) [2023-10-10 04:39:33,573][53252] Updated weights for policy 0, policy_version 1010 (0.0009) [2023-10-10 04:39:33,951][53252] Updated weights for policy 0, policy_version 1020 (0.0008) [2023-10-10 04:39:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12887.4). Total num frames: 2097152. Throughput: 0: 1681.5, 1: 1692.9. Samples: 534092. Policy #0 lag: (min: 13.0, avg: 23.8, max: 45.0) [2023-10-10 04:39:36,784][52050] Avg episode reward: [(0, '2.540'), (1, '2.450')] [2023-10-10 04:39:37,618][53268] Updated weights for policy 1, policy_version 1030 (0.0010) [2023-10-10 04:39:37,989][53268] Updated weights for policy 1, policy_version 1040 (0.0008) [2023-10-10 04:39:38,096][53252] Updated weights for policy 0, policy_version 1030 (0.0008) [2023-10-10 04:39:38,361][53268] Updated weights for policy 1, policy_version 1050 (0.0009) [2023-10-10 04:39:38,459][53252] Updated weights for policy 0, policy_version 1040 (0.0007) [2023-10-10 04:39:38,830][53252] Updated weights for policy 0, policy_version 1050 (0.0008) [2023-10-10 04:39:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12893.9). Total num frames: 2162688. Throughput: 0: 1674.1, 1: 1687.9. Samples: 554502. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 04:39:41,784][52050] Avg episode reward: [(0, '2.400'), (1, '2.340')] [2023-10-10 04:39:42,450][53268] Updated weights for policy 1, policy_version 1060 (0.0010) [2023-10-10 04:39:42,815][53268] Updated weights for policy 1, policy_version 1070 (0.0008) [2023-10-10 04:39:42,844][53252] Updated weights for policy 0, policy_version 1060 (0.0008) [2023-10-10 04:39:43,185][53268] Updated weights for policy 1, policy_version 1080 (0.0010) [2023-10-10 04:39:43,225][53252] Updated weights for policy 0, policy_version 1070 (0.0008) [2023-10-10 04:39:43,604][53252] Updated weights for policy 0, policy_version 1080 (0.0009) [2023-10-10 04:39:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12900.1). Total num frames: 2228224. Throughput: 0: 1667.2, 1: 1677.0. Samples: 563518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:39:46,784][52050] Avg episode reward: [(0, '2.590'), (1, '2.360')] [2023-10-10 04:39:47,290][53268] Updated weights for policy 1, policy_version 1090 (0.0008) [2023-10-10 04:39:47,608][53252] Updated weights for policy 0, policy_version 1090 (0.0009) [2023-10-10 04:39:47,660][53268] Updated weights for policy 1, policy_version 1100 (0.0008) [2023-10-10 04:39:47,978][53252] Updated weights for policy 0, policy_version 1100 (0.0009) [2023-10-10 04:39:48,025][53268] Updated weights for policy 1, policy_version 1110 (0.0009) [2023-10-10 04:39:48,344][53252] Updated weights for policy 0, policy_version 1110 (0.0008) [2023-10-10 04:39:48,397][53268] Updated weights for policy 1, policy_version 1120 (0.0010) [2023-10-10 04:39:48,719][53252] Updated weights for policy 0, policy_version 1120 (0.0010) [2023-10-10 04:39:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12905.9). Total num frames: 2293760. Throughput: 0: 1675.0, 1: 1682.7. Samples: 584038. Policy #0 lag: (min: 6.0, avg: 10.2, max: 38.0) [2023-10-10 04:39:51,784][52050] Avg episode reward: [(0, '2.590'), (1, '2.390')] [2023-10-10 04:39:52,484][53268] Updated weights for policy 1, policy_version 1130 (0.0009) [2023-10-10 04:39:52,665][53252] Updated weights for policy 0, policy_version 1130 (0.0009) [2023-10-10 04:39:52,859][53268] Updated weights for policy 1, policy_version 1140 (0.0009) [2023-10-10 04:39:53,032][53252] Updated weights for policy 0, policy_version 1140 (0.0010) [2023-10-10 04:39:53,223][53268] Updated weights for policy 1, policy_version 1150 (0.0007) [2023-10-10 04:39:53,397][53252] Updated weights for policy 0, policy_version 1150 (0.0007) [2023-10-10 04:39:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12911.4). Total num frames: 2359296. Throughput: 0: 1676.2, 1: 1688.1. Samples: 604932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:39:56,784][52050] Avg episode reward: [(0, '2.540'), (1, '2.410')] [2023-10-10 04:39:57,356][53268] Updated weights for policy 1, policy_version 1160 (0.0008) [2023-10-10 04:39:57,495][53252] Updated weights for policy 0, policy_version 1160 (0.0007) [2023-10-10 04:39:57,716][53268] Updated weights for policy 1, policy_version 1170 (0.0009) [2023-10-10 04:39:57,862][53252] Updated weights for policy 0, policy_version 1170 (0.0009) [2023-10-10 04:39:58,085][53268] Updated weights for policy 1, policy_version 1180 (0.0008) [2023-10-10 04:39:58,230][53252] Updated weights for policy 0, policy_version 1180 (0.0011) [2023-10-10 04:40:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12916.6). Total num frames: 2424832. Throughput: 0: 1672.6, 1: 1685.1. Samples: 614046. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-10 04:40:01,784][52050] Avg episode reward: [(0, '2.550'), (1, '2.360')] [2023-10-10 04:40:02,104][53268] Updated weights for policy 1, policy_version 1190 (0.0010) [2023-10-10 04:40:02,485][53268] Updated weights for policy 1, policy_version 1200 (0.0009) [2023-10-10 04:40:02,513][53252] Updated weights for policy 0, policy_version 1190 (0.0009) [2023-10-10 04:40:02,844][53268] Updated weights for policy 1, policy_version 1210 (0.0007) [2023-10-10 04:40:02,891][53252] Updated weights for policy 0, policy_version 1200 (0.0008) [2023-10-10 04:40:03,260][53252] Updated weights for policy 0, policy_version 1210 (0.0010) [2023-10-10 04:40:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 12921.6). Total num frames: 2490368. Throughput: 0: 1676.5, 1: 1684.5. Samples: 634494. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-10 04:40:06,784][52050] Avg episode reward: [(0, '2.750'), (1, '2.170')] [2023-10-10 04:40:06,898][53268] Updated weights for policy 1, policy_version 1220 (0.0009) [2023-10-10 04:40:07,260][53268] Updated weights for policy 1, policy_version 1230 (0.0008) [2023-10-10 04:40:07,548][53252] Updated weights for policy 0, policy_version 1220 (0.0008) [2023-10-10 04:40:07,626][53268] Updated weights for policy 1, policy_version 1240 (0.0007) [2023-10-10 04:40:07,919][53252] Updated weights for policy 0, policy_version 1230 (0.0008) [2023-10-10 04:40:08,286][53252] Updated weights for policy 0, policy_version 1240 (0.0007) [2023-10-10 04:40:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12926.3). Total num frames: 2555904. Throughput: 0: 1680.4, 1: 1681.3. Samples: 655230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:40:11,784][52050] Avg episode reward: [(0, '2.750'), (1, '2.250')] [2023-10-10 04:40:11,833][53268] Updated weights for policy 1, policy_version 1250 (0.0009) [2023-10-10 04:40:12,199][53268] Updated weights for policy 1, policy_version 1260 (0.0009) [2023-10-10 04:40:12,383][53252] Updated weights for policy 0, policy_version 1250 (0.0007) [2023-10-10 04:40:12,571][53268] Updated weights for policy 1, policy_version 1270 (0.0009) [2023-10-10 04:40:12,784][53252] Updated weights for policy 0, policy_version 1260 (0.0009) [2023-10-10 04:40:12,941][53268] Updated weights for policy 1, policy_version 1280 (0.0009) [2023-10-10 04:40:13,159][53252] Updated weights for policy 0, policy_version 1270 (0.0007) [2023-10-10 04:40:13,526][53252] Updated weights for policy 0, policy_version 1280 (0.0008) [2023-10-10 04:40:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12930.7). Total num frames: 2621440. Throughput: 0: 1674.2, 1: 1679.5. Samples: 664042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:40:16,784][52050] Avg episode reward: [(0, '2.430'), (1, '2.400')] [2023-10-10 04:40:16,922][53268] Updated weights for policy 1, policy_version 1290 (0.0010) [2023-10-10 04:40:17,283][53268] Updated weights for policy 1, policy_version 1300 (0.0007) [2023-10-10 04:40:17,633][53252] Updated weights for policy 0, policy_version 1290 (0.0008) [2023-10-10 04:40:17,653][53268] Updated weights for policy 1, policy_version 1310 (0.0007) [2023-10-10 04:40:18,000][53252] Updated weights for policy 0, policy_version 1300 (0.0009) [2023-10-10 04:40:18,379][53252] Updated weights for policy 0, policy_version 1310 (0.0008) [2023-10-10 04:40:21,737][53268] Updated weights for policy 1, policy_version 1320 (0.0009) [2023-10-10 04:40:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12935.0). Total num frames: 2686976. Throughput: 0: 1665.5, 1: 1677.5. Samples: 684526. Policy #0 lag: (min: 26.0, avg: 36.1, max: 58.0) [2023-10-10 04:40:21,784][52050] Avg episode reward: [(0, '2.390'), (1, '2.630')] [2023-10-10 04:40:22,099][53268] Updated weights for policy 1, policy_version 1330 (0.0007) [2023-10-10 04:40:22,473][53268] Updated weights for policy 1, policy_version 1340 (0.0007) [2023-10-10 04:40:22,512][53252] Updated weights for policy 0, policy_version 1320 (0.0007) [2023-10-10 04:40:22,877][53252] Updated weights for policy 0, policy_version 1330 (0.0008) [2023-10-10 04:40:23,250][53252] Updated weights for policy 0, policy_version 1340 (0.0007) [2023-10-10 04:40:26,504][53268] Updated weights for policy 1, policy_version 1350 (0.0009) [2023-10-10 04:40:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12939.0). Total num frames: 2752512. Throughput: 0: 1665.9, 1: 1682.9. Samples: 705200. Policy #0 lag: (min: 28.0, avg: 30.4, max: 60.0) [2023-10-10 04:40:26,785][52050] Avg episode reward: [(0, '2.660'), (1, '2.620')] [2023-10-10 04:40:26,875][53268] Updated weights for policy 1, policy_version 1360 (0.0008) [2023-10-10 04:40:27,251][53268] Updated weights for policy 1, policy_version 1370 (0.0007) [2023-10-10 04:40:27,275][53252] Updated weights for policy 0, policy_version 1350 (0.0007) [2023-10-10 04:40:27,650][53252] Updated weights for policy 0, policy_version 1360 (0.0007) [2023-10-10 04:40:28,020][53252] Updated weights for policy 0, policy_version 1370 (0.0007) [2023-10-10 04:40:31,343][53268] Updated weights for policy 1, policy_version 1380 (0.0007) [2023-10-10 04:40:31,705][53268] Updated weights for policy 1, policy_version 1390 (0.0008) [2023-10-10 04:40:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12942.9). Total num frames: 2818048. Throughput: 0: 1668.4, 1: 1684.2. Samples: 714384. Policy #0 lag: (min: 10.0, avg: 17.8, max: 42.0) [2023-10-10 04:40:31,784][52050] Avg episode reward: [(0, '2.590'), (1, '2.540')] [2023-10-10 04:40:32,051][53252] Updated weights for policy 0, policy_version 1380 (0.0007) [2023-10-10 04:40:32,083][53268] Updated weights for policy 1, policy_version 1400 (0.0008) [2023-10-10 04:40:32,408][53252] Updated weights for policy 0, policy_version 1390 (0.0007) [2023-10-10 04:40:32,784][53252] Updated weights for policy 0, policy_version 1400 (0.0008) [2023-10-10 04:40:36,165][53268] Updated weights for policy 1, policy_version 1410 (0.0009) [2023-10-10 04:40:36,533][53268] Updated weights for policy 1, policy_version 1420 (0.0008) [2023-10-10 04:40:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 12946.6). Total num frames: 2883584. Throughput: 0: 1669.4, 1: 1683.9. Samples: 734936. Policy #0 lag: (min: 10.0, avg: 18.1, max: 42.0) [2023-10-10 04:40:36,784][52050] Avg episode reward: [(0, '2.570'), (1, '2.550')] [2023-10-10 04:40:36,832][53252] Updated weights for policy 0, policy_version 1410 (0.0009) [2023-10-10 04:40:36,909][53268] Updated weights for policy 1, policy_version 1430 (0.0008) [2023-10-10 04:40:37,193][53252] Updated weights for policy 0, policy_version 1420 (0.0010) [2023-10-10 04:40:37,275][53268] Updated weights for policy 1, policy_version 1440 (0.0008) [2023-10-10 04:40:37,583][53252] Updated weights for policy 0, policy_version 1430 (0.0008) [2023-10-10 04:40:37,950][53252] Updated weights for policy 0, policy_version 1440 (0.0010) [2023-10-10 04:40:41,297][53268] Updated weights for policy 1, policy_version 1450 (0.0008) [2023-10-10 04:40:41,664][53268] Updated weights for policy 1, policy_version 1460 (0.0008) [2023-10-10 04:40:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 12950.1). Total num frames: 2949120. Throughput: 0: 1667.3, 1: 1671.9. Samples: 755198. Policy #0 lag: (min: 15.0, avg: 20.7, max: 47.0) [2023-10-10 04:40:41,784][52050] Avg episode reward: [(0, '2.610'), (1, '2.760')] [2023-10-10 04:40:42,029][53268] Updated weights for policy 1, policy_version 1470 (0.0007) [2023-10-10 04:40:42,100][53252] Updated weights for policy 0, policy_version 1450 (0.0008) [2023-10-10 04:40:42,103][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth... [2023-10-10 04:40:42,135][53061] Saving new best policy, reward=2.760! [2023-10-10 04:40:42,464][53252] Updated weights for policy 0, policy_version 1460 (0.0007) [2023-10-10 04:40:42,835][53252] Updated weights for policy 0, policy_version 1470 (0.0009) [2023-10-10 04:40:42,914][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth... [2023-10-10 04:40:46,130][53268] Updated weights for policy 1, policy_version 1480 (0.0010) [2023-10-10 04:40:46,496][53268] Updated weights for policy 1, policy_version 1490 (0.0010) [2023-10-10 04:40:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12953.5). Total num frames: 3014656. Throughput: 0: 1664.3, 1: 1680.1. Samples: 764542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:40:46,784][52050] Avg episode reward: [(0, '2.430'), (1, '2.940')] [2023-10-10 04:40:46,862][53252] Updated weights for policy 0, policy_version 1480 (0.0009) [2023-10-10 04:40:46,868][53268] Updated weights for policy 1, policy_version 1500 (0.0008) [2023-10-10 04:40:47,010][53061] Saving new best policy, reward=2.940! [2023-10-10 04:40:47,226][53252] Updated weights for policy 0, policy_version 1490 (0.0008) [2023-10-10 04:40:47,602][53252] Updated weights for policy 0, policy_version 1500 (0.0009) [2023-10-10 04:40:50,960][53268] Updated weights for policy 1, policy_version 1510 (0.0008) [2023-10-10 04:40:51,322][53268] Updated weights for policy 1, policy_version 1520 (0.0008) [2023-10-10 04:40:51,689][53268] Updated weights for policy 1, policy_version 1530 (0.0008) [2023-10-10 04:40:51,723][53252] Updated weights for policy 0, policy_version 1510 (0.0010) [2023-10-10 04:40:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 12956.7). Total num frames: 3080192. Throughput: 0: 1669.2, 1: 1681.5. Samples: 785278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:40:51,784][52050] Avg episode reward: [(0, '2.480'), (1, '2.970')] [2023-10-10 04:40:51,902][53061] Saving new best policy, reward=2.970! [2023-10-10 04:40:52,092][53252] Updated weights for policy 0, policy_version 1520 (0.0008) [2023-10-10 04:40:52,476][53252] Updated weights for policy 0, policy_version 1530 (0.0010) [2023-10-10 04:40:55,610][53268] Updated weights for policy 1, policy_version 1540 (0.0008) [2023-10-10 04:40:55,995][53268] Updated weights for policy 1, policy_version 1550 (0.0008) [2023-10-10 04:40:56,361][53268] Updated weights for policy 1, policy_version 1560 (0.0008) [2023-10-10 04:40:56,501][53252] Updated weights for policy 0, policy_version 1540 (0.0008) [2023-10-10 04:40:56,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13094.8). Total num frames: 3178496. Throughput: 0: 1664.4, 1: 1670.4. Samples: 805294. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-10 04:40:56,784][52050] Avg episode reward: [(0, '2.960'), (1, '2.970')] [2023-10-10 04:40:56,884][53252] Updated weights for policy 0, policy_version 1550 (0.0007) [2023-10-10 04:40:57,261][53252] Updated weights for policy 0, policy_version 1560 (0.0008) [2023-10-10 04:40:57,562][52846] Saving new best policy, reward=2.960! [2023-10-10 04:41:00,481][53268] Updated weights for policy 1, policy_version 1570 (0.0008) [2023-10-10 04:41:00,908][53268] Updated weights for policy 1, policy_version 1580 (0.0008) [2023-10-10 04:41:01,271][53268] Updated weights for policy 1, policy_version 1590 (0.0007) [2023-10-10 04:41:01,404][53252] Updated weights for policy 0, policy_version 1570 (0.0007) [2023-10-10 04:41:01,638][53268] Updated weights for policy 1, policy_version 1600 (0.0007) [2023-10-10 04:41:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13095.1). Total num frames: 3244032. Throughput: 0: 1668.5, 1: 1687.6. Samples: 815066. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-10 04:41:01,784][52050] Avg episode reward: [(0, '2.850'), (1, '2.770')] [2023-10-10 04:41:01,821][53252] Updated weights for policy 0, policy_version 1580 (0.0010) [2023-10-10 04:41:02,193][53252] Updated weights for policy 0, policy_version 1590 (0.0008) [2023-10-10 04:41:02,570][53252] Updated weights for policy 0, policy_version 1600 (0.0010) [2023-10-10 04:41:05,841][53268] Updated weights for policy 1, policy_version 1610 (0.0008) [2023-10-10 04:41:06,209][53268] Updated weights for policy 1, policy_version 1620 (0.0009) [2023-10-10 04:41:06,554][53252] Updated weights for policy 0, policy_version 1610 (0.0008) [2023-10-10 04:41:06,580][53268] Updated weights for policy 1, policy_version 1630 (0.0009) [2023-10-10 04:41:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13095.3). Total num frames: 3309568. Throughput: 0: 1677.9, 1: 1677.1. Samples: 835504. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 04:41:06,784][52050] Avg episode reward: [(0, '3.100'), (1, '2.770')] [2023-10-10 04:41:06,935][53252] Updated weights for policy 0, policy_version 1620 (0.0007) [2023-10-10 04:41:07,313][53252] Updated weights for policy 0, policy_version 1630 (0.0008) [2023-10-10 04:41:07,377][52846] Saving new best policy, reward=3.100! [2023-10-10 04:41:10,760][53268] Updated weights for policy 1, policy_version 1640 (0.0008) [2023-10-10 04:41:11,136][53268] Updated weights for policy 1, policy_version 1650 (0.0008) [2023-10-10 04:41:11,459][53252] Updated weights for policy 0, policy_version 1640 (0.0007) [2023-10-10 04:41:11,499][53268] Updated weights for policy 1, policy_version 1660 (0.0007) [2023-10-10 04:41:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13095.5). Total num frames: 3375104. Throughput: 0: 1672.2, 1: 1654.5. Samples: 854904. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 04:41:11,784][52050] Avg episode reward: [(0, '2.870'), (1, '2.690')] [2023-10-10 04:41:11,828][53252] Updated weights for policy 0, policy_version 1650 (0.0009) [2023-10-10 04:41:12,192][53252] Updated weights for policy 0, policy_version 1660 (0.0009) [2023-10-10 04:41:15,702][53268] Updated weights for policy 1, policy_version 1670 (0.0010) [2023-10-10 04:41:16,067][53268] Updated weights for policy 1, policy_version 1680 (0.0008) [2023-10-10 04:41:16,225][53252] Updated weights for policy 0, policy_version 1670 (0.0010) [2023-10-10 04:41:16,427][53268] Updated weights for policy 1, policy_version 1690 (0.0009) [2023-10-10 04:41:16,601][53252] Updated weights for policy 0, policy_version 1680 (0.0008) [2023-10-10 04:41:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13095.8). Total num frames: 3440640. Throughput: 0: 1676.5, 1: 1668.6. Samples: 864916. Policy #0 lag: (min: 4.0, avg: 8.0, max: 36.0) [2023-10-10 04:41:16,784][52050] Avg episode reward: [(0, '2.950'), (1, '2.930')] [2023-10-10 04:41:16,971][53252] Updated weights for policy 0, policy_version 1690 (0.0007) [2023-10-10 04:41:20,555][53268] Updated weights for policy 1, policy_version 1700 (0.0007) [2023-10-10 04:41:20,924][53268] Updated weights for policy 1, policy_version 1710 (0.0008) [2023-10-10 04:41:21,153][53252] Updated weights for policy 0, policy_version 1700 (0.0007) [2023-10-10 04:41:21,294][53268] Updated weights for policy 1, policy_version 1720 (0.0007) [2023-10-10 04:41:21,522][53252] Updated weights for policy 0, policy_version 1710 (0.0009) [2023-10-10 04:41:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13096.0). Total num frames: 3506176. Throughput: 0: 1676.5, 1: 1667.8. Samples: 885432. Policy #0 lag: (min: 31.0, avg: 43.2, max: 63.0) [2023-10-10 04:41:21,784][52050] Avg episode reward: [(0, '3.000'), (1, '3.040')] [2023-10-10 04:41:21,785][53061] Saving new best policy, reward=3.040! [2023-10-10 04:41:21,891][53252] Updated weights for policy 0, policy_version 1720 (0.0009) [2023-10-10 04:41:25,295][53268] Updated weights for policy 1, policy_version 1730 (0.0007) [2023-10-10 04:41:25,660][53268] Updated weights for policy 1, policy_version 1740 (0.0008) [2023-10-10 04:41:26,033][53268] Updated weights for policy 1, policy_version 1750 (0.0009) [2023-10-10 04:41:26,035][53252] Updated weights for policy 0, policy_version 1730 (0.0009) [2023-10-10 04:41:26,392][53268] Updated weights for policy 1, policy_version 1760 (0.0008) [2023-10-10 04:41:26,403][53252] Updated weights for policy 0, policy_version 1740 (0.0007) [2023-10-10 04:41:26,770][53252] Updated weights for policy 0, policy_version 1750 (0.0007) [2023-10-10 04:41:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13096.2). Total num frames: 3571712. Throughput: 0: 1666.0, 1: 1652.3. Samples: 904524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:41:26,784][52050] Avg episode reward: [(0, '2.980'), (1, '2.870')] [2023-10-10 04:41:27,147][53252] Updated weights for policy 0, policy_version 1760 (0.0008) [2023-10-10 04:41:30,394][53268] Updated weights for policy 1, policy_version 1770 (0.0010) [2023-10-10 04:41:30,768][53268] Updated weights for policy 1, policy_version 1780 (0.0010) [2023-10-10 04:41:31,133][53268] Updated weights for policy 1, policy_version 1790 (0.0008) [2023-10-10 04:41:31,149][53252] Updated weights for policy 0, policy_version 1770 (0.0007) [2023-10-10 04:41:31,520][53252] Updated weights for policy 0, policy_version 1780 (0.0009) [2023-10-10 04:41:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13096.4). Total num frames: 3637248. Throughput: 0: 1677.6, 1: 1667.1. Samples: 915054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:41:31,784][52050] Avg episode reward: [(0, '2.760'), (1, '2.780')] [2023-10-10 04:41:31,889][53252] Updated weights for policy 0, policy_version 1790 (0.0008) [2023-10-10 04:41:35,265][53268] Updated weights for policy 1, policy_version 1800 (0.0009) [2023-10-10 04:41:35,631][53268] Updated weights for policy 1, policy_version 1810 (0.0008) [2023-10-10 04:41:36,008][53268] Updated weights for policy 1, policy_version 1820 (0.0007) [2023-10-10 04:41:36,056][53252] Updated weights for policy 0, policy_version 1800 (0.0010) [2023-10-10 04:41:36,416][53252] Updated weights for policy 0, policy_version 1810 (0.0010) [2023-10-10 04:41:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13096.6). Total num frames: 3702784. Throughput: 0: 1676.1, 1: 1660.3. Samples: 935420. Policy #0 lag: (min: 8.0, avg: 37.1, max: 40.0) [2023-10-10 04:41:36,784][52050] Avg episode reward: [(0, '2.890'), (1, '2.800')] [2023-10-10 04:41:36,789][53252] Updated weights for policy 0, policy_version 1820 (0.0008) [2023-10-10 04:41:40,108][53268] Updated weights for policy 1, policy_version 1830 (0.0010) [2023-10-10 04:41:40,474][53268] Updated weights for policy 1, policy_version 1840 (0.0011) [2023-10-10 04:41:40,832][53252] Updated weights for policy 0, policy_version 1830 (0.0010) [2023-10-10 04:41:40,841][53268] Updated weights for policy 1, policy_version 1850 (0.0009) [2023-10-10 04:41:41,196][53252] Updated weights for policy 0, policy_version 1840 (0.0011) [2023-10-10 04:41:41,572][53252] Updated weights for policy 0, policy_version 1850 (0.0008) [2023-10-10 04:41:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13096.7). Total num frames: 3768320. Throughput: 0: 1656.5, 1: 1648.7. Samples: 954028. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) [2023-10-10 04:41:41,784][52050] Avg episode reward: [(0, '2.770'), (1, '2.700')] [2023-10-10 04:41:44,881][53268] Updated weights for policy 1, policy_version 1860 (0.0009) [2023-10-10 04:41:45,257][53268] Updated weights for policy 1, policy_version 1870 (0.0008) [2023-10-10 04:41:45,616][53268] Updated weights for policy 1, policy_version 1880 (0.0009) [2023-10-10 04:41:45,731][53252] Updated weights for policy 0, policy_version 1860 (0.0007) [2023-10-10 04:41:46,097][53252] Updated weights for policy 0, policy_version 1870 (0.0008) [2023-10-10 04:41:46,474][53252] Updated weights for policy 0, policy_version 1880 (0.0008) [2023-10-10 04:41:46,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13208.9). Total num frames: 3866624. Throughput: 0: 1676.4, 1: 1661.5. Samples: 965272. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) [2023-10-10 04:41:46,784][52050] Avg episode reward: [(0, '2.730'), (1, '2.640')] [2023-10-10 04:41:49,886][53268] Updated weights for policy 1, policy_version 1890 (0.0007) [2023-10-10 04:41:50,298][53268] Updated weights for policy 1, policy_version 1900 (0.0009) [2023-10-10 04:41:50,503][53252] Updated weights for policy 0, policy_version 1890 (0.0009) [2023-10-10 04:41:50,672][53268] Updated weights for policy 1, policy_version 1910 (0.0008) [2023-10-10 04:41:50,924][53252] Updated weights for policy 0, policy_version 1900 (0.0008) [2023-10-10 04:41:51,038][53268] Updated weights for policy 1, policy_version 1920 (0.0008) [2023-10-10 04:41:51,290][53252] Updated weights for policy 0, policy_version 1910 (0.0007) [2023-10-10 04:41:51,661][53252] Updated weights for policy 0, policy_version 1920 (0.0007) [2023-10-10 04:41:51,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 3932160. Throughput: 0: 1675.1, 1: 1655.2. Samples: 985368. Policy #0 lag: (min: 31.0, avg: 41.6, max: 63.0) [2023-10-10 04:41:51,784][52050] Avg episode reward: [(0, '2.790'), (1, '2.730')] [2023-10-10 04:41:55,051][53268] Updated weights for policy 1, policy_version 1930 (0.0008) [2023-10-10 04:41:55,419][53268] Updated weights for policy 1, policy_version 1940 (0.0008) [2023-10-10 04:41:55,788][53268] Updated weights for policy 1, policy_version 1950 (0.0008) [2023-10-10 04:41:55,805][53252] Updated weights for policy 0, policy_version 1930 (0.0009) [2023-10-10 04:41:56,173][53252] Updated weights for policy 0, policy_version 1940 (0.0008) [2023-10-10 04:41:56,553][53252] Updated weights for policy 0, policy_version 1950 (0.0008) [2023-10-10 04:41:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 3997696. Throughput: 0: 1660.2, 1: 1654.7. Samples: 1004076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:41:56,785][52050] Avg episode reward: [(0, '3.100'), (1, '3.120')] [2023-10-10 04:41:56,795][53061] Saving new best policy, reward=3.120! [2023-10-10 04:41:59,843][53268] Updated weights for policy 1, policy_version 1960 (0.0007) [2023-10-10 04:42:00,206][53268] Updated weights for policy 1, policy_version 1970 (0.0009) [2023-10-10 04:42:00,576][53252] Updated weights for policy 0, policy_version 1960 (0.0009) [2023-10-10 04:42:00,579][53268] Updated weights for policy 1, policy_version 1980 (0.0010) [2023-10-10 04:42:00,950][53252] Updated weights for policy 0, policy_version 1970 (0.0008) [2023-10-10 04:42:01,326][53252] Updated weights for policy 0, policy_version 1980 (0.0009) [2023-10-10 04:42:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4063232. Throughput: 0: 1674.9, 1: 1667.0. Samples: 1015302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:01,784][52050] Avg episode reward: [(0, '2.810'), (1, '2.990')] [2023-10-10 04:42:04,859][53268] Updated weights for policy 1, policy_version 1990 (0.0010) [2023-10-10 04:42:05,230][53268] Updated weights for policy 1, policy_version 2000 (0.0009) [2023-10-10 04:42:05,470][53252] Updated weights for policy 0, policy_version 1990 (0.0009) [2023-10-10 04:42:05,602][53268] Updated weights for policy 1, policy_version 2010 (0.0009) [2023-10-10 04:42:05,840][53252] Updated weights for policy 0, policy_version 2000 (0.0008) [2023-10-10 04:42:06,223][53252] Updated weights for policy 0, policy_version 2010 (0.0009) [2023-10-10 04:42:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4128768. Throughput: 0: 1674.0, 1: 1655.7. Samples: 1035272. Policy #0 lag: (min: 1.0, avg: 8.5, max: 33.0) [2023-10-10 04:42:06,785][52050] Avg episode reward: [(0, '2.480'), (1, '2.800')] [2023-10-10 04:42:09,554][53268] Updated weights for policy 1, policy_version 2020 (0.0009) [2023-10-10 04:42:09,920][53268] Updated weights for policy 1, policy_version 2030 (0.0007) [2023-10-10 04:42:10,228][53252] Updated weights for policy 0, policy_version 2020 (0.0008) [2023-10-10 04:42:10,303][53268] Updated weights for policy 1, policy_version 2040 (0.0008) [2023-10-10 04:42:10,604][53252] Updated weights for policy 0, policy_version 2030 (0.0008) [2023-10-10 04:42:10,982][53252] Updated weights for policy 0, policy_version 2040 (0.0009) [2023-10-10 04:42:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4194304. Throughput: 0: 1657.3, 1: 1666.6. Samples: 1054098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:11,785][52050] Avg episode reward: [(0, '3.020'), (1, '2.740')] [2023-10-10 04:42:14,466][53268] Updated weights for policy 1, policy_version 2050 (0.0009) [2023-10-10 04:42:14,821][53268] Updated weights for policy 1, policy_version 2060 (0.0008) [2023-10-10 04:42:15,144][53252] Updated weights for policy 0, policy_version 2050 (0.0009) [2023-10-10 04:42:15,192][53268] Updated weights for policy 1, policy_version 2070 (0.0007) [2023-10-10 04:42:15,513][53252] Updated weights for policy 0, policy_version 2060 (0.0009) [2023-10-10 04:42:15,560][53268] Updated weights for policy 1, policy_version 2080 (0.0009) [2023-10-10 04:42:15,883][53252] Updated weights for policy 0, policy_version 2070 (0.0009) [2023-10-10 04:42:16,251][53252] Updated weights for policy 0, policy_version 2080 (0.0009) [2023-10-10 04:42:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4259840. Throughput: 0: 1675.6, 1: 1672.5. Samples: 1065718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:16,784][52050] Avg episode reward: [(0, '3.110'), (1, '2.980')] [2023-10-10 04:42:16,786][52846] Saving new best policy, reward=3.110! [2023-10-10 04:42:19,623][53268] Updated weights for policy 1, policy_version 2090 (0.0007) [2023-10-10 04:42:19,992][53268] Updated weights for policy 1, policy_version 2100 (0.0007) [2023-10-10 04:42:20,329][53252] Updated weights for policy 0, policy_version 2090 (0.0008) [2023-10-10 04:42:20,360][53268] Updated weights for policy 1, policy_version 2110 (0.0007) [2023-10-10 04:42:20,701][53252] Updated weights for policy 0, policy_version 2100 (0.0010) [2023-10-10 04:42:21,083][53252] Updated weights for policy 0, policy_version 2110 (0.0010) [2023-10-10 04:42:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4325376. Throughput: 0: 1666.8, 1: 1659.4. Samples: 1085100. Policy #0 lag: (min: 26.0, avg: 28.2, max: 58.0) [2023-10-10 04:42:21,785][52050] Avg episode reward: [(0, '2.930'), (1, '3.100')] [2023-10-10 04:42:24,339][53268] Updated weights for policy 1, policy_version 2120 (0.0009) [2023-10-10 04:42:24,716][53268] Updated weights for policy 1, policy_version 2130 (0.0009) [2023-10-10 04:42:24,999][53252] Updated weights for policy 0, policy_version 2120 (0.0007) [2023-10-10 04:42:25,083][53268] Updated weights for policy 1, policy_version 2140 (0.0009) [2023-10-10 04:42:25,373][53252] Updated weights for policy 0, policy_version 2130 (0.0009) [2023-10-10 04:42:25,747][53252] Updated weights for policy 0, policy_version 2140 (0.0008) [2023-10-10 04:42:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4390912. Throughput: 0: 1668.5, 1: 1680.2. Samples: 1104722. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 04:42:26,784][52050] Avg episode reward: [(0, '3.080'), (1, '3.380')] [2023-10-10 04:42:26,797][53061] Saving new best policy, reward=3.380! [2023-10-10 04:42:29,189][53268] Updated weights for policy 1, policy_version 2150 (0.0010) [2023-10-10 04:42:29,552][53268] Updated weights for policy 1, policy_version 2160 (0.0010) [2023-10-10 04:42:29,746][53252] Updated weights for policy 0, policy_version 2150 (0.0008) [2023-10-10 04:42:29,921][53268] Updated weights for policy 1, policy_version 2170 (0.0009) [2023-10-10 04:42:30,113][53252] Updated weights for policy 0, policy_version 2160 (0.0008) [2023-10-10 04:42:30,479][53252] Updated weights for policy 0, policy_version 2170 (0.0009) [2023-10-10 04:42:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 4456448. Throughput: 0: 1676.1, 1: 1670.0. Samples: 1115850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:31,784][52050] Avg episode reward: [(0, '3.350'), (1, '3.430')] [2023-10-10 04:42:31,785][52846] Saving new best policy, reward=3.350! [2023-10-10 04:42:31,785][53061] Saving new best policy, reward=3.430! [2023-10-10 04:42:34,078][53268] Updated weights for policy 1, policy_version 2180 (0.0008) [2023-10-10 04:42:34,445][53268] Updated weights for policy 1, policy_version 2190 (0.0008) [2023-10-10 04:42:34,636][53252] Updated weights for policy 0, policy_version 2180 (0.0008) [2023-10-10 04:42:34,812][53268] Updated weights for policy 1, policy_version 2200 (0.0008) [2023-10-10 04:42:35,004][53252] Updated weights for policy 0, policy_version 2190 (0.0008) [2023-10-10 04:42:35,383][53252] Updated weights for policy 0, policy_version 2200 (0.0009) [2023-10-10 04:42:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 4521984. Throughput: 0: 1655.4, 1: 1659.2. Samples: 1134526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:36,784][52050] Avg episode reward: [(0, '3.450'), (1, '2.930')] [2023-10-10 04:42:36,784][52846] Saving new best policy, reward=3.450! [2023-10-10 04:42:39,022][53268] Updated weights for policy 1, policy_version 2210 (0.0009) [2023-10-10 04:42:39,415][53268] Updated weights for policy 1, policy_version 2220 (0.0007) [2023-10-10 04:42:39,581][53252] Updated weights for policy 0, policy_version 2210 (0.0009) [2023-10-10 04:42:39,791][53268] Updated weights for policy 1, policy_version 2230 (0.0008) [2023-10-10 04:42:39,986][53252] Updated weights for policy 0, policy_version 2220 (0.0008) [2023-10-10 04:42:40,150][53268] Updated weights for policy 1, policy_version 2240 (0.0008) [2023-10-10 04:42:40,356][53252] Updated weights for policy 0, policy_version 2230 (0.0009) [2023-10-10 04:42:40,724][53252] Updated weights for policy 0, policy_version 2240 (0.0010) [2023-10-10 04:42:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 4587520. Throughput: 0: 1665.5, 1: 1673.7. Samples: 1154340. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-10 04:42:41,784][52050] Avg episode reward: [(0, '2.960'), (1, '3.000')] [2023-10-10 04:42:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth... [2023-10-10 04:42:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth... [2023-10-10 04:42:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000000672_688128.pth [2023-10-10 04:42:41,835][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000000672_688128.pth [2023-10-10 04:42:44,419][53268] Updated weights for policy 1, policy_version 2250 (0.0008) [2023-10-10 04:42:44,768][53252] Updated weights for policy 0, policy_version 2250 (0.0007) [2023-10-10 04:42:44,795][53268] Updated weights for policy 1, policy_version 2260 (0.0008) [2023-10-10 04:42:45,136][53252] Updated weights for policy 0, policy_version 2260 (0.0007) [2023-10-10 04:42:45,171][53268] Updated weights for policy 1, policy_version 2270 (0.0009) [2023-10-10 04:42:45,499][53252] Updated weights for policy 0, policy_version 2270 (0.0007) [2023-10-10 04:42:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 4653056. Throughput: 0: 1672.7, 1: 1668.3. Samples: 1165646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:42:46,785][52050] Avg episode reward: [(0, '2.980'), (1, '3.290')] [2023-10-10 04:42:49,331][53268] Updated weights for policy 1, policy_version 2280 (0.0009) [2023-10-10 04:42:49,645][53252] Updated weights for policy 0, policy_version 2280 (0.0009) [2023-10-10 04:42:49,708][53268] Updated weights for policy 1, policy_version 2290 (0.0008) [2023-10-10 04:42:50,014][53252] Updated weights for policy 0, policy_version 2290 (0.0009) [2023-10-10 04:42:50,076][53268] Updated weights for policy 1, policy_version 2300 (0.0007) [2023-10-10 04:42:50,384][53252] Updated weights for policy 0, policy_version 2300 (0.0008) [2023-10-10 04:42:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4718592. Throughput: 0: 1647.9, 1: 1659.6. Samples: 1184108. Policy #0 lag: (min: 9.0, avg: 14.5, max: 41.0) [2023-10-10 04:42:51,784][52050] Avg episode reward: [(0, '2.910'), (1, '3.640')] [2023-10-10 04:42:51,785][53061] Saving new best policy, reward=3.640! [2023-10-10 04:42:54,168][53268] Updated weights for policy 1, policy_version 2310 (0.0010) [2023-10-10 04:42:54,536][53268] Updated weights for policy 1, policy_version 2320 (0.0010) [2023-10-10 04:42:54,642][53252] Updated weights for policy 0, policy_version 2310 (0.0008) [2023-10-10 04:42:54,901][53268] Updated weights for policy 1, policy_version 2330 (0.0007) [2023-10-10 04:42:55,014][53252] Updated weights for policy 0, policy_version 2320 (0.0008) [2023-10-10 04:42:55,385][53252] Updated weights for policy 0, policy_version 2330 (0.0010) [2023-10-10 04:42:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 4784128. Throughput: 0: 1669.6, 1: 1665.6. Samples: 1204182. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-10 04:42:56,784][52050] Avg episode reward: [(0, '3.030'), (1, '3.300')] [2023-10-10 04:42:59,037][53268] Updated weights for policy 1, policy_version 2340 (0.0009) [2023-10-10 04:42:59,403][53268] Updated weights for policy 1, policy_version 2350 (0.0010) [2023-10-10 04:42:59,455][53252] Updated weights for policy 0, policy_version 2340 (0.0009) [2023-10-10 04:42:59,768][53268] Updated weights for policy 1, policy_version 2360 (0.0009) [2023-10-10 04:42:59,817][53252] Updated weights for policy 0, policy_version 2350 (0.0008) [2023-10-10 04:43:00,183][53252] Updated weights for policy 0, policy_version 2360 (0.0008) [2023-10-10 04:43:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 4849664. Throughput: 0: 1670.1, 1: 1657.9. Samples: 1215478. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 04:43:01,785][52050] Avg episode reward: [(0, '3.410'), (1, '2.980')] [2023-10-10 04:43:03,955][53268] Updated weights for policy 1, policy_version 2370 (0.0008) [2023-10-10 04:43:04,328][53268] Updated weights for policy 1, policy_version 2380 (0.0008) [2023-10-10 04:43:04,360][53252] Updated weights for policy 0, policy_version 2370 (0.0011) [2023-10-10 04:43:04,696][53268] Updated weights for policy 1, policy_version 2390 (0.0007) [2023-10-10 04:43:04,737][53252] Updated weights for policy 0, policy_version 2380 (0.0007) [2023-10-10 04:43:05,061][53268] Updated weights for policy 1, policy_version 2400 (0.0008) [2023-10-10 04:43:05,105][53252] Updated weights for policy 0, policy_version 2390 (0.0007) [2023-10-10 04:43:05,489][53252] Updated weights for policy 0, policy_version 2400 (0.0008) [2023-10-10 04:43:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 4915200. Throughput: 0: 1656.8, 1: 1647.6. Samples: 1233796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:43:06,784][52050] Avg episode reward: [(0, '3.640'), (1, '3.280')] [2023-10-10 04:43:06,785][52846] Saving new best policy, reward=3.640! [2023-10-10 04:43:09,087][53268] Updated weights for policy 1, policy_version 2410 (0.0011) [2023-10-10 04:43:09,461][53268] Updated weights for policy 1, policy_version 2420 (0.0009) [2023-10-10 04:43:09,555][53252] Updated weights for policy 0, policy_version 2410 (0.0008) [2023-10-10 04:43:09,825][53268] Updated weights for policy 1, policy_version 2430 (0.0008) [2023-10-10 04:43:09,925][53252] Updated weights for policy 0, policy_version 2420 (0.0008) [2023-10-10 04:43:10,302][53252] Updated weights for policy 0, policy_version 2430 (0.0007) [2023-10-10 04:43:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4980736. Throughput: 0: 1671.7, 1: 1652.9. Samples: 1254330. Policy #0 lag: (min: 4.0, avg: 5.2, max: 28.0) [2023-10-10 04:43:11,785][52050] Avg episode reward: [(0, '3.460'), (1, '3.800')] [2023-10-10 04:43:11,797][53061] Saving new best policy, reward=3.800! [2023-10-10 04:43:14,011][53268] Updated weights for policy 1, policy_version 2440 (0.0008) [2023-10-10 04:43:14,086][53252] Updated weights for policy 0, policy_version 2440 (0.0009) [2023-10-10 04:43:14,389][53268] Updated weights for policy 1, policy_version 2450 (0.0009) [2023-10-10 04:43:14,460][53252] Updated weights for policy 0, policy_version 2450 (0.0009) [2023-10-10 04:43:14,753][53268] Updated weights for policy 1, policy_version 2460 (0.0008) [2023-10-10 04:43:14,819][53252] Updated weights for policy 0, policy_version 2460 (0.0008) [2023-10-10 04:43:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 5046272. Throughput: 0: 1657.0, 1: 1652.8. Samples: 1264794. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 04:43:16,784][52050] Avg episode reward: [(0, '3.590'), (1, '3.470')] [2023-10-10 04:43:18,870][53268] Updated weights for policy 1, policy_version 2470 (0.0009) [2023-10-10 04:43:18,949][53252] Updated weights for policy 0, policy_version 2470 (0.0007) [2023-10-10 04:43:19,238][53268] Updated weights for policy 1, policy_version 2480 (0.0011) [2023-10-10 04:43:19,318][53252] Updated weights for policy 0, policy_version 2480 (0.0008) [2023-10-10 04:43:19,618][53268] Updated weights for policy 1, policy_version 2490 (0.0009) [2023-10-10 04:43:19,694][53252] Updated weights for policy 0, policy_version 2490 (0.0009) [2023-10-10 04:43:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5111808. Throughput: 0: 1660.4, 1: 1655.4. Samples: 1283740. Policy #0 lag: (min: 21.0, avg: 26.6, max: 53.0) [2023-10-10 04:43:21,784][52050] Avg episode reward: [(0, '3.540'), (1, '3.320')] [2023-10-10 04:43:23,760][53252] Updated weights for policy 0, policy_version 2500 (0.0007) [2023-10-10 04:43:23,863][53268] Updated weights for policy 1, policy_version 2500 (0.0009) [2023-10-10 04:43:24,131][53252] Updated weights for policy 0, policy_version 2510 (0.0007) [2023-10-10 04:43:24,255][53268] Updated weights for policy 1, policy_version 2510 (0.0007) [2023-10-10 04:43:24,508][53252] Updated weights for policy 0, policy_version 2520 (0.0008) [2023-10-10 04:43:24,630][53268] Updated weights for policy 1, policy_version 2520 (0.0007) [2023-10-10 04:43:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5177344. Throughput: 0: 1672.1, 1: 1655.4. Samples: 1304078. Policy #0 lag: (min: 18.0, avg: 30.6, max: 50.0) [2023-10-10 04:43:26,784][52050] Avg episode reward: [(0, '3.520'), (1, '3.400')] [2023-10-10 04:43:28,588][53268] Updated weights for policy 1, policy_version 2530 (0.0008) [2023-10-10 04:43:28,738][53252] Updated weights for policy 0, policy_version 2530 (0.0008) [2023-10-10 04:43:28,955][53268] Updated weights for policy 1, policy_version 2540 (0.0007) [2023-10-10 04:43:29,149][53252] Updated weights for policy 0, policy_version 2540 (0.0009) [2023-10-10 04:43:29,312][53268] Updated weights for policy 1, policy_version 2550 (0.0007) [2023-10-10 04:43:29,527][53252] Updated weights for policy 0, policy_version 2550 (0.0008) [2023-10-10 04:43:29,684][53268] Updated weights for policy 1, policy_version 2560 (0.0008) [2023-10-10 04:43:29,892][53252] Updated weights for policy 0, policy_version 2560 (0.0009) [2023-10-10 04:43:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 5242880. Throughput: 0: 1652.8, 1: 1648.4. Samples: 1314200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:43:31,784][52050] Avg episode reward: [(0, '3.480'), (1, '3.210')] [2023-10-10 04:43:33,648][53268] Updated weights for policy 1, policy_version 2570 (0.0008) [2023-10-10 04:43:33,826][53252] Updated weights for policy 0, policy_version 2570 (0.0007) [2023-10-10 04:43:34,003][53268] Updated weights for policy 1, policy_version 2580 (0.0010) [2023-10-10 04:43:34,193][53252] Updated weights for policy 0, policy_version 2580 (0.0008) [2023-10-10 04:43:34,380][53268] Updated weights for policy 1, policy_version 2590 (0.0009) [2023-10-10 04:43:34,562][53252] Updated weights for policy 0, policy_version 2590 (0.0007) [2023-10-10 04:43:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 5308416. Throughput: 0: 1670.2, 1: 1657.2. Samples: 1333844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:43:36,784][52050] Avg episode reward: [(0, '3.880'), (1, '3.070')] [2023-10-10 04:43:36,786][52846] Saving new best policy, reward=3.880! [2023-10-10 04:43:38,511][53268] Updated weights for policy 1, policy_version 2600 (0.0010) [2023-10-10 04:43:38,618][53252] Updated weights for policy 0, policy_version 2600 (0.0009) [2023-10-10 04:43:38,881][53268] Updated weights for policy 1, policy_version 2610 (0.0009) [2023-10-10 04:43:38,992][53252] Updated weights for policy 0, policy_version 2610 (0.0009) [2023-10-10 04:43:39,242][53268] Updated weights for policy 1, policy_version 2620 (0.0009) [2023-10-10 04:43:39,355][53252] Updated weights for policy 0, policy_version 2620 (0.0009) [2023-10-10 04:43:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5373952. Throughput: 0: 1673.7, 1: 1660.4. Samples: 1354216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:43:41,785][52050] Avg episode reward: [(0, '3.790'), (1, '3.630')] [2023-10-10 04:43:43,371][53268] Updated weights for policy 1, policy_version 2630 (0.0008) [2023-10-10 04:43:43,478][53252] Updated weights for policy 0, policy_version 2630 (0.0008) [2023-10-10 04:43:43,740][53268] Updated weights for policy 1, policy_version 2640 (0.0008) [2023-10-10 04:43:43,856][53252] Updated weights for policy 0, policy_version 2640 (0.0007) [2023-10-10 04:43:44,112][53268] Updated weights for policy 1, policy_version 2650 (0.0010) [2023-10-10 04:43:44,233][53252] Updated weights for policy 0, policy_version 2650 (0.0007) [2023-10-10 04:43:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 5439488. Throughput: 0: 1647.2, 1: 1646.7. Samples: 1363700. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 04:43:46,784][52050] Avg episode reward: [(0, '3.770'), (1, '3.960')] [2023-10-10 04:43:46,786][53061] Saving new best policy, reward=3.960! [2023-10-10 04:43:48,276][53268] Updated weights for policy 1, policy_version 2660 (0.0009) [2023-10-10 04:43:48,409][53252] Updated weights for policy 0, policy_version 2660 (0.0008) [2023-10-10 04:43:48,642][53268] Updated weights for policy 1, policy_version 2670 (0.0008) [2023-10-10 04:43:48,789][53252] Updated weights for policy 0, policy_version 2670 (0.0009) [2023-10-10 04:43:49,010][53268] Updated weights for policy 1, policy_version 2680 (0.0009) [2023-10-10 04:43:49,156][53252] Updated weights for policy 0, policy_version 2680 (0.0007) [2023-10-10 04:43:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5505024. Throughput: 0: 1665.5, 1: 1667.5. Samples: 1383786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:43:51,784][52050] Avg episode reward: [(0, '3.370'), (1, '3.960')] [2023-10-10 04:43:53,213][53268] Updated weights for policy 1, policy_version 2690 (0.0007) [2023-10-10 04:43:53,250][53252] Updated weights for policy 0, policy_version 2690 (0.0009) [2023-10-10 04:43:53,572][53268] Updated weights for policy 1, policy_version 2700 (0.0008) [2023-10-10 04:43:53,623][53252] Updated weights for policy 0, policy_version 2700 (0.0009) [2023-10-10 04:43:53,942][53268] Updated weights for policy 1, policy_version 2710 (0.0007) [2023-10-10 04:43:53,995][53252] Updated weights for policy 0, policy_version 2710 (0.0009) [2023-10-10 04:43:54,311][53268] Updated weights for policy 1, policy_version 2720 (0.0008) [2023-10-10 04:43:54,359][53252] Updated weights for policy 0, policy_version 2720 (0.0010) [2023-10-10 04:43:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 5570560. Throughput: 0: 1669.7, 1: 1665.0. Samples: 1404394. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 04:43:56,784][52050] Avg episode reward: [(0, '3.260'), (1, '4.090')] [2023-10-10 04:43:56,793][53061] Saving new best policy, reward=4.090! [2023-10-10 04:43:58,348][53252] Updated weights for policy 0, policy_version 2730 (0.0009) [2023-10-10 04:43:58,358][53268] Updated weights for policy 1, policy_version 2730 (0.0009) [2023-10-10 04:43:58,719][53252] Updated weights for policy 0, policy_version 2740 (0.0007) [2023-10-10 04:43:58,730][53268] Updated weights for policy 1, policy_version 2740 (0.0009) [2023-10-10 04:43:59,086][53252] Updated weights for policy 0, policy_version 2750 (0.0007) [2023-10-10 04:43:59,094][53268] Updated weights for policy 1, policy_version 2750 (0.0007) [2023-10-10 04:44:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5636096. Throughput: 0: 1654.9, 1: 1648.2. Samples: 1413434. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:44:01,784][52050] Avg episode reward: [(0, '3.760'), (1, '3.620')] [2023-10-10 04:44:03,137][53268] Updated weights for policy 1, policy_version 2760 (0.0009) [2023-10-10 04:44:03,148][53252] Updated weights for policy 0, policy_version 2760 (0.0008) [2023-10-10 04:44:03,500][53268] Updated weights for policy 1, policy_version 2770 (0.0007) [2023-10-10 04:44:03,513][53252] Updated weights for policy 0, policy_version 2770 (0.0009) [2023-10-10 04:44:03,879][53268] Updated weights for policy 1, policy_version 2780 (0.0009) [2023-10-10 04:44:03,889][53252] Updated weights for policy 0, policy_version 2780 (0.0007) [2023-10-10 04:44:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5701632. Throughput: 0: 1671.6, 1: 1668.8. Samples: 1434056. Policy #0 lag: (min: 25.0, avg: 34.3, max: 57.0) [2023-10-10 04:44:06,784][52050] Avg episode reward: [(0, '3.950'), (1, '3.610')] [2023-10-10 04:44:06,784][52846] Saving new best policy, reward=3.950! [2023-10-10 04:44:07,996][53268] Updated weights for policy 1, policy_version 2790 (0.0008) [2023-10-10 04:44:08,113][53252] Updated weights for policy 0, policy_version 2790 (0.0009) [2023-10-10 04:44:08,365][53268] Updated weights for policy 1, policy_version 2800 (0.0007) [2023-10-10 04:44:08,469][53252] Updated weights for policy 0, policy_version 2800 (0.0009) [2023-10-10 04:44:08,738][53268] Updated weights for policy 1, policy_version 2810 (0.0008) [2023-10-10 04:44:08,847][53252] Updated weights for policy 0, policy_version 2810 (0.0008) [2023-10-10 04:44:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 5767168. Throughput: 0: 1671.4, 1: 1675.8. Samples: 1454704. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-10 04:44:11,784][52050] Avg episode reward: [(0, '3.880'), (1, '3.180')] [2023-10-10 04:44:12,970][53268] Updated weights for policy 1, policy_version 2820 (0.0010) [2023-10-10 04:44:13,034][53252] Updated weights for policy 0, policy_version 2820 (0.0009) [2023-10-10 04:44:13,362][53268] Updated weights for policy 1, policy_version 2830 (0.0009) [2023-10-10 04:44:13,403][53252] Updated weights for policy 0, policy_version 2830 (0.0007) [2023-10-10 04:44:13,743][53268] Updated weights for policy 1, policy_version 2840 (0.0010) [2023-10-10 04:44:13,781][53252] Updated weights for policy 0, policy_version 2840 (0.0008) [2023-10-10 04:44:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5832704. Throughput: 0: 1659.3, 1: 1655.4. Samples: 1463362. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 04:44:16,784][52050] Avg episode reward: [(0, '3.280'), (1, '3.150')] [2023-10-10 04:44:17,724][53268] Updated weights for policy 1, policy_version 2850 (0.0008) [2023-10-10 04:44:17,810][53252] Updated weights for policy 0, policy_version 2850 (0.0007) [2023-10-10 04:44:18,092][53268] Updated weights for policy 1, policy_version 2860 (0.0008) [2023-10-10 04:44:18,173][53252] Updated weights for policy 0, policy_version 2860 (0.0008) [2023-10-10 04:44:18,458][53268] Updated weights for policy 1, policy_version 2870 (0.0009) [2023-10-10 04:44:18,552][53252] Updated weights for policy 0, policy_version 2870 (0.0009) [2023-10-10 04:44:18,830][53268] Updated weights for policy 1, policy_version 2880 (0.0008) [2023-10-10 04:44:18,912][53252] Updated weights for policy 0, policy_version 2880 (0.0007) [2023-10-10 04:44:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5898240. Throughput: 0: 1669.0, 1: 1672.6. Samples: 1484218. Policy #0 lag: (min: 21.0, avg: 45.8, max: 48.0) [2023-10-10 04:44:21,784][52050] Avg episode reward: [(0, '3.480'), (1, '3.150')] [2023-10-10 04:44:22,817][53268] Updated weights for policy 1, policy_version 2890 (0.0009) [2023-10-10 04:44:23,038][53252] Updated weights for policy 0, policy_version 2890 (0.0007) [2023-10-10 04:44:23,180][53268] Updated weights for policy 1, policy_version 2900 (0.0008) [2023-10-10 04:44:23,410][53252] Updated weights for policy 0, policy_version 2900 (0.0007) [2023-10-10 04:44:23,552][53268] Updated weights for policy 1, policy_version 2910 (0.0009) [2023-10-10 04:44:23,784][53252] Updated weights for policy 0, policy_version 2910 (0.0008) [2023-10-10 04:44:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 5963776. Throughput: 0: 1670.0, 1: 1672.3. Samples: 1504616. Policy #0 lag: (min: 15.0, avg: 22.7, max: 47.0) [2023-10-10 04:44:26,784][52050] Avg episode reward: [(0, '3.710'), (1, '3.220')] [2023-10-10 04:44:27,753][53268] Updated weights for policy 1, policy_version 2920 (0.0010) [2023-10-10 04:44:27,997][53252] Updated weights for policy 0, policy_version 2920 (0.0007) [2023-10-10 04:44:28,121][53268] Updated weights for policy 1, policy_version 2930 (0.0008) [2023-10-10 04:44:28,364][53252] Updated weights for policy 0, policy_version 2930 (0.0007) [2023-10-10 04:44:28,481][53268] Updated weights for policy 1, policy_version 2940 (0.0008) [2023-10-10 04:44:28,737][53252] Updated weights for policy 0, policy_version 2940 (0.0009) [2023-10-10 04:44:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6029312. Throughput: 0: 1664.3, 1: 1660.9. Samples: 1513334. Policy #0 lag: (min: 3.0, avg: 9.8, max: 35.0) [2023-10-10 04:44:31,784][52050] Avg episode reward: [(0, '3.650'), (1, '3.390')] [2023-10-10 04:44:32,401][53268] Updated weights for policy 1, policy_version 2950 (0.0008) [2023-10-10 04:44:32,766][53268] Updated weights for policy 1, policy_version 2960 (0.0007) [2023-10-10 04:44:32,769][53252] Updated weights for policy 0, policy_version 2950 (0.0008) [2023-10-10 04:44:33,143][53268] Updated weights for policy 1, policy_version 2970 (0.0008) [2023-10-10 04:44:33,144][53252] Updated weights for policy 0, policy_version 2960 (0.0008) [2023-10-10 04:44:33,519][53252] Updated weights for policy 0, policy_version 2970 (0.0007) [2023-10-10 04:44:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6094848. Throughput: 0: 1672.3, 1: 1664.7. Samples: 1533950. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-10 04:44:36,784][52050] Avg episode reward: [(0, '3.290'), (1, '3.620')] [2023-10-10 04:44:37,291][53268] Updated weights for policy 1, policy_version 2980 (0.0010) [2023-10-10 04:44:37,499][53252] Updated weights for policy 0, policy_version 2980 (0.0007) [2023-10-10 04:44:37,656][53268] Updated weights for policy 1, policy_version 2990 (0.0009) [2023-10-10 04:44:37,877][53252] Updated weights for policy 0, policy_version 2990 (0.0009) [2023-10-10 04:44:38,022][53268] Updated weights for policy 1, policy_version 3000 (0.0008) [2023-10-10 04:44:38,239][53252] Updated weights for policy 0, policy_version 3000 (0.0010) [2023-10-10 04:44:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6160384. Throughput: 0: 1675.2, 1: 1661.4. Samples: 1554538. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-10 04:44:41,784][52050] Avg episode reward: [(0, '2.980'), (1, '3.560')] [2023-10-10 04:44:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth... [2023-10-10 04:44:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000003008_3080192.pth... [2023-10-10 04:44:41,825][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth [2023-10-10 04:44:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth [2023-10-10 04:44:42,292][53252] Updated weights for policy 0, policy_version 3010 (0.0007) [2023-10-10 04:44:42,378][53268] Updated weights for policy 1, policy_version 3010 (0.0008) [2023-10-10 04:44:42,667][53252] Updated weights for policy 0, policy_version 3020 (0.0007) [2023-10-10 04:44:42,749][53268] Updated weights for policy 1, policy_version 3020 (0.0008) [2023-10-10 04:44:43,053][53252] Updated weights for policy 0, policy_version 3030 (0.0008) [2023-10-10 04:44:43,117][53268] Updated weights for policy 1, policy_version 3030 (0.0009) [2023-10-10 04:44:43,419][53252] Updated weights for policy 0, policy_version 3040 (0.0008) [2023-10-10 04:44:43,482][53268] Updated weights for policy 1, policy_version 3040 (0.0009) [2023-10-10 04:44:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6225920. Throughput: 0: 1675.8, 1: 1661.4. Samples: 1563608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:44:46,784][52050] Avg episode reward: [(0, '3.030'), (1, '4.160')] [2023-10-10 04:44:46,785][53061] Saving new best policy, reward=4.160! [2023-10-10 04:44:47,472][53252] Updated weights for policy 0, policy_version 3050 (0.0009) [2023-10-10 04:44:47,701][53268] Updated weights for policy 1, policy_version 3050 (0.0008) [2023-10-10 04:44:47,836][53252] Updated weights for policy 0, policy_version 3060 (0.0008) [2023-10-10 04:44:48,068][53268] Updated weights for policy 1, policy_version 3060 (0.0009) [2023-10-10 04:44:48,211][53252] Updated weights for policy 0, policy_version 3070 (0.0008) [2023-10-10 04:44:48,443][53268] Updated weights for policy 1, policy_version 3070 (0.0008) [2023-10-10 04:44:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 6291456. Throughput: 0: 1675.2, 1: 1657.6. Samples: 1584032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:44:51,784][52050] Avg episode reward: [(0, '3.450'), (1, '4.570')] [2023-10-10 04:44:51,784][53061] Saving new best policy, reward=4.570! [2023-10-10 04:44:52,310][53252] Updated weights for policy 0, policy_version 3080 (0.0008) [2023-10-10 04:44:52,681][53252] Updated weights for policy 0, policy_version 3090 (0.0007) [2023-10-10 04:44:52,731][53268] Updated weights for policy 1, policy_version 3080 (0.0010) [2023-10-10 04:44:53,062][53252] Updated weights for policy 0, policy_version 3100 (0.0008) [2023-10-10 04:44:53,100][53268] Updated weights for policy 1, policy_version 3090 (0.0008) [2023-10-10 04:44:53,467][53268] Updated weights for policy 1, policy_version 3100 (0.0008) [2023-10-10 04:44:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 6356992. Throughput: 0: 1674.2, 1: 1652.9. Samples: 1604424. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 04:44:56,784][52050] Avg episode reward: [(0, '3.650'), (1, '4.910')] [2023-10-10 04:44:56,794][53061] Saving new best policy, reward=4.910! [2023-10-10 04:44:57,208][53252] Updated weights for policy 0, policy_version 3110 (0.0010) [2023-10-10 04:44:57,579][53252] Updated weights for policy 0, policy_version 3120 (0.0008) [2023-10-10 04:44:57,683][53268] Updated weights for policy 1, policy_version 3110 (0.0008) [2023-10-10 04:44:57,944][53252] Updated weights for policy 0, policy_version 3130 (0.0009) [2023-10-10 04:44:58,069][53268] Updated weights for policy 1, policy_version 3120 (0.0008) [2023-10-10 04:44:58,445][53268] Updated weights for policy 1, policy_version 3130 (0.0010) [2023-10-10 04:45:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6422528. Throughput: 0: 1680.1, 1: 1651.9. Samples: 1613300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:01,784][52050] Avg episode reward: [(0, '4.080'), (1, '4.790')] [2023-10-10 04:45:02,022][53252] Updated weights for policy 0, policy_version 3140 (0.0010) [2023-10-10 04:45:02,341][53268] Updated weights for policy 1, policy_version 3140 (0.0008) [2023-10-10 04:45:02,393][53252] Updated weights for policy 0, policy_version 3150 (0.0008) [2023-10-10 04:45:02,716][53268] Updated weights for policy 1, policy_version 3150 (0.0010) [2023-10-10 04:45:02,761][53252] Updated weights for policy 0, policy_version 3160 (0.0008) [2023-10-10 04:45:03,048][52846] Saving new best policy, reward=4.080! [2023-10-10 04:45:03,081][53268] Updated weights for policy 1, policy_version 3160 (0.0007) [2023-10-10 04:45:06,753][53252] Updated weights for policy 0, policy_version 3170 (0.0008) [2023-10-10 04:45:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 6488064. Throughput: 0: 1679.6, 1: 1648.6. Samples: 1633984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:06,784][52050] Avg episode reward: [(0, '4.160'), (1, '4.430')] [2023-10-10 04:45:07,137][53252] Updated weights for policy 0, policy_version 3180 (0.0010) [2023-10-10 04:45:07,271][53268] Updated weights for policy 1, policy_version 3170 (0.0009) [2023-10-10 04:45:07,509][53252] Updated weights for policy 0, policy_version 3190 (0.0008) [2023-10-10 04:45:07,634][53268] Updated weights for policy 1, policy_version 3180 (0.0007) [2023-10-10 04:45:07,874][53252] Updated weights for policy 0, policy_version 3200 (0.0008) [2023-10-10 04:45:07,874][52846] Saving new best policy, reward=4.160! [2023-10-10 04:45:07,997][53268] Updated weights for policy 1, policy_version 3190 (0.0008) [2023-10-10 04:45:08,371][53268] Updated weights for policy 1, policy_version 3200 (0.0008) [2023-10-10 04:45:11,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 6553600. Throughput: 0: 1684.2, 1: 1650.7. Samples: 1654684. Policy #0 lag: (min: 29.0, avg: 36.8, max: 61.0) [2023-10-10 04:45:11,785][52050] Avg episode reward: [(0, '4.110'), (1, '4.140')] [2023-10-10 04:45:11,915][53252] Updated weights for policy 0, policy_version 3210 (0.0009) [2023-10-10 04:45:12,285][53252] Updated weights for policy 0, policy_version 3220 (0.0007) [2023-10-10 04:45:12,376][53268] Updated weights for policy 1, policy_version 3210 (0.0009) [2023-10-10 04:45:12,662][53252] Updated weights for policy 0, policy_version 3230 (0.0007) [2023-10-10 04:45:12,745][53268] Updated weights for policy 1, policy_version 3220 (0.0009) [2023-10-10 04:45:13,104][53268] Updated weights for policy 1, policy_version 3230 (0.0007) [2023-10-10 04:45:16,756][53252] Updated weights for policy 0, policy_version 3240 (0.0008) [2023-10-10 04:45:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6619136. Throughput: 0: 1686.1, 1: 1652.7. Samples: 1663582. Policy #0 lag: (min: 29.0, avg: 36.8, max: 61.0) [2023-10-10 04:45:16,784][52050] Avg episode reward: [(0, '3.710'), (1, '3.860')] [2023-10-10 04:45:17,126][53252] Updated weights for policy 0, policy_version 3250 (0.0008) [2023-10-10 04:45:17,332][53268] Updated weights for policy 1, policy_version 3240 (0.0008) [2023-10-10 04:45:17,502][53252] Updated weights for policy 0, policy_version 3260 (0.0008) [2023-10-10 04:45:17,693][53268] Updated weights for policy 1, policy_version 3250 (0.0009) [2023-10-10 04:45:18,071][53268] Updated weights for policy 1, policy_version 3260 (0.0010) [2023-10-10 04:45:21,654][53252] Updated weights for policy 0, policy_version 3270 (0.0009) [2023-10-10 04:45:21,784][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 6684672. Throughput: 0: 1680.8, 1: 1653.9. Samples: 1684016. Policy #0 lag: (min: 19.0, avg: 27.5, max: 51.0) [2023-10-10 04:45:21,784][52050] Avg episode reward: [(0, '3.880'), (1, '4.090')] [2023-10-10 04:45:22,027][53252] Updated weights for policy 0, policy_version 3280 (0.0007) [2023-10-10 04:45:22,310][53268] Updated weights for policy 1, policy_version 3270 (0.0008) [2023-10-10 04:45:22,398][53252] Updated weights for policy 0, policy_version 3290 (0.0008) [2023-10-10 04:45:22,689][53268] Updated weights for policy 1, policy_version 3280 (0.0008) [2023-10-10 04:45:23,051][53268] Updated weights for policy 1, policy_version 3290 (0.0010) [2023-10-10 04:45:26,556][53252] Updated weights for policy 0, policy_version 3300 (0.0007) [2023-10-10 04:45:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 6750208. Throughput: 0: 1676.5, 1: 1658.0. Samples: 1704590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:26,784][52050] Avg episode reward: [(0, '3.730'), (1, '4.670')] [2023-10-10 04:45:26,939][53252] Updated weights for policy 0, policy_version 3310 (0.0009) [2023-10-10 04:45:27,102][53268] Updated weights for policy 1, policy_version 3300 (0.0008) [2023-10-10 04:45:27,305][53252] Updated weights for policy 0, policy_version 3320 (0.0007) [2023-10-10 04:45:27,456][53268] Updated weights for policy 1, policy_version 3310 (0.0008) [2023-10-10 04:45:27,825][53268] Updated weights for policy 1, policy_version 3320 (0.0010) [2023-10-10 04:45:31,344][53252] Updated weights for policy 0, policy_version 3330 (0.0007) [2023-10-10 04:45:31,712][53252] Updated weights for policy 0, policy_version 3340 (0.0008) [2023-10-10 04:45:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 6815744. Throughput: 0: 1679.6, 1: 1657.2. Samples: 1713766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:31,785][52050] Avg episode reward: [(0, '3.160'), (1, '5.030')] [2023-10-10 04:45:31,920][53268] Updated weights for policy 1, policy_version 3330 (0.0009) [2023-10-10 04:45:32,088][53252] Updated weights for policy 0, policy_version 3350 (0.0007) [2023-10-10 04:45:32,284][53268] Updated weights for policy 1, policy_version 3340 (0.0009) [2023-10-10 04:45:32,450][53252] Updated weights for policy 0, policy_version 3360 (0.0008) [2023-10-10 04:45:32,658][53268] Updated weights for policy 1, policy_version 3350 (0.0007) [2023-10-10 04:45:33,023][53268] Updated weights for policy 1, policy_version 3360 (0.0007) [2023-10-10 04:45:33,024][53061] Saving new best policy, reward=5.030! [2023-10-10 04:45:36,488][53252] Updated weights for policy 0, policy_version 3370 (0.0008) [2023-10-10 04:45:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6881280. Throughput: 0: 1678.9, 1: 1665.4. Samples: 1734526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:36,784][52050] Avg episode reward: [(0, '3.630'), (1, '5.620')] [2023-10-10 04:45:36,856][53252] Updated weights for policy 0, policy_version 3380 (0.0008) [2023-10-10 04:45:37,046][53268] Updated weights for policy 1, policy_version 3370 (0.0007) [2023-10-10 04:45:37,231][53252] Updated weights for policy 0, policy_version 3390 (0.0009) [2023-10-10 04:45:37,410][53268] Updated weights for policy 1, policy_version 3380 (0.0008) [2023-10-10 04:45:37,777][53268] Updated weights for policy 1, policy_version 3390 (0.0009) [2023-10-10 04:45:37,849][53061] Saving new best policy, reward=5.620! [2023-10-10 04:45:41,291][53252] Updated weights for policy 0, policy_version 3400 (0.0007) [2023-10-10 04:45:41,665][53252] Updated weights for policy 0, policy_version 3410 (0.0007) [2023-10-10 04:45:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6946816. Throughput: 0: 1674.0, 1: 1659.7. Samples: 1754436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:41,784][52050] Avg episode reward: [(0, '3.510'), (1, '5.060')] [2023-10-10 04:45:42,038][53252] Updated weights for policy 0, policy_version 3420 (0.0009) [2023-10-10 04:45:42,071][53268] Updated weights for policy 1, policy_version 3400 (0.0008) [2023-10-10 04:45:42,449][53268] Updated weights for policy 1, policy_version 3410 (0.0007) [2023-10-10 04:45:42,816][53268] Updated weights for policy 1, policy_version 3420 (0.0007) [2023-10-10 04:45:45,921][53252] Updated weights for policy 0, policy_version 3430 (0.0009) [2023-10-10 04:45:46,291][53252] Updated weights for policy 0, policy_version 3440 (0.0008) [2023-10-10 04:45:46,662][53252] Updated weights for policy 0, policy_version 3450 (0.0007) [2023-10-10 04:45:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 7012352. Throughput: 0: 1682.7, 1: 1666.1. Samples: 1763996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-10 04:45:46,784][52050] Avg episode reward: [(0, '3.470'), (1, '4.990')] [2023-10-10 04:45:46,947][53268] Updated weights for policy 1, policy_version 3430 (0.0008) [2023-10-10 04:45:47,317][53268] Updated weights for policy 1, policy_version 3440 (0.0010) [2023-10-10 04:45:47,679][53268] Updated weights for policy 1, policy_version 3450 (0.0011) [2023-10-10 04:45:50,783][53252] Updated weights for policy 0, policy_version 3460 (0.0009) [2023-10-10 04:45:51,152][53252] Updated weights for policy 0, policy_version 3470 (0.0009) [2023-10-10 04:45:51,529][53252] Updated weights for policy 0, policy_version 3480 (0.0009) [2023-10-10 04:45:51,660][53268] Updated weights for policy 1, policy_version 3460 (0.0008) [2023-10-10 04:45:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 7077888. Throughput: 0: 1684.8, 1: 1664.1. Samples: 1784684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-10 04:45:51,784][52050] Avg episode reward: [(0, '3.790'), (1, '4.860')] [2023-10-10 04:45:52,036][53268] Updated weights for policy 1, policy_version 3470 (0.0008) [2023-10-10 04:45:52,394][53268] Updated weights for policy 1, policy_version 3480 (0.0008) [2023-10-10 04:45:55,641][53252] Updated weights for policy 0, policy_version 3490 (0.0009) [2023-10-10 04:45:56,011][53252] Updated weights for policy 0, policy_version 3500 (0.0008) [2023-10-10 04:45:56,366][53252] Updated weights for policy 0, policy_version 3510 (0.0010) [2023-10-10 04:45:56,510][53268] Updated weights for policy 1, policy_version 3490 (0.0007) [2023-10-10 04:45:56,740][53252] Updated weights for policy 0, policy_version 3520 (0.0009) [2023-10-10 04:45:56,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 7176192. Throughput: 0: 1661.9, 1: 1669.3. Samples: 1804588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:45:56,784][52050] Avg episode reward: [(0, '3.790'), (1, '5.540')] [2023-10-10 04:45:56,873][53268] Updated weights for policy 1, policy_version 3500 (0.0007) [2023-10-10 04:45:57,248][53268] Updated weights for policy 1, policy_version 3510 (0.0009) [2023-10-10 04:45:57,618][53268] Updated weights for policy 1, policy_version 3520 (0.0009) [2023-10-10 04:46:01,066][53252] Updated weights for policy 0, policy_version 3530 (0.0009) [2023-10-10 04:46:01,432][53252] Updated weights for policy 0, policy_version 3540 (0.0008) [2023-10-10 04:46:01,712][53268] Updated weights for policy 1, policy_version 3530 (0.0008) [2023-10-10 04:46:01,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13218.3). Total num frames: 7208960. Throughput: 0: 1679.7, 1: 1672.2. Samples: 1814416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:46:01,785][52050] Avg episode reward: [(0, '3.990'), (1, '5.910')] [2023-10-10 04:46:01,806][53252] Updated weights for policy 0, policy_version 3550 (0.0008) [2023-10-10 04:46:02,076][53268] Updated weights for policy 1, policy_version 3540 (0.0009) [2023-10-10 04:46:02,453][53268] Updated weights for policy 1, policy_version 3550 (0.0007) [2023-10-10 04:46:02,518][53061] Saving new best policy, reward=5.910! [2023-10-10 04:46:05,676][53252] Updated weights for policy 0, policy_version 3560 (0.0009) [2023-10-10 04:46:06,052][53252] Updated weights for policy 0, policy_version 3570 (0.0009) [2023-10-10 04:46:06,419][53252] Updated weights for policy 0, policy_version 3580 (0.0008) [2023-10-10 04:46:06,536][53268] Updated weights for policy 1, policy_version 3560 (0.0007) [2023-10-10 04:46:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7307264. Throughput: 0: 1678.8, 1: 1676.2. Samples: 1834990. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 04:46:06,784][52050] Avg episode reward: [(0, '4.000'), (1, '5.870')] [2023-10-10 04:46:06,912][53268] Updated weights for policy 1, policy_version 3570 (0.0007) [2023-10-10 04:46:07,278][53268] Updated weights for policy 1, policy_version 3580 (0.0008) [2023-10-10 04:46:10,553][53252] Updated weights for policy 0, policy_version 3590 (0.0008) [2023-10-10 04:46:10,923][53252] Updated weights for policy 0, policy_version 3600 (0.0009) [2023-10-10 04:46:11,295][53252] Updated weights for policy 0, policy_version 3610 (0.0007) [2023-10-10 04:46:11,464][53268] Updated weights for policy 1, policy_version 3590 (0.0009) [2023-10-10 04:46:11,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13329.3). Total num frames: 7372800. Throughput: 0: 1656.7, 1: 1674.3. Samples: 1854486. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 04:46:11,784][52050] Avg episode reward: [(0, '4.160'), (1, '4.870')] [2023-10-10 04:46:11,834][53268] Updated weights for policy 1, policy_version 3600 (0.0008) [2023-10-10 04:46:12,209][53268] Updated weights for policy 1, policy_version 3610 (0.0008) [2023-10-10 04:46:15,453][53252] Updated weights for policy 0, policy_version 3620 (0.0008) [2023-10-10 04:46:15,823][53252] Updated weights for policy 0, policy_version 3630 (0.0010) [2023-10-10 04:46:16,190][53252] Updated weights for policy 0, policy_version 3640 (0.0009) [2023-10-10 04:46:16,195][53268] Updated weights for policy 1, policy_version 3620 (0.0008) [2023-10-10 04:46:16,561][53268] Updated weights for policy 1, policy_version 3630 (0.0010) [2023-10-10 04:46:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7438336. Throughput: 0: 1678.1, 1: 1676.1. Samples: 1864706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:46:16,784][52050] Avg episode reward: [(0, '3.990'), (1, '4.530')] [2023-10-10 04:46:16,927][53268] Updated weights for policy 1, policy_version 3640 (0.0008) [2023-10-10 04:46:20,268][53252] Updated weights for policy 0, policy_version 3650 (0.0007) [2023-10-10 04:46:20,639][53252] Updated weights for policy 0, policy_version 3660 (0.0008) [2023-10-10 04:46:21,007][53268] Updated weights for policy 1, policy_version 3650 (0.0010) [2023-10-10 04:46:21,011][53252] Updated weights for policy 0, policy_version 3670 (0.0007) [2023-10-10 04:46:21,367][53252] Updated weights for policy 0, policy_version 3680 (0.0010) [2023-10-10 04:46:21,373][53268] Updated weights for policy 1, policy_version 3660 (0.0010) [2023-10-10 04:46:21,731][53268] Updated weights for policy 1, policy_version 3670 (0.0011) [2023-10-10 04:46:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 7503872. Throughput: 0: 1674.9, 1: 1670.3. Samples: 1885058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:46:21,784][52050] Avg episode reward: [(0, '4.070'), (1, '4.330')] [2023-10-10 04:46:22,101][53268] Updated weights for policy 1, policy_version 3680 (0.0011) [2023-10-10 04:46:25,506][53252] Updated weights for policy 0, policy_version 3690 (0.0010) [2023-10-10 04:46:25,888][53252] Updated weights for policy 0, policy_version 3700 (0.0010) [2023-10-10 04:46:26,176][53268] Updated weights for policy 1, policy_version 3690 (0.0007) [2023-10-10 04:46:26,260][53252] Updated weights for policy 0, policy_version 3710 (0.0009) [2023-10-10 04:46:26,541][53268] Updated weights for policy 1, policy_version 3700 (0.0008) [2023-10-10 04:46:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7569408. Throughput: 0: 1656.6, 1: 1671.4. Samples: 1904196. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 04:46:26,784][52050] Avg episode reward: [(0, '4.380'), (1, '4.380')] [2023-10-10 04:46:26,792][52846] Saving new best policy, reward=4.380! [2023-10-10 04:46:26,920][53268] Updated weights for policy 1, policy_version 3710 (0.0008) [2023-10-10 04:46:30,200][53252] Updated weights for policy 0, policy_version 3720 (0.0007) [2023-10-10 04:46:30,567][53252] Updated weights for policy 0, policy_version 3730 (0.0009) [2023-10-10 04:46:30,935][53252] Updated weights for policy 0, policy_version 3740 (0.0009) [2023-10-10 04:46:30,987][53268] Updated weights for policy 1, policy_version 3720 (0.0007) [2023-10-10 04:46:31,359][53268] Updated weights for policy 1, policy_version 3730 (0.0010) [2023-10-10 04:46:31,727][53268] Updated weights for policy 1, policy_version 3740 (0.0009) [2023-10-10 04:46:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 7634944. Throughput: 0: 1674.4, 1: 1676.2. Samples: 1914776. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 04:46:31,784][52050] Avg episode reward: [(0, '4.150'), (1, '4.570')] [2023-10-10 04:46:35,128][53252] Updated weights for policy 0, policy_version 3750 (0.0008) [2023-10-10 04:46:35,503][53252] Updated weights for policy 0, policy_version 3760 (0.0009) [2023-10-10 04:46:35,870][53252] Updated weights for policy 0, policy_version 3770 (0.0008) [2023-10-10 04:46:35,959][53268] Updated weights for policy 1, policy_version 3750 (0.0009) [2023-10-10 04:46:36,358][53268] Updated weights for policy 1, policy_version 3760 (0.0008) [2023-10-10 04:46:36,721][53268] Updated weights for policy 1, policy_version 3770 (0.0009) [2023-10-10 04:46:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7700480. Throughput: 0: 1659.8, 1: 1678.2. Samples: 1934894. Policy #0 lag: (min: 16.0, avg: 39.4, max: 48.0) [2023-10-10 04:46:36,784][52050] Avg episode reward: [(0, '4.230'), (1, '4.880')] [2023-10-10 04:46:39,909][53252] Updated weights for policy 0, policy_version 3780 (0.0008) [2023-10-10 04:46:40,283][53252] Updated weights for policy 0, policy_version 3790 (0.0010) [2023-10-10 04:46:40,659][53252] Updated weights for policy 0, policy_version 3800 (0.0009) [2023-10-10 04:46:40,897][53268] Updated weights for policy 1, policy_version 3780 (0.0008) [2023-10-10 04:46:41,258][53268] Updated weights for policy 1, policy_version 3790 (0.0009) [2023-10-10 04:46:41,625][53268] Updated weights for policy 1, policy_version 3800 (0.0007) [2023-10-10 04:46:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 7766016. Throughput: 0: 1665.1, 1: 1661.4. Samples: 1954280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:46:41,784][52050] Avg episode reward: [(0, '4.260'), (1, '5.030')] [2023-10-10 04:46:41,790][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth... [2023-10-10 04:46:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000002240_2293760.pth [2023-10-10 04:46:41,925][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000003808_3899392.pth... [2023-10-10 04:46:41,963][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth [2023-10-10 04:46:44,650][53252] Updated weights for policy 0, policy_version 3810 (0.0007) [2023-10-10 04:46:45,022][53252] Updated weights for policy 0, policy_version 3820 (0.0009) [2023-10-10 04:46:45,389][53252] Updated weights for policy 0, policy_version 3830 (0.0009) [2023-10-10 04:46:45,690][53268] Updated weights for policy 1, policy_version 3810 (0.0009) [2023-10-10 04:46:45,769][53252] Updated weights for policy 0, policy_version 3840 (0.0009) [2023-10-10 04:46:46,064][53268] Updated weights for policy 1, policy_version 3820 (0.0007) [2023-10-10 04:46:46,432][53268] Updated weights for policy 1, policy_version 3830 (0.0008) [2023-10-10 04:46:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 7831552. Throughput: 0: 1679.4, 1: 1671.7. Samples: 1965214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:46:46,784][52050] Avg episode reward: [(0, '4.770'), (1, '4.480')] [2023-10-10 04:46:46,784][52846] Saving new best policy, reward=4.770! [2023-10-10 04:46:46,800][53268] Updated weights for policy 1, policy_version 3840 (0.0008) [2023-10-10 04:46:50,013][53252] Updated weights for policy 0, policy_version 3850 (0.0010) [2023-10-10 04:46:50,393][53252] Updated weights for policy 0, policy_version 3860 (0.0011) [2023-10-10 04:46:50,762][53252] Updated weights for policy 0, policy_version 3870 (0.0009) [2023-10-10 04:46:50,859][53268] Updated weights for policy 1, policy_version 3850 (0.0010) [2023-10-10 04:46:51,217][53268] Updated weights for policy 1, policy_version 3860 (0.0009) [2023-10-10 04:46:51,577][53268] Updated weights for policy 1, policy_version 3870 (0.0010) [2023-10-10 04:46:51,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 7929856. Throughput: 0: 1661.2, 1: 1669.7. Samples: 1984878. Policy #0 lag: (min: 17.0, avg: 30.9, max: 49.0) [2023-10-10 04:46:51,784][52050] Avg episode reward: [(0, '4.420'), (1, '4.680')] [2023-10-10 04:46:54,870][53252] Updated weights for policy 0, policy_version 3880 (0.0007) [2023-10-10 04:46:55,248][53252] Updated weights for policy 0, policy_version 3890 (0.0007) [2023-10-10 04:46:55,593][53268] Updated weights for policy 1, policy_version 3880 (0.0010) [2023-10-10 04:46:55,617][53252] Updated weights for policy 0, policy_version 3900 (0.0008) [2023-10-10 04:46:55,967][53268] Updated weights for policy 1, policy_version 3890 (0.0009) [2023-10-10 04:46:56,334][53268] Updated weights for policy 1, policy_version 3900 (0.0007) [2023-10-10 04:46:56,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7995392. Throughput: 0: 1672.4, 1: 1657.0. Samples: 2004306. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 04:46:56,784][52050] Avg episode reward: [(0, '4.920'), (1, '4.670')] [2023-10-10 04:46:56,792][52846] Saving new best policy, reward=4.920! [2023-10-10 04:46:59,700][53252] Updated weights for policy 0, policy_version 3910 (0.0009) [2023-10-10 04:47:00,065][53252] Updated weights for policy 0, policy_version 3920 (0.0010) [2023-10-10 04:47:00,270][53268] Updated weights for policy 1, policy_version 3910 (0.0008) [2023-10-10 04:47:00,434][53252] Updated weights for policy 0, policy_version 3930 (0.0010) [2023-10-10 04:47:00,637][53268] Updated weights for policy 1, policy_version 3920 (0.0009) [2023-10-10 04:47:01,004][53268] Updated weights for policy 1, policy_version 3930 (0.0008) [2023-10-10 04:47:01,784][52050] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 8060928. Throughput: 0: 1679.1, 1: 1678.1. Samples: 2015784. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 04:47:01,785][52050] Avg episode reward: [(0, '4.910'), (1, '5.300')] [2023-10-10 04:47:04,554][53252] Updated weights for policy 0, policy_version 3940 (0.0009) [2023-10-10 04:47:04,929][53252] Updated weights for policy 0, policy_version 3950 (0.0009) [2023-10-10 04:47:05,294][53268] Updated weights for policy 1, policy_version 3940 (0.0009) [2023-10-10 04:47:05,297][53252] Updated weights for policy 0, policy_version 3960 (0.0008) [2023-10-10 04:47:05,665][53268] Updated weights for policy 1, policy_version 3950 (0.0008) [2023-10-10 04:47:06,038][53268] Updated weights for policy 1, policy_version 3960 (0.0008) [2023-10-10 04:47:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8126464. Throughput: 0: 1663.8, 1: 1673.6. Samples: 2035242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:47:06,784][52050] Avg episode reward: [(0, '4.760'), (1, '5.190')] [2023-10-10 04:47:09,311][53252] Updated weights for policy 0, policy_version 3970 (0.0008) [2023-10-10 04:47:09,683][53252] Updated weights for policy 0, policy_version 3980 (0.0010) [2023-10-10 04:47:09,873][53268] Updated weights for policy 1, policy_version 3970 (0.0008) [2023-10-10 04:47:10,058][53252] Updated weights for policy 0, policy_version 3990 (0.0009) [2023-10-10 04:47:10,241][53268] Updated weights for policy 1, policy_version 3980 (0.0009) [2023-10-10 04:47:10,420][53252] Updated weights for policy 0, policy_version 4000 (0.0008) [2023-10-10 04:47:10,613][53268] Updated weights for policy 1, policy_version 3990 (0.0009) [2023-10-10 04:47:10,988][53268] Updated weights for policy 1, policy_version 4000 (0.0010) [2023-10-10 04:47:11,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 8192000. Throughput: 0: 1678.6, 1: 1654.4. Samples: 2054184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:47:11,784][52050] Avg episode reward: [(0, '4.130'), (1, '4.930')] [2023-10-10 04:47:14,634][53252] Updated weights for policy 0, policy_version 4010 (0.0008) [2023-10-10 04:47:14,999][53252] Updated weights for policy 0, policy_version 4020 (0.0008) [2023-10-10 04:47:15,058][53268] Updated weights for policy 1, policy_version 4010 (0.0008) [2023-10-10 04:47:15,369][53252] Updated weights for policy 0, policy_version 4030 (0.0008) [2023-10-10 04:47:15,425][53268] Updated weights for policy 1, policy_version 4020 (0.0008) [2023-10-10 04:47:15,784][53268] Updated weights for policy 1, policy_version 4030 (0.0009) [2023-10-10 04:47:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8257536. Throughput: 0: 1674.4, 1: 1678.6. Samples: 2065662. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 04:47:16,784][52050] Avg episode reward: [(0, '4.540'), (1, '4.920')] [2023-10-10 04:47:19,257][53252] Updated weights for policy 0, policy_version 4040 (0.0010) [2023-10-10 04:47:19,627][53252] Updated weights for policy 0, policy_version 4050 (0.0010) [2023-10-10 04:47:19,999][53252] Updated weights for policy 0, policy_version 4060 (0.0009) [2023-10-10 04:47:20,114][53268] Updated weights for policy 1, policy_version 4040 (0.0010) [2023-10-10 04:47:20,493][53268] Updated weights for policy 1, policy_version 4050 (0.0008) [2023-10-10 04:47:20,863][53268] Updated weights for policy 1, policy_version 4060 (0.0007) [2023-10-10 04:47:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8323072. Throughput: 0: 1659.2, 1: 1665.4. Samples: 2084498. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 04:47:21,784][52050] Avg episode reward: [(0, '4.970'), (1, '4.880')] [2023-10-10 04:47:21,786][52846] Saving new best policy, reward=4.970! [2023-10-10 04:47:23,990][53252] Updated weights for policy 0, policy_version 4070 (0.0009) [2023-10-10 04:47:24,352][53252] Updated weights for policy 0, policy_version 4080 (0.0009) [2023-10-10 04:47:24,731][53252] Updated weights for policy 0, policy_version 4090 (0.0008) [2023-10-10 04:47:24,966][53268] Updated weights for policy 1, policy_version 4070 (0.0007) [2023-10-10 04:47:25,336][53268] Updated weights for policy 1, policy_version 4080 (0.0010) [2023-10-10 04:47:25,708][53268] Updated weights for policy 1, policy_version 4090 (0.0008) [2023-10-10 04:47:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 8388608. Throughput: 0: 1676.5, 1: 1657.6. Samples: 2104316. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-10-10 04:47:26,784][52050] Avg episode reward: [(0, '4.760'), (1, '5.750')] [2023-10-10 04:47:28,859][53252] Updated weights for policy 0, policy_version 4100 (0.0008) [2023-10-10 04:47:29,240][53252] Updated weights for policy 0, policy_version 4110 (0.0008) [2023-10-10 04:47:29,606][53252] Updated weights for policy 0, policy_version 4120 (0.0008) [2023-10-10 04:47:29,735][53268] Updated weights for policy 1, policy_version 4100 (0.0009) [2023-10-10 04:47:30,110][53268] Updated weights for policy 1, policy_version 4110 (0.0007) [2023-10-10 04:47:30,471][53268] Updated weights for policy 1, policy_version 4120 (0.0009) [2023-10-10 04:47:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8454144. Throughput: 0: 1655.5, 1: 1681.3. Samples: 2115370. Policy #0 lag: (min: 7.0, avg: 9.9, max: 39.0) [2023-10-10 04:47:31,784][52050] Avg episode reward: [(0, '5.100'), (1, '6.190')] [2023-10-10 04:47:31,784][52846] Saving new best policy, reward=5.100! [2023-10-10 04:47:31,785][53061] Saving new best policy, reward=6.190! [2023-10-10 04:47:33,663][53252] Updated weights for policy 0, policy_version 4130 (0.0007) [2023-10-10 04:47:34,041][53252] Updated weights for policy 0, policy_version 4140 (0.0009) [2023-10-10 04:47:34,403][53252] Updated weights for policy 0, policy_version 4150 (0.0010) [2023-10-10 04:47:34,490][53268] Updated weights for policy 1, policy_version 4130 (0.0010) [2023-10-10 04:47:34,777][53252] Updated weights for policy 0, policy_version 4160 (0.0010) [2023-10-10 04:47:34,849][53268] Updated weights for policy 1, policy_version 4140 (0.0007) [2023-10-10 04:47:35,224][53268] Updated weights for policy 1, policy_version 4150 (0.0009) [2023-10-10 04:47:35,591][53268] Updated weights for policy 1, policy_version 4160 (0.0011) [2023-10-10 04:47:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8519680. Throughput: 0: 1663.7, 1: 1665.8. Samples: 2134708. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) [2023-10-10 04:47:36,784][52050] Avg episode reward: [(0, '5.630'), (1, '6.250')] [2023-10-10 04:47:36,786][52846] Saving new best policy, reward=5.630! [2023-10-10 04:47:36,786][53061] Saving new best policy, reward=6.250! [2023-10-10 04:47:39,078][53252] Updated weights for policy 0, policy_version 4170 (0.0008) [2023-10-10 04:47:39,449][53252] Updated weights for policy 0, policy_version 4180 (0.0009) [2023-10-10 04:47:39,495][53268] Updated weights for policy 1, policy_version 4170 (0.0007) [2023-10-10 04:47:39,812][53252] Updated weights for policy 0, policy_version 4190 (0.0008) [2023-10-10 04:47:39,861][53268] Updated weights for policy 1, policy_version 4180 (0.0009) [2023-10-10 04:47:40,224][53268] Updated weights for policy 1, policy_version 4190 (0.0010) [2023-10-10 04:47:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8585216. Throughput: 0: 1666.7, 1: 1671.2. Samples: 2154512. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) [2023-10-10 04:47:41,785][52050] Avg episode reward: [(0, '5.160'), (1, '6.310')] [2023-10-10 04:47:41,797][53061] Saving new best policy, reward=6.310! [2023-10-10 04:47:43,924][53252] Updated weights for policy 0, policy_version 4200 (0.0010) [2023-10-10 04:47:44,292][53252] Updated weights for policy 0, policy_version 4210 (0.0009) [2023-10-10 04:47:44,318][53268] Updated weights for policy 1, policy_version 4200 (0.0008) [2023-10-10 04:47:44,667][53252] Updated weights for policy 0, policy_version 4220 (0.0010) [2023-10-10 04:47:44,699][53268] Updated weights for policy 1, policy_version 4210 (0.0007) [2023-10-10 04:47:45,063][53268] Updated weights for policy 1, policy_version 4220 (0.0011) [2023-10-10 04:47:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 8650752. Throughput: 0: 1645.7, 1: 1672.6. Samples: 2165108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:47:46,784][52050] Avg episode reward: [(0, '4.660'), (1, '6.420')] [2023-10-10 04:47:46,785][53061] Saving new best policy, reward=6.420! [2023-10-10 04:47:48,911][53252] Updated weights for policy 0, policy_version 4230 (0.0009) [2023-10-10 04:47:49,121][53268] Updated weights for policy 1, policy_version 4230 (0.0010) [2023-10-10 04:47:49,289][53252] Updated weights for policy 0, policy_version 4240 (0.0008) [2023-10-10 04:47:49,487][53268] Updated weights for policy 1, policy_version 4240 (0.0008) [2023-10-10 04:47:49,661][53252] Updated weights for policy 0, policy_version 4250 (0.0009) [2023-10-10 04:47:49,854][53268] Updated weights for policy 1, policy_version 4250 (0.0007) [2023-10-10 04:47:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8716288. Throughput: 0: 1652.6, 1: 1652.1. Samples: 2183954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:47:51,784][52050] Avg episode reward: [(0, '4.770'), (1, '6.690')] [2023-10-10 04:47:51,786][53061] Saving new best policy, reward=6.690! [2023-10-10 04:47:53,795][53252] Updated weights for policy 0, policy_version 4260 (0.0009) [2023-10-10 04:47:53,902][53268] Updated weights for policy 1, policy_version 4260 (0.0008) [2023-10-10 04:47:54,164][53252] Updated weights for policy 0, policy_version 4270 (0.0007) [2023-10-10 04:47:54,268][53268] Updated weights for policy 1, policy_version 4270 (0.0008) [2023-10-10 04:47:54,532][53252] Updated weights for policy 0, policy_version 4280 (0.0007) [2023-10-10 04:47:54,641][53268] Updated weights for policy 1, policy_version 4280 (0.0008) [2023-10-10 04:47:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8781824. Throughput: 0: 1662.7, 1: 1680.0. Samples: 2204606. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 04:47:56,784][52050] Avg episode reward: [(0, '5.460'), (1, '6.990')] [2023-10-10 04:47:56,790][53061] Saving new best policy, reward=6.990! [2023-10-10 04:47:58,610][53252] Updated weights for policy 0, policy_version 4290 (0.0009) [2023-10-10 04:47:58,782][53268] Updated weights for policy 1, policy_version 4290 (0.0008) [2023-10-10 04:47:58,977][53252] Updated weights for policy 0, policy_version 4300 (0.0008) [2023-10-10 04:47:59,152][53268] Updated weights for policy 1, policy_version 4300 (0.0009) [2023-10-10 04:47:59,363][53252] Updated weights for policy 0, policy_version 4310 (0.0008) [2023-10-10 04:47:59,511][53268] Updated weights for policy 1, policy_version 4310 (0.0008) [2023-10-10 04:47:59,730][53252] Updated weights for policy 0, policy_version 4320 (0.0007) [2023-10-10 04:47:59,874][53268] Updated weights for policy 1, policy_version 4320 (0.0008) [2023-10-10 04:48:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 8847360. Throughput: 0: 1646.7, 1: 1668.1. Samples: 2214830. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 04:48:01,784][52050] Avg episode reward: [(0, '4.680'), (1, '6.440')] [2023-10-10 04:48:03,820][53252] Updated weights for policy 0, policy_version 4330 (0.0008) [2023-10-10 04:48:03,959][53268] Updated weights for policy 1, policy_version 4330 (0.0007) [2023-10-10 04:48:04,200][53252] Updated weights for policy 0, policy_version 4340 (0.0008) [2023-10-10 04:48:04,326][53268] Updated weights for policy 1, policy_version 4340 (0.0008) [2023-10-10 04:48:04,563][53252] Updated weights for policy 0, policy_version 4350 (0.0009) [2023-10-10 04:48:04,691][53268] Updated weights for policy 1, policy_version 4350 (0.0009) [2023-10-10 04:48:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 8912896. Throughput: 0: 1666.1, 1: 1663.3. Samples: 2234322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:06,784][52050] Avg episode reward: [(0, '5.390'), (1, '5.820')] [2023-10-10 04:48:08,830][53252] Updated weights for policy 0, policy_version 4360 (0.0008) [2023-10-10 04:48:08,959][53268] Updated weights for policy 1, policy_version 4360 (0.0010) [2023-10-10 04:48:09,200][53252] Updated weights for policy 0, policy_version 4370 (0.0008) [2023-10-10 04:48:09,348][53268] Updated weights for policy 1, policy_version 4370 (0.0008) [2023-10-10 04:48:09,566][53252] Updated weights for policy 0, policy_version 4380 (0.0007) [2023-10-10 04:48:09,727][53268] Updated weights for policy 1, policy_version 4380 (0.0009) [2023-10-10 04:48:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 8978432. Throughput: 0: 1661.7, 1: 1678.7. Samples: 2254634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:11,784][52050] Avg episode reward: [(0, '5.250'), (1, '5.280')] [2023-10-10 04:48:13,402][53252] Updated weights for policy 0, policy_version 4390 (0.0007) [2023-10-10 04:48:13,698][53268] Updated weights for policy 1, policy_version 4390 (0.0009) [2023-10-10 04:48:13,775][53252] Updated weights for policy 0, policy_version 4400 (0.0007) [2023-10-10 04:48:14,066][53268] Updated weights for policy 1, policy_version 4400 (0.0009) [2023-10-10 04:48:14,151][53252] Updated weights for policy 0, policy_version 4410 (0.0008) [2023-10-10 04:48:14,430][53268] Updated weights for policy 1, policy_version 4410 (0.0009) [2023-10-10 04:48:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9043968. Throughput: 0: 1653.8, 1: 1659.2. Samples: 2264458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:16,784][52050] Avg episode reward: [(0, '4.980'), (1, '5.730')] [2023-10-10 04:48:18,231][53252] Updated weights for policy 0, policy_version 4420 (0.0008) [2023-10-10 04:48:18,614][53252] Updated weights for policy 0, policy_version 4430 (0.0009) [2023-10-10 04:48:18,710][53268] Updated weights for policy 1, policy_version 4420 (0.0009) [2023-10-10 04:48:18,989][53252] Updated weights for policy 0, policy_version 4440 (0.0010) [2023-10-10 04:48:19,084][53268] Updated weights for policy 1, policy_version 4430 (0.0008) [2023-10-10 04:48:19,457][53268] Updated weights for policy 1, policy_version 4440 (0.0007) [2023-10-10 04:48:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9109504. Throughput: 0: 1665.0, 1: 1660.6. Samples: 2284362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:21,784][52050] Avg episode reward: [(0, '5.120'), (1, '5.120')] [2023-10-10 04:48:23,117][53252] Updated weights for policy 0, policy_version 4450 (0.0010) [2023-10-10 04:48:23,494][53252] Updated weights for policy 0, policy_version 4460 (0.0010) [2023-10-10 04:48:23,543][53268] Updated weights for policy 1, policy_version 4450 (0.0008) [2023-10-10 04:48:23,863][53252] Updated weights for policy 0, policy_version 4470 (0.0008) [2023-10-10 04:48:23,908][53268] Updated weights for policy 1, policy_version 4460 (0.0010) [2023-10-10 04:48:24,232][53252] Updated weights for policy 0, policy_version 4480 (0.0007) [2023-10-10 04:48:24,271][53268] Updated weights for policy 1, policy_version 4470 (0.0008) [2023-10-10 04:48:24,642][53268] Updated weights for policy 1, policy_version 4480 (0.0009) [2023-10-10 04:48:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 9175040. Throughput: 0: 1679.1, 1: 1667.5. Samples: 2305106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:26,784][52050] Avg episode reward: [(0, '5.390'), (1, '5.450')] [2023-10-10 04:48:28,296][53252] Updated weights for policy 0, policy_version 4490 (0.0009) [2023-10-10 04:48:28,634][53268] Updated weights for policy 1, policy_version 4490 (0.0007) [2023-10-10 04:48:28,662][53252] Updated weights for policy 0, policy_version 4500 (0.0008) [2023-10-10 04:48:29,004][53268] Updated weights for policy 1, policy_version 4500 (0.0009) [2023-10-10 04:48:29,024][53252] Updated weights for policy 0, policy_version 4510 (0.0008) [2023-10-10 04:48:29,373][53268] Updated weights for policy 1, policy_version 4510 (0.0010) [2023-10-10 04:48:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9240576. Throughput: 0: 1667.9, 1: 1652.2. Samples: 2314512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:31,784][52050] Avg episode reward: [(0, '5.450'), (1, '5.030')] [2023-10-10 04:48:33,156][53252] Updated weights for policy 0, policy_version 4520 (0.0010) [2023-10-10 04:48:33,527][53252] Updated weights for policy 0, policy_version 4530 (0.0007) [2023-10-10 04:48:33,747][53268] Updated weights for policy 1, policy_version 4520 (0.0009) [2023-10-10 04:48:33,894][53252] Updated weights for policy 0, policy_version 4540 (0.0007) [2023-10-10 04:48:34,122][53268] Updated weights for policy 1, policy_version 4530 (0.0009) [2023-10-10 04:48:34,483][53268] Updated weights for policy 1, policy_version 4540 (0.0010) [2023-10-10 04:48:36,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9306112. Throughput: 0: 1681.8, 1: 1667.2. Samples: 2334658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:36,784][52050] Avg episode reward: [(0, '5.400'), (1, '4.920')] [2023-10-10 04:48:37,776][53252] Updated weights for policy 0, policy_version 4550 (0.0009) [2023-10-10 04:48:38,149][53252] Updated weights for policy 0, policy_version 4560 (0.0008) [2023-10-10 04:48:38,422][53268] Updated weights for policy 1, policy_version 4550 (0.0009) [2023-10-10 04:48:38,511][53252] Updated weights for policy 0, policy_version 4570 (0.0008) [2023-10-10 04:48:38,783][53268] Updated weights for policy 1, policy_version 4560 (0.0008) [2023-10-10 04:48:39,146][53268] Updated weights for policy 1, policy_version 4570 (0.0010) [2023-10-10 04:48:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9371648. Throughput: 0: 1681.6, 1: 1666.4. Samples: 2355266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:41,784][52050] Avg episode reward: [(0, '4.980'), (1, '5.680')] [2023-10-10 04:48:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000004576_4685824.pth... [2023-10-10 04:48:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000004576_4685824.pth... [2023-10-10 04:48:41,828][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000003008_3080192.pth [2023-10-10 04:48:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth [2023-10-10 04:48:42,674][53252] Updated weights for policy 0, policy_version 4580 (0.0008) [2023-10-10 04:48:43,048][53252] Updated weights for policy 0, policy_version 4590 (0.0008) [2023-10-10 04:48:43,305][53268] Updated weights for policy 1, policy_version 4580 (0.0010) [2023-10-10 04:48:43,417][53252] Updated weights for policy 0, policy_version 4600 (0.0008) [2023-10-10 04:48:43,675][53268] Updated weights for policy 1, policy_version 4590 (0.0008) [2023-10-10 04:48:44,048][53268] Updated weights for policy 1, policy_version 4600 (0.0009) [2023-10-10 04:48:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9437184. Throughput: 0: 1672.0, 1: 1652.7. Samples: 2364442. Policy #0 lag: (min: 10.0, avg: 11.7, max: 39.0) [2023-10-10 04:48:46,784][52050] Avg episode reward: [(0, '5.290'), (1, '6.010')] [2023-10-10 04:48:47,567][53252] Updated weights for policy 0, policy_version 4610 (0.0008) [2023-10-10 04:48:47,930][53252] Updated weights for policy 0, policy_version 4620 (0.0009) [2023-10-10 04:48:48,289][53268] Updated weights for policy 1, policy_version 4610 (0.0010) [2023-10-10 04:48:48,312][53252] Updated weights for policy 0, policy_version 4630 (0.0007) [2023-10-10 04:48:48,653][53268] Updated weights for policy 1, policy_version 4620 (0.0008) [2023-10-10 04:48:48,681][53252] Updated weights for policy 0, policy_version 4640 (0.0008) [2023-10-10 04:48:49,020][53268] Updated weights for policy 1, policy_version 4630 (0.0009) [2023-10-10 04:48:49,386][53268] Updated weights for policy 1, policy_version 4640 (0.0007) [2023-10-10 04:48:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9502720. Throughput: 0: 1677.5, 1: 1662.8. Samples: 2384636. Policy #0 lag: (min: 10.0, avg: 11.7, max: 39.0) [2023-10-10 04:48:51,784][52050] Avg episode reward: [(0, '5.170'), (1, '6.320')] [2023-10-10 04:48:52,807][53252] Updated weights for policy 0, policy_version 4650 (0.0007) [2023-10-10 04:48:53,172][53252] Updated weights for policy 0, policy_version 4660 (0.0010) [2023-10-10 04:48:53,542][53252] Updated weights for policy 0, policy_version 4670 (0.0008) [2023-10-10 04:48:53,570][53268] Updated weights for policy 1, policy_version 4650 (0.0009) [2023-10-10 04:48:53,951][53268] Updated weights for policy 1, policy_version 4660 (0.0009) [2023-10-10 04:48:54,309][53268] Updated weights for policy 1, policy_version 4670 (0.0009) [2023-10-10 04:48:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9568256. Throughput: 0: 1682.8, 1: 1666.0. Samples: 2405334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:48:56,784][52050] Avg episode reward: [(0, '5.300'), (1, '5.910')] [2023-10-10 04:48:57,361][53252] Updated weights for policy 0, policy_version 4680 (0.0008) [2023-10-10 04:48:57,725][53252] Updated weights for policy 0, policy_version 4690 (0.0008) [2023-10-10 04:48:58,102][53252] Updated weights for policy 0, policy_version 4700 (0.0010) [2023-10-10 04:48:58,326][53268] Updated weights for policy 1, policy_version 4680 (0.0009) [2023-10-10 04:48:58,698][53268] Updated weights for policy 1, policy_version 4690 (0.0007) [2023-10-10 04:48:59,055][53268] Updated weights for policy 1, policy_version 4700 (0.0008) [2023-10-10 04:49:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 9633792. Throughput: 0: 1681.2, 1: 1657.9. Samples: 2414718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:49:01,785][52050] Avg episode reward: [(0, '5.420'), (1, '6.060')] [2023-10-10 04:49:02,171][53252] Updated weights for policy 0, policy_version 4710 (0.0008) [2023-10-10 04:49:02,549][53252] Updated weights for policy 0, policy_version 4720 (0.0008) [2023-10-10 04:49:02,926][53252] Updated weights for policy 0, policy_version 4730 (0.0008) [2023-10-10 04:49:03,103][53268] Updated weights for policy 1, policy_version 4710 (0.0010) [2023-10-10 04:49:03,463][53268] Updated weights for policy 1, policy_version 4720 (0.0009) [2023-10-10 04:49:03,834][53268] Updated weights for policy 1, policy_version 4730 (0.0008) [2023-10-10 04:49:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9699328. Throughput: 0: 1684.6, 1: 1668.7. Samples: 2435262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:49:06,784][52050] Avg episode reward: [(0, '5.850'), (1, '5.510')] [2023-10-10 04:49:06,880][53252] Updated weights for policy 0, policy_version 4740 (0.0008) [2023-10-10 04:49:07,249][53252] Updated weights for policy 0, policy_version 4750 (0.0007) [2023-10-10 04:49:07,622][53252] Updated weights for policy 0, policy_version 4760 (0.0007) [2023-10-10 04:49:07,918][52846] Saving new best policy, reward=5.850! [2023-10-10 04:49:08,053][53268] Updated weights for policy 1, policy_version 4740 (0.0008) [2023-10-10 04:49:08,424][53268] Updated weights for policy 1, policy_version 4750 (0.0008) [2023-10-10 04:49:08,780][53268] Updated weights for policy 1, policy_version 4760 (0.0007) [2023-10-10 04:49:11,628][53252] Updated weights for policy 0, policy_version 4770 (0.0007) [2023-10-10 04:49:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9764864. Throughput: 0: 1679.5, 1: 1667.3. Samples: 2455714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:49:11,784][52050] Avg episode reward: [(0, '5.140'), (1, '5.590')] [2023-10-10 04:49:11,993][53252] Updated weights for policy 0, policy_version 4780 (0.0008) [2023-10-10 04:49:12,375][53252] Updated weights for policy 0, policy_version 4790 (0.0008) [2023-10-10 04:49:12,742][53252] Updated weights for policy 0, policy_version 4800 (0.0009) [2023-10-10 04:49:12,885][53268] Updated weights for policy 1, policy_version 4770 (0.0008) [2023-10-10 04:49:13,246][53268] Updated weights for policy 1, policy_version 4780 (0.0008) [2023-10-10 04:49:13,629][53268] Updated weights for policy 1, policy_version 4790 (0.0011) [2023-10-10 04:49:13,997][53268] Updated weights for policy 1, policy_version 4800 (0.0009) [2023-10-10 04:49:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9830400. Throughput: 0: 1682.8, 1: 1658.6. Samples: 2464878. Policy #0 lag: (min: 7.0, avg: 9.8, max: 39.0) [2023-10-10 04:49:16,784][52050] Avg episode reward: [(0, '4.320'), (1, '6.070')] [2023-10-10 04:49:17,000][53252] Updated weights for policy 0, policy_version 4810 (0.0007) [2023-10-10 04:49:17,374][53252] Updated weights for policy 0, policy_version 4820 (0.0007) [2023-10-10 04:49:17,758][53252] Updated weights for policy 0, policy_version 4830 (0.0009) [2023-10-10 04:49:18,127][53268] Updated weights for policy 1, policy_version 4810 (0.0009) [2023-10-10 04:49:18,490][53268] Updated weights for policy 1, policy_version 4820 (0.0007) [2023-10-10 04:49:18,873][53268] Updated weights for policy 1, policy_version 4830 (0.0009) [2023-10-10 04:49:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9895936. Throughput: 0: 1681.0, 1: 1671.5. Samples: 2485520. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-10 04:49:21,784][52050] Avg episode reward: [(0, '4.670'), (1, '6.950')] [2023-10-10 04:49:21,821][53252] Updated weights for policy 0, policy_version 4840 (0.0009) [2023-10-10 04:49:22,191][53252] Updated weights for policy 0, policy_version 4850 (0.0008) [2023-10-10 04:49:22,572][53252] Updated weights for policy 0, policy_version 4860 (0.0009) [2023-10-10 04:49:22,837][53268] Updated weights for policy 1, policy_version 4840 (0.0008) [2023-10-10 04:49:23,199][53268] Updated weights for policy 1, policy_version 4850 (0.0008) [2023-10-10 04:49:23,567][53268] Updated weights for policy 1, policy_version 4860 (0.0009) [2023-10-10 04:49:26,669][53252] Updated weights for policy 0, policy_version 4870 (0.0009) [2023-10-10 04:49:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9961472. Throughput: 0: 1684.3, 1: 1676.6. Samples: 2506506. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-10 04:49:26,784][52050] Avg episode reward: [(0, '5.200'), (1, '7.250')] [2023-10-10 04:49:26,791][53061] Saving new best policy, reward=7.250! [2023-10-10 04:49:27,045][53252] Updated weights for policy 0, policy_version 4880 (0.0009) [2023-10-10 04:49:27,390][53268] Updated weights for policy 1, policy_version 4870 (0.0009) [2023-10-10 04:49:27,416][53252] Updated weights for policy 0, policy_version 4890 (0.0007) [2023-10-10 04:49:27,753][53268] Updated weights for policy 1, policy_version 4880 (0.0009) [2023-10-10 04:49:28,119][53268] Updated weights for policy 1, policy_version 4890 (0.0010) [2023-10-10 04:49:31,411][53252] Updated weights for policy 0, policy_version 4900 (0.0007) [2023-10-10 04:49:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10027008. Throughput: 0: 1685.7, 1: 1672.5. Samples: 2515562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:49:31,784][52050] Avg episode reward: [(0, '5.240'), (1, '7.170')] [2023-10-10 04:49:31,789][53252] Updated weights for policy 0, policy_version 4910 (0.0007) [2023-10-10 04:49:32,161][53252] Updated weights for policy 0, policy_version 4920 (0.0008) [2023-10-10 04:49:32,351][53268] Updated weights for policy 1, policy_version 4900 (0.0009) [2023-10-10 04:49:32,728][53268] Updated weights for policy 1, policy_version 4910 (0.0009) [2023-10-10 04:49:33,091][53268] Updated weights for policy 1, policy_version 4920 (0.0011) [2023-10-10 04:49:36,243][53252] Updated weights for policy 0, policy_version 4930 (0.0007) [2023-10-10 04:49:36,618][53252] Updated weights for policy 0, policy_version 4940 (0.0009) [2023-10-10 04:49:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 10092544. Throughput: 0: 1689.1, 1: 1676.9. Samples: 2536106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:49:36,784][52050] Avg episode reward: [(0, '5.450'), (1, '6.870')] [2023-10-10 04:49:36,990][53252] Updated weights for policy 0, policy_version 4950 (0.0007) [2023-10-10 04:49:37,231][53268] Updated weights for policy 1, policy_version 4930 (0.0008) [2023-10-10 04:49:37,361][53252] Updated weights for policy 0, policy_version 4960 (0.0008) [2023-10-10 04:49:37,602][53268] Updated weights for policy 1, policy_version 4940 (0.0008) [2023-10-10 04:49:37,967][53268] Updated weights for policy 1, policy_version 4950 (0.0011) [2023-10-10 04:49:38,338][53268] Updated weights for policy 1, policy_version 4960 (0.0009) [2023-10-10 04:49:41,430][53252] Updated weights for policy 0, policy_version 4970 (0.0007) [2023-10-10 04:49:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10158080. Throughput: 0: 1678.0, 1: 1677.5. Samples: 2556332. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 04:49:41,784][52050] Avg episode reward: [(0, '5.690'), (1, '7.070')] [2023-10-10 04:49:41,803][53252] Updated weights for policy 0, policy_version 4980 (0.0008) [2023-10-10 04:49:42,162][53252] Updated weights for policy 0, policy_version 4990 (0.0009) [2023-10-10 04:49:42,671][53268] Updated weights for policy 1, policy_version 4970 (0.0008) [2023-10-10 04:49:43,039][53268] Updated weights for policy 1, policy_version 4980 (0.0007) [2023-10-10 04:49:43,412][53268] Updated weights for policy 1, policy_version 4990 (0.0007) [2023-10-10 04:49:46,375][53252] Updated weights for policy 0, policy_version 5000 (0.0009) [2023-10-10 04:49:46,754][53252] Updated weights for policy 0, policy_version 5010 (0.0010) [2023-10-10 04:49:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 10223616. Throughput: 0: 1682.0, 1: 1673.1. Samples: 2565698. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 04:49:46,784][52050] Avg episode reward: [(0, '5.480'), (1, '7.210')] [2023-10-10 04:49:47,136][53252] Updated weights for policy 0, policy_version 5020 (0.0008) [2023-10-10 04:49:47,426][53268] Updated weights for policy 1, policy_version 5000 (0.0008) [2023-10-10 04:49:47,788][53268] Updated weights for policy 1, policy_version 5010 (0.0008) [2023-10-10 04:49:48,155][53268] Updated weights for policy 1, policy_version 5020 (0.0008) [2023-10-10 04:49:51,082][53252] Updated weights for policy 0, policy_version 5030 (0.0008) [2023-10-10 04:49:51,458][53252] Updated weights for policy 0, policy_version 5040 (0.0008) [2023-10-10 04:49:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10289152. Throughput: 0: 1681.1, 1: 1678.9. Samples: 2586460. Policy #0 lag: (min: 22.0, avg: 23.4, max: 48.0) [2023-10-10 04:49:51,784][52050] Avg episode reward: [(0, '5.280'), (1, '7.560')] [2023-10-10 04:49:51,785][53061] Saving new best policy, reward=7.560! [2023-10-10 04:49:51,828][53252] Updated weights for policy 0, policy_version 5050 (0.0009) [2023-10-10 04:49:52,131][53268] Updated weights for policy 1, policy_version 5030 (0.0008) [2023-10-10 04:49:52,505][53268] Updated weights for policy 1, policy_version 5040 (0.0007) [2023-10-10 04:49:52,866][53268] Updated weights for policy 1, policy_version 5050 (0.0007) [2023-10-10 04:49:56,089][53252] Updated weights for policy 0, policy_version 5060 (0.0007) [2023-10-10 04:49:56,463][53252] Updated weights for policy 0, policy_version 5070 (0.0008) [2023-10-10 04:49:56,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 10354688. Throughput: 0: 1666.6, 1: 1680.4. Samples: 2606328. Policy #0 lag: (min: 22.0, avg: 23.4, max: 48.0) [2023-10-10 04:49:56,785][52050] Avg episode reward: [(0, '4.940'), (1, '8.020')] [2023-10-10 04:49:56,841][53252] Updated weights for policy 0, policy_version 5080 (0.0009) [2023-10-10 04:49:57,021][53268] Updated weights for policy 1, policy_version 5060 (0.0009) [2023-10-10 04:49:57,389][53268] Updated weights for policy 1, policy_version 5070 (0.0008) [2023-10-10 04:49:57,763][53268] Updated weights for policy 1, policy_version 5080 (0.0008) [2023-10-10 04:49:58,055][53061] Saving new best policy, reward=8.020! [2023-10-10 04:50:00,924][53252] Updated weights for policy 0, policy_version 5090 (0.0008) [2023-10-10 04:50:01,304][53252] Updated weights for policy 0, policy_version 5100 (0.0007) [2023-10-10 04:50:01,679][53252] Updated weights for policy 0, policy_version 5110 (0.0007) [2023-10-10 04:50:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 10420224. Throughput: 0: 1677.7, 1: 1680.1. Samples: 2615982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:50:01,784][52050] Avg episode reward: [(0, '4.780'), (1, '8.540')] [2023-10-10 04:50:01,792][53268] Updated weights for policy 1, policy_version 5090 (0.0009) [2023-10-10 04:50:02,047][53252] Updated weights for policy 0, policy_version 5120 (0.0007) [2023-10-10 04:50:02,167][53268] Updated weights for policy 1, policy_version 5100 (0.0007) [2023-10-10 04:50:02,528][53268] Updated weights for policy 1, policy_version 5110 (0.0007) [2023-10-10 04:50:02,892][53061] Saving new best policy, reward=8.540! [2023-10-10 04:50:02,892][53268] Updated weights for policy 1, policy_version 5120 (0.0007) [2023-10-10 04:50:06,163][53252] Updated weights for policy 0, policy_version 5130 (0.0010) [2023-10-10 04:50:06,535][53252] Updated weights for policy 0, policy_version 5140 (0.0007) [2023-10-10 04:50:06,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10485760. Throughput: 0: 1675.4, 1: 1681.5. Samples: 2636580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:50:06,784][52050] Avg episode reward: [(0, '5.580'), (1, '8.170')] [2023-10-10 04:50:06,912][53252] Updated weights for policy 0, policy_version 5150 (0.0007) [2023-10-10 04:50:06,918][53268] Updated weights for policy 1, policy_version 5130 (0.0008) [2023-10-10 04:50:07,291][53268] Updated weights for policy 1, policy_version 5140 (0.0009) [2023-10-10 04:50:07,664][53268] Updated weights for policy 1, policy_version 5150 (0.0009) [2023-10-10 04:50:11,063][53252] Updated weights for policy 0, policy_version 5160 (0.0010) [2023-10-10 04:50:11,436][53252] Updated weights for policy 0, policy_version 5170 (0.0008) [2023-10-10 04:50:11,733][53268] Updated weights for policy 1, policy_version 5160 (0.0008) [2023-10-10 04:50:11,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 10551296. Throughput: 0: 1652.4, 1: 1676.6. Samples: 2656312. Policy #0 lag: (min: 11.0, avg: 30.9, max: 32.0) [2023-10-10 04:50:11,785][52050] Avg episode reward: [(0, '5.730'), (1, '8.130')] [2023-10-10 04:50:11,802][53252] Updated weights for policy 0, policy_version 5180 (0.0008) [2023-10-10 04:50:12,100][53268] Updated weights for policy 1, policy_version 5170 (0.0010) [2023-10-10 04:50:12,474][53268] Updated weights for policy 1, policy_version 5180 (0.0011) [2023-10-10 04:50:15,917][53252] Updated weights for policy 0, policy_version 5190 (0.0009) [2023-10-10 04:50:16,293][53252] Updated weights for policy 0, policy_version 5200 (0.0007) [2023-10-10 04:50:16,539][53268] Updated weights for policy 1, policy_version 5190 (0.0008) [2023-10-10 04:50:16,661][53252] Updated weights for policy 0, policy_version 5210 (0.0008) [2023-10-10 04:50:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10616832. Throughput: 0: 1666.9, 1: 1675.2. Samples: 2665958. Policy #0 lag: (min: 26.0, avg: 31.2, max: 58.0) [2023-10-10 04:50:16,784][52050] Avg episode reward: [(0, '6.160'), (1, '7.480')] [2023-10-10 04:50:16,880][52846] Saving new best policy, reward=6.160! [2023-10-10 04:50:16,911][53268] Updated weights for policy 1, policy_version 5200 (0.0008) [2023-10-10 04:50:17,286][53268] Updated weights for policy 1, policy_version 5210 (0.0009) [2023-10-10 04:50:20,762][53252] Updated weights for policy 0, policy_version 5220 (0.0008) [2023-10-10 04:50:21,133][53252] Updated weights for policy 0, policy_version 5230 (0.0009) [2023-10-10 04:50:21,385][53268] Updated weights for policy 1, policy_version 5220 (0.0007) [2023-10-10 04:50:21,507][53252] Updated weights for policy 0, policy_version 5240 (0.0008) [2023-10-10 04:50:21,744][53268] Updated weights for policy 1, policy_version 5230 (0.0008) [2023-10-10 04:50:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10682368. Throughput: 0: 1664.1, 1: 1678.0. Samples: 2686500. Policy #0 lag: (min: 26.0, avg: 31.2, max: 58.0) [2023-10-10 04:50:21,784][52050] Avg episode reward: [(0, '5.030'), (1, '7.340')] [2023-10-10 04:50:22,113][53268] Updated weights for policy 1, policy_version 5240 (0.0010) [2023-10-10 04:50:25,605][53252] Updated weights for policy 0, policy_version 5250 (0.0007) [2023-10-10 04:50:25,975][53252] Updated weights for policy 0, policy_version 5260 (0.0007) [2023-10-10 04:50:26,212][53268] Updated weights for policy 1, policy_version 5250 (0.0009) [2023-10-10 04:50:26,350][53252] Updated weights for policy 0, policy_version 5270 (0.0008) [2023-10-10 04:50:26,578][53268] Updated weights for policy 1, policy_version 5260 (0.0009) [2023-10-10 04:50:26,715][53252] Updated weights for policy 0, policy_version 5280 (0.0008) [2023-10-10 04:50:26,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10780672. Throughput: 0: 1651.3, 1: 1680.5. Samples: 2706264. Policy #0 lag: (min: 1.0, avg: 17.9, max: 33.0) [2023-10-10 04:50:26,784][52050] Avg episode reward: [(0, '5.840'), (1, '7.170')] [2023-10-10 04:50:26,952][53268] Updated weights for policy 1, policy_version 5270 (0.0008) [2023-10-10 04:50:27,312][53268] Updated weights for policy 1, policy_version 5280 (0.0007) [2023-10-10 04:50:30,740][53252] Updated weights for policy 0, policy_version 5290 (0.0007) [2023-10-10 04:50:31,107][53252] Updated weights for policy 0, policy_version 5300 (0.0007) [2023-10-10 04:50:31,338][53268] Updated weights for policy 1, policy_version 5290 (0.0007) [2023-10-10 04:50:31,477][53252] Updated weights for policy 0, policy_version 5310 (0.0008) [2023-10-10 04:50:31,715][53268] Updated weights for policy 1, policy_version 5300 (0.0009) [2023-10-10 04:50:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10846208. Throughput: 0: 1668.0, 1: 1680.3. Samples: 2716370. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 04:50:31,784][52050] Avg episode reward: [(0, '5.760'), (1, '7.550')] [2023-10-10 04:50:32,086][53268] Updated weights for policy 1, policy_version 5310 (0.0010) [2023-10-10 04:50:35,729][53252] Updated weights for policy 0, policy_version 5320 (0.0007) [2023-10-10 04:50:36,100][53252] Updated weights for policy 0, policy_version 5330 (0.0007) [2023-10-10 04:50:36,129][53268] Updated weights for policy 1, policy_version 5320 (0.0008) [2023-10-10 04:50:36,471][53252] Updated weights for policy 0, policy_version 5340 (0.0009) [2023-10-10 04:50:36,493][53268] Updated weights for policy 1, policy_version 5330 (0.0009) [2023-10-10 04:50:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10911744. Throughput: 0: 1663.5, 1: 1675.4. Samples: 2736710. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 04:50:36,784][52050] Avg episode reward: [(0, '5.920'), (1, '7.610')] [2023-10-10 04:50:36,859][53268] Updated weights for policy 1, policy_version 5340 (0.0010) [2023-10-10 04:50:40,411][53252] Updated weights for policy 0, policy_version 5350 (0.0008) [2023-10-10 04:50:40,771][53252] Updated weights for policy 0, policy_version 5360 (0.0009) [2023-10-10 04:50:41,029][53268] Updated weights for policy 1, policy_version 5350 (0.0009) [2023-10-10 04:50:41,139][53252] Updated weights for policy 0, policy_version 5370 (0.0009) [2023-10-10 04:50:41,385][53268] Updated weights for policy 1, policy_version 5360 (0.0010) [2023-10-10 04:50:41,752][53268] Updated weights for policy 1, policy_version 5370 (0.0009) [2023-10-10 04:50:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10977280. Throughput: 0: 1652.1, 1: 1670.0. Samples: 2755820. Policy #0 lag: (min: 4.0, avg: 4.0, max: 7.0) [2023-10-10 04:50:41,784][52050] Avg episode reward: [(0, '6.400'), (1, '7.460')] [2023-10-10 04:50:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000005376_5505024.pth... [2023-10-10 04:50:41,820][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000003808_3899392.pth [2023-10-10 04:50:41,823][52846] Saving new best policy, reward=6.400! [2023-10-10 04:50:41,972][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000005376_5505024.pth... [2023-10-10 04:50:42,008][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000003808_3899392.pth [2023-10-10 04:50:45,080][53252] Updated weights for policy 0, policy_version 5380 (0.0009) [2023-10-10 04:50:45,447][53252] Updated weights for policy 0, policy_version 5390 (0.0010) [2023-10-10 04:50:45,817][53252] Updated weights for policy 0, policy_version 5400 (0.0008) [2023-10-10 04:50:46,011][53268] Updated weights for policy 1, policy_version 5380 (0.0010) [2023-10-10 04:50:46,397][53268] Updated weights for policy 1, policy_version 5390 (0.0008) [2023-10-10 04:50:46,764][53268] Updated weights for policy 1, policy_version 5400 (0.0009) [2023-10-10 04:50:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11042816. Throughput: 0: 1672.8, 1: 1675.1. Samples: 2766636. Policy #0 lag: (min: 4.0, avg: 4.0, max: 7.0) [2023-10-10 04:50:46,784][52050] Avg episode reward: [(0, '6.150'), (1, '7.660')] [2023-10-10 04:50:49,888][53252] Updated weights for policy 0, policy_version 5410 (0.0008) [2023-10-10 04:50:50,254][53252] Updated weights for policy 0, policy_version 5420 (0.0009) [2023-10-10 04:50:50,634][53252] Updated weights for policy 0, policy_version 5430 (0.0010) [2023-10-10 04:50:50,868][53268] Updated weights for policy 1, policy_version 5410 (0.0008) [2023-10-10 04:50:51,011][53252] Updated weights for policy 0, policy_version 5440 (0.0010) [2023-10-10 04:50:51,233][53268] Updated weights for policy 1, policy_version 5420 (0.0008) [2023-10-10 04:50:51,604][53268] Updated weights for policy 1, policy_version 5430 (0.0008) [2023-10-10 04:50:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11108352. Throughput: 0: 1663.8, 1: 1670.5. Samples: 2786622. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-10 04:50:51,784][52050] Avg episode reward: [(0, '6.080'), (1, '7.950')] [2023-10-10 04:50:51,973][53268] Updated weights for policy 1, policy_version 5440 (0.0009) [2023-10-10 04:50:55,124][53252] Updated weights for policy 0, policy_version 5450 (0.0009) [2023-10-10 04:50:55,508][53252] Updated weights for policy 0, policy_version 5460 (0.0009) [2023-10-10 04:50:55,869][53252] Updated weights for policy 0, policy_version 5470 (0.0007) [2023-10-10 04:50:55,977][53268] Updated weights for policy 1, policy_version 5450 (0.0009) [2023-10-10 04:50:56,354][53268] Updated weights for policy 1, policy_version 5460 (0.0008) [2023-10-10 04:50:56,721][53268] Updated weights for policy 1, policy_version 5470 (0.0007) [2023-10-10 04:50:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11173888. Throughput: 0: 1669.4, 1: 1660.1. Samples: 2806142. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-10 04:50:56,784][52050] Avg episode reward: [(0, '5.450'), (1, '8.560')] [2023-10-10 04:50:56,796][53061] Saving new best policy, reward=8.560! [2023-10-10 04:50:59,883][53252] Updated weights for policy 0, policy_version 5480 (0.0007) [2023-10-10 04:51:00,259][53252] Updated weights for policy 0, policy_version 5490 (0.0010) [2023-10-10 04:51:00,635][53252] Updated weights for policy 0, policy_version 5500 (0.0009) [2023-10-10 04:51:00,871][53268] Updated weights for policy 1, policy_version 5480 (0.0008) [2023-10-10 04:51:01,234][53268] Updated weights for policy 1, policy_version 5490 (0.0009) [2023-10-10 04:51:01,609][53268] Updated weights for policy 1, policy_version 5500 (0.0007) [2023-10-10 04:51:01,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 11272192. Throughput: 0: 1684.9, 1: 1675.7. Samples: 2817186. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-10 04:51:01,785][52050] Avg episode reward: [(0, '5.900'), (1, '7.630')] [2023-10-10 04:51:04,685][53252] Updated weights for policy 0, policy_version 5510 (0.0008) [2023-10-10 04:51:05,061][53252] Updated weights for policy 0, policy_version 5520 (0.0008) [2023-10-10 04:51:05,439][53252] Updated weights for policy 0, policy_version 5530 (0.0009) [2023-10-10 04:51:05,695][53268] Updated weights for policy 1, policy_version 5510 (0.0009) [2023-10-10 04:51:06,062][53268] Updated weights for policy 1, policy_version 5520 (0.0007) [2023-10-10 04:51:06,439][53268] Updated weights for policy 1, policy_version 5530 (0.0007) [2023-10-10 04:51:06,783][52050] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 11337728. Throughput: 0: 1663.8, 1: 1675.8. Samples: 2836780. Policy #0 lag: (min: 17.0, avg: 25.6, max: 49.0) [2023-10-10 04:51:06,784][52050] Avg episode reward: [(0, '6.100'), (1, '7.560')] [2023-10-10 04:51:09,474][53252] Updated weights for policy 0, policy_version 5540 (0.0007) [2023-10-10 04:51:09,855][53252] Updated weights for policy 0, policy_version 5550 (0.0008) [2023-10-10 04:51:10,227][53252] Updated weights for policy 0, policy_version 5560 (0.0008) [2023-10-10 04:51:10,465][53268] Updated weights for policy 1, policy_version 5540 (0.0009) [2023-10-10 04:51:10,829][53268] Updated weights for policy 1, policy_version 5550 (0.0011) [2023-10-10 04:51:11,196][53268] Updated weights for policy 1, policy_version 5560 (0.0010) [2023-10-10 04:51:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 11403264. Throughput: 0: 1679.9, 1: 1659.5. Samples: 2856540. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-10 04:51:11,785][52050] Avg episode reward: [(0, '6.220'), (1, '7.080')] [2023-10-10 04:51:14,335][53252] Updated weights for policy 0, policy_version 5570 (0.0009) [2023-10-10 04:51:14,705][53252] Updated weights for policy 0, policy_version 5580 (0.0008) [2023-10-10 04:51:15,077][53252] Updated weights for policy 0, policy_version 5590 (0.0008) [2023-10-10 04:51:15,244][53268] Updated weights for policy 1, policy_version 5570 (0.0008) [2023-10-10 04:51:15,451][53252] Updated weights for policy 0, policy_version 5600 (0.0008) [2023-10-10 04:51:15,615][53268] Updated weights for policy 1, policy_version 5580 (0.0008) [2023-10-10 04:51:15,979][53268] Updated weights for policy 1, policy_version 5590 (0.0007) [2023-10-10 04:51:16,352][53268] Updated weights for policy 1, policy_version 5600 (0.0009) [2023-10-10 04:51:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 11468800. Throughput: 0: 1682.1, 1: 1677.6. Samples: 2867554. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-10 04:51:16,784][52050] Avg episode reward: [(0, '5.680'), (1, '7.510')] [2023-10-10 04:51:19,590][53252] Updated weights for policy 0, policy_version 5610 (0.0008) [2023-10-10 04:51:19,947][53252] Updated weights for policy 0, policy_version 5620 (0.0010) [2023-10-10 04:51:20,317][53252] Updated weights for policy 0, policy_version 5630 (0.0007) [2023-10-10 04:51:20,477][53268] Updated weights for policy 1, policy_version 5610 (0.0008) [2023-10-10 04:51:20,848][53268] Updated weights for policy 1, policy_version 5620 (0.0010) [2023-10-10 04:51:21,211][53268] Updated weights for policy 1, policy_version 5630 (0.0007) [2023-10-10 04:51:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 11534336. Throughput: 0: 1660.7, 1: 1676.1. Samples: 2886866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:21,784][52050] Avg episode reward: [(0, '6.050'), (1, '7.790')] [2023-10-10 04:51:24,315][53252] Updated weights for policy 0, policy_version 5640 (0.0007) [2023-10-10 04:51:24,689][53252] Updated weights for policy 0, policy_version 5650 (0.0009) [2023-10-10 04:51:25,067][53252] Updated weights for policy 0, policy_version 5660 (0.0007) [2023-10-10 04:51:25,331][53268] Updated weights for policy 1, policy_version 5640 (0.0008) [2023-10-10 04:51:25,703][53268] Updated weights for policy 1, policy_version 5650 (0.0007) [2023-10-10 04:51:26,062][53268] Updated weights for policy 1, policy_version 5660 (0.0008) [2023-10-10 04:51:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11599872. Throughput: 0: 1689.1, 1: 1658.0. Samples: 2906442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:26,784][52050] Avg episode reward: [(0, '5.930'), (1, '7.840')] [2023-10-10 04:51:29,118][53252] Updated weights for policy 0, policy_version 5670 (0.0008) [2023-10-10 04:51:29,494][53252] Updated weights for policy 0, policy_version 5680 (0.0010) [2023-10-10 04:51:29,861][53252] Updated weights for policy 0, policy_version 5690 (0.0008) [2023-10-10 04:51:30,111][53268] Updated weights for policy 1, policy_version 5670 (0.0009) [2023-10-10 04:51:30,480][53268] Updated weights for policy 1, policy_version 5680 (0.0008) [2023-10-10 04:51:30,856][53268] Updated weights for policy 1, policy_version 5690 (0.0009) [2023-10-10 04:51:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11665408. Throughput: 0: 1672.2, 1: 1677.9. Samples: 2917392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:31,784][52050] Avg episode reward: [(0, '6.410'), (1, '8.240')] [2023-10-10 04:51:31,786][52846] Saving new best policy, reward=6.410! [2023-10-10 04:51:33,936][53252] Updated weights for policy 0, policy_version 5700 (0.0007) [2023-10-10 04:51:34,303][53252] Updated weights for policy 0, policy_version 5710 (0.0009) [2023-10-10 04:51:34,675][53252] Updated weights for policy 0, policy_version 5720 (0.0007) [2023-10-10 04:51:35,027][53268] Updated weights for policy 1, policy_version 5700 (0.0010) [2023-10-10 04:51:35,394][53268] Updated weights for policy 1, policy_version 5710 (0.0010) [2023-10-10 04:51:35,756][53268] Updated weights for policy 1, policy_version 5720 (0.0009) [2023-10-10 04:51:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11730944. Throughput: 0: 1666.3, 1: 1671.7. Samples: 2936830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:36,784][52050] Avg episode reward: [(0, '6.500'), (1, '7.960')] [2023-10-10 04:51:36,786][52846] Saving new best policy, reward=6.500! [2023-10-10 04:51:38,738][53252] Updated weights for policy 0, policy_version 5730 (0.0007) [2023-10-10 04:51:39,115][53252] Updated weights for policy 0, policy_version 5740 (0.0007) [2023-10-10 04:51:39,479][53252] Updated weights for policy 0, policy_version 5750 (0.0008) [2023-10-10 04:51:39,848][53252] Updated weights for policy 0, policy_version 5760 (0.0008) [2023-10-10 04:51:39,948][53268] Updated weights for policy 1, policy_version 5730 (0.0010) [2023-10-10 04:51:40,323][53268] Updated weights for policy 1, policy_version 5740 (0.0009) [2023-10-10 04:51:40,680][53268] Updated weights for policy 1, policy_version 5750 (0.0008) [2023-10-10 04:51:41,058][53268] Updated weights for policy 1, policy_version 5760 (0.0008) [2023-10-10 04:51:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11796480. Throughput: 0: 1680.9, 1: 1659.2. Samples: 2956448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:41,785][52050] Avg episode reward: [(0, '5.930'), (1, '7.230')] [2023-10-10 04:51:44,098][53252] Updated weights for policy 0, policy_version 5770 (0.0008) [2023-10-10 04:51:44,477][53252] Updated weights for policy 0, policy_version 5780 (0.0010) [2023-10-10 04:51:44,845][53252] Updated weights for policy 0, policy_version 5790 (0.0007) [2023-10-10 04:51:45,067][53268] Updated weights for policy 1, policy_version 5770 (0.0009) [2023-10-10 04:51:45,446][53268] Updated weights for policy 1, policy_version 5780 (0.0008) [2023-10-10 04:51:45,809][53268] Updated weights for policy 1, policy_version 5790 (0.0010) [2023-10-10 04:51:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11862016. Throughput: 0: 1660.4, 1: 1674.0. Samples: 2967232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:46,784][52050] Avg episode reward: [(0, '5.830'), (1, '7.020')] [2023-10-10 04:51:48,956][53252] Updated weights for policy 0, policy_version 5800 (0.0008) [2023-10-10 04:51:49,335][53252] Updated weights for policy 0, policy_version 5810 (0.0007) [2023-10-10 04:51:49,707][53252] Updated weights for policy 0, policy_version 5820 (0.0007) [2023-10-10 04:51:50,138][53268] Updated weights for policy 1, policy_version 5800 (0.0009) [2023-10-10 04:51:50,505][53268] Updated weights for policy 1, policy_version 5810 (0.0008) [2023-10-10 04:51:50,872][53268] Updated weights for policy 1, policy_version 5820 (0.0007) [2023-10-10 04:51:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11927552. Throughput: 0: 1668.8, 1: 1669.9. Samples: 2987022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:51,784][52050] Avg episode reward: [(0, '5.950'), (1, '6.320')] [2023-10-10 04:51:53,725][53252] Updated weights for policy 0, policy_version 5830 (0.0008) [2023-10-10 04:51:54,096][53252] Updated weights for policy 0, policy_version 5840 (0.0007) [2023-10-10 04:51:54,482][53252] Updated weights for policy 0, policy_version 5850 (0.0009) [2023-10-10 04:51:54,873][53268] Updated weights for policy 1, policy_version 5830 (0.0010) [2023-10-10 04:51:55,249][53268] Updated weights for policy 1, policy_version 5840 (0.0008) [2023-10-10 04:51:55,614][53268] Updated weights for policy 1, policy_version 5850 (0.0009) [2023-10-10 04:51:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11993088. Throughput: 0: 1675.3, 1: 1664.8. Samples: 3006846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:51:56,784][52050] Avg episode reward: [(0, '6.410'), (1, '6.980')] [2023-10-10 04:51:58,617][53252] Updated weights for policy 0, policy_version 5860 (0.0009) [2023-10-10 04:51:58,985][53252] Updated weights for policy 0, policy_version 5870 (0.0007) [2023-10-10 04:51:59,357][53252] Updated weights for policy 0, policy_version 5880 (0.0008) [2023-10-10 04:51:59,669][53268] Updated weights for policy 1, policy_version 5860 (0.0007) [2023-10-10 04:52:00,044][53268] Updated weights for policy 1, policy_version 5870 (0.0010) [2023-10-10 04:52:00,413][53268] Updated weights for policy 1, policy_version 5880 (0.0009) [2023-10-10 04:52:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12058624. Throughput: 0: 1661.9, 1: 1673.6. Samples: 3017650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:01,784][52050] Avg episode reward: [(0, '5.910'), (1, '7.760')] [2023-10-10 04:52:03,289][53252] Updated weights for policy 0, policy_version 5890 (0.0008) [2023-10-10 04:52:03,661][53252] Updated weights for policy 0, policy_version 5900 (0.0007) [2023-10-10 04:52:04,029][53252] Updated weights for policy 0, policy_version 5910 (0.0007) [2023-10-10 04:52:04,398][53252] Updated weights for policy 0, policy_version 5920 (0.0007) [2023-10-10 04:52:04,519][53268] Updated weights for policy 1, policy_version 5890 (0.0009) [2023-10-10 04:52:04,885][53268] Updated weights for policy 1, policy_version 5900 (0.0010) [2023-10-10 04:52:05,262][53268] Updated weights for policy 1, policy_version 5910 (0.0009) [2023-10-10 04:52:05,631][53268] Updated weights for policy 1, policy_version 5920 (0.0010) [2023-10-10 04:52:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12124160. Throughput: 0: 1683.2, 1: 1658.3. Samples: 3037232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:06,784][52050] Avg episode reward: [(0, '6.130'), (1, '8.620')] [2023-10-10 04:52:06,786][53061] Saving new best policy, reward=8.620! [2023-10-10 04:52:08,455][53252] Updated weights for policy 0, policy_version 5930 (0.0009) [2023-10-10 04:52:08,838][53252] Updated weights for policy 0, policy_version 5940 (0.0007) [2023-10-10 04:52:09,205][53252] Updated weights for policy 0, policy_version 5950 (0.0010) [2023-10-10 04:52:09,523][53268] Updated weights for policy 1, policy_version 5930 (0.0010) [2023-10-10 04:52:09,897][53268] Updated weights for policy 1, policy_version 5940 (0.0009) [2023-10-10 04:52:10,265][53268] Updated weights for policy 1, policy_version 5950 (0.0009) [2023-10-10 04:52:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 12189696. Throughput: 0: 1678.6, 1: 1675.2. Samples: 3057366. Policy #0 lag: (min: 17.0, avg: 36.3, max: 49.0) [2023-10-10 04:52:11,785][52050] Avg episode reward: [(0, '6.550'), (1, '8.630')] [2023-10-10 04:52:11,796][52846] Saving new best policy, reward=6.550! [2023-10-10 04:52:11,796][53061] Saving new best policy, reward=8.630! [2023-10-10 04:52:13,253][53252] Updated weights for policy 0, policy_version 5960 (0.0011) [2023-10-10 04:52:13,624][53252] Updated weights for policy 0, policy_version 5970 (0.0007) [2023-10-10 04:52:13,993][53252] Updated weights for policy 0, policy_version 5980 (0.0007) [2023-10-10 04:52:14,255][53268] Updated weights for policy 1, policy_version 5960 (0.0009) [2023-10-10 04:52:14,626][53268] Updated weights for policy 1, policy_version 5970 (0.0008) [2023-10-10 04:52:14,985][53268] Updated weights for policy 1, policy_version 5980 (0.0012) [2023-10-10 04:52:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12255232. Throughput: 0: 1662.6, 1: 1675.7. Samples: 3067618. Policy #0 lag: (min: 17.0, avg: 36.3, max: 49.0) [2023-10-10 04:52:16,784][52050] Avg episode reward: [(0, '7.130'), (1, '7.420')] [2023-10-10 04:52:16,785][52846] Saving new best policy, reward=7.130! [2023-10-10 04:52:18,163][53252] Updated weights for policy 0, policy_version 5990 (0.0008) [2023-10-10 04:52:18,535][53252] Updated weights for policy 0, policy_version 6000 (0.0007) [2023-10-10 04:52:18,906][53252] Updated weights for policy 0, policy_version 6010 (0.0008) [2023-10-10 04:52:19,143][53268] Updated weights for policy 1, policy_version 5990 (0.0008) [2023-10-10 04:52:19,507][53268] Updated weights for policy 1, policy_version 6000 (0.0010) [2023-10-10 04:52:19,882][53268] Updated weights for policy 1, policy_version 6010 (0.0009) [2023-10-10 04:52:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12320768. Throughput: 0: 1686.1, 1: 1656.4. Samples: 3087238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:21,784][52050] Avg episode reward: [(0, '6.520'), (1, '7.740')] [2023-10-10 04:52:22,743][53252] Updated weights for policy 0, policy_version 6020 (0.0008) [2023-10-10 04:52:23,125][53252] Updated weights for policy 0, policy_version 6030 (0.0008) [2023-10-10 04:52:23,501][53252] Updated weights for policy 0, policy_version 6040 (0.0009) [2023-10-10 04:52:23,939][53268] Updated weights for policy 1, policy_version 6020 (0.0011) [2023-10-10 04:52:24,308][53268] Updated weights for policy 1, policy_version 6030 (0.0010) [2023-10-10 04:52:24,679][53268] Updated weights for policy 1, policy_version 6040 (0.0008) [2023-10-10 04:52:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 12386304. Throughput: 0: 1683.3, 1: 1676.9. Samples: 3107656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:26,784][52050] Avg episode reward: [(0, '6.800'), (1, '7.160')] [2023-10-10 04:52:27,710][53252] Updated weights for policy 0, policy_version 6050 (0.0008) [2023-10-10 04:52:28,087][53252] Updated weights for policy 0, policy_version 6060 (0.0009) [2023-10-10 04:52:28,468][53252] Updated weights for policy 0, policy_version 6070 (0.0008) [2023-10-10 04:52:28,712][53268] Updated weights for policy 1, policy_version 6050 (0.0008) [2023-10-10 04:52:28,839][53252] Updated weights for policy 0, policy_version 6080 (0.0009) [2023-10-10 04:52:29,077][53268] Updated weights for policy 1, policy_version 6060 (0.0009) [2023-10-10 04:52:29,448][53268] Updated weights for policy 1, policy_version 6070 (0.0008) [2023-10-10 04:52:29,818][53268] Updated weights for policy 1, policy_version 6080 (0.0009) [2023-10-10 04:52:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12451840. Throughput: 0: 1673.1, 1: 1669.1. Samples: 3117632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:31,784][52050] Avg episode reward: [(0, '6.350'), (1, '6.950')] [2023-10-10 04:52:32,889][53252] Updated weights for policy 0, policy_version 6090 (0.0007) [2023-10-10 04:52:33,259][53252] Updated weights for policy 0, policy_version 6100 (0.0007) [2023-10-10 04:52:33,631][53252] Updated weights for policy 0, policy_version 6110 (0.0007) [2023-10-10 04:52:33,834][53268] Updated weights for policy 1, policy_version 6090 (0.0008) [2023-10-10 04:52:34,209][53268] Updated weights for policy 1, policy_version 6100 (0.0007) [2023-10-10 04:52:34,581][53268] Updated weights for policy 1, policy_version 6110 (0.0007) [2023-10-10 04:52:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12517376. Throughput: 0: 1684.8, 1: 1661.5. Samples: 3137604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:52:36,784][52050] Avg episode reward: [(0, '5.880'), (1, '7.630')] [2023-10-10 04:52:37,736][53252] Updated weights for policy 0, policy_version 6120 (0.0008) [2023-10-10 04:52:38,120][53252] Updated weights for policy 0, policy_version 6130 (0.0009) [2023-10-10 04:52:38,484][53252] Updated weights for policy 0, policy_version 6140 (0.0007) [2023-10-10 04:52:38,721][53268] Updated weights for policy 1, policy_version 6120 (0.0007) [2023-10-10 04:52:39,085][53268] Updated weights for policy 1, policy_version 6130 (0.0008) [2023-10-10 04:52:39,454][53268] Updated weights for policy 1, policy_version 6140 (0.0009) [2023-10-10 04:52:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 12582912. Throughput: 0: 1681.4, 1: 1685.3. Samples: 3158344. Policy #0 lag: (min: 10.0, avg: 23.9, max: 42.0) [2023-10-10 04:52:41,784][52050] Avg episode reward: [(0, '5.770'), (1, '8.310')] [2023-10-10 04:52:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000006144_6291456.pth... [2023-10-10 04:52:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000006144_6291456.pth... [2023-10-10 04:52:41,822][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000004576_4685824.pth [2023-10-10 04:52:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000004576_4685824.pth [2023-10-10 04:52:42,487][53252] Updated weights for policy 0, policy_version 6150 (0.0007) [2023-10-10 04:52:42,854][53252] Updated weights for policy 0, policy_version 6160 (0.0008) [2023-10-10 04:52:43,233][53252] Updated weights for policy 0, policy_version 6170 (0.0007) [2023-10-10 04:52:43,462][53268] Updated weights for policy 1, policy_version 6150 (0.0008) [2023-10-10 04:52:43,829][53268] Updated weights for policy 1, policy_version 6160 (0.0008) [2023-10-10 04:52:44,197][53268] Updated weights for policy 1, policy_version 6170 (0.0007) [2023-10-10 04:52:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 12648448. Throughput: 0: 1671.7, 1: 1662.2. Samples: 3167676. Policy #0 lag: (min: 10.0, avg: 23.9, max: 42.0) [2023-10-10 04:52:46,784][52050] Avg episode reward: [(0, '5.530'), (1, '9.030')] [2023-10-10 04:52:46,786][53061] Saving new best policy, reward=9.030! [2023-10-10 04:52:47,268][53252] Updated weights for policy 0, policy_version 6180 (0.0007) [2023-10-10 04:52:47,642][53252] Updated weights for policy 0, policy_version 6190 (0.0009) [2023-10-10 04:52:48,010][53252] Updated weights for policy 0, policy_version 6200 (0.0009) [2023-10-10 04:52:48,357][53268] Updated weights for policy 1, policy_version 6180 (0.0009) [2023-10-10 04:52:48,727][53268] Updated weights for policy 1, policy_version 6190 (0.0009) [2023-10-10 04:52:49,091][53268] Updated weights for policy 1, policy_version 6200 (0.0008) [2023-10-10 04:52:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12713984. Throughput: 0: 1677.1, 1: 1672.8. Samples: 3187974. Policy #0 lag: (min: 30.0, avg: 34.1, max: 62.0) [2023-10-10 04:52:51,784][52050] Avg episode reward: [(0, '6.200'), (1, '9.090')] [2023-10-10 04:52:51,785][53061] Saving new best policy, reward=9.090! [2023-10-10 04:52:52,114][53252] Updated weights for policy 0, policy_version 6210 (0.0011) [2023-10-10 04:52:52,500][53252] Updated weights for policy 0, policy_version 6220 (0.0008) [2023-10-10 04:52:52,866][53252] Updated weights for policy 0, policy_version 6230 (0.0007) [2023-10-10 04:52:53,238][53252] Updated weights for policy 0, policy_version 6240 (0.0007) [2023-10-10 04:52:53,342][53268] Updated weights for policy 1, policy_version 6210 (0.0009) [2023-10-10 04:52:53,718][53268] Updated weights for policy 1, policy_version 6220 (0.0007) [2023-10-10 04:52:54,085][53268] Updated weights for policy 1, policy_version 6230 (0.0009) [2023-10-10 04:52:54,451][53268] Updated weights for policy 1, policy_version 6240 (0.0008) [2023-10-10 04:52:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 12779520. Throughput: 0: 1680.2, 1: 1677.4. Samples: 3208458. Policy #0 lag: (min: 30.0, avg: 34.1, max: 62.0) [2023-10-10 04:52:56,784][52050] Avg episode reward: [(0, '5.380'), (1, '8.380')] [2023-10-10 04:52:57,248][53252] Updated weights for policy 0, policy_version 6250 (0.0008) [2023-10-10 04:52:57,624][53252] Updated weights for policy 0, policy_version 6260 (0.0008) [2023-10-10 04:52:57,987][53252] Updated weights for policy 0, policy_version 6270 (0.0008) [2023-10-10 04:52:58,564][53268] Updated weights for policy 1, policy_version 6250 (0.0010) [2023-10-10 04:52:58,940][53268] Updated weights for policy 1, policy_version 6260 (0.0009) [2023-10-10 04:52:59,309][53268] Updated weights for policy 1, policy_version 6270 (0.0007) [2023-10-10 04:53:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12845056. Throughput: 0: 1680.6, 1: 1654.3. Samples: 3217688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:01,784][52050] Avg episode reward: [(0, '5.550'), (1, '7.830')] [2023-10-10 04:53:01,929][53252] Updated weights for policy 0, policy_version 6280 (0.0010) [2023-10-10 04:53:02,307][53252] Updated weights for policy 0, policy_version 6290 (0.0009) [2023-10-10 04:53:02,683][53252] Updated weights for policy 0, policy_version 6300 (0.0008) [2023-10-10 04:53:03,326][53268] Updated weights for policy 1, policy_version 6280 (0.0011) [2023-10-10 04:53:03,695][53268] Updated weights for policy 1, policy_version 6290 (0.0009) [2023-10-10 04:53:04,072][53268] Updated weights for policy 1, policy_version 6300 (0.0009) [2023-10-10 04:53:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12910592. Throughput: 0: 1681.7, 1: 1678.0. Samples: 3238428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:06,784][52050] Avg episode reward: [(0, '5.740'), (1, '7.430')] [2023-10-10 04:53:06,829][53252] Updated weights for policy 0, policy_version 6310 (0.0009) [2023-10-10 04:53:07,200][53252] Updated weights for policy 0, policy_version 6320 (0.0007) [2023-10-10 04:53:07,573][53252] Updated weights for policy 0, policy_version 6330 (0.0009) [2023-10-10 04:53:07,975][53268] Updated weights for policy 1, policy_version 6310 (0.0008) [2023-10-10 04:53:08,344][53268] Updated weights for policy 1, policy_version 6320 (0.0010) [2023-10-10 04:53:08,716][53268] Updated weights for policy 1, policy_version 6330 (0.0010) [2023-10-10 04:53:11,753][53252] Updated weights for policy 0, policy_version 6340 (0.0008) [2023-10-10 04:53:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12976128. Throughput: 0: 1683.0, 1: 1686.3. Samples: 3259276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:11,784][52050] Avg episode reward: [(0, '6.440'), (1, '7.690')] [2023-10-10 04:53:12,115][53252] Updated weights for policy 0, policy_version 6350 (0.0008) [2023-10-10 04:53:12,490][53252] Updated weights for policy 0, policy_version 6360 (0.0008) [2023-10-10 04:53:12,839][53268] Updated weights for policy 1, policy_version 6340 (0.0009) [2023-10-10 04:53:13,208][53268] Updated weights for policy 1, policy_version 6350 (0.0008) [2023-10-10 04:53:13,572][53268] Updated weights for policy 1, policy_version 6360 (0.0008) [2023-10-10 04:53:16,534][53252] Updated weights for policy 0, policy_version 6370 (0.0008) [2023-10-10 04:53:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13041664. Throughput: 0: 1683.8, 1: 1664.4. Samples: 3268300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:16,784][52050] Avg episode reward: [(0, '6.760'), (1, '8.130')] [2023-10-10 04:53:16,914][53252] Updated weights for policy 0, policy_version 6380 (0.0008) [2023-10-10 04:53:17,292][53252] Updated weights for policy 0, policy_version 6390 (0.0008) [2023-10-10 04:53:17,585][53268] Updated weights for policy 1, policy_version 6370 (0.0007) [2023-10-10 04:53:17,657][53252] Updated weights for policy 0, policy_version 6400 (0.0008) [2023-10-10 04:53:17,966][53268] Updated weights for policy 1, policy_version 6380 (0.0009) [2023-10-10 04:53:18,332][53268] Updated weights for policy 1, policy_version 6390 (0.0009) [2023-10-10 04:53:18,700][53268] Updated weights for policy 1, policy_version 6400 (0.0009) [2023-10-10 04:53:21,725][53252] Updated weights for policy 0, policy_version 6410 (0.0007) [2023-10-10 04:53:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13107200. Throughput: 0: 1683.4, 1: 1675.8. Samples: 3288766. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-10 04:53:21,784][52050] Avg episode reward: [(0, '7.150'), (1, '8.110')] [2023-10-10 04:53:22,107][53252] Updated weights for policy 0, policy_version 6420 (0.0008) [2023-10-10 04:53:22,476][53252] Updated weights for policy 0, policy_version 6430 (0.0008) [2023-10-10 04:53:22,547][52846] Saving new best policy, reward=7.150! [2023-10-10 04:53:22,809][53268] Updated weights for policy 1, policy_version 6410 (0.0009) [2023-10-10 04:53:23,174][53268] Updated weights for policy 1, policy_version 6420 (0.0007) [2023-10-10 04:53:23,545][53268] Updated weights for policy 1, policy_version 6430 (0.0007) [2023-10-10 04:53:26,366][53252] Updated weights for policy 0, policy_version 6440 (0.0007) [2023-10-10 04:53:26,741][53252] Updated weights for policy 0, policy_version 6450 (0.0008) [2023-10-10 04:53:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 13172736. Throughput: 0: 1679.8, 1: 1669.9. Samples: 3309078. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-10 04:53:26,785][52050] Avg episode reward: [(0, '7.500'), (1, '8.400')] [2023-10-10 04:53:27,104][53252] Updated weights for policy 0, policy_version 6460 (0.0008) [2023-10-10 04:53:27,250][52846] Saving new best policy, reward=7.500! [2023-10-10 04:53:27,547][53268] Updated weights for policy 1, policy_version 6440 (0.0008) [2023-10-10 04:53:27,923][53268] Updated weights for policy 1, policy_version 6450 (0.0010) [2023-10-10 04:53:28,292][53268] Updated weights for policy 1, policy_version 6460 (0.0009) [2023-10-10 04:53:31,311][53252] Updated weights for policy 0, policy_version 6470 (0.0009) [2023-10-10 04:53:31,687][53252] Updated weights for policy 0, policy_version 6480 (0.0009) [2023-10-10 04:53:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13238272. Throughput: 0: 1686.8, 1: 1666.6. Samples: 3318580. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 04:53:31,784][52050] Avg episode reward: [(0, '7.350'), (1, '9.030')] [2023-10-10 04:53:32,059][53252] Updated weights for policy 0, policy_version 6490 (0.0009) [2023-10-10 04:53:32,459][53268] Updated weights for policy 1, policy_version 6470 (0.0007) [2023-10-10 04:53:32,828][53268] Updated weights for policy 1, policy_version 6480 (0.0008) [2023-10-10 04:53:33,188][53268] Updated weights for policy 1, policy_version 6490 (0.0007) [2023-10-10 04:53:36,214][53252] Updated weights for policy 0, policy_version 6500 (0.0007) [2023-10-10 04:53:36,584][53252] Updated weights for policy 0, policy_version 6510 (0.0008) [2023-10-10 04:53:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13303808. Throughput: 0: 1684.4, 1: 1671.3. Samples: 3338984. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 04:53:36,784][52050] Avg episode reward: [(0, '6.410'), (1, '9.160')] [2023-10-10 04:53:36,784][53061] Saving new best policy, reward=9.160! [2023-10-10 04:53:36,957][53252] Updated weights for policy 0, policy_version 6520 (0.0008) [2023-10-10 04:53:37,411][53268] Updated weights for policy 1, policy_version 6500 (0.0011) [2023-10-10 04:53:37,787][53268] Updated weights for policy 1, policy_version 6510 (0.0008) [2023-10-10 04:53:38,150][53268] Updated weights for policy 1, policy_version 6520 (0.0008) [2023-10-10 04:53:41,062][53252] Updated weights for policy 0, policy_version 6530 (0.0011) [2023-10-10 04:53:41,438][53252] Updated weights for policy 0, policy_version 6540 (0.0007) [2023-10-10 04:53:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13369344. Throughput: 0: 1674.6, 1: 1677.6. Samples: 3359306. Policy #0 lag: (min: 0.0, avg: 24.2, max: 32.0) [2023-10-10 04:53:41,784][52050] Avg episode reward: [(0, '6.500'), (1, '8.980')] [2023-10-10 04:53:41,811][53252] Updated weights for policy 0, policy_version 6550 (0.0008) [2023-10-10 04:53:42,192][53252] Updated weights for policy 0, policy_version 6560 (0.0009) [2023-10-10 04:53:42,318][53268] Updated weights for policy 1, policy_version 6530 (0.0009) [2023-10-10 04:53:42,694][53268] Updated weights for policy 1, policy_version 6540 (0.0008) [2023-10-10 04:53:43,056][53268] Updated weights for policy 1, policy_version 6550 (0.0009) [2023-10-10 04:53:43,429][53268] Updated weights for policy 1, policy_version 6560 (0.0007) [2023-10-10 04:53:46,202][53252] Updated weights for policy 0, policy_version 6570 (0.0007) [2023-10-10 04:53:46,571][53252] Updated weights for policy 0, policy_version 6580 (0.0007) [2023-10-10 04:53:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13434880. Throughput: 0: 1680.9, 1: 1675.0. Samples: 3368704. Policy #0 lag: (min: 0.0, avg: 24.2, max: 32.0) [2023-10-10 04:53:46,784][52050] Avg episode reward: [(0, '6.520'), (1, '8.470')] [2023-10-10 04:53:46,949][53252] Updated weights for policy 0, policy_version 6590 (0.0010) [2023-10-10 04:53:47,586][53268] Updated weights for policy 1, policy_version 6570 (0.0009) [2023-10-10 04:53:47,952][53268] Updated weights for policy 1, policy_version 6580 (0.0009) [2023-10-10 04:53:48,317][53268] Updated weights for policy 1, policy_version 6590 (0.0008) [2023-10-10 04:53:51,066][53252] Updated weights for policy 0, policy_version 6600 (0.0011) [2023-10-10 04:53:51,453][53252] Updated weights for policy 0, policy_version 6610 (0.0009) [2023-10-10 04:53:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 13500416. Throughput: 0: 1674.1, 1: 1677.4. Samples: 3389246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:51,784][52050] Avg episode reward: [(0, '6.580'), (1, '7.360')] [2023-10-10 04:53:51,822][53252] Updated weights for policy 0, policy_version 6620 (0.0008) [2023-10-10 04:53:52,398][53268] Updated weights for policy 1, policy_version 6600 (0.0008) [2023-10-10 04:53:52,767][53268] Updated weights for policy 1, policy_version 6610 (0.0008) [2023-10-10 04:53:53,134][53268] Updated weights for policy 1, policy_version 6620 (0.0008) [2023-10-10 04:53:55,878][53252] Updated weights for policy 0, policy_version 6630 (0.0009) [2023-10-10 04:53:56,254][53252] Updated weights for policy 0, policy_version 6640 (0.0008) [2023-10-10 04:53:56,620][53252] Updated weights for policy 0, policy_version 6650 (0.0009) [2023-10-10 04:53:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13565952. Throughput: 0: 1661.2, 1: 1671.2. Samples: 3409234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:53:56,784][52050] Avg episode reward: [(0, '6.020'), (1, '7.560')] [2023-10-10 04:53:57,221][53268] Updated weights for policy 1, policy_version 6630 (0.0008) [2023-10-10 04:53:57,588][53268] Updated weights for policy 1, policy_version 6640 (0.0009) [2023-10-10 04:53:57,952][53268] Updated weights for policy 1, policy_version 6650 (0.0008) [2023-10-10 04:54:00,802][53252] Updated weights for policy 0, policy_version 6660 (0.0007) [2023-10-10 04:54:01,178][53252] Updated weights for policy 0, policy_version 6670 (0.0007) [2023-10-10 04:54:01,545][53252] Updated weights for policy 0, policy_version 6680 (0.0007) [2023-10-10 04:54:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 13631488. Throughput: 0: 1676.7, 1: 1673.0. Samples: 3419036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:01,784][52050] Avg episode reward: [(0, '5.790'), (1, '8.290')] [2023-10-10 04:54:01,890][53268] Updated weights for policy 1, policy_version 6660 (0.0009) [2023-10-10 04:54:02,248][53268] Updated weights for policy 1, policy_version 6670 (0.0009) [2023-10-10 04:54:02,615][53268] Updated weights for policy 1, policy_version 6680 (0.0008) [2023-10-10 04:54:05,535][53252] Updated weights for policy 0, policy_version 6690 (0.0009) [2023-10-10 04:54:05,907][53252] Updated weights for policy 0, policy_version 6700 (0.0007) [2023-10-10 04:54:06,277][53252] Updated weights for policy 0, policy_version 6710 (0.0008) [2023-10-10 04:54:06,648][53252] Updated weights for policy 0, policy_version 6720 (0.0009) [2023-10-10 04:54:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13729792. Throughput: 0: 1683.7, 1: 1671.7. Samples: 3439762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:06,784][52050] Avg episode reward: [(0, '5.830'), (1, '9.500')] [2023-10-10 04:54:06,829][53268] Updated weights for policy 1, policy_version 6690 (0.0009) [2023-10-10 04:54:07,194][53268] Updated weights for policy 1, policy_version 6700 (0.0008) [2023-10-10 04:54:07,565][53268] Updated weights for policy 1, policy_version 6710 (0.0009) [2023-10-10 04:54:07,931][53061] Saving new best policy, reward=9.500! [2023-10-10 04:54:07,933][53268] Updated weights for policy 1, policy_version 6720 (0.0008) [2023-10-10 04:54:10,796][53252] Updated weights for policy 0, policy_version 6730 (0.0008) [2023-10-10 04:54:11,179][53252] Updated weights for policy 0, policy_version 6740 (0.0010) [2023-10-10 04:54:11,549][53252] Updated weights for policy 0, policy_version 6750 (0.0011) [2023-10-10 04:54:11,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13795328. Throughput: 0: 1666.1, 1: 1673.1. Samples: 3459342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:11,784][52050] Avg episode reward: [(0, '6.100'), (1, '8.970')] [2023-10-10 04:54:12,090][53268] Updated weights for policy 1, policy_version 6730 (0.0011) [2023-10-10 04:54:12,456][53268] Updated weights for policy 1, policy_version 6740 (0.0009) [2023-10-10 04:54:12,820][53268] Updated weights for policy 1, policy_version 6750 (0.0008) [2023-10-10 04:54:15,655][53252] Updated weights for policy 0, policy_version 6760 (0.0010) [2023-10-10 04:54:16,037][53252] Updated weights for policy 0, policy_version 6770 (0.0010) [2023-10-10 04:54:16,404][53252] Updated weights for policy 0, policy_version 6780 (0.0010) [2023-10-10 04:54:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13860864. Throughput: 0: 1677.1, 1: 1669.4. Samples: 3469170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:16,784][52050] Avg episode reward: [(0, '6.390'), (1, '8.450')] [2023-10-10 04:54:16,892][53268] Updated weights for policy 1, policy_version 6760 (0.0009) [2023-10-10 04:54:17,257][53268] Updated weights for policy 1, policy_version 6770 (0.0009) [2023-10-10 04:54:17,624][53268] Updated weights for policy 1, policy_version 6780 (0.0009) [2023-10-10 04:54:20,510][53252] Updated weights for policy 0, policy_version 6790 (0.0009) [2023-10-10 04:54:20,885][53252] Updated weights for policy 0, policy_version 6800 (0.0008) [2023-10-10 04:54:21,245][53252] Updated weights for policy 0, policy_version 6810 (0.0008) [2023-10-10 04:54:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13926400. Throughput: 0: 1673.9, 1: 1673.0. Samples: 3489594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:21,785][52050] Avg episode reward: [(0, '6.380'), (1, '8.230')] [2023-10-10 04:54:21,807][53268] Updated weights for policy 1, policy_version 6790 (0.0008) [2023-10-10 04:54:22,192][53268] Updated weights for policy 1, policy_version 6800 (0.0008) [2023-10-10 04:54:22,563][53268] Updated weights for policy 1, policy_version 6810 (0.0009) [2023-10-10 04:54:25,244][53252] Updated weights for policy 0, policy_version 6820 (0.0008) [2023-10-10 04:54:25,609][53252] Updated weights for policy 0, policy_version 6830 (0.0007) [2023-10-10 04:54:25,981][53252] Updated weights for policy 0, policy_version 6840 (0.0007) [2023-10-10 04:54:26,444][53268] Updated weights for policy 1, policy_version 6820 (0.0008) [2023-10-10 04:54:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 13991936. Throughput: 0: 1658.3, 1: 1673.7. Samples: 3509246. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 04:54:26,784][52050] Avg episode reward: [(0, '6.150'), (1, '9.060')] [2023-10-10 04:54:26,821][53268] Updated weights for policy 1, policy_version 6830 (0.0009) [2023-10-10 04:54:27,187][53268] Updated weights for policy 1, policy_version 6840 (0.0007) [2023-10-10 04:54:29,883][53252] Updated weights for policy 0, policy_version 6850 (0.0007) [2023-10-10 04:54:30,260][53252] Updated weights for policy 0, policy_version 6860 (0.0008) [2023-10-10 04:54:30,639][53252] Updated weights for policy 0, policy_version 6870 (0.0011) [2023-10-10 04:54:31,006][53252] Updated weights for policy 0, policy_version 6880 (0.0008) [2023-10-10 04:54:31,322][53268] Updated weights for policy 1, policy_version 6850 (0.0009) [2023-10-10 04:54:31,691][53268] Updated weights for policy 1, policy_version 6860 (0.0008) [2023-10-10 04:54:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14057472. Throughput: 0: 1683.6, 1: 1672.0. Samples: 3519702. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 04:54:31,784][52050] Avg episode reward: [(0, '6.310'), (1, '9.530')] [2023-10-10 04:54:32,061][53268] Updated weights for policy 1, policy_version 6870 (0.0012) [2023-10-10 04:54:32,427][53061] Saving new best policy, reward=9.530! [2023-10-10 04:54:32,432][53268] Updated weights for policy 1, policy_version 6880 (0.0009) [2023-10-10 04:54:35,034][53252] Updated weights for policy 0, policy_version 6890 (0.0007) [2023-10-10 04:54:35,405][53252] Updated weights for policy 0, policy_version 6900 (0.0008) [2023-10-10 04:54:35,781][53252] Updated weights for policy 0, policy_version 6910 (0.0009) [2023-10-10 04:54:36,482][53268] Updated weights for policy 1, policy_version 6890 (0.0010) [2023-10-10 04:54:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14123008. Throughput: 0: 1668.1, 1: 1678.8. Samples: 3539856. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) [2023-10-10 04:54:36,785][52050] Avg episode reward: [(0, '5.780'), (1, '10.050')] [2023-10-10 04:54:36,861][53268] Updated weights for policy 1, policy_version 6900 (0.0011) [2023-10-10 04:54:37,227][53268] Updated weights for policy 1, policy_version 6910 (0.0009) [2023-10-10 04:54:37,301][53061] Saving new best policy, reward=10.050! [2023-10-10 04:54:39,887][53252] Updated weights for policy 0, policy_version 6920 (0.0009) [2023-10-10 04:54:40,263][53252] Updated weights for policy 0, policy_version 6930 (0.0009) [2023-10-10 04:54:40,648][53252] Updated weights for policy 0, policy_version 6940 (0.0008) [2023-10-10 04:54:41,378][53268] Updated weights for policy 1, policy_version 6920 (0.0011) [2023-10-10 04:54:41,740][53268] Updated weights for policy 1, policy_version 6930 (0.0010) [2023-10-10 04:54:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14188544. Throughput: 0: 1666.4, 1: 1671.4. Samples: 3559434. Policy #0 lag: (min: 24.0, avg: 49.3, max: 56.0) [2023-10-10 04:54:41,784][52050] Avg episode reward: [(0, '5.480'), (1, '9.850')] [2023-10-10 04:54:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000006944_7110656.pth... [2023-10-10 04:54:41,827][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000005376_5505024.pth [2023-10-10 04:54:42,107][53268] Updated weights for policy 1, policy_version 6940 (0.0008) [2023-10-10 04:54:42,251][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000006944_7110656.pth... [2023-10-10 04:54:42,289][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000005376_5505024.pth [2023-10-10 04:54:44,815][53252] Updated weights for policy 0, policy_version 6950 (0.0007) [2023-10-10 04:54:45,180][53252] Updated weights for policy 0, policy_version 6960 (0.0008) [2023-10-10 04:54:45,562][53252] Updated weights for policy 0, policy_version 6970 (0.0007) [2023-10-10 04:54:46,208][53268] Updated weights for policy 1, policy_version 6950 (0.0008) [2023-10-10 04:54:46,575][53268] Updated weights for policy 1, policy_version 6960 (0.0009) [2023-10-10 04:54:46,784][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14254080. Throughput: 0: 1677.3, 1: 1669.9. Samples: 3569660. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) [2023-10-10 04:54:46,785][52050] Avg episode reward: [(0, '5.760'), (1, '9.720')] [2023-10-10 04:54:46,949][53268] Updated weights for policy 1, policy_version 6970 (0.0007) [2023-10-10 04:54:49,631][53252] Updated weights for policy 0, policy_version 6980 (0.0007) [2023-10-10 04:54:50,005][53252] Updated weights for policy 0, policy_version 6990 (0.0008) [2023-10-10 04:54:50,377][53252] Updated weights for policy 0, policy_version 7000 (0.0007) [2023-10-10 04:54:51,100][53268] Updated weights for policy 1, policy_version 6980 (0.0008) [2023-10-10 04:54:51,467][53268] Updated weights for policy 1, policy_version 6990 (0.0010) [2023-10-10 04:54:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14319616. Throughput: 0: 1652.3, 1: 1672.3. Samples: 3589366. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) [2023-10-10 04:54:51,784][52050] Avg episode reward: [(0, '5.340'), (1, '9.030')] [2023-10-10 04:54:51,844][53268] Updated weights for policy 1, policy_version 7000 (0.0009) [2023-10-10 04:54:54,489][53252] Updated weights for policy 0, policy_version 7010 (0.0008) [2023-10-10 04:54:54,859][53252] Updated weights for policy 0, policy_version 7020 (0.0008) [2023-10-10 04:54:55,235][53252] Updated weights for policy 0, policy_version 7030 (0.0008) [2023-10-10 04:54:55,612][53252] Updated weights for policy 0, policy_version 7040 (0.0009) [2023-10-10 04:54:55,987][53268] Updated weights for policy 1, policy_version 7010 (0.0011) [2023-10-10 04:54:56,365][53268] Updated weights for policy 1, policy_version 7020 (0.0010) [2023-10-10 04:54:56,738][53268] Updated weights for policy 1, policy_version 7030 (0.0010) [2023-10-10 04:54:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14385152. Throughput: 0: 1665.1, 1: 1666.3. Samples: 3609258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:54:56,785][52050] Avg episode reward: [(0, '6.530'), (1, '9.270')] [2023-10-10 04:54:57,108][53268] Updated weights for policy 1, policy_version 7040 (0.0010) [2023-10-10 04:54:59,648][53252] Updated weights for policy 0, policy_version 7050 (0.0011) [2023-10-10 04:55:00,011][53252] Updated weights for policy 0, policy_version 7060 (0.0011) [2023-10-10 04:55:00,388][53252] Updated weights for policy 0, policy_version 7070 (0.0009) [2023-10-10 04:55:01,285][53268] Updated weights for policy 1, policy_version 7050 (0.0008) [2023-10-10 04:55:01,649][53268] Updated weights for policy 1, policy_version 7060 (0.0008) [2023-10-10 04:55:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14450688. Throughput: 0: 1673.6, 1: 1671.7. Samples: 3619710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:55:01,784][52050] Avg episode reward: [(0, '6.130'), (1, '8.330')] [2023-10-10 04:55:02,023][53268] Updated weights for policy 1, policy_version 7070 (0.0007) [2023-10-10 04:55:04,545][53252] Updated weights for policy 0, policy_version 7080 (0.0010) [2023-10-10 04:55:04,927][53252] Updated weights for policy 0, policy_version 7090 (0.0009) [2023-10-10 04:55:05,302][53252] Updated weights for policy 0, policy_version 7100 (0.0008) [2023-10-10 04:55:06,091][53268] Updated weights for policy 1, policy_version 7080 (0.0008) [2023-10-10 04:55:06,450][53268] Updated weights for policy 1, policy_version 7090 (0.0010) [2023-10-10 04:55:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14516224. Throughput: 0: 1651.2, 1: 1672.2. Samples: 3639148. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 04:55:06,784][52050] Avg episode reward: [(0, '5.780'), (1, '8.350')] [2023-10-10 04:55:06,819][53268] Updated weights for policy 1, policy_version 7100 (0.0008) [2023-10-10 04:55:09,537][53252] Updated weights for policy 0, policy_version 7110 (0.0007) [2023-10-10 04:55:09,903][53252] Updated weights for policy 0, policy_version 7120 (0.0007) [2023-10-10 04:55:10,276][53252] Updated weights for policy 0, policy_version 7130 (0.0008) [2023-10-10 04:55:10,678][53268] Updated weights for policy 1, policy_version 7110 (0.0009) [2023-10-10 04:55:11,045][53268] Updated weights for policy 1, policy_version 7120 (0.0009) [2023-10-10 04:55:11,431][53268] Updated weights for policy 1, policy_version 7130 (0.0008) [2023-10-10 04:55:11,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14614528. Throughput: 0: 1670.3, 1: 1662.5. Samples: 3659222. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 04:55:11,784][52050] Avg episode reward: [(0, '5.690'), (1, '8.560')] [2023-10-10 04:55:14,380][53252] Updated weights for policy 0, policy_version 7140 (0.0008) [2023-10-10 04:55:14,749][53252] Updated weights for policy 0, policy_version 7150 (0.0007) [2023-10-10 04:55:15,131][53252] Updated weights for policy 0, policy_version 7160 (0.0009) [2023-10-10 04:55:15,482][53268] Updated weights for policy 1, policy_version 7140 (0.0009) [2023-10-10 04:55:15,850][53268] Updated weights for policy 1, policy_version 7150 (0.0008) [2023-10-10 04:55:16,216][53268] Updated weights for policy 1, policy_version 7160 (0.0011) [2023-10-10 04:55:16,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 14680064. Throughput: 0: 1662.4, 1: 1677.9. Samples: 3670018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:55:16,785][52050] Avg episode reward: [(0, '6.290'), (1, '8.670')] [2023-10-10 04:55:18,978][53252] Updated weights for policy 0, policy_version 7170 (0.0008) [2023-10-10 04:55:19,344][53252] Updated weights for policy 0, policy_version 7180 (0.0007) [2023-10-10 04:55:19,711][53252] Updated weights for policy 0, policy_version 7190 (0.0007) [2023-10-10 04:55:20,082][53252] Updated weights for policy 0, policy_version 7200 (0.0007) [2023-10-10 04:55:20,388][53268] Updated weights for policy 1, policy_version 7170 (0.0011) [2023-10-10 04:55:20,760][53268] Updated weights for policy 1, policy_version 7180 (0.0011) [2023-10-10 04:55:21,123][53268] Updated weights for policy 1, policy_version 7190 (0.0010) [2023-10-10 04:55:21,496][53268] Updated weights for policy 1, policy_version 7200 (0.0011) [2023-10-10 04:55:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14745600. Throughput: 0: 1654.4, 1: 1671.7. Samples: 3689528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:55:21,784][52050] Avg episode reward: [(0, '6.160'), (1, '9.570')] [2023-10-10 04:55:24,320][53252] Updated weights for policy 0, policy_version 7210 (0.0009) [2023-10-10 04:55:24,698][53252] Updated weights for policy 0, policy_version 7220 (0.0010) [2023-10-10 04:55:25,072][53252] Updated weights for policy 0, policy_version 7230 (0.0007) [2023-10-10 04:55:25,820][53268] Updated weights for policy 1, policy_version 7210 (0.0010) [2023-10-10 04:55:26,199][53268] Updated weights for policy 1, policy_version 7220 (0.0009) [2023-10-10 04:55:26,567][53268] Updated weights for policy 1, policy_version 7230 (0.0010) [2023-10-10 04:55:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14811136. Throughput: 0: 1667.4, 1: 1659.6. Samples: 3709148. Policy #0 lag: (min: 29.0, avg: 29.0, max: 34.0) [2023-10-10 04:55:26,784][52050] Avg episode reward: [(0, '6.120'), (1, '9.880')] [2023-10-10 04:55:29,104][53252] Updated weights for policy 0, policy_version 7240 (0.0010) [2023-10-10 04:55:29,479][53252] Updated weights for policy 0, policy_version 7250 (0.0009) [2023-10-10 04:55:29,854][53252] Updated weights for policy 0, policy_version 7260 (0.0008) [2023-10-10 04:55:30,685][53268] Updated weights for policy 1, policy_version 7240 (0.0009) [2023-10-10 04:55:31,057][53268] Updated weights for policy 1, policy_version 7250 (0.0011) [2023-10-10 04:55:31,425][53268] Updated weights for policy 1, policy_version 7260 (0.0010) [2023-10-10 04:55:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14876672. Throughput: 0: 1657.7, 1: 1673.4. Samples: 3719558. Policy #0 lag: (min: 29.0, avg: 29.0, max: 34.0) [2023-10-10 04:55:31,784][52050] Avg episode reward: [(0, '6.490'), (1, '9.680')] [2023-10-10 04:55:33,789][53252] Updated weights for policy 0, policy_version 7270 (0.0010) [2023-10-10 04:55:34,167][53252] Updated weights for policy 0, policy_version 7280 (0.0008) [2023-10-10 04:55:34,544][53252] Updated weights for policy 0, policy_version 7290 (0.0008) [2023-10-10 04:55:35,469][53268] Updated weights for policy 1, policy_version 7270 (0.0010) [2023-10-10 04:55:35,834][53268] Updated weights for policy 1, policy_version 7280 (0.0009) [2023-10-10 04:55:36,211][53268] Updated weights for policy 1, policy_version 7290 (0.0009) [2023-10-10 04:55:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 14942208. Throughput: 0: 1666.8, 1: 1677.4. Samples: 3739852. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) [2023-10-10 04:55:36,784][52050] Avg episode reward: [(0, '6.350'), (1, '9.600')] [2023-10-10 04:55:38,529][53252] Updated weights for policy 0, policy_version 7300 (0.0007) [2023-10-10 04:55:38,908][53252] Updated weights for policy 0, policy_version 7310 (0.0008) [2023-10-10 04:55:39,284][53252] Updated weights for policy 0, policy_version 7320 (0.0008) [2023-10-10 04:55:40,122][53268] Updated weights for policy 1, policy_version 7300 (0.0009) [2023-10-10 04:55:40,477][53268] Updated weights for policy 1, policy_version 7310 (0.0010) [2023-10-10 04:55:40,847][53268] Updated weights for policy 1, policy_version 7320 (0.0009) [2023-10-10 04:55:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15007744. Throughput: 0: 1677.6, 1: 1659.5. Samples: 3759426. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) [2023-10-10 04:55:41,784][52050] Avg episode reward: [(0, '6.760'), (1, '8.700')] [2023-10-10 04:55:43,401][53252] Updated weights for policy 0, policy_version 7330 (0.0009) [2023-10-10 04:55:43,769][53252] Updated weights for policy 0, policy_version 7340 (0.0009) [2023-10-10 04:55:44,153][53252] Updated weights for policy 0, policy_version 7350 (0.0008) [2023-10-10 04:55:44,531][53252] Updated weights for policy 0, policy_version 7360 (0.0008) [2023-10-10 04:55:44,874][53268] Updated weights for policy 1, policy_version 7330 (0.0008) [2023-10-10 04:55:45,245][53268] Updated weights for policy 1, policy_version 7340 (0.0007) [2023-10-10 04:55:45,623][53268] Updated weights for policy 1, policy_version 7350 (0.0009) [2023-10-10 04:55:45,984][53268] Updated weights for policy 1, policy_version 7360 (0.0010) [2023-10-10 04:55:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15073280. Throughput: 0: 1655.7, 1: 1684.1. Samples: 3770002. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 04:55:46,784][52050] Avg episode reward: [(0, '6.790'), (1, '9.420')] [2023-10-10 04:55:48,659][53252] Updated weights for policy 0, policy_version 7370 (0.0011) [2023-10-10 04:55:49,023][53252] Updated weights for policy 0, policy_version 7380 (0.0011) [2023-10-10 04:55:49,402][53252] Updated weights for policy 0, policy_version 7390 (0.0007) [2023-10-10 04:55:50,220][53268] Updated weights for policy 1, policy_version 7370 (0.0010) [2023-10-10 04:55:50,590][53268] Updated weights for policy 1, policy_version 7380 (0.0009) [2023-10-10 04:55:50,959][53268] Updated weights for policy 1, policy_version 7390 (0.0009) [2023-10-10 04:55:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15138816. Throughput: 0: 1676.8, 1: 1675.5. Samples: 3790000. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 04:55:51,784][52050] Avg episode reward: [(0, '7.310'), (1, '9.690')] [2023-10-10 04:55:53,425][53252] Updated weights for policy 0, policy_version 7400 (0.0009) [2023-10-10 04:55:53,793][53252] Updated weights for policy 0, policy_version 7410 (0.0007) [2023-10-10 04:55:54,160][53252] Updated weights for policy 0, policy_version 7420 (0.0009) [2023-10-10 04:55:54,970][53268] Updated weights for policy 1, policy_version 7400 (0.0007) [2023-10-10 04:55:55,342][53268] Updated weights for policy 1, policy_version 7410 (0.0008) [2023-10-10 04:55:55,706][53268] Updated weights for policy 1, policy_version 7420 (0.0008) [2023-10-10 04:55:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 15204352. Throughput: 0: 1684.6, 1: 1662.4. Samples: 3809836. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 04:55:56,784][52050] Avg episode reward: [(0, '6.630'), (1, '10.670')] [2023-10-10 04:55:56,798][53061] Saving new best policy, reward=10.670! [2023-10-10 04:55:58,108][53252] Updated weights for policy 0, policy_version 7430 (0.0009) [2023-10-10 04:55:58,480][53252] Updated weights for policy 0, policy_version 7440 (0.0009) [2023-10-10 04:55:58,860][53252] Updated weights for policy 0, policy_version 7450 (0.0008) [2023-10-10 04:55:59,854][53268] Updated weights for policy 1, policy_version 7430 (0.0009) [2023-10-10 04:56:00,226][53268] Updated weights for policy 1, policy_version 7440 (0.0007) [2023-10-10 04:56:00,595][53268] Updated weights for policy 1, policy_version 7450 (0.0007) [2023-10-10 04:56:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 15269888. Throughput: 0: 1660.8, 1: 1676.4. Samples: 3820188. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 04:56:01,784][52050] Avg episode reward: [(0, '5.510'), (1, '9.740')] [2023-10-10 04:56:02,943][53252] Updated weights for policy 0, policy_version 7460 (0.0008) [2023-10-10 04:56:03,325][53252] Updated weights for policy 0, policy_version 7470 (0.0009) [2023-10-10 04:56:03,692][53252] Updated weights for policy 0, policy_version 7480 (0.0009) [2023-10-10 04:56:04,490][53268] Updated weights for policy 1, policy_version 7460 (0.0010) [2023-10-10 04:56:04,862][53268] Updated weights for policy 1, policy_version 7470 (0.0009) [2023-10-10 04:56:05,228][53268] Updated weights for policy 1, policy_version 7480 (0.0008) [2023-10-10 04:56:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 15335424. Throughput: 0: 1689.3, 1: 1656.8. Samples: 3840102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:56:06,784][52050] Avg episode reward: [(0, '5.910'), (1, '9.960')] [2023-10-10 04:56:07,945][53252] Updated weights for policy 0, policy_version 7490 (0.0008) [2023-10-10 04:56:08,316][53252] Updated weights for policy 0, policy_version 7500 (0.0008) [2023-10-10 04:56:08,688][53252] Updated weights for policy 0, policy_version 7510 (0.0007) [2023-10-10 04:56:09,059][53252] Updated weights for policy 0, policy_version 7520 (0.0008) [2023-10-10 04:56:09,302][53268] Updated weights for policy 1, policy_version 7490 (0.0008) [2023-10-10 04:56:09,677][53268] Updated weights for policy 1, policy_version 7500 (0.0009) [2023-10-10 04:56:10,042][53268] Updated weights for policy 1, policy_version 7510 (0.0007) [2023-10-10 04:56:10,412][53268] Updated weights for policy 1, policy_version 7520 (0.0008) [2023-10-10 04:56:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 15400960. Throughput: 0: 1693.2, 1: 1667.8. Samples: 3860394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:56:11,784][52050] Avg episode reward: [(0, '6.060'), (1, '10.240')] [2023-10-10 04:56:13,082][53252] Updated weights for policy 0, policy_version 7530 (0.0007) [2023-10-10 04:56:13,457][53252] Updated weights for policy 0, policy_version 7540 (0.0009) [2023-10-10 04:56:13,826][53252] Updated weights for policy 0, policy_version 7550 (0.0010) [2023-10-10 04:56:14,601][53268] Updated weights for policy 1, policy_version 7530 (0.0009) [2023-10-10 04:56:14,984][53268] Updated weights for policy 1, policy_version 7540 (0.0009) [2023-10-10 04:56:15,355][53268] Updated weights for policy 1, policy_version 7550 (0.0008) [2023-10-10 04:56:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15466496. Throughput: 0: 1674.1, 1: 1684.0. Samples: 3870676. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-10 04:56:16,784][52050] Avg episode reward: [(0, '6.290'), (1, '10.590')] [2023-10-10 04:56:17,744][53252] Updated weights for policy 0, policy_version 7560 (0.0009) [2023-10-10 04:56:18,114][53252] Updated weights for policy 0, policy_version 7570 (0.0009) [2023-10-10 04:56:18,488][53252] Updated weights for policy 0, policy_version 7580 (0.0009) [2023-10-10 04:56:19,558][53268] Updated weights for policy 1, policy_version 7560 (0.0008) [2023-10-10 04:56:19,923][53268] Updated weights for policy 1, policy_version 7570 (0.0009) [2023-10-10 04:56:20,281][53268] Updated weights for policy 1, policy_version 7580 (0.0007) [2023-10-10 04:56:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15532032. Throughput: 0: 1684.4, 1: 1657.2. Samples: 3890226. Policy #0 lag: (min: 31.0, avg: 31.4, max: 43.0) [2023-10-10 04:56:21,784][52050] Avg episode reward: [(0, '6.260'), (1, '11.310')] [2023-10-10 04:56:21,785][53061] Saving new best policy, reward=11.310! [2023-10-10 04:56:22,640][53252] Updated weights for policy 0, policy_version 7590 (0.0008) [2023-10-10 04:56:23,006][53252] Updated weights for policy 0, policy_version 7600 (0.0007) [2023-10-10 04:56:23,384][53252] Updated weights for policy 0, policy_version 7610 (0.0009) [2023-10-10 04:56:24,187][53268] Updated weights for policy 1, policy_version 7590 (0.0008) [2023-10-10 04:56:24,560][53268] Updated weights for policy 1, policy_version 7600 (0.0010) [2023-10-10 04:56:24,922][53268] Updated weights for policy 1, policy_version 7610 (0.0007) [2023-10-10 04:56:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15597568. Throughput: 0: 1684.6, 1: 1677.2. Samples: 3910708. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 04:56:26,784][52050] Avg episode reward: [(0, '6.240'), (1, '11.270')] [2023-10-10 04:56:27,528][53252] Updated weights for policy 0, policy_version 7620 (0.0010) [2023-10-10 04:56:27,901][53252] Updated weights for policy 0, policy_version 7630 (0.0007) [2023-10-10 04:56:28,264][53252] Updated weights for policy 0, policy_version 7640 (0.0009) [2023-10-10 04:56:29,011][53268] Updated weights for policy 1, policy_version 7620 (0.0007) [2023-10-10 04:56:29,389][53268] Updated weights for policy 1, policy_version 7630 (0.0009) [2023-10-10 04:56:29,756][53268] Updated weights for policy 1, policy_version 7640 (0.0007) [2023-10-10 04:56:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15663104. Throughput: 0: 1677.1, 1: 1672.5. Samples: 3920734. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 04:56:31,784][52050] Avg episode reward: [(0, '6.420'), (1, '10.360')] [2023-10-10 04:56:32,399][53252] Updated weights for policy 0, policy_version 7650 (0.0009) [2023-10-10 04:56:32,789][53252] Updated weights for policy 0, policy_version 7660 (0.0011) [2023-10-10 04:56:33,166][53252] Updated weights for policy 0, policy_version 7670 (0.0008) [2023-10-10 04:56:33,547][53252] Updated weights for policy 0, policy_version 7680 (0.0009) [2023-10-10 04:56:33,910][53268] Updated weights for policy 1, policy_version 7650 (0.0009) [2023-10-10 04:56:34,274][53268] Updated weights for policy 1, policy_version 7660 (0.0008) [2023-10-10 04:56:34,645][53268] Updated weights for policy 1, policy_version 7670 (0.0008) [2023-10-10 04:56:35,019][53268] Updated weights for policy 1, policy_version 7680 (0.0008) [2023-10-10 04:56:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15728640. Throughput: 0: 1682.9, 1: 1658.6. Samples: 3940368. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) [2023-10-10 04:56:36,784][52050] Avg episode reward: [(0, '6.190'), (1, '9.210')] [2023-10-10 04:56:37,589][53252] Updated weights for policy 0, policy_version 7690 (0.0007) [2023-10-10 04:56:37,962][53252] Updated weights for policy 0, policy_version 7700 (0.0007) [2023-10-10 04:56:38,329][53252] Updated weights for policy 0, policy_version 7710 (0.0008) [2023-10-10 04:56:39,176][53268] Updated weights for policy 1, policy_version 7690 (0.0011) [2023-10-10 04:56:39,554][53268] Updated weights for policy 1, policy_version 7700 (0.0009) [2023-10-10 04:56:39,924][53268] Updated weights for policy 1, policy_version 7710 (0.0010) [2023-10-10 04:56:41,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 15794176. Throughput: 0: 1679.6, 1: 1673.7. Samples: 3960736. Policy #0 lag: (min: 25.0, avg: 31.2, max: 57.0) [2023-10-10 04:56:41,785][52050] Avg episode reward: [(0, '5.300'), (1, '9.510')] [2023-10-10 04:56:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000007712_7897088.pth... [2023-10-10 04:56:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000007712_7897088.pth... [2023-10-10 04:56:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000006144_6291456.pth [2023-10-10 04:56:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000006144_6291456.pth [2023-10-10 04:56:41,831][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000007712_7897088.pth [2023-10-10 04:56:41,836][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000007712_7897088.pth [2023-10-10 04:56:42,450][53252] Updated weights for policy 0, policy_version 7720 (0.0008) [2023-10-10 04:56:42,841][53252] Updated weights for policy 0, policy_version 7730 (0.0009) [2023-10-10 04:56:43,204][53252] Updated weights for policy 0, policy_version 7740 (0.0007) [2023-10-10 04:56:44,161][53268] Updated weights for policy 1, policy_version 7720 (0.0011) [2023-10-10 04:56:44,526][53268] Updated weights for policy 1, policy_version 7730 (0.0010) [2023-10-10 04:56:44,898][53268] Updated weights for policy 1, policy_version 7740 (0.0011) [2023-10-10 04:56:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 15859712. Throughput: 0: 1679.2, 1: 1664.5. Samples: 3970656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:56:46,784][52050] Avg episode reward: [(0, '5.900'), (1, '8.770')] [2023-10-10 04:56:47,413][53252] Updated weights for policy 0, policy_version 7750 (0.0007) [2023-10-10 04:56:47,796][53252] Updated weights for policy 0, policy_version 7760 (0.0008) [2023-10-10 04:56:48,166][53252] Updated weights for policy 0, policy_version 7770 (0.0007) [2023-10-10 04:56:48,933][53268] Updated weights for policy 1, policy_version 7750 (0.0011) [2023-10-10 04:56:49,303][53268] Updated weights for policy 1, policy_version 7760 (0.0011) [2023-10-10 04:56:49,674][53268] Updated weights for policy 1, policy_version 7770 (0.0009) [2023-10-10 04:56:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15925248. Throughput: 0: 1677.6, 1: 1657.3. Samples: 3990174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:56:51,784][52050] Avg episode reward: [(0, '6.280'), (1, '10.120')] [2023-10-10 04:56:52,269][53252] Updated weights for policy 0, policy_version 7780 (0.0008) [2023-10-10 04:56:52,639][53252] Updated weights for policy 0, policy_version 7790 (0.0007) [2023-10-10 04:56:53,022][53252] Updated weights for policy 0, policy_version 7800 (0.0007) [2023-10-10 04:56:53,726][53268] Updated weights for policy 1, policy_version 7780 (0.0009) [2023-10-10 04:56:54,097][53268] Updated weights for policy 1, policy_version 7790 (0.0010) [2023-10-10 04:56:54,458][53268] Updated weights for policy 1, policy_version 7800 (0.0010) [2023-10-10 04:56:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 15990784. Throughput: 0: 1679.2, 1: 1669.2. Samples: 4011070. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 04:56:56,784][52050] Avg episode reward: [(0, '6.770'), (1, '10.430')] [2023-10-10 04:56:56,796][53252] Updated weights for policy 0, policy_version 7810 (0.0007) [2023-10-10 04:56:57,170][53252] Updated weights for policy 0, policy_version 7820 (0.0010) [2023-10-10 04:56:57,546][53252] Updated weights for policy 0, policy_version 7830 (0.0010) [2023-10-10 04:56:57,926][53252] Updated weights for policy 0, policy_version 7840 (0.0009) [2023-10-10 04:56:58,500][53268] Updated weights for policy 1, policy_version 7810 (0.0008) [2023-10-10 04:56:58,873][53268] Updated weights for policy 1, policy_version 7820 (0.0008) [2023-10-10 04:56:59,242][53268] Updated weights for policy 1, policy_version 7830 (0.0007) [2023-10-10 04:56:59,615][53268] Updated weights for policy 1, policy_version 7840 (0.0007) [2023-10-10 04:57:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16056320. Throughput: 0: 1682.5, 1: 1654.3. Samples: 4020830. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 04:57:01,784][52050] Avg episode reward: [(0, '6.740'), (1, '10.280')] [2023-10-10 04:57:01,855][53252] Updated weights for policy 0, policy_version 7850 (0.0010) [2023-10-10 04:57:02,233][53252] Updated weights for policy 0, policy_version 7860 (0.0010) [2023-10-10 04:57:02,598][53252] Updated weights for policy 0, policy_version 7870 (0.0010) [2023-10-10 04:57:03,811][53268] Updated weights for policy 1, policy_version 7850 (0.0009) [2023-10-10 04:57:04,174][53268] Updated weights for policy 1, policy_version 7860 (0.0009) [2023-10-10 04:57:04,542][53268] Updated weights for policy 1, policy_version 7870 (0.0010) [2023-10-10 04:57:06,637][53252] Updated weights for policy 0, policy_version 7880 (0.0008) [2023-10-10 04:57:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16121856. Throughput: 0: 1682.2, 1: 1661.9. Samples: 4040710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:57:06,784][52050] Avg episode reward: [(0, '6.600'), (1, '10.730')] [2023-10-10 04:57:07,021][53252] Updated weights for policy 0, policy_version 7890 (0.0007) [2023-10-10 04:57:07,395][53252] Updated weights for policy 0, policy_version 7900 (0.0010) [2023-10-10 04:57:08,765][53268] Updated weights for policy 1, policy_version 7880 (0.0008) [2023-10-10 04:57:09,130][53268] Updated weights for policy 1, policy_version 7890 (0.0010) [2023-10-10 04:57:09,508][53268] Updated weights for policy 1, policy_version 7900 (0.0010) [2023-10-10 04:57:11,503][53252] Updated weights for policy 0, policy_version 7910 (0.0009) [2023-10-10 04:57:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 16187392. Throughput: 0: 1677.7, 1: 1663.1. Samples: 4061046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:57:11,785][52050] Avg episode reward: [(0, '6.650'), (1, '10.110')] [2023-10-10 04:57:11,882][53252] Updated weights for policy 0, policy_version 7920 (0.0011) [2023-10-10 04:57:12,247][53252] Updated weights for policy 0, policy_version 7930 (0.0007) [2023-10-10 04:57:13,488][53268] Updated weights for policy 1, policy_version 7910 (0.0010) [2023-10-10 04:57:13,857][53268] Updated weights for policy 1, policy_version 7920 (0.0011) [2023-10-10 04:57:14,226][53268] Updated weights for policy 1, policy_version 7930 (0.0011) [2023-10-10 04:57:16,387][53252] Updated weights for policy 0, policy_version 7940 (0.0009) [2023-10-10 04:57:16,768][53252] Updated weights for policy 0, policy_version 7950 (0.0008) [2023-10-10 04:57:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 16252928. Throughput: 0: 1683.1, 1: 1645.7. Samples: 4070532. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 04:57:16,784][52050] Avg episode reward: [(0, '6.590'), (1, '9.920')] [2023-10-10 04:57:17,135][53252] Updated weights for policy 0, policy_version 7960 (0.0007) [2023-10-10 04:57:18,337][53268] Updated weights for policy 1, policy_version 7940 (0.0009) [2023-10-10 04:57:18,718][53268] Updated weights for policy 1, policy_version 7950 (0.0009) [2023-10-10 04:57:19,083][53268] Updated weights for policy 1, policy_version 7960 (0.0008) [2023-10-10 04:57:21,107][53252] Updated weights for policy 0, policy_version 7970 (0.0008) [2023-10-10 04:57:21,465][53252] Updated weights for policy 0, policy_version 7980 (0.0007) [2023-10-10 04:57:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 16318464. Throughput: 0: 1680.0, 1: 1659.7. Samples: 4090656. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 04:57:21,784][52050] Avg episode reward: [(0, '6.280'), (1, '10.310')] [2023-10-10 04:57:21,841][53252] Updated weights for policy 0, policy_version 7990 (0.0010) [2023-10-10 04:57:22,208][53252] Updated weights for policy 0, policy_version 8000 (0.0008) [2023-10-10 04:57:22,997][53268] Updated weights for policy 1, policy_version 7970 (0.0008) [2023-10-10 04:57:23,361][53268] Updated weights for policy 1, policy_version 7980 (0.0007) [2023-10-10 04:57:23,723][53268] Updated weights for policy 1, policy_version 7990 (0.0009) [2023-10-10 04:57:24,099][53268] Updated weights for policy 1, policy_version 8000 (0.0007) [2023-10-10 04:57:26,286][53252] Updated weights for policy 0, policy_version 8010 (0.0009) [2023-10-10 04:57:26,658][53252] Updated weights for policy 0, policy_version 8020 (0.0008) [2023-10-10 04:57:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16384000. Throughput: 0: 1671.5, 1: 1670.8. Samples: 4111136. Policy #0 lag: (min: 1.0, avg: 13.0, max: 33.0) [2023-10-10 04:57:26,784][52050] Avg episode reward: [(0, '6.630'), (1, '10.970')] [2023-10-10 04:57:27,034][53252] Updated weights for policy 0, policy_version 8030 (0.0009) [2023-10-10 04:57:27,990][53268] Updated weights for policy 1, policy_version 8010 (0.0010) [2023-10-10 04:57:28,351][53268] Updated weights for policy 1, policy_version 8020 (0.0011) [2023-10-10 04:57:28,725][53268] Updated weights for policy 1, policy_version 8030 (0.0008) [2023-10-10 04:57:30,977][53252] Updated weights for policy 0, policy_version 8040 (0.0008) [2023-10-10 04:57:31,350][53252] Updated weights for policy 0, policy_version 8050 (0.0007) [2023-10-10 04:57:31,725][53252] Updated weights for policy 0, policy_version 8060 (0.0008) [2023-10-10 04:57:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 16449536. Throughput: 0: 1686.8, 1: 1651.4. Samples: 4120876. Policy #0 lag: (min: 1.0, avg: 13.0, max: 33.0) [2023-10-10 04:57:31,784][52050] Avg episode reward: [(0, '8.190'), (1, '9.890')] [2023-10-10 04:57:31,872][52846] Saving new best policy, reward=8.190! [2023-10-10 04:57:32,847][53268] Updated weights for policy 1, policy_version 8040 (0.0009) [2023-10-10 04:57:33,218][53268] Updated weights for policy 1, policy_version 8050 (0.0011) [2023-10-10 04:57:33,585][53268] Updated weights for policy 1, policy_version 8060 (0.0012) [2023-10-10 04:57:35,689][53252] Updated weights for policy 0, policy_version 8070 (0.0010) [2023-10-10 04:57:36,066][53252] Updated weights for policy 0, policy_version 8080 (0.0008) [2023-10-10 04:57:36,430][53252] Updated weights for policy 0, policy_version 8090 (0.0009) [2023-10-10 04:57:36,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16547840. Throughput: 0: 1691.3, 1: 1681.4. Samples: 4141944. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 04:57:36,784][52050] Avg episode reward: [(0, '8.600'), (1, '9.990')] [2023-10-10 04:57:36,786][52846] Saving new best policy, reward=8.600! [2023-10-10 04:57:37,787][53268] Updated weights for policy 1, policy_version 8070 (0.0009) [2023-10-10 04:57:38,150][53268] Updated weights for policy 1, policy_version 8080 (0.0009) [2023-10-10 04:57:38,515][53268] Updated weights for policy 1, policy_version 8090 (0.0008) [2023-10-10 04:57:40,443][53252] Updated weights for policy 0, policy_version 8100 (0.0007) [2023-10-10 04:57:40,814][53252] Updated weights for policy 0, policy_version 8110 (0.0010) [2023-10-10 04:57:41,194][53252] Updated weights for policy 0, policy_version 8120 (0.0009) [2023-10-10 04:57:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16613376. Throughput: 0: 1668.4, 1: 1682.4. Samples: 4161856. Policy #0 lag: (min: 2.0, avg: 14.2, max: 34.0) [2023-10-10 04:57:41,784][52050] Avg episode reward: [(0, '8.470'), (1, '10.010')] [2023-10-10 04:57:42,741][53268] Updated weights for policy 1, policy_version 8100 (0.0007) [2023-10-10 04:57:43,110][53268] Updated weights for policy 1, policy_version 8110 (0.0007) [2023-10-10 04:57:43,476][53268] Updated weights for policy 1, policy_version 8120 (0.0010) [2023-10-10 04:57:45,428][53252] Updated weights for policy 0, policy_version 8130 (0.0008) [2023-10-10 04:57:45,799][53252] Updated weights for policy 0, policy_version 8140 (0.0010) [2023-10-10 04:57:46,178][53252] Updated weights for policy 0, policy_version 8150 (0.0007) [2023-10-10 04:57:46,546][53252] Updated weights for policy 0, policy_version 8160 (0.0009) [2023-10-10 04:57:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16678912. Throughput: 0: 1691.5, 1: 1666.9. Samples: 4171958. Policy #0 lag: (min: 2.0, avg: 14.2, max: 34.0) [2023-10-10 04:57:46,784][52050] Avg episode reward: [(0, '7.910'), (1, '9.830')] [2023-10-10 04:57:47,599][53268] Updated weights for policy 1, policy_version 8130 (0.0010) [2023-10-10 04:57:47,966][53268] Updated weights for policy 1, policy_version 8140 (0.0010) [2023-10-10 04:57:48,339][53268] Updated weights for policy 1, policy_version 8150 (0.0011) [2023-10-10 04:57:48,704][53268] Updated weights for policy 1, policy_version 8160 (0.0008) [2023-10-10 04:57:50,788][53252] Updated weights for policy 0, policy_version 8170 (0.0007) [2023-10-10 04:57:51,156][53252] Updated weights for policy 0, policy_version 8180 (0.0008) [2023-10-10 04:57:51,523][53252] Updated weights for policy 0, policy_version 8190 (0.0010) [2023-10-10 04:57:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16744448. Throughput: 0: 1693.6, 1: 1679.9. Samples: 4192518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:57:51,784][52050] Avg episode reward: [(0, '7.910'), (1, '11.720')] [2023-10-10 04:57:51,785][53061] Saving new best policy, reward=11.720! [2023-10-10 04:57:52,762][53268] Updated weights for policy 1, policy_version 8170 (0.0007) [2023-10-10 04:57:53,131][53268] Updated weights for policy 1, policy_version 8180 (0.0007) [2023-10-10 04:57:53,507][53268] Updated weights for policy 1, policy_version 8190 (0.0008) [2023-10-10 04:57:55,451][53252] Updated weights for policy 0, policy_version 8200 (0.0009) [2023-10-10 04:57:55,832][53252] Updated weights for policy 0, policy_version 8210 (0.0008) [2023-10-10 04:57:56,191][53252] Updated weights for policy 0, policy_version 8220 (0.0010) [2023-10-10 04:57:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16809984. Throughput: 0: 1672.9, 1: 1683.0. Samples: 4212060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:57:56,784][52050] Avg episode reward: [(0, '7.800'), (1, '10.450')] [2023-10-10 04:57:57,600][53268] Updated weights for policy 1, policy_version 8200 (0.0008) [2023-10-10 04:57:57,971][53268] Updated weights for policy 1, policy_version 8210 (0.0008) [2023-10-10 04:57:58,342][53268] Updated weights for policy 1, policy_version 8220 (0.0007) [2023-10-10 04:57:59,857][53252] Updated weights for policy 0, policy_version 8230 (0.0008) [2023-10-10 04:58:00,231][53252] Updated weights for policy 0, policy_version 8240 (0.0007) [2023-10-10 04:58:00,605][53252] Updated weights for policy 0, policy_version 8250 (0.0007) [2023-10-10 04:58:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16875520. Throughput: 0: 1704.1, 1: 1680.8. Samples: 4222852. Policy #0 lag: (min: 30.0, avg: 37.7, max: 62.0) [2023-10-10 04:58:01,784][52050] Avg episode reward: [(0, '7.670'), (1, '9.670')] [2023-10-10 04:58:02,389][53268] Updated weights for policy 1, policy_version 8230 (0.0007) [2023-10-10 04:58:02,763][53268] Updated weights for policy 1, policy_version 8240 (0.0007) [2023-10-10 04:58:03,136][53268] Updated weights for policy 1, policy_version 8250 (0.0007) [2023-10-10 04:58:04,578][53252] Updated weights for policy 0, policy_version 8260 (0.0008) [2023-10-10 04:58:04,961][53252] Updated weights for policy 0, policy_version 8270 (0.0008) [2023-10-10 04:58:05,331][53252] Updated weights for policy 0, policy_version 8280 (0.0007) [2023-10-10 04:58:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16941056. Throughput: 0: 1692.1, 1: 1690.6. Samples: 4242880. Policy #0 lag: (min: 30.0, avg: 37.7, max: 62.0) [2023-10-10 04:58:06,784][52050] Avg episode reward: [(0, '8.050'), (1, '10.710')] [2023-10-10 04:58:07,178][53268] Updated weights for policy 1, policy_version 8260 (0.0008) [2023-10-10 04:58:07,555][53268] Updated weights for policy 1, policy_version 8270 (0.0008) [2023-10-10 04:58:07,914][53268] Updated weights for policy 1, policy_version 8280 (0.0011) [2023-10-10 04:58:09,344][53252] Updated weights for policy 0, policy_version 8290 (0.0009) [2023-10-10 04:58:09,718][53252] Updated weights for policy 0, policy_version 8300 (0.0010) [2023-10-10 04:58:10,082][53252] Updated weights for policy 0, policy_version 8310 (0.0009) [2023-10-10 04:58:10,459][53252] Updated weights for policy 0, policy_version 8320 (0.0010) [2023-10-10 04:58:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17006592. Throughput: 0: 1698.0, 1: 1685.1. Samples: 4263374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:58:11,784][52050] Avg episode reward: [(0, '7.600'), (1, '10.550')] [2023-10-10 04:58:11,898][53268] Updated weights for policy 1, policy_version 8290 (0.0009) [2023-10-10 04:58:12,266][53268] Updated weights for policy 1, policy_version 8300 (0.0011) [2023-10-10 04:58:12,630][53268] Updated weights for policy 1, policy_version 8310 (0.0010) [2023-10-10 04:58:13,000][53268] Updated weights for policy 1, policy_version 8320 (0.0008) [2023-10-10 04:58:14,766][53252] Updated weights for policy 0, policy_version 8330 (0.0010) [2023-10-10 04:58:15,144][53252] Updated weights for policy 0, policy_version 8340 (0.0009) [2023-10-10 04:58:15,517][53252] Updated weights for policy 0, policy_version 8350 (0.0008) [2023-10-10 04:58:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17072128. Throughput: 0: 1709.1, 1: 1684.7. Samples: 4273594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:58:16,784][52050] Avg episode reward: [(0, '8.160'), (1, '10.260')] [2023-10-10 04:58:17,026][53268] Updated weights for policy 1, policy_version 8330 (0.0007) [2023-10-10 04:58:17,398][53268] Updated weights for policy 1, policy_version 8340 (0.0009) [2023-10-10 04:58:17,770][53268] Updated weights for policy 1, policy_version 8350 (0.0009) [2023-10-10 04:58:19,342][53252] Updated weights for policy 0, policy_version 8360 (0.0008) [2023-10-10 04:58:19,724][53252] Updated weights for policy 0, policy_version 8370 (0.0009) [2023-10-10 04:58:20,092][53252] Updated weights for policy 0, policy_version 8380 (0.0009) [2023-10-10 04:58:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17137664. Throughput: 0: 1672.3, 1: 1684.7. Samples: 4293010. Policy #0 lag: (min: 31.0, avg: 33.2, max: 59.0) [2023-10-10 04:58:21,784][52050] Avg episode reward: [(0, '8.330'), (1, '10.020')] [2023-10-10 04:58:21,904][53268] Updated weights for policy 1, policy_version 8360 (0.0008) [2023-10-10 04:58:22,281][53268] Updated weights for policy 1, policy_version 8370 (0.0010) [2023-10-10 04:58:22,647][53268] Updated weights for policy 1, policy_version 8380 (0.0010) [2023-10-10 04:58:24,300][53252] Updated weights for policy 0, policy_version 8390 (0.0009) [2023-10-10 04:58:24,679][53252] Updated weights for policy 0, policy_version 8400 (0.0009) [2023-10-10 04:58:25,059][53252] Updated weights for policy 0, policy_version 8410 (0.0007) [2023-10-10 04:58:26,440][53268] Updated weights for policy 1, policy_version 8390 (0.0009) [2023-10-10 04:58:26,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17203200. Throughput: 0: 1690.1, 1: 1685.3. Samples: 4313752. Policy #0 lag: (min: 31.0, avg: 33.2, max: 59.0) [2023-10-10 04:58:26,785][52050] Avg episode reward: [(0, '8.900'), (1, '10.830')] [2023-10-10 04:58:26,795][52846] Saving new best policy, reward=8.900! [2023-10-10 04:58:26,806][53268] Updated weights for policy 1, policy_version 8400 (0.0010) [2023-10-10 04:58:27,173][53268] Updated weights for policy 1, policy_version 8410 (0.0009) [2023-10-10 04:58:29,036][53252] Updated weights for policy 0, policy_version 8420 (0.0009) [2023-10-10 04:58:29,415][53252] Updated weights for policy 0, policy_version 8430 (0.0011) [2023-10-10 04:58:29,794][53252] Updated weights for policy 0, policy_version 8440 (0.0008) [2023-10-10 04:58:31,156][53268] Updated weights for policy 1, policy_version 8420 (0.0008) [2023-10-10 04:58:31,526][53268] Updated weights for policy 1, policy_version 8430 (0.0008) [2023-10-10 04:58:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17268736. Throughput: 0: 1685.2, 1: 1687.5. Samples: 4323726. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 04:58:31,784][52050] Avg episode reward: [(0, '8.540'), (1, '10.250')] [2023-10-10 04:58:31,895][53268] Updated weights for policy 1, policy_version 8440 (0.0009) [2023-10-10 04:58:33,765][53252] Updated weights for policy 0, policy_version 8450 (0.0009) [2023-10-10 04:58:34,152][53252] Updated weights for policy 0, policy_version 8460 (0.0008) [2023-10-10 04:58:34,515][53252] Updated weights for policy 0, policy_version 8470 (0.0010) [2023-10-10 04:58:34,891][53252] Updated weights for policy 0, policy_version 8480 (0.0007) [2023-10-10 04:58:35,927][53268] Updated weights for policy 1, policy_version 8450 (0.0008) [2023-10-10 04:58:36,291][53268] Updated weights for policy 1, policy_version 8460 (0.0007) [2023-10-10 04:58:36,660][53268] Updated weights for policy 1, policy_version 8470 (0.0007) [2023-10-10 04:58:36,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17334272. Throughput: 0: 1672.0, 1: 1693.4. Samples: 4343962. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 04:58:36,784][52050] Avg episode reward: [(0, '8.320'), (1, '12.690')] [2023-10-10 04:58:37,023][53061] Saving new best policy, reward=12.690! [2023-10-10 04:58:37,024][53268] Updated weights for policy 1, policy_version 8480 (0.0007) [2023-10-10 04:58:38,899][53252] Updated weights for policy 0, policy_version 8490 (0.0008) [2023-10-10 04:58:39,273][53252] Updated weights for policy 0, policy_version 8500 (0.0009) [2023-10-10 04:58:39,655][53252] Updated weights for policy 0, policy_version 8510 (0.0008) [2023-10-10 04:58:41,367][53268] Updated weights for policy 1, policy_version 8490 (0.0008) [2023-10-10 04:58:41,739][53268] Updated weights for policy 1, policy_version 8500 (0.0008) [2023-10-10 04:58:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17399808. Throughput: 0: 1697.3, 1: 1689.7. Samples: 4364476. Policy #0 lag: (min: 26.0, avg: 26.6, max: 40.0) [2023-10-10 04:58:41,784][52050] Avg episode reward: [(0, '8.780'), (1, '13.150')] [2023-10-10 04:58:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000008512_8716288.pth... [2023-10-10 04:58:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000006944_7110656.pth [2023-10-10 04:58:42,114][53268] Updated weights for policy 1, policy_version 8510 (0.0008) [2023-10-10 04:58:42,186][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000008512_8716288.pth... [2023-10-10 04:58:42,226][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000006944_7110656.pth [2023-10-10 04:58:42,231][53061] Saving new best policy, reward=13.150! [2023-10-10 04:58:43,718][53252] Updated weights for policy 0, policy_version 8520 (0.0009) [2023-10-10 04:58:44,083][53252] Updated weights for policy 0, policy_version 8530 (0.0009) [2023-10-10 04:58:44,461][53252] Updated weights for policy 0, policy_version 8540 (0.0011) [2023-10-10 04:58:46,293][53268] Updated weights for policy 1, policy_version 8520 (0.0009) [2023-10-10 04:58:46,655][53268] Updated weights for policy 1, policy_version 8530 (0.0008) [2023-10-10 04:58:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17465344. Throughput: 0: 1671.8, 1: 1690.1. Samples: 4374140. Policy #0 lag: (min: 26.0, avg: 26.6, max: 40.0) [2023-10-10 04:58:46,784][52050] Avg episode reward: [(0, '9.390'), (1, '13.340')] [2023-10-10 04:58:46,785][52846] Saving new best policy, reward=9.390! [2023-10-10 04:58:47,018][53268] Updated weights for policy 1, policy_version 8540 (0.0010) [2023-10-10 04:58:47,170][53061] Saving new best policy, reward=13.340! [2023-10-10 04:58:48,559][53252] Updated weights for policy 0, policy_version 8550 (0.0008) [2023-10-10 04:58:48,930][53252] Updated weights for policy 0, policy_version 8560 (0.0010) [2023-10-10 04:58:49,300][53252] Updated weights for policy 0, policy_version 8570 (0.0009) [2023-10-10 04:58:51,132][53268] Updated weights for policy 1, policy_version 8550 (0.0011) [2023-10-10 04:58:51,505][53268] Updated weights for policy 1, policy_version 8560 (0.0007) [2023-10-10 04:58:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17530880. Throughput: 0: 1677.1, 1: 1686.0. Samples: 4394218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:58:51,784][52050] Avg episode reward: [(0, '8.430'), (1, '12.730')] [2023-10-10 04:58:51,870][53268] Updated weights for policy 1, policy_version 8570 (0.0009) [2023-10-10 04:58:53,472][53252] Updated weights for policy 0, policy_version 8580 (0.0010) [2023-10-10 04:58:53,838][53252] Updated weights for policy 0, policy_version 8590 (0.0009) [2023-10-10 04:58:54,217][53252] Updated weights for policy 0, policy_version 8600 (0.0009) [2023-10-10 04:58:55,976][53268] Updated weights for policy 1, policy_version 8580 (0.0008) [2023-10-10 04:58:56,343][53268] Updated weights for policy 1, policy_version 8590 (0.0007) [2023-10-10 04:58:56,711][53268] Updated weights for policy 1, policy_version 8600 (0.0007) [2023-10-10 04:58:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17596416. Throughput: 0: 1677.4, 1: 1673.8. Samples: 4414176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:58:56,785][52050] Avg episode reward: [(0, '8.600'), (1, '12.810')] [2023-10-10 04:58:58,351][53252] Updated weights for policy 0, policy_version 8610 (0.0007) [2023-10-10 04:58:58,722][53252] Updated weights for policy 0, policy_version 8620 (0.0008) [2023-10-10 04:58:59,105][53252] Updated weights for policy 0, policy_version 8630 (0.0009) [2023-10-10 04:58:59,475][53252] Updated weights for policy 0, policy_version 8640 (0.0007) [2023-10-10 04:59:00,793][53268] Updated weights for policy 1, policy_version 8610 (0.0008) [2023-10-10 04:59:01,166][53268] Updated weights for policy 1, policy_version 8620 (0.0008) [2023-10-10 04:59:01,530][53268] Updated weights for policy 1, policy_version 8630 (0.0009) [2023-10-10 04:59:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 17661952. Throughput: 0: 1655.4, 1: 1680.7. Samples: 4423720. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-10 04:59:01,784][52050] Avg episode reward: [(0, '7.420'), (1, '12.760')] [2023-10-10 04:59:01,905][53268] Updated weights for policy 1, policy_version 8640 (0.0011) [2023-10-10 04:59:03,670][53252] Updated weights for policy 0, policy_version 8650 (0.0007) [2023-10-10 04:59:04,035][53252] Updated weights for policy 0, policy_version 8660 (0.0008) [2023-10-10 04:59:04,415][53252] Updated weights for policy 0, policy_version 8670 (0.0009) [2023-10-10 04:59:05,997][53268] Updated weights for policy 1, policy_version 8650 (0.0007) [2023-10-10 04:59:06,362][53268] Updated weights for policy 1, policy_version 8660 (0.0009) [2023-10-10 04:59:06,722][53268] Updated weights for policy 1, policy_version 8670 (0.0011) [2023-10-10 04:59:06,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 17727488. Throughput: 0: 1682.2, 1: 1675.2. Samples: 4444094. Policy #0 lag: (min: 31.0, avg: 42.1, max: 63.0) [2023-10-10 04:59:06,784][52050] Avg episode reward: [(0, '8.020'), (1, '11.770')] [2023-10-10 04:59:08,492][53252] Updated weights for policy 0, policy_version 8680 (0.0009) [2023-10-10 04:59:08,859][53252] Updated weights for policy 0, policy_version 8690 (0.0007) [2023-10-10 04:59:09,238][53252] Updated weights for policy 0, policy_version 8700 (0.0007) [2023-10-10 04:59:10,872][53268] Updated weights for policy 1, policy_version 8680 (0.0009) [2023-10-10 04:59:11,243][53268] Updated weights for policy 1, policy_version 8690 (0.0010) [2023-10-10 04:59:11,609][53268] Updated weights for policy 1, policy_version 8700 (0.0007) [2023-10-10 04:59:11,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17825792. Throughput: 0: 1689.3, 1: 1651.9. Samples: 4464104. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:59:11,784][52050] Avg episode reward: [(0, '7.810'), (1, '11.910')] [2023-10-10 04:59:13,136][53252] Updated weights for policy 0, policy_version 8710 (0.0009) [2023-10-10 04:59:13,512][53252] Updated weights for policy 0, policy_version 8720 (0.0009) [2023-10-10 04:59:13,887][53252] Updated weights for policy 0, policy_version 8730 (0.0010) [2023-10-10 04:59:15,712][53268] Updated weights for policy 1, policy_version 8710 (0.0009) [2023-10-10 04:59:16,075][53268] Updated weights for policy 1, policy_version 8720 (0.0008) [2023-10-10 04:59:16,449][53268] Updated weights for policy 1, policy_version 8730 (0.0008) [2023-10-10 04:59:16,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17891328. Throughput: 0: 1672.1, 1: 1666.4. Samples: 4473958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 04:59:16,784][52050] Avg episode reward: [(0, '8.100'), (1, '10.720')] [2023-10-10 04:59:18,047][53252] Updated weights for policy 0, policy_version 8740 (0.0009) [2023-10-10 04:59:18,413][53252] Updated weights for policy 0, policy_version 8750 (0.0010) [2023-10-10 04:59:18,788][53252] Updated weights for policy 0, policy_version 8760 (0.0009) [2023-10-10 04:59:20,538][53268] Updated weights for policy 1, policy_version 8740 (0.0008) [2023-10-10 04:59:20,902][53268] Updated weights for policy 1, policy_version 8750 (0.0009) [2023-10-10 04:59:21,269][53268] Updated weights for policy 1, policy_version 8760 (0.0009) [2023-10-10 04:59:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17956864. Throughput: 0: 1684.8, 1: 1663.5. Samples: 4494634. Policy #0 lag: (min: 9.0, avg: 27.7, max: 41.0) [2023-10-10 04:59:21,784][52050] Avg episode reward: [(0, '8.180'), (1, '11.770')] [2023-10-10 04:59:22,818][53252] Updated weights for policy 0, policy_version 8770 (0.0010) [2023-10-10 04:59:23,195][53252] Updated weights for policy 0, policy_version 8780 (0.0009) [2023-10-10 04:59:23,566][53252] Updated weights for policy 0, policy_version 8790 (0.0008) [2023-10-10 04:59:23,927][53252] Updated weights for policy 0, policy_version 8800 (0.0009) [2023-10-10 04:59:25,417][53268] Updated weights for policy 1, policy_version 8770 (0.0010) [2023-10-10 04:59:25,820][53268] Updated weights for policy 1, policy_version 8780 (0.0008) [2023-10-10 04:59:26,197][53268] Updated weights for policy 1, policy_version 8790 (0.0009) [2023-10-10 04:59:26,557][53268] Updated weights for policy 1, policy_version 8800 (0.0007) [2023-10-10 04:59:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 18022400. Throughput: 0: 1684.4, 1: 1647.2. Samples: 4514396. Policy #0 lag: (min: 9.0, avg: 27.7, max: 41.0) [2023-10-10 04:59:26,784][52050] Avg episode reward: [(0, '7.440'), (1, '11.420')] [2023-10-10 04:59:28,063][53252] Updated weights for policy 0, policy_version 8810 (0.0010) [2023-10-10 04:59:28,435][53252] Updated weights for policy 0, policy_version 8820 (0.0009) [2023-10-10 04:59:28,821][53252] Updated weights for policy 0, policy_version 8830 (0.0011) [2023-10-10 04:59:30,370][53268] Updated weights for policy 1, policy_version 8810 (0.0009) [2023-10-10 04:59:30,741][53268] Updated weights for policy 1, policy_version 8820 (0.0009) [2023-10-10 04:59:31,119][53268] Updated weights for policy 1, policy_version 8830 (0.0009) [2023-10-10 04:59:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18087936. Throughput: 0: 1674.4, 1: 1664.8. Samples: 4524402. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-10 04:59:31,784][52050] Avg episode reward: [(0, '7.470'), (1, '12.560')] [2023-10-10 04:59:32,847][53252] Updated weights for policy 0, policy_version 8840 (0.0009) [2023-10-10 04:59:33,219][53252] Updated weights for policy 0, policy_version 8850 (0.0008) [2023-10-10 04:59:33,606][53252] Updated weights for policy 0, policy_version 8860 (0.0009) [2023-10-10 04:59:35,185][53268] Updated weights for policy 1, policy_version 8840 (0.0008) [2023-10-10 04:59:35,551][53268] Updated weights for policy 1, policy_version 8850 (0.0010) [2023-10-10 04:59:35,923][53268] Updated weights for policy 1, policy_version 8860 (0.0011) [2023-10-10 04:59:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18153472. Throughput: 0: 1684.6, 1: 1663.5. Samples: 4544884. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-10 04:59:36,785][52050] Avg episode reward: [(0, '7.780'), (1, '12.040')] [2023-10-10 04:59:37,690][53252] Updated weights for policy 0, policy_version 8870 (0.0009) [2023-10-10 04:59:38,072][53252] Updated weights for policy 0, policy_version 8880 (0.0009) [2023-10-10 04:59:38,437][53252] Updated weights for policy 0, policy_version 8890 (0.0008) [2023-10-10 04:59:39,912][53268] Updated weights for policy 1, policy_version 8870 (0.0008) [2023-10-10 04:59:40,282][53268] Updated weights for policy 1, policy_version 8880 (0.0010) [2023-10-10 04:59:40,653][53268] Updated weights for policy 1, policy_version 8890 (0.0009) [2023-10-10 04:59:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18219008. Throughput: 0: 1686.1, 1: 1652.7. Samples: 4564424. Policy #0 lag: (min: 17.0, avg: 28.1, max: 49.0) [2023-10-10 04:59:41,784][52050] Avg episode reward: [(0, '8.730'), (1, '12.680')] [2023-10-10 04:59:42,393][53252] Updated weights for policy 0, policy_version 8900 (0.0008) [2023-10-10 04:59:42,762][53252] Updated weights for policy 0, policy_version 8910 (0.0009) [2023-10-10 04:59:43,137][53252] Updated weights for policy 0, policy_version 8920 (0.0009) [2023-10-10 04:59:44,661][53268] Updated weights for policy 1, policy_version 8900 (0.0008) [2023-10-10 04:59:45,028][53268] Updated weights for policy 1, policy_version 8910 (0.0008) [2023-10-10 04:59:45,390][53268] Updated weights for policy 1, policy_version 8920 (0.0009) [2023-10-10 04:59:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18284544. Throughput: 0: 1681.1, 1: 1677.6. Samples: 4574864. Policy #0 lag: (min: 17.0, avg: 28.1, max: 49.0) [2023-10-10 04:59:46,785][52050] Avg episode reward: [(0, '9.080'), (1, '11.500')] [2023-10-10 04:59:47,266][53252] Updated weights for policy 0, policy_version 8930 (0.0007) [2023-10-10 04:59:47,640][53252] Updated weights for policy 0, policy_version 8940 (0.0011) [2023-10-10 04:59:48,011][53252] Updated weights for policy 0, policy_version 8950 (0.0009) [2023-10-10 04:59:48,376][53252] Updated weights for policy 0, policy_version 8960 (0.0008) [2023-10-10 04:59:49,522][53268] Updated weights for policy 1, policy_version 8930 (0.0009) [2023-10-10 04:59:49,891][53268] Updated weights for policy 1, policy_version 8940 (0.0007) [2023-10-10 04:59:50,248][53268] Updated weights for policy 1, policy_version 8950 (0.0010) [2023-10-10 04:59:50,618][53268] Updated weights for policy 1, policy_version 8960 (0.0008) [2023-10-10 04:59:51,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 18350080. Throughput: 0: 1686.9, 1: 1666.0. Samples: 4594974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:59:51,784][52050] Avg episode reward: [(0, '9.250'), (1, '10.820')] [2023-10-10 04:59:52,377][53252] Updated weights for policy 0, policy_version 8970 (0.0007) [2023-10-10 04:59:52,756][53252] Updated weights for policy 0, policy_version 8980 (0.0008) [2023-10-10 04:59:53,127][53252] Updated weights for policy 0, policy_version 8990 (0.0008) [2023-10-10 04:59:54,898][53268] Updated weights for policy 1, policy_version 8970 (0.0009) [2023-10-10 04:59:55,264][53268] Updated weights for policy 1, policy_version 8980 (0.0007) [2023-10-10 04:59:55,636][53268] Updated weights for policy 1, policy_version 8990 (0.0008) [2023-10-10 04:59:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18415616. Throughput: 0: 1676.8, 1: 1673.8. Samples: 4614882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 04:59:56,784][52050] Avg episode reward: [(0, '8.460'), (1, '10.740')] [2023-10-10 04:59:57,351][53252] Updated weights for policy 0, policy_version 9000 (0.0007) [2023-10-10 04:59:57,734][53252] Updated weights for policy 0, policy_version 9010 (0.0008) [2023-10-10 04:59:58,102][53252] Updated weights for policy 0, policy_version 9020 (0.0011) [2023-10-10 04:59:59,497][53268] Updated weights for policy 1, policy_version 9000 (0.0009) [2023-10-10 04:59:59,860][53268] Updated weights for policy 1, policy_version 9010 (0.0007) [2023-10-10 05:00:00,221][53268] Updated weights for policy 1, policy_version 9020 (0.0008) [2023-10-10 05:00:01,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 18481152. Throughput: 0: 1672.8, 1: 1691.3. Samples: 4625340. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-10 05:00:01,784][52050] Avg episode reward: [(0, '8.230'), (1, '11.310')] [2023-10-10 05:00:02,130][53252] Updated weights for policy 0, policy_version 9030 (0.0009) [2023-10-10 05:00:02,512][53252] Updated weights for policy 0, policy_version 9040 (0.0009) [2023-10-10 05:00:02,887][53252] Updated weights for policy 0, policy_version 9050 (0.0009) [2023-10-10 05:00:04,430][53268] Updated weights for policy 1, policy_version 9030 (0.0008) [2023-10-10 05:00:04,802][53268] Updated weights for policy 1, policy_version 9040 (0.0009) [2023-10-10 05:00:05,163][53268] Updated weights for policy 1, policy_version 9050 (0.0008) [2023-10-10 05:00:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.2, 300 sec: 13329.3). Total num frames: 18546688. Throughput: 0: 1674.3, 1: 1668.7. Samples: 4645070. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-10 05:00:06,785][52050] Avg episode reward: [(0, '8.750'), (1, '12.220')] [2023-10-10 05:00:06,944][53252] Updated weights for policy 0, policy_version 9060 (0.0009) [2023-10-10 05:00:07,314][53252] Updated weights for policy 0, policy_version 9070 (0.0007) [2023-10-10 05:00:07,682][53252] Updated weights for policy 0, policy_version 9080 (0.0009) [2023-10-10 05:00:09,312][53268] Updated weights for policy 1, policy_version 9060 (0.0009) [2023-10-10 05:00:09,677][53268] Updated weights for policy 1, policy_version 9070 (0.0007) [2023-10-10 05:00:10,049][53268] Updated weights for policy 1, policy_version 9080 (0.0007) [2023-10-10 05:00:11,580][53252] Updated weights for policy 0, policy_version 9090 (0.0008) [2023-10-10 05:00:11,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18612224. Throughput: 0: 1675.4, 1: 1679.6. Samples: 4665372. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 05:00:11,785][52050] Avg episode reward: [(0, '9.040'), (1, '12.830')] [2023-10-10 05:00:11,957][53252] Updated weights for policy 0, policy_version 9100 (0.0008) [2023-10-10 05:00:12,337][53252] Updated weights for policy 0, policy_version 9110 (0.0009) [2023-10-10 05:00:12,715][53252] Updated weights for policy 0, policy_version 9120 (0.0010) [2023-10-10 05:00:14,172][53268] Updated weights for policy 1, policy_version 9090 (0.0009) [2023-10-10 05:00:14,554][53268] Updated weights for policy 1, policy_version 9100 (0.0009) [2023-10-10 05:00:14,916][53268] Updated weights for policy 1, policy_version 9110 (0.0010) [2023-10-10 05:00:15,282][53268] Updated weights for policy 1, policy_version 9120 (0.0008) [2023-10-10 05:00:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18677760. Throughput: 0: 1674.3, 1: 1685.5. Samples: 4675596. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 05:00:16,784][52050] Avg episode reward: [(0, '8.360'), (1, '11.940')] [2023-10-10 05:00:16,831][53252] Updated weights for policy 0, policy_version 9130 (0.0012) [2023-10-10 05:00:17,200][53252] Updated weights for policy 0, policy_version 9140 (0.0010) [2023-10-10 05:00:17,563][53252] Updated weights for policy 0, policy_version 9150 (0.0010) [2023-10-10 05:00:19,286][53268] Updated weights for policy 1, policy_version 9130 (0.0010) [2023-10-10 05:00:19,655][53268] Updated weights for policy 1, policy_version 9140 (0.0010) [2023-10-10 05:00:20,018][53268] Updated weights for policy 1, policy_version 9150 (0.0010) [2023-10-10 05:00:21,694][53252] Updated weights for policy 0, policy_version 9160 (0.0010) [2023-10-10 05:00:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18743296. Throughput: 0: 1670.7, 1: 1665.8. Samples: 4695024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:00:21,784][52050] Avg episode reward: [(0, '8.980'), (1, '11.280')] [2023-10-10 05:00:22,071][53252] Updated weights for policy 0, policy_version 9170 (0.0009) [2023-10-10 05:00:22,448][53252] Updated weights for policy 0, policy_version 9180 (0.0008) [2023-10-10 05:00:24,176][53268] Updated weights for policy 1, policy_version 9160 (0.0009) [2023-10-10 05:00:24,550][53268] Updated weights for policy 1, policy_version 9170 (0.0012) [2023-10-10 05:00:24,924][53268] Updated weights for policy 1, policy_version 9180 (0.0009) [2023-10-10 05:00:26,478][53252] Updated weights for policy 0, policy_version 9190 (0.0009) [2023-10-10 05:00:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18808832. Throughput: 0: 1672.9, 1: 1684.1. Samples: 4715490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:00:26,784][52050] Avg episode reward: [(0, '10.110'), (1, '10.990')] [2023-10-10 05:00:26,858][53252] Updated weights for policy 0, policy_version 9200 (0.0009) [2023-10-10 05:00:27,228][53252] Updated weights for policy 0, policy_version 9210 (0.0007) [2023-10-10 05:00:27,456][52846] Saving new best policy, reward=10.110! [2023-10-10 05:00:28,992][53268] Updated weights for policy 1, policy_version 9190 (0.0009) [2023-10-10 05:00:29,364][53268] Updated weights for policy 1, policy_version 9200 (0.0008) [2023-10-10 05:00:29,728][53268] Updated weights for policy 1, policy_version 9210 (0.0007) [2023-10-10 05:00:31,306][53252] Updated weights for policy 0, policy_version 9220 (0.0008) [2023-10-10 05:00:31,681][53252] Updated weights for policy 0, policy_version 9230 (0.0009) [2023-10-10 05:00:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 18874368. Throughput: 0: 1675.7, 1: 1676.1. Samples: 4725690. Policy #0 lag: (min: 31.0, avg: 31.3, max: 42.0) [2023-10-10 05:00:31,784][52050] Avg episode reward: [(0, '10.130'), (1, '11.200')] [2023-10-10 05:00:32,051][53252] Updated weights for policy 0, policy_version 9240 (0.0010) [2023-10-10 05:00:32,348][52846] Saving new best policy, reward=10.130! [2023-10-10 05:00:33,694][53268] Updated weights for policy 1, policy_version 9220 (0.0009) [2023-10-10 05:00:34,054][53268] Updated weights for policy 1, policy_version 9230 (0.0009) [2023-10-10 05:00:34,431][53268] Updated weights for policy 1, policy_version 9240 (0.0009) [2023-10-10 05:00:36,189][53252] Updated weights for policy 0, policy_version 9250 (0.0009) [2023-10-10 05:00:36,571][53252] Updated weights for policy 0, policy_version 9260 (0.0007) [2023-10-10 05:00:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 18939904. Throughput: 0: 1669.1, 1: 1677.9. Samples: 4745590. Policy #0 lag: (min: 31.0, avg: 31.3, max: 42.0) [2023-10-10 05:00:36,784][52050] Avg episode reward: [(0, '10.410'), (1, '11.200')] [2023-10-10 05:00:36,937][53252] Updated weights for policy 0, policy_version 9270 (0.0008) [2023-10-10 05:00:37,306][52846] Saving new best policy, reward=10.410! [2023-10-10 05:00:37,307][53252] Updated weights for policy 0, policy_version 9280 (0.0010) [2023-10-10 05:00:38,470][53268] Updated weights for policy 1, policy_version 9250 (0.0009) [2023-10-10 05:00:38,838][53268] Updated weights for policy 1, policy_version 9260 (0.0008) [2023-10-10 05:00:39,205][53268] Updated weights for policy 1, policy_version 9270 (0.0008) [2023-10-10 05:00:39,574][53268] Updated weights for policy 1, policy_version 9280 (0.0011) [2023-10-10 05:00:41,616][53252] Updated weights for policy 0, policy_version 9290 (0.0008) [2023-10-10 05:00:41,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 19005440. Throughput: 0: 1667.9, 1: 1688.0. Samples: 4765898. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 05:00:41,785][52050] Avg episode reward: [(0, '9.500'), (1, '12.080')] [2023-10-10 05:00:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000009280_9502720.pth... [2023-10-10 05:00:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000007712_7897088.pth [2023-10-10 05:00:41,985][53252] Updated weights for policy 0, policy_version 9300 (0.0009) [2023-10-10 05:00:42,369][53252] Updated weights for policy 0, policy_version 9310 (0.0008) [2023-10-10 05:00:42,443][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000009312_9535488.pth... [2023-10-10 05:00:42,481][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000007712_7897088.pth [2023-10-10 05:00:43,702][53268] Updated weights for policy 1, policy_version 9290 (0.0008) [2023-10-10 05:00:44,070][53268] Updated weights for policy 1, policy_version 9300 (0.0011) [2023-10-10 05:00:44,432][53268] Updated weights for policy 1, policy_version 9310 (0.0010) [2023-10-10 05:00:46,410][53252] Updated weights for policy 0, policy_version 9320 (0.0007) [2023-10-10 05:00:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19070976. Throughput: 0: 1673.2, 1: 1664.6. Samples: 4775542. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 05:00:46,784][52050] Avg episode reward: [(0, '8.790'), (1, '11.650')] [2023-10-10 05:00:46,789][53252] Updated weights for policy 0, policy_version 9330 (0.0007) [2023-10-10 05:00:47,159][53252] Updated weights for policy 0, policy_version 9340 (0.0007) [2023-10-10 05:00:48,505][53268] Updated weights for policy 1, policy_version 9320 (0.0007) [2023-10-10 05:00:48,864][53268] Updated weights for policy 1, policy_version 9330 (0.0007) [2023-10-10 05:00:49,243][53268] Updated weights for policy 1, policy_version 9340 (0.0007) [2023-10-10 05:00:51,300][53252] Updated weights for policy 0, policy_version 9350 (0.0007) [2023-10-10 05:00:51,675][53252] Updated weights for policy 0, policy_version 9360 (0.0009) [2023-10-10 05:00:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 19136512. Throughput: 0: 1669.8, 1: 1678.7. Samples: 4795752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:00:51,784][52050] Avg episode reward: [(0, '8.600'), (1, '11.930')] [2023-10-10 05:00:52,030][53252] Updated weights for policy 0, policy_version 9370 (0.0011) [2023-10-10 05:00:53,389][53268] Updated weights for policy 1, policy_version 9350 (0.0007) [2023-10-10 05:00:53,759][53268] Updated weights for policy 1, policy_version 9360 (0.0011) [2023-10-10 05:00:54,140][53268] Updated weights for policy 1, policy_version 9370 (0.0010) [2023-10-10 05:00:56,164][53252] Updated weights for policy 0, policy_version 9380 (0.0009) [2023-10-10 05:00:56,540][53252] Updated weights for policy 0, policy_version 9390 (0.0007) [2023-10-10 05:00:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 19202048. Throughput: 0: 1658.5, 1: 1693.2. Samples: 4816194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:00:56,784][52050] Avg episode reward: [(0, '9.620'), (1, '10.770')] [2023-10-10 05:00:56,913][53252] Updated weights for policy 0, policy_version 9400 (0.0007) [2023-10-10 05:00:58,109][53268] Updated weights for policy 1, policy_version 9380 (0.0009) [2023-10-10 05:00:58,475][53268] Updated weights for policy 1, policy_version 9390 (0.0008) [2023-10-10 05:00:58,852][53268] Updated weights for policy 1, policy_version 9400 (0.0010) [2023-10-10 05:01:00,959][53252] Updated weights for policy 0, policy_version 9410 (0.0009) [2023-10-10 05:01:01,332][53252] Updated weights for policy 0, policy_version 9420 (0.0008) [2023-10-10 05:01:01,706][53252] Updated weights for policy 0, policy_version 9430 (0.0008) [2023-10-10 05:01:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19267584. Throughput: 0: 1669.8, 1: 1668.5. Samples: 4825818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:01,784][52050] Avg episode reward: [(0, '8.930'), (1, '10.690')] [2023-10-10 05:01:02,072][53252] Updated weights for policy 0, policy_version 9440 (0.0008) [2023-10-10 05:01:02,869][53268] Updated weights for policy 1, policy_version 9410 (0.0009) [2023-10-10 05:01:03,268][53268] Updated weights for policy 1, policy_version 9420 (0.0008) [2023-10-10 05:01:03,632][53268] Updated weights for policy 1, policy_version 9430 (0.0008) [2023-10-10 05:01:04,001][53268] Updated weights for policy 1, policy_version 9440 (0.0009) [2023-10-10 05:01:06,177][53252] Updated weights for policy 0, policy_version 9450 (0.0008) [2023-10-10 05:01:06,546][53252] Updated weights for policy 0, policy_version 9460 (0.0008) [2023-10-10 05:01:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 19333120. Throughput: 0: 1673.2, 1: 1692.2. Samples: 4846470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:06,784][52050] Avg episode reward: [(0, '9.020'), (1, '10.930')] [2023-10-10 05:01:06,930][53252] Updated weights for policy 0, policy_version 9470 (0.0009) [2023-10-10 05:01:07,864][53268] Updated weights for policy 1, policy_version 9450 (0.0009) [2023-10-10 05:01:08,225][53268] Updated weights for policy 1, policy_version 9460 (0.0008) [2023-10-10 05:01:08,602][53268] Updated weights for policy 1, policy_version 9470 (0.0008) [2023-10-10 05:01:10,927][53252] Updated weights for policy 0, policy_version 9480 (0.0009) [2023-10-10 05:01:11,298][53252] Updated weights for policy 0, policy_version 9490 (0.0010) [2023-10-10 05:01:11,666][53252] Updated weights for policy 0, policy_version 9500 (0.0011) [2023-10-10 05:01:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 19398656. Throughput: 0: 1655.5, 1: 1700.8. Samples: 4866520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:11,784][52050] Avg episode reward: [(0, '8.790'), (1, '11.510')] [2023-10-10 05:01:12,568][53268] Updated weights for policy 1, policy_version 9480 (0.0008) [2023-10-10 05:01:12,945][53268] Updated weights for policy 1, policy_version 9490 (0.0008) [2023-10-10 05:01:13,314][53268] Updated weights for policy 1, policy_version 9500 (0.0008) [2023-10-10 05:01:15,595][53252] Updated weights for policy 0, policy_version 9510 (0.0008) [2023-10-10 05:01:15,964][53252] Updated weights for policy 0, policy_version 9520 (0.0008) [2023-10-10 05:01:16,336][53252] Updated weights for policy 0, policy_version 9530 (0.0008) [2023-10-10 05:01:16,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19496960. Throughput: 0: 1677.0, 1: 1678.0. Samples: 4876668. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 05:01:16,784][52050] Avg episode reward: [(0, '9.370'), (1, '11.210')] [2023-10-10 05:01:17,518][53268] Updated weights for policy 1, policy_version 9510 (0.0009) [2023-10-10 05:01:17,888][53268] Updated weights for policy 1, policy_version 9520 (0.0007) [2023-10-10 05:01:18,261][53268] Updated weights for policy 1, policy_version 9530 (0.0009) [2023-10-10 05:01:20,305][53252] Updated weights for policy 0, policy_version 9540 (0.0008) [2023-10-10 05:01:20,678][53252] Updated weights for policy 0, policy_version 9550 (0.0008) [2023-10-10 05:01:21,058][53252] Updated weights for policy 0, policy_version 9560 (0.0008) [2023-10-10 05:01:21,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19562496. Throughput: 0: 1675.1, 1: 1690.2. Samples: 4897030. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 05:01:21,784][52050] Avg episode reward: [(0, '9.660'), (1, '12.200')] [2023-10-10 05:01:22,326][53268] Updated weights for policy 1, policy_version 9540 (0.0010) [2023-10-10 05:01:22,695][53268] Updated weights for policy 1, policy_version 9550 (0.0008) [2023-10-10 05:01:23,057][53268] Updated weights for policy 1, policy_version 9560 (0.0009) [2023-10-10 05:01:25,014][53252] Updated weights for policy 0, policy_version 9570 (0.0009) [2023-10-10 05:01:25,390][53252] Updated weights for policy 0, policy_version 9580 (0.0010) [2023-10-10 05:01:25,761][53252] Updated weights for policy 0, policy_version 9590 (0.0009) [2023-10-10 05:01:26,133][53252] Updated weights for policy 0, policy_version 9600 (0.0008) [2023-10-10 05:01:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19628032. Throughput: 0: 1660.7, 1: 1689.8. Samples: 4916670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:26,784][52050] Avg episode reward: [(0, '9.590'), (1, '12.920')] [2023-10-10 05:01:27,198][53268] Updated weights for policy 1, policy_version 9570 (0.0009) [2023-10-10 05:01:27,572][53268] Updated weights for policy 1, policy_version 9580 (0.0009) [2023-10-10 05:01:27,950][53268] Updated weights for policy 1, policy_version 9590 (0.0008) [2023-10-10 05:01:28,322][53268] Updated weights for policy 1, policy_version 9600 (0.0010) [2023-10-10 05:01:30,170][53252] Updated weights for policy 0, policy_version 9610 (0.0008) [2023-10-10 05:01:30,537][53252] Updated weights for policy 0, policy_version 9620 (0.0008) [2023-10-10 05:01:30,908][53252] Updated weights for policy 0, policy_version 9630 (0.0010) [2023-10-10 05:01:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19693568. Throughput: 0: 1686.0, 1: 1679.9. Samples: 4927006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:31,784][52050] Avg episode reward: [(0, '8.900'), (1, '13.020')] [2023-10-10 05:01:32,395][53268] Updated weights for policy 1, policy_version 9610 (0.0007) [2023-10-10 05:01:32,769][53268] Updated weights for policy 1, policy_version 9620 (0.0008) [2023-10-10 05:01:33,139][53268] Updated weights for policy 1, policy_version 9630 (0.0009) [2023-10-10 05:01:34,943][53252] Updated weights for policy 0, policy_version 9640 (0.0012) [2023-10-10 05:01:35,306][53252] Updated weights for policy 0, policy_version 9650 (0.0008) [2023-10-10 05:01:35,681][53252] Updated weights for policy 0, policy_version 9660 (0.0009) [2023-10-10 05:01:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19759104. Throughput: 0: 1674.8, 1: 1688.8. Samples: 4947114. Policy #0 lag: (min: 25.0, avg: 37.3, max: 57.0) [2023-10-10 05:01:36,784][52050] Avg episode reward: [(0, '9.440'), (1, '13.570')] [2023-10-10 05:01:36,785][53061] Saving new best policy, reward=13.570! [2023-10-10 05:01:37,256][53268] Updated weights for policy 1, policy_version 9640 (0.0008) [2023-10-10 05:01:37,617][53268] Updated weights for policy 1, policy_version 9650 (0.0008) [2023-10-10 05:01:37,988][53268] Updated weights for policy 1, policy_version 9660 (0.0008) [2023-10-10 05:01:39,710][53252] Updated weights for policy 0, policy_version 9670 (0.0007) [2023-10-10 05:01:40,086][53252] Updated weights for policy 0, policy_version 9680 (0.0007) [2023-10-10 05:01:40,463][53252] Updated weights for policy 0, policy_version 9690 (0.0008) [2023-10-10 05:01:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19824640. Throughput: 0: 1677.1, 1: 1688.3. Samples: 4967636. Policy #0 lag: (min: 25.0, avg: 37.3, max: 57.0) [2023-10-10 05:01:41,784][52050] Avg episode reward: [(0, '9.180'), (1, '13.170')] [2023-10-10 05:01:41,896][53268] Updated weights for policy 1, policy_version 9670 (0.0008) [2023-10-10 05:01:42,266][53268] Updated weights for policy 1, policy_version 9680 (0.0009) [2023-10-10 05:01:42,645][53268] Updated weights for policy 1, policy_version 9690 (0.0008) [2023-10-10 05:01:44,399][53252] Updated weights for policy 0, policy_version 9700 (0.0008) [2023-10-10 05:01:44,769][53252] Updated weights for policy 0, policy_version 9710 (0.0007) [2023-10-10 05:01:45,136][53252] Updated weights for policy 0, policy_version 9720 (0.0007) [2023-10-10 05:01:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19890176. Throughput: 0: 1698.7, 1: 1682.5. Samples: 4977974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:46,784][52050] Avg episode reward: [(0, '9.090'), (1, '12.920')] [2023-10-10 05:01:46,802][53268] Updated weights for policy 1, policy_version 9700 (0.0008) [2023-10-10 05:01:47,160][53268] Updated weights for policy 1, policy_version 9710 (0.0007) [2023-10-10 05:01:47,532][53268] Updated weights for policy 1, policy_version 9720 (0.0009) [2023-10-10 05:01:49,400][53252] Updated weights for policy 0, policy_version 9730 (0.0009) [2023-10-10 05:01:49,770][53252] Updated weights for policy 0, policy_version 9740 (0.0008) [2023-10-10 05:01:50,141][53252] Updated weights for policy 0, policy_version 9750 (0.0007) [2023-10-10 05:01:50,509][53252] Updated weights for policy 0, policy_version 9760 (0.0011) [2023-10-10 05:01:51,557][53268] Updated weights for policy 1, policy_version 9730 (0.0009) [2023-10-10 05:01:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19955712. Throughput: 0: 1681.6, 1: 1684.9. Samples: 4997966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:51,784][52050] Avg episode reward: [(0, '9.180'), (1, '12.810')] [2023-10-10 05:01:51,929][53268] Updated weights for policy 1, policy_version 9740 (0.0009) [2023-10-10 05:01:52,306][53268] Updated weights for policy 1, policy_version 9750 (0.0011) [2023-10-10 05:01:52,674][53268] Updated weights for policy 1, policy_version 9760 (0.0009) [2023-10-10 05:01:54,500][53252] Updated weights for policy 0, policy_version 9770 (0.0010) [2023-10-10 05:01:54,872][53252] Updated weights for policy 0, policy_version 9780 (0.0010) [2023-10-10 05:01:55,246][53252] Updated weights for policy 0, policy_version 9790 (0.0009) [2023-10-10 05:01:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20021248. Throughput: 0: 1691.6, 1: 1681.9. Samples: 5018326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:01:56,784][52050] Avg episode reward: [(0, '9.680'), (1, '12.400')] [2023-10-10 05:01:56,785][53268] Updated weights for policy 1, policy_version 9770 (0.0010) [2023-10-10 05:01:57,156][53268] Updated weights for policy 1, policy_version 9780 (0.0008) [2023-10-10 05:01:57,532][53268] Updated weights for policy 1, policy_version 9790 (0.0009) [2023-10-10 05:01:59,335][53252] Updated weights for policy 0, policy_version 9800 (0.0010) [2023-10-10 05:01:59,712][53252] Updated weights for policy 0, policy_version 9810 (0.0010) [2023-10-10 05:02:00,081][53252] Updated weights for policy 0, policy_version 9820 (0.0008) [2023-10-10 05:02:01,521][53268] Updated weights for policy 1, policy_version 9800 (0.0008) [2023-10-10 05:02:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20086784. Throughput: 0: 1691.7, 1: 1679.6. Samples: 5028376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:02:01,784][52050] Avg episode reward: [(0, '10.600'), (1, '12.630')] [2023-10-10 05:02:01,786][52846] Saving new best policy, reward=10.600! [2023-10-10 05:02:01,890][53268] Updated weights for policy 1, policy_version 9810 (0.0009) [2023-10-10 05:02:02,263][53268] Updated weights for policy 1, policy_version 9820 (0.0008) [2023-10-10 05:02:04,084][53252] Updated weights for policy 0, policy_version 9830 (0.0008) [2023-10-10 05:02:04,462][53252] Updated weights for policy 0, policy_version 9840 (0.0007) [2023-10-10 05:02:04,840][53252] Updated weights for policy 0, policy_version 9850 (0.0009) [2023-10-10 05:02:06,465][53268] Updated weights for policy 1, policy_version 9830 (0.0008) [2023-10-10 05:02:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20152320. Throughput: 0: 1677.0, 1: 1680.4. Samples: 5048116. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:02:06,784][52050] Avg episode reward: [(0, '9.850'), (1, '12.550')] [2023-10-10 05:02:06,825][53268] Updated weights for policy 1, policy_version 9840 (0.0009) [2023-10-10 05:02:07,187][53268] Updated weights for policy 1, policy_version 9850 (0.0007) [2023-10-10 05:02:08,741][53252] Updated weights for policy 0, policy_version 9860 (0.0009) [2023-10-10 05:02:09,108][53252] Updated weights for policy 0, policy_version 9870 (0.0009) [2023-10-10 05:02:09,476][53252] Updated weights for policy 0, policy_version 9880 (0.0008) [2023-10-10 05:02:11,234][53268] Updated weights for policy 1, policy_version 9860 (0.0009) [2023-10-10 05:02:11,604][53268] Updated weights for policy 1, policy_version 9870 (0.0011) [2023-10-10 05:02:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20217856. Throughput: 0: 1704.1, 1: 1676.5. Samples: 5068800. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:02:11,784][52050] Avg episode reward: [(0, '10.230'), (1, '13.120')] [2023-10-10 05:02:11,970][53268] Updated weights for policy 1, policy_version 9880 (0.0008) [2023-10-10 05:02:13,559][53252] Updated weights for policy 0, policy_version 9890 (0.0007) [2023-10-10 05:02:13,932][53252] Updated weights for policy 0, policy_version 9900 (0.0009) [2023-10-10 05:02:14,302][53252] Updated weights for policy 0, policy_version 9910 (0.0010) [2023-10-10 05:02:14,675][53252] Updated weights for policy 0, policy_version 9920 (0.0008) [2023-10-10 05:02:16,106][53268] Updated weights for policy 1, policy_version 9890 (0.0008) [2023-10-10 05:02:16,473][53268] Updated weights for policy 1, policy_version 9900 (0.0009) [2023-10-10 05:02:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20283392. Throughput: 0: 1684.6, 1: 1678.0. Samples: 5078324. Policy #0 lag: (min: 24.0, avg: 50.0, max: 56.0) [2023-10-10 05:02:16,784][52050] Avg episode reward: [(0, '10.410'), (1, '13.370')] [2023-10-10 05:02:16,850][53268] Updated weights for policy 1, policy_version 9910 (0.0008) [2023-10-10 05:02:17,214][53268] Updated weights for policy 1, policy_version 9920 (0.0008) [2023-10-10 05:02:18,520][53252] Updated weights for policy 0, policy_version 9930 (0.0011) [2023-10-10 05:02:18,899][53252] Updated weights for policy 0, policy_version 9940 (0.0010) [2023-10-10 05:02:19,273][53252] Updated weights for policy 0, policy_version 9950 (0.0007) [2023-10-10 05:02:21,228][53268] Updated weights for policy 1, policy_version 9930 (0.0011) [2023-10-10 05:02:21,591][53268] Updated weights for policy 1, policy_version 9940 (0.0010) [2023-10-10 05:02:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20348928. Throughput: 0: 1695.0, 1: 1677.3. Samples: 5098866. Policy #0 lag: (min: 24.0, avg: 50.0, max: 56.0) [2023-10-10 05:02:21,784][52050] Avg episode reward: [(0, '10.480'), (1, '13.140')] [2023-10-10 05:02:21,965][53268] Updated weights for policy 1, policy_version 9950 (0.0008) [2023-10-10 05:02:23,388][53252] Updated weights for policy 0, policy_version 9960 (0.0007) [2023-10-10 05:02:23,760][53252] Updated weights for policy 0, policy_version 9970 (0.0007) [2023-10-10 05:02:24,139][53252] Updated weights for policy 0, policy_version 9980 (0.0007) [2023-10-10 05:02:26,142][53268] Updated weights for policy 1, policy_version 9960 (0.0007) [2023-10-10 05:02:26,521][53268] Updated weights for policy 1, policy_version 9970 (0.0008) [2023-10-10 05:02:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20414464. Throughput: 0: 1702.9, 1: 1662.9. Samples: 5119096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:02:26,784][52050] Avg episode reward: [(0, '10.090'), (1, '13.550')] [2023-10-10 05:02:26,885][53268] Updated weights for policy 1, policy_version 9980 (0.0009) [2023-10-10 05:02:28,040][53252] Updated weights for policy 0, policy_version 9990 (0.0009) [2023-10-10 05:02:28,413][53252] Updated weights for policy 0, policy_version 10000 (0.0011) [2023-10-10 05:02:28,785][53252] Updated weights for policy 0, policy_version 10010 (0.0011) [2023-10-10 05:02:31,031][53268] Updated weights for policy 1, policy_version 9990 (0.0010) [2023-10-10 05:02:31,394][53268] Updated weights for policy 1, policy_version 10000 (0.0010) [2023-10-10 05:02:31,768][53268] Updated weights for policy 1, policy_version 10010 (0.0010) [2023-10-10 05:02:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20480000. Throughput: 0: 1673.2, 1: 1670.1. Samples: 5128426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:02:31,784][52050] Avg episode reward: [(0, '9.080'), (1, '13.780')] [2023-10-10 05:02:31,981][53061] Saving new best policy, reward=13.780! [2023-10-10 05:02:32,844][53252] Updated weights for policy 0, policy_version 10020 (0.0010) [2023-10-10 05:02:33,222][53252] Updated weights for policy 0, policy_version 10030 (0.0010) [2023-10-10 05:02:33,598][53252] Updated weights for policy 0, policy_version 10040 (0.0009) [2023-10-10 05:02:35,942][53268] Updated weights for policy 1, policy_version 10020 (0.0008) [2023-10-10 05:02:36,308][53268] Updated weights for policy 1, policy_version 10030 (0.0008) [2023-10-10 05:02:36,688][53268] Updated weights for policy 1, policy_version 10040 (0.0010) [2023-10-10 05:02:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20545536. Throughput: 0: 1689.8, 1: 1665.8. Samples: 5148970. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 05:02:36,784][52050] Avg episode reward: [(0, '8.280'), (1, '12.530')] [2023-10-10 05:02:37,682][53252] Updated weights for policy 0, policy_version 10050 (0.0007) [2023-10-10 05:02:38,058][53252] Updated weights for policy 0, policy_version 10060 (0.0008) [2023-10-10 05:02:38,426][53252] Updated weights for policy 0, policy_version 10070 (0.0010) [2023-10-10 05:02:38,795][53252] Updated weights for policy 0, policy_version 10080 (0.0009) [2023-10-10 05:02:40,895][53268] Updated weights for policy 1, policy_version 10050 (0.0010) [2023-10-10 05:02:41,304][53268] Updated weights for policy 1, policy_version 10060 (0.0009) [2023-10-10 05:02:41,670][53268] Updated weights for policy 1, policy_version 10070 (0.0008) [2023-10-10 05:02:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20611072. Throughput: 0: 1697.7, 1: 1652.4. Samples: 5169082. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 05:02:41,785][52050] Avg episode reward: [(0, '8.220'), (1, '12.800')] [2023-10-10 05:02:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000010080_10321920.pth... [2023-10-10 05:02:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000008512_8716288.pth [2023-10-10 05:02:42,042][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000010080_10321920.pth... [2023-10-10 05:02:42,045][53268] Updated weights for policy 1, policy_version 10080 (0.0009) [2023-10-10 05:02:42,081][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000008512_8716288.pth [2023-10-10 05:02:42,855][53252] Updated weights for policy 0, policy_version 10090 (0.0008) [2023-10-10 05:02:43,236][53252] Updated weights for policy 0, policy_version 10100 (0.0008) [2023-10-10 05:02:43,607][53252] Updated weights for policy 0, policy_version 10110 (0.0008) [2023-10-10 05:02:46,038][53268] Updated weights for policy 1, policy_version 10090 (0.0008) [2023-10-10 05:02:46,402][53268] Updated weights for policy 1, policy_version 10100 (0.0009) [2023-10-10 05:02:46,772][53268] Updated weights for policy 1, policy_version 10110 (0.0009) [2023-10-10 05:02:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20676608. Throughput: 0: 1672.3, 1: 1661.5. Samples: 5178396. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 05:02:46,784][52050] Avg episode reward: [(0, '8.690'), (1, '13.420')] [2023-10-10 05:02:47,736][53252] Updated weights for policy 0, policy_version 10120 (0.0007) [2023-10-10 05:02:48,101][53252] Updated weights for policy 0, policy_version 10130 (0.0009) [2023-10-10 05:02:48,482][53252] Updated weights for policy 0, policy_version 10140 (0.0010) [2023-10-10 05:02:50,993][53268] Updated weights for policy 1, policy_version 10120 (0.0011) [2023-10-10 05:02:51,363][53268] Updated weights for policy 1, policy_version 10130 (0.0008) [2023-10-10 05:02:51,735][53268] Updated weights for policy 1, policy_version 10140 (0.0009) [2023-10-10 05:02:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 20742144. Throughput: 0: 1689.3, 1: 1659.8. Samples: 5198826. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 05:02:51,784][52050] Avg episode reward: [(0, '9.460'), (1, '12.890')] [2023-10-10 05:02:52,708][53252] Updated weights for policy 0, policy_version 10150 (0.0009) [2023-10-10 05:02:53,082][53252] Updated weights for policy 0, policy_version 10160 (0.0007) [2023-10-10 05:02:53,462][53252] Updated weights for policy 0, policy_version 10170 (0.0008) [2023-10-10 05:02:55,701][53268] Updated weights for policy 1, policy_version 10150 (0.0011) [2023-10-10 05:02:56,075][53268] Updated weights for policy 1, policy_version 10160 (0.0010) [2023-10-10 05:02:56,442][53268] Updated weights for policy 1, policy_version 10170 (0.0009) [2023-10-10 05:02:56,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20840448. Throughput: 0: 1682.8, 1: 1652.7. Samples: 5218900. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 05:02:56,784][52050] Avg episode reward: [(0, '10.000'), (1, '13.800')] [2023-10-10 05:02:56,792][53061] Saving new best policy, reward=13.800! [2023-10-10 05:02:57,591][53252] Updated weights for policy 0, policy_version 10180 (0.0008) [2023-10-10 05:02:57,971][53252] Updated weights for policy 0, policy_version 10190 (0.0007) [2023-10-10 05:02:58,353][53252] Updated weights for policy 0, policy_version 10200 (0.0010) [2023-10-10 05:03:00,623][53268] Updated weights for policy 1, policy_version 10180 (0.0010) [2023-10-10 05:03:00,998][53268] Updated weights for policy 1, policy_version 10190 (0.0009) [2023-10-10 05:03:01,359][53268] Updated weights for policy 1, policy_version 10200 (0.0007) [2023-10-10 05:03:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20905984. Throughput: 0: 1672.6, 1: 1663.8. Samples: 5228462. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 05:03:01,785][52050] Avg episode reward: [(0, '9.170'), (1, '12.640')] [2023-10-10 05:03:02,510][53252] Updated weights for policy 0, policy_version 10210 (0.0008) [2023-10-10 05:03:02,879][53252] Updated weights for policy 0, policy_version 10220 (0.0008) [2023-10-10 05:03:03,266][53252] Updated weights for policy 0, policy_version 10230 (0.0010) [2023-10-10 05:03:03,641][53252] Updated weights for policy 0, policy_version 10240 (0.0009) [2023-10-10 05:03:05,528][53268] Updated weights for policy 1, policy_version 10210 (0.0009) [2023-10-10 05:03:05,887][53268] Updated weights for policy 1, policy_version 10220 (0.0008) [2023-10-10 05:03:06,256][53268] Updated weights for policy 1, policy_version 10230 (0.0007) [2023-10-10 05:03:06,623][53268] Updated weights for policy 1, policy_version 10240 (0.0007) [2023-10-10 05:03:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20971520. Throughput: 0: 1674.4, 1: 1662.8. Samples: 5249042. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:03:06,784][52050] Avg episode reward: [(0, '8.440'), (1, '12.270')] [2023-10-10 05:03:07,864][53252] Updated weights for policy 0, policy_version 10250 (0.0009) [2023-10-10 05:03:08,246][53252] Updated weights for policy 0, policy_version 10260 (0.0007) [2023-10-10 05:03:08,618][53252] Updated weights for policy 0, policy_version 10270 (0.0007) [2023-10-10 05:03:10,876][53268] Updated weights for policy 1, policy_version 10250 (0.0011) [2023-10-10 05:03:11,236][53268] Updated weights for policy 1, policy_version 10260 (0.0011) [2023-10-10 05:03:11,609][53268] Updated weights for policy 1, policy_version 10270 (0.0010) [2023-10-10 05:03:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21037056. Throughput: 0: 1674.6, 1: 1654.8. Samples: 5268920. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:03:11,784][52050] Avg episode reward: [(0, '8.790'), (1, '12.260')] [2023-10-10 05:03:12,581][53252] Updated weights for policy 0, policy_version 10280 (0.0007) [2023-10-10 05:03:12,954][53252] Updated weights for policy 0, policy_version 10290 (0.0008) [2023-10-10 05:03:13,329][53252] Updated weights for policy 0, policy_version 10300 (0.0007) [2023-10-10 05:03:15,652][53268] Updated weights for policy 1, policy_version 10280 (0.0008) [2023-10-10 05:03:16,017][53268] Updated weights for policy 1, policy_version 10290 (0.0007) [2023-10-10 05:03:16,390][53268] Updated weights for policy 1, policy_version 10300 (0.0008) [2023-10-10 05:03:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21102592. Throughput: 0: 1672.1, 1: 1666.6. Samples: 5278666. Policy #0 lag: (min: 19.0, avg: 23.1, max: 51.0) [2023-10-10 05:03:16,784][52050] Avg episode reward: [(0, '9.590'), (1, '11.840')] [2023-10-10 05:03:17,261][53252] Updated weights for policy 0, policy_version 10310 (0.0009) [2023-10-10 05:03:17,640][53252] Updated weights for policy 0, policy_version 10320 (0.0009) [2023-10-10 05:03:18,012][53252] Updated weights for policy 0, policy_version 10330 (0.0007) [2023-10-10 05:03:20,404][53268] Updated weights for policy 1, policy_version 10310 (0.0010) [2023-10-10 05:03:20,762][53268] Updated weights for policy 1, policy_version 10320 (0.0010) [2023-10-10 05:03:21,131][53268] Updated weights for policy 1, policy_version 10330 (0.0010) [2023-10-10 05:03:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 21168128. Throughput: 0: 1671.7, 1: 1674.1. Samples: 5299532. Policy #0 lag: (min: 19.0, avg: 23.1, max: 51.0) [2023-10-10 05:03:21,784][52050] Avg episode reward: [(0, '10.440'), (1, '12.780')] [2023-10-10 05:03:22,030][53252] Updated weights for policy 0, policy_version 10340 (0.0010) [2023-10-10 05:03:22,396][53252] Updated weights for policy 0, policy_version 10350 (0.0009) [2023-10-10 05:03:22,781][53252] Updated weights for policy 0, policy_version 10360 (0.0009) [2023-10-10 05:03:25,158][53268] Updated weights for policy 1, policy_version 10340 (0.0008) [2023-10-10 05:03:25,559][53268] Updated weights for policy 1, policy_version 10350 (0.0009) [2023-10-10 05:03:25,929][53268] Updated weights for policy 1, policy_version 10360 (0.0008) [2023-10-10 05:03:26,717][53252] Updated weights for policy 0, policy_version 10370 (0.0008) [2023-10-10 05:03:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21233664. Throughput: 0: 1672.7, 1: 1657.4. Samples: 5318938. Policy #0 lag: (min: 22.0, avg: 34.6, max: 54.0) [2023-10-10 05:03:26,784][52050] Avg episode reward: [(0, '11.570'), (1, '12.830')] [2023-10-10 05:03:27,095][53252] Updated weights for policy 0, policy_version 10380 (0.0007) [2023-10-10 05:03:27,470][53252] Updated weights for policy 0, policy_version 10390 (0.0007) [2023-10-10 05:03:27,834][52846] Saving new best policy, reward=11.570! [2023-10-10 05:03:27,834][53252] Updated weights for policy 0, policy_version 10400 (0.0007) [2023-10-10 05:03:29,938][53268] Updated weights for policy 1, policy_version 10370 (0.0007) [2023-10-10 05:03:30,305][53268] Updated weights for policy 1, policy_version 10380 (0.0009) [2023-10-10 05:03:30,679][53268] Updated weights for policy 1, policy_version 10390 (0.0008) [2023-10-10 05:03:31,040][53268] Updated weights for policy 1, policy_version 10400 (0.0009) [2023-10-10 05:03:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21299200. Throughput: 0: 1679.2, 1: 1675.8. Samples: 5329372. Policy #0 lag: (min: 22.0, avg: 34.6, max: 54.0) [2023-10-10 05:03:31,784][52050] Avg episode reward: [(0, '10.150'), (1, '12.590')] [2023-10-10 05:03:31,797][53252] Updated weights for policy 0, policy_version 10410 (0.0008) [2023-10-10 05:03:32,170][53252] Updated weights for policy 0, policy_version 10420 (0.0009) [2023-10-10 05:03:32,543][53252] Updated weights for policy 0, policy_version 10430 (0.0008) [2023-10-10 05:03:35,114][53268] Updated weights for policy 1, policy_version 10410 (0.0008) [2023-10-10 05:03:35,478][53268] Updated weights for policy 1, policy_version 10420 (0.0009) [2023-10-10 05:03:35,852][53268] Updated weights for policy 1, policy_version 10430 (0.0008) [2023-10-10 05:03:36,619][53252] Updated weights for policy 0, policy_version 10440 (0.0008) [2023-10-10 05:03:36,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21364736. Throughput: 0: 1684.4, 1: 1671.3. Samples: 5349834. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:03:36,785][52050] Avg episode reward: [(0, '10.470'), (1, '11.950')] [2023-10-10 05:03:37,005][53252] Updated weights for policy 0, policy_version 10450 (0.0008) [2023-10-10 05:03:37,374][53252] Updated weights for policy 0, policy_version 10460 (0.0008) [2023-10-10 05:03:39,788][53268] Updated weights for policy 1, policy_version 10440 (0.0010) [2023-10-10 05:03:40,162][53268] Updated weights for policy 1, policy_version 10450 (0.0009) [2023-10-10 05:03:40,535][53268] Updated weights for policy 1, policy_version 10460 (0.0009) [2023-10-10 05:03:41,390][53252] Updated weights for policy 0, policy_version 10470 (0.0009) [2023-10-10 05:03:41,765][53252] Updated weights for policy 0, policy_version 10480 (0.0008) [2023-10-10 05:03:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21430272. Throughput: 0: 1680.6, 1: 1665.6. Samples: 5369480. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:03:41,785][52050] Avg episode reward: [(0, '10.050'), (1, '11.680')] [2023-10-10 05:03:42,128][53252] Updated weights for policy 0, policy_version 10490 (0.0008) [2023-10-10 05:03:44,747][53268] Updated weights for policy 1, policy_version 10470 (0.0009) [2023-10-10 05:03:45,103][53268] Updated weights for policy 1, policy_version 10480 (0.0009) [2023-10-10 05:03:45,476][53268] Updated weights for policy 1, policy_version 10490 (0.0009) [2023-10-10 05:03:46,214][53252] Updated weights for policy 0, policy_version 10500 (0.0009) [2023-10-10 05:03:46,585][53252] Updated weights for policy 0, policy_version 10510 (0.0009) [2023-10-10 05:03:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21495808. Throughput: 0: 1684.2, 1: 1682.6. Samples: 5379968. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:03:46,784][52050] Avg episode reward: [(0, '9.720'), (1, '12.340')] [2023-10-10 05:03:46,962][53252] Updated weights for policy 0, policy_version 10520 (0.0008) [2023-10-10 05:03:49,572][53268] Updated weights for policy 1, policy_version 10500 (0.0010) [2023-10-10 05:03:49,934][53268] Updated weights for policy 1, policy_version 10510 (0.0011) [2023-10-10 05:03:50,302][53268] Updated weights for policy 1, policy_version 10520 (0.0010) [2023-10-10 05:03:51,037][53252] Updated weights for policy 0, policy_version 10530 (0.0007) [2023-10-10 05:03:51,415][53252] Updated weights for policy 0, policy_version 10540 (0.0008) [2023-10-10 05:03:51,784][53252] Updated weights for policy 0, policy_version 10550 (0.0008) [2023-10-10 05:03:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21561344. Throughput: 0: 1690.4, 1: 1666.5. Samples: 5400100. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:03:51,784][52050] Avg episode reward: [(0, '11.160'), (1, '13.390')] [2023-10-10 05:03:52,152][53252] Updated weights for policy 0, policy_version 10560 (0.0008) [2023-10-10 05:03:54,134][53268] Updated weights for policy 1, policy_version 10530 (0.0010) [2023-10-10 05:03:54,503][53268] Updated weights for policy 1, policy_version 10540 (0.0010) [2023-10-10 05:03:54,871][53268] Updated weights for policy 1, policy_version 10550 (0.0009) [2023-10-10 05:03:55,240][53268] Updated weights for policy 1, policy_version 10560 (0.0009) [2023-10-10 05:03:56,339][53252] Updated weights for policy 0, policy_version 10570 (0.0007) [2023-10-10 05:03:56,716][53252] Updated weights for policy 0, policy_version 10580 (0.0008) [2023-10-10 05:03:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 21626880. Throughput: 0: 1675.6, 1: 1677.9. Samples: 5419828. Policy #0 lag: (min: 25.0, avg: 29.5, max: 57.0) [2023-10-10 05:03:56,784][52050] Avg episode reward: [(0, '10.850'), (1, '13.600')] [2023-10-10 05:03:57,105][53252] Updated weights for policy 0, policy_version 10590 (0.0008) [2023-10-10 05:03:59,282][53268] Updated weights for policy 1, policy_version 10570 (0.0010) [2023-10-10 05:03:59,650][53268] Updated weights for policy 1, policy_version 10580 (0.0010) [2023-10-10 05:04:00,018][53268] Updated weights for policy 1, policy_version 10590 (0.0010) [2023-10-10 05:04:01,181][53252] Updated weights for policy 0, policy_version 10600 (0.0008) [2023-10-10 05:04:01,556][53252] Updated weights for policy 0, policy_version 10610 (0.0009) [2023-10-10 05:04:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 21692416. Throughput: 0: 1687.1, 1: 1685.1. Samples: 5430414. Policy #0 lag: (min: 25.0, avg: 29.5, max: 57.0) [2023-10-10 05:04:01,784][52050] Avg episode reward: [(0, '11.390'), (1, '12.940')] [2023-10-10 05:04:01,937][53252] Updated weights for policy 0, policy_version 10620 (0.0010) [2023-10-10 05:04:04,081][53268] Updated weights for policy 1, policy_version 10600 (0.0009) [2023-10-10 05:04:04,458][53268] Updated weights for policy 1, policy_version 10610 (0.0008) [2023-10-10 05:04:04,838][53268] Updated weights for policy 1, policy_version 10620 (0.0008) [2023-10-10 05:04:05,914][53252] Updated weights for policy 0, policy_version 10630 (0.0007) [2023-10-10 05:04:06,287][53252] Updated weights for policy 0, policy_version 10640 (0.0011) [2023-10-10 05:04:06,657][53252] Updated weights for policy 0, policy_version 10650 (0.0007) [2023-10-10 05:04:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21757952. Throughput: 0: 1691.8, 1: 1654.9. Samples: 5450134. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 05:04:06,785][52050] Avg episode reward: [(0, '10.990'), (1, '12.380')] [2023-10-10 05:04:08,898][53268] Updated weights for policy 1, policy_version 10630 (0.0009) [2023-10-10 05:04:09,264][53268] Updated weights for policy 1, policy_version 10640 (0.0009) [2023-10-10 05:04:09,627][53268] Updated weights for policy 1, policy_version 10650 (0.0007) [2023-10-10 05:04:10,728][53252] Updated weights for policy 0, policy_version 10660 (0.0008) [2023-10-10 05:04:11,092][53252] Updated weights for policy 0, policy_version 10670 (0.0008) [2023-10-10 05:04:11,467][53252] Updated weights for policy 0, policy_version 10680 (0.0008) [2023-10-10 05:04:11,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21856256. Throughput: 0: 1673.6, 1: 1688.5. Samples: 5470236. Policy #0 lag: (min: 27.0, avg: 35.0, max: 59.0) [2023-10-10 05:04:11,784][52050] Avg episode reward: [(0, '10.060'), (1, '12.660')] [2023-10-10 05:04:13,611][53268] Updated weights for policy 1, policy_version 10660 (0.0008) [2023-10-10 05:04:14,006][53268] Updated weights for policy 1, policy_version 10670 (0.0010) [2023-10-10 05:04:14,365][53268] Updated weights for policy 1, policy_version 10680 (0.0009) [2023-10-10 05:04:15,400][53252] Updated weights for policy 0, policy_version 10690 (0.0008) [2023-10-10 05:04:15,771][53252] Updated weights for policy 0, policy_version 10700 (0.0009) [2023-10-10 05:04:16,143][53252] Updated weights for policy 0, policy_version 10710 (0.0007) [2023-10-10 05:04:16,518][53252] Updated weights for policy 0, policy_version 10720 (0.0011) [2023-10-10 05:04:16,784][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21921792. Throughput: 0: 1691.6, 1: 1678.5. Samples: 5481028. Policy #0 lag: (min: 27.0, avg: 35.0, max: 59.0) [2023-10-10 05:04:16,785][52050] Avg episode reward: [(0, '8.860'), (1, '13.250')] [2023-10-10 05:04:18,502][53268] Updated weights for policy 1, policy_version 10690 (0.0009) [2023-10-10 05:04:18,878][53268] Updated weights for policy 1, policy_version 10700 (0.0008) [2023-10-10 05:04:19,252][53268] Updated weights for policy 1, policy_version 10710 (0.0009) [2023-10-10 05:04:19,622][53268] Updated weights for policy 1, policy_version 10720 (0.0008) [2023-10-10 05:04:20,635][53252] Updated weights for policy 0, policy_version 10730 (0.0009) [2023-10-10 05:04:21,003][53252] Updated weights for policy 0, policy_version 10740 (0.0008) [2023-10-10 05:04:21,368][53252] Updated weights for policy 0, policy_version 10750 (0.0010) [2023-10-10 05:04:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21987328. Throughput: 0: 1686.7, 1: 1671.8. Samples: 5500966. Policy #0 lag: (min: 2.0, avg: 8.0, max: 34.0) [2023-10-10 05:04:21,784][52050] Avg episode reward: [(0, '9.220'), (1, '13.550')] [2023-10-10 05:04:23,534][53268] Updated weights for policy 1, policy_version 10730 (0.0008) [2023-10-10 05:04:23,910][53268] Updated weights for policy 1, policy_version 10740 (0.0010) [2023-10-10 05:04:24,274][53268] Updated weights for policy 1, policy_version 10750 (0.0009) [2023-10-10 05:04:25,489][53252] Updated weights for policy 0, policy_version 10760 (0.0009) [2023-10-10 05:04:25,866][53252] Updated weights for policy 0, policy_version 10770 (0.0009) [2023-10-10 05:04:26,246][53252] Updated weights for policy 0, policy_version 10780 (0.0010) [2023-10-10 05:04:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22052864. Throughput: 0: 1667.8, 1: 1691.1. Samples: 5520630. Policy #0 lag: (min: 2.0, avg: 8.0, max: 34.0) [2023-10-10 05:04:26,784][52050] Avg episode reward: [(0, '8.650'), (1, '13.780')] [2023-10-10 05:04:28,457][53268] Updated weights for policy 1, policy_version 10760 (0.0010) [2023-10-10 05:04:28,820][53268] Updated weights for policy 1, policy_version 10770 (0.0009) [2023-10-10 05:04:29,204][53268] Updated weights for policy 1, policy_version 10780 (0.0007) [2023-10-10 05:04:30,274][53252] Updated weights for policy 0, policy_version 10790 (0.0008) [2023-10-10 05:04:30,644][53252] Updated weights for policy 0, policy_version 10800 (0.0007) [2023-10-10 05:04:31,007][53252] Updated weights for policy 0, policy_version 10810 (0.0008) [2023-10-10 05:04:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22118400. Throughput: 0: 1691.6, 1: 1668.3. Samples: 5531162. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) [2023-10-10 05:04:31,784][52050] Avg episode reward: [(0, '9.020'), (1, '13.200')] [2023-10-10 05:04:33,233][53268] Updated weights for policy 1, policy_version 10790 (0.0008) [2023-10-10 05:04:33,605][53268] Updated weights for policy 1, policy_version 10800 (0.0008) [2023-10-10 05:04:33,974][53268] Updated weights for policy 1, policy_version 10810 (0.0007) [2023-10-10 05:04:34,991][53252] Updated weights for policy 0, policy_version 10820 (0.0010) [2023-10-10 05:04:35,372][53252] Updated weights for policy 0, policy_version 10830 (0.0009) [2023-10-10 05:04:35,736][53252] Updated weights for policy 0, policy_version 10840 (0.0008) [2023-10-10 05:04:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 22183936. Throughput: 0: 1676.6, 1: 1681.5. Samples: 5551216. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) [2023-10-10 05:04:36,784][52050] Avg episode reward: [(0, '9.840'), (1, '12.970')] [2023-10-10 05:04:38,054][53268] Updated weights for policy 1, policy_version 10820 (0.0008) [2023-10-10 05:04:38,426][53268] Updated weights for policy 1, policy_version 10830 (0.0007) [2023-10-10 05:04:38,800][53268] Updated weights for policy 1, policy_version 10840 (0.0007) [2023-10-10 05:04:39,881][53252] Updated weights for policy 0, policy_version 10850 (0.0008) [2023-10-10 05:04:40,261][53252] Updated weights for policy 0, policy_version 10860 (0.0007) [2023-10-10 05:04:40,641][53252] Updated weights for policy 0, policy_version 10870 (0.0011) [2023-10-10 05:04:41,016][53252] Updated weights for policy 0, policy_version 10880 (0.0008) [2023-10-10 05:04:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 22249472. Throughput: 0: 1673.7, 1: 1690.0. Samples: 5571196. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 05:04:41,784][52050] Avg episode reward: [(0, '10.530'), (1, '12.820')] [2023-10-10 05:04:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000010848_11108352.pth... [2023-10-10 05:04:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000010880_11141120.pth... [2023-10-10 05:04:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000009280_9502720.pth [2023-10-10 05:04:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000009312_9535488.pth [2023-10-10 05:04:42,968][53268] Updated weights for policy 1, policy_version 10850 (0.0008) [2023-10-10 05:04:43,340][53268] Updated weights for policy 1, policy_version 10860 (0.0008) [2023-10-10 05:04:43,700][53268] Updated weights for policy 1, policy_version 10870 (0.0009) [2023-10-10 05:04:44,077][53268] Updated weights for policy 1, policy_version 10880 (0.0007) [2023-10-10 05:04:45,021][53252] Updated weights for policy 0, policy_version 10890 (0.0007) [2023-10-10 05:04:45,394][53252] Updated weights for policy 0, policy_version 10900 (0.0008) [2023-10-10 05:04:45,766][53252] Updated weights for policy 0, policy_version 10910 (0.0007) [2023-10-10 05:04:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 22315008. Throughput: 0: 1696.7, 1: 1664.3. Samples: 5581656. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 05:04:46,784][52050] Avg episode reward: [(0, '11.070'), (1, '12.920')] [2023-10-10 05:04:48,188][53268] Updated weights for policy 1, policy_version 10890 (0.0010) [2023-10-10 05:04:48,561][53268] Updated weights for policy 1, policy_version 10900 (0.0009) [2023-10-10 05:04:48,921][53268] Updated weights for policy 1, policy_version 10910 (0.0009) [2023-10-10 05:04:49,876][53252] Updated weights for policy 0, policy_version 10920 (0.0007) [2023-10-10 05:04:50,239][53252] Updated weights for policy 0, policy_version 10930 (0.0008) [2023-10-10 05:04:50,615][53252] Updated weights for policy 0, policy_version 10940 (0.0007) [2023-10-10 05:04:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22380544. Throughput: 0: 1672.6, 1: 1691.6. Samples: 5601522. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:04:51,784][52050] Avg episode reward: [(0, '10.890'), (1, '12.510')] [2023-10-10 05:04:52,868][53268] Updated weights for policy 1, policy_version 10920 (0.0009) [2023-10-10 05:04:53,228][53268] Updated weights for policy 1, policy_version 10930 (0.0009) [2023-10-10 05:04:53,603][53268] Updated weights for policy 1, policy_version 10940 (0.0009) [2023-10-10 05:04:54,564][53252] Updated weights for policy 0, policy_version 10950 (0.0008) [2023-10-10 05:04:54,941][53252] Updated weights for policy 0, policy_version 10960 (0.0008) [2023-10-10 05:04:55,305][53252] Updated weights for policy 0, policy_version 10970 (0.0009) [2023-10-10 05:04:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22446080. Throughput: 0: 1682.3, 1: 1692.2. Samples: 5622090. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:04:56,784][52050] Avg episode reward: [(0, '10.800'), (1, '13.220')] [2023-10-10 05:04:57,611][53268] Updated weights for policy 1, policy_version 10950 (0.0009) [2023-10-10 05:04:57,969][53268] Updated weights for policy 1, policy_version 10960 (0.0008) [2023-10-10 05:04:58,348][53268] Updated weights for policy 1, policy_version 10970 (0.0010) [2023-10-10 05:04:59,312][53252] Updated weights for policy 0, policy_version 10980 (0.0008) [2023-10-10 05:04:59,682][53252] Updated weights for policy 0, policy_version 10990 (0.0008) [2023-10-10 05:05:00,051][53252] Updated weights for policy 0, policy_version 11000 (0.0008) [2023-10-10 05:05:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 22511616. Throughput: 0: 1682.9, 1: 1673.3. Samples: 5632052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:01,784][52050] Avg episode reward: [(0, '10.560'), (1, '13.600')] [2023-10-10 05:05:02,471][53268] Updated weights for policy 1, policy_version 10980 (0.0009) [2023-10-10 05:05:02,870][53268] Updated weights for policy 1, policy_version 10990 (0.0009) [2023-10-10 05:05:03,231][53268] Updated weights for policy 1, policy_version 11000 (0.0009) [2023-10-10 05:05:04,055][53252] Updated weights for policy 0, policy_version 11010 (0.0008) [2023-10-10 05:05:04,429][53252] Updated weights for policy 0, policy_version 11020 (0.0008) [2023-10-10 05:05:04,798][53252] Updated weights for policy 0, policy_version 11030 (0.0007) [2023-10-10 05:05:05,162][53252] Updated weights for policy 0, policy_version 11040 (0.0007) [2023-10-10 05:05:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 22577152. Throughput: 0: 1663.7, 1: 1688.9. Samples: 5651832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:06,784][52050] Avg episode reward: [(0, '9.810'), (1, '13.970')] [2023-10-10 05:05:06,784][53061] Saving new best policy, reward=13.970! [2023-10-10 05:05:07,187][53268] Updated weights for policy 1, policy_version 11010 (0.0009) [2023-10-10 05:05:07,554][53268] Updated weights for policy 1, policy_version 11020 (0.0010) [2023-10-10 05:05:07,921][53268] Updated weights for policy 1, policy_version 11030 (0.0010) [2023-10-10 05:05:08,291][53268] Updated weights for policy 1, policy_version 11040 (0.0010) [2023-10-10 05:05:09,199][53252] Updated weights for policy 0, policy_version 11050 (0.0009) [2023-10-10 05:05:09,571][53252] Updated weights for policy 0, policy_version 11060 (0.0009) [2023-10-10 05:05:09,948][53252] Updated weights for policy 0, policy_version 11070 (0.0010) [2023-10-10 05:05:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22642688. Throughput: 0: 1687.3, 1: 1686.8. Samples: 5672462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:11,784][52050] Avg episode reward: [(0, '10.000'), (1, '14.060')] [2023-10-10 05:05:11,793][53061] Saving new best policy, reward=14.060! [2023-10-10 05:05:12,434][53268] Updated weights for policy 1, policy_version 11050 (0.0009) [2023-10-10 05:05:12,803][53268] Updated weights for policy 1, policy_version 11060 (0.0009) [2023-10-10 05:05:13,179][53268] Updated weights for policy 1, policy_version 11070 (0.0009) [2023-10-10 05:05:13,849][53252] Updated weights for policy 0, policy_version 11080 (0.0009) [2023-10-10 05:05:14,214][53252] Updated weights for policy 0, policy_version 11090 (0.0007) [2023-10-10 05:05:14,594][53252] Updated weights for policy 0, policy_version 11100 (0.0007) [2023-10-10 05:05:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22708224. Throughput: 0: 1676.8, 1: 1680.9. Samples: 5682258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:16,784][52050] Avg episode reward: [(0, '9.600'), (1, '13.970')] [2023-10-10 05:05:17,418][53268] Updated weights for policy 1, policy_version 11080 (0.0010) [2023-10-10 05:05:17,778][53268] Updated weights for policy 1, policy_version 11090 (0.0009) [2023-10-10 05:05:18,156][53268] Updated weights for policy 1, policy_version 11100 (0.0008) [2023-10-10 05:05:18,856][53252] Updated weights for policy 0, policy_version 11110 (0.0009) [2023-10-10 05:05:19,235][53252] Updated weights for policy 0, policy_version 11120 (0.0010) [2023-10-10 05:05:19,615][53252] Updated weights for policy 0, policy_version 11130 (0.0008) [2023-10-10 05:05:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22773760. Throughput: 0: 1677.1, 1: 1682.6. Samples: 5702402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:21,784][52050] Avg episode reward: [(0, '11.310'), (1, '12.600')] [2023-10-10 05:05:22,216][53268] Updated weights for policy 1, policy_version 11110 (0.0007) [2023-10-10 05:05:22,594][53268] Updated weights for policy 1, policy_version 11120 (0.0007) [2023-10-10 05:05:22,960][53268] Updated weights for policy 1, policy_version 11130 (0.0008) [2023-10-10 05:05:23,599][53252] Updated weights for policy 0, policy_version 11140 (0.0009) [2023-10-10 05:05:23,966][53252] Updated weights for policy 0, policy_version 11150 (0.0007) [2023-10-10 05:05:24,346][53252] Updated weights for policy 0, policy_version 11160 (0.0007) [2023-10-10 05:05:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 22839296. Throughput: 0: 1696.0, 1: 1678.2. Samples: 5723036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:05:26,784][52050] Avg episode reward: [(0, '10.970'), (1, '12.530')] [2023-10-10 05:05:26,986][53268] Updated weights for policy 1, policy_version 11140 (0.0010) [2023-10-10 05:05:27,352][53268] Updated weights for policy 1, policy_version 11150 (0.0011) [2023-10-10 05:05:27,722][53268] Updated weights for policy 1, policy_version 11160 (0.0010) [2023-10-10 05:05:28,405][53252] Updated weights for policy 0, policy_version 11170 (0.0007) [2023-10-10 05:05:28,771][53252] Updated weights for policy 0, policy_version 11180 (0.0008) [2023-10-10 05:05:29,136][53252] Updated weights for policy 0, policy_version 11190 (0.0008) [2023-10-10 05:05:29,508][53252] Updated weights for policy 0, policy_version 11200 (0.0009) [2023-10-10 05:05:31,730][53268] Updated weights for policy 1, policy_version 11170 (0.0009) [2023-10-10 05:05:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 22904832. Throughput: 0: 1667.1, 1: 1682.0. Samples: 5732362. Policy #0 lag: (min: 10.0, avg: 25.9, max: 42.0) [2023-10-10 05:05:31,784][52050] Avg episode reward: [(0, '11.310'), (1, '13.680')] [2023-10-10 05:05:32,105][53268] Updated weights for policy 1, policy_version 11180 (0.0008) [2023-10-10 05:05:32,472][53268] Updated weights for policy 1, policy_version 11190 (0.0007) [2023-10-10 05:05:32,839][53268] Updated weights for policy 1, policy_version 11200 (0.0008) [2023-10-10 05:05:33,638][53252] Updated weights for policy 0, policy_version 11210 (0.0009) [2023-10-10 05:05:34,013][53252] Updated weights for policy 0, policy_version 11220 (0.0008) [2023-10-10 05:05:34,387][53252] Updated weights for policy 0, policy_version 11230 (0.0008) [2023-10-10 05:05:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22970368. Throughput: 0: 1681.0, 1: 1682.8. Samples: 5752894. Policy #0 lag: (min: 10.0, avg: 25.9, max: 42.0) [2023-10-10 05:05:36,784][52050] Avg episode reward: [(0, '10.900'), (1, '12.770')] [2023-10-10 05:05:36,877][53268] Updated weights for policy 1, policy_version 11210 (0.0008) [2023-10-10 05:05:37,255][53268] Updated weights for policy 1, policy_version 11220 (0.0007) [2023-10-10 05:05:37,621][53268] Updated weights for policy 1, policy_version 11230 (0.0008) [2023-10-10 05:05:38,446][53252] Updated weights for policy 0, policy_version 11240 (0.0007) [2023-10-10 05:05:38,817][53252] Updated weights for policy 0, policy_version 11250 (0.0008) [2023-10-10 05:05:39,189][53252] Updated weights for policy 0, policy_version 11260 (0.0007) [2023-10-10 05:05:41,601][53268] Updated weights for policy 1, policy_version 11240 (0.0010) [2023-10-10 05:05:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23035904. Throughput: 0: 1685.3, 1: 1681.9. Samples: 5773616. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:05:41,784][52050] Avg episode reward: [(0, '9.430'), (1, '13.500')] [2023-10-10 05:05:41,973][53268] Updated weights for policy 1, policy_version 11250 (0.0011) [2023-10-10 05:05:42,336][53268] Updated weights for policy 1, policy_version 11260 (0.0009) [2023-10-10 05:05:43,250][53252] Updated weights for policy 0, policy_version 11270 (0.0008) [2023-10-10 05:05:43,611][53252] Updated weights for policy 0, policy_version 11280 (0.0008) [2023-10-10 05:05:43,989][53252] Updated weights for policy 0, policy_version 11290 (0.0008) [2023-10-10 05:05:46,303][53268] Updated weights for policy 1, policy_version 11270 (0.0008) [2023-10-10 05:05:46,659][53268] Updated weights for policy 1, policy_version 11280 (0.0011) [2023-10-10 05:05:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23101440. Throughput: 0: 1663.6, 1: 1688.5. Samples: 5782896. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:05:46,784][52050] Avg episode reward: [(0, '10.190'), (1, '12.640')] [2023-10-10 05:05:47,031][53268] Updated weights for policy 1, policy_version 11290 (0.0011) [2023-10-10 05:05:48,031][53252] Updated weights for policy 0, policy_version 11300 (0.0009) [2023-10-10 05:05:48,404][53252] Updated weights for policy 0, policy_version 11310 (0.0007) [2023-10-10 05:05:48,781][53252] Updated weights for policy 0, policy_version 11320 (0.0009) [2023-10-10 05:05:51,222][53268] Updated weights for policy 1, policy_version 11300 (0.0009) [2023-10-10 05:05:51,593][53268] Updated weights for policy 1, policy_version 11310 (0.0007) [2023-10-10 05:05:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23166976. Throughput: 0: 1690.2, 1: 1684.8. Samples: 5803708. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:05:51,784][52050] Avg episode reward: [(0, '10.470'), (1, '12.410')] [2023-10-10 05:05:51,955][53268] Updated weights for policy 1, policy_version 11320 (0.0008) [2023-10-10 05:05:52,863][53252] Updated weights for policy 0, policy_version 11330 (0.0007) [2023-10-10 05:05:53,232][53252] Updated weights for policy 0, policy_version 11340 (0.0007) [2023-10-10 05:05:53,608][53252] Updated weights for policy 0, policy_version 11350 (0.0008) [2023-10-10 05:05:53,976][53252] Updated weights for policy 0, policy_version 11360 (0.0008) [2023-10-10 05:05:56,175][53268] Updated weights for policy 1, policy_version 11330 (0.0010) [2023-10-10 05:05:56,534][53268] Updated weights for policy 1, policy_version 11340 (0.0012) [2023-10-10 05:05:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23232512. Throughput: 0: 1693.8, 1: 1683.0. Samples: 5824420. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:05:56,784][52050] Avg episode reward: [(0, '10.430'), (1, '13.140')] [2023-10-10 05:05:56,898][53268] Updated weights for policy 1, policy_version 11350 (0.0011) [2023-10-10 05:05:57,271][53268] Updated weights for policy 1, policy_version 11360 (0.0010) [2023-10-10 05:05:57,983][53252] Updated weights for policy 0, policy_version 11370 (0.0010) [2023-10-10 05:05:58,358][53252] Updated weights for policy 0, policy_version 11380 (0.0008) [2023-10-10 05:05:58,728][53252] Updated weights for policy 0, policy_version 11390 (0.0007) [2023-10-10 05:06:01,321][53268] Updated weights for policy 1, policy_version 11370 (0.0007) [2023-10-10 05:06:01,693][53268] Updated weights for policy 1, policy_version 11380 (0.0008) [2023-10-10 05:06:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23298048. Throughput: 0: 1680.0, 1: 1684.1. Samples: 5833642. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:06:01,784][52050] Avg episode reward: [(0, '9.290'), (1, '13.520')] [2023-10-10 05:06:02,064][53268] Updated weights for policy 1, policy_version 11390 (0.0008) [2023-10-10 05:06:02,573][53252] Updated weights for policy 0, policy_version 11400 (0.0008) [2023-10-10 05:06:02,943][53252] Updated weights for policy 0, policy_version 11410 (0.0007) [2023-10-10 05:06:03,315][53252] Updated weights for policy 0, policy_version 11420 (0.0009) [2023-10-10 05:06:06,089][53268] Updated weights for policy 1, policy_version 11400 (0.0008) [2023-10-10 05:06:06,468][53268] Updated weights for policy 1, policy_version 11410 (0.0010) [2023-10-10 05:06:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23363584. Throughput: 0: 1695.9, 1: 1682.1. Samples: 5854414. Policy #0 lag: (min: 32.0, avg: 53.9, max: 56.0) [2023-10-10 05:06:06,784][52050] Avg episode reward: [(0, '8.480'), (1, '13.440')] [2023-10-10 05:06:06,836][53268] Updated weights for policy 1, policy_version 11420 (0.0007) [2023-10-10 05:06:07,132][53252] Updated weights for policy 0, policy_version 11430 (0.0008) [2023-10-10 05:06:07,506][53252] Updated weights for policy 0, policy_version 11440 (0.0007) [2023-10-10 05:06:07,882][53252] Updated weights for policy 0, policy_version 11450 (0.0007) [2023-10-10 05:06:11,116][53268] Updated weights for policy 1, policy_version 11430 (0.0009) [2023-10-10 05:06:11,479][53268] Updated weights for policy 1, policy_version 11440 (0.0010) [2023-10-10 05:06:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23429120. Throughput: 0: 1694.8, 1: 1676.9. Samples: 5874764. Policy #0 lag: (min: 32.0, avg: 53.9, max: 56.0) [2023-10-10 05:06:11,784][52050] Avg episode reward: [(0, '9.060'), (1, '13.540')] [2023-10-10 05:06:11,850][53268] Updated weights for policy 1, policy_version 11450 (0.0009) [2023-10-10 05:06:11,949][53252] Updated weights for policy 0, policy_version 11460 (0.0009) [2023-10-10 05:06:12,316][53252] Updated weights for policy 0, policy_version 11470 (0.0007) [2023-10-10 05:06:12,699][53252] Updated weights for policy 0, policy_version 11480 (0.0007) [2023-10-10 05:06:15,917][53268] Updated weights for policy 1, policy_version 11460 (0.0008) [2023-10-10 05:06:16,281][53268] Updated weights for policy 1, policy_version 11470 (0.0008) [2023-10-10 05:06:16,651][53268] Updated weights for policy 1, policy_version 11480 (0.0007) [2023-10-10 05:06:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 23494656. Throughput: 0: 1692.1, 1: 1681.2. Samples: 5884164. Policy #0 lag: (min: 32.0, avg: 53.9, max: 56.0) [2023-10-10 05:06:16,784][52050] Avg episode reward: [(0, '10.250'), (1, '13.280')] [2023-10-10 05:06:16,835][53252] Updated weights for policy 0, policy_version 11490 (0.0007) [2023-10-10 05:06:17,214][53252] Updated weights for policy 0, policy_version 11500 (0.0009) [2023-10-10 05:06:17,595][53252] Updated weights for policy 0, policy_version 11510 (0.0007) [2023-10-10 05:06:17,964][53252] Updated weights for policy 0, policy_version 11520 (0.0010) [2023-10-10 05:06:20,761][53268] Updated weights for policy 1, policy_version 11490 (0.0009) [2023-10-10 05:06:21,131][53268] Updated weights for policy 1, policy_version 11500 (0.0009) [2023-10-10 05:06:21,502][53268] Updated weights for policy 1, policy_version 11510 (0.0008) [2023-10-10 05:06:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23560192. Throughput: 0: 1697.7, 1: 1676.3. Samples: 5904724. Policy #0 lag: (min: 4.0, avg: 8.0, max: 36.0) [2023-10-10 05:06:21,784][52050] Avg episode reward: [(0, '10.930'), (1, '12.160')] [2023-10-10 05:06:21,868][53268] Updated weights for policy 1, policy_version 11520 (0.0009) [2023-10-10 05:06:22,044][53252] Updated weights for policy 0, policy_version 11530 (0.0008) [2023-10-10 05:06:22,420][53252] Updated weights for policy 0, policy_version 11540 (0.0009) [2023-10-10 05:06:22,802][53252] Updated weights for policy 0, policy_version 11550 (0.0009) [2023-10-10 05:06:25,877][53268] Updated weights for policy 1, policy_version 11530 (0.0009) [2023-10-10 05:06:26,241][53268] Updated weights for policy 1, policy_version 11540 (0.0011) [2023-10-10 05:06:26,612][53268] Updated weights for policy 1, policy_version 11550 (0.0009) [2023-10-10 05:06:26,783][52050] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23658496. Throughput: 0: 1696.5, 1: 1659.4. Samples: 5924632. Policy #0 lag: (min: 4.0, avg: 8.0, max: 36.0) [2023-10-10 05:06:26,784][52050] Avg episode reward: [(0, '10.420'), (1, '12.020')] [2023-10-10 05:06:26,909][53252] Updated weights for policy 0, policy_version 11560 (0.0009) [2023-10-10 05:06:27,272][53252] Updated weights for policy 0, policy_version 11570 (0.0008) [2023-10-10 05:06:27,646][53252] Updated weights for policy 0, policy_version 11580 (0.0007) [2023-10-10 05:06:30,791][53268] Updated weights for policy 1, policy_version 11560 (0.0008) [2023-10-10 05:06:31,154][53268] Updated weights for policy 1, policy_version 11570 (0.0010) [2023-10-10 05:06:31,524][53268] Updated weights for policy 1, policy_version 11580 (0.0007) [2023-10-10 05:06:31,717][53252] Updated weights for policy 0, policy_version 11590 (0.0008) [2023-10-10 05:06:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23724032. Throughput: 0: 1693.6, 1: 1669.7. Samples: 5934246. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) [2023-10-10 05:06:31,784][52050] Avg episode reward: [(0, '10.750'), (1, '11.470')] [2023-10-10 05:06:32,094][53252] Updated weights for policy 0, policy_version 11600 (0.0010) [2023-10-10 05:06:32,468][53252] Updated weights for policy 0, policy_version 11610 (0.0008) [2023-10-10 05:06:35,532][53268] Updated weights for policy 1, policy_version 11590 (0.0007) [2023-10-10 05:06:35,899][53268] Updated weights for policy 1, policy_version 11600 (0.0009) [2023-10-10 05:06:36,271][53268] Updated weights for policy 1, policy_version 11610 (0.0009) [2023-10-10 05:06:36,472][53252] Updated weights for policy 0, policy_version 11620 (0.0010) [2023-10-10 05:06:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23789568. Throughput: 0: 1691.1, 1: 1672.1. Samples: 5955052. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) [2023-10-10 05:06:36,784][52050] Avg episode reward: [(0, '11.280'), (1, '12.140')] [2023-10-10 05:06:36,849][53252] Updated weights for policy 0, policy_version 11630 (0.0010) [2023-10-10 05:06:37,220][53252] Updated weights for policy 0, policy_version 11640 (0.0008) [2023-10-10 05:06:40,379][53268] Updated weights for policy 1, policy_version 11620 (0.0008) [2023-10-10 05:06:40,739][53268] Updated weights for policy 1, policy_version 11630 (0.0008) [2023-10-10 05:06:41,119][53268] Updated weights for policy 1, policy_version 11640 (0.0010) [2023-10-10 05:06:41,336][53252] Updated weights for policy 0, policy_version 11650 (0.0009) [2023-10-10 05:06:41,710][53252] Updated weights for policy 0, policy_version 11660 (0.0008) [2023-10-10 05:06:41,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23855104. Throughput: 0: 1684.4, 1: 1656.3. Samples: 5974754. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) [2023-10-10 05:06:41,785][52050] Avg episode reward: [(0, '11.720'), (1, '12.200')] [2023-10-10 05:06:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000011648_11927552.pth... [2023-10-10 05:06:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000010080_10321920.pth [2023-10-10 05:06:42,082][53252] Updated weights for policy 0, policy_version 11670 (0.0008) [2023-10-10 05:06:42,462][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000011680_11960320.pth... [2023-10-10 05:06:42,463][53252] Updated weights for policy 0, policy_version 11680 (0.0008) [2023-10-10 05:06:42,491][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000010080_10321920.pth [2023-10-10 05:06:42,495][52846] Saving new best policy, reward=11.720! [2023-10-10 05:06:45,158][53268] Updated weights for policy 1, policy_version 11650 (0.0007) [2023-10-10 05:06:45,531][53268] Updated weights for policy 1, policy_version 11660 (0.0010) [2023-10-10 05:06:45,896][53268] Updated weights for policy 1, policy_version 11670 (0.0008) [2023-10-10 05:06:46,265][53268] Updated weights for policy 1, policy_version 11680 (0.0008) [2023-10-10 05:06:46,595][53252] Updated weights for policy 0, policy_version 11690 (0.0008) [2023-10-10 05:06:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 23920640. Throughput: 0: 1686.3, 1: 1679.0. Samples: 5985078. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 05:06:46,784][52050] Avg episode reward: [(0, '12.250'), (1, '12.340')] [2023-10-10 05:06:46,965][53252] Updated weights for policy 0, policy_version 11700 (0.0007) [2023-10-10 05:06:47,337][53252] Updated weights for policy 0, policy_version 11710 (0.0008) [2023-10-10 05:06:47,410][52846] Saving new best policy, reward=12.250! [2023-10-10 05:06:50,345][53268] Updated weights for policy 1, policy_version 11690 (0.0010) [2023-10-10 05:06:50,714][53268] Updated weights for policy 1, policy_version 11700 (0.0007) [2023-10-10 05:06:51,082][53268] Updated weights for policy 1, policy_version 11710 (0.0007) [2023-10-10 05:06:51,409][53252] Updated weights for policy 0, policy_version 11720 (0.0007) [2023-10-10 05:06:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23986176. Throughput: 0: 1683.0, 1: 1680.3. Samples: 6005760. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 05:06:51,784][53252] Updated weights for policy 0, policy_version 11730 (0.0007) [2023-10-10 05:06:51,784][52050] Avg episode reward: [(0, '11.650'), (1, '12.100')] [2023-10-10 05:06:52,151][53252] Updated weights for policy 0, policy_version 11740 (0.0007) [2023-10-10 05:06:55,243][53268] Updated weights for policy 1, policy_version 11720 (0.0008) [2023-10-10 05:06:55,616][53268] Updated weights for policy 1, policy_version 11730 (0.0011) [2023-10-10 05:06:55,990][53268] Updated weights for policy 1, policy_version 11740 (0.0011) [2023-10-10 05:06:56,202][53252] Updated weights for policy 0, policy_version 11750 (0.0007) [2023-10-10 05:06:56,574][53252] Updated weights for policy 0, policy_version 11760 (0.0008) [2023-10-10 05:06:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24051712. Throughput: 0: 1672.8, 1: 1666.7. Samples: 6025042. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 05:06:56,784][52050] Avg episode reward: [(0, '11.240'), (1, '12.290')] [2023-10-10 05:06:56,943][53252] Updated weights for policy 0, policy_version 11770 (0.0007) [2023-10-10 05:06:59,827][53268] Updated weights for policy 1, policy_version 11750 (0.0009) [2023-10-10 05:07:00,187][53268] Updated weights for policy 1, policy_version 11760 (0.0009) [2023-10-10 05:07:00,561][53268] Updated weights for policy 1, policy_version 11770 (0.0009) [2023-10-10 05:07:01,130][53252] Updated weights for policy 0, policy_version 11780 (0.0009) [2023-10-10 05:07:01,502][53252] Updated weights for policy 0, policy_version 11790 (0.0010) [2023-10-10 05:07:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24117248. Throughput: 0: 1678.1, 1: 1695.7. Samples: 6035986. Policy #0 lag: (min: 9.0, avg: 21.1, max: 41.0) [2023-10-10 05:07:01,784][52050] Avg episode reward: [(0, '11.030'), (1, '11.870')] [2023-10-10 05:07:01,867][53252] Updated weights for policy 0, policy_version 11800 (0.0010) [2023-10-10 05:07:04,504][53268] Updated weights for policy 1, policy_version 11780 (0.0011) [2023-10-10 05:07:04,875][53268] Updated weights for policy 1, policy_version 11790 (0.0010) [2023-10-10 05:07:05,247][53268] Updated weights for policy 1, policy_version 11800 (0.0008) [2023-10-10 05:07:05,819][53252] Updated weights for policy 0, policy_version 11810 (0.0010) [2023-10-10 05:07:06,184][53252] Updated weights for policy 0, policy_version 11820 (0.0009) [2023-10-10 05:07:06,557][53252] Updated weights for policy 0, policy_version 11830 (0.0007) [2023-10-10 05:07:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 24182784. Throughput: 0: 1686.0, 1: 1680.1. Samples: 6056200. Policy #0 lag: (min: 9.0, avg: 21.1, max: 41.0) [2023-10-10 05:07:06,784][52050] Avg episode reward: [(0, '10.640'), (1, '13.100')] [2023-10-10 05:07:06,936][53252] Updated weights for policy 0, policy_version 11840 (0.0007) [2023-10-10 05:07:09,336][53268] Updated weights for policy 1, policy_version 11810 (0.0008) [2023-10-10 05:07:09,700][53268] Updated weights for policy 1, policy_version 11820 (0.0007) [2023-10-10 05:07:10,072][53268] Updated weights for policy 1, policy_version 11830 (0.0008) [2023-10-10 05:07:10,439][53268] Updated weights for policy 1, policy_version 11840 (0.0009) [2023-10-10 05:07:11,071][53252] Updated weights for policy 0, policy_version 11850 (0.0008) [2023-10-10 05:07:11,446][53252] Updated weights for policy 0, policy_version 11860 (0.0008) [2023-10-10 05:07:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 24248320. Throughput: 0: 1672.5, 1: 1682.6. Samples: 6075610. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-10 05:07:11,784][52050] Avg episode reward: [(0, '10.720'), (1, '12.550')] [2023-10-10 05:07:11,809][53252] Updated weights for policy 0, policy_version 11870 (0.0008) [2023-10-10 05:07:14,413][53268] Updated weights for policy 1, policy_version 11850 (0.0008) [2023-10-10 05:07:14,780][53268] Updated weights for policy 1, policy_version 11860 (0.0010) [2023-10-10 05:07:15,156][53268] Updated weights for policy 1, policy_version 11870 (0.0010) [2023-10-10 05:07:15,872][53252] Updated weights for policy 0, policy_version 11880 (0.0008) [2023-10-10 05:07:16,241][53252] Updated weights for policy 0, policy_version 11890 (0.0009) [2023-10-10 05:07:16,623][53252] Updated weights for policy 0, policy_version 11900 (0.0008) [2023-10-10 05:07:16,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 24346624. Throughput: 0: 1686.4, 1: 1694.9. Samples: 6086406. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-10 05:07:16,784][52050] Avg episode reward: [(0, '11.460'), (1, '12.700')] [2023-10-10 05:07:19,230][53268] Updated weights for policy 1, policy_version 11880 (0.0010) [2023-10-10 05:07:19,609][53268] Updated weights for policy 1, policy_version 11890 (0.0010) [2023-10-10 05:07:19,986][53268] Updated weights for policy 1, policy_version 11900 (0.0008) [2023-10-10 05:07:20,748][53252] Updated weights for policy 0, policy_version 11910 (0.0009) [2023-10-10 05:07:21,126][53252] Updated weights for policy 0, policy_version 11920 (0.0010) [2023-10-10 05:07:21,491][53252] Updated weights for policy 0, policy_version 11930 (0.0010) [2023-10-10 05:07:21,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 24412160. Throughput: 0: 1688.5, 1: 1666.4. Samples: 6106024. Policy #0 lag: (min: 1.0, avg: 16.2, max: 33.0) [2023-10-10 05:07:21,784][52050] Avg episode reward: [(0, '12.330'), (1, '12.470')] [2023-10-10 05:07:21,785][52846] Saving new best policy, reward=12.330! [2023-10-10 05:07:24,290][53268] Updated weights for policy 1, policy_version 11910 (0.0009) [2023-10-10 05:07:24,684][53268] Updated weights for policy 1, policy_version 11920 (0.0007) [2023-10-10 05:07:25,052][53268] Updated weights for policy 1, policy_version 11930 (0.0009) [2023-10-10 05:07:25,720][53252] Updated weights for policy 0, policy_version 11940 (0.0008) [2023-10-10 05:07:26,102][53252] Updated weights for policy 0, policy_version 11950 (0.0008) [2023-10-10 05:07:26,462][53252] Updated weights for policy 0, policy_version 11960 (0.0008) [2023-10-10 05:07:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 24477696. Throughput: 0: 1666.9, 1: 1680.5. Samples: 6125384. Policy #0 lag: (min: 1.0, avg: 16.2, max: 33.0) [2023-10-10 05:07:26,784][52050] Avg episode reward: [(0, '11.040'), (1, '12.910')] [2023-10-10 05:07:28,860][53268] Updated weights for policy 1, policy_version 11940 (0.0010) [2023-10-10 05:07:29,234][53268] Updated weights for policy 1, policy_version 11950 (0.0010) [2023-10-10 05:07:29,591][53268] Updated weights for policy 1, policy_version 11960 (0.0009) [2023-10-10 05:07:30,327][53252] Updated weights for policy 0, policy_version 11970 (0.0010) [2023-10-10 05:07:30,703][53252] Updated weights for policy 0, policy_version 11980 (0.0010) [2023-10-10 05:07:31,073][53252] Updated weights for policy 0, policy_version 11990 (0.0010) [2023-10-10 05:07:31,447][53252] Updated weights for policy 0, policy_version 12000 (0.0007) [2023-10-10 05:07:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24543232. Throughput: 0: 1680.8, 1: 1681.0. Samples: 6136358. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 05:07:31,784][52050] Avg episode reward: [(0, '10.720'), (1, '13.640')] [2023-10-10 05:07:33,709][53268] Updated weights for policy 1, policy_version 11970 (0.0011) [2023-10-10 05:07:34,077][53268] Updated weights for policy 1, policy_version 11980 (0.0008) [2023-10-10 05:07:34,441][53268] Updated weights for policy 1, policy_version 11990 (0.0007) [2023-10-10 05:07:34,820][53268] Updated weights for policy 1, policy_version 12000 (0.0007) [2023-10-10 05:07:35,664][53252] Updated weights for policy 0, policy_version 12010 (0.0010) [2023-10-10 05:07:36,057][53252] Updated weights for policy 0, policy_version 12020 (0.0009) [2023-10-10 05:07:36,423][53252] Updated weights for policy 0, policy_version 12030 (0.0009) [2023-10-10 05:07:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24608768. Throughput: 0: 1673.6, 1: 1668.6. Samples: 6156160. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 05:07:36,785][52050] Avg episode reward: [(0, '11.310'), (1, '13.390')] [2023-10-10 05:07:38,706][53268] Updated weights for policy 1, policy_version 12010 (0.0009) [2023-10-10 05:07:39,074][53268] Updated weights for policy 1, policy_version 12020 (0.0008) [2023-10-10 05:07:39,456][53268] Updated weights for policy 1, policy_version 12030 (0.0008) [2023-10-10 05:07:40,713][53252] Updated weights for policy 0, policy_version 12040 (0.0008) [2023-10-10 05:07:41,082][53252] Updated weights for policy 0, policy_version 12050 (0.0009) [2023-10-10 05:07:41,463][53252] Updated weights for policy 0, policy_version 12060 (0.0009) [2023-10-10 05:07:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 24674304. Throughput: 0: 1659.7, 1: 1694.8. Samples: 6175994. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:07:41,784][52050] Avg episode reward: [(0, '11.260'), (1, '13.850')] [2023-10-10 05:07:43,470][53268] Updated weights for policy 1, policy_version 12040 (0.0007) [2023-10-10 05:07:43,838][53268] Updated weights for policy 1, policy_version 12050 (0.0007) [2023-10-10 05:07:44,205][53268] Updated weights for policy 1, policy_version 12060 (0.0008) [2023-10-10 05:07:45,438][53252] Updated weights for policy 0, policy_version 12070 (0.0008) [2023-10-10 05:07:45,803][53252] Updated weights for policy 0, policy_version 12080 (0.0008) [2023-10-10 05:07:46,172][53252] Updated weights for policy 0, policy_version 12090 (0.0009) [2023-10-10 05:07:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 24739840. Throughput: 0: 1674.8, 1: 1666.6. Samples: 6186348. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:07:46,784][52050] Avg episode reward: [(0, '12.630'), (1, '12.870')] [2023-10-10 05:07:46,785][52846] Saving new best policy, reward=12.630! [2023-10-10 05:07:48,274][53268] Updated weights for policy 1, policy_version 12070 (0.0008) [2023-10-10 05:07:48,637][53268] Updated weights for policy 1, policy_version 12080 (0.0007) [2023-10-10 05:07:49,015][53268] Updated weights for policy 1, policy_version 12090 (0.0009) [2023-10-10 05:07:50,178][53252] Updated weights for policy 0, policy_version 12100 (0.0007) [2023-10-10 05:07:50,545][53252] Updated weights for policy 0, policy_version 12110 (0.0010) [2023-10-10 05:07:50,924][53252] Updated weights for policy 0, policy_version 12120 (0.0009) [2023-10-10 05:07:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24805376. Throughput: 0: 1661.6, 1: 1677.9. Samples: 6206478. Policy #0 lag: (min: 31.0, avg: 40.7, max: 63.0) [2023-10-10 05:07:51,784][52050] Avg episode reward: [(0, '12.860'), (1, '12.180')] [2023-10-10 05:07:51,786][52846] Saving new best policy, reward=12.860! [2023-10-10 05:07:53,146][53268] Updated weights for policy 1, policy_version 12100 (0.0008) [2023-10-10 05:07:53,512][53268] Updated weights for policy 1, policy_version 12110 (0.0008) [2023-10-10 05:07:53,871][53268] Updated weights for policy 1, policy_version 12120 (0.0010) [2023-10-10 05:07:54,948][53252] Updated weights for policy 0, policy_version 12130 (0.0009) [2023-10-10 05:07:55,327][53252] Updated weights for policy 0, policy_version 12140 (0.0009) [2023-10-10 05:07:55,686][53252] Updated weights for policy 0, policy_version 12150 (0.0010) [2023-10-10 05:07:56,067][53252] Updated weights for policy 0, policy_version 12160 (0.0010) [2023-10-10 05:07:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24870912. Throughput: 0: 1658.5, 1: 1688.3. Samples: 6226218. Policy #0 lag: (min: 31.0, avg: 40.7, max: 63.0) [2023-10-10 05:07:56,784][52050] Avg episode reward: [(0, '11.550'), (1, '12.440')] [2023-10-10 05:07:57,848][53268] Updated weights for policy 1, policy_version 12130 (0.0008) [2023-10-10 05:07:58,219][53268] Updated weights for policy 1, policy_version 12140 (0.0009) [2023-10-10 05:07:58,598][53268] Updated weights for policy 1, policy_version 12150 (0.0009) [2023-10-10 05:07:58,956][53268] Updated weights for policy 1, policy_version 12160 (0.0008) [2023-10-10 05:08:00,088][53252] Updated weights for policy 0, policy_version 12170 (0.0007) [2023-10-10 05:08:00,463][53252] Updated weights for policy 0, policy_version 12180 (0.0008) [2023-10-10 05:08:00,833][53252] Updated weights for policy 0, policy_version 12190 (0.0007) [2023-10-10 05:08:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24936448. Throughput: 0: 1679.5, 1: 1663.2. Samples: 6236832. Policy #0 lag: (min: 31.0, avg: 40.7, max: 63.0) [2023-10-10 05:08:01,784][52050] Avg episode reward: [(0, '11.600'), (1, '12.450')] [2023-10-10 05:08:03,307][53268] Updated weights for policy 1, policy_version 12170 (0.0009) [2023-10-10 05:08:03,678][53268] Updated weights for policy 1, policy_version 12180 (0.0008) [2023-10-10 05:08:04,038][53268] Updated weights for policy 1, policy_version 12190 (0.0011) [2023-10-10 05:08:04,588][53252] Updated weights for policy 0, policy_version 12200 (0.0009) [2023-10-10 05:08:04,961][53252] Updated weights for policy 0, policy_version 12210 (0.0008) [2023-10-10 05:08:05,332][53252] Updated weights for policy 0, policy_version 12220 (0.0009) [2023-10-10 05:08:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 25001984. Throughput: 0: 1653.8, 1: 1687.6. Samples: 6256384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:08:06,784][52050] Avg episode reward: [(0, '10.280'), (1, '11.860')] [2023-10-10 05:08:08,201][53268] Updated weights for policy 1, policy_version 12200 (0.0010) [2023-10-10 05:08:08,576][53268] Updated weights for policy 1, policy_version 12210 (0.0008) [2023-10-10 05:08:08,949][53268] Updated weights for policy 1, policy_version 12220 (0.0008) [2023-10-10 05:08:09,524][53252] Updated weights for policy 0, policy_version 12230 (0.0011) [2023-10-10 05:08:09,903][53252] Updated weights for policy 0, policy_version 12240 (0.0009) [2023-10-10 05:08:10,267][53252] Updated weights for policy 0, policy_version 12250 (0.0009) [2023-10-10 05:08:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 25067520. Throughput: 0: 1673.2, 1: 1695.6. Samples: 6276976. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:08:11,784][52050] Avg episode reward: [(0, '10.630'), (1, '12.800')] [2023-10-10 05:08:13,048][53268] Updated weights for policy 1, policy_version 12230 (0.0008) [2023-10-10 05:08:13,427][53268] Updated weights for policy 1, policy_version 12240 (0.0007) [2023-10-10 05:08:13,793][53268] Updated weights for policy 1, policy_version 12250 (0.0008) [2023-10-10 05:08:14,520][53252] Updated weights for policy 0, policy_version 12260 (0.0009) [2023-10-10 05:08:14,904][53252] Updated weights for policy 0, policy_version 12270 (0.0009) [2023-10-10 05:08:15,270][53252] Updated weights for policy 0, policy_version 12280 (0.0008) [2023-10-10 05:08:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25133056. Throughput: 0: 1683.6, 1: 1667.2. Samples: 6287148. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 05:08:16,784][52050] Avg episode reward: [(0, '10.010'), (1, '11.240')] [2023-10-10 05:08:17,845][53268] Updated weights for policy 1, policy_version 12260 (0.0009) [2023-10-10 05:08:18,209][53268] Updated weights for policy 1, policy_version 12270 (0.0010) [2023-10-10 05:08:18,579][53268] Updated weights for policy 1, policy_version 12280 (0.0010) [2023-10-10 05:08:19,433][53252] Updated weights for policy 0, policy_version 12290 (0.0010) [2023-10-10 05:08:19,812][53252] Updated weights for policy 0, policy_version 12300 (0.0007) [2023-10-10 05:08:20,182][53252] Updated weights for policy 0, policy_version 12310 (0.0009) [2023-10-10 05:08:20,551][53252] Updated weights for policy 0, policy_version 12320 (0.0010) [2023-10-10 05:08:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25198592. Throughput: 0: 1663.4, 1: 1683.3. Samples: 6306762. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 05:08:21,784][52050] Avg episode reward: [(0, '11.230'), (1, '12.320')] [2023-10-10 05:08:22,783][53268] Updated weights for policy 1, policy_version 12290 (0.0010) [2023-10-10 05:08:23,164][53268] Updated weights for policy 1, policy_version 12300 (0.0009) [2023-10-10 05:08:23,538][53268] Updated weights for policy 1, policy_version 12310 (0.0009) [2023-10-10 05:08:23,903][53268] Updated weights for policy 1, policy_version 12320 (0.0008) [2023-10-10 05:08:24,629][53252] Updated weights for policy 0, policy_version 12330 (0.0010) [2023-10-10 05:08:25,001][53252] Updated weights for policy 0, policy_version 12340 (0.0007) [2023-10-10 05:08:25,380][53252] Updated weights for policy 0, policy_version 12350 (0.0009) [2023-10-10 05:08:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 25264128. Throughput: 0: 1678.5, 1: 1676.2. Samples: 6326958. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 05:08:26,785][52050] Avg episode reward: [(0, '11.890'), (1, '12.420')] [2023-10-10 05:08:28,073][53268] Updated weights for policy 1, policy_version 12330 (0.0010) [2023-10-10 05:08:28,438][53268] Updated weights for policy 1, policy_version 12340 (0.0010) [2023-10-10 05:08:28,803][53268] Updated weights for policy 1, policy_version 12350 (0.0012) [2023-10-10 05:08:29,407][53252] Updated weights for policy 0, policy_version 12360 (0.0008) [2023-10-10 05:08:29,788][53252] Updated weights for policy 0, policy_version 12370 (0.0008) [2023-10-10 05:08:30,165][53252] Updated weights for policy 0, policy_version 12380 (0.0008) [2023-10-10 05:08:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25329664. Throughput: 0: 1680.2, 1: 1667.4. Samples: 6336992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:31,784][52050] Avg episode reward: [(0, '12.050'), (1, '12.020')] [2023-10-10 05:08:32,803][53268] Updated weights for policy 1, policy_version 12360 (0.0010) [2023-10-10 05:08:33,179][53268] Updated weights for policy 1, policy_version 12370 (0.0011) [2023-10-10 05:08:33,536][53268] Updated weights for policy 1, policy_version 12380 (0.0010) [2023-10-10 05:08:34,315][53252] Updated weights for policy 0, policy_version 12390 (0.0007) [2023-10-10 05:08:34,695][53252] Updated weights for policy 0, policy_version 12400 (0.0007) [2023-10-10 05:08:35,059][53252] Updated weights for policy 0, policy_version 12410 (0.0009) [2023-10-10 05:08:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25395200. Throughput: 0: 1661.4, 1: 1675.6. Samples: 6356644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:36,784][52050] Avg episode reward: [(0, '13.050'), (1, '13.050')] [2023-10-10 05:08:36,786][52846] Saving new best policy, reward=13.050! [2023-10-10 05:08:37,603][53268] Updated weights for policy 1, policy_version 12390 (0.0010) [2023-10-10 05:08:37,970][53268] Updated weights for policy 1, policy_version 12400 (0.0008) [2023-10-10 05:08:38,342][53268] Updated weights for policy 1, policy_version 12410 (0.0011) [2023-10-10 05:08:39,112][53252] Updated weights for policy 0, policy_version 12420 (0.0010) [2023-10-10 05:08:39,481][53252] Updated weights for policy 0, policy_version 12430 (0.0007) [2023-10-10 05:08:39,852][53252] Updated weights for policy 0, policy_version 12440 (0.0007) [2023-10-10 05:08:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25460736. Throughput: 0: 1680.0, 1: 1676.7. Samples: 6377272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:41,784][52050] Avg episode reward: [(0, '12.240'), (1, '13.170')] [2023-10-10 05:08:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000012416_12713984.pth... [2023-10-10 05:08:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000012448_12746752.pth... [2023-10-10 05:08:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000010848_11108352.pth [2023-10-10 05:08:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000010880_11141120.pth [2023-10-10 05:08:42,394][53268] Updated weights for policy 1, policy_version 12420 (0.0009) [2023-10-10 05:08:42,765][53268] Updated weights for policy 1, policy_version 12430 (0.0008) [2023-10-10 05:08:43,137][53268] Updated weights for policy 1, policy_version 12440 (0.0008) [2023-10-10 05:08:44,028][53252] Updated weights for policy 0, policy_version 12450 (0.0009) [2023-10-10 05:08:44,392][53252] Updated weights for policy 0, policy_version 12460 (0.0008) [2023-10-10 05:08:44,760][53252] Updated weights for policy 0, policy_version 12470 (0.0008) [2023-10-10 05:08:45,140][53252] Updated weights for policy 0, policy_version 12480 (0.0008) [2023-10-10 05:08:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25526272. Throughput: 0: 1665.0, 1: 1676.5. Samples: 6387198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:46,784][52050] Avg episode reward: [(0, '12.330'), (1, '13.730')] [2023-10-10 05:08:47,087][53268] Updated weights for policy 1, policy_version 12450 (0.0008) [2023-10-10 05:08:47,460][53268] Updated weights for policy 1, policy_version 12460 (0.0010) [2023-10-10 05:08:47,828][53268] Updated weights for policy 1, policy_version 12470 (0.0008) [2023-10-10 05:08:48,190][53268] Updated weights for policy 1, policy_version 12480 (0.0007) [2023-10-10 05:08:49,166][53252] Updated weights for policy 0, policy_version 12490 (0.0009) [2023-10-10 05:08:49,540][53252] Updated weights for policy 0, policy_version 12500 (0.0008) [2023-10-10 05:08:49,902][53252] Updated weights for policy 0, policy_version 12510 (0.0008) [2023-10-10 05:08:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25591808. Throughput: 0: 1668.0, 1: 1683.3. Samples: 6407194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:51,784][52050] Avg episode reward: [(0, '11.090'), (1, '13.680')] [2023-10-10 05:08:52,227][53268] Updated weights for policy 1, policy_version 12490 (0.0009) [2023-10-10 05:08:52,602][53268] Updated weights for policy 1, policy_version 12500 (0.0008) [2023-10-10 05:08:52,971][53268] Updated weights for policy 1, policy_version 12510 (0.0007) [2023-10-10 05:08:54,088][53252] Updated weights for policy 0, policy_version 12520 (0.0007) [2023-10-10 05:08:54,475][53252] Updated weights for policy 0, policy_version 12530 (0.0007) [2023-10-10 05:08:54,850][53252] Updated weights for policy 0, policy_version 12540 (0.0008) [2023-10-10 05:08:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25657344. Throughput: 0: 1672.3, 1: 1679.6. Samples: 6427810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:08:56,784][52050] Avg episode reward: [(0, '11.370'), (1, '13.780')] [2023-10-10 05:08:57,065][53268] Updated weights for policy 1, policy_version 12520 (0.0009) [2023-10-10 05:08:57,446][53268] Updated weights for policy 1, policy_version 12530 (0.0008) [2023-10-10 05:08:57,810][53268] Updated weights for policy 1, policy_version 12540 (0.0010) [2023-10-10 05:08:58,676][53252] Updated weights for policy 0, policy_version 12550 (0.0009) [2023-10-10 05:08:59,040][53252] Updated weights for policy 0, policy_version 12560 (0.0010) [2023-10-10 05:08:59,422][53252] Updated weights for policy 0, policy_version 12570 (0.0008) [2023-10-10 05:09:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25722880. Throughput: 0: 1656.7, 1: 1681.1. Samples: 6437348. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 05:09:01,784][52050] Avg episode reward: [(0, '10.840'), (1, '13.560')] [2023-10-10 05:09:01,887][53268] Updated weights for policy 1, policy_version 12550 (0.0008) [2023-10-10 05:09:02,281][53268] Updated weights for policy 1, policy_version 12560 (0.0009) [2023-10-10 05:09:02,652][53268] Updated weights for policy 1, policy_version 12570 (0.0007) [2023-10-10 05:09:03,436][53252] Updated weights for policy 0, policy_version 12580 (0.0009) [2023-10-10 05:09:03,811][53252] Updated weights for policy 0, policy_version 12590 (0.0007) [2023-10-10 05:09:04,177][53252] Updated weights for policy 0, policy_version 12600 (0.0007) [2023-10-10 05:09:06,548][53268] Updated weights for policy 1, policy_version 12580 (0.0008) [2023-10-10 05:09:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25788416. Throughput: 0: 1675.2, 1: 1683.9. Samples: 6457922. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 05:09:06,784][52050] Avg episode reward: [(0, '11.170'), (1, '13.160')] [2023-10-10 05:09:06,914][53268] Updated weights for policy 1, policy_version 12590 (0.0009) [2023-10-10 05:09:07,279][53268] Updated weights for policy 1, policy_version 12600 (0.0010) [2023-10-10 05:09:08,220][53252] Updated weights for policy 0, policy_version 12610 (0.0007) [2023-10-10 05:09:08,599][53252] Updated weights for policy 0, policy_version 12620 (0.0009) [2023-10-10 05:09:08,974][53252] Updated weights for policy 0, policy_version 12630 (0.0007) [2023-10-10 05:09:09,347][53252] Updated weights for policy 0, policy_version 12640 (0.0009) [2023-10-10 05:09:11,327][53268] Updated weights for policy 1, policy_version 12610 (0.0007) [2023-10-10 05:09:11,684][53268] Updated weights for policy 1, policy_version 12620 (0.0009) [2023-10-10 05:09:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25853952. Throughput: 0: 1684.3, 1: 1686.5. Samples: 6478644. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 05:09:11,784][52050] Avg episode reward: [(0, '11.290'), (1, '13.970')] [2023-10-10 05:09:12,049][53268] Updated weights for policy 1, policy_version 12630 (0.0011) [2023-10-10 05:09:12,416][53268] Updated weights for policy 1, policy_version 12640 (0.0007) [2023-10-10 05:09:13,380][53252] Updated weights for policy 0, policy_version 12650 (0.0007) [2023-10-10 05:09:13,749][53252] Updated weights for policy 0, policy_version 12660 (0.0008) [2023-10-10 05:09:14,122][53252] Updated weights for policy 0, policy_version 12670 (0.0009) [2023-10-10 05:09:16,415][53268] Updated weights for policy 1, policy_version 12650 (0.0007) [2023-10-10 05:09:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25919488. Throughput: 0: 1660.8, 1: 1689.0. Samples: 6487730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:09:16,784][52050] Avg episode reward: [(0, '10.670'), (1, '13.760')] [2023-10-10 05:09:16,784][53268] Updated weights for policy 1, policy_version 12660 (0.0007) [2023-10-10 05:09:17,144][53268] Updated weights for policy 1, policy_version 12670 (0.0008) [2023-10-10 05:09:18,108][53252] Updated weights for policy 0, policy_version 12680 (0.0008) [2023-10-10 05:09:18,473][53252] Updated weights for policy 0, policy_version 12690 (0.0011) [2023-10-10 05:09:18,855][53252] Updated weights for policy 0, policy_version 12700 (0.0007) [2023-10-10 05:09:21,250][53268] Updated weights for policy 1, policy_version 12680 (0.0010) [2023-10-10 05:09:21,626][53268] Updated weights for policy 1, policy_version 12690 (0.0008) [2023-10-10 05:09:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25985024. Throughput: 0: 1686.6, 1: 1687.6. Samples: 6508480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:09:21,784][52050] Avg episode reward: [(0, '12.470'), (1, '14.830')] [2023-10-10 05:09:21,991][53268] Updated weights for policy 1, policy_version 12700 (0.0008) [2023-10-10 05:09:22,143][53061] Saving new best policy, reward=14.830! [2023-10-10 05:09:22,906][53252] Updated weights for policy 0, policy_version 12710 (0.0008) [2023-10-10 05:09:23,277][53252] Updated weights for policy 0, policy_version 12720 (0.0007) [2023-10-10 05:09:23,660][53252] Updated weights for policy 0, policy_version 12730 (0.0007) [2023-10-10 05:09:26,131][53268] Updated weights for policy 1, policy_version 12710 (0.0011) [2023-10-10 05:09:26,490][53268] Updated weights for policy 1, policy_version 12720 (0.0008) [2023-10-10 05:09:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26050560. Throughput: 0: 1689.2, 1: 1681.0. Samples: 6528934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:09:26,784][52050] Avg episode reward: [(0, '11.730'), (1, '14.000')] [2023-10-10 05:09:26,859][53268] Updated weights for policy 1, policy_version 12730 (0.0007) [2023-10-10 05:09:27,655][53252] Updated weights for policy 0, policy_version 12740 (0.0007) [2023-10-10 05:09:28,038][53252] Updated weights for policy 0, policy_version 12750 (0.0009) [2023-10-10 05:09:28,405][53252] Updated weights for policy 0, policy_version 12760 (0.0008) [2023-10-10 05:09:31,047][53268] Updated weights for policy 1, policy_version 12740 (0.0009) [2023-10-10 05:09:31,416][53268] Updated weights for policy 1, policy_version 12750 (0.0010) [2023-10-10 05:09:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 26116096. Throughput: 0: 1672.0, 1: 1686.5. Samples: 6538334. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:09:31,784][53268] Updated weights for policy 1, policy_version 12760 (0.0010) [2023-10-10 05:09:31,784][52050] Avg episode reward: [(0, '12.530'), (1, '14.240')] [2023-10-10 05:09:32,549][53252] Updated weights for policy 0, policy_version 12770 (0.0007) [2023-10-10 05:09:32,928][53252] Updated weights for policy 0, policy_version 12780 (0.0007) [2023-10-10 05:09:33,293][53252] Updated weights for policy 0, policy_version 12790 (0.0009) [2023-10-10 05:09:33,667][53252] Updated weights for policy 0, policy_version 12800 (0.0007) [2023-10-10 05:09:35,922][53268] Updated weights for policy 1, policy_version 12770 (0.0009) [2023-10-10 05:09:36,301][53268] Updated weights for policy 1, policy_version 12780 (0.0007) [2023-10-10 05:09:36,662][53268] Updated weights for policy 1, policy_version 12790 (0.0009) [2023-10-10 05:09:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26181632. Throughput: 0: 1692.1, 1: 1684.7. Samples: 6559148. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:09:36,784][52050] Avg episode reward: [(0, '13.980'), (1, '13.900')] [2023-10-10 05:09:36,786][52846] Saving new best policy, reward=13.980! [2023-10-10 05:09:37,033][53268] Updated weights for policy 1, policy_version 12800 (0.0007) [2023-10-10 05:09:37,802][53252] Updated weights for policy 0, policy_version 12810 (0.0007) [2023-10-10 05:09:38,176][53252] Updated weights for policy 0, policy_version 12820 (0.0008) [2023-10-10 05:09:38,548][53252] Updated weights for policy 0, policy_version 12830 (0.0011) [2023-10-10 05:09:40,939][53268] Updated weights for policy 1, policy_version 12810 (0.0008) [2023-10-10 05:09:41,311][53268] Updated weights for policy 1, policy_version 12820 (0.0007) [2023-10-10 05:09:41,681][53268] Updated weights for policy 1, policy_version 12830 (0.0007) [2023-10-10 05:09:41,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26279936. Throughput: 0: 1692.8, 1: 1677.9. Samples: 6579490. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:09:41,784][52050] Avg episode reward: [(0, '13.780'), (1, '13.740')] [2023-10-10 05:09:42,658][53252] Updated weights for policy 0, policy_version 12840 (0.0008) [2023-10-10 05:09:43,029][53252] Updated weights for policy 0, policy_version 12850 (0.0009) [2023-10-10 05:09:43,410][53252] Updated weights for policy 0, policy_version 12860 (0.0008) [2023-10-10 05:09:45,774][53268] Updated weights for policy 1, policy_version 12840 (0.0011) [2023-10-10 05:09:46,131][53268] Updated weights for policy 1, policy_version 12850 (0.0011) [2023-10-10 05:09:46,500][53268] Updated weights for policy 1, policy_version 12860 (0.0009) [2023-10-10 05:09:46,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26345472. Throughput: 0: 1677.6, 1: 1693.3. Samples: 6589040. Policy #0 lag: (min: 24.0, avg: 52.8, max: 56.0) [2023-10-10 05:09:46,784][52050] Avg episode reward: [(0, '13.200'), (1, '14.420')] [2023-10-10 05:09:47,529][53252] Updated weights for policy 0, policy_version 12870 (0.0009) [2023-10-10 05:09:47,888][53252] Updated weights for policy 0, policy_version 12880 (0.0009) [2023-10-10 05:09:48,258][53252] Updated weights for policy 0, policy_version 12890 (0.0010) [2023-10-10 05:09:50,696][53268] Updated weights for policy 1, policy_version 12870 (0.0011) [2023-10-10 05:09:51,063][53268] Updated weights for policy 1, policy_version 12880 (0.0009) [2023-10-10 05:09:51,436][53268] Updated weights for policy 1, policy_version 12890 (0.0008) [2023-10-10 05:09:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26411008. Throughput: 0: 1681.6, 1: 1686.5. Samples: 6609488. Policy #0 lag: (min: 24.0, avg: 52.8, max: 56.0) [2023-10-10 05:09:51,784][52050] Avg episode reward: [(0, '12.550'), (1, '13.590')] [2023-10-10 05:09:52,286][53252] Updated weights for policy 0, policy_version 12900 (0.0009) [2023-10-10 05:09:52,655][53252] Updated weights for policy 0, policy_version 12910 (0.0009) [2023-10-10 05:09:53,034][53252] Updated weights for policy 0, policy_version 12920 (0.0010) [2023-10-10 05:09:55,441][53268] Updated weights for policy 1, policy_version 12900 (0.0008) [2023-10-10 05:09:55,804][53268] Updated weights for policy 1, policy_version 12910 (0.0010) [2023-10-10 05:09:56,172][53268] Updated weights for policy 1, policy_version 12920 (0.0011) [2023-10-10 05:09:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26476544. Throughput: 0: 1681.7, 1: 1668.8. Samples: 6629418. Policy #0 lag: (min: 24.0, avg: 52.8, max: 56.0) [2023-10-10 05:09:56,784][52050] Avg episode reward: [(0, '12.600'), (1, '13.730')] [2023-10-10 05:09:57,007][53252] Updated weights for policy 0, policy_version 12930 (0.0009) [2023-10-10 05:09:57,378][53252] Updated weights for policy 0, policy_version 12940 (0.0010) [2023-10-10 05:09:57,759][53252] Updated weights for policy 0, policy_version 12950 (0.0011) [2023-10-10 05:09:58,116][53252] Updated weights for policy 0, policy_version 12960 (0.0011) [2023-10-10 05:10:00,010][53268] Updated weights for policy 1, policy_version 12930 (0.0011) [2023-10-10 05:10:00,376][53268] Updated weights for policy 1, policy_version 12940 (0.0010) [2023-10-10 05:10:00,756][53268] Updated weights for policy 1, policy_version 12950 (0.0011) [2023-10-10 05:10:01,132][53268] Updated weights for policy 1, policy_version 12960 (0.0009) [2023-10-10 05:10:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26542080. Throughput: 0: 1679.4, 1: 1689.1. Samples: 6639312. Policy #0 lag: (min: 1.0, avg: 11.9, max: 33.0) [2023-10-10 05:10:01,785][52050] Avg episode reward: [(0, '12.610'), (1, '14.070')] [2023-10-10 05:10:02,232][53252] Updated weights for policy 0, policy_version 12970 (0.0009) [2023-10-10 05:10:02,598][53252] Updated weights for policy 0, policy_version 12980 (0.0008) [2023-10-10 05:10:02,977][53252] Updated weights for policy 0, policy_version 12990 (0.0007) [2023-10-10 05:10:05,255][53268] Updated weights for policy 1, policy_version 12970 (0.0008) [2023-10-10 05:10:05,623][53268] Updated weights for policy 1, policy_version 12980 (0.0007) [2023-10-10 05:10:05,982][53268] Updated weights for policy 1, policy_version 12990 (0.0008) [2023-10-10 05:10:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26607616. Throughput: 0: 1684.8, 1: 1679.4. Samples: 6659866. Policy #0 lag: (min: 1.0, avg: 11.9, max: 33.0) [2023-10-10 05:10:06,784][52050] Avg episode reward: [(0, '13.310'), (1, '13.540')] [2023-10-10 05:10:06,835][53252] Updated weights for policy 0, policy_version 13000 (0.0007) [2023-10-10 05:10:07,212][53252] Updated weights for policy 0, policy_version 13010 (0.0008) [2023-10-10 05:10:07,594][53252] Updated weights for policy 0, policy_version 13020 (0.0007) [2023-10-10 05:10:10,012][53268] Updated weights for policy 1, policy_version 13000 (0.0008) [2023-10-10 05:10:10,379][53268] Updated weights for policy 1, policy_version 13010 (0.0007) [2023-10-10 05:10:10,747][53268] Updated weights for policy 1, policy_version 13020 (0.0009) [2023-10-10 05:10:11,484][53252] Updated weights for policy 0, policy_version 13030 (0.0009) [2023-10-10 05:10:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26673152. Throughput: 0: 1688.0, 1: 1666.5. Samples: 6679884. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:10:11,784][52050] Avg episode reward: [(0, '13.200'), (1, '13.900')] [2023-10-10 05:10:11,852][53252] Updated weights for policy 0, policy_version 13040 (0.0011) [2023-10-10 05:10:12,228][53252] Updated weights for policy 0, policy_version 13050 (0.0009) [2023-10-10 05:10:14,926][53268] Updated weights for policy 1, policy_version 13030 (0.0007) [2023-10-10 05:10:15,293][53268] Updated weights for policy 1, policy_version 13040 (0.0009) [2023-10-10 05:10:15,666][53268] Updated weights for policy 1, policy_version 13050 (0.0008) [2023-10-10 05:10:16,267][53252] Updated weights for policy 0, policy_version 13060 (0.0009) [2023-10-10 05:10:16,633][53252] Updated weights for policy 0, policy_version 13070 (0.0009) [2023-10-10 05:10:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26738688. Throughput: 0: 1689.3, 1: 1686.9. Samples: 6690264. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:10:16,784][52050] Avg episode reward: [(0, '12.910'), (1, '14.600')] [2023-10-10 05:10:17,014][53252] Updated weights for policy 0, policy_version 13080 (0.0009) [2023-10-10 05:10:19,777][53268] Updated weights for policy 1, policy_version 13060 (0.0009) [2023-10-10 05:10:20,147][53268] Updated weights for policy 1, policy_version 13070 (0.0007) [2023-10-10 05:10:20,522][53268] Updated weights for policy 1, policy_version 13080 (0.0008) [2023-10-10 05:10:21,173][53252] Updated weights for policy 0, policy_version 13090 (0.0009) [2023-10-10 05:10:21,539][53252] Updated weights for policy 0, policy_version 13100 (0.0009) [2023-10-10 05:10:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26804224. Throughput: 0: 1690.1, 1: 1674.5. Samples: 6710558. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-10 05:10:21,785][52050] Avg episode reward: [(0, '13.530'), (1, '14.110')] [2023-10-10 05:10:21,910][53252] Updated weights for policy 0, policy_version 13110 (0.0009) [2023-10-10 05:10:22,293][53252] Updated weights for policy 0, policy_version 13120 (0.0009) [2023-10-10 05:10:24,608][53268] Updated weights for policy 1, policy_version 13090 (0.0009) [2023-10-10 05:10:24,977][53268] Updated weights for policy 1, policy_version 13100 (0.0009) [2023-10-10 05:10:25,351][53268] Updated weights for policy 1, policy_version 13110 (0.0008) [2023-10-10 05:10:25,723][53268] Updated weights for policy 1, policy_version 13120 (0.0009) [2023-10-10 05:10:26,203][53252] Updated weights for policy 0, policy_version 13130 (0.0007) [2023-10-10 05:10:26,579][53252] Updated weights for policy 0, policy_version 13140 (0.0008) [2023-10-10 05:10:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26869760. Throughput: 0: 1681.6, 1: 1664.2. Samples: 6730050. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:10:26,784][52050] Avg episode reward: [(0, '12.980'), (1, '13.530')] [2023-10-10 05:10:26,954][53252] Updated weights for policy 0, policy_version 13150 (0.0008) [2023-10-10 05:10:29,756][53268] Updated weights for policy 1, policy_version 13130 (0.0008) [2023-10-10 05:10:30,133][53268] Updated weights for policy 1, policy_version 13140 (0.0007) [2023-10-10 05:10:30,502][53268] Updated weights for policy 1, policy_version 13150 (0.0010) [2023-10-10 05:10:31,093][53252] Updated weights for policy 0, policy_version 13160 (0.0010) [2023-10-10 05:10:31,464][53252] Updated weights for policy 0, policy_version 13170 (0.0007) [2023-10-10 05:10:31,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26935296. Throughput: 0: 1695.2, 1: 1682.0. Samples: 6741014. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:10:31,784][52050] Avg episode reward: [(0, '11.890'), (1, '13.120')] [2023-10-10 05:10:31,836][53252] Updated weights for policy 0, policy_version 13180 (0.0007) [2023-10-10 05:10:34,484][53268] Updated weights for policy 1, policy_version 13160 (0.0008) [2023-10-10 05:10:34,857][53268] Updated weights for policy 1, policy_version 13170 (0.0009) [2023-10-10 05:10:35,232][53268] Updated weights for policy 1, policy_version 13180 (0.0009) [2023-10-10 05:10:35,863][53252] Updated weights for policy 0, policy_version 13190 (0.0010) [2023-10-10 05:10:36,242][53252] Updated weights for policy 0, policy_version 13200 (0.0010) [2023-10-10 05:10:36,608][53252] Updated weights for policy 0, policy_version 13210 (0.0010) [2023-10-10 05:10:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 27000832. Throughput: 0: 1701.6, 1: 1665.0. Samples: 6760988. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:10:36,784][52050] Avg episode reward: [(0, '11.630'), (1, '12.030')] [2023-10-10 05:10:39,536][53268] Updated weights for policy 1, policy_version 13190 (0.0009) [2023-10-10 05:10:39,923][53268] Updated weights for policy 1, policy_version 13200 (0.0008) [2023-10-10 05:10:40,296][53268] Updated weights for policy 1, policy_version 13210 (0.0008) [2023-10-10 05:10:40,650][53252] Updated weights for policy 0, policy_version 13220 (0.0009) [2023-10-10 05:10:41,031][53252] Updated weights for policy 0, policy_version 13230 (0.0008) [2023-10-10 05:10:41,403][53252] Updated weights for policy 0, policy_version 13240 (0.0007) [2023-10-10 05:10:41,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 27099136. Throughput: 0: 1680.4, 1: 1671.8. Samples: 6780270. Policy #0 lag: (min: 11.0, avg: 16.5, max: 43.0) [2023-10-10 05:10:41,784][52050] Avg episode reward: [(0, '11.980'), (1, '13.070')] [2023-10-10 05:10:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth... [2023-10-10 05:10:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000013248_13565952.pth... [2023-10-10 05:10:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000011680_11960320.pth [2023-10-10 05:10:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000011648_11927552.pth [2023-10-10 05:10:44,368][53268] Updated weights for policy 1, policy_version 13220 (0.0009) [2023-10-10 05:10:44,733][53268] Updated weights for policy 1, policy_version 13230 (0.0008) [2023-10-10 05:10:45,098][53268] Updated weights for policy 1, policy_version 13240 (0.0007) [2023-10-10 05:10:45,558][53252] Updated weights for policy 0, policy_version 13250 (0.0008) [2023-10-10 05:10:45,930][53252] Updated weights for policy 0, policy_version 13260 (0.0010) [2023-10-10 05:10:46,295][53252] Updated weights for policy 0, policy_version 13270 (0.0010) [2023-10-10 05:10:46,663][53252] Updated weights for policy 0, policy_version 13280 (0.0009) [2023-10-10 05:10:46,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 27164672. Throughput: 0: 1705.3, 1: 1676.9. Samples: 6791510. Policy #0 lag: (min: 11.0, avg: 16.5, max: 43.0) [2023-10-10 05:10:46,784][52050] Avg episode reward: [(0, '12.700'), (1, '13.740')] [2023-10-10 05:10:49,160][53268] Updated weights for policy 1, policy_version 13250 (0.0008) [2023-10-10 05:10:49,526][53268] Updated weights for policy 1, policy_version 13260 (0.0008) [2023-10-10 05:10:49,900][53268] Updated weights for policy 1, policy_version 13270 (0.0007) [2023-10-10 05:10:50,261][53268] Updated weights for policy 1, policy_version 13280 (0.0009) [2023-10-10 05:10:50,682][53252] Updated weights for policy 0, policy_version 13290 (0.0007) [2023-10-10 05:10:51,056][53252] Updated weights for policy 0, policy_version 13300 (0.0009) [2023-10-10 05:10:51,428][53252] Updated weights for policy 0, policy_version 13310 (0.0007) [2023-10-10 05:10:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 27230208. Throughput: 0: 1694.6, 1: 1661.6. Samples: 6810894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:10:51,784][52050] Avg episode reward: [(0, '12.550'), (1, '13.850')] [2023-10-10 05:10:54,187][53268] Updated weights for policy 1, policy_version 13290 (0.0007) [2023-10-10 05:10:54,562][53268] Updated weights for policy 1, policy_version 13300 (0.0011) [2023-10-10 05:10:54,927][53268] Updated weights for policy 1, policy_version 13310 (0.0007) [2023-10-10 05:10:55,376][53252] Updated weights for policy 0, policy_version 13320 (0.0009) [2023-10-10 05:10:55,744][53252] Updated weights for policy 0, policy_version 13330 (0.0008) [2023-10-10 05:10:56,129][53252] Updated weights for policy 0, policy_version 13340 (0.0009) [2023-10-10 05:10:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 27295744. Throughput: 0: 1666.0, 1: 1681.7. Samples: 6830534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:10:56,784][52050] Avg episode reward: [(0, '12.480'), (1, '13.460')] [2023-10-10 05:10:59,017][53268] Updated weights for policy 1, policy_version 13320 (0.0009) [2023-10-10 05:10:59,396][53268] Updated weights for policy 1, policy_version 13330 (0.0009) [2023-10-10 05:10:59,768][53268] Updated weights for policy 1, policy_version 13340 (0.0008) [2023-10-10 05:11:00,049][53252] Updated weights for policy 0, policy_version 13350 (0.0008) [2023-10-10 05:11:00,419][53252] Updated weights for policy 0, policy_version 13360 (0.0007) [2023-10-10 05:11:00,784][53252] Updated weights for policy 0, policy_version 13370 (0.0007) [2023-10-10 05:11:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 27361280. Throughput: 0: 1698.2, 1: 1671.1. Samples: 6841880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:01,784][52050] Avg episode reward: [(0, '12.430'), (1, '13.270')] [2023-10-10 05:11:03,795][53268] Updated weights for policy 1, policy_version 13350 (0.0009) [2023-10-10 05:11:04,166][53268] Updated weights for policy 1, policy_version 13360 (0.0009) [2023-10-10 05:11:04,525][53268] Updated weights for policy 1, policy_version 13370 (0.0008) [2023-10-10 05:11:04,869][53252] Updated weights for policy 0, policy_version 13380 (0.0009) [2023-10-10 05:11:05,245][53252] Updated weights for policy 0, policy_version 13390 (0.0007) [2023-10-10 05:11:05,620][53252] Updated weights for policy 0, policy_version 13400 (0.0009) [2023-10-10 05:11:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 27426816. Throughput: 0: 1679.2, 1: 1664.0. Samples: 6861002. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:11:06,784][52050] Avg episode reward: [(0, '13.890'), (1, '13.440')] [2023-10-10 05:11:08,638][53268] Updated weights for policy 1, policy_version 13380 (0.0008) [2023-10-10 05:11:09,004][53268] Updated weights for policy 1, policy_version 13390 (0.0009) [2023-10-10 05:11:09,372][53268] Updated weights for policy 1, policy_version 13400 (0.0009) [2023-10-10 05:11:09,576][53252] Updated weights for policy 0, policy_version 13410 (0.0010) [2023-10-10 05:11:09,950][53252] Updated weights for policy 0, policy_version 13420 (0.0008) [2023-10-10 05:11:10,315][53252] Updated weights for policy 0, policy_version 13430 (0.0009) [2023-10-10 05:11:10,691][53252] Updated weights for policy 0, policy_version 13440 (0.0008) [2023-10-10 05:11:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 27492352. Throughput: 0: 1677.3, 1: 1679.9. Samples: 6881126. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:11:11,784][52050] Avg episode reward: [(0, '12.900'), (1, '14.880')] [2023-10-10 05:11:11,797][53061] Saving new best policy, reward=14.880! [2023-10-10 05:11:13,580][53268] Updated weights for policy 1, policy_version 13410 (0.0010) [2023-10-10 05:11:13,940][53268] Updated weights for policy 1, policy_version 13420 (0.0010) [2023-10-10 05:11:14,312][53268] Updated weights for policy 1, policy_version 13430 (0.0008) [2023-10-10 05:11:14,673][53268] Updated weights for policy 1, policy_version 13440 (0.0009) [2023-10-10 05:11:14,756][53252] Updated weights for policy 0, policy_version 13450 (0.0007) [2023-10-10 05:11:15,137][53252] Updated weights for policy 0, policy_version 13460 (0.0009) [2023-10-10 05:11:15,500][53252] Updated weights for policy 0, policy_version 13470 (0.0009) [2023-10-10 05:11:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 27557888. Throughput: 0: 1699.2, 1: 1661.0. Samples: 6892226. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:11:16,784][52050] Avg episode reward: [(0, '14.650'), (1, '14.900')] [2023-10-10 05:11:16,786][53061] Saving new best policy, reward=14.900! [2023-10-10 05:11:16,786][52846] Saving new best policy, reward=14.650! [2023-10-10 05:11:18,705][53268] Updated weights for policy 1, policy_version 13450 (0.0009) [2023-10-10 05:11:19,077][53268] Updated weights for policy 1, policy_version 13460 (0.0010) [2023-10-10 05:11:19,447][53268] Updated weights for policy 1, policy_version 13470 (0.0009) [2023-10-10 05:11:19,756][53252] Updated weights for policy 0, policy_version 13480 (0.0008) [2023-10-10 05:11:20,140][53252] Updated weights for policy 0, policy_version 13490 (0.0008) [2023-10-10 05:11:20,507][53252] Updated weights for policy 0, policy_version 13500 (0.0008) [2023-10-10 05:11:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 27623424. Throughput: 0: 1667.7, 1: 1669.6. Samples: 6911166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:21,784][52050] Avg episode reward: [(0, '15.440'), (1, '15.150')] [2023-10-10 05:11:21,785][52846] Saving new best policy, reward=15.440! [2023-10-10 05:11:21,785][53061] Saving new best policy, reward=15.150! [2023-10-10 05:11:23,644][53268] Updated weights for policy 1, policy_version 13480 (0.0008) [2023-10-10 05:11:24,012][53268] Updated weights for policy 1, policy_version 13490 (0.0009) [2023-10-10 05:11:24,384][53268] Updated weights for policy 1, policy_version 13500 (0.0010) [2023-10-10 05:11:24,533][53252] Updated weights for policy 0, policy_version 13510 (0.0008) [2023-10-10 05:11:24,903][53252] Updated weights for policy 0, policy_version 13520 (0.0008) [2023-10-10 05:11:25,278][53252] Updated weights for policy 0, policy_version 13530 (0.0007) [2023-10-10 05:11:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 27688960. Throughput: 0: 1679.7, 1: 1684.5. Samples: 6931658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:26,784][52050] Avg episode reward: [(0, '13.940'), (1, '14.640')] [2023-10-10 05:11:28,435][53268] Updated weights for policy 1, policy_version 13510 (0.0009) [2023-10-10 05:11:28,816][53268] Updated weights for policy 1, policy_version 13520 (0.0008) [2023-10-10 05:11:29,190][53268] Updated weights for policy 1, policy_version 13530 (0.0009) [2023-10-10 05:11:29,356][53252] Updated weights for policy 0, policy_version 13540 (0.0009) [2023-10-10 05:11:29,736][53252] Updated weights for policy 0, policy_version 13550 (0.0009) [2023-10-10 05:11:30,107][53252] Updated weights for policy 0, policy_version 13560 (0.0009) [2023-10-10 05:11:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 27754496. Throughput: 0: 1681.6, 1: 1664.9. Samples: 6942104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:31,784][52050] Avg episode reward: [(0, '14.550'), (1, '12.970')] [2023-10-10 05:11:33,228][53268] Updated weights for policy 1, policy_version 13540 (0.0009) [2023-10-10 05:11:33,593][53268] Updated weights for policy 1, policy_version 13550 (0.0009) [2023-10-10 05:11:33,958][53268] Updated weights for policy 1, policy_version 13560 (0.0010) [2023-10-10 05:11:34,258][53252] Updated weights for policy 0, policy_version 13570 (0.0010) [2023-10-10 05:11:34,635][53252] Updated weights for policy 0, policy_version 13580 (0.0007) [2023-10-10 05:11:35,002][53252] Updated weights for policy 0, policy_version 13590 (0.0007) [2023-10-10 05:11:35,377][53252] Updated weights for policy 0, policy_version 13600 (0.0008) [2023-10-10 05:11:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 27820032. Throughput: 0: 1662.4, 1: 1683.1. Samples: 6961442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:36,784][52050] Avg episode reward: [(0, '13.470'), (1, '13.580')] [2023-10-10 05:11:37,766][53268] Updated weights for policy 1, policy_version 13570 (0.0009) [2023-10-10 05:11:38,136][53268] Updated weights for policy 1, policy_version 13580 (0.0011) [2023-10-10 05:11:38,510][53268] Updated weights for policy 1, policy_version 13590 (0.0009) [2023-10-10 05:11:38,874][53268] Updated weights for policy 1, policy_version 13600 (0.0009) [2023-10-10 05:11:39,486][53252] Updated weights for policy 0, policy_version 13610 (0.0008) [2023-10-10 05:11:39,859][53252] Updated weights for policy 0, policy_version 13620 (0.0007) [2023-10-10 05:11:40,234][53252] Updated weights for policy 0, policy_version 13630 (0.0009) [2023-10-10 05:11:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27885568. Throughput: 0: 1682.0, 1: 1685.7. Samples: 6982080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:41,784][52050] Avg episode reward: [(0, '12.440'), (1, '13.570')] [2023-10-10 05:11:42,878][53268] Updated weights for policy 1, policy_version 13610 (0.0007) [2023-10-10 05:11:43,247][53268] Updated weights for policy 1, policy_version 13620 (0.0007) [2023-10-10 05:11:43,623][53268] Updated weights for policy 1, policy_version 13630 (0.0007) [2023-10-10 05:11:44,377][53252] Updated weights for policy 0, policy_version 13640 (0.0008) [2023-10-10 05:11:44,751][53252] Updated weights for policy 0, policy_version 13650 (0.0007) [2023-10-10 05:11:45,127][53252] Updated weights for policy 0, policy_version 13660 (0.0009) [2023-10-10 05:11:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27951104. Throughput: 0: 1668.3, 1: 1670.6. Samples: 6992132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:11:46,784][52050] Avg episode reward: [(0, '13.650'), (1, '14.560')] [2023-10-10 05:11:47,617][53268] Updated weights for policy 1, policy_version 13640 (0.0010) [2023-10-10 05:11:47,986][53268] Updated weights for policy 1, policy_version 13650 (0.0008) [2023-10-10 05:11:48,348][53268] Updated weights for policy 1, policy_version 13660 (0.0008) [2023-10-10 05:11:49,074][53252] Updated weights for policy 0, policy_version 13670 (0.0007) [2023-10-10 05:11:49,458][53252] Updated weights for policy 0, policy_version 13680 (0.0007) [2023-10-10 05:11:49,835][53252] Updated weights for policy 0, policy_version 13690 (0.0008) [2023-10-10 05:11:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 28016640. Throughput: 0: 1666.7, 1: 1692.0. Samples: 7012144. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:11:51,784][52050] Avg episode reward: [(0, '13.710'), (1, '14.030')] [2023-10-10 05:11:52,328][53268] Updated weights for policy 1, policy_version 13670 (0.0011) [2023-10-10 05:11:52,699][53268] Updated weights for policy 1, policy_version 13680 (0.0008) [2023-10-10 05:11:53,071][53268] Updated weights for policy 1, policy_version 13690 (0.0007) [2023-10-10 05:11:53,871][53252] Updated weights for policy 0, policy_version 13700 (0.0008) [2023-10-10 05:11:54,248][53252] Updated weights for policy 0, policy_version 13710 (0.0008) [2023-10-10 05:11:54,615][53252] Updated weights for policy 0, policy_version 13720 (0.0008) [2023-10-10 05:11:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 28082176. Throughput: 0: 1676.3, 1: 1697.2. Samples: 7032932. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:11:56,784][52050] Avg episode reward: [(0, '12.300'), (1, '14.520')] [2023-10-10 05:11:57,202][53268] Updated weights for policy 1, policy_version 13700 (0.0009) [2023-10-10 05:11:57,569][53268] Updated weights for policy 1, policy_version 13710 (0.0009) [2023-10-10 05:11:57,942][53268] Updated weights for policy 1, policy_version 13720 (0.0010) [2023-10-10 05:11:58,664][53252] Updated weights for policy 0, policy_version 13730 (0.0008) [2023-10-10 05:11:59,028][53252] Updated weights for policy 0, policy_version 13740 (0.0009) [2023-10-10 05:11:59,405][53252] Updated weights for policy 0, policy_version 13750 (0.0007) [2023-10-10 05:11:59,781][53252] Updated weights for policy 0, policy_version 13760 (0.0010) [2023-10-10 05:12:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 28147712. Throughput: 0: 1656.3, 1: 1686.1. Samples: 7042634. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:12:01,784][52050] Avg episode reward: [(0, '12.430'), (1, '14.060')] [2023-10-10 05:12:01,891][53268] Updated weights for policy 1, policy_version 13730 (0.0009) [2023-10-10 05:12:02,262][53268] Updated weights for policy 1, policy_version 13740 (0.0008) [2023-10-10 05:12:02,637][53268] Updated weights for policy 1, policy_version 13750 (0.0007) [2023-10-10 05:12:02,998][53268] Updated weights for policy 1, policy_version 13760 (0.0011) [2023-10-10 05:12:03,806][53252] Updated weights for policy 0, policy_version 13770 (0.0011) [2023-10-10 05:12:04,182][53252] Updated weights for policy 0, policy_version 13780 (0.0010) [2023-10-10 05:12:04,546][53252] Updated weights for policy 0, policy_version 13790 (0.0010) [2023-10-10 05:12:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 28213248. Throughput: 0: 1673.8, 1: 1700.8. Samples: 7063022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:12:06,784][52050] Avg episode reward: [(0, '12.080'), (1, '13.810')] [2023-10-10 05:12:06,944][53268] Updated weights for policy 1, policy_version 13770 (0.0008) [2023-10-10 05:12:07,302][53268] Updated weights for policy 1, policy_version 13780 (0.0008) [2023-10-10 05:12:07,670][53268] Updated weights for policy 1, policy_version 13790 (0.0010) [2023-10-10 05:12:08,547][53252] Updated weights for policy 0, policy_version 13800 (0.0008) [2023-10-10 05:12:08,928][53252] Updated weights for policy 0, policy_version 13810 (0.0010) [2023-10-10 05:12:09,298][53252] Updated weights for policy 0, policy_version 13820 (0.0008) [2023-10-10 05:12:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 28278784. Throughput: 0: 1684.4, 1: 1697.0. Samples: 7083818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:12:11,784][52050] Avg episode reward: [(0, '12.930'), (1, '15.110')] [2023-10-10 05:12:11,852][53268] Updated weights for policy 1, policy_version 13800 (0.0010) [2023-10-10 05:12:12,225][53268] Updated weights for policy 1, policy_version 13810 (0.0007) [2023-10-10 05:12:12,586][53268] Updated weights for policy 1, policy_version 13820 (0.0007) [2023-10-10 05:12:13,360][53252] Updated weights for policy 0, policy_version 13830 (0.0009) [2023-10-10 05:12:13,731][53252] Updated weights for policy 0, policy_version 13840 (0.0009) [2023-10-10 05:12:14,099][53252] Updated weights for policy 0, policy_version 13850 (0.0007) [2023-10-10 05:12:16,644][53268] Updated weights for policy 1, policy_version 13830 (0.0007) [2023-10-10 05:12:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28344320. Throughput: 0: 1662.3, 1: 1692.1. Samples: 7093054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:12:16,784][52050] Avg episode reward: [(0, '12.950'), (1, '13.350')] [2023-10-10 05:12:17,024][53268] Updated weights for policy 1, policy_version 13840 (0.0009) [2023-10-10 05:12:17,395][53268] Updated weights for policy 1, policy_version 13850 (0.0009) [2023-10-10 05:12:17,988][53252] Updated weights for policy 0, policy_version 13860 (0.0010) [2023-10-10 05:12:18,354][53252] Updated weights for policy 0, policy_version 13870 (0.0007) [2023-10-10 05:12:18,729][53252] Updated weights for policy 0, policy_version 13880 (0.0007) [2023-10-10 05:12:21,560][53268] Updated weights for policy 1, policy_version 13860 (0.0009) [2023-10-10 05:12:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28409856. Throughput: 0: 1698.2, 1: 1694.0. Samples: 7114092. Policy #0 lag: (min: 18.0, avg: 33.7, max: 50.0) [2023-10-10 05:12:21,784][52050] Avg episode reward: [(0, '12.740'), (1, '14.920')] [2023-10-10 05:12:21,930][53268] Updated weights for policy 1, policy_version 13870 (0.0010) [2023-10-10 05:12:22,305][53268] Updated weights for policy 1, policy_version 13880 (0.0009) [2023-10-10 05:12:22,636][53252] Updated weights for policy 0, policy_version 13890 (0.0009) [2023-10-10 05:12:23,020][53252] Updated weights for policy 0, policy_version 13900 (0.0010) [2023-10-10 05:12:23,395][53252] Updated weights for policy 0, policy_version 13910 (0.0011) [2023-10-10 05:12:23,770][53252] Updated weights for policy 0, policy_version 13920 (0.0011) [2023-10-10 05:12:26,389][53268] Updated weights for policy 1, policy_version 13890 (0.0008) [2023-10-10 05:12:26,766][53268] Updated weights for policy 1, policy_version 13900 (0.0008) [2023-10-10 05:12:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28475392. Throughput: 0: 1702.1, 1: 1693.7. Samples: 7134892. Policy #0 lag: (min: 18.0, avg: 33.7, max: 50.0) [2023-10-10 05:12:26,784][52050] Avg episode reward: [(0, '13.650'), (1, '14.400')] [2023-10-10 05:12:27,130][53268] Updated weights for policy 1, policy_version 13910 (0.0008) [2023-10-10 05:12:27,495][53268] Updated weights for policy 1, policy_version 13920 (0.0008) [2023-10-10 05:12:27,841][53252] Updated weights for policy 0, policy_version 13930 (0.0008) [2023-10-10 05:12:28,214][53252] Updated weights for policy 0, policy_version 13940 (0.0007) [2023-10-10 05:12:28,587][53252] Updated weights for policy 0, policy_version 13950 (0.0007) [2023-10-10 05:12:31,563][53268] Updated weights for policy 1, policy_version 13930 (0.0009) [2023-10-10 05:12:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28540928. Throughput: 0: 1681.9, 1: 1690.9. Samples: 7143906. Policy #0 lag: (min: 18.0, avg: 33.7, max: 50.0) [2023-10-10 05:12:31,784][52050] Avg episode reward: [(0, '13.590'), (1, '14.350')] [2023-10-10 05:12:31,927][53268] Updated weights for policy 1, policy_version 13940 (0.0008) [2023-10-10 05:12:32,300][53268] Updated weights for policy 1, policy_version 13950 (0.0008) [2023-10-10 05:12:32,618][53252] Updated weights for policy 0, policy_version 13960 (0.0009) [2023-10-10 05:12:32,995][53252] Updated weights for policy 0, policy_version 13970 (0.0007) [2023-10-10 05:12:33,374][53252] Updated weights for policy 0, policy_version 13980 (0.0008) [2023-10-10 05:12:36,353][53268] Updated weights for policy 1, policy_version 13960 (0.0008) [2023-10-10 05:12:36,718][53268] Updated weights for policy 1, policy_version 13970 (0.0007) [2023-10-10 05:12:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28606464. Throughput: 0: 1701.5, 1: 1685.6. Samples: 7164562. Policy #0 lag: (min: 24.0, avg: 45.0, max: 56.0) [2023-10-10 05:12:36,784][52050] Avg episode reward: [(0, '14.060'), (1, '15.740')] [2023-10-10 05:12:37,093][53268] Updated weights for policy 1, policy_version 13980 (0.0010) [2023-10-10 05:12:37,234][53061] Saving new best policy, reward=15.740! [2023-10-10 05:12:37,412][53252] Updated weights for policy 0, policy_version 13990 (0.0009) [2023-10-10 05:12:37,786][53252] Updated weights for policy 0, policy_version 14000 (0.0010) [2023-10-10 05:12:38,153][53252] Updated weights for policy 0, policy_version 14010 (0.0009) [2023-10-10 05:12:41,174][53268] Updated weights for policy 1, policy_version 13990 (0.0009) [2023-10-10 05:12:41,533][53268] Updated weights for policy 1, policy_version 14000 (0.0008) [2023-10-10 05:12:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 28672000. Throughput: 0: 1706.1, 1: 1672.0. Samples: 7184950. Policy #0 lag: (min: 24.0, avg: 45.0, max: 56.0) [2023-10-10 05:12:41,784][52050] Avg episode reward: [(0, '13.240'), (1, '16.050')] [2023-10-10 05:12:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000014016_14352384.pth... [2023-10-10 05:12:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000012448_12746752.pth [2023-10-10 05:12:41,900][53268] Updated weights for policy 1, policy_version 14010 (0.0007) [2023-10-10 05:12:42,120][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000014016_14352384.pth... [2023-10-10 05:12:42,158][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000012416_12713984.pth [2023-10-10 05:12:42,163][53061] Saving new best policy, reward=16.050! [2023-10-10 05:12:42,203][53252] Updated weights for policy 0, policy_version 14020 (0.0007) [2023-10-10 05:12:42,577][53252] Updated weights for policy 0, policy_version 14030 (0.0009) [2023-10-10 05:12:42,943][53252] Updated weights for policy 0, policy_version 14040 (0.0007) [2023-10-10 05:12:46,051][53268] Updated weights for policy 1, policy_version 14020 (0.0010) [2023-10-10 05:12:46,419][53268] Updated weights for policy 1, policy_version 14030 (0.0011) [2023-10-10 05:12:46,776][53268] Updated weights for policy 1, policy_version 14040 (0.0010) [2023-10-10 05:12:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28737536. Throughput: 0: 1689.6, 1: 1678.8. Samples: 7194214. Policy #0 lag: (min: 24.0, avg: 45.0, max: 56.0) [2023-10-10 05:12:46,784][52050] Avg episode reward: [(0, '12.940'), (1, '15.720')] [2023-10-10 05:12:47,153][53252] Updated weights for policy 0, policy_version 14050 (0.0007) [2023-10-10 05:12:47,521][53252] Updated weights for policy 0, policy_version 14060 (0.0008) [2023-10-10 05:12:47,895][53252] Updated weights for policy 0, policy_version 14070 (0.0008) [2023-10-10 05:12:48,258][53252] Updated weights for policy 0, policy_version 14080 (0.0007) [2023-10-10 05:12:50,869][53268] Updated weights for policy 1, policy_version 14050 (0.0009) [2023-10-10 05:12:51,244][53268] Updated weights for policy 1, policy_version 14060 (0.0008) [2023-10-10 05:12:51,609][53268] Updated weights for policy 1, policy_version 14070 (0.0007) [2023-10-10 05:12:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28803072. Throughput: 0: 1702.4, 1: 1672.6. Samples: 7214896. Policy #0 lag: (min: 3.0, avg: 7.9, max: 35.0) [2023-10-10 05:12:51,784][52050] Avg episode reward: [(0, '13.220'), (1, '15.690')] [2023-10-10 05:12:51,973][53268] Updated weights for policy 1, policy_version 14080 (0.0008) [2023-10-10 05:12:52,273][53252] Updated weights for policy 0, policy_version 14090 (0.0007) [2023-10-10 05:12:52,653][53252] Updated weights for policy 0, policy_version 14100 (0.0009) [2023-10-10 05:12:53,033][53252] Updated weights for policy 0, policy_version 14110 (0.0008) [2023-10-10 05:12:56,019][53268] Updated weights for policy 1, policy_version 14090 (0.0007) [2023-10-10 05:12:56,385][53268] Updated weights for policy 1, policy_version 14100 (0.0008) [2023-10-10 05:12:56,749][53268] Updated weights for policy 1, policy_version 14110 (0.0009) [2023-10-10 05:12:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 28868608. Throughput: 0: 1699.7, 1: 1664.0. Samples: 7235182. Policy #0 lag: (min: 3.0, avg: 7.9, max: 35.0) [2023-10-10 05:12:56,784][52050] Avg episode reward: [(0, '12.800'), (1, '15.490')] [2023-10-10 05:12:57,051][53252] Updated weights for policy 0, policy_version 14120 (0.0009) [2023-10-10 05:12:57,422][53252] Updated weights for policy 0, policy_version 14130 (0.0008) [2023-10-10 05:12:57,795][53252] Updated weights for policy 0, policy_version 14140 (0.0008) [2023-10-10 05:13:00,660][53268] Updated weights for policy 1, policy_version 14120 (0.0008) [2023-10-10 05:13:01,033][53268] Updated weights for policy 1, policy_version 14130 (0.0007) [2023-10-10 05:13:01,403][53268] Updated weights for policy 1, policy_version 14140 (0.0008) [2023-10-10 05:13:01,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 28966912. Throughput: 0: 1693.5, 1: 1678.5. Samples: 7244792. Policy #0 lag: (min: 3.0, avg: 7.9, max: 35.0) [2023-10-10 05:13:01,784][52050] Avg episode reward: [(0, '14.340'), (1, '14.500')] [2023-10-10 05:13:01,971][53252] Updated weights for policy 0, policy_version 14150 (0.0007) [2023-10-10 05:13:02,346][53252] Updated weights for policy 0, policy_version 14160 (0.0008) [2023-10-10 05:13:02,717][53252] Updated weights for policy 0, policy_version 14170 (0.0008) [2023-10-10 05:13:05,414][53268] Updated weights for policy 1, policy_version 14150 (0.0010) [2023-10-10 05:13:05,787][53268] Updated weights for policy 1, policy_version 14160 (0.0009) [2023-10-10 05:13:06,144][53268] Updated weights for policy 1, policy_version 14170 (0.0009) [2023-10-10 05:13:06,664][53252] Updated weights for policy 0, policy_version 14180 (0.0009) [2023-10-10 05:13:06,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29032448. Throughput: 0: 1685.6, 1: 1690.6. Samples: 7266022. Policy #0 lag: (min: 30.0, avg: 34.1, max: 62.0) [2023-10-10 05:13:06,784][52050] Avg episode reward: [(0, '14.880'), (1, '14.210')] [2023-10-10 05:13:07,040][53252] Updated weights for policy 0, policy_version 14190 (0.0010) [2023-10-10 05:13:07,410][53252] Updated weights for policy 0, policy_version 14200 (0.0007) [2023-10-10 05:13:10,343][53268] Updated weights for policy 1, policy_version 14180 (0.0009) [2023-10-10 05:13:10,703][53268] Updated weights for policy 1, policy_version 14190 (0.0009) [2023-10-10 05:13:11,072][53268] Updated weights for policy 1, policy_version 14200 (0.0010) [2023-10-10 05:13:11,342][53252] Updated weights for policy 0, policy_version 14210 (0.0009) [2023-10-10 05:13:11,709][53252] Updated weights for policy 0, policy_version 14220 (0.0011) [2023-10-10 05:13:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29097984. Throughput: 0: 1685.4, 1: 1663.5. Samples: 7285590. Policy #0 lag: (min: 30.0, avg: 34.1, max: 62.0) [2023-10-10 05:13:11,784][52050] Avg episode reward: [(0, '15.190'), (1, '14.800')] [2023-10-10 05:13:12,081][53252] Updated weights for policy 0, policy_version 14230 (0.0010) [2023-10-10 05:13:12,456][53252] Updated weights for policy 0, policy_version 14240 (0.0009) [2023-10-10 05:13:15,164][53268] Updated weights for policy 1, policy_version 14210 (0.0009) [2023-10-10 05:13:15,528][53268] Updated weights for policy 1, policy_version 14220 (0.0010) [2023-10-10 05:13:15,898][53268] Updated weights for policy 1, policy_version 14230 (0.0009) [2023-10-10 05:13:16,265][53268] Updated weights for policy 1, policy_version 14240 (0.0008) [2023-10-10 05:13:16,509][53252] Updated weights for policy 0, policy_version 14250 (0.0008) [2023-10-10 05:13:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 29163520. Throughput: 0: 1685.6, 1: 1691.1. Samples: 7295858. Policy #0 lag: (min: 30.0, avg: 34.1, max: 62.0) [2023-10-10 05:13:16,784][52050] Avg episode reward: [(0, '15.000'), (1, '15.060')] [2023-10-10 05:13:16,888][53252] Updated weights for policy 0, policy_version 14260 (0.0010) [2023-10-10 05:13:17,258][53252] Updated weights for policy 0, policy_version 14270 (0.0009) [2023-10-10 05:13:20,272][53268] Updated weights for policy 1, policy_version 14250 (0.0008) [2023-10-10 05:13:20,645][53268] Updated weights for policy 1, policy_version 14260 (0.0009) [2023-10-10 05:13:21,021][53268] Updated weights for policy 1, policy_version 14270 (0.0010) [2023-10-10 05:13:21,618][53252] Updated weights for policy 0, policy_version 14280 (0.0008) [2023-10-10 05:13:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29229056. Throughput: 0: 1683.7, 1: 1690.3. Samples: 7316392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:21,785][52050] Avg episode reward: [(0, '14.710'), (1, '14.600')] [2023-10-10 05:13:21,993][53252] Updated weights for policy 0, policy_version 14290 (0.0008) [2023-10-10 05:13:22,365][53252] Updated weights for policy 0, policy_version 14300 (0.0007) [2023-10-10 05:13:25,054][53268] Updated weights for policy 1, policy_version 14280 (0.0009) [2023-10-10 05:13:25,420][53268] Updated weights for policy 1, policy_version 14290 (0.0008) [2023-10-10 05:13:25,789][53268] Updated weights for policy 1, policy_version 14300 (0.0009) [2023-10-10 05:13:26,220][53252] Updated weights for policy 0, policy_version 14310 (0.0007) [2023-10-10 05:13:26,599][53252] Updated weights for policy 0, policy_version 14320 (0.0008) [2023-10-10 05:13:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29294592. Throughput: 0: 1674.1, 1: 1677.0. Samples: 7335752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:26,784][52050] Avg episode reward: [(0, '14.950'), (1, '15.430')] [2023-10-10 05:13:26,966][53252] Updated weights for policy 0, policy_version 14330 (0.0007) [2023-10-10 05:13:29,877][53268] Updated weights for policy 1, policy_version 14310 (0.0009) [2023-10-10 05:13:30,251][53268] Updated weights for policy 1, policy_version 14320 (0.0011) [2023-10-10 05:13:30,628][53268] Updated weights for policy 1, policy_version 14330 (0.0010) [2023-10-10 05:13:30,907][53252] Updated weights for policy 0, policy_version 14340 (0.0007) [2023-10-10 05:13:31,285][53252] Updated weights for policy 0, policy_version 14350 (0.0009) [2023-10-10 05:13:31,652][53252] Updated weights for policy 0, policy_version 14360 (0.0011) [2023-10-10 05:13:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 29360128. Throughput: 0: 1689.2, 1: 1700.1. Samples: 7346730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:31,784][52050] Avg episode reward: [(0, '15.680'), (1, '14.980')] [2023-10-10 05:13:31,956][52846] Saving new best policy, reward=15.680! [2023-10-10 05:13:34,692][53268] Updated weights for policy 1, policy_version 14340 (0.0009) [2023-10-10 05:13:35,059][53268] Updated weights for policy 1, policy_version 14350 (0.0007) [2023-10-10 05:13:35,429][53268] Updated weights for policy 1, policy_version 14360 (0.0008) [2023-10-10 05:13:35,699][53252] Updated weights for policy 0, policy_version 14370 (0.0008) [2023-10-10 05:13:36,068][53252] Updated weights for policy 0, policy_version 14380 (0.0008) [2023-10-10 05:13:36,440][53252] Updated weights for policy 0, policy_version 14390 (0.0007) [2023-10-10 05:13:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29425664. Throughput: 0: 1690.7, 1: 1691.0. Samples: 7367072. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 05:13:36,784][52050] Avg episode reward: [(0, '15.500'), (1, '14.590')] [2023-10-10 05:13:36,822][53252] Updated weights for policy 0, policy_version 14400 (0.0007) [2023-10-10 05:13:39,425][53268] Updated weights for policy 1, policy_version 14370 (0.0009) [2023-10-10 05:13:39,781][53268] Updated weights for policy 1, policy_version 14380 (0.0010) [2023-10-10 05:13:40,152][53268] Updated weights for policy 1, policy_version 14390 (0.0009) [2023-10-10 05:13:40,518][53268] Updated weights for policy 1, policy_version 14400 (0.0009) [2023-10-10 05:13:40,944][53252] Updated weights for policy 0, policy_version 14410 (0.0008) [2023-10-10 05:13:41,321][53252] Updated weights for policy 0, policy_version 14420 (0.0009) [2023-10-10 05:13:41,703][53252] Updated weights for policy 0, policy_version 14430 (0.0007) [2023-10-10 05:13:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29523968. Throughput: 0: 1670.4, 1: 1686.7. Samples: 7386254. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 05:13:41,784][52050] Avg episode reward: [(0, '15.390'), (1, '14.880')] [2023-10-10 05:13:44,764][53268] Updated weights for policy 1, policy_version 14410 (0.0009) [2023-10-10 05:13:45,124][53268] Updated weights for policy 1, policy_version 14420 (0.0009) [2023-10-10 05:13:45,494][53268] Updated weights for policy 1, policy_version 14430 (0.0010) [2023-10-10 05:13:45,889][53252] Updated weights for policy 0, policy_version 14440 (0.0009) [2023-10-10 05:13:46,263][53252] Updated weights for policy 0, policy_version 14450 (0.0009) [2023-10-10 05:13:46,628][53252] Updated weights for policy 0, policy_version 14460 (0.0009) [2023-10-10 05:13:46,785][52050] Fps is (10 sec: 16381.9, 60 sec: 14199.2, 300 sec: 13551.4). Total num frames: 29589504. Throughput: 0: 1691.9, 1: 1695.7. Samples: 7397238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:46,786][52050] Avg episode reward: [(0, '13.610'), (1, '15.300')] [2023-10-10 05:13:49,486][53268] Updated weights for policy 1, policy_version 14440 (0.0008) [2023-10-10 05:13:49,864][53268] Updated weights for policy 1, policy_version 14450 (0.0008) [2023-10-10 05:13:50,228][53268] Updated weights for policy 1, policy_version 14460 (0.0008) [2023-10-10 05:13:50,768][53252] Updated weights for policy 0, policy_version 14470 (0.0007) [2023-10-10 05:13:51,147][53252] Updated weights for policy 0, policy_version 14480 (0.0007) [2023-10-10 05:13:51,522][53252] Updated weights for policy 0, policy_version 14490 (0.0009) [2023-10-10 05:13:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29655040. Throughput: 0: 1684.6, 1: 1664.4. Samples: 7416724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:51,784][52050] Avg episode reward: [(0, '13.940'), (1, '14.470')] [2023-10-10 05:13:54,235][53268] Updated weights for policy 1, policy_version 14470 (0.0008) [2023-10-10 05:13:54,617][53268] Updated weights for policy 1, policy_version 14480 (0.0010) [2023-10-10 05:13:54,992][53268] Updated weights for policy 1, policy_version 14490 (0.0009) [2023-10-10 05:13:55,611][53252] Updated weights for policy 0, policy_version 14500 (0.0008) [2023-10-10 05:13:55,988][53252] Updated weights for policy 0, policy_version 14510 (0.0007) [2023-10-10 05:13:56,363][53252] Updated weights for policy 0, policy_version 14520 (0.0008) [2023-10-10 05:13:56,783][52050] Fps is (10 sec: 13109.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 29720576. Throughput: 0: 1661.9, 1: 1679.3. Samples: 7435942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:13:56,784][52050] Avg episode reward: [(0, '14.090'), (1, '14.630')] [2023-10-10 05:13:59,015][53268] Updated weights for policy 1, policy_version 14500 (0.0009) [2023-10-10 05:13:59,384][53268] Updated weights for policy 1, policy_version 14510 (0.0010) [2023-10-10 05:13:59,754][53268] Updated weights for policy 1, policy_version 14520 (0.0010) [2023-10-10 05:14:00,341][53252] Updated weights for policy 0, policy_version 14530 (0.0009) [2023-10-10 05:14:00,704][53252] Updated weights for policy 0, policy_version 14540 (0.0010) [2023-10-10 05:14:01,074][53252] Updated weights for policy 0, policy_version 14550 (0.0011) [2023-10-10 05:14:01,452][53252] Updated weights for policy 0, policy_version 14560 (0.0010) [2023-10-10 05:14:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29786112. Throughput: 0: 1685.4, 1: 1675.1. Samples: 7447080. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:14:01,784][52050] Avg episode reward: [(0, '14.420'), (1, '14.380')] [2023-10-10 05:14:03,609][53268] Updated weights for policy 1, policy_version 14530 (0.0008) [2023-10-10 05:14:03,974][53268] Updated weights for policy 1, policy_version 14540 (0.0008) [2023-10-10 05:14:04,343][53268] Updated weights for policy 1, policy_version 14550 (0.0009) [2023-10-10 05:14:04,724][53268] Updated weights for policy 1, policy_version 14560 (0.0008) [2023-10-10 05:14:05,458][53252] Updated weights for policy 0, policy_version 14570 (0.0010) [2023-10-10 05:14:05,824][53252] Updated weights for policy 0, policy_version 14580 (0.0011) [2023-10-10 05:14:06,198][53252] Updated weights for policy 0, policy_version 14590 (0.0008) [2023-10-10 05:14:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 29851648. Throughput: 0: 1683.3, 1: 1661.1. Samples: 7466890. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:14:06,784][52050] Avg episode reward: [(0, '14.480'), (1, '14.850')] [2023-10-10 05:14:08,710][53268] Updated weights for policy 1, policy_version 14570 (0.0010) [2023-10-10 05:14:09,074][53268] Updated weights for policy 1, policy_version 14580 (0.0009) [2023-10-10 05:14:09,437][53268] Updated weights for policy 1, policy_version 14590 (0.0008) [2023-10-10 05:14:10,246][53252] Updated weights for policy 0, policy_version 14600 (0.0009) [2023-10-10 05:14:10,628][53252] Updated weights for policy 0, policy_version 14610 (0.0007) [2023-10-10 05:14:11,007][53252] Updated weights for policy 0, policy_version 14620 (0.0009) [2023-10-10 05:14:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29917184. Throughput: 0: 1666.9, 1: 1686.5. Samples: 7486656. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:14:11,784][52050] Avg episode reward: [(0, '14.530'), (1, '15.510')] [2023-10-10 05:14:13,505][53268] Updated weights for policy 1, policy_version 14600 (0.0010) [2023-10-10 05:14:13,876][53268] Updated weights for policy 1, policy_version 14610 (0.0008) [2023-10-10 05:14:14,251][53268] Updated weights for policy 1, policy_version 14620 (0.0010) [2023-10-10 05:14:15,167][53252] Updated weights for policy 0, policy_version 14630 (0.0009) [2023-10-10 05:14:15,529][53252] Updated weights for policy 0, policy_version 14640 (0.0009) [2023-10-10 05:14:15,900][53252] Updated weights for policy 0, policy_version 14650 (0.0008) [2023-10-10 05:14:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 29982720. Throughput: 0: 1684.6, 1: 1663.9. Samples: 7497414. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-10 05:14:16,784][52050] Avg episode reward: [(0, '15.010'), (1, '15.060')] [2023-10-10 05:14:18,305][53268] Updated weights for policy 1, policy_version 14630 (0.0010) [2023-10-10 05:14:18,669][53268] Updated weights for policy 1, policy_version 14640 (0.0007) [2023-10-10 05:14:19,044][53268] Updated weights for policy 1, policy_version 14650 (0.0008) [2023-10-10 05:14:19,875][53252] Updated weights for policy 0, policy_version 14660 (0.0008) [2023-10-10 05:14:20,241][53252] Updated weights for policy 0, policy_version 14670 (0.0008) [2023-10-10 05:14:20,613][53252] Updated weights for policy 0, policy_version 14680 (0.0007) [2023-10-10 05:14:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 30048256. Throughput: 0: 1665.3, 1: 1674.0. Samples: 7517340. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-10 05:14:21,784][52050] Avg episode reward: [(0, '15.340'), (1, '14.490')] [2023-10-10 05:14:23,168][53268] Updated weights for policy 1, policy_version 14660 (0.0010) [2023-10-10 05:14:23,533][53268] Updated weights for policy 1, policy_version 14670 (0.0009) [2023-10-10 05:14:23,905][53268] Updated weights for policy 1, policy_version 14680 (0.0013) [2023-10-10 05:14:24,711][53252] Updated weights for policy 0, policy_version 14690 (0.0009) [2023-10-10 05:14:25,091][53252] Updated weights for policy 0, policy_version 14700 (0.0008) [2023-10-10 05:14:25,455][53252] Updated weights for policy 0, policy_version 14710 (0.0010) [2023-10-10 05:14:25,821][53252] Updated weights for policy 0, policy_version 14720 (0.0007) [2023-10-10 05:14:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 30113792. Throughput: 0: 1673.5, 1: 1691.5. Samples: 7537678. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-10 05:14:26,784][52050] Avg episode reward: [(0, '15.630'), (1, '13.830')] [2023-10-10 05:14:28,051][53268] Updated weights for policy 1, policy_version 14690 (0.0009) [2023-10-10 05:14:28,423][53268] Updated weights for policy 1, policy_version 14700 (0.0010) [2023-10-10 05:14:28,784][53268] Updated weights for policy 1, policy_version 14710 (0.0008) [2023-10-10 05:14:29,148][53268] Updated weights for policy 1, policy_version 14720 (0.0009) [2023-10-10 05:14:29,738][53252] Updated weights for policy 0, policy_version 14730 (0.0008) [2023-10-10 05:14:30,115][53252] Updated weights for policy 0, policy_version 14740 (0.0007) [2023-10-10 05:14:30,491][53252] Updated weights for policy 0, policy_version 14750 (0.0008) [2023-10-10 05:14:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 30179328. Throughput: 0: 1683.1, 1: 1667.7. Samples: 7548018. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:14:31,784][52050] Avg episode reward: [(0, '14.580'), (1, '13.650')] [2023-10-10 05:14:33,187][53268] Updated weights for policy 1, policy_version 14730 (0.0011) [2023-10-10 05:14:33,561][53268] Updated weights for policy 1, policy_version 14740 (0.0009) [2023-10-10 05:14:33,930][53268] Updated weights for policy 1, policy_version 14750 (0.0010) [2023-10-10 05:14:34,621][53252] Updated weights for policy 0, policy_version 14760 (0.0008) [2023-10-10 05:14:34,997][53252] Updated weights for policy 0, policy_version 14770 (0.0007) [2023-10-10 05:14:35,363][53252] Updated weights for policy 0, policy_version 14780 (0.0007) [2023-10-10 05:14:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 30244864. Throughput: 0: 1663.8, 1: 1691.5. Samples: 7567712. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:14:36,784][52050] Avg episode reward: [(0, '14.070'), (1, '14.700')] [2023-10-10 05:14:37,935][53268] Updated weights for policy 1, policy_version 14760 (0.0009) [2023-10-10 05:14:38,309][53268] Updated weights for policy 1, policy_version 14770 (0.0009) [2023-10-10 05:14:38,665][53268] Updated weights for policy 1, policy_version 14780 (0.0009) [2023-10-10 05:14:39,297][53252] Updated weights for policy 0, policy_version 14790 (0.0007) [2023-10-10 05:14:39,670][53252] Updated weights for policy 0, policy_version 14800 (0.0009) [2023-10-10 05:14:40,036][53252] Updated weights for policy 0, policy_version 14810 (0.0007) [2023-10-10 05:14:41,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 30310400. Throughput: 0: 1688.1, 1: 1702.0. Samples: 7588496. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:14:41,785][52050] Avg episode reward: [(0, '13.350'), (1, '14.940')] [2023-10-10 05:14:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000014784_15138816.pth... [2023-10-10 05:14:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000014816_15171584.pth... [2023-10-10 05:14:41,826][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth [2023-10-10 05:14:41,826][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000013248_13565952.pth [2023-10-10 05:14:42,747][53268] Updated weights for policy 1, policy_version 14790 (0.0009) [2023-10-10 05:14:43,136][53268] Updated weights for policy 1, policy_version 14800 (0.0007) [2023-10-10 05:14:43,506][53268] Updated weights for policy 1, policy_version 14810 (0.0007) [2023-10-10 05:14:44,122][53252] Updated weights for policy 0, policy_version 14820 (0.0007) [2023-10-10 05:14:44,487][53252] Updated weights for policy 0, policy_version 14830 (0.0007) [2023-10-10 05:14:44,861][53252] Updated weights for policy 0, policy_version 14840 (0.0007) [2023-10-10 05:14:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.5, 300 sec: 13440.4). Total num frames: 30375936. Throughput: 0: 1687.7, 1: 1677.8. Samples: 7598528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:14:46,784][52050] Avg episode reward: [(0, '12.910'), (1, '14.140')] [2023-10-10 05:14:47,542][53268] Updated weights for policy 1, policy_version 14820 (0.0009) [2023-10-10 05:14:47,910][53268] Updated weights for policy 1, policy_version 14830 (0.0008) [2023-10-10 05:14:48,281][53268] Updated weights for policy 1, policy_version 14840 (0.0009) [2023-10-10 05:14:48,825][53252] Updated weights for policy 0, policy_version 14850 (0.0007) [2023-10-10 05:14:49,198][53252] Updated weights for policy 0, policy_version 14860 (0.0007) [2023-10-10 05:14:49,574][53252] Updated weights for policy 0, policy_version 14870 (0.0007) [2023-10-10 05:14:49,940][53252] Updated weights for policy 0, policy_version 14880 (0.0008) [2023-10-10 05:14:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30441472. Throughput: 0: 1670.5, 1: 1696.5. Samples: 7618406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:14:51,784][52050] Avg episode reward: [(0, '14.170'), (1, '14.240')] [2023-10-10 05:14:52,393][53268] Updated weights for policy 1, policy_version 14850 (0.0007) [2023-10-10 05:14:52,758][53268] Updated weights for policy 1, policy_version 14860 (0.0009) [2023-10-10 05:14:53,126][53268] Updated weights for policy 1, policy_version 14870 (0.0008) [2023-10-10 05:14:53,489][53268] Updated weights for policy 1, policy_version 14880 (0.0008) [2023-10-10 05:14:54,015][53252] Updated weights for policy 0, policy_version 14890 (0.0007) [2023-10-10 05:14:54,381][53252] Updated weights for policy 0, policy_version 14900 (0.0007) [2023-10-10 05:14:54,753][53252] Updated weights for policy 0, policy_version 14910 (0.0007) [2023-10-10 05:14:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30507008. Throughput: 0: 1696.3, 1: 1692.0. Samples: 7639126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:14:56,784][52050] Avg episode reward: [(0, '13.510'), (1, '13.990')] [2023-10-10 05:14:57,556][53268] Updated weights for policy 1, policy_version 14890 (0.0010) [2023-10-10 05:14:57,925][53268] Updated weights for policy 1, policy_version 14900 (0.0010) [2023-10-10 05:14:58,286][53268] Updated weights for policy 1, policy_version 14910 (0.0009) [2023-10-10 05:14:58,674][53252] Updated weights for policy 0, policy_version 14920 (0.0008) [2023-10-10 05:14:59,048][53252] Updated weights for policy 0, policy_version 14930 (0.0010) [2023-10-10 05:14:59,429][53252] Updated weights for policy 0, policy_version 14940 (0.0007) [2023-10-10 05:15:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30572544. Throughput: 0: 1676.1, 1: 1683.0. Samples: 7648574. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:15:01,784][52050] Avg episode reward: [(0, '14.410'), (1, '14.060')] [2023-10-10 05:15:02,440][53268] Updated weights for policy 1, policy_version 14920 (0.0011) [2023-10-10 05:15:02,811][53268] Updated weights for policy 1, policy_version 14930 (0.0007) [2023-10-10 05:15:03,178][53268] Updated weights for policy 1, policy_version 14940 (0.0007) [2023-10-10 05:15:03,469][53252] Updated weights for policy 0, policy_version 14950 (0.0009) [2023-10-10 05:15:03,829][53252] Updated weights for policy 0, policy_version 14960 (0.0009) [2023-10-10 05:15:04,204][53252] Updated weights for policy 0, policy_version 14970 (0.0010) [2023-10-10 05:15:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30638080. Throughput: 0: 1687.2, 1: 1684.3. Samples: 7669058. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:15:06,784][52050] Avg episode reward: [(0, '13.990'), (1, '14.280')] [2023-10-10 05:15:07,185][53268] Updated weights for policy 1, policy_version 14950 (0.0010) [2023-10-10 05:15:07,548][53268] Updated weights for policy 1, policy_version 14960 (0.0010) [2023-10-10 05:15:07,921][53268] Updated weights for policy 1, policy_version 14970 (0.0010) [2023-10-10 05:15:08,230][53252] Updated weights for policy 0, policy_version 14980 (0.0008) [2023-10-10 05:15:08,596][53252] Updated weights for policy 0, policy_version 14990 (0.0010) [2023-10-10 05:15:08,965][53252] Updated weights for policy 0, policy_version 15000 (0.0012) [2023-10-10 05:15:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 30703616. Throughput: 0: 1699.2, 1: 1679.0. Samples: 7689696. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:15:11,784][52050] Avg episode reward: [(0, '12.790'), (1, '14.650')] [2023-10-10 05:15:11,915][53268] Updated weights for policy 1, policy_version 14980 (0.0009) [2023-10-10 05:15:12,277][53268] Updated weights for policy 1, policy_version 14990 (0.0009) [2023-10-10 05:15:12,643][53268] Updated weights for policy 1, policy_version 15000 (0.0008) [2023-10-10 05:15:13,122][53252] Updated weights for policy 0, policy_version 15010 (0.0007) [2023-10-10 05:15:13,501][53252] Updated weights for policy 0, policy_version 15020 (0.0008) [2023-10-10 05:15:13,866][53252] Updated weights for policy 0, policy_version 15030 (0.0008) [2023-10-10 05:15:14,235][53252] Updated weights for policy 0, policy_version 15040 (0.0008) [2023-10-10 05:15:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30769152. Throughput: 0: 1671.2, 1: 1674.8. Samples: 7698586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:15:16,784][52050] Avg episode reward: [(0, '12.390'), (1, '14.980')] [2023-10-10 05:15:16,929][53268] Updated weights for policy 1, policy_version 15010 (0.0009) [2023-10-10 05:15:17,298][53268] Updated weights for policy 1, policy_version 15020 (0.0007) [2023-10-10 05:15:17,666][53268] Updated weights for policy 1, policy_version 15030 (0.0009) [2023-10-10 05:15:18,025][53268] Updated weights for policy 1, policy_version 15040 (0.0009) [2023-10-10 05:15:18,297][53252] Updated weights for policy 0, policy_version 15050 (0.0009) [2023-10-10 05:15:18,665][53252] Updated weights for policy 0, policy_version 15060 (0.0008) [2023-10-10 05:15:19,036][53252] Updated weights for policy 0, policy_version 15070 (0.0010) [2023-10-10 05:15:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30834688. Throughput: 0: 1692.8, 1: 1671.7. Samples: 7719114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:15:21,784][52050] Avg episode reward: [(0, '13.210'), (1, '14.520')] [2023-10-10 05:15:22,238][53268] Updated weights for policy 1, policy_version 15050 (0.0007) [2023-10-10 05:15:22,610][53268] Updated weights for policy 1, policy_version 15060 (0.0007) [2023-10-10 05:15:22,977][53268] Updated weights for policy 1, policy_version 15070 (0.0008) [2023-10-10 05:15:23,081][53252] Updated weights for policy 0, policy_version 15080 (0.0007) [2023-10-10 05:15:23,459][53252] Updated weights for policy 0, policy_version 15090 (0.0008) [2023-10-10 05:15:23,827][53252] Updated weights for policy 0, policy_version 15100 (0.0009) [2023-10-10 05:15:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30900224. Throughput: 0: 1692.1, 1: 1669.1. Samples: 7739746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:15:26,784][52050] Avg episode reward: [(0, '11.930'), (1, '14.910')] [2023-10-10 05:15:27,071][53268] Updated weights for policy 1, policy_version 15080 (0.0008) [2023-10-10 05:15:27,429][53268] Updated weights for policy 1, policy_version 15090 (0.0008) [2023-10-10 05:15:27,798][53268] Updated weights for policy 1, policy_version 15100 (0.0008) [2023-10-10 05:15:27,844][53252] Updated weights for policy 0, policy_version 15110 (0.0008) [2023-10-10 05:15:28,221][53252] Updated weights for policy 0, policy_version 15120 (0.0011) [2023-10-10 05:15:28,583][53252] Updated weights for policy 0, policy_version 15130 (0.0010) [2023-10-10 05:15:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30965760. Throughput: 0: 1666.3, 1: 1674.0. Samples: 7748844. Policy #0 lag: (min: 24.0, avg: 50.7, max: 56.0) [2023-10-10 05:15:31,784][52050] Avg episode reward: [(0, '12.480'), (1, '15.200')] [2023-10-10 05:15:31,976][53268] Updated weights for policy 1, policy_version 15110 (0.0009) [2023-10-10 05:15:32,361][53268] Updated weights for policy 1, policy_version 15120 (0.0008) [2023-10-10 05:15:32,701][53252] Updated weights for policy 0, policy_version 15140 (0.0008) [2023-10-10 05:15:32,728][53268] Updated weights for policy 1, policy_version 15130 (0.0007) [2023-10-10 05:15:33,077][53252] Updated weights for policy 0, policy_version 15150 (0.0009) [2023-10-10 05:15:33,454][53252] Updated weights for policy 0, policy_version 15160 (0.0007) [2023-10-10 05:15:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31031296. Throughput: 0: 1687.4, 1: 1666.0. Samples: 7769308. Policy #0 lag: (min: 24.0, avg: 50.7, max: 56.0) [2023-10-10 05:15:36,784][52050] Avg episode reward: [(0, '12.550'), (1, '14.800')] [2023-10-10 05:15:36,817][53268] Updated weights for policy 1, policy_version 15140 (0.0008) [2023-10-10 05:15:37,188][53268] Updated weights for policy 1, policy_version 15150 (0.0007) [2023-10-10 05:15:37,523][53252] Updated weights for policy 0, policy_version 15170 (0.0008) [2023-10-10 05:15:37,552][53268] Updated weights for policy 1, policy_version 15160 (0.0008) [2023-10-10 05:15:37,882][53252] Updated weights for policy 0, policy_version 15180 (0.0007) [2023-10-10 05:15:38,257][53252] Updated weights for policy 0, policy_version 15190 (0.0008) [2023-10-10 05:15:38,628][53252] Updated weights for policy 0, policy_version 15200 (0.0009) [2023-10-10 05:15:41,621][53268] Updated weights for policy 1, policy_version 15170 (0.0009) [2023-10-10 05:15:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 31096832. Throughput: 0: 1680.7, 1: 1665.2. Samples: 7789694. Policy #0 lag: (min: 24.0, avg: 50.7, max: 56.0) [2023-10-10 05:15:41,784][52050] Avg episode reward: [(0, '13.050'), (1, '15.130')] [2023-10-10 05:15:41,986][53268] Updated weights for policy 1, policy_version 15180 (0.0010) [2023-10-10 05:15:42,360][53268] Updated weights for policy 1, policy_version 15190 (0.0011) [2023-10-10 05:15:42,720][53268] Updated weights for policy 1, policy_version 15200 (0.0008) [2023-10-10 05:15:42,863][53252] Updated weights for policy 0, policy_version 15210 (0.0007) [2023-10-10 05:15:43,238][53252] Updated weights for policy 0, policy_version 15220 (0.0007) [2023-10-10 05:15:43,612][53252] Updated weights for policy 0, policy_version 15230 (0.0007) [2023-10-10 05:15:46,694][53268] Updated weights for policy 1, policy_version 15210 (0.0007) [2023-10-10 05:15:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31162368. Throughput: 0: 1673.3, 1: 1668.6. Samples: 7798958. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:15:46,784][52050] Avg episode reward: [(0, '13.970'), (1, '14.210')] [2023-10-10 05:15:47,055][53268] Updated weights for policy 1, policy_version 15220 (0.0007) [2023-10-10 05:15:47,425][53268] Updated weights for policy 1, policy_version 15230 (0.0008) [2023-10-10 05:15:47,610][53252] Updated weights for policy 0, policy_version 15240 (0.0008) [2023-10-10 05:15:47,977][53252] Updated weights for policy 0, policy_version 15250 (0.0009) [2023-10-10 05:15:48,352][53252] Updated weights for policy 0, policy_version 15260 (0.0008) [2023-10-10 05:15:51,454][53268] Updated weights for policy 1, policy_version 15240 (0.0010) [2023-10-10 05:15:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31227904. Throughput: 0: 1680.4, 1: 1674.9. Samples: 7820046. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:15:51,784][52050] Avg episode reward: [(0, '13.510'), (1, '12.670')] [2023-10-10 05:15:51,828][53268] Updated weights for policy 1, policy_version 15250 (0.0010) [2023-10-10 05:15:52,198][53268] Updated weights for policy 1, policy_version 15260 (0.0009) [2023-10-10 05:15:52,362][53252] Updated weights for policy 0, policy_version 15270 (0.0008) [2023-10-10 05:15:52,726][53252] Updated weights for policy 0, policy_version 15280 (0.0007) [2023-10-10 05:15:53,098][53252] Updated weights for policy 0, policy_version 15290 (0.0007) [2023-10-10 05:15:56,139][53268] Updated weights for policy 1, policy_version 15270 (0.0009) [2023-10-10 05:15:56,508][53268] Updated weights for policy 1, policy_version 15280 (0.0007) [2023-10-10 05:15:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31293440. Throughput: 0: 1684.6, 1: 1671.7. Samples: 7840728. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:15:56,784][52050] Avg episode reward: [(0, '14.110'), (1, '14.100')] [2023-10-10 05:15:56,875][53268] Updated weights for policy 1, policy_version 15290 (0.0008) [2023-10-10 05:15:57,069][53252] Updated weights for policy 0, policy_version 15300 (0.0007) [2023-10-10 05:15:57,443][53252] Updated weights for policy 0, policy_version 15310 (0.0007) [2023-10-10 05:15:57,812][53252] Updated weights for policy 0, policy_version 15320 (0.0008) [2023-10-10 05:16:00,943][53268] Updated weights for policy 1, policy_version 15300 (0.0009) [2023-10-10 05:16:01,318][53268] Updated weights for policy 1, policy_version 15310 (0.0010) [2023-10-10 05:16:01,685][53268] Updated weights for policy 1, policy_version 15320 (0.0009) [2023-10-10 05:16:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31358976. Throughput: 0: 1684.3, 1: 1682.5. Samples: 7850094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:01,784][52050] Avg episode reward: [(0, '15.150'), (1, '13.980')] [2023-10-10 05:16:01,892][53252] Updated weights for policy 0, policy_version 15330 (0.0010) [2023-10-10 05:16:02,268][53252] Updated weights for policy 0, policy_version 15340 (0.0007) [2023-10-10 05:16:02,634][53252] Updated weights for policy 0, policy_version 15350 (0.0007) [2023-10-10 05:16:03,005][53252] Updated weights for policy 0, policy_version 15360 (0.0007) [2023-10-10 05:16:05,863][53268] Updated weights for policy 1, policy_version 15330 (0.0011) [2023-10-10 05:16:06,236][53268] Updated weights for policy 1, policy_version 15340 (0.0010) [2023-10-10 05:16:06,601][53268] Updated weights for policy 1, policy_version 15350 (0.0010) [2023-10-10 05:16:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 31424512. Throughput: 0: 1689.1, 1: 1685.7. Samples: 7870980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:06,784][52050] Avg episode reward: [(0, '14.970'), (1, '15.150')] [2023-10-10 05:16:06,974][53268] Updated weights for policy 1, policy_version 15360 (0.0007) [2023-10-10 05:16:07,047][53252] Updated weights for policy 0, policy_version 15370 (0.0009) [2023-10-10 05:16:07,416][53252] Updated weights for policy 0, policy_version 15380 (0.0008) [2023-10-10 05:16:07,784][53252] Updated weights for policy 0, policy_version 15390 (0.0010) [2023-10-10 05:16:10,939][53268] Updated weights for policy 1, policy_version 15370 (0.0010) [2023-10-10 05:16:11,305][53268] Updated weights for policy 1, policy_version 15380 (0.0009) [2023-10-10 05:16:11,686][53268] Updated weights for policy 1, policy_version 15390 (0.0009) [2023-10-10 05:16:11,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31522816. Throughput: 0: 1690.7, 1: 1674.8. Samples: 7891192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:11,784][52050] Avg episode reward: [(0, '14.860'), (1, '15.940')] [2023-10-10 05:16:11,941][53252] Updated weights for policy 0, policy_version 15400 (0.0008) [2023-10-10 05:16:12,308][53252] Updated weights for policy 0, policy_version 15410 (0.0007) [2023-10-10 05:16:12,674][53252] Updated weights for policy 0, policy_version 15420 (0.0007) [2023-10-10 05:16:15,784][53268] Updated weights for policy 1, policy_version 15400 (0.0010) [2023-10-10 05:16:16,153][53268] Updated weights for policy 1, policy_version 15410 (0.0009) [2023-10-10 05:16:16,528][53268] Updated weights for policy 1, policy_version 15420 (0.0008) [2023-10-10 05:16:16,676][53252] Updated weights for policy 0, policy_version 15430 (0.0007) [2023-10-10 05:16:16,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31588352. Throughput: 0: 1689.0, 1: 1684.6. Samples: 7900654. Policy #0 lag: (min: 31.0, avg: 43.2, max: 63.0) [2023-10-10 05:16:16,784][52050] Avg episode reward: [(0, '16.040'), (1, '15.760')] [2023-10-10 05:16:17,040][53252] Updated weights for policy 0, policy_version 15440 (0.0010) [2023-10-10 05:16:17,417][53252] Updated weights for policy 0, policy_version 15450 (0.0009) [2023-10-10 05:16:17,633][52846] Saving new best policy, reward=16.040! [2023-10-10 05:16:20,611][53268] Updated weights for policy 1, policy_version 15430 (0.0009) [2023-10-10 05:16:20,981][53268] Updated weights for policy 1, policy_version 15440 (0.0008) [2023-10-10 05:16:21,346][53268] Updated weights for policy 1, policy_version 15450 (0.0008) [2023-10-10 05:16:21,619][53252] Updated weights for policy 0, policy_version 15460 (0.0009) [2023-10-10 05:16:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31653888. Throughput: 0: 1688.6, 1: 1688.3. Samples: 7921270. Policy #0 lag: (min: 31.0, avg: 43.2, max: 63.0) [2023-10-10 05:16:21,785][52050] Avg episode reward: [(0, '15.080'), (1, '15.020')] [2023-10-10 05:16:21,992][53252] Updated weights for policy 0, policy_version 15470 (0.0010) [2023-10-10 05:16:22,358][53252] Updated weights for policy 0, policy_version 15480 (0.0008) [2023-10-10 05:16:25,497][53268] Updated weights for policy 1, policy_version 15460 (0.0009) [2023-10-10 05:16:25,862][53268] Updated weights for policy 1, policy_version 15470 (0.0012) [2023-10-10 05:16:26,239][53268] Updated weights for policy 1, policy_version 15480 (0.0009) [2023-10-10 05:16:26,375][53252] Updated weights for policy 0, policy_version 15490 (0.0007) [2023-10-10 05:16:26,739][53252] Updated weights for policy 0, policy_version 15500 (0.0009) [2023-10-10 05:16:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 31719424. Throughput: 0: 1691.7, 1: 1672.5. Samples: 7941082. Policy #0 lag: (min: 31.0, avg: 43.2, max: 63.0) [2023-10-10 05:16:26,784][52050] Avg episode reward: [(0, '14.900'), (1, '15.070')] [2023-10-10 05:16:27,115][53252] Updated weights for policy 0, policy_version 15510 (0.0009) [2023-10-10 05:16:27,491][53252] Updated weights for policy 0, policy_version 15520 (0.0008) [2023-10-10 05:16:30,265][53268] Updated weights for policy 1, policy_version 15490 (0.0008) [2023-10-10 05:16:30,633][53268] Updated weights for policy 1, policy_version 15500 (0.0008) [2023-10-10 05:16:31,004][53268] Updated weights for policy 1, policy_version 15510 (0.0008) [2023-10-10 05:16:31,371][53268] Updated weights for policy 1, policy_version 15520 (0.0010) [2023-10-10 05:16:31,574][53252] Updated weights for policy 0, policy_version 15530 (0.0009) [2023-10-10 05:16:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31784960. Throughput: 0: 1690.9, 1: 1687.8. Samples: 7950998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:31,784][52050] Avg episode reward: [(0, '13.550'), (1, '15.030')] [2023-10-10 05:16:31,943][53252] Updated weights for policy 0, policy_version 15540 (0.0009) [2023-10-10 05:16:32,329][53252] Updated weights for policy 0, policy_version 15550 (0.0008) [2023-10-10 05:16:35,347][53268] Updated weights for policy 1, policy_version 15530 (0.0008) [2023-10-10 05:16:35,720][53268] Updated weights for policy 1, policy_version 15540 (0.0008) [2023-10-10 05:16:36,082][53268] Updated weights for policy 1, policy_version 15550 (0.0007) [2023-10-10 05:16:36,284][53252] Updated weights for policy 0, policy_version 15560 (0.0008) [2023-10-10 05:16:36,659][53252] Updated weights for policy 0, policy_version 15570 (0.0007) [2023-10-10 05:16:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31850496. Throughput: 0: 1692.5, 1: 1678.9. Samples: 7971758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:36,784][52050] Avg episode reward: [(0, '13.600'), (1, '14.730')] [2023-10-10 05:16:37,029][53252] Updated weights for policy 0, policy_version 15580 (0.0009) [2023-10-10 05:16:40,329][53268] Updated weights for policy 1, policy_version 15560 (0.0008) [2023-10-10 05:16:40,694][53268] Updated weights for policy 1, policy_version 15570 (0.0008) [2023-10-10 05:16:41,053][53252] Updated weights for policy 0, policy_version 15590 (0.0008) [2023-10-10 05:16:41,058][53268] Updated weights for policy 1, policy_version 15580 (0.0009) [2023-10-10 05:16:41,430][53252] Updated weights for policy 0, policy_version 15600 (0.0007) [2023-10-10 05:16:41,784][52050] Fps is (10 sec: 13106.6, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 31916032. Throughput: 0: 1676.7, 1: 1658.1. Samples: 7990798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:16:41,785][52050] Avg episode reward: [(0, '13.450'), (1, '16.270')] [2023-10-10 05:16:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000015584_15958016.pth... [2023-10-10 05:16:41,813][53252] Updated weights for policy 0, policy_version 15610 (0.0009) [2023-10-10 05:16:41,823][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000014016_14352384.pth [2023-10-10 05:16:41,827][53061] Saving new best policy, reward=16.270! [2023-10-10 05:16:41,859][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000015584_15958016.pth [2023-10-10 05:16:42,034][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000015616_15990784.pth... [2023-10-10 05:16:42,063][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000014016_14352384.pth [2023-10-10 05:16:42,067][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000015616_15990784.pth [2023-10-10 05:16:45,227][53268] Updated weights for policy 1, policy_version 15590 (0.0008) [2023-10-10 05:16:45,591][53268] Updated weights for policy 1, policy_version 15600 (0.0009) [2023-10-10 05:16:45,896][53252] Updated weights for policy 0, policy_version 15620 (0.0009) [2023-10-10 05:16:45,961][53268] Updated weights for policy 1, policy_version 15610 (0.0008) [2023-10-10 05:16:46,262][53252] Updated weights for policy 0, policy_version 15630 (0.0009) [2023-10-10 05:16:46,644][53252] Updated weights for policy 0, policy_version 15640 (0.0007) [2023-10-10 05:16:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31981568. Throughput: 0: 1688.4, 1: 1676.9. Samples: 8001534. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:16:46,784][52050] Avg episode reward: [(0, '13.960'), (1, '17.130')] [2023-10-10 05:16:46,786][53061] Saving new best policy, reward=17.130! [2023-10-10 05:16:50,167][53268] Updated weights for policy 1, policy_version 15620 (0.0009) [2023-10-10 05:16:50,542][53268] Updated weights for policy 1, policy_version 15630 (0.0009) [2023-10-10 05:16:50,720][53252] Updated weights for policy 0, policy_version 15650 (0.0008) [2023-10-10 05:16:50,906][53268] Updated weights for policy 1, policy_version 15640 (0.0009) [2023-10-10 05:16:51,092][53252] Updated weights for policy 0, policy_version 15660 (0.0008) [2023-10-10 05:16:51,456][53252] Updated weights for policy 0, policy_version 15670 (0.0010) [2023-10-10 05:16:51,783][52050] Fps is (10 sec: 13108.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 32047104. Throughput: 0: 1685.9, 1: 1672.8. Samples: 8022120. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:16:51,784][52050] Avg episode reward: [(0, '15.240'), (1, '16.150')] [2023-10-10 05:16:51,823][53252] Updated weights for policy 0, policy_version 15680 (0.0011) [2023-10-10 05:16:55,066][53268] Updated weights for policy 1, policy_version 15650 (0.0009) [2023-10-10 05:16:55,433][53268] Updated weights for policy 1, policy_version 15660 (0.0008) [2023-10-10 05:16:55,736][53252] Updated weights for policy 0, policy_version 15690 (0.0007) [2023-10-10 05:16:55,804][53268] Updated weights for policy 1, policy_version 15670 (0.0009) [2023-10-10 05:16:56,100][53252] Updated weights for policy 0, policy_version 15700 (0.0010) [2023-10-10 05:16:56,174][53268] Updated weights for policy 1, policy_version 15680 (0.0008) [2023-10-10 05:16:56,480][53252] Updated weights for policy 0, policy_version 15710 (0.0010) [2023-10-10 05:16:56,783][52050] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 32145408. Throughput: 0: 1665.0, 1: 1660.3. Samples: 8040830. Policy #0 lag: (min: 1.0, avg: 12.9, max: 33.0) [2023-10-10 05:16:56,784][52050] Avg episode reward: [(0, '15.410'), (1, '15.150')] [2023-10-10 05:17:00,362][53268] Updated weights for policy 1, policy_version 15690 (0.0008) [2023-10-10 05:17:00,619][53252] Updated weights for policy 0, policy_version 15720 (0.0008) [2023-10-10 05:17:00,730][53268] Updated weights for policy 1, policy_version 15700 (0.0009) [2023-10-10 05:17:00,987][53252] Updated weights for policy 0, policy_version 15730 (0.0009) [2023-10-10 05:17:01,096][53268] Updated weights for policy 1, policy_version 15710 (0.0007) [2023-10-10 05:17:01,363][53252] Updated weights for policy 0, policy_version 15740 (0.0010) [2023-10-10 05:17:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 32210944. Throughput: 0: 1691.1, 1: 1673.1. Samples: 8052044. Policy #0 lag: (min: 1.0, avg: 12.9, max: 33.0) [2023-10-10 05:17:01,784][52050] Avg episode reward: [(0, '16.080'), (1, '15.310')] [2023-10-10 05:17:01,784][52846] Saving new best policy, reward=16.080! [2023-10-10 05:17:05,070][53268] Updated weights for policy 1, policy_version 15720 (0.0009) [2023-10-10 05:17:05,434][53268] Updated weights for policy 1, policy_version 15730 (0.0008) [2023-10-10 05:17:05,529][53252] Updated weights for policy 0, policy_version 15750 (0.0010) [2023-10-10 05:17:05,800][53268] Updated weights for policy 1, policy_version 15740 (0.0009) [2023-10-10 05:17:05,899][53252] Updated weights for policy 0, policy_version 15760 (0.0007) [2023-10-10 05:17:06,263][53252] Updated weights for policy 0, policy_version 15770 (0.0007) [2023-10-10 05:17:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 32276480. Throughput: 0: 1688.5, 1: 1663.3. Samples: 8072096. Policy #0 lag: (min: 1.0, avg: 12.9, max: 33.0) [2023-10-10 05:17:06,784][52050] Avg episode reward: [(0, '16.540'), (1, '13.850')] [2023-10-10 05:17:06,785][52846] Saving new best policy, reward=16.540! [2023-10-10 05:17:10,042][53268] Updated weights for policy 1, policy_version 15750 (0.0009) [2023-10-10 05:17:10,302][53252] Updated weights for policy 0, policy_version 15780 (0.0007) [2023-10-10 05:17:10,415][53268] Updated weights for policy 1, policy_version 15760 (0.0008) [2023-10-10 05:17:10,672][53252] Updated weights for policy 0, policy_version 15790 (0.0009) [2023-10-10 05:17:10,776][53268] Updated weights for policy 1, policy_version 15770 (0.0009) [2023-10-10 05:17:11,045][53252] Updated weights for policy 0, policy_version 15800 (0.0009) [2023-10-10 05:17:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32342016. Throughput: 0: 1665.2, 1: 1660.0. Samples: 8090712. Policy #0 lag: (min: 2.0, avg: 2.8, max: 18.0) [2023-10-10 05:17:11,784][52050] Avg episode reward: [(0, '16.990'), (1, '14.910')] [2023-10-10 05:17:11,794][52846] Saving new best policy, reward=16.990! [2023-10-10 05:17:14,863][53268] Updated weights for policy 1, policy_version 15780 (0.0010) [2023-10-10 05:17:15,005][53252] Updated weights for policy 0, policy_version 15810 (0.0007) [2023-10-10 05:17:15,220][53268] Updated weights for policy 1, policy_version 15790 (0.0010) [2023-10-10 05:17:15,369][53252] Updated weights for policy 0, policy_version 15820 (0.0009) [2023-10-10 05:17:15,586][53268] Updated weights for policy 1, policy_version 15800 (0.0008) [2023-10-10 05:17:15,748][53252] Updated weights for policy 0, policy_version 15830 (0.0007) [2023-10-10 05:17:16,109][53252] Updated weights for policy 0, policy_version 15840 (0.0007) [2023-10-10 05:17:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32407552. Throughput: 0: 1696.7, 1: 1668.8. Samples: 8102442. Policy #0 lag: (min: 2.0, avg: 2.8, max: 18.0) [2023-10-10 05:17:16,784][52050] Avg episode reward: [(0, '15.250'), (1, '15.190')] [2023-10-10 05:17:19,672][53268] Updated weights for policy 1, policy_version 15810 (0.0009) [2023-10-10 05:17:20,034][53268] Updated weights for policy 1, policy_version 15820 (0.0007) [2023-10-10 05:17:20,100][53252] Updated weights for policy 0, policy_version 15850 (0.0008) [2023-10-10 05:17:20,401][53268] Updated weights for policy 1, policy_version 15830 (0.0009) [2023-10-10 05:17:20,466][53252] Updated weights for policy 0, policy_version 15860 (0.0008) [2023-10-10 05:17:20,771][53268] Updated weights for policy 1, policy_version 15840 (0.0008) [2023-10-10 05:17:20,836][53252] Updated weights for policy 0, policy_version 15870 (0.0010) [2023-10-10 05:17:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 32473088. Throughput: 0: 1674.3, 1: 1659.1. Samples: 8121760. Policy #0 lag: (min: 2.0, avg: 2.8, max: 18.0) [2023-10-10 05:17:21,784][52050] Avg episode reward: [(0, '14.570'), (1, '15.470')] [2023-10-10 05:17:24,587][53268] Updated weights for policy 1, policy_version 15850 (0.0009) [2023-10-10 05:17:24,921][53252] Updated weights for policy 0, policy_version 15880 (0.0010) [2023-10-10 05:17:24,956][53268] Updated weights for policy 1, policy_version 15860 (0.0008) [2023-10-10 05:17:25,289][53252] Updated weights for policy 0, policy_version 15890 (0.0009) [2023-10-10 05:17:25,321][53268] Updated weights for policy 1, policy_version 15870 (0.0008) [2023-10-10 05:17:25,649][53252] Updated weights for policy 0, policy_version 15900 (0.0008) [2023-10-10 05:17:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32538624. Throughput: 0: 1669.9, 1: 1672.8. Samples: 8141218. Policy #0 lag: (min: 0.0, avg: 21.2, max: 32.0) [2023-10-10 05:17:26,784][52050] Avg episode reward: [(0, '14.770'), (1, '14.940')] [2023-10-10 05:17:29,357][53268] Updated weights for policy 1, policy_version 15880 (0.0008) [2023-10-10 05:17:29,735][53268] Updated weights for policy 1, policy_version 15890 (0.0010) [2023-10-10 05:17:29,760][53252] Updated weights for policy 0, policy_version 15910 (0.0007) [2023-10-10 05:17:30,106][53268] Updated weights for policy 1, policy_version 15900 (0.0008) [2023-10-10 05:17:30,137][53252] Updated weights for policy 0, policy_version 15920 (0.0008) [2023-10-10 05:17:30,506][53252] Updated weights for policy 0, policy_version 15930 (0.0009) [2023-10-10 05:17:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32604160. Throughput: 0: 1689.8, 1: 1671.2. Samples: 8152776. Policy #0 lag: (min: 0.0, avg: 21.2, max: 32.0) [2023-10-10 05:17:31,784][52050] Avg episode reward: [(0, '13.600'), (1, '14.960')] [2023-10-10 05:17:34,398][53268] Updated weights for policy 1, policy_version 15910 (0.0009) [2023-10-10 05:17:34,701][53252] Updated weights for policy 0, policy_version 15940 (0.0008) [2023-10-10 05:17:34,764][53268] Updated weights for policy 1, policy_version 15920 (0.0009) [2023-10-10 05:17:35,066][53252] Updated weights for policy 0, policy_version 15950 (0.0007) [2023-10-10 05:17:35,134][53268] Updated weights for policy 1, policy_version 15930 (0.0008) [2023-10-10 05:17:35,440][53252] Updated weights for policy 0, policy_version 15960 (0.0009) [2023-10-10 05:17:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32669696. Throughput: 0: 1669.0, 1: 1649.6. Samples: 8171456. Policy #0 lag: (min: 0.0, avg: 21.2, max: 32.0) [2023-10-10 05:17:36,784][52050] Avg episode reward: [(0, '14.080'), (1, '14.400')] [2023-10-10 05:17:39,296][53268] Updated weights for policy 1, policy_version 15940 (0.0009) [2023-10-10 05:17:39,468][53252] Updated weights for policy 0, policy_version 15970 (0.0008) [2023-10-10 05:17:39,670][53268] Updated weights for policy 1, policy_version 15950 (0.0008) [2023-10-10 05:17:39,844][53252] Updated weights for policy 0, policy_version 15980 (0.0009) [2023-10-10 05:17:40,041][53268] Updated weights for policy 1, policy_version 15960 (0.0009) [2023-10-10 05:17:40,217][53252] Updated weights for policy 0, policy_version 15990 (0.0007) [2023-10-10 05:17:40,586][53252] Updated weights for policy 0, policy_version 16000 (0.0007) [2023-10-10 05:17:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 32735232. Throughput: 0: 1678.3, 1: 1666.3. Samples: 8191336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:17:41,785][52050] Avg episode reward: [(0, '15.430'), (1, '13.910')] [2023-10-10 05:17:44,006][53268] Updated weights for policy 1, policy_version 15970 (0.0009) [2023-10-10 05:17:44,381][53268] Updated weights for policy 1, policy_version 15980 (0.0009) [2023-10-10 05:17:44,700][53252] Updated weights for policy 0, policy_version 16010 (0.0009) [2023-10-10 05:17:44,742][53268] Updated weights for policy 1, policy_version 15990 (0.0008) [2023-10-10 05:17:45,069][53252] Updated weights for policy 0, policy_version 16020 (0.0008) [2023-10-10 05:17:45,111][53268] Updated weights for policy 1, policy_version 16000 (0.0009) [2023-10-10 05:17:45,435][53252] Updated weights for policy 0, policy_version 16030 (0.0009) [2023-10-10 05:17:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 32800768. Throughput: 0: 1678.8, 1: 1667.4. Samples: 8202622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:17:46,784][52050] Avg episode reward: [(0, '14.750'), (1, '14.980')] [2023-10-10 05:17:49,224][53268] Updated weights for policy 1, policy_version 16010 (0.0007) [2023-10-10 05:17:49,588][53268] Updated weights for policy 1, policy_version 16020 (0.0007) [2023-10-10 05:17:49,791][53252] Updated weights for policy 0, policy_version 16040 (0.0008) [2023-10-10 05:17:49,951][53268] Updated weights for policy 1, policy_version 16030 (0.0008) [2023-10-10 05:17:50,171][53252] Updated weights for policy 0, policy_version 16050 (0.0009) [2023-10-10 05:17:50,537][53252] Updated weights for policy 0, policy_version 16060 (0.0010) [2023-10-10 05:17:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 32866304. Throughput: 0: 1655.2, 1: 1655.2. Samples: 8221066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:17:51,784][52050] Avg episode reward: [(0, '14.950'), (1, '15.280')] [2023-10-10 05:17:54,089][53268] Updated weights for policy 1, policy_version 16040 (0.0009) [2023-10-10 05:17:54,455][53268] Updated weights for policy 1, policy_version 16050 (0.0009) [2023-10-10 05:17:54,590][53252] Updated weights for policy 0, policy_version 16070 (0.0009) [2023-10-10 05:17:54,818][53268] Updated weights for policy 1, policy_version 16060 (0.0009) [2023-10-10 05:17:54,962][53252] Updated weights for policy 0, policy_version 16080 (0.0007) [2023-10-10 05:17:55,339][53252] Updated weights for policy 0, policy_version 16090 (0.0009) [2023-10-10 05:17:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32931840. Throughput: 0: 1671.1, 1: 1676.1. Samples: 8241340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:17:56,784][52050] Avg episode reward: [(0, '15.160'), (1, '15.230')] [2023-10-10 05:17:59,024][53268] Updated weights for policy 1, policy_version 16070 (0.0009) [2023-10-10 05:17:59,406][53268] Updated weights for policy 1, policy_version 16080 (0.0010) [2023-10-10 05:17:59,481][53252] Updated weights for policy 0, policy_version 16100 (0.0008) [2023-10-10 05:17:59,778][53268] Updated weights for policy 1, policy_version 16090 (0.0008) [2023-10-10 05:17:59,852][53252] Updated weights for policy 0, policy_version 16110 (0.0007) [2023-10-10 05:18:00,224][53252] Updated weights for policy 0, policy_version 16120 (0.0007) [2023-10-10 05:18:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32997376. Throughput: 0: 1665.6, 1: 1664.6. Samples: 8252300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:18:01,784][52050] Avg episode reward: [(0, '14.980'), (1, '16.030')] [2023-10-10 05:18:03,659][53268] Updated weights for policy 1, policy_version 16100 (0.0007) [2023-10-10 05:18:04,030][53268] Updated weights for policy 1, policy_version 16110 (0.0009) [2023-10-10 05:18:04,255][53252] Updated weights for policy 0, policy_version 16130 (0.0009) [2023-10-10 05:18:04,400][53268] Updated weights for policy 1, policy_version 16120 (0.0007) [2023-10-10 05:18:04,629][53252] Updated weights for policy 0, policy_version 16140 (0.0008) [2023-10-10 05:18:05,004][53252] Updated weights for policy 0, policy_version 16150 (0.0008) [2023-10-10 05:18:05,378][53252] Updated weights for policy 0, policy_version 16160 (0.0010) [2023-10-10 05:18:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33062912. Throughput: 0: 1656.4, 1: 1660.0. Samples: 8270994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:18:06,784][52050] Avg episode reward: [(0, '14.540'), (1, '14.640')] [2023-10-10 05:18:08,470][53268] Updated weights for policy 1, policy_version 16130 (0.0007) [2023-10-10 05:18:08,847][53268] Updated weights for policy 1, policy_version 16140 (0.0010) [2023-10-10 05:18:09,212][53268] Updated weights for policy 1, policy_version 16150 (0.0009) [2023-10-10 05:18:09,361][53252] Updated weights for policy 0, policy_version 16170 (0.0009) [2023-10-10 05:18:09,582][53268] Updated weights for policy 1, policy_version 16160 (0.0009) [2023-10-10 05:18:09,737][53252] Updated weights for policy 0, policy_version 16180 (0.0009) [2023-10-10 05:18:10,115][53252] Updated weights for policy 0, policy_version 16190 (0.0009) [2023-10-10 05:18:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33128448. Throughput: 0: 1669.4, 1: 1674.6. Samples: 8291698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:18:11,784][52050] Avg episode reward: [(0, '14.100'), (1, '14.350')] [2023-10-10 05:18:13,656][53268] Updated weights for policy 1, policy_version 16170 (0.0010) [2023-10-10 05:18:13,978][53252] Updated weights for policy 0, policy_version 16200 (0.0008) [2023-10-10 05:18:14,028][53268] Updated weights for policy 1, policy_version 16180 (0.0008) [2023-10-10 05:18:14,345][53252] Updated weights for policy 0, policy_version 16210 (0.0010) [2023-10-10 05:18:14,400][53268] Updated weights for policy 1, policy_version 16190 (0.0009) [2023-10-10 05:18:14,723][53252] Updated weights for policy 0, policy_version 16220 (0.0009) [2023-10-10 05:18:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33193984. Throughput: 0: 1652.3, 1: 1656.0. Samples: 8301648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:18:16,784][52050] Avg episode reward: [(0, '14.380'), (1, '14.160')] [2023-10-10 05:18:18,584][53268] Updated weights for policy 1, policy_version 16200 (0.0009) [2023-10-10 05:18:18,717][53252] Updated weights for policy 0, policy_version 16230 (0.0008) [2023-10-10 05:18:18,961][53268] Updated weights for policy 1, policy_version 16210 (0.0008) [2023-10-10 05:18:19,082][53252] Updated weights for policy 0, policy_version 16240 (0.0007) [2023-10-10 05:18:19,320][53268] Updated weights for policy 1, policy_version 16220 (0.0009) [2023-10-10 05:18:19,458][53252] Updated weights for policy 0, policy_version 16250 (0.0008) [2023-10-10 05:18:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33259520. Throughput: 0: 1657.6, 1: 1670.5. Samples: 8321220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:18:21,784][52050] Avg episode reward: [(0, '13.040'), (1, '13.700')] [2023-10-10 05:18:23,410][53268] Updated weights for policy 1, policy_version 16230 (0.0008) [2023-10-10 05:18:23,631][53252] Updated weights for policy 0, policy_version 16260 (0.0009) [2023-10-10 05:18:23,780][53268] Updated weights for policy 1, policy_version 16240 (0.0009) [2023-10-10 05:18:24,006][53252] Updated weights for policy 0, policy_version 16270 (0.0008) [2023-10-10 05:18:24,148][53268] Updated weights for policy 1, policy_version 16250 (0.0009) [2023-10-10 05:18:24,365][53252] Updated weights for policy 0, policy_version 16280 (0.0009) [2023-10-10 05:18:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33325056. Throughput: 0: 1664.9, 1: 1679.1. Samples: 8341816. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 05:18:26,784][52050] Avg episode reward: [(0, '13.710'), (1, '14.200')] [2023-10-10 05:18:28,298][53268] Updated weights for policy 1, policy_version 16260 (0.0009) [2023-10-10 05:18:28,546][53252] Updated weights for policy 0, policy_version 16290 (0.0009) [2023-10-10 05:18:28,657][53268] Updated weights for policy 1, policy_version 16270 (0.0009) [2023-10-10 05:18:28,922][53252] Updated weights for policy 0, policy_version 16300 (0.0009) [2023-10-10 05:18:29,025][53268] Updated weights for policy 1, policy_version 16280 (0.0008) [2023-10-10 05:18:29,286][53252] Updated weights for policy 0, policy_version 16310 (0.0010) [2023-10-10 05:18:29,655][53252] Updated weights for policy 0, policy_version 16320 (0.0009) [2023-10-10 05:18:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 33390592. Throughput: 0: 1650.4, 1: 1660.3. Samples: 8351600. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 05:18:31,784][52050] Avg episode reward: [(0, '14.370'), (1, '16.120')] [2023-10-10 05:18:33,105][53268] Updated weights for policy 1, policy_version 16290 (0.0009) [2023-10-10 05:18:33,460][53268] Updated weights for policy 1, policy_version 16300 (0.0009) [2023-10-10 05:18:33,828][53268] Updated weights for policy 1, policy_version 16310 (0.0007) [2023-10-10 05:18:33,901][53252] Updated weights for policy 0, policy_version 16330 (0.0007) [2023-10-10 05:18:34,185][53268] Updated weights for policy 1, policy_version 16320 (0.0008) [2023-10-10 05:18:34,274][53252] Updated weights for policy 0, policy_version 16340 (0.0008) [2023-10-10 05:18:34,643][53252] Updated weights for policy 0, policy_version 16350 (0.0008) [2023-10-10 05:18:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 33456128. Throughput: 0: 1667.6, 1: 1678.3. Samples: 8371632. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 05:18:36,784][52050] Avg episode reward: [(0, '14.020'), (1, '16.060')] [2023-10-10 05:18:38,223][53268] Updated weights for policy 1, policy_version 16330 (0.0007) [2023-10-10 05:18:38,585][53268] Updated weights for policy 1, policy_version 16340 (0.0007) [2023-10-10 05:18:38,894][53252] Updated weights for policy 0, policy_version 16360 (0.0008) [2023-10-10 05:18:38,955][53268] Updated weights for policy 1, policy_version 16350 (0.0007) [2023-10-10 05:18:39,269][53252] Updated weights for policy 0, policy_version 16370 (0.0007) [2023-10-10 05:18:39,634][53252] Updated weights for policy 0, policy_version 16380 (0.0008) [2023-10-10 05:18:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33521664. Throughput: 0: 1671.5, 1: 1684.4. Samples: 8392356. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) [2023-10-10 05:18:41,784][52050] Avg episode reward: [(0, '15.060'), (1, '15.860')] [2023-10-10 05:18:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000016352_16744448.pth... [2023-10-10 05:18:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000016384_16777216.pth... [2023-10-10 05:18:41,822][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000014784_15138816.pth [2023-10-10 05:18:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000014816_15171584.pth [2023-10-10 05:18:43,022][53268] Updated weights for policy 1, policy_version 16360 (0.0008) [2023-10-10 05:18:43,391][53268] Updated weights for policy 1, policy_version 16370 (0.0007) [2023-10-10 05:18:43,660][53252] Updated weights for policy 0, policy_version 16390 (0.0008) [2023-10-10 05:18:43,749][53268] Updated weights for policy 1, policy_version 16380 (0.0008) [2023-10-10 05:18:44,037][53252] Updated weights for policy 0, policy_version 16400 (0.0008) [2023-10-10 05:18:44,398][53252] Updated weights for policy 0, policy_version 16410 (0.0009) [2023-10-10 05:18:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33587200. Throughput: 0: 1652.9, 1: 1671.1. Samples: 8401880. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) [2023-10-10 05:18:46,784][52050] Avg episode reward: [(0, '15.010'), (1, '16.280')] [2023-10-10 05:18:47,744][53268] Updated weights for policy 1, policy_version 16390 (0.0008) [2023-10-10 05:18:48,104][53268] Updated weights for policy 1, policy_version 16400 (0.0010) [2023-10-10 05:18:48,413][53252] Updated weights for policy 0, policy_version 16420 (0.0007) [2023-10-10 05:18:48,466][53268] Updated weights for policy 1, policy_version 16410 (0.0008) [2023-10-10 05:18:48,788][53252] Updated weights for policy 0, policy_version 16430 (0.0007) [2023-10-10 05:18:49,164][53252] Updated weights for policy 0, policy_version 16440 (0.0008) [2023-10-10 05:18:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 33652736. Throughput: 0: 1675.1, 1: 1689.6. Samples: 8422404. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) [2023-10-10 05:18:51,785][52050] Avg episode reward: [(0, '15.530'), (1, '16.380')] [2023-10-10 05:18:52,733][53268] Updated weights for policy 1, policy_version 16420 (0.0008) [2023-10-10 05:18:53,047][53252] Updated weights for policy 0, policy_version 16450 (0.0007) [2023-10-10 05:18:53,131][53268] Updated weights for policy 1, policy_version 16430 (0.0008) [2023-10-10 05:18:53,426][53252] Updated weights for policy 0, policy_version 16460 (0.0008) [2023-10-10 05:18:53,495][53268] Updated weights for policy 1, policy_version 16440 (0.0007) [2023-10-10 05:18:53,788][53252] Updated weights for policy 0, policy_version 16470 (0.0009) [2023-10-10 05:18:54,161][53252] Updated weights for policy 0, policy_version 16480 (0.0008) [2023-10-10 05:18:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33718272. Throughput: 0: 1680.6, 1: 1688.8. Samples: 8443320. Policy #0 lag: (min: 6.0, avg: 10.5, max: 38.0) [2023-10-10 05:18:56,784][52050] Avg episode reward: [(0, '14.730'), (1, '16.340')] [2023-10-10 05:18:57,295][53268] Updated weights for policy 1, policy_version 16450 (0.0009) [2023-10-10 05:18:57,664][53268] Updated weights for policy 1, policy_version 16460 (0.0007) [2023-10-10 05:18:58,018][53268] Updated weights for policy 1, policy_version 16470 (0.0009) [2023-10-10 05:18:58,134][53252] Updated weights for policy 0, policy_version 16490 (0.0008) [2023-10-10 05:18:58,398][53268] Updated weights for policy 1, policy_version 16480 (0.0009) [2023-10-10 05:18:58,501][53252] Updated weights for policy 0, policy_version 16500 (0.0008) [2023-10-10 05:18:58,874][53252] Updated weights for policy 0, policy_version 16510 (0.0010) [2023-10-10 05:19:01,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33783808. Throughput: 0: 1665.4, 1: 1683.4. Samples: 8452346. Policy #0 lag: (min: 6.0, avg: 10.5, max: 38.0) [2023-10-10 05:19:01,784][52050] Avg episode reward: [(0, '13.260'), (1, '16.540')] [2023-10-10 05:19:02,321][53268] Updated weights for policy 1, policy_version 16490 (0.0008) [2023-10-10 05:19:02,694][53268] Updated weights for policy 1, policy_version 16500 (0.0008) [2023-10-10 05:19:02,918][53252] Updated weights for policy 0, policy_version 16520 (0.0008) [2023-10-10 05:19:03,059][53268] Updated weights for policy 1, policy_version 16510 (0.0008) [2023-10-10 05:19:03,293][53252] Updated weights for policy 0, policy_version 16530 (0.0007) [2023-10-10 05:19:03,669][53252] Updated weights for policy 0, policy_version 16540 (0.0008) [2023-10-10 05:19:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 33849344. Throughput: 0: 1682.2, 1: 1700.0. Samples: 8473418. Policy #0 lag: (min: 6.0, avg: 10.5, max: 38.0) [2023-10-10 05:19:06,784][52050] Avg episode reward: [(0, '15.400'), (1, '16.090')] [2023-10-10 05:19:07,092][53268] Updated weights for policy 1, policy_version 16520 (0.0008) [2023-10-10 05:19:07,461][53268] Updated weights for policy 1, policy_version 16530 (0.0009) [2023-10-10 05:19:07,606][53252] Updated weights for policy 0, policy_version 16550 (0.0009) [2023-10-10 05:19:07,816][53268] Updated weights for policy 1, policy_version 16540 (0.0009) [2023-10-10 05:19:07,967][53252] Updated weights for policy 0, policy_version 16560 (0.0008) [2023-10-10 05:19:08,343][53252] Updated weights for policy 0, policy_version 16570 (0.0007) [2023-10-10 05:19:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33914880. Throughput: 0: 1688.4, 1: 1698.7. Samples: 8494234. Policy #0 lag: (min: 9.0, avg: 11.5, max: 41.0) [2023-10-10 05:19:11,784][52050] Avg episode reward: [(0, '14.510'), (1, '15.920')] [2023-10-10 05:19:11,902][53268] Updated weights for policy 1, policy_version 16550 (0.0008) [2023-10-10 05:19:12,265][53268] Updated weights for policy 1, policy_version 16560 (0.0009) [2023-10-10 05:19:12,338][53252] Updated weights for policy 0, policy_version 16580 (0.0007) [2023-10-10 05:19:12,634][53268] Updated weights for policy 1, policy_version 16570 (0.0008) [2023-10-10 05:19:12,716][53252] Updated weights for policy 0, policy_version 16590 (0.0007) [2023-10-10 05:19:13,085][53252] Updated weights for policy 0, policy_version 16600 (0.0008) [2023-10-10 05:19:16,655][53268] Updated weights for policy 1, policy_version 16580 (0.0008) [2023-10-10 05:19:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 33980416. Throughput: 0: 1681.3, 1: 1692.4. Samples: 8503412. Policy #0 lag: (min: 9.0, avg: 11.5, max: 41.0) [2023-10-10 05:19:16,784][52050] Avg episode reward: [(0, '15.390'), (1, '15.500')] [2023-10-10 05:19:17,024][53268] Updated weights for policy 1, policy_version 16590 (0.0008) [2023-10-10 05:19:17,295][53252] Updated weights for policy 0, policy_version 16610 (0.0010) [2023-10-10 05:19:17,392][53268] Updated weights for policy 1, policy_version 16600 (0.0008) [2023-10-10 05:19:17,671][53252] Updated weights for policy 0, policy_version 16620 (0.0009) [2023-10-10 05:19:18,039][53252] Updated weights for policy 0, policy_version 16630 (0.0011) [2023-10-10 05:19:18,402][53252] Updated weights for policy 0, policy_version 16640 (0.0009) [2023-10-10 05:19:21,467][53268] Updated weights for policy 1, policy_version 16610 (0.0007) [2023-10-10 05:19:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 34045952. Throughput: 0: 1688.7, 1: 1697.3. Samples: 8524002. Policy #0 lag: (min: 9.0, avg: 11.5, max: 41.0) [2023-10-10 05:19:21,784][52050] Avg episode reward: [(0, '13.780'), (1, '15.510')] [2023-10-10 05:19:21,830][53268] Updated weights for policy 1, policy_version 16620 (0.0010) [2023-10-10 05:19:22,203][53268] Updated weights for policy 1, policy_version 16630 (0.0008) [2023-10-10 05:19:22,372][53252] Updated weights for policy 0, policy_version 16650 (0.0007) [2023-10-10 05:19:22,566][53268] Updated weights for policy 1, policy_version 16640 (0.0008) [2023-10-10 05:19:22,752][53252] Updated weights for policy 0, policy_version 16660 (0.0009) [2023-10-10 05:19:23,133][53252] Updated weights for policy 0, policy_version 16670 (0.0008) [2023-10-10 05:19:26,716][53268] Updated weights for policy 1, policy_version 16650 (0.0007) [2023-10-10 05:19:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 34111488. Throughput: 0: 1693.2, 1: 1694.8. Samples: 8544816. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:19:26,784][52050] Avg episode reward: [(0, '13.510'), (1, '15.510')] [2023-10-10 05:19:27,089][53268] Updated weights for policy 1, policy_version 16660 (0.0008) [2023-10-10 05:19:27,197][53252] Updated weights for policy 0, policy_version 16680 (0.0010) [2023-10-10 05:19:27,458][53268] Updated weights for policy 1, policy_version 16670 (0.0008) [2023-10-10 05:19:27,573][53252] Updated weights for policy 0, policy_version 16690 (0.0009) [2023-10-10 05:19:27,956][53252] Updated weights for policy 0, policy_version 16700 (0.0009) [2023-10-10 05:19:31,640][53268] Updated weights for policy 1, policy_version 16680 (0.0008) [2023-10-10 05:19:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34177024. Throughput: 0: 1683.2, 1: 1690.5. Samples: 8553696. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:19:31,784][52050] Avg episode reward: [(0, '14.320'), (1, '15.500')] [2023-10-10 05:19:32,014][53268] Updated weights for policy 1, policy_version 16690 (0.0008) [2023-10-10 05:19:32,038][53252] Updated weights for policy 0, policy_version 16710 (0.0010) [2023-10-10 05:19:32,381][53268] Updated weights for policy 1, policy_version 16700 (0.0009) [2023-10-10 05:19:32,414][53252] Updated weights for policy 0, policy_version 16720 (0.0008) [2023-10-10 05:19:32,790][53252] Updated weights for policy 0, policy_version 16730 (0.0011) [2023-10-10 05:19:36,438][53268] Updated weights for policy 1, policy_version 16710 (0.0009) [2023-10-10 05:19:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34242560. Throughput: 0: 1689.6, 1: 1687.2. Samples: 8574358. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:19:36,784][52050] Avg episode reward: [(0, '15.360'), (1, '14.850')] [2023-10-10 05:19:36,803][53268] Updated weights for policy 1, policy_version 16720 (0.0009) [2023-10-10 05:19:37,026][53252] Updated weights for policy 0, policy_version 16740 (0.0008) [2023-10-10 05:19:37,176][53268] Updated weights for policy 1, policy_version 16730 (0.0009) [2023-10-10 05:19:37,396][53252] Updated weights for policy 0, policy_version 16750 (0.0009) [2023-10-10 05:19:37,766][53252] Updated weights for policy 0, policy_version 16760 (0.0008) [2023-10-10 05:19:41,305][53268] Updated weights for policy 1, policy_version 16740 (0.0009) [2023-10-10 05:19:41,696][53268] Updated weights for policy 1, policy_version 16750 (0.0010) [2023-10-10 05:19:41,752][53252] Updated weights for policy 0, policy_version 16770 (0.0007) [2023-10-10 05:19:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34308096. Throughput: 0: 1691.1, 1: 1680.6. Samples: 8595046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:19:41,784][52050] Avg episode reward: [(0, '16.790'), (1, '15.450')] [2023-10-10 05:19:42,064][53268] Updated weights for policy 1, policy_version 16760 (0.0009) [2023-10-10 05:19:42,122][53252] Updated weights for policy 0, policy_version 16780 (0.0008) [2023-10-10 05:19:42,504][53252] Updated weights for policy 0, policy_version 16790 (0.0009) [2023-10-10 05:19:42,870][53252] Updated weights for policy 0, policy_version 16800 (0.0008) [2023-10-10 05:19:46,033][53268] Updated weights for policy 1, policy_version 16770 (0.0009) [2023-10-10 05:19:46,402][53268] Updated weights for policy 1, policy_version 16780 (0.0008) [2023-10-10 05:19:46,782][53268] Updated weights for policy 1, policy_version 16790 (0.0010) [2023-10-10 05:19:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34373632. Throughput: 0: 1692.5, 1: 1680.0. Samples: 8604106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:19:46,784][52050] Avg episode reward: [(0, '16.650'), (1, '15.080')] [2023-10-10 05:19:46,829][53252] Updated weights for policy 0, policy_version 16810 (0.0008) [2023-10-10 05:19:47,143][53268] Updated weights for policy 1, policy_version 16800 (0.0009) [2023-10-10 05:19:47,200][53252] Updated weights for policy 0, policy_version 16820 (0.0009) [2023-10-10 05:19:47,563][53252] Updated weights for policy 0, policy_version 16830 (0.0011) [2023-10-10 05:19:51,291][53268] Updated weights for policy 1, policy_version 16810 (0.0009) [2023-10-10 05:19:51,611][53252] Updated weights for policy 0, policy_version 16840 (0.0008) [2023-10-10 05:19:51,652][53268] Updated weights for policy 1, policy_version 16820 (0.0010) [2023-10-10 05:19:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 34439168. Throughput: 0: 1689.4, 1: 1671.6. Samples: 8624660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:19:51,784][52050] Avg episode reward: [(0, '16.190'), (1, '14.590')] [2023-10-10 05:19:51,985][53252] Updated weights for policy 0, policy_version 16850 (0.0008) [2023-10-10 05:19:52,018][53268] Updated weights for policy 1, policy_version 16830 (0.0008) [2023-10-10 05:19:52,365][53252] Updated weights for policy 0, policy_version 16860 (0.0008) [2023-10-10 05:19:56,000][53268] Updated weights for policy 1, policy_version 16840 (0.0007) [2023-10-10 05:19:56,369][53268] Updated weights for policy 1, policy_version 16850 (0.0008) [2023-10-10 05:19:56,449][53252] Updated weights for policy 0, policy_version 16870 (0.0009) [2023-10-10 05:19:56,734][53268] Updated weights for policy 1, policy_version 16860 (0.0009) [2023-10-10 05:19:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34504704. Throughput: 0: 1684.1, 1: 1664.5. Samples: 8644922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:19:56,784][52050] Avg episode reward: [(0, '16.210'), (1, '16.400')] [2023-10-10 05:19:56,823][53252] Updated weights for policy 0, policy_version 16880 (0.0007) [2023-10-10 05:19:57,197][53252] Updated weights for policy 0, policy_version 16890 (0.0010) [2023-10-10 05:20:00,876][53268] Updated weights for policy 1, policy_version 16870 (0.0008) [2023-10-10 05:20:01,250][53268] Updated weights for policy 1, policy_version 16880 (0.0008) [2023-10-10 05:20:01,445][53252] Updated weights for policy 0, policy_version 16900 (0.0009) [2023-10-10 05:20:01,605][53268] Updated weights for policy 1, policy_version 16890 (0.0009) [2023-10-10 05:20:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 34570240. Throughput: 0: 1687.2, 1: 1674.4. Samples: 8654684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:20:01,784][52050] Avg episode reward: [(0, '15.230'), (1, '15.110')] [2023-10-10 05:20:01,822][53252] Updated weights for policy 0, policy_version 16910 (0.0008) [2023-10-10 05:20:02,195][53252] Updated weights for policy 0, policy_version 16920 (0.0010) [2023-10-10 05:20:05,616][53268] Updated weights for policy 1, policy_version 16900 (0.0008) [2023-10-10 05:20:05,989][53268] Updated weights for policy 1, policy_version 16910 (0.0010) [2023-10-10 05:20:06,217][53252] Updated weights for policy 0, policy_version 16930 (0.0009) [2023-10-10 05:20:06,364][53268] Updated weights for policy 1, policy_version 16920 (0.0008) [2023-10-10 05:20:06,579][53252] Updated weights for policy 0, policy_version 16940 (0.0008) [2023-10-10 05:20:06,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 34668544. Throughput: 0: 1687.1, 1: 1673.4. Samples: 8675226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:20:06,784][52050] Avg episode reward: [(0, '14.950'), (1, '14.640')] [2023-10-10 05:20:06,954][53252] Updated weights for policy 0, policy_version 16950 (0.0008) [2023-10-10 05:20:07,329][53252] Updated weights for policy 0, policy_version 16960 (0.0009) [2023-10-10 05:20:10,530][53268] Updated weights for policy 1, policy_version 16930 (0.0008) [2023-10-10 05:20:10,892][53268] Updated weights for policy 1, policy_version 16940 (0.0008) [2023-10-10 05:20:11,271][53268] Updated weights for policy 1, policy_version 16950 (0.0010) [2023-10-10 05:20:11,526][53252] Updated weights for policy 0, policy_version 16970 (0.0007) [2023-10-10 05:20:11,635][53268] Updated weights for policy 1, policy_version 16960 (0.0008) [2023-10-10 05:20:11,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 34734080. Throughput: 0: 1679.5, 1: 1652.5. Samples: 8694756. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) [2023-10-10 05:20:11,784][52050] Avg episode reward: [(0, '16.070'), (1, '16.240')] [2023-10-10 05:20:11,894][53252] Updated weights for policy 0, policy_version 16980 (0.0008) [2023-10-10 05:20:12,259][53252] Updated weights for policy 0, policy_version 16990 (0.0008) [2023-10-10 05:20:15,666][53268] Updated weights for policy 1, policy_version 16970 (0.0007) [2023-10-10 05:20:16,030][53268] Updated weights for policy 1, policy_version 16980 (0.0008) [2023-10-10 05:20:16,361][53252] Updated weights for policy 0, policy_version 17000 (0.0009) [2023-10-10 05:20:16,404][53268] Updated weights for policy 1, policy_version 16990 (0.0008) [2023-10-10 05:20:16,732][53252] Updated weights for policy 0, policy_version 17010 (0.0010) [2023-10-10 05:20:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 34799616. Throughput: 0: 1687.5, 1: 1674.1. Samples: 8704966. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) [2023-10-10 05:20:16,784][52050] Avg episode reward: [(0, '15.530'), (1, '15.270')] [2023-10-10 05:20:17,112][53252] Updated weights for policy 0, policy_version 17020 (0.0009) [2023-10-10 05:20:20,691][53268] Updated weights for policy 1, policy_version 17000 (0.0009) [2023-10-10 05:20:21,061][53268] Updated weights for policy 1, policy_version 17010 (0.0009) [2023-10-10 05:20:21,201][53252] Updated weights for policy 0, policy_version 17030 (0.0009) [2023-10-10 05:20:21,424][53268] Updated weights for policy 1, policy_version 17020 (0.0007) [2023-10-10 05:20:21,568][53252] Updated weights for policy 0, policy_version 17040 (0.0008) [2023-10-10 05:20:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 34865152. Throughput: 0: 1684.8, 1: 1675.9. Samples: 8725586. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) [2023-10-10 05:20:21,784][52050] Avg episode reward: [(0, '16.100'), (1, '16.150')] [2023-10-10 05:20:21,942][53252] Updated weights for policy 0, policy_version 17050 (0.0010) [2023-10-10 05:20:25,759][53268] Updated weights for policy 1, policy_version 17030 (0.0009) [2023-10-10 05:20:25,998][53252] Updated weights for policy 0, policy_version 17060 (0.0010) [2023-10-10 05:20:26,141][53268] Updated weights for policy 1, policy_version 17040 (0.0007) [2023-10-10 05:20:26,359][53252] Updated weights for policy 0, policy_version 17070 (0.0008) [2023-10-10 05:20:26,501][53268] Updated weights for policy 1, policy_version 17050 (0.0007) [2023-10-10 05:20:26,740][53252] Updated weights for policy 0, policy_version 17080 (0.0007) [2023-10-10 05:20:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 34930688. Throughput: 0: 1669.0, 1: 1660.8. Samples: 8744886. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:20:26,784][52050] Avg episode reward: [(0, '16.790'), (1, '15.710')] [2023-10-10 05:20:30,436][53268] Updated weights for policy 1, policy_version 17060 (0.0008) [2023-10-10 05:20:30,798][53268] Updated weights for policy 1, policy_version 17070 (0.0008) [2023-10-10 05:20:30,884][53252] Updated weights for policy 0, policy_version 17090 (0.0007) [2023-10-10 05:20:31,165][53268] Updated weights for policy 1, policy_version 17080 (0.0007) [2023-10-10 05:20:31,254][53252] Updated weights for policy 0, policy_version 17100 (0.0008) [2023-10-10 05:20:31,613][53252] Updated weights for policy 0, policy_version 17110 (0.0009) [2023-10-10 05:20:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 34996224. Throughput: 0: 1678.3, 1: 1676.4. Samples: 8755068. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:20:31,784][52050] Avg episode reward: [(0, '17.450'), (1, '16.910')] [2023-10-10 05:20:31,985][52846] Saving new best policy, reward=17.450! [2023-10-10 05:20:31,986][53252] Updated weights for policy 0, policy_version 17120 (0.0011) [2023-10-10 05:20:35,287][53268] Updated weights for policy 1, policy_version 17090 (0.0008) [2023-10-10 05:20:35,656][53268] Updated weights for policy 1, policy_version 17100 (0.0007) [2023-10-10 05:20:35,959][53252] Updated weights for policy 0, policy_version 17130 (0.0008) [2023-10-10 05:20:36,020][53268] Updated weights for policy 1, policy_version 17110 (0.0007) [2023-10-10 05:20:36,327][53252] Updated weights for policy 0, policy_version 17140 (0.0007) [2023-10-10 05:20:36,377][53268] Updated weights for policy 1, policy_version 17120 (0.0008) [2023-10-10 05:20:36,691][53252] Updated weights for policy 0, policy_version 17150 (0.0008) [2023-10-10 05:20:36,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35094528. Throughput: 0: 1679.9, 1: 1677.6. Samples: 8775750. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:20:36,784][52050] Avg episode reward: [(0, '16.620'), (1, '15.740')] [2023-10-10 05:20:40,455][53268] Updated weights for policy 1, policy_version 17130 (0.0011) [2023-10-10 05:20:40,642][53252] Updated weights for policy 0, policy_version 17160 (0.0009) [2023-10-10 05:20:40,823][53268] Updated weights for policy 1, policy_version 17140 (0.0009) [2023-10-10 05:20:41,006][53252] Updated weights for policy 0, policy_version 17170 (0.0008) [2023-10-10 05:20:41,193][53268] Updated weights for policy 1, policy_version 17150 (0.0008) [2023-10-10 05:20:41,375][53252] Updated weights for policy 0, policy_version 17180 (0.0007) [2023-10-10 05:20:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35160064. Throughput: 0: 1660.9, 1: 1660.3. Samples: 8794374. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:20:41,784][52050] Avg episode reward: [(0, '17.690'), (1, '14.920')] [2023-10-10 05:20:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000017184_17596416.pth... [2023-10-10 05:20:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000017152_17563648.pth... [2023-10-10 05:20:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000015616_15990784.pth [2023-10-10 05:20:41,827][52846] Saving new best policy, reward=17.690! [2023-10-10 05:20:41,830][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000015584_15958016.pth [2023-10-10 05:20:45,240][53268] Updated weights for policy 1, policy_version 17160 (0.0009) [2023-10-10 05:20:45,395][53252] Updated weights for policy 0, policy_version 17190 (0.0008) [2023-10-10 05:20:45,592][53268] Updated weights for policy 1, policy_version 17170 (0.0009) [2023-10-10 05:20:45,766][53252] Updated weights for policy 0, policy_version 17200 (0.0007) [2023-10-10 05:20:45,964][53268] Updated weights for policy 1, policy_version 17180 (0.0008) [2023-10-10 05:20:46,138][53252] Updated weights for policy 0, policy_version 17210 (0.0008) [2023-10-10 05:20:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35225600. Throughput: 0: 1681.9, 1: 1671.8. Samples: 8805600. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 05:20:46,784][52050] Avg episode reward: [(0, '17.840'), (1, '16.690')] [2023-10-10 05:20:46,786][52846] Saving new best policy, reward=17.840! [2023-10-10 05:20:50,049][53268] Updated weights for policy 1, policy_version 17190 (0.0008) [2023-10-10 05:20:50,117][53252] Updated weights for policy 0, policy_version 17220 (0.0009) [2023-10-10 05:20:50,417][53268] Updated weights for policy 1, policy_version 17200 (0.0009) [2023-10-10 05:20:50,491][53252] Updated weights for policy 0, policy_version 17230 (0.0007) [2023-10-10 05:20:50,786][53268] Updated weights for policy 1, policy_version 17210 (0.0009) [2023-10-10 05:20:50,862][53252] Updated weights for policy 0, policy_version 17240 (0.0009) [2023-10-10 05:20:51,784][52050] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35291136. Throughput: 0: 1673.9, 1: 1668.3. Samples: 8825626. Policy #0 lag: (min: 26.0, avg: 31.6, max: 58.0) [2023-10-10 05:20:51,785][52050] Avg episode reward: [(0, '16.350'), (1, '15.530')] [2023-10-10 05:20:54,763][53252] Updated weights for policy 0, policy_version 17250 (0.0010) [2023-10-10 05:20:54,957][53268] Updated weights for policy 1, policy_version 17220 (0.0008) [2023-10-10 05:20:55,125][53252] Updated weights for policy 0, policy_version 17260 (0.0009) [2023-10-10 05:20:55,327][53268] Updated weights for policy 1, policy_version 17230 (0.0008) [2023-10-10 05:20:55,494][53252] Updated weights for policy 0, policy_version 17270 (0.0007) [2023-10-10 05:20:55,704][53268] Updated weights for policy 1, policy_version 17240 (0.0007) [2023-10-10 05:20:55,866][53252] Updated weights for policy 0, policy_version 17280 (0.0007) [2023-10-10 05:20:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 35356672. Throughput: 0: 1668.5, 1: 1664.0. Samples: 8844720. Policy #0 lag: (min: 26.0, avg: 31.6, max: 58.0) [2023-10-10 05:20:56,784][52050] Avg episode reward: [(0, '15.860'), (1, '15.100')] [2023-10-10 05:20:59,724][53268] Updated weights for policy 1, policy_version 17250 (0.0009) [2023-10-10 05:21:00,074][53252] Updated weights for policy 0, policy_version 17290 (0.0008) [2023-10-10 05:21:00,084][53268] Updated weights for policy 1, policy_version 17260 (0.0008) [2023-10-10 05:21:00,450][53252] Updated weights for policy 0, policy_version 17300 (0.0007) [2023-10-10 05:21:00,454][53268] Updated weights for policy 1, policy_version 17270 (0.0009) [2023-10-10 05:21:00,817][53252] Updated weights for policy 0, policy_version 17310 (0.0007) [2023-10-10 05:21:00,826][53268] Updated weights for policy 1, policy_version 17280 (0.0009) [2023-10-10 05:21:01,783][52050] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 35422208. Throughput: 0: 1689.1, 1: 1675.0. Samples: 8856352. Policy #0 lag: (min: 26.0, avg: 31.6, max: 58.0) [2023-10-10 05:21:01,784][52050] Avg episode reward: [(0, '14.350'), (1, '15.430')] [2023-10-10 05:21:04,697][53268] Updated weights for policy 1, policy_version 17290 (0.0010) [2023-10-10 05:21:04,963][53252] Updated weights for policy 0, policy_version 17320 (0.0007) [2023-10-10 05:21:05,062][53268] Updated weights for policy 1, policy_version 17300 (0.0009) [2023-10-10 05:21:05,328][53252] Updated weights for policy 0, policy_version 17330 (0.0007) [2023-10-10 05:21:05,434][53268] Updated weights for policy 1, policy_version 17310 (0.0010) [2023-10-10 05:21:05,700][53252] Updated weights for policy 0, policy_version 17340 (0.0009) [2023-10-10 05:21:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35487744. Throughput: 0: 1674.1, 1: 1664.7. Samples: 8875834. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:21:06,784][52050] Avg episode reward: [(0, '14.630'), (1, '15.910')] [2023-10-10 05:21:09,474][53268] Updated weights for policy 1, policy_version 17320 (0.0008) [2023-10-10 05:21:09,750][53252] Updated weights for policy 0, policy_version 17350 (0.0007) [2023-10-10 05:21:09,836][53268] Updated weights for policy 1, policy_version 17330 (0.0009) [2023-10-10 05:21:10,126][53252] Updated weights for policy 0, policy_version 17360 (0.0009) [2023-10-10 05:21:10,207][53268] Updated weights for policy 1, policy_version 17340 (0.0008) [2023-10-10 05:21:10,487][53252] Updated weights for policy 0, policy_version 17370 (0.0010) [2023-10-10 05:21:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35553280. Throughput: 0: 1671.2, 1: 1675.2. Samples: 8895476. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:21:11,784][52050] Avg episode reward: [(0, '15.430'), (1, '15.800')] [2023-10-10 05:21:14,300][53268] Updated weights for policy 1, policy_version 17350 (0.0008) [2023-10-10 05:21:14,610][53252] Updated weights for policy 0, policy_version 17380 (0.0009) [2023-10-10 05:21:14,677][53268] Updated weights for policy 1, policy_version 17360 (0.0010) [2023-10-10 05:21:14,981][53252] Updated weights for policy 0, policy_version 17390 (0.0008) [2023-10-10 05:21:15,041][53268] Updated weights for policy 1, policy_version 17370 (0.0009) [2023-10-10 05:21:15,350][53252] Updated weights for policy 0, policy_version 17400 (0.0008) [2023-10-10 05:21:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35618816. Throughput: 0: 1689.5, 1: 1687.6. Samples: 8907038. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:21:16,784][52050] Avg episode reward: [(0, '14.900'), (1, '15.540')] [2023-10-10 05:21:19,165][53268] Updated weights for policy 1, policy_version 17380 (0.0008) [2023-10-10 05:21:19,410][53252] Updated weights for policy 0, policy_version 17410 (0.0008) [2023-10-10 05:21:19,534][53268] Updated weights for policy 1, policy_version 17390 (0.0008) [2023-10-10 05:21:19,781][53252] Updated weights for policy 0, policy_version 17420 (0.0007) [2023-10-10 05:21:19,905][53268] Updated weights for policy 1, policy_version 17400 (0.0008) [2023-10-10 05:21:20,158][53252] Updated weights for policy 0, policy_version 17430 (0.0007) [2023-10-10 05:21:20,530][53252] Updated weights for policy 0, policy_version 17440 (0.0007) [2023-10-10 05:21:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35684352. Throughput: 0: 1666.5, 1: 1663.5. Samples: 8925598. Policy #0 lag: (min: 31.0, avg: 32.8, max: 61.0) [2023-10-10 05:21:21,784][52050] Avg episode reward: [(0, '15.890'), (1, '14.550')] [2023-10-10 05:21:23,909][53268] Updated weights for policy 1, policy_version 17410 (0.0009) [2023-10-10 05:21:24,271][53268] Updated weights for policy 1, policy_version 17420 (0.0008) [2023-10-10 05:21:24,642][53268] Updated weights for policy 1, policy_version 17430 (0.0009) [2023-10-10 05:21:24,654][53252] Updated weights for policy 0, policy_version 17450 (0.0008) [2023-10-10 05:21:25,020][53268] Updated weights for policy 1, policy_version 17440 (0.0007) [2023-10-10 05:21:25,037][53252] Updated weights for policy 0, policy_version 17460 (0.0008) [2023-10-10 05:21:25,413][53252] Updated weights for policy 0, policy_version 17470 (0.0008) [2023-10-10 05:21:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35749888. Throughput: 0: 1680.6, 1: 1681.2. Samples: 8945652. Policy #0 lag: (min: 31.0, avg: 32.8, max: 61.0) [2023-10-10 05:21:26,784][52050] Avg episode reward: [(0, '15.510'), (1, '14.610')] [2023-10-10 05:21:28,990][53268] Updated weights for policy 1, policy_version 17450 (0.0008) [2023-10-10 05:21:29,356][53268] Updated weights for policy 1, policy_version 17460 (0.0009) [2023-10-10 05:21:29,495][53252] Updated weights for policy 0, policy_version 17480 (0.0009) [2023-10-10 05:21:29,730][53268] Updated weights for policy 1, policy_version 17470 (0.0007) [2023-10-10 05:21:29,874][53252] Updated weights for policy 0, policy_version 17490 (0.0008) [2023-10-10 05:21:30,240][53252] Updated weights for policy 0, policy_version 17500 (0.0007) [2023-10-10 05:21:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 35815424. Throughput: 0: 1680.8, 1: 1674.3. Samples: 8956578. Policy #0 lag: (min: 31.0, avg: 32.8, max: 61.0) [2023-10-10 05:21:31,784][52050] Avg episode reward: [(0, '15.750'), (1, '14.230')] [2023-10-10 05:21:33,713][53268] Updated weights for policy 1, policy_version 17480 (0.0008) [2023-10-10 05:21:34,087][53268] Updated weights for policy 1, policy_version 17490 (0.0009) [2023-10-10 05:21:34,173][53252] Updated weights for policy 0, policy_version 17510 (0.0009) [2023-10-10 05:21:34,458][53268] Updated weights for policy 1, policy_version 17500 (0.0008) [2023-10-10 05:21:34,549][53252] Updated weights for policy 0, policy_version 17520 (0.0011) [2023-10-10 05:21:34,909][53252] Updated weights for policy 0, policy_version 17530 (0.0008) [2023-10-10 05:21:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 35880960. Throughput: 0: 1664.5, 1: 1665.5. Samples: 8975476. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) [2023-10-10 05:21:36,785][52050] Avg episode reward: [(0, '15.980'), (1, '14.610')] [2023-10-10 05:21:38,647][53268] Updated weights for policy 1, policy_version 17510 (0.0007) [2023-10-10 05:21:39,013][53268] Updated weights for policy 1, policy_version 17520 (0.0008) [2023-10-10 05:21:39,029][53252] Updated weights for policy 0, policy_version 17540 (0.0008) [2023-10-10 05:21:39,383][53268] Updated weights for policy 1, policy_version 17530 (0.0007) [2023-10-10 05:21:39,391][53252] Updated weights for policy 0, policy_version 17550 (0.0007) [2023-10-10 05:21:39,762][53252] Updated weights for policy 0, policy_version 17560 (0.0008) [2023-10-10 05:21:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 35946496. Throughput: 0: 1677.0, 1: 1690.7. Samples: 8996266. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) [2023-10-10 05:21:41,784][52050] Avg episode reward: [(0, '15.790'), (1, '16.170')] [2023-10-10 05:21:43,491][53268] Updated weights for policy 1, policy_version 17540 (0.0008) [2023-10-10 05:21:43,679][53252] Updated weights for policy 0, policy_version 17570 (0.0007) [2023-10-10 05:21:43,857][53268] Updated weights for policy 1, policy_version 17550 (0.0008) [2023-10-10 05:21:44,053][53252] Updated weights for policy 0, policy_version 17580 (0.0007) [2023-10-10 05:21:44,223][53268] Updated weights for policy 1, policy_version 17560 (0.0007) [2023-10-10 05:21:44,421][53252] Updated weights for policy 0, policy_version 17590 (0.0007) [2023-10-10 05:21:44,788][53252] Updated weights for policy 0, policy_version 17600 (0.0007) [2023-10-10 05:21:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 36012032. Throughput: 0: 1658.4, 1: 1671.2. Samples: 9006186. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) [2023-10-10 05:21:46,784][52050] Avg episode reward: [(0, '16.230'), (1, '15.140')] [2023-10-10 05:21:48,316][53268] Updated weights for policy 1, policy_version 17570 (0.0008) [2023-10-10 05:21:48,686][53268] Updated weights for policy 1, policy_version 17580 (0.0008) [2023-10-10 05:21:48,950][53252] Updated weights for policy 0, policy_version 17610 (0.0007) [2023-10-10 05:21:49,051][53268] Updated weights for policy 1, policy_version 17590 (0.0007) [2023-10-10 05:21:49,325][53252] Updated weights for policy 0, policy_version 17620 (0.0008) [2023-10-10 05:21:49,415][53268] Updated weights for policy 1, policy_version 17600 (0.0007) [2023-10-10 05:21:49,698][53252] Updated weights for policy 0, policy_version 17630 (0.0008) [2023-10-10 05:21:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 36077568. Throughput: 0: 1664.4, 1: 1673.0. Samples: 9026014. Policy #0 lag: (min: 24.0, avg: 42.0, max: 56.0) [2023-10-10 05:21:51,784][52050] Avg episode reward: [(0, '16.220'), (1, '14.970')] [2023-10-10 05:21:53,644][53268] Updated weights for policy 1, policy_version 17610 (0.0007) [2023-10-10 05:21:53,797][53252] Updated weights for policy 0, policy_version 17640 (0.0008) [2023-10-10 05:21:54,000][53268] Updated weights for policy 1, policy_version 17620 (0.0010) [2023-10-10 05:21:54,164][53252] Updated weights for policy 0, policy_version 17650 (0.0010) [2023-10-10 05:21:54,377][53268] Updated weights for policy 1, policy_version 17630 (0.0007) [2023-10-10 05:21:54,535][53252] Updated weights for policy 0, policy_version 17660 (0.0010) [2023-10-10 05:21:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 36143104. Throughput: 0: 1677.1, 1: 1679.3. Samples: 9046514. Policy #0 lag: (min: 24.0, avg: 42.0, max: 56.0) [2023-10-10 05:21:56,784][52050] Avg episode reward: [(0, '16.150'), (1, '16.360')] [2023-10-10 05:21:58,482][53252] Updated weights for policy 0, policy_version 17670 (0.0009) [2023-10-10 05:21:58,582][53268] Updated weights for policy 1, policy_version 17640 (0.0008) [2023-10-10 05:21:58,848][53252] Updated weights for policy 0, policy_version 17680 (0.0008) [2023-10-10 05:21:58,943][53268] Updated weights for policy 1, policy_version 17650 (0.0009) [2023-10-10 05:21:59,211][53252] Updated weights for policy 0, policy_version 17690 (0.0008) [2023-10-10 05:21:59,311][53268] Updated weights for policy 1, policy_version 17660 (0.0008) [2023-10-10 05:22:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36208640. Throughput: 0: 1655.7, 1: 1654.1. Samples: 9055976. Policy #0 lag: (min: 24.0, avg: 42.0, max: 56.0) [2023-10-10 05:22:01,784][52050] Avg episode reward: [(0, '16.450'), (1, '14.760')] [2023-10-10 05:22:03,401][53268] Updated weights for policy 1, policy_version 17670 (0.0007) [2023-10-10 05:22:03,437][53252] Updated weights for policy 0, policy_version 17700 (0.0008) [2023-10-10 05:22:03,762][53268] Updated weights for policy 1, policy_version 17680 (0.0008) [2023-10-10 05:22:03,797][53252] Updated weights for policy 0, policy_version 17710 (0.0008) [2023-10-10 05:22:04,127][53268] Updated weights for policy 1, policy_version 17690 (0.0009) [2023-10-10 05:22:04,173][53252] Updated weights for policy 0, policy_version 17720 (0.0009) [2023-10-10 05:22:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36274176. Throughput: 0: 1676.2, 1: 1673.3. Samples: 9076326. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 05:22:06,784][52050] Avg episode reward: [(0, '14.670'), (1, '15.240')] [2023-10-10 05:22:08,243][53268] Updated weights for policy 1, policy_version 17700 (0.0008) [2023-10-10 05:22:08,293][53252] Updated weights for policy 0, policy_version 17730 (0.0008) [2023-10-10 05:22:08,610][53268] Updated weights for policy 1, policy_version 17710 (0.0010) [2023-10-10 05:22:08,661][53252] Updated weights for policy 0, policy_version 17740 (0.0008) [2023-10-10 05:22:08,984][53268] Updated weights for policy 1, policy_version 17720 (0.0009) [2023-10-10 05:22:09,035][53252] Updated weights for policy 0, policy_version 17750 (0.0010) [2023-10-10 05:22:09,395][53252] Updated weights for policy 0, policy_version 17760 (0.0009) [2023-10-10 05:22:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36339712. Throughput: 0: 1680.2, 1: 1678.0. Samples: 9096768. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 05:22:11,784][52050] Avg episode reward: [(0, '14.660'), (1, '14.860')] [2023-10-10 05:22:13,017][53268] Updated weights for policy 1, policy_version 17730 (0.0007) [2023-10-10 05:22:13,377][53268] Updated weights for policy 1, policy_version 17740 (0.0009) [2023-10-10 05:22:13,431][53252] Updated weights for policy 0, policy_version 17770 (0.0007) [2023-10-10 05:22:13,760][53268] Updated weights for policy 1, policy_version 17750 (0.0009) [2023-10-10 05:22:13,797][53252] Updated weights for policy 0, policy_version 17780 (0.0008) [2023-10-10 05:22:14,119][53268] Updated weights for policy 1, policy_version 17760 (0.0008) [2023-10-10 05:22:14,177][53252] Updated weights for policy 0, policy_version 17790 (0.0007) [2023-10-10 05:22:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36405248. Throughput: 0: 1657.6, 1: 1661.8. Samples: 9105954. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-10 05:22:16,784][52050] Avg episode reward: [(0, '15.710'), (1, '13.980')] [2023-10-10 05:22:18,128][53252] Updated weights for policy 0, policy_version 17800 (0.0007) [2023-10-10 05:22:18,192][53268] Updated weights for policy 1, policy_version 17770 (0.0009) [2023-10-10 05:22:18,492][53252] Updated weights for policy 0, policy_version 17810 (0.0009) [2023-10-10 05:22:18,555][53268] Updated weights for policy 1, policy_version 17780 (0.0008) [2023-10-10 05:22:18,867][53252] Updated weights for policy 0, policy_version 17820 (0.0008) [2023-10-10 05:22:18,921][53268] Updated weights for policy 1, policy_version 17790 (0.0009) [2023-10-10 05:22:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36470784. Throughput: 0: 1686.4, 1: 1676.9. Samples: 9126826. Policy #0 lag: (min: 18.0, avg: 21.3, max: 50.0) [2023-10-10 05:22:21,784][52050] Avg episode reward: [(0, '16.280'), (1, '15.200')] [2023-10-10 05:22:23,007][53268] Updated weights for policy 1, policy_version 17800 (0.0008) [2023-10-10 05:22:23,094][53252] Updated weights for policy 0, policy_version 17830 (0.0008) [2023-10-10 05:22:23,375][53268] Updated weights for policy 1, policy_version 17810 (0.0007) [2023-10-10 05:22:23,455][53252] Updated weights for policy 0, policy_version 17840 (0.0008) [2023-10-10 05:22:23,742][53268] Updated weights for policy 1, policy_version 17820 (0.0008) [2023-10-10 05:22:23,838][53252] Updated weights for policy 0, policy_version 17850 (0.0007) [2023-10-10 05:22:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36536320. Throughput: 0: 1682.3, 1: 1668.1. Samples: 9147038. Policy #0 lag: (min: 18.0, avg: 21.3, max: 50.0) [2023-10-10 05:22:26,785][52050] Avg episode reward: [(0, '16.310'), (1, '14.920')] [2023-10-10 05:22:27,883][53268] Updated weights for policy 1, policy_version 17830 (0.0009) [2023-10-10 05:22:27,988][53252] Updated weights for policy 0, policy_version 17860 (0.0009) [2023-10-10 05:22:28,254][53268] Updated weights for policy 1, policy_version 17840 (0.0009) [2023-10-10 05:22:28,361][53252] Updated weights for policy 0, policy_version 17870 (0.0010) [2023-10-10 05:22:28,620][53268] Updated weights for policy 1, policy_version 17850 (0.0010) [2023-10-10 05:22:28,728][53252] Updated weights for policy 0, policy_version 17880 (0.0009) [2023-10-10 05:22:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36601856. Throughput: 0: 1672.3, 1: 1658.6. Samples: 9156076. Policy #0 lag: (min: 18.0, avg: 21.3, max: 50.0) [2023-10-10 05:22:31,784][52050] Avg episode reward: [(0, '16.190'), (1, '15.070')] [2023-10-10 05:22:32,801][53252] Updated weights for policy 0, policy_version 17890 (0.0009) [2023-10-10 05:22:32,834][53268] Updated weights for policy 1, policy_version 17860 (0.0007) [2023-10-10 05:22:33,179][53252] Updated weights for policy 0, policy_version 17900 (0.0007) [2023-10-10 05:22:33,210][53268] Updated weights for policy 1, policy_version 17870 (0.0009) [2023-10-10 05:22:33,553][53252] Updated weights for policy 0, policy_version 17910 (0.0007) [2023-10-10 05:22:33,573][53268] Updated weights for policy 1, policy_version 17880 (0.0009) [2023-10-10 05:22:33,922][53252] Updated weights for policy 0, policy_version 17920 (0.0009) [2023-10-10 05:22:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36667392. Throughput: 0: 1678.7, 1: 1667.3. Samples: 9176582. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 05:22:36,784][52050] Avg episode reward: [(0, '15.440'), (1, '15.030')] [2023-10-10 05:22:37,496][53268] Updated weights for policy 1, policy_version 17890 (0.0007) [2023-10-10 05:22:37,871][53268] Updated weights for policy 1, policy_version 17900 (0.0010) [2023-10-10 05:22:38,103][53252] Updated weights for policy 0, policy_version 17930 (0.0008) [2023-10-10 05:22:38,240][53268] Updated weights for policy 1, policy_version 17910 (0.0008) [2023-10-10 05:22:38,469][53252] Updated weights for policy 0, policy_version 17940 (0.0008) [2023-10-10 05:22:38,598][53268] Updated weights for policy 1, policy_version 17920 (0.0007) [2023-10-10 05:22:38,836][53252] Updated weights for policy 0, policy_version 17950 (0.0008) [2023-10-10 05:22:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36732928. Throughput: 0: 1679.1, 1: 1676.3. Samples: 9197504. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 05:22:41,784][52050] Avg episode reward: [(0, '15.600'), (1, '16.040')] [2023-10-10 05:22:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000017920_18350080.pth... [2023-10-10 05:22:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000017952_18382848.pth... [2023-10-10 05:22:41,830][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000016352_16744448.pth [2023-10-10 05:22:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000016384_16777216.pth [2023-10-10 05:22:42,518][53268] Updated weights for policy 1, policy_version 17930 (0.0008) [2023-10-10 05:22:42,878][53268] Updated weights for policy 1, policy_version 17940 (0.0009) [2023-10-10 05:22:43,005][53252] Updated weights for policy 0, policy_version 17960 (0.0009) [2023-10-10 05:22:43,251][53268] Updated weights for policy 1, policy_version 17950 (0.0007) [2023-10-10 05:22:43,385][53252] Updated weights for policy 0, policy_version 17970 (0.0007) [2023-10-10 05:22:43,755][53252] Updated weights for policy 0, policy_version 17980 (0.0007) [2023-10-10 05:22:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36798464. Throughput: 0: 1673.2, 1: 1676.6. Samples: 9206714. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 05:22:46,784][52050] Avg episode reward: [(0, '16.570'), (1, '16.470')] [2023-10-10 05:22:47,281][53268] Updated weights for policy 1, policy_version 17960 (0.0007) [2023-10-10 05:22:47,644][53252] Updated weights for policy 0, policy_version 17990 (0.0008) [2023-10-10 05:22:47,650][53268] Updated weights for policy 1, policy_version 17970 (0.0009) [2023-10-10 05:22:48,013][53252] Updated weights for policy 0, policy_version 18000 (0.0009) [2023-10-10 05:22:48,024][53268] Updated weights for policy 1, policy_version 17980 (0.0011) [2023-10-10 05:22:48,386][53252] Updated weights for policy 0, policy_version 18010 (0.0008) [2023-10-10 05:22:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36864000. Throughput: 0: 1677.6, 1: 1684.1. Samples: 9227604. Policy #0 lag: (min: 24.0, avg: 53.1, max: 56.0) [2023-10-10 05:22:51,784][52050] Avg episode reward: [(0, '16.750'), (1, '17.550')] [2023-10-10 05:22:52,077][53268] Updated weights for policy 1, policy_version 17990 (0.0008) [2023-10-10 05:22:52,449][53268] Updated weights for policy 1, policy_version 18000 (0.0008) [2023-10-10 05:22:52,496][53252] Updated weights for policy 0, policy_version 18020 (0.0008) [2023-10-10 05:22:52,819][53268] Updated weights for policy 1, policy_version 18010 (0.0007) [2023-10-10 05:22:52,861][53252] Updated weights for policy 0, policy_version 18030 (0.0008) [2023-10-10 05:22:53,038][53061] Saving new best policy, reward=17.550! [2023-10-10 05:22:53,236][53252] Updated weights for policy 0, policy_version 18040 (0.0009) [2023-10-10 05:22:56,764][53268] Updated weights for policy 1, policy_version 18020 (0.0010) [2023-10-10 05:22:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36929536. Throughput: 0: 1676.1, 1: 1689.5. Samples: 9248218. Policy #0 lag: (min: 24.0, avg: 53.1, max: 56.0) [2023-10-10 05:22:56,784][52050] Avg episode reward: [(0, '15.500'), (1, '16.970')] [2023-10-10 05:22:57,130][53268] Updated weights for policy 1, policy_version 18030 (0.0009) [2023-10-10 05:22:57,248][53252] Updated weights for policy 0, policy_version 18050 (0.0008) [2023-10-10 05:22:57,508][53268] Updated weights for policy 1, policy_version 18040 (0.0007) [2023-10-10 05:22:57,623][53252] Updated weights for policy 0, policy_version 18060 (0.0009) [2023-10-10 05:22:57,993][53252] Updated weights for policy 0, policy_version 18070 (0.0008) [2023-10-10 05:22:58,359][53252] Updated weights for policy 0, policy_version 18080 (0.0007) [2023-10-10 05:23:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 36995072. Throughput: 0: 1674.3, 1: 1689.3. Samples: 9257316. Policy #0 lag: (min: 24.0, avg: 53.1, max: 56.0) [2023-10-10 05:23:01,784][52050] Avg episode reward: [(0, '14.820'), (1, '16.310')] [2023-10-10 05:23:01,805][53268] Updated weights for policy 1, policy_version 18050 (0.0009) [2023-10-10 05:23:02,181][53268] Updated weights for policy 1, policy_version 18060 (0.0007) [2023-10-10 05:23:02,245][53252] Updated weights for policy 0, policy_version 18090 (0.0007) [2023-10-10 05:23:02,552][53268] Updated weights for policy 1, policy_version 18070 (0.0007) [2023-10-10 05:23:02,614][53252] Updated weights for policy 0, policy_version 18100 (0.0007) [2023-10-10 05:23:02,917][53268] Updated weights for policy 1, policy_version 18080 (0.0008) [2023-10-10 05:23:02,990][53252] Updated weights for policy 0, policy_version 18110 (0.0008) [2023-10-10 05:23:06,761][53268] Updated weights for policy 1, policy_version 18090 (0.0007) [2023-10-10 05:23:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37060608. Throughput: 0: 1673.2, 1: 1689.3. Samples: 9278140. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-10 05:23:06,784][52050] Avg episode reward: [(0, '15.190'), (1, '16.650')] [2023-10-10 05:23:07,058][53252] Updated weights for policy 0, policy_version 18120 (0.0007) [2023-10-10 05:23:07,134][53268] Updated weights for policy 1, policy_version 18100 (0.0008) [2023-10-10 05:23:07,428][53252] Updated weights for policy 0, policy_version 18130 (0.0007) [2023-10-10 05:23:07,505][53268] Updated weights for policy 1, policy_version 18110 (0.0007) [2023-10-10 05:23:07,783][53252] Updated weights for policy 0, policy_version 18140 (0.0008) [2023-10-10 05:23:11,580][53268] Updated weights for policy 1, policy_version 18120 (0.0008) [2023-10-10 05:23:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 37126144. Throughput: 0: 1678.9, 1: 1694.8. Samples: 9298854. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-10 05:23:11,784][52050] Avg episode reward: [(0, '16.190'), (1, '15.930')] [2023-10-10 05:23:11,801][53252] Updated weights for policy 0, policy_version 18150 (0.0009) [2023-10-10 05:23:11,945][53268] Updated weights for policy 1, policy_version 18130 (0.0007) [2023-10-10 05:23:12,175][53252] Updated weights for policy 0, policy_version 18160 (0.0008) [2023-10-10 05:23:12,306][53268] Updated weights for policy 1, policy_version 18140 (0.0008) [2023-10-10 05:23:12,543][53252] Updated weights for policy 0, policy_version 18170 (0.0008) [2023-10-10 05:23:16,413][53268] Updated weights for policy 1, policy_version 18150 (0.0008) [2023-10-10 05:23:16,599][53252] Updated weights for policy 0, policy_version 18180 (0.0008) [2023-10-10 05:23:16,777][53268] Updated weights for policy 1, policy_version 18160 (0.0007) [2023-10-10 05:23:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37191680. Throughput: 0: 1682.3, 1: 1694.1. Samples: 9308014. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-10 05:23:16,784][52050] Avg episode reward: [(0, '17.320'), (1, '16.200')] [2023-10-10 05:23:16,970][53252] Updated weights for policy 0, policy_version 18190 (0.0009) [2023-10-10 05:23:17,140][53268] Updated weights for policy 1, policy_version 18170 (0.0008) [2023-10-10 05:23:17,336][53252] Updated weights for policy 0, policy_version 18200 (0.0009) [2023-10-10 05:23:21,188][53268] Updated weights for policy 1, policy_version 18180 (0.0008) [2023-10-10 05:23:21,410][53252] Updated weights for policy 0, policy_version 18210 (0.0009) [2023-10-10 05:23:21,551][53268] Updated weights for policy 1, policy_version 18190 (0.0008) [2023-10-10 05:23:21,776][53252] Updated weights for policy 0, policy_version 18220 (0.0008) [2023-10-10 05:23:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37257216. Throughput: 0: 1685.7, 1: 1697.3. Samples: 9328820. Policy #0 lag: (min: 16.0, avg: 40.2, max: 48.0) [2023-10-10 05:23:21,784][52050] Avg episode reward: [(0, '17.620'), (1, '16.770')] [2023-10-10 05:23:21,913][53268] Updated weights for policy 1, policy_version 18200 (0.0007) [2023-10-10 05:23:22,137][53252] Updated weights for policy 0, policy_version 18230 (0.0007) [2023-10-10 05:23:22,510][53252] Updated weights for policy 0, policy_version 18240 (0.0008) [2023-10-10 05:23:26,012][53268] Updated weights for policy 1, policy_version 18210 (0.0008) [2023-10-10 05:23:26,371][53268] Updated weights for policy 1, policy_version 18220 (0.0007) [2023-10-10 05:23:26,636][53252] Updated weights for policy 0, policy_version 18250 (0.0009) [2023-10-10 05:23:26,740][53268] Updated weights for policy 1, policy_version 18230 (0.0009) [2023-10-10 05:23:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37322752. Throughput: 0: 1682.7, 1: 1686.3. Samples: 9349108. Policy #0 lag: (min: 16.0, avg: 40.2, max: 48.0) [2023-10-10 05:23:26,784][52050] Avg episode reward: [(0, '17.740'), (1, '16.560')] [2023-10-10 05:23:27,000][53252] Updated weights for policy 0, policy_version 18260 (0.0008) [2023-10-10 05:23:27,110][53268] Updated weights for policy 1, policy_version 18240 (0.0007) [2023-10-10 05:23:27,382][53252] Updated weights for policy 0, policy_version 18270 (0.0010) [2023-10-10 05:23:31,056][53268] Updated weights for policy 1, policy_version 18250 (0.0008) [2023-10-10 05:23:31,415][53268] Updated weights for policy 1, policy_version 18260 (0.0007) [2023-10-10 05:23:31,582][53252] Updated weights for policy 0, policy_version 18280 (0.0008) [2023-10-10 05:23:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 37388288. Throughput: 0: 1684.8, 1: 1692.9. Samples: 9358710. Policy #0 lag: (min: 16.0, avg: 40.2, max: 48.0) [2023-10-10 05:23:31,784][52050] Avg episode reward: [(0, '16.830'), (1, '17.120')] [2023-10-10 05:23:31,788][53268] Updated weights for policy 1, policy_version 18270 (0.0009) [2023-10-10 05:23:31,955][53252] Updated weights for policy 0, policy_version 18290 (0.0009) [2023-10-10 05:23:32,323][53252] Updated weights for policy 0, policy_version 18300 (0.0011) [2023-10-10 05:23:35,790][53268] Updated weights for policy 1, policy_version 18280 (0.0009) [2023-10-10 05:23:36,162][53268] Updated weights for policy 1, policy_version 18290 (0.0007) [2023-10-10 05:23:36,412][53252] Updated weights for policy 0, policy_version 18310 (0.0008) [2023-10-10 05:23:36,529][53268] Updated weights for policy 1, policy_version 18300 (0.0007) [2023-10-10 05:23:36,783][52050] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 37486592. Throughput: 0: 1677.7, 1: 1701.7. Samples: 9379678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:23:36,784][52050] Avg episode reward: [(0, '16.490'), (1, '17.570')] [2023-10-10 05:23:36,784][53061] Saving new best policy, reward=17.570! [2023-10-10 05:23:36,788][53252] Updated weights for policy 0, policy_version 18320 (0.0009) [2023-10-10 05:23:37,160][53252] Updated weights for policy 0, policy_version 18330 (0.0011) [2023-10-10 05:23:40,731][53268] Updated weights for policy 1, policy_version 18310 (0.0010) [2023-10-10 05:23:41,108][53268] Updated weights for policy 1, policy_version 18320 (0.0012) [2023-10-10 05:23:41,320][53252] Updated weights for policy 0, policy_version 18340 (0.0008) [2023-10-10 05:23:41,461][53268] Updated weights for policy 1, policy_version 18330 (0.0010) [2023-10-10 05:23:41,692][53252] Updated weights for policy 0, policy_version 18350 (0.0008) [2023-10-10 05:23:41,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37552128. Throughput: 0: 1679.8, 1: 1675.9. Samples: 9399224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:23:41,784][52050] Avg episode reward: [(0, '15.960'), (1, '15.880')] [2023-10-10 05:23:42,066][53252] Updated weights for policy 0, policy_version 18360 (0.0011) [2023-10-10 05:23:45,624][53268] Updated weights for policy 1, policy_version 18340 (0.0009) [2023-10-10 05:23:45,986][53268] Updated weights for policy 1, policy_version 18350 (0.0009) [2023-10-10 05:23:46,066][53252] Updated weights for policy 0, policy_version 18370 (0.0007) [2023-10-10 05:23:46,358][53268] Updated weights for policy 1, policy_version 18360 (0.0007) [2023-10-10 05:23:46,432][53252] Updated weights for policy 0, policy_version 18380 (0.0008) [2023-10-10 05:23:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37617664. Throughput: 0: 1682.0, 1: 1692.8. Samples: 9409182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:23:46,784][52050] Avg episode reward: [(0, '15.550'), (1, '16.790')] [2023-10-10 05:23:46,803][53252] Updated weights for policy 0, policy_version 18390 (0.0007) [2023-10-10 05:23:47,177][53252] Updated weights for policy 0, policy_version 18400 (0.0007) [2023-10-10 05:23:50,279][53268] Updated weights for policy 1, policy_version 18370 (0.0008) [2023-10-10 05:23:50,644][53268] Updated weights for policy 1, policy_version 18380 (0.0009) [2023-10-10 05:23:51,011][53268] Updated weights for policy 1, policy_version 18390 (0.0011) [2023-10-10 05:23:51,365][53252] Updated weights for policy 0, policy_version 18410 (0.0008) [2023-10-10 05:23:51,383][53268] Updated weights for policy 1, policy_version 18400 (0.0009) [2023-10-10 05:23:51,730][53252] Updated weights for policy 0, policy_version 18420 (0.0011) [2023-10-10 05:23:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37683200. Throughput: 0: 1675.8, 1: 1691.0. Samples: 9429646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:23:51,784][52050] Avg episode reward: [(0, '15.950'), (1, '16.070')] [2023-10-10 05:23:52,111][53252] Updated weights for policy 0, policy_version 18430 (0.0010) [2023-10-10 05:23:55,236][53268] Updated weights for policy 1, policy_version 18410 (0.0008) [2023-10-10 05:23:55,597][53268] Updated weights for policy 1, policy_version 18420 (0.0008) [2023-10-10 05:23:55,969][53268] Updated weights for policy 1, policy_version 18430 (0.0008) [2023-10-10 05:23:56,221][53252] Updated weights for policy 0, policy_version 18440 (0.0008) [2023-10-10 05:23:56,594][53252] Updated weights for policy 0, policy_version 18450 (0.0008) [2023-10-10 05:23:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37748736. Throughput: 0: 1660.4, 1: 1664.7. Samples: 9448484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:23:56,784][52050] Avg episode reward: [(0, '14.610'), (1, '15.430')] [2023-10-10 05:23:56,968][53252] Updated weights for policy 0, policy_version 18460 (0.0007) [2023-10-10 05:24:00,055][53268] Updated weights for policy 1, policy_version 18440 (0.0009) [2023-10-10 05:24:00,416][53268] Updated weights for policy 1, policy_version 18450 (0.0009) [2023-10-10 05:24:00,783][53268] Updated weights for policy 1, policy_version 18460 (0.0009) [2023-10-10 05:24:00,984][53252] Updated weights for policy 0, policy_version 18470 (0.0008) [2023-10-10 05:24:01,350][53252] Updated weights for policy 0, policy_version 18480 (0.0010) [2023-10-10 05:24:01,720][53252] Updated weights for policy 0, policy_version 18490 (0.0010) [2023-10-10 05:24:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37814272. Throughput: 0: 1671.5, 1: 1697.1. Samples: 9459600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:24:01,784][52050] Avg episode reward: [(0, '15.510'), (1, '16.850')] [2023-10-10 05:24:04,927][53268] Updated weights for policy 1, policy_version 18470 (0.0009) [2023-10-10 05:24:05,299][53268] Updated weights for policy 1, policy_version 18480 (0.0007) [2023-10-10 05:24:05,660][53268] Updated weights for policy 1, policy_version 18490 (0.0009) [2023-10-10 05:24:05,692][53252] Updated weights for policy 0, policy_version 18500 (0.0009) [2023-10-10 05:24:06,069][53252] Updated weights for policy 0, policy_version 18510 (0.0009) [2023-10-10 05:24:06,441][53252] Updated weights for policy 0, policy_version 18520 (0.0009) [2023-10-10 05:24:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 37912576. Throughput: 0: 1680.0, 1: 1685.7. Samples: 9480278. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-10 05:24:06,784][52050] Avg episode reward: [(0, '14.140'), (1, '16.760')] [2023-10-10 05:24:09,865][53268] Updated weights for policy 1, policy_version 18500 (0.0008) [2023-10-10 05:24:10,245][53268] Updated weights for policy 1, policy_version 18510 (0.0011) [2023-10-10 05:24:10,612][53268] Updated weights for policy 1, policy_version 18520 (0.0010) [2023-10-10 05:24:10,651][53252] Updated weights for policy 0, policy_version 18530 (0.0008) [2023-10-10 05:24:11,024][53252] Updated weights for policy 0, policy_version 18540 (0.0008) [2023-10-10 05:24:11,399][53252] Updated weights for policy 0, policy_version 18550 (0.0007) [2023-10-10 05:24:11,772][53252] Updated weights for policy 0, policy_version 18560 (0.0007) [2023-10-10 05:24:11,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 37978112. Throughput: 0: 1665.0, 1: 1672.8. Samples: 9499306. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-10 05:24:11,784][52050] Avg episode reward: [(0, '14.780'), (1, '16.720')] [2023-10-10 05:24:14,750][53268] Updated weights for policy 1, policy_version 18530 (0.0009) [2023-10-10 05:24:15,111][53268] Updated weights for policy 1, policy_version 18540 (0.0008) [2023-10-10 05:24:15,485][53268] Updated weights for policy 1, policy_version 18550 (0.0010) [2023-10-10 05:24:15,854][53268] Updated weights for policy 1, policy_version 18560 (0.0010) [2023-10-10 05:24:16,057][53252] Updated weights for policy 0, policy_version 18570 (0.0008) [2023-10-10 05:24:16,427][53252] Updated weights for policy 0, policy_version 18580 (0.0007) [2023-10-10 05:24:16,783][52050] Fps is (10 sec: 9830.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38010880. Throughput: 0: 1678.0, 1: 1691.4. Samples: 9510332. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-10 05:24:16,785][52050] Avg episode reward: [(0, '16.470'), (1, '15.630')] [2023-10-10 05:24:16,801][53252] Updated weights for policy 0, policy_version 18590 (0.0008) [2023-10-10 05:24:20,010][53268] Updated weights for policy 1, policy_version 18570 (0.0009) [2023-10-10 05:24:20,373][53268] Updated weights for policy 1, policy_version 18580 (0.0008) [2023-10-10 05:24:20,737][53268] Updated weights for policy 1, policy_version 18590 (0.0009) [2023-10-10 05:24:20,945][53252] Updated weights for policy 0, policy_version 18600 (0.0009) [2023-10-10 05:24:21,324][53252] Updated weights for policy 0, policy_version 18610 (0.0009) [2023-10-10 05:24:21,703][53252] Updated weights for policy 0, policy_version 18620 (0.0009) [2023-10-10 05:24:21,783][52050] Fps is (10 sec: 9830.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38076416. Throughput: 0: 1674.2, 1: 1667.6. Samples: 9530060. Policy #0 lag: (min: 31.0, avg: 32.9, max: 59.0) [2023-10-10 05:24:21,784][52050] Avg episode reward: [(0, '14.900'), (1, '15.540')] [2023-10-10 05:24:24,856][53268] Updated weights for policy 1, policy_version 18600 (0.0007) [2023-10-10 05:24:25,239][53268] Updated weights for policy 1, policy_version 18610 (0.0008) [2023-10-10 05:24:25,606][53268] Updated weights for policy 1, policy_version 18620 (0.0008) [2023-10-10 05:24:25,645][53252] Updated weights for policy 0, policy_version 18630 (0.0009) [2023-10-10 05:24:26,016][53252] Updated weights for policy 0, policy_version 18640 (0.0010) [2023-10-10 05:24:26,392][53252] Updated weights for policy 0, policy_version 18650 (0.0009) [2023-10-10 05:24:26,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38174720. Throughput: 0: 1659.7, 1: 1674.4. Samples: 9549258. Policy #0 lag: (min: 31.0, avg: 32.9, max: 59.0) [2023-10-10 05:24:26,785][52050] Avg episode reward: [(0, '16.830'), (1, '14.620')] [2023-10-10 05:24:29,580][53268] Updated weights for policy 1, policy_version 18630 (0.0008) [2023-10-10 05:24:29,948][53268] Updated weights for policy 1, policy_version 18640 (0.0007) [2023-10-10 05:24:30,220][53252] Updated weights for policy 0, policy_version 18660 (0.0008) [2023-10-10 05:24:30,316][53268] Updated weights for policy 1, policy_version 18650 (0.0009) [2023-10-10 05:24:30,578][53252] Updated weights for policy 0, policy_version 18670 (0.0009) [2023-10-10 05:24:30,958][53252] Updated weights for policy 0, policy_version 18680 (0.0011) [2023-10-10 05:24:31,783][52050] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 38240256. Throughput: 0: 1678.1, 1: 1688.0. Samples: 9560660. Policy #0 lag: (min: 11.0, avg: 19.8, max: 43.0) [2023-10-10 05:24:31,784][52050] Avg episode reward: [(0, '17.040'), (1, '15.370')] [2023-10-10 05:24:34,233][53268] Updated weights for policy 1, policy_version 18660 (0.0008) [2023-10-10 05:24:34,610][53268] Updated weights for policy 1, policy_version 18670 (0.0009) [2023-10-10 05:24:34,966][53268] Updated weights for policy 1, policy_version 18680 (0.0010) [2023-10-10 05:24:35,198][53252] Updated weights for policy 0, policy_version 18690 (0.0008) [2023-10-10 05:24:35,578][53252] Updated weights for policy 0, policy_version 18700 (0.0010) [2023-10-10 05:24:35,948][53252] Updated weights for policy 0, policy_version 18710 (0.0007) [2023-10-10 05:24:36,321][53252] Updated weights for policy 0, policy_version 18720 (0.0010) [2023-10-10 05:24:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 38305792. Throughput: 0: 1672.2, 1: 1663.7. Samples: 9579762. Policy #0 lag: (min: 11.0, avg: 19.8, max: 43.0) [2023-10-10 05:24:36,784][52050] Avg episode reward: [(0, '16.370'), (1, '16.020')] [2023-10-10 05:24:39,127][53268] Updated weights for policy 1, policy_version 18690 (0.0009) [2023-10-10 05:24:39,496][53268] Updated weights for policy 1, policy_version 18700 (0.0010) [2023-10-10 05:24:39,873][53268] Updated weights for policy 1, policy_version 18710 (0.0010) [2023-10-10 05:24:40,241][53268] Updated weights for policy 1, policy_version 18720 (0.0010) [2023-10-10 05:24:40,263][53252] Updated weights for policy 0, policy_version 18730 (0.0008) [2023-10-10 05:24:40,638][53252] Updated weights for policy 0, policy_version 18740 (0.0007) [2023-10-10 05:24:41,011][53252] Updated weights for policy 0, policy_version 18750 (0.0008) [2023-10-10 05:24:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 38371328. Throughput: 0: 1668.2, 1: 1685.4. Samples: 9599398. Policy #0 lag: (min: 11.0, avg: 19.8, max: 43.0) [2023-10-10 05:24:41,784][52050] Avg episode reward: [(0, '17.350'), (1, '15.820')] [2023-10-10 05:24:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000018720_19169280.pth... [2023-10-10 05:24:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000018752_19202048.pth... [2023-10-10 05:24:41,824][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000017152_17563648.pth [2023-10-10 05:24:41,831][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000017184_17596416.pth [2023-10-10 05:24:44,586][53268] Updated weights for policy 1, policy_version 18730 (0.0008) [2023-10-10 05:24:44,951][53268] Updated weights for policy 1, policy_version 18740 (0.0007) [2023-10-10 05:24:44,986][53252] Updated weights for policy 0, policy_version 18760 (0.0010) [2023-10-10 05:24:45,315][53268] Updated weights for policy 1, policy_version 18750 (0.0010) [2023-10-10 05:24:45,363][53252] Updated weights for policy 0, policy_version 18770 (0.0009) [2023-10-10 05:24:45,734][53252] Updated weights for policy 0, policy_version 18780 (0.0008) [2023-10-10 05:24:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 38436864. Throughput: 0: 1683.9, 1: 1674.8. Samples: 9610742. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 05:24:46,784][52050] Avg episode reward: [(0, '17.370'), (1, '15.020')] [2023-10-10 05:24:49,525][53268] Updated weights for policy 1, policy_version 18760 (0.0010) [2023-10-10 05:24:49,775][53252] Updated weights for policy 0, policy_version 18790 (0.0007) [2023-10-10 05:24:49,896][53268] Updated weights for policy 1, policy_version 18770 (0.0008) [2023-10-10 05:24:50,145][53252] Updated weights for policy 0, policy_version 18800 (0.0008) [2023-10-10 05:24:50,259][53268] Updated weights for policy 1, policy_version 18780 (0.0010) [2023-10-10 05:24:50,518][53252] Updated weights for policy 0, policy_version 18810 (0.0008) [2023-10-10 05:24:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 38502400. Throughput: 0: 1656.0, 1: 1657.2. Samples: 9629372. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 05:24:51,784][52050] Avg episode reward: [(0, '17.390'), (1, '14.740')] [2023-10-10 05:24:54,384][53268] Updated weights for policy 1, policy_version 18790 (0.0008) [2023-10-10 05:24:54,654][53252] Updated weights for policy 0, policy_version 18820 (0.0008) [2023-10-10 05:24:54,756][53268] Updated weights for policy 1, policy_version 18800 (0.0009) [2023-10-10 05:24:55,027][53252] Updated weights for policy 0, policy_version 18830 (0.0007) [2023-10-10 05:24:55,123][53268] Updated weights for policy 1, policy_version 18810 (0.0007) [2023-10-10 05:24:55,392][53252] Updated weights for policy 0, policy_version 18840 (0.0009) [2023-10-10 05:24:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 38567936. Throughput: 0: 1664.0, 1: 1666.4. Samples: 9649174. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-10 05:24:56,784][52050] Avg episode reward: [(0, '16.160'), (1, '15.130')] [2023-10-10 05:24:59,322][53268] Updated weights for policy 1, policy_version 18820 (0.0009) [2023-10-10 05:24:59,504][53252] Updated weights for policy 0, policy_version 18850 (0.0007) [2023-10-10 05:24:59,694][53268] Updated weights for policy 1, policy_version 18830 (0.0009) [2023-10-10 05:24:59,876][53252] Updated weights for policy 0, policy_version 18860 (0.0008) [2023-10-10 05:25:00,059][53268] Updated weights for policy 1, policy_version 18840 (0.0009) [2023-10-10 05:25:00,251][53252] Updated weights for policy 0, policy_version 18870 (0.0010) [2023-10-10 05:25:00,614][53252] Updated weights for policy 0, policy_version 18880 (0.0009) [2023-10-10 05:25:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38633472. Throughput: 0: 1676.7, 1: 1660.9. Samples: 9660524. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:25:01,784][52050] Avg episode reward: [(0, '16.330'), (1, '15.460')] [2023-10-10 05:25:04,012][53268] Updated weights for policy 1, policy_version 18850 (0.0008) [2023-10-10 05:25:04,385][53268] Updated weights for policy 1, policy_version 18860 (0.0008) [2023-10-10 05:25:04,749][53268] Updated weights for policy 1, policy_version 18870 (0.0007) [2023-10-10 05:25:04,786][53252] Updated weights for policy 0, policy_version 18890 (0.0008) [2023-10-10 05:25:05,111][53268] Updated weights for policy 1, policy_version 18880 (0.0007) [2023-10-10 05:25:05,168][53252] Updated weights for policy 0, policy_version 18900 (0.0008) [2023-10-10 05:25:05,530][53252] Updated weights for policy 0, policy_version 18910 (0.0010) [2023-10-10 05:25:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38699008. Throughput: 0: 1661.2, 1: 1648.0. Samples: 9678972. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:25:06,785][52050] Avg episode reward: [(0, '17.280'), (1, '15.200')] [2023-10-10 05:25:09,049][53268] Updated weights for policy 1, policy_version 18890 (0.0008) [2023-10-10 05:25:09,419][53268] Updated weights for policy 1, policy_version 18900 (0.0008) [2023-10-10 05:25:09,529][53252] Updated weights for policy 0, policy_version 18920 (0.0008) [2023-10-10 05:25:09,780][53268] Updated weights for policy 1, policy_version 18910 (0.0007) [2023-10-10 05:25:09,908][53252] Updated weights for policy 0, policy_version 18930 (0.0010) [2023-10-10 05:25:10,278][53252] Updated weights for policy 0, policy_version 18940 (0.0011) [2023-10-10 05:25:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38764544. Throughput: 0: 1671.8, 1: 1667.9. Samples: 9699546. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 05:25:11,784][52050] Avg episode reward: [(0, '16.970'), (1, '15.910')] [2023-10-10 05:25:13,787][53268] Updated weights for policy 1, policy_version 18920 (0.0009) [2023-10-10 05:25:14,158][53268] Updated weights for policy 1, policy_version 18930 (0.0010) [2023-10-10 05:25:14,449][53252] Updated weights for policy 0, policy_version 18950 (0.0009) [2023-10-10 05:25:14,524][53268] Updated weights for policy 1, policy_version 18940 (0.0009) [2023-10-10 05:25:14,818][53252] Updated weights for policy 0, policy_version 18960 (0.0009) [2023-10-10 05:25:15,193][53252] Updated weights for policy 0, policy_version 18970 (0.0009) [2023-10-10 05:25:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38830080. Throughput: 0: 1672.3, 1: 1651.2. Samples: 9710216. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 05:25:16,784][52050] Avg episode reward: [(0, '16.890'), (1, '16.770')] [2023-10-10 05:25:18,667][53268] Updated weights for policy 1, policy_version 18950 (0.0009) [2023-10-10 05:25:19,033][53268] Updated weights for policy 1, policy_version 18960 (0.0009) [2023-10-10 05:25:19,389][53252] Updated weights for policy 0, policy_version 18980 (0.0008) [2023-10-10 05:25:19,396][53268] Updated weights for policy 1, policy_version 18970 (0.0009) [2023-10-10 05:25:19,758][53252] Updated weights for policy 0, policy_version 18990 (0.0007) [2023-10-10 05:25:20,124][53252] Updated weights for policy 0, policy_version 19000 (0.0008) [2023-10-10 05:25:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38895616. Throughput: 0: 1659.9, 1: 1666.7. Samples: 9729458. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 05:25:21,784][52050] Avg episode reward: [(0, '18.340'), (1, '16.620')] [2023-10-10 05:25:21,786][52846] Saving new best policy, reward=18.340! [2023-10-10 05:25:23,491][53268] Updated weights for policy 1, policy_version 18980 (0.0008) [2023-10-10 05:25:23,865][53268] Updated weights for policy 1, policy_version 18990 (0.0008) [2023-10-10 05:25:24,138][53252] Updated weights for policy 0, policy_version 19010 (0.0010) [2023-10-10 05:25:24,237][53268] Updated weights for policy 1, policy_version 19000 (0.0008) [2023-10-10 05:25:24,505][53252] Updated weights for policy 0, policy_version 19020 (0.0007) [2023-10-10 05:25:24,869][53252] Updated weights for policy 0, policy_version 19030 (0.0007) [2023-10-10 05:25:25,249][53252] Updated weights for policy 0, policy_version 19040 (0.0009) [2023-10-10 05:25:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38961152. Throughput: 0: 1675.6, 1: 1668.8. Samples: 9749896. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 05:25:26,784][52050] Avg episode reward: [(0, '16.110'), (1, '15.370')] [2023-10-10 05:25:28,308][53268] Updated weights for policy 1, policy_version 19010 (0.0008) [2023-10-10 05:25:28,672][53268] Updated weights for policy 1, policy_version 19020 (0.0008) [2023-10-10 05:25:29,045][53268] Updated weights for policy 1, policy_version 19030 (0.0010) [2023-10-10 05:25:29,297][53252] Updated weights for policy 0, policy_version 19050 (0.0007) [2023-10-10 05:25:29,413][53268] Updated weights for policy 1, policy_version 19040 (0.0008) [2023-10-10 05:25:29,677][53252] Updated weights for policy 0, policy_version 19060 (0.0007) [2023-10-10 05:25:30,050][53252] Updated weights for policy 0, policy_version 19070 (0.0007) [2023-10-10 05:25:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39026688. Throughput: 0: 1670.3, 1: 1649.7. Samples: 9760142. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 05:25:31,784][52050] Avg episode reward: [(0, '16.280'), (1, '15.490')] [2023-10-10 05:25:33,334][53268] Updated weights for policy 1, policy_version 19050 (0.0008) [2023-10-10 05:25:33,700][53268] Updated weights for policy 1, policy_version 19060 (0.0008) [2023-10-10 05:25:33,906][53252] Updated weights for policy 0, policy_version 19080 (0.0008) [2023-10-10 05:25:34,069][53268] Updated weights for policy 1, policy_version 19070 (0.0010) [2023-10-10 05:25:34,277][53252] Updated weights for policy 0, policy_version 19090 (0.0008) [2023-10-10 05:25:34,653][53252] Updated weights for policy 0, policy_version 19100 (0.0008) [2023-10-10 05:25:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 39092224. Throughput: 0: 1674.9, 1: 1674.0. Samples: 9780072. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 05:25:36,784][52050] Avg episode reward: [(0, '17.940'), (1, '15.330')] [2023-10-10 05:25:38,198][53268] Updated weights for policy 1, policy_version 19080 (0.0009) [2023-10-10 05:25:38,561][53268] Updated weights for policy 1, policy_version 19090 (0.0009) [2023-10-10 05:25:38,879][53252] Updated weights for policy 0, policy_version 19110 (0.0008) [2023-10-10 05:25:38,925][53268] Updated weights for policy 1, policy_version 19100 (0.0009) [2023-10-10 05:25:39,257][53252] Updated weights for policy 0, policy_version 19120 (0.0009) [2023-10-10 05:25:39,622][53252] Updated weights for policy 0, policy_version 19130 (0.0007) [2023-10-10 05:25:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39157760. Throughput: 0: 1684.6, 1: 1681.8. Samples: 9800664. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 05:25:41,784][52050] Avg episode reward: [(0, '16.750'), (1, '15.290')] [2023-10-10 05:25:43,075][53268] Updated weights for policy 1, policy_version 19110 (0.0010) [2023-10-10 05:25:43,443][53268] Updated weights for policy 1, policy_version 19120 (0.0009) [2023-10-10 05:25:43,472][53252] Updated weights for policy 0, policy_version 19140 (0.0008) [2023-10-10 05:25:43,807][53268] Updated weights for policy 1, policy_version 19130 (0.0008) [2023-10-10 05:25:43,848][53252] Updated weights for policy 0, policy_version 19150 (0.0009) [2023-10-10 05:25:44,215][53252] Updated weights for policy 0, policy_version 19160 (0.0009) [2023-10-10 05:25:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39223296. Throughput: 0: 1663.3, 1: 1656.6. Samples: 9809920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:25:46,784][52050] Avg episode reward: [(0, '16.490'), (1, '16.370')] [2023-10-10 05:25:47,915][53268] Updated weights for policy 1, policy_version 19140 (0.0008) [2023-10-10 05:25:48,274][53268] Updated weights for policy 1, policy_version 19150 (0.0007) [2023-10-10 05:25:48,277][53252] Updated weights for policy 0, policy_version 19170 (0.0009) [2023-10-10 05:25:48,646][53268] Updated weights for policy 1, policy_version 19160 (0.0007) [2023-10-10 05:25:48,654][53252] Updated weights for policy 0, policy_version 19180 (0.0010) [2023-10-10 05:25:49,013][53252] Updated weights for policy 0, policy_version 19190 (0.0009) [2023-10-10 05:25:49,385][53252] Updated weights for policy 0, policy_version 19200 (0.0009) [2023-10-10 05:25:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39288832. Throughput: 0: 1682.1, 1: 1685.5. Samples: 9830512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:25:51,784][52050] Avg episode reward: [(0, '16.450'), (1, '16.720')] [2023-10-10 05:25:52,695][53268] Updated weights for policy 1, policy_version 19170 (0.0008) [2023-10-10 05:25:53,075][53268] Updated weights for policy 1, policy_version 19180 (0.0008) [2023-10-10 05:25:53,431][53268] Updated weights for policy 1, policy_version 19190 (0.0008) [2023-10-10 05:25:53,605][53252] Updated weights for policy 0, policy_version 19210 (0.0008) [2023-10-10 05:25:53,798][53268] Updated weights for policy 1, policy_version 19200 (0.0009) [2023-10-10 05:25:53,979][53252] Updated weights for policy 0, policy_version 19220 (0.0008) [2023-10-10 05:25:54,348][53252] Updated weights for policy 0, policy_version 19230 (0.0007) [2023-10-10 05:25:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 39354368. Throughput: 0: 1684.5, 1: 1681.1. Samples: 9851002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:25:56,784][52050] Avg episode reward: [(0, '16.400'), (1, '17.290')] [2023-10-10 05:25:57,941][53268] Updated weights for policy 1, policy_version 19210 (0.0009) [2023-10-10 05:25:58,302][53268] Updated weights for policy 1, policy_version 19220 (0.0009) [2023-10-10 05:25:58,324][53252] Updated weights for policy 0, policy_version 19240 (0.0009) [2023-10-10 05:25:58,667][53268] Updated weights for policy 1, policy_version 19230 (0.0007) [2023-10-10 05:25:58,695][53252] Updated weights for policy 0, policy_version 19250 (0.0008) [2023-10-10 05:25:59,055][53252] Updated weights for policy 0, policy_version 19260 (0.0008) [2023-10-10 05:26:01,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 39419904. Throughput: 0: 1665.5, 1: 1665.8. Samples: 9860124. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:26:01,785][52050] Avg episode reward: [(0, '16.370'), (1, '16.420')] [2023-10-10 05:26:02,913][53268] Updated weights for policy 1, policy_version 19240 (0.0008) [2023-10-10 05:26:03,233][53252] Updated weights for policy 0, policy_version 19270 (0.0008) [2023-10-10 05:26:03,293][53268] Updated weights for policy 1, policy_version 19250 (0.0008) [2023-10-10 05:26:03,606][53252] Updated weights for policy 0, policy_version 19280 (0.0009) [2023-10-10 05:26:03,659][53268] Updated weights for policy 1, policy_version 19260 (0.0007) [2023-10-10 05:26:03,968][53252] Updated weights for policy 0, policy_version 19290 (0.0009) [2023-10-10 05:26:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39485440. Throughput: 0: 1688.2, 1: 1670.9. Samples: 9880614. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:26:06,784][52050] Avg episode reward: [(0, '16.560'), (1, '17.020')] [2023-10-10 05:26:07,684][53268] Updated weights for policy 1, policy_version 19270 (0.0009) [2023-10-10 05:26:07,991][53252] Updated weights for policy 0, policy_version 19300 (0.0008) [2023-10-10 05:26:08,056][53268] Updated weights for policy 1, policy_version 19280 (0.0008) [2023-10-10 05:26:08,357][53252] Updated weights for policy 0, policy_version 19310 (0.0009) [2023-10-10 05:26:08,415][53268] Updated weights for policy 1, policy_version 19290 (0.0007) [2023-10-10 05:26:08,739][53252] Updated weights for policy 0, policy_version 19320 (0.0007) [2023-10-10 05:26:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 39550976. Throughput: 0: 1697.0, 1: 1677.9. Samples: 9901768. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 05:26:11,785][52050] Avg episode reward: [(0, '17.870'), (1, '16.450')] [2023-10-10 05:26:12,461][53268] Updated weights for policy 1, policy_version 19300 (0.0010) [2023-10-10 05:26:12,796][53252] Updated weights for policy 0, policy_version 19330 (0.0009) [2023-10-10 05:26:12,840][53268] Updated weights for policy 1, policy_version 19310 (0.0008) [2023-10-10 05:26:13,154][53252] Updated weights for policy 0, policy_version 19340 (0.0009) [2023-10-10 05:26:13,202][53268] Updated weights for policy 1, policy_version 19320 (0.0007) [2023-10-10 05:26:13,536][53252] Updated weights for policy 0, policy_version 19350 (0.0007) [2023-10-10 05:26:13,902][53252] Updated weights for policy 0, policy_version 19360 (0.0009) [2023-10-10 05:26:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39616512. Throughput: 0: 1674.0, 1: 1674.3. Samples: 9910818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:16,784][52050] Avg episode reward: [(0, '17.410'), (1, '15.470')] [2023-10-10 05:26:17,251][53268] Updated weights for policy 1, policy_version 19330 (0.0007) [2023-10-10 05:26:17,620][53268] Updated weights for policy 1, policy_version 19340 (0.0007) [2023-10-10 05:26:17,990][53268] Updated weights for policy 1, policy_version 19350 (0.0008) [2023-10-10 05:26:18,002][53252] Updated weights for policy 0, policy_version 19370 (0.0008) [2023-10-10 05:26:18,355][53268] Updated weights for policy 1, policy_version 19360 (0.0008) [2023-10-10 05:26:18,375][53252] Updated weights for policy 0, policy_version 19380 (0.0007) [2023-10-10 05:26:18,752][53252] Updated weights for policy 0, policy_version 19390 (0.0009) [2023-10-10 05:26:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39682048. Throughput: 0: 1690.0, 1: 1677.6. Samples: 9931614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:21,784][52050] Avg episode reward: [(0, '17.600'), (1, '16.250')] [2023-10-10 05:26:22,428][53268] Updated weights for policy 1, policy_version 19370 (0.0008) [2023-10-10 05:26:22,497][53252] Updated weights for policy 0, policy_version 19400 (0.0009) [2023-10-10 05:26:22,798][53268] Updated weights for policy 1, policy_version 19380 (0.0007) [2023-10-10 05:26:22,881][53252] Updated weights for policy 0, policy_version 19410 (0.0008) [2023-10-10 05:26:23,171][53268] Updated weights for policy 1, policy_version 19390 (0.0008) [2023-10-10 05:26:23,251][53252] Updated weights for policy 0, policy_version 19420 (0.0008) [2023-10-10 05:26:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39747584. Throughput: 0: 1689.6, 1: 1678.3. Samples: 9952220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:26,784][52050] Avg episode reward: [(0, '17.330'), (1, '15.670')] [2023-10-10 05:26:27,115][53268] Updated weights for policy 1, policy_version 19400 (0.0008) [2023-10-10 05:26:27,353][53252] Updated weights for policy 0, policy_version 19430 (0.0008) [2023-10-10 05:26:27,472][53268] Updated weights for policy 1, policy_version 19410 (0.0008) [2023-10-10 05:26:27,719][53252] Updated weights for policy 0, policy_version 19440 (0.0008) [2023-10-10 05:26:27,834][53268] Updated weights for policy 1, policy_version 19420 (0.0008) [2023-10-10 05:26:28,099][53252] Updated weights for policy 0, policy_version 19450 (0.0007) [2023-10-10 05:26:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39813120. Throughput: 0: 1684.0, 1: 1679.3. Samples: 9961268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:31,784][52050] Avg episode reward: [(0, '16.390'), (1, '17.170')] [2023-10-10 05:26:32,034][53268] Updated weights for policy 1, policy_version 19430 (0.0009) [2023-10-10 05:26:32,065][53252] Updated weights for policy 0, policy_version 19460 (0.0010) [2023-10-10 05:26:32,406][53268] Updated weights for policy 1, policy_version 19440 (0.0009) [2023-10-10 05:26:32,448][53252] Updated weights for policy 0, policy_version 19470 (0.0009) [2023-10-10 05:26:32,771][53268] Updated weights for policy 1, policy_version 19450 (0.0009) [2023-10-10 05:26:32,813][53252] Updated weights for policy 0, policy_version 19480 (0.0008) [2023-10-10 05:26:36,684][53268] Updated weights for policy 1, policy_version 19460 (0.0008) [2023-10-10 05:26:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39878656. Throughput: 0: 1687.8, 1: 1679.9. Samples: 9982056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:36,784][52050] Avg episode reward: [(0, '16.210'), (1, '16.040')] [2023-10-10 05:26:36,959][53252] Updated weights for policy 0, policy_version 19490 (0.0008) [2023-10-10 05:26:37,052][53268] Updated weights for policy 1, policy_version 19470 (0.0007) [2023-10-10 05:26:37,325][53252] Updated weights for policy 0, policy_version 19500 (0.0008) [2023-10-10 05:26:37,412][53268] Updated weights for policy 1, policy_version 19480 (0.0007) [2023-10-10 05:26:37,694][53252] Updated weights for policy 0, policy_version 19510 (0.0009) [2023-10-10 05:26:38,069][53252] Updated weights for policy 0, policy_version 19520 (0.0010) [2023-10-10 05:26:41,682][53268] Updated weights for policy 1, policy_version 19490 (0.0007) [2023-10-10 05:26:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 39944192. Throughput: 0: 1698.4, 1: 1679.2. Samples: 10002990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:41,784][52050] Avg episode reward: [(0, '16.310'), (1, '16.920')] [2023-10-10 05:26:42,044][53268] Updated weights for policy 1, policy_version 19500 (0.0008) [2023-10-10 05:26:42,061][53252] Updated weights for policy 0, policy_version 19530 (0.0009) [2023-10-10 05:26:42,413][53268] Updated weights for policy 1, policy_version 19510 (0.0009) [2023-10-10 05:26:42,424][53252] Updated weights for policy 0, policy_version 19540 (0.0009) [2023-10-10 05:26:42,778][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000019520_19988480.pth... [2023-10-10 05:26:42,782][53268] Updated weights for policy 1, policy_version 19520 (0.0008) [2023-10-10 05:26:42,803][53252] Updated weights for policy 0, policy_version 19550 (0.0008) [2023-10-10 05:26:42,807][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000017920_18350080.pth [2023-10-10 05:26:42,868][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000019552_20021248.pth... [2023-10-10 05:26:42,897][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000017952_18382848.pth [2023-10-10 05:26:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40009728. Throughput: 0: 1695.2, 1: 1679.7. Samples: 10011994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:46,784][52050] Avg episode reward: [(0, '17.320'), (1, '16.960')] [2023-10-10 05:26:46,832][53252] Updated weights for policy 0, policy_version 19560 (0.0008) [2023-10-10 05:26:46,884][53268] Updated weights for policy 1, policy_version 19530 (0.0010) [2023-10-10 05:26:47,202][53252] Updated weights for policy 0, policy_version 19570 (0.0008) [2023-10-10 05:26:47,255][53268] Updated weights for policy 1, policy_version 19540 (0.0008) [2023-10-10 05:26:47,575][53252] Updated weights for policy 0, policy_version 19580 (0.0007) [2023-10-10 05:26:47,623][53268] Updated weights for policy 1, policy_version 19550 (0.0008) [2023-10-10 05:26:51,670][53252] Updated weights for policy 0, policy_version 19590 (0.0007) [2023-10-10 05:26:51,691][53268] Updated weights for policy 1, policy_version 19560 (0.0008) [2023-10-10 05:26:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40075264. Throughput: 0: 1698.9, 1: 1684.9. Samples: 10032886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:51,784][52050] Avg episode reward: [(0, '17.020'), (1, '15.680')] [2023-10-10 05:26:52,041][53252] Updated weights for policy 0, policy_version 19600 (0.0007) [2023-10-10 05:26:52,062][53268] Updated weights for policy 1, policy_version 19570 (0.0008) [2023-10-10 05:26:52,406][53252] Updated weights for policy 0, policy_version 19610 (0.0008) [2023-10-10 05:26:52,433][53268] Updated weights for policy 1, policy_version 19580 (0.0009) [2023-10-10 05:26:56,394][53252] Updated weights for policy 0, policy_version 19620 (0.0009) [2023-10-10 05:26:56,620][53268] Updated weights for policy 1, policy_version 19590 (0.0009) [2023-10-10 05:26:56,772][53252] Updated weights for policy 0, policy_version 19630 (0.0007) [2023-10-10 05:26:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 40140800. Throughput: 0: 1692.4, 1: 1676.5. Samples: 10053366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:26:56,784][52050] Avg episode reward: [(0, '18.330'), (1, '17.760')] [2023-10-10 05:26:56,992][53268] Updated weights for policy 1, policy_version 19600 (0.0009) [2023-10-10 05:26:57,141][53252] Updated weights for policy 0, policy_version 19640 (0.0008) [2023-10-10 05:26:57,354][53268] Updated weights for policy 1, policy_version 19610 (0.0007) [2023-10-10 05:26:57,571][53061] Saving new best policy, reward=17.760! [2023-10-10 05:27:01,299][53268] Updated weights for policy 1, policy_version 19620 (0.0009) [2023-10-10 05:27:01,504][53252] Updated weights for policy 0, policy_version 19650 (0.0010) [2023-10-10 05:27:01,669][53268] Updated weights for policy 1, policy_version 19630 (0.0008) [2023-10-10 05:27:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 40206336. Throughput: 0: 1693.3, 1: 1677.6. Samples: 10062508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:27:01,784][52050] Avg episode reward: [(0, '18.510'), (1, '16.620')] [2023-10-10 05:27:01,873][53252] Updated weights for policy 0, policy_version 19660 (0.0008) [2023-10-10 05:27:02,028][53268] Updated weights for policy 1, policy_version 19640 (0.0008) [2023-10-10 05:27:02,246][53252] Updated weights for policy 0, policy_version 19670 (0.0009) [2023-10-10 05:27:02,621][52846] Saving new best policy, reward=18.510! [2023-10-10 05:27:02,621][53252] Updated weights for policy 0, policy_version 19680 (0.0010) [2023-10-10 05:27:06,151][53268] Updated weights for policy 1, policy_version 19650 (0.0008) [2023-10-10 05:27:06,524][53268] Updated weights for policy 1, policy_version 19660 (0.0007) [2023-10-10 05:27:06,682][53252] Updated weights for policy 0, policy_version 19690 (0.0009) [2023-10-10 05:27:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40271872. Throughput: 0: 1687.3, 1: 1676.9. Samples: 10083004. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 05:27:06,784][52050] Avg episode reward: [(0, '18.060'), (1, '16.710')] [2023-10-10 05:27:06,888][53268] Updated weights for policy 1, policy_version 19670 (0.0007) [2023-10-10 05:27:07,052][53252] Updated weights for policy 0, policy_version 19700 (0.0009) [2023-10-10 05:27:07,244][53268] Updated weights for policy 1, policy_version 19680 (0.0008) [2023-10-10 05:27:07,419][53252] Updated weights for policy 0, policy_version 19710 (0.0007) [2023-10-10 05:27:11,354][53252] Updated weights for policy 0, policy_version 19720 (0.0008) [2023-10-10 05:27:11,400][53268] Updated weights for policy 1, policy_version 19690 (0.0008) [2023-10-10 05:27:11,720][53252] Updated weights for policy 0, policy_version 19730 (0.0008) [2023-10-10 05:27:11,763][53268] Updated weights for policy 1, policy_version 19700 (0.0008) [2023-10-10 05:27:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 40337408. Throughput: 0: 1683.3, 1: 1678.3. Samples: 10103494. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 05:27:11,784][52050] Avg episode reward: [(0, '16.800'), (1, '15.910')] [2023-10-10 05:27:12,100][53252] Updated weights for policy 0, policy_version 19740 (0.0007) [2023-10-10 05:27:12,136][53268] Updated weights for policy 1, policy_version 19710 (0.0007) [2023-10-10 05:27:16,072][53252] Updated weights for policy 0, policy_version 19750 (0.0008) [2023-10-10 05:27:16,256][53268] Updated weights for policy 1, policy_version 19720 (0.0007) [2023-10-10 05:27:16,437][53252] Updated weights for policy 0, policy_version 19760 (0.0008) [2023-10-10 05:27:16,621][53268] Updated weights for policy 1, policy_version 19730 (0.0010) [2023-10-10 05:27:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40402944. Throughput: 0: 1692.1, 1: 1681.4. Samples: 10113078. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 05:27:16,784][52050] Avg episode reward: [(0, '17.620'), (1, '16.490')] [2023-10-10 05:27:16,809][53252] Updated weights for policy 0, policy_version 19770 (0.0007) [2023-10-10 05:27:16,982][53268] Updated weights for policy 1, policy_version 19740 (0.0008) [2023-10-10 05:27:20,871][53252] Updated weights for policy 0, policy_version 19780 (0.0008) [2023-10-10 05:27:21,036][53268] Updated weights for policy 1, policy_version 19750 (0.0008) [2023-10-10 05:27:21,251][53252] Updated weights for policy 0, policy_version 19790 (0.0008) [2023-10-10 05:27:21,401][53268] Updated weights for policy 1, policy_version 19760 (0.0008) [2023-10-10 05:27:21,630][53252] Updated weights for policy 0, policy_version 19800 (0.0008) [2023-10-10 05:27:21,774][53268] Updated weights for policy 1, policy_version 19770 (0.0009) [2023-10-10 05:27:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40468480. Throughput: 0: 1698.7, 1: 1678.7. Samples: 10134038. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 05:27:21,784][52050] Avg episode reward: [(0, '16.420'), (1, '15.870')] [2023-10-10 05:27:25,739][53252] Updated weights for policy 0, policy_version 19810 (0.0007) [2023-10-10 05:27:25,785][53268] Updated weights for policy 1, policy_version 19780 (0.0009) [2023-10-10 05:27:26,107][53252] Updated weights for policy 0, policy_version 19820 (0.0008) [2023-10-10 05:27:26,153][53268] Updated weights for policy 1, policy_version 19790 (0.0009) [2023-10-10 05:27:26,474][53252] Updated weights for policy 0, policy_version 19830 (0.0007) [2023-10-10 05:27:26,520][53268] Updated weights for policy 1, policy_version 19800 (0.0009) [2023-10-10 05:27:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40534016. Throughput: 0: 1673.0, 1: 1669.1. Samples: 10153384. Policy #0 lag: (min: 29.0, avg: 29.0, max: 33.0) [2023-10-10 05:27:26,784][52050] Avg episode reward: [(0, '17.620'), (1, '15.540')] [2023-10-10 05:27:26,844][53252] Updated weights for policy 0, policy_version 19840 (0.0007) [2023-10-10 05:27:30,567][53268] Updated weights for policy 1, policy_version 19810 (0.0009) [2023-10-10 05:27:30,817][53252] Updated weights for policy 0, policy_version 19850 (0.0009) [2023-10-10 05:27:30,934][53268] Updated weights for policy 1, policy_version 19820 (0.0009) [2023-10-10 05:27:31,186][53252] Updated weights for policy 0, policy_version 19860 (0.0008) [2023-10-10 05:27:31,307][53268] Updated weights for policy 1, policy_version 19830 (0.0008) [2023-10-10 05:27:31,557][53252] Updated weights for policy 0, policy_version 19870 (0.0008) [2023-10-10 05:27:31,662][53268] Updated weights for policy 1, policy_version 19840 (0.0009) [2023-10-10 05:27:31,783][52050] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 40665088. Throughput: 0: 1690.9, 1: 1681.2. Samples: 10163738. Policy #0 lag: (min: 29.0, avg: 29.0, max: 33.0) [2023-10-10 05:27:31,784][52050] Avg episode reward: [(0, '18.590'), (1, '16.090')] [2023-10-10 05:27:31,785][52846] Saving new best policy, reward=18.590! [2023-10-10 05:27:35,452][53252] Updated weights for policy 0, policy_version 19880 (0.0009) [2023-10-10 05:27:35,823][53252] Updated weights for policy 0, policy_version 19890 (0.0009) [2023-10-10 05:27:35,875][53268] Updated weights for policy 1, policy_version 19850 (0.0009) [2023-10-10 05:27:36,200][53252] Updated weights for policy 0, policy_version 19900 (0.0008) [2023-10-10 05:27:36,234][53268] Updated weights for policy 1, policy_version 19860 (0.0009) [2023-10-10 05:27:36,606][53268] Updated weights for policy 1, policy_version 19870 (0.0008) [2023-10-10 05:27:36,783][52050] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 40730624. Throughput: 0: 1681.2, 1: 1676.9. Samples: 10184004. Policy #0 lag: (min: 29.0, avg: 29.0, max: 33.0) [2023-10-10 05:27:36,784][52050] Avg episode reward: [(0, '18.780'), (1, '14.700')] [2023-10-10 05:27:36,785][52846] Saving new best policy, reward=18.780! [2023-10-10 05:27:40,395][53252] Updated weights for policy 0, policy_version 19910 (0.0009) [2023-10-10 05:27:40,717][53268] Updated weights for policy 1, policy_version 19880 (0.0008) [2023-10-10 05:27:40,767][53252] Updated weights for policy 0, policy_version 19920 (0.0007) [2023-10-10 05:27:41,089][53268] Updated weights for policy 1, policy_version 19890 (0.0010) [2023-10-10 05:27:41,139][53252] Updated weights for policy 0, policy_version 19930 (0.0007) [2023-10-10 05:27:41,456][53268] Updated weights for policy 1, policy_version 19900 (0.0008) [2023-10-10 05:27:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 40796160. Throughput: 0: 1656.9, 1: 1663.7. Samples: 10202792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:27:41,784][52050] Avg episode reward: [(0, '18.380'), (1, '15.350')] [2023-10-10 05:27:45,053][53252] Updated weights for policy 0, policy_version 19940 (0.0007) [2023-10-10 05:27:45,410][53252] Updated weights for policy 0, policy_version 19950 (0.0008) [2023-10-10 05:27:45,607][53268] Updated weights for policy 1, policy_version 19910 (0.0008) [2023-10-10 05:27:45,786][53252] Updated weights for policy 0, policy_version 19960 (0.0007) [2023-10-10 05:27:45,978][53268] Updated weights for policy 1, policy_version 19920 (0.0008) [2023-10-10 05:27:46,361][53268] Updated weights for policy 1, policy_version 19930 (0.0009) [2023-10-10 05:27:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 40861696. Throughput: 0: 1687.1, 1: 1676.6. Samples: 10213876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:27:46,784][52050] Avg episode reward: [(0, '17.710'), (1, '15.030')] [2023-10-10 05:27:49,811][53252] Updated weights for policy 0, policy_version 19970 (0.0008) [2023-10-10 05:27:50,188][53252] Updated weights for policy 0, policy_version 19980 (0.0008) [2023-10-10 05:27:50,529][53268] Updated weights for policy 1, policy_version 19940 (0.0010) [2023-10-10 05:27:50,567][53252] Updated weights for policy 0, policy_version 19990 (0.0008) [2023-10-10 05:27:50,896][53268] Updated weights for policy 1, policy_version 19950 (0.0009) [2023-10-10 05:27:50,938][53252] Updated weights for policy 0, policy_version 20000 (0.0008) [2023-10-10 05:27:51,255][53268] Updated weights for policy 1, policy_version 19960 (0.0010) [2023-10-10 05:27:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 40927232. Throughput: 0: 1681.0, 1: 1676.7. Samples: 10234100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:27:51,784][52050] Avg episode reward: [(0, '16.600'), (1, '15.120')] [2023-10-10 05:27:55,037][53252] Updated weights for policy 0, policy_version 20010 (0.0010) [2023-10-10 05:27:55,207][53268] Updated weights for policy 1, policy_version 19970 (0.0010) [2023-10-10 05:27:55,406][53252] Updated weights for policy 0, policy_version 20020 (0.0007) [2023-10-10 05:27:55,562][53268] Updated weights for policy 1, policy_version 19980 (0.0008) [2023-10-10 05:27:55,787][53252] Updated weights for policy 0, policy_version 20030 (0.0009) [2023-10-10 05:27:55,938][53268] Updated weights for policy 1, policy_version 19990 (0.0009) [2023-10-10 05:27:56,296][53268] Updated weights for policy 1, policy_version 20000 (0.0008) [2023-10-10 05:27:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 40992768. Throughput: 0: 1672.8, 1: 1653.7. Samples: 10253186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:27:56,784][52050] Avg episode reward: [(0, '16.320'), (1, '15.880')] [2023-10-10 05:27:59,874][53252] Updated weights for policy 0, policy_version 20040 (0.0008) [2023-10-10 05:28:00,240][53252] Updated weights for policy 0, policy_version 20050 (0.0007) [2023-10-10 05:28:00,249][53268] Updated weights for policy 1, policy_version 20010 (0.0009) [2023-10-10 05:28:00,612][53252] Updated weights for policy 0, policy_version 20060 (0.0009) [2023-10-10 05:28:00,613][53268] Updated weights for policy 1, policy_version 20020 (0.0007) [2023-10-10 05:28:00,988][53268] Updated weights for policy 1, policy_version 20030 (0.0008) [2023-10-10 05:28:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 41058304. Throughput: 0: 1691.4, 1: 1681.4. Samples: 10264852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:28:01,784][52050] Avg episode reward: [(0, '16.290'), (1, '16.330')] [2023-10-10 05:28:04,907][53252] Updated weights for policy 0, policy_version 20070 (0.0009) [2023-10-10 05:28:05,173][53268] Updated weights for policy 1, policy_version 20040 (0.0008) [2023-10-10 05:28:05,279][53252] Updated weights for policy 0, policy_version 20080 (0.0009) [2023-10-10 05:28:05,531][53268] Updated weights for policy 1, policy_version 20050 (0.0010) [2023-10-10 05:28:05,656][53252] Updated weights for policy 0, policy_version 20090 (0.0009) [2023-10-10 05:28:05,903][53268] Updated weights for policy 1, policy_version 20060 (0.0009) [2023-10-10 05:28:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 41123840. Throughput: 0: 1667.1, 1: 1673.3. Samples: 10284356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:28:06,784][52050] Avg episode reward: [(0, '17.220'), (1, '16.360')] [2023-10-10 05:28:09,730][53252] Updated weights for policy 0, policy_version 20100 (0.0008) [2023-10-10 05:28:09,879][53268] Updated weights for policy 1, policy_version 20070 (0.0008) [2023-10-10 05:28:10,110][53252] Updated weights for policy 0, policy_version 20110 (0.0008) [2023-10-10 05:28:10,239][53268] Updated weights for policy 1, policy_version 20080 (0.0007) [2023-10-10 05:28:10,478][53252] Updated weights for policy 0, policy_version 20120 (0.0010) [2023-10-10 05:28:10,609][53268] Updated weights for policy 1, policy_version 20090 (0.0007) [2023-10-10 05:28:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 41189376. Throughput: 0: 1672.3, 1: 1662.0. Samples: 10303430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:28:11,784][52050] Avg episode reward: [(0, '16.020'), (1, '17.230')] [2023-10-10 05:28:14,534][53252] Updated weights for policy 0, policy_version 20130 (0.0007) [2023-10-10 05:28:14,632][53268] Updated weights for policy 1, policy_version 20100 (0.0009) [2023-10-10 05:28:14,908][53252] Updated weights for policy 0, policy_version 20140 (0.0008) [2023-10-10 05:28:14,998][53268] Updated weights for policy 1, policy_version 20110 (0.0008) [2023-10-10 05:28:15,286][53252] Updated weights for policy 0, policy_version 20150 (0.0007) [2023-10-10 05:28:15,363][53268] Updated weights for policy 1, policy_version 20120 (0.0008) [2023-10-10 05:28:15,653][53252] Updated weights for policy 0, policy_version 20160 (0.0008) [2023-10-10 05:28:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 41254912. Throughput: 0: 1682.1, 1: 1679.9. Samples: 10315026. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 05:28:16,784][52050] Avg episode reward: [(0, '16.600'), (1, '17.470')] [2023-10-10 05:28:19,507][53268] Updated weights for policy 1, policy_version 20130 (0.0009) [2023-10-10 05:28:19,661][53252] Updated weights for policy 0, policy_version 20170 (0.0009) [2023-10-10 05:28:19,872][53268] Updated weights for policy 1, policy_version 20140 (0.0007) [2023-10-10 05:28:20,037][53252] Updated weights for policy 0, policy_version 20180 (0.0007) [2023-10-10 05:28:20,238][53268] Updated weights for policy 1, policy_version 20150 (0.0008) [2023-10-10 05:28:20,403][53252] Updated weights for policy 0, policy_version 20190 (0.0009) [2023-10-10 05:28:20,608][53268] Updated weights for policy 1, policy_version 20160 (0.0008) [2023-10-10 05:28:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 41320448. Throughput: 0: 1659.4, 1: 1666.3. Samples: 10333660. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 05:28:21,784][52050] Avg episode reward: [(0, '17.800'), (1, '15.920')] [2023-10-10 05:28:24,541][53252] Updated weights for policy 0, policy_version 20200 (0.0008) [2023-10-10 05:28:24,728][53268] Updated weights for policy 1, policy_version 20170 (0.0007) [2023-10-10 05:28:24,905][53252] Updated weights for policy 0, policy_version 20210 (0.0009) [2023-10-10 05:28:25,097][53268] Updated weights for policy 1, policy_version 20180 (0.0007) [2023-10-10 05:28:25,282][53252] Updated weights for policy 0, policy_version 20220 (0.0009) [2023-10-10 05:28:25,461][53268] Updated weights for policy 1, policy_version 20190 (0.0008) [2023-10-10 05:28:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 41385984. Throughput: 0: 1672.3, 1: 1673.1. Samples: 10353334. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 05:28:26,784][52050] Avg episode reward: [(0, '17.300'), (1, '16.490')] [2023-10-10 05:28:29,407][53252] Updated weights for policy 0, policy_version 20230 (0.0007) [2023-10-10 05:28:29,646][53268] Updated weights for policy 1, policy_version 20200 (0.0009) [2023-10-10 05:28:29,771][53252] Updated weights for policy 0, policy_version 20240 (0.0008) [2023-10-10 05:28:30,015][53268] Updated weights for policy 1, policy_version 20210 (0.0010) [2023-10-10 05:28:30,138][53252] Updated weights for policy 0, policy_version 20250 (0.0011) [2023-10-10 05:28:30,385][53268] Updated weights for policy 1, policy_version 20220 (0.0011) [2023-10-10 05:28:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41451520. Throughput: 0: 1664.8, 1: 1688.8. Samples: 10364784. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-10 05:28:31,785][52050] Avg episode reward: [(0, '18.160'), (1, '16.510')] [2023-10-10 05:28:34,313][53252] Updated weights for policy 0, policy_version 20260 (0.0009) [2023-10-10 05:28:34,366][53268] Updated weights for policy 1, policy_version 20230 (0.0009) [2023-10-10 05:28:34,697][53252] Updated weights for policy 0, policy_version 20270 (0.0007) [2023-10-10 05:28:34,735][53268] Updated weights for policy 1, policy_version 20240 (0.0010) [2023-10-10 05:28:35,077][53252] Updated weights for policy 0, policy_version 20280 (0.0007) [2023-10-10 05:28:35,113][53268] Updated weights for policy 1, policy_version 20250 (0.0009) [2023-10-10 05:28:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41517056. Throughput: 0: 1650.9, 1: 1662.0. Samples: 10383176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:28:36,784][52050] Avg episode reward: [(0, '18.320'), (1, '15.150')] [2023-10-10 05:28:39,202][53252] Updated weights for policy 0, policy_version 20290 (0.0009) [2023-10-10 05:28:39,216][53268] Updated weights for policy 1, policy_version 20260 (0.0008) [2023-10-10 05:28:39,574][53252] Updated weights for policy 0, policy_version 20300 (0.0009) [2023-10-10 05:28:39,581][53268] Updated weights for policy 1, policy_version 20270 (0.0008) [2023-10-10 05:28:39,939][53268] Updated weights for policy 1, policy_version 20280 (0.0008) [2023-10-10 05:28:39,944][53252] Updated weights for policy 0, policy_version 20310 (0.0008) [2023-10-10 05:28:40,322][53252] Updated weights for policy 0, policy_version 20320 (0.0009) [2023-10-10 05:28:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 41582592. Throughput: 0: 1662.2, 1: 1678.0. Samples: 10403496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:28:41,784][52050] Avg episode reward: [(0, '17.690'), (1, '16.890')] [2023-10-10 05:28:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000020320_20807680.pth... [2023-10-10 05:28:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000020288_20774912.pth... [2023-10-10 05:28:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000018752_19202048.pth [2023-10-10 05:28:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000018720_19169280.pth [2023-10-10 05:28:44,135][53268] Updated weights for policy 1, policy_version 20290 (0.0008) [2023-10-10 05:28:44,507][53268] Updated weights for policy 1, policy_version 20300 (0.0008) [2023-10-10 05:28:44,584][53252] Updated weights for policy 0, policy_version 20330 (0.0008) [2023-10-10 05:28:44,874][53268] Updated weights for policy 1, policy_version 20310 (0.0007) [2023-10-10 05:28:44,954][53252] Updated weights for policy 0, policy_version 20340 (0.0008) [2023-10-10 05:28:45,250][53268] Updated weights for policy 1, policy_version 20320 (0.0008) [2023-10-10 05:28:45,316][53252] Updated weights for policy 0, policy_version 20350 (0.0008) [2023-10-10 05:28:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41648128. Throughput: 0: 1661.7, 1: 1672.9. Samples: 10414908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:28:46,784][52050] Avg episode reward: [(0, '17.520'), (1, '17.330')] [2023-10-10 05:28:49,183][53268] Updated weights for policy 1, policy_version 20330 (0.0009) [2023-10-10 05:28:49,329][53252] Updated weights for policy 0, policy_version 20360 (0.0007) [2023-10-10 05:28:49,551][53268] Updated weights for policy 1, policy_version 20340 (0.0010) [2023-10-10 05:28:49,714][53252] Updated weights for policy 0, policy_version 20370 (0.0007) [2023-10-10 05:28:49,927][53268] Updated weights for policy 1, policy_version 20350 (0.0007) [2023-10-10 05:28:50,092][53252] Updated weights for policy 0, policy_version 20380 (0.0010) [2023-10-10 05:28:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41713664. Throughput: 0: 1655.7, 1: 1653.8. Samples: 10433284. Policy #0 lag: (min: 18.0, avg: 25.6, max: 50.0) [2023-10-10 05:28:51,784][52050] Avg episode reward: [(0, '18.000'), (1, '16.660')] [2023-10-10 05:28:54,048][53268] Updated weights for policy 1, policy_version 20360 (0.0008) [2023-10-10 05:28:54,175][53252] Updated weights for policy 0, policy_version 20390 (0.0010) [2023-10-10 05:28:54,423][53268] Updated weights for policy 1, policy_version 20370 (0.0009) [2023-10-10 05:28:54,536][53252] Updated weights for policy 0, policy_version 20400 (0.0009) [2023-10-10 05:28:54,791][53268] Updated weights for policy 1, policy_version 20380 (0.0009) [2023-10-10 05:28:54,918][53252] Updated weights for policy 0, policy_version 20410 (0.0008) [2023-10-10 05:28:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41779200. Throughput: 0: 1672.2, 1: 1672.6. Samples: 10453944. Policy #0 lag: (min: 18.0, avg: 25.6, max: 50.0) [2023-10-10 05:28:56,784][52050] Avg episode reward: [(0, '16.830'), (1, '17.250')] [2023-10-10 05:28:58,860][53252] Updated weights for policy 0, policy_version 20420 (0.0008) [2023-10-10 05:28:58,946][53268] Updated weights for policy 1, policy_version 20390 (0.0009) [2023-10-10 05:28:59,232][53252] Updated weights for policy 0, policy_version 20430 (0.0007) [2023-10-10 05:28:59,314][53268] Updated weights for policy 1, policy_version 20400 (0.0008) [2023-10-10 05:28:59,606][53252] Updated weights for policy 0, policy_version 20440 (0.0007) [2023-10-10 05:28:59,682][53268] Updated weights for policy 1, policy_version 20410 (0.0009) [2023-10-10 05:29:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 41844736. Throughput: 0: 1656.6, 1: 1661.2. Samples: 10464330. Policy #0 lag: (min: 18.0, avg: 25.6, max: 50.0) [2023-10-10 05:29:01,785][52050] Avg episode reward: [(0, '16.390'), (1, '16.450')] [2023-10-10 05:29:03,575][53252] Updated weights for policy 0, policy_version 20450 (0.0007) [2023-10-10 05:29:03,753][53268] Updated weights for policy 1, policy_version 20420 (0.0008) [2023-10-10 05:29:03,946][53252] Updated weights for policy 0, policy_version 20460 (0.0008) [2023-10-10 05:29:04,116][53268] Updated weights for policy 1, policy_version 20430 (0.0007) [2023-10-10 05:29:04,322][53252] Updated weights for policy 0, policy_version 20470 (0.0007) [2023-10-10 05:29:04,491][53268] Updated weights for policy 1, policy_version 20440 (0.0008) [2023-10-10 05:29:04,697][53252] Updated weights for policy 0, policy_version 20480 (0.0007) [2023-10-10 05:29:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 41910272. Throughput: 0: 1672.2, 1: 1659.6. Samples: 10483594. Policy #0 lag: (min: 18.0, avg: 25.6, max: 50.0) [2023-10-10 05:29:06,784][52050] Avg episode reward: [(0, '16.370'), (1, '16.170')] [2023-10-10 05:29:08,731][53268] Updated weights for policy 1, policy_version 20450 (0.0010) [2023-10-10 05:29:08,764][53252] Updated weights for policy 0, policy_version 20490 (0.0009) [2023-10-10 05:29:09,094][53268] Updated weights for policy 1, policy_version 20460 (0.0008) [2023-10-10 05:29:09,138][53252] Updated weights for policy 0, policy_version 20500 (0.0009) [2023-10-10 05:29:09,456][53268] Updated weights for policy 1, policy_version 20470 (0.0008) [2023-10-10 05:29:09,505][53252] Updated weights for policy 0, policy_version 20510 (0.0009) [2023-10-10 05:29:09,826][53268] Updated weights for policy 1, policy_version 20480 (0.0010) [2023-10-10 05:29:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41975808. Throughput: 0: 1685.1, 1: 1663.3. Samples: 10504014. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) [2023-10-10 05:29:11,784][52050] Avg episode reward: [(0, '17.580'), (1, '17.510')] [2023-10-10 05:29:13,530][53252] Updated weights for policy 0, policy_version 20520 (0.0008) [2023-10-10 05:29:13,891][53252] Updated weights for policy 0, policy_version 20530 (0.0010) [2023-10-10 05:29:13,969][53268] Updated weights for policy 1, policy_version 20490 (0.0008) [2023-10-10 05:29:14,267][53252] Updated weights for policy 0, policy_version 20540 (0.0008) [2023-10-10 05:29:14,330][53268] Updated weights for policy 1, policy_version 20500 (0.0008) [2023-10-10 05:29:14,695][53268] Updated weights for policy 1, policy_version 20510 (0.0007) [2023-10-10 05:29:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42041344. Throughput: 0: 1663.8, 1: 1649.6. Samples: 10513886. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) [2023-10-10 05:29:16,784][52050] Avg episode reward: [(0, '17.020'), (1, '17.000')] [2023-10-10 05:29:18,338][53252] Updated weights for policy 0, policy_version 20550 (0.0010) [2023-10-10 05:29:18,701][53252] Updated weights for policy 0, policy_version 20560 (0.0008) [2023-10-10 05:29:18,742][53268] Updated weights for policy 1, policy_version 20520 (0.0008) [2023-10-10 05:29:19,063][53252] Updated weights for policy 0, policy_version 20570 (0.0008) [2023-10-10 05:29:19,101][53268] Updated weights for policy 1, policy_version 20530 (0.0008) [2023-10-10 05:29:19,469][53268] Updated weights for policy 1, policy_version 20540 (0.0009) [2023-10-10 05:29:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42106880. Throughput: 0: 1688.2, 1: 1660.0. Samples: 10533846. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) [2023-10-10 05:29:21,785][52050] Avg episode reward: [(0, '17.370'), (1, '17.310')] [2023-10-10 05:29:23,109][53252] Updated weights for policy 0, policy_version 20580 (0.0007) [2023-10-10 05:29:23,485][53252] Updated weights for policy 0, policy_version 20590 (0.0008) [2023-10-10 05:29:23,540][53268] Updated weights for policy 1, policy_version 20550 (0.0009) [2023-10-10 05:29:23,845][53252] Updated weights for policy 0, policy_version 20600 (0.0008) [2023-10-10 05:29:23,907][53268] Updated weights for policy 1, policy_version 20560 (0.0008) [2023-10-10 05:29:24,277][53268] Updated weights for policy 1, policy_version 20570 (0.0007) [2023-10-10 05:29:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42172416. Throughput: 0: 1691.5, 1: 1667.5. Samples: 10554650. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) [2023-10-10 05:29:26,784][52050] Avg episode reward: [(0, '18.840'), (1, '17.800')] [2023-10-10 05:29:26,795][52846] Saving new best policy, reward=18.840! [2023-10-10 05:29:26,795][53061] Saving new best policy, reward=17.800! [2023-10-10 05:29:27,913][53252] Updated weights for policy 0, policy_version 20610 (0.0008) [2023-10-10 05:29:28,283][53252] Updated weights for policy 0, policy_version 20620 (0.0009) [2023-10-10 05:29:28,360][53268] Updated weights for policy 1, policy_version 20580 (0.0009) [2023-10-10 05:29:28,662][53252] Updated weights for policy 0, policy_version 20630 (0.0007) [2023-10-10 05:29:28,725][53268] Updated weights for policy 1, policy_version 20590 (0.0008) [2023-10-10 05:29:29,039][53252] Updated weights for policy 0, policy_version 20640 (0.0009) [2023-10-10 05:29:29,090][53268] Updated weights for policy 1, policy_version 20600 (0.0010) [2023-10-10 05:29:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42237952. Throughput: 0: 1663.2, 1: 1648.6. Samples: 10563940. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:29:31,784][52050] Avg episode reward: [(0, '20.140'), (1, '16.590')] [2023-10-10 05:29:31,785][52846] Saving new best policy, reward=20.140! [2023-10-10 05:29:33,074][53252] Updated weights for policy 0, policy_version 20650 (0.0009) [2023-10-10 05:29:33,113][53268] Updated weights for policy 1, policy_version 20610 (0.0009) [2023-10-10 05:29:33,450][53252] Updated weights for policy 0, policy_version 20660 (0.0007) [2023-10-10 05:29:33,476][53268] Updated weights for policy 1, policy_version 20620 (0.0010) [2023-10-10 05:29:33,817][53252] Updated weights for policy 0, policy_version 20670 (0.0008) [2023-10-10 05:29:33,843][53268] Updated weights for policy 1, policy_version 20630 (0.0008) [2023-10-10 05:29:34,213][53268] Updated weights for policy 1, policy_version 20640 (0.0010) [2023-10-10 05:29:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42303488. Throughput: 0: 1687.4, 1: 1671.2. Samples: 10584424. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:29:36,784][52050] Avg episode reward: [(0, '18.270'), (1, '17.290')] [2023-10-10 05:29:37,805][53252] Updated weights for policy 0, policy_version 20680 (0.0009) [2023-10-10 05:29:38,178][53252] Updated weights for policy 0, policy_version 20690 (0.0009) [2023-10-10 05:29:38,426][53268] Updated weights for policy 1, policy_version 20650 (0.0008) [2023-10-10 05:29:38,541][53252] Updated weights for policy 0, policy_version 20700 (0.0009) [2023-10-10 05:29:38,791][53268] Updated weights for policy 1, policy_version 20660 (0.0009) [2023-10-10 05:29:39,166][53268] Updated weights for policy 1, policy_version 20670 (0.0011) [2023-10-10 05:29:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42369024. Throughput: 0: 1688.3, 1: 1669.3. Samples: 10605034. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:29:41,785][52050] Avg episode reward: [(0, '18.430'), (1, '18.730')] [2023-10-10 05:29:41,796][53061] Saving new best policy, reward=18.730! [2023-10-10 05:29:42,722][53252] Updated weights for policy 0, policy_version 20710 (0.0008) [2023-10-10 05:29:43,098][53252] Updated weights for policy 0, policy_version 20720 (0.0009) [2023-10-10 05:29:43,393][53268] Updated weights for policy 1, policy_version 20680 (0.0009) [2023-10-10 05:29:43,477][53252] Updated weights for policy 0, policy_version 20730 (0.0008) [2023-10-10 05:29:43,764][53268] Updated weights for policy 1, policy_version 20690 (0.0008) [2023-10-10 05:29:44,137][53268] Updated weights for policy 1, policy_version 20700 (0.0008) [2023-10-10 05:29:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42434560. Throughput: 0: 1676.7, 1: 1654.9. Samples: 10614252. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:29:46,784][52050] Avg episode reward: [(0, '15.420'), (1, '18.500')] [2023-10-10 05:29:47,441][53252] Updated weights for policy 0, policy_version 20740 (0.0007) [2023-10-10 05:29:47,809][53252] Updated weights for policy 0, policy_version 20750 (0.0008) [2023-10-10 05:29:47,990][53268] Updated weights for policy 1, policy_version 20710 (0.0008) [2023-10-10 05:29:48,171][53252] Updated weights for policy 0, policy_version 20760 (0.0009) [2023-10-10 05:29:48,359][53268] Updated weights for policy 1, policy_version 20720 (0.0008) [2023-10-10 05:29:48,720][53268] Updated weights for policy 1, policy_version 20730 (0.0009) [2023-10-10 05:29:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42500096. Throughput: 0: 1689.6, 1: 1671.3. Samples: 10634832. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 05:29:51,784][52050] Avg episode reward: [(0, '15.820'), (1, '18.140')] [2023-10-10 05:29:52,295][53252] Updated weights for policy 0, policy_version 20770 (0.0008) [2023-10-10 05:29:52,665][53252] Updated weights for policy 0, policy_version 20780 (0.0008) [2023-10-10 05:29:52,823][53268] Updated weights for policy 1, policy_version 20740 (0.0008) [2023-10-10 05:29:53,036][53252] Updated weights for policy 0, policy_version 20790 (0.0009) [2023-10-10 05:29:53,198][53268] Updated weights for policy 1, policy_version 20750 (0.0009) [2023-10-10 05:29:53,407][53252] Updated weights for policy 0, policy_version 20800 (0.0009) [2023-10-10 05:29:53,561][53268] Updated weights for policy 1, policy_version 20760 (0.0009) [2023-10-10 05:29:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42565632. Throughput: 0: 1691.9, 1: 1675.6. Samples: 10655554. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 05:29:56,784][52050] Avg episode reward: [(0, '16.490'), (1, '18.790')] [2023-10-10 05:29:56,794][53061] Saving new best policy, reward=18.790! [2023-10-10 05:29:57,577][53252] Updated weights for policy 0, policy_version 20810 (0.0009) [2023-10-10 05:29:57,663][53268] Updated weights for policy 1, policy_version 20770 (0.0008) [2023-10-10 05:29:57,953][53252] Updated weights for policy 0, policy_version 20820 (0.0007) [2023-10-10 05:29:58,037][53268] Updated weights for policy 1, policy_version 20780 (0.0008) [2023-10-10 05:29:58,317][53252] Updated weights for policy 0, policy_version 20830 (0.0008) [2023-10-10 05:29:58,402][53268] Updated weights for policy 1, policy_version 20790 (0.0008) [2023-10-10 05:29:58,763][53268] Updated weights for policy 1, policy_version 20800 (0.0008) [2023-10-10 05:30:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 42631168. Throughput: 0: 1687.9, 1: 1660.9. Samples: 10664582. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 05:30:01,784][52050] Avg episode reward: [(0, '16.940'), (1, '17.360')] [2023-10-10 05:30:02,367][53252] Updated weights for policy 0, policy_version 20840 (0.0007) [2023-10-10 05:30:02,749][53252] Updated weights for policy 0, policy_version 20850 (0.0008) [2023-10-10 05:30:02,861][53268] Updated weights for policy 1, policy_version 20810 (0.0008) [2023-10-10 05:30:03,112][53252] Updated weights for policy 0, policy_version 20860 (0.0009) [2023-10-10 05:30:03,233][53268] Updated weights for policy 1, policy_version 20820 (0.0009) [2023-10-10 05:30:03,590][53268] Updated weights for policy 1, policy_version 20830 (0.0009) [2023-10-10 05:30:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42696704. Throughput: 0: 1689.3, 1: 1677.2. Samples: 10685338. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 05:30:06,785][52050] Avg episode reward: [(0, '16.810'), (1, '17.730')] [2023-10-10 05:30:07,097][53252] Updated weights for policy 0, policy_version 20870 (0.0007) [2023-10-10 05:30:07,467][53252] Updated weights for policy 0, policy_version 20880 (0.0007) [2023-10-10 05:30:07,834][53252] Updated weights for policy 0, policy_version 20890 (0.0008) [2023-10-10 05:30:07,873][53268] Updated weights for policy 1, policy_version 20840 (0.0007) [2023-10-10 05:30:08,251][53268] Updated weights for policy 1, policy_version 20850 (0.0009) [2023-10-10 05:30:08,620][53268] Updated weights for policy 1, policy_version 20860 (0.0010) [2023-10-10 05:30:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42762240. Throughput: 0: 1689.7, 1: 1674.8. Samples: 10706048. Policy #0 lag: (min: 3.0, avg: 3.2, max: 13.0) [2023-10-10 05:30:11,784][52050] Avg episode reward: [(0, '17.220'), (1, '17.040')] [2023-10-10 05:30:11,882][53252] Updated weights for policy 0, policy_version 20900 (0.0009) [2023-10-10 05:30:12,251][53252] Updated weights for policy 0, policy_version 20910 (0.0008) [2023-10-10 05:30:12,523][53268] Updated weights for policy 1, policy_version 20870 (0.0010) [2023-10-10 05:30:12,628][53252] Updated weights for policy 0, policy_version 20920 (0.0008) [2023-10-10 05:30:12,888][53268] Updated weights for policy 1, policy_version 20880 (0.0008) [2023-10-10 05:30:13,255][53268] Updated weights for policy 1, policy_version 20890 (0.0010) [2023-10-10 05:30:16,710][53252] Updated weights for policy 0, policy_version 20930 (0.0009) [2023-10-10 05:30:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42827776. Throughput: 0: 1688.8, 1: 1667.9. Samples: 10714994. Policy #0 lag: (min: 3.0, avg: 3.2, max: 13.0) [2023-10-10 05:30:16,784][52050] Avg episode reward: [(0, '17.020'), (1, '15.730')] [2023-10-10 05:30:17,079][53252] Updated weights for policy 0, policy_version 20940 (0.0010) [2023-10-10 05:30:17,443][53252] Updated weights for policy 0, policy_version 20950 (0.0009) [2023-10-10 05:30:17,458][53268] Updated weights for policy 1, policy_version 20900 (0.0008) [2023-10-10 05:30:17,817][53268] Updated weights for policy 1, policy_version 20910 (0.0007) [2023-10-10 05:30:17,819][53252] Updated weights for policy 0, policy_version 20960 (0.0009) [2023-10-10 05:30:18,188][53268] Updated weights for policy 1, policy_version 20920 (0.0007) [2023-10-10 05:30:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42893312. Throughput: 0: 1686.6, 1: 1671.0. Samples: 10735516. Policy #0 lag: (min: 3.0, avg: 3.2, max: 13.0) [2023-10-10 05:30:21,784][52050] Avg episode reward: [(0, '16.490'), (1, '16.550')] [2023-10-10 05:30:21,809][53252] Updated weights for policy 0, policy_version 20970 (0.0007) [2023-10-10 05:30:22,185][53252] Updated weights for policy 0, policy_version 20980 (0.0007) [2023-10-10 05:30:22,281][53268] Updated weights for policy 1, policy_version 20930 (0.0008) [2023-10-10 05:30:22,546][53252] Updated weights for policy 0, policy_version 20990 (0.0007) [2023-10-10 05:30:22,648][53268] Updated weights for policy 1, policy_version 20940 (0.0007) [2023-10-10 05:30:23,021][53268] Updated weights for policy 1, policy_version 20950 (0.0008) [2023-10-10 05:30:23,388][53268] Updated weights for policy 1, policy_version 20960 (0.0009) [2023-10-10 05:30:26,571][53252] Updated weights for policy 0, policy_version 21000 (0.0008) [2023-10-10 05:30:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 42958848. Throughput: 0: 1678.4, 1: 1675.5. Samples: 10755958. Policy #0 lag: (min: 1.0, avg: 11.2, max: 33.0) [2023-10-10 05:30:26,784][52050] Avg episode reward: [(0, '17.180'), (1, '16.330')] [2023-10-10 05:30:26,940][53252] Updated weights for policy 0, policy_version 21010 (0.0007) [2023-10-10 05:30:27,306][53252] Updated weights for policy 0, policy_version 21020 (0.0008) [2023-10-10 05:30:27,469][53268] Updated weights for policy 1, policy_version 20970 (0.0010) [2023-10-10 05:30:27,846][53268] Updated weights for policy 1, policy_version 20980 (0.0009) [2023-10-10 05:30:28,202][53268] Updated weights for policy 1, policy_version 20990 (0.0007) [2023-10-10 05:30:31,577][53252] Updated weights for policy 0, policy_version 21030 (0.0009) [2023-10-10 05:30:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43024384. Throughput: 0: 1681.2, 1: 1672.4. Samples: 10765166. Policy #0 lag: (min: 1.0, avg: 11.2, max: 33.0) [2023-10-10 05:30:31,784][52050] Avg episode reward: [(0, '17.220'), (1, '17.280')] [2023-10-10 05:30:31,953][53252] Updated weights for policy 0, policy_version 21040 (0.0009) [2023-10-10 05:30:32,184][53268] Updated weights for policy 1, policy_version 21000 (0.0007) [2023-10-10 05:30:32,329][53252] Updated weights for policy 0, policy_version 21050 (0.0009) [2023-10-10 05:30:32,550][53268] Updated weights for policy 1, policy_version 21010 (0.0007) [2023-10-10 05:30:32,918][53268] Updated weights for policy 1, policy_version 21020 (0.0007) [2023-10-10 05:30:36,327][53252] Updated weights for policy 0, policy_version 21060 (0.0008) [2023-10-10 05:30:36,692][53252] Updated weights for policy 0, policy_version 21070 (0.0010) [2023-10-10 05:30:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43089920. Throughput: 0: 1682.8, 1: 1674.8. Samples: 10785924. Policy #0 lag: (min: 1.0, avg: 11.2, max: 33.0) [2023-10-10 05:30:36,784][52050] Avg episode reward: [(0, '17.260'), (1, '17.600')] [2023-10-10 05:30:37,057][53252] Updated weights for policy 0, policy_version 21080 (0.0010) [2023-10-10 05:30:37,057][53268] Updated weights for policy 1, policy_version 21030 (0.0008) [2023-10-10 05:30:37,435][53268] Updated weights for policy 1, policy_version 21040 (0.0009) [2023-10-10 05:30:37,798][53268] Updated weights for policy 1, policy_version 21050 (0.0008) [2023-10-10 05:30:41,005][53252] Updated weights for policy 0, policy_version 21090 (0.0008) [2023-10-10 05:30:41,364][53252] Updated weights for policy 0, policy_version 21100 (0.0010) [2023-10-10 05:30:41,734][53252] Updated weights for policy 0, policy_version 21110 (0.0011) [2023-10-10 05:30:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 43155456. Throughput: 0: 1667.2, 1: 1679.3. Samples: 10806146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:30:41,784][52050] Avg episode reward: [(0, '18.200'), (1, '17.500')] [2023-10-10 05:30:42,018][53268] Updated weights for policy 1, policy_version 21060 (0.0008) [2023-10-10 05:30:42,103][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000021120_21626880.pth... [2023-10-10 05:30:42,107][53252] Updated weights for policy 0, policy_version 21120 (0.0008) [2023-10-10 05:30:42,131][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000019552_20021248.pth [2023-10-10 05:30:42,387][53268] Updated weights for policy 1, policy_version 21070 (0.0009) [2023-10-10 05:30:42,753][53268] Updated weights for policy 1, policy_version 21080 (0.0010) [2023-10-10 05:30:43,047][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000021088_21594112.pth... [2023-10-10 05:30:43,085][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000019520_19988480.pth [2023-10-10 05:30:46,374][53252] Updated weights for policy 0, policy_version 21130 (0.0008) [2023-10-10 05:30:46,743][53252] Updated weights for policy 0, policy_version 21140 (0.0008) [2023-10-10 05:30:46,777][53268] Updated weights for policy 1, policy_version 21090 (0.0008) [2023-10-10 05:30:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43220992. Throughput: 0: 1682.6, 1: 1679.9. Samples: 10815896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:30:46,784][52050] Avg episode reward: [(0, '17.300'), (1, '17.750')] [2023-10-10 05:30:47,111][53252] Updated weights for policy 0, policy_version 21150 (0.0007) [2023-10-10 05:30:47,142][53268] Updated weights for policy 1, policy_version 21100 (0.0007) [2023-10-10 05:30:47,510][53268] Updated weights for policy 1, policy_version 21110 (0.0007) [2023-10-10 05:30:47,879][53268] Updated weights for policy 1, policy_version 21120 (0.0011) [2023-10-10 05:30:51,161][53252] Updated weights for policy 0, policy_version 21160 (0.0007) [2023-10-10 05:30:51,541][53252] Updated weights for policy 0, policy_version 21170 (0.0007) [2023-10-10 05:30:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43286528. Throughput: 0: 1679.6, 1: 1681.5. Samples: 10836586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:30:51,784][52050] Avg episode reward: [(0, '17.130'), (1, '17.530')] [2023-10-10 05:30:51,891][53268] Updated weights for policy 1, policy_version 21130 (0.0009) [2023-10-10 05:30:51,903][53252] Updated weights for policy 0, policy_version 21180 (0.0010) [2023-10-10 05:30:52,263][53268] Updated weights for policy 1, policy_version 21140 (0.0009) [2023-10-10 05:30:52,630][53268] Updated weights for policy 1, policy_version 21150 (0.0009) [2023-10-10 05:30:55,906][53252] Updated weights for policy 0, policy_version 21190 (0.0008) [2023-10-10 05:30:56,270][53252] Updated weights for policy 0, policy_version 21200 (0.0008) [2023-10-10 05:30:56,643][53252] Updated weights for policy 0, policy_version 21210 (0.0008) [2023-10-10 05:30:56,691][53268] Updated weights for policy 1, policy_version 21160 (0.0010) [2023-10-10 05:30:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43352064. Throughput: 0: 1664.9, 1: 1681.2. Samples: 10856622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:30:56,784][52050] Avg episode reward: [(0, '17.010'), (1, '17.080')] [2023-10-10 05:30:57,062][53268] Updated weights for policy 1, policy_version 21170 (0.0008) [2023-10-10 05:30:57,425][53268] Updated weights for policy 1, policy_version 21180 (0.0008) [2023-10-10 05:31:00,660][53252] Updated weights for policy 0, policy_version 21220 (0.0008) [2023-10-10 05:31:01,024][53252] Updated weights for policy 0, policy_version 21230 (0.0010) [2023-10-10 05:31:01,414][53252] Updated weights for policy 0, policy_version 21240 (0.0010) [2023-10-10 05:31:01,611][53268] Updated weights for policy 1, policy_version 21190 (0.0009) [2023-10-10 05:31:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43450368. Throughput: 0: 1685.1, 1: 1676.9. Samples: 10866280. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 05:31:01,784][52050] Avg episode reward: [(0, '15.350'), (1, '16.430')] [2023-10-10 05:31:01,969][53268] Updated weights for policy 1, policy_version 21200 (0.0008) [2023-10-10 05:31:02,349][53268] Updated weights for policy 1, policy_version 21210 (0.0007) [2023-10-10 05:31:05,406][53252] Updated weights for policy 0, policy_version 21250 (0.0008) [2023-10-10 05:31:05,775][53252] Updated weights for policy 0, policy_version 21260 (0.0007) [2023-10-10 05:31:06,153][53252] Updated weights for policy 0, policy_version 21270 (0.0007) [2023-10-10 05:31:06,515][53268] Updated weights for policy 1, policy_version 21220 (0.0008) [2023-10-10 05:31:06,525][53252] Updated weights for policy 0, policy_version 21280 (0.0007) [2023-10-10 05:31:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43515904. Throughput: 0: 1687.0, 1: 1676.1. Samples: 10886858. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 05:31:06,784][52050] Avg episode reward: [(0, '18.090'), (1, '17.030')] [2023-10-10 05:31:06,881][53268] Updated weights for policy 1, policy_version 21230 (0.0007) [2023-10-10 05:31:07,246][53268] Updated weights for policy 1, policy_version 21240 (0.0010) [2023-10-10 05:31:10,546][53252] Updated weights for policy 0, policy_version 21290 (0.0009) [2023-10-10 05:31:10,913][53252] Updated weights for policy 0, policy_version 21300 (0.0008) [2023-10-10 05:31:11,285][53252] Updated weights for policy 0, policy_version 21310 (0.0007) [2023-10-10 05:31:11,459][53268] Updated weights for policy 1, policy_version 21250 (0.0010) [2023-10-10 05:31:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43581440. Throughput: 0: 1664.3, 1: 1677.8. Samples: 10906354. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 05:31:11,784][52050] Avg episode reward: [(0, '18.230'), (1, '16.660')] [2023-10-10 05:31:11,824][53268] Updated weights for policy 1, policy_version 21260 (0.0010) [2023-10-10 05:31:12,195][53268] Updated weights for policy 1, policy_version 21270 (0.0008) [2023-10-10 05:31:12,559][53268] Updated weights for policy 1, policy_version 21280 (0.0008) [2023-10-10 05:31:15,441][53252] Updated weights for policy 0, policy_version 21320 (0.0009) [2023-10-10 05:31:15,815][53252] Updated weights for policy 0, policy_version 21330 (0.0009) [2023-10-10 05:31:16,194][53252] Updated weights for policy 0, policy_version 21340 (0.0007) [2023-10-10 05:31:16,486][53268] Updated weights for policy 1, policy_version 21290 (0.0009) [2023-10-10 05:31:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43646976. Throughput: 0: 1690.8, 1: 1678.7. Samples: 10916792. Policy #0 lag: (min: 27.0, avg: 27.0, max: 31.0) [2023-10-10 05:31:16,784][52050] Avg episode reward: [(0, '18.870'), (1, '15.550')] [2023-10-10 05:31:16,853][53268] Updated weights for policy 1, policy_version 21300 (0.0007) [2023-10-10 05:31:17,225][53268] Updated weights for policy 1, policy_version 21310 (0.0007) [2023-10-10 05:31:20,160][53252] Updated weights for policy 0, policy_version 21350 (0.0007) [2023-10-10 05:31:20,536][53252] Updated weights for policy 0, policy_version 21360 (0.0010) [2023-10-10 05:31:20,897][53252] Updated weights for policy 0, policy_version 21370 (0.0008) [2023-10-10 05:31:21,292][53268] Updated weights for policy 1, policy_version 21320 (0.0008) [2023-10-10 05:31:21,667][53268] Updated weights for policy 1, policy_version 21330 (0.0008) [2023-10-10 05:31:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43712512. Throughput: 0: 1678.5, 1: 1682.3. Samples: 10937162. Policy #0 lag: (min: 27.0, avg: 27.0, max: 31.0) [2023-10-10 05:31:21,784][52050] Avg episode reward: [(0, '18.840'), (1, '15.750')] [2023-10-10 05:31:22,038][53268] Updated weights for policy 1, policy_version 21340 (0.0008) [2023-10-10 05:31:24,827][53252] Updated weights for policy 0, policy_version 21380 (0.0008) [2023-10-10 05:31:25,204][53252] Updated weights for policy 0, policy_version 21390 (0.0008) [2023-10-10 05:31:25,578][53252] Updated weights for policy 0, policy_version 21400 (0.0010) [2023-10-10 05:31:26,007][53268] Updated weights for policy 1, policy_version 21350 (0.0008) [2023-10-10 05:31:26,372][53268] Updated weights for policy 1, policy_version 21360 (0.0011) [2023-10-10 05:31:26,739][53268] Updated weights for policy 1, policy_version 21370 (0.0010) [2023-10-10 05:31:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43778048. Throughput: 0: 1675.0, 1: 1674.8. Samples: 10956888. Policy #0 lag: (min: 27.0, avg: 27.0, max: 31.0) [2023-10-10 05:31:26,784][52050] Avg episode reward: [(0, '16.600'), (1, '17.620')] [2023-10-10 05:31:29,768][53252] Updated weights for policy 0, policy_version 21410 (0.0008) [2023-10-10 05:31:30,126][53252] Updated weights for policy 0, policy_version 21420 (0.0010) [2023-10-10 05:31:30,497][53252] Updated weights for policy 0, policy_version 21430 (0.0010) [2023-10-10 05:31:30,692][53268] Updated weights for policy 1, policy_version 21380 (0.0009) [2023-10-10 05:31:30,873][53252] Updated weights for policy 0, policy_version 21440 (0.0011) [2023-10-10 05:31:31,063][53268] Updated weights for policy 1, policy_version 21390 (0.0010) [2023-10-10 05:31:31,423][53268] Updated weights for policy 1, policy_version 21400 (0.0011) [2023-10-10 05:31:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 43876352. Throughput: 0: 1689.0, 1: 1682.0. Samples: 10967590. Policy #0 lag: (min: 27.0, avg: 27.0, max: 31.0) [2023-10-10 05:31:31,784][52050] Avg episode reward: [(0, '18.520'), (1, '16.150')] [2023-10-10 05:31:35,021][53252] Updated weights for policy 0, policy_version 21450 (0.0007) [2023-10-10 05:31:35,392][53252] Updated weights for policy 0, policy_version 21460 (0.0008) [2023-10-10 05:31:35,442][53268] Updated weights for policy 1, policy_version 21410 (0.0009) [2023-10-10 05:31:35,757][53252] Updated weights for policy 0, policy_version 21470 (0.0008) [2023-10-10 05:31:35,815][53268] Updated weights for policy 1, policy_version 21420 (0.0010) [2023-10-10 05:31:36,183][53268] Updated weights for policy 1, policy_version 21430 (0.0009) [2023-10-10 05:31:36,547][53268] Updated weights for policy 1, policy_version 21440 (0.0008) [2023-10-10 05:31:36,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 43941888. Throughput: 0: 1671.8, 1: 1686.2. Samples: 10987698. Policy #0 lag: (min: 17.0, avg: 26.1, max: 49.0) [2023-10-10 05:31:36,784][52050] Avg episode reward: [(0, '17.380'), (1, '16.490')] [2023-10-10 05:31:39,849][53252] Updated weights for policy 0, policy_version 21480 (0.0007) [2023-10-10 05:31:40,229][53252] Updated weights for policy 0, policy_version 21490 (0.0009) [2023-10-10 05:31:40,597][53252] Updated weights for policy 0, policy_version 21500 (0.0009) [2023-10-10 05:31:40,667][53268] Updated weights for policy 1, policy_version 21450 (0.0008) [2023-10-10 05:31:41,033][53268] Updated weights for policy 1, policy_version 21460 (0.0011) [2023-10-10 05:31:41,419][53268] Updated weights for policy 1, policy_version 21470 (0.0011) [2023-10-10 05:31:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 44007424. Throughput: 0: 1674.8, 1: 1666.9. Samples: 11006996. Policy #0 lag: (min: 17.0, avg: 26.1, max: 49.0) [2023-10-10 05:31:41,784][52050] Avg episode reward: [(0, '18.040'), (1, '17.600')] [2023-10-10 05:31:44,588][53252] Updated weights for policy 0, policy_version 21510 (0.0007) [2023-10-10 05:31:44,970][53252] Updated weights for policy 0, policy_version 21520 (0.0007) [2023-10-10 05:31:45,338][53252] Updated weights for policy 0, policy_version 21530 (0.0009) [2023-10-10 05:31:45,651][53268] Updated weights for policy 1, policy_version 21480 (0.0010) [2023-10-10 05:31:46,026][53268] Updated weights for policy 1, policy_version 21490 (0.0007) [2023-10-10 05:31:46,389][53268] Updated weights for policy 1, policy_version 21500 (0.0008) [2023-10-10 05:31:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 44072960. Throughput: 0: 1683.1, 1: 1691.4. Samples: 11018134. Policy #0 lag: (min: 17.0, avg: 26.1, max: 49.0) [2023-10-10 05:31:46,784][52050] Avg episode reward: [(0, '17.600'), (1, '15.270')] [2023-10-10 05:31:49,434][53252] Updated weights for policy 0, policy_version 21540 (0.0008) [2023-10-10 05:31:49,800][53252] Updated weights for policy 0, policy_version 21550 (0.0007) [2023-10-10 05:31:50,176][53252] Updated weights for policy 0, policy_version 21560 (0.0008) [2023-10-10 05:31:50,464][53268] Updated weights for policy 1, policy_version 21510 (0.0008) [2023-10-10 05:31:50,830][53268] Updated weights for policy 1, policy_version 21520 (0.0008) [2023-10-10 05:31:51,215][53268] Updated weights for policy 1, policy_version 21530 (0.0009) [2023-10-10 05:31:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 44138496. Throughput: 0: 1658.0, 1: 1690.6. Samples: 11037544. Policy #0 lag: (min: 17.0, avg: 26.1, max: 49.0) [2023-10-10 05:31:51,784][52050] Avg episode reward: [(0, '17.080'), (1, '16.080')] [2023-10-10 05:31:54,258][53252] Updated weights for policy 0, policy_version 21570 (0.0009) [2023-10-10 05:31:54,627][53252] Updated weights for policy 0, policy_version 21580 (0.0008) [2023-10-10 05:31:55,001][53252] Updated weights for policy 0, policy_version 21590 (0.0007) [2023-10-10 05:31:55,297][53268] Updated weights for policy 1, policy_version 21540 (0.0010) [2023-10-10 05:31:55,370][53252] Updated weights for policy 0, policy_version 21600 (0.0007) [2023-10-10 05:31:55,669][53268] Updated weights for policy 1, policy_version 21550 (0.0010) [2023-10-10 05:31:56,031][53268] Updated weights for policy 1, policy_version 21560 (0.0007) [2023-10-10 05:31:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 44204032. Throughput: 0: 1680.9, 1: 1667.4. Samples: 11057028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:31:56,785][52050] Avg episode reward: [(0, '18.180'), (1, '17.160')] [2023-10-10 05:31:59,366][53252] Updated weights for policy 0, policy_version 21610 (0.0007) [2023-10-10 05:31:59,742][53252] Updated weights for policy 0, policy_version 21620 (0.0009) [2023-10-10 05:32:00,093][53268] Updated weights for policy 1, policy_version 21570 (0.0008) [2023-10-10 05:32:00,108][53252] Updated weights for policy 0, policy_version 21630 (0.0008) [2023-10-10 05:32:00,451][53268] Updated weights for policy 1, policy_version 21580 (0.0008) [2023-10-10 05:32:00,819][53268] Updated weights for policy 1, policy_version 21590 (0.0010) [2023-10-10 05:32:01,191][53268] Updated weights for policy 1, policy_version 21600 (0.0009) [2023-10-10 05:32:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44269568. Throughput: 0: 1670.5, 1: 1693.2. Samples: 11068158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:32:01,784][52050] Avg episode reward: [(0, '17.410'), (1, '16.210')] [2023-10-10 05:32:04,209][53252] Updated weights for policy 0, policy_version 21640 (0.0009) [2023-10-10 05:32:04,581][53252] Updated weights for policy 0, policy_version 21650 (0.0007) [2023-10-10 05:32:04,965][53252] Updated weights for policy 0, policy_version 21660 (0.0009) [2023-10-10 05:32:05,283][53268] Updated weights for policy 1, policy_version 21610 (0.0010) [2023-10-10 05:32:05,655][53268] Updated weights for policy 1, policy_version 21620 (0.0009) [2023-10-10 05:32:06,025][53268] Updated weights for policy 1, policy_version 21630 (0.0010) [2023-10-10 05:32:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44335104. Throughput: 0: 1660.0, 1: 1681.1. Samples: 11087510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:32:06,784][52050] Avg episode reward: [(0, '17.580'), (1, '16.890')] [2023-10-10 05:32:09,123][53252] Updated weights for policy 0, policy_version 21670 (0.0008) [2023-10-10 05:32:09,491][53252] Updated weights for policy 0, policy_version 21680 (0.0010) [2023-10-10 05:32:09,866][53252] Updated weights for policy 0, policy_version 21690 (0.0008) [2023-10-10 05:32:10,086][53268] Updated weights for policy 1, policy_version 21640 (0.0009) [2023-10-10 05:32:10,456][53268] Updated weights for policy 1, policy_version 21650 (0.0009) [2023-10-10 05:32:10,835][53268] Updated weights for policy 1, policy_version 21660 (0.0007) [2023-10-10 05:32:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44400640. Throughput: 0: 1675.3, 1: 1660.9. Samples: 11107020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:32:11,784][52050] Avg episode reward: [(0, '17.500'), (1, '18.090')] [2023-10-10 05:32:14,092][53252] Updated weights for policy 0, policy_version 21700 (0.0007) [2023-10-10 05:32:14,453][53252] Updated weights for policy 0, policy_version 21710 (0.0007) [2023-10-10 05:32:14,828][53252] Updated weights for policy 0, policy_version 21720 (0.0007) [2023-10-10 05:32:14,979][53268] Updated weights for policy 1, policy_version 21670 (0.0010) [2023-10-10 05:32:15,350][53268] Updated weights for policy 1, policy_version 21680 (0.0009) [2023-10-10 05:32:15,716][53268] Updated weights for policy 1, policy_version 21690 (0.0009) [2023-10-10 05:32:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44466176. Throughput: 0: 1661.9, 1: 1680.6. Samples: 11118004. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:32:16,784][52050] Avg episode reward: [(0, '17.350'), (1, '17.370')] [2023-10-10 05:32:18,746][53252] Updated weights for policy 0, policy_version 21730 (0.0007) [2023-10-10 05:32:19,116][53252] Updated weights for policy 0, policy_version 21740 (0.0008) [2023-10-10 05:32:19,476][53252] Updated weights for policy 0, policy_version 21750 (0.0007) [2023-10-10 05:32:19,769][53268] Updated weights for policy 1, policy_version 21700 (0.0009) [2023-10-10 05:32:19,843][53252] Updated weights for policy 0, policy_version 21760 (0.0009) [2023-10-10 05:32:20,143][53268] Updated weights for policy 1, policy_version 21710 (0.0009) [2023-10-10 05:32:20,510][53268] Updated weights for policy 1, policy_version 21720 (0.0010) [2023-10-10 05:32:21,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 44531712. Throughput: 0: 1666.7, 1: 1661.9. Samples: 11137490. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:32:21,784][52050] Avg episode reward: [(0, '16.410'), (1, '17.930')] [2023-10-10 05:32:23,912][53252] Updated weights for policy 0, policy_version 21770 (0.0009) [2023-10-10 05:32:24,288][53252] Updated weights for policy 0, policy_version 21780 (0.0008) [2023-10-10 05:32:24,492][53268] Updated weights for policy 1, policy_version 21730 (0.0009) [2023-10-10 05:32:24,669][53252] Updated weights for policy 0, policy_version 21790 (0.0007) [2023-10-10 05:32:24,853][53268] Updated weights for policy 1, policy_version 21740 (0.0007) [2023-10-10 05:32:25,215][53268] Updated weights for policy 1, policy_version 21750 (0.0010) [2023-10-10 05:32:25,578][53268] Updated weights for policy 1, policy_version 21760 (0.0009) [2023-10-10 05:32:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 44597248. Throughput: 0: 1681.2, 1: 1664.1. Samples: 11157532. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:32:26,784][52050] Avg episode reward: [(0, '17.520'), (1, '17.930')] [2023-10-10 05:32:28,674][53252] Updated weights for policy 0, policy_version 21800 (0.0007) [2023-10-10 05:32:29,039][53252] Updated weights for policy 0, policy_version 21810 (0.0007) [2023-10-10 05:32:29,409][53252] Updated weights for policy 0, policy_version 21820 (0.0007) [2023-10-10 05:32:29,770][53268] Updated weights for policy 1, policy_version 21770 (0.0007) [2023-10-10 05:32:30,146][53268] Updated weights for policy 1, policy_version 21780 (0.0007) [2023-10-10 05:32:30,505][53268] Updated weights for policy 1, policy_version 21790 (0.0009) [2023-10-10 05:32:31,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44662784. Throughput: 0: 1663.9, 1: 1672.5. Samples: 11168270. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:32:31,784][52050] Avg episode reward: [(0, '17.310'), (1, '16.810')] [2023-10-10 05:32:33,352][53252] Updated weights for policy 0, policy_version 21830 (0.0009) [2023-10-10 05:32:33,733][53252] Updated weights for policy 0, policy_version 21840 (0.0008) [2023-10-10 05:32:34,106][53252] Updated weights for policy 0, policy_version 21850 (0.0009) [2023-10-10 05:32:34,504][53268] Updated weights for policy 1, policy_version 21800 (0.0008) [2023-10-10 05:32:34,861][53268] Updated weights for policy 1, policy_version 21810 (0.0008) [2023-10-10 05:32:35,229][53268] Updated weights for policy 1, policy_version 21820 (0.0009) [2023-10-10 05:32:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44728320. Throughput: 0: 1688.3, 1: 1656.0. Samples: 11188036. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:32:36,784][52050] Avg episode reward: [(0, '17.190'), (1, '16.690')] [2023-10-10 05:32:38,102][53252] Updated weights for policy 0, policy_version 21860 (0.0010) [2023-10-10 05:32:38,471][53252] Updated weights for policy 0, policy_version 21870 (0.0010) [2023-10-10 05:32:38,851][53252] Updated weights for policy 0, policy_version 21880 (0.0010) [2023-10-10 05:32:39,302][53268] Updated weights for policy 1, policy_version 21830 (0.0011) [2023-10-10 05:32:39,675][53268] Updated weights for policy 1, policy_version 21840 (0.0008) [2023-10-10 05:32:40,048][53268] Updated weights for policy 1, policy_version 21850 (0.0007) [2023-10-10 05:32:41,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 44793856. Throughput: 0: 1693.3, 1: 1670.4. Samples: 11208396. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:32:41,785][52050] Avg episode reward: [(0, '17.520'), (1, '17.950')] [2023-10-10 05:32:41,798][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000021856_22380544.pth... [2023-10-10 05:32:41,798][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000021888_22413312.pth... [2023-10-10 05:32:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000020288_20774912.pth [2023-10-10 05:32:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000020320_20807680.pth [2023-10-10 05:32:42,942][53252] Updated weights for policy 0, policy_version 21890 (0.0009) [2023-10-10 05:32:43,319][53252] Updated weights for policy 0, policy_version 21900 (0.0007) [2023-10-10 05:32:43,687][53252] Updated weights for policy 0, policy_version 21910 (0.0007) [2023-10-10 05:32:44,055][53252] Updated weights for policy 0, policy_version 21920 (0.0009) [2023-10-10 05:32:44,136][53268] Updated weights for policy 1, policy_version 21860 (0.0008) [2023-10-10 05:32:44,501][53268] Updated weights for policy 1, policy_version 21870 (0.0008) [2023-10-10 05:32:44,872][53268] Updated weights for policy 1, policy_version 21880 (0.0008) [2023-10-10 05:32:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44859392. Throughput: 0: 1675.1, 1: 1670.8. Samples: 11218722. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:32:46,784][52050] Avg episode reward: [(0, '16.840'), (1, '18.110')] [2023-10-10 05:32:47,936][53252] Updated weights for policy 0, policy_version 21930 (0.0008) [2023-10-10 05:32:48,299][53252] Updated weights for policy 0, policy_version 21940 (0.0009) [2023-10-10 05:32:48,675][53252] Updated weights for policy 0, policy_version 21950 (0.0007) [2023-10-10 05:32:48,774][53268] Updated weights for policy 1, policy_version 21890 (0.0009) [2023-10-10 05:32:49,136][53268] Updated weights for policy 1, policy_version 21900 (0.0009) [2023-10-10 05:32:49,509][53268] Updated weights for policy 1, policy_version 21910 (0.0009) [2023-10-10 05:32:49,872][53268] Updated weights for policy 1, policy_version 21920 (0.0009) [2023-10-10 05:32:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44924928. Throughput: 0: 1701.3, 1: 1659.4. Samples: 11238742. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:32:51,784][52050] Avg episode reward: [(0, '17.020'), (1, '18.550')] [2023-10-10 05:32:52,841][53252] Updated weights for policy 0, policy_version 21960 (0.0009) [2023-10-10 05:32:53,220][53252] Updated weights for policy 0, policy_version 21970 (0.0009) [2023-10-10 05:32:53,589][53252] Updated weights for policy 0, policy_version 21980 (0.0008) [2023-10-10 05:32:54,065][53268] Updated weights for policy 1, policy_version 21930 (0.0011) [2023-10-10 05:32:54,443][53268] Updated weights for policy 1, policy_version 21940 (0.0010) [2023-10-10 05:32:54,801][53268] Updated weights for policy 1, policy_version 21950 (0.0008) [2023-10-10 05:32:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44990464. Throughput: 0: 1701.6, 1: 1682.6. Samples: 11259306. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-10 05:32:56,784][52050] Avg episode reward: [(0, '17.390'), (1, '19.260')] [2023-10-10 05:32:56,793][53061] Saving new best policy, reward=19.260! [2023-10-10 05:32:57,739][53252] Updated weights for policy 0, policy_version 21990 (0.0008) [2023-10-10 05:32:58,108][53252] Updated weights for policy 0, policy_version 22000 (0.0010) [2023-10-10 05:32:58,485][53252] Updated weights for policy 0, policy_version 22010 (0.0010) [2023-10-10 05:32:58,841][53268] Updated weights for policy 1, policy_version 21960 (0.0008) [2023-10-10 05:32:59,204][53268] Updated weights for policy 1, policy_version 21970 (0.0009) [2023-10-10 05:32:59,577][53268] Updated weights for policy 1, policy_version 21980 (0.0009) [2023-10-10 05:33:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45056000. Throughput: 0: 1688.7, 1: 1676.0. Samples: 11269416. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-10 05:33:01,784][52050] Avg episode reward: [(0, '18.460'), (1, '16.810')] [2023-10-10 05:33:02,463][53252] Updated weights for policy 0, policy_version 22020 (0.0008) [2023-10-10 05:33:02,837][53252] Updated weights for policy 0, policy_version 22030 (0.0011) [2023-10-10 05:33:03,204][53252] Updated weights for policy 0, policy_version 22040 (0.0011) [2023-10-10 05:33:03,643][53268] Updated weights for policy 1, policy_version 21990 (0.0009) [2023-10-10 05:33:04,005][53268] Updated weights for policy 1, policy_version 22000 (0.0011) [2023-10-10 05:33:04,375][53268] Updated weights for policy 1, policy_version 22010 (0.0011) [2023-10-10 05:33:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45121536. Throughput: 0: 1711.8, 1: 1675.6. Samples: 11289920. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-10 05:33:06,784][52050] Avg episode reward: [(0, '17.970'), (1, '17.020')] [2023-10-10 05:33:07,260][53252] Updated weights for policy 0, policy_version 22050 (0.0009) [2023-10-10 05:33:07,625][53252] Updated weights for policy 0, policy_version 22060 (0.0007) [2023-10-10 05:33:07,999][53252] Updated weights for policy 0, policy_version 22070 (0.0007) [2023-10-10 05:33:08,364][53252] Updated weights for policy 0, policy_version 22080 (0.0008) [2023-10-10 05:33:08,365][53268] Updated weights for policy 1, policy_version 22020 (0.0009) [2023-10-10 05:33:08,735][53268] Updated weights for policy 1, policy_version 22030 (0.0010) [2023-10-10 05:33:09,096][53268] Updated weights for policy 1, policy_version 22040 (0.0009) [2023-10-10 05:33:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45187072. Throughput: 0: 1707.2, 1: 1689.6. Samples: 11310388. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-10 05:33:11,784][52050] Avg episode reward: [(0, '18.440'), (1, '17.150')] [2023-10-10 05:33:12,455][53252] Updated weights for policy 0, policy_version 22090 (0.0009) [2023-10-10 05:33:12,819][53252] Updated weights for policy 0, policy_version 22100 (0.0009) [2023-10-10 05:33:13,148][53268] Updated weights for policy 1, policy_version 22050 (0.0011) [2023-10-10 05:33:13,187][53252] Updated weights for policy 0, policy_version 22110 (0.0007) [2023-10-10 05:33:13,522][53268] Updated weights for policy 1, policy_version 22060 (0.0011) [2023-10-10 05:33:13,883][53268] Updated weights for policy 1, policy_version 22070 (0.0010) [2023-10-10 05:33:14,254][53268] Updated weights for policy 1, policy_version 22080 (0.0010) [2023-10-10 05:33:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45252608. Throughput: 0: 1693.2, 1: 1667.5. Samples: 11319498. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) [2023-10-10 05:33:16,784][52050] Avg episode reward: [(0, '17.630'), (1, '15.490')] [2023-10-10 05:33:17,291][53252] Updated weights for policy 0, policy_version 22120 (0.0008) [2023-10-10 05:33:17,659][53252] Updated weights for policy 0, policy_version 22130 (0.0008) [2023-10-10 05:33:18,038][53252] Updated weights for policy 0, policy_version 22140 (0.0008) [2023-10-10 05:33:18,252][53268] Updated weights for policy 1, policy_version 22090 (0.0008) [2023-10-10 05:33:18,615][53268] Updated weights for policy 1, policy_version 22100 (0.0007) [2023-10-10 05:33:18,976][53268] Updated weights for policy 1, policy_version 22110 (0.0008) [2023-10-10 05:33:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45318144. Throughput: 0: 1691.1, 1: 1685.7. Samples: 11339992. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) [2023-10-10 05:33:21,784][52050] Avg episode reward: [(0, '17.520'), (1, '16.480')] [2023-10-10 05:33:22,160][53252] Updated weights for policy 0, policy_version 22150 (0.0008) [2023-10-10 05:33:22,541][53252] Updated weights for policy 0, policy_version 22160 (0.0009) [2023-10-10 05:33:22,914][53252] Updated weights for policy 0, policy_version 22170 (0.0008) [2023-10-10 05:33:23,156][53268] Updated weights for policy 1, policy_version 22120 (0.0007) [2023-10-10 05:33:23,525][53268] Updated weights for policy 1, policy_version 22130 (0.0007) [2023-10-10 05:33:23,898][53268] Updated weights for policy 1, policy_version 22140 (0.0007) [2023-10-10 05:33:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45383680. Throughput: 0: 1687.4, 1: 1693.4. Samples: 11360532. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) [2023-10-10 05:33:26,784][52050] Avg episode reward: [(0, '17.580'), (1, '17.010')] [2023-10-10 05:33:26,794][53252] Updated weights for policy 0, policy_version 22180 (0.0009) [2023-10-10 05:33:27,169][53252] Updated weights for policy 0, policy_version 22190 (0.0009) [2023-10-10 05:33:27,542][53252] Updated weights for policy 0, policy_version 22200 (0.0010) [2023-10-10 05:33:27,973][53268] Updated weights for policy 1, policy_version 22150 (0.0009) [2023-10-10 05:33:28,347][53268] Updated weights for policy 1, policy_version 22160 (0.0009) [2023-10-10 05:33:28,707][53268] Updated weights for policy 1, policy_version 22170 (0.0009) [2023-10-10 05:33:31,702][53252] Updated weights for policy 0, policy_version 22210 (0.0008) [2023-10-10 05:33:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 45449216. Throughput: 0: 1684.8, 1: 1665.9. Samples: 11369500. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) [2023-10-10 05:33:31,784][52050] Avg episode reward: [(0, '16.770'), (1, '15.270')] [2023-10-10 05:33:32,077][53252] Updated weights for policy 0, policy_version 22220 (0.0008) [2023-10-10 05:33:32,444][53252] Updated weights for policy 0, policy_version 22230 (0.0007) [2023-10-10 05:33:32,585][53268] Updated weights for policy 1, policy_version 22180 (0.0008) [2023-10-10 05:33:32,818][53252] Updated weights for policy 0, policy_version 22240 (0.0008) [2023-10-10 05:33:32,952][53268] Updated weights for policy 1, policy_version 22190 (0.0008) [2023-10-10 05:33:33,323][53268] Updated weights for policy 1, policy_version 22200 (0.0011) [2023-10-10 05:33:36,735][53252] Updated weights for policy 0, policy_version 22250 (0.0009) [2023-10-10 05:33:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45514752. Throughput: 0: 1681.2, 1: 1686.0. Samples: 11390264. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-10 05:33:36,784][52050] Avg episode reward: [(0, '18.130'), (1, '16.410')] [2023-10-10 05:33:37,105][53252] Updated weights for policy 0, policy_version 22260 (0.0009) [2023-10-10 05:33:37,479][53252] Updated weights for policy 0, policy_version 22270 (0.0009) [2023-10-10 05:33:37,503][53268] Updated weights for policy 1, policy_version 22210 (0.0008) [2023-10-10 05:33:37,879][53268] Updated weights for policy 1, policy_version 22220 (0.0008) [2023-10-10 05:33:38,245][53268] Updated weights for policy 1, policy_version 22230 (0.0010) [2023-10-10 05:33:38,612][53268] Updated weights for policy 1, policy_version 22240 (0.0010) [2023-10-10 05:33:41,695][53252] Updated weights for policy 0, policy_version 22280 (0.0009) [2023-10-10 05:33:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 45580288. Throughput: 0: 1677.7, 1: 1690.2. Samples: 11410860. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-10 05:33:41,784][52050] Avg episode reward: [(0, '17.950'), (1, '16.660')] [2023-10-10 05:33:42,069][53252] Updated weights for policy 0, policy_version 22290 (0.0010) [2023-10-10 05:33:42,443][53252] Updated weights for policy 0, policy_version 22300 (0.0010) [2023-10-10 05:33:42,691][53268] Updated weights for policy 1, policy_version 22250 (0.0008) [2023-10-10 05:33:43,049][53268] Updated weights for policy 1, policy_version 22260 (0.0008) [2023-10-10 05:33:43,417][53268] Updated weights for policy 1, policy_version 22270 (0.0009) [2023-10-10 05:33:46,634][53252] Updated weights for policy 0, policy_version 22310 (0.0010) [2023-10-10 05:33:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45645824. Throughput: 0: 1674.7, 1: 1669.8. Samples: 11419918. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-10 05:33:46,784][52050] Avg episode reward: [(0, '17.540'), (1, '17.000')] [2023-10-10 05:33:47,009][53252] Updated weights for policy 0, policy_version 22320 (0.0008) [2023-10-10 05:33:47,377][53252] Updated weights for policy 0, policy_version 22330 (0.0008) [2023-10-10 05:33:47,493][53268] Updated weights for policy 1, policy_version 22280 (0.0009) [2023-10-10 05:33:47,859][53268] Updated weights for policy 1, policy_version 22290 (0.0010) [2023-10-10 05:33:48,218][53268] Updated weights for policy 1, policy_version 22300 (0.0008) [2023-10-10 05:33:51,395][53252] Updated weights for policy 0, policy_version 22340 (0.0008) [2023-10-10 05:33:51,768][53252] Updated weights for policy 0, policy_version 22350 (0.0007) [2023-10-10 05:33:51,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 45711360. Throughput: 0: 1663.8, 1: 1685.2. Samples: 11440624. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-10 05:33:51,785][52050] Avg episode reward: [(0, '18.470'), (1, '16.920')] [2023-10-10 05:33:52,144][53252] Updated weights for policy 0, policy_version 22360 (0.0007) [2023-10-10 05:33:52,485][53268] Updated weights for policy 1, policy_version 22310 (0.0008) [2023-10-10 05:33:52,859][53268] Updated weights for policy 1, policy_version 22320 (0.0008) [2023-10-10 05:33:53,227][53268] Updated weights for policy 1, policy_version 22330 (0.0009) [2023-10-10 05:33:56,317][53252] Updated weights for policy 0, policy_version 22370 (0.0007) [2023-10-10 05:33:56,689][53252] Updated weights for policy 0, policy_version 22380 (0.0008) [2023-10-10 05:33:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45776896. Throughput: 0: 1661.9, 1: 1688.3. Samples: 11461146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:33:56,784][52050] Avg episode reward: [(0, '17.580'), (1, '16.360')] [2023-10-10 05:33:57,064][53252] Updated weights for policy 0, policy_version 22390 (0.0009) [2023-10-10 05:33:57,327][53268] Updated weights for policy 1, policy_version 22340 (0.0008) [2023-10-10 05:33:57,431][53252] Updated weights for policy 0, policy_version 22400 (0.0008) [2023-10-10 05:33:57,697][53268] Updated weights for policy 1, policy_version 22350 (0.0007) [2023-10-10 05:33:58,068][53268] Updated weights for policy 1, policy_version 22360 (0.0008) [2023-10-10 05:34:01,593][53252] Updated weights for policy 0, policy_version 22410 (0.0007) [2023-10-10 05:34:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45842432. Throughput: 0: 1670.0, 1: 1682.3. Samples: 11470352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:34:01,784][52050] Avg episode reward: [(0, '17.530'), (1, '16.240')] [2023-10-10 05:34:01,974][53252] Updated weights for policy 0, policy_version 22420 (0.0009) [2023-10-10 05:34:02,109][53268] Updated weights for policy 1, policy_version 22370 (0.0008) [2023-10-10 05:34:02,340][53252] Updated weights for policy 0, policy_version 22430 (0.0009) [2023-10-10 05:34:02,468][53268] Updated weights for policy 1, policy_version 22380 (0.0008) [2023-10-10 05:34:02,838][53268] Updated weights for policy 1, policy_version 22390 (0.0011) [2023-10-10 05:34:03,214][53268] Updated weights for policy 1, policy_version 22400 (0.0011) [2023-10-10 05:34:06,362][53252] Updated weights for policy 0, policy_version 22440 (0.0008) [2023-10-10 05:34:06,738][53252] Updated weights for policy 0, policy_version 22450 (0.0008) [2023-10-10 05:34:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45907968. Throughput: 0: 1674.8, 1: 1683.3. Samples: 11491106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:34:06,784][52050] Avg episode reward: [(0, '18.680'), (1, '18.150')] [2023-10-10 05:34:07,110][53252] Updated weights for policy 0, policy_version 22460 (0.0008) [2023-10-10 05:34:07,365][53268] Updated weights for policy 1, policy_version 22410 (0.0008) [2023-10-10 05:34:07,738][53268] Updated weights for policy 1, policy_version 22420 (0.0008) [2023-10-10 05:34:08,119][53268] Updated weights for policy 1, policy_version 22430 (0.0007) [2023-10-10 05:34:11,064][53252] Updated weights for policy 0, policy_version 22470 (0.0008) [2023-10-10 05:34:11,431][53252] Updated weights for policy 0, policy_version 22480 (0.0007) [2023-10-10 05:34:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 45973504. Throughput: 0: 1665.2, 1: 1684.5. Samples: 11511270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:34:11,784][52050] Avg episode reward: [(0, '18.710'), (1, '18.520')] [2023-10-10 05:34:11,806][53252] Updated weights for policy 0, policy_version 22490 (0.0009) [2023-10-10 05:34:12,279][53268] Updated weights for policy 1, policy_version 22440 (0.0008) [2023-10-10 05:34:12,672][53268] Updated weights for policy 1, policy_version 22450 (0.0009) [2023-10-10 05:34:13,042][53268] Updated weights for policy 1, policy_version 22460 (0.0009) [2023-10-10 05:34:15,726][53252] Updated weights for policy 0, policy_version 22500 (0.0007) [2023-10-10 05:34:16,105][53252] Updated weights for policy 0, policy_version 22510 (0.0008) [2023-10-10 05:34:16,485][53252] Updated weights for policy 0, policy_version 22520 (0.0008) [2023-10-10 05:34:16,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46071808. Throughput: 0: 1680.8, 1: 1680.5. Samples: 11520758. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:34:16,784][52050] Avg episode reward: [(0, '19.540'), (1, '18.350')] [2023-10-10 05:34:17,001][53268] Updated weights for policy 1, policy_version 22470 (0.0007) [2023-10-10 05:34:17,375][53268] Updated weights for policy 1, policy_version 22480 (0.0009) [2023-10-10 05:34:17,748][53268] Updated weights for policy 1, policy_version 22490 (0.0011) [2023-10-10 05:34:20,555][53252] Updated weights for policy 0, policy_version 22530 (0.0008) [2023-10-10 05:34:20,927][53252] Updated weights for policy 0, policy_version 22540 (0.0008) [2023-10-10 05:34:21,293][53252] Updated weights for policy 0, policy_version 22550 (0.0007) [2023-10-10 05:34:21,608][53268] Updated weights for policy 1, policy_version 22500 (0.0010) [2023-10-10 05:34:21,666][53252] Updated weights for policy 0, policy_version 22560 (0.0007) [2023-10-10 05:34:21,783][52050] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46137344. Throughput: 0: 1680.4, 1: 1680.5. Samples: 11541504. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:34:21,784][52050] Avg episode reward: [(0, '19.500'), (1, '19.900')] [2023-10-10 05:34:21,976][53268] Updated weights for policy 1, policy_version 22510 (0.0009) [2023-10-10 05:34:22,343][53268] Updated weights for policy 1, policy_version 22520 (0.0011) [2023-10-10 05:34:22,636][53061] Saving new best policy, reward=19.900! [2023-10-10 05:34:25,761][53252] Updated weights for policy 0, policy_version 22570 (0.0009) [2023-10-10 05:34:26,118][53252] Updated weights for policy 0, policy_version 22580 (0.0009) [2023-10-10 05:34:26,491][53252] Updated weights for policy 0, policy_version 22590 (0.0007) [2023-10-10 05:34:26,528][53268] Updated weights for policy 1, policy_version 22530 (0.0009) [2023-10-10 05:34:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46202880. Throughput: 0: 1661.7, 1: 1680.9. Samples: 11561276. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:34:26,784][52050] Avg episode reward: [(0, '19.560'), (1, '17.930')] [2023-10-10 05:34:26,902][53268] Updated weights for policy 1, policy_version 22540 (0.0009) [2023-10-10 05:34:27,267][53268] Updated weights for policy 1, policy_version 22550 (0.0012) [2023-10-10 05:34:27,632][53268] Updated weights for policy 1, policy_version 22560 (0.0008) [2023-10-10 05:34:30,395][53252] Updated weights for policy 0, policy_version 22600 (0.0008) [2023-10-10 05:34:30,773][53252] Updated weights for policy 0, policy_version 22610 (0.0008) [2023-10-10 05:34:31,138][53252] Updated weights for policy 0, policy_version 22620 (0.0008) [2023-10-10 05:34:31,738][53268] Updated weights for policy 1, policy_version 22570 (0.0007) [2023-10-10 05:34:31,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46268416. Throughput: 0: 1689.5, 1: 1679.5. Samples: 11571524. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 05:34:31,785][52050] Avg episode reward: [(0, '17.410'), (1, '17.660')] [2023-10-10 05:34:32,104][53268] Updated weights for policy 1, policy_version 22580 (0.0009) [2023-10-10 05:34:32,472][53268] Updated weights for policy 1, policy_version 22590 (0.0009) [2023-10-10 05:34:35,119][53252] Updated weights for policy 0, policy_version 22630 (0.0007) [2023-10-10 05:34:35,481][53252] Updated weights for policy 0, policy_version 22640 (0.0009) [2023-10-10 05:34:35,849][53252] Updated weights for policy 0, policy_version 22650 (0.0009) [2023-10-10 05:34:36,475][53268] Updated weights for policy 1, policy_version 22600 (0.0008) [2023-10-10 05:34:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46333952. Throughput: 0: 1683.1, 1: 1676.7. Samples: 11591814. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 05:34:36,784][52050] Avg episode reward: [(0, '17.050'), (1, '17.800')] [2023-10-10 05:34:36,853][53268] Updated weights for policy 1, policy_version 22610 (0.0007) [2023-10-10 05:34:37,218][53268] Updated weights for policy 1, policy_version 22620 (0.0009) [2023-10-10 05:34:39,881][53252] Updated weights for policy 0, policy_version 22660 (0.0009) [2023-10-10 05:34:40,262][53252] Updated weights for policy 0, policy_version 22670 (0.0010) [2023-10-10 05:34:40,635][53252] Updated weights for policy 0, policy_version 22680 (0.0010) [2023-10-10 05:34:41,193][53268] Updated weights for policy 1, policy_version 22630 (0.0010) [2023-10-10 05:34:41,552][53268] Updated weights for policy 1, policy_version 22640 (0.0010) [2023-10-10 05:34:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46399488. Throughput: 0: 1668.8, 1: 1677.0. Samples: 11611708. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 05:34:41,784][52050] Avg episode reward: [(0, '16.980'), (1, '17.900')] [2023-10-10 05:34:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000022688_23232512.pth... [2023-10-10 05:34:41,827][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000021120_21626880.pth [2023-10-10 05:34:41,916][53268] Updated weights for policy 1, policy_version 22650 (0.0011) [2023-10-10 05:34:42,138][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000022656_23199744.pth... [2023-10-10 05:34:42,177][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000021088_21594112.pth [2023-10-10 05:34:44,600][53252] Updated weights for policy 0, policy_version 22690 (0.0010) [2023-10-10 05:34:44,970][53252] Updated weights for policy 0, policy_version 22700 (0.0008) [2023-10-10 05:34:45,343][53252] Updated weights for policy 0, policy_version 22710 (0.0008) [2023-10-10 05:34:45,709][53252] Updated weights for policy 0, policy_version 22720 (0.0010) [2023-10-10 05:34:45,917][53268] Updated weights for policy 1, policy_version 22660 (0.0009) [2023-10-10 05:34:46,279][53268] Updated weights for policy 1, policy_version 22670 (0.0009) [2023-10-10 05:34:46,637][53268] Updated weights for policy 1, policy_version 22680 (0.0007) [2023-10-10 05:34:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46465024. Throughput: 0: 1700.2, 1: 1681.9. Samples: 11622546. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 05:34:46,784][52050] Avg episode reward: [(0, '15.440'), (1, '17.630')] [2023-10-10 05:34:49,850][53252] Updated weights for policy 0, policy_version 22730 (0.0007) [2023-10-10 05:34:50,222][53252] Updated weights for policy 0, policy_version 22740 (0.0007) [2023-10-10 05:34:50,598][53252] Updated weights for policy 0, policy_version 22750 (0.0009) [2023-10-10 05:34:50,934][53268] Updated weights for policy 1, policy_version 22690 (0.0008) [2023-10-10 05:34:51,306][53268] Updated weights for policy 1, policy_version 22700 (0.0009) [2023-10-10 05:34:51,668][53268] Updated weights for policy 1, policy_version 22710 (0.0009) [2023-10-10 05:34:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46530560. Throughput: 0: 1675.8, 1: 1681.9. Samples: 11642202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:34:51,784][52050] Avg episode reward: [(0, '16.540'), (1, '17.120')] [2023-10-10 05:34:52,041][53268] Updated weights for policy 1, policy_version 22720 (0.0007) [2023-10-10 05:34:54,576][53252] Updated weights for policy 0, policy_version 22760 (0.0008) [2023-10-10 05:34:54,959][53252] Updated weights for policy 0, policy_version 22770 (0.0007) [2023-10-10 05:34:55,327][53252] Updated weights for policy 0, policy_version 22780 (0.0008) [2023-10-10 05:34:56,238][53268] Updated weights for policy 1, policy_version 22730 (0.0009) [2023-10-10 05:34:56,600][53268] Updated weights for policy 1, policy_version 22740 (0.0010) [2023-10-10 05:34:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46596096. Throughput: 0: 1683.5, 1: 1674.5. Samples: 11662376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:34:56,784][52050] Avg episode reward: [(0, '16.280'), (1, '17.290')] [2023-10-10 05:34:56,967][53268] Updated weights for policy 1, policy_version 22750 (0.0010) [2023-10-10 05:34:59,334][53252] Updated weights for policy 0, policy_version 22790 (0.0007) [2023-10-10 05:34:59,707][53252] Updated weights for policy 0, policy_version 22800 (0.0009) [2023-10-10 05:35:00,081][53252] Updated weights for policy 0, policy_version 22810 (0.0009) [2023-10-10 05:35:01,141][53268] Updated weights for policy 1, policy_version 22760 (0.0011) [2023-10-10 05:35:01,509][53268] Updated weights for policy 1, policy_version 22770 (0.0009) [2023-10-10 05:35:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46661632. Throughput: 0: 1693.5, 1: 1682.8. Samples: 11672694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:35:01,784][52050] Avg episode reward: [(0, '15.700'), (1, '16.860')] [2023-10-10 05:35:01,868][53268] Updated weights for policy 1, policy_version 22780 (0.0008) [2023-10-10 05:35:04,184][53252] Updated weights for policy 0, policy_version 22820 (0.0010) [2023-10-10 05:35:04,552][53252] Updated weights for policy 0, policy_version 22830 (0.0007) [2023-10-10 05:35:04,918][53252] Updated weights for policy 0, policy_version 22840 (0.0008) [2023-10-10 05:35:05,973][53268] Updated weights for policy 1, policy_version 22790 (0.0009) [2023-10-10 05:35:06,345][53268] Updated weights for policy 1, policy_version 22800 (0.0009) [2023-10-10 05:35:06,713][53268] Updated weights for policy 1, policy_version 22810 (0.0011) [2023-10-10 05:35:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 46727168. Throughput: 0: 1669.2, 1: 1680.4. Samples: 11692236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:35:06,784][52050] Avg episode reward: [(0, '15.270'), (1, '15.840')] [2023-10-10 05:35:09,029][53252] Updated weights for policy 0, policy_version 22850 (0.0008) [2023-10-10 05:35:09,397][53252] Updated weights for policy 0, policy_version 22860 (0.0008) [2023-10-10 05:35:09,774][53252] Updated weights for policy 0, policy_version 22870 (0.0007) [2023-10-10 05:35:10,155][53252] Updated weights for policy 0, policy_version 22880 (0.0007) [2023-10-10 05:35:10,725][53268] Updated weights for policy 1, policy_version 22820 (0.0009) [2023-10-10 05:35:11,088][53268] Updated weights for policy 1, policy_version 22830 (0.0007) [2023-10-10 05:35:11,461][53268] Updated weights for policy 1, policy_version 22840 (0.0008) [2023-10-10 05:35:11,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 46825472. Throughput: 0: 1691.6, 1: 1670.5. Samples: 11712572. Policy #0 lag: (min: 16.0, avg: 40.9, max: 48.0) [2023-10-10 05:35:11,784][52050] Avg episode reward: [(0, '16.010'), (1, '16.020')] [2023-10-10 05:35:14,296][53252] Updated weights for policy 0, policy_version 22890 (0.0010) [2023-10-10 05:35:14,677][53252] Updated weights for policy 0, policy_version 22900 (0.0008) [2023-10-10 05:35:15,058][53252] Updated weights for policy 0, policy_version 22910 (0.0007) [2023-10-10 05:35:15,578][53268] Updated weights for policy 1, policy_version 22850 (0.0008) [2023-10-10 05:35:15,941][53268] Updated weights for policy 1, policy_version 22860 (0.0008) [2023-10-10 05:35:16,304][53268] Updated weights for policy 1, policy_version 22870 (0.0010) [2023-10-10 05:35:16,672][53268] Updated weights for policy 1, policy_version 22880 (0.0008) [2023-10-10 05:35:16,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 46891008. Throughput: 0: 1680.2, 1: 1685.8. Samples: 11722994. Policy #0 lag: (min: 16.0, avg: 40.9, max: 48.0) [2023-10-10 05:35:16,784][52050] Avg episode reward: [(0, '16.380'), (1, '15.610')] [2023-10-10 05:35:19,058][53252] Updated weights for policy 0, policy_version 22920 (0.0008) [2023-10-10 05:35:19,433][53252] Updated weights for policy 0, policy_version 22930 (0.0007) [2023-10-10 05:35:19,803][53252] Updated weights for policy 0, policy_version 22940 (0.0008) [2023-10-10 05:35:20,690][53268] Updated weights for policy 1, policy_version 22890 (0.0008) [2023-10-10 05:35:21,060][53268] Updated weights for policy 1, policy_version 22900 (0.0008) [2023-10-10 05:35:21,420][53268] Updated weights for policy 1, policy_version 22910 (0.0008) [2023-10-10 05:35:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 46956544. Throughput: 0: 1670.6, 1: 1688.3. Samples: 11742964. Policy #0 lag: (min: 16.0, avg: 40.9, max: 48.0) [2023-10-10 05:35:21,784][52050] Avg episode reward: [(0, '15.890'), (1, '16.660')] [2023-10-10 05:35:23,814][53252] Updated weights for policy 0, policy_version 22950 (0.0009) [2023-10-10 05:35:24,200][53252] Updated weights for policy 0, policy_version 22960 (0.0010) [2023-10-10 05:35:24,569][53252] Updated weights for policy 0, policy_version 22970 (0.0010) [2023-10-10 05:35:25,410][53268] Updated weights for policy 1, policy_version 22920 (0.0010) [2023-10-10 05:35:25,778][53268] Updated weights for policy 1, policy_version 22930 (0.0010) [2023-10-10 05:35:26,157][53268] Updated weights for policy 1, policy_version 22940 (0.0009) [2023-10-10 05:35:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 47022080. Throughput: 0: 1689.7, 1: 1667.6. Samples: 11762784. Policy #0 lag: (min: 16.0, avg: 40.9, max: 48.0) [2023-10-10 05:35:26,784][52050] Avg episode reward: [(0, '15.560'), (1, '16.310')] [2023-10-10 05:35:28,646][53252] Updated weights for policy 0, policy_version 22980 (0.0007) [2023-10-10 05:35:29,010][53252] Updated weights for policy 0, policy_version 22990 (0.0007) [2023-10-10 05:35:29,383][53252] Updated weights for policy 0, policy_version 23000 (0.0007) [2023-10-10 05:35:30,135][53268] Updated weights for policy 1, policy_version 22950 (0.0009) [2023-10-10 05:35:30,502][53268] Updated weights for policy 1, policy_version 22960 (0.0008) [2023-10-10 05:35:30,867][53268] Updated weights for policy 1, policy_version 22970 (0.0009) [2023-10-10 05:35:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 47087616. Throughput: 0: 1664.2, 1: 1686.7. Samples: 11773336. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 05:35:31,784][52050] Avg episode reward: [(0, '16.170'), (1, '17.480')] [2023-10-10 05:35:33,330][53252] Updated weights for policy 0, policy_version 23010 (0.0008) [2023-10-10 05:35:33,705][53252] Updated weights for policy 0, policy_version 23020 (0.0009) [2023-10-10 05:35:34,074][53252] Updated weights for policy 0, policy_version 23030 (0.0011) [2023-10-10 05:35:34,446][53252] Updated weights for policy 0, policy_version 23040 (0.0009) [2023-10-10 05:35:34,828][53268] Updated weights for policy 1, policy_version 22980 (0.0010) [2023-10-10 05:35:35,192][53268] Updated weights for policy 1, policy_version 22990 (0.0009) [2023-10-10 05:35:35,561][53268] Updated weights for policy 1, policy_version 23000 (0.0010) [2023-10-10 05:35:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 47153152. Throughput: 0: 1680.3, 1: 1676.6. Samples: 11793264. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 05:35:36,785][52050] Avg episode reward: [(0, '15.910'), (1, '17.730')] [2023-10-10 05:35:38,556][53252] Updated weights for policy 0, policy_version 23050 (0.0009) [2023-10-10 05:35:38,931][53252] Updated weights for policy 0, policy_version 23060 (0.0008) [2023-10-10 05:35:39,307][53252] Updated weights for policy 0, policy_version 23070 (0.0009) [2023-10-10 05:35:39,625][53268] Updated weights for policy 1, policy_version 23010 (0.0008) [2023-10-10 05:35:39,991][53268] Updated weights for policy 1, policy_version 23020 (0.0008) [2023-10-10 05:35:40,352][53268] Updated weights for policy 1, policy_version 23030 (0.0010) [2023-10-10 05:35:40,726][53268] Updated weights for policy 1, policy_version 23040 (0.0011) [2023-10-10 05:35:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 47218688. Throughput: 0: 1684.7, 1: 1665.2. Samples: 11813118. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 05:35:41,784][52050] Avg episode reward: [(0, '18.600'), (1, '16.750')] [2023-10-10 05:35:43,453][53252] Updated weights for policy 0, policy_version 23080 (0.0008) [2023-10-10 05:35:43,824][53252] Updated weights for policy 0, policy_version 23090 (0.0007) [2023-10-10 05:35:44,208][53252] Updated weights for policy 0, policy_version 23100 (0.0009) [2023-10-10 05:35:44,882][53268] Updated weights for policy 1, policy_version 23050 (0.0010) [2023-10-10 05:35:45,250][53268] Updated weights for policy 1, policy_version 23060 (0.0008) [2023-10-10 05:35:45,629][53268] Updated weights for policy 1, policy_version 23070 (0.0009) [2023-10-10 05:35:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 47284224. Throughput: 0: 1660.6, 1: 1691.0. Samples: 11823516. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 05:35:46,784][52050] Avg episode reward: [(0, '18.390'), (1, '16.560')] [2023-10-10 05:35:47,982][53252] Updated weights for policy 0, policy_version 23110 (0.0009) [2023-10-10 05:35:48,353][53252] Updated weights for policy 0, policy_version 23120 (0.0008) [2023-10-10 05:35:48,732][53252] Updated weights for policy 0, policy_version 23130 (0.0008) [2023-10-10 05:35:49,803][53268] Updated weights for policy 1, policy_version 23080 (0.0009) [2023-10-10 05:35:50,178][53268] Updated weights for policy 1, policy_version 23090 (0.0009) [2023-10-10 05:35:50,550][53268] Updated weights for policy 1, policy_version 23100 (0.0011) [2023-10-10 05:35:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 47349760. Throughput: 0: 1690.1, 1: 1675.0. Samples: 11843666. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 05:35:51,785][52050] Avg episode reward: [(0, '19.260'), (1, '18.530')] [2023-10-10 05:35:52,911][53252] Updated weights for policy 0, policy_version 23140 (0.0007) [2023-10-10 05:35:53,282][53252] Updated weights for policy 0, policy_version 23150 (0.0007) [2023-10-10 05:35:53,648][53252] Updated weights for policy 0, policy_version 23160 (0.0007) [2023-10-10 05:35:54,551][53268] Updated weights for policy 1, policy_version 23110 (0.0010) [2023-10-10 05:35:54,917][53268] Updated weights for policy 1, policy_version 23120 (0.0007) [2023-10-10 05:35:55,292][53268] Updated weights for policy 1, policy_version 23130 (0.0008) [2023-10-10 05:35:56,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 47415296. Throughput: 0: 1690.8, 1: 1670.4. Samples: 11863828. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 05:35:56,785][52050] Avg episode reward: [(0, '19.580'), (1, '17.550')] [2023-10-10 05:35:57,686][53252] Updated weights for policy 0, policy_version 23170 (0.0008) [2023-10-10 05:35:58,058][53252] Updated weights for policy 0, policy_version 23180 (0.0007) [2023-10-10 05:35:58,433][53252] Updated weights for policy 0, policy_version 23190 (0.0007) [2023-10-10 05:35:58,806][53252] Updated weights for policy 0, policy_version 23200 (0.0007) [2023-10-10 05:35:59,313][53268] Updated weights for policy 1, policy_version 23140 (0.0009) [2023-10-10 05:35:59,681][53268] Updated weights for policy 1, policy_version 23150 (0.0009) [2023-10-10 05:36:00,062][53268] Updated weights for policy 1, policy_version 23160 (0.0008) [2023-10-10 05:36:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47480832. Throughput: 0: 1675.0, 1: 1684.1. Samples: 11874152. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 05:36:01,784][52050] Avg episode reward: [(0, '17.960'), (1, '17.850')] [2023-10-10 05:36:02,932][53252] Updated weights for policy 0, policy_version 23210 (0.0008) [2023-10-10 05:36:03,291][53252] Updated weights for policy 0, policy_version 23220 (0.0009) [2023-10-10 05:36:03,670][53252] Updated weights for policy 0, policy_version 23230 (0.0009) [2023-10-10 05:36:04,161][53268] Updated weights for policy 1, policy_version 23170 (0.0008) [2023-10-10 05:36:04,520][53268] Updated weights for policy 1, policy_version 23180 (0.0009) [2023-10-10 05:36:04,886][53268] Updated weights for policy 1, policy_version 23190 (0.0010) [2023-10-10 05:36:05,253][53268] Updated weights for policy 1, policy_version 23200 (0.0010) [2023-10-10 05:36:06,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 47546368. Throughput: 0: 1697.3, 1: 1657.2. Samples: 11893916. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 05:36:06,784][52050] Avg episode reward: [(0, '18.730'), (1, '18.380')] [2023-10-10 05:36:07,784][53252] Updated weights for policy 0, policy_version 23240 (0.0009) [2023-10-10 05:36:08,154][53252] Updated weights for policy 0, policy_version 23250 (0.0009) [2023-10-10 05:36:08,537][53252] Updated weights for policy 0, policy_version 23260 (0.0008) [2023-10-10 05:36:09,188][53268] Updated weights for policy 1, policy_version 23210 (0.0008) [2023-10-10 05:36:09,561][53268] Updated weights for policy 1, policy_version 23220 (0.0010) [2023-10-10 05:36:09,927][53268] Updated weights for policy 1, policy_version 23230 (0.0008) [2023-10-10 05:36:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47611904. Throughput: 0: 1693.2, 1: 1674.7. Samples: 11914338. Policy #0 lag: (min: 0.0, avg: 11.6, max: 32.0) [2023-10-10 05:36:11,784][52050] Avg episode reward: [(0, '16.260'), (1, '16.090')] [2023-10-10 05:36:12,726][53252] Updated weights for policy 0, policy_version 23270 (0.0010) [2023-10-10 05:36:13,101][53252] Updated weights for policy 0, policy_version 23280 (0.0009) [2023-10-10 05:36:13,468][53252] Updated weights for policy 0, policy_version 23290 (0.0008) [2023-10-10 05:36:14,117][53268] Updated weights for policy 1, policy_version 23240 (0.0007) [2023-10-10 05:36:14,484][53268] Updated weights for policy 1, policy_version 23250 (0.0008) [2023-10-10 05:36:14,856][53268] Updated weights for policy 1, policy_version 23260 (0.0008) [2023-10-10 05:36:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47677440. Throughput: 0: 1681.8, 1: 1675.5. Samples: 11924414. Policy #0 lag: (min: 0.0, avg: 11.6, max: 32.0) [2023-10-10 05:36:16,784][52050] Avg episode reward: [(0, '16.800'), (1, '17.220')] [2023-10-10 05:36:17,555][53252] Updated weights for policy 0, policy_version 23300 (0.0008) [2023-10-10 05:36:17,922][53252] Updated weights for policy 0, policy_version 23310 (0.0009) [2023-10-10 05:36:18,291][53252] Updated weights for policy 0, policy_version 23320 (0.0008) [2023-10-10 05:36:19,045][53268] Updated weights for policy 1, policy_version 23270 (0.0010) [2023-10-10 05:36:19,409][53268] Updated weights for policy 1, policy_version 23280 (0.0008) [2023-10-10 05:36:19,777][53268] Updated weights for policy 1, policy_version 23290 (0.0007) [2023-10-10 05:36:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47742976. Throughput: 0: 1690.5, 1: 1664.8. Samples: 11944250. Policy #0 lag: (min: 0.0, avg: 11.6, max: 32.0) [2023-10-10 05:36:21,784][52050] Avg episode reward: [(0, '17.350'), (1, '16.590')] [2023-10-10 05:36:22,332][53252] Updated weights for policy 0, policy_version 23330 (0.0008) [2023-10-10 05:36:22,707][53252] Updated weights for policy 0, policy_version 23340 (0.0007) [2023-10-10 05:36:23,076][53252] Updated weights for policy 0, policy_version 23350 (0.0007) [2023-10-10 05:36:23,440][53252] Updated weights for policy 0, policy_version 23360 (0.0008) [2023-10-10 05:36:23,613][53268] Updated weights for policy 1, policy_version 23300 (0.0007) [2023-10-10 05:36:23,988][53268] Updated weights for policy 1, policy_version 23310 (0.0008) [2023-10-10 05:36:24,356][53268] Updated weights for policy 1, policy_version 23320 (0.0007) [2023-10-10 05:36:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47808512. Throughput: 0: 1691.1, 1: 1687.1. Samples: 11965136. Policy #0 lag: (min: 0.0, avg: 11.6, max: 32.0) [2023-10-10 05:36:26,784][52050] Avg episode reward: [(0, '16.860'), (1, '16.570')] [2023-10-10 05:36:27,603][53252] Updated weights for policy 0, policy_version 23370 (0.0010) [2023-10-10 05:36:27,984][53252] Updated weights for policy 0, policy_version 23380 (0.0008) [2023-10-10 05:36:28,278][53268] Updated weights for policy 1, policy_version 23330 (0.0008) [2023-10-10 05:36:28,350][53252] Updated weights for policy 0, policy_version 23390 (0.0009) [2023-10-10 05:36:28,633][53268] Updated weights for policy 1, policy_version 23340 (0.0008) [2023-10-10 05:36:29,005][53268] Updated weights for policy 1, policy_version 23350 (0.0009) [2023-10-10 05:36:29,375][53268] Updated weights for policy 1, policy_version 23360 (0.0008) [2023-10-10 05:36:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47874048. Throughput: 0: 1686.8, 1: 1667.3. Samples: 11974452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:36:31,784][52050] Avg episode reward: [(0, '17.060'), (1, '16.760')] [2023-10-10 05:36:32,382][53252] Updated weights for policy 0, policy_version 23400 (0.0008) [2023-10-10 05:36:32,766][53252] Updated weights for policy 0, policy_version 23410 (0.0010) [2023-10-10 05:36:33,140][53252] Updated weights for policy 0, policy_version 23420 (0.0008) [2023-10-10 05:36:33,337][53268] Updated weights for policy 1, policy_version 23370 (0.0009) [2023-10-10 05:36:33,705][53268] Updated weights for policy 1, policy_version 23380 (0.0008) [2023-10-10 05:36:34,072][53268] Updated weights for policy 1, policy_version 23390 (0.0007) [2023-10-10 05:36:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 47939584. Throughput: 0: 1677.3, 1: 1679.9. Samples: 11994742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:36:36,785][52050] Avg episode reward: [(0, '18.300'), (1, '16.860')] [2023-10-10 05:36:37,293][53252] Updated weights for policy 0, policy_version 23430 (0.0010) [2023-10-10 05:36:37,670][53252] Updated weights for policy 0, policy_version 23440 (0.0010) [2023-10-10 05:36:38,032][53252] Updated weights for policy 0, policy_version 23450 (0.0008) [2023-10-10 05:36:38,179][53268] Updated weights for policy 1, policy_version 23400 (0.0009) [2023-10-10 05:36:38,542][53268] Updated weights for policy 1, policy_version 23410 (0.0010) [2023-10-10 05:36:38,915][53268] Updated weights for policy 1, policy_version 23420 (0.0010) [2023-10-10 05:36:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48005120. Throughput: 0: 1673.1, 1: 1697.5. Samples: 12015506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:36:41,784][52050] Avg episode reward: [(0, '17.470'), (1, '17.470')] [2023-10-10 05:36:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000023424_23986176.pth... [2023-10-10 05:36:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000023456_24018944.pth... [2023-10-10 05:36:41,825][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000021888_22413312.pth [2023-10-10 05:36:41,826][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000021856_22380544.pth [2023-10-10 05:36:41,829][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000023456_24018944.pth [2023-10-10 05:36:41,830][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000023424_23986176.pth [2023-10-10 05:36:42,150][53252] Updated weights for policy 0, policy_version 23460 (0.0008) [2023-10-10 05:36:42,513][53252] Updated weights for policy 0, policy_version 23470 (0.0007) [2023-10-10 05:36:42,888][53252] Updated weights for policy 0, policy_version 23480 (0.0008) [2023-10-10 05:36:43,037][53268] Updated weights for policy 1, policy_version 23430 (0.0008) [2023-10-10 05:36:43,405][53268] Updated weights for policy 1, policy_version 23440 (0.0009) [2023-10-10 05:36:43,767][53268] Updated weights for policy 1, policy_version 23450 (0.0011) [2023-10-10 05:36:46,747][53252] Updated weights for policy 0, policy_version 23490 (0.0008) [2023-10-10 05:36:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48070656. Throughput: 0: 1675.2, 1: 1668.1. Samples: 12024600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:36:46,784][52050] Avg episode reward: [(0, '17.630'), (1, '16.650')] [2023-10-10 05:36:47,122][53252] Updated weights for policy 0, policy_version 23500 (0.0009) [2023-10-10 05:36:47,499][53252] Updated weights for policy 0, policy_version 23510 (0.0008) [2023-10-10 05:36:47,751][53268] Updated weights for policy 1, policy_version 23460 (0.0010) [2023-10-10 05:36:47,869][53252] Updated weights for policy 0, policy_version 23520 (0.0008) [2023-10-10 05:36:48,114][53268] Updated weights for policy 1, policy_version 23470 (0.0009) [2023-10-10 05:36:48,481][53268] Updated weights for policy 1, policy_version 23480 (0.0009) [2023-10-10 05:36:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48136192. Throughput: 0: 1667.5, 1: 1698.0. Samples: 12045364. Policy #0 lag: (min: 12.0, avg: 38.1, max: 40.0) [2023-10-10 05:36:51,784][52050] Avg episode reward: [(0, '18.860'), (1, '17.470')] [2023-10-10 05:36:51,964][53252] Updated weights for policy 0, policy_version 23530 (0.0008) [2023-10-10 05:36:52,328][53252] Updated weights for policy 0, policy_version 23540 (0.0007) [2023-10-10 05:36:52,376][53268] Updated weights for policy 1, policy_version 23490 (0.0008) [2023-10-10 05:36:52,709][53252] Updated weights for policy 0, policy_version 23550 (0.0007) [2023-10-10 05:36:52,738][53268] Updated weights for policy 1, policy_version 23500 (0.0007) [2023-10-10 05:36:53,108][53268] Updated weights for policy 1, policy_version 23510 (0.0008) [2023-10-10 05:36:53,466][53268] Updated weights for policy 1, policy_version 23520 (0.0009) [2023-10-10 05:36:56,735][53252] Updated weights for policy 0, policy_version 23560 (0.0010) [2023-10-10 05:36:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48201728. Throughput: 0: 1671.4, 1: 1704.5. Samples: 12066256. Policy #0 lag: (min: 12.0, avg: 38.1, max: 40.0) [2023-10-10 05:36:56,784][52050] Avg episode reward: [(0, '20.370'), (1, '17.690')] [2023-10-10 05:36:57,101][53252] Updated weights for policy 0, policy_version 23570 (0.0008) [2023-10-10 05:36:57,471][53252] Updated weights for policy 0, policy_version 23580 (0.0007) [2023-10-10 05:36:57,618][52846] Saving new best policy, reward=20.370! [2023-10-10 05:36:57,659][53268] Updated weights for policy 1, policy_version 23530 (0.0009) [2023-10-10 05:36:58,027][53268] Updated weights for policy 1, policy_version 23540 (0.0007) [2023-10-10 05:36:58,393][53268] Updated weights for policy 1, policy_version 23550 (0.0008) [2023-10-10 05:37:01,625][53252] Updated weights for policy 0, policy_version 23590 (0.0007) [2023-10-10 05:37:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48267264. Throughput: 0: 1672.7, 1: 1678.9. Samples: 12075238. Policy #0 lag: (min: 12.0, avg: 38.1, max: 40.0) [2023-10-10 05:37:01,784][52050] Avg episode reward: [(0, '18.520'), (1, '17.810')] [2023-10-10 05:37:02,003][53252] Updated weights for policy 0, policy_version 23600 (0.0007) [2023-10-10 05:37:02,380][53252] Updated weights for policy 0, policy_version 23610 (0.0007) [2023-10-10 05:37:02,625][53268] Updated weights for policy 1, policy_version 23560 (0.0010) [2023-10-10 05:37:02,997][53268] Updated weights for policy 1, policy_version 23570 (0.0009) [2023-10-10 05:37:03,366][53268] Updated weights for policy 1, policy_version 23580 (0.0008) [2023-10-10 05:37:06,684][53252] Updated weights for policy 0, policy_version 23620 (0.0010) [2023-10-10 05:37:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48332800. Throughput: 0: 1669.3, 1: 1698.9. Samples: 12095820. Policy #0 lag: (min: 12.0, avg: 38.1, max: 40.0) [2023-10-10 05:37:06,784][52050] Avg episode reward: [(0, '17.970'), (1, '17.560')] [2023-10-10 05:37:07,064][53252] Updated weights for policy 0, policy_version 23630 (0.0009) [2023-10-10 05:37:07,346][53268] Updated weights for policy 1, policy_version 23590 (0.0008) [2023-10-10 05:37:07,441][53252] Updated weights for policy 0, policy_version 23640 (0.0007) [2023-10-10 05:37:07,709][53268] Updated weights for policy 1, policy_version 23600 (0.0009) [2023-10-10 05:37:08,069][53268] Updated weights for policy 1, policy_version 23610 (0.0011) [2023-10-10 05:37:11,430][53252] Updated weights for policy 0, policy_version 23650 (0.0008) [2023-10-10 05:37:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48398336. Throughput: 0: 1665.7, 1: 1694.6. Samples: 12116348. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-10 05:37:11,784][52050] Avg episode reward: [(0, '17.590'), (1, '18.140')] [2023-10-10 05:37:11,818][53252] Updated weights for policy 0, policy_version 23660 (0.0009) [2023-10-10 05:37:12,019][53268] Updated weights for policy 1, policy_version 23620 (0.0008) [2023-10-10 05:37:12,196][53252] Updated weights for policy 0, policy_version 23670 (0.0008) [2023-10-10 05:37:12,390][53268] Updated weights for policy 1, policy_version 23630 (0.0007) [2023-10-10 05:37:12,567][53252] Updated weights for policy 0, policy_version 23680 (0.0009) [2023-10-10 05:37:12,758][53268] Updated weights for policy 1, policy_version 23640 (0.0008) [2023-10-10 05:37:16,454][53252] Updated weights for policy 0, policy_version 23690 (0.0008) [2023-10-10 05:37:16,757][53268] Updated weights for policy 1, policy_version 23650 (0.0007) [2023-10-10 05:37:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48463872. Throughput: 0: 1671.9, 1: 1684.8. Samples: 12125504. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-10 05:37:16,784][52050] Avg episode reward: [(0, '17.340'), (1, '17.170')] [2023-10-10 05:37:16,817][53252] Updated weights for policy 0, policy_version 23700 (0.0007) [2023-10-10 05:37:17,120][53268] Updated weights for policy 1, policy_version 23660 (0.0010) [2023-10-10 05:37:17,197][53252] Updated weights for policy 0, policy_version 23710 (0.0007) [2023-10-10 05:37:17,485][53268] Updated weights for policy 1, policy_version 23670 (0.0008) [2023-10-10 05:37:17,850][53268] Updated weights for policy 1, policy_version 23680 (0.0010) [2023-10-10 05:37:21,297][53252] Updated weights for policy 0, policy_version 23720 (0.0008) [2023-10-10 05:37:21,659][53252] Updated weights for policy 0, policy_version 23730 (0.0008) [2023-10-10 05:37:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48529408. Throughput: 0: 1676.0, 1: 1689.2. Samples: 12146176. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-10 05:37:21,784][52050] Avg episode reward: [(0, '17.940'), (1, '17.470')] [2023-10-10 05:37:21,989][53268] Updated weights for policy 1, policy_version 23690 (0.0007) [2023-10-10 05:37:22,031][53252] Updated weights for policy 0, policy_version 23740 (0.0008) [2023-10-10 05:37:22,356][53268] Updated weights for policy 1, policy_version 23700 (0.0007) [2023-10-10 05:37:22,726][53268] Updated weights for policy 1, policy_version 23710 (0.0007) [2023-10-10 05:37:26,157][53252] Updated weights for policy 0, policy_version 23750 (0.0009) [2023-10-10 05:37:26,533][53252] Updated weights for policy 0, policy_version 23760 (0.0008) [2023-10-10 05:37:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 48594944. Throughput: 0: 1666.3, 1: 1682.8. Samples: 12166214. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-10 05:37:26,785][52050] Avg episode reward: [(0, '19.670'), (1, '18.120')] [2023-10-10 05:37:26,911][53252] Updated weights for policy 0, policy_version 23770 (0.0008) [2023-10-10 05:37:27,023][53268] Updated weights for policy 1, policy_version 23720 (0.0007) [2023-10-10 05:37:27,401][53268] Updated weights for policy 1, policy_version 23730 (0.0008) [2023-10-10 05:37:27,772][53268] Updated weights for policy 1, policy_version 23740 (0.0007) [2023-10-10 05:37:30,881][53252] Updated weights for policy 0, policy_version 23780 (0.0009) [2023-10-10 05:37:31,252][53252] Updated weights for policy 0, policy_version 23790 (0.0010) [2023-10-10 05:37:31,620][53252] Updated weights for policy 0, policy_version 23800 (0.0009) [2023-10-10 05:37:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48660480. Throughput: 0: 1678.3, 1: 1678.2. Samples: 12175640. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) [2023-10-10 05:37:31,784][52050] Avg episode reward: [(0, '18.380'), (1, '18.140')] [2023-10-10 05:37:31,870][53268] Updated weights for policy 1, policy_version 23750 (0.0008) [2023-10-10 05:37:32,237][53268] Updated weights for policy 1, policy_version 23760 (0.0008) [2023-10-10 05:37:32,603][53268] Updated weights for policy 1, policy_version 23770 (0.0007) [2023-10-10 05:37:35,630][53252] Updated weights for policy 0, policy_version 23810 (0.0009) [2023-10-10 05:37:36,006][53252] Updated weights for policy 0, policy_version 23820 (0.0010) [2023-10-10 05:37:36,375][53252] Updated weights for policy 0, policy_version 23830 (0.0007) [2023-10-10 05:37:36,747][53252] Updated weights for policy 0, policy_version 23840 (0.0007) [2023-10-10 05:37:36,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48758784. Throughput: 0: 1684.2, 1: 1674.7. Samples: 12196514. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) [2023-10-10 05:37:36,784][52050] Avg episode reward: [(0, '17.790'), (1, '17.350')] [2023-10-10 05:37:36,816][53268] Updated weights for policy 1, policy_version 23780 (0.0007) [2023-10-10 05:37:37,182][53268] Updated weights for policy 1, policy_version 23790 (0.0008) [2023-10-10 05:37:37,550][53268] Updated weights for policy 1, policy_version 23800 (0.0008) [2023-10-10 05:37:40,708][53252] Updated weights for policy 0, policy_version 23850 (0.0009) [2023-10-10 05:37:41,083][53252] Updated weights for policy 0, policy_version 23860 (0.0007) [2023-10-10 05:37:41,445][53252] Updated weights for policy 0, policy_version 23870 (0.0010) [2023-10-10 05:37:41,581][53268] Updated weights for policy 1, policy_version 23810 (0.0008) [2023-10-10 05:37:41,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48824320. Throughput: 0: 1661.4, 1: 1669.7. Samples: 12216156. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) [2023-10-10 05:37:41,784][52050] Avg episode reward: [(0, '17.220'), (1, '17.960')] [2023-10-10 05:37:41,948][53268] Updated weights for policy 1, policy_version 23820 (0.0010) [2023-10-10 05:37:42,317][53268] Updated weights for policy 1, policy_version 23830 (0.0008) [2023-10-10 05:37:42,686][53268] Updated weights for policy 1, policy_version 23840 (0.0008) [2023-10-10 05:37:45,519][53252] Updated weights for policy 0, policy_version 23880 (0.0007) [2023-10-10 05:37:45,883][53252] Updated weights for policy 0, policy_version 23890 (0.0010) [2023-10-10 05:37:46,250][53252] Updated weights for policy 0, policy_version 23900 (0.0007) [2023-10-10 05:37:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48889856. Throughput: 0: 1686.5, 1: 1672.6. Samples: 12226398. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:37:46,784][52050] Avg episode reward: [(0, '16.480'), (1, '18.370')] [2023-10-10 05:37:46,864][53268] Updated weights for policy 1, policy_version 23850 (0.0009) [2023-10-10 05:37:47,220][53268] Updated weights for policy 1, policy_version 23860 (0.0009) [2023-10-10 05:37:47,589][53268] Updated weights for policy 1, policy_version 23870 (0.0009) [2023-10-10 05:37:50,379][53252] Updated weights for policy 0, policy_version 23910 (0.0007) [2023-10-10 05:37:50,741][53252] Updated weights for policy 0, policy_version 23920 (0.0009) [2023-10-10 05:37:51,127][53252] Updated weights for policy 0, policy_version 23930 (0.0011) [2023-10-10 05:37:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48955392. Throughput: 0: 1685.2, 1: 1669.7. Samples: 12246790. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:37:51,784][52050] Avg episode reward: [(0, '18.930'), (1, '16.980')] [2023-10-10 05:37:51,849][53268] Updated weights for policy 1, policy_version 23880 (0.0008) [2023-10-10 05:37:52,228][53268] Updated weights for policy 1, policy_version 23890 (0.0007) [2023-10-10 05:37:52,586][53268] Updated weights for policy 1, policy_version 23900 (0.0008) [2023-10-10 05:37:55,128][53252] Updated weights for policy 0, policy_version 23940 (0.0010) [2023-10-10 05:37:55,499][53252] Updated weights for policy 0, policy_version 23950 (0.0011) [2023-10-10 05:37:55,869][53252] Updated weights for policy 0, policy_version 23960 (0.0010) [2023-10-10 05:37:56,444][53268] Updated weights for policy 1, policy_version 23910 (0.0008) [2023-10-10 05:37:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 49020928. Throughput: 0: 1663.8, 1: 1672.0. Samples: 12266456. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:37:56,784][52050] Avg episode reward: [(0, '18.690'), (1, '15.980')] [2023-10-10 05:37:56,813][53268] Updated weights for policy 1, policy_version 23920 (0.0009) [2023-10-10 05:37:57,176][53268] Updated weights for policy 1, policy_version 23930 (0.0010) [2023-10-10 05:37:59,944][53252] Updated weights for policy 0, policy_version 23970 (0.0007) [2023-10-10 05:38:00,307][53252] Updated weights for policy 0, policy_version 23980 (0.0009) [2023-10-10 05:38:00,674][53252] Updated weights for policy 0, policy_version 23990 (0.0007) [2023-10-10 05:38:01,047][53252] Updated weights for policy 0, policy_version 24000 (0.0007) [2023-10-10 05:38:01,400][53268] Updated weights for policy 1, policy_version 23940 (0.0010) [2023-10-10 05:38:01,774][53268] Updated weights for policy 1, policy_version 23950 (0.0010) [2023-10-10 05:38:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 49086464. Throughput: 0: 1687.2, 1: 1670.0. Samples: 12276574. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 05:38:01,784][52050] Avg episode reward: [(0, '17.370'), (1, '16.480')] [2023-10-10 05:38:02,137][53268] Updated weights for policy 1, policy_version 23960 (0.0011) [2023-10-10 05:38:05,173][53252] Updated weights for policy 0, policy_version 24010 (0.0010) [2023-10-10 05:38:05,535][53252] Updated weights for policy 0, policy_version 24020 (0.0009) [2023-10-10 05:38:05,910][53252] Updated weights for policy 0, policy_version 24030 (0.0008) [2023-10-10 05:38:06,117][53268] Updated weights for policy 1, policy_version 23970 (0.0011) [2023-10-10 05:38:06,494][53268] Updated weights for policy 1, policy_version 23980 (0.0009) [2023-10-10 05:38:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49152000. Throughput: 0: 1675.7, 1: 1670.8. Samples: 12296772. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:38:06,784][52050] Avg episode reward: [(0, '19.190'), (1, '16.580')] [2023-10-10 05:38:06,857][53268] Updated weights for policy 1, policy_version 23990 (0.0010) [2023-10-10 05:38:07,227][53268] Updated weights for policy 1, policy_version 24000 (0.0011) [2023-10-10 05:38:09,919][53252] Updated weights for policy 0, policy_version 24040 (0.0007) [2023-10-10 05:38:10,295][53252] Updated weights for policy 0, policy_version 24050 (0.0009) [2023-10-10 05:38:10,670][53252] Updated weights for policy 0, policy_version 24060 (0.0011) [2023-10-10 05:38:11,374][53268] Updated weights for policy 1, policy_version 24010 (0.0008) [2023-10-10 05:38:11,744][53268] Updated weights for policy 1, policy_version 24020 (0.0010) [2023-10-10 05:38:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49217536. Throughput: 0: 1675.8, 1: 1667.0. Samples: 12316638. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:38:11,784][52050] Avg episode reward: [(0, '18.000'), (1, '17.410')] [2023-10-10 05:38:12,109][53268] Updated weights for policy 1, policy_version 24030 (0.0008) [2023-10-10 05:38:14,764][53252] Updated weights for policy 0, policy_version 24070 (0.0010) [2023-10-10 05:38:15,136][53252] Updated weights for policy 0, policy_version 24080 (0.0008) [2023-10-10 05:38:15,515][53252] Updated weights for policy 0, policy_version 24090 (0.0008) [2023-10-10 05:38:16,317][53268] Updated weights for policy 1, policy_version 24040 (0.0008) [2023-10-10 05:38:16,689][53268] Updated weights for policy 1, policy_version 24050 (0.0010) [2023-10-10 05:38:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49283072. Throughput: 0: 1695.1, 1: 1674.0. Samples: 12327248. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:38:16,784][52050] Avg episode reward: [(0, '18.090'), (1, '18.180')] [2023-10-10 05:38:17,055][53268] Updated weights for policy 1, policy_version 24060 (0.0009) [2023-10-10 05:38:19,572][53252] Updated weights for policy 0, policy_version 24100 (0.0008) [2023-10-10 05:38:19,946][53252] Updated weights for policy 0, policy_version 24110 (0.0008) [2023-10-10 05:38:20,317][53252] Updated weights for policy 0, policy_version 24120 (0.0007) [2023-10-10 05:38:21,027][53268] Updated weights for policy 1, policy_version 24070 (0.0008) [2023-10-10 05:38:21,388][53268] Updated weights for policy 1, policy_version 24080 (0.0008) [2023-10-10 05:38:21,764][53268] Updated weights for policy 1, policy_version 24090 (0.0009) [2023-10-10 05:38:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49348608. Throughput: 0: 1670.1, 1: 1669.3. Samples: 12346786. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 05:38:21,784][52050] Avg episode reward: [(0, '18.500'), (1, '19.030')] [2023-10-10 05:38:24,373][53252] Updated weights for policy 0, policy_version 24130 (0.0009) [2023-10-10 05:38:24,747][53252] Updated weights for policy 0, policy_version 24140 (0.0008) [2023-10-10 05:38:25,124][53252] Updated weights for policy 0, policy_version 24150 (0.0008) [2023-10-10 05:38:25,496][53252] Updated weights for policy 0, policy_version 24160 (0.0011) [2023-10-10 05:38:26,080][53268] Updated weights for policy 1, policy_version 24100 (0.0008) [2023-10-10 05:38:26,458][53268] Updated weights for policy 1, policy_version 24110 (0.0008) [2023-10-10 05:38:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 49414144. Throughput: 0: 1683.3, 1: 1660.1. Samples: 12366612. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:38:26,784][52050] Avg episode reward: [(0, '17.070'), (1, '17.000')] [2023-10-10 05:38:26,823][53268] Updated weights for policy 1, policy_version 24120 (0.0007) [2023-10-10 05:38:29,622][53252] Updated weights for policy 0, policy_version 24170 (0.0007) [2023-10-10 05:38:29,997][53252] Updated weights for policy 0, policy_version 24180 (0.0008) [2023-10-10 05:38:30,356][53252] Updated weights for policy 0, policy_version 24190 (0.0010) [2023-10-10 05:38:30,918][53268] Updated weights for policy 1, policy_version 24130 (0.0008) [2023-10-10 05:38:31,296][53268] Updated weights for policy 1, policy_version 24140 (0.0010) [2023-10-10 05:38:31,660][53268] Updated weights for policy 1, policy_version 24150 (0.0010) [2023-10-10 05:38:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49479680. Throughput: 0: 1685.0, 1: 1660.2. Samples: 12376934. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:38:31,784][52050] Avg episode reward: [(0, '17.620'), (1, '16.900')] [2023-10-10 05:38:32,038][53268] Updated weights for policy 1, policy_version 24160 (0.0010) [2023-10-10 05:38:34,344][53252] Updated weights for policy 0, policy_version 24200 (0.0009) [2023-10-10 05:38:34,713][53252] Updated weights for policy 0, policy_version 24210 (0.0008) [2023-10-10 05:38:35,093][53252] Updated weights for policy 0, policy_version 24220 (0.0008) [2023-10-10 05:38:36,169][53268] Updated weights for policy 1, policy_version 24170 (0.0008) [2023-10-10 05:38:36,542][53268] Updated weights for policy 1, policy_version 24180 (0.0009) [2023-10-10 05:38:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49545216. Throughput: 0: 1660.4, 1: 1665.8. Samples: 12396470. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:38:36,784][52050] Avg episode reward: [(0, '18.300'), (1, '16.870')] [2023-10-10 05:38:36,908][53268] Updated weights for policy 1, policy_version 24190 (0.0008) [2023-10-10 05:38:39,134][53252] Updated weights for policy 0, policy_version 24230 (0.0009) [2023-10-10 05:38:39,507][53252] Updated weights for policy 0, policy_version 24240 (0.0007) [2023-10-10 05:38:39,873][53252] Updated weights for policy 0, policy_version 24250 (0.0007) [2023-10-10 05:38:41,163][53268] Updated weights for policy 1, policy_version 24200 (0.0007) [2023-10-10 05:38:41,539][53268] Updated weights for policy 1, policy_version 24210 (0.0007) [2023-10-10 05:38:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49610752. Throughput: 0: 1687.0, 1: 1657.0. Samples: 12416936. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 05:38:41,784][52050] Avg episode reward: [(0, '17.340'), (1, '16.340')] [2023-10-10 05:38:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000024256_24838144.pth... [2023-10-10 05:38:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000022688_23232512.pth [2023-10-10 05:38:41,903][53268] Updated weights for policy 1, policy_version 24220 (0.0008) [2023-10-10 05:38:42,048][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000024224_24805376.pth... [2023-10-10 05:38:42,087][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000022656_23199744.pth [2023-10-10 05:38:43,784][53252] Updated weights for policy 0, policy_version 24260 (0.0007) [2023-10-10 05:38:44,158][53252] Updated weights for policy 0, policy_version 24270 (0.0011) [2023-10-10 05:38:44,531][53252] Updated weights for policy 0, policy_version 24280 (0.0010) [2023-10-10 05:38:45,836][53268] Updated weights for policy 1, policy_version 24230 (0.0010) [2023-10-10 05:38:46,205][53268] Updated weights for policy 1, policy_version 24240 (0.0007) [2023-10-10 05:38:46,576][53268] Updated weights for policy 1, policy_version 24250 (0.0007) [2023-10-10 05:38:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 49676288. Throughput: 0: 1677.0, 1: 1665.6. Samples: 12426992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-10 05:38:46,784][52050] Avg episode reward: [(0, '20.200'), (1, '16.990')] [2023-10-10 05:38:48,678][53252] Updated weights for policy 0, policy_version 24290 (0.0009) [2023-10-10 05:38:49,051][53252] Updated weights for policy 0, policy_version 24300 (0.0008) [2023-10-10 05:38:49,421][53252] Updated weights for policy 0, policy_version 24310 (0.0007) [2023-10-10 05:38:49,792][53252] Updated weights for policy 0, policy_version 24320 (0.0007) [2023-10-10 05:38:50,609][53268] Updated weights for policy 1, policy_version 24260 (0.0007) [2023-10-10 05:38:50,980][53268] Updated weights for policy 1, policy_version 24270 (0.0009) [2023-10-10 05:38:51,356][53268] Updated weights for policy 1, policy_version 24280 (0.0008) [2023-10-10 05:38:51,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49774592. Throughput: 0: 1676.2, 1: 1668.8. Samples: 12447300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-10 05:38:51,785][52050] Avg episode reward: [(0, '20.150'), (1, '17.190')] [2023-10-10 05:38:53,854][53252] Updated weights for policy 0, policy_version 24330 (0.0007) [2023-10-10 05:38:54,234][53252] Updated weights for policy 0, policy_version 24340 (0.0010) [2023-10-10 05:38:54,605][53252] Updated weights for policy 0, policy_version 24350 (0.0010) [2023-10-10 05:38:55,465][53268] Updated weights for policy 1, policy_version 24290 (0.0009) [2023-10-10 05:38:55,827][53268] Updated weights for policy 1, policy_version 24300 (0.0010) [2023-10-10 05:38:56,190][53268] Updated weights for policy 1, policy_version 24310 (0.0009) [2023-10-10 05:38:56,567][53268] Updated weights for policy 1, policy_version 24320 (0.0007) [2023-10-10 05:38:56,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 49840128. Throughput: 0: 1691.7, 1: 1656.1. Samples: 12467288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-10 05:38:56,784][52050] Avg episode reward: [(0, '18.640'), (1, '17.310')] [2023-10-10 05:38:58,431][53252] Updated weights for policy 0, policy_version 24360 (0.0007) [2023-10-10 05:38:58,807][53252] Updated weights for policy 0, policy_version 24370 (0.0010) [2023-10-10 05:38:59,174][53252] Updated weights for policy 0, policy_version 24380 (0.0008) [2023-10-10 05:39:00,610][53268] Updated weights for policy 1, policy_version 24330 (0.0009) [2023-10-10 05:39:00,981][53268] Updated weights for policy 1, policy_version 24340 (0.0007) [2023-10-10 05:39:01,349][53268] Updated weights for policy 1, policy_version 24350 (0.0008) [2023-10-10 05:39:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49905664. Throughput: 0: 1662.0, 1: 1674.4. Samples: 12477390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-10 05:39:01,784][52050] Avg episode reward: [(0, '19.750'), (1, '17.160')] [2023-10-10 05:39:03,080][53252] Updated weights for policy 0, policy_version 24390 (0.0007) [2023-10-10 05:39:03,456][53252] Updated weights for policy 0, policy_version 24400 (0.0007) [2023-10-10 05:39:03,819][53252] Updated weights for policy 0, policy_version 24410 (0.0009) [2023-10-10 05:39:05,345][53268] Updated weights for policy 1, policy_version 24360 (0.0008) [2023-10-10 05:39:05,707][53268] Updated weights for policy 1, policy_version 24370 (0.0010) [2023-10-10 05:39:06,075][53268] Updated weights for policy 1, policy_version 24380 (0.0009) [2023-10-10 05:39:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 49971200. Throughput: 0: 1686.5, 1: 1678.9. Samples: 12498232. Policy #0 lag: (min: 16.0, avg: 39.4, max: 48.0) [2023-10-10 05:39:06,784][52050] Avg episode reward: [(0, '17.790'), (1, '18.350')] [2023-10-10 05:39:07,951][53252] Updated weights for policy 0, policy_version 24420 (0.0009) [2023-10-10 05:39:08,336][53252] Updated weights for policy 0, policy_version 24430 (0.0009) [2023-10-10 05:39:08,710][53252] Updated weights for policy 0, policy_version 24440 (0.0008) [2023-10-10 05:39:10,225][53268] Updated weights for policy 1, policy_version 24390 (0.0008) [2023-10-10 05:39:10,590][53268] Updated weights for policy 1, policy_version 24400 (0.0009) [2023-10-10 05:39:10,958][53268] Updated weights for policy 1, policy_version 24410 (0.0008) [2023-10-10 05:39:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50036736. Throughput: 0: 1695.1, 1: 1666.9. Samples: 12517904. Policy #0 lag: (min: 16.0, avg: 39.4, max: 48.0) [2023-10-10 05:39:11,784][52050] Avg episode reward: [(0, '18.010'), (1, '17.990')] [2023-10-10 05:39:12,805][53252] Updated weights for policy 0, policy_version 24450 (0.0009) [2023-10-10 05:39:13,178][53252] Updated weights for policy 0, policy_version 24460 (0.0007) [2023-10-10 05:39:13,556][53252] Updated weights for policy 0, policy_version 24470 (0.0008) [2023-10-10 05:39:13,920][53252] Updated weights for policy 0, policy_version 24480 (0.0007) [2023-10-10 05:39:14,905][53268] Updated weights for policy 1, policy_version 24420 (0.0009) [2023-10-10 05:39:15,279][53268] Updated weights for policy 1, policy_version 24430 (0.0010) [2023-10-10 05:39:15,648][53268] Updated weights for policy 1, policy_version 24440 (0.0009) [2023-10-10 05:39:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50102272. Throughput: 0: 1666.4, 1: 1692.8. Samples: 12528100. Policy #0 lag: (min: 16.0, avg: 39.4, max: 48.0) [2023-10-10 05:39:16,784][52050] Avg episode reward: [(0, '18.480'), (1, '17.630')] [2023-10-10 05:39:18,138][53252] Updated weights for policy 0, policy_version 24490 (0.0010) [2023-10-10 05:39:18,528][53252] Updated weights for policy 0, policy_version 24500 (0.0011) [2023-10-10 05:39:18,885][53252] Updated weights for policy 0, policy_version 24510 (0.0009) [2023-10-10 05:39:19,513][53268] Updated weights for policy 1, policy_version 24450 (0.0010) [2023-10-10 05:39:19,877][53268] Updated weights for policy 1, policy_version 24460 (0.0008) [2023-10-10 05:39:20,247][53268] Updated weights for policy 1, policy_version 24470 (0.0010) [2023-10-10 05:39:20,615][53268] Updated weights for policy 1, policy_version 24480 (0.0008) [2023-10-10 05:39:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50167808. Throughput: 0: 1692.8, 1: 1676.0. Samples: 12548064. Policy #0 lag: (min: 16.0, avg: 39.4, max: 48.0) [2023-10-10 05:39:21,784][52050] Avg episode reward: [(0, '17.150'), (1, '16.670')] [2023-10-10 05:39:22,974][53252] Updated weights for policy 0, policy_version 24520 (0.0009) [2023-10-10 05:39:23,342][53252] Updated weights for policy 0, policy_version 24530 (0.0009) [2023-10-10 05:39:23,726][53252] Updated weights for policy 0, policy_version 24540 (0.0009) [2023-10-10 05:39:24,579][53268] Updated weights for policy 1, policy_version 24490 (0.0008) [2023-10-10 05:39:24,937][53268] Updated weights for policy 1, policy_version 24500 (0.0010) [2023-10-10 05:39:25,309][53268] Updated weights for policy 1, policy_version 24510 (0.0008) [2023-10-10 05:39:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50233344. Throughput: 0: 1693.5, 1: 1674.0. Samples: 12568472. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:39:26,784][52050] Avg episode reward: [(0, '17.200'), (1, '18.530')] [2023-10-10 05:39:27,784][53252] Updated weights for policy 0, policy_version 24550 (0.0009) [2023-10-10 05:39:28,157][53252] Updated weights for policy 0, policy_version 24560 (0.0010) [2023-10-10 05:39:28,533][53252] Updated weights for policy 0, policy_version 24570 (0.0010) [2023-10-10 05:39:29,421][53268] Updated weights for policy 1, policy_version 24520 (0.0009) [2023-10-10 05:39:29,804][53268] Updated weights for policy 1, policy_version 24530 (0.0010) [2023-10-10 05:39:30,170][53268] Updated weights for policy 1, policy_version 24540 (0.0010) [2023-10-10 05:39:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50298880. Throughput: 0: 1676.3, 1: 1695.2. Samples: 12578712. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:39:31,784][52050] Avg episode reward: [(0, '17.380'), (1, '17.460')] [2023-10-10 05:39:32,609][53252] Updated weights for policy 0, policy_version 24580 (0.0009) [2023-10-10 05:39:32,977][53252] Updated weights for policy 0, policy_version 24590 (0.0011) [2023-10-10 05:39:33,355][53252] Updated weights for policy 0, policy_version 24600 (0.0008) [2023-10-10 05:39:34,313][53268] Updated weights for policy 1, policy_version 24550 (0.0009) [2023-10-10 05:39:34,681][53268] Updated weights for policy 1, policy_version 24560 (0.0009) [2023-10-10 05:39:35,042][53268] Updated weights for policy 1, policy_version 24570 (0.0007) [2023-10-10 05:39:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50364416. Throughput: 0: 1690.4, 1: 1668.8. Samples: 12598462. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:39:36,784][52050] Avg episode reward: [(0, '18.140'), (1, '17.860')] [2023-10-10 05:39:37,376][53252] Updated weights for policy 0, policy_version 24610 (0.0007) [2023-10-10 05:39:37,750][53252] Updated weights for policy 0, policy_version 24620 (0.0010) [2023-10-10 05:39:38,124][53252] Updated weights for policy 0, policy_version 24630 (0.0008) [2023-10-10 05:39:38,489][53252] Updated weights for policy 0, policy_version 24640 (0.0009) [2023-10-10 05:39:39,208][53268] Updated weights for policy 1, policy_version 24580 (0.0010) [2023-10-10 05:39:39,568][53268] Updated weights for policy 1, policy_version 24590 (0.0009) [2023-10-10 05:39:39,930][53268] Updated weights for policy 1, policy_version 24600 (0.0011) [2023-10-10 05:39:41,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50429952. Throughput: 0: 1686.1, 1: 1682.5. Samples: 12618876. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:39:41,785][52050] Avg episode reward: [(0, '17.070'), (1, '18.620')] [2023-10-10 05:39:42,690][53252] Updated weights for policy 0, policy_version 24650 (0.0007) [2023-10-10 05:39:43,057][53252] Updated weights for policy 0, policy_version 24660 (0.0011) [2023-10-10 05:39:43,436][53252] Updated weights for policy 0, policy_version 24670 (0.0011) [2023-10-10 05:39:43,878][53268] Updated weights for policy 1, policy_version 24610 (0.0009) [2023-10-10 05:39:44,243][53268] Updated weights for policy 1, policy_version 24620 (0.0008) [2023-10-10 05:39:44,615][53268] Updated weights for policy 1, policy_version 24630 (0.0008) [2023-10-10 05:39:44,986][53268] Updated weights for policy 1, policy_version 24640 (0.0007) [2023-10-10 05:39:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50495488. Throughput: 0: 1680.0, 1: 1683.4. Samples: 12628740. Policy #0 lag: (min: 4.0, avg: 11.4, max: 36.0) [2023-10-10 05:39:46,784][52050] Avg episode reward: [(0, '18.270'), (1, '16.980')] [2023-10-10 05:39:47,352][53252] Updated weights for policy 0, policy_version 24680 (0.0009) [2023-10-10 05:39:47,722][53252] Updated weights for policy 0, policy_version 24690 (0.0009) [2023-10-10 05:39:48,097][53252] Updated weights for policy 0, policy_version 24700 (0.0007) [2023-10-10 05:39:48,901][53268] Updated weights for policy 1, policy_version 24650 (0.0008) [2023-10-10 05:39:49,277][53268] Updated weights for policy 1, policy_version 24660 (0.0008) [2023-10-10 05:39:49,651][53268] Updated weights for policy 1, policy_version 24670 (0.0008) [2023-10-10 05:39:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 50561024. Throughput: 0: 1681.3, 1: 1666.2. Samples: 12648870. Policy #0 lag: (min: 4.0, avg: 11.4, max: 36.0) [2023-10-10 05:39:51,784][52050] Avg episode reward: [(0, '19.520'), (1, '17.100')] [2023-10-10 05:39:52,183][53252] Updated weights for policy 0, policy_version 24710 (0.0008) [2023-10-10 05:39:52,552][53252] Updated weights for policy 0, policy_version 24720 (0.0008) [2023-10-10 05:39:52,922][53252] Updated weights for policy 0, policy_version 24730 (0.0007) [2023-10-10 05:39:53,685][53268] Updated weights for policy 1, policy_version 24680 (0.0010) [2023-10-10 05:39:54,051][53268] Updated weights for policy 1, policy_version 24690 (0.0008) [2023-10-10 05:39:54,431][53268] Updated weights for policy 1, policy_version 24700 (0.0009) [2023-10-10 05:39:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 50626560. Throughput: 0: 1680.0, 1: 1691.8. Samples: 12669634. Policy #0 lag: (min: 4.0, avg: 11.4, max: 36.0) [2023-10-10 05:39:56,784][52050] Avg episode reward: [(0, '17.730'), (1, '17.950')] [2023-10-10 05:39:57,175][53252] Updated weights for policy 0, policy_version 24740 (0.0010) [2023-10-10 05:39:57,540][53252] Updated weights for policy 0, policy_version 24750 (0.0008) [2023-10-10 05:39:57,901][53252] Updated weights for policy 0, policy_version 24760 (0.0008) [2023-10-10 05:39:58,531][53268] Updated weights for policy 1, policy_version 24710 (0.0008) [2023-10-10 05:39:58,894][53268] Updated weights for policy 1, policy_version 24720 (0.0008) [2023-10-10 05:39:59,258][53268] Updated weights for policy 1, policy_version 24730 (0.0009) [2023-10-10 05:40:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 50692096. Throughput: 0: 1680.4, 1: 1672.8. Samples: 12678994. Policy #0 lag: (min: 4.0, avg: 11.4, max: 36.0) [2023-10-10 05:40:01,784][52050] Avg episode reward: [(0, '19.380'), (1, '17.600')] [2023-10-10 05:40:02,037][53252] Updated weights for policy 0, policy_version 24770 (0.0009) [2023-10-10 05:40:02,409][53252] Updated weights for policy 0, policy_version 24780 (0.0009) [2023-10-10 05:40:02,787][53252] Updated weights for policy 0, policy_version 24790 (0.0009) [2023-10-10 05:40:03,162][53252] Updated weights for policy 0, policy_version 24800 (0.0009) [2023-10-10 05:40:03,382][53268] Updated weights for policy 1, policy_version 24740 (0.0008) [2023-10-10 05:40:03,752][53268] Updated weights for policy 1, policy_version 24750 (0.0009) [2023-10-10 05:40:04,117][53268] Updated weights for policy 1, policy_version 24760 (0.0011) [2023-10-10 05:40:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 50757632. Throughput: 0: 1682.3, 1: 1675.9. Samples: 12699182. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:40:06,784][52050] Avg episode reward: [(0, '19.920'), (1, '16.190')] [2023-10-10 05:40:07,137][53252] Updated weights for policy 0, policy_version 24810 (0.0012) [2023-10-10 05:40:07,502][53252] Updated weights for policy 0, policy_version 24820 (0.0008) [2023-10-10 05:40:07,884][53252] Updated weights for policy 0, policy_version 24830 (0.0008) [2023-10-10 05:40:08,320][53268] Updated weights for policy 1, policy_version 24770 (0.0011) [2023-10-10 05:40:08,686][53268] Updated weights for policy 1, policy_version 24780 (0.0008) [2023-10-10 05:40:09,054][53268] Updated weights for policy 1, policy_version 24790 (0.0009) [2023-10-10 05:40:09,416][53268] Updated weights for policy 1, policy_version 24800 (0.0009) [2023-10-10 05:40:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 50823168. Throughput: 0: 1675.5, 1: 1683.7. Samples: 12719638. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:40:11,784][52050] Avg episode reward: [(0, '17.350'), (1, '17.390')] [2023-10-10 05:40:12,115][53252] Updated weights for policy 0, policy_version 24840 (0.0010) [2023-10-10 05:40:12,487][53252] Updated weights for policy 0, policy_version 24850 (0.0007) [2023-10-10 05:40:12,859][53252] Updated weights for policy 0, policy_version 24860 (0.0008) [2023-10-10 05:40:13,500][53268] Updated weights for policy 1, policy_version 24810 (0.0008) [2023-10-10 05:40:13,864][53268] Updated weights for policy 1, policy_version 24820 (0.0009) [2023-10-10 05:40:14,240][53268] Updated weights for policy 1, policy_version 24830 (0.0011) [2023-10-10 05:40:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 50888704. Throughput: 0: 1679.2, 1: 1662.9. Samples: 12729106. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:40:16,784][52050] Avg episode reward: [(0, '18.630'), (1, '18.170')] [2023-10-10 05:40:16,919][53252] Updated weights for policy 0, policy_version 24870 (0.0007) [2023-10-10 05:40:17,300][53252] Updated weights for policy 0, policy_version 24880 (0.0007) [2023-10-10 05:40:17,672][53252] Updated weights for policy 0, policy_version 24890 (0.0008) [2023-10-10 05:40:18,316][53268] Updated weights for policy 1, policy_version 24840 (0.0007) [2023-10-10 05:40:18,690][53268] Updated weights for policy 1, policy_version 24850 (0.0009) [2023-10-10 05:40:19,059][53268] Updated weights for policy 1, policy_version 24860 (0.0008) [2023-10-10 05:40:21,662][53252] Updated weights for policy 0, policy_version 24900 (0.0009) [2023-10-10 05:40:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 50954240. Throughput: 0: 1673.3, 1: 1680.3. Samples: 12749374. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:40:21,785][52050] Avg episode reward: [(0, '17.930'), (1, '18.340')] [2023-10-10 05:40:22,034][53252] Updated weights for policy 0, policy_version 24910 (0.0008) [2023-10-10 05:40:22,417][53252] Updated weights for policy 0, policy_version 24920 (0.0007) [2023-10-10 05:40:23,192][53268] Updated weights for policy 1, policy_version 24870 (0.0010) [2023-10-10 05:40:23,555][53268] Updated weights for policy 1, policy_version 24880 (0.0011) [2023-10-10 05:40:23,927][53268] Updated weights for policy 1, policy_version 24890 (0.0009) [2023-10-10 05:40:26,465][53252] Updated weights for policy 0, policy_version 24930 (0.0009) [2023-10-10 05:40:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 51019776. Throughput: 0: 1670.8, 1: 1687.3. Samples: 12769990. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:40:26,785][52050] Avg episode reward: [(0, '17.510'), (1, '17.910')] [2023-10-10 05:40:26,835][53252] Updated weights for policy 0, policy_version 24940 (0.0009) [2023-10-10 05:40:27,205][53252] Updated weights for policy 0, policy_version 24950 (0.0012) [2023-10-10 05:40:27,579][53252] Updated weights for policy 0, policy_version 24960 (0.0009) [2023-10-10 05:40:28,063][53268] Updated weights for policy 1, policy_version 24900 (0.0011) [2023-10-10 05:40:28,437][53268] Updated weights for policy 1, policy_version 24910 (0.0010) [2023-10-10 05:40:28,796][53268] Updated weights for policy 1, policy_version 24920 (0.0010) [2023-10-10 05:40:31,657][53252] Updated weights for policy 0, policy_version 24970 (0.0007) [2023-10-10 05:40:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51085312. Throughput: 0: 1673.5, 1: 1664.4. Samples: 12778946. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:40:31,784][52050] Avg episode reward: [(0, '17.650'), (1, '17.790')] [2023-10-10 05:40:32,034][53252] Updated weights for policy 0, policy_version 24980 (0.0008) [2023-10-10 05:40:32,404][53252] Updated weights for policy 0, policy_version 24990 (0.0008) [2023-10-10 05:40:32,847][53268] Updated weights for policy 1, policy_version 24930 (0.0010) [2023-10-10 05:40:33,222][53268] Updated weights for policy 1, policy_version 24940 (0.0009) [2023-10-10 05:40:33,596][53268] Updated weights for policy 1, policy_version 24950 (0.0007) [2023-10-10 05:40:33,966][53268] Updated weights for policy 1, policy_version 24960 (0.0009) [2023-10-10 05:40:36,454][53252] Updated weights for policy 0, policy_version 25000 (0.0008) [2023-10-10 05:40:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51150848. Throughput: 0: 1670.4, 1: 1682.2. Samples: 12799738. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:40:36,784][52050] Avg episode reward: [(0, '19.420'), (1, '18.200')] [2023-10-10 05:40:36,829][53252] Updated weights for policy 0, policy_version 25010 (0.0009) [2023-10-10 05:40:37,206][53252] Updated weights for policy 0, policy_version 25020 (0.0007) [2023-10-10 05:40:38,141][53268] Updated weights for policy 1, policy_version 24970 (0.0009) [2023-10-10 05:40:38,516][53268] Updated weights for policy 1, policy_version 24980 (0.0010) [2023-10-10 05:40:38,881][53268] Updated weights for policy 1, policy_version 24990 (0.0009) [2023-10-10 05:40:41,162][53252] Updated weights for policy 0, policy_version 25030 (0.0008) [2023-10-10 05:40:41,545][53252] Updated weights for policy 0, policy_version 25040 (0.0008) [2023-10-10 05:40:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 51216384. Throughput: 0: 1662.2, 1: 1680.0. Samples: 12820032. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 05:40:41,784][52050] Avg episode reward: [(0, '19.480'), (1, '16.670')] [2023-10-10 05:40:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000024992_25591808.pth... [2023-10-10 05:40:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000023424_23986176.pth [2023-10-10 05:40:41,916][53252] Updated weights for policy 0, policy_version 25050 (0.0008) [2023-10-10 05:40:42,128][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000025056_25657344.pth... [2023-10-10 05:40:42,164][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000023456_24018944.pth [2023-10-10 05:40:42,861][53268] Updated weights for policy 1, policy_version 25000 (0.0008) [2023-10-10 05:40:43,227][53268] Updated weights for policy 1, policy_version 25010 (0.0008) [2023-10-10 05:40:43,602][53268] Updated weights for policy 1, policy_version 25020 (0.0010) [2023-10-10 05:40:45,879][53252] Updated weights for policy 0, policy_version 25060 (0.0008) [2023-10-10 05:40:46,261][53252] Updated weights for policy 0, policy_version 25070 (0.0007) [2023-10-10 05:40:46,620][53252] Updated weights for policy 0, policy_version 25080 (0.0007) [2023-10-10 05:40:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 51281920. Throughput: 0: 1678.2, 1: 1672.2. Samples: 12829764. Policy #0 lag: (min: 1.0, avg: 3.5, max: 33.0) [2023-10-10 05:40:46,784][52050] Avg episode reward: [(0, '17.940'), (1, '16.560')] [2023-10-10 05:40:47,557][53268] Updated weights for policy 1, policy_version 25030 (0.0010) [2023-10-10 05:40:47,922][53268] Updated weights for policy 1, policy_version 25040 (0.0008) [2023-10-10 05:40:48,287][53268] Updated weights for policy 1, policy_version 25050 (0.0007) [2023-10-10 05:40:50,634][53252] Updated weights for policy 0, policy_version 25090 (0.0008) [2023-10-10 05:40:51,004][53252] Updated weights for policy 0, policy_version 25100 (0.0008) [2023-10-10 05:40:51,377][53252] Updated weights for policy 0, policy_version 25110 (0.0009) [2023-10-10 05:40:51,746][53252] Updated weights for policy 0, policy_version 25120 (0.0008) [2023-10-10 05:40:51,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51380224. Throughput: 0: 1679.6, 1: 1687.0. Samples: 12850676. Policy #0 lag: (min: 1.0, avg: 3.5, max: 33.0) [2023-10-10 05:40:51,784][52050] Avg episode reward: [(0, '19.060'), (1, '17.320')] [2023-10-10 05:40:52,193][53268] Updated weights for policy 1, policy_version 25060 (0.0008) [2023-10-10 05:40:52,573][53268] Updated weights for policy 1, policy_version 25070 (0.0009) [2023-10-10 05:40:52,930][53268] Updated weights for policy 1, policy_version 25080 (0.0007) [2023-10-10 05:40:55,926][53252] Updated weights for policy 0, policy_version 25130 (0.0009) [2023-10-10 05:40:56,299][53252] Updated weights for policy 0, policy_version 25140 (0.0009) [2023-10-10 05:40:56,673][53252] Updated weights for policy 0, policy_version 25150 (0.0009) [2023-10-10 05:40:56,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51445760. Throughput: 0: 1663.3, 1: 1687.8. Samples: 12870438. Policy #0 lag: (min: 1.0, avg: 3.5, max: 33.0) [2023-10-10 05:40:56,784][52050] Avg episode reward: [(0, '18.030'), (1, '17.460')] [2023-10-10 05:40:57,077][53268] Updated weights for policy 1, policy_version 25090 (0.0008) [2023-10-10 05:40:57,448][53268] Updated weights for policy 1, policy_version 25100 (0.0008) [2023-10-10 05:40:57,803][53268] Updated weights for policy 1, policy_version 25110 (0.0007) [2023-10-10 05:40:58,173][53268] Updated weights for policy 1, policy_version 25120 (0.0009) [2023-10-10 05:41:00,655][53252] Updated weights for policy 0, policy_version 25160 (0.0011) [2023-10-10 05:41:01,036][53252] Updated weights for policy 0, policy_version 25170 (0.0010) [2023-10-10 05:41:01,412][53252] Updated weights for policy 0, policy_version 25180 (0.0007) [2023-10-10 05:41:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51511296. Throughput: 0: 1680.5, 1: 1685.4. Samples: 12880570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:41:01,784][52050] Avg episode reward: [(0, '16.400'), (1, '17.560')] [2023-10-10 05:41:02,066][53268] Updated weights for policy 1, policy_version 25130 (0.0011) [2023-10-10 05:41:02,437][53268] Updated weights for policy 1, policy_version 25140 (0.0010) [2023-10-10 05:41:02,814][53268] Updated weights for policy 1, policy_version 25150 (0.0009) [2023-10-10 05:41:05,538][53252] Updated weights for policy 0, policy_version 25190 (0.0009) [2023-10-10 05:41:05,909][53252] Updated weights for policy 0, policy_version 25200 (0.0009) [2023-10-10 05:41:06,277][53252] Updated weights for policy 0, policy_version 25210 (0.0007) [2023-10-10 05:41:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51576832. Throughput: 0: 1687.8, 1: 1694.7. Samples: 12901586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:41:06,784][52050] Avg episode reward: [(0, '17.380'), (1, '18.050')] [2023-10-10 05:41:06,971][53268] Updated weights for policy 1, policy_version 25160 (0.0008) [2023-10-10 05:41:07,328][53268] Updated weights for policy 1, policy_version 25170 (0.0009) [2023-10-10 05:41:07,707][53268] Updated weights for policy 1, policy_version 25180 (0.0009) [2023-10-10 05:41:10,389][53252] Updated weights for policy 0, policy_version 25220 (0.0008) [2023-10-10 05:41:10,763][53252] Updated weights for policy 0, policy_version 25230 (0.0008) [2023-10-10 05:41:11,133][53252] Updated weights for policy 0, policy_version 25240 (0.0008) [2023-10-10 05:41:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51642368. Throughput: 0: 1666.4, 1: 1695.1. Samples: 12921256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:41:11,784][52050] Avg episode reward: [(0, '16.880'), (1, '18.630')] [2023-10-10 05:41:11,814][53268] Updated weights for policy 1, policy_version 25190 (0.0008) [2023-10-10 05:41:12,179][53268] Updated weights for policy 1, policy_version 25200 (0.0010) [2023-10-10 05:41:12,535][53268] Updated weights for policy 1, policy_version 25210 (0.0009) [2023-10-10 05:41:15,302][53252] Updated weights for policy 0, policy_version 25250 (0.0008) [2023-10-10 05:41:15,705][53252] Updated weights for policy 0, policy_version 25260 (0.0010) [2023-10-10 05:41:16,081][53252] Updated weights for policy 0, policy_version 25270 (0.0010) [2023-10-10 05:41:16,451][53252] Updated weights for policy 0, policy_version 25280 (0.0011) [2023-10-10 05:41:16,649][53268] Updated weights for policy 1, policy_version 25220 (0.0009) [2023-10-10 05:41:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51707904. Throughput: 0: 1695.7, 1: 1696.0. Samples: 12931572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:41:16,784][52050] Avg episode reward: [(0, '16.360'), (1, '16.050')] [2023-10-10 05:41:17,029][53268] Updated weights for policy 1, policy_version 25230 (0.0009) [2023-10-10 05:41:17,395][53268] Updated weights for policy 1, policy_version 25240 (0.0008) [2023-10-10 05:41:20,343][53252] Updated weights for policy 0, policy_version 25290 (0.0008) [2023-10-10 05:41:20,721][53252] Updated weights for policy 0, policy_version 25300 (0.0008) [2023-10-10 05:41:21,094][53252] Updated weights for policy 0, policy_version 25310 (0.0007) [2023-10-10 05:41:21,487][53268] Updated weights for policy 1, policy_version 25250 (0.0009) [2023-10-10 05:41:21,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51773440. Throughput: 0: 1687.9, 1: 1693.5. Samples: 12951898. Policy #0 lag: (min: 3.0, avg: 4.6, max: 31.0) [2023-10-10 05:41:21,785][52050] Avg episode reward: [(0, '19.380'), (1, '17.420')] [2023-10-10 05:41:21,856][53268] Updated weights for policy 1, policy_version 25260 (0.0008) [2023-10-10 05:41:22,221][53268] Updated weights for policy 1, policy_version 25270 (0.0008) [2023-10-10 05:41:22,591][53268] Updated weights for policy 1, policy_version 25280 (0.0008) [2023-10-10 05:41:25,144][53252] Updated weights for policy 0, policy_version 25320 (0.0007) [2023-10-10 05:41:25,531][53252] Updated weights for policy 0, policy_version 25330 (0.0009) [2023-10-10 05:41:25,907][53252] Updated weights for policy 0, policy_version 25340 (0.0010) [2023-10-10 05:41:26,757][53268] Updated weights for policy 1, policy_version 25290 (0.0008) [2023-10-10 05:41:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51838976. Throughput: 0: 1677.9, 1: 1694.6. Samples: 12971794. Policy #0 lag: (min: 3.0, avg: 4.6, max: 31.0) [2023-10-10 05:41:26,784][52050] Avg episode reward: [(0, '18.420'), (1, '16.370')] [2023-10-10 05:41:27,126][53268] Updated weights for policy 1, policy_version 25300 (0.0008) [2023-10-10 05:41:27,493][53268] Updated weights for policy 1, policy_version 25310 (0.0007) [2023-10-10 05:41:29,989][53252] Updated weights for policy 0, policy_version 25350 (0.0008) [2023-10-10 05:41:30,363][53252] Updated weights for policy 0, policy_version 25360 (0.0008) [2023-10-10 05:41:30,737][53252] Updated weights for policy 0, policy_version 25370 (0.0009) [2023-10-10 05:41:31,515][53268] Updated weights for policy 1, policy_version 25320 (0.0008) [2023-10-10 05:41:31,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51904512. Throughput: 0: 1691.7, 1: 1691.2. Samples: 12981998. Policy #0 lag: (min: 3.0, avg: 4.6, max: 31.0) [2023-10-10 05:41:31,784][52050] Avg episode reward: [(0, '18.150'), (1, '16.340')] [2023-10-10 05:41:31,885][53268] Updated weights for policy 1, policy_version 25330 (0.0009) [2023-10-10 05:41:32,250][53268] Updated weights for policy 1, policy_version 25340 (0.0007) [2023-10-10 05:41:34,681][53252] Updated weights for policy 0, policy_version 25380 (0.0010) [2023-10-10 05:41:35,052][53252] Updated weights for policy 0, policy_version 25390 (0.0009) [2023-10-10 05:41:35,429][53252] Updated weights for policy 0, policy_version 25400 (0.0008) [2023-10-10 05:41:36,201][53268] Updated weights for policy 1, policy_version 25350 (0.0010) [2023-10-10 05:41:36,566][53268] Updated weights for policy 1, policy_version 25360 (0.0011) [2023-10-10 05:41:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51970048. Throughput: 0: 1670.2, 1: 1688.6. Samples: 13001824. Policy #0 lag: (min: 3.0, avg: 4.6, max: 31.0) [2023-10-10 05:41:36,784][52050] Avg episode reward: [(0, '18.300'), (1, '16.220')] [2023-10-10 05:41:36,941][53268] Updated weights for policy 1, policy_version 25370 (0.0009) [2023-10-10 05:41:39,378][53252] Updated weights for policy 0, policy_version 25410 (0.0009) [2023-10-10 05:41:39,755][53252] Updated weights for policy 0, policy_version 25420 (0.0008) [2023-10-10 05:41:40,136][53252] Updated weights for policy 0, policy_version 25430 (0.0007) [2023-10-10 05:41:40,505][53252] Updated weights for policy 0, policy_version 25440 (0.0008) [2023-10-10 05:41:40,907][53268] Updated weights for policy 1, policy_version 25380 (0.0008) [2023-10-10 05:41:41,277][53268] Updated weights for policy 1, policy_version 25390 (0.0007) [2023-10-10 05:41:41,644][53268] Updated weights for policy 1, policy_version 25400 (0.0009) [2023-10-10 05:41:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52035584. Throughput: 0: 1683.7, 1: 1680.6. Samples: 13021832. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 05:41:41,784][52050] Avg episode reward: [(0, '18.190'), (1, '16.470')] [2023-10-10 05:41:44,614][53252] Updated weights for policy 0, policy_version 25450 (0.0009) [2023-10-10 05:41:44,987][53252] Updated weights for policy 0, policy_version 25460 (0.0009) [2023-10-10 05:41:45,365][53252] Updated weights for policy 0, policy_version 25470 (0.0010) [2023-10-10 05:41:45,450][53268] Updated weights for policy 1, policy_version 25410 (0.0009) [2023-10-10 05:41:45,821][53268] Updated weights for policy 1, policy_version 25420 (0.0009) [2023-10-10 05:41:46,189][53268] Updated weights for policy 1, policy_version 25430 (0.0007) [2023-10-10 05:41:46,557][53268] Updated weights for policy 1, policy_version 25440 (0.0008) [2023-10-10 05:41:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 52133888. Throughput: 0: 1687.1, 1: 1690.5. Samples: 13032560. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 05:41:46,784][52050] Avg episode reward: [(0, '18.660'), (1, '16.640')] [2023-10-10 05:41:49,526][53252] Updated weights for policy 0, policy_version 25480 (0.0009) [2023-10-10 05:41:49,900][53252] Updated weights for policy 0, policy_version 25490 (0.0008) [2023-10-10 05:41:50,279][53252] Updated weights for policy 0, policy_version 25500 (0.0008) [2023-10-10 05:41:50,625][53268] Updated weights for policy 1, policy_version 25450 (0.0011) [2023-10-10 05:41:50,991][53268] Updated weights for policy 1, policy_version 25460 (0.0007) [2023-10-10 05:41:51,363][53268] Updated weights for policy 1, policy_version 25470 (0.0007) [2023-10-10 05:41:51,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52199424. Throughput: 0: 1658.9, 1: 1691.1. Samples: 13052336. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 05:41:51,784][52050] Avg episode reward: [(0, '18.920'), (1, '16.600')] [2023-10-10 05:41:54,294][53252] Updated weights for policy 0, policy_version 25510 (0.0008) [2023-10-10 05:41:54,666][53252] Updated weights for policy 0, policy_version 25520 (0.0008) [2023-10-10 05:41:55,040][53252] Updated weights for policy 0, policy_version 25530 (0.0007) [2023-10-10 05:41:55,415][53268] Updated weights for policy 1, policy_version 25480 (0.0009) [2023-10-10 05:41:55,770][53268] Updated weights for policy 1, policy_version 25490 (0.0010) [2023-10-10 05:41:56,145][53268] Updated weights for policy 1, policy_version 25500 (0.0009) [2023-10-10 05:41:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52264960. Throughput: 0: 1680.6, 1: 1668.0. Samples: 13071946. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 05:41:56,784][52050] Avg episode reward: [(0, '18.920'), (1, '16.190')] [2023-10-10 05:41:59,064][53252] Updated weights for policy 0, policy_version 25540 (0.0008) [2023-10-10 05:41:59,433][53252] Updated weights for policy 0, policy_version 25550 (0.0007) [2023-10-10 05:41:59,809][53252] Updated weights for policy 0, policy_version 25560 (0.0007) [2023-10-10 05:42:00,196][53268] Updated weights for policy 1, policy_version 25510 (0.0009) [2023-10-10 05:42:00,562][53268] Updated weights for policy 1, policy_version 25520 (0.0007) [2023-10-10 05:42:00,938][53268] Updated weights for policy 1, policy_version 25530 (0.0008) [2023-10-10 05:42:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52330496. Throughput: 0: 1674.6, 1: 1692.3. Samples: 13083080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:42:01,784][52050] Avg episode reward: [(0, '17.800'), (1, '16.720')] [2023-10-10 05:42:04,011][53252] Updated weights for policy 0, policy_version 25570 (0.0008) [2023-10-10 05:42:04,411][53252] Updated weights for policy 0, policy_version 25580 (0.0009) [2023-10-10 05:42:04,790][53252] Updated weights for policy 0, policy_version 25590 (0.0007) [2023-10-10 05:42:04,939][53268] Updated weights for policy 1, policy_version 25540 (0.0007) [2023-10-10 05:42:05,158][53252] Updated weights for policy 0, policy_version 25600 (0.0007) [2023-10-10 05:42:05,306][53268] Updated weights for policy 1, policy_version 25550 (0.0010) [2023-10-10 05:42:05,671][53268] Updated weights for policy 1, policy_version 25560 (0.0010) [2023-10-10 05:42:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52396032. Throughput: 0: 1658.9, 1: 1687.8. Samples: 13102498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:42:06,784][52050] Avg episode reward: [(0, '19.070'), (1, '15.260')] [2023-10-10 05:42:09,133][53252] Updated weights for policy 0, policy_version 25610 (0.0008) [2023-10-10 05:42:09,509][53252] Updated weights for policy 0, policy_version 25620 (0.0010) [2023-10-10 05:42:09,609][53268] Updated weights for policy 1, policy_version 25570 (0.0008) [2023-10-10 05:42:09,882][53252] Updated weights for policy 0, policy_version 25630 (0.0008) [2023-10-10 05:42:09,973][53268] Updated weights for policy 1, policy_version 25580 (0.0008) [2023-10-10 05:42:10,347][53268] Updated weights for policy 1, policy_version 25590 (0.0009) [2023-10-10 05:42:10,715][53268] Updated weights for policy 1, policy_version 25600 (0.0008) [2023-10-10 05:42:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52461568. Throughput: 0: 1683.2, 1: 1668.5. Samples: 13122620. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:42:11,784][52050] Avg episode reward: [(0, '19.450'), (1, '15.970')] [2023-10-10 05:42:13,881][53252] Updated weights for policy 0, policy_version 25640 (0.0008) [2023-10-10 05:42:14,255][53252] Updated weights for policy 0, policy_version 25650 (0.0007) [2023-10-10 05:42:14,630][53252] Updated weights for policy 0, policy_version 25660 (0.0008) [2023-10-10 05:42:14,963][53268] Updated weights for policy 1, policy_version 25610 (0.0010) [2023-10-10 05:42:15,334][53268] Updated weights for policy 1, policy_version 25620 (0.0009) [2023-10-10 05:42:15,707][53268] Updated weights for policy 1, policy_version 25630 (0.0009) [2023-10-10 05:42:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52527104. Throughput: 0: 1668.0, 1: 1701.1. Samples: 13133606. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 05:42:16,784][52050] Avg episode reward: [(0, '20.940'), (1, '17.930')] [2023-10-10 05:42:16,786][52846] Saving new best policy, reward=20.940! [2023-10-10 05:42:18,715][53252] Updated weights for policy 0, policy_version 25670 (0.0008) [2023-10-10 05:42:19,101][53252] Updated weights for policy 0, policy_version 25680 (0.0008) [2023-10-10 05:42:19,468][53252] Updated weights for policy 0, policy_version 25690 (0.0007) [2023-10-10 05:42:19,858][53268] Updated weights for policy 1, policy_version 25640 (0.0009) [2023-10-10 05:42:20,225][53268] Updated weights for policy 1, policy_version 25650 (0.0010) [2023-10-10 05:42:20,606][53268] Updated weights for policy 1, policy_version 25660 (0.0010) [2023-10-10 05:42:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 52592640. Throughput: 0: 1678.3, 1: 1685.4. Samples: 13153188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:21,784][52050] Avg episode reward: [(0, '20.470'), (1, '16.510')] [2023-10-10 05:42:23,652][53252] Updated weights for policy 0, policy_version 25700 (0.0008) [2023-10-10 05:42:24,024][53252] Updated weights for policy 0, policy_version 25710 (0.0009) [2023-10-10 05:42:24,397][53252] Updated weights for policy 0, policy_version 25720 (0.0008) [2023-10-10 05:42:24,613][53268] Updated weights for policy 1, policy_version 25670 (0.0009) [2023-10-10 05:42:24,993][53268] Updated weights for policy 1, policy_version 25680 (0.0010) [2023-10-10 05:42:25,348][53268] Updated weights for policy 1, policy_version 25690 (0.0008) [2023-10-10 05:42:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 52658176. Throughput: 0: 1683.5, 1: 1679.9. Samples: 13173186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:26,784][52050] Avg episode reward: [(0, '19.100'), (1, '17.990')] [2023-10-10 05:42:28,464][53252] Updated weights for policy 0, policy_version 25730 (0.0009) [2023-10-10 05:42:28,838][53252] Updated weights for policy 0, policy_version 25740 (0.0011) [2023-10-10 05:42:29,219][53252] Updated weights for policy 0, policy_version 25750 (0.0009) [2023-10-10 05:42:29,444][53268] Updated weights for policy 1, policy_version 25700 (0.0007) [2023-10-10 05:42:29,589][53252] Updated weights for policy 0, policy_version 25760 (0.0007) [2023-10-10 05:42:29,815][53268] Updated weights for policy 1, policy_version 25710 (0.0008) [2023-10-10 05:42:30,187][53268] Updated weights for policy 1, policy_version 25720 (0.0009) [2023-10-10 05:42:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 52723712. Throughput: 0: 1665.9, 1: 1692.3. Samples: 13183680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:31,784][52050] Avg episode reward: [(0, '18.840'), (1, '16.610')] [2023-10-10 05:42:33,402][53252] Updated weights for policy 0, policy_version 25770 (0.0007) [2023-10-10 05:42:33,766][53252] Updated weights for policy 0, policy_version 25780 (0.0007) [2023-10-10 05:42:34,146][53252] Updated weights for policy 0, policy_version 25790 (0.0007) [2023-10-10 05:42:34,159][53268] Updated weights for policy 1, policy_version 25730 (0.0009) [2023-10-10 05:42:34,518][53268] Updated weights for policy 1, policy_version 25740 (0.0009) [2023-10-10 05:42:34,876][53268] Updated weights for policy 1, policy_version 25750 (0.0009) [2023-10-10 05:42:35,244][53268] Updated weights for policy 1, policy_version 25760 (0.0009) [2023-10-10 05:42:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52789248. Throughput: 0: 1688.7, 1: 1664.6. Samples: 13203238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:36,784][52050] Avg episode reward: [(0, '17.140'), (1, '15.540')] [2023-10-10 05:42:38,124][53252] Updated weights for policy 0, policy_version 25800 (0.0010) [2023-10-10 05:42:38,511][53252] Updated weights for policy 0, policy_version 25810 (0.0009) [2023-10-10 05:42:38,884][53252] Updated weights for policy 0, policy_version 25820 (0.0008) [2023-10-10 05:42:39,317][53268] Updated weights for policy 1, policy_version 25770 (0.0009) [2023-10-10 05:42:39,688][53268] Updated weights for policy 1, policy_version 25780 (0.0007) [2023-10-10 05:42:40,049][53268] Updated weights for policy 1, policy_version 25790 (0.0007) [2023-10-10 05:42:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52854784. Throughput: 0: 1690.3, 1: 1685.9. Samples: 13223874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:41,784][52050] Avg episode reward: [(0, '18.980'), (1, '18.000')] [2023-10-10 05:42:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000025792_26411008.pth... [2023-10-10 05:42:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000025824_26443776.pth... [2023-10-10 05:42:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000024224_24805376.pth [2023-10-10 05:42:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000024256_24838144.pth [2023-10-10 05:42:43,180][53252] Updated weights for policy 0, policy_version 25830 (0.0010) [2023-10-10 05:42:43,551][53252] Updated weights for policy 0, policy_version 25840 (0.0010) [2023-10-10 05:42:43,919][53252] Updated weights for policy 0, policy_version 25850 (0.0007) [2023-10-10 05:42:44,167][53268] Updated weights for policy 1, policy_version 25800 (0.0008) [2023-10-10 05:42:44,530][53268] Updated weights for policy 1, policy_version 25810 (0.0009) [2023-10-10 05:42:44,897][53268] Updated weights for policy 1, policy_version 25820 (0.0008) [2023-10-10 05:42:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52920320. Throughput: 0: 1665.4, 1: 1686.4. Samples: 13233910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:46,784][52050] Avg episode reward: [(0, '17.350'), (1, '16.710')] [2023-10-10 05:42:47,894][53252] Updated weights for policy 0, policy_version 25860 (0.0008) [2023-10-10 05:42:48,278][53252] Updated weights for policy 0, policy_version 25870 (0.0009) [2023-10-10 05:42:48,657][53252] Updated weights for policy 0, policy_version 25880 (0.0009) [2023-10-10 05:42:48,957][53268] Updated weights for policy 1, policy_version 25830 (0.0010) [2023-10-10 05:42:49,324][53268] Updated weights for policy 1, policy_version 25840 (0.0009) [2023-10-10 05:42:49,692][53268] Updated weights for policy 1, policy_version 25850 (0.0009) [2023-10-10 05:42:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 52985856. Throughput: 0: 1693.3, 1: 1668.6. Samples: 13253784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:51,784][52050] Avg episode reward: [(0, '18.490'), (1, '15.890')] [2023-10-10 05:42:52,634][53252] Updated weights for policy 0, policy_version 25890 (0.0007) [2023-10-10 05:42:53,022][53252] Updated weights for policy 0, policy_version 25900 (0.0008) [2023-10-10 05:42:53,391][53252] Updated weights for policy 0, policy_version 25910 (0.0009) [2023-10-10 05:42:53,707][53268] Updated weights for policy 1, policy_version 25860 (0.0008) [2023-10-10 05:42:53,759][53252] Updated weights for policy 0, policy_version 25920 (0.0008) [2023-10-10 05:42:54,071][53268] Updated weights for policy 1, policy_version 25870 (0.0007) [2023-10-10 05:42:54,446][53268] Updated weights for policy 1, policy_version 25880 (0.0008) [2023-10-10 05:42:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53051392. Throughput: 0: 1689.8, 1: 1690.5. Samples: 13274732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:42:56,784][52050] Avg episode reward: [(0, '19.460'), (1, '16.360')] [2023-10-10 05:42:57,781][53252] Updated weights for policy 0, policy_version 25930 (0.0007) [2023-10-10 05:42:58,151][53252] Updated weights for policy 0, policy_version 25940 (0.0008) [2023-10-10 05:42:58,452][53268] Updated weights for policy 1, policy_version 25890 (0.0008) [2023-10-10 05:42:58,519][53252] Updated weights for policy 0, policy_version 25950 (0.0008) [2023-10-10 05:42:58,829][53268] Updated weights for policy 1, policy_version 25900 (0.0009) [2023-10-10 05:42:59,195][53268] Updated weights for policy 1, policy_version 25910 (0.0008) [2023-10-10 05:42:59,556][53268] Updated weights for policy 1, policy_version 25920 (0.0008) [2023-10-10 05:43:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53116928. Throughput: 0: 1677.3, 1: 1671.0. Samples: 13284282. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-10 05:43:01,784][52050] Avg episode reward: [(0, '18.000'), (1, '15.860')] [2023-10-10 05:43:02,605][53252] Updated weights for policy 0, policy_version 25960 (0.0008) [2023-10-10 05:43:02,977][53252] Updated weights for policy 0, policy_version 25970 (0.0011) [2023-10-10 05:43:03,360][53252] Updated weights for policy 0, policy_version 25980 (0.0009) [2023-10-10 05:43:03,770][53268] Updated weights for policy 1, policy_version 25930 (0.0010) [2023-10-10 05:43:04,137][53268] Updated weights for policy 1, policy_version 25940 (0.0008) [2023-10-10 05:43:04,516][53268] Updated weights for policy 1, policy_version 25950 (0.0009) [2023-10-10 05:43:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53182464. Throughput: 0: 1686.1, 1: 1673.6. Samples: 13304374. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-10 05:43:06,784][52050] Avg episode reward: [(0, '19.730'), (1, '17.240')] [2023-10-10 05:43:07,442][53252] Updated weights for policy 0, policy_version 25990 (0.0009) [2023-10-10 05:43:07,815][53252] Updated weights for policy 0, policy_version 26000 (0.0009) [2023-10-10 05:43:08,190][53252] Updated weights for policy 0, policy_version 26010 (0.0008) [2023-10-10 05:43:08,492][53268] Updated weights for policy 1, policy_version 25960 (0.0010) [2023-10-10 05:43:08,854][53268] Updated weights for policy 1, policy_version 25970 (0.0010) [2023-10-10 05:43:09,219][53268] Updated weights for policy 1, policy_version 25980 (0.0011) [2023-10-10 05:43:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53248000. Throughput: 0: 1688.1, 1: 1687.5. Samples: 13325088. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-10 05:43:11,785][52050] Avg episode reward: [(0, '18.970'), (1, '16.240')] [2023-10-10 05:43:12,253][53252] Updated weights for policy 0, policy_version 26020 (0.0010) [2023-10-10 05:43:12,624][53252] Updated weights for policy 0, policy_version 26030 (0.0010) [2023-10-10 05:43:12,995][53252] Updated weights for policy 0, policy_version 26040 (0.0007) [2023-10-10 05:43:13,384][53268] Updated weights for policy 1, policy_version 25990 (0.0009) [2023-10-10 05:43:13,748][53268] Updated weights for policy 1, policy_version 26000 (0.0008) [2023-10-10 05:43:14,114][53268] Updated weights for policy 1, policy_version 26010 (0.0010) [2023-10-10 05:43:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 53313536. Throughput: 0: 1681.7, 1: 1664.6. Samples: 13334264. Policy #0 lag: (min: 31.0, avg: 31.2, max: 39.0) [2023-10-10 05:43:16,784][52050] Avg episode reward: [(0, '18.750'), (1, '17.690')] [2023-10-10 05:43:17,064][53252] Updated weights for policy 0, policy_version 26050 (0.0009) [2023-10-10 05:43:17,428][53252] Updated weights for policy 0, policy_version 26060 (0.0010) [2023-10-10 05:43:17,803][53252] Updated weights for policy 0, policy_version 26070 (0.0009) [2023-10-10 05:43:18,001][53268] Updated weights for policy 1, policy_version 26020 (0.0010) [2023-10-10 05:43:18,179][53252] Updated weights for policy 0, policy_version 26080 (0.0009) [2023-10-10 05:43:18,376][53268] Updated weights for policy 1, policy_version 26030 (0.0010) [2023-10-10 05:43:18,738][53268] Updated weights for policy 1, policy_version 26040 (0.0009) [2023-10-10 05:43:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53379072. Throughput: 0: 1681.6, 1: 1688.1. Samples: 13354876. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:43:21,784][52050] Avg episode reward: [(0, '19.820'), (1, '16.650')] [2023-10-10 05:43:22,394][53252] Updated weights for policy 0, policy_version 26090 (0.0007) [2023-10-10 05:43:22,688][53268] Updated weights for policy 1, policy_version 26050 (0.0008) [2023-10-10 05:43:22,770][53252] Updated weights for policy 0, policy_version 26100 (0.0007) [2023-10-10 05:43:23,054][53268] Updated weights for policy 1, policy_version 26060 (0.0008) [2023-10-10 05:43:23,148][53252] Updated weights for policy 0, policy_version 26110 (0.0009) [2023-10-10 05:43:23,428][53268] Updated weights for policy 1, policy_version 26070 (0.0009) [2023-10-10 05:43:23,789][53268] Updated weights for policy 1, policy_version 26080 (0.0010) [2023-10-10 05:43:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53444608. Throughput: 0: 1687.6, 1: 1689.2. Samples: 13375832. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:43:26,784][52050] Avg episode reward: [(0, '19.370'), (1, '16.600')] [2023-10-10 05:43:27,181][53252] Updated weights for policy 0, policy_version 26120 (0.0010) [2023-10-10 05:43:27,542][53252] Updated weights for policy 0, policy_version 26130 (0.0009) [2023-10-10 05:43:27,921][53252] Updated weights for policy 0, policy_version 26140 (0.0008) [2023-10-10 05:43:28,013][53268] Updated weights for policy 1, policy_version 26090 (0.0007) [2023-10-10 05:43:28,376][53268] Updated weights for policy 1, policy_version 26100 (0.0008) [2023-10-10 05:43:28,755][53268] Updated weights for policy 1, policy_version 26110 (0.0007) [2023-10-10 05:43:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53510144. Throughput: 0: 1687.4, 1: 1663.7. Samples: 13384712. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:43:31,784][52050] Avg episode reward: [(0, '18.330'), (1, '16.240')] [2023-10-10 05:43:32,134][53252] Updated weights for policy 0, policy_version 26150 (0.0007) [2023-10-10 05:43:32,522][53252] Updated weights for policy 0, policy_version 26160 (0.0008) [2023-10-10 05:43:32,894][53252] Updated weights for policy 0, policy_version 26170 (0.0008) [2023-10-10 05:43:33,028][53268] Updated weights for policy 1, policy_version 26120 (0.0008) [2023-10-10 05:43:33,395][53268] Updated weights for policy 1, policy_version 26130 (0.0008) [2023-10-10 05:43:33,755][53268] Updated weights for policy 1, policy_version 26140 (0.0008) [2023-10-10 05:43:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 53575680. Throughput: 0: 1677.5, 1: 1682.9. Samples: 13405002. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 05:43:36,784][52050] Avg episode reward: [(0, '18.290'), (1, '17.290')] [2023-10-10 05:43:37,033][53252] Updated weights for policy 0, policy_version 26180 (0.0007) [2023-10-10 05:43:37,409][53252] Updated weights for policy 0, policy_version 26190 (0.0009) [2023-10-10 05:43:37,789][53252] Updated weights for policy 0, policy_version 26200 (0.0011) [2023-10-10 05:43:37,959][53268] Updated weights for policy 1, policy_version 26150 (0.0008) [2023-10-10 05:43:38,327][53268] Updated weights for policy 1, policy_version 26160 (0.0009) [2023-10-10 05:43:38,692][53268] Updated weights for policy 1, policy_version 26170 (0.0008) [2023-10-10 05:43:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53641216. Throughput: 0: 1672.8, 1: 1677.0. Samples: 13425474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:43:41,784][52050] Avg episode reward: [(0, '17.880'), (1, '16.530')] [2023-10-10 05:43:42,000][53252] Updated weights for policy 0, policy_version 26210 (0.0009) [2023-10-10 05:43:42,386][53252] Updated weights for policy 0, policy_version 26220 (0.0007) [2023-10-10 05:43:42,757][53252] Updated weights for policy 0, policy_version 26230 (0.0008) [2023-10-10 05:43:42,839][53268] Updated weights for policy 1, policy_version 26180 (0.0008) [2023-10-10 05:43:43,136][53252] Updated weights for policy 0, policy_version 26240 (0.0009) [2023-10-10 05:43:43,207][53268] Updated weights for policy 1, policy_version 26190 (0.0010) [2023-10-10 05:43:43,577][53268] Updated weights for policy 1, policy_version 26200 (0.0012) [2023-10-10 05:43:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53706752. Throughput: 0: 1668.4, 1: 1666.5. Samples: 13434350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:43:46,784][52050] Avg episode reward: [(0, '17.650'), (1, '16.870')] [2023-10-10 05:43:47,199][53252] Updated weights for policy 0, policy_version 26250 (0.0011) [2023-10-10 05:43:47,439][53268] Updated weights for policy 1, policy_version 26210 (0.0008) [2023-10-10 05:43:47,572][53252] Updated weights for policy 0, policy_version 26260 (0.0009) [2023-10-10 05:43:47,816][53268] Updated weights for policy 1, policy_version 26220 (0.0008) [2023-10-10 05:43:47,940][53252] Updated weights for policy 0, policy_version 26270 (0.0009) [2023-10-10 05:43:48,178][53268] Updated weights for policy 1, policy_version 26230 (0.0008) [2023-10-10 05:43:48,543][53268] Updated weights for policy 1, policy_version 26240 (0.0008) [2023-10-10 05:43:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53772288. Throughput: 0: 1669.9, 1: 1681.3. Samples: 13455176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:43:51,784][52050] Avg episode reward: [(0, '17.830'), (1, '19.460')] [2023-10-10 05:43:51,916][53252] Updated weights for policy 0, policy_version 26280 (0.0010) [2023-10-10 05:43:52,289][53252] Updated weights for policy 0, policy_version 26290 (0.0008) [2023-10-10 05:43:52,667][53252] Updated weights for policy 0, policy_version 26300 (0.0009) [2023-10-10 05:43:52,881][53268] Updated weights for policy 1, policy_version 26250 (0.0009) [2023-10-10 05:43:53,259][53268] Updated weights for policy 1, policy_version 26260 (0.0010) [2023-10-10 05:43:53,629][53268] Updated weights for policy 1, policy_version 26270 (0.0009) [2023-10-10 05:43:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53837824. Throughput: 0: 1672.9, 1: 1672.6. Samples: 13475636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:43:56,784][52050] Avg episode reward: [(0, '19.930'), (1, '18.250')] [2023-10-10 05:43:56,800][53252] Updated weights for policy 0, policy_version 26310 (0.0008) [2023-10-10 05:43:57,165][53252] Updated weights for policy 0, policy_version 26320 (0.0009) [2023-10-10 05:43:57,539][53252] Updated weights for policy 0, policy_version 26330 (0.0008) [2023-10-10 05:43:57,665][53268] Updated weights for policy 1, policy_version 26280 (0.0009) [2023-10-10 05:43:58,034][53268] Updated weights for policy 1, policy_version 26290 (0.0009) [2023-10-10 05:43:58,405][53268] Updated weights for policy 1, policy_version 26300 (0.0007) [2023-10-10 05:44:01,574][53252] Updated weights for policy 0, policy_version 26340 (0.0009) [2023-10-10 05:44:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53903360. Throughput: 0: 1672.4, 1: 1672.6. Samples: 13484786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:01,784][52050] Avg episode reward: [(0, '20.040'), (1, '17.160')] [2023-10-10 05:44:01,947][53252] Updated weights for policy 0, policy_version 26350 (0.0009) [2023-10-10 05:44:02,316][53252] Updated weights for policy 0, policy_version 26360 (0.0008) [2023-10-10 05:44:02,382][53268] Updated weights for policy 1, policy_version 26310 (0.0007) [2023-10-10 05:44:02,743][53268] Updated weights for policy 1, policy_version 26320 (0.0007) [2023-10-10 05:44:03,105][53268] Updated weights for policy 1, policy_version 26330 (0.0007) [2023-10-10 05:44:06,317][53252] Updated weights for policy 0, policy_version 26370 (0.0007) [2023-10-10 05:44:06,692][53252] Updated weights for policy 0, policy_version 26380 (0.0009) [2023-10-10 05:44:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53968896. Throughput: 0: 1676.8, 1: 1672.8. Samples: 13505610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:06,784][52050] Avg episode reward: [(0, '19.320'), (1, '17.970')] [2023-10-10 05:44:07,053][53252] Updated weights for policy 0, policy_version 26390 (0.0009) [2023-10-10 05:44:07,232][53268] Updated weights for policy 1, policy_version 26340 (0.0008) [2023-10-10 05:44:07,423][53252] Updated weights for policy 0, policy_version 26400 (0.0007) [2023-10-10 05:44:07,597][53268] Updated weights for policy 1, policy_version 26350 (0.0007) [2023-10-10 05:44:07,970][53268] Updated weights for policy 1, policy_version 26360 (0.0008) [2023-10-10 05:44:11,441][53252] Updated weights for policy 0, policy_version 26410 (0.0007) [2023-10-10 05:44:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 54034432. Throughput: 0: 1663.9, 1: 1672.6. Samples: 13525972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:11,784][52050] Avg episode reward: [(0, '19.380'), (1, '15.750')] [2023-10-10 05:44:11,826][53252] Updated weights for policy 0, policy_version 26420 (0.0008) [2023-10-10 05:44:12,046][53268] Updated weights for policy 1, policy_version 26370 (0.0010) [2023-10-10 05:44:12,195][53252] Updated weights for policy 0, policy_version 26430 (0.0007) [2023-10-10 05:44:12,418][53268] Updated weights for policy 1, policy_version 26380 (0.0010) [2023-10-10 05:44:12,793][53268] Updated weights for policy 1, policy_version 26390 (0.0010) [2023-10-10 05:44:13,165][53268] Updated weights for policy 1, policy_version 26400 (0.0009) [2023-10-10 05:44:16,208][53252] Updated weights for policy 0, policy_version 26440 (0.0007) [2023-10-10 05:44:16,576][53252] Updated weights for policy 0, policy_version 26450 (0.0007) [2023-10-10 05:44:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 54099968. Throughput: 0: 1675.2, 1: 1675.3. Samples: 13535482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:16,784][52050] Avg episode reward: [(0, '18.140'), (1, '17.030')] [2023-10-10 05:44:16,952][53252] Updated weights for policy 0, policy_version 26460 (0.0008) [2023-10-10 05:44:17,308][53268] Updated weights for policy 1, policy_version 26410 (0.0007) [2023-10-10 05:44:17,664][53268] Updated weights for policy 1, policy_version 26420 (0.0008) [2023-10-10 05:44:18,036][53268] Updated weights for policy 1, policy_version 26430 (0.0009) [2023-10-10 05:44:21,038][53252] Updated weights for policy 0, policy_version 26470 (0.0008) [2023-10-10 05:44:21,405][53252] Updated weights for policy 0, policy_version 26480 (0.0007) [2023-10-10 05:44:21,779][53252] Updated weights for policy 0, policy_version 26490 (0.0009) [2023-10-10 05:44:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 54165504. Throughput: 0: 1684.8, 1: 1678.0. Samples: 13556330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:21,784][52050] Avg episode reward: [(0, '18.820'), (1, '17.150')] [2023-10-10 05:44:22,059][53268] Updated weights for policy 1, policy_version 26440 (0.0009) [2023-10-10 05:44:22,432][53268] Updated weights for policy 1, policy_version 26450 (0.0010) [2023-10-10 05:44:22,794][53268] Updated weights for policy 1, policy_version 26460 (0.0009) [2023-10-10 05:44:25,614][53252] Updated weights for policy 0, policy_version 26500 (0.0008) [2023-10-10 05:44:25,991][53252] Updated weights for policy 0, policy_version 26510 (0.0009) [2023-10-10 05:44:26,362][53252] Updated weights for policy 0, policy_version 26520 (0.0007) [2023-10-10 05:44:26,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54263808. Throughput: 0: 1670.0, 1: 1682.3. Samples: 13576328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:26,784][52050] Avg episode reward: [(0, '18.050'), (1, '17.490')] [2023-10-10 05:44:26,794][53268] Updated weights for policy 1, policy_version 26470 (0.0008) [2023-10-10 05:44:27,166][53268] Updated weights for policy 1, policy_version 26480 (0.0009) [2023-10-10 05:44:27,544][53268] Updated weights for policy 1, policy_version 26490 (0.0009) [2023-10-10 05:44:30,475][53252] Updated weights for policy 0, policy_version 26530 (0.0007) [2023-10-10 05:44:30,875][53252] Updated weights for policy 0, policy_version 26540 (0.0009) [2023-10-10 05:44:31,251][53252] Updated weights for policy 0, policy_version 26550 (0.0007) [2023-10-10 05:44:31,620][53252] Updated weights for policy 0, policy_version 26560 (0.0007) [2023-10-10 05:44:31,627][53268] Updated weights for policy 1, policy_version 26500 (0.0009) [2023-10-10 05:44:31,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54329344. Throughput: 0: 1696.9, 1: 1681.5. Samples: 13586378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:44:31,784][52050] Avg episode reward: [(0, '21.200'), (1, '17.370')] [2023-10-10 05:44:31,784][52846] Saving new best policy, reward=21.200! [2023-10-10 05:44:31,993][53268] Updated weights for policy 1, policy_version 26510 (0.0009) [2023-10-10 05:44:32,361][53268] Updated weights for policy 1, policy_version 26520 (0.0010) [2023-10-10 05:44:35,600][53252] Updated weights for policy 0, policy_version 26570 (0.0008) [2023-10-10 05:44:35,976][53252] Updated weights for policy 0, policy_version 26580 (0.0007) [2023-10-10 05:44:36,356][53252] Updated weights for policy 0, policy_version 26590 (0.0008) [2023-10-10 05:44:36,456][53268] Updated weights for policy 1, policy_version 26530 (0.0007) [2023-10-10 05:44:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 54394880. Throughput: 0: 1694.9, 1: 1679.3. Samples: 13607016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:44:36,784][52050] Avg episode reward: [(0, '20.690'), (1, '16.300')] [2023-10-10 05:44:36,841][53268] Updated weights for policy 1, policy_version 26540 (0.0009) [2023-10-10 05:44:37,207][53268] Updated weights for policy 1, policy_version 26550 (0.0011) [2023-10-10 05:44:37,581][53268] Updated weights for policy 1, policy_version 26560 (0.0008) [2023-10-10 05:44:40,238][53252] Updated weights for policy 0, policy_version 26600 (0.0009) [2023-10-10 05:44:40,614][53252] Updated weights for policy 0, policy_version 26610 (0.0008) [2023-10-10 05:44:40,977][53252] Updated weights for policy 0, policy_version 26620 (0.0009) [2023-10-10 05:44:41,617][53268] Updated weights for policy 1, policy_version 26570 (0.0009) [2023-10-10 05:44:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54460416. Throughput: 0: 1671.5, 1: 1691.3. Samples: 13626960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:44:41,784][52050] Avg episode reward: [(0, '20.080'), (1, '15.670')] [2023-10-10 05:44:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000026624_27262976.pth... [2023-10-10 05:44:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000025056_25657344.pth [2023-10-10 05:44:41,982][53268] Updated weights for policy 1, policy_version 26580 (0.0009) [2023-10-10 05:44:42,355][53268] Updated weights for policy 1, policy_version 26590 (0.0008) [2023-10-10 05:44:42,425][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000026592_27230208.pth... [2023-10-10 05:44:42,460][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000024992_25591808.pth [2023-10-10 05:44:44,955][53252] Updated weights for policy 0, policy_version 26630 (0.0011) [2023-10-10 05:44:45,334][53252] Updated weights for policy 0, policy_version 26640 (0.0008) [2023-10-10 05:44:45,704][53252] Updated weights for policy 0, policy_version 26650 (0.0008) [2023-10-10 05:44:46,313][53268] Updated weights for policy 1, policy_version 26600 (0.0010) [2023-10-10 05:44:46,691][53268] Updated weights for policy 1, policy_version 26610 (0.0010) [2023-10-10 05:44:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54525952. Throughput: 0: 1708.4, 1: 1683.6. Samples: 13637422. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:44:46,784][52050] Avg episode reward: [(0, '19.230'), (1, '15.140')] [2023-10-10 05:44:47,060][53268] Updated weights for policy 1, policy_version 26620 (0.0009) [2023-10-10 05:44:49,545][53252] Updated weights for policy 0, policy_version 26660 (0.0009) [2023-10-10 05:44:49,919][53252] Updated weights for policy 0, policy_version 26670 (0.0009) [2023-10-10 05:44:50,300][53252] Updated weights for policy 0, policy_version 26680 (0.0009) [2023-10-10 05:44:51,088][53268] Updated weights for policy 1, policy_version 26630 (0.0010) [2023-10-10 05:44:51,460][53268] Updated weights for policy 1, policy_version 26640 (0.0011) [2023-10-10 05:44:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54591488. Throughput: 0: 1685.3, 1: 1689.2. Samples: 13657466. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-10 05:44:51,784][52050] Avg episode reward: [(0, '19.000'), (1, '15.520')] [2023-10-10 05:44:51,816][53268] Updated weights for policy 1, policy_version 26650 (0.0007) [2023-10-10 05:44:54,480][53252] Updated weights for policy 0, policy_version 26690 (0.0009) [2023-10-10 05:44:54,837][53252] Updated weights for policy 0, policy_version 26700 (0.0008) [2023-10-10 05:44:55,209][53252] Updated weights for policy 0, policy_version 26710 (0.0007) [2023-10-10 05:44:55,573][53252] Updated weights for policy 0, policy_version 26720 (0.0007) [2023-10-10 05:44:55,846][53268] Updated weights for policy 1, policy_version 26660 (0.0009) [2023-10-10 05:44:56,214][53268] Updated weights for policy 1, policy_version 26670 (0.0007) [2023-10-10 05:44:56,580][53268] Updated weights for policy 1, policy_version 26680 (0.0007) [2023-10-10 05:44:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54657024. Throughput: 0: 1687.3, 1: 1681.2. Samples: 13677554. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 05:44:56,784][52050] Avg episode reward: [(0, '18.830'), (1, '16.050')] [2023-10-10 05:44:59,606][53252] Updated weights for policy 0, policy_version 26730 (0.0007) [2023-10-10 05:44:59,982][53252] Updated weights for policy 0, policy_version 26740 (0.0009) [2023-10-10 05:45:00,346][53252] Updated weights for policy 0, policy_version 26750 (0.0010) [2023-10-10 05:45:00,552][53268] Updated weights for policy 1, policy_version 26690 (0.0008) [2023-10-10 05:45:00,920][53268] Updated weights for policy 1, policy_version 26700 (0.0011) [2023-10-10 05:45:01,287][53268] Updated weights for policy 1, policy_version 26710 (0.0010) [2023-10-10 05:45:01,655][53268] Updated weights for policy 1, policy_version 26720 (0.0010) [2023-10-10 05:45:01,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 54755328. Throughput: 0: 1706.4, 1: 1690.0. Samples: 13688320. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 05:45:01,784][52050] Avg episode reward: [(0, '18.650'), (1, '17.200')] [2023-10-10 05:45:04,455][53252] Updated weights for policy 0, policy_version 26760 (0.0009) [2023-10-10 05:45:04,820][53252] Updated weights for policy 0, policy_version 26770 (0.0009) [2023-10-10 05:45:05,199][53252] Updated weights for policy 0, policy_version 26780 (0.0009) [2023-10-10 05:45:05,886][53268] Updated weights for policy 1, policy_version 26730 (0.0010) [2023-10-10 05:45:06,254][53268] Updated weights for policy 1, policy_version 26740 (0.0011) [2023-10-10 05:45:06,624][53268] Updated weights for policy 1, policy_version 26750 (0.0010) [2023-10-10 05:45:06,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 54820864. Throughput: 0: 1676.5, 1: 1693.9. Samples: 13707998. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 05:45:06,784][52050] Avg episode reward: [(0, '19.450'), (1, '17.400')] [2023-10-10 05:45:09,243][53252] Updated weights for policy 0, policy_version 26790 (0.0007) [2023-10-10 05:45:09,611][53252] Updated weights for policy 0, policy_version 26800 (0.0009) [2023-10-10 05:45:09,992][53252] Updated weights for policy 0, policy_version 26810 (0.0008) [2023-10-10 05:45:10,736][53268] Updated weights for policy 1, policy_version 26760 (0.0008) [2023-10-10 05:45:11,097][53268] Updated weights for policy 1, policy_version 26770 (0.0011) [2023-10-10 05:45:11,467][53268] Updated weights for policy 1, policy_version 26780 (0.0011) [2023-10-10 05:45:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 54886400. Throughput: 0: 1693.0, 1: 1674.2. Samples: 13727854. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 05:45:11,784][52050] Avg episode reward: [(0, '19.800'), (1, '16.550')] [2023-10-10 05:45:13,958][53252] Updated weights for policy 0, policy_version 26820 (0.0009) [2023-10-10 05:45:14,326][53252] Updated weights for policy 0, policy_version 26830 (0.0007) [2023-10-10 05:45:14,689][53252] Updated weights for policy 0, policy_version 26840 (0.0007) [2023-10-10 05:45:15,564][53268] Updated weights for policy 1, policy_version 26790 (0.0010) [2023-10-10 05:45:15,925][53268] Updated weights for policy 1, policy_version 26800 (0.0011) [2023-10-10 05:45:16,295][53268] Updated weights for policy 1, policy_version 26810 (0.0010) [2023-10-10 05:45:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 54951936. Throughput: 0: 1691.4, 1: 1687.5. Samples: 13738430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:16,784][52050] Avg episode reward: [(0, '19.190'), (1, '17.570')] [2023-10-10 05:45:18,567][53252] Updated weights for policy 0, policy_version 26850 (0.0008) [2023-10-10 05:45:18,946][53252] Updated weights for policy 0, policy_version 26860 (0.0008) [2023-10-10 05:45:19,307][53252] Updated weights for policy 0, policy_version 26870 (0.0007) [2023-10-10 05:45:19,675][53252] Updated weights for policy 0, policy_version 26880 (0.0007) [2023-10-10 05:45:20,507][53268] Updated weights for policy 1, policy_version 26820 (0.0010) [2023-10-10 05:45:20,875][53268] Updated weights for policy 1, policy_version 26830 (0.0009) [2023-10-10 05:45:21,244][53268] Updated weights for policy 1, policy_version 26840 (0.0008) [2023-10-10 05:45:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 55017472. Throughput: 0: 1677.9, 1: 1683.5. Samples: 13758280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:21,785][52050] Avg episode reward: [(0, '19.160'), (1, '17.500')] [2023-10-10 05:45:23,885][53252] Updated weights for policy 0, policy_version 26890 (0.0009) [2023-10-10 05:45:24,268][53252] Updated weights for policy 0, policy_version 26900 (0.0009) [2023-10-10 05:45:24,642][53252] Updated weights for policy 0, policy_version 26910 (0.0009) [2023-10-10 05:45:25,187][53268] Updated weights for policy 1, policy_version 26850 (0.0007) [2023-10-10 05:45:25,551][53268] Updated weights for policy 1, policy_version 26860 (0.0007) [2023-10-10 05:45:25,920][53268] Updated weights for policy 1, policy_version 26870 (0.0008) [2023-10-10 05:45:26,290][53268] Updated weights for policy 1, policy_version 26880 (0.0007) [2023-10-10 05:45:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55083008. Throughput: 0: 1698.3, 1: 1658.0. Samples: 13777996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:26,784][52050] Avg episode reward: [(0, '18.600'), (1, '16.850')] [2023-10-10 05:45:28,634][53252] Updated weights for policy 0, policy_version 26920 (0.0009) [2023-10-10 05:45:29,015][53252] Updated weights for policy 0, policy_version 26930 (0.0009) [2023-10-10 05:45:29,386][53252] Updated weights for policy 0, policy_version 26940 (0.0007) [2023-10-10 05:45:30,628][53268] Updated weights for policy 1, policy_version 26890 (0.0008) [2023-10-10 05:45:30,995][53268] Updated weights for policy 1, policy_version 26900 (0.0009) [2023-10-10 05:45:31,356][53268] Updated weights for policy 1, policy_version 26910 (0.0009) [2023-10-10 05:45:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55148544. Throughput: 0: 1663.5, 1: 1685.5. Samples: 13788126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:31,784][52050] Avg episode reward: [(0, '18.700'), (1, '16.840')] [2023-10-10 05:45:33,400][53252] Updated weights for policy 0, policy_version 26950 (0.0009) [2023-10-10 05:45:33,776][53252] Updated weights for policy 0, policy_version 26960 (0.0011) [2023-10-10 05:45:34,147][53252] Updated weights for policy 0, policy_version 26970 (0.0009) [2023-10-10 05:45:35,150][53268] Updated weights for policy 1, policy_version 26920 (0.0007) [2023-10-10 05:45:35,515][53268] Updated weights for policy 1, policy_version 26930 (0.0009) [2023-10-10 05:45:35,887][53268] Updated weights for policy 1, policy_version 26940 (0.0008) [2023-10-10 05:45:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55214080. Throughput: 0: 1679.6, 1: 1671.5. Samples: 13808270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:36,784][52050] Avg episode reward: [(0, '18.830'), (1, '17.570')] [2023-10-10 05:45:38,217][53252] Updated weights for policy 0, policy_version 26980 (0.0008) [2023-10-10 05:45:38,588][53252] Updated weights for policy 0, policy_version 26990 (0.0008) [2023-10-10 05:45:38,957][53252] Updated weights for policy 0, policy_version 27000 (0.0009) [2023-10-10 05:45:39,945][53268] Updated weights for policy 1, policy_version 26950 (0.0008) [2023-10-10 05:45:40,321][53268] Updated weights for policy 1, policy_version 26960 (0.0009) [2023-10-10 05:45:40,681][53268] Updated weights for policy 1, policy_version 26970 (0.0009) [2023-10-10 05:45:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 55279616. Throughput: 0: 1686.2, 1: 1659.0. Samples: 13828090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:41,784][52050] Avg episode reward: [(0, '18.450'), (1, '16.160')] [2023-10-10 05:45:43,116][53252] Updated weights for policy 0, policy_version 27010 (0.0009) [2023-10-10 05:45:43,475][53252] Updated weights for policy 0, policy_version 27020 (0.0008) [2023-10-10 05:45:43,853][53252] Updated weights for policy 0, policy_version 27030 (0.0008) [2023-10-10 05:45:44,217][53252] Updated weights for policy 0, policy_version 27040 (0.0007) [2023-10-10 05:45:44,804][53268] Updated weights for policy 1, policy_version 26980 (0.0009) [2023-10-10 05:45:45,173][53268] Updated weights for policy 1, policy_version 26990 (0.0008) [2023-10-10 05:45:45,547][53268] Updated weights for policy 1, policy_version 27000 (0.0009) [2023-10-10 05:45:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 55345152. Throughput: 0: 1657.7, 1: 1677.9. Samples: 13838422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:46,784][52050] Avg episode reward: [(0, '18.040'), (1, '15.740')] [2023-10-10 05:45:48,520][53252] Updated weights for policy 0, policy_version 27050 (0.0011) [2023-10-10 05:45:48,890][53252] Updated weights for policy 0, policy_version 27060 (0.0008) [2023-10-10 05:45:49,271][53252] Updated weights for policy 0, policy_version 27070 (0.0009) [2023-10-10 05:45:49,604][53268] Updated weights for policy 1, policy_version 27010 (0.0007) [2023-10-10 05:45:49,963][53268] Updated weights for policy 1, policy_version 27020 (0.0007) [2023-10-10 05:45:50,334][53268] Updated weights for policy 1, policy_version 27030 (0.0009) [2023-10-10 05:45:50,712][53268] Updated weights for policy 1, policy_version 27040 (0.0011) [2023-10-10 05:45:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55410688. Throughput: 0: 1678.9, 1: 1660.4. Samples: 13858266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:45:51,784][52050] Avg episode reward: [(0, '17.300'), (1, '16.850')] [2023-10-10 05:45:53,259][53252] Updated weights for policy 0, policy_version 27080 (0.0007) [2023-10-10 05:45:53,629][53252] Updated weights for policy 0, policy_version 27090 (0.0007) [2023-10-10 05:45:54,000][53252] Updated weights for policy 0, policy_version 27100 (0.0010) [2023-10-10 05:45:54,800][53268] Updated weights for policy 1, policy_version 27050 (0.0010) [2023-10-10 05:45:55,170][53268] Updated weights for policy 1, policy_version 27060 (0.0010) [2023-10-10 05:45:55,546][53268] Updated weights for policy 1, policy_version 27070 (0.0008) [2023-10-10 05:45:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55476224. Throughput: 0: 1683.4, 1: 1666.9. Samples: 13878618. Policy #0 lag: (min: 9.0, avg: 36.4, max: 40.0) [2023-10-10 05:45:56,784][52050] Avg episode reward: [(0, '18.240'), (1, '15.420')] [2023-10-10 05:45:58,240][53252] Updated weights for policy 0, policy_version 27110 (0.0009) [2023-10-10 05:45:58,616][53252] Updated weights for policy 0, policy_version 27120 (0.0010) [2023-10-10 05:45:58,988][53252] Updated weights for policy 0, policy_version 27130 (0.0010) [2023-10-10 05:45:59,477][53268] Updated weights for policy 1, policy_version 27080 (0.0008) [2023-10-10 05:45:59,841][53268] Updated weights for policy 1, policy_version 27090 (0.0007) [2023-10-10 05:46:00,209][53268] Updated weights for policy 1, policy_version 27100 (0.0010) [2023-10-10 05:46:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55541760. Throughput: 0: 1659.3, 1: 1687.9. Samples: 13889054. Policy #0 lag: (min: 9.0, avg: 36.4, max: 40.0) [2023-10-10 05:46:01,784][52050] Avg episode reward: [(0, '19.580'), (1, '15.830')] [2023-10-10 05:46:02,893][53252] Updated weights for policy 0, policy_version 27140 (0.0008) [2023-10-10 05:46:03,259][53252] Updated weights for policy 0, policy_version 27150 (0.0009) [2023-10-10 05:46:03,635][53252] Updated weights for policy 0, policy_version 27160 (0.0009) [2023-10-10 05:46:04,291][53268] Updated weights for policy 1, policy_version 27110 (0.0010) [2023-10-10 05:46:04,655][53268] Updated weights for policy 1, policy_version 27120 (0.0007) [2023-10-10 05:46:05,024][53268] Updated weights for policy 1, policy_version 27130 (0.0007) [2023-10-10 05:46:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55607296. Throughput: 0: 1676.8, 1: 1670.2. Samples: 13908898. Policy #0 lag: (min: 9.0, avg: 36.4, max: 40.0) [2023-10-10 05:46:06,784][52050] Avg episode reward: [(0, '19.350'), (1, '17.720')] [2023-10-10 05:46:07,703][53252] Updated weights for policy 0, policy_version 27170 (0.0008) [2023-10-10 05:46:08,079][53252] Updated weights for policy 0, policy_version 27180 (0.0009) [2023-10-10 05:46:08,452][53252] Updated weights for policy 0, policy_version 27190 (0.0008) [2023-10-10 05:46:08,817][53252] Updated weights for policy 0, policy_version 27200 (0.0007) [2023-10-10 05:46:08,881][53268] Updated weights for policy 1, policy_version 27140 (0.0008) [2023-10-10 05:46:09,249][53268] Updated weights for policy 1, policy_version 27150 (0.0010) [2023-10-10 05:46:09,615][53268] Updated weights for policy 1, policy_version 27160 (0.0009) [2023-10-10 05:46:11,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55672832. Throughput: 0: 1677.8, 1: 1695.3. Samples: 13929788. Policy #0 lag: (min: 9.0, avg: 36.4, max: 40.0) [2023-10-10 05:46:11,785][52050] Avg episode reward: [(0, '20.490'), (1, '19.870')] [2023-10-10 05:46:12,833][53252] Updated weights for policy 0, policy_version 27210 (0.0007) [2023-10-10 05:46:13,212][53252] Updated weights for policy 0, policy_version 27220 (0.0007) [2023-10-10 05:46:13,583][53252] Updated weights for policy 0, policy_version 27230 (0.0010) [2023-10-10 05:46:13,668][53268] Updated weights for policy 1, policy_version 27170 (0.0007) [2023-10-10 05:46:14,027][53268] Updated weights for policy 1, policy_version 27180 (0.0009) [2023-10-10 05:46:14,390][53268] Updated weights for policy 1, policy_version 27190 (0.0009) [2023-10-10 05:46:14,755][53268] Updated weights for policy 1, policy_version 27200 (0.0007) [2023-10-10 05:46:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55738368. Throughput: 0: 1677.6, 1: 1688.7. Samples: 13939612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 05:46:16,784][52050] Avg episode reward: [(0, '20.560'), (1, '18.860')] [2023-10-10 05:46:17,595][53252] Updated weights for policy 0, policy_version 27240 (0.0009) [2023-10-10 05:46:17,973][53252] Updated weights for policy 0, policy_version 27250 (0.0007) [2023-10-10 05:46:18,351][53252] Updated weights for policy 0, policy_version 27260 (0.0008) [2023-10-10 05:46:18,709][53268] Updated weights for policy 1, policy_version 27210 (0.0009) [2023-10-10 05:46:19,071][53268] Updated weights for policy 1, policy_version 27220 (0.0011) [2023-10-10 05:46:19,448][53268] Updated weights for policy 1, policy_version 27230 (0.0009) [2023-10-10 05:46:21,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 55803904. Throughput: 0: 1683.9, 1: 1683.4. Samples: 13959798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 05:46:21,784][52050] Avg episode reward: [(0, '19.960'), (1, '18.490')] [2023-10-10 05:46:22,294][53252] Updated weights for policy 0, policy_version 27270 (0.0007) [2023-10-10 05:46:22,662][53252] Updated weights for policy 0, policy_version 27280 (0.0008) [2023-10-10 05:46:23,039][53252] Updated weights for policy 0, policy_version 27290 (0.0009) [2023-10-10 05:46:23,588][53268] Updated weights for policy 1, policy_version 27240 (0.0010) [2023-10-10 05:46:23,965][53268] Updated weights for policy 1, policy_version 27250 (0.0009) [2023-10-10 05:46:24,332][53268] Updated weights for policy 1, policy_version 27260 (0.0009) [2023-10-10 05:46:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 55869440. Throughput: 0: 1688.9, 1: 1693.4. Samples: 13980294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 05:46:26,784][52050] Avg episode reward: [(0, '18.640'), (1, '18.810')] [2023-10-10 05:46:27,019][53252] Updated weights for policy 0, policy_version 27300 (0.0009) [2023-10-10 05:46:27,401][53252] Updated weights for policy 0, policy_version 27310 (0.0008) [2023-10-10 05:46:27,765][53252] Updated weights for policy 0, policy_version 27320 (0.0009) [2023-10-10 05:46:28,493][53268] Updated weights for policy 1, policy_version 27270 (0.0008) [2023-10-10 05:46:28,850][53268] Updated weights for policy 1, policy_version 27280 (0.0008) [2023-10-10 05:46:29,226][53268] Updated weights for policy 1, policy_version 27290 (0.0007) [2023-10-10 05:46:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55934976. Throughput: 0: 1689.6, 1: 1674.0. Samples: 13989782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 05:46:31,784][52050] Avg episode reward: [(0, '18.260'), (1, '16.220')] [2023-10-10 05:46:31,926][53252] Updated weights for policy 0, policy_version 27330 (0.0008) [2023-10-10 05:46:32,288][53252] Updated weights for policy 0, policy_version 27340 (0.0010) [2023-10-10 05:46:32,656][53252] Updated weights for policy 0, policy_version 27350 (0.0008) [2023-10-10 05:46:33,037][53252] Updated weights for policy 0, policy_version 27360 (0.0007) [2023-10-10 05:46:33,264][53268] Updated weights for policy 1, policy_version 27300 (0.0008) [2023-10-10 05:46:33,634][53268] Updated weights for policy 1, policy_version 27310 (0.0010) [2023-10-10 05:46:34,000][53268] Updated weights for policy 1, policy_version 27320 (0.0008) [2023-10-10 05:46:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56000512. Throughput: 0: 1698.9, 1: 1683.5. Samples: 14010474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:46:36,784][52050] Avg episode reward: [(0, '17.070'), (1, '15.950')] [2023-10-10 05:46:37,075][53252] Updated weights for policy 0, policy_version 27370 (0.0007) [2023-10-10 05:46:37,456][53252] Updated weights for policy 0, policy_version 27380 (0.0008) [2023-10-10 05:46:37,828][53252] Updated weights for policy 0, policy_version 27390 (0.0008) [2023-10-10 05:46:38,024][53268] Updated weights for policy 1, policy_version 27330 (0.0007) [2023-10-10 05:46:38,381][53268] Updated weights for policy 1, policy_version 27340 (0.0008) [2023-10-10 05:46:38,757][53268] Updated weights for policy 1, policy_version 27350 (0.0007) [2023-10-10 05:46:39,115][53268] Updated weights for policy 1, policy_version 27360 (0.0010) [2023-10-10 05:46:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56066048. Throughput: 0: 1698.3, 1: 1696.5. Samples: 14031384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:46:41,784][52050] Avg episode reward: [(0, '17.900'), (1, '15.650')] [2023-10-10 05:46:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth... [2023-10-10 05:46:41,801][53252] Updated weights for policy 0, policy_version 27400 (0.0007) [2023-10-10 05:46:41,825][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000025792_26411008.pth [2023-10-10 05:46:42,179][53252] Updated weights for policy 0, policy_version 27410 (0.0007) [2023-10-10 05:46:42,552][53252] Updated weights for policy 0, policy_version 27420 (0.0009) [2023-10-10 05:46:42,701][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000027424_28082176.pth... [2023-10-10 05:46:42,732][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000025824_26443776.pth [2023-10-10 05:46:43,160][53268] Updated weights for policy 1, policy_version 27370 (0.0010) [2023-10-10 05:46:43,515][53268] Updated weights for policy 1, policy_version 27380 (0.0010) [2023-10-10 05:46:43,885][53268] Updated weights for policy 1, policy_version 27390 (0.0010) [2023-10-10 05:46:46,663][53252] Updated weights for policy 0, policy_version 27430 (0.0008) [2023-10-10 05:46:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 56131584. Throughput: 0: 1702.2, 1: 1666.0. Samples: 14040624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:46:46,784][52050] Avg episode reward: [(0, '18.590'), (1, '15.950')] [2023-10-10 05:46:47,037][53252] Updated weights for policy 0, policy_version 27440 (0.0009) [2023-10-10 05:46:47,410][53252] Updated weights for policy 0, policy_version 27450 (0.0009) [2023-10-10 05:46:47,988][53268] Updated weights for policy 1, policy_version 27400 (0.0009) [2023-10-10 05:46:48,362][53268] Updated weights for policy 1, policy_version 27410 (0.0009) [2023-10-10 05:46:48,737][53268] Updated weights for policy 1, policy_version 27420 (0.0009) [2023-10-10 05:46:51,339][53252] Updated weights for policy 0, policy_version 27460 (0.0010) [2023-10-10 05:46:51,710][53252] Updated weights for policy 0, policy_version 27470 (0.0009) [2023-10-10 05:46:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56197120. Throughput: 0: 1698.1, 1: 1689.6. Samples: 14061342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:46:51,784][52050] Avg episode reward: [(0, '18.580'), (1, '17.050')] [2023-10-10 05:46:52,077][53252] Updated weights for policy 0, policy_version 27480 (0.0009) [2023-10-10 05:46:52,866][53268] Updated weights for policy 1, policy_version 27430 (0.0008) [2023-10-10 05:46:53,244][53268] Updated weights for policy 1, policy_version 27440 (0.0008) [2023-10-10 05:46:53,613][53268] Updated weights for policy 1, policy_version 27450 (0.0007) [2023-10-10 05:46:56,044][53252] Updated weights for policy 0, policy_version 27490 (0.0007) [2023-10-10 05:46:56,419][53252] Updated weights for policy 0, policy_version 27500 (0.0011) [2023-10-10 05:46:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56262656. Throughput: 0: 1689.9, 1: 1683.5. Samples: 14081592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:46:56,784][52050] Avg episode reward: [(0, '17.920'), (1, '16.020')] [2023-10-10 05:46:56,786][53252] Updated weights for policy 0, policy_version 27510 (0.0010) [2023-10-10 05:46:57,160][53252] Updated weights for policy 0, policy_version 27520 (0.0008) [2023-10-10 05:46:57,611][53268] Updated weights for policy 1, policy_version 27460 (0.0009) [2023-10-10 05:46:57,975][53268] Updated weights for policy 1, policy_version 27470 (0.0008) [2023-10-10 05:46:58,346][53268] Updated weights for policy 1, policy_version 27480 (0.0010) [2023-10-10 05:47:01,382][53252] Updated weights for policy 0, policy_version 27530 (0.0007) [2023-10-10 05:47:01,751][53252] Updated weights for policy 0, policy_version 27540 (0.0007) [2023-10-10 05:47:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56328192. Throughput: 0: 1698.2, 1: 1669.9. Samples: 14091176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:47:01,784][52050] Avg episode reward: [(0, '20.490'), (1, '16.960')] [2023-10-10 05:47:02,124][53252] Updated weights for policy 0, policy_version 27550 (0.0008) [2023-10-10 05:47:02,468][53268] Updated weights for policy 1, policy_version 27490 (0.0008) [2023-10-10 05:47:02,837][53268] Updated weights for policy 1, policy_version 27500 (0.0009) [2023-10-10 05:47:03,198][53268] Updated weights for policy 1, policy_version 27510 (0.0007) [2023-10-10 05:47:03,570][53268] Updated weights for policy 1, policy_version 27520 (0.0008) [2023-10-10 05:47:06,452][53252] Updated weights for policy 0, policy_version 27560 (0.0008) [2023-10-10 05:47:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 56393728. Throughput: 0: 1692.1, 1: 1686.1. Samples: 14111818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:47:06,785][52050] Avg episode reward: [(0, '20.900'), (1, '17.600')] [2023-10-10 05:47:06,822][53252] Updated weights for policy 0, policy_version 27570 (0.0008) [2023-10-10 05:47:07,197][53252] Updated weights for policy 0, policy_version 27580 (0.0007) [2023-10-10 05:47:07,622][53268] Updated weights for policy 1, policy_version 27530 (0.0010) [2023-10-10 05:47:07,992][53268] Updated weights for policy 1, policy_version 27540 (0.0010) [2023-10-10 05:47:08,369][53268] Updated weights for policy 1, policy_version 27550 (0.0010) [2023-10-10 05:47:11,280][53252] Updated weights for policy 0, policy_version 27590 (0.0008) [2023-10-10 05:47:11,649][53252] Updated weights for policy 0, policy_version 27600 (0.0007) [2023-10-10 05:47:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56459264. Throughput: 0: 1678.4, 1: 1694.4. Samples: 14132072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:47:11,785][52050] Avg episode reward: [(0, '20.100'), (1, '18.350')] [2023-10-10 05:47:12,029][53252] Updated weights for policy 0, policy_version 27610 (0.0007) [2023-10-10 05:47:12,447][53268] Updated weights for policy 1, policy_version 27560 (0.0009) [2023-10-10 05:47:12,837][53268] Updated weights for policy 1, policy_version 27570 (0.0009) [2023-10-10 05:47:13,198][53268] Updated weights for policy 1, policy_version 27580 (0.0009) [2023-10-10 05:47:15,894][53252] Updated weights for policy 0, policy_version 27620 (0.0007) [2023-10-10 05:47:16,271][53252] Updated weights for policy 0, policy_version 27630 (0.0009) [2023-10-10 05:47:16,648][53252] Updated weights for policy 0, policy_version 27640 (0.0008) [2023-10-10 05:47:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56524800. Throughput: 0: 1692.2, 1: 1678.4. Samples: 14141462. Policy #0 lag: (min: 10.0, avg: 16.6, max: 42.0) [2023-10-10 05:47:16,784][52050] Avg episode reward: [(0, '19.860'), (1, '18.380')] [2023-10-10 05:47:17,330][53268] Updated weights for policy 1, policy_version 27590 (0.0009) [2023-10-10 05:47:17,701][53268] Updated weights for policy 1, policy_version 27600 (0.0008) [2023-10-10 05:47:18,064][53268] Updated weights for policy 1, policy_version 27610 (0.0008) [2023-10-10 05:47:20,721][53252] Updated weights for policy 0, policy_version 27650 (0.0010) [2023-10-10 05:47:21,094][53252] Updated weights for policy 0, policy_version 27660 (0.0008) [2023-10-10 05:47:21,468][53252] Updated weights for policy 0, policy_version 27670 (0.0009) [2023-10-10 05:47:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 56590336. Throughput: 0: 1684.5, 1: 1682.0. Samples: 14161968. Policy #0 lag: (min: 10.0, avg: 16.6, max: 42.0) [2023-10-10 05:47:21,785][52050] Avg episode reward: [(0, '18.820'), (1, '17.760')] [2023-10-10 05:47:21,844][53252] Updated weights for policy 0, policy_version 27680 (0.0007) [2023-10-10 05:47:22,220][53268] Updated weights for policy 1, policy_version 27620 (0.0008) [2023-10-10 05:47:22,598][53268] Updated weights for policy 1, policy_version 27630 (0.0009) [2023-10-10 05:47:22,968][53268] Updated weights for policy 1, policy_version 27640 (0.0008) [2023-10-10 05:47:25,858][53252] Updated weights for policy 0, policy_version 27690 (0.0009) [2023-10-10 05:47:26,232][53252] Updated weights for policy 0, policy_version 27700 (0.0010) [2023-10-10 05:47:26,591][53252] Updated weights for policy 0, policy_version 27710 (0.0010) [2023-10-10 05:47:26,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56688640. Throughput: 0: 1663.7, 1: 1686.6. Samples: 14182148. Policy #0 lag: (min: 10.0, avg: 16.6, max: 42.0) [2023-10-10 05:47:26,784][52050] Avg episode reward: [(0, '18.980'), (1, '16.400')] [2023-10-10 05:47:26,902][53268] Updated weights for policy 1, policy_version 27650 (0.0007) [2023-10-10 05:47:27,274][53268] Updated weights for policy 1, policy_version 27660 (0.0009) [2023-10-10 05:47:27,655][53268] Updated weights for policy 1, policy_version 27670 (0.0008) [2023-10-10 05:47:28,013][53268] Updated weights for policy 1, policy_version 27680 (0.0011) [2023-10-10 05:47:30,770][53252] Updated weights for policy 0, policy_version 27720 (0.0010) [2023-10-10 05:47:31,152][53252] Updated weights for policy 0, policy_version 27730 (0.0007) [2023-10-10 05:47:31,535][53252] Updated weights for policy 0, policy_version 27740 (0.0007) [2023-10-10 05:47:31,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56754176. Throughput: 0: 1680.3, 1: 1687.0. Samples: 14192152. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) [2023-10-10 05:47:31,784][52050] Avg episode reward: [(0, '17.800'), (1, '17.040')] [2023-10-10 05:47:32,086][53268] Updated weights for policy 1, policy_version 27690 (0.0008) [2023-10-10 05:47:32,453][53268] Updated weights for policy 1, policy_version 27700 (0.0009) [2023-10-10 05:47:32,826][53268] Updated weights for policy 1, policy_version 27710 (0.0011) [2023-10-10 05:47:35,526][53252] Updated weights for policy 0, policy_version 27750 (0.0009) [2023-10-10 05:47:35,898][53252] Updated weights for policy 0, policy_version 27760 (0.0008) [2023-10-10 05:47:36,273][53252] Updated weights for policy 0, policy_version 27770 (0.0008) [2023-10-10 05:47:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 56819712. Throughput: 0: 1680.2, 1: 1684.6. Samples: 14212756. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) [2023-10-10 05:47:36,784][52050] Avg episode reward: [(0, '19.510'), (1, '18.450')] [2023-10-10 05:47:36,902][53268] Updated weights for policy 1, policy_version 27720 (0.0009) [2023-10-10 05:47:37,257][53268] Updated weights for policy 1, policy_version 27730 (0.0011) [2023-10-10 05:47:37,624][53268] Updated weights for policy 1, policy_version 27740 (0.0011) [2023-10-10 05:47:40,137][53252] Updated weights for policy 0, policy_version 27780 (0.0009) [2023-10-10 05:47:40,509][53252] Updated weights for policy 0, policy_version 27790 (0.0008) [2023-10-10 05:47:40,872][53252] Updated weights for policy 0, policy_version 27800 (0.0010) [2023-10-10 05:47:41,576][53268] Updated weights for policy 1, policy_version 27750 (0.0010) [2023-10-10 05:47:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56885248. Throughput: 0: 1662.1, 1: 1689.0. Samples: 14232388. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) [2023-10-10 05:47:41,784][52050] Avg episode reward: [(0, '20.230'), (1, '18.440')] [2023-10-10 05:47:41,941][53268] Updated weights for policy 1, policy_version 27760 (0.0008) [2023-10-10 05:47:42,314][53268] Updated weights for policy 1, policy_version 27770 (0.0009) [2023-10-10 05:47:44,912][53252] Updated weights for policy 0, policy_version 27810 (0.0010) [2023-10-10 05:47:45,283][53252] Updated weights for policy 0, policy_version 27820 (0.0009) [2023-10-10 05:47:45,654][53252] Updated weights for policy 0, policy_version 27830 (0.0008) [2023-10-10 05:47:46,023][53252] Updated weights for policy 0, policy_version 27840 (0.0007) [2023-10-10 05:47:46,578][53268] Updated weights for policy 1, policy_version 27780 (0.0009) [2023-10-10 05:47:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56950784. Throughput: 0: 1684.8, 1: 1687.0. Samples: 14242906. Policy #0 lag: (min: 10.0, avg: 18.0, max: 42.0) [2023-10-10 05:47:46,784][52050] Avg episode reward: [(0, '18.560'), (1, '18.030')] [2023-10-10 05:47:46,959][53268] Updated weights for policy 1, policy_version 27790 (0.0009) [2023-10-10 05:47:47,320][53268] Updated weights for policy 1, policy_version 27800 (0.0009) [2023-10-10 05:47:50,163][53252] Updated weights for policy 0, policy_version 27850 (0.0007) [2023-10-10 05:47:50,527][53252] Updated weights for policy 0, policy_version 27860 (0.0008) [2023-10-10 05:47:50,893][53252] Updated weights for policy 0, policy_version 27870 (0.0007) [2023-10-10 05:47:51,339][53268] Updated weights for policy 1, policy_version 27810 (0.0010) [2023-10-10 05:47:51,707][53268] Updated weights for policy 1, policy_version 27820 (0.0009) [2023-10-10 05:47:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57016320. Throughput: 0: 1673.8, 1: 1682.4. Samples: 14262848. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:47:51,785][52050] Avg episode reward: [(0, '20.070'), (1, '18.130')] [2023-10-10 05:47:52,075][53268] Updated weights for policy 1, policy_version 27830 (0.0007) [2023-10-10 05:47:52,434][53268] Updated weights for policy 1, policy_version 27840 (0.0009) [2023-10-10 05:47:55,004][53252] Updated weights for policy 0, policy_version 27880 (0.0007) [2023-10-10 05:47:55,379][53252] Updated weights for policy 0, policy_version 27890 (0.0009) [2023-10-10 05:47:55,750][53252] Updated weights for policy 0, policy_version 27900 (0.0010) [2023-10-10 05:47:56,515][53268] Updated weights for policy 1, policy_version 27850 (0.0008) [2023-10-10 05:47:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57081856. Throughput: 0: 1666.3, 1: 1682.3. Samples: 14282760. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:47:56,784][52050] Avg episode reward: [(0, '19.880'), (1, '16.500')] [2023-10-10 05:47:56,876][53268] Updated weights for policy 1, policy_version 27860 (0.0008) [2023-10-10 05:47:57,246][53268] Updated weights for policy 1, policy_version 27870 (0.0008) [2023-10-10 05:47:59,880][53252] Updated weights for policy 0, policy_version 27910 (0.0008) [2023-10-10 05:48:00,248][53252] Updated weights for policy 0, policy_version 27920 (0.0009) [2023-10-10 05:48:00,623][53252] Updated weights for policy 0, policy_version 27930 (0.0010) [2023-10-10 05:48:01,481][53268] Updated weights for policy 1, policy_version 27880 (0.0008) [2023-10-10 05:48:01,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57147392. Throughput: 0: 1680.3, 1: 1688.0. Samples: 14293036. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:48:01,785][52050] Avg episode reward: [(0, '18.060'), (1, '16.100')] [2023-10-10 05:48:01,845][53268] Updated weights for policy 1, policy_version 27890 (0.0009) [2023-10-10 05:48:02,226][53268] Updated weights for policy 1, policy_version 27900 (0.0007) [2023-10-10 05:48:04,627][53252] Updated weights for policy 0, policy_version 27940 (0.0010) [2023-10-10 05:48:05,005][53252] Updated weights for policy 0, policy_version 27950 (0.0008) [2023-10-10 05:48:05,378][53252] Updated weights for policy 0, policy_version 27960 (0.0008) [2023-10-10 05:48:06,071][53268] Updated weights for policy 1, policy_version 27910 (0.0008) [2023-10-10 05:48:06,442][53268] Updated weights for policy 1, policy_version 27920 (0.0009) [2023-10-10 05:48:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57212928. Throughput: 0: 1660.9, 1: 1687.6. Samples: 14312648. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 05:48:06,784][52050] Avg episode reward: [(0, '17.860'), (1, '18.330')] [2023-10-10 05:48:06,815][53268] Updated weights for policy 1, policy_version 27930 (0.0008) [2023-10-10 05:48:09,575][53252] Updated weights for policy 0, policy_version 27970 (0.0009) [2023-10-10 05:48:09,941][53252] Updated weights for policy 0, policy_version 27980 (0.0009) [2023-10-10 05:48:10,306][53252] Updated weights for policy 0, policy_version 27990 (0.0009) [2023-10-10 05:48:10,682][53252] Updated weights for policy 0, policy_version 28000 (0.0008) [2023-10-10 05:48:10,950][53268] Updated weights for policy 1, policy_version 27940 (0.0007) [2023-10-10 05:48:11,325][53268] Updated weights for policy 1, policy_version 27950 (0.0009) [2023-10-10 05:48:11,685][53268] Updated weights for policy 1, policy_version 27960 (0.0007) [2023-10-10 05:48:11,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57278464. Throughput: 0: 1668.8, 1: 1674.4. Samples: 14332588. Policy #0 lag: (min: 2.0, avg: 7.6, max: 34.0) [2023-10-10 05:48:11,784][52050] Avg episode reward: [(0, '18.380'), (1, '17.850')] [2023-10-10 05:48:14,770][53252] Updated weights for policy 0, policy_version 28010 (0.0010) [2023-10-10 05:48:15,131][53252] Updated weights for policy 0, policy_version 28020 (0.0011) [2023-10-10 05:48:15,511][53252] Updated weights for policy 0, policy_version 28030 (0.0011) [2023-10-10 05:48:15,779][53268] Updated weights for policy 1, policy_version 27970 (0.0007) [2023-10-10 05:48:16,149][53268] Updated weights for policy 1, policy_version 27980 (0.0008) [2023-10-10 05:48:16,519][53268] Updated weights for policy 1, policy_version 27990 (0.0008) [2023-10-10 05:48:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57344000. Throughput: 0: 1678.4, 1: 1683.2. Samples: 14343422. Policy #0 lag: (min: 2.0, avg: 7.6, max: 34.0) [2023-10-10 05:48:16,784][52050] Avg episode reward: [(0, '17.540'), (1, '18.400')] [2023-10-10 05:48:16,885][53268] Updated weights for policy 1, policy_version 28000 (0.0010) [2023-10-10 05:48:19,600][53252] Updated weights for policy 0, policy_version 28040 (0.0009) [2023-10-10 05:48:19,969][53252] Updated weights for policy 0, policy_version 28050 (0.0008) [2023-10-10 05:48:20,339][53252] Updated weights for policy 0, policy_version 28060 (0.0008) [2023-10-10 05:48:20,909][53268] Updated weights for policy 1, policy_version 28010 (0.0008) [2023-10-10 05:48:21,269][53268] Updated weights for policy 1, policy_version 28020 (0.0009) [2023-10-10 05:48:21,643][53268] Updated weights for policy 1, policy_version 28030 (0.0008) [2023-10-10 05:48:21,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 57442304. Throughput: 0: 1654.8, 1: 1686.5. Samples: 14363114. Policy #0 lag: (min: 2.0, avg: 7.6, max: 34.0) [2023-10-10 05:48:21,784][52050] Avg episode reward: [(0, '18.800'), (1, '18.740')] [2023-10-10 05:48:24,609][53252] Updated weights for policy 0, policy_version 28070 (0.0008) [2023-10-10 05:48:24,970][53252] Updated weights for policy 0, policy_version 28080 (0.0008) [2023-10-10 05:48:25,340][53252] Updated weights for policy 0, policy_version 28090 (0.0009) [2023-10-10 05:48:25,770][53268] Updated weights for policy 1, policy_version 28040 (0.0009) [2023-10-10 05:48:26,144][53268] Updated weights for policy 1, policy_version 28050 (0.0007) [2023-10-10 05:48:26,518][53268] Updated weights for policy 1, policy_version 28060 (0.0008) [2023-10-10 05:48:26,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57507840. Throughput: 0: 1675.9, 1: 1668.0. Samples: 14382860. Policy #0 lag: (min: 2.0, avg: 7.6, max: 34.0) [2023-10-10 05:48:26,784][52050] Avg episode reward: [(0, '19.390'), (1, '18.910')] [2023-10-10 05:48:29,166][53252] Updated weights for policy 0, policy_version 28100 (0.0008) [2023-10-10 05:48:29,532][53252] Updated weights for policy 0, policy_version 28110 (0.0008) [2023-10-10 05:48:29,904][53252] Updated weights for policy 0, policy_version 28120 (0.0009) [2023-10-10 05:48:30,503][53268] Updated weights for policy 1, policy_version 28070 (0.0009) [2023-10-10 05:48:30,877][53268] Updated weights for policy 1, policy_version 28080 (0.0008) [2023-10-10 05:48:31,247][53268] Updated weights for policy 1, policy_version 28090 (0.0008) [2023-10-10 05:48:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57573376. Throughput: 0: 1668.3, 1: 1681.0. Samples: 14393628. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 05:48:31,784][52050] Avg episode reward: [(0, '18.080'), (1, '18.300')] [2023-10-10 05:48:33,901][53252] Updated weights for policy 0, policy_version 28130 (0.0010) [2023-10-10 05:48:34,279][53252] Updated weights for policy 0, policy_version 28140 (0.0008) [2023-10-10 05:48:34,642][53252] Updated weights for policy 0, policy_version 28150 (0.0008) [2023-10-10 05:48:35,012][53252] Updated weights for policy 0, policy_version 28160 (0.0008) [2023-10-10 05:48:35,208][53268] Updated weights for policy 1, policy_version 28100 (0.0010) [2023-10-10 05:48:35,580][53268] Updated weights for policy 1, policy_version 28110 (0.0009) [2023-10-10 05:48:35,950][53268] Updated weights for policy 1, policy_version 28120 (0.0010) [2023-10-10 05:48:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57638912. Throughput: 0: 1662.2, 1: 1686.1. Samples: 14413520. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 05:48:36,785][52050] Avg episode reward: [(0, '19.380'), (1, '17.160')] [2023-10-10 05:48:39,346][53252] Updated weights for policy 0, policy_version 28170 (0.0008) [2023-10-10 05:48:39,715][53252] Updated weights for policy 0, policy_version 28180 (0.0007) [2023-10-10 05:48:39,880][53268] Updated weights for policy 1, policy_version 28130 (0.0010) [2023-10-10 05:48:40,090][53252] Updated weights for policy 0, policy_version 28190 (0.0008) [2023-10-10 05:48:40,248][53268] Updated weights for policy 1, policy_version 28140 (0.0008) [2023-10-10 05:48:40,625][53268] Updated weights for policy 1, policy_version 28150 (0.0010) [2023-10-10 05:48:40,994][53268] Updated weights for policy 1, policy_version 28160 (0.0010) [2023-10-10 05:48:41,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57704448. Throughput: 0: 1674.1, 1: 1664.8. Samples: 14433010. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 05:48:41,784][52050] Avg episode reward: [(0, '20.300'), (1, '18.370')] [2023-10-10 05:48:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000028192_28868608.pth... [2023-10-10 05:48:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000028160_28835840.pth... [2023-10-10 05:48:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000026592_27230208.pth [2023-10-10 05:48:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000026624_27262976.pth [2023-10-10 05:48:44,090][53252] Updated weights for policy 0, policy_version 28200 (0.0007) [2023-10-10 05:48:44,465][53252] Updated weights for policy 0, policy_version 28210 (0.0007) [2023-10-10 05:48:44,834][53252] Updated weights for policy 0, policy_version 28220 (0.0007) [2023-10-10 05:48:45,160][53268] Updated weights for policy 1, policy_version 28170 (0.0008) [2023-10-10 05:48:45,527][53268] Updated weights for policy 1, policy_version 28180 (0.0010) [2023-10-10 05:48:45,908][53268] Updated weights for policy 1, policy_version 28190 (0.0010) [2023-10-10 05:48:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57769984. Throughput: 0: 1665.0, 1: 1692.7. Samples: 14444132. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 05:48:46,784][52050] Avg episode reward: [(0, '20.240'), (1, '16.680')] [2023-10-10 05:48:48,769][53252] Updated weights for policy 0, policy_version 28230 (0.0009) [2023-10-10 05:48:49,133][53252] Updated weights for policy 0, policy_version 28240 (0.0010) [2023-10-10 05:48:49,512][53252] Updated weights for policy 0, policy_version 28250 (0.0008) [2023-10-10 05:48:50,091][53268] Updated weights for policy 1, policy_version 28200 (0.0010) [2023-10-10 05:48:50,463][53268] Updated weights for policy 1, policy_version 28210 (0.0011) [2023-10-10 05:48:50,825][53268] Updated weights for policy 1, policy_version 28220 (0.0011) [2023-10-10 05:48:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57835520. Throughput: 0: 1672.3, 1: 1680.9. Samples: 14463544. Policy #0 lag: (min: 31.0, avg: 32.6, max: 59.0) [2023-10-10 05:48:51,785][52050] Avg episode reward: [(0, '20.870'), (1, '18.370')] [2023-10-10 05:48:53,491][53252] Updated weights for policy 0, policy_version 28260 (0.0007) [2023-10-10 05:48:53,856][53252] Updated weights for policy 0, policy_version 28270 (0.0010) [2023-10-10 05:48:54,229][53252] Updated weights for policy 0, policy_version 28280 (0.0009) [2023-10-10 05:48:54,989][53268] Updated weights for policy 1, policy_version 28230 (0.0009) [2023-10-10 05:48:55,354][53268] Updated weights for policy 1, policy_version 28240 (0.0007) [2023-10-10 05:48:55,728][53268] Updated weights for policy 1, policy_version 28250 (0.0009) [2023-10-10 05:48:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 57901056. Throughput: 0: 1683.1, 1: 1662.0. Samples: 14483120. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:48:56,784][52050] Avg episode reward: [(0, '20.550'), (1, '18.760')] [2023-10-10 05:48:58,492][53252] Updated weights for policy 0, policy_version 28290 (0.0008) [2023-10-10 05:48:58,863][53252] Updated weights for policy 0, policy_version 28300 (0.0008) [2023-10-10 05:48:59,250][53252] Updated weights for policy 0, policy_version 28310 (0.0007) [2023-10-10 05:48:59,627][53252] Updated weights for policy 0, policy_version 28320 (0.0009) [2023-10-10 05:48:59,787][53268] Updated weights for policy 1, policy_version 28260 (0.0009) [2023-10-10 05:49:00,154][53268] Updated weights for policy 1, policy_version 28270 (0.0008) [2023-10-10 05:49:00,522][53268] Updated weights for policy 1, policy_version 28280 (0.0009) [2023-10-10 05:49:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 57966592. Throughput: 0: 1662.1, 1: 1677.7. Samples: 14493712. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:49:01,785][52050] Avg episode reward: [(0, '19.780'), (1, '18.120')] [2023-10-10 05:49:03,675][53252] Updated weights for policy 0, policy_version 28330 (0.0008) [2023-10-10 05:49:04,043][53252] Updated weights for policy 0, policy_version 28340 (0.0007) [2023-10-10 05:49:04,409][53252] Updated weights for policy 0, policy_version 28350 (0.0007) [2023-10-10 05:49:04,675][53268] Updated weights for policy 1, policy_version 28290 (0.0008) [2023-10-10 05:49:05,045][53268] Updated weights for policy 1, policy_version 28300 (0.0007) [2023-10-10 05:49:05,403][53268] Updated weights for policy 1, policy_version 28310 (0.0008) [2023-10-10 05:49:05,771][53268] Updated weights for policy 1, policy_version 28320 (0.0011) [2023-10-10 05:49:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 58032128. Throughput: 0: 1678.1, 1: 1663.2. Samples: 14513474. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:49:06,784][52050] Avg episode reward: [(0, '19.960'), (1, '18.600')] [2023-10-10 05:49:08,694][53252] Updated weights for policy 0, policy_version 28360 (0.0008) [2023-10-10 05:49:09,061][53252] Updated weights for policy 0, policy_version 28370 (0.0007) [2023-10-10 05:49:09,426][53252] Updated weights for policy 0, policy_version 28380 (0.0007) [2023-10-10 05:49:09,684][53268] Updated weights for policy 1, policy_version 28330 (0.0008) [2023-10-10 05:49:10,047][53268] Updated weights for policy 1, policy_version 28340 (0.0010) [2023-10-10 05:49:10,415][53268] Updated weights for policy 1, policy_version 28350 (0.0008) [2023-10-10 05:49:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 58097664. Throughput: 0: 1679.2, 1: 1667.8. Samples: 14533472. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:49:11,784][52050] Avg episode reward: [(0, '19.160'), (1, '18.030')] [2023-10-10 05:49:13,367][53252] Updated weights for policy 0, policy_version 28390 (0.0009) [2023-10-10 05:49:13,731][53252] Updated weights for policy 0, policy_version 28400 (0.0008) [2023-10-10 05:49:14,102][53252] Updated weights for policy 0, policy_version 28410 (0.0008) [2023-10-10 05:49:14,510][53268] Updated weights for policy 1, policy_version 28360 (0.0008) [2023-10-10 05:49:14,878][53268] Updated weights for policy 1, policy_version 28370 (0.0009) [2023-10-10 05:49:15,236][53268] Updated weights for policy 1, policy_version 28380 (0.0009) [2023-10-10 05:49:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 58163200. Throughput: 0: 1654.6, 1: 1683.0. Samples: 14543820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:49:16,784][52050] Avg episode reward: [(0, '19.120'), (1, '18.660')] [2023-10-10 05:49:18,060][53252] Updated weights for policy 0, policy_version 28420 (0.0008) [2023-10-10 05:49:18,439][53252] Updated weights for policy 0, policy_version 28430 (0.0007) [2023-10-10 05:49:18,812][53252] Updated weights for policy 0, policy_version 28440 (0.0007) [2023-10-10 05:49:19,252][53268] Updated weights for policy 1, policy_version 28390 (0.0011) [2023-10-10 05:49:19,623][53268] Updated weights for policy 1, policy_version 28400 (0.0010) [2023-10-10 05:49:19,986][53268] Updated weights for policy 1, policy_version 28410 (0.0011) [2023-10-10 05:49:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58228736. Throughput: 0: 1680.9, 1: 1654.3. Samples: 14563602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:49:21,784][52050] Avg episode reward: [(0, '20.240'), (1, '19.330')] [2023-10-10 05:49:22,984][53252] Updated weights for policy 0, policy_version 28450 (0.0008) [2023-10-10 05:49:23,353][53252] Updated weights for policy 0, policy_version 28460 (0.0007) [2023-10-10 05:49:23,726][53252] Updated weights for policy 0, policy_version 28470 (0.0009) [2023-10-10 05:49:24,094][53252] Updated weights for policy 0, policy_version 28480 (0.0007) [2023-10-10 05:49:24,311][53268] Updated weights for policy 1, policy_version 28420 (0.0009) [2023-10-10 05:49:24,682][53268] Updated weights for policy 1, policy_version 28430 (0.0009) [2023-10-10 05:49:25,049][53268] Updated weights for policy 1, policy_version 28440 (0.0010) [2023-10-10 05:49:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58294272. Throughput: 0: 1684.4, 1: 1671.1. Samples: 14584008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:49:26,784][52050] Avg episode reward: [(0, '19.510'), (1, '18.860')] [2023-10-10 05:49:28,156][53252] Updated weights for policy 0, policy_version 28490 (0.0009) [2023-10-10 05:49:28,531][53252] Updated weights for policy 0, policy_version 28500 (0.0011) [2023-10-10 05:49:28,909][53252] Updated weights for policy 0, policy_version 28510 (0.0009) [2023-10-10 05:49:29,129][53268] Updated weights for policy 1, policy_version 28450 (0.0008) [2023-10-10 05:49:29,505][53268] Updated weights for policy 1, policy_version 28460 (0.0010) [2023-10-10 05:49:29,874][53268] Updated weights for policy 1, policy_version 28470 (0.0009) [2023-10-10 05:49:30,244][53268] Updated weights for policy 1, policy_version 28480 (0.0009) [2023-10-10 05:49:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58359808. Throughput: 0: 1665.8, 1: 1672.0. Samples: 14594336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:49:31,784][52050] Avg episode reward: [(0, '19.100'), (1, '17.690')] [2023-10-10 05:49:32,978][53252] Updated weights for policy 0, policy_version 28520 (0.0010) [2023-10-10 05:49:33,348][53252] Updated weights for policy 0, policy_version 28530 (0.0010) [2023-10-10 05:49:33,718][53252] Updated weights for policy 0, policy_version 28540 (0.0008) [2023-10-10 05:49:34,292][53268] Updated weights for policy 1, policy_version 28490 (0.0008) [2023-10-10 05:49:34,654][53268] Updated weights for policy 1, policy_version 28500 (0.0007) [2023-10-10 05:49:35,020][53268] Updated weights for policy 1, policy_version 28510 (0.0008) [2023-10-10 05:49:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58425344. Throughput: 0: 1680.0, 1: 1662.9. Samples: 14613974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:49:36,784][52050] Avg episode reward: [(0, '18.420'), (1, '17.740')] [2023-10-10 05:49:37,680][53252] Updated weights for policy 0, policy_version 28550 (0.0009) [2023-10-10 05:49:38,052][53252] Updated weights for policy 0, policy_version 28560 (0.0011) [2023-10-10 05:49:38,431][53252] Updated weights for policy 0, policy_version 28570 (0.0010) [2023-10-10 05:49:38,941][53268] Updated weights for policy 1, policy_version 28520 (0.0008) [2023-10-10 05:49:39,332][53268] Updated weights for policy 1, policy_version 28530 (0.0007) [2023-10-10 05:49:39,704][53268] Updated weights for policy 1, policy_version 28540 (0.0008) [2023-10-10 05:49:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58490880. Throughput: 0: 1681.0, 1: 1690.0. Samples: 14634816. Policy #0 lag: (min: 1.0, avg: 16.6, max: 33.0) [2023-10-10 05:49:41,784][52050] Avg episode reward: [(0, '17.990'), (1, '17.150')] [2023-10-10 05:49:42,480][53252] Updated weights for policy 0, policy_version 28580 (0.0008) [2023-10-10 05:49:42,853][53252] Updated weights for policy 0, policy_version 28590 (0.0008) [2023-10-10 05:49:43,220][53252] Updated weights for policy 0, policy_version 28600 (0.0008) [2023-10-10 05:49:43,508][53268] Updated weights for policy 1, policy_version 28550 (0.0008) [2023-10-10 05:49:43,876][53268] Updated weights for policy 1, policy_version 28560 (0.0010) [2023-10-10 05:49:44,239][53268] Updated weights for policy 1, policy_version 28570 (0.0011) [2023-10-10 05:49:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58556416. Throughput: 0: 1672.4, 1: 1673.1. Samples: 14644256. Policy #0 lag: (min: 1.0, avg: 16.6, max: 33.0) [2023-10-10 05:49:46,785][52050] Avg episode reward: [(0, '18.110'), (1, '18.330')] [2023-10-10 05:49:47,412][53252] Updated weights for policy 0, policy_version 28610 (0.0008) [2023-10-10 05:49:47,788][53252] Updated weights for policy 0, policy_version 28620 (0.0011) [2023-10-10 05:49:48,167][53252] Updated weights for policy 0, policy_version 28630 (0.0009) [2023-10-10 05:49:48,453][53268] Updated weights for policy 1, policy_version 28580 (0.0009) [2023-10-10 05:49:48,529][53252] Updated weights for policy 0, policy_version 28640 (0.0009) [2023-10-10 05:49:48,816][53268] Updated weights for policy 1, policy_version 28590 (0.0010) [2023-10-10 05:49:49,190][53268] Updated weights for policy 1, policy_version 28600 (0.0009) [2023-10-10 05:49:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 58621952. Throughput: 0: 1683.1, 1: 1673.0. Samples: 14664500. Policy #0 lag: (min: 1.0, avg: 16.6, max: 33.0) [2023-10-10 05:49:51,784][52050] Avg episode reward: [(0, '18.900'), (1, '19.130')] [2023-10-10 05:49:52,427][53252] Updated weights for policy 0, policy_version 28650 (0.0009) [2023-10-10 05:49:52,803][53252] Updated weights for policy 0, policy_version 28660 (0.0008) [2023-10-10 05:49:53,177][53252] Updated weights for policy 0, policy_version 28670 (0.0008) [2023-10-10 05:49:53,358][53268] Updated weights for policy 1, policy_version 28610 (0.0009) [2023-10-10 05:49:53,725][53268] Updated weights for policy 1, policy_version 28620 (0.0011) [2023-10-10 05:49:54,088][53268] Updated weights for policy 1, policy_version 28630 (0.0008) [2023-10-10 05:49:54,455][53268] Updated weights for policy 1, policy_version 28640 (0.0007) [2023-10-10 05:49:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58687488. Throughput: 0: 1690.2, 1: 1684.0. Samples: 14685308. Policy #0 lag: (min: 1.0, avg: 16.6, max: 33.0) [2023-10-10 05:49:56,784][52050] Avg episode reward: [(0, '18.820'), (1, '18.700')] [2023-10-10 05:49:57,076][53252] Updated weights for policy 0, policy_version 28680 (0.0007) [2023-10-10 05:49:57,436][53252] Updated weights for policy 0, policy_version 28690 (0.0007) [2023-10-10 05:49:57,813][53252] Updated weights for policy 0, policy_version 28700 (0.0007) [2023-10-10 05:49:58,508][53268] Updated weights for policy 1, policy_version 28650 (0.0010) [2023-10-10 05:49:58,867][53268] Updated weights for policy 1, policy_version 28660 (0.0009) [2023-10-10 05:49:59,241][53268] Updated weights for policy 1, policy_version 28670 (0.0008) [2023-10-10 05:50:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 58753024. Throughput: 0: 1692.6, 1: 1662.7. Samples: 14694810. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:50:01,784][52050] Avg episode reward: [(0, '18.780'), (1, '19.350')] [2023-10-10 05:50:01,980][53252] Updated weights for policy 0, policy_version 28710 (0.0007) [2023-10-10 05:50:02,347][53252] Updated weights for policy 0, policy_version 28720 (0.0007) [2023-10-10 05:50:02,723][53252] Updated weights for policy 0, policy_version 28730 (0.0009) [2023-10-10 05:50:03,273][53268] Updated weights for policy 1, policy_version 28680 (0.0010) [2023-10-10 05:50:03,643][53268] Updated weights for policy 1, policy_version 28690 (0.0011) [2023-10-10 05:50:04,004][53268] Updated weights for policy 1, policy_version 28700 (0.0008) [2023-10-10 05:50:06,552][53252] Updated weights for policy 0, policy_version 28740 (0.0008) [2023-10-10 05:50:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58818560. Throughput: 0: 1689.8, 1: 1683.6. Samples: 14715404. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:50:06,784][52050] Avg episode reward: [(0, '19.400'), (1, '18.380')] [2023-10-10 05:50:06,926][53252] Updated weights for policy 0, policy_version 28750 (0.0010) [2023-10-10 05:50:07,287][53252] Updated weights for policy 0, policy_version 28760 (0.0009) [2023-10-10 05:50:08,059][53268] Updated weights for policy 1, policy_version 28710 (0.0009) [2023-10-10 05:50:08,427][53268] Updated weights for policy 1, policy_version 28720 (0.0007) [2023-10-10 05:50:08,795][53268] Updated weights for policy 1, policy_version 28730 (0.0007) [2023-10-10 05:50:11,450][53252] Updated weights for policy 0, policy_version 28770 (0.0011) [2023-10-10 05:50:11,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58884096. Throughput: 0: 1692.4, 1: 1693.2. Samples: 14736358. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:50:11,784][52050] Avg episode reward: [(0, '21.820'), (1, '17.350')] [2023-10-10 05:50:11,820][53252] Updated weights for policy 0, policy_version 28780 (0.0007) [2023-10-10 05:50:12,186][53252] Updated weights for policy 0, policy_version 28790 (0.0010) [2023-10-10 05:50:12,552][52846] Saving new best policy, reward=21.820! [2023-10-10 05:50:12,554][53252] Updated weights for policy 0, policy_version 28800 (0.0010) [2023-10-10 05:50:12,704][53268] Updated weights for policy 1, policy_version 28740 (0.0009) [2023-10-10 05:50:13,062][53268] Updated weights for policy 1, policy_version 28750 (0.0009) [2023-10-10 05:50:13,436][53268] Updated weights for policy 1, policy_version 28760 (0.0008) [2023-10-10 05:50:16,596][53252] Updated weights for policy 0, policy_version 28810 (0.0008) [2023-10-10 05:50:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58949632. Throughput: 0: 1693.6, 1: 1666.0. Samples: 14745516. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:50:16,784][52050] Avg episode reward: [(0, '20.850'), (1, '17.600')] [2023-10-10 05:50:16,954][53252] Updated weights for policy 0, policy_version 28820 (0.0008) [2023-10-10 05:50:17,339][53252] Updated weights for policy 0, policy_version 28830 (0.0009) [2023-10-10 05:50:17,472][53268] Updated weights for policy 1, policy_version 28770 (0.0008) [2023-10-10 05:50:17,853][53268] Updated weights for policy 1, policy_version 28780 (0.0007) [2023-10-10 05:50:18,225][53268] Updated weights for policy 1, policy_version 28790 (0.0008) [2023-10-10 05:50:18,593][53268] Updated weights for policy 1, policy_version 28800 (0.0008) [2023-10-10 05:50:21,259][53252] Updated weights for policy 0, policy_version 28840 (0.0008) [2023-10-10 05:50:21,632][53252] Updated weights for policy 0, policy_version 28850 (0.0007) [2023-10-10 05:50:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59015168. Throughput: 0: 1697.8, 1: 1690.7. Samples: 14766458. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-10 05:50:21,784][52050] Avg episode reward: [(0, '20.170'), (1, '17.660')] [2023-10-10 05:50:22,001][53252] Updated weights for policy 0, policy_version 28860 (0.0007) [2023-10-10 05:50:22,699][53268] Updated weights for policy 1, policy_version 28810 (0.0010) [2023-10-10 05:50:23,075][53268] Updated weights for policy 1, policy_version 28820 (0.0009) [2023-10-10 05:50:23,442][53268] Updated weights for policy 1, policy_version 28830 (0.0010) [2023-10-10 05:50:25,971][53252] Updated weights for policy 0, policy_version 28870 (0.0008) [2023-10-10 05:50:26,334][53252] Updated weights for policy 0, policy_version 28880 (0.0009) [2023-10-10 05:50:26,716][53252] Updated weights for policy 0, policy_version 28890 (0.0007) [2023-10-10 05:50:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 59080704. Throughput: 0: 1686.4, 1: 1692.1. Samples: 14786850. Policy #0 lag: (min: 15.0, avg: 34.3, max: 47.0) [2023-10-10 05:50:26,784][52050] Avg episode reward: [(0, '19.850'), (1, '17.600')] [2023-10-10 05:50:27,557][53268] Updated weights for policy 1, policy_version 28840 (0.0009) [2023-10-10 05:50:27,948][53268] Updated weights for policy 1, policy_version 28850 (0.0010) [2023-10-10 05:50:28,306][53268] Updated weights for policy 1, policy_version 28860 (0.0008) [2023-10-10 05:50:30,817][53252] Updated weights for policy 0, policy_version 28900 (0.0010) [2023-10-10 05:50:31,192][53252] Updated weights for policy 0, policy_version 28910 (0.0010) [2023-10-10 05:50:31,568][53252] Updated weights for policy 0, policy_version 28920 (0.0010) [2023-10-10 05:50:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59146240. Throughput: 0: 1705.0, 1: 1680.2. Samples: 14796590. Policy #0 lag: (min: 15.0, avg: 34.3, max: 47.0) [2023-10-10 05:50:31,784][52050] Avg episode reward: [(0, '19.880'), (1, '18.600')] [2023-10-10 05:50:32,534][53268] Updated weights for policy 1, policy_version 28870 (0.0009) [2023-10-10 05:50:32,893][53268] Updated weights for policy 1, policy_version 28880 (0.0008) [2023-10-10 05:50:33,265][53268] Updated weights for policy 1, policy_version 28890 (0.0008) [2023-10-10 05:50:35,810][53252] Updated weights for policy 0, policy_version 28930 (0.0008) [2023-10-10 05:50:36,177][53252] Updated weights for policy 0, policy_version 28940 (0.0007) [2023-10-10 05:50:36,550][53252] Updated weights for policy 0, policy_version 28950 (0.0010) [2023-10-10 05:50:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59211776. Throughput: 0: 1697.6, 1: 1688.7. Samples: 14816884. Policy #0 lag: (min: 15.0, avg: 34.3, max: 47.0) [2023-10-10 05:50:36,784][52050] Avg episode reward: [(0, '18.790'), (1, '17.490')] [2023-10-10 05:50:36,921][53252] Updated weights for policy 0, policy_version 28960 (0.0007) [2023-10-10 05:50:37,227][53268] Updated weights for policy 1, policy_version 28900 (0.0008) [2023-10-10 05:50:37,589][53268] Updated weights for policy 1, policy_version 28910 (0.0008) [2023-10-10 05:50:37,957][53268] Updated weights for policy 1, policy_version 28920 (0.0010) [2023-10-10 05:50:40,890][53252] Updated weights for policy 0, policy_version 28970 (0.0007) [2023-10-10 05:50:41,261][53252] Updated weights for policy 0, policy_version 28980 (0.0009) [2023-10-10 05:50:41,638][53252] Updated weights for policy 0, policy_version 28990 (0.0007) [2023-10-10 05:50:41,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59310080. Throughput: 0: 1674.8, 1: 1695.5. Samples: 14836976. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:50:41,785][52050] Avg episode reward: [(0, '19.570'), (1, '18.520')] [2023-10-10 05:50:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000028992_29687808.pth... [2023-10-10 05:50:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000028928_29622272.pth... [2023-10-10 05:50:41,830][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth [2023-10-10 05:50:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000027424_28082176.pth [2023-10-10 05:50:41,994][53268] Updated weights for policy 1, policy_version 28930 (0.0010) [2023-10-10 05:50:42,363][53268] Updated weights for policy 1, policy_version 28940 (0.0009) [2023-10-10 05:50:42,732][53268] Updated weights for policy 1, policy_version 28950 (0.0007) [2023-10-10 05:50:43,096][53268] Updated weights for policy 1, policy_version 28960 (0.0007) [2023-10-10 05:50:45,670][53252] Updated weights for policy 0, policy_version 29000 (0.0009) [2023-10-10 05:50:46,051][53252] Updated weights for policy 0, policy_version 29010 (0.0009) [2023-10-10 05:50:46,423][53252] Updated weights for policy 0, policy_version 29020 (0.0007) [2023-10-10 05:50:46,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59375616. Throughput: 0: 1695.2, 1: 1690.0. Samples: 14847144. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:50:46,784][52050] Avg episode reward: [(0, '20.720'), (1, '18.630')] [2023-10-10 05:50:47,008][53268] Updated weights for policy 1, policy_version 28970 (0.0008) [2023-10-10 05:50:47,377][53268] Updated weights for policy 1, policy_version 28980 (0.0010) [2023-10-10 05:50:47,733][53268] Updated weights for policy 1, policy_version 28990 (0.0009) [2023-10-10 05:50:50,424][53252] Updated weights for policy 0, policy_version 29030 (0.0008) [2023-10-10 05:50:50,792][53252] Updated weights for policy 0, policy_version 29040 (0.0008) [2023-10-10 05:50:51,171][53252] Updated weights for policy 0, policy_version 29050 (0.0009) [2023-10-10 05:50:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59441152. Throughput: 0: 1692.1, 1: 1695.5. Samples: 14867848. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:50:51,784][52050] Avg episode reward: [(0, '19.670'), (1, '17.250')] [2023-10-10 05:50:51,834][53268] Updated weights for policy 1, policy_version 29000 (0.0009) [2023-10-10 05:50:52,201][53268] Updated weights for policy 1, policy_version 29010 (0.0009) [2023-10-10 05:50:52,563][53268] Updated weights for policy 1, policy_version 29020 (0.0007) [2023-10-10 05:50:55,329][53252] Updated weights for policy 0, policy_version 29060 (0.0008) [2023-10-10 05:50:55,687][53252] Updated weights for policy 0, policy_version 29070 (0.0007) [2023-10-10 05:50:56,063][53252] Updated weights for policy 0, policy_version 29080 (0.0011) [2023-10-10 05:50:56,561][53268] Updated weights for policy 1, policy_version 29030 (0.0008) [2023-10-10 05:50:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59506688. Throughput: 0: 1668.7, 1: 1692.4. Samples: 14887606. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:50:56,784][52050] Avg episode reward: [(0, '19.080'), (1, '17.690')] [2023-10-10 05:50:56,936][53268] Updated weights for policy 1, policy_version 29040 (0.0007) [2023-10-10 05:50:57,299][53268] Updated weights for policy 1, policy_version 29050 (0.0009) [2023-10-10 05:51:00,047][53252] Updated weights for policy 0, policy_version 29090 (0.0008) [2023-10-10 05:51:00,427][53252] Updated weights for policy 0, policy_version 29100 (0.0010) [2023-10-10 05:51:00,804][53252] Updated weights for policy 0, policy_version 29110 (0.0009) [2023-10-10 05:51:01,181][53252] Updated weights for policy 0, policy_version 29120 (0.0008) [2023-10-10 05:51:01,475][53268] Updated weights for policy 1, policy_version 29060 (0.0010) [2023-10-10 05:51:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59572224. Throughput: 0: 1694.2, 1: 1693.6. Samples: 14897968. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:51:01,784][52050] Avg episode reward: [(0, '18.900'), (1, '17.090')] [2023-10-10 05:51:01,841][53268] Updated weights for policy 1, policy_version 29070 (0.0010) [2023-10-10 05:51:02,214][53268] Updated weights for policy 1, policy_version 29080 (0.0010) [2023-10-10 05:51:05,319][53252] Updated weights for policy 0, policy_version 29130 (0.0010) [2023-10-10 05:51:05,691][53252] Updated weights for policy 0, policy_version 29140 (0.0010) [2023-10-10 05:51:06,063][53252] Updated weights for policy 0, policy_version 29150 (0.0011) [2023-10-10 05:51:06,329][53268] Updated weights for policy 1, policy_version 29090 (0.0009) [2023-10-10 05:51:06,696][53268] Updated weights for policy 1, policy_version 29100 (0.0008) [2023-10-10 05:51:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59637760. Throughput: 0: 1676.8, 1: 1690.8. Samples: 14918002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:51:06,784][52050] Avg episode reward: [(0, '17.340'), (1, '17.820')] [2023-10-10 05:51:07,065][53268] Updated weights for policy 1, policy_version 29110 (0.0008) [2023-10-10 05:51:07,428][53268] Updated weights for policy 1, policy_version 29120 (0.0011) [2023-10-10 05:51:10,077][53252] Updated weights for policy 0, policy_version 29160 (0.0008) [2023-10-10 05:51:10,451][53252] Updated weights for policy 0, policy_version 29170 (0.0007) [2023-10-10 05:51:10,826][53252] Updated weights for policy 0, policy_version 29180 (0.0008) [2023-10-10 05:51:11,520][53268] Updated weights for policy 1, policy_version 29130 (0.0010) [2023-10-10 05:51:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59703296. Throughput: 0: 1672.5, 1: 1687.7. Samples: 14938060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:51:11,784][52050] Avg episode reward: [(0, '18.280'), (1, '18.190')] [2023-10-10 05:51:11,893][53268] Updated weights for policy 1, policy_version 29140 (0.0009) [2023-10-10 05:51:12,259][53268] Updated weights for policy 1, policy_version 29150 (0.0010) [2023-10-10 05:51:15,003][53252] Updated weights for policy 0, policy_version 29190 (0.0009) [2023-10-10 05:51:15,376][53252] Updated weights for policy 0, policy_version 29200 (0.0009) [2023-10-10 05:51:15,746][53252] Updated weights for policy 0, policy_version 29210 (0.0009) [2023-10-10 05:51:16,325][53268] Updated weights for policy 1, policy_version 29160 (0.0008) [2023-10-10 05:51:16,705][53268] Updated weights for policy 1, policy_version 29170 (0.0007) [2023-10-10 05:51:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59768832. Throughput: 0: 1684.9, 1: 1686.9. Samples: 14948320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:51:16,784][52050] Avg episode reward: [(0, '21.110'), (1, '18.600')] [2023-10-10 05:51:17,078][53268] Updated weights for policy 1, policy_version 29180 (0.0008) [2023-10-10 05:51:19,621][53252] Updated weights for policy 0, policy_version 29220 (0.0009) [2023-10-10 05:51:19,995][53252] Updated weights for policy 0, policy_version 29230 (0.0008) [2023-10-10 05:51:20,378][53252] Updated weights for policy 0, policy_version 29240 (0.0009) [2023-10-10 05:51:21,070][53268] Updated weights for policy 1, policy_version 29190 (0.0011) [2023-10-10 05:51:21,449][53268] Updated weights for policy 1, policy_version 29200 (0.0007) [2023-10-10 05:51:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59834368. Throughput: 0: 1675.5, 1: 1690.0. Samples: 14968334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:51:21,784][52050] Avg episode reward: [(0, '21.470'), (1, '19.200')] [2023-10-10 05:51:21,815][53268] Updated weights for policy 1, policy_version 29210 (0.0008) [2023-10-10 05:51:24,397][53252] Updated weights for policy 0, policy_version 29250 (0.0008) [2023-10-10 05:51:24,768][53252] Updated weights for policy 0, policy_version 29260 (0.0008) [2023-10-10 05:51:25,133][53252] Updated weights for policy 0, policy_version 29270 (0.0008) [2023-10-10 05:51:25,511][53252] Updated weights for policy 0, policy_version 29280 (0.0010) [2023-10-10 05:51:25,917][53268] Updated weights for policy 1, policy_version 29220 (0.0008) [2023-10-10 05:51:26,283][53268] Updated weights for policy 1, policy_version 29230 (0.0010) [2023-10-10 05:51:26,663][53268] Updated weights for policy 1, policy_version 29240 (0.0010) [2023-10-10 05:51:26,784][52050] Fps is (10 sec: 13106.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59899904. Throughput: 0: 1688.7, 1: 1678.6. Samples: 14988504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:51:26,785][52050] Avg episode reward: [(0, '20.720'), (1, '20.040')] [2023-10-10 05:51:26,952][53061] Saving new best policy, reward=20.040! [2023-10-10 05:51:29,691][53252] Updated weights for policy 0, policy_version 29290 (0.0009) [2023-10-10 05:51:30,069][53252] Updated weights for policy 0, policy_version 29300 (0.0008) [2023-10-10 05:51:30,441][53252] Updated weights for policy 0, policy_version 29310 (0.0009) [2023-10-10 05:51:30,655][53268] Updated weights for policy 1, policy_version 29250 (0.0010) [2023-10-10 05:51:31,011][53268] Updated weights for policy 1, policy_version 29260 (0.0010) [2023-10-10 05:51:31,382][53268] Updated weights for policy 1, policy_version 29270 (0.0008) [2023-10-10 05:51:31,744][53268] Updated weights for policy 1, policy_version 29280 (0.0007) [2023-10-10 05:51:31,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 59998208. Throughput: 0: 1691.2, 1: 1681.5. Samples: 14998912. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:51:31,784][52050] Avg episode reward: [(0, '18.450'), (1, '18.790')] [2023-10-10 05:51:34,352][53252] Updated weights for policy 0, policy_version 29320 (0.0008) [2023-10-10 05:51:34,724][53252] Updated weights for policy 0, policy_version 29330 (0.0010) [2023-10-10 05:51:35,094][53252] Updated weights for policy 0, policy_version 29340 (0.0010) [2023-10-10 05:51:35,737][53268] Updated weights for policy 1, policy_version 29290 (0.0011) [2023-10-10 05:51:36,099][53268] Updated weights for policy 1, policy_version 29300 (0.0011) [2023-10-10 05:51:36,471][53268] Updated weights for policy 1, policy_version 29310 (0.0011) [2023-10-10 05:51:36,783][52050] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 60063744. Throughput: 0: 1668.1, 1: 1689.3. Samples: 15018932. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:51:36,784][52050] Avg episode reward: [(0, '18.990'), (1, '19.270')] [2023-10-10 05:51:38,988][53252] Updated weights for policy 0, policy_version 29350 (0.0009) [2023-10-10 05:51:39,352][53252] Updated weights for policy 0, policy_version 29360 (0.0007) [2023-10-10 05:51:39,723][53252] Updated weights for policy 0, policy_version 29370 (0.0007) [2023-10-10 05:51:40,531][53268] Updated weights for policy 1, policy_version 29320 (0.0010) [2023-10-10 05:51:40,898][53268] Updated weights for policy 1, policy_version 29330 (0.0012) [2023-10-10 05:51:41,260][53268] Updated weights for policy 1, policy_version 29340 (0.0010) [2023-10-10 05:51:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 60129280. Throughput: 0: 1692.4, 1: 1666.8. Samples: 15038774. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:51:41,784][52050] Avg episode reward: [(0, '19.070'), (1, '19.290')] [2023-10-10 05:51:43,789][53252] Updated weights for policy 0, policy_version 29380 (0.0008) [2023-10-10 05:51:44,166][53252] Updated weights for policy 0, policy_version 29390 (0.0008) [2023-10-10 05:51:44,532][53252] Updated weights for policy 0, policy_version 29400 (0.0008) [2023-10-10 05:51:45,197][53268] Updated weights for policy 1, policy_version 29350 (0.0007) [2023-10-10 05:51:45,570][53268] Updated weights for policy 1, policy_version 29360 (0.0009) [2023-10-10 05:51:45,929][53268] Updated weights for policy 1, policy_version 29370 (0.0008) [2023-10-10 05:51:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60194816. Throughput: 0: 1676.4, 1: 1689.3. Samples: 15049420. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:51:46,784][52050] Avg episode reward: [(0, '19.860'), (1, '18.880')] [2023-10-10 05:51:48,493][53252] Updated weights for policy 0, policy_version 29410 (0.0007) [2023-10-10 05:51:48,857][53252] Updated weights for policy 0, policy_version 29420 (0.0007) [2023-10-10 05:51:49,232][53252] Updated weights for policy 0, policy_version 29430 (0.0007) [2023-10-10 05:51:49,604][53252] Updated weights for policy 0, policy_version 29440 (0.0008) [2023-10-10 05:51:50,001][53268] Updated weights for policy 1, policy_version 29380 (0.0007) [2023-10-10 05:51:50,361][53268] Updated weights for policy 1, policy_version 29390 (0.0008) [2023-10-10 05:51:50,734][53268] Updated weights for policy 1, policy_version 29400 (0.0011) [2023-10-10 05:51:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60260352. Throughput: 0: 1684.7, 1: 1685.4. Samples: 15069658. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) [2023-10-10 05:51:51,784][52050] Avg episode reward: [(0, '20.690'), (1, '18.580')] [2023-10-10 05:51:53,729][53252] Updated weights for policy 0, policy_version 29450 (0.0009) [2023-10-10 05:51:54,104][53252] Updated weights for policy 0, policy_version 29460 (0.0010) [2023-10-10 05:51:54,484][53252] Updated weights for policy 0, policy_version 29470 (0.0007) [2023-10-10 05:51:54,689][53268] Updated weights for policy 1, policy_version 29410 (0.0010) [2023-10-10 05:51:55,046][53268] Updated weights for policy 1, policy_version 29420 (0.0010) [2023-10-10 05:51:55,422][53268] Updated weights for policy 1, policy_version 29430 (0.0011) [2023-10-10 05:51:55,785][53268] Updated weights for policy 1, policy_version 29440 (0.0008) [2023-10-10 05:51:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60325888. Throughput: 0: 1696.0, 1: 1667.3. Samples: 15089410. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 05:51:56,784][52050] Avg episode reward: [(0, '20.500'), (1, '18.210')] [2023-10-10 05:51:58,536][53252] Updated weights for policy 0, policy_version 29480 (0.0008) [2023-10-10 05:51:58,909][53252] Updated weights for policy 0, policy_version 29490 (0.0010) [2023-10-10 05:51:59,285][53252] Updated weights for policy 0, policy_version 29500 (0.0008) [2023-10-10 05:51:59,862][53268] Updated weights for policy 1, policy_version 29450 (0.0011) [2023-10-10 05:52:00,218][53268] Updated weights for policy 1, policy_version 29460 (0.0012) [2023-10-10 05:52:00,580][53268] Updated weights for policy 1, policy_version 29470 (0.0011) [2023-10-10 05:52:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60391424. Throughput: 0: 1669.4, 1: 1698.5. Samples: 15099874. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 05:52:01,784][52050] Avg episode reward: [(0, '20.590'), (1, '17.730')] [2023-10-10 05:52:03,459][53252] Updated weights for policy 0, policy_version 29510 (0.0007) [2023-10-10 05:52:03,822][53252] Updated weights for policy 0, policy_version 29520 (0.0008) [2023-10-10 05:52:04,203][53252] Updated weights for policy 0, policy_version 29530 (0.0007) [2023-10-10 05:52:04,672][53268] Updated weights for policy 1, policy_version 29480 (0.0008) [2023-10-10 05:52:05,035][53268] Updated weights for policy 1, policy_version 29490 (0.0009) [2023-10-10 05:52:05,414][53268] Updated weights for policy 1, policy_version 29500 (0.0008) [2023-10-10 05:52:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60456960. Throughput: 0: 1679.5, 1: 1680.0. Samples: 15119514. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 05:52:06,784][52050] Avg episode reward: [(0, '19.350'), (1, '17.310')] [2023-10-10 05:52:08,242][53252] Updated weights for policy 0, policy_version 29540 (0.0008) [2023-10-10 05:52:08,616][53252] Updated weights for policy 0, policy_version 29550 (0.0007) [2023-10-10 05:52:08,981][53252] Updated weights for policy 0, policy_version 29560 (0.0007) [2023-10-10 05:52:09,525][53268] Updated weights for policy 1, policy_version 29510 (0.0009) [2023-10-10 05:52:09,904][53268] Updated weights for policy 1, policy_version 29520 (0.0009) [2023-10-10 05:52:10,268][53268] Updated weights for policy 1, policy_version 29530 (0.0009) [2023-10-10 05:52:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60522496. Throughput: 0: 1678.8, 1: 1671.3. Samples: 15139256. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 05:52:11,784][52050] Avg episode reward: [(0, '18.770'), (1, '17.750')] [2023-10-10 05:52:13,102][53252] Updated weights for policy 0, policy_version 29570 (0.0007) [2023-10-10 05:52:13,472][53252] Updated weights for policy 0, policy_version 29580 (0.0008) [2023-10-10 05:52:13,850][53252] Updated weights for policy 0, policy_version 29590 (0.0010) [2023-10-10 05:52:14,215][53252] Updated weights for policy 0, policy_version 29600 (0.0008) [2023-10-10 05:52:14,443][53268] Updated weights for policy 1, policy_version 29540 (0.0007) [2023-10-10 05:52:14,806][53268] Updated weights for policy 1, policy_version 29550 (0.0010) [2023-10-10 05:52:15,170][53268] Updated weights for policy 1, policy_version 29560 (0.0009) [2023-10-10 05:52:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 60588032. Throughput: 0: 1653.4, 1: 1693.2. Samples: 15149510. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 05:52:16,784][52050] Avg episode reward: [(0, '19.860'), (1, '16.660')] [2023-10-10 05:52:18,279][53252] Updated weights for policy 0, policy_version 29610 (0.0010) [2023-10-10 05:52:18,642][53252] Updated weights for policy 0, policy_version 29620 (0.0009) [2023-10-10 05:52:19,020][53252] Updated weights for policy 0, policy_version 29630 (0.0008) [2023-10-10 05:52:19,193][53268] Updated weights for policy 1, policy_version 29570 (0.0009) [2023-10-10 05:52:19,559][53268] Updated weights for policy 1, policy_version 29580 (0.0011) [2023-10-10 05:52:19,930][53268] Updated weights for policy 1, policy_version 29590 (0.0010) [2023-10-10 05:52:20,289][53268] Updated weights for policy 1, policy_version 29600 (0.0010) [2023-10-10 05:52:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60653568. Throughput: 0: 1681.7, 1: 1662.8. Samples: 15169436. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 05:52:21,784][52050] Avg episode reward: [(0, '19.100'), (1, '17.660')] [2023-10-10 05:52:22,963][53252] Updated weights for policy 0, policy_version 29640 (0.0007) [2023-10-10 05:52:23,340][53252] Updated weights for policy 0, policy_version 29650 (0.0007) [2023-10-10 05:52:23,705][53252] Updated weights for policy 0, policy_version 29660 (0.0007) [2023-10-10 05:52:24,555][53268] Updated weights for policy 1, policy_version 29610 (0.0009) [2023-10-10 05:52:24,926][53268] Updated weights for policy 1, policy_version 29620 (0.0008) [2023-10-10 05:52:25,290][53268] Updated weights for policy 1, policy_version 29630 (0.0007) [2023-10-10 05:52:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 60719104. Throughput: 0: 1680.4, 1: 1678.5. Samples: 15189928. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 05:52:26,784][52050] Avg episode reward: [(0, '19.830'), (1, '19.200')] [2023-10-10 05:52:27,853][53252] Updated weights for policy 0, policy_version 29670 (0.0007) [2023-10-10 05:52:28,230][53252] Updated weights for policy 0, policy_version 29680 (0.0008) [2023-10-10 05:52:28,598][53252] Updated weights for policy 0, policy_version 29690 (0.0010) [2023-10-10 05:52:29,453][53268] Updated weights for policy 1, policy_version 29640 (0.0008) [2023-10-10 05:52:29,820][53268] Updated weights for policy 1, policy_version 29650 (0.0007) [2023-10-10 05:52:30,193][53268] Updated weights for policy 1, policy_version 29660 (0.0011) [2023-10-10 05:52:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60784640. Throughput: 0: 1670.1, 1: 1684.8. Samples: 15200390. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 05:52:31,784][52050] Avg episode reward: [(0, '20.430'), (1, '19.180')] [2023-10-10 05:52:32,711][53252] Updated weights for policy 0, policy_version 29700 (0.0009) [2023-10-10 05:52:33,092][53252] Updated weights for policy 0, policy_version 29710 (0.0009) [2023-10-10 05:52:33,467][53252] Updated weights for policy 0, policy_version 29720 (0.0008) [2023-10-10 05:52:34,369][53268] Updated weights for policy 1, policy_version 29670 (0.0009) [2023-10-10 05:52:34,743][53268] Updated weights for policy 1, policy_version 29680 (0.0007) [2023-10-10 05:52:35,103][53268] Updated weights for policy 1, policy_version 29690 (0.0008) [2023-10-10 05:52:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60850176. Throughput: 0: 1677.7, 1: 1664.7. Samples: 15220068. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 05:52:36,784][52050] Avg episode reward: [(0, '21.370'), (1, '17.990')] [2023-10-10 05:52:37,720][53252] Updated weights for policy 0, policy_version 29730 (0.0009) [2023-10-10 05:52:38,133][53252] Updated weights for policy 0, policy_version 29740 (0.0010) [2023-10-10 05:52:38,506][53252] Updated weights for policy 0, policy_version 29750 (0.0009) [2023-10-10 05:52:38,871][53252] Updated weights for policy 0, policy_version 29760 (0.0007) [2023-10-10 05:52:39,141][53268] Updated weights for policy 1, policy_version 29700 (0.0009) [2023-10-10 05:52:39,507][53268] Updated weights for policy 1, policy_version 29710 (0.0011) [2023-10-10 05:52:39,873][53268] Updated weights for policy 1, policy_version 29720 (0.0009) [2023-10-10 05:52:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60915712. Throughput: 0: 1677.5, 1: 1680.8. Samples: 15240534. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 05:52:41,784][52050] Avg episode reward: [(0, '20.050'), (1, '18.760')] [2023-10-10 05:52:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000029760_30474240.pth... [2023-10-10 05:52:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000029728_30441472.pth... [2023-10-10 05:52:41,837][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000028160_28835840.pth [2023-10-10 05:52:41,839][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000028192_28868608.pth [2023-10-10 05:52:42,987][53252] Updated weights for policy 0, policy_version 29770 (0.0007) [2023-10-10 05:52:43,357][53252] Updated weights for policy 0, policy_version 29780 (0.0010) [2023-10-10 05:52:43,737][53252] Updated weights for policy 0, policy_version 29790 (0.0009) [2023-10-10 05:52:43,769][53268] Updated weights for policy 1, policy_version 29730 (0.0009) [2023-10-10 05:52:44,132][53268] Updated weights for policy 1, policy_version 29740 (0.0011) [2023-10-10 05:52:44,508][53268] Updated weights for policy 1, policy_version 29750 (0.0010) [2023-10-10 05:52:44,879][53268] Updated weights for policy 1, policy_version 29760 (0.0008) [2023-10-10 05:52:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60981248. Throughput: 0: 1674.6, 1: 1673.8. Samples: 15250552. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:52:46,784][52050] Avg episode reward: [(0, '20.080'), (1, '19.730')] [2023-10-10 05:52:47,766][53252] Updated weights for policy 0, policy_version 29800 (0.0007) [2023-10-10 05:52:48,138][53252] Updated weights for policy 0, policy_version 29810 (0.0009) [2023-10-10 05:52:48,509][53252] Updated weights for policy 0, policy_version 29820 (0.0010) [2023-10-10 05:52:48,842][53268] Updated weights for policy 1, policy_version 29770 (0.0007) [2023-10-10 05:52:49,198][53268] Updated weights for policy 1, policy_version 29780 (0.0007) [2023-10-10 05:52:49,562][53268] Updated weights for policy 1, policy_version 29790 (0.0007) [2023-10-10 05:52:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 61046784. Throughput: 0: 1683.0, 1: 1678.1. Samples: 15270760. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:52:51,784][52050] Avg episode reward: [(0, '19.470'), (1, '18.210')] [2023-10-10 05:52:52,573][53252] Updated weights for policy 0, policy_version 29830 (0.0008) [2023-10-10 05:52:52,946][53252] Updated weights for policy 0, policy_version 29840 (0.0009) [2023-10-10 05:52:53,319][53252] Updated weights for policy 0, policy_version 29850 (0.0009) [2023-10-10 05:52:53,720][53268] Updated weights for policy 1, policy_version 29800 (0.0009) [2023-10-10 05:52:54,091][53268] Updated weights for policy 1, policy_version 29810 (0.0009) [2023-10-10 05:52:54,465][53268] Updated weights for policy 1, policy_version 29820 (0.0008) [2023-10-10 05:52:56,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 61112320. Throughput: 0: 1693.3, 1: 1691.3. Samples: 15291564. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:52:56,785][52050] Avg episode reward: [(0, '19.050'), (1, '19.130')] [2023-10-10 05:52:57,202][53252] Updated weights for policy 0, policy_version 29860 (0.0008) [2023-10-10 05:52:57,572][53252] Updated weights for policy 0, policy_version 29870 (0.0010) [2023-10-10 05:52:57,934][53252] Updated weights for policy 0, policy_version 29880 (0.0010) [2023-10-10 05:52:58,425][53268] Updated weights for policy 1, policy_version 29830 (0.0008) [2023-10-10 05:52:58,797][53268] Updated weights for policy 1, policy_version 29840 (0.0009) [2023-10-10 05:52:59,171][53268] Updated weights for policy 1, policy_version 29850 (0.0007) [2023-10-10 05:53:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 61177856. Throughput: 0: 1696.8, 1: 1676.8. Samples: 15301324. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:53:01,784][52050] Avg episode reward: [(0, '18.430'), (1, '18.550')] [2023-10-10 05:53:01,802][53252] Updated weights for policy 0, policy_version 29890 (0.0009) [2023-10-10 05:53:02,169][53252] Updated weights for policy 0, policy_version 29900 (0.0008) [2023-10-10 05:53:02,546][53252] Updated weights for policy 0, policy_version 29910 (0.0008) [2023-10-10 05:53:02,910][53252] Updated weights for policy 0, policy_version 29920 (0.0009) [2023-10-10 05:53:03,045][53268] Updated weights for policy 1, policy_version 29860 (0.0008) [2023-10-10 05:53:03,407][53268] Updated weights for policy 1, policy_version 29870 (0.0009) [2023-10-10 05:53:03,782][53268] Updated weights for policy 1, policy_version 29880 (0.0007) [2023-10-10 05:53:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 61243392. Throughput: 0: 1692.4, 1: 1691.8. Samples: 15321726. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 05:53:06,784][52050] Avg episode reward: [(0, '18.650'), (1, '19.510')] [2023-10-10 05:53:06,953][53252] Updated weights for policy 0, policy_version 29930 (0.0007) [2023-10-10 05:53:07,324][53252] Updated weights for policy 0, policy_version 29940 (0.0007) [2023-10-10 05:53:07,689][53252] Updated weights for policy 0, policy_version 29950 (0.0008) [2023-10-10 05:53:07,872][53268] Updated weights for policy 1, policy_version 29890 (0.0008) [2023-10-10 05:53:08,241][53268] Updated weights for policy 1, policy_version 29900 (0.0010) [2023-10-10 05:53:08,609][53268] Updated weights for policy 1, policy_version 29910 (0.0009) [2023-10-10 05:53:08,983][53268] Updated weights for policy 1, policy_version 29920 (0.0009) [2023-10-10 05:53:11,693][53252] Updated weights for policy 0, policy_version 29960 (0.0007) [2023-10-10 05:53:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 61308928. Throughput: 0: 1693.2, 1: 1699.2. Samples: 15342590. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:53:11,784][52050] Avg episode reward: [(0, '19.720'), (1, '19.120')] [2023-10-10 05:53:12,078][53252] Updated weights for policy 0, policy_version 29970 (0.0009) [2023-10-10 05:53:12,446][53252] Updated weights for policy 0, policy_version 29980 (0.0008) [2023-10-10 05:53:13,042][53268] Updated weights for policy 1, policy_version 29930 (0.0009) [2023-10-10 05:53:13,413][53268] Updated weights for policy 1, policy_version 29940 (0.0010) [2023-10-10 05:53:13,781][53268] Updated weights for policy 1, policy_version 29950 (0.0009) [2023-10-10 05:53:16,678][53252] Updated weights for policy 0, policy_version 29990 (0.0009) [2023-10-10 05:53:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61374464. Throughput: 0: 1694.4, 1: 1668.4. Samples: 15351716. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:53:16,784][52050] Avg episode reward: [(0, '18.240'), (1, '19.290')] [2023-10-10 05:53:17,059][53252] Updated weights for policy 0, policy_version 30000 (0.0007) [2023-10-10 05:53:17,430][53252] Updated weights for policy 0, policy_version 30010 (0.0008) [2023-10-10 05:53:17,940][53268] Updated weights for policy 1, policy_version 29960 (0.0008) [2023-10-10 05:53:18,305][53268] Updated weights for policy 1, policy_version 29970 (0.0009) [2023-10-10 05:53:18,660][53268] Updated weights for policy 1, policy_version 29980 (0.0008) [2023-10-10 05:53:21,445][53252] Updated weights for policy 0, policy_version 30020 (0.0009) [2023-10-10 05:53:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61440000. Throughput: 0: 1694.4, 1: 1694.0. Samples: 15372550. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:53:21,784][52050] Avg episode reward: [(0, '18.770'), (1, '19.560')] [2023-10-10 05:53:21,826][53252] Updated weights for policy 0, policy_version 30030 (0.0009) [2023-10-10 05:53:22,196][53252] Updated weights for policy 0, policy_version 30040 (0.0007) [2023-10-10 05:53:22,620][53268] Updated weights for policy 1, policy_version 29990 (0.0008) [2023-10-10 05:53:22,980][53268] Updated weights for policy 1, policy_version 30000 (0.0007) [2023-10-10 05:53:23,351][53268] Updated weights for policy 1, policy_version 30010 (0.0008) [2023-10-10 05:53:26,244][53252] Updated weights for policy 0, policy_version 30050 (0.0007) [2023-10-10 05:53:26,657][53252] Updated weights for policy 0, policy_version 30060 (0.0009) [2023-10-10 05:53:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61505536. Throughput: 0: 1693.1, 1: 1694.9. Samples: 15392996. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:53:26,784][52050] Avg episode reward: [(0, '19.700'), (1, '17.950')] [2023-10-10 05:53:27,029][53252] Updated weights for policy 0, policy_version 30070 (0.0008) [2023-10-10 05:53:27,375][53268] Updated weights for policy 1, policy_version 30020 (0.0007) [2023-10-10 05:53:27,407][53252] Updated weights for policy 0, policy_version 30080 (0.0008) [2023-10-10 05:53:27,750][53268] Updated weights for policy 1, policy_version 30030 (0.0007) [2023-10-10 05:53:28,115][53268] Updated weights for policy 1, policy_version 30040 (0.0008) [2023-10-10 05:53:31,234][53252] Updated weights for policy 0, policy_version 30090 (0.0008) [2023-10-10 05:53:31,612][53252] Updated weights for policy 0, policy_version 30100 (0.0009) [2023-10-10 05:53:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61571072. Throughput: 0: 1695.7, 1: 1675.1. Samples: 15402238. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 05:53:31,784][52050] Avg episode reward: [(0, '18.410'), (1, '18.440')] [2023-10-10 05:53:31,979][53252] Updated weights for policy 0, policy_version 30110 (0.0007) [2023-10-10 05:53:32,228][53268] Updated weights for policy 1, policy_version 30050 (0.0009) [2023-10-10 05:53:32,601][53268] Updated weights for policy 1, policy_version 30060 (0.0010) [2023-10-10 05:53:32,973][53268] Updated weights for policy 1, policy_version 30070 (0.0011) [2023-10-10 05:53:33,343][53268] Updated weights for policy 1, policy_version 30080 (0.0009) [2023-10-10 05:53:35,982][53252] Updated weights for policy 0, policy_version 30120 (0.0008) [2023-10-10 05:53:36,349][53252] Updated weights for policy 0, policy_version 30130 (0.0010) [2023-10-10 05:53:36,725][53252] Updated weights for policy 0, policy_version 30140 (0.0007) [2023-10-10 05:53:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61636608. Throughput: 0: 1702.6, 1: 1684.2. Samples: 15423168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:53:36,784][52050] Avg episode reward: [(0, '19.480'), (1, '17.240')] [2023-10-10 05:53:37,481][53268] Updated weights for policy 1, policy_version 30090 (0.0009) [2023-10-10 05:53:37,846][53268] Updated weights for policy 1, policy_version 30100 (0.0007) [2023-10-10 05:53:38,211][53268] Updated weights for policy 1, policy_version 30110 (0.0010) [2023-10-10 05:53:40,846][53252] Updated weights for policy 0, policy_version 30150 (0.0008) [2023-10-10 05:53:41,218][53252] Updated weights for policy 0, policy_version 30160 (0.0008) [2023-10-10 05:53:41,591][53252] Updated weights for policy 0, policy_version 30170 (0.0007) [2023-10-10 05:53:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61702144. Throughput: 0: 1681.6, 1: 1686.3. Samples: 15443120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:53:41,784][52050] Avg episode reward: [(0, '20.810'), (1, '16.590')] [2023-10-10 05:53:42,336][53268] Updated weights for policy 1, policy_version 30120 (0.0008) [2023-10-10 05:53:42,708][53268] Updated weights for policy 1, policy_version 30130 (0.0007) [2023-10-10 05:53:43,086][53268] Updated weights for policy 1, policy_version 30140 (0.0009) [2023-10-10 05:53:45,608][53252] Updated weights for policy 0, policy_version 30180 (0.0009) [2023-10-10 05:53:45,979][53252] Updated weights for policy 0, policy_version 30190 (0.0007) [2023-10-10 05:53:46,353][53252] Updated weights for policy 0, policy_version 30200 (0.0007) [2023-10-10 05:53:46,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61800448. Throughput: 0: 1699.2, 1: 1671.6. Samples: 15453010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:53:46,784][52050] Avg episode reward: [(0, '20.370'), (1, '16.900')] [2023-10-10 05:53:47,158][53268] Updated weights for policy 1, policy_version 30150 (0.0009) [2023-10-10 05:53:47,523][53268] Updated weights for policy 1, policy_version 30160 (0.0011) [2023-10-10 05:53:47,894][53268] Updated weights for policy 1, policy_version 30170 (0.0009) [2023-10-10 05:53:50,335][53252] Updated weights for policy 0, policy_version 30210 (0.0009) [2023-10-10 05:53:50,700][53252] Updated weights for policy 0, policy_version 30220 (0.0009) [2023-10-10 05:53:51,076][53252] Updated weights for policy 0, policy_version 30230 (0.0009) [2023-10-10 05:53:51,432][53252] Updated weights for policy 0, policy_version 30240 (0.0007) [2023-10-10 05:53:51,784][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61865984. Throughput: 0: 1695.5, 1: 1678.9. Samples: 15473574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:53:51,785][52050] Avg episode reward: [(0, '19.380'), (1, '17.780')] [2023-10-10 05:53:52,100][53268] Updated weights for policy 1, policy_version 30180 (0.0007) [2023-10-10 05:53:52,468][53268] Updated weights for policy 1, policy_version 30190 (0.0008) [2023-10-10 05:53:52,835][53268] Updated weights for policy 1, policy_version 30200 (0.0009) [2023-10-10 05:53:55,353][53252] Updated weights for policy 0, policy_version 30250 (0.0010) [2023-10-10 05:53:55,730][53252] Updated weights for policy 0, policy_version 30260 (0.0008) [2023-10-10 05:53:56,096][53252] Updated weights for policy 0, policy_version 30270 (0.0008) [2023-10-10 05:53:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61931520. Throughput: 0: 1669.9, 1: 1678.7. Samples: 15493276. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 05:53:56,784][52050] Avg episode reward: [(0, '20.410'), (1, '17.420')] [2023-10-10 05:53:56,864][53268] Updated weights for policy 1, policy_version 30210 (0.0008) [2023-10-10 05:53:57,230][53268] Updated weights for policy 1, policy_version 30220 (0.0007) [2023-10-10 05:53:57,603][53268] Updated weights for policy 1, policy_version 30230 (0.0009) [2023-10-10 05:53:57,971][53268] Updated weights for policy 1, policy_version 30240 (0.0008) [2023-10-10 05:54:00,162][53252] Updated weights for policy 0, policy_version 30280 (0.0010) [2023-10-10 05:54:00,535][53252] Updated weights for policy 0, policy_version 30290 (0.0007) [2023-10-10 05:54:00,902][53252] Updated weights for policy 0, policy_version 30300 (0.0008) [2023-10-10 05:54:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61997056. Throughput: 0: 1700.9, 1: 1678.0. Samples: 15503768. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 05:54:01,784][52050] Avg episode reward: [(0, '19.330'), (1, '18.260')] [2023-10-10 05:54:02,245][53268] Updated weights for policy 1, policy_version 30250 (0.0008) [2023-10-10 05:54:02,619][53268] Updated weights for policy 1, policy_version 30260 (0.0007) [2023-10-10 05:54:02,982][53268] Updated weights for policy 1, policy_version 30270 (0.0008) [2023-10-10 05:54:04,903][53252] Updated weights for policy 0, policy_version 30310 (0.0007) [2023-10-10 05:54:05,278][53252] Updated weights for policy 0, policy_version 30320 (0.0008) [2023-10-10 05:54:05,638][53252] Updated weights for policy 0, policy_version 30330 (0.0010) [2023-10-10 05:54:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 62062592. Throughput: 0: 1685.2, 1: 1673.1. Samples: 15523674. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 05:54:06,784][52050] Avg episode reward: [(0, '18.770'), (1, '17.890')] [2023-10-10 05:54:07,091][53268] Updated weights for policy 1, policy_version 30280 (0.0010) [2023-10-10 05:54:07,458][53268] Updated weights for policy 1, policy_version 30290 (0.0010) [2023-10-10 05:54:07,819][53268] Updated weights for policy 1, policy_version 30300 (0.0010) [2023-10-10 05:54:09,536][53252] Updated weights for policy 0, policy_version 30340 (0.0009) [2023-10-10 05:54:09,909][53252] Updated weights for policy 0, policy_version 30350 (0.0008) [2023-10-10 05:54:10,292][53252] Updated weights for policy 0, policy_version 30360 (0.0007) [2023-10-10 05:54:11,708][53268] Updated weights for policy 1, policy_version 30310 (0.0010) [2023-10-10 05:54:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62128128. Throughput: 0: 1684.0, 1: 1675.4. Samples: 15544168. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 05:54:11,784][52050] Avg episode reward: [(0, '20.450'), (1, '18.220')] [2023-10-10 05:54:12,081][53268] Updated weights for policy 1, policy_version 30320 (0.0008) [2023-10-10 05:54:12,436][53268] Updated weights for policy 1, policy_version 30330 (0.0007) [2023-10-10 05:54:14,391][53252] Updated weights for policy 0, policy_version 30370 (0.0009) [2023-10-10 05:54:14,784][53252] Updated weights for policy 0, policy_version 30380 (0.0009) [2023-10-10 05:54:15,155][53252] Updated weights for policy 0, policy_version 30390 (0.0007) [2023-10-10 05:54:15,519][53252] Updated weights for policy 0, policy_version 30400 (0.0009) [2023-10-10 05:54:16,607][53268] Updated weights for policy 1, policy_version 30340 (0.0008) [2023-10-10 05:54:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62193664. Throughput: 0: 1706.8, 1: 1673.3. Samples: 15554344. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 05:54:16,784][52050] Avg episode reward: [(0, '20.240'), (1, '18.450')] [2023-10-10 05:54:16,979][53268] Updated weights for policy 1, policy_version 30350 (0.0009) [2023-10-10 05:54:17,352][53268] Updated weights for policy 1, policy_version 30360 (0.0008) [2023-10-10 05:54:19,556][53252] Updated weights for policy 0, policy_version 30410 (0.0011) [2023-10-10 05:54:19,919][53252] Updated weights for policy 0, policy_version 30420 (0.0010) [2023-10-10 05:54:20,290][53252] Updated weights for policy 0, policy_version 30430 (0.0007) [2023-10-10 05:54:21,384][53268] Updated weights for policy 1, policy_version 30370 (0.0008) [2023-10-10 05:54:21,761][53268] Updated weights for policy 1, policy_version 30380 (0.0011) [2023-10-10 05:54:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62259200. Throughput: 0: 1673.2, 1: 1677.9. Samples: 15573968. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 05:54:21,784][52050] Avg episode reward: [(0, '19.330'), (1, '17.930')] [2023-10-10 05:54:22,135][53268] Updated weights for policy 1, policy_version 30390 (0.0009) [2023-10-10 05:54:22,503][53268] Updated weights for policy 1, policy_version 30400 (0.0008) [2023-10-10 05:54:24,428][53252] Updated weights for policy 0, policy_version 30440 (0.0009) [2023-10-10 05:54:24,807][53252] Updated weights for policy 0, policy_version 30450 (0.0007) [2023-10-10 05:54:25,173][53252] Updated weights for policy 0, policy_version 30460 (0.0008) [2023-10-10 05:54:26,750][53268] Updated weights for policy 1, policy_version 30410 (0.0008) [2023-10-10 05:54:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62324736. Throughput: 0: 1682.3, 1: 1674.6. Samples: 15594180. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 05:54:26,784][52050] Avg episode reward: [(0, '21.520'), (1, '19.510')] [2023-10-10 05:54:27,127][53268] Updated weights for policy 1, policy_version 30420 (0.0011) [2023-10-10 05:54:27,492][53268] Updated weights for policy 1, policy_version 30430 (0.0010) [2023-10-10 05:54:29,389][53252] Updated weights for policy 0, policy_version 30470 (0.0008) [2023-10-10 05:54:29,756][53252] Updated weights for policy 0, policy_version 30480 (0.0008) [2023-10-10 05:54:30,128][53252] Updated weights for policy 0, policy_version 30490 (0.0007) [2023-10-10 05:54:31,520][53268] Updated weights for policy 1, policy_version 30440 (0.0009) [2023-10-10 05:54:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62390272. Throughput: 0: 1685.9, 1: 1673.8. Samples: 15604196. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 05:54:31,784][52050] Avg episode reward: [(0, '19.440'), (1, '18.850')] [2023-10-10 05:54:31,892][53268] Updated weights for policy 1, policy_version 30450 (0.0010) [2023-10-10 05:54:32,263][53268] Updated weights for policy 1, policy_version 30460 (0.0008) [2023-10-10 05:54:34,228][53252] Updated weights for policy 0, policy_version 30500 (0.0007) [2023-10-10 05:54:34,604][53252] Updated weights for policy 0, policy_version 30510 (0.0008) [2023-10-10 05:54:34,978][53252] Updated weights for policy 0, policy_version 30520 (0.0010) [2023-10-10 05:54:36,304][53268] Updated weights for policy 1, policy_version 30470 (0.0009) [2023-10-10 05:54:36,659][53268] Updated weights for policy 1, policy_version 30480 (0.0011) [2023-10-10 05:54:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62455808. Throughput: 0: 1667.6, 1: 1676.2. Samples: 15624044. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 05:54:36,784][52050] Avg episode reward: [(0, '18.520'), (1, '18.200')] [2023-10-10 05:54:37,026][53268] Updated weights for policy 1, policy_version 30490 (0.0011) [2023-10-10 05:54:38,882][53252] Updated weights for policy 0, policy_version 30530 (0.0010) [2023-10-10 05:54:39,247][53252] Updated weights for policy 0, policy_version 30540 (0.0008) [2023-10-10 05:54:39,614][53252] Updated weights for policy 0, policy_version 30550 (0.0008) [2023-10-10 05:54:39,984][53252] Updated weights for policy 0, policy_version 30560 (0.0008) [2023-10-10 05:54:41,156][53268] Updated weights for policy 1, policy_version 30500 (0.0009) [2023-10-10 05:54:41,516][53268] Updated weights for policy 1, policy_version 30510 (0.0008) [2023-10-10 05:54:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62521344. Throughput: 0: 1693.4, 1: 1669.3. Samples: 15644600. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 05:54:41,784][52050] Avg episode reward: [(0, '20.890'), (1, '19.500')] [2023-10-10 05:54:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000030560_31293440.pth... [2023-10-10 05:54:41,837][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000028992_29687808.pth [2023-10-10 05:54:41,891][53268] Updated weights for policy 1, policy_version 30520 (0.0011) [2023-10-10 05:54:42,180][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000030528_31260672.pth... [2023-10-10 05:54:42,220][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000028928_29622272.pth [2023-10-10 05:54:43,815][53252] Updated weights for policy 0, policy_version 30570 (0.0008) [2023-10-10 05:54:44,174][53252] Updated weights for policy 0, policy_version 30580 (0.0008) [2023-10-10 05:54:44,557][53252] Updated weights for policy 0, policy_version 30590 (0.0007) [2023-10-10 05:54:45,968][53268] Updated weights for policy 1, policy_version 30530 (0.0008) [2023-10-10 05:54:46,343][53268] Updated weights for policy 1, policy_version 30540 (0.0009) [2023-10-10 05:54:46,711][53268] Updated weights for policy 1, policy_version 30550 (0.0008) [2023-10-10 05:54:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62586880. Throughput: 0: 1673.0, 1: 1672.6. Samples: 15654320. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:54:46,784][52050] Avg episode reward: [(0, '16.620'), (1, '19.100')] [2023-10-10 05:54:47,080][53268] Updated weights for policy 1, policy_version 30560 (0.0007) [2023-10-10 05:54:48,541][53252] Updated weights for policy 0, policy_version 30600 (0.0009) [2023-10-10 05:54:48,915][53252] Updated weights for policy 0, policy_version 30610 (0.0007) [2023-10-10 05:54:49,293][53252] Updated weights for policy 0, policy_version 30620 (0.0009) [2023-10-10 05:54:51,151][53268] Updated weights for policy 1, policy_version 30570 (0.0007) [2023-10-10 05:54:51,515][53268] Updated weights for policy 1, policy_version 30580 (0.0007) [2023-10-10 05:54:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62652416. Throughput: 0: 1683.9, 1: 1674.1. Samples: 15674782. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:54:51,784][52050] Avg episode reward: [(0, '17.040'), (1, '19.270')] [2023-10-10 05:54:51,882][53268] Updated weights for policy 1, policy_version 30590 (0.0010) [2023-10-10 05:54:53,412][53252] Updated weights for policy 0, policy_version 30630 (0.0009) [2023-10-10 05:54:53,782][53252] Updated weights for policy 0, policy_version 30640 (0.0009) [2023-10-10 05:54:54,165][53252] Updated weights for policy 0, policy_version 30650 (0.0011) [2023-10-10 05:54:55,950][53268] Updated weights for policy 1, policy_version 30600 (0.0008) [2023-10-10 05:54:56,316][53268] Updated weights for policy 1, policy_version 30610 (0.0009) [2023-10-10 05:54:56,687][53268] Updated weights for policy 1, policy_version 30620 (0.0007) [2023-10-10 05:54:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62717952. Throughput: 0: 1687.8, 1: 1660.4. Samples: 15694836. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:54:56,784][52050] Avg episode reward: [(0, '20.330'), (1, '18.170')] [2023-10-10 05:54:58,336][53252] Updated weights for policy 0, policy_version 30660 (0.0008) [2023-10-10 05:54:58,716][53252] Updated weights for policy 0, policy_version 30670 (0.0008) [2023-10-10 05:54:59,093][53252] Updated weights for policy 0, policy_version 30680 (0.0008) [2023-10-10 05:55:00,795][53268] Updated weights for policy 1, policy_version 30630 (0.0008) [2023-10-10 05:55:01,159][53268] Updated weights for policy 1, policy_version 30640 (0.0008) [2023-10-10 05:55:01,532][53268] Updated weights for policy 1, policy_version 30650 (0.0007) [2023-10-10 05:55:01,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 62816256. Throughput: 0: 1665.1, 1: 1676.6. Samples: 15704722. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:55:01,784][52050] Avg episode reward: [(0, '18.700'), (1, '18.240')] [2023-10-10 05:55:03,164][53252] Updated weights for policy 0, policy_version 30690 (0.0008) [2023-10-10 05:55:03,568][53252] Updated weights for policy 0, policy_version 30700 (0.0008) [2023-10-10 05:55:03,943][53252] Updated weights for policy 0, policy_version 30710 (0.0010) [2023-10-10 05:55:04,316][53252] Updated weights for policy 0, policy_version 30720 (0.0009) [2023-10-10 05:55:05,581][53268] Updated weights for policy 1, policy_version 30660 (0.0009) [2023-10-10 05:55:05,955][53268] Updated weights for policy 1, policy_version 30670 (0.0007) [2023-10-10 05:55:06,318][53268] Updated weights for policy 1, policy_version 30680 (0.0010) [2023-10-10 05:55:06,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 62881792. Throughput: 0: 1684.3, 1: 1680.8. Samples: 15725396. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:55:06,784][52050] Avg episode reward: [(0, '19.690'), (1, '18.250')] [2023-10-10 05:55:08,304][53252] Updated weights for policy 0, policy_version 30730 (0.0009) [2023-10-10 05:55:08,675][53252] Updated weights for policy 0, policy_version 30740 (0.0008) [2023-10-10 05:55:09,042][53252] Updated weights for policy 0, policy_version 30750 (0.0007) [2023-10-10 05:55:10,452][53268] Updated weights for policy 1, policy_version 30690 (0.0010) [2023-10-10 05:55:10,815][53268] Updated weights for policy 1, policy_version 30700 (0.0010) [2023-10-10 05:55:11,178][53268] Updated weights for policy 1, policy_version 30710 (0.0008) [2023-10-10 05:55:11,554][53268] Updated weights for policy 1, policy_version 30720 (0.0008) [2023-10-10 05:55:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 62947328. Throughput: 0: 1694.0, 1: 1662.0. Samples: 15745198. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 05:55:11,784][52050] Avg episode reward: [(0, '19.810'), (1, '18.180')] [2023-10-10 05:55:13,201][53252] Updated weights for policy 0, policy_version 30760 (0.0008) [2023-10-10 05:55:13,576][53252] Updated weights for policy 0, policy_version 30770 (0.0007) [2023-10-10 05:55:13,954][53252] Updated weights for policy 0, policy_version 30780 (0.0008) [2023-10-10 05:55:15,592][53268] Updated weights for policy 1, policy_version 30730 (0.0010) [2023-10-10 05:55:15,967][53268] Updated weights for policy 1, policy_version 30740 (0.0010) [2023-10-10 05:55:16,336][53268] Updated weights for policy 1, policy_version 30750 (0.0008) [2023-10-10 05:55:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 63012864. Throughput: 0: 1669.1, 1: 1683.4. Samples: 15755058. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 05:55:16,784][52050] Avg episode reward: [(0, '18.720'), (1, '17.320')] [2023-10-10 05:55:17,906][53252] Updated weights for policy 0, policy_version 30790 (0.0007) [2023-10-10 05:55:18,273][53252] Updated weights for policy 0, policy_version 30800 (0.0007) [2023-10-10 05:55:18,646][53252] Updated weights for policy 0, policy_version 30810 (0.0009) [2023-10-10 05:55:20,385][53268] Updated weights for policy 1, policy_version 30760 (0.0008) [2023-10-10 05:55:20,757][53268] Updated weights for policy 1, policy_version 30770 (0.0010) [2023-10-10 05:55:21,121][53268] Updated weights for policy 1, policy_version 30780 (0.0009) [2023-10-10 05:55:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 63078400. Throughput: 0: 1690.8, 1: 1679.8. Samples: 15775720. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 05:55:21,784][52050] Avg episode reward: [(0, '19.170'), (1, '18.560')] [2023-10-10 05:55:22,755][53252] Updated weights for policy 0, policy_version 30820 (0.0008) [2023-10-10 05:55:23,131][53252] Updated weights for policy 0, policy_version 30830 (0.0008) [2023-10-10 05:55:23,507][53252] Updated weights for policy 0, policy_version 30840 (0.0007) [2023-10-10 05:55:25,182][53268] Updated weights for policy 1, policy_version 30790 (0.0009) [2023-10-10 05:55:25,546][53268] Updated weights for policy 1, policy_version 30800 (0.0010) [2023-10-10 05:55:25,928][53268] Updated weights for policy 1, policy_version 30810 (0.0010) [2023-10-10 05:55:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 63143936. Throughput: 0: 1688.9, 1: 1660.6. Samples: 15795330. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 05:55:26,784][52050] Avg episode reward: [(0, '19.830'), (1, '17.760')] [2023-10-10 05:55:27,486][53252] Updated weights for policy 0, policy_version 30850 (0.0007) [2023-10-10 05:55:27,855][53252] Updated weights for policy 0, policy_version 30860 (0.0008) [2023-10-10 05:55:28,222][53252] Updated weights for policy 0, policy_version 30870 (0.0009) [2023-10-10 05:55:28,591][53252] Updated weights for policy 0, policy_version 30880 (0.0009) [2023-10-10 05:55:29,930][53268] Updated weights for policy 1, policy_version 30820 (0.0011) [2023-10-10 05:55:30,295][53268] Updated weights for policy 1, policy_version 30830 (0.0010) [2023-10-10 05:55:30,668][53268] Updated weights for policy 1, policy_version 30840 (0.0011) [2023-10-10 05:55:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 63209472. Throughput: 0: 1679.5, 1: 1686.6. Samples: 15805798. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 05:55:31,784][52050] Avg episode reward: [(0, '20.230'), (1, '17.290')] [2023-10-10 05:55:32,601][53252] Updated weights for policy 0, policy_version 30890 (0.0007) [2023-10-10 05:55:32,962][53252] Updated weights for policy 0, policy_version 30900 (0.0008) [2023-10-10 05:55:33,336][53252] Updated weights for policy 0, policy_version 30910 (0.0008) [2023-10-10 05:55:34,634][53268] Updated weights for policy 1, policy_version 30850 (0.0009) [2023-10-10 05:55:34,998][53268] Updated weights for policy 1, policy_version 30860 (0.0007) [2023-10-10 05:55:35,362][53268] Updated weights for policy 1, policy_version 30870 (0.0008) [2023-10-10 05:55:35,735][53268] Updated weights for policy 1, policy_version 30880 (0.0011) [2023-10-10 05:55:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63275008. Throughput: 0: 1687.7, 1: 1678.9. Samples: 15826280. Policy #0 lag: (min: 26.0, avg: 26.9, max: 46.0) [2023-10-10 05:55:36,784][52050] Avg episode reward: [(0, '20.920'), (1, '18.550')] [2023-10-10 05:55:37,203][53252] Updated weights for policy 0, policy_version 30920 (0.0009) [2023-10-10 05:55:37,580][53252] Updated weights for policy 0, policy_version 30930 (0.0009) [2023-10-10 05:55:37,951][53252] Updated weights for policy 0, policy_version 30940 (0.0009) [2023-10-10 05:55:39,571][53268] Updated weights for policy 1, policy_version 30890 (0.0010) [2023-10-10 05:55:39,938][53268] Updated weights for policy 1, policy_version 30900 (0.0007) [2023-10-10 05:55:40,303][53268] Updated weights for policy 1, policy_version 30910 (0.0009) [2023-10-10 05:55:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63340544. Throughput: 0: 1690.2, 1: 1683.2. Samples: 15846636. Policy #0 lag: (min: 26.0, avg: 26.9, max: 46.0) [2023-10-10 05:55:41,784][52050] Avg episode reward: [(0, '20.880'), (1, '18.680')] [2023-10-10 05:55:41,998][53252] Updated weights for policy 0, policy_version 30950 (0.0009) [2023-10-10 05:55:42,367][53252] Updated weights for policy 0, policy_version 30960 (0.0007) [2023-10-10 05:55:42,742][53252] Updated weights for policy 0, policy_version 30970 (0.0008) [2023-10-10 05:55:44,313][53268] Updated weights for policy 1, policy_version 30920 (0.0009) [2023-10-10 05:55:44,683][53268] Updated weights for policy 1, policy_version 30930 (0.0008) [2023-10-10 05:55:45,053][53268] Updated weights for policy 1, policy_version 30940 (0.0008) [2023-10-10 05:55:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63406080. Throughput: 0: 1688.7, 1: 1695.0. Samples: 15856986. Policy #0 lag: (min: 26.0, avg: 26.9, max: 46.0) [2023-10-10 05:55:46,784][52050] Avg episode reward: [(0, '20.150'), (1, '17.890')] [2023-10-10 05:55:46,834][53252] Updated weights for policy 0, policy_version 30980 (0.0007) [2023-10-10 05:55:47,196][53252] Updated weights for policy 0, policy_version 30990 (0.0007) [2023-10-10 05:55:47,574][53252] Updated weights for policy 0, policy_version 31000 (0.0007) [2023-10-10 05:55:49,093][53268] Updated weights for policy 1, policy_version 30950 (0.0008) [2023-10-10 05:55:49,467][53268] Updated weights for policy 1, policy_version 30960 (0.0007) [2023-10-10 05:55:49,827][53268] Updated weights for policy 1, policy_version 30970 (0.0007) [2023-10-10 05:55:51,723][53252] Updated weights for policy 0, policy_version 31010 (0.0008) [2023-10-10 05:55:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 63471616. Throughput: 0: 1694.0, 1: 1664.0. Samples: 15876502. Policy #0 lag: (min: 26.0, avg: 26.9, max: 46.0) [2023-10-10 05:55:51,784][52050] Avg episode reward: [(0, '20.240'), (1, '16.810')] [2023-10-10 05:55:52,134][53252] Updated weights for policy 0, policy_version 31020 (0.0008) [2023-10-10 05:55:52,505][53252] Updated weights for policy 0, policy_version 31030 (0.0009) [2023-10-10 05:55:52,868][53252] Updated weights for policy 0, policy_version 31040 (0.0009) [2023-10-10 05:55:53,870][53268] Updated weights for policy 1, policy_version 30980 (0.0010) [2023-10-10 05:55:54,236][53268] Updated weights for policy 1, policy_version 30990 (0.0012) [2023-10-10 05:55:54,600][53268] Updated weights for policy 1, policy_version 31000 (0.0011) [2023-10-10 05:55:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63537152. Throughput: 0: 1686.7, 1: 1685.4. Samples: 15896942. Policy #0 lag: (min: 26.0, avg: 26.9, max: 46.0) [2023-10-10 05:55:56,784][52050] Avg episode reward: [(0, '20.640'), (1, '17.130')] [2023-10-10 05:55:57,096][53252] Updated weights for policy 0, policy_version 31050 (0.0008) [2023-10-10 05:55:57,466][53252] Updated weights for policy 0, policy_version 31060 (0.0009) [2023-10-10 05:55:57,857][53252] Updated weights for policy 0, policy_version 31070 (0.0010) [2023-10-10 05:55:58,553][53268] Updated weights for policy 1, policy_version 31010 (0.0010) [2023-10-10 05:55:58,924][53268] Updated weights for policy 1, policy_version 31020 (0.0008) [2023-10-10 05:55:59,279][53268] Updated weights for policy 1, policy_version 31030 (0.0009) [2023-10-10 05:55:59,648][53268] Updated weights for policy 1, policy_version 31040 (0.0009) [2023-10-10 05:56:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63602688. Throughput: 0: 1687.5, 1: 1682.2. Samples: 15906694. Policy #0 lag: (min: 0.0, avg: 21.5, max: 32.0) [2023-10-10 05:56:01,784][52050] Avg episode reward: [(0, '19.010'), (1, '17.760')] [2023-10-10 05:56:01,853][53252] Updated weights for policy 0, policy_version 31080 (0.0009) [2023-10-10 05:56:02,233][53252] Updated weights for policy 0, policy_version 31090 (0.0007) [2023-10-10 05:56:02,601][53252] Updated weights for policy 0, policy_version 31100 (0.0010) [2023-10-10 05:56:03,866][53268] Updated weights for policy 1, policy_version 31050 (0.0007) [2023-10-10 05:56:04,243][53268] Updated weights for policy 1, policy_version 31060 (0.0008) [2023-10-10 05:56:04,609][53268] Updated weights for policy 1, policy_version 31070 (0.0008) [2023-10-10 05:56:06,747][53252] Updated weights for policy 0, policy_version 31110 (0.0009) [2023-10-10 05:56:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63668224. Throughput: 0: 1685.1, 1: 1667.6. Samples: 15926588. Policy #0 lag: (min: 0.0, avg: 21.5, max: 32.0) [2023-10-10 05:56:06,784][52050] Avg episode reward: [(0, '18.950'), (1, '17.760')] [2023-10-10 05:56:07,116][53252] Updated weights for policy 0, policy_version 31120 (0.0008) [2023-10-10 05:56:07,503][53252] Updated weights for policy 0, policy_version 31130 (0.0008) [2023-10-10 05:56:08,538][53268] Updated weights for policy 1, policy_version 31080 (0.0010) [2023-10-10 05:56:08,907][53268] Updated weights for policy 1, policy_version 31090 (0.0011) [2023-10-10 05:56:09,277][53268] Updated weights for policy 1, policy_version 31100 (0.0009) [2023-10-10 05:56:11,671][53252] Updated weights for policy 0, policy_version 31140 (0.0007) [2023-10-10 05:56:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63733760. Throughput: 0: 1688.3, 1: 1698.6. Samples: 15947742. Policy #0 lag: (min: 0.0, avg: 21.5, max: 32.0) [2023-10-10 05:56:11,784][52050] Avg episode reward: [(0, '19.700'), (1, '19.540')] [2023-10-10 05:56:12,042][53252] Updated weights for policy 0, policy_version 31150 (0.0009) [2023-10-10 05:56:12,418][53252] Updated weights for policy 0, policy_version 31160 (0.0008) [2023-10-10 05:56:13,388][53268] Updated weights for policy 1, policy_version 31110 (0.0008) [2023-10-10 05:56:13,752][53268] Updated weights for policy 1, policy_version 31120 (0.0009) [2023-10-10 05:56:14,111][53268] Updated weights for policy 1, policy_version 31130 (0.0007) [2023-10-10 05:56:16,315][53252] Updated weights for policy 0, policy_version 31170 (0.0008) [2023-10-10 05:56:16,688][53252] Updated weights for policy 0, policy_version 31180 (0.0008) [2023-10-10 05:56:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63799296. Throughput: 0: 1691.1, 1: 1678.5. Samples: 15957428. Policy #0 lag: (min: 0.0, avg: 21.5, max: 32.0) [2023-10-10 05:56:16,784][52050] Avg episode reward: [(0, '19.880'), (1, '18.670')] [2023-10-10 05:56:17,065][53252] Updated weights for policy 0, policy_version 31190 (0.0007) [2023-10-10 05:56:17,431][53252] Updated weights for policy 0, policy_version 31200 (0.0007) [2023-10-10 05:56:18,076][53268] Updated weights for policy 1, policy_version 31140 (0.0010) [2023-10-10 05:56:18,445][53268] Updated weights for policy 1, policy_version 31150 (0.0010) [2023-10-10 05:56:18,819][53268] Updated weights for policy 1, policy_version 31160 (0.0008) [2023-10-10 05:56:21,696][53252] Updated weights for policy 0, policy_version 31210 (0.0010) [2023-10-10 05:56:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 63864832. Throughput: 0: 1683.0, 1: 1690.6. Samples: 15978094. Policy #0 lag: (min: 0.0, avg: 21.5, max: 32.0) [2023-10-10 05:56:21,784][52050] Avg episode reward: [(0, '19.880'), (1, '18.510')] [2023-10-10 05:56:22,060][53252] Updated weights for policy 0, policy_version 31220 (0.0009) [2023-10-10 05:56:22,425][53252] Updated weights for policy 0, policy_version 31230 (0.0007) [2023-10-10 05:56:22,940][53268] Updated weights for policy 1, policy_version 31170 (0.0008) [2023-10-10 05:56:23,316][53268] Updated weights for policy 1, policy_version 31180 (0.0010) [2023-10-10 05:56:23,690][53268] Updated weights for policy 1, policy_version 31190 (0.0010) [2023-10-10 05:56:24,052][53268] Updated weights for policy 1, policy_version 31200 (0.0009) [2023-10-10 05:56:26,286][53252] Updated weights for policy 0, policy_version 31240 (0.0007) [2023-10-10 05:56:26,662][53252] Updated weights for policy 0, policy_version 31250 (0.0007) [2023-10-10 05:56:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 63930368. Throughput: 0: 1674.6, 1: 1697.2. Samples: 15998366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:56:26,785][52050] Avg episode reward: [(0, '20.950'), (1, '18.020')] [2023-10-10 05:56:27,041][53252] Updated weights for policy 0, policy_version 31260 (0.0007) [2023-10-10 05:56:27,969][53268] Updated weights for policy 1, policy_version 31210 (0.0007) [2023-10-10 05:56:28,332][53268] Updated weights for policy 1, policy_version 31220 (0.0007) [2023-10-10 05:56:28,699][53268] Updated weights for policy 1, policy_version 31230 (0.0007) [2023-10-10 05:56:31,177][53252] Updated weights for policy 0, policy_version 31270 (0.0008) [2023-10-10 05:56:31,544][53252] Updated weights for policy 0, policy_version 31280 (0.0007) [2023-10-10 05:56:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 63995904. Throughput: 0: 1682.9, 1: 1673.6. Samples: 16008026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:56:31,784][52050] Avg episode reward: [(0, '22.290'), (1, '18.200')] [2023-10-10 05:56:31,923][53252] Updated weights for policy 0, policy_version 31290 (0.0008) [2023-10-10 05:56:32,138][52846] Saving new best policy, reward=22.290! [2023-10-10 05:56:32,773][53268] Updated weights for policy 1, policy_version 31240 (0.0010) [2023-10-10 05:56:33,134][53268] Updated weights for policy 1, policy_version 31250 (0.0010) [2023-10-10 05:56:33,506][53268] Updated weights for policy 1, policy_version 31260 (0.0009) [2023-10-10 05:56:35,925][53252] Updated weights for policy 0, policy_version 31300 (0.0008) [2023-10-10 05:56:36,287][53252] Updated weights for policy 0, policy_version 31310 (0.0007) [2023-10-10 05:56:36,660][53252] Updated weights for policy 0, policy_version 31320 (0.0008) [2023-10-10 05:56:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64061440. Throughput: 0: 1682.4, 1: 1701.2. Samples: 16028768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:56:36,784][52050] Avg episode reward: [(0, '18.830'), (1, '18.720')] [2023-10-10 05:56:37,647][53268] Updated weights for policy 1, policy_version 31270 (0.0007) [2023-10-10 05:56:38,016][53268] Updated weights for policy 1, policy_version 31280 (0.0009) [2023-10-10 05:56:38,388][53268] Updated weights for policy 1, policy_version 31290 (0.0009) [2023-10-10 05:56:40,955][53252] Updated weights for policy 0, policy_version 31330 (0.0009) [2023-10-10 05:56:41,358][53252] Updated weights for policy 0, policy_version 31340 (0.0010) [2023-10-10 05:56:41,733][53252] Updated weights for policy 0, policy_version 31350 (0.0011) [2023-10-10 05:56:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 64126976. Throughput: 0: 1673.3, 1: 1702.9. Samples: 16048870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 05:56:41,784][52050] Avg episode reward: [(0, '19.080'), (1, '18.050')] [2023-10-10 05:56:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000031296_32047104.pth... [2023-10-10 05:56:41,836][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000029728_30441472.pth [2023-10-10 05:56:41,842][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000031296_32047104.pth [2023-10-10 05:56:42,103][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000031360_32112640.pth... [2023-10-10 05:56:42,108][53252] Updated weights for policy 0, policy_version 31360 (0.0007) [2023-10-10 05:56:42,142][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000029760_30474240.pth [2023-10-10 05:56:42,147][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000031360_32112640.pth [2023-10-10 05:56:42,442][53268] Updated weights for policy 1, policy_version 31300 (0.0009) [2023-10-10 05:56:42,811][53268] Updated weights for policy 1, policy_version 31310 (0.0010) [2023-10-10 05:56:43,181][53268] Updated weights for policy 1, policy_version 31320 (0.0010) [2023-10-10 05:56:45,838][53252] Updated weights for policy 0, policy_version 31370 (0.0007) [2023-10-10 05:56:46,221][53252] Updated weights for policy 0, policy_version 31380 (0.0007) [2023-10-10 05:56:46,593][53252] Updated weights for policy 0, policy_version 31390 (0.0008) [2023-10-10 05:56:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64225280. Throughput: 0: 1688.5, 1: 1688.0. Samples: 16058638. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) [2023-10-10 05:56:46,784][52050] Avg episode reward: [(0, '19.650'), (1, '17.480')] [2023-10-10 05:56:47,267][53268] Updated weights for policy 1, policy_version 31330 (0.0010) [2023-10-10 05:56:47,637][53268] Updated weights for policy 1, policy_version 31340 (0.0008) [2023-10-10 05:56:48,009][53268] Updated weights for policy 1, policy_version 31350 (0.0008) [2023-10-10 05:56:48,375][53268] Updated weights for policy 1, policy_version 31360 (0.0007) [2023-10-10 05:56:50,699][53252] Updated weights for policy 0, policy_version 31400 (0.0008) [2023-10-10 05:56:51,073][53252] Updated weights for policy 0, policy_version 31410 (0.0010) [2023-10-10 05:56:51,428][53252] Updated weights for policy 0, policy_version 31420 (0.0010) [2023-10-10 05:56:51,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64290816. Throughput: 0: 1692.2, 1: 1705.9. Samples: 16079502. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) [2023-10-10 05:56:51,785][52050] Avg episode reward: [(0, '19.010'), (1, '19.090')] [2023-10-10 05:56:52,511][53268] Updated weights for policy 1, policy_version 31370 (0.0010) [2023-10-10 05:56:52,881][53268] Updated weights for policy 1, policy_version 31380 (0.0010) [2023-10-10 05:56:53,248][53268] Updated weights for policy 1, policy_version 31390 (0.0010) [2023-10-10 05:56:55,283][53252] Updated weights for policy 0, policy_version 31430 (0.0009) [2023-10-10 05:56:55,654][53252] Updated weights for policy 0, policy_version 31440 (0.0008) [2023-10-10 05:56:56,034][53252] Updated weights for policy 0, policy_version 31450 (0.0009) [2023-10-10 05:56:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64356352. Throughput: 0: 1663.4, 1: 1696.4. Samples: 16098936. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) [2023-10-10 05:56:56,785][52050] Avg episode reward: [(0, '19.660'), (1, '19.490')] [2023-10-10 05:56:57,341][53268] Updated weights for policy 1, policy_version 31400 (0.0008) [2023-10-10 05:56:57,718][53268] Updated weights for policy 1, policy_version 31410 (0.0008) [2023-10-10 05:56:58,085][53268] Updated weights for policy 1, policy_version 31420 (0.0009) [2023-10-10 05:57:00,152][53252] Updated weights for policy 0, policy_version 31460 (0.0007) [2023-10-10 05:57:00,519][53252] Updated weights for policy 0, policy_version 31470 (0.0007) [2023-10-10 05:57:00,890][53252] Updated weights for policy 0, policy_version 31480 (0.0007) [2023-10-10 05:57:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64421888. Throughput: 0: 1689.0, 1: 1686.7. Samples: 16109336. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) [2023-10-10 05:57:01,784][52050] Avg episode reward: [(0, '20.600'), (1, '18.580')] [2023-10-10 05:57:02,181][53268] Updated weights for policy 1, policy_version 31430 (0.0010) [2023-10-10 05:57:02,545][53268] Updated weights for policy 1, policy_version 31440 (0.0008) [2023-10-10 05:57:02,922][53268] Updated weights for policy 1, policy_version 31450 (0.0011) [2023-10-10 05:57:04,927][53252] Updated weights for policy 0, policy_version 31490 (0.0008) [2023-10-10 05:57:05,301][53252] Updated weights for policy 0, policy_version 31500 (0.0008) [2023-10-10 05:57:05,663][53252] Updated weights for policy 0, policy_version 31510 (0.0010) [2023-10-10 05:57:06,033][53252] Updated weights for policy 0, policy_version 31520 (0.0008) [2023-10-10 05:57:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64487424. Throughput: 0: 1681.0, 1: 1681.5. Samples: 16129404. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) [2023-10-10 05:57:06,784][52050] Avg episode reward: [(0, '19.200'), (1, '19.380')] [2023-10-10 05:57:06,987][53268] Updated weights for policy 1, policy_version 31460 (0.0009) [2023-10-10 05:57:07,357][53268] Updated weights for policy 1, policy_version 31470 (0.0009) [2023-10-10 05:57:07,718][53268] Updated weights for policy 1, policy_version 31480 (0.0007) [2023-10-10 05:57:10,239][53252] Updated weights for policy 0, policy_version 31530 (0.0008) [2023-10-10 05:57:10,617][53252] Updated weights for policy 0, policy_version 31540 (0.0008) [2023-10-10 05:57:10,983][53252] Updated weights for policy 0, policy_version 31550 (0.0010) [2023-10-10 05:57:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64552960. Throughput: 0: 1673.2, 1: 1688.7. Samples: 16149654. Policy #0 lag: (min: 14.0, avg: 27.1, max: 46.0) [2023-10-10 05:57:11,784][52050] Avg episode reward: [(0, '18.800'), (1, '18.600')] [2023-10-10 05:57:11,798][53268] Updated weights for policy 1, policy_version 31490 (0.0007) [2023-10-10 05:57:12,162][53268] Updated weights for policy 1, policy_version 31500 (0.0007) [2023-10-10 05:57:12,532][53268] Updated weights for policy 1, policy_version 31510 (0.0007) [2023-10-10 05:57:12,903][53268] Updated weights for policy 1, policy_version 31520 (0.0009) [2023-10-10 05:57:14,987][53252] Updated weights for policy 0, policy_version 31560 (0.0008) [2023-10-10 05:57:15,361][53252] Updated weights for policy 0, policy_version 31570 (0.0008) [2023-10-10 05:57:15,723][53252] Updated weights for policy 0, policy_version 31580 (0.0008) [2023-10-10 05:57:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64618496. Throughput: 0: 1694.0, 1: 1683.8. Samples: 16160026. Policy #0 lag: (min: 14.0, avg: 27.1, max: 46.0) [2023-10-10 05:57:16,784][52050] Avg episode reward: [(0, '19.290'), (1, '18.230')] [2023-10-10 05:57:16,988][53268] Updated weights for policy 1, policy_version 31530 (0.0007) [2023-10-10 05:57:17,359][53268] Updated weights for policy 1, policy_version 31540 (0.0007) [2023-10-10 05:57:17,718][53268] Updated weights for policy 1, policy_version 31550 (0.0007) [2023-10-10 05:57:19,661][53252] Updated weights for policy 0, policy_version 31590 (0.0009) [2023-10-10 05:57:20,037][53252] Updated weights for policy 0, policy_version 31600 (0.0009) [2023-10-10 05:57:20,399][53252] Updated weights for policy 0, policy_version 31610 (0.0009) [2023-10-10 05:57:21,701][53268] Updated weights for policy 1, policy_version 31560 (0.0007) [2023-10-10 05:57:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64684032. Throughput: 0: 1671.3, 1: 1684.7. Samples: 16179790. Policy #0 lag: (min: 14.0, avg: 27.1, max: 46.0) [2023-10-10 05:57:21,784][52050] Avg episode reward: [(0, '18.930'), (1, '18.170')] [2023-10-10 05:57:22,076][53268] Updated weights for policy 1, policy_version 31570 (0.0007) [2023-10-10 05:57:22,440][53268] Updated weights for policy 1, policy_version 31580 (0.0008) [2023-10-10 05:57:24,324][53252] Updated weights for policy 0, policy_version 31620 (0.0007) [2023-10-10 05:57:24,701][53252] Updated weights for policy 0, policy_version 31630 (0.0007) [2023-10-10 05:57:25,073][53252] Updated weights for policy 0, policy_version 31640 (0.0008) [2023-10-10 05:57:26,577][53268] Updated weights for policy 1, policy_version 31590 (0.0009) [2023-10-10 05:57:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64749568. Throughput: 0: 1679.3, 1: 1684.8. Samples: 16200252. Policy #0 lag: (min: 14.0, avg: 27.1, max: 46.0) [2023-10-10 05:57:26,784][52050] Avg episode reward: [(0, '19.740'), (1, '18.400')] [2023-10-10 05:57:26,942][53268] Updated weights for policy 1, policy_version 31600 (0.0009) [2023-10-10 05:57:27,305][53268] Updated weights for policy 1, policy_version 31610 (0.0012) [2023-10-10 05:57:29,361][53252] Updated weights for policy 0, policy_version 31650 (0.0008) [2023-10-10 05:57:29,750][53252] Updated weights for policy 0, policy_version 31660 (0.0008) [2023-10-10 05:57:30,129][53252] Updated weights for policy 0, policy_version 31670 (0.0008) [2023-10-10 05:57:30,499][53252] Updated weights for policy 0, policy_version 31680 (0.0008) [2023-10-10 05:57:31,365][53268] Updated weights for policy 1, policy_version 31620 (0.0009) [2023-10-10 05:57:31,735][53268] Updated weights for policy 1, policy_version 31630 (0.0010) [2023-10-10 05:57:31,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 64815104. Throughput: 0: 1691.3, 1: 1684.3. Samples: 16210542. Policy #0 lag: (min: 14.0, avg: 27.1, max: 46.0) [2023-10-10 05:57:31,785][52050] Avg episode reward: [(0, '19.650'), (1, '18.540')] [2023-10-10 05:57:32,104][53268] Updated weights for policy 1, policy_version 31640 (0.0008) [2023-10-10 05:57:34,419][53252] Updated weights for policy 0, policy_version 31690 (0.0007) [2023-10-10 05:57:34,795][53252] Updated weights for policy 0, policy_version 31700 (0.0007) [2023-10-10 05:57:35,159][53252] Updated weights for policy 0, policy_version 31710 (0.0010) [2023-10-10 05:57:36,116][53268] Updated weights for policy 1, policy_version 31650 (0.0009) [2023-10-10 05:57:36,480][53268] Updated weights for policy 1, policy_version 31660 (0.0007) [2023-10-10 05:57:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64880640. Throughput: 0: 1663.9, 1: 1682.5. Samples: 16230092. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-10 05:57:36,784][52050] Avg episode reward: [(0, '20.070'), (1, '18.540')] [2023-10-10 05:57:36,856][53268] Updated weights for policy 1, policy_version 31670 (0.0008) [2023-10-10 05:57:37,213][53268] Updated weights for policy 1, policy_version 31680 (0.0009) [2023-10-10 05:57:39,232][53252] Updated weights for policy 0, policy_version 31720 (0.0010) [2023-10-10 05:57:39,600][53252] Updated weights for policy 0, policy_version 31730 (0.0009) [2023-10-10 05:57:39,974][53252] Updated weights for policy 0, policy_version 31740 (0.0011) [2023-10-10 05:57:41,314][53268] Updated weights for policy 1, policy_version 31690 (0.0011) [2023-10-10 05:57:41,685][53268] Updated weights for policy 1, policy_version 31700 (0.0011) [2023-10-10 05:57:41,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64946176. Throughput: 0: 1685.0, 1: 1678.6. Samples: 16250296. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-10 05:57:41,784][52050] Avg episode reward: [(0, '20.790'), (1, '18.530')] [2023-10-10 05:57:42,055][53268] Updated weights for policy 1, policy_version 31710 (0.0009) [2023-10-10 05:57:44,216][53252] Updated weights for policy 0, policy_version 31750 (0.0010) [2023-10-10 05:57:44,594][53252] Updated weights for policy 0, policy_version 31760 (0.0009) [2023-10-10 05:57:44,964][53252] Updated weights for policy 0, policy_version 31770 (0.0009) [2023-10-10 05:57:46,104][53268] Updated weights for policy 1, policy_version 31720 (0.0007) [2023-10-10 05:57:46,476][53268] Updated weights for policy 1, policy_version 31730 (0.0007) [2023-10-10 05:57:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65011712. Throughput: 0: 1672.9, 1: 1683.0. Samples: 16260354. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-10 05:57:46,784][52050] Avg episode reward: [(0, '20.990'), (1, '19.120')] [2023-10-10 05:57:46,841][53268] Updated weights for policy 1, policy_version 31740 (0.0008) [2023-10-10 05:57:49,047][53252] Updated weights for policy 0, policy_version 31780 (0.0007) [2023-10-10 05:57:49,426][53252] Updated weights for policy 0, policy_version 31790 (0.0007) [2023-10-10 05:57:49,796][53252] Updated weights for policy 0, policy_version 31800 (0.0007) [2023-10-10 05:57:50,775][53268] Updated weights for policy 1, policy_version 31750 (0.0008) [2023-10-10 05:57:51,147][53268] Updated weights for policy 1, policy_version 31760 (0.0008) [2023-10-10 05:57:51,512][53268] Updated weights for policy 1, policy_version 31770 (0.0008) [2023-10-10 05:57:51,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65110016. Throughput: 0: 1662.4, 1: 1691.2. Samples: 16280320. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-10 05:57:51,785][52050] Avg episode reward: [(0, '19.620'), (1, '17.180')] [2023-10-10 05:57:53,836][53252] Updated weights for policy 0, policy_version 31810 (0.0008) [2023-10-10 05:57:54,213][53252] Updated weights for policy 0, policy_version 31820 (0.0008) [2023-10-10 05:57:54,582][53252] Updated weights for policy 0, policy_version 31830 (0.0009) [2023-10-10 05:57:54,951][53252] Updated weights for policy 0, policy_version 31840 (0.0007) [2023-10-10 05:57:55,450][53268] Updated weights for policy 1, policy_version 31780 (0.0009) [2023-10-10 05:57:55,809][53268] Updated weights for policy 1, policy_version 31790 (0.0008) [2023-10-10 05:57:56,182][53268] Updated weights for policy 1, policy_version 31800 (0.0007) [2023-10-10 05:57:56,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 65175552. Throughput: 0: 1679.5, 1: 1670.3. Samples: 16300392. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-10 05:57:56,784][52050] Avg episode reward: [(0, '19.100'), (1, '17.440')] [2023-10-10 05:57:58,988][53252] Updated weights for policy 0, policy_version 31850 (0.0009) [2023-10-10 05:57:59,357][53252] Updated weights for policy 0, policy_version 31860 (0.0009) [2023-10-10 05:57:59,728][53252] Updated weights for policy 0, policy_version 31870 (0.0007) [2023-10-10 05:58:00,342][53268] Updated weights for policy 1, policy_version 31810 (0.0009) [2023-10-10 05:58:00,704][53268] Updated weights for policy 1, policy_version 31820 (0.0010) [2023-10-10 05:58:01,076][53268] Updated weights for policy 1, policy_version 31830 (0.0009) [2023-10-10 05:58:01,438][53268] Updated weights for policy 1, policy_version 31840 (0.0009) [2023-10-10 05:58:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65241088. Throughput: 0: 1663.6, 1: 1692.6. Samples: 16311056. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:58:01,784][52050] Avg episode reward: [(0, '19.840'), (1, '18.490')] [2023-10-10 05:58:03,771][53252] Updated weights for policy 0, policy_version 31880 (0.0007) [2023-10-10 05:58:04,135][53252] Updated weights for policy 0, policy_version 31890 (0.0008) [2023-10-10 05:58:04,512][53252] Updated weights for policy 0, policy_version 31900 (0.0009) [2023-10-10 05:58:05,544][53268] Updated weights for policy 1, policy_version 31850 (0.0009) [2023-10-10 05:58:05,909][53268] Updated weights for policy 1, policy_version 31860 (0.0010) [2023-10-10 05:58:06,277][53268] Updated weights for policy 1, policy_version 31870 (0.0009) [2023-10-10 05:58:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 65306624. Throughput: 0: 1670.6, 1: 1692.7. Samples: 16331136. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:58:06,784][52050] Avg episode reward: [(0, '18.650'), (1, '18.280')] [2023-10-10 05:58:08,483][53252] Updated weights for policy 0, policy_version 31910 (0.0010) [2023-10-10 05:58:08,856][53252] Updated weights for policy 0, policy_version 31920 (0.0010) [2023-10-10 05:58:09,226][53252] Updated weights for policy 0, policy_version 31930 (0.0008) [2023-10-10 05:58:10,477][53268] Updated weights for policy 1, policy_version 31880 (0.0009) [2023-10-10 05:58:10,850][53268] Updated weights for policy 1, policy_version 31890 (0.0009) [2023-10-10 05:58:11,215][53268] Updated weights for policy 1, policy_version 31900 (0.0010) [2023-10-10 05:58:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65372160. Throughput: 0: 1678.3, 1: 1666.3. Samples: 16350760. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:58:11,784][52050] Avg episode reward: [(0, '18.170'), (1, '17.730')] [2023-10-10 05:58:13,346][53252] Updated weights for policy 0, policy_version 31940 (0.0008) [2023-10-10 05:58:13,725][53252] Updated weights for policy 0, policy_version 31950 (0.0008) [2023-10-10 05:58:14,095][53252] Updated weights for policy 0, policy_version 31960 (0.0007) [2023-10-10 05:58:15,356][53268] Updated weights for policy 1, policy_version 31910 (0.0010) [2023-10-10 05:58:15,722][53268] Updated weights for policy 1, policy_version 31920 (0.0009) [2023-10-10 05:58:16,098][53268] Updated weights for policy 1, policy_version 31930 (0.0010) [2023-10-10 05:58:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 65437696. Throughput: 0: 1656.1, 1: 1690.5. Samples: 16361136. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:58:16,784][52050] Avg episode reward: [(0, '19.490'), (1, '18.770')] [2023-10-10 05:58:18,204][53252] Updated weights for policy 0, policy_version 31970 (0.0008) [2023-10-10 05:58:18,585][53252] Updated weights for policy 0, policy_version 31980 (0.0009) [2023-10-10 05:58:18,965][53252] Updated weights for policy 0, policy_version 31990 (0.0009) [2023-10-10 05:58:19,332][53252] Updated weights for policy 0, policy_version 32000 (0.0008) [2023-10-10 05:58:20,232][53268] Updated weights for policy 1, policy_version 31940 (0.0009) [2023-10-10 05:58:20,592][53268] Updated weights for policy 1, policy_version 31950 (0.0008) [2023-10-10 05:58:20,954][53268] Updated weights for policy 1, policy_version 31960 (0.0008) [2023-10-10 05:58:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65503232. Throughput: 0: 1679.4, 1: 1684.8. Samples: 16381484. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 05:58:21,784][52050] Avg episode reward: [(0, '19.160'), (1, '17.930')] [2023-10-10 05:58:23,333][53252] Updated weights for policy 0, policy_version 32010 (0.0009) [2023-10-10 05:58:23,706][53252] Updated weights for policy 0, policy_version 32020 (0.0007) [2023-10-10 05:58:24,074][53252] Updated weights for policy 0, policy_version 32030 (0.0007) [2023-10-10 05:58:25,019][53268] Updated weights for policy 1, policy_version 31970 (0.0010) [2023-10-10 05:58:25,382][53268] Updated weights for policy 1, policy_version 31980 (0.0010) [2023-10-10 05:58:25,750][53268] Updated weights for policy 1, policy_version 31990 (0.0011) [2023-10-10 05:58:26,118][53268] Updated weights for policy 1, policy_version 32000 (0.0007) [2023-10-10 05:58:26,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65568768. Throughput: 0: 1687.5, 1: 1668.5. Samples: 16401316. Policy #0 lag: (min: 35.0, avg: 54.1, max: 56.0) [2023-10-10 05:58:26,785][52050] Avg episode reward: [(0, '18.940'), (1, '19.040')] [2023-10-10 05:58:28,061][53252] Updated weights for policy 0, policy_version 32040 (0.0008) [2023-10-10 05:58:28,423][53252] Updated weights for policy 0, policy_version 32050 (0.0011) [2023-10-10 05:58:28,792][53252] Updated weights for policy 0, policy_version 32060 (0.0011) [2023-10-10 05:58:29,968][53268] Updated weights for policy 1, policy_version 32010 (0.0007) [2023-10-10 05:58:30,336][53268] Updated weights for policy 1, policy_version 32020 (0.0009) [2023-10-10 05:58:30,696][53268] Updated weights for policy 1, policy_version 32030 (0.0009) [2023-10-10 05:58:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 65634304. Throughput: 0: 1667.2, 1: 1692.7. Samples: 16411550. Policy #0 lag: (min: 35.0, avg: 54.1, max: 56.0) [2023-10-10 05:58:31,784][52050] Avg episode reward: [(0, '19.130'), (1, '19.020')] [2023-10-10 05:58:33,022][53252] Updated weights for policy 0, policy_version 32070 (0.0010) [2023-10-10 05:58:33,399][53252] Updated weights for policy 0, policy_version 32080 (0.0008) [2023-10-10 05:58:33,758][53252] Updated weights for policy 0, policy_version 32090 (0.0008) [2023-10-10 05:58:34,826][53268] Updated weights for policy 1, policy_version 32040 (0.0007) [2023-10-10 05:58:35,194][53268] Updated weights for policy 1, policy_version 32050 (0.0010) [2023-10-10 05:58:35,551][53268] Updated weights for policy 1, policy_version 32060 (0.0010) [2023-10-10 05:58:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 65699840. Throughput: 0: 1691.5, 1: 1669.4. Samples: 16431558. Policy #0 lag: (min: 35.0, avg: 54.1, max: 56.0) [2023-10-10 05:58:36,784][52050] Avg episode reward: [(0, '18.630'), (1, '18.860')] [2023-10-10 05:58:37,734][53252] Updated weights for policy 0, policy_version 32100 (0.0010) [2023-10-10 05:58:38,113][53252] Updated weights for policy 0, policy_version 32110 (0.0009) [2023-10-10 05:58:38,489][53252] Updated weights for policy 0, policy_version 32120 (0.0007) [2023-10-10 05:58:39,626][53268] Updated weights for policy 1, policy_version 32070 (0.0010) [2023-10-10 05:58:39,995][53268] Updated weights for policy 1, policy_version 32080 (0.0008) [2023-10-10 05:58:40,367][53268] Updated weights for policy 1, policy_version 32090 (0.0008) [2023-10-10 05:58:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65765376. Throughput: 0: 1692.0, 1: 1670.7. Samples: 16451712. Policy #0 lag: (min: 35.0, avg: 54.1, max: 56.0) [2023-10-10 05:58:41,784][52050] Avg episode reward: [(0, '18.230'), (1, '19.230')] [2023-10-10 05:58:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000032128_32899072.pth... [2023-10-10 05:58:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000032096_32866304.pth... [2023-10-10 05:58:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000030528_31260672.pth [2023-10-10 05:58:41,835][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000030560_31293440.pth [2023-10-10 05:58:42,483][53252] Updated weights for policy 0, policy_version 32130 (0.0008) [2023-10-10 05:58:42,856][53252] Updated weights for policy 0, policy_version 32140 (0.0008) [2023-10-10 05:58:43,235][53252] Updated weights for policy 0, policy_version 32150 (0.0007) [2023-10-10 05:58:43,611][53252] Updated weights for policy 0, policy_version 32160 (0.0007) [2023-10-10 05:58:44,616][53268] Updated weights for policy 1, policy_version 32100 (0.0007) [2023-10-10 05:58:44,977][53268] Updated weights for policy 1, policy_version 32110 (0.0008) [2023-10-10 05:58:45,350][53268] Updated weights for policy 1, policy_version 32120 (0.0010) [2023-10-10 05:58:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 65830912. Throughput: 0: 1679.4, 1: 1678.2. Samples: 16462146. Policy #0 lag: (min: 35.0, avg: 54.1, max: 56.0) [2023-10-10 05:58:46,784][52050] Avg episode reward: [(0, '18.480'), (1, '19.190')] [2023-10-10 05:58:47,626][53252] Updated weights for policy 0, policy_version 32170 (0.0008) [2023-10-10 05:58:48,004][53252] Updated weights for policy 0, policy_version 32180 (0.0011) [2023-10-10 05:58:48,378][53252] Updated weights for policy 0, policy_version 32190 (0.0012) [2023-10-10 05:58:49,363][53268] Updated weights for policy 1, policy_version 32130 (0.0009) [2023-10-10 05:58:49,731][53268] Updated weights for policy 1, policy_version 32140 (0.0007) [2023-10-10 05:58:50,105][53268] Updated weights for policy 1, policy_version 32150 (0.0008) [2023-10-10 05:58:50,476][53268] Updated weights for policy 1, policy_version 32160 (0.0009) [2023-10-10 05:58:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 65896448. Throughput: 0: 1688.7, 1: 1657.2. Samples: 16481700. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 05:58:51,784][52050] Avg episode reward: [(0, '20.200'), (1, '17.890')] [2023-10-10 05:58:52,461][53252] Updated weights for policy 0, policy_version 32200 (0.0009) [2023-10-10 05:58:52,832][53252] Updated weights for policy 0, policy_version 32210 (0.0008) [2023-10-10 05:58:53,205][53252] Updated weights for policy 0, policy_version 32220 (0.0010) [2023-10-10 05:58:54,418][53268] Updated weights for policy 1, policy_version 32170 (0.0007) [2023-10-10 05:58:54,790][53268] Updated weights for policy 1, policy_version 32180 (0.0007) [2023-10-10 05:58:55,151][53268] Updated weights for policy 1, policy_version 32190 (0.0008) [2023-10-10 05:58:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 65961984. Throughput: 0: 1690.2, 1: 1680.1. Samples: 16502426. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 05:58:56,785][52050] Avg episode reward: [(0, '20.200'), (1, '18.980')] [2023-10-10 05:58:57,184][53252] Updated weights for policy 0, policy_version 32230 (0.0008) [2023-10-10 05:58:57,567][53252] Updated weights for policy 0, policy_version 32240 (0.0010) [2023-10-10 05:58:57,937][53252] Updated weights for policy 0, policy_version 32250 (0.0008) [2023-10-10 05:58:59,080][53268] Updated weights for policy 1, policy_version 32200 (0.0008) [2023-10-10 05:58:59,462][53268] Updated weights for policy 1, policy_version 32210 (0.0009) [2023-10-10 05:58:59,828][53268] Updated weights for policy 1, policy_version 32220 (0.0009) [2023-10-10 05:59:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66027520. Throughput: 0: 1685.5, 1: 1682.4. Samples: 16512694. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 05:59:01,784][52050] Avg episode reward: [(0, '20.500'), (1, '20.230')] [2023-10-10 05:59:01,785][53061] Saving new best policy, reward=20.230! [2023-10-10 05:59:02,119][53252] Updated weights for policy 0, policy_version 32260 (0.0008) [2023-10-10 05:59:02,482][53252] Updated weights for policy 0, policy_version 32270 (0.0009) [2023-10-10 05:59:02,847][53252] Updated weights for policy 0, policy_version 32280 (0.0009) [2023-10-10 05:59:03,845][53268] Updated weights for policy 1, policy_version 32230 (0.0009) [2023-10-10 05:59:04,215][53268] Updated weights for policy 1, policy_version 32240 (0.0010) [2023-10-10 05:59:04,589][53268] Updated weights for policy 1, policy_version 32250 (0.0008) [2023-10-10 05:59:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66093056. Throughput: 0: 1692.8, 1: 1667.8. Samples: 16532712. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 05:59:06,784][52050] Avg episode reward: [(0, '20.230'), (1, '19.050')] [2023-10-10 05:59:06,889][53252] Updated weights for policy 0, policy_version 32290 (0.0009) [2023-10-10 05:59:07,242][53252] Updated weights for policy 0, policy_version 32300 (0.0010) [2023-10-10 05:59:07,606][53252] Updated weights for policy 0, policy_version 32310 (0.0012) [2023-10-10 05:59:07,972][53252] Updated weights for policy 0, policy_version 32320 (0.0009) [2023-10-10 05:59:08,632][53268] Updated weights for policy 1, policy_version 32260 (0.0011) [2023-10-10 05:59:09,003][53268] Updated weights for policy 1, policy_version 32270 (0.0011) [2023-10-10 05:59:09,364][53268] Updated weights for policy 1, policy_version 32280 (0.0008) [2023-10-10 05:59:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66158592. Throughput: 0: 1692.1, 1: 1689.1. Samples: 16553470. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 05:59:11,784][52050] Avg episode reward: [(0, '21.690'), (1, '19.470')] [2023-10-10 05:59:12,086][53252] Updated weights for policy 0, policy_version 32330 (0.0009) [2023-10-10 05:59:12,459][53252] Updated weights for policy 0, policy_version 32340 (0.0009) [2023-10-10 05:59:12,821][53252] Updated weights for policy 0, policy_version 32350 (0.0008) [2023-10-10 05:59:13,221][53268] Updated weights for policy 1, policy_version 32290 (0.0009) [2023-10-10 05:59:13,591][53268] Updated weights for policy 1, policy_version 32300 (0.0007) [2023-10-10 05:59:13,965][53268] Updated weights for policy 1, policy_version 32310 (0.0008) [2023-10-10 05:59:14,331][53268] Updated weights for policy 1, policy_version 32320 (0.0009) [2023-10-10 05:59:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66224128. Throughput: 0: 1690.5, 1: 1673.1. Samples: 16562912. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 05:59:16,784][52050] Avg episode reward: [(0, '20.500'), (1, '19.450')] [2023-10-10 05:59:16,800][53252] Updated weights for policy 0, policy_version 32360 (0.0008) [2023-10-10 05:59:17,168][53252] Updated weights for policy 0, policy_version 32370 (0.0008) [2023-10-10 05:59:17,536][53252] Updated weights for policy 0, policy_version 32380 (0.0007) [2023-10-10 05:59:18,285][53268] Updated weights for policy 1, policy_version 32330 (0.0011) [2023-10-10 05:59:18,661][53268] Updated weights for policy 1, policy_version 32340 (0.0008) [2023-10-10 05:59:19,034][53268] Updated weights for policy 1, policy_version 32350 (0.0007) [2023-10-10 05:59:21,715][53252] Updated weights for policy 0, policy_version 32390 (0.0008) [2023-10-10 05:59:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66289664. Throughput: 0: 1688.7, 1: 1687.2. Samples: 16583470. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 05:59:21,784][52050] Avg episode reward: [(0, '19.260'), (1, '18.440')] [2023-10-10 05:59:22,085][53252] Updated weights for policy 0, policy_version 32400 (0.0007) [2023-10-10 05:59:22,454][53252] Updated weights for policy 0, policy_version 32410 (0.0008) [2023-10-10 05:59:23,287][53268] Updated weights for policy 1, policy_version 32360 (0.0011) [2023-10-10 05:59:23,658][53268] Updated weights for policy 1, policy_version 32370 (0.0008) [2023-10-10 05:59:24,032][53268] Updated weights for policy 1, policy_version 32380 (0.0008) [2023-10-10 05:59:26,495][53252] Updated weights for policy 0, policy_version 32420 (0.0008) [2023-10-10 05:59:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 66355200. Throughput: 0: 1685.1, 1: 1697.1. Samples: 16603908. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 05:59:26,784][52050] Avg episode reward: [(0, '20.530'), (1, '17.970')] [2023-10-10 05:59:26,874][53252] Updated weights for policy 0, policy_version 32430 (0.0008) [2023-10-10 05:59:27,250][53252] Updated weights for policy 0, policy_version 32440 (0.0009) [2023-10-10 05:59:28,179][53268] Updated weights for policy 1, policy_version 32390 (0.0008) [2023-10-10 05:59:28,549][53268] Updated weights for policy 1, policy_version 32400 (0.0010) [2023-10-10 05:59:28,905][53268] Updated weights for policy 1, policy_version 32410 (0.0010) [2023-10-10 05:59:31,177][53252] Updated weights for policy 0, policy_version 32450 (0.0009) [2023-10-10 05:59:31,549][53252] Updated weights for policy 0, policy_version 32460 (0.0009) [2023-10-10 05:59:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66420736. Throughput: 0: 1687.5, 1: 1667.2. Samples: 16613106. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 05:59:31,784][52050] Avg episode reward: [(0, '20.870'), (1, '17.930')] [2023-10-10 05:59:31,914][53252] Updated weights for policy 0, policy_version 32470 (0.0009) [2023-10-10 05:59:32,282][53252] Updated weights for policy 0, policy_version 32480 (0.0009) [2023-10-10 05:59:33,155][53268] Updated weights for policy 1, policy_version 32420 (0.0008) [2023-10-10 05:59:33,526][53268] Updated weights for policy 1, policy_version 32430 (0.0008) [2023-10-10 05:59:33,891][53268] Updated weights for policy 1, policy_version 32440 (0.0009) [2023-10-10 05:59:36,320][53252] Updated weights for policy 0, policy_version 32490 (0.0009) [2023-10-10 05:59:36,690][53252] Updated weights for policy 0, policy_version 32500 (0.0008) [2023-10-10 05:59:36,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66486272. Throughput: 0: 1698.1, 1: 1683.9. Samples: 16633892. Policy #0 lag: (min: 31.0, avg: 44.0, max: 63.0) [2023-10-10 05:59:36,784][52050] Avg episode reward: [(0, '20.260'), (1, '17.280')] [2023-10-10 05:59:37,061][53252] Updated weights for policy 0, policy_version 32510 (0.0007) [2023-10-10 05:59:38,008][53268] Updated weights for policy 1, policy_version 32450 (0.0010) [2023-10-10 05:59:38,373][53268] Updated weights for policy 1, policy_version 32460 (0.0011) [2023-10-10 05:59:38,740][53268] Updated weights for policy 1, policy_version 32470 (0.0010) [2023-10-10 05:59:39,111][53268] Updated weights for policy 1, policy_version 32480 (0.0010) [2023-10-10 05:59:41,067][53252] Updated weights for policy 0, policy_version 32520 (0.0008) [2023-10-10 05:59:41,434][53252] Updated weights for policy 0, policy_version 32530 (0.0011) [2023-10-10 05:59:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 66551808. Throughput: 0: 1681.4, 1: 1685.0. Samples: 16653914. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:59:41,784][52050] Avg episode reward: [(0, '20.640'), (1, '17.480')] [2023-10-10 05:59:41,800][53252] Updated weights for policy 0, policy_version 32540 (0.0008) [2023-10-10 05:59:43,127][53268] Updated weights for policy 1, policy_version 32490 (0.0010) [2023-10-10 05:59:43,506][53268] Updated weights for policy 1, policy_version 32500 (0.0011) [2023-10-10 05:59:43,872][53268] Updated weights for policy 1, policy_version 32510 (0.0011) [2023-10-10 05:59:45,779][53252] Updated weights for policy 0, policy_version 32550 (0.0010) [2023-10-10 05:59:46,154][53252] Updated weights for policy 0, policy_version 32560 (0.0011) [2023-10-10 05:59:46,526][53252] Updated weights for policy 0, policy_version 32570 (0.0007) [2023-10-10 05:59:46,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 66650112. Throughput: 0: 1697.2, 1: 1661.0. Samples: 16663816. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:59:46,784][52050] Avg episode reward: [(0, '20.800'), (1, '16.730')] [2023-10-10 05:59:47,873][53268] Updated weights for policy 1, policy_version 32520 (0.0010) [2023-10-10 05:59:48,246][53268] Updated weights for policy 1, policy_version 32530 (0.0007) [2023-10-10 05:59:48,619][53268] Updated weights for policy 1, policy_version 32540 (0.0007) [2023-10-10 05:59:50,569][53252] Updated weights for policy 0, policy_version 32580 (0.0008) [2023-10-10 05:59:50,936][53252] Updated weights for policy 0, policy_version 32590 (0.0007) [2023-10-10 05:59:51,306][53252] Updated weights for policy 0, policy_version 32600 (0.0007) [2023-10-10 05:59:51,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 66715648. Throughput: 0: 1696.0, 1: 1683.8. Samples: 16684800. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:59:51,784][52050] Avg episode reward: [(0, '21.260'), (1, '18.820')] [2023-10-10 05:59:52,681][53268] Updated weights for policy 1, policy_version 32550 (0.0008) [2023-10-10 05:59:53,050][53268] Updated weights for policy 1, policy_version 32560 (0.0007) [2023-10-10 05:59:53,419][53268] Updated weights for policy 1, policy_version 32570 (0.0007) [2023-10-10 05:59:55,318][53252] Updated weights for policy 0, policy_version 32610 (0.0007) [2023-10-10 05:59:55,684][53252] Updated weights for policy 0, policy_version 32620 (0.0007) [2023-10-10 05:59:56,065][53252] Updated weights for policy 0, policy_version 32630 (0.0008) [2023-10-10 05:59:56,429][53252] Updated weights for policy 0, policy_version 32640 (0.0007) [2023-10-10 05:59:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66781184. Throughput: 0: 1667.0, 1: 1685.5. Samples: 16704332. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) [2023-10-10 05:59:56,784][52050] Avg episode reward: [(0, '20.250'), (1, '18.970')] [2023-10-10 05:59:57,572][53268] Updated weights for policy 1, policy_version 32580 (0.0009) [2023-10-10 05:59:57,943][53268] Updated weights for policy 1, policy_version 32590 (0.0010) [2023-10-10 05:59:58,309][53268] Updated weights for policy 1, policy_version 32600 (0.0010) [2023-10-10 06:00:00,434][53252] Updated weights for policy 0, policy_version 32650 (0.0009) [2023-10-10 06:00:00,812][53252] Updated weights for policy 0, policy_version 32660 (0.0008) [2023-10-10 06:00:01,172][53252] Updated weights for policy 0, policy_version 32670 (0.0008) [2023-10-10 06:00:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66846720. Throughput: 0: 1697.2, 1: 1672.8. Samples: 16714566. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:00:01,784][52050] Avg episode reward: [(0, '21.160'), (1, '17.450')] [2023-10-10 06:00:02,375][53268] Updated weights for policy 1, policy_version 32610 (0.0008) [2023-10-10 06:00:02,746][53268] Updated weights for policy 1, policy_version 32620 (0.0008) [2023-10-10 06:00:03,120][53268] Updated weights for policy 1, policy_version 32630 (0.0007) [2023-10-10 06:00:03,497][53268] Updated weights for policy 1, policy_version 32640 (0.0009) [2023-10-10 06:00:05,245][53252] Updated weights for policy 0, policy_version 32680 (0.0009) [2023-10-10 06:00:05,613][53252] Updated weights for policy 0, policy_version 32690 (0.0008) [2023-10-10 06:00:05,982][53252] Updated weights for policy 0, policy_version 32700 (0.0007) [2023-10-10 06:00:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66912256. Throughput: 0: 1684.8, 1: 1682.4. Samples: 16734998. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:00:06,784][52050] Avg episode reward: [(0, '22.370'), (1, '18.160')] [2023-10-10 06:00:06,785][52846] Saving new best policy, reward=22.370! [2023-10-10 06:00:07,474][53268] Updated weights for policy 1, policy_version 32650 (0.0008) [2023-10-10 06:00:07,839][53268] Updated weights for policy 1, policy_version 32660 (0.0010) [2023-10-10 06:00:08,204][53268] Updated weights for policy 1, policy_version 32670 (0.0009) [2023-10-10 06:00:10,009][53252] Updated weights for policy 0, policy_version 32710 (0.0009) [2023-10-10 06:00:10,389][53252] Updated weights for policy 0, policy_version 32720 (0.0007) [2023-10-10 06:00:10,764][53252] Updated weights for policy 0, policy_version 32730 (0.0007) [2023-10-10 06:00:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66977792. Throughput: 0: 1670.0, 1: 1689.2. Samples: 16755072. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:00:11,784][52050] Avg episode reward: [(0, '22.550'), (1, '19.870')] [2023-10-10 06:00:11,793][52846] Saving new best policy, reward=22.550! [2023-10-10 06:00:12,328][53268] Updated weights for policy 1, policy_version 32680 (0.0008) [2023-10-10 06:00:12,698][53268] Updated weights for policy 1, policy_version 32690 (0.0008) [2023-10-10 06:00:13,075][53268] Updated weights for policy 1, policy_version 32700 (0.0009) [2023-10-10 06:00:15,059][53252] Updated weights for policy 0, policy_version 32740 (0.0008) [2023-10-10 06:00:15,429][53252] Updated weights for policy 0, policy_version 32750 (0.0008) [2023-10-10 06:00:15,787][53252] Updated weights for policy 0, policy_version 32760 (0.0007) [2023-10-10 06:00:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67043328. Throughput: 0: 1693.3, 1: 1685.8. Samples: 16765164. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:00:16,784][52050] Avg episode reward: [(0, '21.320'), (1, '17.930')] [2023-10-10 06:00:17,096][53268] Updated weights for policy 1, policy_version 32710 (0.0009) [2023-10-10 06:00:17,465][53268] Updated weights for policy 1, policy_version 32720 (0.0009) [2023-10-10 06:00:17,829][53268] Updated weights for policy 1, policy_version 32730 (0.0011) [2023-10-10 06:00:20,024][53252] Updated weights for policy 0, policy_version 32770 (0.0007) [2023-10-10 06:00:20,410][53252] Updated weights for policy 0, policy_version 32780 (0.0010) [2023-10-10 06:00:20,769][53252] Updated weights for policy 0, policy_version 32790 (0.0009) [2023-10-10 06:00:21,147][53252] Updated weights for policy 0, policy_version 32800 (0.0010) [2023-10-10 06:00:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67108864. Throughput: 0: 1674.3, 1: 1687.5. Samples: 16785170. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:00:21,784][52050] Avg episode reward: [(0, '21.010'), (1, '18.910')] [2023-10-10 06:00:21,953][53268] Updated weights for policy 1, policy_version 32740 (0.0008) [2023-10-10 06:00:22,329][53268] Updated weights for policy 1, policy_version 32750 (0.0008) [2023-10-10 06:00:22,694][53268] Updated weights for policy 1, policy_version 32760 (0.0008) [2023-10-10 06:00:25,214][53252] Updated weights for policy 0, policy_version 32810 (0.0007) [2023-10-10 06:00:25,579][53252] Updated weights for policy 0, policy_version 32820 (0.0009) [2023-10-10 06:00:25,958][53252] Updated weights for policy 0, policy_version 32830 (0.0008) [2023-10-10 06:00:26,731][53268] Updated weights for policy 1, policy_version 32770 (0.0011) [2023-10-10 06:00:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67174400. Throughput: 0: 1671.9, 1: 1686.8. Samples: 16805056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:00:26,784][52050] Avg episode reward: [(0, '20.400'), (1, '20.250')] [2023-10-10 06:00:27,099][53268] Updated weights for policy 1, policy_version 32780 (0.0007) [2023-10-10 06:00:27,467][53268] Updated weights for policy 1, policy_version 32790 (0.0008) [2023-10-10 06:00:27,837][53061] Saving new best policy, reward=20.250! [2023-10-10 06:00:27,842][53268] Updated weights for policy 1, policy_version 32800 (0.0007) [2023-10-10 06:00:30,079][53252] Updated weights for policy 0, policy_version 32840 (0.0009) [2023-10-10 06:00:30,460][53252] Updated weights for policy 0, policy_version 32850 (0.0009) [2023-10-10 06:00:30,831][53252] Updated weights for policy 0, policy_version 32860 (0.0009) [2023-10-10 06:00:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67239936. Throughput: 0: 1682.9, 1: 1685.5. Samples: 16815394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:00:31,784][52050] Avg episode reward: [(0, '19.410'), (1, '16.960')] [2023-10-10 06:00:31,961][53268] Updated weights for policy 1, policy_version 32810 (0.0009) [2023-10-10 06:00:32,323][53268] Updated weights for policy 1, policy_version 32820 (0.0010) [2023-10-10 06:00:32,694][53268] Updated weights for policy 1, policy_version 32830 (0.0010) [2023-10-10 06:00:34,956][53252] Updated weights for policy 0, policy_version 32870 (0.0009) [2023-10-10 06:00:35,333][53252] Updated weights for policy 0, policy_version 32880 (0.0010) [2023-10-10 06:00:35,695][53252] Updated weights for policy 0, policy_version 32890 (0.0007) [2023-10-10 06:00:36,596][53268] Updated weights for policy 1, policy_version 32840 (0.0009) [2023-10-10 06:00:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 67305472. Throughput: 0: 1661.3, 1: 1684.3. Samples: 16835350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:00:36,784][52050] Avg episode reward: [(0, '19.780'), (1, '17.260')] [2023-10-10 06:00:36,969][53268] Updated weights for policy 1, policy_version 32850 (0.0009) [2023-10-10 06:00:37,335][53268] Updated weights for policy 1, policy_version 32860 (0.0007) [2023-10-10 06:00:39,661][53252] Updated weights for policy 0, policy_version 32900 (0.0007) [2023-10-10 06:00:40,037][53252] Updated weights for policy 0, policy_version 32910 (0.0010) [2023-10-10 06:00:40,415][53252] Updated weights for policy 0, policy_version 32920 (0.0010) [2023-10-10 06:00:41,515][53268] Updated weights for policy 1, policy_version 32870 (0.0009) [2023-10-10 06:00:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67371008. Throughput: 0: 1671.9, 1: 1682.9. Samples: 16855300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:00:41,784][52050] Avg episode reward: [(0, '20.300'), (1, '18.790')] [2023-10-10 06:00:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000032928_33718272.pth... [2023-10-10 06:00:41,829][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000031360_32112640.pth [2023-10-10 06:00:41,875][53268] Updated weights for policy 1, policy_version 32880 (0.0011) [2023-10-10 06:00:42,251][53268] Updated weights for policy 1, policy_version 32890 (0.0008) [2023-10-10 06:00:42,464][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000032896_33685504.pth... [2023-10-10 06:00:42,493][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000031296_32047104.pth [2023-10-10 06:00:44,517][53252] Updated weights for policy 0, policy_version 32930 (0.0010) [2023-10-10 06:00:44,882][53252] Updated weights for policy 0, policy_version 32940 (0.0007) [2023-10-10 06:00:45,263][53252] Updated weights for policy 0, policy_version 32950 (0.0008) [2023-10-10 06:00:45,634][53252] Updated weights for policy 0, policy_version 32960 (0.0009) [2023-10-10 06:00:46,260][53268] Updated weights for policy 1, policy_version 32900 (0.0011) [2023-10-10 06:00:46,633][53268] Updated weights for policy 1, policy_version 32910 (0.0010) [2023-10-10 06:00:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67436544. Throughput: 0: 1672.1, 1: 1685.7. Samples: 16865668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:00:46,784][52050] Avg episode reward: [(0, '19.830'), (1, '17.080')] [2023-10-10 06:00:47,014][53268] Updated weights for policy 1, policy_version 32920 (0.0010) [2023-10-10 06:00:49,765][53252] Updated weights for policy 0, policy_version 32970 (0.0008) [2023-10-10 06:00:50,135][53252] Updated weights for policy 0, policy_version 32980 (0.0010) [2023-10-10 06:00:50,512][53252] Updated weights for policy 0, policy_version 32990 (0.0010) [2023-10-10 06:00:51,070][53268] Updated weights for policy 1, policy_version 32930 (0.0010) [2023-10-10 06:00:51,433][53268] Updated weights for policy 1, policy_version 32940 (0.0008) [2023-10-10 06:00:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67502080. Throughput: 0: 1658.9, 1: 1679.6. Samples: 16885234. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 06:00:51,784][52050] Avg episode reward: [(0, '19.190'), (1, '18.960')] [2023-10-10 06:00:51,809][53268] Updated weights for policy 1, policy_version 32950 (0.0010) [2023-10-10 06:00:52,170][53268] Updated weights for policy 1, policy_version 32960 (0.0010) [2023-10-10 06:00:54,359][53252] Updated weights for policy 0, policy_version 33000 (0.0007) [2023-10-10 06:00:54,728][53252] Updated weights for policy 0, policy_version 33010 (0.0009) [2023-10-10 06:00:55,100][53252] Updated weights for policy 0, policy_version 33020 (0.0008) [2023-10-10 06:00:56,058][53268] Updated weights for policy 1, policy_version 32970 (0.0008) [2023-10-10 06:00:56,429][53268] Updated weights for policy 1, policy_version 32980 (0.0009) [2023-10-10 06:00:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67567616. Throughput: 0: 1674.1, 1: 1666.8. Samples: 16905412. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 06:00:56,784][52050] Avg episode reward: [(0, '19.920'), (1, '20.020')] [2023-10-10 06:00:56,791][53268] Updated weights for policy 1, policy_version 32990 (0.0009) [2023-10-10 06:00:59,261][53252] Updated weights for policy 0, policy_version 33030 (0.0007) [2023-10-10 06:00:59,640][53252] Updated weights for policy 0, policy_version 33040 (0.0007) [2023-10-10 06:01:00,009][53252] Updated weights for policy 0, policy_version 33050 (0.0007) [2023-10-10 06:01:01,041][53268] Updated weights for policy 1, policy_version 33000 (0.0010) [2023-10-10 06:01:01,411][53268] Updated weights for policy 1, policy_version 33010 (0.0009) [2023-10-10 06:01:01,780][53268] Updated weights for policy 1, policy_version 33020 (0.0010) [2023-10-10 06:01:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67633152. Throughput: 0: 1670.5, 1: 1680.3. Samples: 16915950. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 06:01:01,784][52050] Avg episode reward: [(0, '20.000'), (1, '19.500')] [2023-10-10 06:01:04,043][53252] Updated weights for policy 0, policy_version 33060 (0.0007) [2023-10-10 06:01:04,412][53252] Updated weights for policy 0, policy_version 33070 (0.0008) [2023-10-10 06:01:04,783][53252] Updated weights for policy 0, policy_version 33080 (0.0010) [2023-10-10 06:01:05,919][53268] Updated weights for policy 1, policy_version 33030 (0.0010) [2023-10-10 06:01:06,283][53268] Updated weights for policy 1, policy_version 33040 (0.0008) [2023-10-10 06:01:06,657][53268] Updated weights for policy 1, policy_version 33050 (0.0008) [2023-10-10 06:01:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67698688. Throughput: 0: 1660.4, 1: 1683.5. Samples: 16935646. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 06:01:06,784][52050] Avg episode reward: [(0, '19.650'), (1, '18.950')] [2023-10-10 06:01:08,874][53252] Updated weights for policy 0, policy_version 33090 (0.0009) [2023-10-10 06:01:09,242][53252] Updated weights for policy 0, policy_version 33100 (0.0008) [2023-10-10 06:01:09,615][53252] Updated weights for policy 0, policy_version 33110 (0.0007) [2023-10-10 06:01:09,974][53252] Updated weights for policy 0, policy_version 33120 (0.0008) [2023-10-10 06:01:10,722][53268] Updated weights for policy 1, policy_version 33060 (0.0009) [2023-10-10 06:01:11,087][53268] Updated weights for policy 1, policy_version 33070 (0.0009) [2023-10-10 06:01:11,455][53268] Updated weights for policy 1, policy_version 33080 (0.0008) [2023-10-10 06:01:11,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67796992. Throughput: 0: 1678.8, 1: 1665.6. Samples: 16955556. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-10 06:01:11,784][52050] Avg episode reward: [(0, '19.620'), (1, '18.320')] [2023-10-10 06:01:13,783][53252] Updated weights for policy 0, policy_version 33130 (0.0008) [2023-10-10 06:01:14,160][53252] Updated weights for policy 0, policy_version 33140 (0.0009) [2023-10-10 06:01:14,540][53252] Updated weights for policy 0, policy_version 33150 (0.0008) [2023-10-10 06:01:15,573][53268] Updated weights for policy 1, policy_version 33090 (0.0008) [2023-10-10 06:01:15,940][53268] Updated weights for policy 1, policy_version 33100 (0.0010) [2023-10-10 06:01:16,311][53268] Updated weights for policy 1, policy_version 33110 (0.0008) [2023-10-10 06:01:16,679][53268] Updated weights for policy 1, policy_version 33120 (0.0007) [2023-10-10 06:01:16,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67862528. Throughput: 0: 1662.3, 1: 1678.6. Samples: 16965734. Policy #0 lag: (min: 9.0, avg: 17.7, max: 41.0) [2023-10-10 06:01:16,784][52050] Avg episode reward: [(0, '19.870'), (1, '17.920')] [2023-10-10 06:01:18,703][53252] Updated weights for policy 0, policy_version 33160 (0.0008) [2023-10-10 06:01:19,070][53252] Updated weights for policy 0, policy_version 33170 (0.0008) [2023-10-10 06:01:19,445][53252] Updated weights for policy 0, policy_version 33180 (0.0007) [2023-10-10 06:01:20,795][53268] Updated weights for policy 1, policy_version 33130 (0.0010) [2023-10-10 06:01:21,160][53268] Updated weights for policy 1, policy_version 33140 (0.0009) [2023-10-10 06:01:21,538][53268] Updated weights for policy 1, policy_version 33150 (0.0010) [2023-10-10 06:01:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67928064. Throughput: 0: 1676.8, 1: 1678.8. Samples: 16986352. Policy #0 lag: (min: 9.0, avg: 17.7, max: 41.0) [2023-10-10 06:01:21,784][52050] Avg episode reward: [(0, '20.260'), (1, '17.790')] [2023-10-10 06:01:23,452][53252] Updated weights for policy 0, policy_version 33190 (0.0009) [2023-10-10 06:01:23,814][53252] Updated weights for policy 0, policy_version 33200 (0.0008) [2023-10-10 06:01:24,200][53252] Updated weights for policy 0, policy_version 33210 (0.0008) [2023-10-10 06:01:25,645][53268] Updated weights for policy 1, policy_version 33160 (0.0010) [2023-10-10 06:01:26,013][53268] Updated weights for policy 1, policy_version 33170 (0.0009) [2023-10-10 06:01:26,383][53268] Updated weights for policy 1, policy_version 33180 (0.0009) [2023-10-10 06:01:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 67993600. Throughput: 0: 1694.0, 1: 1664.7. Samples: 17006442. Policy #0 lag: (min: 9.0, avg: 17.7, max: 41.0) [2023-10-10 06:01:26,784][52050] Avg episode reward: [(0, '18.940'), (1, '18.110')] [2023-10-10 06:01:28,260][53252] Updated weights for policy 0, policy_version 33220 (0.0008) [2023-10-10 06:01:28,627][53252] Updated weights for policy 0, policy_version 33230 (0.0007) [2023-10-10 06:01:29,007][53252] Updated weights for policy 0, policy_version 33240 (0.0007) [2023-10-10 06:01:30,229][53268] Updated weights for policy 1, policy_version 33190 (0.0009) [2023-10-10 06:01:30,603][53268] Updated weights for policy 1, policy_version 33200 (0.0008) [2023-10-10 06:01:30,971][53268] Updated weights for policy 1, policy_version 33210 (0.0008) [2023-10-10 06:01:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 68059136. Throughput: 0: 1666.3, 1: 1682.9. Samples: 17016380. Policy #0 lag: (min: 9.0, avg: 17.7, max: 41.0) [2023-10-10 06:01:31,784][52050] Avg episode reward: [(0, '19.730'), (1, '18.360')] [2023-10-10 06:01:33,021][53252] Updated weights for policy 0, policy_version 33250 (0.0007) [2023-10-10 06:01:33,401][53252] Updated weights for policy 0, policy_version 33260 (0.0009) [2023-10-10 06:01:33,765][53252] Updated weights for policy 0, policy_version 33270 (0.0008) [2023-10-10 06:01:34,144][53252] Updated weights for policy 0, policy_version 33280 (0.0008) [2023-10-10 06:01:34,913][53268] Updated weights for policy 1, policy_version 33220 (0.0008) [2023-10-10 06:01:35,281][53268] Updated weights for policy 1, policy_version 33230 (0.0007) [2023-10-10 06:01:35,644][53268] Updated weights for policy 1, policy_version 33240 (0.0009) [2023-10-10 06:01:36,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 68124672. Throughput: 0: 1695.7, 1: 1679.2. Samples: 17037108. Policy #0 lag: (min: 9.0, avg: 17.7, max: 41.0) [2023-10-10 06:01:36,785][52050] Avg episode reward: [(0, '19.580'), (1, '18.170')] [2023-10-10 06:01:38,228][53252] Updated weights for policy 0, policy_version 33290 (0.0010) [2023-10-10 06:01:38,614][53252] Updated weights for policy 0, policy_version 33300 (0.0007) [2023-10-10 06:01:38,983][53252] Updated weights for policy 0, policy_version 33310 (0.0007) [2023-10-10 06:01:39,704][53268] Updated weights for policy 1, policy_version 33250 (0.0009) [2023-10-10 06:01:40,072][53268] Updated weights for policy 1, policy_version 33260 (0.0007) [2023-10-10 06:01:40,452][53268] Updated weights for policy 1, policy_version 33270 (0.0008) [2023-10-10 06:01:40,813][53268] Updated weights for policy 1, policy_version 33280 (0.0010) [2023-10-10 06:01:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68190208. Throughput: 0: 1694.1, 1: 1669.7. Samples: 17056782. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) [2023-10-10 06:01:41,784][52050] Avg episode reward: [(0, '20.920'), (1, '18.490')] [2023-10-10 06:01:43,021][53252] Updated weights for policy 0, policy_version 33320 (0.0007) [2023-10-10 06:01:43,396][53252] Updated weights for policy 0, policy_version 33330 (0.0007) [2023-10-10 06:01:43,769][53252] Updated weights for policy 0, policy_version 33340 (0.0007) [2023-10-10 06:01:44,917][53268] Updated weights for policy 1, policy_version 33290 (0.0009) [2023-10-10 06:01:45,283][53268] Updated weights for policy 1, policy_version 33300 (0.0010) [2023-10-10 06:01:45,656][53268] Updated weights for policy 1, policy_version 33310 (0.0008) [2023-10-10 06:01:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68255744. Throughput: 0: 1672.3, 1: 1691.2. Samples: 17067306. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) [2023-10-10 06:01:46,784][52050] Avg episode reward: [(0, '20.800'), (1, '18.690')] [2023-10-10 06:01:47,678][53252] Updated weights for policy 0, policy_version 33350 (0.0009) [2023-10-10 06:01:48,052][53252] Updated weights for policy 0, policy_version 33360 (0.0009) [2023-10-10 06:01:48,421][53252] Updated weights for policy 0, policy_version 33370 (0.0011) [2023-10-10 06:01:49,781][53268] Updated weights for policy 1, policy_version 33320 (0.0008) [2023-10-10 06:01:50,155][53268] Updated weights for policy 1, policy_version 33330 (0.0008) [2023-10-10 06:01:50,527][53268] Updated weights for policy 1, policy_version 33340 (0.0009) [2023-10-10 06:01:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68321280. Throughput: 0: 1698.0, 1: 1670.4. Samples: 17087220. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) [2023-10-10 06:01:51,784][52050] Avg episode reward: [(0, '19.800'), (1, '18.420')] [2023-10-10 06:01:52,437][53252] Updated weights for policy 0, policy_version 33380 (0.0011) [2023-10-10 06:01:52,807][53252] Updated weights for policy 0, policy_version 33390 (0.0010) [2023-10-10 06:01:53,172][53252] Updated weights for policy 0, policy_version 33400 (0.0011) [2023-10-10 06:01:54,462][53268] Updated weights for policy 1, policy_version 33350 (0.0010) [2023-10-10 06:01:54,830][53268] Updated weights for policy 1, policy_version 33360 (0.0008) [2023-10-10 06:01:55,194][53268] Updated weights for policy 1, policy_version 33370 (0.0007) [2023-10-10 06:01:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68386816. Throughput: 0: 1696.6, 1: 1681.3. Samples: 17107562. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) [2023-10-10 06:01:56,785][52050] Avg episode reward: [(0, '21.690'), (1, '18.840')] [2023-10-10 06:01:57,254][53252] Updated weights for policy 0, policy_version 33410 (0.0010) [2023-10-10 06:01:57,628][53252] Updated weights for policy 0, policy_version 33420 (0.0007) [2023-10-10 06:01:58,003][53252] Updated weights for policy 0, policy_version 33430 (0.0009) [2023-10-10 06:01:58,372][53252] Updated weights for policy 0, policy_version 33440 (0.0009) [2023-10-10 06:01:59,245][53268] Updated weights for policy 1, policy_version 33380 (0.0009) [2023-10-10 06:01:59,624][53268] Updated weights for policy 1, policy_version 33390 (0.0009) [2023-10-10 06:01:59,989][53268] Updated weights for policy 1, policy_version 33400 (0.0010) [2023-10-10 06:02:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68452352. Throughput: 0: 1685.7, 1: 1694.5. Samples: 17117844. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) [2023-10-10 06:02:01,784][52050] Avg episode reward: [(0, '22.280'), (1, '18.450')] [2023-10-10 06:02:02,591][53252] Updated weights for policy 0, policy_version 33450 (0.0008) [2023-10-10 06:02:02,960][53252] Updated weights for policy 0, policy_version 33460 (0.0009) [2023-10-10 06:02:03,323][53252] Updated weights for policy 0, policy_version 33470 (0.0008) [2023-10-10 06:02:04,072][53268] Updated weights for policy 1, policy_version 33410 (0.0008) [2023-10-10 06:02:04,450][53268] Updated weights for policy 1, policy_version 33420 (0.0007) [2023-10-10 06:02:04,828][53268] Updated weights for policy 1, policy_version 33430 (0.0009) [2023-10-10 06:02:05,192][53268] Updated weights for policy 1, policy_version 33440 (0.0007) [2023-10-10 06:02:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68517888. Throughput: 0: 1686.8, 1: 1671.2. Samples: 17137462. Policy #0 lag: (min: 14.0, avg: 21.3, max: 46.0) [2023-10-10 06:02:06,784][52050] Avg episode reward: [(0, '21.710'), (1, '18.490')] [2023-10-10 06:02:07,521][53252] Updated weights for policy 0, policy_version 33480 (0.0008) [2023-10-10 06:02:07,885][53252] Updated weights for policy 0, policy_version 33490 (0.0010) [2023-10-10 06:02:08,255][53252] Updated weights for policy 0, policy_version 33500 (0.0009) [2023-10-10 06:02:09,188][53268] Updated weights for policy 1, policy_version 33450 (0.0007) [2023-10-10 06:02:09,545][53268] Updated weights for policy 1, policy_version 33460 (0.0008) [2023-10-10 06:02:09,916][53268] Updated weights for policy 1, policy_version 33470 (0.0008) [2023-10-10 06:02:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68583424. Throughput: 0: 1684.7, 1: 1684.7. Samples: 17158064. Policy #0 lag: (min: 14.0, avg: 21.3, max: 46.0) [2023-10-10 06:02:11,784][52050] Avg episode reward: [(0, '20.940'), (1, '18.080')] [2023-10-10 06:02:12,223][53252] Updated weights for policy 0, policy_version 33510 (0.0011) [2023-10-10 06:02:12,599][53252] Updated weights for policy 0, policy_version 33520 (0.0010) [2023-10-10 06:02:12,972][53252] Updated weights for policy 0, policy_version 33530 (0.0007) [2023-10-10 06:02:14,093][53268] Updated weights for policy 1, policy_version 33480 (0.0009) [2023-10-10 06:02:14,477][53268] Updated weights for policy 1, policy_version 33490 (0.0010) [2023-10-10 06:02:14,840][53268] Updated weights for policy 1, policy_version 33500 (0.0009) [2023-10-10 06:02:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 68648960. Throughput: 0: 1684.8, 1: 1684.8. Samples: 17168012. Policy #0 lag: (min: 14.0, avg: 21.3, max: 46.0) [2023-10-10 06:02:16,784][52050] Avg episode reward: [(0, '20.150'), (1, '18.340')] [2023-10-10 06:02:17,133][53252] Updated weights for policy 0, policy_version 33540 (0.0007) [2023-10-10 06:02:17,511][53252] Updated weights for policy 0, policy_version 33550 (0.0007) [2023-10-10 06:02:17,886][53252] Updated weights for policy 0, policy_version 33560 (0.0007) [2023-10-10 06:02:18,963][53268] Updated weights for policy 1, policy_version 33510 (0.0010) [2023-10-10 06:02:19,330][53268] Updated weights for policy 1, policy_version 33520 (0.0010) [2023-10-10 06:02:19,704][53268] Updated weights for policy 1, policy_version 33530 (0.0009) [2023-10-10 06:02:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68714496. Throughput: 0: 1679.9, 1: 1673.7. Samples: 17188016. Policy #0 lag: (min: 14.0, avg: 21.3, max: 46.0) [2023-10-10 06:02:21,784][52050] Avg episode reward: [(0, '20.200'), (1, '19.130')] [2023-10-10 06:02:21,986][53252] Updated weights for policy 0, policy_version 33570 (0.0009) [2023-10-10 06:02:22,358][53252] Updated weights for policy 0, policy_version 33580 (0.0007) [2023-10-10 06:02:22,719][53252] Updated weights for policy 0, policy_version 33590 (0.0007) [2023-10-10 06:02:23,092][53252] Updated weights for policy 0, policy_version 33600 (0.0008) [2023-10-10 06:02:23,709][53268] Updated weights for policy 1, policy_version 33540 (0.0008) [2023-10-10 06:02:24,068][53268] Updated weights for policy 1, policy_version 33550 (0.0007) [2023-10-10 06:02:24,431][53268] Updated weights for policy 1, policy_version 33560 (0.0008) [2023-10-10 06:02:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68780032. Throughput: 0: 1687.5, 1: 1697.7. Samples: 17209112. Policy #0 lag: (min: 14.0, avg: 21.3, max: 46.0) [2023-10-10 06:02:26,784][52050] Avg episode reward: [(0, '20.260'), (1, '17.970')] [2023-10-10 06:02:27,161][53252] Updated weights for policy 0, policy_version 33610 (0.0007) [2023-10-10 06:02:27,525][53252] Updated weights for policy 0, policy_version 33620 (0.0009) [2023-10-10 06:02:27,905][53252] Updated weights for policy 0, policy_version 33630 (0.0007) [2023-10-10 06:02:28,294][53268] Updated weights for policy 1, policy_version 33570 (0.0010) [2023-10-10 06:02:28,661][53268] Updated weights for policy 1, policy_version 33580 (0.0009) [2023-10-10 06:02:29,020][53268] Updated weights for policy 1, policy_version 33590 (0.0010) [2023-10-10 06:02:29,381][53268] Updated weights for policy 1, policy_version 33600 (0.0009) [2023-10-10 06:02:31,692][53252] Updated weights for policy 0, policy_version 33640 (0.0009) [2023-10-10 06:02:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68845568. Throughput: 0: 1686.0, 1: 1675.3. Samples: 17218564. Policy #0 lag: (min: 17.0, avg: 39.9, max: 49.0) [2023-10-10 06:02:31,784][52050] Avg episode reward: [(0, '20.320'), (1, '18.260')] [2023-10-10 06:02:32,063][53252] Updated weights for policy 0, policy_version 33650 (0.0008) [2023-10-10 06:02:32,425][53252] Updated weights for policy 0, policy_version 33660 (0.0009) [2023-10-10 06:02:33,438][53268] Updated weights for policy 1, policy_version 33610 (0.0010) [2023-10-10 06:02:33,804][53268] Updated weights for policy 1, policy_version 33620 (0.0009) [2023-10-10 06:02:34,181][53268] Updated weights for policy 1, policy_version 33630 (0.0008) [2023-10-10 06:02:36,392][53252] Updated weights for policy 0, policy_version 33670 (0.0009) [2023-10-10 06:02:36,779][53252] Updated weights for policy 0, policy_version 33680 (0.0010) [2023-10-10 06:02:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 68911104. Throughput: 0: 1688.8, 1: 1685.3. Samples: 17239056. Policy #0 lag: (min: 17.0, avg: 39.9, max: 49.0) [2023-10-10 06:02:36,784][52050] Avg episode reward: [(0, '20.020'), (1, '18.400')] [2023-10-10 06:02:37,147][53252] Updated weights for policy 0, policy_version 33690 (0.0011) [2023-10-10 06:02:38,185][53268] Updated weights for policy 1, policy_version 33640 (0.0009) [2023-10-10 06:02:38,556][53268] Updated weights for policy 1, policy_version 33650 (0.0007) [2023-10-10 06:02:38,911][53268] Updated weights for policy 1, policy_version 33660 (0.0009) [2023-10-10 06:02:41,427][53252] Updated weights for policy 0, policy_version 33700 (0.0009) [2023-10-10 06:02:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68976640. Throughput: 0: 1683.1, 1: 1699.0. Samples: 17259754. Policy #0 lag: (min: 17.0, avg: 39.9, max: 49.0) [2023-10-10 06:02:41,784][52050] Avg episode reward: [(0, '19.280'), (1, '18.500')] [2023-10-10 06:02:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000033664_34471936.pth... [2023-10-10 06:02:41,802][53252] Updated weights for policy 0, policy_version 33710 (0.0009) [2023-10-10 06:02:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000032096_32866304.pth [2023-10-10 06:02:42,167][53252] Updated weights for policy 0, policy_version 33720 (0.0007) [2023-10-10 06:02:42,466][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000033728_34537472.pth... [2023-10-10 06:02:42,495][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000032128_32899072.pth [2023-10-10 06:02:42,980][53268] Updated weights for policy 1, policy_version 33670 (0.0009) [2023-10-10 06:02:43,342][53268] Updated weights for policy 1, policy_version 33680 (0.0010) [2023-10-10 06:02:43,714][53268] Updated weights for policy 1, policy_version 33690 (0.0008) [2023-10-10 06:02:46,219][53252] Updated weights for policy 0, policy_version 33730 (0.0007) [2023-10-10 06:02:46,595][53252] Updated weights for policy 0, policy_version 33740 (0.0007) [2023-10-10 06:02:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69042176. Throughput: 0: 1688.4, 1: 1674.0. Samples: 17269148. Policy #0 lag: (min: 17.0, avg: 39.9, max: 49.0) [2023-10-10 06:02:46,784][52050] Avg episode reward: [(0, '18.400'), (1, '18.640')] [2023-10-10 06:02:46,972][53252] Updated weights for policy 0, policy_version 33750 (0.0009) [2023-10-10 06:02:47,343][53252] Updated weights for policy 0, policy_version 33760 (0.0008) [2023-10-10 06:02:47,722][53268] Updated weights for policy 1, policy_version 33700 (0.0007) [2023-10-10 06:02:48,080][53268] Updated weights for policy 1, policy_version 33710 (0.0010) [2023-10-10 06:02:48,441][53268] Updated weights for policy 1, policy_version 33720 (0.0010) [2023-10-10 06:02:51,480][53252] Updated weights for policy 0, policy_version 33770 (0.0007) [2023-10-10 06:02:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69107712. Throughput: 0: 1690.1, 1: 1701.9. Samples: 17290104. Policy #0 lag: (min: 17.0, avg: 39.9, max: 49.0) [2023-10-10 06:02:51,784][52050] Avg episode reward: [(0, '18.420'), (1, '17.440')] [2023-10-10 06:02:51,858][53252] Updated weights for policy 0, policy_version 33780 (0.0007) [2023-10-10 06:02:52,234][53252] Updated weights for policy 0, policy_version 33790 (0.0007) [2023-10-10 06:02:52,383][53268] Updated weights for policy 1, policy_version 33730 (0.0009) [2023-10-10 06:02:52,755][53268] Updated weights for policy 1, policy_version 33740 (0.0009) [2023-10-10 06:02:53,124][53268] Updated weights for policy 1, policy_version 33750 (0.0008) [2023-10-10 06:02:53,498][53268] Updated weights for policy 1, policy_version 33760 (0.0009) [2023-10-10 06:02:56,195][53252] Updated weights for policy 0, policy_version 33800 (0.0009) [2023-10-10 06:02:56,566][53252] Updated weights for policy 0, policy_version 33810 (0.0009) [2023-10-10 06:02:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 69173248. Throughput: 0: 1676.1, 1: 1702.7. Samples: 17310110. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:02:56,784][52050] Avg episode reward: [(0, '19.690'), (1, '17.240')] [2023-10-10 06:02:56,945][53252] Updated weights for policy 0, policy_version 33820 (0.0011) [2023-10-10 06:02:57,444][53268] Updated weights for policy 1, policy_version 33770 (0.0008) [2023-10-10 06:02:57,815][53268] Updated weights for policy 1, policy_version 33780 (0.0008) [2023-10-10 06:02:58,192][53268] Updated weights for policy 1, policy_version 33790 (0.0010) [2023-10-10 06:03:01,176][53252] Updated weights for policy 0, policy_version 33830 (0.0009) [2023-10-10 06:03:01,554][53252] Updated weights for policy 0, policy_version 33840 (0.0007) [2023-10-10 06:03:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69238784. Throughput: 0: 1686.0, 1: 1684.5. Samples: 17319686. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:03:01,784][52050] Avg episode reward: [(0, '20.560'), (1, '16.900')] [2023-10-10 06:03:01,925][53252] Updated weights for policy 0, policy_version 33850 (0.0008) [2023-10-10 06:03:02,329][53268] Updated weights for policy 1, policy_version 33800 (0.0008) [2023-10-10 06:03:02,700][53268] Updated weights for policy 1, policy_version 33810 (0.0007) [2023-10-10 06:03:03,067][53268] Updated weights for policy 1, policy_version 33820 (0.0008) [2023-10-10 06:03:06,093][53252] Updated weights for policy 0, policy_version 33860 (0.0008) [2023-10-10 06:03:06,468][53252] Updated weights for policy 0, policy_version 33870 (0.0009) [2023-10-10 06:03:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69304320. Throughput: 0: 1682.1, 1: 1699.6. Samples: 17340196. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:03:06,784][52050] Avg episode reward: [(0, '20.680'), (1, '17.730')] [2023-10-10 06:03:06,834][53252] Updated weights for policy 0, policy_version 33880 (0.0008) [2023-10-10 06:03:06,978][53268] Updated weights for policy 1, policy_version 33830 (0.0008) [2023-10-10 06:03:07,343][53268] Updated weights for policy 1, policy_version 33840 (0.0010) [2023-10-10 06:03:07,703][53268] Updated weights for policy 1, policy_version 33850 (0.0010) [2023-10-10 06:03:10,828][53252] Updated weights for policy 0, policy_version 33890 (0.0008) [2023-10-10 06:03:11,194][53252] Updated weights for policy 0, policy_version 33900 (0.0008) [2023-10-10 06:03:11,572][53252] Updated weights for policy 0, policy_version 33910 (0.0009) [2023-10-10 06:03:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 69369856. Throughput: 0: 1660.2, 1: 1700.4. Samples: 17360338. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:03:11,784][52050] Avg episode reward: [(0, '21.160'), (1, '18.080')] [2023-10-10 06:03:11,902][53268] Updated weights for policy 1, policy_version 33860 (0.0008) [2023-10-10 06:03:11,938][53252] Updated weights for policy 0, policy_version 33920 (0.0007) [2023-10-10 06:03:12,274][53268] Updated weights for policy 1, policy_version 33870 (0.0008) [2023-10-10 06:03:12,633][53268] Updated weights for policy 1, policy_version 33880 (0.0008) [2023-10-10 06:03:16,094][53252] Updated weights for policy 0, policy_version 33930 (0.0009) [2023-10-10 06:03:16,465][53252] Updated weights for policy 0, policy_version 33940 (0.0009) [2023-10-10 06:03:16,761][53268] Updated weights for policy 1, policy_version 33890 (0.0009) [2023-10-10 06:03:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 69435392. Throughput: 0: 1673.8, 1: 1691.0. Samples: 17369980. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:03:16,784][52050] Avg episode reward: [(0, '20.850'), (1, '18.490')] [2023-10-10 06:03:16,839][53252] Updated weights for policy 0, policy_version 33950 (0.0007) [2023-10-10 06:03:17,125][53268] Updated weights for policy 1, policy_version 33900 (0.0008) [2023-10-10 06:03:17,497][53268] Updated weights for policy 1, policy_version 33910 (0.0011) [2023-10-10 06:03:17,855][53268] Updated weights for policy 1, policy_version 33920 (0.0010) [2023-10-10 06:03:21,037][53252] Updated weights for policy 0, policy_version 33960 (0.0007) [2023-10-10 06:03:21,416][53252] Updated weights for policy 0, policy_version 33970 (0.0008) [2023-10-10 06:03:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69500928. Throughput: 0: 1664.8, 1: 1700.8. Samples: 17390504. Policy #0 lag: (min: 6.0, avg: 29.6, max: 32.0) [2023-10-10 06:03:21,784][52050] Avg episode reward: [(0, '21.580'), (1, '19.610')] [2023-10-10 06:03:21,793][53252] Updated weights for policy 0, policy_version 33980 (0.0007) [2023-10-10 06:03:21,985][53268] Updated weights for policy 1, policy_version 33930 (0.0009) [2023-10-10 06:03:22,353][53268] Updated weights for policy 1, policy_version 33940 (0.0011) [2023-10-10 06:03:22,719][53268] Updated weights for policy 1, policy_version 33950 (0.0008) [2023-10-10 06:03:25,911][53252] Updated weights for policy 0, policy_version 33990 (0.0008) [2023-10-10 06:03:26,287][53252] Updated weights for policy 0, policy_version 34000 (0.0010) [2023-10-10 06:03:26,648][53252] Updated weights for policy 0, policy_version 34010 (0.0009) [2023-10-10 06:03:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69566464. Throughput: 0: 1650.5, 1: 1695.3. Samples: 17410314. Policy #0 lag: (min: 6.0, avg: 29.6, max: 32.0) [2023-10-10 06:03:26,784][52050] Avg episode reward: [(0, '19.570'), (1, '18.800')] [2023-10-10 06:03:26,889][53268] Updated weights for policy 1, policy_version 33960 (0.0008) [2023-10-10 06:03:27,274][53268] Updated weights for policy 1, policy_version 33970 (0.0008) [2023-10-10 06:03:27,638][53268] Updated weights for policy 1, policy_version 33980 (0.0008) [2023-10-10 06:03:30,602][53252] Updated weights for policy 0, policy_version 34020 (0.0008) [2023-10-10 06:03:30,975][53252] Updated weights for policy 0, policy_version 34030 (0.0010) [2023-10-10 06:03:31,347][53252] Updated weights for policy 0, policy_version 34040 (0.0007) [2023-10-10 06:03:31,761][53268] Updated weights for policy 1, policy_version 33990 (0.0008) [2023-10-10 06:03:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 69664768. Throughput: 0: 1663.5, 1: 1688.7. Samples: 17419996. Policy #0 lag: (min: 6.0, avg: 29.6, max: 32.0) [2023-10-10 06:03:31,784][52050] Avg episode reward: [(0, '19.890'), (1, '19.280')] [2023-10-10 06:03:32,123][53268] Updated weights for policy 1, policy_version 34000 (0.0010) [2023-10-10 06:03:32,496][53268] Updated weights for policy 1, policy_version 34010 (0.0010) [2023-10-10 06:03:35,348][53252] Updated weights for policy 0, policy_version 34050 (0.0007) [2023-10-10 06:03:35,725][53252] Updated weights for policy 0, policy_version 34060 (0.0009) [2023-10-10 06:03:36,097][53252] Updated weights for policy 0, policy_version 34070 (0.0007) [2023-10-10 06:03:36,475][53252] Updated weights for policy 0, policy_version 34080 (0.0008) [2023-10-10 06:03:36,693][53268] Updated weights for policy 1, policy_version 34020 (0.0009) [2023-10-10 06:03:36,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69730304. Throughput: 0: 1665.7, 1: 1682.4. Samples: 17440770. Policy #0 lag: (min: 6.0, avg: 29.6, max: 32.0) [2023-10-10 06:03:36,784][52050] Avg episode reward: [(0, '21.180'), (1, '19.480')] [2023-10-10 06:03:37,047][53268] Updated weights for policy 1, policy_version 34030 (0.0009) [2023-10-10 06:03:37,412][53268] Updated weights for policy 1, policy_version 34040 (0.0009) [2023-10-10 06:03:40,302][53252] Updated weights for policy 0, policy_version 34090 (0.0007) [2023-10-10 06:03:40,679][53252] Updated weights for policy 0, policy_version 34100 (0.0007) [2023-10-10 06:03:41,056][53252] Updated weights for policy 0, policy_version 34110 (0.0008) [2023-10-10 06:03:41,432][53268] Updated weights for policy 1, policy_version 34050 (0.0009) [2023-10-10 06:03:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 69795840. Throughput: 0: 1655.1, 1: 1686.9. Samples: 17460500. Policy #0 lag: (min: 10.0, avg: 14.3, max: 42.0) [2023-10-10 06:03:41,784][52050] Avg episode reward: [(0, '21.420'), (1, '18.670')] [2023-10-10 06:03:41,800][53268] Updated weights for policy 1, policy_version 34060 (0.0008) [2023-10-10 06:03:42,159][53268] Updated weights for policy 1, policy_version 34070 (0.0008) [2023-10-10 06:03:42,531][53268] Updated weights for policy 1, policy_version 34080 (0.0008) [2023-10-10 06:03:45,029][53252] Updated weights for policy 0, policy_version 34120 (0.0009) [2023-10-10 06:03:45,397][53252] Updated weights for policy 0, policy_version 34130 (0.0009) [2023-10-10 06:03:45,768][53252] Updated weights for policy 0, policy_version 34140 (0.0007) [2023-10-10 06:03:46,434][53268] Updated weights for policy 1, policy_version 34090 (0.0007) [2023-10-10 06:03:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69861376. Throughput: 0: 1678.0, 1: 1685.1. Samples: 17471024. Policy #0 lag: (min: 10.0, avg: 14.3, max: 42.0) [2023-10-10 06:03:46,784][52050] Avg episode reward: [(0, '21.220'), (1, '17.940')] [2023-10-10 06:03:46,802][53268] Updated weights for policy 1, policy_version 34100 (0.0011) [2023-10-10 06:03:47,168][53268] Updated weights for policy 1, policy_version 34110 (0.0010) [2023-10-10 06:03:49,826][53252] Updated weights for policy 0, policy_version 34150 (0.0007) [2023-10-10 06:03:50,186][53252] Updated weights for policy 0, policy_version 34160 (0.0008) [2023-10-10 06:03:50,564][53252] Updated weights for policy 0, policy_version 34170 (0.0009) [2023-10-10 06:03:51,247][53268] Updated weights for policy 1, policy_version 34120 (0.0010) [2023-10-10 06:03:51,620][53268] Updated weights for policy 1, policy_version 34130 (0.0009) [2023-10-10 06:03:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 69926912. Throughput: 0: 1665.0, 1: 1689.8. Samples: 17491164. Policy #0 lag: (min: 10.0, avg: 14.3, max: 42.0) [2023-10-10 06:03:51,784][52050] Avg episode reward: [(0, '22.120'), (1, '18.220')] [2023-10-10 06:03:51,988][53268] Updated weights for policy 1, policy_version 34140 (0.0008) [2023-10-10 06:03:54,644][53252] Updated weights for policy 0, policy_version 34180 (0.0010) [2023-10-10 06:03:55,023][53252] Updated weights for policy 0, policy_version 34190 (0.0009) [2023-10-10 06:03:55,389][53252] Updated weights for policy 0, policy_version 34200 (0.0009) [2023-10-10 06:03:56,033][53268] Updated weights for policy 1, policy_version 34150 (0.0012) [2023-10-10 06:03:56,407][53268] Updated weights for policy 1, policy_version 34160 (0.0010) [2023-10-10 06:03:56,763][53268] Updated weights for policy 1, policy_version 34170 (0.0011) [2023-10-10 06:03:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 69992448. Throughput: 0: 1674.0, 1: 1677.7. Samples: 17511162. Policy #0 lag: (min: 10.0, avg: 14.3, max: 42.0) [2023-10-10 06:03:56,784][52050] Avg episode reward: [(0, '20.060'), (1, '18.090')] [2023-10-10 06:03:59,426][53252] Updated weights for policy 0, policy_version 34210 (0.0007) [2023-10-10 06:03:59,793][53252] Updated weights for policy 0, policy_version 34220 (0.0009) [2023-10-10 06:04:00,167][53252] Updated weights for policy 0, policy_version 34230 (0.0009) [2023-10-10 06:04:00,539][53252] Updated weights for policy 0, policy_version 34240 (0.0010) [2023-10-10 06:04:00,717][53268] Updated weights for policy 1, policy_version 34180 (0.0009) [2023-10-10 06:04:01,091][53268] Updated weights for policy 1, policy_version 34190 (0.0009) [2023-10-10 06:04:01,455][53268] Updated weights for policy 1, policy_version 34200 (0.0007) [2023-10-10 06:04:01,783][52050] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 70090752. Throughput: 0: 1688.3, 1: 1686.8. Samples: 17521862. Policy #0 lag: (min: 10.0, avg: 14.3, max: 42.0) [2023-10-10 06:04:01,784][52050] Avg episode reward: [(0, '18.850'), (1, '18.540')] [2023-10-10 06:04:04,779][53252] Updated weights for policy 0, policy_version 34250 (0.0009) [2023-10-10 06:04:05,164][53252] Updated weights for policy 0, policy_version 34260 (0.0008) [2023-10-10 06:04:05,419][53268] Updated weights for policy 1, policy_version 34210 (0.0010) [2023-10-10 06:04:05,534][53252] Updated weights for policy 0, policy_version 34270 (0.0007) [2023-10-10 06:04:05,783][53268] Updated weights for policy 1, policy_version 34220 (0.0008) [2023-10-10 06:04:06,146][53268] Updated weights for policy 1, policy_version 34230 (0.0008) [2023-10-10 06:04:06,507][53268] Updated weights for policy 1, policy_version 34240 (0.0009) [2023-10-10 06:04:06,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 70156288. Throughput: 0: 1666.1, 1: 1687.1. Samples: 17541396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:06,784][52050] Avg episode reward: [(0, '19.780'), (1, '19.580')] [2023-10-10 06:04:09,526][53252] Updated weights for policy 0, policy_version 34280 (0.0009) [2023-10-10 06:04:09,904][53252] Updated weights for policy 0, policy_version 34290 (0.0008) [2023-10-10 06:04:10,276][53252] Updated weights for policy 0, policy_version 34300 (0.0008) [2023-10-10 06:04:10,675][53268] Updated weights for policy 1, policy_version 34250 (0.0011) [2023-10-10 06:04:11,041][53268] Updated weights for policy 1, policy_version 34260 (0.0011) [2023-10-10 06:04:11,415][53268] Updated weights for policy 1, policy_version 34270 (0.0009) [2023-10-10 06:04:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 70221824. Throughput: 0: 1678.7, 1: 1666.2. Samples: 17560832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:11,784][52050] Avg episode reward: [(0, '18.940'), (1, '18.690')] [2023-10-10 06:04:14,260][53252] Updated weights for policy 0, policy_version 34310 (0.0009) [2023-10-10 06:04:14,639][53252] Updated weights for policy 0, policy_version 34320 (0.0009) [2023-10-10 06:04:15,009][53252] Updated weights for policy 0, policy_version 34330 (0.0008) [2023-10-10 06:04:15,600][53268] Updated weights for policy 1, policy_version 34280 (0.0009) [2023-10-10 06:04:15,983][53268] Updated weights for policy 1, policy_version 34290 (0.0007) [2023-10-10 06:04:16,356][53268] Updated weights for policy 1, policy_version 34300 (0.0008) [2023-10-10 06:04:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 70287360. Throughput: 0: 1680.7, 1: 1692.8. Samples: 17571802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:16,784][52050] Avg episode reward: [(0, '19.670'), (1, '19.140')] [2023-10-10 06:04:19,248][53252] Updated weights for policy 0, policy_version 34340 (0.0007) [2023-10-10 06:04:19,623][53252] Updated weights for policy 0, policy_version 34350 (0.0007) [2023-10-10 06:04:20,000][53252] Updated weights for policy 0, policy_version 34360 (0.0008) [2023-10-10 06:04:20,192][53268] Updated weights for policy 1, policy_version 34310 (0.0008) [2023-10-10 06:04:20,564][53268] Updated weights for policy 1, policy_version 34320 (0.0007) [2023-10-10 06:04:20,927][53268] Updated weights for policy 1, policy_version 34330 (0.0008) [2023-10-10 06:04:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 70352896. Throughput: 0: 1660.5, 1: 1687.8. Samples: 17591444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:21,784][52050] Avg episode reward: [(0, '20.080'), (1, '19.440')] [2023-10-10 06:04:24,194][53252] Updated weights for policy 0, policy_version 34370 (0.0009) [2023-10-10 06:04:24,575][53252] Updated weights for policy 0, policy_version 34380 (0.0010) [2023-10-10 06:04:24,949][53252] Updated weights for policy 0, policy_version 34390 (0.0009) [2023-10-10 06:04:25,029][53268] Updated weights for policy 1, policy_version 34340 (0.0009) [2023-10-10 06:04:25,316][53252] Updated weights for policy 0, policy_version 34400 (0.0007) [2023-10-10 06:04:25,402][53268] Updated weights for policy 1, policy_version 34350 (0.0008) [2023-10-10 06:04:25,764][53268] Updated weights for policy 1, policy_version 34360 (0.0010) [2023-10-10 06:04:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 70418432. Throughput: 0: 1684.8, 1: 1661.0. Samples: 17611062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:26,784][52050] Avg episode reward: [(0, '19.280'), (1, '17.300')] [2023-10-10 06:04:29,486][53252] Updated weights for policy 0, policy_version 34410 (0.0008) [2023-10-10 06:04:29,812][53268] Updated weights for policy 1, policy_version 34370 (0.0010) [2023-10-10 06:04:29,861][53252] Updated weights for policy 0, policy_version 34420 (0.0007) [2023-10-10 06:04:30,181][53268] Updated weights for policy 1, policy_version 34380 (0.0009) [2023-10-10 06:04:30,238][53252] Updated weights for policy 0, policy_version 34430 (0.0008) [2023-10-10 06:04:30,559][53268] Updated weights for policy 1, policy_version 34390 (0.0007) [2023-10-10 06:04:30,921][53268] Updated weights for policy 1, policy_version 34400 (0.0009) [2023-10-10 06:04:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 70483968. Throughput: 0: 1672.3, 1: 1688.2. Samples: 17622248. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-10 06:04:31,784][52050] Avg episode reward: [(0, '17.980'), (1, '19.230')] [2023-10-10 06:04:34,251][53252] Updated weights for policy 0, policy_version 34440 (0.0009) [2023-10-10 06:04:34,618][53252] Updated weights for policy 0, policy_version 34450 (0.0010) [2023-10-10 06:04:34,992][53252] Updated weights for policy 0, policy_version 34460 (0.0007) [2023-10-10 06:04:35,007][53268] Updated weights for policy 1, policy_version 34410 (0.0009) [2023-10-10 06:04:35,369][53268] Updated weights for policy 1, policy_version 34420 (0.0009) [2023-10-10 06:04:35,732][53268] Updated weights for policy 1, policy_version 34430 (0.0010) [2023-10-10 06:04:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 70549504. Throughput: 0: 1666.4, 1: 1672.0. Samples: 17641390. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-10 06:04:36,784][52050] Avg episode reward: [(0, '17.420'), (1, '18.050')] [2023-10-10 06:04:39,094][53252] Updated weights for policy 0, policy_version 34470 (0.0009) [2023-10-10 06:04:39,462][53252] Updated weights for policy 0, policy_version 34480 (0.0008) [2023-10-10 06:04:39,835][53252] Updated weights for policy 0, policy_version 34490 (0.0007) [2023-10-10 06:04:39,855][53268] Updated weights for policy 1, policy_version 34440 (0.0008) [2023-10-10 06:04:40,219][53268] Updated weights for policy 1, policy_version 34450 (0.0008) [2023-10-10 06:04:40,589][53268] Updated weights for policy 1, policy_version 34460 (0.0010) [2023-10-10 06:04:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70615040. Throughput: 0: 1673.0, 1: 1664.0. Samples: 17661326. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-10 06:04:41,784][52050] Avg episode reward: [(0, '17.840'), (1, '17.800')] [2023-10-10 06:04:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000034496_35323904.pth... [2023-10-10 06:04:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000034464_35291136.pth... [2023-10-10 06:04:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000032928_33718272.pth [2023-10-10 06:04:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000032896_33685504.pth [2023-10-10 06:04:44,038][53252] Updated weights for policy 0, policy_version 34500 (0.0007) [2023-10-10 06:04:44,409][53252] Updated weights for policy 0, policy_version 34510 (0.0009) [2023-10-10 06:04:44,634][53268] Updated weights for policy 1, policy_version 34470 (0.0009) [2023-10-10 06:04:44,784][53252] Updated weights for policy 0, policy_version 34520 (0.0008) [2023-10-10 06:04:44,997][53268] Updated weights for policy 1, policy_version 34480 (0.0009) [2023-10-10 06:04:45,365][53268] Updated weights for policy 1, policy_version 34490 (0.0008) [2023-10-10 06:04:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70680576. Throughput: 0: 1661.7, 1: 1685.6. Samples: 17672494. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-10 06:04:46,784][52050] Avg episode reward: [(0, '19.580'), (1, '18.910')] [2023-10-10 06:04:48,874][53252] Updated weights for policy 0, policy_version 34530 (0.0008) [2023-10-10 06:04:49,238][53252] Updated weights for policy 0, policy_version 34540 (0.0008) [2023-10-10 06:04:49,554][53268] Updated weights for policy 1, policy_version 34500 (0.0009) [2023-10-10 06:04:49,607][53252] Updated weights for policy 0, policy_version 34550 (0.0008) [2023-10-10 06:04:49,914][53268] Updated weights for policy 1, policy_version 34510 (0.0009) [2023-10-10 06:04:49,993][53252] Updated weights for policy 0, policy_version 34560 (0.0008) [2023-10-10 06:04:50,277][53268] Updated weights for policy 1, policy_version 34520 (0.0009) [2023-10-10 06:04:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70746112. Throughput: 0: 1670.1, 1: 1664.9. Samples: 17691472. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-10 06:04:51,784][52050] Avg episode reward: [(0, '19.580'), (1, '19.190')] [2023-10-10 06:04:54,190][53252] Updated weights for policy 0, policy_version 34570 (0.0007) [2023-10-10 06:04:54,515][53268] Updated weights for policy 1, policy_version 34530 (0.0009) [2023-10-10 06:04:54,555][53252] Updated weights for policy 0, policy_version 34580 (0.0007) [2023-10-10 06:04:54,886][53268] Updated weights for policy 1, policy_version 34540 (0.0009) [2023-10-10 06:04:54,929][53252] Updated weights for policy 0, policy_version 34590 (0.0008) [2023-10-10 06:04:55,249][53268] Updated weights for policy 1, policy_version 34550 (0.0009) [2023-10-10 06:04:55,618][53268] Updated weights for policy 1, policy_version 34560 (0.0009) [2023-10-10 06:04:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70811648. Throughput: 0: 1673.1, 1: 1677.5. Samples: 17711606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:04:56,784][52050] Avg episode reward: [(0, '19.510'), (1, '17.640')] [2023-10-10 06:04:58,794][53252] Updated weights for policy 0, policy_version 34600 (0.0008) [2023-10-10 06:04:59,163][53252] Updated weights for policy 0, policy_version 34610 (0.0009) [2023-10-10 06:04:59,523][53252] Updated weights for policy 0, policy_version 34620 (0.0008) [2023-10-10 06:04:59,874][53268] Updated weights for policy 1, policy_version 34570 (0.0009) [2023-10-10 06:05:00,243][53268] Updated weights for policy 1, policy_version 34580 (0.0009) [2023-10-10 06:05:00,612][53268] Updated weights for policy 1, policy_version 34590 (0.0009) [2023-10-10 06:05:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70877184. Throughput: 0: 1660.1, 1: 1681.6. Samples: 17722182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:01,784][52050] Avg episode reward: [(0, '20.280'), (1, '19.610')] [2023-10-10 06:05:03,336][53252] Updated weights for policy 0, policy_version 34630 (0.0009) [2023-10-10 06:05:03,708][53252] Updated weights for policy 0, policy_version 34640 (0.0009) [2023-10-10 06:05:04,084][53252] Updated weights for policy 0, policy_version 34650 (0.0007) [2023-10-10 06:05:04,536][53268] Updated weights for policy 1, policy_version 34600 (0.0010) [2023-10-10 06:05:04,896][53268] Updated weights for policy 1, policy_version 34610 (0.0011) [2023-10-10 06:05:05,273][53268] Updated weights for policy 1, policy_version 34620 (0.0008) [2023-10-10 06:05:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 70942720. Throughput: 0: 1674.5, 1: 1664.0. Samples: 17741676. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:06,784][52050] Avg episode reward: [(0, '19.370'), (1, '18.840')] [2023-10-10 06:05:08,029][53252] Updated weights for policy 0, policy_version 34660 (0.0008) [2023-10-10 06:05:08,402][53252] Updated weights for policy 0, policy_version 34670 (0.0009) [2023-10-10 06:05:08,781][53252] Updated weights for policy 0, policy_version 34680 (0.0008) [2023-10-10 06:05:09,283][53268] Updated weights for policy 1, policy_version 34630 (0.0010) [2023-10-10 06:05:09,640][53268] Updated weights for policy 1, policy_version 34640 (0.0008) [2023-10-10 06:05:10,013][53268] Updated weights for policy 1, policy_version 34650 (0.0009) [2023-10-10 06:05:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71008256. Throughput: 0: 1675.4, 1: 1681.1. Samples: 17762104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:11,785][52050] Avg episode reward: [(0, '21.150'), (1, '18.280')] [2023-10-10 06:05:12,946][53252] Updated weights for policy 0, policy_version 34690 (0.0007) [2023-10-10 06:05:13,333][53252] Updated weights for policy 0, policy_version 34700 (0.0009) [2023-10-10 06:05:13,691][53252] Updated weights for policy 0, policy_version 34710 (0.0008) [2023-10-10 06:05:14,054][53268] Updated weights for policy 1, policy_version 34660 (0.0009) [2023-10-10 06:05:14,069][53252] Updated weights for policy 0, policy_version 34720 (0.0007) [2023-10-10 06:05:14,421][53268] Updated weights for policy 1, policy_version 34670 (0.0008) [2023-10-10 06:05:14,782][53268] Updated weights for policy 1, policy_version 34680 (0.0007) [2023-10-10 06:05:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71073792. Throughput: 0: 1652.4, 1: 1677.3. Samples: 17772084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:16,784][52050] Avg episode reward: [(0, '20.610'), (1, '19.800')] [2023-10-10 06:05:18,014][53252] Updated weights for policy 0, policy_version 34730 (0.0008) [2023-10-10 06:05:18,381][53252] Updated weights for policy 0, policy_version 34740 (0.0009) [2023-10-10 06:05:18,757][53252] Updated weights for policy 0, policy_version 34750 (0.0011) [2023-10-10 06:05:18,971][53268] Updated weights for policy 1, policy_version 34690 (0.0008) [2023-10-10 06:05:19,324][53268] Updated weights for policy 1, policy_version 34700 (0.0009) [2023-10-10 06:05:19,694][53268] Updated weights for policy 1, policy_version 34710 (0.0010) [2023-10-10 06:05:20,057][53268] Updated weights for policy 1, policy_version 34720 (0.0009) [2023-10-10 06:05:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71139328. Throughput: 0: 1684.1, 1: 1668.6. Samples: 17792262. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 06:05:21,784][52050] Avg episode reward: [(0, '21.030'), (1, '19.950')] [2023-10-10 06:05:22,728][53252] Updated weights for policy 0, policy_version 34760 (0.0008) [2023-10-10 06:05:23,106][53252] Updated weights for policy 0, policy_version 34770 (0.0007) [2023-10-10 06:05:23,473][53252] Updated weights for policy 0, policy_version 34780 (0.0007) [2023-10-10 06:05:23,952][53268] Updated weights for policy 1, policy_version 34730 (0.0007) [2023-10-10 06:05:24,315][53268] Updated weights for policy 1, policy_version 34740 (0.0008) [2023-10-10 06:05:24,686][53268] Updated weights for policy 1, policy_version 34750 (0.0007) [2023-10-10 06:05:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71204864. Throughput: 0: 1686.7, 1: 1686.7. Samples: 17813130. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 06:05:26,784][52050] Avg episode reward: [(0, '20.380'), (1, '20.440')] [2023-10-10 06:05:26,797][53061] Saving new best policy, reward=20.440! [2023-10-10 06:05:27,609][53252] Updated weights for policy 0, policy_version 34790 (0.0008) [2023-10-10 06:05:27,986][53252] Updated weights for policy 0, policy_version 34800 (0.0010) [2023-10-10 06:05:28,362][53252] Updated weights for policy 0, policy_version 34810 (0.0010) [2023-10-10 06:05:28,798][53268] Updated weights for policy 1, policy_version 34760 (0.0008) [2023-10-10 06:05:29,169][53268] Updated weights for policy 1, policy_version 34770 (0.0008) [2023-10-10 06:05:29,540][53268] Updated weights for policy 1, policy_version 34780 (0.0007) [2023-10-10 06:05:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71270400. Throughput: 0: 1669.0, 1: 1667.9. Samples: 17822656. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 06:05:31,784][52050] Avg episode reward: [(0, '19.830'), (1, '20.020')] [2023-10-10 06:05:32,564][53252] Updated weights for policy 0, policy_version 34820 (0.0009) [2023-10-10 06:05:32,937][53252] Updated weights for policy 0, policy_version 34830 (0.0008) [2023-10-10 06:05:33,309][53252] Updated weights for policy 0, policy_version 34840 (0.0008) [2023-10-10 06:05:33,538][53268] Updated weights for policy 1, policy_version 34790 (0.0008) [2023-10-10 06:05:33,909][53268] Updated weights for policy 1, policy_version 34800 (0.0008) [2023-10-10 06:05:34,279][53268] Updated weights for policy 1, policy_version 34810 (0.0008) [2023-10-10 06:05:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71335936. Throughput: 0: 1683.5, 1: 1674.8. Samples: 17842596. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 06:05:36,785][52050] Avg episode reward: [(0, '18.770'), (1, '21.000')] [2023-10-10 06:05:36,786][53061] Saving new best policy, reward=21.000! [2023-10-10 06:05:37,474][53252] Updated weights for policy 0, policy_version 34850 (0.0008) [2023-10-10 06:05:37,846][53252] Updated weights for policy 0, policy_version 34860 (0.0009) [2023-10-10 06:05:38,216][53252] Updated weights for policy 0, policy_version 34870 (0.0009) [2023-10-10 06:05:38,290][53268] Updated weights for policy 1, policy_version 34820 (0.0009) [2023-10-10 06:05:38,578][53252] Updated weights for policy 0, policy_version 34880 (0.0007) [2023-10-10 06:05:38,662][53268] Updated weights for policy 1, policy_version 34830 (0.0007) [2023-10-10 06:05:39,030][53268] Updated weights for policy 1, policy_version 34840 (0.0011) [2023-10-10 06:05:41,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71401472. Throughput: 0: 1684.5, 1: 1685.6. Samples: 17863258. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 06:05:41,784][52050] Avg episode reward: [(0, '18.500'), (1, '20.050')] [2023-10-10 06:05:42,977][53252] Updated weights for policy 0, policy_version 34890 (0.0007) [2023-10-10 06:05:43,230][53268] Updated weights for policy 1, policy_version 34850 (0.0009) [2023-10-10 06:05:43,345][53252] Updated weights for policy 0, policy_version 34900 (0.0007) [2023-10-10 06:05:43,600][53268] Updated weights for policy 1, policy_version 34860 (0.0008) [2023-10-10 06:05:43,721][53252] Updated weights for policy 0, policy_version 34910 (0.0008) [2023-10-10 06:05:43,956][53268] Updated weights for policy 1, policy_version 34870 (0.0010) [2023-10-10 06:05:44,328][53268] Updated weights for policy 1, policy_version 34880 (0.0008) [2023-10-10 06:05:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71467008. Throughput: 0: 1670.6, 1: 1662.9. Samples: 17872190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:46,785][52050] Avg episode reward: [(0, '18.310'), (1, '19.080')] [2023-10-10 06:05:47,863][53252] Updated weights for policy 0, policy_version 34920 (0.0007) [2023-10-10 06:05:48,237][53252] Updated weights for policy 0, policy_version 34930 (0.0007) [2023-10-10 06:05:48,389][53268] Updated weights for policy 1, policy_version 34890 (0.0008) [2023-10-10 06:05:48,600][53252] Updated weights for policy 0, policy_version 34940 (0.0007) [2023-10-10 06:05:48,762][53268] Updated weights for policy 1, policy_version 34900 (0.0009) [2023-10-10 06:05:49,126][53268] Updated weights for policy 1, policy_version 34910 (0.0010) [2023-10-10 06:05:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71532544. Throughput: 0: 1671.0, 1: 1681.4. Samples: 17892534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:51,785][52050] Avg episode reward: [(0, '17.950'), (1, '18.450')] [2023-10-10 06:05:52,560][53252] Updated weights for policy 0, policy_version 34950 (0.0007) [2023-10-10 06:05:52,931][53252] Updated weights for policy 0, policy_version 34960 (0.0008) [2023-10-10 06:05:53,314][53252] Updated weights for policy 0, policy_version 34970 (0.0008) [2023-10-10 06:05:53,360][53268] Updated weights for policy 1, policy_version 34920 (0.0008) [2023-10-10 06:05:53,733][53268] Updated weights for policy 1, policy_version 34930 (0.0007) [2023-10-10 06:05:54,099][53268] Updated weights for policy 1, policy_version 34940 (0.0007) [2023-10-10 06:05:56,784][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 71598080. Throughput: 0: 1668.1, 1: 1686.4. Samples: 17913058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:05:56,785][52050] Avg episode reward: [(0, '18.770'), (1, '18.560')] [2023-10-10 06:05:57,463][53252] Updated weights for policy 0, policy_version 34980 (0.0008) [2023-10-10 06:05:57,840][53252] Updated weights for policy 0, policy_version 34990 (0.0007) [2023-10-10 06:05:58,127][53268] Updated weights for policy 1, policy_version 34950 (0.0007) [2023-10-10 06:05:58,207][53252] Updated weights for policy 0, policy_version 35000 (0.0008) [2023-10-10 06:05:58,498][53268] Updated weights for policy 1, policy_version 34960 (0.0009) [2023-10-10 06:05:58,873][53268] Updated weights for policy 1, policy_version 34970 (0.0009) [2023-10-10 06:06:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71663616. Throughput: 0: 1672.0, 1: 1662.2. Samples: 17922124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:06:01,784][52050] Avg episode reward: [(0, '20.170'), (1, '19.160')] [2023-10-10 06:06:02,368][53252] Updated weights for policy 0, policy_version 35010 (0.0009) [2023-10-10 06:06:02,746][53252] Updated weights for policy 0, policy_version 35020 (0.0009) [2023-10-10 06:06:02,844][53268] Updated weights for policy 1, policy_version 34980 (0.0009) [2023-10-10 06:06:03,116][53252] Updated weights for policy 0, policy_version 35030 (0.0009) [2023-10-10 06:06:03,214][53268] Updated weights for policy 1, policy_version 34990 (0.0008) [2023-10-10 06:06:03,488][53252] Updated weights for policy 0, policy_version 35040 (0.0009) [2023-10-10 06:06:03,592][53268] Updated weights for policy 1, policy_version 35000 (0.0008) [2023-10-10 06:06:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71729152. Throughput: 0: 1659.2, 1: 1689.6. Samples: 17942958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:06:06,784][52050] Avg episode reward: [(0, '19.320'), (1, '18.890')] [2023-10-10 06:06:07,563][53268] Updated weights for policy 1, policy_version 35010 (0.0008) [2023-10-10 06:06:07,605][53252] Updated weights for policy 0, policy_version 35050 (0.0009) [2023-10-10 06:06:07,927][53268] Updated weights for policy 1, policy_version 35020 (0.0007) [2023-10-10 06:06:07,965][53252] Updated weights for policy 0, policy_version 35060 (0.0008) [2023-10-10 06:06:08,303][53268] Updated weights for policy 1, policy_version 35030 (0.0008) [2023-10-10 06:06:08,337][53252] Updated weights for policy 0, policy_version 35070 (0.0008) [2023-10-10 06:06:08,666][53268] Updated weights for policy 1, policy_version 35040 (0.0007) [2023-10-10 06:06:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71794688. Throughput: 0: 1658.2, 1: 1688.5. Samples: 17963730. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:06:11,784][52050] Avg episode reward: [(0, '22.560'), (1, '19.980')] [2023-10-10 06:06:11,793][52846] Saving new best policy, reward=22.560! [2023-10-10 06:06:12,470][53252] Updated weights for policy 0, policy_version 35080 (0.0008) [2023-10-10 06:06:12,675][53268] Updated weights for policy 1, policy_version 35050 (0.0009) [2023-10-10 06:06:12,838][53252] Updated weights for policy 0, policy_version 35090 (0.0008) [2023-10-10 06:06:13,035][53268] Updated weights for policy 1, policy_version 35060 (0.0011) [2023-10-10 06:06:13,205][53252] Updated weights for policy 0, policy_version 35100 (0.0007) [2023-10-10 06:06:13,406][53268] Updated weights for policy 1, policy_version 35070 (0.0010) [2023-10-10 06:06:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71860224. Throughput: 0: 1658.8, 1: 1675.2. Samples: 17972684. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:06:16,784][52050] Avg episode reward: [(0, '21.490'), (1, '20.000')] [2023-10-10 06:06:17,330][53252] Updated weights for policy 0, policy_version 35110 (0.0008) [2023-10-10 06:06:17,555][53268] Updated weights for policy 1, policy_version 35080 (0.0008) [2023-10-10 06:06:17,713][53252] Updated weights for policy 0, policy_version 35120 (0.0007) [2023-10-10 06:06:17,930][53268] Updated weights for policy 1, policy_version 35090 (0.0009) [2023-10-10 06:06:18,078][53252] Updated weights for policy 0, policy_version 35130 (0.0007) [2023-10-10 06:06:18,294][53268] Updated weights for policy 1, policy_version 35100 (0.0008) [2023-10-10 06:06:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71925760. Throughput: 0: 1659.0, 1: 1686.8. Samples: 17993158. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:06:21,785][52050] Avg episode reward: [(0, '19.320'), (1, '19.010')] [2023-10-10 06:06:22,267][53252] Updated weights for policy 0, policy_version 35140 (0.0007) [2023-10-10 06:06:22,321][53268] Updated weights for policy 1, policy_version 35110 (0.0009) [2023-10-10 06:06:22,638][53252] Updated weights for policy 0, policy_version 35150 (0.0007) [2023-10-10 06:06:22,681][53268] Updated weights for policy 1, policy_version 35120 (0.0007) [2023-10-10 06:06:23,012][53252] Updated weights for policy 0, policy_version 35160 (0.0008) [2023-10-10 06:06:23,045][53268] Updated weights for policy 1, policy_version 35130 (0.0008) [2023-10-10 06:06:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 71991296. Throughput: 0: 1660.2, 1: 1683.8. Samples: 18013738. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:06:26,784][52050] Avg episode reward: [(0, '20.620'), (1, '17.880')] [2023-10-10 06:06:27,035][53252] Updated weights for policy 0, policy_version 35170 (0.0007) [2023-10-10 06:06:27,314][53268] Updated weights for policy 1, policy_version 35140 (0.0008) [2023-10-10 06:06:27,411][53252] Updated weights for policy 0, policy_version 35180 (0.0008) [2023-10-10 06:06:27,673][53268] Updated weights for policy 1, policy_version 35150 (0.0007) [2023-10-10 06:06:27,790][53252] Updated weights for policy 0, policy_version 35190 (0.0008) [2023-10-10 06:06:28,050][53268] Updated weights for policy 1, policy_version 35160 (0.0007) [2023-10-10 06:06:28,147][53252] Updated weights for policy 0, policy_version 35200 (0.0010) [2023-10-10 06:06:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72056832. Throughput: 0: 1667.1, 1: 1678.8. Samples: 18022754. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:06:31,784][52050] Avg episode reward: [(0, '21.180'), (1, '18.100')] [2023-10-10 06:06:32,098][53252] Updated weights for policy 0, policy_version 35210 (0.0009) [2023-10-10 06:06:32,219][53268] Updated weights for policy 1, policy_version 35170 (0.0009) [2023-10-10 06:06:32,467][53252] Updated weights for policy 0, policy_version 35220 (0.0009) [2023-10-10 06:06:32,596][53268] Updated weights for policy 1, policy_version 35180 (0.0007) [2023-10-10 06:06:32,834][53252] Updated weights for policy 0, policy_version 35230 (0.0007) [2023-10-10 06:06:32,966][53268] Updated weights for policy 1, policy_version 35190 (0.0008) [2023-10-10 06:06:33,323][53268] Updated weights for policy 1, policy_version 35200 (0.0008) [2023-10-10 06:06:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 72122368. Throughput: 0: 1675.9, 1: 1676.3. Samples: 18043380. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 06:06:36,784][52050] Avg episode reward: [(0, '19.890'), (1, '16.900')] [2023-10-10 06:06:36,944][53252] Updated weights for policy 0, policy_version 35240 (0.0008) [2023-10-10 06:06:37,303][53252] Updated weights for policy 0, policy_version 35250 (0.0009) [2023-10-10 06:06:37,510][53268] Updated weights for policy 1, policy_version 35210 (0.0009) [2023-10-10 06:06:37,679][53252] Updated weights for policy 0, policy_version 35260 (0.0007) [2023-10-10 06:06:37,873][53268] Updated weights for policy 1, policy_version 35220 (0.0009) [2023-10-10 06:06:38,249][53268] Updated weights for policy 1, policy_version 35230 (0.0009) [2023-10-10 06:06:41,655][53252] Updated weights for policy 0, policy_version 35270 (0.0007) [2023-10-10 06:06:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72187904. Throughput: 0: 1676.4, 1: 1675.7. Samples: 18063900. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 06:06:41,784][52050] Avg episode reward: [(0, '20.210'), (1, '16.800')] [2023-10-10 06:06:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000035232_36077568.pth... [2023-10-10 06:06:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000033664_34471936.pth [2023-10-10 06:06:42,021][53252] Updated weights for policy 0, policy_version 35280 (0.0008) [2023-10-10 06:06:42,380][53252] Updated weights for policy 0, policy_version 35290 (0.0008) [2023-10-10 06:06:42,444][53268] Updated weights for policy 1, policy_version 35240 (0.0007) [2023-10-10 06:06:42,605][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000035296_36143104.pth... [2023-10-10 06:06:42,646][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000033728_34537472.pth [2023-10-10 06:06:42,804][53268] Updated weights for policy 1, policy_version 35250 (0.0009) [2023-10-10 06:06:43,175][53268] Updated weights for policy 1, policy_version 35260 (0.0008) [2023-10-10 06:06:46,339][53252] Updated weights for policy 0, policy_version 35300 (0.0007) [2023-10-10 06:06:46,711][53252] Updated weights for policy 0, policy_version 35310 (0.0007) [2023-10-10 06:06:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72253440. Throughput: 0: 1677.3, 1: 1673.8. Samples: 18072926. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 06:06:46,784][52050] Avg episode reward: [(0, '19.270'), (1, '17.610')] [2023-10-10 06:06:47,087][53252] Updated weights for policy 0, policy_version 35320 (0.0008) [2023-10-10 06:06:47,152][53268] Updated weights for policy 1, policy_version 35270 (0.0009) [2023-10-10 06:06:47,523][53268] Updated weights for policy 1, policy_version 35280 (0.0007) [2023-10-10 06:06:47,904][53268] Updated weights for policy 1, policy_version 35290 (0.0008) [2023-10-10 06:06:51,243][53252] Updated weights for policy 0, policy_version 35330 (0.0007) [2023-10-10 06:06:51,628][53252] Updated weights for policy 0, policy_version 35340 (0.0011) [2023-10-10 06:06:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72318976. Throughput: 0: 1681.7, 1: 1666.4. Samples: 18093626. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 06:06:51,784][52050] Avg episode reward: [(0, '20.000'), (1, '17.860')] [2023-10-10 06:06:51,996][53252] Updated weights for policy 0, policy_version 35350 (0.0009) [2023-10-10 06:06:52,111][53268] Updated weights for policy 1, policy_version 35300 (0.0008) [2023-10-10 06:06:52,365][53252] Updated weights for policy 0, policy_version 35360 (0.0009) [2023-10-10 06:06:52,483][53268] Updated weights for policy 1, policy_version 35310 (0.0008) [2023-10-10 06:06:52,848][53268] Updated weights for policy 1, policy_version 35320 (0.0007) [2023-10-10 06:06:56,457][53252] Updated weights for policy 0, policy_version 35370 (0.0007) [2023-10-10 06:06:56,755][53268] Updated weights for policy 1, policy_version 35330 (0.0007) [2023-10-10 06:06:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 72384512. Throughput: 0: 1672.7, 1: 1666.2. Samples: 18113984. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 06:06:56,785][52050] Avg episode reward: [(0, '19.840'), (1, '17.350')] [2023-10-10 06:06:56,828][53252] Updated weights for policy 0, policy_version 35380 (0.0009) [2023-10-10 06:06:57,118][53268] Updated weights for policy 1, policy_version 35340 (0.0007) [2023-10-10 06:06:57,199][53252] Updated weights for policy 0, policy_version 35390 (0.0008) [2023-10-10 06:06:57,481][53268] Updated weights for policy 1, policy_version 35350 (0.0009) [2023-10-10 06:06:57,844][53268] Updated weights for policy 1, policy_version 35360 (0.0010) [2023-10-10 06:07:01,187][53252] Updated weights for policy 0, policy_version 35400 (0.0007) [2023-10-10 06:07:01,559][53252] Updated weights for policy 0, policy_version 35410 (0.0007) [2023-10-10 06:07:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 72450048. Throughput: 0: 1680.7, 1: 1666.1. Samples: 18123288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:01,784][52050] Avg episode reward: [(0, '19.280'), (1, '16.800')] [2023-10-10 06:07:01,936][53252] Updated weights for policy 0, policy_version 35420 (0.0007) [2023-10-10 06:07:01,965][53268] Updated weights for policy 1, policy_version 35370 (0.0009) [2023-10-10 06:07:02,326][53268] Updated weights for policy 1, policy_version 35380 (0.0010) [2023-10-10 06:07:02,696][53268] Updated weights for policy 1, policy_version 35390 (0.0008) [2023-10-10 06:07:06,050][53252] Updated weights for policy 0, policy_version 35430 (0.0008) [2023-10-10 06:07:06,417][53252] Updated weights for policy 0, policy_version 35440 (0.0007) [2023-10-10 06:07:06,779][53268] Updated weights for policy 1, policy_version 35400 (0.0007) [2023-10-10 06:07:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72515584. Throughput: 0: 1681.9, 1: 1669.6. Samples: 18143974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:06,784][52050] Avg episode reward: [(0, '18.440'), (1, '17.870')] [2023-10-10 06:07:06,792][53252] Updated weights for policy 0, policy_version 35450 (0.0007) [2023-10-10 06:07:07,143][53268] Updated weights for policy 1, policy_version 35410 (0.0008) [2023-10-10 06:07:07,518][53268] Updated weights for policy 1, policy_version 35420 (0.0007) [2023-10-10 06:07:10,829][53252] Updated weights for policy 0, policy_version 35460 (0.0007) [2023-10-10 06:07:11,191][53252] Updated weights for policy 0, policy_version 35470 (0.0007) [2023-10-10 06:07:11,556][53268] Updated weights for policy 1, policy_version 35430 (0.0009) [2023-10-10 06:07:11,569][53252] Updated weights for policy 0, policy_version 35480 (0.0008) [2023-10-10 06:07:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72581120. Throughput: 0: 1670.5, 1: 1671.0. Samples: 18164106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:11,784][52050] Avg episode reward: [(0, '18.790'), (1, '18.610')] [2023-10-10 06:07:11,927][53268] Updated weights for policy 1, policy_version 35440 (0.0009) [2023-10-10 06:07:12,293][53268] Updated weights for policy 1, policy_version 35450 (0.0010) [2023-10-10 06:07:15,655][53252] Updated weights for policy 0, policy_version 35490 (0.0008) [2023-10-10 06:07:16,030][53252] Updated weights for policy 0, policy_version 35500 (0.0009) [2023-10-10 06:07:16,228][53268] Updated weights for policy 1, policy_version 35460 (0.0007) [2023-10-10 06:07:16,403][53252] Updated weights for policy 0, policy_version 35510 (0.0008) [2023-10-10 06:07:16,588][53268] Updated weights for policy 1, policy_version 35470 (0.0008) [2023-10-10 06:07:16,763][53252] Updated weights for policy 0, policy_version 35520 (0.0007) [2023-10-10 06:07:16,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72679424. Throughput: 0: 1688.5, 1: 1674.2. Samples: 18174076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:16,784][52050] Avg episode reward: [(0, '17.640'), (1, '18.860')] [2023-10-10 06:07:16,958][53268] Updated weights for policy 1, policy_version 35480 (0.0007) [2023-10-10 06:07:20,773][53268] Updated weights for policy 1, policy_version 35490 (0.0009) [2023-10-10 06:07:20,888][53252] Updated weights for policy 0, policy_version 35530 (0.0008) [2023-10-10 06:07:21,146][53268] Updated weights for policy 1, policy_version 35500 (0.0009) [2023-10-10 06:07:21,258][53252] Updated weights for policy 0, policy_version 35540 (0.0008) [2023-10-10 06:07:21,504][53268] Updated weights for policy 1, policy_version 35510 (0.0008) [2023-10-10 06:07:21,629][53252] Updated weights for policy 0, policy_version 35550 (0.0009) [2023-10-10 06:07:21,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 72744960. Throughput: 0: 1687.2, 1: 1685.1. Samples: 18195132. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:07:21,784][52050] Avg episode reward: [(0, '19.230'), (1, '19.520')] [2023-10-10 06:07:21,865][53268] Updated weights for policy 1, policy_version 35520 (0.0010) [2023-10-10 06:07:25,799][53252] Updated weights for policy 0, policy_version 35560 (0.0008) [2023-10-10 06:07:26,090][53268] Updated weights for policy 1, policy_version 35530 (0.0007) [2023-10-10 06:07:26,169][53252] Updated weights for policy 0, policy_version 35570 (0.0008) [2023-10-10 06:07:26,453][53268] Updated weights for policy 1, policy_version 35540 (0.0008) [2023-10-10 06:07:26,538][53252] Updated weights for policy 0, policy_version 35580 (0.0007) [2023-10-10 06:07:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72810496. Throughput: 0: 1666.1, 1: 1678.0. Samples: 18214386. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:07:26,784][52050] Avg episode reward: [(0, '20.220'), (1, '19.960')] [2023-10-10 06:07:26,808][53268] Updated weights for policy 1, policy_version 35550 (0.0008) [2023-10-10 06:07:30,505][53252] Updated weights for policy 0, policy_version 35590 (0.0008) [2023-10-10 06:07:30,866][53252] Updated weights for policy 0, policy_version 35600 (0.0009) [2023-10-10 06:07:30,903][53268] Updated weights for policy 1, policy_version 35560 (0.0009) [2023-10-10 06:07:31,243][53252] Updated weights for policy 0, policy_version 35610 (0.0009) [2023-10-10 06:07:31,281][53268] Updated weights for policy 1, policy_version 35570 (0.0007) [2023-10-10 06:07:31,640][53268] Updated weights for policy 1, policy_version 35580 (0.0009) [2023-10-10 06:07:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72876032. Throughput: 0: 1686.3, 1: 1690.0. Samples: 18224862. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:07:31,784][52050] Avg episode reward: [(0, '21.230'), (1, '19.410')] [2023-10-10 06:07:35,208][53252] Updated weights for policy 0, policy_version 35620 (0.0008) [2023-10-10 06:07:35,582][53252] Updated weights for policy 0, policy_version 35630 (0.0009) [2023-10-10 06:07:35,584][53268] Updated weights for policy 1, policy_version 35590 (0.0009) [2023-10-10 06:07:35,955][53268] Updated weights for policy 1, policy_version 35600 (0.0008) [2023-10-10 06:07:35,956][53252] Updated weights for policy 0, policy_version 35640 (0.0007) [2023-10-10 06:07:36,316][53268] Updated weights for policy 1, policy_version 35610 (0.0008) [2023-10-10 06:07:36,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 72974336. Throughput: 0: 1678.2, 1: 1691.8. Samples: 18245276. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:07:36,784][52050] Avg episode reward: [(0, '20.890'), (1, '20.400')] [2023-10-10 06:07:39,979][53252] Updated weights for policy 0, policy_version 35650 (0.0008) [2023-10-10 06:07:40,345][53252] Updated weights for policy 0, policy_version 35660 (0.0008) [2023-10-10 06:07:40,486][53268] Updated weights for policy 1, policy_version 35620 (0.0009) [2023-10-10 06:07:40,713][53252] Updated weights for policy 0, policy_version 35670 (0.0007) [2023-10-10 06:07:40,856][53268] Updated weights for policy 1, policy_version 35630 (0.0007) [2023-10-10 06:07:41,086][53252] Updated weights for policy 0, policy_version 35680 (0.0010) [2023-10-10 06:07:41,221][53268] Updated weights for policy 1, policy_version 35640 (0.0009) [2023-10-10 06:07:41,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73039872. Throughput: 0: 1667.5, 1: 1668.7. Samples: 18264112. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 06:07:41,784][52050] Avg episode reward: [(0, '20.910'), (1, '18.730')] [2023-10-10 06:07:45,276][53268] Updated weights for policy 1, policy_version 35650 (0.0009) [2023-10-10 06:07:45,287][53252] Updated weights for policy 0, policy_version 35690 (0.0010) [2023-10-10 06:07:45,643][53268] Updated weights for policy 1, policy_version 35660 (0.0008) [2023-10-10 06:07:45,654][53252] Updated weights for policy 0, policy_version 35700 (0.0008) [2023-10-10 06:07:46,012][53268] Updated weights for policy 1, policy_version 35670 (0.0008) [2023-10-10 06:07:46,030][53252] Updated weights for policy 0, policy_version 35710 (0.0008) [2023-10-10 06:07:46,371][53268] Updated weights for policy 1, policy_version 35680 (0.0007) [2023-10-10 06:07:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73105408. Throughput: 0: 1688.8, 1: 1689.9. Samples: 18275326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:46,784][52050] Avg episode reward: [(0, '21.200'), (1, '18.040')] [2023-10-10 06:07:49,970][53252] Updated weights for policy 0, policy_version 35720 (0.0007) [2023-10-10 06:07:50,339][53252] Updated weights for policy 0, policy_version 35730 (0.0007) [2023-10-10 06:07:50,568][53268] Updated weights for policy 1, policy_version 35690 (0.0008) [2023-10-10 06:07:50,701][53252] Updated weights for policy 0, policy_version 35740 (0.0007) [2023-10-10 06:07:50,931][53268] Updated weights for policy 1, policy_version 35700 (0.0010) [2023-10-10 06:07:51,297][53268] Updated weights for policy 1, policy_version 35710 (0.0011) [2023-10-10 06:07:51,783][52050] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 73170944. Throughput: 0: 1672.5, 1: 1687.6. Samples: 18295178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:51,784][52050] Avg episode reward: [(0, '20.050'), (1, '19.260')] [2023-10-10 06:07:54,764][53252] Updated weights for policy 0, policy_version 35750 (0.0008) [2023-10-10 06:07:55,131][53252] Updated weights for policy 0, policy_version 35760 (0.0008) [2023-10-10 06:07:55,370][53268] Updated weights for policy 1, policy_version 35720 (0.0009) [2023-10-10 06:07:55,502][53252] Updated weights for policy 0, policy_version 35770 (0.0008) [2023-10-10 06:07:55,739][53268] Updated weights for policy 1, policy_version 35730 (0.0010) [2023-10-10 06:07:56,102][53268] Updated weights for policy 1, policy_version 35740 (0.0010) [2023-10-10 06:07:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73236480. Throughput: 0: 1678.9, 1: 1663.3. Samples: 18314504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:07:56,784][52050] Avg episode reward: [(0, '18.970'), (1, '17.650')] [2023-10-10 06:07:59,463][53252] Updated weights for policy 0, policy_version 35780 (0.0009) [2023-10-10 06:07:59,838][53252] Updated weights for policy 0, policy_version 35790 (0.0007) [2023-10-10 06:08:00,130][53268] Updated weights for policy 1, policy_version 35750 (0.0009) [2023-10-10 06:08:00,204][53252] Updated weights for policy 0, policy_version 35800 (0.0008) [2023-10-10 06:08:00,495][53268] Updated weights for policy 1, policy_version 35760 (0.0009) [2023-10-10 06:08:00,874][53268] Updated weights for policy 1, policy_version 35770 (0.0008) [2023-10-10 06:08:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73302016. Throughput: 0: 1687.8, 1: 1683.1. Samples: 18325768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:01,785][52050] Avg episode reward: [(0, '18.820'), (1, '18.630')] [2023-10-10 06:08:04,361][53252] Updated weights for policy 0, policy_version 35810 (0.0010) [2023-10-10 06:08:04,763][53252] Updated weights for policy 0, policy_version 35820 (0.0010) [2023-10-10 06:08:04,860][53268] Updated weights for policy 1, policy_version 35780 (0.0008) [2023-10-10 06:08:05,137][53252] Updated weights for policy 0, policy_version 35830 (0.0007) [2023-10-10 06:08:05,223][53268] Updated weights for policy 1, policy_version 35790 (0.0008) [2023-10-10 06:08:05,503][53252] Updated weights for policy 0, policy_version 35840 (0.0008) [2023-10-10 06:08:05,595][53268] Updated weights for policy 1, policy_version 35800 (0.0008) [2023-10-10 06:08:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 73367552. Throughput: 0: 1660.1, 1: 1671.2. Samples: 18345040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:06,784][52050] Avg episode reward: [(0, '18.720'), (1, '19.500')] [2023-10-10 06:08:09,599][53252] Updated weights for policy 0, policy_version 35850 (0.0009) [2023-10-10 06:08:09,805][53268] Updated weights for policy 1, policy_version 35810 (0.0009) [2023-10-10 06:08:09,964][53252] Updated weights for policy 0, policy_version 35860 (0.0008) [2023-10-10 06:08:10,172][53268] Updated weights for policy 1, policy_version 35820 (0.0008) [2023-10-10 06:08:10,339][53252] Updated weights for policy 0, policy_version 35870 (0.0008) [2023-10-10 06:08:10,538][53268] Updated weights for policy 1, policy_version 35830 (0.0009) [2023-10-10 06:08:10,908][53268] Updated weights for policy 1, policy_version 35840 (0.0008) [2023-10-10 06:08:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 73433088. Throughput: 0: 1674.7, 1: 1660.6. Samples: 18364478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:11,785][52050] Avg episode reward: [(0, '19.340'), (1, '17.810')] [2023-10-10 06:08:14,485][53252] Updated weights for policy 0, policy_version 35880 (0.0009) [2023-10-10 06:08:14,850][53252] Updated weights for policy 0, policy_version 35890 (0.0008) [2023-10-10 06:08:14,964][53268] Updated weights for policy 1, policy_version 35850 (0.0008) [2023-10-10 06:08:15,223][53252] Updated weights for policy 0, policy_version 35900 (0.0008) [2023-10-10 06:08:15,327][53268] Updated weights for policy 1, policy_version 35860 (0.0008) [2023-10-10 06:08:15,701][53268] Updated weights for policy 1, policy_version 35870 (0.0011) [2023-10-10 06:08:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 73498624. Throughput: 0: 1678.6, 1: 1680.0. Samples: 18375998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:16,784][52050] Avg episode reward: [(0, '19.160'), (1, '18.500')] [2023-10-10 06:08:19,299][53252] Updated weights for policy 0, policy_version 35910 (0.0007) [2023-10-10 06:08:19,680][53252] Updated weights for policy 0, policy_version 35920 (0.0007) [2023-10-10 06:08:19,930][53268] Updated weights for policy 1, policy_version 35880 (0.0009) [2023-10-10 06:08:20,044][53252] Updated weights for policy 0, policy_version 35930 (0.0009) [2023-10-10 06:08:20,309][53268] Updated weights for policy 1, policy_version 35890 (0.0008) [2023-10-10 06:08:20,677][53268] Updated weights for policy 1, policy_version 35900 (0.0009) [2023-10-10 06:08:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 73564160. Throughput: 0: 1664.6, 1: 1663.4. Samples: 18395034. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:21,784][52050] Avg episode reward: [(0, '20.480'), (1, '19.010')] [2023-10-10 06:08:24,028][53252] Updated weights for policy 0, policy_version 35940 (0.0009) [2023-10-10 06:08:24,405][53252] Updated weights for policy 0, policy_version 35950 (0.0009) [2023-10-10 06:08:24,665][53268] Updated weights for policy 1, policy_version 35910 (0.0009) [2023-10-10 06:08:24,777][53252] Updated weights for policy 0, policy_version 35960 (0.0009) [2023-10-10 06:08:25,024][53268] Updated weights for policy 1, policy_version 35920 (0.0008) [2023-10-10 06:08:25,392][53268] Updated weights for policy 1, policy_version 35930 (0.0009) [2023-10-10 06:08:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73629696. Throughput: 0: 1684.4, 1: 1672.7. Samples: 18415184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:26,784][52050] Avg episode reward: [(0, '21.070'), (1, '18.730')] [2023-10-10 06:08:28,861][53252] Updated weights for policy 0, policy_version 35970 (0.0008) [2023-10-10 06:08:29,239][53252] Updated weights for policy 0, policy_version 35980 (0.0009) [2023-10-10 06:08:29,540][53268] Updated weights for policy 1, policy_version 35940 (0.0008) [2023-10-10 06:08:29,603][53252] Updated weights for policy 0, policy_version 35990 (0.0009) [2023-10-10 06:08:29,901][53268] Updated weights for policy 1, policy_version 35950 (0.0008) [2023-10-10 06:08:29,973][53252] Updated weights for policy 0, policy_version 36000 (0.0007) [2023-10-10 06:08:30,270][53268] Updated weights for policy 1, policy_version 35960 (0.0011) [2023-10-10 06:08:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73695232. Throughput: 0: 1671.4, 1: 1682.0. Samples: 18426228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:08:31,784][52050] Avg episode reward: [(0, '21.500'), (1, '19.610')] [2023-10-10 06:08:34,049][53252] Updated weights for policy 0, policy_version 36010 (0.0010) [2023-10-10 06:08:34,250][53268] Updated weights for policy 1, policy_version 35970 (0.0009) [2023-10-10 06:08:34,426][53252] Updated weights for policy 0, policy_version 36020 (0.0009) [2023-10-10 06:08:34,610][53268] Updated weights for policy 1, policy_version 35980 (0.0007) [2023-10-10 06:08:34,793][53252] Updated weights for policy 0, policy_version 36030 (0.0010) [2023-10-10 06:08:34,971][53268] Updated weights for policy 1, policy_version 35990 (0.0007) [2023-10-10 06:08:35,334][53268] Updated weights for policy 1, policy_version 36000 (0.0010) [2023-10-10 06:08:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73760768. Throughput: 0: 1672.6, 1: 1662.2. Samples: 18445244. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:08:36,784][52050] Avg episode reward: [(0, '20.930'), (1, '18.970')] [2023-10-10 06:08:38,393][53252] Updated weights for policy 0, policy_version 36040 (0.0008) [2023-10-10 06:08:38,764][53252] Updated weights for policy 0, policy_version 36050 (0.0008) [2023-10-10 06:08:38,817][53268] Updated weights for policy 1, policy_version 36010 (0.0008) [2023-10-10 06:08:39,137][53252] Updated weights for policy 0, policy_version 36060 (0.0007) [2023-10-10 06:08:39,193][53268] Updated weights for policy 1, policy_version 36020 (0.0010) [2023-10-10 06:08:39,551][53268] Updated weights for policy 1, policy_version 36030 (0.0010) [2023-10-10 06:08:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 73826304. Throughput: 0: 1705.5, 1: 1701.6. Samples: 18467828. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:08:41,784][52050] Avg episode reward: [(0, '19.370'), (1, '18.360')] [2023-10-10 06:08:41,799][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000036032_36896768.pth... [2023-10-10 06:08:41,800][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000036064_36929536.pth... [2023-10-10 06:08:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000034496_35323904.pth [2023-10-10 06:08:41,837][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000034464_35291136.pth [2023-10-10 06:08:43,165][53252] Updated weights for policy 0, policy_version 36070 (0.0008) [2023-10-10 06:08:43,533][53252] Updated weights for policy 0, policy_version 36080 (0.0008) [2023-10-10 06:08:43,668][53268] Updated weights for policy 1, policy_version 36040 (0.0009) [2023-10-10 06:08:43,895][53252] Updated weights for policy 0, policy_version 36090 (0.0007) [2023-10-10 06:08:44,034][53268] Updated weights for policy 1, policy_version 36050 (0.0008) [2023-10-10 06:08:44,403][53268] Updated weights for policy 1, policy_version 36060 (0.0011) [2023-10-10 06:08:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73891840. Throughput: 0: 1678.5, 1: 1690.9. Samples: 18477386. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:08:46,784][52050] Avg episode reward: [(0, '20.020'), (1, '18.170')] [2023-10-10 06:08:48,002][53252] Updated weights for policy 0, policy_version 36100 (0.0008) [2023-10-10 06:08:48,371][53252] Updated weights for policy 0, policy_version 36110 (0.0008) [2023-10-10 06:08:48,373][53268] Updated weights for policy 1, policy_version 36070 (0.0009) [2023-10-10 06:08:48,733][53268] Updated weights for policy 1, policy_version 36080 (0.0010) [2023-10-10 06:08:48,737][53252] Updated weights for policy 0, policy_version 36120 (0.0008) [2023-10-10 06:08:49,111][53268] Updated weights for policy 1, policy_version 36090 (0.0010) [2023-10-10 06:08:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 73957376. Throughput: 0: 1700.8, 1: 1692.2. Samples: 18497724. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:08:51,784][52050] Avg episode reward: [(0, '18.100'), (1, '17.050')] [2023-10-10 06:08:52,768][53252] Updated weights for policy 0, policy_version 36130 (0.0008) [2023-10-10 06:08:52,985][53268] Updated weights for policy 1, policy_version 36100 (0.0009) [2023-10-10 06:08:53,166][53252] Updated weights for policy 0, policy_version 36140 (0.0008) [2023-10-10 06:08:53,356][53268] Updated weights for policy 1, policy_version 36110 (0.0008) [2023-10-10 06:08:53,535][53252] Updated weights for policy 0, policy_version 36150 (0.0008) [2023-10-10 06:08:53,721][53268] Updated weights for policy 1, policy_version 36120 (0.0009) [2023-10-10 06:08:53,898][53252] Updated weights for policy 0, policy_version 36160 (0.0007) [2023-10-10 06:08:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74022912. Throughput: 0: 1709.4, 1: 1714.0. Samples: 18518528. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:08:56,784][52050] Avg episode reward: [(0, '18.560'), (1, '17.220')] [2023-10-10 06:08:57,703][53268] Updated weights for policy 1, policy_version 36130 (0.0010) [2023-10-10 06:08:57,877][53252] Updated weights for policy 0, policy_version 36170 (0.0009) [2023-10-10 06:08:58,068][53268] Updated weights for policy 1, policy_version 36140 (0.0008) [2023-10-10 06:08:58,244][53252] Updated weights for policy 0, policy_version 36180 (0.0007) [2023-10-10 06:08:58,431][53268] Updated weights for policy 1, policy_version 36150 (0.0008) [2023-10-10 06:08:58,615][53252] Updated weights for policy 0, policy_version 36190 (0.0008) [2023-10-10 06:08:58,790][53268] Updated weights for policy 1, policy_version 36160 (0.0010) [2023-10-10 06:09:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74088448. Throughput: 0: 1685.7, 1: 1688.2. Samples: 18527824. Policy #0 lag: (min: 34.0, avg: 54.0, max: 56.0) [2023-10-10 06:09:01,784][52050] Avg episode reward: [(0, '21.810'), (1, '17.250')] [2023-10-10 06:09:02,651][53252] Updated weights for policy 0, policy_version 36200 (0.0009) [2023-10-10 06:09:03,020][53252] Updated weights for policy 0, policy_version 36210 (0.0007) [2023-10-10 06:09:03,096][53268] Updated weights for policy 1, policy_version 36170 (0.0008) [2023-10-10 06:09:03,387][53252] Updated weights for policy 0, policy_version 36220 (0.0008) [2023-10-10 06:09:03,465][53268] Updated weights for policy 1, policy_version 36180 (0.0009) [2023-10-10 06:09:03,833][53268] Updated weights for policy 1, policy_version 36190 (0.0009) [2023-10-10 06:09:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74153984. Throughput: 0: 1706.1, 1: 1700.2. Samples: 18548320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:09:06,784][52050] Avg episode reward: [(0, '20.450'), (1, '18.960')] [2023-10-10 06:09:07,462][53252] Updated weights for policy 0, policy_version 36230 (0.0007) [2023-10-10 06:09:07,793][53268] Updated weights for policy 1, policy_version 36200 (0.0008) [2023-10-10 06:09:07,825][53252] Updated weights for policy 0, policy_version 36240 (0.0009) [2023-10-10 06:09:08,160][53268] Updated weights for policy 1, policy_version 36210 (0.0009) [2023-10-10 06:09:08,195][53252] Updated weights for policy 0, policy_version 36250 (0.0008) [2023-10-10 06:09:08,533][53268] Updated weights for policy 1, policy_version 36220 (0.0008) [2023-10-10 06:09:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74219520. Throughput: 0: 1703.5, 1: 1721.8. Samples: 18569322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:09:11,784][52050] Avg episode reward: [(0, '19.870'), (1, '18.590')] [2023-10-10 06:09:12,220][53252] Updated weights for policy 0, policy_version 36260 (0.0007) [2023-10-10 06:09:12,593][53252] Updated weights for policy 0, policy_version 36270 (0.0009) [2023-10-10 06:09:12,672][53268] Updated weights for policy 1, policy_version 36230 (0.0008) [2023-10-10 06:09:12,957][53252] Updated weights for policy 0, policy_version 36280 (0.0007) [2023-10-10 06:09:13,025][53268] Updated weights for policy 1, policy_version 36240 (0.0009) [2023-10-10 06:09:13,392][53268] Updated weights for policy 1, policy_version 36250 (0.0010) [2023-10-10 06:09:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74285056. Throughput: 0: 1688.8, 1: 1691.4. Samples: 18578336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:09:16,784][52050] Avg episode reward: [(0, '22.020'), (1, '20.270')] [2023-10-10 06:09:17,014][53252] Updated weights for policy 0, policy_version 36290 (0.0007) [2023-10-10 06:09:17,394][53252] Updated weights for policy 0, policy_version 36300 (0.0009) [2023-10-10 06:09:17,601][53268] Updated weights for policy 1, policy_version 36260 (0.0011) [2023-10-10 06:09:17,770][53252] Updated weights for policy 0, policy_version 36310 (0.0008) [2023-10-10 06:09:17,961][53268] Updated weights for policy 1, policy_version 36270 (0.0008) [2023-10-10 06:09:18,144][53252] Updated weights for policy 0, policy_version 36320 (0.0008) [2023-10-10 06:09:18,319][53268] Updated weights for policy 1, policy_version 36280 (0.0008) [2023-10-10 06:09:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74350592. Throughput: 0: 1708.3, 1: 1709.2. Samples: 18599028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:09:21,784][52050] Avg episode reward: [(0, '20.930'), (1, '19.850')] [2023-10-10 06:09:22,387][53252] Updated weights for policy 0, policy_version 36330 (0.0009) [2023-10-10 06:09:22,390][53268] Updated weights for policy 1, policy_version 36290 (0.0009) [2023-10-10 06:09:22,755][53252] Updated weights for policy 0, policy_version 36340 (0.0008) [2023-10-10 06:09:22,756][53268] Updated weights for policy 1, policy_version 36300 (0.0007) [2023-10-10 06:09:23,125][53268] Updated weights for policy 1, policy_version 36310 (0.0007) [2023-10-10 06:09:23,126][53252] Updated weights for policy 0, policy_version 36350 (0.0009) [2023-10-10 06:09:23,499][53268] Updated weights for policy 1, policy_version 36320 (0.0009) [2023-10-10 06:09:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74416128. Throughput: 0: 1679.6, 1: 1692.3. Samples: 18619562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:09:26,784][52050] Avg episode reward: [(0, '19.980'), (1, '19.560')] [2023-10-10 06:09:27,211][53252] Updated weights for policy 0, policy_version 36360 (0.0008) [2023-10-10 06:09:27,473][53268] Updated weights for policy 1, policy_version 36330 (0.0010) [2023-10-10 06:09:27,583][53252] Updated weights for policy 0, policy_version 36370 (0.0008) [2023-10-10 06:09:27,843][53268] Updated weights for policy 1, policy_version 36340 (0.0009) [2023-10-10 06:09:27,949][53252] Updated weights for policy 0, policy_version 36380 (0.0010) [2023-10-10 06:09:28,212][53268] Updated weights for policy 1, policy_version 36350 (0.0011) [2023-10-10 06:09:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74481664. Throughput: 0: 1679.6, 1: 1679.9. Samples: 18628562. Policy #0 lag: (min: 31.0, avg: 32.9, max: 60.0) [2023-10-10 06:09:31,784][52050] Avg episode reward: [(0, '21.050'), (1, '18.530')] [2023-10-10 06:09:32,008][53252] Updated weights for policy 0, policy_version 36390 (0.0008) [2023-10-10 06:09:32,378][53252] Updated weights for policy 0, policy_version 36400 (0.0008) [2023-10-10 06:09:32,430][53268] Updated weights for policy 1, policy_version 36360 (0.0009) [2023-10-10 06:09:32,743][53252] Updated weights for policy 0, policy_version 36410 (0.0008) [2023-10-10 06:09:32,806][53268] Updated weights for policy 1, policy_version 36370 (0.0007) [2023-10-10 06:09:33,169][53268] Updated weights for policy 1, policy_version 36380 (0.0007) [2023-10-10 06:09:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74547200. Throughput: 0: 1678.3, 1: 1683.5. Samples: 18649002. Policy #0 lag: (min: 31.0, avg: 32.9, max: 60.0) [2023-10-10 06:09:36,784][52050] Avg episode reward: [(0, '20.330'), (1, '19.410')] [2023-10-10 06:09:36,856][53252] Updated weights for policy 0, policy_version 36420 (0.0007) [2023-10-10 06:09:37,223][53252] Updated weights for policy 0, policy_version 36430 (0.0007) [2023-10-10 06:09:37,270][53268] Updated weights for policy 1, policy_version 36390 (0.0007) [2023-10-10 06:09:37,592][53252] Updated weights for policy 0, policy_version 36440 (0.0009) [2023-10-10 06:09:37,635][53268] Updated weights for policy 1, policy_version 36400 (0.0009) [2023-10-10 06:09:37,998][53268] Updated weights for policy 1, policy_version 36410 (0.0008) [2023-10-10 06:09:41,651][53252] Updated weights for policy 0, policy_version 36450 (0.0007) [2023-10-10 06:09:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74612736. Throughput: 0: 1682.9, 1: 1678.7. Samples: 18669800. Policy #0 lag: (min: 31.0, avg: 32.9, max: 60.0) [2023-10-10 06:09:41,784][52050] Avg episode reward: [(0, '20.450'), (1, '17.690')] [2023-10-10 06:09:42,043][53252] Updated weights for policy 0, policy_version 36460 (0.0010) [2023-10-10 06:09:42,111][53268] Updated weights for policy 1, policy_version 36420 (0.0009) [2023-10-10 06:09:42,400][53252] Updated weights for policy 0, policy_version 36470 (0.0010) [2023-10-10 06:09:42,482][53268] Updated weights for policy 1, policy_version 36430 (0.0009) [2023-10-10 06:09:42,775][53252] Updated weights for policy 0, policy_version 36480 (0.0007) [2023-10-10 06:09:42,839][53268] Updated weights for policy 1, policy_version 36440 (0.0009) [2023-10-10 06:09:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74678272. Throughput: 0: 1678.8, 1: 1673.8. Samples: 18678688. Policy #0 lag: (min: 31.0, avg: 32.9, max: 60.0) [2023-10-10 06:09:46,784][52050] Avg episode reward: [(0, '21.020'), (1, '17.870')] [2023-10-10 06:09:46,845][53252] Updated weights for policy 0, policy_version 36490 (0.0009) [2023-10-10 06:09:46,933][53268] Updated weights for policy 1, policy_version 36450 (0.0009) [2023-10-10 06:09:47,217][53252] Updated weights for policy 0, policy_version 36500 (0.0008) [2023-10-10 06:09:47,295][53268] Updated weights for policy 1, policy_version 36460 (0.0008) [2023-10-10 06:09:47,589][53252] Updated weights for policy 0, policy_version 36510 (0.0009) [2023-10-10 06:09:47,663][53268] Updated weights for policy 1, policy_version 36470 (0.0008) [2023-10-10 06:09:48,032][53268] Updated weights for policy 1, policy_version 36480 (0.0009) [2023-10-10 06:09:51,591][53252] Updated weights for policy 0, policy_version 36520 (0.0010) [2023-10-10 06:09:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74743808. Throughput: 0: 1676.4, 1: 1679.6. Samples: 18699338. Policy #0 lag: (min: 31.0, avg: 32.9, max: 60.0) [2023-10-10 06:09:51,784][52050] Avg episode reward: [(0, '20.190'), (1, '18.070')] [2023-10-10 06:09:51,962][53252] Updated weights for policy 0, policy_version 36530 (0.0010) [2023-10-10 06:09:52,034][53268] Updated weights for policy 1, policy_version 36490 (0.0008) [2023-10-10 06:09:52,337][53252] Updated weights for policy 0, policy_version 36540 (0.0008) [2023-10-10 06:09:52,397][53268] Updated weights for policy 1, policy_version 36500 (0.0007) [2023-10-10 06:09:52,768][53268] Updated weights for policy 1, policy_version 36510 (0.0008) [2023-10-10 06:09:56,535][53252] Updated weights for policy 0, policy_version 36550 (0.0008) [2023-10-10 06:09:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 74809344. Throughput: 0: 1675.6, 1: 1676.1. Samples: 18720152. Policy #0 lag: (min: 15.0, avg: 19.3, max: 47.0) [2023-10-10 06:09:56,784][52050] Avg episode reward: [(0, '19.420'), (1, '17.850')] [2023-10-10 06:09:56,902][53252] Updated weights for policy 0, policy_version 36560 (0.0007) [2023-10-10 06:09:56,972][53268] Updated weights for policy 1, policy_version 36520 (0.0007) [2023-10-10 06:09:57,274][53252] Updated weights for policy 0, policy_version 36570 (0.0008) [2023-10-10 06:09:57,350][53268] Updated weights for policy 1, policy_version 36530 (0.0009) [2023-10-10 06:09:57,709][53268] Updated weights for policy 1, policy_version 36540 (0.0008) [2023-10-10 06:10:01,248][53252] Updated weights for policy 0, policy_version 36580 (0.0007) [2023-10-10 06:10:01,622][53252] Updated weights for policy 0, policy_version 36590 (0.0008) [2023-10-10 06:10:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74874880. Throughput: 0: 1679.2, 1: 1676.0. Samples: 18729318. Policy #0 lag: (min: 15.0, avg: 19.3, max: 47.0) [2023-10-10 06:10:01,784][52050] Avg episode reward: [(0, '18.930'), (1, '17.220')] [2023-10-10 06:10:01,811][53268] Updated weights for policy 1, policy_version 36550 (0.0008) [2023-10-10 06:10:01,998][53252] Updated weights for policy 0, policy_version 36600 (0.0008) [2023-10-10 06:10:02,167][53268] Updated weights for policy 1, policy_version 36560 (0.0008) [2023-10-10 06:10:02,537][53268] Updated weights for policy 1, policy_version 36570 (0.0009) [2023-10-10 06:10:06,150][53252] Updated weights for policy 0, policy_version 36610 (0.0008) [2023-10-10 06:10:06,528][53252] Updated weights for policy 0, policy_version 36620 (0.0009) [2023-10-10 06:10:06,664][53268] Updated weights for policy 1, policy_version 36580 (0.0010) [2023-10-10 06:10:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74940416. Throughput: 0: 1670.0, 1: 1676.4. Samples: 18749616. Policy #0 lag: (min: 15.0, avg: 19.3, max: 47.0) [2023-10-10 06:10:06,784][52050] Avg episode reward: [(0, '18.460'), (1, '19.190')] [2023-10-10 06:10:06,887][53252] Updated weights for policy 0, policy_version 36630 (0.0007) [2023-10-10 06:10:07,033][53268] Updated weights for policy 1, policy_version 36590 (0.0007) [2023-10-10 06:10:07,260][53252] Updated weights for policy 0, policy_version 36640 (0.0007) [2023-10-10 06:10:07,413][53268] Updated weights for policy 1, policy_version 36600 (0.0008) [2023-10-10 06:10:11,437][53252] Updated weights for policy 0, policy_version 36650 (0.0007) [2023-10-10 06:10:11,634][53268] Updated weights for policy 1, policy_version 36610 (0.0009) [2023-10-10 06:10:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 75005952. Throughput: 0: 1661.8, 1: 1674.0. Samples: 18769670. Policy #0 lag: (min: 15.0, avg: 19.3, max: 47.0) [2023-10-10 06:10:11,784][52050] Avg episode reward: [(0, '18.860'), (1, '18.790')] [2023-10-10 06:10:11,804][53252] Updated weights for policy 0, policy_version 36660 (0.0007) [2023-10-10 06:10:11,988][53268] Updated weights for policy 1, policy_version 36620 (0.0009) [2023-10-10 06:10:12,185][53252] Updated weights for policy 0, policy_version 36670 (0.0009) [2023-10-10 06:10:12,355][53268] Updated weights for policy 1, policy_version 36630 (0.0010) [2023-10-10 06:10:12,722][53268] Updated weights for policy 1, policy_version 36640 (0.0008) [2023-10-10 06:10:16,227][53252] Updated weights for policy 0, policy_version 36680 (0.0008) [2023-10-10 06:10:16,601][53252] Updated weights for policy 0, policy_version 36690 (0.0007) [2023-10-10 06:10:16,696][53268] Updated weights for policy 1, policy_version 36650 (0.0008) [2023-10-10 06:10:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75071488. Throughput: 0: 1669.2, 1: 1674.4. Samples: 18779026. Policy #0 lag: (min: 15.0, avg: 19.3, max: 47.0) [2023-10-10 06:10:16,784][52050] Avg episode reward: [(0, '18.400'), (1, '18.410')] [2023-10-10 06:10:16,976][53252] Updated weights for policy 0, policy_version 36700 (0.0007) [2023-10-10 06:10:17,060][53268] Updated weights for policy 1, policy_version 36660 (0.0009) [2023-10-10 06:10:17,431][53268] Updated weights for policy 1, policy_version 36670 (0.0010) [2023-10-10 06:10:21,048][53252] Updated weights for policy 0, policy_version 36710 (0.0009) [2023-10-10 06:10:21,414][53252] Updated weights for policy 0, policy_version 36720 (0.0007) [2023-10-10 06:10:21,577][53268] Updated weights for policy 1, policy_version 36680 (0.0009) [2023-10-10 06:10:21,778][53252] Updated weights for policy 0, policy_version 36730 (0.0007) [2023-10-10 06:10:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75137024. Throughput: 0: 1670.3, 1: 1674.4. Samples: 18799516. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-10 06:10:21,784][52050] Avg episode reward: [(0, '19.910'), (1, '19.120')] [2023-10-10 06:10:21,946][53268] Updated weights for policy 1, policy_version 36690 (0.0009) [2023-10-10 06:10:22,304][53268] Updated weights for policy 1, policy_version 36700 (0.0011) [2023-10-10 06:10:25,744][53252] Updated weights for policy 0, policy_version 36740 (0.0008) [2023-10-10 06:10:26,119][53252] Updated weights for policy 0, policy_version 36750 (0.0009) [2023-10-10 06:10:26,339][53268] Updated weights for policy 1, policy_version 36710 (0.0008) [2023-10-10 06:10:26,497][53252] Updated weights for policy 0, policy_version 36760 (0.0007) [2023-10-10 06:10:26,700][53268] Updated weights for policy 1, policy_version 36720 (0.0007) [2023-10-10 06:10:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 75202560. Throughput: 0: 1650.6, 1: 1683.3. Samples: 18819826. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-10 06:10:26,784][52050] Avg episode reward: [(0, '20.680'), (1, '18.220')] [2023-10-10 06:10:27,068][53268] Updated weights for policy 1, policy_version 36730 (0.0010) [2023-10-10 06:10:30,575][53252] Updated weights for policy 0, policy_version 36770 (0.0008) [2023-10-10 06:10:30,963][53252] Updated weights for policy 0, policy_version 36780 (0.0008) [2023-10-10 06:10:31,014][53268] Updated weights for policy 1, policy_version 36740 (0.0008) [2023-10-10 06:10:31,342][53252] Updated weights for policy 0, policy_version 36790 (0.0008) [2023-10-10 06:10:31,391][53268] Updated weights for policy 1, policy_version 36750 (0.0009) [2023-10-10 06:10:31,705][53252] Updated weights for policy 0, policy_version 36800 (0.0008) [2023-10-10 06:10:31,747][53268] Updated weights for policy 1, policy_version 36760 (0.0008) [2023-10-10 06:10:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75300864. Throughput: 0: 1671.7, 1: 1689.1. Samples: 18829926. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-10 06:10:31,784][52050] Avg episode reward: [(0, '19.900'), (1, '17.840')] [2023-10-10 06:10:35,776][53268] Updated weights for policy 1, policy_version 36770 (0.0010) [2023-10-10 06:10:35,849][53252] Updated weights for policy 0, policy_version 36810 (0.0007) [2023-10-10 06:10:36,143][53268] Updated weights for policy 1, policy_version 36780 (0.0008) [2023-10-10 06:10:36,216][53252] Updated weights for policy 0, policy_version 36820 (0.0008) [2023-10-10 06:10:36,499][53268] Updated weights for policy 1, policy_version 36790 (0.0008) [2023-10-10 06:10:36,595][53252] Updated weights for policy 0, policy_version 36830 (0.0009) [2023-10-10 06:10:36,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75366400. Throughput: 0: 1672.8, 1: 1685.5. Samples: 18850462. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-10 06:10:36,784][52050] Avg episode reward: [(0, '21.360'), (1, '18.550')] [2023-10-10 06:10:36,868][53268] Updated weights for policy 1, policy_version 36800 (0.0009) [2023-10-10 06:10:40,588][53252] Updated weights for policy 0, policy_version 36840 (0.0008) [2023-10-10 06:10:40,963][53252] Updated weights for policy 0, policy_version 36850 (0.0009) [2023-10-10 06:10:41,027][53268] Updated weights for policy 1, policy_version 36810 (0.0008) [2023-10-10 06:10:41,327][53252] Updated weights for policy 0, policy_version 36860 (0.0008) [2023-10-10 06:10:41,395][53268] Updated weights for policy 1, policy_version 36820 (0.0009) [2023-10-10 06:10:41,763][53268] Updated weights for policy 1, policy_version 36830 (0.0009) [2023-10-10 06:10:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75431936. Throughput: 0: 1652.0, 1: 1664.6. Samples: 18869396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:10:41,784][52050] Avg episode reward: [(0, '20.090'), (1, '18.200')] [2023-10-10 06:10:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000036864_37748736.pth... [2023-10-10 06:10:41,829][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000035296_36143104.pth [2023-10-10 06:10:41,832][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000036832_37715968.pth... [2023-10-10 06:10:41,862][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000035232_36077568.pth [2023-10-10 06:10:45,479][53252] Updated weights for policy 0, policy_version 36870 (0.0009) [2023-10-10 06:10:45,849][53252] Updated weights for policy 0, policy_version 36880 (0.0009) [2023-10-10 06:10:45,970][53268] Updated weights for policy 1, policy_version 36840 (0.0007) [2023-10-10 06:10:46,216][53252] Updated weights for policy 0, policy_version 36890 (0.0009) [2023-10-10 06:10:46,336][53268] Updated weights for policy 1, policy_version 36850 (0.0007) [2023-10-10 06:10:46,701][53268] Updated weights for policy 1, policy_version 36860 (0.0007) [2023-10-10 06:10:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75497472. Throughput: 0: 1670.8, 1: 1674.7. Samples: 18879868. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:10:46,784][52050] Avg episode reward: [(0, '18.860'), (1, '18.300')] [2023-10-10 06:10:50,454][53252] Updated weights for policy 0, policy_version 36900 (0.0010) [2023-10-10 06:10:50,799][53268] Updated weights for policy 1, policy_version 36870 (0.0008) [2023-10-10 06:10:50,826][53252] Updated weights for policy 0, policy_version 36910 (0.0009) [2023-10-10 06:10:51,172][53268] Updated weights for policy 1, policy_version 36880 (0.0008) [2023-10-10 06:10:51,192][53252] Updated weights for policy 0, policy_version 36920 (0.0007) [2023-10-10 06:10:51,535][53268] Updated weights for policy 1, policy_version 36890 (0.0009) [2023-10-10 06:10:51,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 75595776. Throughput: 0: 1670.4, 1: 1676.9. Samples: 18900242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:10:51,784][52050] Avg episode reward: [(0, '19.720'), (1, '18.710')] [2023-10-10 06:10:55,397][53252] Updated weights for policy 0, policy_version 36930 (0.0010) [2023-10-10 06:10:55,730][53268] Updated weights for policy 1, policy_version 36900 (0.0008) [2023-10-10 06:10:55,765][53252] Updated weights for policy 0, policy_version 36940 (0.0008) [2023-10-10 06:10:56,097][53268] Updated weights for policy 1, policy_version 36910 (0.0009) [2023-10-10 06:10:56,134][53252] Updated weights for policy 0, policy_version 36950 (0.0008) [2023-10-10 06:10:56,477][53268] Updated weights for policy 1, policy_version 36920 (0.0009) [2023-10-10 06:10:56,508][53252] Updated weights for policy 0, policy_version 36960 (0.0008) [2023-10-10 06:10:56,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75661312. Throughput: 0: 1659.5, 1: 1662.9. Samples: 18919180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:10:56,784][52050] Avg episode reward: [(0, '21.460'), (1, '19.770')] [2023-10-10 06:11:00,493][53268] Updated weights for policy 1, policy_version 36930 (0.0009) [2023-10-10 06:11:00,741][53252] Updated weights for policy 0, policy_version 36970 (0.0009) [2023-10-10 06:11:00,866][53268] Updated weights for policy 1, policy_version 36940 (0.0009) [2023-10-10 06:11:01,115][53252] Updated weights for policy 0, policy_version 36980 (0.0009) [2023-10-10 06:11:01,226][53268] Updated weights for policy 1, policy_version 36950 (0.0009) [2023-10-10 06:11:01,481][53252] Updated weights for policy 0, policy_version 36990 (0.0008) [2023-10-10 06:11:01,588][53268] Updated weights for policy 1, policy_version 36960 (0.0008) [2023-10-10 06:11:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75726848. Throughput: 0: 1675.6, 1: 1677.5. Samples: 18929914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:01,784][52050] Avg episode reward: [(0, '19.940'), (1, '19.550')] [2023-10-10 06:11:05,443][53252] Updated weights for policy 0, policy_version 37000 (0.0008) [2023-10-10 06:11:05,698][53268] Updated weights for policy 1, policy_version 36970 (0.0009) [2023-10-10 06:11:05,819][53252] Updated weights for policy 0, policy_version 37010 (0.0009) [2023-10-10 06:11:06,068][53268] Updated weights for policy 1, policy_version 36980 (0.0008) [2023-10-10 06:11:06,189][53252] Updated weights for policy 0, policy_version 37020 (0.0008) [2023-10-10 06:11:06,434][53268] Updated weights for policy 1, policy_version 36990 (0.0009) [2023-10-10 06:11:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75792384. Throughput: 0: 1671.2, 1: 1677.3. Samples: 18950202. Policy #0 lag: (min: 2.0, avg: 11.9, max: 34.0) [2023-10-10 06:11:06,784][52050] Avg episode reward: [(0, '21.010'), (1, '19.750')] [2023-10-10 06:11:10,209][53252] Updated weights for policy 0, policy_version 37030 (0.0008) [2023-10-10 06:11:10,569][53268] Updated weights for policy 1, policy_version 37000 (0.0010) [2023-10-10 06:11:10,579][53252] Updated weights for policy 0, policy_version 37040 (0.0007) [2023-10-10 06:11:10,936][53268] Updated weights for policy 1, policy_version 37010 (0.0009) [2023-10-10 06:11:10,949][53252] Updated weights for policy 0, policy_version 37050 (0.0007) [2023-10-10 06:11:11,310][53268] Updated weights for policy 1, policy_version 37020 (0.0008) [2023-10-10 06:11:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75857920. Throughput: 0: 1663.5, 1: 1652.0. Samples: 18969020. Policy #0 lag: (min: 2.0, avg: 11.9, max: 34.0) [2023-10-10 06:11:11,785][52050] Avg episode reward: [(0, '20.770'), (1, '18.870')] [2023-10-10 06:11:15,107][53252] Updated weights for policy 0, policy_version 37060 (0.0007) [2023-10-10 06:11:15,307][53268] Updated weights for policy 1, policy_version 37030 (0.0008) [2023-10-10 06:11:15,483][53252] Updated weights for policy 0, policy_version 37070 (0.0009) [2023-10-10 06:11:15,670][53268] Updated weights for policy 1, policy_version 37040 (0.0007) [2023-10-10 06:11:15,851][53252] Updated weights for policy 0, policy_version 37080 (0.0009) [2023-10-10 06:11:16,034][53268] Updated weights for policy 1, policy_version 37050 (0.0008) [2023-10-10 06:11:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75923456. Throughput: 0: 1669.3, 1: 1668.8. Samples: 18980140. Policy #0 lag: (min: 2.0, avg: 11.9, max: 34.0) [2023-10-10 06:11:16,784][52050] Avg episode reward: [(0, '20.070'), (1, '17.620')] [2023-10-10 06:11:19,913][53252] Updated weights for policy 0, policy_version 37090 (0.0008) [2023-10-10 06:11:20,118][53268] Updated weights for policy 1, policy_version 37060 (0.0009) [2023-10-10 06:11:20,309][53252] Updated weights for policy 0, policy_version 37100 (0.0008) [2023-10-10 06:11:20,479][53268] Updated weights for policy 1, policy_version 37070 (0.0008) [2023-10-10 06:11:20,680][53252] Updated weights for policy 0, policy_version 37110 (0.0009) [2023-10-10 06:11:20,845][53268] Updated weights for policy 1, policy_version 37080 (0.0008) [2023-10-10 06:11:21,045][53252] Updated weights for policy 0, policy_version 37120 (0.0008) [2023-10-10 06:11:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 75988992. Throughput: 0: 1656.0, 1: 1670.0. Samples: 19000136. Policy #0 lag: (min: 2.0, avg: 11.9, max: 34.0) [2023-10-10 06:11:21,784][52050] Avg episode reward: [(0, '20.070'), (1, '18.280')] [2023-10-10 06:11:24,964][53268] Updated weights for policy 1, policy_version 37090 (0.0008) [2023-10-10 06:11:25,083][53252] Updated weights for policy 0, policy_version 37130 (0.0008) [2023-10-10 06:11:25,327][53268] Updated weights for policy 1, policy_version 37100 (0.0008) [2023-10-10 06:11:25,452][53252] Updated weights for policy 0, policy_version 37140 (0.0010) [2023-10-10 06:11:25,692][53268] Updated weights for policy 1, policy_version 37110 (0.0009) [2023-10-10 06:11:25,831][53252] Updated weights for policy 0, policy_version 37150 (0.0008) [2023-10-10 06:11:26,064][53268] Updated weights for policy 1, policy_version 37120 (0.0008) [2023-10-10 06:11:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 76054528. Throughput: 0: 1663.9, 1: 1656.6. Samples: 19018818. Policy #0 lag: (min: 2.0, avg: 11.9, max: 34.0) [2023-10-10 06:11:26,784][52050] Avg episode reward: [(0, '22.120'), (1, '17.750')] [2023-10-10 06:11:29,838][53252] Updated weights for policy 0, policy_version 37160 (0.0008) [2023-10-10 06:11:29,936][53268] Updated weights for policy 1, policy_version 37130 (0.0008) [2023-10-10 06:11:30,207][53252] Updated weights for policy 0, policy_version 37170 (0.0008) [2023-10-10 06:11:30,300][53268] Updated weights for policy 1, policy_version 37140 (0.0008) [2023-10-10 06:11:30,575][53252] Updated weights for policy 0, policy_version 37180 (0.0008) [2023-10-10 06:11:30,668][53268] Updated weights for policy 1, policy_version 37150 (0.0008) [2023-10-10 06:11:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 76120064. Throughput: 0: 1670.0, 1: 1680.2. Samples: 19030628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:31,784][52050] Avg episode reward: [(0, '22.940'), (1, '17.190')] [2023-10-10 06:11:31,785][52846] Saving new best policy, reward=22.940! [2023-10-10 06:11:34,564][53252] Updated weights for policy 0, policy_version 37190 (0.0007) [2023-10-10 06:11:34,786][53268] Updated weights for policy 1, policy_version 37160 (0.0009) [2023-10-10 06:11:34,932][53252] Updated weights for policy 0, policy_version 37200 (0.0008) [2023-10-10 06:11:35,141][53268] Updated weights for policy 1, policy_version 37170 (0.0009) [2023-10-10 06:11:35,302][53252] Updated weights for policy 0, policy_version 37210 (0.0009) [2023-10-10 06:11:35,507][53268] Updated weights for policy 1, policy_version 37180 (0.0007) [2023-10-10 06:11:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 76185600. Throughput: 0: 1653.9, 1: 1661.3. Samples: 19049428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:36,784][52050] Avg episode reward: [(0, '20.900'), (1, '16.950')] [2023-10-10 06:11:39,276][53252] Updated weights for policy 0, policy_version 37220 (0.0008) [2023-10-10 06:11:39,552][53268] Updated weights for policy 1, policy_version 37190 (0.0009) [2023-10-10 06:11:39,648][53252] Updated weights for policy 0, policy_version 37230 (0.0008) [2023-10-10 06:11:39,915][53268] Updated weights for policy 1, policy_version 37200 (0.0009) [2023-10-10 06:11:40,027][53252] Updated weights for policy 0, policy_version 37240 (0.0008) [2023-10-10 06:11:40,285][53268] Updated weights for policy 1, policy_version 37210 (0.0009) [2023-10-10 06:11:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 76251136. Throughput: 0: 1674.1, 1: 1667.5. Samples: 19069552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:41,784][52050] Avg episode reward: [(0, '22.090'), (1, '17.260')] [2023-10-10 06:11:44,086][53252] Updated weights for policy 0, policy_version 37250 (0.0009) [2023-10-10 06:11:44,434][53268] Updated weights for policy 1, policy_version 37220 (0.0008) [2023-10-10 06:11:44,447][53252] Updated weights for policy 0, policy_version 37260 (0.0009) [2023-10-10 06:11:44,803][53268] Updated weights for policy 1, policy_version 37230 (0.0008) [2023-10-10 06:11:44,810][53252] Updated weights for policy 0, policy_version 37270 (0.0009) [2023-10-10 06:11:45,170][53268] Updated weights for policy 1, policy_version 37240 (0.0008) [2023-10-10 06:11:45,182][53252] Updated weights for policy 0, policy_version 37280 (0.0011) [2023-10-10 06:11:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 76316672. Throughput: 0: 1672.5, 1: 1678.9. Samples: 19080728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:46,784][52050] Avg episode reward: [(0, '21.130'), (1, '17.520')] [2023-10-10 06:11:49,157][53268] Updated weights for policy 1, policy_version 37250 (0.0010) [2023-10-10 06:11:49,302][53252] Updated weights for policy 0, policy_version 37290 (0.0007) [2023-10-10 06:11:49,516][53268] Updated weights for policy 1, policy_version 37260 (0.0008) [2023-10-10 06:11:49,671][53252] Updated weights for policy 0, policy_version 37300 (0.0008) [2023-10-10 06:11:49,888][53268] Updated weights for policy 1, policy_version 37270 (0.0009) [2023-10-10 06:11:50,040][53252] Updated weights for policy 0, policy_version 37310 (0.0009) [2023-10-10 06:11:50,262][53268] Updated weights for policy 1, policy_version 37280 (0.0010) [2023-10-10 06:11:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 76382208. Throughput: 0: 1657.9, 1: 1658.7. Samples: 19099448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:51,784][52050] Avg episode reward: [(0, '19.470'), (1, '18.710')] [2023-10-10 06:11:54,150][53252] Updated weights for policy 0, policy_version 37320 (0.0008) [2023-10-10 06:11:54,335][53268] Updated weights for policy 1, policy_version 37290 (0.0009) [2023-10-10 06:11:54,516][53252] Updated weights for policy 0, policy_version 37330 (0.0007) [2023-10-10 06:11:54,703][53268] Updated weights for policy 1, policy_version 37300 (0.0007) [2023-10-10 06:11:54,886][53252] Updated weights for policy 0, policy_version 37340 (0.0008) [2023-10-10 06:11:55,060][53268] Updated weights for policy 1, policy_version 37310 (0.0010) [2023-10-10 06:11:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 76447744. Throughput: 0: 1679.7, 1: 1681.0. Samples: 19120252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:11:56,784][52050] Avg episode reward: [(0, '19.550'), (1, '18.650')] [2023-10-10 06:11:58,941][53252] Updated weights for policy 0, policy_version 37350 (0.0009) [2023-10-10 06:11:59,038][53268] Updated weights for policy 1, policy_version 37320 (0.0008) [2023-10-10 06:11:59,312][53252] Updated weights for policy 0, policy_version 37360 (0.0009) [2023-10-10 06:11:59,404][53268] Updated weights for policy 1, policy_version 37330 (0.0008) [2023-10-10 06:11:59,685][53252] Updated weights for policy 0, policy_version 37370 (0.0009) [2023-10-10 06:11:59,770][53268] Updated weights for policy 1, policy_version 37340 (0.0007) [2023-10-10 06:12:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 76513280. Throughput: 0: 1668.9, 1: 1682.4. Samples: 19130948. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:12:01,784][52050] Avg episode reward: [(0, '18.780'), (1, '18.790')] [2023-10-10 06:12:03,729][53268] Updated weights for policy 1, policy_version 37350 (0.0009) [2023-10-10 06:12:03,869][53252] Updated weights for policy 0, policy_version 37380 (0.0009) [2023-10-10 06:12:04,107][53268] Updated weights for policy 1, policy_version 37360 (0.0009) [2023-10-10 06:12:04,235][53252] Updated weights for policy 0, policy_version 37390 (0.0008) [2023-10-10 06:12:04,467][53268] Updated weights for policy 1, policy_version 37370 (0.0008) [2023-10-10 06:12:04,594][53252] Updated weights for policy 0, policy_version 37400 (0.0008) [2023-10-10 06:12:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 76578816. Throughput: 0: 1666.9, 1: 1670.4. Samples: 19150316. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:12:06,784][52050] Avg episode reward: [(0, '20.220'), (1, '20.640')] [2023-10-10 06:12:08,486][53268] Updated weights for policy 1, policy_version 37380 (0.0009) [2023-10-10 06:12:08,747][53252] Updated weights for policy 0, policy_version 37410 (0.0009) [2023-10-10 06:12:08,855][53268] Updated weights for policy 1, policy_version 37390 (0.0008) [2023-10-10 06:12:09,152][53252] Updated weights for policy 0, policy_version 37420 (0.0010) [2023-10-10 06:12:09,221][53268] Updated weights for policy 1, policy_version 37400 (0.0009) [2023-10-10 06:12:09,530][53252] Updated weights for policy 0, policy_version 37430 (0.0007) [2023-10-10 06:12:09,899][53252] Updated weights for policy 0, policy_version 37440 (0.0009) [2023-10-10 06:12:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76644352. Throughput: 0: 1676.6, 1: 1701.5. Samples: 19170832. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:12:11,784][52050] Avg episode reward: [(0, '19.920'), (1, '20.190')] [2023-10-10 06:12:13,320][53268] Updated weights for policy 1, policy_version 37410 (0.0009) [2023-10-10 06:12:13,685][53268] Updated weights for policy 1, policy_version 37420 (0.0009) [2023-10-10 06:12:14,047][53268] Updated weights for policy 1, policy_version 37430 (0.0009) [2023-10-10 06:12:14,181][53252] Updated weights for policy 0, policy_version 37450 (0.0009) [2023-10-10 06:12:14,413][53268] Updated weights for policy 1, policy_version 37440 (0.0009) [2023-10-10 06:12:14,548][53252] Updated weights for policy 0, policy_version 37460 (0.0009) [2023-10-10 06:12:14,920][53252] Updated weights for policy 0, policy_version 37470 (0.0010) [2023-10-10 06:12:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76709888. Throughput: 0: 1661.3, 1: 1678.7. Samples: 19180930. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:12:16,784][52050] Avg episode reward: [(0, '20.040'), (1, '20.470')] [2023-10-10 06:12:18,576][53268] Updated weights for policy 1, policy_version 37450 (0.0010) [2023-10-10 06:12:18,910][53252] Updated weights for policy 0, policy_version 37480 (0.0009) [2023-10-10 06:12:18,935][53268] Updated weights for policy 1, policy_version 37460 (0.0008) [2023-10-10 06:12:19,284][53252] Updated weights for policy 0, policy_version 37490 (0.0007) [2023-10-10 06:12:19,298][53268] Updated weights for policy 1, policy_version 37470 (0.0007) [2023-10-10 06:12:19,650][53252] Updated weights for policy 0, policy_version 37500 (0.0007) [2023-10-10 06:12:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76775424. Throughput: 0: 1673.2, 1: 1689.1. Samples: 19200728. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:12:21,784][52050] Avg episode reward: [(0, '19.650'), (1, '20.920')] [2023-10-10 06:12:23,578][53252] Updated weights for policy 0, policy_version 37510 (0.0008) [2023-10-10 06:12:23,608][53268] Updated weights for policy 1, policy_version 37480 (0.0007) [2023-10-10 06:12:23,953][53252] Updated weights for policy 0, policy_version 37520 (0.0007) [2023-10-10 06:12:23,984][53268] Updated weights for policy 1, policy_version 37490 (0.0008) [2023-10-10 06:12:24,322][53252] Updated weights for policy 0, policy_version 37530 (0.0008) [2023-10-10 06:12:24,342][53268] Updated weights for policy 1, policy_version 37500 (0.0008) [2023-10-10 06:12:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76840960. Throughput: 0: 1678.9, 1: 1691.6. Samples: 19221224. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:26,784][52050] Avg episode reward: [(0, '19.810'), (1, '22.120')] [2023-10-10 06:12:26,797][53061] Saving new best policy, reward=22.120! [2023-10-10 06:12:28,433][53268] Updated weights for policy 1, policy_version 37510 (0.0008) [2023-10-10 06:12:28,459][53252] Updated weights for policy 0, policy_version 37540 (0.0008) [2023-10-10 06:12:28,793][53268] Updated weights for policy 1, policy_version 37520 (0.0007) [2023-10-10 06:12:28,827][53252] Updated weights for policy 0, policy_version 37550 (0.0008) [2023-10-10 06:12:29,155][53268] Updated weights for policy 1, policy_version 37530 (0.0009) [2023-10-10 06:12:29,203][53252] Updated weights for policy 0, policy_version 37560 (0.0009) [2023-10-10 06:12:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76906496. Throughput: 0: 1660.0, 1: 1673.7. Samples: 19230746. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:31,784][52050] Avg episode reward: [(0, '18.880'), (1, '20.940')] [2023-10-10 06:12:33,167][53268] Updated weights for policy 1, policy_version 37540 (0.0009) [2023-10-10 06:12:33,452][53252] Updated weights for policy 0, policy_version 37570 (0.0009) [2023-10-10 06:12:33,529][53268] Updated weights for policy 1, policy_version 37550 (0.0007) [2023-10-10 06:12:33,819][53252] Updated weights for policy 0, policy_version 37580 (0.0010) [2023-10-10 06:12:33,887][53268] Updated weights for policy 1, policy_version 37560 (0.0008) [2023-10-10 06:12:34,180][53252] Updated weights for policy 0, policy_version 37590 (0.0008) [2023-10-10 06:12:34,557][53252] Updated weights for policy 0, policy_version 37600 (0.0007) [2023-10-10 06:12:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76972032. Throughput: 0: 1674.1, 1: 1690.6. Samples: 19250858. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:36,784][52050] Avg episode reward: [(0, '19.060'), (1, '20.440')] [2023-10-10 06:12:38,089][53268] Updated weights for policy 1, policy_version 37570 (0.0009) [2023-10-10 06:12:38,458][53268] Updated weights for policy 1, policy_version 37580 (0.0007) [2023-10-10 06:12:38,600][53252] Updated weights for policy 0, policy_version 37610 (0.0008) [2023-10-10 06:12:38,814][53268] Updated weights for policy 1, policy_version 37590 (0.0009) [2023-10-10 06:12:38,966][53252] Updated weights for policy 0, policy_version 37620 (0.0009) [2023-10-10 06:12:39,184][53268] Updated weights for policy 1, policy_version 37600 (0.0009) [2023-10-10 06:12:39,334][53252] Updated weights for policy 0, policy_version 37630 (0.0010) [2023-10-10 06:12:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77037568. Throughput: 0: 1674.2, 1: 1682.3. Samples: 19271294. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:41,784][52050] Avg episode reward: [(0, '19.980'), (1, '18.950')] [2023-10-10 06:12:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000037632_38535168.pth... [2023-10-10 06:12:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000037600_38502400.pth... [2023-10-10 06:12:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000036032_36896768.pth [2023-10-10 06:12:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000036064_36929536.pth [2023-10-10 06:12:43,358][53268] Updated weights for policy 1, policy_version 37610 (0.0010) [2023-10-10 06:12:43,361][53252] Updated weights for policy 0, policy_version 37640 (0.0009) [2023-10-10 06:12:43,715][53268] Updated weights for policy 1, policy_version 37620 (0.0009) [2023-10-10 06:12:43,729][53252] Updated weights for policy 0, policy_version 37650 (0.0007) [2023-10-10 06:12:44,079][53268] Updated weights for policy 1, policy_version 37630 (0.0010) [2023-10-10 06:12:44,095][53252] Updated weights for policy 0, policy_version 37660 (0.0007) [2023-10-10 06:12:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77103104. Throughput: 0: 1656.5, 1: 1664.2. Samples: 19280382. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:46,784][52050] Avg episode reward: [(0, '20.640'), (1, '18.750')] [2023-10-10 06:12:48,095][53268] Updated weights for policy 1, policy_version 37640 (0.0009) [2023-10-10 06:12:48,207][53252] Updated weights for policy 0, policy_version 37670 (0.0008) [2023-10-10 06:12:48,453][53268] Updated weights for policy 1, policy_version 37650 (0.0009) [2023-10-10 06:12:48,569][53252] Updated weights for policy 0, policy_version 37680 (0.0009) [2023-10-10 06:12:48,821][53268] Updated weights for policy 1, policy_version 37660 (0.0008) [2023-10-10 06:12:48,936][53252] Updated weights for policy 0, policy_version 37690 (0.0008) [2023-10-10 06:12:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77168640. Throughput: 0: 1672.8, 1: 1674.4. Samples: 19300940. Policy #0 lag: (min: 24.0, avg: 53.3, max: 56.0) [2023-10-10 06:12:51,784][52050] Avg episode reward: [(0, '19.090'), (1, '18.440')] [2023-10-10 06:12:52,882][53268] Updated weights for policy 1, policy_version 37670 (0.0008) [2023-10-10 06:12:53,050][53252] Updated weights for policy 0, policy_version 37700 (0.0008) [2023-10-10 06:12:53,243][53268] Updated weights for policy 1, policy_version 37680 (0.0008) [2023-10-10 06:12:53,426][53252] Updated weights for policy 0, policy_version 37710 (0.0009) [2023-10-10 06:12:53,599][53268] Updated weights for policy 1, policy_version 37690 (0.0008) [2023-10-10 06:12:53,793][53252] Updated weights for policy 0, policy_version 37720 (0.0009) [2023-10-10 06:12:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77234176. Throughput: 0: 1681.2, 1: 1671.5. Samples: 19321704. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 06:12:56,784][52050] Avg episode reward: [(0, '19.420'), (1, '19.310')] [2023-10-10 06:12:57,675][53268] Updated weights for policy 1, policy_version 37700 (0.0010) [2023-10-10 06:12:57,894][53252] Updated weights for policy 0, policy_version 37730 (0.0007) [2023-10-10 06:12:58,033][53268] Updated weights for policy 1, policy_version 37710 (0.0009) [2023-10-10 06:12:58,293][53252] Updated weights for policy 0, policy_version 37740 (0.0009) [2023-10-10 06:12:58,406][53268] Updated weights for policy 1, policy_version 37720 (0.0008) [2023-10-10 06:12:58,661][53252] Updated weights for policy 0, policy_version 37750 (0.0008) [2023-10-10 06:12:59,033][53252] Updated weights for policy 0, policy_version 37760 (0.0010) [2023-10-10 06:13:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 77299712. Throughput: 0: 1666.4, 1: 1661.8. Samples: 19330696. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 06:13:01,785][52050] Avg episode reward: [(0, '18.740'), (1, '20.730')] [2023-10-10 06:13:02,447][53268] Updated weights for policy 1, policy_version 37730 (0.0008) [2023-10-10 06:13:02,818][53268] Updated weights for policy 1, policy_version 37740 (0.0007) [2023-10-10 06:13:03,035][53252] Updated weights for policy 0, policy_version 37770 (0.0008) [2023-10-10 06:13:03,190][53268] Updated weights for policy 1, policy_version 37750 (0.0008) [2023-10-10 06:13:03,403][53252] Updated weights for policy 0, policy_version 37780 (0.0009) [2023-10-10 06:13:03,547][53268] Updated weights for policy 1, policy_version 37760 (0.0009) [2023-10-10 06:13:03,781][53252] Updated weights for policy 0, policy_version 37790 (0.0010) [2023-10-10 06:13:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 77365248. Throughput: 0: 1677.2, 1: 1671.2. Samples: 19351406. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 06:13:06,784][52050] Avg episode reward: [(0, '17.980'), (1, '19.080')] [2023-10-10 06:13:07,556][53268] Updated weights for policy 1, policy_version 37770 (0.0008) [2023-10-10 06:13:07,921][53268] Updated weights for policy 1, policy_version 37780 (0.0007) [2023-10-10 06:13:07,989][53252] Updated weights for policy 0, policy_version 37800 (0.0008) [2023-10-10 06:13:08,286][53268] Updated weights for policy 1, policy_version 37790 (0.0007) [2023-10-10 06:13:08,356][53252] Updated weights for policy 0, policy_version 37810 (0.0010) [2023-10-10 06:13:08,725][53252] Updated weights for policy 0, policy_version 37820 (0.0008) [2023-10-10 06:13:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 77430784. Throughput: 0: 1671.7, 1: 1682.6. Samples: 19372166. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 06:13:11,784][52050] Avg episode reward: [(0, '18.540'), (1, '19.470')] [2023-10-10 06:13:12,336][53268] Updated weights for policy 1, policy_version 37800 (0.0008) [2023-10-10 06:13:12,701][53268] Updated weights for policy 1, policy_version 37810 (0.0007) [2023-10-10 06:13:12,712][53252] Updated weights for policy 0, policy_version 37830 (0.0008) [2023-10-10 06:13:13,067][53268] Updated weights for policy 1, policy_version 37820 (0.0007) [2023-10-10 06:13:13,085][53252] Updated weights for policy 0, policy_version 37840 (0.0008) [2023-10-10 06:13:13,449][53252] Updated weights for policy 0, policy_version 37850 (0.0009) [2023-10-10 06:13:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77496320. Throughput: 0: 1669.6, 1: 1678.3. Samples: 19381400. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 06:13:16,784][52050] Avg episode reward: [(0, '18.940'), (1, '19.000')] [2023-10-10 06:13:17,252][53268] Updated weights for policy 1, policy_version 37830 (0.0007) [2023-10-10 06:13:17,545][53252] Updated weights for policy 0, policy_version 37860 (0.0009) [2023-10-10 06:13:17,620][53268] Updated weights for policy 1, policy_version 37840 (0.0007) [2023-10-10 06:13:17,906][53252] Updated weights for policy 0, policy_version 37870 (0.0008) [2023-10-10 06:13:17,989][53268] Updated weights for policy 1, policy_version 37850 (0.0009) [2023-10-10 06:13:18,280][53252] Updated weights for policy 0, policy_version 37880 (0.0008) [2023-10-10 06:13:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 77561856. Throughput: 0: 1676.3, 1: 1686.7. Samples: 19402196. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:21,784][52050] Avg episode reward: [(0, '20.750'), (1, '18.040')] [2023-10-10 06:13:22,046][53268] Updated weights for policy 1, policy_version 37860 (0.0008) [2023-10-10 06:13:22,382][53252] Updated weights for policy 0, policy_version 37890 (0.0010) [2023-10-10 06:13:22,418][53268] Updated weights for policy 1, policy_version 37870 (0.0009) [2023-10-10 06:13:22,754][53252] Updated weights for policy 0, policy_version 37900 (0.0008) [2023-10-10 06:13:22,786][53268] Updated weights for policy 1, policy_version 37880 (0.0009) [2023-10-10 06:13:23,111][53252] Updated weights for policy 0, policy_version 37910 (0.0008) [2023-10-10 06:13:23,483][53252] Updated weights for policy 0, policy_version 37920 (0.0007) [2023-10-10 06:13:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77627392. Throughput: 0: 1676.0, 1: 1690.6. Samples: 19422792. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:26,784][52050] Avg episode reward: [(0, '20.740'), (1, '17.330')] [2023-10-10 06:13:26,810][53268] Updated weights for policy 1, policy_version 37890 (0.0009) [2023-10-10 06:13:27,182][53268] Updated weights for policy 1, policy_version 37900 (0.0009) [2023-10-10 06:13:27,540][53268] Updated weights for policy 1, policy_version 37910 (0.0008) [2023-10-10 06:13:27,644][53252] Updated weights for policy 0, policy_version 37930 (0.0008) [2023-10-10 06:13:27,914][53268] Updated weights for policy 1, policy_version 37920 (0.0008) [2023-10-10 06:13:28,022][53252] Updated weights for policy 0, policy_version 37940 (0.0009) [2023-10-10 06:13:28,392][53252] Updated weights for policy 0, policy_version 37950 (0.0008) [2023-10-10 06:13:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 77692928. Throughput: 0: 1680.4, 1: 1687.9. Samples: 19431954. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:31,784][52050] Avg episode reward: [(0, '21.080'), (1, '18.630')] [2023-10-10 06:13:32,027][53268] Updated weights for policy 1, policy_version 37930 (0.0009) [2023-10-10 06:13:32,400][53268] Updated weights for policy 1, policy_version 37940 (0.0007) [2023-10-10 06:13:32,548][53252] Updated weights for policy 0, policy_version 37960 (0.0009) [2023-10-10 06:13:32,774][53268] Updated weights for policy 1, policy_version 37950 (0.0009) [2023-10-10 06:13:32,915][53252] Updated weights for policy 0, policy_version 37970 (0.0009) [2023-10-10 06:13:33,298][53252] Updated weights for policy 0, policy_version 37980 (0.0007) [2023-10-10 06:13:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77758464. Throughput: 0: 1680.9, 1: 1689.4. Samples: 19452604. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:36,784][52050] Avg episode reward: [(0, '20.410'), (1, '18.070')] [2023-10-10 06:13:36,838][53268] Updated weights for policy 1, policy_version 37960 (0.0009) [2023-10-10 06:13:37,205][53268] Updated weights for policy 1, policy_version 37970 (0.0008) [2023-10-10 06:13:37,254][53252] Updated weights for policy 0, policy_version 37990 (0.0010) [2023-10-10 06:13:37,580][53268] Updated weights for policy 1, policy_version 37980 (0.0008) [2023-10-10 06:13:37,622][53252] Updated weights for policy 0, policy_version 38000 (0.0008) [2023-10-10 06:13:37,992][53252] Updated weights for policy 0, policy_version 38010 (0.0009) [2023-10-10 06:13:41,672][53268] Updated weights for policy 1, policy_version 37990 (0.0009) [2023-10-10 06:13:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77824000. Throughput: 0: 1685.7, 1: 1691.6. Samples: 19473680. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:41,784][52050] Avg episode reward: [(0, '19.010'), (1, '18.830')] [2023-10-10 06:13:41,887][53252] Updated weights for policy 0, policy_version 38020 (0.0007) [2023-10-10 06:13:42,049][53268] Updated weights for policy 1, policy_version 38000 (0.0009) [2023-10-10 06:13:42,270][53252] Updated weights for policy 0, policy_version 38030 (0.0008) [2023-10-10 06:13:42,412][53268] Updated weights for policy 1, policy_version 38010 (0.0008) [2023-10-10 06:13:42,646][53252] Updated weights for policy 0, policy_version 38040 (0.0008) [2023-10-10 06:13:46,467][53268] Updated weights for policy 1, policy_version 38020 (0.0008) [2023-10-10 06:13:46,606][53252] Updated weights for policy 0, policy_version 38050 (0.0010) [2023-10-10 06:13:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77889536. Throughput: 0: 1686.1, 1: 1690.8. Samples: 19482654. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) [2023-10-10 06:13:46,784][52050] Avg episode reward: [(0, '18.820'), (1, '19.550')] [2023-10-10 06:13:46,832][53268] Updated weights for policy 1, policy_version 38030 (0.0008) [2023-10-10 06:13:46,987][53252] Updated weights for policy 0, policy_version 38060 (0.0007) [2023-10-10 06:13:47,197][53268] Updated weights for policy 1, policy_version 38040 (0.0008) [2023-10-10 06:13:47,357][53252] Updated weights for policy 0, policy_version 38070 (0.0008) [2023-10-10 06:13:47,724][53252] Updated weights for policy 0, policy_version 38080 (0.0009) [2023-10-10 06:13:51,266][53268] Updated weights for policy 1, policy_version 38050 (0.0008) [2023-10-10 06:13:51,635][53268] Updated weights for policy 1, policy_version 38060 (0.0007) [2023-10-10 06:13:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77955072. Throughput: 0: 1686.4, 1: 1690.2. Samples: 19503352. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:13:51,784][52050] Avg episode reward: [(0, '19.840'), (1, '20.400')] [2023-10-10 06:13:51,799][53252] Updated weights for policy 0, policy_version 38090 (0.0009) [2023-10-10 06:13:52,010][53268] Updated weights for policy 1, policy_version 38070 (0.0008) [2023-10-10 06:13:52,167][53252] Updated weights for policy 0, policy_version 38100 (0.0009) [2023-10-10 06:13:52,376][53268] Updated weights for policy 1, policy_version 38080 (0.0008) [2023-10-10 06:13:52,535][53252] Updated weights for policy 0, policy_version 38110 (0.0007) [2023-10-10 06:13:56,358][53268] Updated weights for policy 1, policy_version 38090 (0.0008) [2023-10-10 06:13:56,513][53252] Updated weights for policy 0, policy_version 38120 (0.0009) [2023-10-10 06:13:56,718][53268] Updated weights for policy 1, policy_version 38100 (0.0008) [2023-10-10 06:13:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78020608. Throughput: 0: 1688.1, 1: 1683.0. Samples: 19523868. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:13:56,784][52050] Avg episode reward: [(0, '19.930'), (1, '20.340')] [2023-10-10 06:13:56,884][53252] Updated weights for policy 0, policy_version 38130 (0.0009) [2023-10-10 06:13:57,083][53268] Updated weights for policy 1, policy_version 38110 (0.0008) [2023-10-10 06:13:57,255][53252] Updated weights for policy 0, policy_version 38140 (0.0007) [2023-10-10 06:14:01,256][53268] Updated weights for policy 1, policy_version 38120 (0.0010) [2023-10-10 06:14:01,431][53252] Updated weights for policy 0, policy_version 38150 (0.0008) [2023-10-10 06:14:01,638][53268] Updated weights for policy 1, policy_version 38130 (0.0009) [2023-10-10 06:14:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78086144. Throughput: 0: 1690.8, 1: 1685.2. Samples: 19533322. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:14:01,784][52050] Avg episode reward: [(0, '20.230'), (1, '21.530')] [2023-10-10 06:14:01,809][53252] Updated weights for policy 0, policy_version 38160 (0.0008) [2023-10-10 06:14:02,007][53268] Updated weights for policy 1, policy_version 38140 (0.0010) [2023-10-10 06:14:02,180][53252] Updated weights for policy 0, policy_version 38170 (0.0007) [2023-10-10 06:14:06,021][53268] Updated weights for policy 1, policy_version 38150 (0.0008) [2023-10-10 06:14:06,295][53252] Updated weights for policy 0, policy_version 38180 (0.0008) [2023-10-10 06:14:06,391][53268] Updated weights for policy 1, policy_version 38160 (0.0007) [2023-10-10 06:14:06,662][53252] Updated weights for policy 0, policy_version 38190 (0.0009) [2023-10-10 06:14:06,760][53268] Updated weights for policy 1, policy_version 38170 (0.0007) [2023-10-10 06:14:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78151680. Throughput: 0: 1689.7, 1: 1678.5. Samples: 19553766. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:14:06,784][52050] Avg episode reward: [(0, '19.930'), (1, '20.610')] [2023-10-10 06:14:07,034][53252] Updated weights for policy 0, policy_version 38200 (0.0009) [2023-10-10 06:14:10,949][53268] Updated weights for policy 1, policy_version 38180 (0.0009) [2023-10-10 06:14:11,177][53252] Updated weights for policy 0, policy_version 38210 (0.0008) [2023-10-10 06:14:11,317][53268] Updated weights for policy 1, policy_version 38190 (0.0009) [2023-10-10 06:14:11,546][53252] Updated weights for policy 0, policy_version 38220 (0.0008) [2023-10-10 06:14:11,681][53268] Updated weights for policy 1, policy_version 38200 (0.0009) [2023-10-10 06:14:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78217216. Throughput: 0: 1680.7, 1: 1668.9. Samples: 19573524. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:14:11,784][52050] Avg episode reward: [(0, '19.870'), (1, '21.410')] [2023-10-10 06:14:11,929][53252] Updated weights for policy 0, policy_version 38230 (0.0008) [2023-10-10 06:14:12,291][53252] Updated weights for policy 0, policy_version 38240 (0.0010) [2023-10-10 06:14:15,860][53268] Updated weights for policy 1, policy_version 38210 (0.0009) [2023-10-10 06:14:16,221][53268] Updated weights for policy 1, policy_version 38220 (0.0009) [2023-10-10 06:14:16,265][53252] Updated weights for policy 0, policy_version 38250 (0.0008) [2023-10-10 06:14:16,591][53268] Updated weights for policy 1, policy_version 38230 (0.0010) [2023-10-10 06:14:16,630][53252] Updated weights for policy 0, policy_version 38260 (0.0007) [2023-10-10 06:14:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 78282752. Throughput: 0: 1688.8, 1: 1674.0. Samples: 19583278. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) [2023-10-10 06:14:16,784][52050] Avg episode reward: [(0, '19.450'), (1, '21.550')] [2023-10-10 06:14:16,966][53268] Updated weights for policy 1, policy_version 38240 (0.0009) [2023-10-10 06:14:17,002][53252] Updated weights for policy 0, policy_version 38270 (0.0008) [2023-10-10 06:14:20,986][53268] Updated weights for policy 1, policy_version 38250 (0.0007) [2023-10-10 06:14:21,145][53252] Updated weights for policy 0, policy_version 38280 (0.0008) [2023-10-10 06:14:21,344][53268] Updated weights for policy 1, policy_version 38260 (0.0009) [2023-10-10 06:14:21,510][53252] Updated weights for policy 0, policy_version 38290 (0.0007) [2023-10-10 06:14:21,708][53268] Updated weights for policy 1, policy_version 38270 (0.0008) [2023-10-10 06:14:21,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 78381056. Throughput: 0: 1687.4, 1: 1668.8. Samples: 19603630. Policy #0 lag: (min: 10.0, avg: 10.8, max: 29.0) [2023-10-10 06:14:21,784][52050] Avg episode reward: [(0, '18.740'), (1, '19.450')] [2023-10-10 06:14:21,885][53252] Updated weights for policy 0, policy_version 38300 (0.0007) [2023-10-10 06:14:25,876][53268] Updated weights for policy 1, policy_version 38280 (0.0008) [2023-10-10 06:14:26,017][53252] Updated weights for policy 0, policy_version 38310 (0.0011) [2023-10-10 06:14:26,239][53268] Updated weights for policy 1, policy_version 38290 (0.0009) [2023-10-10 06:14:26,389][53252] Updated weights for policy 0, policy_version 38320 (0.0009) [2023-10-10 06:14:26,601][53268] Updated weights for policy 1, policy_version 38300 (0.0009) [2023-10-10 06:14:26,757][53252] Updated weights for policy 0, policy_version 38330 (0.0008) [2023-10-10 06:14:26,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78446592. Throughput: 0: 1666.1, 1: 1653.1. Samples: 19623042. Policy #0 lag: (min: 10.0, avg: 10.8, max: 29.0) [2023-10-10 06:14:26,784][52050] Avg episode reward: [(0, '20.710'), (1, '20.600')] [2023-10-10 06:14:30,675][53268] Updated weights for policy 1, policy_version 38310 (0.0009) [2023-10-10 06:14:30,756][53252] Updated weights for policy 0, policy_version 38340 (0.0010) [2023-10-10 06:14:31,046][53268] Updated weights for policy 1, policy_version 38320 (0.0009) [2023-10-10 06:14:31,127][53252] Updated weights for policy 0, policy_version 38350 (0.0007) [2023-10-10 06:14:31,412][53268] Updated weights for policy 1, policy_version 38330 (0.0008) [2023-10-10 06:14:31,494][53252] Updated weights for policy 0, policy_version 38360 (0.0008) [2023-10-10 06:14:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 78512128. Throughput: 0: 1675.9, 1: 1669.9. Samples: 19633214. Policy #0 lag: (min: 10.0, avg: 10.8, max: 29.0) [2023-10-10 06:14:31,784][52050] Avg episode reward: [(0, '19.110'), (1, '21.910')] [2023-10-10 06:14:35,389][53268] Updated weights for policy 1, policy_version 38340 (0.0009) [2023-10-10 06:14:35,668][53252] Updated weights for policy 0, policy_version 38370 (0.0007) [2023-10-10 06:14:35,757][53268] Updated weights for policy 1, policy_version 38350 (0.0009) [2023-10-10 06:14:36,068][53252] Updated weights for policy 0, policy_version 38380 (0.0007) [2023-10-10 06:14:36,127][53268] Updated weights for policy 1, policy_version 38360 (0.0008) [2023-10-10 06:14:36,444][53252] Updated weights for policy 0, policy_version 38390 (0.0007) [2023-10-10 06:14:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 78577664. Throughput: 0: 1676.0, 1: 1675.3. Samples: 19654158. Policy #0 lag: (min: 10.0, avg: 10.8, max: 29.0) [2023-10-10 06:14:36,784][52050] Avg episode reward: [(0, '21.180'), (1, '21.780')] [2023-10-10 06:14:36,808][53252] Updated weights for policy 0, policy_version 38400 (0.0008) [2023-10-10 06:14:40,113][53268] Updated weights for policy 1, policy_version 38370 (0.0007) [2023-10-10 06:14:40,469][53268] Updated weights for policy 1, policy_version 38380 (0.0008) [2023-10-10 06:14:40,839][53268] Updated weights for policy 1, policy_version 38390 (0.0009) [2023-10-10 06:14:40,965][53252] Updated weights for policy 0, policy_version 38410 (0.0008) [2023-10-10 06:14:41,198][53268] Updated weights for policy 1, policy_version 38400 (0.0009) [2023-10-10 06:14:41,331][53252] Updated weights for policy 0, policy_version 38420 (0.0007) [2023-10-10 06:14:41,708][53252] Updated weights for policy 0, policy_version 38430 (0.0009) [2023-10-10 06:14:41,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 78675968. Throughput: 0: 1653.0, 1: 1655.0. Samples: 19672728. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:14:41,784][52050] Avg episode reward: [(0, '21.130'), (1, '19.940')] [2023-10-10 06:14:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000038432_39354368.pth... [2023-10-10 06:14:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000038400_39321600.pth... [2023-10-10 06:14:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000036864_37748736.pth [2023-10-10 06:14:41,824][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000036832_37715968.pth [2023-10-10 06:14:45,305][53268] Updated weights for policy 1, policy_version 38410 (0.0011) [2023-10-10 06:14:45,669][53268] Updated weights for policy 1, policy_version 38420 (0.0008) [2023-10-10 06:14:45,790][53252] Updated weights for policy 0, policy_version 38440 (0.0009) [2023-10-10 06:14:46,037][53268] Updated weights for policy 1, policy_version 38430 (0.0008) [2023-10-10 06:14:46,155][53252] Updated weights for policy 0, policy_version 38450 (0.0009) [2023-10-10 06:14:46,535][53252] Updated weights for policy 0, policy_version 38460 (0.0007) [2023-10-10 06:14:46,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 78741504. Throughput: 0: 1665.6, 1: 1673.4. Samples: 19683578. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:14:46,784][52050] Avg episode reward: [(0, '19.800'), (1, '21.120')] [2023-10-10 06:14:50,347][53268] Updated weights for policy 1, policy_version 38440 (0.0007) [2023-10-10 06:14:50,611][53252] Updated weights for policy 0, policy_version 38470 (0.0009) [2023-10-10 06:14:50,721][53268] Updated weights for policy 1, policy_version 38450 (0.0009) [2023-10-10 06:14:50,989][53252] Updated weights for policy 0, policy_version 38480 (0.0007) [2023-10-10 06:14:51,096][53268] Updated weights for policy 1, policy_version 38460 (0.0010) [2023-10-10 06:14:51,358][53252] Updated weights for policy 0, policy_version 38490 (0.0009) [2023-10-10 06:14:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 78807040. Throughput: 0: 1662.7, 1: 1668.5. Samples: 19703670. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:14:51,784][52050] Avg episode reward: [(0, '20.230'), (1, '19.160')] [2023-10-10 06:14:55,138][53268] Updated weights for policy 1, policy_version 38470 (0.0009) [2023-10-10 06:14:55,332][53252] Updated weights for policy 0, policy_version 38500 (0.0007) [2023-10-10 06:14:55,494][53268] Updated weights for policy 1, policy_version 38480 (0.0007) [2023-10-10 06:14:55,692][53252] Updated weights for policy 0, policy_version 38510 (0.0007) [2023-10-10 06:14:55,860][53268] Updated weights for policy 1, policy_version 38490 (0.0007) [2023-10-10 06:14:56,066][53252] Updated weights for policy 0, policy_version 38520 (0.0008) [2023-10-10 06:14:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 78872576. Throughput: 0: 1645.8, 1: 1654.4. Samples: 19722032. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:14:56,784][52050] Avg episode reward: [(0, '20.350'), (1, '19.420')] [2023-10-10 06:14:59,927][53268] Updated weights for policy 1, policy_version 38500 (0.0008) [2023-10-10 06:15:00,207][53252] Updated weights for policy 0, policy_version 38530 (0.0009) [2023-10-10 06:15:00,294][53268] Updated weights for policy 1, policy_version 38510 (0.0009) [2023-10-10 06:15:00,578][53252] Updated weights for policy 0, policy_version 38540 (0.0008) [2023-10-10 06:15:00,669][53268] Updated weights for policy 1, policy_version 38520 (0.0009) [2023-10-10 06:15:00,943][53252] Updated weights for policy 0, policy_version 38550 (0.0008) [2023-10-10 06:15:01,318][53252] Updated weights for policy 0, policy_version 38560 (0.0010) [2023-10-10 06:15:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 78938112. Throughput: 0: 1664.8, 1: 1674.2. Samples: 19733532. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:15:01,785][52050] Avg episode reward: [(0, '21.330'), (1, '18.340')] [2023-10-10 06:15:04,900][53268] Updated weights for policy 1, policy_version 38530 (0.0009) [2023-10-10 06:15:05,267][53268] Updated weights for policy 1, policy_version 38540 (0.0009) [2023-10-10 06:15:05,464][53252] Updated weights for policy 0, policy_version 38570 (0.0007) [2023-10-10 06:15:05,633][53268] Updated weights for policy 1, policy_version 38550 (0.0010) [2023-10-10 06:15:05,844][53252] Updated weights for policy 0, policy_version 38580 (0.0008) [2023-10-10 06:15:05,994][53268] Updated weights for policy 1, policy_version 38560 (0.0007) [2023-10-10 06:15:06,211][53252] Updated weights for policy 0, policy_version 38590 (0.0008) [2023-10-10 06:15:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79003648. Throughput: 0: 1658.3, 1: 1674.8. Samples: 19753620. Policy #0 lag: (min: 20.0, avg: 26.6, max: 52.0) [2023-10-10 06:15:06,784][52050] Avg episode reward: [(0, '20.060'), (1, '20.050')] [2023-10-10 06:15:10,156][53268] Updated weights for policy 1, policy_version 38570 (0.0008) [2023-10-10 06:15:10,299][53252] Updated weights for policy 0, policy_version 38600 (0.0008) [2023-10-10 06:15:10,528][53268] Updated weights for policy 1, policy_version 38580 (0.0009) [2023-10-10 06:15:10,678][53252] Updated weights for policy 0, policy_version 38610 (0.0009) [2023-10-10 06:15:10,884][53268] Updated weights for policy 1, policy_version 38590 (0.0009) [2023-10-10 06:15:11,052][53252] Updated weights for policy 0, policy_version 38620 (0.0007) [2023-10-10 06:15:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 79069184. Throughput: 0: 1652.9, 1: 1664.6. Samples: 19772328. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:11,784][52050] Avg episode reward: [(0, '19.550'), (1, '19.580')] [2023-10-10 06:15:15,034][53268] Updated weights for policy 1, policy_version 38600 (0.0010) [2023-10-10 06:15:15,242][53252] Updated weights for policy 0, policy_version 38630 (0.0008) [2023-10-10 06:15:15,394][53268] Updated weights for policy 1, policy_version 38610 (0.0009) [2023-10-10 06:15:15,615][53252] Updated weights for policy 0, policy_version 38640 (0.0007) [2023-10-10 06:15:15,757][53268] Updated weights for policy 1, policy_version 38620 (0.0011) [2023-10-10 06:15:15,988][53252] Updated weights for policy 0, policy_version 38650 (0.0007) [2023-10-10 06:15:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 79134720. Throughput: 0: 1668.7, 1: 1677.2. Samples: 19783780. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:16,784][52050] Avg episode reward: [(0, '20.270'), (1, '20.600')] [2023-10-10 06:15:19,815][53268] Updated weights for policy 1, policy_version 38630 (0.0008) [2023-10-10 06:15:20,054][53252] Updated weights for policy 0, policy_version 38660 (0.0007) [2023-10-10 06:15:20,182][53268] Updated weights for policy 1, policy_version 38640 (0.0007) [2023-10-10 06:15:20,428][53252] Updated weights for policy 0, policy_version 38670 (0.0008) [2023-10-10 06:15:20,547][53268] Updated weights for policy 1, policy_version 38650 (0.0009) [2023-10-10 06:15:20,798][53252] Updated weights for policy 0, policy_version 38680 (0.0007) [2023-10-10 06:15:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 79200256. Throughput: 0: 1658.5, 1: 1653.3. Samples: 19803190. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:21,784][52050] Avg episode reward: [(0, '20.190'), (1, '19.390')] [2023-10-10 06:15:24,639][53268] Updated weights for policy 1, policy_version 38660 (0.0010) [2023-10-10 06:15:24,775][53252] Updated weights for policy 0, policy_version 38690 (0.0010) [2023-10-10 06:15:24,992][53268] Updated weights for policy 1, policy_version 38670 (0.0009) [2023-10-10 06:15:25,147][53252] Updated weights for policy 0, policy_version 38700 (0.0007) [2023-10-10 06:15:25,355][53268] Updated weights for policy 1, policy_version 38680 (0.0008) [2023-10-10 06:15:25,514][53252] Updated weights for policy 0, policy_version 38710 (0.0010) [2023-10-10 06:15:25,886][53252] Updated weights for policy 0, policy_version 38720 (0.0008) [2023-10-10 06:15:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 79265792. Throughput: 0: 1665.2, 1: 1658.7. Samples: 19822304. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:26,784][52050] Avg episode reward: [(0, '19.740'), (1, '21.110')] [2023-10-10 06:15:29,280][53268] Updated weights for policy 1, policy_version 38690 (0.0008) [2023-10-10 06:15:29,652][53268] Updated weights for policy 1, policy_version 38700 (0.0010) [2023-10-10 06:15:29,964][53252] Updated weights for policy 0, policy_version 38730 (0.0007) [2023-10-10 06:15:30,024][53268] Updated weights for policy 1, policy_version 38710 (0.0008) [2023-10-10 06:15:30,328][53252] Updated weights for policy 0, policy_version 38740 (0.0008) [2023-10-10 06:15:30,385][53268] Updated weights for policy 1, policy_version 38720 (0.0009) [2023-10-10 06:15:30,697][53252] Updated weights for policy 0, policy_version 38750 (0.0009) [2023-10-10 06:15:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79331328. Throughput: 0: 1678.8, 1: 1670.0. Samples: 19834274. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:31,785][52050] Avg episode reward: [(0, '20.660'), (1, '18.330')] [2023-10-10 06:15:34,551][53268] Updated weights for policy 1, policy_version 38730 (0.0008) [2023-10-10 06:15:34,834][53252] Updated weights for policy 0, policy_version 38760 (0.0008) [2023-10-10 06:15:34,914][53268] Updated weights for policy 1, policy_version 38740 (0.0008) [2023-10-10 06:15:35,215][53252] Updated weights for policy 0, policy_version 38770 (0.0008) [2023-10-10 06:15:35,281][53268] Updated weights for policy 1, policy_version 38750 (0.0008) [2023-10-10 06:15:35,574][53252] Updated weights for policy 0, policy_version 38780 (0.0007) [2023-10-10 06:15:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79396864. Throughput: 0: 1661.3, 1: 1654.5. Samples: 19852884. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:15:36,784][52050] Avg episode reward: [(0, '20.090'), (1, '19.290')] [2023-10-10 06:15:39,499][53268] Updated weights for policy 1, policy_version 38760 (0.0009) [2023-10-10 06:15:39,795][53252] Updated weights for policy 0, policy_version 38790 (0.0008) [2023-10-10 06:15:39,875][53268] Updated weights for policy 1, policy_version 38770 (0.0007) [2023-10-10 06:15:40,172][53252] Updated weights for policy 0, policy_version 38800 (0.0008) [2023-10-10 06:15:40,237][53268] Updated weights for policy 1, policy_version 38780 (0.0008) [2023-10-10 06:15:40,537][53252] Updated weights for policy 0, policy_version 38810 (0.0008) [2023-10-10 06:15:41,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 79462400. Throughput: 0: 1671.6, 1: 1671.4. Samples: 19872466. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:15:41,784][52050] Avg episode reward: [(0, '19.250'), (1, '20.080')] [2023-10-10 06:15:44,295][53268] Updated weights for policy 1, policy_version 38790 (0.0010) [2023-10-10 06:15:44,651][53252] Updated weights for policy 0, policy_version 38820 (0.0008) [2023-10-10 06:15:44,668][53268] Updated weights for policy 1, policy_version 38800 (0.0009) [2023-10-10 06:15:45,020][53252] Updated weights for policy 0, policy_version 38830 (0.0008) [2023-10-10 06:15:45,032][53268] Updated weights for policy 1, policy_version 38810 (0.0009) [2023-10-10 06:15:45,395][53252] Updated weights for policy 0, policy_version 38840 (0.0010) [2023-10-10 06:15:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79527936. Throughput: 0: 1675.3, 1: 1670.4. Samples: 19884088. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:15:46,784][52050] Avg episode reward: [(0, '20.640'), (1, '19.310')] [2023-10-10 06:15:49,141][53268] Updated weights for policy 1, policy_version 38820 (0.0009) [2023-10-10 06:15:49,503][53268] Updated weights for policy 1, policy_version 38830 (0.0008) [2023-10-10 06:15:49,594][53252] Updated weights for policy 0, policy_version 38850 (0.0008) [2023-10-10 06:15:49,874][53268] Updated weights for policy 1, policy_version 38840 (0.0008) [2023-10-10 06:15:49,963][53252] Updated weights for policy 0, policy_version 38860 (0.0009) [2023-10-10 06:15:50,339][53252] Updated weights for policy 0, policy_version 38870 (0.0010) [2023-10-10 06:15:50,719][53252] Updated weights for policy 0, policy_version 38880 (0.0009) [2023-10-10 06:15:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79593472. Throughput: 0: 1658.9, 1: 1653.9. Samples: 19902698. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:15:51,784][52050] Avg episode reward: [(0, '19.090'), (1, '19.670')] [2023-10-10 06:15:53,830][53268] Updated weights for policy 1, policy_version 38850 (0.0008) [2023-10-10 06:15:54,191][53268] Updated weights for policy 1, policy_version 38860 (0.0007) [2023-10-10 06:15:54,566][53268] Updated weights for policy 1, policy_version 38870 (0.0007) [2023-10-10 06:15:54,783][53252] Updated weights for policy 0, policy_version 38890 (0.0007) [2023-10-10 06:15:54,932][53268] Updated weights for policy 1, policy_version 38880 (0.0007) [2023-10-10 06:15:55,164][53252] Updated weights for policy 0, policy_version 38900 (0.0009) [2023-10-10 06:15:55,535][53252] Updated weights for policy 0, policy_version 38910 (0.0008) [2023-10-10 06:15:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 79659008. Throughput: 0: 1670.3, 1: 1677.5. Samples: 19922980. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:15:56,784][52050] Avg episode reward: [(0, '18.150'), (1, '21.050')] [2023-10-10 06:15:59,102][53268] Updated weights for policy 1, policy_version 38890 (0.0008) [2023-10-10 06:15:59,379][53252] Updated weights for policy 0, policy_version 38920 (0.0007) [2023-10-10 06:15:59,467][53268] Updated weights for policy 1, policy_version 38900 (0.0010) [2023-10-10 06:15:59,754][53252] Updated weights for policy 0, policy_version 38930 (0.0007) [2023-10-10 06:15:59,836][53268] Updated weights for policy 1, policy_version 38910 (0.0008) [2023-10-10 06:16:00,135][53252] Updated weights for policy 0, policy_version 38940 (0.0008) [2023-10-10 06:16:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79724544. Throughput: 0: 1668.3, 1: 1669.1. Samples: 19933960. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:16:01,784][52050] Avg episode reward: [(0, '18.520'), (1, '21.130')] [2023-10-10 06:16:03,965][53268] Updated weights for policy 1, policy_version 38920 (0.0009) [2023-10-10 06:16:04,091][53252] Updated weights for policy 0, policy_version 38950 (0.0007) [2023-10-10 06:16:04,338][53268] Updated weights for policy 1, policy_version 38930 (0.0008) [2023-10-10 06:16:04,462][53252] Updated weights for policy 0, policy_version 38960 (0.0007) [2023-10-10 06:16:04,702][53268] Updated weights for policy 1, policy_version 38940 (0.0008) [2023-10-10 06:16:04,830][53252] Updated weights for policy 0, policy_version 38970 (0.0007) [2023-10-10 06:16:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79790080. Throughput: 0: 1660.0, 1: 1667.4. Samples: 19952924. Policy #0 lag: (min: 3.0, avg: 10.9, max: 35.0) [2023-10-10 06:16:06,784][52050] Avg episode reward: [(0, '20.190'), (1, '20.720')] [2023-10-10 06:16:08,644][53268] Updated weights for policy 1, policy_version 38950 (0.0009) [2023-10-10 06:16:08,784][53252] Updated weights for policy 0, policy_version 38980 (0.0010) [2023-10-10 06:16:08,999][53268] Updated weights for policy 1, policy_version 38960 (0.0009) [2023-10-10 06:16:09,148][53252] Updated weights for policy 0, policy_version 38990 (0.0009) [2023-10-10 06:16:09,376][53268] Updated weights for policy 1, policy_version 38970 (0.0009) [2023-10-10 06:16:09,512][53252] Updated weights for policy 0, policy_version 39000 (0.0008) [2023-10-10 06:16:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79855616. Throughput: 0: 1684.5, 1: 1687.8. Samples: 19974060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:11,784][52050] Avg episode reward: [(0, '21.620'), (1, '19.850')] [2023-10-10 06:16:13,471][53268] Updated weights for policy 1, policy_version 38980 (0.0008) [2023-10-10 06:16:13,626][53252] Updated weights for policy 0, policy_version 39010 (0.0007) [2023-10-10 06:16:13,833][53268] Updated weights for policy 1, policy_version 38990 (0.0008) [2023-10-10 06:16:14,020][53252] Updated weights for policy 0, policy_version 39020 (0.0008) [2023-10-10 06:16:14,206][53268] Updated weights for policy 1, policy_version 39000 (0.0008) [2023-10-10 06:16:14,395][53252] Updated weights for policy 0, policy_version 39030 (0.0008) [2023-10-10 06:16:14,758][53252] Updated weights for policy 0, policy_version 39040 (0.0008) [2023-10-10 06:16:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 79921152. Throughput: 0: 1663.7, 1: 1660.5. Samples: 19983864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:16,784][52050] Avg episode reward: [(0, '21.520'), (1, '20.240')] [2023-10-10 06:16:18,317][53268] Updated weights for policy 1, policy_version 39010 (0.0009) [2023-10-10 06:16:18,687][53268] Updated weights for policy 1, policy_version 39020 (0.0008) [2023-10-10 06:16:18,971][53252] Updated weights for policy 0, policy_version 39050 (0.0009) [2023-10-10 06:16:19,059][53268] Updated weights for policy 1, policy_version 39030 (0.0008) [2023-10-10 06:16:19,335][53252] Updated weights for policy 0, policy_version 39060 (0.0007) [2023-10-10 06:16:19,417][53268] Updated weights for policy 1, policy_version 39040 (0.0007) [2023-10-10 06:16:19,702][53252] Updated weights for policy 0, policy_version 39070 (0.0009) [2023-10-10 06:16:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79986688. Throughput: 0: 1675.4, 1: 1672.4. Samples: 20003534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:21,784][52050] Avg episode reward: [(0, '21.130'), (1, '18.670')] [2023-10-10 06:16:23,572][53268] Updated weights for policy 1, policy_version 39050 (0.0008) [2023-10-10 06:16:23,755][53252] Updated weights for policy 0, policy_version 39080 (0.0007) [2023-10-10 06:16:23,932][53268] Updated weights for policy 1, policy_version 39060 (0.0009) [2023-10-10 06:16:24,129][53252] Updated weights for policy 0, policy_version 39090 (0.0009) [2023-10-10 06:16:24,297][53268] Updated weights for policy 1, policy_version 39070 (0.0010) [2023-10-10 06:16:24,495][53252] Updated weights for policy 0, policy_version 39100 (0.0010) [2023-10-10 06:16:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 80052224. Throughput: 0: 1700.6, 1: 1679.8. Samples: 20024586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:26,784][52050] Avg episode reward: [(0, '21.520'), (1, '18.850')] [2023-10-10 06:16:28,362][53268] Updated weights for policy 1, policy_version 39080 (0.0009) [2023-10-10 06:16:28,450][53252] Updated weights for policy 0, policy_version 39110 (0.0009) [2023-10-10 06:16:28,748][53268] Updated weights for policy 1, policy_version 39090 (0.0007) [2023-10-10 06:16:28,834][53252] Updated weights for policy 0, policy_version 39120 (0.0008) [2023-10-10 06:16:29,113][53268] Updated weights for policy 1, policy_version 39100 (0.0008) [2023-10-10 06:16:29,203][53252] Updated weights for policy 0, policy_version 39130 (0.0007) [2023-10-10 06:16:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 80117760. Throughput: 0: 1673.4, 1: 1657.1. Samples: 20033958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:31,784][52050] Avg episode reward: [(0, '20.790'), (1, '19.020')] [2023-10-10 06:16:33,280][53268] Updated weights for policy 1, policy_version 39110 (0.0009) [2023-10-10 06:16:33,288][53252] Updated weights for policy 0, policy_version 39140 (0.0008) [2023-10-10 06:16:33,649][53268] Updated weights for policy 1, policy_version 39120 (0.0008) [2023-10-10 06:16:33,664][53252] Updated weights for policy 0, policy_version 39150 (0.0008) [2023-10-10 06:16:34,007][53268] Updated weights for policy 1, policy_version 39130 (0.0009) [2023-10-10 06:16:34,036][53252] Updated weights for policy 0, policy_version 39160 (0.0010) [2023-10-10 06:16:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80183296. Throughput: 0: 1694.5, 1: 1667.9. Samples: 20054008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:16:36,784][52050] Avg episode reward: [(0, '20.080'), (1, '18.750')] [2023-10-10 06:16:38,012][53252] Updated weights for policy 0, policy_version 39170 (0.0009) [2023-10-10 06:16:38,161][53268] Updated weights for policy 1, policy_version 39140 (0.0009) [2023-10-10 06:16:38,391][53252] Updated weights for policy 0, policy_version 39180 (0.0009) [2023-10-10 06:16:38,523][53268] Updated weights for policy 1, policy_version 39150 (0.0008) [2023-10-10 06:16:38,753][53252] Updated weights for policy 0, policy_version 39190 (0.0008) [2023-10-10 06:16:38,880][53268] Updated weights for policy 1, policy_version 39160 (0.0008) [2023-10-10 06:16:39,126][53252] Updated weights for policy 0, policy_version 39200 (0.0007) [2023-10-10 06:16:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 80248832. Throughput: 0: 1704.8, 1: 1665.0. Samples: 20074624. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:16:41,785][52050] Avg episode reward: [(0, '21.060'), (1, '18.060')] [2023-10-10 06:16:41,797][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000039168_40108032.pth... [2023-10-10 06:16:41,797][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000039200_40140800.pth... [2023-10-10 06:16:41,837][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000037632_38535168.pth [2023-10-10 06:16:41,838][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000037600_38502400.pth [2023-10-10 06:16:41,841][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000039200_40140800.pth [2023-10-10 06:16:41,844][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000039168_40108032.pth [2023-10-10 06:16:43,037][53268] Updated weights for policy 1, policy_version 39170 (0.0007) [2023-10-10 06:16:43,122][53252] Updated weights for policy 0, policy_version 39210 (0.0007) [2023-10-10 06:16:43,404][53268] Updated weights for policy 1, policy_version 39180 (0.0007) [2023-10-10 06:16:43,498][53252] Updated weights for policy 0, policy_version 39220 (0.0007) [2023-10-10 06:16:43,770][53268] Updated weights for policy 1, policy_version 39190 (0.0011) [2023-10-10 06:16:43,864][53252] Updated weights for policy 0, policy_version 39230 (0.0007) [2023-10-10 06:16:44,131][53268] Updated weights for policy 1, policy_version 39200 (0.0009) [2023-10-10 06:16:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80314368. Throughput: 0: 1682.0, 1: 1646.2. Samples: 20083732. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:16:46,784][52050] Avg episode reward: [(0, '18.830'), (1, '19.740')] [2023-10-10 06:16:47,894][53252] Updated weights for policy 0, policy_version 39240 (0.0009) [2023-10-10 06:16:48,253][53252] Updated weights for policy 0, policy_version 39250 (0.0008) [2023-10-10 06:16:48,303][53268] Updated weights for policy 1, policy_version 39210 (0.0009) [2023-10-10 06:16:48,634][53252] Updated weights for policy 0, policy_version 39260 (0.0007) [2023-10-10 06:16:48,658][53268] Updated weights for policy 1, policy_version 39220 (0.0008) [2023-10-10 06:16:49,025][53268] Updated weights for policy 1, policy_version 39230 (0.0009) [2023-10-10 06:16:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80379904. Throughput: 0: 1700.6, 1: 1663.8. Samples: 20104322. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:16:51,784][52050] Avg episode reward: [(0, '19.040'), (1, '19.710')] [2023-10-10 06:16:52,660][53252] Updated weights for policy 0, policy_version 39270 (0.0009) [2023-10-10 06:16:52,994][53268] Updated weights for policy 1, policy_version 39240 (0.0007) [2023-10-10 06:16:53,030][53252] Updated weights for policy 0, policy_version 39280 (0.0007) [2023-10-10 06:16:53,362][53268] Updated weights for policy 1, policy_version 39250 (0.0007) [2023-10-10 06:16:53,398][53252] Updated weights for policy 0, policy_version 39290 (0.0008) [2023-10-10 06:16:53,729][53268] Updated weights for policy 1, policy_version 39260 (0.0009) [2023-10-10 06:16:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 80445440. Throughput: 0: 1695.9, 1: 1661.3. Samples: 20125134. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:16:56,784][52050] Avg episode reward: [(0, '20.810'), (1, '19.200')] [2023-10-10 06:16:57,383][53252] Updated weights for policy 0, policy_version 39300 (0.0007) [2023-10-10 06:16:57,759][53252] Updated weights for policy 0, policy_version 39310 (0.0008) [2023-10-10 06:16:57,782][53268] Updated weights for policy 1, policy_version 39270 (0.0007) [2023-10-10 06:16:58,129][53252] Updated weights for policy 0, policy_version 39320 (0.0007) [2023-10-10 06:16:58,145][53268] Updated weights for policy 1, policy_version 39280 (0.0007) [2023-10-10 06:16:58,512][53268] Updated weights for policy 1, policy_version 39290 (0.0008) [2023-10-10 06:17:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80510976. Throughput: 0: 1692.7, 1: 1653.2. Samples: 20134428. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:17:01,784][52050] Avg episode reward: [(0, '22.270'), (1, '20.140')] [2023-10-10 06:17:02,257][53252] Updated weights for policy 0, policy_version 39330 (0.0009) [2023-10-10 06:17:02,636][53268] Updated weights for policy 1, policy_version 39300 (0.0008) [2023-10-10 06:17:02,645][53252] Updated weights for policy 0, policy_version 39340 (0.0007) [2023-10-10 06:17:03,008][53268] Updated weights for policy 1, policy_version 39310 (0.0007) [2023-10-10 06:17:03,021][53252] Updated weights for policy 0, policy_version 39350 (0.0007) [2023-10-10 06:17:03,369][53268] Updated weights for policy 1, policy_version 39320 (0.0009) [2023-10-10 06:17:03,392][53252] Updated weights for policy 0, policy_version 39360 (0.0008) [2023-10-10 06:17:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80576512. Throughput: 0: 1697.6, 1: 1667.8. Samples: 20154978. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 06:17:06,785][52050] Avg episode reward: [(0, '21.850'), (1, '20.460')] [2023-10-10 06:17:07,476][53252] Updated weights for policy 0, policy_version 39370 (0.0008) [2023-10-10 06:17:07,546][53268] Updated weights for policy 1, policy_version 39330 (0.0011) [2023-10-10 06:17:07,853][53252] Updated weights for policy 0, policy_version 39380 (0.0009) [2023-10-10 06:17:07,911][53268] Updated weights for policy 1, policy_version 39340 (0.0008) [2023-10-10 06:17:08,227][53252] Updated weights for policy 0, policy_version 39390 (0.0007) [2023-10-10 06:17:08,275][53268] Updated weights for policy 1, policy_version 39350 (0.0008) [2023-10-10 06:17:08,647][53268] Updated weights for policy 1, policy_version 39360 (0.0007) [2023-10-10 06:17:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80642048. Throughput: 0: 1686.8, 1: 1668.3. Samples: 20175564. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:11,784][52050] Avg episode reward: [(0, '21.230'), (1, '21.390')] [2023-10-10 06:17:12,322][53252] Updated weights for policy 0, policy_version 39400 (0.0009) [2023-10-10 06:17:12,694][53252] Updated weights for policy 0, policy_version 39410 (0.0008) [2023-10-10 06:17:12,800][53268] Updated weights for policy 1, policy_version 39370 (0.0009) [2023-10-10 06:17:13,070][53252] Updated weights for policy 0, policy_version 39420 (0.0008) [2023-10-10 06:17:13,171][53268] Updated weights for policy 1, policy_version 39380 (0.0009) [2023-10-10 06:17:13,529][53268] Updated weights for policy 1, policy_version 39390 (0.0011) [2023-10-10 06:17:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80707584. Throughput: 0: 1681.5, 1: 1665.9. Samples: 20184592. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:16,784][52050] Avg episode reward: [(0, '22.410'), (1, '20.500')] [2023-10-10 06:17:17,110][53252] Updated weights for policy 0, policy_version 39430 (0.0008) [2023-10-10 06:17:17,475][53252] Updated weights for policy 0, policy_version 39440 (0.0007) [2023-10-10 06:17:17,718][53268] Updated weights for policy 1, policy_version 39400 (0.0010) [2023-10-10 06:17:17,845][53252] Updated weights for policy 0, policy_version 39450 (0.0008) [2023-10-10 06:17:18,086][53268] Updated weights for policy 1, policy_version 39410 (0.0009) [2023-10-10 06:17:18,461][53268] Updated weights for policy 1, policy_version 39420 (0.0010) [2023-10-10 06:17:21,733][53252] Updated weights for policy 0, policy_version 39460 (0.0008) [2023-10-10 06:17:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80773120. Throughput: 0: 1686.7, 1: 1671.3. Samples: 20205118. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:21,784][52050] Avg episode reward: [(0, '20.520'), (1, '20.240')] [2023-10-10 06:17:22,100][53252] Updated weights for policy 0, policy_version 39470 (0.0011) [2023-10-10 06:17:22,467][53252] Updated weights for policy 0, policy_version 39480 (0.0009) [2023-10-10 06:17:22,552][53268] Updated weights for policy 1, policy_version 39430 (0.0007) [2023-10-10 06:17:22,921][53268] Updated weights for policy 1, policy_version 39440 (0.0008) [2023-10-10 06:17:23,292][53268] Updated weights for policy 1, policy_version 39450 (0.0010) [2023-10-10 06:17:26,510][53252] Updated weights for policy 0, policy_version 39490 (0.0008) [2023-10-10 06:17:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 80838656. Throughput: 0: 1688.9, 1: 1674.3. Samples: 20225966. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:26,784][52050] Avg episode reward: [(0, '20.020'), (1, '18.950')] [2023-10-10 06:17:26,887][53252] Updated weights for policy 0, policy_version 39500 (0.0008) [2023-10-10 06:17:27,265][53252] Updated weights for policy 0, policy_version 39510 (0.0007) [2023-10-10 06:17:27,327][53268] Updated weights for policy 1, policy_version 39460 (0.0010) [2023-10-10 06:17:27,624][53252] Updated weights for policy 0, policy_version 39520 (0.0008) [2023-10-10 06:17:27,691][53268] Updated weights for policy 1, policy_version 39470 (0.0009) [2023-10-10 06:17:28,051][53268] Updated weights for policy 1, policy_version 39480 (0.0010) [2023-10-10 06:17:31,560][53252] Updated weights for policy 0, policy_version 39530 (0.0010) [2023-10-10 06:17:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80904192. Throughput: 0: 1689.5, 1: 1674.4. Samples: 20235106. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:31,784][52050] Avg episode reward: [(0, '20.510'), (1, '20.550')] [2023-10-10 06:17:31,928][53252] Updated weights for policy 0, policy_version 39540 (0.0008) [2023-10-10 06:17:32,023][53268] Updated weights for policy 1, policy_version 39490 (0.0010) [2023-10-10 06:17:32,291][53252] Updated weights for policy 0, policy_version 39550 (0.0010) [2023-10-10 06:17:32,389][53268] Updated weights for policy 1, policy_version 39500 (0.0008) [2023-10-10 06:17:32,764][53268] Updated weights for policy 1, policy_version 39510 (0.0010) [2023-10-10 06:17:33,136][53268] Updated weights for policy 1, policy_version 39520 (0.0008) [2023-10-10 06:17:36,328][53252] Updated weights for policy 0, policy_version 39560 (0.0007) [2023-10-10 06:17:36,697][53252] Updated weights for policy 0, policy_version 39570 (0.0007) [2023-10-10 06:17:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 80969728. Throughput: 0: 1689.5, 1: 1679.1. Samples: 20255910. Policy #0 lag: (min: 3.0, avg: 11.7, max: 35.0) [2023-10-10 06:17:36,784][52050] Avg episode reward: [(0, '20.070'), (1, '18.800')] [2023-10-10 06:17:37,085][53252] Updated weights for policy 0, policy_version 39580 (0.0009) [2023-10-10 06:17:37,107][53268] Updated weights for policy 1, policy_version 39530 (0.0008) [2023-10-10 06:17:37,466][53268] Updated weights for policy 1, policy_version 39540 (0.0007) [2023-10-10 06:17:37,840][53268] Updated weights for policy 1, policy_version 39550 (0.0009) [2023-10-10 06:17:41,158][53252] Updated weights for policy 0, policy_version 39590 (0.0009) [2023-10-10 06:17:41,523][53252] Updated weights for policy 0, policy_version 39600 (0.0008) [2023-10-10 06:17:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 81035264. Throughput: 0: 1678.9, 1: 1679.6. Samples: 20276270. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 06:17:41,784][52050] Avg episode reward: [(0, '20.250'), (1, '20.970')] [2023-10-10 06:17:41,899][53252] Updated weights for policy 0, policy_version 39610 (0.0008) [2023-10-10 06:17:41,941][53268] Updated weights for policy 1, policy_version 39560 (0.0009) [2023-10-10 06:17:42,314][53268] Updated weights for policy 1, policy_version 39570 (0.0009) [2023-10-10 06:17:42,684][53268] Updated weights for policy 1, policy_version 39580 (0.0008) [2023-10-10 06:17:46,002][53252] Updated weights for policy 0, policy_version 39620 (0.0008) [2023-10-10 06:17:46,361][53252] Updated weights for policy 0, policy_version 39630 (0.0008) [2023-10-10 06:17:46,731][53252] Updated weights for policy 0, policy_version 39640 (0.0008) [2023-10-10 06:17:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81100800. Throughput: 0: 1684.8, 1: 1681.5. Samples: 20285910. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 06:17:46,784][52050] Avg episode reward: [(0, '19.650'), (1, '19.490')] [2023-10-10 06:17:46,862][53268] Updated weights for policy 1, policy_version 39590 (0.0008) [2023-10-10 06:17:47,237][53268] Updated weights for policy 1, policy_version 39600 (0.0007) [2023-10-10 06:17:47,594][53268] Updated weights for policy 1, policy_version 39610 (0.0009) [2023-10-10 06:17:50,939][53252] Updated weights for policy 0, policy_version 39650 (0.0007) [2023-10-10 06:17:51,331][53252] Updated weights for policy 0, policy_version 39660 (0.0009) [2023-10-10 06:17:51,705][53252] Updated weights for policy 0, policy_version 39670 (0.0008) [2023-10-10 06:17:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81166336. Throughput: 0: 1684.0, 1: 1673.2. Samples: 20306054. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 06:17:51,784][52050] Avg episode reward: [(0, '21.580'), (1, '19.780')] [2023-10-10 06:17:51,791][53268] Updated weights for policy 1, policy_version 39620 (0.0009) [2023-10-10 06:17:52,079][53252] Updated weights for policy 0, policy_version 39680 (0.0007) [2023-10-10 06:17:52,149][53268] Updated weights for policy 1, policy_version 39630 (0.0008) [2023-10-10 06:17:52,513][53268] Updated weights for policy 1, policy_version 39640 (0.0007) [2023-10-10 06:17:56,305][53252] Updated weights for policy 0, policy_version 39690 (0.0008) [2023-10-10 06:17:56,597][53268] Updated weights for policy 1, policy_version 39650 (0.0008) [2023-10-10 06:17:56,670][53252] Updated weights for policy 0, policy_version 39700 (0.0009) [2023-10-10 06:17:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81231872. Throughput: 0: 1668.4, 1: 1675.1. Samples: 20326022. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 06:17:56,784][52050] Avg episode reward: [(0, '20.070'), (1, '18.790')] [2023-10-10 06:17:56,969][53268] Updated weights for policy 1, policy_version 39660 (0.0007) [2023-10-10 06:17:57,043][53252] Updated weights for policy 0, policy_version 39710 (0.0007) [2023-10-10 06:17:57,322][53268] Updated weights for policy 1, policy_version 39670 (0.0008) [2023-10-10 06:17:57,689][53268] Updated weights for policy 1, policy_version 39680 (0.0008) [2023-10-10 06:18:01,112][53252] Updated weights for policy 0, policy_version 39720 (0.0009) [2023-10-10 06:18:01,482][53252] Updated weights for policy 0, policy_version 39730 (0.0009) [2023-10-10 06:18:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81297408. Throughput: 0: 1680.5, 1: 1677.3. Samples: 20335694. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-10 06:18:01,784][52050] Avg episode reward: [(0, '19.440'), (1, '17.870')] [2023-10-10 06:18:01,802][53268] Updated weights for policy 1, policy_version 39690 (0.0008) [2023-10-10 06:18:01,853][53252] Updated weights for policy 0, policy_version 39740 (0.0007) [2023-10-10 06:18:02,178][53268] Updated weights for policy 1, policy_version 39700 (0.0007) [2023-10-10 06:18:02,538][53268] Updated weights for policy 1, policy_version 39710 (0.0009) [2023-10-10 06:18:05,742][53252] Updated weights for policy 0, policy_version 39750 (0.0008) [2023-10-10 06:18:06,113][53252] Updated weights for policy 0, policy_version 39760 (0.0008) [2023-10-10 06:18:06,489][53252] Updated weights for policy 0, policy_version 39770 (0.0009) [2023-10-10 06:18:06,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81395712. Throughput: 0: 1683.4, 1: 1679.9. Samples: 20356464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:06,784][52050] Avg episode reward: [(0, '20.590'), (1, '19.080')] [2023-10-10 06:18:06,822][53268] Updated weights for policy 1, policy_version 39720 (0.0009) [2023-10-10 06:18:07,200][53268] Updated weights for policy 1, policy_version 39730 (0.0009) [2023-10-10 06:18:07,570][53268] Updated weights for policy 1, policy_version 39740 (0.0008) [2023-10-10 06:18:10,733][53252] Updated weights for policy 0, policy_version 39780 (0.0009) [2023-10-10 06:18:11,110][53252] Updated weights for policy 0, policy_version 39790 (0.0011) [2023-10-10 06:18:11,478][53268] Updated weights for policy 1, policy_version 39750 (0.0007) [2023-10-10 06:18:11,487][53252] Updated weights for policy 0, policy_version 39800 (0.0009) [2023-10-10 06:18:11,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81461248. Throughput: 0: 1657.6, 1: 1678.9. Samples: 20376110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:11,784][52050] Avg episode reward: [(0, '19.020'), (1, '19.020')] [2023-10-10 06:18:11,841][53268] Updated weights for policy 1, policy_version 39760 (0.0007) [2023-10-10 06:18:12,213][53268] Updated weights for policy 1, policy_version 39770 (0.0008) [2023-10-10 06:18:15,535][53252] Updated weights for policy 0, policy_version 39810 (0.0008) [2023-10-10 06:18:15,904][53252] Updated weights for policy 0, policy_version 39820 (0.0010) [2023-10-10 06:18:16,278][53252] Updated weights for policy 0, policy_version 39830 (0.0009) [2023-10-10 06:18:16,434][53268] Updated weights for policy 1, policy_version 39780 (0.0009) [2023-10-10 06:18:16,641][53252] Updated weights for policy 0, policy_version 39840 (0.0007) [2023-10-10 06:18:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81526784. Throughput: 0: 1676.7, 1: 1678.4. Samples: 20386084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:16,784][52050] Avg episode reward: [(0, '19.690'), (1, '19.270')] [2023-10-10 06:18:16,804][53268] Updated weights for policy 1, policy_version 39790 (0.0007) [2023-10-10 06:18:17,175][53268] Updated weights for policy 1, policy_version 39800 (0.0008) [2023-10-10 06:18:20,797][53252] Updated weights for policy 0, policy_version 39850 (0.0011) [2023-10-10 06:18:21,150][53268] Updated weights for policy 1, policy_version 39810 (0.0008) [2023-10-10 06:18:21,162][53252] Updated weights for policy 0, policy_version 39860 (0.0008) [2023-10-10 06:18:21,523][53268] Updated weights for policy 1, policy_version 39820 (0.0008) [2023-10-10 06:18:21,537][53252] Updated weights for policy 0, policy_version 39870 (0.0008) [2023-10-10 06:18:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81592320. Throughput: 0: 1677.4, 1: 1674.5. Samples: 20406746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:21,784][52050] Avg episode reward: [(0, '20.000'), (1, '18.370')] [2023-10-10 06:18:21,890][53268] Updated weights for policy 1, policy_version 39830 (0.0011) [2023-10-10 06:18:22,260][53268] Updated weights for policy 1, policy_version 39840 (0.0009) [2023-10-10 06:18:25,745][53252] Updated weights for policy 0, policy_version 39880 (0.0007) [2023-10-10 06:18:26,122][53252] Updated weights for policy 0, policy_version 39890 (0.0007) [2023-10-10 06:18:26,436][53268] Updated weights for policy 1, policy_version 39850 (0.0008) [2023-10-10 06:18:26,499][53252] Updated weights for policy 0, policy_version 39900 (0.0008) [2023-10-10 06:18:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81657856. Throughput: 0: 1661.3, 1: 1671.4. Samples: 20426242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:26,784][52050] Avg episode reward: [(0, '19.190'), (1, '17.920')] [2023-10-10 06:18:26,801][53268] Updated weights for policy 1, policy_version 39860 (0.0009) [2023-10-10 06:18:27,174][53268] Updated weights for policy 1, policy_version 39870 (0.0007) [2023-10-10 06:18:30,615][53252] Updated weights for policy 0, policy_version 39910 (0.0008) [2023-10-10 06:18:30,986][53252] Updated weights for policy 0, policy_version 39920 (0.0008) [2023-10-10 06:18:31,240][53268] Updated weights for policy 1, policy_version 39880 (0.0009) [2023-10-10 06:18:31,351][53252] Updated weights for policy 0, policy_version 39930 (0.0009) [2023-10-10 06:18:31,611][53268] Updated weights for policy 1, policy_version 39890 (0.0008) [2023-10-10 06:18:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81723392. Throughput: 0: 1669.6, 1: 1671.0. Samples: 20436238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:18:31,784][52050] Avg episode reward: [(0, '21.210'), (1, '17.850')] [2023-10-10 06:18:31,987][53268] Updated weights for policy 1, policy_version 39900 (0.0008) [2023-10-10 06:18:35,382][53252] Updated weights for policy 0, policy_version 39940 (0.0009) [2023-10-10 06:18:35,772][53252] Updated weights for policy 0, policy_version 39950 (0.0007) [2023-10-10 06:18:36,136][53252] Updated weights for policy 0, policy_version 39960 (0.0010) [2023-10-10 06:18:36,210][53268] Updated weights for policy 1, policy_version 39910 (0.0009) [2023-10-10 06:18:36,569][53268] Updated weights for policy 1, policy_version 39920 (0.0008) [2023-10-10 06:18:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 81788928. Throughput: 0: 1670.7, 1: 1674.0. Samples: 20456568. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:18:36,785][52050] Avg episode reward: [(0, '20.980'), (1, '17.660')] [2023-10-10 06:18:36,934][53268] Updated weights for policy 1, policy_version 39930 (0.0008) [2023-10-10 06:18:40,141][53252] Updated weights for policy 0, policy_version 39970 (0.0008) [2023-10-10 06:18:40,510][53252] Updated weights for policy 0, policy_version 39980 (0.0009) [2023-10-10 06:18:40,885][53252] Updated weights for policy 0, policy_version 39990 (0.0010) [2023-10-10 06:18:40,958][53268] Updated weights for policy 1, policy_version 39940 (0.0008) [2023-10-10 06:18:41,251][53252] Updated weights for policy 0, policy_version 40000 (0.0009) [2023-10-10 06:18:41,331][53268] Updated weights for policy 1, policy_version 39950 (0.0007) [2023-10-10 06:18:41,697][53268] Updated weights for policy 1, policy_version 39960 (0.0008) [2023-10-10 06:18:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81854464. Throughput: 0: 1666.3, 1: 1666.6. Samples: 20476004. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:18:41,784][52050] Avg episode reward: [(0, '20.850'), (1, '19.450')] [2023-10-10 06:18:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000040000_40960000.pth... [2023-10-10 06:18:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000038432_39354368.pth [2023-10-10 06:18:41,979][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000039968_40927232.pth... [2023-10-10 06:18:42,008][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000038400_39321600.pth [2023-10-10 06:18:45,067][53252] Updated weights for policy 0, policy_version 40010 (0.0009) [2023-10-10 06:18:45,443][53252] Updated weights for policy 0, policy_version 40020 (0.0009) [2023-10-10 06:18:45,736][53268] Updated weights for policy 1, policy_version 39970 (0.0008) [2023-10-10 06:18:45,821][53252] Updated weights for policy 0, policy_version 40030 (0.0010) [2023-10-10 06:18:46,114][53268] Updated weights for policy 1, policy_version 39980 (0.0009) [2023-10-10 06:18:46,487][53268] Updated weights for policy 1, policy_version 39990 (0.0008) [2023-10-10 06:18:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81920000. Throughput: 0: 1684.8, 1: 1670.4. Samples: 20486674. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:18:46,784][52050] Avg episode reward: [(0, '20.750'), (1, '20.530')] [2023-10-10 06:18:46,845][53268] Updated weights for policy 1, policy_version 40000 (0.0008) [2023-10-10 06:18:49,936][53252] Updated weights for policy 0, policy_version 40040 (0.0008) [2023-10-10 06:18:50,301][53252] Updated weights for policy 0, policy_version 40050 (0.0009) [2023-10-10 06:18:50,677][53252] Updated weights for policy 0, policy_version 40060 (0.0010) [2023-10-10 06:18:51,027][53268] Updated weights for policy 1, policy_version 40010 (0.0007) [2023-10-10 06:18:51,397][53268] Updated weights for policy 1, policy_version 40020 (0.0007) [2023-10-10 06:18:51,767][53268] Updated weights for policy 1, policy_version 40030 (0.0007) [2023-10-10 06:18:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81985536. Throughput: 0: 1661.9, 1: 1671.7. Samples: 20506472. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:18:51,784][52050] Avg episode reward: [(0, '22.510'), (1, '21.380')] [2023-10-10 06:18:54,545][53252] Updated weights for policy 0, policy_version 40070 (0.0007) [2023-10-10 06:18:54,918][53252] Updated weights for policy 0, policy_version 40080 (0.0009) [2023-10-10 06:18:55,294][53252] Updated weights for policy 0, policy_version 40090 (0.0009) [2023-10-10 06:18:55,746][53268] Updated weights for policy 1, policy_version 40040 (0.0008) [2023-10-10 06:18:56,115][53268] Updated weights for policy 1, policy_version 40050 (0.0008) [2023-10-10 06:18:56,488][53268] Updated weights for policy 1, policy_version 40060 (0.0010) [2023-10-10 06:18:56,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 82083840. Throughput: 0: 1673.3, 1: 1659.8. Samples: 20526100. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:18:56,784][52050] Avg episode reward: [(0, '21.810'), (1, '21.240')] [2023-10-10 06:18:59,406][53252] Updated weights for policy 0, policy_version 40100 (0.0007) [2023-10-10 06:18:59,776][53252] Updated weights for policy 0, policy_version 40110 (0.0009) [2023-10-10 06:19:00,150][53252] Updated weights for policy 0, policy_version 40120 (0.0009) [2023-10-10 06:19:00,606][53268] Updated weights for policy 1, policy_version 40070 (0.0009) [2023-10-10 06:19:00,959][53268] Updated weights for policy 1, policy_version 40080 (0.0010) [2023-10-10 06:19:01,335][53268] Updated weights for policy 1, policy_version 40090 (0.0009) [2023-10-10 06:19:01,783][52050] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 82149376. Throughput: 0: 1679.5, 1: 1674.6. Samples: 20537020. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 06:19:01,785][52050] Avg episode reward: [(0, '21.970'), (1, '20.260')] [2023-10-10 06:19:04,257][53252] Updated weights for policy 0, policy_version 40130 (0.0007) [2023-10-10 06:19:04,627][53252] Updated weights for policy 0, policy_version 40140 (0.0009) [2023-10-10 06:19:04,992][53252] Updated weights for policy 0, policy_version 40150 (0.0010) [2023-10-10 06:19:05,359][53268] Updated weights for policy 1, policy_version 40100 (0.0008) [2023-10-10 06:19:05,366][53252] Updated weights for policy 0, policy_version 40160 (0.0009) [2023-10-10 06:19:05,727][53268] Updated weights for policy 1, policy_version 40110 (0.0007) [2023-10-10 06:19:06,095][53268] Updated weights for policy 1, policy_version 40120 (0.0009) [2023-10-10 06:19:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82214912. Throughput: 0: 1659.3, 1: 1671.1. Samples: 20556614. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:06,784][52050] Avg episode reward: [(0, '21.330'), (1, '20.700')] [2023-10-10 06:19:09,509][53252] Updated weights for policy 0, policy_version 40170 (0.0008) [2023-10-10 06:19:09,885][53252] Updated weights for policy 0, policy_version 40180 (0.0010) [2023-10-10 06:19:10,095][53268] Updated weights for policy 1, policy_version 40130 (0.0008) [2023-10-10 06:19:10,256][53252] Updated weights for policy 0, policy_version 40190 (0.0008) [2023-10-10 06:19:10,464][53268] Updated weights for policy 1, policy_version 40140 (0.0010) [2023-10-10 06:19:10,823][53268] Updated weights for policy 1, policy_version 40150 (0.0009) [2023-10-10 06:19:11,191][53268] Updated weights for policy 1, policy_version 40160 (0.0007) [2023-10-10 06:19:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 82280448. Throughput: 0: 1686.5, 1: 1651.4. Samples: 20576446. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:11,784][52050] Avg episode reward: [(0, '21.910'), (1, '18.840')] [2023-10-10 06:19:14,224][53252] Updated weights for policy 0, policy_version 40200 (0.0008) [2023-10-10 06:19:14,598][53252] Updated weights for policy 0, policy_version 40210 (0.0008) [2023-10-10 06:19:14,978][53252] Updated weights for policy 0, policy_version 40220 (0.0010) [2023-10-10 06:19:15,397][53268] Updated weights for policy 1, policy_version 40170 (0.0008) [2023-10-10 06:19:15,756][53268] Updated weights for policy 1, policy_version 40180 (0.0009) [2023-10-10 06:19:16,118][53268] Updated weights for policy 1, policy_version 40190 (0.0009) [2023-10-10 06:19:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82345984. Throughput: 0: 1688.0, 1: 1675.2. Samples: 20587582. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:16,784][52050] Avg episode reward: [(0, '19.680'), (1, '18.790')] [2023-10-10 06:19:19,066][53252] Updated weights for policy 0, policy_version 40230 (0.0008) [2023-10-10 06:19:19,442][53252] Updated weights for policy 0, policy_version 40240 (0.0007) [2023-10-10 06:19:19,806][53252] Updated weights for policy 0, policy_version 40250 (0.0008) [2023-10-10 06:19:20,174][53268] Updated weights for policy 1, policy_version 40200 (0.0009) [2023-10-10 06:19:20,537][53268] Updated weights for policy 1, policy_version 40210 (0.0007) [2023-10-10 06:19:20,910][53268] Updated weights for policy 1, policy_version 40220 (0.0007) [2023-10-10 06:19:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 82411520. Throughput: 0: 1674.4, 1: 1671.3. Samples: 20607126. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:21,784][52050] Avg episode reward: [(0, '21.050'), (1, '18.990')] [2023-10-10 06:19:23,929][53252] Updated weights for policy 0, policy_version 40260 (0.0008) [2023-10-10 06:19:24,315][53252] Updated weights for policy 0, policy_version 40270 (0.0007) [2023-10-10 06:19:24,703][53252] Updated weights for policy 0, policy_version 40280 (0.0009) [2023-10-10 06:19:24,992][53268] Updated weights for policy 1, policy_version 40230 (0.0007) [2023-10-10 06:19:25,355][53268] Updated weights for policy 1, policy_version 40240 (0.0008) [2023-10-10 06:19:25,732][53268] Updated weights for policy 1, policy_version 40250 (0.0009) [2023-10-10 06:19:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82477056. Throughput: 0: 1694.9, 1: 1657.5. Samples: 20626862. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:26,784][52050] Avg episode reward: [(0, '21.210'), (1, '18.800')] [2023-10-10 06:19:28,580][53252] Updated weights for policy 0, policy_version 40290 (0.0010) [2023-10-10 06:19:28,948][53252] Updated weights for policy 0, policy_version 40300 (0.0009) [2023-10-10 06:19:29,323][53252] Updated weights for policy 0, policy_version 40310 (0.0010) [2023-10-10 06:19:29,685][53252] Updated weights for policy 0, policy_version 40320 (0.0009) [2023-10-10 06:19:29,779][53268] Updated weights for policy 1, policy_version 40260 (0.0007) [2023-10-10 06:19:30,146][53268] Updated weights for policy 1, policy_version 40270 (0.0007) [2023-10-10 06:19:30,514][53268] Updated weights for policy 1, policy_version 40280 (0.0009) [2023-10-10 06:19:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82542592. Throughput: 0: 1674.2, 1: 1681.9. Samples: 20637696. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-10 06:19:31,784][52050] Avg episode reward: [(0, '21.690'), (1, '19.030')] [2023-10-10 06:19:33,780][53252] Updated weights for policy 0, policy_version 40330 (0.0010) [2023-10-10 06:19:34,144][53252] Updated weights for policy 0, policy_version 40340 (0.0007) [2023-10-10 06:19:34,524][53252] Updated weights for policy 0, policy_version 40350 (0.0008) [2023-10-10 06:19:34,541][53268] Updated weights for policy 1, policy_version 40290 (0.0008) [2023-10-10 06:19:34,903][53268] Updated weights for policy 1, policy_version 40300 (0.0010) [2023-10-10 06:19:35,272][53268] Updated weights for policy 1, policy_version 40310 (0.0007) [2023-10-10 06:19:35,638][53268] Updated weights for policy 1, policy_version 40320 (0.0009) [2023-10-10 06:19:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 82608128. Throughput: 0: 1683.8, 1: 1669.3. Samples: 20657360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:19:36,784][52050] Avg episode reward: [(0, '21.860'), (1, '19.840')] [2023-10-10 06:19:38,520][53252] Updated weights for policy 0, policy_version 40360 (0.0010) [2023-10-10 06:19:38,889][53252] Updated weights for policy 0, policy_version 40370 (0.0008) [2023-10-10 06:19:39,266][53252] Updated weights for policy 0, policy_version 40380 (0.0009) [2023-10-10 06:19:39,735][53268] Updated weights for policy 1, policy_version 40330 (0.0007) [2023-10-10 06:19:40,098][53268] Updated weights for policy 1, policy_version 40340 (0.0011) [2023-10-10 06:19:40,460][53268] Updated weights for policy 1, policy_version 40350 (0.0008) [2023-10-10 06:19:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 82673664. Throughput: 0: 1693.7, 1: 1670.0. Samples: 20677470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:19:41,784][52050] Avg episode reward: [(0, '22.170'), (1, '20.870')] [2023-10-10 06:19:43,346][53252] Updated weights for policy 0, policy_version 40390 (0.0008) [2023-10-10 06:19:43,720][53252] Updated weights for policy 0, policy_version 40400 (0.0007) [2023-10-10 06:19:44,102][53252] Updated weights for policy 0, policy_version 40410 (0.0008) [2023-10-10 06:19:44,649][53268] Updated weights for policy 1, policy_version 40360 (0.0010) [2023-10-10 06:19:45,017][53268] Updated weights for policy 1, policy_version 40370 (0.0009) [2023-10-10 06:19:45,383][53268] Updated weights for policy 1, policy_version 40380 (0.0008) [2023-10-10 06:19:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 82739200. Throughput: 0: 1665.9, 1: 1686.1. Samples: 20687860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:19:46,784][52050] Avg episode reward: [(0, '21.860'), (1, '20.340')] [2023-10-10 06:19:48,019][53252] Updated weights for policy 0, policy_version 40420 (0.0009) [2023-10-10 06:19:48,397][53252] Updated weights for policy 0, policy_version 40430 (0.0009) [2023-10-10 06:19:48,753][53252] Updated weights for policy 0, policy_version 40440 (0.0009) [2023-10-10 06:19:49,574][53268] Updated weights for policy 1, policy_version 40390 (0.0009) [2023-10-10 06:19:49,956][53268] Updated weights for policy 1, policy_version 40400 (0.0010) [2023-10-10 06:19:50,312][53268] Updated weights for policy 1, policy_version 40410 (0.0011) [2023-10-10 06:19:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 82804736. Throughput: 0: 1694.0, 1: 1664.6. Samples: 20707752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:19:51,784][52050] Avg episode reward: [(0, '21.440'), (1, '20.720')] [2023-10-10 06:19:52,567][53252] Updated weights for policy 0, policy_version 40450 (0.0007) [2023-10-10 06:19:52,931][53252] Updated weights for policy 0, policy_version 40460 (0.0009) [2023-10-10 06:19:53,301][53252] Updated weights for policy 0, policy_version 40470 (0.0008) [2023-10-10 06:19:53,672][53252] Updated weights for policy 0, policy_version 40480 (0.0009) [2023-10-10 06:19:54,243][53268] Updated weights for policy 1, policy_version 40420 (0.0008) [2023-10-10 06:19:54,616][53268] Updated weights for policy 1, policy_version 40430 (0.0007) [2023-10-10 06:19:54,981][53268] Updated weights for policy 1, policy_version 40440 (0.0008) [2023-10-10 06:19:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82870272. Throughput: 0: 1696.3, 1: 1677.8. Samples: 20728278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:19:56,784][52050] Avg episode reward: [(0, '19.640'), (1, '20.530')] [2023-10-10 06:19:57,688][53252] Updated weights for policy 0, policy_version 40490 (0.0011) [2023-10-10 06:19:58,058][53252] Updated weights for policy 0, policy_version 40500 (0.0010) [2023-10-10 06:19:58,431][53252] Updated weights for policy 0, policy_version 40510 (0.0009) [2023-10-10 06:19:58,950][53268] Updated weights for policy 1, policy_version 40450 (0.0010) [2023-10-10 06:19:59,318][53268] Updated weights for policy 1, policy_version 40460 (0.0007) [2023-10-10 06:19:59,686][53268] Updated weights for policy 1, policy_version 40470 (0.0009) [2023-10-10 06:20:00,055][53268] Updated weights for policy 1, policy_version 40480 (0.0010) [2023-10-10 06:20:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 82935808. Throughput: 0: 1671.6, 1: 1677.5. Samples: 20738292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:01,784][52050] Avg episode reward: [(0, '21.190'), (1, '21.520')] [2023-10-10 06:20:02,664][53252] Updated weights for policy 0, policy_version 40520 (0.0008) [2023-10-10 06:20:03,030][53252] Updated weights for policy 0, policy_version 40530 (0.0008) [2023-10-10 06:20:03,403][53252] Updated weights for policy 0, policy_version 40540 (0.0007) [2023-10-10 06:20:04,113][53268] Updated weights for policy 1, policy_version 40490 (0.0010) [2023-10-10 06:20:04,484][53268] Updated weights for policy 1, policy_version 40500 (0.0009) [2023-10-10 06:20:04,845][53268] Updated weights for policy 1, policy_version 40510 (0.0008) [2023-10-10 06:20:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83001344. Throughput: 0: 1687.9, 1: 1666.4. Samples: 20758068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:06,784][52050] Avg episode reward: [(0, '21.010'), (1, '19.490')] [2023-10-10 06:20:07,541][53252] Updated weights for policy 0, policy_version 40550 (0.0008) [2023-10-10 06:20:07,923][53252] Updated weights for policy 0, policy_version 40560 (0.0010) [2023-10-10 06:20:08,298][53252] Updated weights for policy 0, policy_version 40570 (0.0008) [2023-10-10 06:20:08,923][53268] Updated weights for policy 1, policy_version 40520 (0.0009) [2023-10-10 06:20:09,282][53268] Updated weights for policy 1, policy_version 40530 (0.0009) [2023-10-10 06:20:09,652][53268] Updated weights for policy 1, policy_version 40540 (0.0007) [2023-10-10 06:20:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83066880. Throughput: 0: 1689.1, 1: 1682.2. Samples: 20778568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:11,784][52050] Avg episode reward: [(0, '20.090'), (1, '18.870')] [2023-10-10 06:20:12,458][53252] Updated weights for policy 0, policy_version 40580 (0.0007) [2023-10-10 06:20:12,850][53252] Updated weights for policy 0, policy_version 40590 (0.0008) [2023-10-10 06:20:13,212][53252] Updated weights for policy 0, policy_version 40600 (0.0008) [2023-10-10 06:20:13,804][53268] Updated weights for policy 1, policy_version 40550 (0.0010) [2023-10-10 06:20:14,173][53268] Updated weights for policy 1, policy_version 40560 (0.0010) [2023-10-10 06:20:14,538][53268] Updated weights for policy 1, policy_version 40570 (0.0010) [2023-10-10 06:20:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83132416. Throughput: 0: 1677.5, 1: 1668.9. Samples: 20788284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:16,784][52050] Avg episode reward: [(0, '20.550'), (1, '18.700')] [2023-10-10 06:20:17,347][53252] Updated weights for policy 0, policy_version 40610 (0.0009) [2023-10-10 06:20:17,716][53252] Updated weights for policy 0, policy_version 40620 (0.0009) [2023-10-10 06:20:18,094][53252] Updated weights for policy 0, policy_version 40630 (0.0009) [2023-10-10 06:20:18,458][53252] Updated weights for policy 0, policy_version 40640 (0.0009) [2023-10-10 06:20:18,727][53268] Updated weights for policy 1, policy_version 40580 (0.0011) [2023-10-10 06:20:19,094][53268] Updated weights for policy 1, policy_version 40590 (0.0010) [2023-10-10 06:20:19,456][53268] Updated weights for policy 1, policy_version 40600 (0.0011) [2023-10-10 06:20:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83197952. Throughput: 0: 1683.9, 1: 1668.2. Samples: 20808202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:21,784][52050] Avg episode reward: [(0, '20.170'), (1, '20.070')] [2023-10-10 06:20:22,328][53252] Updated weights for policy 0, policy_version 40650 (0.0010) [2023-10-10 06:20:22,701][53252] Updated weights for policy 0, policy_version 40660 (0.0010) [2023-10-10 06:20:23,083][53252] Updated weights for policy 0, policy_version 40670 (0.0009) [2023-10-10 06:20:23,672][53268] Updated weights for policy 1, policy_version 40610 (0.0008) [2023-10-10 06:20:24,050][53268] Updated weights for policy 1, policy_version 40620 (0.0008) [2023-10-10 06:20:24,411][53268] Updated weights for policy 1, policy_version 40630 (0.0008) [2023-10-10 06:20:24,785][53268] Updated weights for policy 1, policy_version 40640 (0.0008) [2023-10-10 06:20:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 83263488. Throughput: 0: 1691.2, 1: 1680.2. Samples: 20829186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:26,784][52050] Avg episode reward: [(0, '20.020'), (1, '19.150')] [2023-10-10 06:20:26,998][53252] Updated weights for policy 0, policy_version 40680 (0.0007) [2023-10-10 06:20:27,376][53252] Updated weights for policy 0, policy_version 40690 (0.0007) [2023-10-10 06:20:27,748][53252] Updated weights for policy 0, policy_version 40700 (0.0009) [2023-10-10 06:20:28,748][53268] Updated weights for policy 1, policy_version 40650 (0.0010) [2023-10-10 06:20:29,126][53268] Updated weights for policy 1, policy_version 40660 (0.0010) [2023-10-10 06:20:29,489][53268] Updated weights for policy 1, policy_version 40670 (0.0007) [2023-10-10 06:20:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83329024. Throughput: 0: 1694.0, 1: 1661.3. Samples: 20838846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:20:31,784][52050] Avg episode reward: [(0, '21.140'), (1, '20.460')] [2023-10-10 06:20:31,934][53252] Updated weights for policy 0, policy_version 40710 (0.0008) [2023-10-10 06:20:32,304][53252] Updated weights for policy 0, policy_version 40720 (0.0007) [2023-10-10 06:20:32,672][53252] Updated weights for policy 0, policy_version 40730 (0.0007) [2023-10-10 06:20:33,539][53268] Updated weights for policy 1, policy_version 40680 (0.0007) [2023-10-10 06:20:33,915][53268] Updated weights for policy 1, policy_version 40690 (0.0007) [2023-10-10 06:20:34,278][53268] Updated weights for policy 1, policy_version 40700 (0.0007) [2023-10-10 06:20:36,751][53252] Updated weights for policy 0, policy_version 40740 (0.0007) [2023-10-10 06:20:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 83394560. Throughput: 0: 1685.2, 1: 1678.0. Samples: 20859098. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:20:36,784][52050] Avg episode reward: [(0, '20.880'), (1, '19.800')] [2023-10-10 06:20:37,119][53252] Updated weights for policy 0, policy_version 40750 (0.0007) [2023-10-10 06:20:37,496][53252] Updated weights for policy 0, policy_version 40760 (0.0008) [2023-10-10 06:20:38,258][53268] Updated weights for policy 1, policy_version 40710 (0.0008) [2023-10-10 06:20:38,624][53268] Updated weights for policy 1, policy_version 40720 (0.0008) [2023-10-10 06:20:38,993][53268] Updated weights for policy 1, policy_version 40730 (0.0008) [2023-10-10 06:20:41,777][53252] Updated weights for policy 0, policy_version 40770 (0.0009) [2023-10-10 06:20:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 83460096. Throughput: 0: 1675.3, 1: 1689.8. Samples: 20879710. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:20:41,784][52050] Avg episode reward: [(0, '21.520'), (1, '20.390')] [2023-10-10 06:20:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000040736_41713664.pth... [2023-10-10 06:20:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000039168_40108032.pth [2023-10-10 06:20:42,144][53252] Updated weights for policy 0, policy_version 40780 (0.0009) [2023-10-10 06:20:42,517][53252] Updated weights for policy 0, policy_version 40790 (0.0007) [2023-10-10 06:20:42,873][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000040800_41779200.pth... [2023-10-10 06:20:42,879][53252] Updated weights for policy 0, policy_version 40800 (0.0009) [2023-10-10 06:20:42,914][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000039200_40140800.pth [2023-10-10 06:20:43,130][53268] Updated weights for policy 1, policy_version 40740 (0.0008) [2023-10-10 06:20:43,497][53268] Updated weights for policy 1, policy_version 40750 (0.0011) [2023-10-10 06:20:43,857][53268] Updated weights for policy 1, policy_version 40760 (0.0009) [2023-10-10 06:20:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83525632. Throughput: 0: 1683.5, 1: 1668.7. Samples: 20889138. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:20:46,784][52050] Avg episode reward: [(0, '22.520'), (1, '18.050')] [2023-10-10 06:20:46,828][53252] Updated weights for policy 0, policy_version 40810 (0.0010) [2023-10-10 06:20:47,202][53252] Updated weights for policy 0, policy_version 40820 (0.0010) [2023-10-10 06:20:47,570][53252] Updated weights for policy 0, policy_version 40830 (0.0011) [2023-10-10 06:20:47,872][53268] Updated weights for policy 1, policy_version 40770 (0.0008) [2023-10-10 06:20:48,238][53268] Updated weights for policy 1, policy_version 40780 (0.0010) [2023-10-10 06:20:48,605][53268] Updated weights for policy 1, policy_version 40790 (0.0009) [2023-10-10 06:20:48,967][53268] Updated weights for policy 1, policy_version 40800 (0.0009) [2023-10-10 06:20:51,560][53252] Updated weights for policy 0, policy_version 40840 (0.0008) [2023-10-10 06:20:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83591168. Throughput: 0: 1679.0, 1: 1688.5. Samples: 20909608. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:20:51,784][52050] Avg episode reward: [(0, '21.550'), (1, '17.550')] [2023-10-10 06:20:51,931][53252] Updated weights for policy 0, policy_version 40850 (0.0008) [2023-10-10 06:20:52,296][53252] Updated weights for policy 0, policy_version 40860 (0.0008) [2023-10-10 06:20:53,061][53268] Updated weights for policy 1, policy_version 40810 (0.0007) [2023-10-10 06:20:53,433][53268] Updated weights for policy 1, policy_version 40820 (0.0008) [2023-10-10 06:20:53,796][53268] Updated weights for policy 1, policy_version 40830 (0.0007) [2023-10-10 06:20:56,470][53252] Updated weights for policy 0, policy_version 40870 (0.0009) [2023-10-10 06:20:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 83656704. Throughput: 0: 1676.9, 1: 1697.0. Samples: 20930394. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:20:56,784][52050] Avg episode reward: [(0, '21.550'), (1, '19.240')] [2023-10-10 06:20:56,851][53252] Updated weights for policy 0, policy_version 40880 (0.0008) [2023-10-10 06:20:57,216][53252] Updated weights for policy 0, policy_version 40890 (0.0008) [2023-10-10 06:20:57,645][53268] Updated weights for policy 1, policy_version 40840 (0.0010) [2023-10-10 06:20:58,016][53268] Updated weights for policy 1, policy_version 40850 (0.0010) [2023-10-10 06:20:58,389][53268] Updated weights for policy 1, policy_version 40860 (0.0009) [2023-10-10 06:21:01,316][53252] Updated weights for policy 0, policy_version 40900 (0.0007) [2023-10-10 06:21:01,720][53252] Updated weights for policy 0, policy_version 40910 (0.0009) [2023-10-10 06:21:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 83722240. Throughput: 0: 1686.6, 1: 1680.6. Samples: 20939806. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:21:01,784][52050] Avg episode reward: [(0, '21.940'), (1, '18.830')] [2023-10-10 06:21:02,097][53252] Updated weights for policy 0, policy_version 40920 (0.0008) [2023-10-10 06:21:02,413][53268] Updated weights for policy 1, policy_version 40870 (0.0009) [2023-10-10 06:21:02,778][53268] Updated weights for policy 1, policy_version 40880 (0.0009) [2023-10-10 06:21:03,147][53268] Updated weights for policy 1, policy_version 40890 (0.0008) [2023-10-10 06:21:06,138][53252] Updated weights for policy 0, policy_version 40930 (0.0008) [2023-10-10 06:21:06,515][53252] Updated weights for policy 0, policy_version 40940 (0.0007) [2023-10-10 06:21:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83787776. Throughput: 0: 1685.1, 1: 1697.1. Samples: 20960398. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:06,784][52050] Avg episode reward: [(0, '22.130'), (1, '19.580')] [2023-10-10 06:21:06,882][53252] Updated weights for policy 0, policy_version 40950 (0.0008) [2023-10-10 06:21:07,253][53252] Updated weights for policy 0, policy_version 40960 (0.0009) [2023-10-10 06:21:07,316][53268] Updated weights for policy 1, policy_version 40900 (0.0007) [2023-10-10 06:21:07,682][53268] Updated weights for policy 1, policy_version 40910 (0.0009) [2023-10-10 06:21:08,046][53268] Updated weights for policy 1, policy_version 40920 (0.0009) [2023-10-10 06:21:11,356][53252] Updated weights for policy 0, policy_version 40970 (0.0007) [2023-10-10 06:21:11,727][53252] Updated weights for policy 0, policy_version 40980 (0.0007) [2023-10-10 06:21:11,764][53268] Updated weights for policy 1, policy_version 40930 (0.0009) [2023-10-10 06:21:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83853312. Throughput: 0: 1666.0, 1: 1706.2. Samples: 20980934. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:11,784][52050] Avg episode reward: [(0, '20.660'), (1, '19.060')] [2023-10-10 06:21:12,103][53252] Updated weights for policy 0, policy_version 40990 (0.0007) [2023-10-10 06:21:12,134][53268] Updated weights for policy 1, policy_version 40940 (0.0007) [2023-10-10 06:21:12,496][53268] Updated weights for policy 1, policy_version 40950 (0.0007) [2023-10-10 06:21:12,859][53268] Updated weights for policy 1, policy_version 40960 (0.0008) [2023-10-10 06:21:16,197][53252] Updated weights for policy 0, policy_version 41000 (0.0007) [2023-10-10 06:21:16,570][53252] Updated weights for policy 0, policy_version 41010 (0.0007) [2023-10-10 06:21:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 83918848. Throughput: 0: 1673.4, 1: 1696.2. Samples: 20990480. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:16,784][52050] Avg episode reward: [(0, '21.110'), (1, '19.350')] [2023-10-10 06:21:16,833][53268] Updated weights for policy 1, policy_version 40970 (0.0009) [2023-10-10 06:21:16,933][53252] Updated weights for policy 0, policy_version 41020 (0.0008) [2023-10-10 06:21:17,201][53268] Updated weights for policy 1, policy_version 40980 (0.0009) [2023-10-10 06:21:17,564][53268] Updated weights for policy 1, policy_version 40990 (0.0011) [2023-10-10 06:21:21,110][53252] Updated weights for policy 0, policy_version 41030 (0.0009) [2023-10-10 06:21:21,484][53252] Updated weights for policy 0, policy_version 41040 (0.0008) [2023-10-10 06:21:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83984384. Throughput: 0: 1670.5, 1: 1705.6. Samples: 21011020. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:21,784][52050] Avg episode reward: [(0, '22.200'), (1, '19.340')] [2023-10-10 06:21:21,859][53268] Updated weights for policy 1, policy_version 41000 (0.0007) [2023-10-10 06:21:21,861][53252] Updated weights for policy 0, policy_version 41050 (0.0007) [2023-10-10 06:21:22,233][53268] Updated weights for policy 1, policy_version 41010 (0.0009) [2023-10-10 06:21:22,601][53268] Updated weights for policy 1, policy_version 41020 (0.0008) [2023-10-10 06:21:26,044][53252] Updated weights for policy 0, policy_version 41060 (0.0008) [2023-10-10 06:21:26,419][53252] Updated weights for policy 0, policy_version 41070 (0.0009) [2023-10-10 06:21:26,463][53268] Updated weights for policy 1, policy_version 41030 (0.0009) [2023-10-10 06:21:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84049920. Throughput: 0: 1664.6, 1: 1705.3. Samples: 21031352. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:26,784][52050] Avg episode reward: [(0, '21.520'), (1, '19.500')] [2023-10-10 06:21:26,793][53252] Updated weights for policy 0, policy_version 41080 (0.0008) [2023-10-10 06:21:26,835][53268] Updated weights for policy 1, policy_version 41040 (0.0009) [2023-10-10 06:21:27,198][53268] Updated weights for policy 1, policy_version 41050 (0.0007) [2023-10-10 06:21:30,893][53252] Updated weights for policy 0, policy_version 41090 (0.0010) [2023-10-10 06:21:31,265][53252] Updated weights for policy 0, policy_version 41100 (0.0009) [2023-10-10 06:21:31,280][53268] Updated weights for policy 1, policy_version 41060 (0.0008) [2023-10-10 06:21:31,638][53268] Updated weights for policy 1, policy_version 41070 (0.0008) [2023-10-10 06:21:31,639][53252] Updated weights for policy 0, policy_version 41110 (0.0008) [2023-10-10 06:21:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84115456. Throughput: 0: 1668.9, 1: 1700.5. Samples: 21040760. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:21:31,784][52050] Avg episode reward: [(0, '20.620'), (1, '19.240')] [2023-10-10 06:21:32,008][53252] Updated weights for policy 0, policy_version 41120 (0.0009) [2023-10-10 06:21:32,010][53268] Updated weights for policy 1, policy_version 41080 (0.0008) [2023-10-10 06:21:36,001][53268] Updated weights for policy 1, policy_version 41090 (0.0008) [2023-10-10 06:21:36,279][53252] Updated weights for policy 0, policy_version 41130 (0.0007) [2023-10-10 06:21:36,368][53268] Updated weights for policy 1, policy_version 41100 (0.0010) [2023-10-10 06:21:36,657][53252] Updated weights for policy 0, policy_version 41140 (0.0008) [2023-10-10 06:21:36,726][53268] Updated weights for policy 1, policy_version 41110 (0.0010) [2023-10-10 06:21:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84180992. Throughput: 0: 1671.7, 1: 1702.6. Samples: 21061450. Policy #0 lag: (min: 37.0, avg: 47.4, max: 48.0) [2023-10-10 06:21:36,784][52050] Avg episode reward: [(0, '21.200'), (1, '20.300')] [2023-10-10 06:21:37,021][53252] Updated weights for policy 0, policy_version 41150 (0.0010) [2023-10-10 06:21:37,098][53268] Updated weights for policy 1, policy_version 41120 (0.0008) [2023-10-10 06:21:41,022][53252] Updated weights for policy 0, policy_version 41160 (0.0008) [2023-10-10 06:21:41,197][53268] Updated weights for policy 1, policy_version 41130 (0.0008) [2023-10-10 06:21:41,391][53252] Updated weights for policy 0, policy_version 41170 (0.0008) [2023-10-10 06:21:41,564][53268] Updated weights for policy 1, policy_version 41140 (0.0007) [2023-10-10 06:21:41,765][53252] Updated weights for policy 0, policy_version 41180 (0.0009) [2023-10-10 06:21:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 84246528. Throughput: 0: 1656.7, 1: 1690.4. Samples: 21081014. Policy #0 lag: (min: 37.0, avg: 47.4, max: 48.0) [2023-10-10 06:21:41,784][52050] Avg episode reward: [(0, '21.540'), (1, '17.990')] [2023-10-10 06:21:41,931][53268] Updated weights for policy 1, policy_version 41150 (0.0008) [2023-10-10 06:21:45,851][53252] Updated weights for policy 0, policy_version 41190 (0.0008) [2023-10-10 06:21:46,033][53268] Updated weights for policy 1, policy_version 41160 (0.0009) [2023-10-10 06:21:46,224][53252] Updated weights for policy 0, policy_version 41200 (0.0007) [2023-10-10 06:21:46,405][53268] Updated weights for policy 1, policy_version 41170 (0.0007) [2023-10-10 06:21:46,595][53252] Updated weights for policy 0, policy_version 41210 (0.0009) [2023-10-10 06:21:46,775][53268] Updated weights for policy 1, policy_version 41180 (0.0009) [2023-10-10 06:21:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84312064. Throughput: 0: 1669.5, 1: 1696.8. Samples: 21091288. Policy #0 lag: (min: 37.0, avg: 47.4, max: 48.0) [2023-10-10 06:21:46,784][52050] Avg episode reward: [(0, '21.690'), (1, '17.510')] [2023-10-10 06:21:50,887][53252] Updated weights for policy 0, policy_version 41220 (0.0009) [2023-10-10 06:21:50,924][53268] Updated weights for policy 1, policy_version 41190 (0.0009) [2023-10-10 06:21:51,257][53252] Updated weights for policy 0, policy_version 41230 (0.0008) [2023-10-10 06:21:51,296][53268] Updated weights for policy 1, policy_version 41200 (0.0009) [2023-10-10 06:21:51,620][53252] Updated weights for policy 0, policy_version 41240 (0.0007) [2023-10-10 06:21:51,665][53268] Updated weights for policy 1, policy_version 41210 (0.0008) [2023-10-10 06:21:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 84377600. Throughput: 0: 1660.2, 1: 1692.9. Samples: 21111288. Policy #0 lag: (min: 37.0, avg: 47.4, max: 48.0) [2023-10-10 06:21:51,784][52050] Avg episode reward: [(0, '22.920'), (1, '18.140')] [2023-10-10 06:21:55,574][53268] Updated weights for policy 1, policy_version 41220 (0.0007) [2023-10-10 06:21:55,704][53252] Updated weights for policy 0, policy_version 41250 (0.0007) [2023-10-10 06:21:55,939][53268] Updated weights for policy 1, policy_version 41230 (0.0009) [2023-10-10 06:21:56,084][53252] Updated weights for policy 0, policy_version 41260 (0.0009) [2023-10-10 06:21:56,309][53268] Updated weights for policy 1, policy_version 41240 (0.0009) [2023-10-10 06:21:56,451][53252] Updated weights for policy 0, policy_version 41270 (0.0009) [2023-10-10 06:21:56,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 84475904. Throughput: 0: 1654.8, 1: 1675.0. Samples: 21130774. Policy #0 lag: (min: 37.0, avg: 47.4, max: 48.0) [2023-10-10 06:21:56,784][52050] Avg episode reward: [(0, '20.920'), (1, '17.540')] [2023-10-10 06:21:56,816][53252] Updated weights for policy 0, policy_version 41280 (0.0007) [2023-10-10 06:22:00,481][53268] Updated weights for policy 1, policy_version 41250 (0.0010) [2023-10-10 06:22:00,842][53252] Updated weights for policy 0, policy_version 41290 (0.0008) [2023-10-10 06:22:00,848][53268] Updated weights for policy 1, policy_version 41260 (0.0007) [2023-10-10 06:22:01,210][53252] Updated weights for policy 0, policy_version 41300 (0.0009) [2023-10-10 06:22:01,214][53268] Updated weights for policy 1, policy_version 41270 (0.0008) [2023-10-10 06:22:01,573][53268] Updated weights for policy 1, policy_version 41280 (0.0009) [2023-10-10 06:22:01,588][53252] Updated weights for policy 0, policy_version 41310 (0.0010) [2023-10-10 06:22:01,783][52050] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 84574208. Throughput: 0: 1666.5, 1: 1687.0. Samples: 21141388. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:01,784][52050] Avg episode reward: [(0, '22.420'), (1, '17.710')] [2023-10-10 06:22:05,568][53252] Updated weights for policy 0, policy_version 41320 (0.0007) [2023-10-10 06:22:05,722][53268] Updated weights for policy 1, policy_version 41290 (0.0008) [2023-10-10 06:22:05,933][53252] Updated weights for policy 0, policy_version 41330 (0.0011) [2023-10-10 06:22:06,088][53268] Updated weights for policy 1, policy_version 41300 (0.0009) [2023-10-10 06:22:06,305][53252] Updated weights for policy 0, policy_version 41340 (0.0008) [2023-10-10 06:22:06,452][53268] Updated weights for policy 1, policy_version 41310 (0.0009) [2023-10-10 06:22:06,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 84639744. Throughput: 0: 1664.5, 1: 1685.2. Samples: 21161760. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:06,784][52050] Avg episode reward: [(0, '19.260'), (1, '18.450')] [2023-10-10 06:22:10,402][53252] Updated weights for policy 0, policy_version 41350 (0.0007) [2023-10-10 06:22:10,561][53268] Updated weights for policy 1, policy_version 41320 (0.0009) [2023-10-10 06:22:10,782][53252] Updated weights for policy 0, policy_version 41360 (0.0007) [2023-10-10 06:22:10,945][53268] Updated weights for policy 1, policy_version 41330 (0.0009) [2023-10-10 06:22:11,162][53252] Updated weights for policy 0, policy_version 41370 (0.0008) [2023-10-10 06:22:11,309][53268] Updated weights for policy 1, policy_version 41340 (0.0009) [2023-10-10 06:22:11,784][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 84705280. Throughput: 0: 1647.0, 1: 1659.7. Samples: 21180154. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:11,785][52050] Avg episode reward: [(0, '19.350'), (1, '19.530')] [2023-10-10 06:22:15,326][53252] Updated weights for policy 0, policy_version 41380 (0.0008) [2023-10-10 06:22:15,568][53268] Updated weights for policy 1, policy_version 41350 (0.0007) [2023-10-10 06:22:15,703][53252] Updated weights for policy 0, policy_version 41390 (0.0008) [2023-10-10 06:22:15,937][53268] Updated weights for policy 1, policy_version 41360 (0.0009) [2023-10-10 06:22:16,065][53252] Updated weights for policy 0, policy_version 41400 (0.0007) [2023-10-10 06:22:16,298][53268] Updated weights for policy 1, policy_version 41370 (0.0009) [2023-10-10 06:22:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 84770816. Throughput: 0: 1663.6, 1: 1680.3. Samples: 21191236. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:16,784][52050] Avg episode reward: [(0, '18.980'), (1, '18.910')] [2023-10-10 06:22:20,141][53252] Updated weights for policy 0, policy_version 41410 (0.0007) [2023-10-10 06:22:20,315][53268] Updated weights for policy 1, policy_version 41380 (0.0009) [2023-10-10 06:22:20,516][53252] Updated weights for policy 0, policy_version 41420 (0.0009) [2023-10-10 06:22:20,685][53268] Updated weights for policy 1, policy_version 41390 (0.0008) [2023-10-10 06:22:20,886][53252] Updated weights for policy 0, policy_version 41430 (0.0008) [2023-10-10 06:22:21,055][53268] Updated weights for policy 1, policy_version 41400 (0.0007) [2023-10-10 06:22:21,256][53252] Updated weights for policy 0, policy_version 41440 (0.0009) [2023-10-10 06:22:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 84836352. Throughput: 0: 1657.1, 1: 1675.5. Samples: 21211416. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:21,784][52050] Avg episode reward: [(0, '20.400'), (1, '19.650')] [2023-10-10 06:22:25,125][53268] Updated weights for policy 1, policy_version 41410 (0.0010) [2023-10-10 06:22:25,447][53252] Updated weights for policy 0, policy_version 41450 (0.0009) [2023-10-10 06:22:25,487][53268] Updated weights for policy 1, policy_version 41420 (0.0009) [2023-10-10 06:22:25,818][53252] Updated weights for policy 0, policy_version 41460 (0.0008) [2023-10-10 06:22:25,861][53268] Updated weights for policy 1, policy_version 41430 (0.0008) [2023-10-10 06:22:26,182][53252] Updated weights for policy 0, policy_version 41470 (0.0007) [2023-10-10 06:22:26,227][53268] Updated weights for policy 1, policy_version 41440 (0.0008) [2023-10-10 06:22:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 84901888. Throughput: 0: 1647.1, 1: 1662.1. Samples: 21229928. Policy #0 lag: (min: 5.0, avg: 7.7, max: 37.0) [2023-10-10 06:22:26,784][52050] Avg episode reward: [(0, '20.460'), (1, '19.740')] [2023-10-10 06:22:30,259][53252] Updated weights for policy 0, policy_version 41480 (0.0008) [2023-10-10 06:22:30,275][53268] Updated weights for policy 1, policy_version 41450 (0.0010) [2023-10-10 06:22:30,634][53252] Updated weights for policy 0, policy_version 41490 (0.0008) [2023-10-10 06:22:30,638][53268] Updated weights for policy 1, policy_version 41460 (0.0009) [2023-10-10 06:22:31,004][53252] Updated weights for policy 0, policy_version 41500 (0.0008) [2023-10-10 06:22:31,012][53268] Updated weights for policy 1, policy_version 41470 (0.0009) [2023-10-10 06:22:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 84967424. Throughput: 0: 1654.8, 1: 1679.7. Samples: 21241342. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:31,784][52050] Avg episode reward: [(0, '20.630'), (1, '19.660')] [2023-10-10 06:22:34,985][53268] Updated weights for policy 1, policy_version 41480 (0.0008) [2023-10-10 06:22:35,132][53252] Updated weights for policy 0, policy_version 41510 (0.0009) [2023-10-10 06:22:35,359][53268] Updated weights for policy 1, policy_version 41490 (0.0010) [2023-10-10 06:22:35,517][53252] Updated weights for policy 0, policy_version 41520 (0.0008) [2023-10-10 06:22:35,721][53268] Updated weights for policy 1, policy_version 41500 (0.0008) [2023-10-10 06:22:35,897][53252] Updated weights for policy 0, policy_version 41530 (0.0009) [2023-10-10 06:22:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 85032960. Throughput: 0: 1656.3, 1: 1676.4. Samples: 21261260. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:36,784][52050] Avg episode reward: [(0, '21.380'), (1, '20.070')] [2023-10-10 06:22:39,877][53268] Updated weights for policy 1, policy_version 41510 (0.0008) [2023-10-10 06:22:40,045][53252] Updated weights for policy 0, policy_version 41540 (0.0007) [2023-10-10 06:22:40,237][53268] Updated weights for policy 1, policy_version 41520 (0.0007) [2023-10-10 06:22:40,410][53252] Updated weights for policy 0, policy_version 41550 (0.0008) [2023-10-10 06:22:40,603][53268] Updated weights for policy 1, policy_version 41530 (0.0009) [2023-10-10 06:22:40,793][53252] Updated weights for policy 0, policy_version 41560 (0.0009) [2023-10-10 06:22:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 85098496. Throughput: 0: 1654.2, 1: 1664.9. Samples: 21280136. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:41,784][52050] Avg episode reward: [(0, '22.780'), (1, '19.480')] [2023-10-10 06:22:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000041536_42532864.pth... [2023-10-10 06:22:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000041568_42565632.pth... [2023-10-10 06:22:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000039968_40927232.pth [2023-10-10 06:22:41,835][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000040000_40960000.pth [2023-10-10 06:22:44,476][53268] Updated weights for policy 1, policy_version 41540 (0.0007) [2023-10-10 06:22:44,629][53252] Updated weights for policy 0, policy_version 41570 (0.0008) [2023-10-10 06:22:44,841][53268] Updated weights for policy 1, policy_version 41550 (0.0008) [2023-10-10 06:22:44,996][53252] Updated weights for policy 0, policy_version 41580 (0.0007) [2023-10-10 06:22:45,216][53268] Updated weights for policy 1, policy_version 41560 (0.0008) [2023-10-10 06:22:45,368][53252] Updated weights for policy 0, policy_version 41590 (0.0007) [2023-10-10 06:22:45,734][53252] Updated weights for policy 0, policy_version 41600 (0.0007) [2023-10-10 06:22:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 85164032. Throughput: 0: 1665.9, 1: 1682.1. Samples: 21292046. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:46,784][52050] Avg episode reward: [(0, '21.730'), (1, '19.620')] [2023-10-10 06:22:49,335][53268] Updated weights for policy 1, policy_version 41570 (0.0008) [2023-10-10 06:22:49,705][53268] Updated weights for policy 1, policy_version 41580 (0.0007) [2023-10-10 06:22:49,819][53252] Updated weights for policy 0, policy_version 41610 (0.0007) [2023-10-10 06:22:50,058][53268] Updated weights for policy 1, policy_version 41590 (0.0008) [2023-10-10 06:22:50,192][53252] Updated weights for policy 0, policy_version 41620 (0.0007) [2023-10-10 06:22:50,426][53268] Updated weights for policy 1, policy_version 41600 (0.0010) [2023-10-10 06:22:50,559][53252] Updated weights for policy 0, policy_version 41630 (0.0008) [2023-10-10 06:22:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 85229568. Throughput: 0: 1648.8, 1: 1664.0. Samples: 21310834. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:51,784][52050] Avg episode reward: [(0, '19.560'), (1, '18.500')] [2023-10-10 06:22:54,582][53268] Updated weights for policy 1, policy_version 41610 (0.0010) [2023-10-10 06:22:54,757][53252] Updated weights for policy 0, policy_version 41640 (0.0011) [2023-10-10 06:22:54,949][53268] Updated weights for policy 1, policy_version 41620 (0.0009) [2023-10-10 06:22:55,129][53252] Updated weights for policy 0, policy_version 41650 (0.0010) [2023-10-10 06:22:55,313][53268] Updated weights for policy 1, policy_version 41630 (0.0010) [2023-10-10 06:22:55,510][53252] Updated weights for policy 0, policy_version 41660 (0.0008) [2023-10-10 06:22:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 85295104. Throughput: 0: 1668.8, 1: 1680.1. Samples: 21330852. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) [2023-10-10 06:22:56,784][52050] Avg episode reward: [(0, '20.770'), (1, '17.710')] [2023-10-10 06:22:59,457][53268] Updated weights for policy 1, policy_version 41640 (0.0007) [2023-10-10 06:22:59,629][53252] Updated weights for policy 0, policy_version 41670 (0.0009) [2023-10-10 06:22:59,837][53268] Updated weights for policy 1, policy_version 41650 (0.0007) [2023-10-10 06:22:59,995][53252] Updated weights for policy 0, policy_version 41680 (0.0008) [2023-10-10 06:23:00,202][53268] Updated weights for policy 1, policy_version 41660 (0.0007) [2023-10-10 06:23:00,370][53252] Updated weights for policy 0, policy_version 41690 (0.0009) [2023-10-10 06:23:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85360640. Throughput: 0: 1669.3, 1: 1687.1. Samples: 21342272. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:01,784][52050] Avg episode reward: [(0, '20.450'), (1, '17.170')] [2023-10-10 06:23:04,312][53268] Updated weights for policy 1, policy_version 41670 (0.0007) [2023-10-10 06:23:04,349][53252] Updated weights for policy 0, policy_version 41700 (0.0009) [2023-10-10 06:23:04,678][53268] Updated weights for policy 1, policy_version 41680 (0.0008) [2023-10-10 06:23:04,725][53252] Updated weights for policy 0, policy_version 41710 (0.0008) [2023-10-10 06:23:05,040][53268] Updated weights for policy 1, policy_version 41690 (0.0008) [2023-10-10 06:23:05,089][53252] Updated weights for policy 0, policy_version 41720 (0.0008) [2023-10-10 06:23:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85426176. Throughput: 0: 1653.8, 1: 1662.4. Samples: 21360644. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:06,784][52050] Avg episode reward: [(0, '21.290'), (1, '17.640')] [2023-10-10 06:23:09,166][53268] Updated weights for policy 1, policy_version 41700 (0.0009) [2023-10-10 06:23:09,304][53252] Updated weights for policy 0, policy_version 41730 (0.0007) [2023-10-10 06:23:09,539][53268] Updated weights for policy 1, policy_version 41710 (0.0009) [2023-10-10 06:23:09,673][53252] Updated weights for policy 0, policy_version 41740 (0.0008) [2023-10-10 06:23:09,910][53268] Updated weights for policy 1, policy_version 41720 (0.0009) [2023-10-10 06:23:10,042][53252] Updated weights for policy 0, policy_version 41750 (0.0007) [2023-10-10 06:23:10,411][53252] Updated weights for policy 0, policy_version 41760 (0.0009) [2023-10-10 06:23:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 85491712. Throughput: 0: 1679.6, 1: 1682.4. Samples: 21381218. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:11,784][52050] Avg episode reward: [(0, '21.070'), (1, '18.110')] [2023-10-10 06:23:13,913][53268] Updated weights for policy 1, policy_version 41730 (0.0009) [2023-10-10 06:23:14,274][53268] Updated weights for policy 1, policy_version 41740 (0.0009) [2023-10-10 06:23:14,415][53252] Updated weights for policy 0, policy_version 41770 (0.0007) [2023-10-10 06:23:14,643][53268] Updated weights for policy 1, policy_version 41750 (0.0008) [2023-10-10 06:23:14,783][53252] Updated weights for policy 0, policy_version 41780 (0.0008) [2023-10-10 06:23:15,003][53268] Updated weights for policy 1, policy_version 41760 (0.0009) [2023-10-10 06:23:15,153][53252] Updated weights for policy 0, policy_version 41790 (0.0008) [2023-10-10 06:23:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85557248. Throughput: 0: 1674.9, 1: 1677.4. Samples: 21392196. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:16,784][52050] Avg episode reward: [(0, '22.030'), (1, '18.500')] [2023-10-10 06:23:18,906][53268] Updated weights for policy 1, policy_version 41770 (0.0009) [2023-10-10 06:23:19,165][53252] Updated weights for policy 0, policy_version 41800 (0.0007) [2023-10-10 06:23:19,265][53268] Updated weights for policy 1, policy_version 41780 (0.0010) [2023-10-10 06:23:19,538][53252] Updated weights for policy 0, policy_version 41810 (0.0007) [2023-10-10 06:23:19,640][53268] Updated weights for policy 1, policy_version 41790 (0.0008) [2023-10-10 06:23:19,918][53252] Updated weights for policy 0, policy_version 41820 (0.0008) [2023-10-10 06:23:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85622784. Throughput: 0: 1666.0, 1: 1666.0. Samples: 21411200. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:21,784][52050] Avg episode reward: [(0, '22.040'), (1, '19.530')] [2023-10-10 06:23:23,669][53268] Updated weights for policy 1, policy_version 41800 (0.0011) [2023-10-10 06:23:23,926][53252] Updated weights for policy 0, policy_version 41830 (0.0009) [2023-10-10 06:23:24,038][53268] Updated weights for policy 1, policy_version 41810 (0.0008) [2023-10-10 06:23:24,310][53252] Updated weights for policy 0, policy_version 41840 (0.0008) [2023-10-10 06:23:24,415][53268] Updated weights for policy 1, policy_version 41820 (0.0009) [2023-10-10 06:23:24,679][53252] Updated weights for policy 0, policy_version 41850 (0.0008) [2023-10-10 06:23:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85688320. Throughput: 0: 1688.4, 1: 1683.7. Samples: 21431880. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) [2023-10-10 06:23:26,784][52050] Avg episode reward: [(0, '20.850'), (1, '18.630')] [2023-10-10 06:23:28,549][53268] Updated weights for policy 1, policy_version 41830 (0.0008) [2023-10-10 06:23:28,694][53252] Updated weights for policy 0, policy_version 41860 (0.0008) [2023-10-10 06:23:28,918][53268] Updated weights for policy 1, policy_version 41840 (0.0008) [2023-10-10 06:23:29,065][53252] Updated weights for policy 0, policy_version 41870 (0.0009) [2023-10-10 06:23:29,283][53268] Updated weights for policy 1, policy_version 41850 (0.0009) [2023-10-10 06:23:29,444][53252] Updated weights for policy 0, policy_version 41880 (0.0007) [2023-10-10 06:23:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85753856. Throughput: 0: 1668.0, 1: 1664.1. Samples: 21441992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:31,784][52050] Avg episode reward: [(0, '20.730'), (1, '17.940')] [2023-10-10 06:23:33,485][53252] Updated weights for policy 0, policy_version 41890 (0.0008) [2023-10-10 06:23:33,494][53268] Updated weights for policy 1, policy_version 41860 (0.0010) [2023-10-10 06:23:33,854][53252] Updated weights for policy 0, policy_version 41900 (0.0008) [2023-10-10 06:23:33,860][53268] Updated weights for policy 1, policy_version 41870 (0.0008) [2023-10-10 06:23:34,220][53268] Updated weights for policy 1, policy_version 41880 (0.0008) [2023-10-10 06:23:34,226][53252] Updated weights for policy 0, policy_version 41910 (0.0009) [2023-10-10 06:23:34,598][53252] Updated weights for policy 0, policy_version 41920 (0.0008) [2023-10-10 06:23:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85819392. Throughput: 0: 1677.4, 1: 1675.7. Samples: 21461726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:36,784][52050] Avg episode reward: [(0, '20.990'), (1, '18.740')] [2023-10-10 06:23:38,273][53268] Updated weights for policy 1, policy_version 41890 (0.0009) [2023-10-10 06:23:38,638][53268] Updated weights for policy 1, policy_version 41900 (0.0008) [2023-10-10 06:23:38,723][53252] Updated weights for policy 0, policy_version 41930 (0.0008) [2023-10-10 06:23:39,003][53268] Updated weights for policy 1, policy_version 41910 (0.0007) [2023-10-10 06:23:39,098][53252] Updated weights for policy 0, policy_version 41940 (0.0008) [2023-10-10 06:23:39,367][53268] Updated weights for policy 1, policy_version 41920 (0.0008) [2023-10-10 06:23:39,465][53252] Updated weights for policy 0, policy_version 41950 (0.0009) [2023-10-10 06:23:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85884928. Throughput: 0: 1686.9, 1: 1685.2. Samples: 21482600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:41,784][52050] Avg episode reward: [(0, '20.270'), (1, '17.940')] [2023-10-10 06:23:43,226][53268] Updated weights for policy 1, policy_version 41930 (0.0009) [2023-10-10 06:23:43,370][53252] Updated weights for policy 0, policy_version 41960 (0.0008) [2023-10-10 06:23:43,601][53268] Updated weights for policy 1, policy_version 41940 (0.0008) [2023-10-10 06:23:43,737][53252] Updated weights for policy 0, policy_version 41970 (0.0007) [2023-10-10 06:23:43,965][53268] Updated weights for policy 1, policy_version 41950 (0.0010) [2023-10-10 06:23:44,112][53252] Updated weights for policy 0, policy_version 41980 (0.0010) [2023-10-10 06:23:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85950464. Throughput: 0: 1663.0, 1: 1661.6. Samples: 21491876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:46,784][52050] Avg episode reward: [(0, '20.420'), (1, '18.420')] [2023-10-10 06:23:48,032][53268] Updated weights for policy 1, policy_version 41960 (0.0008) [2023-10-10 06:23:48,262][53252] Updated weights for policy 0, policy_version 41990 (0.0009) [2023-10-10 06:23:48,397][53268] Updated weights for policy 1, policy_version 41970 (0.0008) [2023-10-10 06:23:48,638][53252] Updated weights for policy 0, policy_version 42000 (0.0009) [2023-10-10 06:23:48,765][53268] Updated weights for policy 1, policy_version 41980 (0.0008) [2023-10-10 06:23:48,997][53252] Updated weights for policy 0, policy_version 42010 (0.0009) [2023-10-10 06:23:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 86016000. Throughput: 0: 1690.3, 1: 1686.5. Samples: 21512602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:51,785][52050] Avg episode reward: [(0, '21.200'), (1, '19.070')] [2023-10-10 06:23:52,988][53268] Updated weights for policy 1, policy_version 41990 (0.0010) [2023-10-10 06:23:53,128][53252] Updated weights for policy 0, policy_version 42020 (0.0008) [2023-10-10 06:23:53,373][53268] Updated weights for policy 1, policy_version 42000 (0.0010) [2023-10-10 06:23:53,502][53252] Updated weights for policy 0, policy_version 42030 (0.0008) [2023-10-10 06:23:53,744][53268] Updated weights for policy 1, policy_version 42010 (0.0007) [2023-10-10 06:23:53,862][53252] Updated weights for policy 0, policy_version 42040 (0.0007) [2023-10-10 06:23:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86081536. Throughput: 0: 1695.2, 1: 1683.6. Samples: 21533264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:23:56,784][52050] Avg episode reward: [(0, '19.260'), (1, '19.140')] [2023-10-10 06:23:57,747][53268] Updated weights for policy 1, policy_version 42020 (0.0007) [2023-10-10 06:23:57,881][53252] Updated weights for policy 0, policy_version 42050 (0.0007) [2023-10-10 06:23:58,111][53268] Updated weights for policy 1, policy_version 42030 (0.0009) [2023-10-10 06:23:58,253][53252] Updated weights for policy 0, policy_version 42060 (0.0007) [2023-10-10 06:23:58,481][53268] Updated weights for policy 1, policy_version 42040 (0.0009) [2023-10-10 06:23:58,630][53252] Updated weights for policy 0, policy_version 42070 (0.0007) [2023-10-10 06:23:58,997][53252] Updated weights for policy 0, policy_version 42080 (0.0008) [2023-10-10 06:24:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86147072. Throughput: 0: 1668.9, 1: 1668.6. Samples: 21542382. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:01,784][52050] Avg episode reward: [(0, '20.970'), (1, '19.750')] [2023-10-10 06:24:02,594][53268] Updated weights for policy 1, policy_version 42050 (0.0010) [2023-10-10 06:24:02,960][53268] Updated weights for policy 1, policy_version 42060 (0.0008) [2023-10-10 06:24:03,023][53252] Updated weights for policy 0, policy_version 42090 (0.0008) [2023-10-10 06:24:03,324][53268] Updated weights for policy 1, policy_version 42070 (0.0008) [2023-10-10 06:24:03,399][53252] Updated weights for policy 0, policy_version 42100 (0.0008) [2023-10-10 06:24:03,687][53268] Updated weights for policy 1, policy_version 42080 (0.0009) [2023-10-10 06:24:03,765][53252] Updated weights for policy 0, policy_version 42110 (0.0010) [2023-10-10 06:24:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86212608. Throughput: 0: 1693.2, 1: 1687.9. Samples: 21563352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:06,784][52050] Avg episode reward: [(0, '21.110'), (1, '19.740')] [2023-10-10 06:24:07,794][53268] Updated weights for policy 1, policy_version 42090 (0.0007) [2023-10-10 06:24:08,037][53252] Updated weights for policy 0, policy_version 42120 (0.0008) [2023-10-10 06:24:08,160][53268] Updated weights for policy 1, policy_version 42100 (0.0008) [2023-10-10 06:24:08,414][53252] Updated weights for policy 0, policy_version 42130 (0.0007) [2023-10-10 06:24:08,532][53268] Updated weights for policy 1, policy_version 42110 (0.0009) [2023-10-10 06:24:08,778][53252] Updated weights for policy 0, policy_version 42140 (0.0009) [2023-10-10 06:24:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 86278144. Throughput: 0: 1688.5, 1: 1688.4. Samples: 21583840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:11,784][52050] Avg episode reward: [(0, '20.110'), (1, '19.290')] [2023-10-10 06:24:12,657][53268] Updated weights for policy 1, policy_version 42120 (0.0010) [2023-10-10 06:24:12,789][53252] Updated weights for policy 0, policy_version 42150 (0.0007) [2023-10-10 06:24:13,026][53268] Updated weights for policy 1, policy_version 42130 (0.0009) [2023-10-10 06:24:13,174][53252] Updated weights for policy 0, policy_version 42160 (0.0009) [2023-10-10 06:24:13,390][53268] Updated weights for policy 1, policy_version 42140 (0.0009) [2023-10-10 06:24:13,551][53252] Updated weights for policy 0, policy_version 42170 (0.0008) [2023-10-10 06:24:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86343680. Throughput: 0: 1675.8, 1: 1675.5. Samples: 21592802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:16,784][52050] Avg episode reward: [(0, '20.130'), (1, '18.190')] [2023-10-10 06:24:17,490][53268] Updated weights for policy 1, policy_version 42150 (0.0008) [2023-10-10 06:24:17,509][53252] Updated weights for policy 0, policy_version 42180 (0.0010) [2023-10-10 06:24:17,854][53268] Updated weights for policy 1, policy_version 42160 (0.0009) [2023-10-10 06:24:17,877][53252] Updated weights for policy 0, policy_version 42190 (0.0007) [2023-10-10 06:24:18,219][53268] Updated weights for policy 1, policy_version 42170 (0.0007) [2023-10-10 06:24:18,252][53252] Updated weights for policy 0, policy_version 42200 (0.0007) [2023-10-10 06:24:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86409216. Throughput: 0: 1692.0, 1: 1681.8. Samples: 21613548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:21,784][52050] Avg episode reward: [(0, '20.380'), (1, '18.300')] [2023-10-10 06:24:22,257][53252] Updated weights for policy 0, policy_version 42210 (0.0009) [2023-10-10 06:24:22,351][53268] Updated weights for policy 1, policy_version 42180 (0.0011) [2023-10-10 06:24:22,619][53252] Updated weights for policy 0, policy_version 42220 (0.0007) [2023-10-10 06:24:22,719][53268] Updated weights for policy 1, policy_version 42190 (0.0011) [2023-10-10 06:24:22,988][53252] Updated weights for policy 0, policy_version 42230 (0.0007) [2023-10-10 06:24:23,079][53268] Updated weights for policy 1, policy_version 42200 (0.0009) [2023-10-10 06:24:23,349][53252] Updated weights for policy 0, policy_version 42240 (0.0007) [2023-10-10 06:24:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86474752. Throughput: 0: 1690.2, 1: 1677.6. Samples: 21634152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:26,784][52050] Avg episode reward: [(0, '19.780'), (1, '18.290')] [2023-10-10 06:24:27,100][53268] Updated weights for policy 1, policy_version 42210 (0.0009) [2023-10-10 06:24:27,379][53252] Updated weights for policy 0, policy_version 42250 (0.0009) [2023-10-10 06:24:27,463][53268] Updated weights for policy 1, policy_version 42220 (0.0007) [2023-10-10 06:24:27,741][53252] Updated weights for policy 0, policy_version 42260 (0.0008) [2023-10-10 06:24:27,830][53268] Updated weights for policy 1, policy_version 42230 (0.0009) [2023-10-10 06:24:28,115][53252] Updated weights for policy 0, policy_version 42270 (0.0008) [2023-10-10 06:24:28,191][53268] Updated weights for policy 1, policy_version 42240 (0.0010) [2023-10-10 06:24:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 86540288. Throughput: 0: 1689.1, 1: 1671.9. Samples: 21643122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:31,784][52050] Avg episode reward: [(0, '19.840'), (1, '19.430')] [2023-10-10 06:24:32,063][53252] Updated weights for policy 0, policy_version 42280 (0.0008) [2023-10-10 06:24:32,403][53268] Updated weights for policy 1, policy_version 42250 (0.0007) [2023-10-10 06:24:32,427][53252] Updated weights for policy 0, policy_version 42290 (0.0010) [2023-10-10 06:24:32,763][53268] Updated weights for policy 1, policy_version 42260 (0.0009) [2023-10-10 06:24:32,803][53252] Updated weights for policy 0, policy_version 42300 (0.0009) [2023-10-10 06:24:33,129][53268] Updated weights for policy 1, policy_version 42270 (0.0008) [2023-10-10 06:24:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86605824. Throughput: 0: 1688.2, 1: 1672.5. Samples: 21663834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:36,784][52050] Avg episode reward: [(0, '20.240'), (1, '20.370')] [2023-10-10 06:24:36,827][53252] Updated weights for policy 0, policy_version 42310 (0.0007) [2023-10-10 06:24:37,192][53268] Updated weights for policy 1, policy_version 42280 (0.0008) [2023-10-10 06:24:37,192][53252] Updated weights for policy 0, policy_version 42320 (0.0008) [2023-10-10 06:24:37,555][53268] Updated weights for policy 1, policy_version 42290 (0.0008) [2023-10-10 06:24:37,570][53252] Updated weights for policy 0, policy_version 42330 (0.0009) [2023-10-10 06:24:37,925][53268] Updated weights for policy 1, policy_version 42300 (0.0009) [2023-10-10 06:24:41,730][53252] Updated weights for policy 0, policy_version 42340 (0.0009) [2023-10-10 06:24:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86671360. Throughput: 0: 1683.5, 1: 1672.9. Samples: 21684300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:41,784][52050] Avg episode reward: [(0, '20.320'), (1, '21.160')] [2023-10-10 06:24:42,087][53268] Updated weights for policy 1, policy_version 42310 (0.0009) [2023-10-10 06:24:42,098][53252] Updated weights for policy 0, policy_version 42350 (0.0009) [2023-10-10 06:24:42,474][53268] Updated weights for policy 1, policy_version 42320 (0.0007) [2023-10-10 06:24:42,480][53252] Updated weights for policy 0, policy_version 42360 (0.0009) [2023-10-10 06:24:42,766][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000042368_43384832.pth... [2023-10-10 06:24:42,803][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000040800_41779200.pth [2023-10-10 06:24:42,852][53268] Updated weights for policy 1, policy_version 42330 (0.0009) [2023-10-10 06:24:43,065][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000042336_43352064.pth... [2023-10-10 06:24:43,096][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000040736_41713664.pth [2023-10-10 06:24:46,544][53252] Updated weights for policy 0, policy_version 42370 (0.0008) [2023-10-10 06:24:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86736896. Throughput: 0: 1687.1, 1: 1667.4. Samples: 21693336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:46,784][52050] Avg episode reward: [(0, '21.050'), (1, '20.360')] [2023-10-10 06:24:46,793][53268] Updated weights for policy 1, policy_version 42340 (0.0009) [2023-10-10 06:24:46,912][53252] Updated weights for policy 0, policy_version 42380 (0.0008) [2023-10-10 06:24:47,153][53268] Updated weights for policy 1, policy_version 42350 (0.0007) [2023-10-10 06:24:47,289][53252] Updated weights for policy 0, policy_version 42390 (0.0009) [2023-10-10 06:24:47,522][53268] Updated weights for policy 1, policy_version 42360 (0.0009) [2023-10-10 06:24:47,656][53252] Updated weights for policy 0, policy_version 42400 (0.0008) [2023-10-10 06:24:51,731][53252] Updated weights for policy 0, policy_version 42410 (0.0008) [2023-10-10 06:24:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 86802432. Throughput: 0: 1688.2, 1: 1661.2. Samples: 21714074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:51,784][52050] Avg episode reward: [(0, '21.080'), (1, '18.740')] [2023-10-10 06:24:51,805][53268] Updated weights for policy 1, policy_version 42370 (0.0010) [2023-10-10 06:24:52,090][53252] Updated weights for policy 0, policy_version 42420 (0.0007) [2023-10-10 06:24:52,180][53268] Updated weights for policy 1, policy_version 42380 (0.0008) [2023-10-10 06:24:52,465][53252] Updated weights for policy 0, policy_version 42430 (0.0008) [2023-10-10 06:24:52,541][53268] Updated weights for policy 1, policy_version 42390 (0.0007) [2023-10-10 06:24:52,899][53268] Updated weights for policy 1, policy_version 42400 (0.0007) [2023-10-10 06:24:56,508][53252] Updated weights for policy 0, policy_version 42440 (0.0010) [2023-10-10 06:24:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86867968. Throughput: 0: 1687.4, 1: 1662.9. Samples: 21734600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:24:56,784][52050] Avg episode reward: [(0, '21.900'), (1, '19.200')] [2023-10-10 06:24:56,888][53252] Updated weights for policy 0, policy_version 42450 (0.0008) [2023-10-10 06:24:56,956][53268] Updated weights for policy 1, policy_version 42410 (0.0008) [2023-10-10 06:24:57,250][53252] Updated weights for policy 0, policy_version 42460 (0.0008) [2023-10-10 06:24:57,315][53268] Updated weights for policy 1, policy_version 42420 (0.0008) [2023-10-10 06:24:57,680][53268] Updated weights for policy 1, policy_version 42430 (0.0009) [2023-10-10 06:25:01,363][53252] Updated weights for policy 0, policy_version 42470 (0.0008) [2023-10-10 06:25:01,745][53252] Updated weights for policy 0, policy_version 42480 (0.0009) [2023-10-10 06:25:01,783][53268] Updated weights for policy 1, policy_version 42440 (0.0008) [2023-10-10 06:25:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86933504. Throughput: 0: 1695.5, 1: 1662.1. Samples: 21743892. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) [2023-10-10 06:25:01,784][52050] Avg episode reward: [(0, '20.110'), (1, '20.130')] [2023-10-10 06:25:02,112][53252] Updated weights for policy 0, policy_version 42490 (0.0009) [2023-10-10 06:25:02,151][53268] Updated weights for policy 1, policy_version 42450 (0.0008) [2023-10-10 06:25:02,519][53268] Updated weights for policy 1, policy_version 42460 (0.0007) [2023-10-10 06:25:06,059][53252] Updated weights for policy 0, policy_version 42500 (0.0008) [2023-10-10 06:25:06,430][53252] Updated weights for policy 0, policy_version 42510 (0.0007) [2023-10-10 06:25:06,616][53268] Updated weights for policy 1, policy_version 42470 (0.0008) [2023-10-10 06:25:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 86999040. Throughput: 0: 1696.3, 1: 1664.3. Samples: 21764776. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) [2023-10-10 06:25:06,784][52050] Avg episode reward: [(0, '20.490'), (1, '20.070')] [2023-10-10 06:25:06,801][53252] Updated weights for policy 0, policy_version 42520 (0.0007) [2023-10-10 06:25:06,982][53268] Updated weights for policy 1, policy_version 42480 (0.0009) [2023-10-10 06:25:07,344][53268] Updated weights for policy 1, policy_version 42490 (0.0010) [2023-10-10 06:25:10,921][53252] Updated weights for policy 0, policy_version 42530 (0.0007) [2023-10-10 06:25:11,296][53252] Updated weights for policy 0, policy_version 42540 (0.0007) [2023-10-10 06:25:11,623][53268] Updated weights for policy 1, policy_version 42500 (0.0009) [2023-10-10 06:25:11,659][53252] Updated weights for policy 0, policy_version 42550 (0.0007) [2023-10-10 06:25:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87064576. Throughput: 0: 1683.8, 1: 1664.8. Samples: 21784838. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) [2023-10-10 06:25:11,784][52050] Avg episode reward: [(0, '20.030'), (1, '19.610')] [2023-10-10 06:25:11,982][53268] Updated weights for policy 1, policy_version 42510 (0.0009) [2023-10-10 06:25:12,031][53252] Updated weights for policy 0, policy_version 42560 (0.0007) [2023-10-10 06:25:12,359][53268] Updated weights for policy 1, policy_version 42520 (0.0008) [2023-10-10 06:25:16,051][53252] Updated weights for policy 0, policy_version 42570 (0.0010) [2023-10-10 06:25:16,418][53252] Updated weights for policy 0, policy_version 42580 (0.0009) [2023-10-10 06:25:16,535][53268] Updated weights for policy 1, policy_version 42530 (0.0008) [2023-10-10 06:25:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 87130112. Throughput: 0: 1693.6, 1: 1666.0. Samples: 21794304. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) [2023-10-10 06:25:16,784][53252] Updated weights for policy 0, policy_version 42590 (0.0009) [2023-10-10 06:25:16,784][52050] Avg episode reward: [(0, '23.020'), (1, '19.630')] [2023-10-10 06:25:16,853][52846] Saving new best policy, reward=23.020! [2023-10-10 06:25:16,888][53268] Updated weights for policy 1, policy_version 42540 (0.0007) [2023-10-10 06:25:17,258][53268] Updated weights for policy 1, policy_version 42550 (0.0007) [2023-10-10 06:25:17,621][53268] Updated weights for policy 1, policy_version 42560 (0.0009) [2023-10-10 06:25:20,950][53252] Updated weights for policy 0, policy_version 42600 (0.0008) [2023-10-10 06:25:21,317][53252] Updated weights for policy 0, policy_version 42610 (0.0007) [2023-10-10 06:25:21,702][53252] Updated weights for policy 0, policy_version 42620 (0.0007) [2023-10-10 06:25:21,725][53268] Updated weights for policy 1, policy_version 42570 (0.0010) [2023-10-10 06:25:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 87195648. Throughput: 0: 1692.6, 1: 1664.0. Samples: 21814880. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) [2023-10-10 06:25:21,784][52050] Avg episode reward: [(0, '21.810'), (1, '18.770')] [2023-10-10 06:25:22,090][53268] Updated weights for policy 1, policy_version 42580 (0.0008) [2023-10-10 06:25:22,455][53268] Updated weights for policy 1, policy_version 42590 (0.0011) [2023-10-10 06:25:25,694][53252] Updated weights for policy 0, policy_version 42630 (0.0007) [2023-10-10 06:25:26,067][53252] Updated weights for policy 0, policy_version 42640 (0.0007) [2023-10-10 06:25:26,444][53252] Updated weights for policy 0, policy_version 42650 (0.0008) [2023-10-10 06:25:26,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87293952. Throughput: 0: 1675.7, 1: 1662.9. Samples: 21834538. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:26,784][52050] Avg episode reward: [(0, '20.420'), (1, '18.050')] [2023-10-10 06:25:26,804][53268] Updated weights for policy 1, policy_version 42600 (0.0010) [2023-10-10 06:25:27,186][53268] Updated weights for policy 1, policy_version 42610 (0.0009) [2023-10-10 06:25:27,554][53268] Updated weights for policy 1, policy_version 42620 (0.0011) [2023-10-10 06:25:30,381][53252] Updated weights for policy 0, policy_version 42660 (0.0009) [2023-10-10 06:25:30,748][53252] Updated weights for policy 0, policy_version 42670 (0.0007) [2023-10-10 06:25:31,118][53252] Updated weights for policy 0, policy_version 42680 (0.0007) [2023-10-10 06:25:31,678][53268] Updated weights for policy 1, policy_version 42630 (0.0008) [2023-10-10 06:25:31,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87359488. Throughput: 0: 1702.8, 1: 1658.5. Samples: 21844594. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:31,784][52050] Avg episode reward: [(0, '19.290'), (1, '19.010')] [2023-10-10 06:25:32,051][53268] Updated weights for policy 1, policy_version 42640 (0.0008) [2023-10-10 06:25:32,429][53268] Updated weights for policy 1, policy_version 42650 (0.0007) [2023-10-10 06:25:34,936][53252] Updated weights for policy 0, policy_version 42690 (0.0009) [2023-10-10 06:25:35,300][53252] Updated weights for policy 0, policy_version 42700 (0.0008) [2023-10-10 06:25:35,666][53252] Updated weights for policy 0, policy_version 42710 (0.0008) [2023-10-10 06:25:36,038][53252] Updated weights for policy 0, policy_version 42720 (0.0009) [2023-10-10 06:25:36,458][53268] Updated weights for policy 1, policy_version 42660 (0.0009) [2023-10-10 06:25:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87425024. Throughput: 0: 1697.6, 1: 1659.6. Samples: 21865144. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:36,784][52050] Avg episode reward: [(0, '19.160'), (1, '20.450')] [2023-10-10 06:25:36,824][53268] Updated weights for policy 1, policy_version 42670 (0.0008) [2023-10-10 06:25:37,191][53268] Updated weights for policy 1, policy_version 42680 (0.0008) [2023-10-10 06:25:40,079][53252] Updated weights for policy 0, policy_version 42730 (0.0008) [2023-10-10 06:25:40,452][53252] Updated weights for policy 0, policy_version 42740 (0.0007) [2023-10-10 06:25:40,813][53252] Updated weights for policy 0, policy_version 42750 (0.0007) [2023-10-10 06:25:41,304][53268] Updated weights for policy 1, policy_version 42690 (0.0009) [2023-10-10 06:25:41,675][53268] Updated weights for policy 1, policy_version 42700 (0.0007) [2023-10-10 06:25:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87490560. Throughput: 0: 1686.7, 1: 1659.6. Samples: 21885186. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:41,784][52050] Avg episode reward: [(0, '19.830'), (1, '20.190')] [2023-10-10 06:25:42,040][53268] Updated weights for policy 1, policy_version 42710 (0.0008) [2023-10-10 06:25:42,418][53268] Updated weights for policy 1, policy_version 42720 (0.0009) [2023-10-10 06:25:44,659][53252] Updated weights for policy 0, policy_version 42760 (0.0010) [2023-10-10 06:25:45,018][53252] Updated weights for policy 0, policy_version 42770 (0.0010) [2023-10-10 06:25:45,392][53252] Updated weights for policy 0, policy_version 42780 (0.0009) [2023-10-10 06:25:46,330][53268] Updated weights for policy 1, policy_version 42730 (0.0009) [2023-10-10 06:25:46,704][53268] Updated weights for policy 1, policy_version 42740 (0.0008) [2023-10-10 06:25:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87556096. Throughput: 0: 1712.0, 1: 1659.6. Samples: 21895616. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:46,784][52050] Avg episode reward: [(0, '20.820'), (1, '20.590')] [2023-10-10 06:25:47,078][53268] Updated weights for policy 1, policy_version 42750 (0.0010) [2023-10-10 06:25:49,587][53252] Updated weights for policy 0, policy_version 42790 (0.0008) [2023-10-10 06:25:49,982][53252] Updated weights for policy 0, policy_version 42800 (0.0010) [2023-10-10 06:25:50,354][53252] Updated weights for policy 0, policy_version 42810 (0.0008) [2023-10-10 06:25:51,277][53268] Updated weights for policy 1, policy_version 42760 (0.0010) [2023-10-10 06:25:51,648][53268] Updated weights for policy 1, policy_version 42770 (0.0010) [2023-10-10 06:25:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87621632. Throughput: 0: 1681.3, 1: 1661.9. Samples: 21915222. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-10 06:25:51,784][52050] Avg episode reward: [(0, '20.580'), (1, '21.160')] [2023-10-10 06:25:52,015][53268] Updated weights for policy 1, policy_version 42780 (0.0009) [2023-10-10 06:25:54,436][53252] Updated weights for policy 0, policy_version 42820 (0.0009) [2023-10-10 06:25:54,800][53252] Updated weights for policy 0, policy_version 42830 (0.0008) [2023-10-10 06:25:55,165][53252] Updated weights for policy 0, policy_version 42840 (0.0009) [2023-10-10 06:25:56,087][53268] Updated weights for policy 1, policy_version 42790 (0.0010) [2023-10-10 06:25:56,451][53268] Updated weights for policy 1, policy_version 42800 (0.0008) [2023-10-10 06:25:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87687168. Throughput: 0: 1687.0, 1: 1655.6. Samples: 21935254. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:25:56,784][52050] Avg episode reward: [(0, '19.940'), (1, '21.670')] [2023-10-10 06:25:56,823][53268] Updated weights for policy 1, policy_version 42810 (0.0009) [2023-10-10 06:25:59,246][53252] Updated weights for policy 0, policy_version 42850 (0.0007) [2023-10-10 06:25:59,608][53252] Updated weights for policy 0, policy_version 42860 (0.0007) [2023-10-10 06:25:59,976][53252] Updated weights for policy 0, policy_version 42870 (0.0008) [2023-10-10 06:26:00,351][53252] Updated weights for policy 0, policy_version 42880 (0.0010) [2023-10-10 06:26:00,782][53268] Updated weights for policy 1, policy_version 42820 (0.0009) [2023-10-10 06:26:01,140][53268] Updated weights for policy 1, policy_version 42830 (0.0008) [2023-10-10 06:26:01,521][53268] Updated weights for policy 1, policy_version 42840 (0.0010) [2023-10-10 06:26:01,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87752704. Throughput: 0: 1702.2, 1: 1665.6. Samples: 21945856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:26:01,784][52050] Avg episode reward: [(0, '19.260'), (1, '19.700')] [2023-10-10 06:26:04,441][53252] Updated weights for policy 0, policy_version 42890 (0.0010) [2023-10-10 06:26:04,814][53252] Updated weights for policy 0, policy_version 42900 (0.0009) [2023-10-10 06:26:05,186][53252] Updated weights for policy 0, policy_version 42910 (0.0010) [2023-10-10 06:26:05,426][53268] Updated weights for policy 1, policy_version 42850 (0.0009) [2023-10-10 06:26:05,784][53268] Updated weights for policy 1, policy_version 42860 (0.0010) [2023-10-10 06:26:06,160][53268] Updated weights for policy 1, policy_version 42870 (0.0009) [2023-10-10 06:26:06,530][53268] Updated weights for policy 1, policy_version 42880 (0.0007) [2023-10-10 06:26:06,784][52050] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 87851008. Throughput: 0: 1679.0, 1: 1674.9. Samples: 21965808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:26:06,785][52050] Avg episode reward: [(0, '18.580'), (1, '20.400')] [2023-10-10 06:26:09,128][53252] Updated weights for policy 0, policy_version 42920 (0.0007) [2023-10-10 06:26:09,496][53252] Updated weights for policy 0, policy_version 42930 (0.0009) [2023-10-10 06:26:09,867][53252] Updated weights for policy 0, policy_version 42940 (0.0009) [2023-10-10 06:26:10,678][53268] Updated weights for policy 1, policy_version 42890 (0.0009) [2023-10-10 06:26:11,043][53268] Updated weights for policy 1, policy_version 42900 (0.0008) [2023-10-10 06:26:11,415][53268] Updated weights for policy 1, policy_version 42910 (0.0008) [2023-10-10 06:26:11,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 87916544. Throughput: 0: 1698.0, 1: 1659.2. Samples: 21985610. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:26:11,784][52050] Avg episode reward: [(0, '18.720'), (1, '19.470')] [2023-10-10 06:26:13,914][53252] Updated weights for policy 0, policy_version 42950 (0.0008) [2023-10-10 06:26:14,285][53252] Updated weights for policy 0, policy_version 42960 (0.0010) [2023-10-10 06:26:14,655][53252] Updated weights for policy 0, policy_version 42970 (0.0008) [2023-10-10 06:26:15,627][53268] Updated weights for policy 1, policy_version 42920 (0.0008) [2023-10-10 06:26:16,011][53268] Updated weights for policy 1, policy_version 42930 (0.0010) [2023-10-10 06:26:16,383][53268] Updated weights for policy 1, policy_version 42940 (0.0009) [2023-10-10 06:26:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 87982080. Throughput: 0: 1681.6, 1: 1686.0. Samples: 21996138. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:26:16,784][52050] Avg episode reward: [(0, '20.410'), (1, '17.810')] [2023-10-10 06:26:18,645][53252] Updated weights for policy 0, policy_version 42980 (0.0009) [2023-10-10 06:26:18,998][53252] Updated weights for policy 0, policy_version 42990 (0.0008) [2023-10-10 06:26:19,368][53252] Updated weights for policy 0, policy_version 43000 (0.0007) [2023-10-10 06:26:20,422][53268] Updated weights for policy 1, policy_version 42950 (0.0009) [2023-10-10 06:26:20,787][53268] Updated weights for policy 1, policy_version 42960 (0.0011) [2023-10-10 06:26:21,157][53268] Updated weights for policy 1, policy_version 42970 (0.0011) [2023-10-10 06:26:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 88047616. Throughput: 0: 1669.3, 1: 1683.3. Samples: 22016012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-10 06:26:21,784][52050] Avg episode reward: [(0, '20.850'), (1, '18.440')] [2023-10-10 06:26:23,255][53252] Updated weights for policy 0, policy_version 43010 (0.0007) [2023-10-10 06:26:23,627][53252] Updated weights for policy 0, policy_version 43020 (0.0008) [2023-10-10 06:26:24,000][53252] Updated weights for policy 0, policy_version 43030 (0.0008) [2023-10-10 06:26:24,366][53252] Updated weights for policy 0, policy_version 43040 (0.0007) [2023-10-10 06:26:25,191][53268] Updated weights for policy 1, policy_version 42980 (0.0008) [2023-10-10 06:26:25,561][53268] Updated weights for policy 1, policy_version 42990 (0.0007) [2023-10-10 06:26:25,926][53268] Updated weights for policy 1, policy_version 43000 (0.0009) [2023-10-10 06:26:26,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 88113152. Throughput: 0: 1689.5, 1: 1657.2. Samples: 22035788. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:26,784][52050] Avg episode reward: [(0, '21.500'), (1, '20.070')] [2023-10-10 06:26:28,431][53252] Updated weights for policy 0, policy_version 43050 (0.0009) [2023-10-10 06:26:28,802][53252] Updated weights for policy 0, policy_version 43060 (0.0009) [2023-10-10 06:26:29,172][53252] Updated weights for policy 0, policy_version 43070 (0.0010) [2023-10-10 06:26:29,805][53268] Updated weights for policy 1, policy_version 43010 (0.0008) [2023-10-10 06:26:30,174][53268] Updated weights for policy 1, policy_version 43020 (0.0008) [2023-10-10 06:26:30,535][53268] Updated weights for policy 1, policy_version 43030 (0.0009) [2023-10-10 06:26:30,905][53268] Updated weights for policy 1, policy_version 43040 (0.0009) [2023-10-10 06:26:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 88178688. Throughput: 0: 1658.5, 1: 1689.8. Samples: 22046290. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:31,785][52050] Avg episode reward: [(0, '21.350'), (1, '20.830')] [2023-10-10 06:26:33,413][53252] Updated weights for policy 0, policy_version 43080 (0.0009) [2023-10-10 06:26:33,774][53252] Updated weights for policy 0, policy_version 43090 (0.0010) [2023-10-10 06:26:34,156][53252] Updated weights for policy 0, policy_version 43100 (0.0007) [2023-10-10 06:26:34,990][53268] Updated weights for policy 1, policy_version 43050 (0.0010) [2023-10-10 06:26:35,366][53268] Updated weights for policy 1, policy_version 43060 (0.0008) [2023-10-10 06:26:35,728][53268] Updated weights for policy 1, policy_version 43070 (0.0009) [2023-10-10 06:26:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 88244224. Throughput: 0: 1685.1, 1: 1674.2. Samples: 22066388. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:36,784][52050] Avg episode reward: [(0, '20.710'), (1, '20.430')] [2023-10-10 06:26:38,311][53252] Updated weights for policy 0, policy_version 43110 (0.0009) [2023-10-10 06:26:38,703][53252] Updated weights for policy 0, policy_version 43120 (0.0008) [2023-10-10 06:26:39,073][53252] Updated weights for policy 0, policy_version 43130 (0.0007) [2023-10-10 06:26:39,742][53268] Updated weights for policy 1, policy_version 43080 (0.0009) [2023-10-10 06:26:40,117][53268] Updated weights for policy 1, policy_version 43090 (0.0009) [2023-10-10 06:26:40,484][53268] Updated weights for policy 1, policy_version 43100 (0.0007) [2023-10-10 06:26:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 88309760. Throughput: 0: 1686.6, 1: 1669.9. Samples: 22086294. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:41,785][52050] Avg episode reward: [(0, '21.340'), (1, '21.690')] [2023-10-10 06:26:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000043136_44171264.pth... [2023-10-10 06:26:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000043104_44138496.pth... [2023-10-10 06:26:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000041568_42565632.pth [2023-10-10 06:26:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000041536_42532864.pth [2023-10-10 06:26:43,293][53252] Updated weights for policy 0, policy_version 43140 (0.0010) [2023-10-10 06:26:43,669][53252] Updated weights for policy 0, policy_version 43150 (0.0009) [2023-10-10 06:26:44,042][53252] Updated weights for policy 0, policy_version 43160 (0.0008) [2023-10-10 06:26:44,752][53268] Updated weights for policy 1, policy_version 43110 (0.0008) [2023-10-10 06:26:45,113][53268] Updated weights for policy 1, policy_version 43120 (0.0009) [2023-10-10 06:26:45,477][53268] Updated weights for policy 1, policy_version 43130 (0.0007) [2023-10-10 06:26:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 88375296. Throughput: 0: 1659.9, 1: 1689.0. Samples: 22096554. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:46,784][52050] Avg episode reward: [(0, '22.280'), (1, '19.180')] [2023-10-10 06:26:48,155][53252] Updated weights for policy 0, policy_version 43170 (0.0008) [2023-10-10 06:26:48,518][53252] Updated weights for policy 0, policy_version 43180 (0.0008) [2023-10-10 06:26:48,889][53252] Updated weights for policy 0, policy_version 43190 (0.0007) [2023-10-10 06:26:49,267][53252] Updated weights for policy 0, policy_version 43200 (0.0009) [2023-10-10 06:26:49,642][53268] Updated weights for policy 1, policy_version 43140 (0.0009) [2023-10-10 06:26:50,010][53268] Updated weights for policy 1, policy_version 43150 (0.0007) [2023-10-10 06:26:50,382][53268] Updated weights for policy 1, policy_version 43160 (0.0007) [2023-10-10 06:26:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88440832. Throughput: 0: 1682.9, 1: 1672.7. Samples: 22116810. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 06:26:51,784][52050] Avg episode reward: [(0, '21.930'), (1, '17.360')] [2023-10-10 06:26:53,233][53252] Updated weights for policy 0, policy_version 43210 (0.0008) [2023-10-10 06:26:53,603][53252] Updated weights for policy 0, policy_version 43220 (0.0009) [2023-10-10 06:26:53,980][53252] Updated weights for policy 0, policy_version 43230 (0.0009) [2023-10-10 06:26:54,342][53268] Updated weights for policy 1, policy_version 43170 (0.0009) [2023-10-10 06:26:54,717][53268] Updated weights for policy 1, policy_version 43180 (0.0007) [2023-10-10 06:26:55,080][53268] Updated weights for policy 1, policy_version 43190 (0.0010) [2023-10-10 06:26:55,441][53268] Updated weights for policy 1, policy_version 43200 (0.0007) [2023-10-10 06:26:56,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 88506368. Throughput: 0: 1684.5, 1: 1682.1. Samples: 22137106. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:26:56,785][52050] Avg episode reward: [(0, '21.970'), (1, '20.010')] [2023-10-10 06:26:58,038][53252] Updated weights for policy 0, policy_version 43240 (0.0010) [2023-10-10 06:26:58,412][53252] Updated weights for policy 0, policy_version 43250 (0.0008) [2023-10-10 06:26:58,784][53252] Updated weights for policy 0, policy_version 43260 (0.0007) [2023-10-10 06:26:59,556][53268] Updated weights for policy 1, policy_version 43210 (0.0010) [2023-10-10 06:26:59,929][53268] Updated weights for policy 1, policy_version 43220 (0.0008) [2023-10-10 06:27:00,286][53268] Updated weights for policy 1, policy_version 43230 (0.0011) [2023-10-10 06:27:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 88571904. Throughput: 0: 1670.5, 1: 1690.8. Samples: 22147392. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:27:01,784][52050] Avg episode reward: [(0, '22.980'), (1, '20.280')] [2023-10-10 06:27:02,669][53252] Updated weights for policy 0, policy_version 43270 (0.0009) [2023-10-10 06:27:03,042][53252] Updated weights for policy 0, policy_version 43280 (0.0008) [2023-10-10 06:27:03,410][53252] Updated weights for policy 0, policy_version 43290 (0.0009) [2023-10-10 06:27:04,358][53268] Updated weights for policy 1, policy_version 43240 (0.0008) [2023-10-10 06:27:04,729][53268] Updated weights for policy 1, policy_version 43250 (0.0009) [2023-10-10 06:27:05,087][53268] Updated weights for policy 1, policy_version 43260 (0.0008) [2023-10-10 06:27:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 88637440. Throughput: 0: 1691.2, 1: 1668.0. Samples: 22167172. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:27:06,784][52050] Avg episode reward: [(0, '19.730'), (1, '20.270')] [2023-10-10 06:27:07,238][53252] Updated weights for policy 0, policy_version 43300 (0.0011) [2023-10-10 06:27:07,613][53252] Updated weights for policy 0, policy_version 43310 (0.0010) [2023-10-10 06:27:07,992][53252] Updated weights for policy 0, policy_version 43320 (0.0010) [2023-10-10 06:27:08,996][53268] Updated weights for policy 1, policy_version 43270 (0.0010) [2023-10-10 06:27:09,371][53268] Updated weights for policy 1, policy_version 43280 (0.0009) [2023-10-10 06:27:09,739][53268] Updated weights for policy 1, policy_version 43290 (0.0010) [2023-10-10 06:27:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 88702976. Throughput: 0: 1686.8, 1: 1698.9. Samples: 22188146. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:27:11,784][52050] Avg episode reward: [(0, '20.730'), (1, '21.180')] [2023-10-10 06:27:11,936][53252] Updated weights for policy 0, policy_version 43330 (0.0010) [2023-10-10 06:27:12,310][53252] Updated weights for policy 0, policy_version 43340 (0.0009) [2023-10-10 06:27:12,678][53252] Updated weights for policy 0, policy_version 43350 (0.0010) [2023-10-10 06:27:13,055][53252] Updated weights for policy 0, policy_version 43360 (0.0011) [2023-10-10 06:27:13,592][53268] Updated weights for policy 1, policy_version 43300 (0.0009) [2023-10-10 06:27:13,958][53268] Updated weights for policy 1, policy_version 43310 (0.0010) [2023-10-10 06:27:14,329][53268] Updated weights for policy 1, policy_version 43320 (0.0007) [2023-10-10 06:27:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 88768512. Throughput: 0: 1685.6, 1: 1681.3. Samples: 22197796. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:27:16,784][52050] Avg episode reward: [(0, '22.540'), (1, '21.490')] [2023-10-10 06:27:17,112][53252] Updated weights for policy 0, policy_version 43370 (0.0009) [2023-10-10 06:27:17,487][53252] Updated weights for policy 0, policy_version 43380 (0.0008) [2023-10-10 06:27:17,858][53252] Updated weights for policy 0, policy_version 43390 (0.0009) [2023-10-10 06:27:18,506][53268] Updated weights for policy 1, policy_version 43330 (0.0008) [2023-10-10 06:27:18,866][53268] Updated weights for policy 1, policy_version 43340 (0.0010) [2023-10-10 06:27:19,245][53268] Updated weights for policy 1, policy_version 43350 (0.0011) [2023-10-10 06:27:19,612][53268] Updated weights for policy 1, policy_version 43360 (0.0008) [2023-10-10 06:27:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 88834048. Throughput: 0: 1685.0, 1: 1679.2. Samples: 22217778. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-10 06:27:21,785][52050] Avg episode reward: [(0, '22.190'), (1, '20.510')] [2023-10-10 06:27:21,856][53252] Updated weights for policy 0, policy_version 43400 (0.0010) [2023-10-10 06:27:22,237][53252] Updated weights for policy 0, policy_version 43410 (0.0010) [2023-10-10 06:27:22,609][53252] Updated weights for policy 0, policy_version 43420 (0.0009) [2023-10-10 06:27:23,555][53268] Updated weights for policy 1, policy_version 43370 (0.0010) [2023-10-10 06:27:23,916][53268] Updated weights for policy 1, policy_version 43380 (0.0010) [2023-10-10 06:27:24,289][53268] Updated weights for policy 1, policy_version 43390 (0.0010) [2023-10-10 06:27:26,731][53252] Updated weights for policy 0, policy_version 43430 (0.0007) [2023-10-10 06:27:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 88899584. Throughput: 0: 1696.8, 1: 1690.0. Samples: 22238698. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:26,784][52050] Avg episode reward: [(0, '22.320'), (1, '19.450')] [2023-10-10 06:27:27,111][53252] Updated weights for policy 0, policy_version 43440 (0.0010) [2023-10-10 06:27:27,482][53252] Updated weights for policy 0, policy_version 43450 (0.0011) [2023-10-10 06:27:28,358][53268] Updated weights for policy 1, policy_version 43400 (0.0010) [2023-10-10 06:27:28,733][53268] Updated weights for policy 1, policy_version 43410 (0.0009) [2023-10-10 06:27:29,106][53268] Updated weights for policy 1, policy_version 43420 (0.0008) [2023-10-10 06:27:31,613][53252] Updated weights for policy 0, policy_version 43460 (0.0011) [2023-10-10 06:27:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 88965120. Throughput: 0: 1693.4, 1: 1670.7. Samples: 22247936. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:31,784][52050] Avg episode reward: [(0, '22.190'), (1, '19.140')] [2023-10-10 06:27:31,976][53252] Updated weights for policy 0, policy_version 43470 (0.0007) [2023-10-10 06:27:32,332][53252] Updated weights for policy 0, policy_version 43480 (0.0010) [2023-10-10 06:27:33,266][53268] Updated weights for policy 1, policy_version 43430 (0.0011) [2023-10-10 06:27:33,628][53268] Updated weights for policy 1, policy_version 43440 (0.0010) [2023-10-10 06:27:33,998][53268] Updated weights for policy 1, policy_version 43450 (0.0007) [2023-10-10 06:27:36,463][53252] Updated weights for policy 0, policy_version 43490 (0.0007) [2023-10-10 06:27:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 89030656. Throughput: 0: 1692.2, 1: 1676.2. Samples: 22268386. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:36,784][52050] Avg episode reward: [(0, '20.780'), (1, '20.420')] [2023-10-10 06:27:36,834][53252] Updated weights for policy 0, policy_version 43500 (0.0008) [2023-10-10 06:27:37,206][53252] Updated weights for policy 0, policy_version 43510 (0.0007) [2023-10-10 06:27:37,583][53252] Updated weights for policy 0, policy_version 43520 (0.0008) [2023-10-10 06:27:38,165][53268] Updated weights for policy 1, policy_version 43460 (0.0008) [2023-10-10 06:27:38,522][53268] Updated weights for policy 1, policy_version 43470 (0.0007) [2023-10-10 06:27:38,893][53268] Updated weights for policy 1, policy_version 43480 (0.0007) [2023-10-10 06:27:41,703][53252] Updated weights for policy 0, policy_version 43530 (0.0007) [2023-10-10 06:27:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89096192. Throughput: 0: 1687.3, 1: 1686.1. Samples: 22288910. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:41,784][52050] Avg episode reward: [(0, '21.080'), (1, '20.240')] [2023-10-10 06:27:42,078][53252] Updated weights for policy 0, policy_version 43540 (0.0007) [2023-10-10 06:27:42,448][53252] Updated weights for policy 0, policy_version 43550 (0.0010) [2023-10-10 06:27:43,057][53268] Updated weights for policy 1, policy_version 43490 (0.0008) [2023-10-10 06:27:43,429][53268] Updated weights for policy 1, policy_version 43500 (0.0010) [2023-10-10 06:27:43,803][53268] Updated weights for policy 1, policy_version 43510 (0.0007) [2023-10-10 06:27:44,167][53268] Updated weights for policy 1, policy_version 43520 (0.0007) [2023-10-10 06:27:46,512][53252] Updated weights for policy 0, policy_version 43560 (0.0007) [2023-10-10 06:27:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89161728. Throughput: 0: 1694.4, 1: 1659.5. Samples: 22298318. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:46,784][52050] Avg episode reward: [(0, '23.240'), (1, '20.970')] [2023-10-10 06:27:46,879][53252] Updated weights for policy 0, policy_version 43570 (0.0007) [2023-10-10 06:27:47,245][53252] Updated weights for policy 0, policy_version 43580 (0.0007) [2023-10-10 06:27:47,391][52846] Saving new best policy, reward=23.240! [2023-10-10 06:27:48,105][53268] Updated weights for policy 1, policy_version 43530 (0.0014) [2023-10-10 06:27:48,471][53268] Updated weights for policy 1, policy_version 43540 (0.0010) [2023-10-10 06:27:48,840][53268] Updated weights for policy 1, policy_version 43550 (0.0009) [2023-10-10 06:27:51,364][53252] Updated weights for policy 0, policy_version 43590 (0.0008) [2023-10-10 06:27:51,736][53252] Updated weights for policy 0, policy_version 43600 (0.0008) [2023-10-10 06:27:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89227264. Throughput: 0: 1684.4, 1: 1692.6. Samples: 22319140. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-10 06:27:51,784][52050] Avg episode reward: [(0, '21.480'), (1, '20.320')] [2023-10-10 06:27:52,106][53252] Updated weights for policy 0, policy_version 43610 (0.0007) [2023-10-10 06:27:52,891][53268] Updated weights for policy 1, policy_version 43560 (0.0008) [2023-10-10 06:27:53,261][53268] Updated weights for policy 1, policy_version 43570 (0.0009) [2023-10-10 06:27:53,627][53268] Updated weights for policy 1, policy_version 43580 (0.0009) [2023-10-10 06:27:56,287][53252] Updated weights for policy 0, policy_version 43620 (0.0009) [2023-10-10 06:27:56,660][53252] Updated weights for policy 0, policy_version 43630 (0.0007) [2023-10-10 06:27:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 89292800. Throughput: 0: 1675.0, 1: 1688.1. Samples: 22339486. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:27:56,784][52050] Avg episode reward: [(0, '19.430'), (1, '19.940')] [2023-10-10 06:27:57,029][53252] Updated weights for policy 0, policy_version 43640 (0.0007) [2023-10-10 06:27:57,716][53268] Updated weights for policy 1, policy_version 43590 (0.0007) [2023-10-10 06:27:58,079][53268] Updated weights for policy 1, policy_version 43600 (0.0008) [2023-10-10 06:27:58,445][53268] Updated weights for policy 1, policy_version 43610 (0.0008) [2023-10-10 06:28:01,032][53252] Updated weights for policy 0, policy_version 43650 (0.0007) [2023-10-10 06:28:01,411][53252] Updated weights for policy 0, policy_version 43660 (0.0008) [2023-10-10 06:28:01,772][53252] Updated weights for policy 0, policy_version 43670 (0.0007) [2023-10-10 06:28:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 89358336. Throughput: 0: 1681.2, 1: 1676.4. Samples: 22348890. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:28:01,784][52050] Avg episode reward: [(0, '20.370'), (1, '19.050')] [2023-10-10 06:28:02,151][53252] Updated weights for policy 0, policy_version 43680 (0.0008) [2023-10-10 06:28:02,635][53268] Updated weights for policy 1, policy_version 43620 (0.0009) [2023-10-10 06:28:03,001][53268] Updated weights for policy 1, policy_version 43630 (0.0009) [2023-10-10 06:28:03,363][53268] Updated weights for policy 1, policy_version 43640 (0.0010) [2023-10-10 06:28:06,273][53252] Updated weights for policy 0, policy_version 43690 (0.0007) [2023-10-10 06:28:06,648][53252] Updated weights for policy 0, policy_version 43700 (0.0008) [2023-10-10 06:28:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89423872. Throughput: 0: 1679.9, 1: 1692.4. Samples: 22369532. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:28:06,784][52050] Avg episode reward: [(0, '19.840'), (1, '18.260')] [2023-10-10 06:28:07,005][53252] Updated weights for policy 0, policy_version 43710 (0.0009) [2023-10-10 06:28:07,438][53268] Updated weights for policy 1, policy_version 43650 (0.0008) [2023-10-10 06:28:07,804][53268] Updated weights for policy 1, policy_version 43660 (0.0009) [2023-10-10 06:28:08,176][53268] Updated weights for policy 1, policy_version 43670 (0.0007) [2023-10-10 06:28:08,543][53268] Updated weights for policy 1, policy_version 43680 (0.0007) [2023-10-10 06:28:11,133][53252] Updated weights for policy 0, policy_version 43720 (0.0009) [2023-10-10 06:28:11,495][53252] Updated weights for policy 0, policy_version 43730 (0.0010) [2023-10-10 06:28:11,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 89489408. Throughput: 0: 1659.9, 1: 1694.8. Samples: 22389662. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:28:11,785][52050] Avg episode reward: [(0, '19.660'), (1, '18.890')] [2023-10-10 06:28:11,865][53252] Updated weights for policy 0, policy_version 43740 (0.0009) [2023-10-10 06:28:12,528][53268] Updated weights for policy 1, policy_version 43690 (0.0008) [2023-10-10 06:28:12,896][53268] Updated weights for policy 1, policy_version 43700 (0.0008) [2023-10-10 06:28:13,251][53268] Updated weights for policy 1, policy_version 43710 (0.0008) [2023-10-10 06:28:15,994][53252] Updated weights for policy 0, policy_version 43750 (0.0010) [2023-10-10 06:28:16,372][53252] Updated weights for policy 0, policy_version 43760 (0.0009) [2023-10-10 06:28:16,744][53252] Updated weights for policy 0, policy_version 43770 (0.0007) [2023-10-10 06:28:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 89554944. Throughput: 0: 1677.5, 1: 1689.7. Samples: 22399458. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:28:16,784][52050] Avg episode reward: [(0, '21.300'), (1, '19.260')] [2023-10-10 06:28:17,076][53268] Updated weights for policy 1, policy_version 43720 (0.0008) [2023-10-10 06:28:17,448][53268] Updated weights for policy 1, policy_version 43730 (0.0007) [2023-10-10 06:28:17,813][53268] Updated weights for policy 1, policy_version 43740 (0.0009) [2023-10-10 06:28:20,911][53252] Updated weights for policy 0, policy_version 43780 (0.0008) [2023-10-10 06:28:21,270][53252] Updated weights for policy 0, policy_version 43790 (0.0009) [2023-10-10 06:28:21,646][53252] Updated weights for policy 0, policy_version 43800 (0.0007) [2023-10-10 06:28:21,681][53268] Updated weights for policy 1, policy_version 43750 (0.0008) [2023-10-10 06:28:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89620480. Throughput: 0: 1674.2, 1: 1698.8. Samples: 22420170. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-10 06:28:21,784][52050] Avg episode reward: [(0, '22.570'), (1, '19.240')] [2023-10-10 06:28:22,038][53268] Updated weights for policy 1, policy_version 43760 (0.0008) [2023-10-10 06:28:22,410][53268] Updated weights for policy 1, policy_version 43770 (0.0009) [2023-10-10 06:28:25,757][53252] Updated weights for policy 0, policy_version 43810 (0.0009) [2023-10-10 06:28:26,128][53252] Updated weights for policy 0, policy_version 43820 (0.0008) [2023-10-10 06:28:26,507][53252] Updated weights for policy 0, policy_version 43830 (0.0008) [2023-10-10 06:28:26,546][53268] Updated weights for policy 1, policy_version 43780 (0.0008) [2023-10-10 06:28:26,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89686016. Throughput: 0: 1658.9, 1: 1701.2. Samples: 22440116. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 06:28:26,784][52050] Avg episode reward: [(0, '20.960'), (1, '17.890')] [2023-10-10 06:28:26,871][53252] Updated weights for policy 0, policy_version 43840 (0.0007) [2023-10-10 06:28:26,900][53268] Updated weights for policy 1, policy_version 43790 (0.0009) [2023-10-10 06:28:27,274][53268] Updated weights for policy 1, policy_version 43800 (0.0010) [2023-10-10 06:28:30,898][53252] Updated weights for policy 0, policy_version 43850 (0.0008) [2023-10-10 06:28:31,270][53252] Updated weights for policy 0, policy_version 43860 (0.0009) [2023-10-10 06:28:31,552][53268] Updated weights for policy 1, policy_version 43810 (0.0009) [2023-10-10 06:28:31,634][53252] Updated weights for policy 0, policy_version 43870 (0.0008) [2023-10-10 06:28:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89784320. Throughput: 0: 1670.8, 1: 1699.5. Samples: 22449980. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 06:28:31,784][52050] Avg episode reward: [(0, '21.020'), (1, '17.900')] [2023-10-10 06:28:31,920][53268] Updated weights for policy 1, policy_version 43820 (0.0008) [2023-10-10 06:28:32,292][53268] Updated weights for policy 1, policy_version 43830 (0.0008) [2023-10-10 06:28:32,657][53268] Updated weights for policy 1, policy_version 43840 (0.0008) [2023-10-10 06:28:35,590][53252] Updated weights for policy 0, policy_version 43880 (0.0007) [2023-10-10 06:28:35,970][53252] Updated weights for policy 0, policy_version 43890 (0.0009) [2023-10-10 06:28:36,332][53252] Updated weights for policy 0, policy_version 43900 (0.0008) [2023-10-10 06:28:36,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 89849856. Throughput: 0: 1672.1, 1: 1690.6. Samples: 22470458. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 06:28:36,784][52050] Avg episode reward: [(0, '20.870'), (1, '17.940')] [2023-10-10 06:28:36,786][53268] Updated weights for policy 1, policy_version 43850 (0.0007) [2023-10-10 06:28:37,155][53268] Updated weights for policy 1, policy_version 43860 (0.0008) [2023-10-10 06:28:37,519][53268] Updated weights for policy 1, policy_version 43870 (0.0009) [2023-10-10 06:28:40,440][53252] Updated weights for policy 0, policy_version 43910 (0.0008) [2023-10-10 06:28:40,820][53252] Updated weights for policy 0, policy_version 43920 (0.0010) [2023-10-10 06:28:41,189][53252] Updated weights for policy 0, policy_version 43930 (0.0010) [2023-10-10 06:28:41,772][53268] Updated weights for policy 1, policy_version 43880 (0.0009) [2023-10-10 06:28:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89915392. Throughput: 0: 1652.2, 1: 1689.7. Samples: 22489870. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 06:28:41,784][52050] Avg episode reward: [(0, '21.820'), (1, '19.600')] [2023-10-10 06:28:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000043936_44990464.pth... [2023-10-10 06:28:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000042368_43384832.pth [2023-10-10 06:28:42,150][53268] Updated weights for policy 1, policy_version 43890 (0.0007) [2023-10-10 06:28:42,524][53268] Updated weights for policy 1, policy_version 43900 (0.0009) [2023-10-10 06:28:42,667][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000043904_44957696.pth... [2023-10-10 06:28:42,706][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000042336_43352064.pth [2023-10-10 06:28:45,256][53252] Updated weights for policy 0, policy_version 43940 (0.0008) [2023-10-10 06:28:45,636][53252] Updated weights for policy 0, policy_version 43950 (0.0009) [2023-10-10 06:28:46,003][53252] Updated weights for policy 0, policy_version 43960 (0.0009) [2023-10-10 06:28:46,569][53268] Updated weights for policy 1, policy_version 43910 (0.0008) [2023-10-10 06:28:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 89980928. Throughput: 0: 1678.9, 1: 1685.1. Samples: 22500270. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 06:28:46,784][52050] Avg episode reward: [(0, '20.910'), (1, '18.470')] [2023-10-10 06:28:46,937][53268] Updated weights for policy 1, policy_version 43920 (0.0009) [2023-10-10 06:28:47,306][53268] Updated weights for policy 1, policy_version 43930 (0.0007) [2023-10-10 06:28:50,091][53252] Updated weights for policy 0, policy_version 43970 (0.0009) [2023-10-10 06:28:50,469][53252] Updated weights for policy 0, policy_version 43980 (0.0009) [2023-10-10 06:28:50,833][53252] Updated weights for policy 0, policy_version 43990 (0.0008) [2023-10-10 06:28:51,211][53268] Updated weights for policy 1, policy_version 43940 (0.0008) [2023-10-10 06:28:51,211][53252] Updated weights for policy 0, policy_version 44000 (0.0008) [2023-10-10 06:28:51,580][53268] Updated weights for policy 1, policy_version 43950 (0.0009) [2023-10-10 06:28:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90046464. Throughput: 0: 1668.9, 1: 1684.6. Samples: 22520438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:28:51,784][52050] Avg episode reward: [(0, '20.400'), (1, '18.350')] [2023-10-10 06:28:51,946][53268] Updated weights for policy 1, policy_version 43960 (0.0009) [2023-10-10 06:28:55,148][53252] Updated weights for policy 0, policy_version 44010 (0.0007) [2023-10-10 06:28:55,519][53252] Updated weights for policy 0, policy_version 44020 (0.0007) [2023-10-10 06:28:55,893][53252] Updated weights for policy 0, policy_version 44030 (0.0007) [2023-10-10 06:28:56,030][53268] Updated weights for policy 1, policy_version 43970 (0.0010) [2023-10-10 06:28:56,399][53268] Updated weights for policy 1, policy_version 43980 (0.0009) [2023-10-10 06:28:56,766][53268] Updated weights for policy 1, policy_version 43990 (0.0009) [2023-10-10 06:28:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90112000. Throughput: 0: 1665.3, 1: 1680.4. Samples: 22540214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:28:56,785][52050] Avg episode reward: [(0, '22.210'), (1, '19.020')] [2023-10-10 06:28:57,142][53268] Updated weights for policy 1, policy_version 44000 (0.0009) [2023-10-10 06:28:59,991][53252] Updated weights for policy 0, policy_version 44040 (0.0007) [2023-10-10 06:29:00,359][53252] Updated weights for policy 0, policy_version 44050 (0.0008) [2023-10-10 06:29:00,729][53252] Updated weights for policy 0, policy_version 44060 (0.0009) [2023-10-10 06:29:01,094][53268] Updated weights for policy 1, policy_version 44010 (0.0009) [2023-10-10 06:29:01,460][53268] Updated weights for policy 1, policy_version 44020 (0.0012) [2023-10-10 06:29:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90177536. Throughput: 0: 1678.5, 1: 1683.7. Samples: 22550758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:29:01,784][52050] Avg episode reward: [(0, '23.040'), (1, '18.290')] [2023-10-10 06:29:01,842][53268] Updated weights for policy 1, policy_version 44030 (0.0010) [2023-10-10 06:29:04,936][53252] Updated weights for policy 0, policy_version 44070 (0.0009) [2023-10-10 06:29:05,318][53252] Updated weights for policy 0, policy_version 44080 (0.0010) [2023-10-10 06:29:05,691][53252] Updated weights for policy 0, policy_version 44090 (0.0008) [2023-10-10 06:29:05,861][53268] Updated weights for policy 1, policy_version 44040 (0.0009) [2023-10-10 06:29:06,233][53268] Updated weights for policy 1, policy_version 44050 (0.0009) [2023-10-10 06:29:06,608][53268] Updated weights for policy 1, policy_version 44060 (0.0010) [2023-10-10 06:29:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 90275840. Throughput: 0: 1669.1, 1: 1673.8. Samples: 22570600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:29:06,784][52050] Avg episode reward: [(0, '20.830'), (1, '18.080')] [2023-10-10 06:29:09,595][53252] Updated weights for policy 0, policy_version 44100 (0.0007) [2023-10-10 06:29:09,959][53252] Updated weights for policy 0, policy_version 44110 (0.0009) [2023-10-10 06:29:10,339][53252] Updated weights for policy 0, policy_version 44120 (0.0009) [2023-10-10 06:29:10,837][53268] Updated weights for policy 1, policy_version 44070 (0.0007) [2023-10-10 06:29:11,204][53268] Updated weights for policy 1, policy_version 44080 (0.0008) [2023-10-10 06:29:11,568][53268] Updated weights for policy 1, policy_version 44090 (0.0009) [2023-10-10 06:29:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90308608. Throughput: 0: 1677.4, 1: 1659.4. Samples: 22590274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:29:11,784][52050] Avg episode reward: [(0, '21.540'), (1, '19.390')] [2023-10-10 06:29:14,380][53252] Updated weights for policy 0, policy_version 44130 (0.0009) [2023-10-10 06:29:14,760][53252] Updated weights for policy 0, policy_version 44140 (0.0007) [2023-10-10 06:29:15,119][53252] Updated weights for policy 0, policy_version 44150 (0.0009) [2023-10-10 06:29:15,486][53252] Updated weights for policy 0, policy_version 44160 (0.0009) [2023-10-10 06:29:15,659][53268] Updated weights for policy 1, policy_version 44100 (0.0008) [2023-10-10 06:29:16,024][53268] Updated weights for policy 1, policy_version 44110 (0.0007) [2023-10-10 06:29:16,398][53268] Updated weights for policy 1, policy_version 44120 (0.0008) [2023-10-10 06:29:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 90406912. Throughput: 0: 1690.3, 1: 1670.1. Samples: 22601198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:29:16,784][52050] Avg episode reward: [(0, '20.870'), (1, '20.190')] [2023-10-10 06:29:19,396][53252] Updated weights for policy 0, policy_version 44170 (0.0007) [2023-10-10 06:29:19,771][53252] Updated weights for policy 0, policy_version 44180 (0.0008) [2023-10-10 06:29:20,138][53252] Updated weights for policy 0, policy_version 44190 (0.0007) [2023-10-10 06:29:20,369][53268] Updated weights for policy 1, policy_version 44130 (0.0008) [2023-10-10 06:29:20,739][53268] Updated weights for policy 1, policy_version 44140 (0.0009) [2023-10-10 06:29:21,112][53268] Updated weights for policy 1, policy_version 44150 (0.0010) [2023-10-10 06:29:21,480][53268] Updated weights for policy 1, policy_version 44160 (0.0010) [2023-10-10 06:29:21,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 90472448. Throughput: 0: 1668.8, 1: 1680.3. Samples: 22621168. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:21,784][52050] Avg episode reward: [(0, '19.980'), (1, '22.270')] [2023-10-10 06:29:21,786][53061] Saving new best policy, reward=22.270! [2023-10-10 06:29:24,189][53252] Updated weights for policy 0, policy_version 44200 (0.0010) [2023-10-10 06:29:24,554][53252] Updated weights for policy 0, policy_version 44210 (0.0010) [2023-10-10 06:29:24,935][53252] Updated weights for policy 0, policy_version 44220 (0.0010) [2023-10-10 06:29:25,631][53268] Updated weights for policy 1, policy_version 44170 (0.0008) [2023-10-10 06:29:26,001][53268] Updated weights for policy 1, policy_version 44180 (0.0008) [2023-10-10 06:29:26,366][53268] Updated weights for policy 1, policy_version 44190 (0.0008) [2023-10-10 06:29:26,783][52050] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 90537984. Throughput: 0: 1699.7, 1: 1656.3. Samples: 22640890. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:26,784][52050] Avg episode reward: [(0, '19.710'), (1, '22.410')] [2023-10-10 06:29:26,797][53061] Saving new best policy, reward=22.410! [2023-10-10 06:29:29,015][53252] Updated weights for policy 0, policy_version 44230 (0.0008) [2023-10-10 06:29:29,398][53252] Updated weights for policy 0, policy_version 44240 (0.0007) [2023-10-10 06:29:29,757][53252] Updated weights for policy 0, policy_version 44250 (0.0010) [2023-10-10 06:29:30,361][53268] Updated weights for policy 1, policy_version 44200 (0.0010) [2023-10-10 06:29:30,739][53268] Updated weights for policy 1, policy_version 44210 (0.0007) [2023-10-10 06:29:31,110][53268] Updated weights for policy 1, policy_version 44220 (0.0007) [2023-10-10 06:29:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 90603520. Throughput: 0: 1683.1, 1: 1683.1. Samples: 22651750. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:31,784][52050] Avg episode reward: [(0, '20.150'), (1, '20.150')] [2023-10-10 06:29:33,777][53252] Updated weights for policy 0, policy_version 44260 (0.0009) [2023-10-10 06:29:34,161][53252] Updated weights for policy 0, policy_version 44270 (0.0007) [2023-10-10 06:29:34,526][53252] Updated weights for policy 0, policy_version 44280 (0.0008) [2023-10-10 06:29:35,068][53268] Updated weights for policy 1, policy_version 44230 (0.0008) [2023-10-10 06:29:35,444][53268] Updated weights for policy 1, policy_version 44240 (0.0007) [2023-10-10 06:29:35,812][53268] Updated weights for policy 1, policy_version 44250 (0.0008) [2023-10-10 06:29:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90669056. Throughput: 0: 1682.7, 1: 1675.8. Samples: 22671572. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:36,784][52050] Avg episode reward: [(0, '19.380'), (1, '20.760')] [2023-10-10 06:29:38,615][53252] Updated weights for policy 0, policy_version 44290 (0.0009) [2023-10-10 06:29:38,984][53252] Updated weights for policy 0, policy_version 44300 (0.0008) [2023-10-10 06:29:39,354][53252] Updated weights for policy 0, policy_version 44310 (0.0010) [2023-10-10 06:29:39,724][53252] Updated weights for policy 0, policy_version 44320 (0.0011) [2023-10-10 06:29:40,137][53268] Updated weights for policy 1, policy_version 44260 (0.0008) [2023-10-10 06:29:40,513][53268] Updated weights for policy 1, policy_version 44270 (0.0011) [2023-10-10 06:29:40,877][53268] Updated weights for policy 1, policy_version 44280 (0.0012) [2023-10-10 06:29:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90734592. Throughput: 0: 1700.0, 1: 1657.0. Samples: 22691278. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:41,784][52050] Avg episode reward: [(0, '18.450'), (1, '20.370')] [2023-10-10 06:29:43,645][53252] Updated weights for policy 0, policy_version 44330 (0.0008) [2023-10-10 06:29:44,013][53252] Updated weights for policy 0, policy_version 44340 (0.0007) [2023-10-10 06:29:44,376][53252] Updated weights for policy 0, policy_version 44350 (0.0007) [2023-10-10 06:29:44,820][53268] Updated weights for policy 1, policy_version 44290 (0.0011) [2023-10-10 06:29:45,192][53268] Updated weights for policy 1, policy_version 44300 (0.0009) [2023-10-10 06:29:45,561][53268] Updated weights for policy 1, policy_version 44310 (0.0009) [2023-10-10 06:29:45,922][53268] Updated weights for policy 1, policy_version 44320 (0.0009) [2023-10-10 06:29:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90800128. Throughput: 0: 1678.4, 1: 1681.6. Samples: 22701960. Policy #0 lag: (min: 22.0, avg: 34.5, max: 54.0) [2023-10-10 06:29:46,784][52050] Avg episode reward: [(0, '19.740'), (1, '18.910')] [2023-10-10 06:29:48,428][53252] Updated weights for policy 0, policy_version 44360 (0.0008) [2023-10-10 06:29:48,792][53252] Updated weights for policy 0, policy_version 44370 (0.0009) [2023-10-10 06:29:49,169][53252] Updated weights for policy 0, policy_version 44380 (0.0008) [2023-10-10 06:29:49,920][53268] Updated weights for policy 1, policy_version 44330 (0.0010) [2023-10-10 06:29:50,283][53268] Updated weights for policy 1, policy_version 44340 (0.0010) [2023-10-10 06:29:50,658][53268] Updated weights for policy 1, policy_version 44350 (0.0010) [2023-10-10 06:29:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90865664. Throughput: 0: 1686.8, 1: 1675.8. Samples: 22721914. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:29:51,784][52050] Avg episode reward: [(0, '22.210'), (1, '18.940')] [2023-10-10 06:29:53,321][53252] Updated weights for policy 0, policy_version 44390 (0.0009) [2023-10-10 06:29:53,708][53252] Updated weights for policy 0, policy_version 44400 (0.0010) [2023-10-10 06:29:54,083][53252] Updated weights for policy 0, policy_version 44410 (0.0011) [2023-10-10 06:29:54,765][53268] Updated weights for policy 1, policy_version 44360 (0.0010) [2023-10-10 06:29:55,137][53268] Updated weights for policy 1, policy_version 44370 (0.0008) [2023-10-10 06:29:55,493][53268] Updated weights for policy 1, policy_version 44380 (0.0009) [2023-10-10 06:29:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90931200. Throughput: 0: 1695.2, 1: 1673.8. Samples: 22741876. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:29:56,785][52050] Avg episode reward: [(0, '21.770'), (1, '21.730')] [2023-10-10 06:29:57,903][53252] Updated weights for policy 0, policy_version 44420 (0.0010) [2023-10-10 06:29:58,276][53252] Updated weights for policy 0, policy_version 44430 (0.0007) [2023-10-10 06:29:58,642][53252] Updated weights for policy 0, policy_version 44440 (0.0007) [2023-10-10 06:29:59,688][53268] Updated weights for policy 1, policy_version 44390 (0.0008) [2023-10-10 06:30:00,048][53268] Updated weights for policy 1, policy_version 44400 (0.0009) [2023-10-10 06:30:00,418][53268] Updated weights for policy 1, policy_version 44410 (0.0009) [2023-10-10 06:30:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 90996736. Throughput: 0: 1669.2, 1: 1687.3. Samples: 22752240. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:30:01,784][52050] Avg episode reward: [(0, '23.400'), (1, '20.460')] [2023-10-10 06:30:01,785][52846] Saving new best policy, reward=23.400! [2023-10-10 06:30:02,777][53252] Updated weights for policy 0, policy_version 44450 (0.0008) [2023-10-10 06:30:03,149][53252] Updated weights for policy 0, policy_version 44460 (0.0007) [2023-10-10 06:30:03,520][53252] Updated weights for policy 0, policy_version 44470 (0.0009) [2023-10-10 06:30:03,884][53252] Updated weights for policy 0, policy_version 44480 (0.0008) [2023-10-10 06:30:04,594][53268] Updated weights for policy 1, policy_version 44420 (0.0008) [2023-10-10 06:30:04,961][53268] Updated weights for policy 1, policy_version 44430 (0.0008) [2023-10-10 06:30:05,329][53268] Updated weights for policy 1, policy_version 44440 (0.0007) [2023-10-10 06:30:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 91062272. Throughput: 0: 1692.6, 1: 1660.2. Samples: 22772044. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:30:06,784][52050] Avg episode reward: [(0, '23.680'), (1, '19.330')] [2023-10-10 06:30:06,786][52846] Saving new best policy, reward=23.680! [2023-10-10 06:30:07,956][53252] Updated weights for policy 0, policy_version 44490 (0.0009) [2023-10-10 06:30:08,321][53252] Updated weights for policy 0, policy_version 44500 (0.0009) [2023-10-10 06:30:08,689][53252] Updated weights for policy 0, policy_version 44510 (0.0009) [2023-10-10 06:30:09,282][53268] Updated weights for policy 1, policy_version 44450 (0.0008) [2023-10-10 06:30:09,638][53268] Updated weights for policy 1, policy_version 44460 (0.0008) [2023-10-10 06:30:10,013][53268] Updated weights for policy 1, policy_version 44470 (0.0007) [2023-10-10 06:30:10,374][53268] Updated weights for policy 1, policy_version 44480 (0.0008) [2023-10-10 06:30:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 91127808. Throughput: 0: 1687.6, 1: 1676.0. Samples: 22792252. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:30:11,784][52050] Avg episode reward: [(0, '21.010'), (1, '19.430')] [2023-10-10 06:30:12,787][53252] Updated weights for policy 0, policy_version 44520 (0.0009) [2023-10-10 06:30:13,146][53252] Updated weights for policy 0, policy_version 44530 (0.0009) [2023-10-10 06:30:13,517][53252] Updated weights for policy 0, policy_version 44540 (0.0008) [2023-10-10 06:30:14,499][53268] Updated weights for policy 1, policy_version 44490 (0.0009) [2023-10-10 06:30:14,858][53268] Updated weights for policy 1, policy_version 44500 (0.0008) [2023-10-10 06:30:15,232][53268] Updated weights for policy 1, policy_version 44510 (0.0008) [2023-10-10 06:30:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 91193344. Throughput: 0: 1671.5, 1: 1680.3. Samples: 22802578. Policy #0 lag: (min: 31.0, avg: 32.0, max: 52.0) [2023-10-10 06:30:16,784][52050] Avg episode reward: [(0, '21.650'), (1, '17.990')] [2023-10-10 06:30:17,729][53252] Updated weights for policy 0, policy_version 44550 (0.0008) [2023-10-10 06:30:18,095][53252] Updated weights for policy 0, policy_version 44560 (0.0010) [2023-10-10 06:30:18,466][53252] Updated weights for policy 0, policy_version 44570 (0.0009) [2023-10-10 06:30:19,430][53268] Updated weights for policy 1, policy_version 44520 (0.0009) [2023-10-10 06:30:19,798][53268] Updated weights for policy 1, policy_version 44530 (0.0008) [2023-10-10 06:30:20,160][53268] Updated weights for policy 1, policy_version 44540 (0.0010) [2023-10-10 06:30:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91258880. Throughput: 0: 1682.5, 1: 1660.5. Samples: 22822008. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:21,784][52050] Avg episode reward: [(0, '21.410'), (1, '19.140')] [2023-10-10 06:30:22,600][53252] Updated weights for policy 0, policy_version 44580 (0.0009) [2023-10-10 06:30:22,969][53252] Updated weights for policy 0, policy_version 44590 (0.0007) [2023-10-10 06:30:23,336][53252] Updated weights for policy 0, policy_version 44600 (0.0008) [2023-10-10 06:30:24,138][53268] Updated weights for policy 1, policy_version 44550 (0.0010) [2023-10-10 06:30:24,503][53268] Updated weights for policy 1, policy_version 44560 (0.0007) [2023-10-10 06:30:24,870][53268] Updated weights for policy 1, policy_version 44570 (0.0007) [2023-10-10 06:30:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 91324416. Throughput: 0: 1683.7, 1: 1688.1. Samples: 22843010. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:26,784][52050] Avg episode reward: [(0, '21.050'), (1, '19.950')] [2023-10-10 06:30:27,318][53252] Updated weights for policy 0, policy_version 44610 (0.0009) [2023-10-10 06:30:27,696][53252] Updated weights for policy 0, policy_version 44620 (0.0008) [2023-10-10 06:30:28,066][53252] Updated weights for policy 0, policy_version 44630 (0.0008) [2023-10-10 06:30:28,441][53252] Updated weights for policy 0, policy_version 44640 (0.0007) [2023-10-10 06:30:28,926][53268] Updated weights for policy 1, policy_version 44580 (0.0008) [2023-10-10 06:30:29,297][53268] Updated weights for policy 1, policy_version 44590 (0.0009) [2023-10-10 06:30:29,653][53268] Updated weights for policy 1, policy_version 44600 (0.0008) [2023-10-10 06:30:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91389952. Throughput: 0: 1679.2, 1: 1676.1. Samples: 22852948. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:31,784][52050] Avg episode reward: [(0, '20.910'), (1, '19.850')] [2023-10-10 06:30:32,515][53252] Updated weights for policy 0, policy_version 44650 (0.0008) [2023-10-10 06:30:32,889][53252] Updated weights for policy 0, policy_version 44660 (0.0008) [2023-10-10 06:30:33,257][53252] Updated weights for policy 0, policy_version 44670 (0.0007) [2023-10-10 06:30:33,708][53268] Updated weights for policy 1, policy_version 44610 (0.0008) [2023-10-10 06:30:34,073][53268] Updated weights for policy 1, policy_version 44620 (0.0009) [2023-10-10 06:30:34,443][53268] Updated weights for policy 1, policy_version 44630 (0.0008) [2023-10-10 06:30:34,814][53268] Updated weights for policy 1, policy_version 44640 (0.0009) [2023-10-10 06:30:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91455488. Throughput: 0: 1685.4, 1: 1667.0. Samples: 22872772. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:36,784][52050] Avg episode reward: [(0, '23.810'), (1, '20.920')] [2023-10-10 06:30:36,786][52846] Saving new best policy, reward=23.810! [2023-10-10 06:30:37,338][53252] Updated weights for policy 0, policy_version 44680 (0.0010) [2023-10-10 06:30:37,698][53252] Updated weights for policy 0, policy_version 44690 (0.0007) [2023-10-10 06:30:38,077][53252] Updated weights for policy 0, policy_version 44700 (0.0009) [2023-10-10 06:30:38,945][53268] Updated weights for policy 1, policy_version 44650 (0.0011) [2023-10-10 06:30:39,315][53268] Updated weights for policy 1, policy_version 44660 (0.0010) [2023-10-10 06:30:39,680][53268] Updated weights for policy 1, policy_version 44670 (0.0009) [2023-10-10 06:30:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91521024. Throughput: 0: 1683.5, 1: 1682.0. Samples: 22893322. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:41,784][52050] Avg episode reward: [(0, '21.750'), (1, '20.610')] [2023-10-10 06:30:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000044704_45776896.pth... [2023-10-10 06:30:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000044672_45744128.pth... [2023-10-10 06:30:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000043136_44171264.pth [2023-10-10 06:30:41,835][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000043104_44138496.pth [2023-10-10 06:30:42,211][53252] Updated weights for policy 0, policy_version 44710 (0.0010) [2023-10-10 06:30:42,579][53252] Updated weights for policy 0, policy_version 44720 (0.0009) [2023-10-10 06:30:42,949][53252] Updated weights for policy 0, policy_version 44730 (0.0007) [2023-10-10 06:30:43,797][53268] Updated weights for policy 1, policy_version 44680 (0.0011) [2023-10-10 06:30:44,160][53268] Updated weights for policy 1, policy_version 44690 (0.0009) [2023-10-10 06:30:44,523][53268] Updated weights for policy 1, policy_version 44700 (0.0008) [2023-10-10 06:30:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91586560. Throughput: 0: 1682.6, 1: 1672.0. Samples: 22903200. Policy #0 lag: (min: 16.0, avg: 32.0, max: 48.0) [2023-10-10 06:30:46,784][52050] Avg episode reward: [(0, '21.720'), (1, '19.800')] [2023-10-10 06:30:47,055][53252] Updated weights for policy 0, policy_version 44740 (0.0009) [2023-10-10 06:30:47,420][53252] Updated weights for policy 0, policy_version 44750 (0.0007) [2023-10-10 06:30:47,786][53252] Updated weights for policy 0, policy_version 44760 (0.0010) [2023-10-10 06:30:48,501][53268] Updated weights for policy 1, policy_version 44710 (0.0009) [2023-10-10 06:30:48,872][53268] Updated weights for policy 1, policy_version 44720 (0.0009) [2023-10-10 06:30:49,243][53268] Updated weights for policy 1, policy_version 44730 (0.0010) [2023-10-10 06:30:51,669][53252] Updated weights for policy 0, policy_version 44770 (0.0010) [2023-10-10 06:30:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91652096. Throughput: 0: 1682.4, 1: 1680.7. Samples: 22923384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:30:51,784][52050] Avg episode reward: [(0, '22.150'), (1, '19.660')] [2023-10-10 06:30:52,034][53252] Updated weights for policy 0, policy_version 44780 (0.0007) [2023-10-10 06:30:52,407][53252] Updated weights for policy 0, policy_version 44790 (0.0010) [2023-10-10 06:30:52,782][53252] Updated weights for policy 0, policy_version 44800 (0.0009) [2023-10-10 06:30:53,259][53268] Updated weights for policy 1, policy_version 44740 (0.0008) [2023-10-10 06:30:53,617][53268] Updated weights for policy 1, policy_version 44750 (0.0009) [2023-10-10 06:30:53,979][53268] Updated weights for policy 1, policy_version 44760 (0.0008) [2023-10-10 06:30:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91717632. Throughput: 0: 1684.0, 1: 1694.7. Samples: 22944292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:30:56,784][52050] Avg episode reward: [(0, '21.180'), (1, '19.340')] [2023-10-10 06:30:57,077][53252] Updated weights for policy 0, policy_version 44810 (0.0009) [2023-10-10 06:30:57,458][53252] Updated weights for policy 0, policy_version 44820 (0.0009) [2023-10-10 06:30:57,822][53252] Updated weights for policy 0, policy_version 44830 (0.0008) [2023-10-10 06:30:57,939][53268] Updated weights for policy 1, policy_version 44770 (0.0009) [2023-10-10 06:30:58,312][53268] Updated weights for policy 1, policy_version 44780 (0.0010) [2023-10-10 06:30:58,668][53268] Updated weights for policy 1, policy_version 44790 (0.0010) [2023-10-10 06:30:59,032][53268] Updated weights for policy 1, policy_version 44800 (0.0011) [2023-10-10 06:31:01,734][53252] Updated weights for policy 0, policy_version 44840 (0.0007) [2023-10-10 06:31:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91783168. Throughput: 0: 1687.9, 1: 1666.4. Samples: 22953518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:31:01,784][52050] Avg episode reward: [(0, '20.200'), (1, '18.580')] [2023-10-10 06:31:02,097][53252] Updated weights for policy 0, policy_version 44850 (0.0008) [2023-10-10 06:31:02,471][53252] Updated weights for policy 0, policy_version 44860 (0.0008) [2023-10-10 06:31:02,983][53268] Updated weights for policy 1, policy_version 44810 (0.0009) [2023-10-10 06:31:03,352][53268] Updated weights for policy 1, policy_version 44820 (0.0011) [2023-10-10 06:31:03,727][53268] Updated weights for policy 1, policy_version 44830 (0.0009) [2023-10-10 06:31:06,368][53252] Updated weights for policy 0, policy_version 44870 (0.0008) [2023-10-10 06:31:06,745][53252] Updated weights for policy 0, policy_version 44880 (0.0008) [2023-10-10 06:31:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 91848704. Throughput: 0: 1697.6, 1: 1693.5. Samples: 22974608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:31:06,784][52050] Avg episode reward: [(0, '21.680'), (1, '18.650')] [2023-10-10 06:31:07,125][53252] Updated weights for policy 0, policy_version 44890 (0.0007) [2023-10-10 06:31:07,877][53268] Updated weights for policy 1, policy_version 44840 (0.0007) [2023-10-10 06:31:08,245][53268] Updated weights for policy 1, policy_version 44850 (0.0007) [2023-10-10 06:31:08,618][53268] Updated weights for policy 1, policy_version 44860 (0.0010) [2023-10-10 06:31:11,155][53252] Updated weights for policy 0, policy_version 44900 (0.0007) [2023-10-10 06:31:11,523][53252] Updated weights for policy 0, policy_version 44910 (0.0007) [2023-10-10 06:31:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91914240. Throughput: 0: 1684.7, 1: 1691.5. Samples: 22994940. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:31:11,784][52050] Avg episode reward: [(0, '23.310'), (1, '20.060')] [2023-10-10 06:31:11,890][53252] Updated weights for policy 0, policy_version 44920 (0.0008) [2023-10-10 06:31:12,775][53268] Updated weights for policy 1, policy_version 44870 (0.0009) [2023-10-10 06:31:13,153][53268] Updated weights for policy 1, policy_version 44880 (0.0007) [2023-10-10 06:31:13,513][53268] Updated weights for policy 1, policy_version 44890 (0.0008) [2023-10-10 06:31:15,924][53252] Updated weights for policy 0, policy_version 44930 (0.0008) [2023-10-10 06:31:16,302][53252] Updated weights for policy 0, policy_version 44940 (0.0009) [2023-10-10 06:31:16,678][53252] Updated weights for policy 0, policy_version 44950 (0.0010) [2023-10-10 06:31:16,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91979776. Throughput: 0: 1696.4, 1: 1675.8. Samples: 23004698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:31:16,784][52050] Avg episode reward: [(0, '22.690'), (1, '19.670')] [2023-10-10 06:31:17,042][53252] Updated weights for policy 0, policy_version 44960 (0.0007) [2023-10-10 06:31:17,381][53268] Updated weights for policy 1, policy_version 44900 (0.0009) [2023-10-10 06:31:17,746][53268] Updated weights for policy 1, policy_version 44910 (0.0008) [2023-10-10 06:31:18,121][53268] Updated weights for policy 1, policy_version 44920 (0.0007) [2023-10-10 06:31:21,061][53252] Updated weights for policy 0, policy_version 44970 (0.0010) [2023-10-10 06:31:21,441][53252] Updated weights for policy 0, policy_version 44980 (0.0009) [2023-10-10 06:31:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 92045312. Throughput: 0: 1694.5, 1: 1702.7. Samples: 23025646. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:31:21,784][52050] Avg episode reward: [(0, '22.050'), (1, '19.540')] [2023-10-10 06:31:21,808][53252] Updated weights for policy 0, policy_version 44990 (0.0008) [2023-10-10 06:31:22,389][53268] Updated weights for policy 1, policy_version 44930 (0.0009) [2023-10-10 06:31:22,749][53268] Updated weights for policy 1, policy_version 44940 (0.0008) [2023-10-10 06:31:23,110][53268] Updated weights for policy 1, policy_version 44950 (0.0009) [2023-10-10 06:31:23,478][53268] Updated weights for policy 1, policy_version 44960 (0.0008) [2023-10-10 06:31:25,832][53252] Updated weights for policy 0, policy_version 45000 (0.0010) [2023-10-10 06:31:26,204][53252] Updated weights for policy 0, policy_version 45010 (0.0010) [2023-10-10 06:31:26,589][53252] Updated weights for policy 0, policy_version 45020 (0.0008) [2023-10-10 06:31:26,783][52050] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92143616. Throughput: 0: 1679.1, 1: 1703.4. Samples: 23045532. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:26,785][52050] Avg episode reward: [(0, '22.430'), (1, '21.080')] [2023-10-10 06:31:27,477][53268] Updated weights for policy 1, policy_version 44970 (0.0010) [2023-10-10 06:31:27,846][53268] Updated weights for policy 1, policy_version 44980 (0.0010) [2023-10-10 06:31:28,213][53268] Updated weights for policy 1, policy_version 44990 (0.0010) [2023-10-10 06:31:30,590][53252] Updated weights for policy 0, policy_version 45030 (0.0009) [2023-10-10 06:31:30,951][53252] Updated weights for policy 0, policy_version 45040 (0.0010) [2023-10-10 06:31:31,323][53252] Updated weights for policy 0, policy_version 45050 (0.0009) [2023-10-10 06:31:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92209152. Throughput: 0: 1698.9, 1: 1686.8. Samples: 23055558. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:31,784][52050] Avg episode reward: [(0, '20.370'), (1, '21.790')] [2023-10-10 06:31:32,248][53268] Updated weights for policy 1, policy_version 45000 (0.0008) [2023-10-10 06:31:32,605][53268] Updated weights for policy 1, policy_version 45010 (0.0009) [2023-10-10 06:31:32,981][53268] Updated weights for policy 1, policy_version 45020 (0.0007) [2023-10-10 06:31:35,212][53252] Updated weights for policy 0, policy_version 45060 (0.0009) [2023-10-10 06:31:35,572][53252] Updated weights for policy 0, policy_version 45070 (0.0007) [2023-10-10 06:31:35,955][53252] Updated weights for policy 0, policy_version 45080 (0.0009) [2023-10-10 06:31:36,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 92274688. Throughput: 0: 1696.3, 1: 1694.7. Samples: 23075978. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:36,784][52050] Avg episode reward: [(0, '19.020'), (1, '21.810')] [2023-10-10 06:31:37,052][53268] Updated weights for policy 1, policy_version 45030 (0.0010) [2023-10-10 06:31:37,413][53268] Updated weights for policy 1, policy_version 45040 (0.0010) [2023-10-10 06:31:37,776][53268] Updated weights for policy 1, policy_version 45050 (0.0010) [2023-10-10 06:31:40,035][53252] Updated weights for policy 0, policy_version 45090 (0.0009) [2023-10-10 06:31:40,412][53252] Updated weights for policy 0, policy_version 45100 (0.0007) [2023-10-10 06:31:40,774][53252] Updated weights for policy 0, policy_version 45110 (0.0009) [2023-10-10 06:31:41,148][53252] Updated weights for policy 0, policy_version 45120 (0.0007) [2023-10-10 06:31:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92340224. Throughput: 0: 1674.8, 1: 1691.2. Samples: 23095762. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:41,784][52050] Avg episode reward: [(0, '19.540'), (1, '19.530')] [2023-10-10 06:31:41,799][53268] Updated weights for policy 1, policy_version 45060 (0.0009) [2023-10-10 06:31:42,168][53268] Updated weights for policy 1, policy_version 45070 (0.0007) [2023-10-10 06:31:42,539][53268] Updated weights for policy 1, policy_version 45080 (0.0009) [2023-10-10 06:31:45,289][53252] Updated weights for policy 0, policy_version 45130 (0.0009) [2023-10-10 06:31:45,654][53252] Updated weights for policy 0, policy_version 45140 (0.0007) [2023-10-10 06:31:46,027][53252] Updated weights for policy 0, policy_version 45150 (0.0007) [2023-10-10 06:31:46,625][53268] Updated weights for policy 1, policy_version 45090 (0.0007) [2023-10-10 06:31:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92405760. Throughput: 0: 1698.4, 1: 1692.4. Samples: 23106104. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:46,784][52050] Avg episode reward: [(0, '20.320'), (1, '19.700')] [2023-10-10 06:31:46,992][53268] Updated weights for policy 1, policy_version 45100 (0.0007) [2023-10-10 06:31:47,364][53268] Updated weights for policy 1, policy_version 45110 (0.0007) [2023-10-10 06:31:47,729][53268] Updated weights for policy 1, policy_version 45120 (0.0011) [2023-10-10 06:31:50,069][53252] Updated weights for policy 0, policy_version 45160 (0.0007) [2023-10-10 06:31:50,434][53252] Updated weights for policy 0, policy_version 45170 (0.0008) [2023-10-10 06:31:50,806][53252] Updated weights for policy 0, policy_version 45180 (0.0008) [2023-10-10 06:31:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92471296. Throughput: 0: 1678.9, 1: 1688.4. Samples: 23126136. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-10 06:31:51,784][52050] Avg episode reward: [(0, '20.720'), (1, '19.850')] [2023-10-10 06:31:51,787][53268] Updated weights for policy 1, policy_version 45130 (0.0007) [2023-10-10 06:31:52,147][53268] Updated weights for policy 1, policy_version 45140 (0.0008) [2023-10-10 06:31:52,507][53268] Updated weights for policy 1, policy_version 45150 (0.0007) [2023-10-10 06:31:54,879][53252] Updated weights for policy 0, policy_version 45190 (0.0007) [2023-10-10 06:31:55,249][53252] Updated weights for policy 0, policy_version 45200 (0.0007) [2023-10-10 06:31:55,613][53252] Updated weights for policy 0, policy_version 45210 (0.0009) [2023-10-10 06:31:56,554][53268] Updated weights for policy 1, policy_version 45160 (0.0009) [2023-10-10 06:31:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92536832. Throughput: 0: 1675.6, 1: 1692.7. Samples: 23146514. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:31:56,784][52050] Avg episode reward: [(0, '20.270'), (1, '19.700')] [2023-10-10 06:31:56,924][53268] Updated weights for policy 1, policy_version 45170 (0.0008) [2023-10-10 06:31:57,295][53268] Updated weights for policy 1, policy_version 45180 (0.0008) [2023-10-10 06:31:59,558][53252] Updated weights for policy 0, policy_version 45220 (0.0007) [2023-10-10 06:31:59,924][53252] Updated weights for policy 0, policy_version 45230 (0.0008) [2023-10-10 06:32:00,289][53252] Updated weights for policy 0, policy_version 45240 (0.0007) [2023-10-10 06:32:01,345][53268] Updated weights for policy 1, policy_version 45190 (0.0009) [2023-10-10 06:32:01,715][53268] Updated weights for policy 1, policy_version 45200 (0.0007) [2023-10-10 06:32:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92602368. Throughput: 0: 1694.3, 1: 1688.2. Samples: 23156912. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:01,784][52050] Avg episode reward: [(0, '21.240'), (1, '19.930')] [2023-10-10 06:32:02,073][53268] Updated weights for policy 1, policy_version 45210 (0.0010) [2023-10-10 06:32:04,276][53252] Updated weights for policy 0, policy_version 45250 (0.0008) [2023-10-10 06:32:04,653][53252] Updated weights for policy 0, policy_version 45260 (0.0007) [2023-10-10 06:32:05,020][53252] Updated weights for policy 0, policy_version 45270 (0.0009) [2023-10-10 06:32:05,394][53252] Updated weights for policy 0, policy_version 45280 (0.0007) [2023-10-10 06:32:06,022][53268] Updated weights for policy 1, policy_version 45220 (0.0008) [2023-10-10 06:32:06,407][53268] Updated weights for policy 1, policy_version 45230 (0.0011) [2023-10-10 06:32:06,773][53268] Updated weights for policy 1, policy_version 45240 (0.0008) [2023-10-10 06:32:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92667904. Throughput: 0: 1672.8, 1: 1683.9. Samples: 23176698. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:06,784][52050] Avg episode reward: [(0, '20.350'), (1, '19.150')] [2023-10-10 06:32:09,457][53252] Updated weights for policy 0, policy_version 45290 (0.0009) [2023-10-10 06:32:09,825][53252] Updated weights for policy 0, policy_version 45300 (0.0009) [2023-10-10 06:32:10,204][53252] Updated weights for policy 0, policy_version 45310 (0.0008) [2023-10-10 06:32:10,781][53268] Updated weights for policy 1, policy_version 45250 (0.0009) [2023-10-10 06:32:11,153][53268] Updated weights for policy 1, policy_version 45260 (0.0010) [2023-10-10 06:32:11,509][53268] Updated weights for policy 1, policy_version 45270 (0.0011) [2023-10-10 06:32:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92733440. Throughput: 0: 1692.5, 1: 1668.6. Samples: 23196782. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:11,784][52050] Avg episode reward: [(0, '21.080'), (1, '19.510')] [2023-10-10 06:32:11,872][53268] Updated weights for policy 1, policy_version 45280 (0.0011) [2023-10-10 06:32:14,201][53252] Updated weights for policy 0, policy_version 45320 (0.0008) [2023-10-10 06:32:14,582][53252] Updated weights for policy 0, policy_version 45330 (0.0009) [2023-10-10 06:32:14,950][53252] Updated weights for policy 0, policy_version 45340 (0.0008) [2023-10-10 06:32:15,988][53268] Updated weights for policy 1, policy_version 45290 (0.0010) [2023-10-10 06:32:16,356][53268] Updated weights for policy 1, policy_version 45300 (0.0009) [2023-10-10 06:32:16,728][53268] Updated weights for policy 1, policy_version 45310 (0.0010) [2023-10-10 06:32:16,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 92798976. Throughput: 0: 1692.1, 1: 1678.6. Samples: 23207240. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:16,784][52050] Avg episode reward: [(0, '20.840'), (1, '19.950')] [2023-10-10 06:32:18,999][53252] Updated weights for policy 0, policy_version 45350 (0.0008) [2023-10-10 06:32:19,369][53252] Updated weights for policy 0, policy_version 45360 (0.0008) [2023-10-10 06:32:19,729][53252] Updated weights for policy 0, policy_version 45370 (0.0008) [2023-10-10 06:32:20,811][53268] Updated weights for policy 1, policy_version 45320 (0.0008) [2023-10-10 06:32:21,172][53268] Updated weights for policy 1, policy_version 45330 (0.0009) [2023-10-10 06:32:21,542][53268] Updated weights for policy 1, policy_version 45340 (0.0009) [2023-10-10 06:32:21,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 92897280. Throughput: 0: 1678.6, 1: 1684.8. Samples: 23227330. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:21,784][52050] Avg episode reward: [(0, '21.510'), (1, '19.300')] [2023-10-10 06:32:23,924][53252] Updated weights for policy 0, policy_version 45380 (0.0008) [2023-10-10 06:32:24,306][53252] Updated weights for policy 0, policy_version 45390 (0.0009) [2023-10-10 06:32:24,678][53252] Updated weights for policy 0, policy_version 45400 (0.0007) [2023-10-10 06:32:25,623][53268] Updated weights for policy 1, policy_version 45350 (0.0011) [2023-10-10 06:32:25,989][53268] Updated weights for policy 1, policy_version 45360 (0.0009) [2023-10-10 06:32:26,350][53268] Updated weights for policy 1, policy_version 45370 (0.0007) [2023-10-10 06:32:26,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 92962816. Throughput: 0: 1705.0, 1: 1668.9. Samples: 23247588. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 06:32:26,784][52050] Avg episode reward: [(0, '21.600'), (1, '19.260')] [2023-10-10 06:32:28,455][53252] Updated weights for policy 0, policy_version 45410 (0.0010) [2023-10-10 06:32:28,835][53252] Updated weights for policy 0, policy_version 45420 (0.0008) [2023-10-10 06:32:29,198][53252] Updated weights for policy 0, policy_version 45430 (0.0007) [2023-10-10 06:32:29,568][53252] Updated weights for policy 0, policy_version 45440 (0.0007) [2023-10-10 06:32:30,469][53268] Updated weights for policy 1, policy_version 45380 (0.0009) [2023-10-10 06:32:30,830][53268] Updated weights for policy 1, policy_version 45390 (0.0007) [2023-10-10 06:32:31,204][53268] Updated weights for policy 1, policy_version 45400 (0.0009) [2023-10-10 06:32:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93028352. Throughput: 0: 1685.9, 1: 1685.2. Samples: 23257806. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:31,784][52050] Avg episode reward: [(0, '23.630'), (1, '20.610')] [2023-10-10 06:32:33,398][53252] Updated weights for policy 0, policy_version 45450 (0.0007) [2023-10-10 06:32:33,776][53252] Updated weights for policy 0, policy_version 45460 (0.0007) [2023-10-10 06:32:34,145][53252] Updated weights for policy 0, policy_version 45470 (0.0009) [2023-10-10 06:32:35,340][53268] Updated weights for policy 1, policy_version 45410 (0.0010) [2023-10-10 06:32:35,708][53268] Updated weights for policy 1, policy_version 45420 (0.0009) [2023-10-10 06:32:36,069][53268] Updated weights for policy 1, policy_version 45430 (0.0008) [2023-10-10 06:32:36,445][53268] Updated weights for policy 1, policy_version 45440 (0.0007) [2023-10-10 06:32:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93093888. Throughput: 0: 1696.0, 1: 1682.6. Samples: 23278174. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:36,784][52050] Avg episode reward: [(0, '22.680'), (1, '18.890')] [2023-10-10 06:32:38,135][53252] Updated weights for policy 0, policy_version 45480 (0.0007) [2023-10-10 06:32:38,507][53252] Updated weights for policy 0, policy_version 45490 (0.0008) [2023-10-10 06:32:38,881][53252] Updated weights for policy 0, policy_version 45500 (0.0010) [2023-10-10 06:32:40,653][53268] Updated weights for policy 1, policy_version 45450 (0.0009) [2023-10-10 06:32:41,021][53268] Updated weights for policy 1, policy_version 45460 (0.0008) [2023-10-10 06:32:41,382][53268] Updated weights for policy 1, policy_version 45470 (0.0008) [2023-10-10 06:32:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 93159424. Throughput: 0: 1712.4, 1: 1653.6. Samples: 23297982. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:41,784][52050] Avg episode reward: [(0, '22.370'), (1, '18.080')] [2023-10-10 06:32:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000045504_46596096.pth... [2023-10-10 06:32:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000045472_46563328.pth... [2023-10-10 06:32:41,828][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000043904_44957696.pth [2023-10-10 06:32:41,831][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000043936_44990464.pth [2023-10-10 06:32:42,983][53252] Updated weights for policy 0, policy_version 45510 (0.0008) [2023-10-10 06:32:43,357][53252] Updated weights for policy 0, policy_version 45520 (0.0007) [2023-10-10 06:32:43,728][53252] Updated weights for policy 0, policy_version 45530 (0.0007) [2023-10-10 06:32:45,627][53268] Updated weights for policy 1, policy_version 45480 (0.0008) [2023-10-10 06:32:45,998][53268] Updated weights for policy 1, policy_version 45490 (0.0008) [2023-10-10 06:32:46,367][53268] Updated weights for policy 1, policy_version 45500 (0.0008) [2023-10-10 06:32:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93224960. Throughput: 0: 1683.6, 1: 1681.3. Samples: 23308334. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:46,784][52050] Avg episode reward: [(0, '22.560'), (1, '18.610')] [2023-10-10 06:32:47,695][53252] Updated weights for policy 0, policy_version 45540 (0.0008) [2023-10-10 06:32:48,063][53252] Updated weights for policy 0, policy_version 45550 (0.0008) [2023-10-10 06:32:48,438][53252] Updated weights for policy 0, policy_version 45560 (0.0008) [2023-10-10 06:32:50,403][53268] Updated weights for policy 1, policy_version 45510 (0.0009) [2023-10-10 06:32:50,777][53268] Updated weights for policy 1, policy_version 45520 (0.0009) [2023-10-10 06:32:51,132][53268] Updated weights for policy 1, policy_version 45530 (0.0009) [2023-10-10 06:32:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93290496. Throughput: 0: 1709.4, 1: 1675.6. Samples: 23329020. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:51,784][52050] Avg episode reward: [(0, '20.760'), (1, '20.050')] [2023-10-10 06:32:52,503][53252] Updated weights for policy 0, policy_version 45570 (0.0008) [2023-10-10 06:32:52,877][53252] Updated weights for policy 0, policy_version 45580 (0.0008) [2023-10-10 06:32:53,257][53252] Updated weights for policy 0, policy_version 45590 (0.0010) [2023-10-10 06:32:53,622][53252] Updated weights for policy 0, policy_version 45600 (0.0007) [2023-10-10 06:32:55,207][53268] Updated weights for policy 1, policy_version 45540 (0.0009) [2023-10-10 06:32:55,568][53268] Updated weights for policy 1, policy_version 45550 (0.0008) [2023-10-10 06:32:55,933][53268] Updated weights for policy 1, policy_version 45560 (0.0010) [2023-10-10 06:32:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93356032. Throughput: 0: 1715.2, 1: 1665.2. Samples: 23348904. Policy #0 lag: (min: 9.0, avg: 15.0, max: 41.0) [2023-10-10 06:32:56,784][52050] Avg episode reward: [(0, '19.330'), (1, '18.610')] [2023-10-10 06:32:57,617][53252] Updated weights for policy 0, policy_version 45610 (0.0008) [2023-10-10 06:32:58,001][53252] Updated weights for policy 0, policy_version 45620 (0.0007) [2023-10-10 06:32:58,366][53252] Updated weights for policy 0, policy_version 45630 (0.0008) [2023-10-10 06:33:00,027][53268] Updated weights for policy 1, policy_version 45570 (0.0010) [2023-10-10 06:33:00,400][53268] Updated weights for policy 1, policy_version 45580 (0.0010) [2023-10-10 06:33:00,775][53268] Updated weights for policy 1, policy_version 45590 (0.0010) [2023-10-10 06:33:01,142][53268] Updated weights for policy 1, policy_version 45600 (0.0009) [2023-10-10 06:33:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93421568. Throughput: 0: 1695.0, 1: 1680.4. Samples: 23359134. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:01,784][52050] Avg episode reward: [(0, '19.510'), (1, '18.970')] [2023-10-10 06:33:02,489][53252] Updated weights for policy 0, policy_version 45640 (0.0008) [2023-10-10 06:33:02,869][53252] Updated weights for policy 0, policy_version 45650 (0.0008) [2023-10-10 06:33:03,240][53252] Updated weights for policy 0, policy_version 45660 (0.0008) [2023-10-10 06:33:05,149][53268] Updated weights for policy 1, policy_version 45610 (0.0009) [2023-10-10 06:33:05,515][53268] Updated weights for policy 1, policy_version 45620 (0.0008) [2023-10-10 06:33:05,870][53268] Updated weights for policy 1, policy_version 45630 (0.0010) [2023-10-10 06:33:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93487104. Throughput: 0: 1710.4, 1: 1674.1. Samples: 23379634. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:06,784][52050] Avg episode reward: [(0, '20.590'), (1, '18.310')] [2023-10-10 06:33:07,233][53252] Updated weights for policy 0, policy_version 45670 (0.0009) [2023-10-10 06:33:07,598][53252] Updated weights for policy 0, policy_version 45680 (0.0010) [2023-10-10 06:33:07,970][53252] Updated weights for policy 0, policy_version 45690 (0.0010) [2023-10-10 06:33:09,716][53268] Updated weights for policy 1, policy_version 45640 (0.0008) [2023-10-10 06:33:10,084][53268] Updated weights for policy 1, policy_version 45650 (0.0009) [2023-10-10 06:33:10,450][53268] Updated weights for policy 1, policy_version 45660 (0.0009) [2023-10-10 06:33:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 93552640. Throughput: 0: 1705.2, 1: 1672.1. Samples: 23399566. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:11,784][52050] Avg episode reward: [(0, '22.130'), (1, '17.070')] [2023-10-10 06:33:11,949][53252] Updated weights for policy 0, policy_version 45700 (0.0010) [2023-10-10 06:33:12,342][53252] Updated weights for policy 0, policy_version 45710 (0.0008) [2023-10-10 06:33:12,724][53252] Updated weights for policy 0, policy_version 45720 (0.0008) [2023-10-10 06:33:14,468][53268] Updated weights for policy 1, policy_version 45670 (0.0009) [2023-10-10 06:33:14,834][53268] Updated weights for policy 1, policy_version 45680 (0.0012) [2023-10-10 06:33:15,200][53268] Updated weights for policy 1, policy_version 45690 (0.0011) [2023-10-10 06:33:16,780][53252] Updated weights for policy 0, policy_version 45730 (0.0008) [2023-10-10 06:33:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 93618176. Throughput: 0: 1696.3, 1: 1682.4. Samples: 23409846. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:16,784][52050] Avg episode reward: [(0, '21.360'), (1, '18.470')] [2023-10-10 06:33:17,144][53252] Updated weights for policy 0, policy_version 45740 (0.0008) [2023-10-10 06:33:17,517][53252] Updated weights for policy 0, policy_version 45750 (0.0008) [2023-10-10 06:33:17,894][53252] Updated weights for policy 0, policy_version 45760 (0.0008) [2023-10-10 06:33:19,319][53268] Updated weights for policy 1, policy_version 45700 (0.0008) [2023-10-10 06:33:19,683][53268] Updated weights for policy 1, policy_version 45710 (0.0007) [2023-10-10 06:33:20,058][53268] Updated weights for policy 1, policy_version 45720 (0.0008) [2023-10-10 06:33:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 93683712. Throughput: 0: 1696.7, 1: 1667.9. Samples: 23429582. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:21,784][52050] Avg episode reward: [(0, '21.420'), (1, '18.690')] [2023-10-10 06:33:21,941][53252] Updated weights for policy 0, policy_version 45770 (0.0007) [2023-10-10 06:33:22,320][53252] Updated weights for policy 0, policy_version 45780 (0.0010) [2023-10-10 06:33:22,699][53252] Updated weights for policy 0, policy_version 45790 (0.0008) [2023-10-10 06:33:24,269][53268] Updated weights for policy 1, policy_version 45730 (0.0009) [2023-10-10 06:33:24,631][53268] Updated weights for policy 1, policy_version 45740 (0.0008) [2023-10-10 06:33:24,998][53268] Updated weights for policy 1, policy_version 45750 (0.0010) [2023-10-10 06:33:25,360][53268] Updated weights for policy 1, policy_version 45760 (0.0010) [2023-10-10 06:33:26,704][53252] Updated weights for policy 0, policy_version 45800 (0.0010) [2023-10-10 06:33:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93749248. Throughput: 0: 1699.2, 1: 1680.0. Samples: 23450044. Policy #0 lag: (min: 6.0, avg: 13.2, max: 38.0) [2023-10-10 06:33:26,784][52050] Avg episode reward: [(0, '21.430'), (1, '19.360')] [2023-10-10 06:33:27,072][53252] Updated weights for policy 0, policy_version 45810 (0.0008) [2023-10-10 06:33:27,448][53252] Updated weights for policy 0, policy_version 45820 (0.0009) [2023-10-10 06:33:29,386][53268] Updated weights for policy 1, policy_version 45770 (0.0008) [2023-10-10 06:33:29,750][53268] Updated weights for policy 1, policy_version 45780 (0.0009) [2023-10-10 06:33:30,116][53268] Updated weights for policy 1, policy_version 45790 (0.0008) [2023-10-10 06:33:31,533][53252] Updated weights for policy 0, policy_version 45830 (0.0007) [2023-10-10 06:33:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93814784. Throughput: 0: 1694.9, 1: 1681.9. Samples: 23460290. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:31,784][52050] Avg episode reward: [(0, '21.070'), (1, '20.570')] [2023-10-10 06:33:31,892][53252] Updated weights for policy 0, policy_version 45840 (0.0011) [2023-10-10 06:33:32,265][53252] Updated weights for policy 0, policy_version 45850 (0.0008) [2023-10-10 06:33:34,354][53268] Updated weights for policy 1, policy_version 45800 (0.0010) [2023-10-10 06:33:34,731][53268] Updated weights for policy 1, policy_version 45810 (0.0009) [2023-10-10 06:33:35,101][53268] Updated weights for policy 1, policy_version 45820 (0.0009) [2023-10-10 06:33:36,279][53252] Updated weights for policy 0, policy_version 45860 (0.0008) [2023-10-10 06:33:36,654][53252] Updated weights for policy 0, policy_version 45870 (0.0007) [2023-10-10 06:33:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93880320. Throughput: 0: 1699.2, 1: 1656.7. Samples: 23480036. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:36,784][52050] Avg episode reward: [(0, '20.150'), (1, '21.220')] [2023-10-10 06:33:37,024][53252] Updated weights for policy 0, policy_version 45880 (0.0009) [2023-10-10 06:33:39,254][53268] Updated weights for policy 1, policy_version 45830 (0.0010) [2023-10-10 06:33:39,622][53268] Updated weights for policy 1, policy_version 45840 (0.0009) [2023-10-10 06:33:39,990][53268] Updated weights for policy 1, policy_version 45850 (0.0009) [2023-10-10 06:33:41,084][53252] Updated weights for policy 0, policy_version 45890 (0.0010) [2023-10-10 06:33:41,466][53252] Updated weights for policy 0, policy_version 45900 (0.0010) [2023-10-10 06:33:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93945856. Throughput: 0: 1686.1, 1: 1681.1. Samples: 23500426. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:41,784][52050] Avg episode reward: [(0, '20.440'), (1, '21.000')] [2023-10-10 06:33:41,833][53252] Updated weights for policy 0, policy_version 45910 (0.0007) [2023-10-10 06:33:42,199][53252] Updated weights for policy 0, policy_version 45920 (0.0009) [2023-10-10 06:33:44,113][53268] Updated weights for policy 1, policy_version 45860 (0.0009) [2023-10-10 06:33:44,479][53268] Updated weights for policy 1, policy_version 45870 (0.0008) [2023-10-10 06:33:44,848][53268] Updated weights for policy 1, policy_version 45880 (0.0009) [2023-10-10 06:33:46,241][53252] Updated weights for policy 0, policy_version 45930 (0.0007) [2023-10-10 06:33:46,613][53252] Updated weights for policy 0, policy_version 45940 (0.0007) [2023-10-10 06:33:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94011392. Throughput: 0: 1695.6, 1: 1681.1. Samples: 23511088. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:46,784][52050] Avg episode reward: [(0, '21.520'), (1, '20.170')] [2023-10-10 06:33:46,995][53252] Updated weights for policy 0, policy_version 45950 (0.0008) [2023-10-10 06:33:48,957][53268] Updated weights for policy 1, policy_version 45890 (0.0010) [2023-10-10 06:33:49,312][53268] Updated weights for policy 1, policy_version 45900 (0.0009) [2023-10-10 06:33:49,675][53268] Updated weights for policy 1, policy_version 45910 (0.0008) [2023-10-10 06:33:50,046][53268] Updated weights for policy 1, policy_version 45920 (0.0009) [2023-10-10 06:33:50,945][53252] Updated weights for policy 0, policy_version 45960 (0.0010) [2023-10-10 06:33:51,312][53252] Updated weights for policy 0, policy_version 45970 (0.0010) [2023-10-10 06:33:51,690][53252] Updated weights for policy 0, policy_version 45980 (0.0009) [2023-10-10 06:33:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94076928. Throughput: 0: 1693.4, 1: 1659.5. Samples: 23530512. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:51,784][52050] Avg episode reward: [(0, '21.450'), (1, '21.390')] [2023-10-10 06:33:53,802][53268] Updated weights for policy 1, policy_version 45930 (0.0007) [2023-10-10 06:33:54,173][53268] Updated weights for policy 1, policy_version 45940 (0.0007) [2023-10-10 06:33:54,544][53268] Updated weights for policy 1, policy_version 45950 (0.0009) [2023-10-10 06:33:55,632][53252] Updated weights for policy 0, policy_version 45990 (0.0007) [2023-10-10 06:33:55,997][53252] Updated weights for policy 0, policy_version 46000 (0.0007) [2023-10-10 06:33:56,370][53252] Updated weights for policy 0, policy_version 46010 (0.0009) [2023-10-10 06:33:56,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 94175232. Throughput: 0: 1669.2, 1: 1685.5. Samples: 23550530. Policy #0 lag: (min: 23.0, avg: 23.6, max: 38.0) [2023-10-10 06:33:56,784][52050] Avg episode reward: [(0, '21.440'), (1, '19.810')] [2023-10-10 06:33:58,523][53268] Updated weights for policy 1, policy_version 45960 (0.0009) [2023-10-10 06:33:58,883][53268] Updated weights for policy 1, policy_version 45970 (0.0010) [2023-10-10 06:33:59,258][53268] Updated weights for policy 1, policy_version 45980 (0.0011) [2023-10-10 06:34:00,630][53252] Updated weights for policy 0, policy_version 46020 (0.0007) [2023-10-10 06:34:01,031][53252] Updated weights for policy 0, policy_version 46030 (0.0008) [2023-10-10 06:34:01,403][53252] Updated weights for policy 0, policy_version 46040 (0.0010) [2023-10-10 06:34:01,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 94240768. Throughput: 0: 1690.9, 1: 1670.8. Samples: 23561122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:01,784][52050] Avg episode reward: [(0, '21.110'), (1, '19.760')] [2023-10-10 06:34:03,292][53268] Updated weights for policy 1, policy_version 45990 (0.0009) [2023-10-10 06:34:03,672][53268] Updated weights for policy 1, policy_version 46000 (0.0011) [2023-10-10 06:34:04,041][53268] Updated weights for policy 1, policy_version 46010 (0.0009) [2023-10-10 06:34:05,317][53252] Updated weights for policy 0, policy_version 46050 (0.0007) [2023-10-10 06:34:05,687][53252] Updated weights for policy 0, policy_version 46060 (0.0007) [2023-10-10 06:34:06,054][53252] Updated weights for policy 0, policy_version 46070 (0.0007) [2023-10-10 06:34:06,423][53252] Updated weights for policy 0, policy_version 46080 (0.0007) [2023-10-10 06:34:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 94306304. Throughput: 0: 1692.0, 1: 1684.7. Samples: 23581532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:06,784][52050] Avg episode reward: [(0, '21.960'), (1, '19.590')] [2023-10-10 06:34:08,048][53268] Updated weights for policy 1, policy_version 46020 (0.0008) [2023-10-10 06:34:08,413][53268] Updated weights for policy 1, policy_version 46030 (0.0007) [2023-10-10 06:34:08,781][53268] Updated weights for policy 1, policy_version 46040 (0.0009) [2023-10-10 06:34:10,416][53252] Updated weights for policy 0, policy_version 46090 (0.0010) [2023-10-10 06:34:10,795][53252] Updated weights for policy 0, policy_version 46100 (0.0010) [2023-10-10 06:34:11,170][53252] Updated weights for policy 0, policy_version 46110 (0.0009) [2023-10-10 06:34:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94371840. Throughput: 0: 1667.0, 1: 1699.1. Samples: 23601520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:11,785][52050] Avg episode reward: [(0, '22.240'), (1, '18.440')] [2023-10-10 06:34:12,917][53268] Updated weights for policy 1, policy_version 46050 (0.0010) [2023-10-10 06:34:13,277][53268] Updated weights for policy 1, policy_version 46060 (0.0008) [2023-10-10 06:34:13,642][53268] Updated weights for policy 1, policy_version 46070 (0.0007) [2023-10-10 06:34:14,007][53268] Updated weights for policy 1, policy_version 46080 (0.0007) [2023-10-10 06:34:15,392][53252] Updated weights for policy 0, policy_version 46120 (0.0010) [2023-10-10 06:34:15,764][53252] Updated weights for policy 0, policy_version 46130 (0.0010) [2023-10-10 06:34:16,124][53252] Updated weights for policy 0, policy_version 46140 (0.0008) [2023-10-10 06:34:16,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94437376. Throughput: 0: 1699.4, 1: 1672.4. Samples: 23612022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:16,785][52050] Avg episode reward: [(0, '21.780'), (1, '19.600')] [2023-10-10 06:34:18,038][53268] Updated weights for policy 1, policy_version 46090 (0.0009) [2023-10-10 06:34:18,404][53268] Updated weights for policy 1, policy_version 46100 (0.0009) [2023-10-10 06:34:18,782][53268] Updated weights for policy 1, policy_version 46110 (0.0008) [2023-10-10 06:34:20,069][53252] Updated weights for policy 0, policy_version 46150 (0.0009) [2023-10-10 06:34:20,433][53252] Updated weights for policy 0, policy_version 46160 (0.0008) [2023-10-10 06:34:20,808][53252] Updated weights for policy 0, policy_version 46170 (0.0007) [2023-10-10 06:34:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94502912. Throughput: 0: 1680.3, 1: 1700.6. Samples: 23632178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:21,784][52050] Avg episode reward: [(0, '20.430'), (1, '18.900')] [2023-10-10 06:34:22,875][53268] Updated weights for policy 1, policy_version 46120 (0.0008) [2023-10-10 06:34:23,252][53268] Updated weights for policy 1, policy_version 46130 (0.0009) [2023-10-10 06:34:23,620][53268] Updated weights for policy 1, policy_version 46140 (0.0010) [2023-10-10 06:34:24,576][53252] Updated weights for policy 0, policy_version 46180 (0.0007) [2023-10-10 06:34:24,939][53252] Updated weights for policy 0, policy_version 46190 (0.0010) [2023-10-10 06:34:25,311][53252] Updated weights for policy 0, policy_version 46200 (0.0009) [2023-10-10 06:34:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94568448. Throughput: 0: 1677.2, 1: 1702.0. Samples: 23652488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:26,784][52050] Avg episode reward: [(0, '21.050'), (1, '20.800')] [2023-10-10 06:34:27,548][53268] Updated weights for policy 1, policy_version 46150 (0.0011) [2023-10-10 06:34:27,915][53268] Updated weights for policy 1, policy_version 46160 (0.0010) [2023-10-10 06:34:28,274][53268] Updated weights for policy 1, policy_version 46170 (0.0009) [2023-10-10 06:34:29,479][53252] Updated weights for policy 0, policy_version 46210 (0.0010) [2023-10-10 06:34:29,854][53252] Updated weights for policy 0, policy_version 46220 (0.0010) [2023-10-10 06:34:30,232][53252] Updated weights for policy 0, policy_version 46230 (0.0011) [2023-10-10 06:34:30,603][53252] Updated weights for policy 0, policy_version 46240 (0.0009) [2023-10-10 06:34:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94633984. Throughput: 0: 1697.5, 1: 1676.0. Samples: 23662898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:34:31,784][52050] Avg episode reward: [(0, '20.860'), (1, '20.100')] [2023-10-10 06:34:32,404][53268] Updated weights for policy 1, policy_version 46180 (0.0007) [2023-10-10 06:34:32,765][53268] Updated weights for policy 1, policy_version 46190 (0.0009) [2023-10-10 06:34:33,128][53268] Updated weights for policy 1, policy_version 46200 (0.0010) [2023-10-10 06:34:34,710][53252] Updated weights for policy 0, policy_version 46250 (0.0010) [2023-10-10 06:34:35,085][53252] Updated weights for policy 0, policy_version 46260 (0.0009) [2023-10-10 06:34:35,463][53252] Updated weights for policy 0, policy_version 46270 (0.0007) [2023-10-10 06:34:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94699520. Throughput: 0: 1676.0, 1: 1703.2. Samples: 23682578. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:34:36,784][52050] Avg episode reward: [(0, '22.280'), (1, '19.200')] [2023-10-10 06:34:37,018][53268] Updated weights for policy 1, policy_version 46210 (0.0009) [2023-10-10 06:34:37,382][53268] Updated weights for policy 1, policy_version 46220 (0.0008) [2023-10-10 06:34:37,759][53268] Updated weights for policy 1, policy_version 46230 (0.0010) [2023-10-10 06:34:38,124][53268] Updated weights for policy 1, policy_version 46240 (0.0008) [2023-10-10 06:34:39,521][53252] Updated weights for policy 0, policy_version 46280 (0.0007) [2023-10-10 06:34:39,900][53252] Updated weights for policy 0, policy_version 46290 (0.0007) [2023-10-10 06:34:40,273][53252] Updated weights for policy 0, policy_version 46300 (0.0009) [2023-10-10 06:34:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94765056. Throughput: 0: 1694.0, 1: 1691.9. Samples: 23702894. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:34:41,785][52050] Avg episode reward: [(0, '21.390'), (1, '19.610')] [2023-10-10 06:34:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000046304_47415296.pth... [2023-10-10 06:34:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000046240_47349760.pth... [2023-10-10 06:34:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000044672_45744128.pth [2023-10-10 06:34:41,837][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000044704_45776896.pth [2023-10-10 06:34:42,376][53268] Updated weights for policy 1, policy_version 46250 (0.0007) [2023-10-10 06:34:42,747][53268] Updated weights for policy 1, policy_version 46260 (0.0009) [2023-10-10 06:34:43,116][53268] Updated weights for policy 1, policy_version 46270 (0.0009) [2023-10-10 06:34:44,311][53252] Updated weights for policy 0, policy_version 46310 (0.0008) [2023-10-10 06:34:44,691][53252] Updated weights for policy 0, policy_version 46320 (0.0008) [2023-10-10 06:34:45,056][53252] Updated weights for policy 0, policy_version 46330 (0.0009) [2023-10-10 06:34:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 94830592. Throughput: 0: 1697.2, 1: 1677.7. Samples: 23712994. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:34:46,784][52050] Avg episode reward: [(0, '24.630'), (1, '19.580')] [2023-10-10 06:34:46,784][52846] Saving new best policy, reward=24.630! [2023-10-10 06:34:47,142][53268] Updated weights for policy 1, policy_version 46280 (0.0008) [2023-10-10 06:34:47,506][53268] Updated weights for policy 1, policy_version 46290 (0.0008) [2023-10-10 06:34:47,872][53268] Updated weights for policy 1, policy_version 46300 (0.0010) [2023-10-10 06:34:49,058][53252] Updated weights for policy 0, policy_version 46340 (0.0008) [2023-10-10 06:34:49,433][53252] Updated weights for policy 0, policy_version 46350 (0.0007) [2023-10-10 06:34:49,803][53252] Updated weights for policy 0, policy_version 46360 (0.0007) [2023-10-10 06:34:51,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 94896128. Throughput: 0: 1678.7, 1: 1684.9. Samples: 23732894. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:34:51,784][52050] Avg episode reward: [(0, '24.030'), (1, '19.970')] [2023-10-10 06:34:52,081][53268] Updated weights for policy 1, policy_version 46310 (0.0010) [2023-10-10 06:34:52,441][53268] Updated weights for policy 1, policy_version 46320 (0.0010) [2023-10-10 06:34:52,814][53268] Updated weights for policy 1, policy_version 46330 (0.0010) [2023-10-10 06:34:54,009][53252] Updated weights for policy 0, policy_version 46370 (0.0010) [2023-10-10 06:34:54,389][53252] Updated weights for policy 0, policy_version 46380 (0.0009) [2023-10-10 06:34:54,756][53252] Updated weights for policy 0, policy_version 46390 (0.0009) [2023-10-10 06:34:55,119][53252] Updated weights for policy 0, policy_version 46400 (0.0010) [2023-10-10 06:34:56,760][53268] Updated weights for policy 1, policy_version 46340 (0.0008) [2023-10-10 06:34:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94961664. Throughput: 0: 1692.3, 1: 1684.0. Samples: 23753452. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:34:56,784][52050] Avg episode reward: [(0, '22.640'), (1, '20.120')] [2023-10-10 06:34:57,125][53268] Updated weights for policy 1, policy_version 46350 (0.0008) [2023-10-10 06:34:57,490][53268] Updated weights for policy 1, policy_version 46360 (0.0009) [2023-10-10 06:34:59,162][53252] Updated weights for policy 0, policy_version 46410 (0.0008) [2023-10-10 06:34:59,532][53252] Updated weights for policy 0, policy_version 46420 (0.0007) [2023-10-10 06:34:59,897][53252] Updated weights for policy 0, policy_version 46430 (0.0010) [2023-10-10 06:35:01,611][53268] Updated weights for policy 1, policy_version 46370 (0.0009) [2023-10-10 06:35:01,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 95027200. Throughput: 0: 1676.5, 1: 1682.8. Samples: 23763188. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:35:01,784][52050] Avg episode reward: [(0, '22.560'), (1, '19.340')] [2023-10-10 06:35:01,985][53268] Updated weights for policy 1, policy_version 46380 (0.0009) [2023-10-10 06:35:02,355][53268] Updated weights for policy 1, policy_version 46390 (0.0008) [2023-10-10 06:35:02,727][53268] Updated weights for policy 1, policy_version 46400 (0.0009) [2023-10-10 06:35:03,907][53252] Updated weights for policy 0, policy_version 46440 (0.0008) [2023-10-10 06:35:04,276][53252] Updated weights for policy 0, policy_version 46450 (0.0007) [2023-10-10 06:35:04,639][53252] Updated weights for policy 0, policy_version 46460 (0.0009) [2023-10-10 06:35:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95092736. Throughput: 0: 1671.5, 1: 1684.8. Samples: 23783208. Policy #0 lag: (min: 14.0, avg: 14.2, max: 24.0) [2023-10-10 06:35:06,784][52050] Avg episode reward: [(0, '22.690'), (1, '20.920')] [2023-10-10 06:35:06,806][53268] Updated weights for policy 1, policy_version 46410 (0.0007) [2023-10-10 06:35:07,168][53268] Updated weights for policy 1, policy_version 46420 (0.0008) [2023-10-10 06:35:07,538][53268] Updated weights for policy 1, policy_version 46430 (0.0008) [2023-10-10 06:35:08,555][53252] Updated weights for policy 0, policy_version 46470 (0.0008) [2023-10-10 06:35:08,918][53252] Updated weights for policy 0, policy_version 46480 (0.0009) [2023-10-10 06:35:09,296][53252] Updated weights for policy 0, policy_version 46490 (0.0007) [2023-10-10 06:35:11,613][53268] Updated weights for policy 1, policy_version 46440 (0.0010) [2023-10-10 06:35:11,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95158272. Throughput: 0: 1682.1, 1: 1683.9. Samples: 23803958. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:11,785][52050] Avg episode reward: [(0, '23.050'), (1, '18.590')] [2023-10-10 06:35:11,989][53268] Updated weights for policy 1, policy_version 46450 (0.0009) [2023-10-10 06:35:12,356][53268] Updated weights for policy 1, policy_version 46460 (0.0010) [2023-10-10 06:35:13,458][53252] Updated weights for policy 0, policy_version 46500 (0.0012) [2023-10-10 06:35:13,829][53252] Updated weights for policy 0, policy_version 46510 (0.0008) [2023-10-10 06:35:14,197][53252] Updated weights for policy 0, policy_version 46520 (0.0008) [2023-10-10 06:35:16,335][53268] Updated weights for policy 1, policy_version 46470 (0.0011) [2023-10-10 06:35:16,703][53268] Updated weights for policy 1, policy_version 46480 (0.0008) [2023-10-10 06:35:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 95223808. Throughput: 0: 1654.4, 1: 1687.6. Samples: 23813288. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:16,784][52050] Avg episode reward: [(0, '21.270'), (1, '19.320')] [2023-10-10 06:35:17,069][53268] Updated weights for policy 1, policy_version 46490 (0.0008) [2023-10-10 06:35:18,248][53252] Updated weights for policy 0, policy_version 46530 (0.0007) [2023-10-10 06:35:18,621][53252] Updated weights for policy 0, policy_version 46540 (0.0009) [2023-10-10 06:35:18,990][53252] Updated weights for policy 0, policy_version 46550 (0.0008) [2023-10-10 06:35:19,366][53252] Updated weights for policy 0, policy_version 46560 (0.0008) [2023-10-10 06:35:21,147][53268] Updated weights for policy 1, policy_version 46500 (0.0009) [2023-10-10 06:35:21,512][53268] Updated weights for policy 1, policy_version 46510 (0.0008) [2023-10-10 06:35:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95289344. Throughput: 0: 1676.8, 1: 1686.2. Samples: 23833912. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:21,784][52050] Avg episode reward: [(0, '22.200'), (1, '20.670')] [2023-10-10 06:35:21,876][53268] Updated weights for policy 1, policy_version 46520 (0.0007) [2023-10-10 06:35:23,358][53252] Updated weights for policy 0, policy_version 46570 (0.0009) [2023-10-10 06:35:23,733][53252] Updated weights for policy 0, policy_version 46580 (0.0009) [2023-10-10 06:35:24,113][53252] Updated weights for policy 0, policy_version 46590 (0.0009) [2023-10-10 06:35:25,958][53268] Updated weights for policy 1, policy_version 46530 (0.0010) [2023-10-10 06:35:26,322][53268] Updated weights for policy 1, policy_version 46540 (0.0010) [2023-10-10 06:35:26,690][53268] Updated weights for policy 1, policy_version 46550 (0.0008) [2023-10-10 06:35:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 95354880. Throughput: 0: 1690.8, 1: 1682.8. Samples: 23854704. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:26,784][52050] Avg episode reward: [(0, '22.710'), (1, '18.900')] [2023-10-10 06:35:27,062][53268] Updated weights for policy 1, policy_version 46560 (0.0008) [2023-10-10 06:35:27,926][53252] Updated weights for policy 0, policy_version 46600 (0.0007) [2023-10-10 06:35:28,299][53252] Updated weights for policy 0, policy_version 46610 (0.0008) [2023-10-10 06:35:28,661][53252] Updated weights for policy 0, policy_version 46620 (0.0010) [2023-10-10 06:35:31,084][53268] Updated weights for policy 1, policy_version 46570 (0.0008) [2023-10-10 06:35:31,445][53268] Updated weights for policy 1, policy_version 46580 (0.0009) [2023-10-10 06:35:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95420416. Throughput: 0: 1668.8, 1: 1691.5. Samples: 23864208. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:31,784][52050] Avg episode reward: [(0, '23.180'), (1, '18.650')] [2023-10-10 06:35:31,807][53268] Updated weights for policy 1, policy_version 46590 (0.0007) [2023-10-10 06:35:32,734][53252] Updated weights for policy 0, policy_version 46630 (0.0010) [2023-10-10 06:35:33,102][53252] Updated weights for policy 0, policy_version 46640 (0.0010) [2023-10-10 06:35:33,471][53252] Updated weights for policy 0, policy_version 46650 (0.0010) [2023-10-10 06:35:35,740][53268] Updated weights for policy 1, policy_version 46600 (0.0009) [2023-10-10 06:35:36,106][53268] Updated weights for policy 1, policy_version 46610 (0.0007) [2023-10-10 06:35:36,469][53268] Updated weights for policy 1, policy_version 46620 (0.0010) [2023-10-10 06:35:36,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95518720. Throughput: 0: 1685.6, 1: 1698.1. Samples: 23885158. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:36,784][52050] Avg episode reward: [(0, '21.370'), (1, '18.440')] [2023-10-10 06:35:37,446][53252] Updated weights for policy 0, policy_version 46660 (0.0010) [2023-10-10 06:35:37,818][53252] Updated weights for policy 0, policy_version 46670 (0.0011) [2023-10-10 06:35:38,195][53252] Updated weights for policy 0, policy_version 46680 (0.0008) [2023-10-10 06:35:40,430][53268] Updated weights for policy 1, policy_version 46630 (0.0009) [2023-10-10 06:35:40,797][53268] Updated weights for policy 1, policy_version 46640 (0.0009) [2023-10-10 06:35:41,150][53268] Updated weights for policy 1, policy_version 46650 (0.0009) [2023-10-10 06:35:41,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95584256. Throughput: 0: 1692.4, 1: 1675.1. Samples: 23904990. Policy #0 lag: (min: 21.0, avg: 28.4, max: 53.0) [2023-10-10 06:35:41,785][52050] Avg episode reward: [(0, '23.030'), (1, '18.090')] [2023-10-10 06:35:42,469][53252] Updated weights for policy 0, policy_version 46690 (0.0009) [2023-10-10 06:35:42,849][53252] Updated weights for policy 0, policy_version 46700 (0.0009) [2023-10-10 06:35:43,222][53252] Updated weights for policy 0, policy_version 46710 (0.0007) [2023-10-10 06:35:43,597][53252] Updated weights for policy 0, policy_version 46720 (0.0010) [2023-10-10 06:35:45,073][53268] Updated weights for policy 1, policy_version 46660 (0.0009) [2023-10-10 06:35:45,443][53268] Updated weights for policy 1, policy_version 46670 (0.0010) [2023-10-10 06:35:45,805][53268] Updated weights for policy 1, policy_version 46680 (0.0007) [2023-10-10 06:35:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95649792. Throughput: 0: 1675.9, 1: 1702.0. Samples: 23915194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:35:46,784][52050] Avg episode reward: [(0, '22.410'), (1, '18.140')] [2023-10-10 06:35:47,619][53252] Updated weights for policy 0, policy_version 46730 (0.0009) [2023-10-10 06:35:47,988][53252] Updated weights for policy 0, policy_version 46740 (0.0008) [2023-10-10 06:35:48,357][53252] Updated weights for policy 0, policy_version 46750 (0.0008) [2023-10-10 06:35:49,959][53268] Updated weights for policy 1, policy_version 46690 (0.0010) [2023-10-10 06:35:50,311][53268] Updated weights for policy 1, policy_version 46700 (0.0008) [2023-10-10 06:35:50,685][53268] Updated weights for policy 1, policy_version 46710 (0.0007) [2023-10-10 06:35:51,048][53268] Updated weights for policy 1, policy_version 46720 (0.0009) [2023-10-10 06:35:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95715328. Throughput: 0: 1695.1, 1: 1693.2. Samples: 23935684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:35:51,784][52050] Avg episode reward: [(0, '20.420'), (1, '19.370')] [2023-10-10 06:35:52,401][53252] Updated weights for policy 0, policy_version 46760 (0.0010) [2023-10-10 06:35:52,774][53252] Updated weights for policy 0, policy_version 46770 (0.0010) [2023-10-10 06:35:53,145][53252] Updated weights for policy 0, policy_version 46780 (0.0010) [2023-10-10 06:35:54,993][53268] Updated weights for policy 1, policy_version 46730 (0.0009) [2023-10-10 06:35:55,355][53268] Updated weights for policy 1, policy_version 46740 (0.0007) [2023-10-10 06:35:55,719][53268] Updated weights for policy 1, policy_version 46750 (0.0007) [2023-10-10 06:35:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95780864. Throughput: 0: 1694.3, 1: 1673.3. Samples: 23955500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:35:56,784][52050] Avg episode reward: [(0, '20.740'), (1, '19.970')] [2023-10-10 06:35:57,336][53252] Updated weights for policy 0, policy_version 46790 (0.0009) [2023-10-10 06:35:57,696][53252] Updated weights for policy 0, policy_version 46800 (0.0011) [2023-10-10 06:35:58,070][53252] Updated weights for policy 0, policy_version 46810 (0.0009) [2023-10-10 06:35:59,890][53268] Updated weights for policy 1, policy_version 46760 (0.0009) [2023-10-10 06:36:00,262][53268] Updated weights for policy 1, policy_version 46770 (0.0011) [2023-10-10 06:36:00,632][53268] Updated weights for policy 1, policy_version 46780 (0.0009) [2023-10-10 06:36:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 95846400. Throughput: 0: 1686.6, 1: 1700.6. Samples: 23965710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:36:01,784][52050] Avg episode reward: [(0, '20.460'), (1, '19.170')] [2023-10-10 06:36:02,039][53252] Updated weights for policy 0, policy_version 46820 (0.0007) [2023-10-10 06:36:02,405][53252] Updated weights for policy 0, policy_version 46830 (0.0009) [2023-10-10 06:36:02,778][53252] Updated weights for policy 0, policy_version 46840 (0.0008) [2023-10-10 06:36:04,808][53268] Updated weights for policy 1, policy_version 46790 (0.0008) [2023-10-10 06:36:05,170][53268] Updated weights for policy 1, policy_version 46800 (0.0007) [2023-10-10 06:36:05,544][53268] Updated weights for policy 1, policy_version 46810 (0.0007) [2023-10-10 06:36:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 95911936. Throughput: 0: 1692.0, 1: 1684.4. Samples: 23985850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:36:06,784][52050] Avg episode reward: [(0, '21.750'), (1, '18.720')] [2023-10-10 06:36:06,806][53252] Updated weights for policy 0, policy_version 46850 (0.0007) [2023-10-10 06:36:07,184][53252] Updated weights for policy 0, policy_version 46860 (0.0007) [2023-10-10 06:36:07,551][53252] Updated weights for policy 0, policy_version 46870 (0.0010) [2023-10-10 06:36:07,917][53252] Updated weights for policy 0, policy_version 46880 (0.0010) [2023-10-10 06:36:09,325][53268] Updated weights for policy 1, policy_version 46820 (0.0009) [2023-10-10 06:36:09,693][53268] Updated weights for policy 1, policy_version 46830 (0.0009) [2023-10-10 06:36:10,051][53268] Updated weights for policy 1, policy_version 46840 (0.0008) [2023-10-10 06:36:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 95977472. Throughput: 0: 1684.5, 1: 1681.4. Samples: 24006172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:36:11,784][52050] Avg episode reward: [(0, '20.680'), (1, '21.560')] [2023-10-10 06:36:12,050][53252] Updated weights for policy 0, policy_version 46890 (0.0010) [2023-10-10 06:36:12,425][53252] Updated weights for policy 0, policy_version 46900 (0.0008) [2023-10-10 06:36:12,796][53252] Updated weights for policy 0, policy_version 46910 (0.0007) [2023-10-10 06:36:14,212][53268] Updated weights for policy 1, policy_version 46850 (0.0007) [2023-10-10 06:36:14,583][53268] Updated weights for policy 1, policy_version 46860 (0.0008) [2023-10-10 06:36:14,956][53268] Updated weights for policy 1, policy_version 46870 (0.0008) [2023-10-10 06:36:15,326][53268] Updated weights for policy 1, policy_version 46880 (0.0009) [2023-10-10 06:36:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96043008. Throughput: 0: 1682.4, 1: 1700.0. Samples: 24016416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:36:16,784][52050] Avg episode reward: [(0, '21.190'), (1, '19.470')] [2023-10-10 06:36:16,836][53252] Updated weights for policy 0, policy_version 46920 (0.0011) [2023-10-10 06:36:17,208][53252] Updated weights for policy 0, policy_version 46930 (0.0007) [2023-10-10 06:36:17,592][53252] Updated weights for policy 0, policy_version 46940 (0.0007) [2023-10-10 06:36:19,532][53268] Updated weights for policy 1, policy_version 46890 (0.0008) [2023-10-10 06:36:19,908][53268] Updated weights for policy 1, policy_version 46900 (0.0007) [2023-10-10 06:36:20,278][53268] Updated weights for policy 1, policy_version 46910 (0.0008) [2023-10-10 06:36:21,583][53252] Updated weights for policy 0, policy_version 46950 (0.0009) [2023-10-10 06:36:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96108544. Throughput: 0: 1688.1, 1: 1671.1. Samples: 24036324. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:21,784][52050] Avg episode reward: [(0, '22.250'), (1, '19.340')] [2023-10-10 06:36:21,951][53252] Updated weights for policy 0, policy_version 46960 (0.0009) [2023-10-10 06:36:22,317][53252] Updated weights for policy 0, policy_version 46970 (0.0008) [2023-10-10 06:36:24,343][53268] Updated weights for policy 1, policy_version 46920 (0.0008) [2023-10-10 06:36:24,708][53268] Updated weights for policy 1, policy_version 46930 (0.0009) [2023-10-10 06:36:25,073][53268] Updated weights for policy 1, policy_version 46940 (0.0010) [2023-10-10 06:36:26,480][53252] Updated weights for policy 0, policy_version 46980 (0.0008) [2023-10-10 06:36:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96174080. Throughput: 0: 1683.5, 1: 1688.8. Samples: 24056742. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:26,784][52050] Avg episode reward: [(0, '20.740'), (1, '19.250')] [2023-10-10 06:36:26,852][53252] Updated weights for policy 0, policy_version 46990 (0.0008) [2023-10-10 06:36:27,232][53252] Updated weights for policy 0, policy_version 47000 (0.0009) [2023-10-10 06:36:29,011][53268] Updated weights for policy 1, policy_version 46950 (0.0009) [2023-10-10 06:36:29,382][53268] Updated weights for policy 1, policy_version 46960 (0.0008) [2023-10-10 06:36:29,757][53268] Updated weights for policy 1, policy_version 46970 (0.0008) [2023-10-10 06:36:31,253][53252] Updated weights for policy 0, policy_version 47010 (0.0008) [2023-10-10 06:36:31,661][53252] Updated weights for policy 0, policy_version 47020 (0.0007) [2023-10-10 06:36:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 96239616. Throughput: 0: 1686.4, 1: 1684.7. Samples: 24066894. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:31,784][52050] Avg episode reward: [(0, '22.070'), (1, '19.050')] [2023-10-10 06:36:32,044][53252] Updated weights for policy 0, policy_version 47030 (0.0009) [2023-10-10 06:36:32,414][53252] Updated weights for policy 0, policy_version 47040 (0.0007) [2023-10-10 06:36:33,774][53268] Updated weights for policy 1, policy_version 46980 (0.0008) [2023-10-10 06:36:34,136][53268] Updated weights for policy 1, policy_version 46990 (0.0011) [2023-10-10 06:36:34,505][53268] Updated weights for policy 1, policy_version 47000 (0.0011) [2023-10-10 06:36:36,480][53252] Updated weights for policy 0, policy_version 47050 (0.0008) [2023-10-10 06:36:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96305152. Throughput: 0: 1683.2, 1: 1669.5. Samples: 24086554. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:36,784][52050] Avg episode reward: [(0, '22.950'), (1, '18.340')] [2023-10-10 06:36:36,854][53252] Updated weights for policy 0, policy_version 47060 (0.0009) [2023-10-10 06:36:37,217][53252] Updated weights for policy 0, policy_version 47070 (0.0007) [2023-10-10 06:36:38,704][53268] Updated weights for policy 1, policy_version 47010 (0.0010) [2023-10-10 06:36:39,071][53268] Updated weights for policy 1, policy_version 47020 (0.0008) [2023-10-10 06:36:39,437][53268] Updated weights for policy 1, policy_version 47030 (0.0007) [2023-10-10 06:36:39,806][53268] Updated weights for policy 1, policy_version 47040 (0.0008) [2023-10-10 06:36:41,385][53252] Updated weights for policy 0, policy_version 47080 (0.0007) [2023-10-10 06:36:41,757][53252] Updated weights for policy 0, policy_version 47090 (0.0008) [2023-10-10 06:36:41,784][52050] Fps is (10 sec: 13106.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96370688. Throughput: 0: 1676.2, 1: 1689.8. Samples: 24106970. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:41,785][52050] Avg episode reward: [(0, '23.140'), (1, '20.930')] [2023-10-10 06:36:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000047040_48168960.pth... [2023-10-10 06:36:41,828][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000045472_46563328.pth [2023-10-10 06:36:41,834][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000047040_48168960.pth [2023-10-10 06:36:42,140][53252] Updated weights for policy 0, policy_version 47100 (0.0009) [2023-10-10 06:36:42,279][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000047104_48234496.pth... [2023-10-10 06:36:42,312][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000045504_46596096.pth [2023-10-10 06:36:42,317][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000047104_48234496.pth [2023-10-10 06:36:43,878][53268] Updated weights for policy 1, policy_version 47050 (0.0007) [2023-10-10 06:36:44,240][53268] Updated weights for policy 1, policy_version 47060 (0.0007) [2023-10-10 06:36:44,606][53268] Updated weights for policy 1, policy_version 47070 (0.0007) [2023-10-10 06:36:46,118][53252] Updated weights for policy 0, policy_version 47110 (0.0009) [2023-10-10 06:36:46,489][53252] Updated weights for policy 0, policy_version 47120 (0.0007) [2023-10-10 06:36:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96436224. Throughput: 0: 1689.8, 1: 1675.9. Samples: 24117166. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:46,784][52050] Avg episode reward: [(0, '22.950'), (1, '19.890')] [2023-10-10 06:36:46,862][53252] Updated weights for policy 0, policy_version 47130 (0.0008) [2023-10-10 06:36:48,558][53268] Updated weights for policy 1, policy_version 47080 (0.0011) [2023-10-10 06:36:48,919][53268] Updated weights for policy 1, policy_version 47090 (0.0010) [2023-10-10 06:36:49,285][53268] Updated weights for policy 1, policy_version 47100 (0.0010) [2023-10-10 06:36:50,821][53252] Updated weights for policy 0, policy_version 47140 (0.0008) [2023-10-10 06:36:51,178][53252] Updated weights for policy 0, policy_version 47150 (0.0008) [2023-10-10 06:36:51,545][53252] Updated weights for policy 0, policy_version 47160 (0.0009) [2023-10-10 06:36:51,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96501760. Throughput: 0: 1690.5, 1: 1677.2. Samples: 24137400. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-10 06:36:51,784][52050] Avg episode reward: [(0, '23.150'), (1, '19.220')] [2023-10-10 06:36:53,562][53268] Updated weights for policy 1, policy_version 47110 (0.0011) [2023-10-10 06:36:53,931][53268] Updated weights for policy 1, policy_version 47120 (0.0010) [2023-10-10 06:36:54,303][53268] Updated weights for policy 1, policy_version 47130 (0.0009) [2023-10-10 06:36:55,588][53252] Updated weights for policy 0, policy_version 47170 (0.0010) [2023-10-10 06:36:55,964][53252] Updated weights for policy 0, policy_version 47180 (0.0007) [2023-10-10 06:36:56,329][53252] Updated weights for policy 0, policy_version 47190 (0.0008) [2023-10-10 06:36:56,701][53252] Updated weights for policy 0, policy_version 47200 (0.0008) [2023-10-10 06:36:56,783][52050] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96600064. Throughput: 0: 1676.7, 1: 1682.4. Samples: 24157328. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:36:56,784][52050] Avg episode reward: [(0, '22.210'), (1, '19.560')] [2023-10-10 06:36:58,270][53268] Updated weights for policy 1, policy_version 47140 (0.0007) [2023-10-10 06:36:58,637][53268] Updated weights for policy 1, policy_version 47150 (0.0008) [2023-10-10 06:36:59,006][53268] Updated weights for policy 1, policy_version 47160 (0.0007) [2023-10-10 06:37:00,719][53252] Updated weights for policy 0, policy_version 47210 (0.0008) [2023-10-10 06:37:01,088][53252] Updated weights for policy 0, policy_version 47220 (0.0007) [2023-10-10 06:37:01,472][53252] Updated weights for policy 0, policy_version 47230 (0.0010) [2023-10-10 06:37:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96665600. Throughput: 0: 1696.0, 1: 1662.2. Samples: 24167534. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:37:01,784][52050] Avg episode reward: [(0, '20.550'), (1, '18.760')] [2023-10-10 06:37:03,139][53268] Updated weights for policy 1, policy_version 47170 (0.0008) [2023-10-10 06:37:03,508][53268] Updated weights for policy 1, policy_version 47180 (0.0008) [2023-10-10 06:37:03,875][53268] Updated weights for policy 1, policy_version 47190 (0.0008) [2023-10-10 06:37:04,232][53268] Updated weights for policy 1, policy_version 47200 (0.0009) [2023-10-10 06:37:05,483][53252] Updated weights for policy 0, policy_version 47240 (0.0008) [2023-10-10 06:37:05,864][53252] Updated weights for policy 0, policy_version 47250 (0.0009) [2023-10-10 06:37:06,233][53252] Updated weights for policy 0, policy_version 47260 (0.0008) [2023-10-10 06:37:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96731136. Throughput: 0: 1691.1, 1: 1673.2. Samples: 24187720. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:37:06,784][52050] Avg episode reward: [(0, '19.470'), (1, '19.370')] [2023-10-10 06:37:08,357][53268] Updated weights for policy 1, policy_version 47210 (0.0011) [2023-10-10 06:37:08,728][53268] Updated weights for policy 1, policy_version 47220 (0.0010) [2023-10-10 06:37:09,088][53268] Updated weights for policy 1, policy_version 47230 (0.0007) [2023-10-10 06:37:10,282][53252] Updated weights for policy 0, policy_version 47270 (0.0009) [2023-10-10 06:37:10,657][53252] Updated weights for policy 0, policy_version 47280 (0.0009) [2023-10-10 06:37:11,026][53252] Updated weights for policy 0, policy_version 47290 (0.0007) [2023-10-10 06:37:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 96796672. Throughput: 0: 1674.4, 1: 1676.1. Samples: 24207512. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:37:11,784][52050] Avg episode reward: [(0, '20.980'), (1, '19.730')] [2023-10-10 06:37:13,180][53268] Updated weights for policy 1, policy_version 47240 (0.0007) [2023-10-10 06:37:13,551][53268] Updated weights for policy 1, policy_version 47250 (0.0009) [2023-10-10 06:37:13,930][53268] Updated weights for policy 1, policy_version 47260 (0.0009) [2023-10-10 06:37:15,059][53252] Updated weights for policy 0, policy_version 47300 (0.0010) [2023-10-10 06:37:15,424][53252] Updated weights for policy 0, policy_version 47310 (0.0010) [2023-10-10 06:37:15,794][53252] Updated weights for policy 0, policy_version 47320 (0.0008) [2023-10-10 06:37:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96862208. Throughput: 0: 1701.7, 1: 1654.5. Samples: 24217922. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:37:16,784][52050] Avg episode reward: [(0, '20.850'), (1, '19.340')] [2023-10-10 06:37:17,964][53268] Updated weights for policy 1, policy_version 47270 (0.0010) [2023-10-10 06:37:18,334][53268] Updated weights for policy 1, policy_version 47280 (0.0009) [2023-10-10 06:37:18,700][53268] Updated weights for policy 1, policy_version 47290 (0.0009) [2023-10-10 06:37:19,804][53252] Updated weights for policy 0, policy_version 47330 (0.0007) [2023-10-10 06:37:20,214][53252] Updated weights for policy 0, policy_version 47340 (0.0008) [2023-10-10 06:37:20,581][53252] Updated weights for policy 0, policy_version 47350 (0.0011) [2023-10-10 06:37:20,951][53252] Updated weights for policy 0, policy_version 47360 (0.0007) [2023-10-10 06:37:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 96927744. Throughput: 0: 1686.2, 1: 1677.7. Samples: 24237930. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-10 06:37:21,784][52050] Avg episode reward: [(0, '20.900'), (1, '19.330')] [2023-10-10 06:37:22,792][53268] Updated weights for policy 1, policy_version 47300 (0.0010) [2023-10-10 06:37:23,154][53268] Updated weights for policy 1, policy_version 47310 (0.0011) [2023-10-10 06:37:23,520][53268] Updated weights for policy 1, policy_version 47320 (0.0009) [2023-10-10 06:37:24,895][53252] Updated weights for policy 0, policy_version 47370 (0.0009) [2023-10-10 06:37:25,259][53252] Updated weights for policy 0, policy_version 47380 (0.0008) [2023-10-10 06:37:25,627][53252] Updated weights for policy 0, policy_version 47390 (0.0008) [2023-10-10 06:37:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 96993280. Throughput: 0: 1679.3, 1: 1674.6. Samples: 24257896. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:26,784][52050] Avg episode reward: [(0, '21.510'), (1, '18.800')] [2023-10-10 06:37:27,575][53268] Updated weights for policy 1, policy_version 47330 (0.0009) [2023-10-10 06:37:27,948][53268] Updated weights for policy 1, policy_version 47340 (0.0009) [2023-10-10 06:37:28,308][53268] Updated weights for policy 1, policy_version 47350 (0.0008) [2023-10-10 06:37:28,681][53268] Updated weights for policy 1, policy_version 47360 (0.0007) [2023-10-10 06:37:29,728][53252] Updated weights for policy 0, policy_version 47400 (0.0009) [2023-10-10 06:37:30,104][53252] Updated weights for policy 0, policy_version 47410 (0.0009) [2023-10-10 06:37:30,484][53252] Updated weights for policy 0, policy_version 47420 (0.0008) [2023-10-10 06:37:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97058816. Throughput: 0: 1701.9, 1: 1656.8. Samples: 24268310. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:31,784][52050] Avg episode reward: [(0, '21.930'), (1, '18.950')] [2023-10-10 06:37:32,988][53268] Updated weights for policy 1, policy_version 47370 (0.0007) [2023-10-10 06:37:33,350][53268] Updated weights for policy 1, policy_version 47380 (0.0007) [2023-10-10 06:37:33,721][53268] Updated weights for policy 1, policy_version 47390 (0.0011) [2023-10-10 06:37:34,494][53252] Updated weights for policy 0, policy_version 47430 (0.0010) [2023-10-10 06:37:34,862][53252] Updated weights for policy 0, policy_version 47440 (0.0009) [2023-10-10 06:37:35,222][53252] Updated weights for policy 0, policy_version 47450 (0.0010) [2023-10-10 06:37:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97124352. Throughput: 0: 1674.7, 1: 1668.7. Samples: 24287852. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:36,784][52050] Avg episode reward: [(0, '20.140'), (1, '19.750')] [2023-10-10 06:37:37,800][53268] Updated weights for policy 1, policy_version 47400 (0.0008) [2023-10-10 06:37:38,161][53268] Updated weights for policy 1, policy_version 47410 (0.0008) [2023-10-10 06:37:38,535][53268] Updated weights for policy 1, policy_version 47420 (0.0009) [2023-10-10 06:37:39,143][53252] Updated weights for policy 0, policy_version 47460 (0.0008) [2023-10-10 06:37:39,505][53252] Updated weights for policy 0, policy_version 47470 (0.0008) [2023-10-10 06:37:39,874][53252] Updated weights for policy 0, policy_version 47480 (0.0007) [2023-10-10 06:37:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97189888. Throughput: 0: 1690.7, 1: 1673.3. Samples: 24308706. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:41,784][52050] Avg episode reward: [(0, '21.630'), (1, '18.610')] [2023-10-10 06:37:42,637][53268] Updated weights for policy 1, policy_version 47430 (0.0009) [2023-10-10 06:37:42,999][53268] Updated weights for policy 1, policy_version 47440 (0.0010) [2023-10-10 06:37:43,375][53268] Updated weights for policy 1, policy_version 47450 (0.0010) [2023-10-10 06:37:44,017][53252] Updated weights for policy 0, policy_version 47490 (0.0007) [2023-10-10 06:37:44,388][53252] Updated weights for policy 0, policy_version 47500 (0.0007) [2023-10-10 06:37:44,759][53252] Updated weights for policy 0, policy_version 47510 (0.0007) [2023-10-10 06:37:45,129][53252] Updated weights for policy 0, policy_version 47520 (0.0008) [2023-10-10 06:37:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97255424. Throughput: 0: 1687.8, 1: 1671.9. Samples: 24318718. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:46,784][52050] Avg episode reward: [(0, '20.410'), (1, '20.150')] [2023-10-10 06:37:47,409][53268] Updated weights for policy 1, policy_version 47460 (0.0009) [2023-10-10 06:37:47,773][53268] Updated weights for policy 1, policy_version 47470 (0.0010) [2023-10-10 06:37:48,142][53268] Updated weights for policy 1, policy_version 47480 (0.0009) [2023-10-10 06:37:49,297][53252] Updated weights for policy 0, policy_version 47530 (0.0007) [2023-10-10 06:37:49,665][53252] Updated weights for policy 0, policy_version 47540 (0.0007) [2023-10-10 06:37:50,029][53252] Updated weights for policy 0, policy_version 47550 (0.0008) [2023-10-10 06:37:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97320960. Throughput: 0: 1666.5, 1: 1682.1. Samples: 24338406. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:51,784][52050] Avg episode reward: [(0, '21.580'), (1, '19.580')] [2023-10-10 06:37:52,173][53268] Updated weights for policy 1, policy_version 47490 (0.0008) [2023-10-10 06:37:52,531][53268] Updated weights for policy 1, policy_version 47500 (0.0007) [2023-10-10 06:37:52,909][53268] Updated weights for policy 1, policy_version 47510 (0.0007) [2023-10-10 06:37:53,282][53268] Updated weights for policy 1, policy_version 47520 (0.0009) [2023-10-10 06:37:54,219][53252] Updated weights for policy 0, policy_version 47560 (0.0009) [2023-10-10 06:37:54,598][53252] Updated weights for policy 0, policy_version 47570 (0.0009) [2023-10-10 06:37:54,960][53252] Updated weights for policy 0, policy_version 47580 (0.0011) [2023-10-10 06:37:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97386496. Throughput: 0: 1690.4, 1: 1686.0. Samples: 24359454. Policy #0 lag: (min: 20.0, avg: 21.1, max: 43.0) [2023-10-10 06:37:56,784][52050] Avg episode reward: [(0, '21.830'), (1, '19.810')] [2023-10-10 06:37:57,257][53268] Updated weights for policy 1, policy_version 47530 (0.0008) [2023-10-10 06:37:57,626][53268] Updated weights for policy 1, policy_version 47540 (0.0008) [2023-10-10 06:37:57,995][53268] Updated weights for policy 1, policy_version 47550 (0.0009) [2023-10-10 06:37:58,737][53252] Updated weights for policy 0, policy_version 47590 (0.0007) [2023-10-10 06:37:59,102][53252] Updated weights for policy 0, policy_version 47600 (0.0009) [2023-10-10 06:37:59,481][53252] Updated weights for policy 0, policy_version 47610 (0.0007) [2023-10-10 06:38:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97452032. Throughput: 0: 1673.7, 1: 1683.4. Samples: 24368990. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:01,784][52050] Avg episode reward: [(0, '20.570'), (1, '20.010')] [2023-10-10 06:38:02,211][53268] Updated weights for policy 1, policy_version 47560 (0.0009) [2023-10-10 06:38:02,583][53268] Updated weights for policy 1, policy_version 47570 (0.0008) [2023-10-10 06:38:02,950][53268] Updated weights for policy 1, policy_version 47580 (0.0010) [2023-10-10 06:38:03,357][53252] Updated weights for policy 0, policy_version 47620 (0.0008) [2023-10-10 06:38:03,733][53252] Updated weights for policy 0, policy_version 47630 (0.0009) [2023-10-10 06:38:04,113][53252] Updated weights for policy 0, policy_version 47640 (0.0007) [2023-10-10 06:38:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97517568. Throughput: 0: 1688.3, 1: 1680.5. Samples: 24389526. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:06,784][52050] Avg episode reward: [(0, '20.690'), (1, '19.190')] [2023-10-10 06:38:06,981][53268] Updated weights for policy 1, policy_version 47590 (0.0009) [2023-10-10 06:38:07,346][53268] Updated weights for policy 1, policy_version 47600 (0.0010) [2023-10-10 06:38:07,713][53268] Updated weights for policy 1, policy_version 47610 (0.0008) [2023-10-10 06:38:08,138][53252] Updated weights for policy 0, policy_version 47650 (0.0007) [2023-10-10 06:38:08,515][53252] Updated weights for policy 0, policy_version 47660 (0.0010) [2023-10-10 06:38:08,884][53252] Updated weights for policy 0, policy_version 47670 (0.0011) [2023-10-10 06:38:09,252][53252] Updated weights for policy 0, policy_version 47680 (0.0010) [2023-10-10 06:38:11,601][53268] Updated weights for policy 1, policy_version 47620 (0.0008) [2023-10-10 06:38:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97583104. Throughput: 0: 1705.5, 1: 1686.6. Samples: 24410542. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:11,784][52050] Avg episode reward: [(0, '22.860'), (1, '18.570')] [2023-10-10 06:38:11,970][53268] Updated weights for policy 1, policy_version 47630 (0.0007) [2023-10-10 06:38:12,337][53268] Updated weights for policy 1, policy_version 47640 (0.0008) [2023-10-10 06:38:13,175][53252] Updated weights for policy 0, policy_version 47690 (0.0008) [2023-10-10 06:38:13,538][53252] Updated weights for policy 0, policy_version 47700 (0.0007) [2023-10-10 06:38:13,910][53252] Updated weights for policy 0, policy_version 47710 (0.0008) [2023-10-10 06:38:16,358][53268] Updated weights for policy 1, policy_version 47650 (0.0009) [2023-10-10 06:38:16,726][53268] Updated weights for policy 1, policy_version 47660 (0.0009) [2023-10-10 06:38:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97648640. Throughput: 0: 1676.5, 1: 1688.0. Samples: 24419714. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:16,784][52050] Avg episode reward: [(0, '23.060'), (1, '19.650')] [2023-10-10 06:38:17,097][53268] Updated weights for policy 1, policy_version 47670 (0.0010) [2023-10-10 06:38:17,454][53268] Updated weights for policy 1, policy_version 47680 (0.0010) [2023-10-10 06:38:17,967][53252] Updated weights for policy 0, policy_version 47720 (0.0010) [2023-10-10 06:38:18,332][53252] Updated weights for policy 0, policy_version 47730 (0.0009) [2023-10-10 06:38:18,707][53252] Updated weights for policy 0, policy_version 47740 (0.0009) [2023-10-10 06:38:21,596][53268] Updated weights for policy 1, policy_version 47690 (0.0009) [2023-10-10 06:38:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97714176. Throughput: 0: 1698.6, 1: 1692.7. Samples: 24440460. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:21,784][52050] Avg episode reward: [(0, '22.770'), (1, '20.550')] [2023-10-10 06:38:21,967][53268] Updated weights for policy 1, policy_version 47700 (0.0008) [2023-10-10 06:38:22,333][53268] Updated weights for policy 1, policy_version 47710 (0.0010) [2023-10-10 06:38:22,788][53252] Updated weights for policy 0, policy_version 47750 (0.0008) [2023-10-10 06:38:23,155][53252] Updated weights for policy 0, policy_version 47760 (0.0009) [2023-10-10 06:38:23,535][53252] Updated weights for policy 0, policy_version 47770 (0.0009) [2023-10-10 06:38:26,525][53268] Updated weights for policy 1, policy_version 47720 (0.0009) [2023-10-10 06:38:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97779712. Throughput: 0: 1701.6, 1: 1692.1. Samples: 24461422. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:26,784][52050] Avg episode reward: [(0, '23.870'), (1, '20.280')] [2023-10-10 06:38:26,901][53268] Updated weights for policy 1, policy_version 47730 (0.0008) [2023-10-10 06:38:27,271][53268] Updated weights for policy 1, policy_version 47740 (0.0009) [2023-10-10 06:38:27,420][53252] Updated weights for policy 0, policy_version 47780 (0.0007) [2023-10-10 06:38:27,791][53252] Updated weights for policy 0, policy_version 47790 (0.0008) [2023-10-10 06:38:28,157][53252] Updated weights for policy 0, policy_version 47800 (0.0007) [2023-10-10 06:38:31,336][53268] Updated weights for policy 1, policy_version 47750 (0.0009) [2023-10-10 06:38:31,707][53268] Updated weights for policy 1, policy_version 47760 (0.0007) [2023-10-10 06:38:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 97845248. Throughput: 0: 1691.6, 1: 1686.9. Samples: 24470750. Policy #0 lag: (min: 19.0, avg: 25.3, max: 51.0) [2023-10-10 06:38:31,784][52050] Avg episode reward: [(0, '21.560'), (1, '18.720')] [2023-10-10 06:38:32,077][53268] Updated weights for policy 1, policy_version 47770 (0.0007) [2023-10-10 06:38:32,163][53252] Updated weights for policy 0, policy_version 47810 (0.0008) [2023-10-10 06:38:32,534][53252] Updated weights for policy 0, policy_version 47820 (0.0007) [2023-10-10 06:38:32,905][53252] Updated weights for policy 0, policy_version 47830 (0.0010) [2023-10-10 06:38:33,272][53252] Updated weights for policy 0, policy_version 47840 (0.0010) [2023-10-10 06:38:36,180][53268] Updated weights for policy 1, policy_version 47780 (0.0007) [2023-10-10 06:38:36,551][53268] Updated weights for policy 1, policy_version 47790 (0.0007) [2023-10-10 06:38:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97910784. Throughput: 0: 1714.5, 1: 1685.3. Samples: 24491398. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:38:36,784][52050] Avg episode reward: [(0, '20.830'), (1, '18.180')] [2023-10-10 06:38:36,913][53268] Updated weights for policy 1, policy_version 47800 (0.0009) [2023-10-10 06:38:37,375][53252] Updated weights for policy 0, policy_version 47850 (0.0008) [2023-10-10 06:38:37,744][53252] Updated weights for policy 0, policy_version 47860 (0.0009) [2023-10-10 06:38:38,118][53252] Updated weights for policy 0, policy_version 47870 (0.0008) [2023-10-10 06:38:40,855][53268] Updated weights for policy 1, policy_version 47810 (0.0010) [2023-10-10 06:38:41,223][53268] Updated weights for policy 1, policy_version 47820 (0.0009) [2023-10-10 06:38:41,583][53268] Updated weights for policy 1, policy_version 47830 (0.0008) [2023-10-10 06:38:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 97976320. Throughput: 0: 1709.7, 1: 1674.7. Samples: 24511752. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:38:41,784][52050] Avg episode reward: [(0, '20.410'), (1, '18.080')] [2023-10-10 06:38:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000047872_49020928.pth... [2023-10-10 06:38:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000046304_47415296.pth [2023-10-10 06:38:41,942][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000047840_48988160.pth... [2023-10-10 06:38:41,944][53268] Updated weights for policy 1, policy_version 47840 (0.0008) [2023-10-10 06:38:41,970][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000046240_47349760.pth [2023-10-10 06:38:42,243][53252] Updated weights for policy 0, policy_version 47880 (0.0007) [2023-10-10 06:38:42,612][53252] Updated weights for policy 0, policy_version 47890 (0.0007) [2023-10-10 06:38:42,989][53252] Updated weights for policy 0, policy_version 47900 (0.0008) [2023-10-10 06:38:45,952][53268] Updated weights for policy 1, policy_version 47850 (0.0009) [2023-10-10 06:38:46,316][53268] Updated weights for policy 1, policy_version 47860 (0.0008) [2023-10-10 06:38:46,691][53268] Updated weights for policy 1, policy_version 47870 (0.0010) [2023-10-10 06:38:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 98074624. Throughput: 0: 1696.5, 1: 1683.1. Samples: 24521074. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:38:46,784][52050] Avg episode reward: [(0, '18.740'), (1, '16.860')] [2023-10-10 06:38:47,002][53252] Updated weights for policy 0, policy_version 47910 (0.0009) [2023-10-10 06:38:47,368][53252] Updated weights for policy 0, policy_version 47920 (0.0010) [2023-10-10 06:38:47,734][53252] Updated weights for policy 0, policy_version 47930 (0.0008) [2023-10-10 06:38:50,849][53268] Updated weights for policy 1, policy_version 47880 (0.0009) [2023-10-10 06:38:51,223][53268] Updated weights for policy 1, policy_version 47890 (0.0007) [2023-10-10 06:38:51,588][53268] Updated weights for policy 1, policy_version 47900 (0.0007) [2023-10-10 06:38:51,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98140160. Throughput: 0: 1695.9, 1: 1685.5. Samples: 24541688. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:38:51,784][52050] Avg episode reward: [(0, '19.970'), (1, '17.300')] [2023-10-10 06:38:51,912][53252] Updated weights for policy 0, policy_version 47940 (0.0009) [2023-10-10 06:38:52,282][53252] Updated weights for policy 0, policy_version 47950 (0.0008) [2023-10-10 06:38:52,654][53252] Updated weights for policy 0, policy_version 47960 (0.0009) [2023-10-10 06:38:55,486][53268] Updated weights for policy 1, policy_version 47910 (0.0007) [2023-10-10 06:38:55,847][53268] Updated weights for policy 1, policy_version 47920 (0.0008) [2023-10-10 06:38:56,213][53268] Updated weights for policy 1, policy_version 47930 (0.0010) [2023-10-10 06:38:56,656][53252] Updated weights for policy 0, policy_version 47970 (0.0009) [2023-10-10 06:38:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98205696. Throughput: 0: 1698.5, 1: 1671.1. Samples: 24562178. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:38:56,784][52050] Avg episode reward: [(0, '21.580'), (1, '17.290')] [2023-10-10 06:38:57,056][53252] Updated weights for policy 0, policy_version 47980 (0.0008) [2023-10-10 06:38:57,416][53252] Updated weights for policy 0, policy_version 47990 (0.0008) [2023-10-10 06:38:57,791][53252] Updated weights for policy 0, policy_version 48000 (0.0011) [2023-10-10 06:39:00,257][53268] Updated weights for policy 1, policy_version 47940 (0.0010) [2023-10-10 06:39:00,629][53268] Updated weights for policy 1, policy_version 47950 (0.0008) [2023-10-10 06:39:00,995][53268] Updated weights for policy 1, policy_version 47960 (0.0007) [2023-10-10 06:39:01,752][53252] Updated weights for policy 0, policy_version 48010 (0.0007) [2023-10-10 06:39:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98271232. Throughput: 0: 1693.2, 1: 1694.1. Samples: 24572144. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:39:01,784][52050] Avg episode reward: [(0, '22.160'), (1, '17.140')] [2023-10-10 06:39:02,124][53252] Updated weights for policy 0, policy_version 48020 (0.0007) [2023-10-10 06:39:02,500][53252] Updated weights for policy 0, policy_version 48030 (0.0007) [2023-10-10 06:39:05,137][53268] Updated weights for policy 1, policy_version 47970 (0.0008) [2023-10-10 06:39:05,505][53268] Updated weights for policy 1, policy_version 47980 (0.0009) [2023-10-10 06:39:05,873][53268] Updated weights for policy 1, policy_version 47990 (0.0010) [2023-10-10 06:39:06,231][53268] Updated weights for policy 1, policy_version 48000 (0.0009) [2023-10-10 06:39:06,521][53252] Updated weights for policy 0, policy_version 48040 (0.0010) [2023-10-10 06:39:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98336768. Throughput: 0: 1692.6, 1: 1685.6. Samples: 24592480. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-10 06:39:06,784][52050] Avg episode reward: [(0, '22.080'), (1, '18.040')] [2023-10-10 06:39:06,891][53252] Updated weights for policy 0, policy_version 48050 (0.0010) [2023-10-10 06:39:07,270][53252] Updated weights for policy 0, policy_version 48060 (0.0008) [2023-10-10 06:39:10,261][53268] Updated weights for policy 1, policy_version 48010 (0.0007) [2023-10-10 06:39:10,627][53268] Updated weights for policy 1, policy_version 48020 (0.0009) [2023-10-10 06:39:10,989][53268] Updated weights for policy 1, policy_version 48030 (0.0007) [2023-10-10 06:39:11,295][53252] Updated weights for policy 0, policy_version 48070 (0.0008) [2023-10-10 06:39:11,666][53252] Updated weights for policy 0, policy_version 48080 (0.0009) [2023-10-10 06:39:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98402304. Throughput: 0: 1679.4, 1: 1660.1. Samples: 24611700. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:11,784][52050] Avg episode reward: [(0, '21.760'), (1, '19.140')] [2023-10-10 06:39:12,034][53252] Updated weights for policy 0, policy_version 48090 (0.0009) [2023-10-10 06:39:15,219][53268] Updated weights for policy 1, policy_version 48040 (0.0010) [2023-10-10 06:39:15,586][53268] Updated weights for policy 1, policy_version 48050 (0.0008) [2023-10-10 06:39:15,952][53268] Updated weights for policy 1, policy_version 48060 (0.0008) [2023-10-10 06:39:16,087][53252] Updated weights for policy 0, policy_version 48100 (0.0007) [2023-10-10 06:39:16,452][53252] Updated weights for policy 0, policy_version 48110 (0.0009) [2023-10-10 06:39:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98467840. Throughput: 0: 1682.9, 1: 1687.3. Samples: 24622408. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:16,784][52050] Avg episode reward: [(0, '20.230'), (1, '19.480')] [2023-10-10 06:39:16,822][53252] Updated weights for policy 0, policy_version 48120 (0.0007) [2023-10-10 06:39:20,178][53268] Updated weights for policy 1, policy_version 48070 (0.0009) [2023-10-10 06:39:20,541][53268] Updated weights for policy 1, policy_version 48080 (0.0011) [2023-10-10 06:39:20,913][53268] Updated weights for policy 1, policy_version 48090 (0.0010) [2023-10-10 06:39:21,003][53252] Updated weights for policy 0, policy_version 48130 (0.0010) [2023-10-10 06:39:21,386][53252] Updated weights for policy 0, policy_version 48140 (0.0009) [2023-10-10 06:39:21,759][53252] Updated weights for policy 0, policy_version 48150 (0.0008) [2023-10-10 06:39:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98533376. Throughput: 0: 1678.4, 1: 1682.8. Samples: 24642654. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:21,784][52050] Avg episode reward: [(0, '20.450'), (1, '19.620')] [2023-10-10 06:39:22,132][53252] Updated weights for policy 0, policy_version 48160 (0.0007) [2023-10-10 06:39:24,856][53268] Updated weights for policy 1, policy_version 48100 (0.0009) [2023-10-10 06:39:25,229][53268] Updated weights for policy 1, policy_version 48110 (0.0010) [2023-10-10 06:39:25,586][53268] Updated weights for policy 1, policy_version 48120 (0.0009) [2023-10-10 06:39:26,028][53252] Updated weights for policy 0, policy_version 48170 (0.0007) [2023-10-10 06:39:26,401][53252] Updated weights for policy 0, policy_version 48180 (0.0007) [2023-10-10 06:39:26,773][53252] Updated weights for policy 0, policy_version 48190 (0.0010) [2023-10-10 06:39:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98598912. Throughput: 0: 1670.0, 1: 1667.4. Samples: 24661936. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:26,784][52050] Avg episode reward: [(0, '20.420'), (1, '18.730')] [2023-10-10 06:39:29,708][53268] Updated weights for policy 1, policy_version 48130 (0.0007) [2023-10-10 06:39:30,075][53268] Updated weights for policy 1, policy_version 48140 (0.0008) [2023-10-10 06:39:30,443][53268] Updated weights for policy 1, policy_version 48150 (0.0008) [2023-10-10 06:39:30,779][53252] Updated weights for policy 0, policy_version 48200 (0.0008) [2023-10-10 06:39:30,807][53268] Updated weights for policy 1, policy_version 48160 (0.0008) [2023-10-10 06:39:31,154][53252] Updated weights for policy 0, policy_version 48210 (0.0009) [2023-10-10 06:39:31,530][53252] Updated weights for policy 0, policy_version 48220 (0.0009) [2023-10-10 06:39:31,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 98697216. Throughput: 0: 1689.2, 1: 1688.9. Samples: 24673090. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:31,784][52050] Avg episode reward: [(0, '20.300'), (1, '18.440')] [2023-10-10 06:39:34,768][53268] Updated weights for policy 1, policy_version 48170 (0.0008) [2023-10-10 06:39:35,129][53268] Updated weights for policy 1, policy_version 48180 (0.0009) [2023-10-10 06:39:35,508][53268] Updated weights for policy 1, policy_version 48190 (0.0010) [2023-10-10 06:39:35,786][53252] Updated weights for policy 0, policy_version 48230 (0.0007) [2023-10-10 06:39:36,158][53252] Updated weights for policy 0, policy_version 48240 (0.0008) [2023-10-10 06:39:36,536][53252] Updated weights for policy 0, policy_version 48250 (0.0008) [2023-10-10 06:39:36,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 98762752. Throughput: 0: 1693.7, 1: 1678.7. Samples: 24693444. Policy #0 lag: (min: 31.0, avg: 32.4, max: 56.0) [2023-10-10 06:39:36,784][52050] Avg episode reward: [(0, '22.700'), (1, '19.400')] [2023-10-10 06:39:39,450][53268] Updated weights for policy 1, policy_version 48200 (0.0008) [2023-10-10 06:39:39,820][53268] Updated weights for policy 1, policy_version 48210 (0.0008) [2023-10-10 06:39:40,175][53268] Updated weights for policy 1, policy_version 48220 (0.0008) [2023-10-10 06:39:40,577][53252] Updated weights for policy 0, policy_version 48260 (0.0010) [2023-10-10 06:39:40,950][53252] Updated weights for policy 0, policy_version 48270 (0.0008) [2023-10-10 06:39:41,329][53252] Updated weights for policy 0, policy_version 48280 (0.0008) [2023-10-10 06:39:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 98828288. Throughput: 0: 1662.5, 1: 1683.6. Samples: 24712752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:39:41,784][52050] Avg episode reward: [(0, '22.940'), (1, '18.840')] [2023-10-10 06:39:44,173][53268] Updated weights for policy 1, policy_version 48230 (0.0009) [2023-10-10 06:39:44,535][53268] Updated weights for policy 1, policy_version 48240 (0.0007) [2023-10-10 06:39:44,900][53268] Updated weights for policy 1, policy_version 48250 (0.0009) [2023-10-10 06:39:45,432][53252] Updated weights for policy 0, policy_version 48290 (0.0009) [2023-10-10 06:39:45,826][53252] Updated weights for policy 0, policy_version 48300 (0.0007) [2023-10-10 06:39:46,183][53252] Updated weights for policy 0, policy_version 48310 (0.0007) [2023-10-10 06:39:46,552][53252] Updated weights for policy 0, policy_version 48320 (0.0007) [2023-10-10 06:39:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98893824. Throughput: 0: 1688.3, 1: 1684.0. Samples: 24723898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:39:46,784][52050] Avg episode reward: [(0, '22.070'), (1, '19.370')] [2023-10-10 06:39:48,872][53268] Updated weights for policy 1, policy_version 48260 (0.0009) [2023-10-10 06:39:49,245][53268] Updated weights for policy 1, policy_version 48270 (0.0007) [2023-10-10 06:39:49,606][53268] Updated weights for policy 1, policy_version 48280 (0.0007) [2023-10-10 06:39:50,609][53252] Updated weights for policy 0, policy_version 48330 (0.0007) [2023-10-10 06:39:50,987][53252] Updated weights for policy 0, policy_version 48340 (0.0008) [2023-10-10 06:39:51,346][53252] Updated weights for policy 0, policy_version 48350 (0.0008) [2023-10-10 06:39:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 98959360. Throughput: 0: 1683.6, 1: 1669.6. Samples: 24743376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:39:51,784][52050] Avg episode reward: [(0, '22.370'), (1, '20.400')] [2023-10-10 06:39:53,568][53268] Updated weights for policy 1, policy_version 48290 (0.0008) [2023-10-10 06:39:53,936][53268] Updated weights for policy 1, policy_version 48300 (0.0009) [2023-10-10 06:39:54,304][53268] Updated weights for policy 1, policy_version 48310 (0.0009) [2023-10-10 06:39:54,667][53268] Updated weights for policy 1, policy_version 48320 (0.0009) [2023-10-10 06:39:55,356][53252] Updated weights for policy 0, policy_version 48360 (0.0009) [2023-10-10 06:39:55,732][53252] Updated weights for policy 0, policy_version 48370 (0.0008) [2023-10-10 06:39:56,108][53252] Updated weights for policy 0, policy_version 48380 (0.0010) [2023-10-10 06:39:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 99024896. Throughput: 0: 1668.1, 1: 1698.7. Samples: 24763208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:39:56,784][52050] Avg episode reward: [(0, '23.380'), (1, '19.530')] [2023-10-10 06:39:58,925][53268] Updated weights for policy 1, policy_version 48330 (0.0008) [2023-10-10 06:39:59,297][53268] Updated weights for policy 1, policy_version 48340 (0.0010) [2023-10-10 06:39:59,674][53268] Updated weights for policy 1, policy_version 48350 (0.0008) [2023-10-10 06:39:59,977][53252] Updated weights for policy 0, policy_version 48390 (0.0008) [2023-10-10 06:40:00,345][53252] Updated weights for policy 0, policy_version 48400 (0.0007) [2023-10-10 06:40:00,713][53252] Updated weights for policy 0, policy_version 48410 (0.0008) [2023-10-10 06:40:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99090432. Throughput: 0: 1693.0, 1: 1688.3. Samples: 24774566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:40:01,784][52050] Avg episode reward: [(0, '22.520'), (1, '19.120')] [2023-10-10 06:40:03,830][53268] Updated weights for policy 1, policy_version 48360 (0.0009) [2023-10-10 06:40:04,196][53268] Updated weights for policy 1, policy_version 48370 (0.0008) [2023-10-10 06:40:04,564][53268] Updated weights for policy 1, policy_version 48380 (0.0008) [2023-10-10 06:40:04,853][53252] Updated weights for policy 0, policy_version 48420 (0.0009) [2023-10-10 06:40:05,226][53252] Updated weights for policy 0, policy_version 48430 (0.0009) [2023-10-10 06:40:05,587][53252] Updated weights for policy 0, policy_version 48440 (0.0007) [2023-10-10 06:40:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99155968. Throughput: 0: 1680.4, 1: 1675.2. Samples: 24793658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:40:06,784][52050] Avg episode reward: [(0, '23.980'), (1, '19.430')] [2023-10-10 06:40:08,593][53268] Updated weights for policy 1, policy_version 48390 (0.0009) [2023-10-10 06:40:08,964][53268] Updated weights for policy 1, policy_version 48400 (0.0009) [2023-10-10 06:40:09,327][53268] Updated weights for policy 1, policy_version 48410 (0.0007) [2023-10-10 06:40:09,766][53252] Updated weights for policy 0, policy_version 48450 (0.0010) [2023-10-10 06:40:10,139][53252] Updated weights for policy 0, policy_version 48460 (0.0009) [2023-10-10 06:40:10,510][53252] Updated weights for policy 0, policy_version 48470 (0.0009) [2023-10-10 06:40:10,889][53252] Updated weights for policy 0, policy_version 48480 (0.0010) [2023-10-10 06:40:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99221504. Throughput: 0: 1673.6, 1: 1702.3. Samples: 24813854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:40:11,784][52050] Avg episode reward: [(0, '21.760'), (1, '19.590')] [2023-10-10 06:40:13,426][53268] Updated weights for policy 1, policy_version 48420 (0.0009) [2023-10-10 06:40:13,787][53268] Updated weights for policy 1, policy_version 48430 (0.0010) [2023-10-10 06:40:14,154][53268] Updated weights for policy 1, policy_version 48440 (0.0011) [2023-10-10 06:40:14,895][53252] Updated weights for policy 0, policy_version 48490 (0.0008) [2023-10-10 06:40:15,263][53252] Updated weights for policy 0, policy_version 48500 (0.0007) [2023-10-10 06:40:15,642][53252] Updated weights for policy 0, policy_version 48510 (0.0009) [2023-10-10 06:40:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99287040. Throughput: 0: 1683.8, 1: 1677.2. Samples: 24824336. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:16,784][52050] Avg episode reward: [(0, '21.630'), (1, '19.350')] [2023-10-10 06:40:18,202][53268] Updated weights for policy 1, policy_version 48450 (0.0008) [2023-10-10 06:40:18,575][53268] Updated weights for policy 1, policy_version 48460 (0.0008) [2023-10-10 06:40:18,940][53268] Updated weights for policy 1, policy_version 48470 (0.0008) [2023-10-10 06:40:19,308][53268] Updated weights for policy 1, policy_version 48480 (0.0009) [2023-10-10 06:40:19,728][53252] Updated weights for policy 0, policy_version 48520 (0.0009) [2023-10-10 06:40:20,097][53252] Updated weights for policy 0, policy_version 48530 (0.0008) [2023-10-10 06:40:20,470][53252] Updated weights for policy 0, policy_version 48540 (0.0009) [2023-10-10 06:40:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99352576. Throughput: 0: 1660.1, 1: 1681.2. Samples: 24843802. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:21,784][52050] Avg episode reward: [(0, '21.040'), (1, '20.470')] [2023-10-10 06:40:23,324][53268] Updated weights for policy 1, policy_version 48490 (0.0010) [2023-10-10 06:40:23,685][53268] Updated weights for policy 1, policy_version 48500 (0.0007) [2023-10-10 06:40:24,054][53268] Updated weights for policy 1, policy_version 48510 (0.0009) [2023-10-10 06:40:24,567][53252] Updated weights for policy 0, policy_version 48550 (0.0008) [2023-10-10 06:40:24,938][53252] Updated weights for policy 0, policy_version 48560 (0.0007) [2023-10-10 06:40:25,315][53252] Updated weights for policy 0, policy_version 48570 (0.0009) [2023-10-10 06:40:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 99418112. Throughput: 0: 1679.6, 1: 1689.4. Samples: 24864356. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:26,784][52050] Avg episode reward: [(0, '19.990'), (1, '19.680')] [2023-10-10 06:40:28,060][53268] Updated weights for policy 1, policy_version 48520 (0.0009) [2023-10-10 06:40:28,420][53268] Updated weights for policy 1, policy_version 48530 (0.0010) [2023-10-10 06:40:28,791][53268] Updated weights for policy 1, policy_version 48540 (0.0010) [2023-10-10 06:40:29,355][53252] Updated weights for policy 0, policy_version 48580 (0.0008) [2023-10-10 06:40:29,732][53252] Updated weights for policy 0, policy_version 48590 (0.0010) [2023-10-10 06:40:30,101][53252] Updated weights for policy 0, policy_version 48600 (0.0008) [2023-10-10 06:40:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99483648. Throughput: 0: 1678.6, 1: 1666.1. Samples: 24874410. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:31,784][52050] Avg episode reward: [(0, '22.470'), (1, '18.790')] [2023-10-10 06:40:32,936][53268] Updated weights for policy 1, policy_version 48550 (0.0009) [2023-10-10 06:40:33,298][53268] Updated weights for policy 1, policy_version 48560 (0.0009) [2023-10-10 06:40:33,661][53268] Updated weights for policy 1, policy_version 48570 (0.0010) [2023-10-10 06:40:34,227][53252] Updated weights for policy 0, policy_version 48610 (0.0010) [2023-10-10 06:40:34,637][53252] Updated weights for policy 0, policy_version 48620 (0.0008) [2023-10-10 06:40:35,013][53252] Updated weights for policy 0, policy_version 48630 (0.0010) [2023-10-10 06:40:35,376][53252] Updated weights for policy 0, policy_version 48640 (0.0010) [2023-10-10 06:40:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99549184. Throughput: 0: 1660.9, 1: 1686.8. Samples: 24894024. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:36,784][52050] Avg episode reward: [(0, '22.630'), (1, '18.480')] [2023-10-10 06:40:37,665][53268] Updated weights for policy 1, policy_version 48580 (0.0008) [2023-10-10 06:40:38,036][53268] Updated weights for policy 1, policy_version 48590 (0.0010) [2023-10-10 06:40:38,402][53268] Updated weights for policy 1, policy_version 48600 (0.0008) [2023-10-10 06:40:39,354][53252] Updated weights for policy 0, policy_version 48650 (0.0007) [2023-10-10 06:40:39,723][53252] Updated weights for policy 0, policy_version 48660 (0.0007) [2023-10-10 06:40:40,100][53252] Updated weights for policy 0, policy_version 48670 (0.0007) [2023-10-10 06:40:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99614720. Throughput: 0: 1687.0, 1: 1684.6. Samples: 24914932. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:41,784][52050] Avg episode reward: [(0, '22.390'), (1, '18.250')] [2023-10-10 06:40:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000048672_49840128.pth... [2023-10-10 06:40:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000048608_49774592.pth... [2023-10-10 06:40:41,825][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000047104_48234496.pth [2023-10-10 06:40:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000047040_48168960.pth [2023-10-10 06:40:42,542][53268] Updated weights for policy 1, policy_version 48610 (0.0008) [2023-10-10 06:40:42,905][53268] Updated weights for policy 1, policy_version 48620 (0.0010) [2023-10-10 06:40:43,280][53268] Updated weights for policy 1, policy_version 48630 (0.0010) [2023-10-10 06:40:43,643][53268] Updated weights for policy 1, policy_version 48640 (0.0007) [2023-10-10 06:40:44,049][53252] Updated weights for policy 0, policy_version 48680 (0.0008) [2023-10-10 06:40:44,422][53252] Updated weights for policy 0, policy_version 48690 (0.0008) [2023-10-10 06:40:44,788][53252] Updated weights for policy 0, policy_version 48700 (0.0007) [2023-10-10 06:40:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99680256. Throughput: 0: 1666.8, 1: 1668.5. Samples: 24924652. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 06:40:46,784][52050] Avg episode reward: [(0, '23.040'), (1, '18.990')] [2023-10-10 06:40:47,762][53268] Updated weights for policy 1, policy_version 48650 (0.0010) [2023-10-10 06:40:48,127][53268] Updated weights for policy 1, policy_version 48660 (0.0008) [2023-10-10 06:40:48,492][53268] Updated weights for policy 1, policy_version 48670 (0.0008) [2023-10-10 06:40:49,000][53252] Updated weights for policy 0, policy_version 48710 (0.0009) [2023-10-10 06:40:49,364][53252] Updated weights for policy 0, policy_version 48720 (0.0007) [2023-10-10 06:40:49,741][53252] Updated weights for policy 0, policy_version 48730 (0.0009) [2023-10-10 06:40:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99745792. Throughput: 0: 1667.0, 1: 1693.7. Samples: 24944888. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:40:51,784][52050] Avg episode reward: [(0, '21.790'), (1, '19.020')] [2023-10-10 06:40:52,399][53268] Updated weights for policy 1, policy_version 48680 (0.0009) [2023-10-10 06:40:52,771][53268] Updated weights for policy 1, policy_version 48690 (0.0010) [2023-10-10 06:40:53,141][53268] Updated weights for policy 1, policy_version 48700 (0.0009) [2023-10-10 06:40:53,751][53252] Updated weights for policy 0, policy_version 48740 (0.0009) [2023-10-10 06:40:54,121][53252] Updated weights for policy 0, policy_version 48750 (0.0007) [2023-10-10 06:40:54,485][53252] Updated weights for policy 0, policy_version 48760 (0.0010) [2023-10-10 06:40:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 99811328. Throughput: 0: 1685.8, 1: 1686.2. Samples: 24965592. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:40:56,785][52050] Avg episode reward: [(0, '20.500'), (1, '20.270')] [2023-10-10 06:40:57,252][53268] Updated weights for policy 1, policy_version 48710 (0.0007) [2023-10-10 06:40:57,614][53268] Updated weights for policy 1, policy_version 48720 (0.0008) [2023-10-10 06:40:57,977][53268] Updated weights for policy 1, policy_version 48730 (0.0008) [2023-10-10 06:40:58,439][53252] Updated weights for policy 0, policy_version 48770 (0.0009) [2023-10-10 06:40:58,808][53252] Updated weights for policy 0, policy_version 48780 (0.0008) [2023-10-10 06:40:59,180][53252] Updated weights for policy 0, policy_version 48790 (0.0008) [2023-10-10 06:40:59,557][53252] Updated weights for policy 0, policy_version 48800 (0.0011) [2023-10-10 06:41:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 99876864. Throughput: 0: 1666.8, 1: 1684.2. Samples: 24975128. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:41:01,784][52050] Avg episode reward: [(0, '20.970'), (1, '19.950')] [2023-10-10 06:41:02,124][53268] Updated weights for policy 1, policy_version 48740 (0.0010) [2023-10-10 06:41:02,504][53268] Updated weights for policy 1, policy_version 48750 (0.0007) [2023-10-10 06:41:02,873][53268] Updated weights for policy 1, policy_version 48760 (0.0008) [2023-10-10 06:41:03,600][53252] Updated weights for policy 0, policy_version 48810 (0.0008) [2023-10-10 06:41:03,974][53252] Updated weights for policy 0, policy_version 48820 (0.0008) [2023-10-10 06:41:04,337][53252] Updated weights for policy 0, policy_version 48830 (0.0008) [2023-10-10 06:41:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99942400. Throughput: 0: 1685.9, 1: 1688.1. Samples: 24995632. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:41:06,784][52050] Avg episode reward: [(0, '22.350'), (1, '18.860')] [2023-10-10 06:41:06,860][53268] Updated weights for policy 1, policy_version 48770 (0.0009) [2023-10-10 06:41:07,219][53268] Updated weights for policy 1, policy_version 48780 (0.0008) [2023-10-10 06:41:07,589][53268] Updated weights for policy 1, policy_version 48790 (0.0009) [2023-10-10 06:41:07,953][53268] Updated weights for policy 1, policy_version 48800 (0.0008) [2023-10-10 06:41:08,261][53252] Updated weights for policy 0, policy_version 48840 (0.0008) [2023-10-10 06:41:08,628][53252] Updated weights for policy 0, policy_version 48850 (0.0007) [2023-10-10 06:41:08,993][53252] Updated weights for policy 0, policy_version 48860 (0.0008) [2023-10-10 06:41:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100007936. Throughput: 0: 1698.7, 1: 1685.1. Samples: 25016624. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:41:11,784][52050] Avg episode reward: [(0, '21.060'), (1, '17.940')] [2023-10-10 06:41:12,076][53268] Updated weights for policy 1, policy_version 48810 (0.0010) [2023-10-10 06:41:12,449][53268] Updated weights for policy 1, policy_version 48820 (0.0007) [2023-10-10 06:41:12,812][53268] Updated weights for policy 1, policy_version 48830 (0.0007) [2023-10-10 06:41:12,940][53252] Updated weights for policy 0, policy_version 48870 (0.0009) [2023-10-10 06:41:13,325][53252] Updated weights for policy 0, policy_version 48880 (0.0010) [2023-10-10 06:41:13,687][53252] Updated weights for policy 0, policy_version 48890 (0.0010) [2023-10-10 06:41:16,780][53268] Updated weights for policy 1, policy_version 48840 (0.0007) [2023-10-10 06:41:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100073472. Throughput: 0: 1679.0, 1: 1687.6. Samples: 25025910. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:41:16,784][52050] Avg episode reward: [(0, '20.870'), (1, '19.510')] [2023-10-10 06:41:17,141][53268] Updated weights for policy 1, policy_version 48850 (0.0007) [2023-10-10 06:41:17,515][53268] Updated weights for policy 1, policy_version 48860 (0.0007) [2023-10-10 06:41:17,794][53252] Updated weights for policy 0, policy_version 48900 (0.0011) [2023-10-10 06:41:18,169][53252] Updated weights for policy 0, policy_version 48910 (0.0009) [2023-10-10 06:41:18,544][53252] Updated weights for policy 0, policy_version 48920 (0.0007) [2023-10-10 06:41:21,724][53268] Updated weights for policy 1, policy_version 48870 (0.0008) [2023-10-10 06:41:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100139008. Throughput: 0: 1703.5, 1: 1688.2. Samples: 25046650. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-10 06:41:21,784][52050] Avg episode reward: [(0, '22.720'), (1, '18.220')] [2023-10-10 06:41:22,102][53268] Updated weights for policy 1, policy_version 48880 (0.0007) [2023-10-10 06:41:22,463][53268] Updated weights for policy 1, policy_version 48890 (0.0007) [2023-10-10 06:41:22,581][53252] Updated weights for policy 0, policy_version 48930 (0.0009) [2023-10-10 06:41:22,985][53252] Updated weights for policy 0, policy_version 48940 (0.0007) [2023-10-10 06:41:23,360][53252] Updated weights for policy 0, policy_version 48950 (0.0008) [2023-10-10 06:41:23,726][53252] Updated weights for policy 0, policy_version 48960 (0.0009) [2023-10-10 06:41:26,647][53268] Updated weights for policy 1, policy_version 48900 (0.0009) [2023-10-10 06:41:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100204544. Throughput: 0: 1705.4, 1: 1684.3. Samples: 25067470. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:26,784][52050] Avg episode reward: [(0, '21.780'), (1, '18.840')] [2023-10-10 06:41:27,021][53268] Updated weights for policy 1, policy_version 48910 (0.0011) [2023-10-10 06:41:27,373][53268] Updated weights for policy 1, policy_version 48920 (0.0007) [2023-10-10 06:41:27,590][53252] Updated weights for policy 0, policy_version 48970 (0.0009) [2023-10-10 06:41:27,973][53252] Updated weights for policy 0, policy_version 48980 (0.0009) [2023-10-10 06:41:28,350][53252] Updated weights for policy 0, policy_version 48990 (0.0010) [2023-10-10 06:41:31,405][53268] Updated weights for policy 1, policy_version 48930 (0.0009) [2023-10-10 06:41:31,758][53268] Updated weights for policy 1, policy_version 48940 (0.0011) [2023-10-10 06:41:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100270080. Throughput: 0: 1690.2, 1: 1683.9. Samples: 25076488. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:31,784][52050] Avg episode reward: [(0, '21.610'), (1, '19.250')] [2023-10-10 06:41:32,121][53268] Updated weights for policy 1, policy_version 48950 (0.0008) [2023-10-10 06:41:32,232][53252] Updated weights for policy 0, policy_version 49000 (0.0008) [2023-10-10 06:41:32,489][53268] Updated weights for policy 1, policy_version 48960 (0.0007) [2023-10-10 06:41:32,603][53252] Updated weights for policy 0, policy_version 49010 (0.0008) [2023-10-10 06:41:32,989][53252] Updated weights for policy 0, policy_version 49020 (0.0009) [2023-10-10 06:41:36,536][53268] Updated weights for policy 1, policy_version 48970 (0.0008) [2023-10-10 06:41:36,784][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100335616. Throughput: 0: 1704.1, 1: 1679.9. Samples: 25097166. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:36,785][52050] Avg episode reward: [(0, '21.220'), (1, '19.250')] [2023-10-10 06:41:36,907][53268] Updated weights for policy 1, policy_version 48980 (0.0009) [2023-10-10 06:41:37,093][53252] Updated weights for policy 0, policy_version 49030 (0.0008) [2023-10-10 06:41:37,277][53268] Updated weights for policy 1, policy_version 48990 (0.0007) [2023-10-10 06:41:37,452][53252] Updated weights for policy 0, policy_version 49040 (0.0009) [2023-10-10 06:41:37,823][53252] Updated weights for policy 0, policy_version 49050 (0.0010) [2023-10-10 06:41:41,281][53268] Updated weights for policy 1, policy_version 49000 (0.0009) [2023-10-10 06:41:41,650][53268] Updated weights for policy 1, policy_version 49010 (0.0011) [2023-10-10 06:41:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100401152. Throughput: 0: 1706.7, 1: 1673.8. Samples: 25117714. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:41,784][52050] Avg episode reward: [(0, '20.710'), (1, '19.350')] [2023-10-10 06:41:42,024][53268] Updated weights for policy 1, policy_version 49020 (0.0011) [2023-10-10 06:41:42,026][53252] Updated weights for policy 0, policy_version 49060 (0.0008) [2023-10-10 06:41:42,400][53252] Updated weights for policy 0, policy_version 49070 (0.0008) [2023-10-10 06:41:42,767][53252] Updated weights for policy 0, policy_version 49080 (0.0007) [2023-10-10 06:41:46,128][53268] Updated weights for policy 1, policy_version 49030 (0.0009) [2023-10-10 06:41:46,499][53268] Updated weights for policy 1, policy_version 49040 (0.0007) [2023-10-10 06:41:46,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100466688. Throughput: 0: 1699.2, 1: 1673.9. Samples: 25126920. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:46,784][52050] Avg episode reward: [(0, '21.490'), (1, '18.980')] [2023-10-10 06:41:46,831][53252] Updated weights for policy 0, policy_version 49090 (0.0009) [2023-10-10 06:41:46,853][53268] Updated weights for policy 1, policy_version 49050 (0.0008) [2023-10-10 06:41:47,200][53252] Updated weights for policy 0, policy_version 49100 (0.0009) [2023-10-10 06:41:47,569][53252] Updated weights for policy 0, policy_version 49110 (0.0008) [2023-10-10 06:41:47,947][53252] Updated weights for policy 0, policy_version 49120 (0.0009) [2023-10-10 06:41:50,848][53268] Updated weights for policy 1, policy_version 49060 (0.0008) [2023-10-10 06:41:51,219][53268] Updated weights for policy 1, policy_version 49070 (0.0010) [2023-10-10 06:41:51,589][53268] Updated weights for policy 1, policy_version 49080 (0.0010) [2023-10-10 06:41:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 100532224. Throughput: 0: 1703.2, 1: 1676.0. Samples: 25147696. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:51,784][52050] Avg episode reward: [(0, '20.660'), (1, '18.780')] [2023-10-10 06:41:52,011][53252] Updated weights for policy 0, policy_version 49130 (0.0008) [2023-10-10 06:41:52,377][53252] Updated weights for policy 0, policy_version 49140 (0.0008) [2023-10-10 06:41:52,751][53252] Updated weights for policy 0, policy_version 49150 (0.0007) [2023-10-10 06:41:55,631][53268] Updated weights for policy 1, policy_version 49090 (0.0008) [2023-10-10 06:41:55,990][53268] Updated weights for policy 1, policy_version 49100 (0.0010) [2023-10-10 06:41:56,360][53268] Updated weights for policy 1, policy_version 49110 (0.0007) [2023-10-10 06:41:56,727][53268] Updated weights for policy 1, policy_version 49120 (0.0008) [2023-10-10 06:41:56,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100630528. Throughput: 0: 1698.7, 1: 1662.6. Samples: 25167882. Policy #0 lag: (min: 21.0, avg: 27.8, max: 53.0) [2023-10-10 06:41:56,784][52050] Avg episode reward: [(0, '20.980'), (1, '18.970')] [2023-10-10 06:41:56,871][53252] Updated weights for policy 0, policy_version 49160 (0.0008) [2023-10-10 06:41:57,242][53252] Updated weights for policy 0, policy_version 49170 (0.0008) [2023-10-10 06:41:57,613][53252] Updated weights for policy 0, policy_version 49180 (0.0011) [2023-10-10 06:42:00,960][53268] Updated weights for policy 1, policy_version 49130 (0.0010) [2023-10-10 06:42:01,329][53268] Updated weights for policy 1, policy_version 49140 (0.0010) [2023-10-10 06:42:01,695][53268] Updated weights for policy 1, policy_version 49150 (0.0009) [2023-10-10 06:42:01,724][53252] Updated weights for policy 0, policy_version 49190 (0.0009) [2023-10-10 06:42:01,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100696064. Throughput: 0: 1695.2, 1: 1672.9. Samples: 25177474. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:01,784][52050] Avg episode reward: [(0, '22.110'), (1, '21.210')] [2023-10-10 06:42:02,090][53252] Updated weights for policy 0, policy_version 49200 (0.0008) [2023-10-10 06:42:02,458][53252] Updated weights for policy 0, policy_version 49210 (0.0008) [2023-10-10 06:42:05,763][53268] Updated weights for policy 1, policy_version 49160 (0.0009) [2023-10-10 06:42:06,129][53268] Updated weights for policy 1, policy_version 49170 (0.0007) [2023-10-10 06:42:06,482][53252] Updated weights for policy 0, policy_version 49220 (0.0007) [2023-10-10 06:42:06,485][53268] Updated weights for policy 1, policy_version 49180 (0.0007) [2023-10-10 06:42:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100761600. Throughput: 0: 1694.2, 1: 1675.6. Samples: 25198288. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:06,784][52050] Avg episode reward: [(0, '21.030'), (1, '19.000')] [2023-10-10 06:42:06,853][53252] Updated weights for policy 0, policy_version 49230 (0.0008) [2023-10-10 06:42:07,235][53252] Updated weights for policy 0, policy_version 49240 (0.0010) [2023-10-10 06:42:10,604][53268] Updated weights for policy 1, policy_version 49190 (0.0008) [2023-10-10 06:42:10,975][53268] Updated weights for policy 1, policy_version 49200 (0.0009) [2023-10-10 06:42:11,345][53268] Updated weights for policy 1, policy_version 49210 (0.0009) [2023-10-10 06:42:11,396][53252] Updated weights for policy 0, policy_version 49250 (0.0009) [2023-10-10 06:42:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100827136. Throughput: 0: 1686.3, 1: 1661.6. Samples: 25218124. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:11,784][52050] Avg episode reward: [(0, '19.550'), (1, '19.780')] [2023-10-10 06:42:11,798][53252] Updated weights for policy 0, policy_version 49260 (0.0008) [2023-10-10 06:42:12,162][53252] Updated weights for policy 0, policy_version 49270 (0.0007) [2023-10-10 06:42:12,540][53252] Updated weights for policy 0, policy_version 49280 (0.0008) [2023-10-10 06:42:15,539][53268] Updated weights for policy 1, policy_version 49220 (0.0008) [2023-10-10 06:42:15,906][53268] Updated weights for policy 1, policy_version 49230 (0.0008) [2023-10-10 06:42:16,271][53268] Updated weights for policy 1, policy_version 49240 (0.0007) [2023-10-10 06:42:16,522][53252] Updated weights for policy 0, policy_version 49290 (0.0008) [2023-10-10 06:42:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100892672. Throughput: 0: 1686.5, 1: 1682.0. Samples: 25228068. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:16,784][52050] Avg episode reward: [(0, '21.150'), (1, '20.340')] [2023-10-10 06:42:16,892][53252] Updated weights for policy 0, policy_version 49300 (0.0010) [2023-10-10 06:42:17,275][53252] Updated weights for policy 0, policy_version 49310 (0.0010) [2023-10-10 06:42:20,364][53268] Updated weights for policy 1, policy_version 49250 (0.0010) [2023-10-10 06:42:20,733][53268] Updated weights for policy 1, policy_version 49260 (0.0009) [2023-10-10 06:42:21,109][53268] Updated weights for policy 1, policy_version 49270 (0.0009) [2023-10-10 06:42:21,249][53252] Updated weights for policy 0, policy_version 49320 (0.0011) [2023-10-10 06:42:21,464][53268] Updated weights for policy 1, policy_version 49280 (0.0008) [2023-10-10 06:42:21,616][53252] Updated weights for policy 0, policy_version 49330 (0.0008) [2023-10-10 06:42:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100958208. Throughput: 0: 1689.8, 1: 1678.9. Samples: 25248756. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:21,784][52050] Avg episode reward: [(0, '20.380'), (1, '21.430')] [2023-10-10 06:42:21,993][53252] Updated weights for policy 0, policy_version 49340 (0.0007) [2023-10-10 06:42:25,455][53268] Updated weights for policy 1, policy_version 49290 (0.0009) [2023-10-10 06:42:25,813][53268] Updated weights for policy 1, policy_version 49300 (0.0009) [2023-10-10 06:42:25,924][53252] Updated weights for policy 0, policy_version 49350 (0.0008) [2023-10-10 06:42:26,182][53268] Updated weights for policy 1, policy_version 49310 (0.0008) [2023-10-10 06:42:26,296][53252] Updated weights for policy 0, policy_version 49360 (0.0008) [2023-10-10 06:42:26,657][53252] Updated weights for policy 0, policy_version 49370 (0.0009) [2023-10-10 06:42:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101023744. Throughput: 0: 1669.6, 1: 1660.7. Samples: 25267576. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-10 06:42:26,785][52050] Avg episode reward: [(0, '19.040'), (1, '21.020')] [2023-10-10 06:42:30,461][53268] Updated weights for policy 1, policy_version 49320 (0.0008) [2023-10-10 06:42:30,550][53252] Updated weights for policy 0, policy_version 49380 (0.0008) [2023-10-10 06:42:30,838][53268] Updated weights for policy 1, policy_version 49330 (0.0008) [2023-10-10 06:42:30,921][53252] Updated weights for policy 0, policy_version 49390 (0.0009) [2023-10-10 06:42:31,211][53268] Updated weights for policy 1, policy_version 49340 (0.0008) [2023-10-10 06:42:31,296][53252] Updated weights for policy 0, policy_version 49400 (0.0008) [2023-10-10 06:42:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 101122048. Throughput: 0: 1685.6, 1: 1686.4. Samples: 25278662. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:31,784][52050] Avg episode reward: [(0, '20.110'), (1, '21.000')] [2023-10-10 06:42:35,216][53268] Updated weights for policy 1, policy_version 49350 (0.0010) [2023-10-10 06:42:35,518][53252] Updated weights for policy 0, policy_version 49410 (0.0007) [2023-10-10 06:42:35,588][53268] Updated weights for policy 1, policy_version 49360 (0.0009) [2023-10-10 06:42:35,894][53252] Updated weights for policy 0, policy_version 49420 (0.0007) [2023-10-10 06:42:35,952][53268] Updated weights for policy 1, policy_version 49370 (0.0008) [2023-10-10 06:42:36,267][53252] Updated weights for policy 0, policy_version 49430 (0.0008) [2023-10-10 06:42:36,640][53252] Updated weights for policy 0, policy_version 49440 (0.0011) [2023-10-10 06:42:36,783][52050] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 101187584. Throughput: 0: 1684.7, 1: 1678.2. Samples: 25299024. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:36,784][52050] Avg episode reward: [(0, '21.380'), (1, '20.760')] [2023-10-10 06:42:39,947][53268] Updated weights for policy 1, policy_version 49380 (0.0009) [2023-10-10 06:42:40,321][53268] Updated weights for policy 1, policy_version 49390 (0.0011) [2023-10-10 06:42:40,685][53268] Updated weights for policy 1, policy_version 49400 (0.0009) [2023-10-10 06:42:40,780][53252] Updated weights for policy 0, policy_version 49450 (0.0007) [2023-10-10 06:42:41,148][53252] Updated weights for policy 0, policy_version 49460 (0.0010) [2023-10-10 06:42:41,528][53252] Updated weights for policy 0, policy_version 49470 (0.0007) [2023-10-10 06:42:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 101253120. Throughput: 0: 1658.3, 1: 1668.4. Samples: 25317584. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:41,784][52050] Avg episode reward: [(0, '22.370'), (1, '19.980')] [2023-10-10 06:42:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000049408_50593792.pth... [2023-10-10 06:42:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000049472_50659328.pth... [2023-10-10 06:42:41,829][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000047872_49020928.pth [2023-10-10 06:42:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000047840_48988160.pth [2023-10-10 06:42:44,663][53268] Updated weights for policy 1, policy_version 49410 (0.0007) [2023-10-10 06:42:45,038][53268] Updated weights for policy 1, policy_version 49420 (0.0009) [2023-10-10 06:42:45,401][53268] Updated weights for policy 1, policy_version 49430 (0.0008) [2023-10-10 06:42:45,609][53252] Updated weights for policy 0, policy_version 49480 (0.0007) [2023-10-10 06:42:45,761][53268] Updated weights for policy 1, policy_version 49440 (0.0008) [2023-10-10 06:42:45,986][53252] Updated weights for policy 0, policy_version 49490 (0.0007) [2023-10-10 06:42:46,366][53252] Updated weights for policy 0, policy_version 49500 (0.0007) [2023-10-10 06:42:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 101318656. Throughput: 0: 1681.7, 1: 1684.8. Samples: 25328966. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:46,784][52050] Avg episode reward: [(0, '22.740'), (1, '18.270')] [2023-10-10 06:42:49,872][53268] Updated weights for policy 1, policy_version 49450 (0.0007) [2023-10-10 06:42:50,237][53268] Updated weights for policy 1, policy_version 49460 (0.0009) [2023-10-10 06:42:50,424][53252] Updated weights for policy 0, policy_version 49510 (0.0008) [2023-10-10 06:42:50,610][53268] Updated weights for policy 1, policy_version 49470 (0.0008) [2023-10-10 06:42:50,796][53252] Updated weights for policy 0, policy_version 49520 (0.0009) [2023-10-10 06:42:51,165][53252] Updated weights for policy 0, policy_version 49530 (0.0010) [2023-10-10 06:42:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 101384192. Throughput: 0: 1675.2, 1: 1669.3. Samples: 25348794. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:51,784][52050] Avg episode reward: [(0, '22.620'), (1, '19.070')] [2023-10-10 06:42:54,580][53268] Updated weights for policy 1, policy_version 49480 (0.0009) [2023-10-10 06:42:54,943][53268] Updated weights for policy 1, policy_version 49490 (0.0009) [2023-10-10 06:42:55,222][53252] Updated weights for policy 0, policy_version 49540 (0.0008) [2023-10-10 06:42:55,301][53268] Updated weights for policy 1, policy_version 49500 (0.0010) [2023-10-10 06:42:55,588][53252] Updated weights for policy 0, policy_version 49550 (0.0007) [2023-10-10 06:42:55,955][53252] Updated weights for policy 0, policy_version 49560 (0.0008) [2023-10-10 06:42:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101449728. Throughput: 0: 1653.7, 1: 1676.1. Samples: 25367960. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:42:56,784][52050] Avg episode reward: [(0, '21.030'), (1, '19.080')] [2023-10-10 06:42:59,320][53268] Updated weights for policy 1, policy_version 49510 (0.0008) [2023-10-10 06:42:59,696][53268] Updated weights for policy 1, policy_version 49520 (0.0009) [2023-10-10 06:42:59,998][53252] Updated weights for policy 0, policy_version 49570 (0.0007) [2023-10-10 06:43:00,056][53268] Updated weights for policy 1, policy_version 49530 (0.0009) [2023-10-10 06:43:00,385][53252] Updated weights for policy 0, policy_version 49580 (0.0008) [2023-10-10 06:43:00,758][53252] Updated weights for policy 0, policy_version 49590 (0.0007) [2023-10-10 06:43:01,137][53252] Updated weights for policy 0, policy_version 49600 (0.0008) [2023-10-10 06:43:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101515264. Throughput: 0: 1684.0, 1: 1682.6. Samples: 25379564. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-10 06:43:01,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.090')] [2023-10-10 06:43:04,100][53268] Updated weights for policy 1, policy_version 49540 (0.0008) [2023-10-10 06:43:04,453][53268] Updated weights for policy 1, policy_version 49550 (0.0009) [2023-10-10 06:43:04,830][53268] Updated weights for policy 1, policy_version 49560 (0.0009) [2023-10-10 06:43:05,186][53252] Updated weights for policy 0, policy_version 49610 (0.0010) [2023-10-10 06:43:05,555][53252] Updated weights for policy 0, policy_version 49620 (0.0008) [2023-10-10 06:43:05,934][53252] Updated weights for policy 0, policy_version 49630 (0.0007) [2023-10-10 06:43:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101580800. Throughput: 0: 1672.4, 1: 1656.2. Samples: 25398544. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:06,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.260')] [2023-10-10 06:43:08,952][53268] Updated weights for policy 1, policy_version 49570 (0.0009) [2023-10-10 06:43:09,320][53268] Updated weights for policy 1, policy_version 49580 (0.0007) [2023-10-10 06:43:09,692][53268] Updated weights for policy 1, policy_version 49590 (0.0008) [2023-10-10 06:43:09,935][53252] Updated weights for policy 0, policy_version 49640 (0.0009) [2023-10-10 06:43:10,054][53268] Updated weights for policy 1, policy_version 49600 (0.0007) [2023-10-10 06:43:10,309][53252] Updated weights for policy 0, policy_version 49650 (0.0010) [2023-10-10 06:43:10,674][53252] Updated weights for policy 0, policy_version 49660 (0.0007) [2023-10-10 06:43:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101646336. Throughput: 0: 1672.5, 1: 1682.0. Samples: 25418530. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:11,784][52050] Avg episode reward: [(0, '21.990'), (1, '19.660')] [2023-10-10 06:43:14,184][53268] Updated weights for policy 1, policy_version 49610 (0.0009) [2023-10-10 06:43:14,542][53268] Updated weights for policy 1, policy_version 49620 (0.0011) [2023-10-10 06:43:14,811][53252] Updated weights for policy 0, policy_version 49670 (0.0008) [2023-10-10 06:43:14,914][53268] Updated weights for policy 1, policy_version 49630 (0.0009) [2023-10-10 06:43:15,180][53252] Updated weights for policy 0, policy_version 49680 (0.0007) [2023-10-10 06:43:15,550][53252] Updated weights for policy 0, policy_version 49690 (0.0010) [2023-10-10 06:43:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101711872. Throughput: 0: 1689.5, 1: 1673.1. Samples: 25429980. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:16,785][52050] Avg episode reward: [(0, '22.590'), (1, '18.430')] [2023-10-10 06:43:19,149][53268] Updated weights for policy 1, policy_version 49640 (0.0008) [2023-10-10 06:43:19,517][53268] Updated weights for policy 1, policy_version 49650 (0.0010) [2023-10-10 06:43:19,561][53252] Updated weights for policy 0, policy_version 49700 (0.0007) [2023-10-10 06:43:19,885][53268] Updated weights for policy 1, policy_version 49660 (0.0009) [2023-10-10 06:43:19,926][53252] Updated weights for policy 0, policy_version 49710 (0.0009) [2023-10-10 06:43:20,286][53252] Updated weights for policy 0, policy_version 49720 (0.0009) [2023-10-10 06:43:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 101777408. Throughput: 0: 1669.3, 1: 1656.2. Samples: 25448672. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:21,784][52050] Avg episode reward: [(0, '23.010'), (1, '19.190')] [2023-10-10 06:43:24,044][53268] Updated weights for policy 1, policy_version 49670 (0.0011) [2023-10-10 06:43:24,418][53268] Updated weights for policy 1, policy_version 49680 (0.0010) [2023-10-10 06:43:24,431][53252] Updated weights for policy 0, policy_version 49730 (0.0008) [2023-10-10 06:43:24,790][53268] Updated weights for policy 1, policy_version 49690 (0.0009) [2023-10-10 06:43:24,801][53252] Updated weights for policy 0, policy_version 49740 (0.0008) [2023-10-10 06:43:25,169][53252] Updated weights for policy 0, policy_version 49750 (0.0008) [2023-10-10 06:43:25,545][53252] Updated weights for policy 0, policy_version 49760 (0.0008) [2023-10-10 06:43:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 101842944. Throughput: 0: 1685.5, 1: 1678.0. Samples: 25468942. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:26,784][52050] Avg episode reward: [(0, '22.590'), (1, '19.930')] [2023-10-10 06:43:28,933][53268] Updated weights for policy 1, policy_version 49700 (0.0010) [2023-10-10 06:43:29,304][53268] Updated weights for policy 1, policy_version 49710 (0.0009) [2023-10-10 06:43:29,559][53252] Updated weights for policy 0, policy_version 49770 (0.0009) [2023-10-10 06:43:29,666][53268] Updated weights for policy 1, policy_version 49720 (0.0009) [2023-10-10 06:43:29,931][53252] Updated weights for policy 0, policy_version 49780 (0.0008) [2023-10-10 06:43:30,297][53252] Updated weights for policy 0, policy_version 49790 (0.0011) [2023-10-10 06:43:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 101908480. Throughput: 0: 1687.6, 1: 1664.1. Samples: 25479792. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:31,784][52050] Avg episode reward: [(0, '20.600'), (1, '19.870')] [2023-10-10 06:43:33,696][53268] Updated weights for policy 1, policy_version 49730 (0.0008) [2023-10-10 06:43:34,066][53268] Updated weights for policy 1, policy_version 49740 (0.0009) [2023-10-10 06:43:34,281][53252] Updated weights for policy 0, policy_version 49800 (0.0009) [2023-10-10 06:43:34,438][53268] Updated weights for policy 1, policy_version 49750 (0.0009) [2023-10-10 06:43:34,649][53252] Updated weights for policy 0, policy_version 49810 (0.0007) [2023-10-10 06:43:34,800][53268] Updated weights for policy 1, policy_version 49760 (0.0009) [2023-10-10 06:43:35,024][53252] Updated weights for policy 0, policy_version 49820 (0.0008) [2023-10-10 06:43:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 101974016. Throughput: 0: 1668.5, 1: 1660.3. Samples: 25498592. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:36,784][52050] Avg episode reward: [(0, '20.700'), (1, '19.230')] [2023-10-10 06:43:38,858][53268] Updated weights for policy 1, policy_version 49770 (0.0008) [2023-10-10 06:43:39,048][53252] Updated weights for policy 0, policy_version 49830 (0.0008) [2023-10-10 06:43:39,218][53268] Updated weights for policy 1, policy_version 49780 (0.0008) [2023-10-10 06:43:39,414][53252] Updated weights for policy 0, policy_version 49840 (0.0009) [2023-10-10 06:43:39,588][53268] Updated weights for policy 1, policy_version 49790 (0.0008) [2023-10-10 06:43:39,787][53252] Updated weights for policy 0, policy_version 49850 (0.0009) [2023-10-10 06:43:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102039552. Throughput: 0: 1693.4, 1: 1670.7. Samples: 25519346. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:41,785][52050] Avg episode reward: [(0, '21.550'), (1, '20.630')] [2023-10-10 06:43:43,713][53268] Updated weights for policy 1, policy_version 49800 (0.0009) [2023-10-10 06:43:43,815][53252] Updated weights for policy 0, policy_version 49860 (0.0008) [2023-10-10 06:43:44,080][53268] Updated weights for policy 1, policy_version 49810 (0.0008) [2023-10-10 06:43:44,179][53252] Updated weights for policy 0, policy_version 49870 (0.0007) [2023-10-10 06:43:44,450][53268] Updated weights for policy 1, policy_version 49820 (0.0009) [2023-10-10 06:43:44,554][53252] Updated weights for policy 0, policy_version 49880 (0.0007) [2023-10-10 06:43:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 102105088. Throughput: 0: 1675.3, 1: 1656.3. Samples: 25529488. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:46,784][52050] Avg episode reward: [(0, '21.150'), (1, '21.090')] [2023-10-10 06:43:48,417][53268] Updated weights for policy 1, policy_version 49830 (0.0010) [2023-10-10 06:43:48,758][53252] Updated weights for policy 0, policy_version 49890 (0.0008) [2023-10-10 06:43:48,771][53268] Updated weights for policy 1, policy_version 49840 (0.0010) [2023-10-10 06:43:49,131][53252] Updated weights for policy 0, policy_version 49900 (0.0007) [2023-10-10 06:43:49,143][53268] Updated weights for policy 1, policy_version 49850 (0.0010) [2023-10-10 06:43:49,493][53252] Updated weights for policy 0, policy_version 49910 (0.0010) [2023-10-10 06:43:49,858][53252] Updated weights for policy 0, policy_version 49920 (0.0009) [2023-10-10 06:43:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102170624. Throughput: 0: 1665.9, 1: 1673.7. Samples: 25548828. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:51,785][52050] Avg episode reward: [(0, '21.480'), (1, '20.620')] [2023-10-10 06:43:53,150][53268] Updated weights for policy 1, policy_version 49860 (0.0008) [2023-10-10 06:43:53,519][53268] Updated weights for policy 1, policy_version 49870 (0.0010) [2023-10-10 06:43:53,879][53268] Updated weights for policy 1, policy_version 49880 (0.0008) [2023-10-10 06:43:54,140][53252] Updated weights for policy 0, policy_version 49930 (0.0007) [2023-10-10 06:43:54,523][53252] Updated weights for policy 0, policy_version 49940 (0.0008) [2023-10-10 06:43:54,882][53252] Updated weights for policy 0, policy_version 49950 (0.0007) [2023-10-10 06:43:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 102236160. Throughput: 0: 1681.4, 1: 1668.7. Samples: 25569282. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:43:56,784][52050] Avg episode reward: [(0, '21.270'), (1, '20.100')] [2023-10-10 06:43:58,091][53268] Updated weights for policy 1, policy_version 49890 (0.0008) [2023-10-10 06:43:58,459][53268] Updated weights for policy 1, policy_version 49900 (0.0010) [2023-10-10 06:43:58,826][53268] Updated weights for policy 1, policy_version 49910 (0.0009) [2023-10-10 06:43:59,121][53252] Updated weights for policy 0, policy_version 49960 (0.0007) [2023-10-10 06:43:59,192][53268] Updated weights for policy 1, policy_version 49920 (0.0009) [2023-10-10 06:43:59,491][53252] Updated weights for policy 0, policy_version 49970 (0.0007) [2023-10-10 06:43:59,863][53252] Updated weights for policy 0, policy_version 49980 (0.0008) [2023-10-10 06:44:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102301696. Throughput: 0: 1661.6, 1: 1651.4. Samples: 25579066. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:44:01,784][52050] Avg episode reward: [(0, '20.000'), (1, '20.740')] [2023-10-10 06:44:03,295][53268] Updated weights for policy 1, policy_version 49930 (0.0008) [2023-10-10 06:44:03,664][53268] Updated weights for policy 1, policy_version 49940 (0.0008) [2023-10-10 06:44:03,976][53252] Updated weights for policy 0, policy_version 49990 (0.0010) [2023-10-10 06:44:04,039][53268] Updated weights for policy 1, policy_version 49950 (0.0010) [2023-10-10 06:44:04,341][53252] Updated weights for policy 0, policy_version 50000 (0.0009) [2023-10-10 06:44:04,714][53252] Updated weights for policy 0, policy_version 50010 (0.0007) [2023-10-10 06:44:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102367232. Throughput: 0: 1663.2, 1: 1671.5. Samples: 25598734. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:44:06,784][52050] Avg episode reward: [(0, '20.610'), (1, '19.650')] [2023-10-10 06:44:08,122][53268] Updated weights for policy 1, policy_version 49960 (0.0010) [2023-10-10 06:44:08,491][53268] Updated weights for policy 1, policy_version 49970 (0.0008) [2023-10-10 06:44:08,676][53252] Updated weights for policy 0, policy_version 50020 (0.0008) [2023-10-10 06:44:08,848][53268] Updated weights for policy 1, policy_version 49980 (0.0009) [2023-10-10 06:44:09,065][53252] Updated weights for policy 0, policy_version 50030 (0.0008) [2023-10-10 06:44:09,427][53252] Updated weights for policy 0, policy_version 50040 (0.0007) [2023-10-10 06:44:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102432768. Throughput: 0: 1670.7, 1: 1680.2. Samples: 25619730. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 06:44:11,784][52050] Avg episode reward: [(0, '19.530'), (1, '19.420')] [2023-10-10 06:44:13,109][53268] Updated weights for policy 1, policy_version 49990 (0.0009) [2023-10-10 06:44:13,427][53252] Updated weights for policy 0, policy_version 50050 (0.0008) [2023-10-10 06:44:13,483][53268] Updated weights for policy 1, policy_version 50000 (0.0009) [2023-10-10 06:44:13,792][53252] Updated weights for policy 0, policy_version 50060 (0.0008) [2023-10-10 06:44:13,855][53268] Updated weights for policy 1, policy_version 50010 (0.0008) [2023-10-10 06:44:14,177][53252] Updated weights for policy 0, policy_version 50070 (0.0010) [2023-10-10 06:44:14,549][53252] Updated weights for policy 0, policy_version 50080 (0.0007) [2023-10-10 06:44:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102498304. Throughput: 0: 1651.3, 1: 1663.9. Samples: 25628976. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:16,784][52050] Avg episode reward: [(0, '20.770'), (1, '18.540')] [2023-10-10 06:44:17,887][53268] Updated weights for policy 1, policy_version 50020 (0.0009) [2023-10-10 06:44:18,249][53268] Updated weights for policy 1, policy_version 50030 (0.0007) [2023-10-10 06:44:18,608][53252] Updated weights for policy 0, policy_version 50090 (0.0008) [2023-10-10 06:44:18,612][53268] Updated weights for policy 1, policy_version 50040 (0.0010) [2023-10-10 06:44:18,983][53252] Updated weights for policy 0, policy_version 50100 (0.0009) [2023-10-10 06:44:19,356][53252] Updated weights for policy 0, policy_version 50110 (0.0007) [2023-10-10 06:44:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 102563840. Throughput: 0: 1669.6, 1: 1681.2. Samples: 25649376. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:21,784][52050] Avg episode reward: [(0, '20.810'), (1, '19.440')] [2023-10-10 06:44:22,797][53268] Updated weights for policy 1, policy_version 50050 (0.0009) [2023-10-10 06:44:23,167][53268] Updated weights for policy 1, policy_version 50060 (0.0010) [2023-10-10 06:44:23,480][53252] Updated weights for policy 0, policy_version 50120 (0.0008) [2023-10-10 06:44:23,519][53268] Updated weights for policy 1, policy_version 50070 (0.0008) [2023-10-10 06:44:23,856][53252] Updated weights for policy 0, policy_version 50130 (0.0007) [2023-10-10 06:44:23,891][53268] Updated weights for policy 1, policy_version 50080 (0.0008) [2023-10-10 06:44:24,228][53252] Updated weights for policy 0, policy_version 50140 (0.0007) [2023-10-10 06:44:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102629376. Throughput: 0: 1669.7, 1: 1680.5. Samples: 25670102. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:26,784][52050] Avg episode reward: [(0, '21.300'), (1, '20.350')] [2023-10-10 06:44:28,060][53268] Updated weights for policy 1, policy_version 50090 (0.0008) [2023-10-10 06:44:28,254][53252] Updated weights for policy 0, policy_version 50150 (0.0008) [2023-10-10 06:44:28,436][53268] Updated weights for policy 1, policy_version 50100 (0.0009) [2023-10-10 06:44:28,631][53252] Updated weights for policy 0, policy_version 50160 (0.0007) [2023-10-10 06:44:28,802][53268] Updated weights for policy 1, policy_version 50110 (0.0008) [2023-10-10 06:44:28,997][53252] Updated weights for policy 0, policy_version 50170 (0.0007) [2023-10-10 06:44:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102694912. Throughput: 0: 1659.4, 1: 1669.4. Samples: 25679282. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:31,784][52050] Avg episode reward: [(0, '22.530'), (1, '21.190')] [2023-10-10 06:44:32,879][53268] Updated weights for policy 1, policy_version 50120 (0.0009) [2023-10-10 06:44:33,099][53252] Updated weights for policy 0, policy_version 50180 (0.0008) [2023-10-10 06:44:33,252][53268] Updated weights for policy 1, policy_version 50130 (0.0008) [2023-10-10 06:44:33,476][53252] Updated weights for policy 0, policy_version 50190 (0.0007) [2023-10-10 06:44:33,625][53268] Updated weights for policy 1, policy_version 50140 (0.0009) [2023-10-10 06:44:33,840][53252] Updated weights for policy 0, policy_version 50200 (0.0007) [2023-10-10 06:44:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 102760448. Throughput: 0: 1682.2, 1: 1675.2. Samples: 25699912. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:36,784][52050] Avg episode reward: [(0, '22.230'), (1, '19.990')] [2023-10-10 06:44:37,792][53268] Updated weights for policy 1, policy_version 50150 (0.0008) [2023-10-10 06:44:37,933][53252] Updated weights for policy 0, policy_version 50210 (0.0007) [2023-10-10 06:44:38,154][53268] Updated weights for policy 1, policy_version 50160 (0.0010) [2023-10-10 06:44:38,312][53252] Updated weights for policy 0, policy_version 50220 (0.0007) [2023-10-10 06:44:38,521][53268] Updated weights for policy 1, policy_version 50170 (0.0010) [2023-10-10 06:44:38,688][53252] Updated weights for policy 0, policy_version 50230 (0.0007) [2023-10-10 06:44:39,065][53252] Updated weights for policy 0, policy_version 50240 (0.0010) [2023-10-10 06:44:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 102825984. Throughput: 0: 1681.3, 1: 1680.1. Samples: 25720548. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:41,784][52050] Avg episode reward: [(0, '21.920'), (1, '20.910')] [2023-10-10 06:44:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000050240_51445760.pth... [2023-10-10 06:44:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000050176_51380224.pth... [2023-10-10 06:44:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000048672_49840128.pth [2023-10-10 06:44:41,837][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000048608_49774592.pth [2023-10-10 06:44:42,625][53268] Updated weights for policy 1, policy_version 50180 (0.0010) [2023-10-10 06:44:43,000][53268] Updated weights for policy 1, policy_version 50190 (0.0007) [2023-10-10 06:44:43,126][53252] Updated weights for policy 0, policy_version 50250 (0.0007) [2023-10-10 06:44:43,373][53268] Updated weights for policy 1, policy_version 50200 (0.0009) [2023-10-10 06:44:43,492][53252] Updated weights for policy 0, policy_version 50260 (0.0008) [2023-10-10 06:44:43,854][53252] Updated weights for policy 0, policy_version 50270 (0.0010) [2023-10-10 06:44:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 102891520. Throughput: 0: 1665.8, 1: 1679.5. Samples: 25729604. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-10 06:44:46,784][52050] Avg episode reward: [(0, '22.870'), (1, '20.200')] [2023-10-10 06:44:47,280][53268] Updated weights for policy 1, policy_version 50210 (0.0008) [2023-10-10 06:44:47,632][53268] Updated weights for policy 1, policy_version 50220 (0.0010) [2023-10-10 06:44:47,836][53252] Updated weights for policy 0, policy_version 50280 (0.0008) [2023-10-10 06:44:47,999][53268] Updated weights for policy 1, policy_version 50230 (0.0010) [2023-10-10 06:44:48,205][53252] Updated weights for policy 0, policy_version 50290 (0.0008) [2023-10-10 06:44:48,364][53268] Updated weights for policy 1, policy_version 50240 (0.0007) [2023-10-10 06:44:48,578][53252] Updated weights for policy 0, policy_version 50300 (0.0008) [2023-10-10 06:44:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 102957056. Throughput: 0: 1689.8, 1: 1685.6. Samples: 25750628. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:44:51,784][52050] Avg episode reward: [(0, '20.460'), (1, '18.830')] [2023-10-10 06:44:52,297][53268] Updated weights for policy 1, policy_version 50250 (0.0010) [2023-10-10 06:44:52,404][53252] Updated weights for policy 0, policy_version 50310 (0.0010) [2023-10-10 06:44:52,668][53268] Updated weights for policy 1, policy_version 50260 (0.0010) [2023-10-10 06:44:52,778][53252] Updated weights for policy 0, policy_version 50320 (0.0009) [2023-10-10 06:44:53,040][53268] Updated weights for policy 1, policy_version 50270 (0.0009) [2023-10-10 06:44:53,152][53252] Updated weights for policy 0, policy_version 50330 (0.0008) [2023-10-10 06:44:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103022592. Throughput: 0: 1689.2, 1: 1677.9. Samples: 25771254. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:44:56,784][52050] Avg episode reward: [(0, '20.080'), (1, '20.220')] [2023-10-10 06:44:57,089][53268] Updated weights for policy 1, policy_version 50280 (0.0007) [2023-10-10 06:44:57,335][53252] Updated weights for policy 0, policy_version 50340 (0.0009) [2023-10-10 06:44:57,456][53268] Updated weights for policy 1, policy_version 50290 (0.0008) [2023-10-10 06:44:57,704][53252] Updated weights for policy 0, policy_version 50350 (0.0010) [2023-10-10 06:44:57,828][53268] Updated weights for policy 1, policy_version 50300 (0.0007) [2023-10-10 06:44:58,067][53252] Updated weights for policy 0, policy_version 50360 (0.0009) [2023-10-10 06:45:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103088128. Throughput: 0: 1681.5, 1: 1681.3. Samples: 25780300. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:45:01,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.520')] [2023-10-10 06:45:02,073][53252] Updated weights for policy 0, policy_version 50370 (0.0009) [2023-10-10 06:45:02,074][53268] Updated weights for policy 1, policy_version 50310 (0.0008) [2023-10-10 06:45:02,438][53252] Updated weights for policy 0, policy_version 50380 (0.0009) [2023-10-10 06:45:02,465][53268] Updated weights for policy 1, policy_version 50320 (0.0007) [2023-10-10 06:45:02,817][53252] Updated weights for policy 0, policy_version 50390 (0.0008) [2023-10-10 06:45:02,826][53268] Updated weights for policy 1, policy_version 50330 (0.0007) [2023-10-10 06:45:03,189][53252] Updated weights for policy 0, policy_version 50400 (0.0008) [2023-10-10 06:45:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103153664. Throughput: 0: 1689.2, 1: 1674.8. Samples: 25800756. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:45:06,784][52050] Avg episode reward: [(0, '19.340'), (1, '18.970')] [2023-10-10 06:45:06,852][53268] Updated weights for policy 1, policy_version 50340 (0.0008) [2023-10-10 06:45:07,228][53268] Updated weights for policy 1, policy_version 50350 (0.0009) [2023-10-10 06:45:07,256][53252] Updated weights for policy 0, policy_version 50410 (0.0007) [2023-10-10 06:45:07,586][53268] Updated weights for policy 1, policy_version 50360 (0.0010) [2023-10-10 06:45:07,622][53252] Updated weights for policy 0, policy_version 50420 (0.0007) [2023-10-10 06:45:08,005][53252] Updated weights for policy 0, policy_version 50430 (0.0008) [2023-10-10 06:45:11,706][53268] Updated weights for policy 1, policy_version 50370 (0.0008) [2023-10-10 06:45:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 103219200. Throughput: 0: 1687.8, 1: 1674.8. Samples: 25821422. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:45:11,784][52050] Avg episode reward: [(0, '19.930'), (1, '20.340')] [2023-10-10 06:45:12,083][53268] Updated weights for policy 1, policy_version 50380 (0.0008) [2023-10-10 06:45:12,164][53252] Updated weights for policy 0, policy_version 50440 (0.0008) [2023-10-10 06:45:12,453][53268] Updated weights for policy 1, policy_version 50390 (0.0008) [2023-10-10 06:45:12,530][53252] Updated weights for policy 0, policy_version 50450 (0.0007) [2023-10-10 06:45:12,822][53268] Updated weights for policy 1, policy_version 50400 (0.0008) [2023-10-10 06:45:12,915][53252] Updated weights for policy 0, policy_version 50460 (0.0009) [2023-10-10 06:45:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103284736. Throughput: 0: 1684.3, 1: 1671.9. Samples: 25830310. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:45:16,784][52050] Avg episode reward: [(0, '20.180'), (1, '19.860')] [2023-10-10 06:45:16,855][53268] Updated weights for policy 1, policy_version 50410 (0.0009) [2023-10-10 06:45:17,048][53252] Updated weights for policy 0, policy_version 50470 (0.0010) [2023-10-10 06:45:17,233][53268] Updated weights for policy 1, policy_version 50420 (0.0008) [2023-10-10 06:45:17,430][53252] Updated weights for policy 0, policy_version 50480 (0.0010) [2023-10-10 06:45:17,606][53268] Updated weights for policy 1, policy_version 50430 (0.0008) [2023-10-10 06:45:17,803][53252] Updated weights for policy 0, policy_version 50490 (0.0009) [2023-10-10 06:45:21,741][53268] Updated weights for policy 1, policy_version 50440 (0.0008) [2023-10-10 06:45:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103350272. Throughput: 0: 1674.8, 1: 1674.3. Samples: 25850620. Policy #0 lag: (min: 22.0, avg: 29.4, max: 54.0) [2023-10-10 06:45:21,784][52050] Avg episode reward: [(0, '19.690'), (1, '19.610')] [2023-10-10 06:45:22,057][53252] Updated weights for policy 0, policy_version 50500 (0.0007) [2023-10-10 06:45:22,101][53268] Updated weights for policy 1, policy_version 50450 (0.0010) [2023-10-10 06:45:22,431][53252] Updated weights for policy 0, policy_version 50510 (0.0008) [2023-10-10 06:45:22,467][53268] Updated weights for policy 1, policy_version 50460 (0.0007) [2023-10-10 06:45:22,804][53252] Updated weights for policy 0, policy_version 50520 (0.0008) [2023-10-10 06:45:26,469][53268] Updated weights for policy 1, policy_version 50470 (0.0008) [2023-10-10 06:45:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103415808. Throughput: 0: 1676.5, 1: 1676.6. Samples: 25871440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:26,784][52050] Avg episode reward: [(0, '20.580'), (1, '19.940')] [2023-10-10 06:45:26,795][53252] Updated weights for policy 0, policy_version 50530 (0.0008) [2023-10-10 06:45:26,839][53268] Updated weights for policy 1, policy_version 50480 (0.0007) [2023-10-10 06:45:27,182][53252] Updated weights for policy 0, policy_version 50540 (0.0008) [2023-10-10 06:45:27,213][53268] Updated weights for policy 1, policy_version 50490 (0.0007) [2023-10-10 06:45:27,552][53252] Updated weights for policy 0, policy_version 50550 (0.0008) [2023-10-10 06:45:27,927][53252] Updated weights for policy 0, policy_version 50560 (0.0007) [2023-10-10 06:45:31,190][53268] Updated weights for policy 1, policy_version 50500 (0.0009) [2023-10-10 06:45:31,566][53268] Updated weights for policy 1, policy_version 50510 (0.0011) [2023-10-10 06:45:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103481344. Throughput: 0: 1673.3, 1: 1674.7. Samples: 25880264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:31,784][52050] Avg episode reward: [(0, '20.390'), (1, '19.450')] [2023-10-10 06:45:31,938][53268] Updated weights for policy 1, policy_version 50520 (0.0009) [2023-10-10 06:45:32,094][53252] Updated weights for policy 0, policy_version 50570 (0.0007) [2023-10-10 06:45:32,477][53252] Updated weights for policy 0, policy_version 50580 (0.0008) [2023-10-10 06:45:32,847][53252] Updated weights for policy 0, policy_version 50590 (0.0008) [2023-10-10 06:45:36,118][53268] Updated weights for policy 1, policy_version 50530 (0.0008) [2023-10-10 06:45:36,482][53268] Updated weights for policy 1, policy_version 50540 (0.0010) [2023-10-10 06:45:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103546880. Throughput: 0: 1666.0, 1: 1672.3. Samples: 25900852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:36,784][52050] Avg episode reward: [(0, '21.610'), (1, '20.420')] [2023-10-10 06:45:36,842][53252] Updated weights for policy 0, policy_version 50600 (0.0007) [2023-10-10 06:45:36,851][53268] Updated weights for policy 1, policy_version 50550 (0.0007) [2023-10-10 06:45:37,204][53252] Updated weights for policy 0, policy_version 50610 (0.0007) [2023-10-10 06:45:37,212][53268] Updated weights for policy 1, policy_version 50560 (0.0009) [2023-10-10 06:45:37,579][53252] Updated weights for policy 0, policy_version 50620 (0.0009) [2023-10-10 06:45:41,516][53268] Updated weights for policy 1, policy_version 50570 (0.0009) [2023-10-10 06:45:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 103612416. Throughput: 0: 1660.5, 1: 1669.9. Samples: 25921122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:41,784][52050] Avg episode reward: [(0, '21.900'), (1, '20.930')] [2023-10-10 06:45:41,801][53252] Updated weights for policy 0, policy_version 50630 (0.0009) [2023-10-10 06:45:41,879][53268] Updated weights for policy 1, policy_version 50580 (0.0007) [2023-10-10 06:45:42,165][53252] Updated weights for policy 0, policy_version 50640 (0.0009) [2023-10-10 06:45:42,255][53268] Updated weights for policy 1, policy_version 50590 (0.0008) [2023-10-10 06:45:42,534][53252] Updated weights for policy 0, policy_version 50650 (0.0007) [2023-10-10 06:45:46,435][53268] Updated weights for policy 1, policy_version 50600 (0.0007) [2023-10-10 06:45:46,649][53252] Updated weights for policy 0, policy_version 50660 (0.0008) [2023-10-10 06:45:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103677952. Throughput: 0: 1662.1, 1: 1671.0. Samples: 25930288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:46,784][52050] Avg episode reward: [(0, '21.720'), (1, '20.530')] [2023-10-10 06:45:46,798][53268] Updated weights for policy 1, policy_version 50610 (0.0009) [2023-10-10 06:45:47,019][53252] Updated weights for policy 0, policy_version 50670 (0.0009) [2023-10-10 06:45:47,173][53268] Updated weights for policy 1, policy_version 50620 (0.0008) [2023-10-10 06:45:47,380][53252] Updated weights for policy 0, policy_version 50680 (0.0008) [2023-10-10 06:45:51,324][53268] Updated weights for policy 1, policy_version 50630 (0.0009) [2023-10-10 06:45:51,643][53252] Updated weights for policy 0, policy_version 50690 (0.0010) [2023-10-10 06:45:51,706][53268] Updated weights for policy 1, policy_version 50640 (0.0007) [2023-10-10 06:45:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 103743488. Throughput: 0: 1659.2, 1: 1672.0. Samples: 25950660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:51,784][52050] Avg episode reward: [(0, '22.080'), (1, '20.820')] [2023-10-10 06:45:52,011][53252] Updated weights for policy 0, policy_version 50700 (0.0008) [2023-10-10 06:45:52,082][53268] Updated weights for policy 1, policy_version 50650 (0.0008) [2023-10-10 06:45:52,377][53252] Updated weights for policy 0, policy_version 50710 (0.0009) [2023-10-10 06:45:52,753][53252] Updated weights for policy 0, policy_version 50720 (0.0008) [2023-10-10 06:45:55,922][53268] Updated weights for policy 1, policy_version 50660 (0.0010) [2023-10-10 06:45:56,291][53268] Updated weights for policy 1, policy_version 50670 (0.0007) [2023-10-10 06:45:56,657][53268] Updated weights for policy 1, policy_version 50680 (0.0007) [2023-10-10 06:45:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 103809024. Throughput: 0: 1655.9, 1: 1666.9. Samples: 25970948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:45:56,784][52050] Avg episode reward: [(0, '20.000'), (1, '22.130')] [2023-10-10 06:45:56,787][53252] Updated weights for policy 0, policy_version 50730 (0.0009) [2023-10-10 06:45:57,153][53252] Updated weights for policy 0, policy_version 50740 (0.0007) [2023-10-10 06:45:57,525][53252] Updated weights for policy 0, policy_version 50750 (0.0009) [2023-10-10 06:46:00,548][53268] Updated weights for policy 1, policy_version 50690 (0.0007) [2023-10-10 06:46:00,917][53268] Updated weights for policy 1, policy_version 50700 (0.0007) [2023-10-10 06:46:01,277][53268] Updated weights for policy 1, policy_version 50710 (0.0008) [2023-10-10 06:46:01,640][53268] Updated weights for policy 1, policy_version 50720 (0.0008) [2023-10-10 06:46:01,686][53252] Updated weights for policy 0, policy_version 50760 (0.0008) [2023-10-10 06:46:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 103907328. Throughput: 0: 1657.1, 1: 1682.5. Samples: 25980588. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:01,784][52050] Avg episode reward: [(0, '21.700'), (1, '20.560')] [2023-10-10 06:46:02,053][53252] Updated weights for policy 0, policy_version 50770 (0.0010) [2023-10-10 06:46:02,420][53252] Updated weights for policy 0, policy_version 50780 (0.0011) [2023-10-10 06:46:05,642][53268] Updated weights for policy 1, policy_version 50730 (0.0011) [2023-10-10 06:46:06,005][53268] Updated weights for policy 1, policy_version 50740 (0.0011) [2023-10-10 06:46:06,372][53268] Updated weights for policy 1, policy_version 50750 (0.0010) [2023-10-10 06:46:06,649][53252] Updated weights for policy 0, policy_version 50790 (0.0008) [2023-10-10 06:46:06,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 103972864. Throughput: 0: 1663.9, 1: 1688.0. Samples: 26001454. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:06,784][52050] Avg episode reward: [(0, '22.100'), (1, '20.240')] [2023-10-10 06:46:07,022][53252] Updated weights for policy 0, policy_version 50800 (0.0008) [2023-10-10 06:46:07,388][53252] Updated weights for policy 0, policy_version 50810 (0.0008) [2023-10-10 06:46:10,462][53268] Updated weights for policy 1, policy_version 50760 (0.0009) [2023-10-10 06:46:10,828][53268] Updated weights for policy 1, policy_version 50770 (0.0008) [2023-10-10 06:46:11,189][53268] Updated weights for policy 1, policy_version 50780 (0.0008) [2023-10-10 06:46:11,455][53252] Updated weights for policy 0, policy_version 50820 (0.0008) [2023-10-10 06:46:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104038400. Throughput: 0: 1658.0, 1: 1663.5. Samples: 26020908. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:11,785][52050] Avg episode reward: [(0, '20.450'), (1, '20.110')] [2023-10-10 06:46:11,824][53252] Updated weights for policy 0, policy_version 50830 (0.0007) [2023-10-10 06:46:12,201][53252] Updated weights for policy 0, policy_version 50840 (0.0007) [2023-10-10 06:46:15,232][53268] Updated weights for policy 1, policy_version 50790 (0.0010) [2023-10-10 06:46:15,601][53268] Updated weights for policy 1, policy_version 50800 (0.0007) [2023-10-10 06:46:15,963][53268] Updated weights for policy 1, policy_version 50810 (0.0009) [2023-10-10 06:46:16,177][53252] Updated weights for policy 0, policy_version 50850 (0.0007) [2023-10-10 06:46:16,571][53252] Updated weights for policy 0, policy_version 50860 (0.0007) [2023-10-10 06:46:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 104103936. Throughput: 0: 1664.3, 1: 1687.8. Samples: 26031108. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:16,784][52050] Avg episode reward: [(0, '20.830'), (1, '19.660')] [2023-10-10 06:46:16,943][53252] Updated weights for policy 0, policy_version 50870 (0.0010) [2023-10-10 06:46:17,321][53252] Updated weights for policy 0, policy_version 50880 (0.0010) [2023-10-10 06:46:19,933][53268] Updated weights for policy 1, policy_version 50820 (0.0007) [2023-10-10 06:46:20,292][53268] Updated weights for policy 1, policy_version 50830 (0.0008) [2023-10-10 06:46:20,662][53268] Updated weights for policy 1, policy_version 50840 (0.0007) [2023-10-10 06:46:21,301][53252] Updated weights for policy 0, policy_version 50890 (0.0009) [2023-10-10 06:46:21,679][53252] Updated weights for policy 0, policy_version 50900 (0.0008) [2023-10-10 06:46:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104169472. Throughput: 0: 1664.2, 1: 1681.6. Samples: 26051412. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:21,784][52050] Avg episode reward: [(0, '21.780'), (1, '19.980')] [2023-10-10 06:46:22,043][53252] Updated weights for policy 0, policy_version 50910 (0.0008) [2023-10-10 06:46:24,814][53268] Updated weights for policy 1, policy_version 50850 (0.0009) [2023-10-10 06:46:25,184][53268] Updated weights for policy 1, policy_version 50860 (0.0010) [2023-10-10 06:46:25,542][53268] Updated weights for policy 1, policy_version 50870 (0.0009) [2023-10-10 06:46:25,917][53268] Updated weights for policy 1, policy_version 50880 (0.0008) [2023-10-10 06:46:26,207][53252] Updated weights for policy 0, policy_version 50920 (0.0010) [2023-10-10 06:46:26,584][53252] Updated weights for policy 0, policy_version 50930 (0.0010) [2023-10-10 06:46:26,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104235008. Throughput: 0: 1659.2, 1: 1665.6. Samples: 26070734. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:26,784][52050] Avg episode reward: [(0, '21.280'), (1, '20.640')] [2023-10-10 06:46:26,960][53252] Updated weights for policy 0, policy_version 50940 (0.0010) [2023-10-10 06:46:29,951][53268] Updated weights for policy 1, policy_version 50890 (0.0009) [2023-10-10 06:46:30,322][53268] Updated weights for policy 1, policy_version 50900 (0.0009) [2023-10-10 06:46:30,682][53268] Updated weights for policy 1, policy_version 50910 (0.0008) [2023-10-10 06:46:31,003][53252] Updated weights for policy 0, policy_version 50950 (0.0009) [2023-10-10 06:46:31,373][53252] Updated weights for policy 0, policy_version 50960 (0.0008) [2023-10-10 06:46:31,742][53252] Updated weights for policy 0, policy_version 50970 (0.0009) [2023-10-10 06:46:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104300544. Throughput: 0: 1666.7, 1: 1693.5. Samples: 26081498. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-10 06:46:31,784][52050] Avg episode reward: [(0, '20.890'), (1, '20.500')] [2023-10-10 06:46:34,801][53268] Updated weights for policy 1, policy_version 50920 (0.0010) [2023-10-10 06:46:35,169][53268] Updated weights for policy 1, policy_version 50930 (0.0010) [2023-10-10 06:46:35,535][53268] Updated weights for policy 1, policy_version 50940 (0.0008) [2023-10-10 06:46:35,792][53252] Updated weights for policy 0, policy_version 50980 (0.0009) [2023-10-10 06:46:36,162][53252] Updated weights for policy 0, policy_version 50990 (0.0009) [2023-10-10 06:46:36,538][53252] Updated weights for policy 0, policy_version 51000 (0.0010) [2023-10-10 06:46:36,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 104366080. Throughput: 0: 1673.1, 1: 1686.3. Samples: 26101834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:46:36,785][52050] Avg episode reward: [(0, '20.230'), (1, '21.150')] [2023-10-10 06:46:39,596][53268] Updated weights for policy 1, policy_version 50950 (0.0008) [2023-10-10 06:46:39,976][53268] Updated weights for policy 1, policy_version 50960 (0.0007) [2023-10-10 06:46:40,341][53268] Updated weights for policy 1, policy_version 50970 (0.0007) [2023-10-10 06:46:40,534][53252] Updated weights for policy 0, policy_version 51010 (0.0009) [2023-10-10 06:46:40,906][53252] Updated weights for policy 0, policy_version 51020 (0.0009) [2023-10-10 06:46:41,275][53252] Updated weights for policy 0, policy_version 51030 (0.0010) [2023-10-10 06:46:41,653][53252] Updated weights for policy 0, policy_version 51040 (0.0010) [2023-10-10 06:46:41,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 104464384. Throughput: 0: 1656.7, 1: 1679.6. Samples: 26121080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:46:41,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.630')] [2023-10-10 06:46:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000050976_52199424.pth... [2023-10-10 06:46:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000051040_52264960.pth... [2023-10-10 06:46:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000049408_50593792.pth [2023-10-10 06:46:41,840][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000049472_50659328.pth [2023-10-10 06:46:44,297][53268] Updated weights for policy 1, policy_version 50980 (0.0008) [2023-10-10 06:46:44,664][53268] Updated weights for policy 1, policy_version 50990 (0.0008) [2023-10-10 06:46:45,024][53268] Updated weights for policy 1, policy_version 51000 (0.0008) [2023-10-10 06:46:45,790][53252] Updated weights for policy 0, policy_version 51050 (0.0009) [2023-10-10 06:46:46,159][53252] Updated weights for policy 0, policy_version 51060 (0.0007) [2023-10-10 06:46:46,535][53252] Updated weights for policy 0, policy_version 51070 (0.0007) [2023-10-10 06:46:46,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 104529920. Throughput: 0: 1676.8, 1: 1698.3. Samples: 26132468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:46:46,784][52050] Avg episode reward: [(0, '21.130'), (1, '20.410')] [2023-10-10 06:46:48,872][53268] Updated weights for policy 1, policy_version 51010 (0.0010) [2023-10-10 06:46:49,242][53268] Updated weights for policy 1, policy_version 51020 (0.0007) [2023-10-10 06:46:49,597][53268] Updated weights for policy 1, policy_version 51030 (0.0007) [2023-10-10 06:46:49,958][53268] Updated weights for policy 1, policy_version 51040 (0.0008) [2023-10-10 06:46:50,625][53252] Updated weights for policy 0, policy_version 51080 (0.0008) [2023-10-10 06:46:50,993][53252] Updated weights for policy 0, policy_version 51090 (0.0007) [2023-10-10 06:46:51,365][53252] Updated weights for policy 0, policy_version 51100 (0.0008) [2023-10-10 06:46:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 104595456. Throughput: 0: 1670.5, 1: 1671.1. Samples: 26151826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:46:51,784][52050] Avg episode reward: [(0, '19.720'), (1, '18.670')] [2023-10-10 06:46:54,033][53268] Updated weights for policy 1, policy_version 51050 (0.0007) [2023-10-10 06:46:54,393][53268] Updated weights for policy 1, policy_version 51060 (0.0007) [2023-10-10 06:46:54,757][53268] Updated weights for policy 1, policy_version 51070 (0.0008) [2023-10-10 06:46:55,436][53252] Updated weights for policy 0, policy_version 51110 (0.0007) [2023-10-10 06:46:55,813][53252] Updated weights for policy 0, policy_version 51120 (0.0007) [2023-10-10 06:46:56,181][53252] Updated weights for policy 0, policy_version 51130 (0.0007) [2023-10-10 06:46:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 104660992. Throughput: 0: 1653.7, 1: 1698.4. Samples: 26171750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:46:56,784][52050] Avg episode reward: [(0, '20.040'), (1, '19.490')] [2023-10-10 06:46:58,726][53268] Updated weights for policy 1, policy_version 51080 (0.0008) [2023-10-10 06:46:59,101][53268] Updated weights for policy 1, policy_version 51090 (0.0008) [2023-10-10 06:46:59,471][53268] Updated weights for policy 1, policy_version 51100 (0.0008) [2023-10-10 06:47:00,216][53252] Updated weights for policy 0, policy_version 51140 (0.0008) [2023-10-10 06:47:00,583][53252] Updated weights for policy 0, policy_version 51150 (0.0009) [2023-10-10 06:47:00,954][53252] Updated weights for policy 0, policy_version 51160 (0.0008) [2023-10-10 06:47:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104726528. Throughput: 0: 1682.1, 1: 1687.2. Samples: 26182726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:47:01,784][52050] Avg episode reward: [(0, '19.360'), (1, '19.170')] [2023-10-10 06:47:03,656][53268] Updated weights for policy 1, policy_version 51110 (0.0008) [2023-10-10 06:47:04,021][53268] Updated weights for policy 1, policy_version 51120 (0.0009) [2023-10-10 06:47:04,392][53268] Updated weights for policy 1, policy_version 51130 (0.0008) [2023-10-10 06:47:04,964][53252] Updated weights for policy 0, policy_version 51170 (0.0008) [2023-10-10 06:47:05,348][53252] Updated weights for policy 0, policy_version 51180 (0.0007) [2023-10-10 06:47:05,721][53252] Updated weights for policy 0, policy_version 51190 (0.0010) [2023-10-10 06:47:06,102][53252] Updated weights for policy 0, policy_version 51200 (0.0008) [2023-10-10 06:47:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104792064. Throughput: 0: 1677.4, 1: 1683.4. Samples: 26202648. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:06,785][52050] Avg episode reward: [(0, '20.090'), (1, '19.510')] [2023-10-10 06:47:08,284][53268] Updated weights for policy 1, policy_version 51140 (0.0009) [2023-10-10 06:47:08,651][53268] Updated weights for policy 1, policy_version 51150 (0.0009) [2023-10-10 06:47:09,017][53268] Updated weights for policy 1, policy_version 51160 (0.0008) [2023-10-10 06:47:10,062][53252] Updated weights for policy 0, policy_version 51210 (0.0008) [2023-10-10 06:47:10,440][53252] Updated weights for policy 0, policy_version 51220 (0.0010) [2023-10-10 06:47:10,816][53252] Updated weights for policy 0, policy_version 51230 (0.0008) [2023-10-10 06:47:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 104857600. Throughput: 0: 1672.4, 1: 1710.4. Samples: 26222956. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:11,784][52050] Avg episode reward: [(0, '21.860'), (1, '18.890')] [2023-10-10 06:47:13,028][53268] Updated weights for policy 1, policy_version 51170 (0.0010) [2023-10-10 06:47:13,388][53268] Updated weights for policy 1, policy_version 51180 (0.0011) [2023-10-10 06:47:13,767][53268] Updated weights for policy 1, policy_version 51190 (0.0009) [2023-10-10 06:47:14,130][53268] Updated weights for policy 1, policy_version 51200 (0.0011) [2023-10-10 06:47:14,945][53252] Updated weights for policy 0, policy_version 51240 (0.0007) [2023-10-10 06:47:15,315][53252] Updated weights for policy 0, policy_version 51250 (0.0009) [2023-10-10 06:47:15,684][53252] Updated weights for policy 0, policy_version 51260 (0.0008) [2023-10-10 06:47:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104923136. Throughput: 0: 1692.6, 1: 1680.9. Samples: 26233306. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:16,784][52050] Avg episode reward: [(0, '19.880'), (1, '18.290')] [2023-10-10 06:47:18,257][53268] Updated weights for policy 1, policy_version 51210 (0.0007) [2023-10-10 06:47:18,618][53268] Updated weights for policy 1, policy_version 51220 (0.0009) [2023-10-10 06:47:18,985][53268] Updated weights for policy 1, policy_version 51230 (0.0009) [2023-10-10 06:47:19,730][53252] Updated weights for policy 0, policy_version 51270 (0.0008) [2023-10-10 06:47:20,109][53252] Updated weights for policy 0, policy_version 51280 (0.0009) [2023-10-10 06:47:20,486][53252] Updated weights for policy 0, policy_version 51290 (0.0009) [2023-10-10 06:47:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104988672. Throughput: 0: 1667.8, 1: 1694.8. Samples: 26253150. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:21,784][52050] Avg episode reward: [(0, '19.670'), (1, '19.110')] [2023-10-10 06:47:23,063][53268] Updated weights for policy 1, policy_version 51240 (0.0008) [2023-10-10 06:47:23,429][53268] Updated weights for policy 1, policy_version 51250 (0.0009) [2023-10-10 06:47:23,798][53268] Updated weights for policy 1, policy_version 51260 (0.0007) [2023-10-10 06:47:24,526][53252] Updated weights for policy 0, policy_version 51300 (0.0008) [2023-10-10 06:47:24,888][53252] Updated weights for policy 0, policy_version 51310 (0.0008) [2023-10-10 06:47:25,254][53252] Updated weights for policy 0, policy_version 51320 (0.0009) [2023-10-10 06:47:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 105054208. Throughput: 0: 1683.6, 1: 1705.2. Samples: 26273578. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:26,784][52050] Avg episode reward: [(0, '21.850'), (1, '21.090')] [2023-10-10 06:47:27,760][53268] Updated weights for policy 1, policy_version 51270 (0.0010) [2023-10-10 06:47:28,139][53268] Updated weights for policy 1, policy_version 51280 (0.0008) [2023-10-10 06:47:28,512][53268] Updated weights for policy 1, policy_version 51290 (0.0008) [2023-10-10 06:47:29,429][53252] Updated weights for policy 0, policy_version 51330 (0.0010) [2023-10-10 06:47:29,801][53252] Updated weights for policy 0, policy_version 51340 (0.0010) [2023-10-10 06:47:30,178][53252] Updated weights for policy 0, policy_version 51350 (0.0010) [2023-10-10 06:47:30,537][53252] Updated weights for policy 0, policy_version 51360 (0.0008) [2023-10-10 06:47:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 105119744. Throughput: 0: 1695.2, 1: 1669.0. Samples: 26283856. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:31,784][52050] Avg episode reward: [(0, '18.920'), (1, '20.300')] [2023-10-10 06:47:32,701][53268] Updated weights for policy 1, policy_version 51300 (0.0008) [2023-10-10 06:47:33,069][53268] Updated weights for policy 1, policy_version 51310 (0.0007) [2023-10-10 06:47:33,444][53268] Updated weights for policy 1, policy_version 51320 (0.0009) [2023-10-10 06:47:34,609][53252] Updated weights for policy 0, policy_version 51370 (0.0007) [2023-10-10 06:47:34,975][53252] Updated weights for policy 0, policy_version 51380 (0.0007) [2023-10-10 06:47:35,341][53252] Updated weights for policy 0, policy_version 51390 (0.0008) [2023-10-10 06:47:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 105185280. Throughput: 0: 1677.9, 1: 1693.3. Samples: 26303532. Policy #0 lag: (min: 28.0, avg: 38.8, max: 60.0) [2023-10-10 06:47:36,784][52050] Avg episode reward: [(0, '20.250'), (1, '19.840')] [2023-10-10 06:47:37,340][53268] Updated weights for policy 1, policy_version 51330 (0.0008) [2023-10-10 06:47:37,705][53268] Updated weights for policy 1, policy_version 51340 (0.0008) [2023-10-10 06:47:38,076][53268] Updated weights for policy 1, policy_version 51350 (0.0008) [2023-10-10 06:47:38,444][53268] Updated weights for policy 1, policy_version 51360 (0.0009) [2023-10-10 06:47:39,438][53252] Updated weights for policy 0, policy_version 51400 (0.0008) [2023-10-10 06:47:39,813][53252] Updated weights for policy 0, policy_version 51410 (0.0008) [2023-10-10 06:47:40,184][53252] Updated weights for policy 0, policy_version 51420 (0.0010) [2023-10-10 06:47:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 105250816. Throughput: 0: 1692.6, 1: 1692.6. Samples: 26324084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:47:41,784][52050] Avg episode reward: [(0, '21.560'), (1, '19.710')] [2023-10-10 06:47:42,540][53268] Updated weights for policy 1, policy_version 51370 (0.0010) [2023-10-10 06:47:42,909][53268] Updated weights for policy 1, policy_version 51380 (0.0011) [2023-10-10 06:47:43,282][53268] Updated weights for policy 1, policy_version 51390 (0.0008) [2023-10-10 06:47:44,212][53252] Updated weights for policy 0, policy_version 51430 (0.0009) [2023-10-10 06:47:44,578][53252] Updated weights for policy 0, policy_version 51440 (0.0010) [2023-10-10 06:47:44,959][53252] Updated weights for policy 0, policy_version 51450 (0.0009) [2023-10-10 06:47:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105316352. Throughput: 0: 1681.1, 1: 1680.8. Samples: 26334010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:47:46,784][52050] Avg episode reward: [(0, '21.190'), (1, '19.180')] [2023-10-10 06:47:47,457][53268] Updated weights for policy 1, policy_version 51400 (0.0010) [2023-10-10 06:47:47,820][53268] Updated weights for policy 1, policy_version 51410 (0.0009) [2023-10-10 06:47:48,183][53268] Updated weights for policy 1, policy_version 51420 (0.0010) [2023-10-10 06:47:48,953][53252] Updated weights for policy 0, policy_version 51460 (0.0008) [2023-10-10 06:47:49,318][53252] Updated weights for policy 0, policy_version 51470 (0.0007) [2023-10-10 06:47:49,690][53252] Updated weights for policy 0, policy_version 51480 (0.0010) [2023-10-10 06:47:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105381888. Throughput: 0: 1666.7, 1: 1690.1. Samples: 26353706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:47:51,784][52050] Avg episode reward: [(0, '21.610'), (1, '18.310')] [2023-10-10 06:47:52,287][53268] Updated weights for policy 1, policy_version 51430 (0.0008) [2023-10-10 06:47:52,645][53268] Updated weights for policy 1, policy_version 51440 (0.0009) [2023-10-10 06:47:53,013][53268] Updated weights for policy 1, policy_version 51450 (0.0007) [2023-10-10 06:47:53,941][53252] Updated weights for policy 0, policy_version 51490 (0.0009) [2023-10-10 06:47:54,345][53252] Updated weights for policy 0, policy_version 51500 (0.0007) [2023-10-10 06:47:54,715][53252] Updated weights for policy 0, policy_version 51510 (0.0008) [2023-10-10 06:47:55,093][53252] Updated weights for policy 0, policy_version 51520 (0.0010) [2023-10-10 06:47:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105447424. Throughput: 0: 1679.3, 1: 1686.7. Samples: 26374424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:47:56,784][52050] Avg episode reward: [(0, '21.040'), (1, '19.450')] [2023-10-10 06:47:57,064][53268] Updated weights for policy 1, policy_version 51460 (0.0009) [2023-10-10 06:47:57,429][53268] Updated weights for policy 1, policy_version 51470 (0.0009) [2023-10-10 06:47:57,802][53268] Updated weights for policy 1, policy_version 51480 (0.0008) [2023-10-10 06:47:59,132][53252] Updated weights for policy 0, policy_version 51530 (0.0008) [2023-10-10 06:47:59,509][53252] Updated weights for policy 0, policy_version 51540 (0.0007) [2023-10-10 06:47:59,878][53252] Updated weights for policy 0, policy_version 51550 (0.0009) [2023-10-10 06:48:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105512960. Throughput: 0: 1668.7, 1: 1684.4. Samples: 26384198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:48:01,784][52050] Avg episode reward: [(0, '20.680'), (1, '18.830')] [2023-10-10 06:48:02,024][53268] Updated weights for policy 1, policy_version 51490 (0.0008) [2023-10-10 06:48:02,395][53268] Updated weights for policy 1, policy_version 51500 (0.0009) [2023-10-10 06:48:02,765][53268] Updated weights for policy 1, policy_version 51510 (0.0008) [2023-10-10 06:48:03,134][53268] Updated weights for policy 1, policy_version 51520 (0.0009) [2023-10-10 06:48:03,768][53252] Updated weights for policy 0, policy_version 51560 (0.0010) [2023-10-10 06:48:04,138][53252] Updated weights for policy 0, policy_version 51570 (0.0007) [2023-10-10 06:48:04,512][53252] Updated weights for policy 0, policy_version 51580 (0.0007) [2023-10-10 06:48:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 105578496. Throughput: 0: 1677.1, 1: 1680.8. Samples: 26404252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:48:06,784][52050] Avg episode reward: [(0, '20.430'), (1, '19.230')] [2023-10-10 06:48:07,102][53268] Updated weights for policy 1, policy_version 51530 (0.0008) [2023-10-10 06:48:07,462][53268] Updated weights for policy 1, policy_version 51540 (0.0008) [2023-10-10 06:48:07,828][53268] Updated weights for policy 1, policy_version 51550 (0.0009) [2023-10-10 06:48:08,558][53252] Updated weights for policy 0, policy_version 51590 (0.0008) [2023-10-10 06:48:08,939][53252] Updated weights for policy 0, policy_version 51600 (0.0011) [2023-10-10 06:48:09,309][53252] Updated weights for policy 0, policy_version 51610 (0.0010) [2023-10-10 06:48:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 105644032. Throughput: 0: 1684.7, 1: 1684.4. Samples: 26425190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:48:11,784][52050] Avg episode reward: [(0, '18.870'), (1, '19.710')] [2023-10-10 06:48:11,898][53268] Updated weights for policy 1, policy_version 51560 (0.0007) [2023-10-10 06:48:12,266][53268] Updated weights for policy 1, policy_version 51570 (0.0007) [2023-10-10 06:48:12,644][53268] Updated weights for policy 1, policy_version 51580 (0.0009) [2023-10-10 06:48:13,253][53252] Updated weights for policy 0, policy_version 51620 (0.0008) [2023-10-10 06:48:13,619][53252] Updated weights for policy 0, policy_version 51630 (0.0010) [2023-10-10 06:48:13,989][53252] Updated weights for policy 0, policy_version 51640 (0.0007) [2023-10-10 06:48:16,518][53268] Updated weights for policy 1, policy_version 51590 (0.0008) [2023-10-10 06:48:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105709568. Throughput: 0: 1651.6, 1: 1694.1. Samples: 26434414. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:16,784][52050] Avg episode reward: [(0, '20.330'), (1, '19.370')] [2023-10-10 06:48:16,903][53268] Updated weights for policy 1, policy_version 51600 (0.0009) [2023-10-10 06:48:17,258][53268] Updated weights for policy 1, policy_version 51610 (0.0009) [2023-10-10 06:48:18,124][53252] Updated weights for policy 0, policy_version 51650 (0.0010) [2023-10-10 06:48:18,487][53252] Updated weights for policy 0, policy_version 51660 (0.0009) [2023-10-10 06:48:18,863][53252] Updated weights for policy 0, policy_version 51670 (0.0010) [2023-10-10 06:48:19,241][53252] Updated weights for policy 0, policy_version 51680 (0.0009) [2023-10-10 06:48:21,400][53268] Updated weights for policy 1, policy_version 51620 (0.0008) [2023-10-10 06:48:21,780][53268] Updated weights for policy 1, policy_version 51630 (0.0008) [2023-10-10 06:48:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105775104. Throughput: 0: 1672.1, 1: 1692.6. Samples: 26454944. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:21,784][52050] Avg episode reward: [(0, '21.730'), (1, '19.140')] [2023-10-10 06:48:22,147][53268] Updated weights for policy 1, policy_version 51640 (0.0008) [2023-10-10 06:48:23,420][53252] Updated weights for policy 0, policy_version 51690 (0.0007) [2023-10-10 06:48:23,795][53252] Updated weights for policy 0, policy_version 51700 (0.0008) [2023-10-10 06:48:24,167][53252] Updated weights for policy 0, policy_version 51710 (0.0009) [2023-10-10 06:48:26,165][53268] Updated weights for policy 1, policy_version 51650 (0.0009) [2023-10-10 06:48:26,524][53268] Updated weights for policy 1, policy_version 51660 (0.0010) [2023-10-10 06:48:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105840640. Throughput: 0: 1677.7, 1: 1688.5. Samples: 26475562. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:26,784][52050] Avg episode reward: [(0, '21.060'), (1, '19.100')] [2023-10-10 06:48:26,895][53268] Updated weights for policy 1, policy_version 51670 (0.0009) [2023-10-10 06:48:27,267][53268] Updated weights for policy 1, policy_version 51680 (0.0009) [2023-10-10 06:48:28,298][53252] Updated weights for policy 0, policy_version 51720 (0.0008) [2023-10-10 06:48:28,672][53252] Updated weights for policy 0, policy_version 51730 (0.0009) [2023-10-10 06:48:29,035][53252] Updated weights for policy 0, policy_version 51740 (0.0007) [2023-10-10 06:48:31,346][53268] Updated weights for policy 1, policy_version 51690 (0.0008) [2023-10-10 06:48:31,703][53268] Updated weights for policy 1, policy_version 51700 (0.0010) [2023-10-10 06:48:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105906176. Throughput: 0: 1662.5, 1: 1694.0. Samples: 26485054. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:31,784][52050] Avg episode reward: [(0, '21.300'), (1, '19.890')] [2023-10-10 06:48:32,069][53268] Updated weights for policy 1, policy_version 51710 (0.0011) [2023-10-10 06:48:33,205][53252] Updated weights for policy 0, policy_version 51750 (0.0009) [2023-10-10 06:48:33,582][53252] Updated weights for policy 0, policy_version 51760 (0.0007) [2023-10-10 06:48:33,947][53252] Updated weights for policy 0, policy_version 51770 (0.0007) [2023-10-10 06:48:36,067][53268] Updated weights for policy 1, policy_version 51720 (0.0008) [2023-10-10 06:48:36,425][53268] Updated weights for policy 1, policy_version 51730 (0.0009) [2023-10-10 06:48:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105971712. Throughput: 0: 1678.9, 1: 1692.6. Samples: 26505426. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:36,784][52050] Avg episode reward: [(0, '22.090'), (1, '19.500')] [2023-10-10 06:48:36,792][53268] Updated weights for policy 1, policy_version 51740 (0.0010) [2023-10-10 06:48:37,967][53252] Updated weights for policy 0, policy_version 51780 (0.0007) [2023-10-10 06:48:38,340][53252] Updated weights for policy 0, policy_version 51790 (0.0008) [2023-10-10 06:48:38,706][53252] Updated weights for policy 0, policy_version 51800 (0.0011) [2023-10-10 06:48:40,766][53268] Updated weights for policy 1, policy_version 51750 (0.0009) [2023-10-10 06:48:41,131][53268] Updated weights for policy 1, policy_version 51760 (0.0009) [2023-10-10 06:48:41,502][53268] Updated weights for policy 1, policy_version 51770 (0.0008) [2023-10-10 06:48:41,784][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106070016. Throughput: 0: 1682.4, 1: 1679.3. Samples: 26525700. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:41,785][52050] Avg episode reward: [(0, '22.550'), (1, '20.560')] [2023-10-10 06:48:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000051776_53018624.pth... [2023-10-10 06:48:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000051808_53051392.pth... [2023-10-10 06:48:41,842][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000050240_51445760.pth [2023-10-10 06:48:41,842][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000050176_51380224.pth [2023-10-10 06:48:42,751][53252] Updated weights for policy 0, policy_version 51810 (0.0010) [2023-10-10 06:48:43,144][53252] Updated weights for policy 0, policy_version 51820 (0.0010) [2023-10-10 06:48:43,508][53252] Updated weights for policy 0, policy_version 51830 (0.0007) [2023-10-10 06:48:43,882][53252] Updated weights for policy 0, policy_version 51840 (0.0010) [2023-10-10 06:48:45,535][53268] Updated weights for policy 1, policy_version 51780 (0.0008) [2023-10-10 06:48:45,909][53268] Updated weights for policy 1, policy_version 51790 (0.0009) [2023-10-10 06:48:46,279][53268] Updated weights for policy 1, policy_version 51800 (0.0008) [2023-10-10 06:48:46,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106135552. Throughput: 0: 1665.5, 1: 1696.2. Samples: 26535474. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) [2023-10-10 06:48:46,784][52050] Avg episode reward: [(0, '22.920'), (1, '19.850')] [2023-10-10 06:48:47,911][53252] Updated weights for policy 0, policy_version 51850 (0.0009) [2023-10-10 06:48:48,276][53252] Updated weights for policy 0, policy_version 51860 (0.0008) [2023-10-10 06:48:48,651][53252] Updated weights for policy 0, policy_version 51870 (0.0007) [2023-10-10 06:48:50,418][53268] Updated weights for policy 1, policy_version 51810 (0.0009) [2023-10-10 06:48:50,795][53268] Updated weights for policy 1, policy_version 51820 (0.0008) [2023-10-10 06:48:51,172][53268] Updated weights for policy 1, policy_version 51830 (0.0011) [2023-10-10 06:48:51,526][53268] Updated weights for policy 1, policy_version 51840 (0.0011) [2023-10-10 06:48:51,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106201088. Throughput: 0: 1678.2, 1: 1696.1. Samples: 26556096. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:48:51,784][52050] Avg episode reward: [(0, '23.120'), (1, '17.600')] [2023-10-10 06:48:52,785][53252] Updated weights for policy 0, policy_version 51880 (0.0008) [2023-10-10 06:48:53,164][53252] Updated weights for policy 0, policy_version 51890 (0.0009) [2023-10-10 06:48:53,546][53252] Updated weights for policy 0, policy_version 51900 (0.0010) [2023-10-10 06:48:55,542][53268] Updated weights for policy 1, policy_version 51850 (0.0008) [2023-10-10 06:48:55,905][53268] Updated weights for policy 1, policy_version 51860 (0.0009) [2023-10-10 06:48:56,275][53268] Updated weights for policy 1, policy_version 51870 (0.0007) [2023-10-10 06:48:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106266624. Throughput: 0: 1680.3, 1: 1669.6. Samples: 26575938. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:48:56,784][52050] Avg episode reward: [(0, '22.540'), (1, '19.450')] [2023-10-10 06:48:57,615][53252] Updated weights for policy 0, policy_version 51910 (0.0010) [2023-10-10 06:48:57,993][53252] Updated weights for policy 0, policy_version 51920 (0.0010) [2023-10-10 06:48:58,360][53252] Updated weights for policy 0, policy_version 51930 (0.0010) [2023-10-10 06:49:00,303][53268] Updated weights for policy 1, policy_version 51880 (0.0008) [2023-10-10 06:49:00,662][53268] Updated weights for policy 1, policy_version 51890 (0.0010) [2023-10-10 06:49:01,025][53268] Updated weights for policy 1, policy_version 51900 (0.0009) [2023-10-10 06:49:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106332160. Throughput: 0: 1680.0, 1: 1689.5. Samples: 26586042. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:49:01,784][52050] Avg episode reward: [(0, '19.440'), (1, '19.890')] [2023-10-10 06:49:02,304][53252] Updated weights for policy 0, policy_version 51940 (0.0009) [2023-10-10 06:49:02,673][53252] Updated weights for policy 0, policy_version 51950 (0.0008) [2023-10-10 06:49:03,046][53252] Updated weights for policy 0, policy_version 51960 (0.0009) [2023-10-10 06:49:05,275][53268] Updated weights for policy 1, policy_version 51910 (0.0010) [2023-10-10 06:49:05,658][53268] Updated weights for policy 1, policy_version 51920 (0.0009) [2023-10-10 06:49:06,017][53268] Updated weights for policy 1, policy_version 51930 (0.0008) [2023-10-10 06:49:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106397696. Throughput: 0: 1679.9, 1: 1683.8. Samples: 26606310. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:49:06,784][52050] Avg episode reward: [(0, '19.160'), (1, '19.190')] [2023-10-10 06:49:07,028][53252] Updated weights for policy 0, policy_version 51970 (0.0009) [2023-10-10 06:49:07,404][53252] Updated weights for policy 0, policy_version 51980 (0.0009) [2023-10-10 06:49:07,766][53252] Updated weights for policy 0, policy_version 51990 (0.0009) [2023-10-10 06:49:08,134][53252] Updated weights for policy 0, policy_version 52000 (0.0007) [2023-10-10 06:49:10,071][53268] Updated weights for policy 1, policy_version 51940 (0.0008) [2023-10-10 06:49:10,430][53268] Updated weights for policy 1, policy_version 51950 (0.0008) [2023-10-10 06:49:10,805][53268] Updated weights for policy 1, policy_version 51960 (0.0008) [2023-10-10 06:49:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106463232. Throughput: 0: 1686.6, 1: 1661.0. Samples: 26626204. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:49:11,784][52050] Avg episode reward: [(0, '20.170'), (1, '21.000')] [2023-10-10 06:49:12,263][53252] Updated weights for policy 0, policy_version 52010 (0.0007) [2023-10-10 06:49:12,631][53252] Updated weights for policy 0, policy_version 52020 (0.0008) [2023-10-10 06:49:13,009][53252] Updated weights for policy 0, policy_version 52030 (0.0007) [2023-10-10 06:49:14,947][53268] Updated weights for policy 1, policy_version 51970 (0.0010) [2023-10-10 06:49:15,321][53268] Updated weights for policy 1, policy_version 51980 (0.0011) [2023-10-10 06:49:15,691][53268] Updated weights for policy 1, policy_version 51990 (0.0010) [2023-10-10 06:49:16,060][53268] Updated weights for policy 1, policy_version 52000 (0.0010) [2023-10-10 06:49:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106528768. Throughput: 0: 1682.8, 1: 1683.1. Samples: 26636516. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:49:16,784][52050] Avg episode reward: [(0, '20.480'), (1, '21.730')] [2023-10-10 06:49:17,078][53252] Updated weights for policy 0, policy_version 52040 (0.0009) [2023-10-10 06:49:17,456][53252] Updated weights for policy 0, policy_version 52050 (0.0011) [2023-10-10 06:49:17,831][53252] Updated weights for policy 0, policy_version 52060 (0.0010) [2023-10-10 06:49:20,278][53268] Updated weights for policy 1, policy_version 52010 (0.0009) [2023-10-10 06:49:20,645][53268] Updated weights for policy 1, policy_version 52020 (0.0007) [2023-10-10 06:49:21,008][53268] Updated weights for policy 1, policy_version 52030 (0.0011) [2023-10-10 06:49:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106594304. Throughput: 0: 1684.8, 1: 1673.6. Samples: 26656550. Policy #0 lag: (min: 31.0, avg: 33.1, max: 63.0) [2023-10-10 06:49:21,784][52050] Avg episode reward: [(0, '20.660'), (1, '21.810')] [2023-10-10 06:49:22,018][53252] Updated weights for policy 0, policy_version 52070 (0.0007) [2023-10-10 06:49:22,397][53252] Updated weights for policy 0, policy_version 52080 (0.0009) [2023-10-10 06:49:22,773][53252] Updated weights for policy 0, policy_version 52090 (0.0010) [2023-10-10 06:49:25,036][53268] Updated weights for policy 1, policy_version 52040 (0.0009) [2023-10-10 06:49:25,404][53268] Updated weights for policy 1, policy_version 52050 (0.0008) [2023-10-10 06:49:25,768][53268] Updated weights for policy 1, policy_version 52060 (0.0009) [2023-10-10 06:49:26,643][53252] Updated weights for policy 0, policy_version 52100 (0.0008) [2023-10-10 06:49:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106659840. Throughput: 0: 1685.3, 1: 1662.6. Samples: 26676358. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:26,784][52050] Avg episode reward: [(0, '21.470'), (1, '20.370')] [2023-10-10 06:49:27,007][53252] Updated weights for policy 0, policy_version 52110 (0.0009) [2023-10-10 06:49:27,381][53252] Updated weights for policy 0, policy_version 52120 (0.0010) [2023-10-10 06:49:29,911][53268] Updated weights for policy 1, policy_version 52070 (0.0010) [2023-10-10 06:49:30,286][53268] Updated weights for policy 1, policy_version 52080 (0.0009) [2023-10-10 06:49:30,655][53268] Updated weights for policy 1, policy_version 52090 (0.0008) [2023-10-10 06:49:31,585][53252] Updated weights for policy 0, policy_version 52130 (0.0009) [2023-10-10 06:49:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106725376. Throughput: 0: 1683.4, 1: 1677.1. Samples: 26686694. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:31,784][52050] Avg episode reward: [(0, '22.260'), (1, '21.270')] [2023-10-10 06:49:31,982][53252] Updated weights for policy 0, policy_version 52140 (0.0007) [2023-10-10 06:49:32,342][53252] Updated weights for policy 0, policy_version 52150 (0.0010) [2023-10-10 06:49:32,710][53252] Updated weights for policy 0, policy_version 52160 (0.0010) [2023-10-10 06:49:34,607][53268] Updated weights for policy 1, policy_version 52100 (0.0009) [2023-10-10 06:49:34,974][53268] Updated weights for policy 1, policy_version 52110 (0.0010) [2023-10-10 06:49:35,342][53268] Updated weights for policy 1, policy_version 52120 (0.0007) [2023-10-10 06:49:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106790912. Throughput: 0: 1680.2, 1: 1665.5. Samples: 26706650. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:36,784][52050] Avg episode reward: [(0, '21.660'), (1, '19.040')] [2023-10-10 06:49:36,810][53252] Updated weights for policy 0, policy_version 52170 (0.0010) [2023-10-10 06:49:37,193][53252] Updated weights for policy 0, policy_version 52180 (0.0010) [2023-10-10 06:49:37,564][53252] Updated weights for policy 0, policy_version 52190 (0.0011) [2023-10-10 06:49:39,378][53268] Updated weights for policy 1, policy_version 52130 (0.0009) [2023-10-10 06:49:39,751][53268] Updated weights for policy 1, policy_version 52140 (0.0008) [2023-10-10 06:49:40,115][53268] Updated weights for policy 1, policy_version 52150 (0.0009) [2023-10-10 06:49:40,489][53268] Updated weights for policy 1, policy_version 52160 (0.0008) [2023-10-10 06:49:41,686][53252] Updated weights for policy 0, policy_version 52200 (0.0008) [2023-10-10 06:49:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106856448. Throughput: 0: 1676.4, 1: 1681.7. Samples: 26727054. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:41,784][52050] Avg episode reward: [(0, '20.370'), (1, '19.190')] [2023-10-10 06:49:42,063][53252] Updated weights for policy 0, policy_version 52210 (0.0008) [2023-10-10 06:49:42,440][53252] Updated weights for policy 0, policy_version 52220 (0.0007) [2023-10-10 06:49:44,492][53268] Updated weights for policy 1, policy_version 52170 (0.0008) [2023-10-10 06:49:44,857][53268] Updated weights for policy 1, policy_version 52180 (0.0011) [2023-10-10 06:49:45,234][53268] Updated weights for policy 1, policy_version 52190 (0.0011) [2023-10-10 06:49:46,422][53252] Updated weights for policy 0, policy_version 52230 (0.0007) [2023-10-10 06:49:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106921984. Throughput: 0: 1681.1, 1: 1686.5. Samples: 26737584. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:46,784][52050] Avg episode reward: [(0, '20.040'), (1, '19.700')] [2023-10-10 06:49:46,789][53252] Updated weights for policy 0, policy_version 52240 (0.0010) [2023-10-10 06:49:47,157][53252] Updated weights for policy 0, policy_version 52250 (0.0009) [2023-10-10 06:49:49,255][53268] Updated weights for policy 1, policy_version 52200 (0.0008) [2023-10-10 06:49:49,618][53268] Updated weights for policy 1, policy_version 52210 (0.0008) [2023-10-10 06:49:49,984][53268] Updated weights for policy 1, policy_version 52220 (0.0008) [2023-10-10 06:49:51,281][53252] Updated weights for policy 0, policy_version 52260 (0.0008) [2023-10-10 06:49:51,648][53252] Updated weights for policy 0, policy_version 52270 (0.0009) [2023-10-10 06:49:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106987520. Throughput: 0: 1682.9, 1: 1667.7. Samples: 26757084. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:51,784][52050] Avg episode reward: [(0, '19.450'), (1, '19.340')] [2023-10-10 06:49:52,015][53252] Updated weights for policy 0, policy_version 52280 (0.0009) [2023-10-10 06:49:54,210][53268] Updated weights for policy 1, policy_version 52230 (0.0009) [2023-10-10 06:49:54,595][53268] Updated weights for policy 1, policy_version 52240 (0.0010) [2023-10-10 06:49:54,971][53268] Updated weights for policy 1, policy_version 52250 (0.0011) [2023-10-10 06:49:56,074][53252] Updated weights for policy 0, policy_version 52290 (0.0010) [2023-10-10 06:49:56,452][53252] Updated weights for policy 0, policy_version 52300 (0.0007) [2023-10-10 06:49:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107053056. Throughput: 0: 1667.6, 1: 1684.5. Samples: 26777050. Policy #0 lag: (min: 16.0, avg: 41.2, max: 48.0) [2023-10-10 06:49:56,784][52050] Avg episode reward: [(0, '19.060'), (1, '19.230')] [2023-10-10 06:49:56,816][53252] Updated weights for policy 0, policy_version 52310 (0.0007) [2023-10-10 06:49:57,191][53252] Updated weights for policy 0, policy_version 52320 (0.0008) [2023-10-10 06:49:58,947][53268] Updated weights for policy 1, policy_version 52260 (0.0008) [2023-10-10 06:49:59,314][53268] Updated weights for policy 1, policy_version 52270 (0.0007) [2023-10-10 06:49:59,690][53268] Updated weights for policy 1, policy_version 52280 (0.0009) [2023-10-10 06:50:01,234][53252] Updated weights for policy 0, policy_version 52330 (0.0009) [2023-10-10 06:50:01,611][53252] Updated weights for policy 0, policy_version 52340 (0.0008) [2023-10-10 06:50:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 107118592. Throughput: 0: 1677.1, 1: 1677.5. Samples: 26787474. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:01,784][52050] Avg episode reward: [(0, '19.090'), (1, '19.300')] [2023-10-10 06:50:01,983][53252] Updated weights for policy 0, policy_version 52350 (0.0009) [2023-10-10 06:50:03,800][53268] Updated weights for policy 1, policy_version 52290 (0.0009) [2023-10-10 06:50:04,179][53268] Updated weights for policy 1, policy_version 52300 (0.0009) [2023-10-10 06:50:04,545][53268] Updated weights for policy 1, policy_version 52310 (0.0008) [2023-10-10 06:50:04,912][53268] Updated weights for policy 1, policy_version 52320 (0.0008) [2023-10-10 06:50:06,109][53252] Updated weights for policy 0, policy_version 52360 (0.0010) [2023-10-10 06:50:06,488][53252] Updated weights for policy 0, policy_version 52370 (0.0007) [2023-10-10 06:50:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107184128. Throughput: 0: 1677.9, 1: 1669.1. Samples: 26807164. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:06,784][52050] Avg episode reward: [(0, '19.720'), (1, '19.760')] [2023-10-10 06:50:06,855][53252] Updated weights for policy 0, policy_version 52380 (0.0007) [2023-10-10 06:50:08,745][53268] Updated weights for policy 1, policy_version 52330 (0.0009) [2023-10-10 06:50:09,119][53268] Updated weights for policy 1, policy_version 52340 (0.0010) [2023-10-10 06:50:09,494][53268] Updated weights for policy 1, policy_version 52350 (0.0009) [2023-10-10 06:50:10,945][53252] Updated weights for policy 0, policy_version 52390 (0.0009) [2023-10-10 06:50:11,320][53252] Updated weights for policy 0, policy_version 52400 (0.0009) [2023-10-10 06:50:11,695][53252] Updated weights for policy 0, policy_version 52410 (0.0008) [2023-10-10 06:50:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107249664. Throughput: 0: 1661.2, 1: 1691.9. Samples: 26827248. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:11,784][52050] Avg episode reward: [(0, '20.190'), (1, '20.610')] [2023-10-10 06:50:13,612][53268] Updated weights for policy 1, policy_version 52360 (0.0009) [2023-10-10 06:50:13,986][53268] Updated weights for policy 1, policy_version 52370 (0.0009) [2023-10-10 06:50:14,344][53268] Updated weights for policy 1, policy_version 52380 (0.0011) [2023-10-10 06:50:15,729][53252] Updated weights for policy 0, policy_version 52420 (0.0008) [2023-10-10 06:50:16,099][53252] Updated weights for policy 0, policy_version 52430 (0.0008) [2023-10-10 06:50:16,469][53252] Updated weights for policy 0, policy_version 52440 (0.0010) [2023-10-10 06:50:16,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107347968. Throughput: 0: 1683.2, 1: 1668.5. Samples: 26837522. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:16,784][52050] Avg episode reward: [(0, '21.250'), (1, '21.410')] [2023-10-10 06:50:18,622][53268] Updated weights for policy 1, policy_version 52390 (0.0010) [2023-10-10 06:50:18,988][53268] Updated weights for policy 1, policy_version 52400 (0.0009) [2023-10-10 06:50:19,361][53268] Updated weights for policy 1, policy_version 52410 (0.0007) [2023-10-10 06:50:20,470][53252] Updated weights for policy 0, policy_version 52450 (0.0010) [2023-10-10 06:50:20,869][53252] Updated weights for policy 0, policy_version 52460 (0.0007) [2023-10-10 06:50:21,242][53252] Updated weights for policy 0, policy_version 52470 (0.0007) [2023-10-10 06:50:21,614][53252] Updated weights for policy 0, policy_version 52480 (0.0009) [2023-10-10 06:50:21,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107413504. Throughput: 0: 1686.5, 1: 1668.1. Samples: 26857608. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:21,784][52050] Avg episode reward: [(0, '22.120'), (1, '20.790')] [2023-10-10 06:50:23,457][53268] Updated weights for policy 1, policy_version 52420 (0.0009) [2023-10-10 06:50:23,824][53268] Updated weights for policy 1, policy_version 52430 (0.0007) [2023-10-10 06:50:24,181][53268] Updated weights for policy 1, policy_version 52440 (0.0008) [2023-10-10 06:50:25,542][53252] Updated weights for policy 0, policy_version 52490 (0.0009) [2023-10-10 06:50:25,901][53252] Updated weights for policy 0, policy_version 52500 (0.0009) [2023-10-10 06:50:26,275][53252] Updated weights for policy 0, policy_version 52510 (0.0008) [2023-10-10 06:50:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107479040. Throughput: 0: 1657.7, 1: 1673.9. Samples: 26876974. Policy #0 lag: (min: 22.0, avg: 22.7, max: 41.0) [2023-10-10 06:50:26,785][52050] Avg episode reward: [(0, '21.000'), (1, '21.580')] [2023-10-10 06:50:28,253][53268] Updated weights for policy 1, policy_version 52450 (0.0011) [2023-10-10 06:50:28,616][53268] Updated weights for policy 1, policy_version 52460 (0.0011) [2023-10-10 06:50:28,989][53268] Updated weights for policy 1, policy_version 52470 (0.0011) [2023-10-10 06:50:29,360][53268] Updated weights for policy 1, policy_version 52480 (0.0007) [2023-10-10 06:50:30,324][53252] Updated weights for policy 0, policy_version 52520 (0.0010) [2023-10-10 06:50:30,700][53252] Updated weights for policy 0, policy_version 52530 (0.0009) [2023-10-10 06:50:31,073][53252] Updated weights for policy 0, policy_version 52540 (0.0007) [2023-10-10 06:50:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107544576. Throughput: 0: 1688.3, 1: 1649.3. Samples: 26887774. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:31,784][52050] Avg episode reward: [(0, '19.730'), (1, '21.860')] [2023-10-10 06:50:33,603][53268] Updated weights for policy 1, policy_version 52490 (0.0008) [2023-10-10 06:50:33,979][53268] Updated weights for policy 1, policy_version 52500 (0.0010) [2023-10-10 06:50:34,341][53268] Updated weights for policy 1, policy_version 52510 (0.0007) [2023-10-10 06:50:35,160][53252] Updated weights for policy 0, policy_version 52550 (0.0007) [2023-10-10 06:50:35,541][53252] Updated weights for policy 0, policy_version 52560 (0.0009) [2023-10-10 06:50:35,895][53252] Updated weights for policy 0, policy_version 52570 (0.0008) [2023-10-10 06:50:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107610112. Throughput: 0: 1677.3, 1: 1662.7. Samples: 26907384. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:36,784][52050] Avg episode reward: [(0, '20.350'), (1, '22.030')] [2023-10-10 06:50:38,312][53268] Updated weights for policy 1, policy_version 52520 (0.0010) [2023-10-10 06:50:38,680][53268] Updated weights for policy 1, policy_version 52530 (0.0008) [2023-10-10 06:50:39,050][53268] Updated weights for policy 1, policy_version 52540 (0.0008) [2023-10-10 06:50:39,928][53252] Updated weights for policy 0, policy_version 52580 (0.0008) [2023-10-10 06:50:40,295][53252] Updated weights for policy 0, policy_version 52590 (0.0008) [2023-10-10 06:50:40,666][53252] Updated weights for policy 0, policy_version 52600 (0.0009) [2023-10-10 06:50:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107675648. Throughput: 0: 1669.3, 1: 1672.1. Samples: 26927416. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:41,784][52050] Avg episode reward: [(0, '19.010'), (1, '21.180')] [2023-10-10 06:50:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000052608_53870592.pth... [2023-10-10 06:50:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000052544_53805056.pth... [2023-10-10 06:50:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000050976_52199424.pth [2023-10-10 06:50:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000051040_52264960.pth [2023-10-10 06:50:43,190][53268] Updated weights for policy 1, policy_version 52550 (0.0007) [2023-10-10 06:50:43,572][53268] Updated weights for policy 1, policy_version 52560 (0.0007) [2023-10-10 06:50:43,939][53268] Updated weights for policy 1, policy_version 52570 (0.0011) [2023-10-10 06:50:44,632][53252] Updated weights for policy 0, policy_version 52610 (0.0009) [2023-10-10 06:50:44,995][53252] Updated weights for policy 0, policy_version 52620 (0.0008) [2023-10-10 06:50:45,364][53252] Updated weights for policy 0, policy_version 52630 (0.0007) [2023-10-10 06:50:45,727][53252] Updated weights for policy 0, policy_version 52640 (0.0007) [2023-10-10 06:50:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107741184. Throughput: 0: 1689.5, 1: 1653.2. Samples: 26937894. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:46,784][52050] Avg episode reward: [(0, '20.790'), (1, '20.810')] [2023-10-10 06:50:48,240][53268] Updated weights for policy 1, policy_version 52580 (0.0011) [2023-10-10 06:50:48,606][53268] Updated weights for policy 1, policy_version 52590 (0.0008) [2023-10-10 06:50:48,967][53268] Updated weights for policy 1, policy_version 52600 (0.0008) [2023-10-10 06:50:49,896][53252] Updated weights for policy 0, policy_version 52650 (0.0009) [2023-10-10 06:50:50,277][53252] Updated weights for policy 0, policy_version 52660 (0.0009) [2023-10-10 06:50:50,640][53252] Updated weights for policy 0, policy_version 52670 (0.0009) [2023-10-10 06:50:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107806720. Throughput: 0: 1671.2, 1: 1667.4. Samples: 26957402. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:51,784][52050] Avg episode reward: [(0, '21.810'), (1, '20.450')] [2023-10-10 06:50:52,952][53268] Updated weights for policy 1, policy_version 52610 (0.0008) [2023-10-10 06:50:53,316][53268] Updated weights for policy 1, policy_version 52620 (0.0007) [2023-10-10 06:50:53,686][53268] Updated weights for policy 1, policy_version 52630 (0.0010) [2023-10-10 06:50:54,047][53268] Updated weights for policy 1, policy_version 52640 (0.0008) [2023-10-10 06:50:54,609][53252] Updated weights for policy 0, policy_version 52680 (0.0008) [2023-10-10 06:50:54,979][53252] Updated weights for policy 0, policy_version 52690 (0.0009) [2023-10-10 06:50:55,357][53252] Updated weights for policy 0, policy_version 52700 (0.0008) [2023-10-10 06:50:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107872256. Throughput: 0: 1686.2, 1: 1670.1. Samples: 26978282. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:50:56,784][52050] Avg episode reward: [(0, '21.730'), (1, '20.210')] [2023-10-10 06:50:58,170][53268] Updated weights for policy 1, policy_version 52650 (0.0009) [2023-10-10 06:50:58,538][53268] Updated weights for policy 1, policy_version 52660 (0.0008) [2023-10-10 06:50:58,903][53268] Updated weights for policy 1, policy_version 52670 (0.0007) [2023-10-10 06:50:59,399][53252] Updated weights for policy 0, policy_version 52710 (0.0008) [2023-10-10 06:50:59,772][53252] Updated weights for policy 0, policy_version 52720 (0.0009) [2023-10-10 06:51:00,140][53252] Updated weights for policy 0, policy_version 52730 (0.0008) [2023-10-10 06:51:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 107937792. Throughput: 0: 1688.7, 1: 1663.3. Samples: 26988362. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-10 06:51:01,784][52050] Avg episode reward: [(0, '23.660'), (1, '20.080')] [2023-10-10 06:51:02,955][53268] Updated weights for policy 1, policy_version 52680 (0.0008) [2023-10-10 06:51:03,326][53268] Updated weights for policy 1, policy_version 52690 (0.0008) [2023-10-10 06:51:03,696][53268] Updated weights for policy 1, policy_version 52700 (0.0008) [2023-10-10 06:51:04,114][53252] Updated weights for policy 0, policy_version 52740 (0.0009) [2023-10-10 06:51:04,475][53252] Updated weights for policy 0, policy_version 52750 (0.0010) [2023-10-10 06:51:04,852][53252] Updated weights for policy 0, policy_version 52760 (0.0007) [2023-10-10 06:51:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108003328. Throughput: 0: 1667.2, 1: 1678.4. Samples: 27008160. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:06,784][52050] Avg episode reward: [(0, '22.980'), (1, '19.240')] [2023-10-10 06:51:07,828][53268] Updated weights for policy 1, policy_version 52710 (0.0009) [2023-10-10 06:51:08,186][53268] Updated weights for policy 1, policy_version 52720 (0.0009) [2023-10-10 06:51:08,556][53268] Updated weights for policy 1, policy_version 52730 (0.0008) [2023-10-10 06:51:09,048][53252] Updated weights for policy 0, policy_version 52770 (0.0007) [2023-10-10 06:51:09,437][53252] Updated weights for policy 0, policy_version 52780 (0.0008) [2023-10-10 06:51:09,805][53252] Updated weights for policy 0, policy_version 52790 (0.0009) [2023-10-10 06:51:10,172][53252] Updated weights for policy 0, policy_version 52800 (0.0009) [2023-10-10 06:51:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108068864. Throughput: 0: 1693.0, 1: 1682.0. Samples: 27028850. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:11,784][52050] Avg episode reward: [(0, '21.250'), (1, '20.210')] [2023-10-10 06:51:12,510][53268] Updated weights for policy 1, policy_version 52740 (0.0008) [2023-10-10 06:51:12,879][53268] Updated weights for policy 1, policy_version 52750 (0.0007) [2023-10-10 06:51:13,250][53268] Updated weights for policy 1, policy_version 52760 (0.0008) [2023-10-10 06:51:14,179][53252] Updated weights for policy 0, policy_version 52810 (0.0008) [2023-10-10 06:51:14,554][53252] Updated weights for policy 0, policy_version 52820 (0.0007) [2023-10-10 06:51:14,918][53252] Updated weights for policy 0, policy_version 52830 (0.0008) [2023-10-10 06:51:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108134400. Throughput: 0: 1677.4, 1: 1679.9. Samples: 27038852. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:16,784][52050] Avg episode reward: [(0, '21.230'), (1, '20.540')] [2023-10-10 06:51:17,280][53268] Updated weights for policy 1, policy_version 52770 (0.0010) [2023-10-10 06:51:17,648][53268] Updated weights for policy 1, policy_version 52780 (0.0008) [2023-10-10 06:51:18,023][53268] Updated weights for policy 1, policy_version 52790 (0.0009) [2023-10-10 06:51:18,395][53268] Updated weights for policy 1, policy_version 52800 (0.0008) [2023-10-10 06:51:18,901][53252] Updated weights for policy 0, policy_version 52840 (0.0007) [2023-10-10 06:51:19,268][53252] Updated weights for policy 0, policy_version 52850 (0.0007) [2023-10-10 06:51:19,644][53252] Updated weights for policy 0, policy_version 52860 (0.0008) [2023-10-10 06:51:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108199936. Throughput: 0: 1676.8, 1: 1690.0. Samples: 27058890. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:21,784][52050] Avg episode reward: [(0, '18.680'), (1, '19.230')] [2023-10-10 06:51:22,365][53268] Updated weights for policy 1, policy_version 52810 (0.0010) [2023-10-10 06:51:22,734][53268] Updated weights for policy 1, policy_version 52820 (0.0007) [2023-10-10 06:51:23,100][53268] Updated weights for policy 1, policy_version 52830 (0.0009) [2023-10-10 06:51:23,617][53252] Updated weights for policy 0, policy_version 52870 (0.0007) [2023-10-10 06:51:23,985][53252] Updated weights for policy 0, policy_version 52880 (0.0007) [2023-10-10 06:51:24,359][53252] Updated weights for policy 0, policy_version 52890 (0.0010) [2023-10-10 06:51:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 108265472. Throughput: 0: 1698.8, 1: 1685.2. Samples: 27079694. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:26,784][52050] Avg episode reward: [(0, '19.340'), (1, '19.750')] [2023-10-10 06:51:27,148][53268] Updated weights for policy 1, policy_version 52840 (0.0009) [2023-10-10 06:51:27,513][53268] Updated weights for policy 1, policy_version 52850 (0.0008) [2023-10-10 06:51:27,884][53268] Updated weights for policy 1, policy_version 52860 (0.0009) [2023-10-10 06:51:28,382][53252] Updated weights for policy 0, policy_version 52900 (0.0009) [2023-10-10 06:51:28,754][53252] Updated weights for policy 0, policy_version 52910 (0.0008) [2023-10-10 06:51:29,129][53252] Updated weights for policy 0, policy_version 52920 (0.0008) [2023-10-10 06:51:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 108331008. Throughput: 0: 1670.1, 1: 1688.4. Samples: 27089028. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:31,785][52050] Avg episode reward: [(0, '20.560'), (1, '20.270')] [2023-10-10 06:51:32,172][53268] Updated weights for policy 1, policy_version 52870 (0.0009) [2023-10-10 06:51:32,561][53268] Updated weights for policy 1, policy_version 52880 (0.0009) [2023-10-10 06:51:32,939][53268] Updated weights for policy 1, policy_version 52890 (0.0007) [2023-10-10 06:51:33,286][53252] Updated weights for policy 0, policy_version 52930 (0.0008) [2023-10-10 06:51:33,670][53252] Updated weights for policy 0, policy_version 52940 (0.0009) [2023-10-10 06:51:34,043][53252] Updated weights for policy 0, policy_version 52950 (0.0008) [2023-10-10 06:51:34,411][53252] Updated weights for policy 0, policy_version 52960 (0.0008) [2023-10-10 06:51:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108396544. Throughput: 0: 1685.9, 1: 1688.8. Samples: 27109262. Policy #0 lag: (min: 1.0, avg: 2.7, max: 28.0) [2023-10-10 06:51:36,784][52050] Avg episode reward: [(0, '21.300'), (1, '20.360')] [2023-10-10 06:51:37,018][53268] Updated weights for policy 1, policy_version 52900 (0.0008) [2023-10-10 06:51:37,377][53268] Updated weights for policy 1, policy_version 52910 (0.0008) [2023-10-10 06:51:37,738][53268] Updated weights for policy 1, policy_version 52920 (0.0008) [2023-10-10 06:51:38,535][53252] Updated weights for policy 0, policy_version 52970 (0.0007) [2023-10-10 06:51:38,899][53252] Updated weights for policy 0, policy_version 52980 (0.0008) [2023-10-10 06:51:39,270][53252] Updated weights for policy 0, policy_version 52990 (0.0008) [2023-10-10 06:51:41,654][53268] Updated weights for policy 1, policy_version 52930 (0.0010) [2023-10-10 06:51:41,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 108462080. Throughput: 0: 1690.1, 1: 1684.1. Samples: 27130120. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:51:41,784][52050] Avg episode reward: [(0, '22.150'), (1, '19.060')] [2023-10-10 06:51:42,029][53268] Updated weights for policy 1, policy_version 52940 (0.0008) [2023-10-10 06:51:42,393][53268] Updated weights for policy 1, policy_version 52950 (0.0008) [2023-10-10 06:51:42,759][53268] Updated weights for policy 1, policy_version 52960 (0.0007) [2023-10-10 06:51:43,129][53252] Updated weights for policy 0, policy_version 53000 (0.0008) [2023-10-10 06:51:43,495][53252] Updated weights for policy 0, policy_version 53010 (0.0008) [2023-10-10 06:51:43,870][53252] Updated weights for policy 0, policy_version 53020 (0.0007) [2023-10-10 06:51:46,714][53268] Updated weights for policy 1, policy_version 52970 (0.0008) [2023-10-10 06:51:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108527616. Throughput: 0: 1674.8, 1: 1686.0. Samples: 27139596. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:51:46,784][52050] Avg episode reward: [(0, '23.230'), (1, '19.440')] [2023-10-10 06:51:47,084][53268] Updated weights for policy 1, policy_version 52980 (0.0008) [2023-10-10 06:51:47,458][53268] Updated weights for policy 1, policy_version 52990 (0.0008) [2023-10-10 06:51:47,965][53252] Updated weights for policy 0, policy_version 53030 (0.0008) [2023-10-10 06:51:48,337][53252] Updated weights for policy 0, policy_version 53040 (0.0008) [2023-10-10 06:51:48,707][53252] Updated weights for policy 0, policy_version 53050 (0.0009) [2023-10-10 06:51:51,463][53268] Updated weights for policy 1, policy_version 53000 (0.0009) [2023-10-10 06:51:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108593152. Throughput: 0: 1698.7, 1: 1686.5. Samples: 27160494. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:51:51,784][52050] Avg episode reward: [(0, '21.150'), (1, '19.680')] [2023-10-10 06:51:51,836][53268] Updated weights for policy 1, policy_version 53010 (0.0010) [2023-10-10 06:51:52,209][53268] Updated weights for policy 1, policy_version 53020 (0.0011) [2023-10-10 06:51:52,626][53252] Updated weights for policy 0, policy_version 53060 (0.0008) [2023-10-10 06:51:52,998][53252] Updated weights for policy 0, policy_version 53070 (0.0007) [2023-10-10 06:51:53,364][53252] Updated weights for policy 0, policy_version 53080 (0.0008) [2023-10-10 06:51:56,243][53268] Updated weights for policy 1, policy_version 53030 (0.0009) [2023-10-10 06:51:56,609][53268] Updated weights for policy 1, policy_version 53040 (0.0008) [2023-10-10 06:51:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108658688. Throughput: 0: 1703.5, 1: 1682.5. Samples: 27181220. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:51:56,784][52050] Avg episode reward: [(0, '20.690'), (1, '19.370')] [2023-10-10 06:51:56,972][53268] Updated weights for policy 1, policy_version 53050 (0.0008) [2023-10-10 06:51:57,447][53252] Updated weights for policy 0, policy_version 53090 (0.0007) [2023-10-10 06:51:57,852][53252] Updated weights for policy 0, policy_version 53100 (0.0010) [2023-10-10 06:51:58,233][53252] Updated weights for policy 0, policy_version 53110 (0.0009) [2023-10-10 06:51:58,602][53252] Updated weights for policy 0, policy_version 53120 (0.0008) [2023-10-10 06:52:00,949][53268] Updated weights for policy 1, policy_version 53060 (0.0009) [2023-10-10 06:52:01,317][53268] Updated weights for policy 1, policy_version 53070 (0.0010) [2023-10-10 06:52:01,676][53268] Updated weights for policy 1, policy_version 53080 (0.0007) [2023-10-10 06:52:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108724224. Throughput: 0: 1680.3, 1: 1687.2. Samples: 27190386. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:52:01,784][52050] Avg episode reward: [(0, '20.550'), (1, '19.320')] [2023-10-10 06:52:02,654][53252] Updated weights for policy 0, policy_version 53130 (0.0007) [2023-10-10 06:52:03,034][53252] Updated weights for policy 0, policy_version 53140 (0.0007) [2023-10-10 06:52:03,394][53252] Updated weights for policy 0, policy_version 53150 (0.0009) [2023-10-10 06:52:05,892][53268] Updated weights for policy 1, policy_version 53090 (0.0007) [2023-10-10 06:52:06,252][53268] Updated weights for policy 1, policy_version 53100 (0.0008) [2023-10-10 06:52:06,626][53268] Updated weights for policy 1, policy_version 53110 (0.0009) [2023-10-10 06:52:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108789760. Throughput: 0: 1695.3, 1: 1687.2. Samples: 27211102. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:52:06,784][52050] Avg episode reward: [(0, '19.510'), (1, '21.530')] [2023-10-10 06:52:06,985][53268] Updated weights for policy 1, policy_version 53120 (0.0010) [2023-10-10 06:52:07,453][53252] Updated weights for policy 0, policy_version 53160 (0.0007) [2023-10-10 06:52:07,824][53252] Updated weights for policy 0, policy_version 53170 (0.0007) [2023-10-10 06:52:08,194][53252] Updated weights for policy 0, policy_version 53180 (0.0007) [2023-10-10 06:52:11,091][53268] Updated weights for policy 1, policy_version 53130 (0.0008) [2023-10-10 06:52:11,453][53268] Updated weights for policy 1, policy_version 53140 (0.0009) [2023-10-10 06:52:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108855296. Throughput: 0: 1695.0, 1: 1677.9. Samples: 27231476. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 06:52:11,784][52050] Avg episode reward: [(0, '19.700'), (1, '21.480')] [2023-10-10 06:52:11,814][53268] Updated weights for policy 1, policy_version 53150 (0.0011) [2023-10-10 06:52:12,132][53252] Updated weights for policy 0, policy_version 53190 (0.0007) [2023-10-10 06:52:12,510][53252] Updated weights for policy 0, policy_version 53200 (0.0010) [2023-10-10 06:52:12,877][53252] Updated weights for policy 0, policy_version 53210 (0.0007) [2023-10-10 06:52:16,057][53268] Updated weights for policy 1, policy_version 53160 (0.0011) [2023-10-10 06:52:16,429][53268] Updated weights for policy 1, policy_version 53170 (0.0009) [2023-10-10 06:52:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108920832. Throughput: 0: 1695.2, 1: 1682.7. Samples: 27241036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:16,784][52050] Avg episode reward: [(0, '19.600'), (1, '21.710')] [2023-10-10 06:52:16,788][53268] Updated weights for policy 1, policy_version 53180 (0.0008) [2023-10-10 06:52:16,904][53252] Updated weights for policy 0, policy_version 53220 (0.0007) [2023-10-10 06:52:17,274][53252] Updated weights for policy 0, policy_version 53230 (0.0008) [2023-10-10 06:52:17,646][53252] Updated weights for policy 0, policy_version 53240 (0.0007) [2023-10-10 06:52:20,958][53268] Updated weights for policy 1, policy_version 53190 (0.0007) [2023-10-10 06:52:21,326][53268] Updated weights for policy 1, policy_version 53200 (0.0007) [2023-10-10 06:52:21,696][53268] Updated weights for policy 1, policy_version 53210 (0.0008) [2023-10-10 06:52:21,748][53252] Updated weights for policy 0, policy_version 53250 (0.0008) [2023-10-10 06:52:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 108986368. Throughput: 0: 1700.2, 1: 1690.3. Samples: 27261834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:21,784][52050] Avg episode reward: [(0, '19.040'), (1, '22.280')] [2023-10-10 06:52:22,122][53252] Updated weights for policy 0, policy_version 53260 (0.0007) [2023-10-10 06:52:22,479][53252] Updated weights for policy 0, policy_version 53270 (0.0009) [2023-10-10 06:52:22,843][53252] Updated weights for policy 0, policy_version 53280 (0.0008) [2023-10-10 06:52:25,607][53268] Updated weights for policy 1, policy_version 53220 (0.0008) [2023-10-10 06:52:25,979][53268] Updated weights for policy 1, policy_version 53230 (0.0008) [2023-10-10 06:52:26,340][53268] Updated weights for policy 1, policy_version 53240 (0.0008) [2023-10-10 06:52:26,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109084672. Throughput: 0: 1699.6, 1: 1675.7. Samples: 27282008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:26,784][52050] Avg episode reward: [(0, '19.730'), (1, '21.400')] [2023-10-10 06:52:26,971][53252] Updated weights for policy 0, policy_version 53290 (0.0007) [2023-10-10 06:52:27,344][53252] Updated weights for policy 0, policy_version 53300 (0.0008) [2023-10-10 06:52:27,715][53252] Updated weights for policy 0, policy_version 53310 (0.0009) [2023-10-10 06:52:30,333][53268] Updated weights for policy 1, policy_version 53250 (0.0008) [2023-10-10 06:52:30,702][53268] Updated weights for policy 1, policy_version 53260 (0.0009) [2023-10-10 06:52:31,070][53268] Updated weights for policy 1, policy_version 53270 (0.0007) [2023-10-10 06:52:31,431][53268] Updated weights for policy 1, policy_version 53280 (0.0009) [2023-10-10 06:52:31,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109150208. Throughput: 0: 1692.7, 1: 1692.8. Samples: 27291944. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:31,784][52050] Avg episode reward: [(0, '22.380'), (1, '19.840')] [2023-10-10 06:52:31,858][53252] Updated weights for policy 0, policy_version 53320 (0.0010) [2023-10-10 06:52:32,223][53252] Updated weights for policy 0, policy_version 53330 (0.0007) [2023-10-10 06:52:32,583][53252] Updated weights for policy 0, policy_version 53340 (0.0008) [2023-10-10 06:52:35,329][53268] Updated weights for policy 1, policy_version 53290 (0.0010) [2023-10-10 06:52:35,697][53268] Updated weights for policy 1, policy_version 53300 (0.0008) [2023-10-10 06:52:36,067][53268] Updated weights for policy 1, policy_version 53310 (0.0008) [2023-10-10 06:52:36,665][53252] Updated weights for policy 0, policy_version 53350 (0.0008) [2023-10-10 06:52:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109215744. Throughput: 0: 1688.4, 1: 1687.2. Samples: 27312396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:36,784][52050] Avg episode reward: [(0, '22.460'), (1, '20.660')] [2023-10-10 06:52:37,054][53252] Updated weights for policy 0, policy_version 53360 (0.0009) [2023-10-10 06:52:37,430][53252] Updated weights for policy 0, policy_version 53370 (0.0008) [2023-10-10 06:52:40,175][53268] Updated weights for policy 1, policy_version 53320 (0.0010) [2023-10-10 06:52:40,541][53268] Updated weights for policy 1, policy_version 53330 (0.0010) [2023-10-10 06:52:40,909][53268] Updated weights for policy 1, policy_version 53340 (0.0011) [2023-10-10 06:52:41,355][53252] Updated weights for policy 0, policy_version 53380 (0.0007) [2023-10-10 06:52:41,731][53252] Updated weights for policy 0, policy_version 53390 (0.0008) [2023-10-10 06:52:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109281280. Throughput: 0: 1682.5, 1: 1668.6. Samples: 27332018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:41,784][52050] Avg episode reward: [(0, '21.780'), (1, '19.490')] [2023-10-10 06:52:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000053344_54624256.pth... [2023-10-10 06:52:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000051776_53018624.pth [2023-10-10 06:52:42,104][53252] Updated weights for policy 0, policy_version 53400 (0.0008) [2023-10-10 06:52:42,394][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000053408_54689792.pth... [2023-10-10 06:52:42,434][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000051808_53051392.pth [2023-10-10 06:52:44,925][53268] Updated weights for policy 1, policy_version 53350 (0.0009) [2023-10-10 06:52:45,299][53268] Updated weights for policy 1, policy_version 53360 (0.0008) [2023-10-10 06:52:45,681][53268] Updated weights for policy 1, policy_version 53370 (0.0009) [2023-10-10 06:52:46,261][53252] Updated weights for policy 0, policy_version 53410 (0.0009) [2023-10-10 06:52:46,628][53252] Updated weights for policy 0, policy_version 53420 (0.0007) [2023-10-10 06:52:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109346816. Throughput: 0: 1692.1, 1: 1693.8. Samples: 27342750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:52:46,784][52050] Avg episode reward: [(0, '21.790'), (1, '20.180')] [2023-10-10 06:52:46,997][53252] Updated weights for policy 0, policy_version 53430 (0.0007) [2023-10-10 06:52:47,372][53252] Updated weights for policy 0, policy_version 53440 (0.0010) [2023-10-10 06:52:49,780][53268] Updated weights for policy 1, policy_version 53380 (0.0010) [2023-10-10 06:52:50,148][53268] Updated weights for policy 1, policy_version 53390 (0.0009) [2023-10-10 06:52:50,509][53268] Updated weights for policy 1, policy_version 53400 (0.0009) [2023-10-10 06:52:51,307][53252] Updated weights for policy 0, policy_version 53450 (0.0009) [2023-10-10 06:52:51,674][53252] Updated weights for policy 0, policy_version 53460 (0.0008) [2023-10-10 06:52:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109412352. Throughput: 0: 1689.9, 1: 1685.0. Samples: 27362972. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:52:51,784][52050] Avg episode reward: [(0, '21.040'), (1, '21.960')] [2023-10-10 06:52:52,051][53252] Updated weights for policy 0, policy_version 53470 (0.0007) [2023-10-10 06:52:54,462][53268] Updated weights for policy 1, policy_version 53410 (0.0009) [2023-10-10 06:52:54,834][53268] Updated weights for policy 1, policy_version 53420 (0.0010) [2023-10-10 06:52:55,200][53268] Updated weights for policy 1, policy_version 53430 (0.0008) [2023-10-10 06:52:55,560][53268] Updated weights for policy 1, policy_version 53440 (0.0010) [2023-10-10 06:52:56,095][53252] Updated weights for policy 0, policy_version 53480 (0.0008) [2023-10-10 06:52:56,473][53252] Updated weights for policy 0, policy_version 53490 (0.0007) [2023-10-10 06:52:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109477888. Throughput: 0: 1678.1, 1: 1682.1. Samples: 27382686. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:52:56,784][52050] Avg episode reward: [(0, '19.970'), (1, '20.180')] [2023-10-10 06:52:56,843][53252] Updated weights for policy 0, policy_version 53500 (0.0008) [2023-10-10 06:52:59,440][53268] Updated weights for policy 1, policy_version 53450 (0.0009) [2023-10-10 06:52:59,811][53268] Updated weights for policy 1, policy_version 53460 (0.0010) [2023-10-10 06:53:00,183][53268] Updated weights for policy 1, policy_version 53470 (0.0008) [2023-10-10 06:53:00,688][53252] Updated weights for policy 0, policy_version 53510 (0.0009) [2023-10-10 06:53:01,072][53252] Updated weights for policy 0, policy_version 53520 (0.0011) [2023-10-10 06:53:01,447][53252] Updated weights for policy 0, policy_version 53530 (0.0010) [2023-10-10 06:53:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109576192. Throughput: 0: 1691.0, 1: 1699.2. Samples: 27393596. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:53:01,784][52050] Avg episode reward: [(0, '21.960'), (1, '20.030')] [2023-10-10 06:53:04,170][53268] Updated weights for policy 1, policy_version 53480 (0.0008) [2023-10-10 06:53:04,533][53268] Updated weights for policy 1, policy_version 53490 (0.0008) [2023-10-10 06:53:04,914][53268] Updated weights for policy 1, policy_version 53500 (0.0011) [2023-10-10 06:53:05,607][53252] Updated weights for policy 0, policy_version 53540 (0.0009) [2023-10-10 06:53:05,977][53252] Updated weights for policy 0, policy_version 53550 (0.0007) [2023-10-10 06:53:06,361][53252] Updated weights for policy 0, policy_version 53560 (0.0010) [2023-10-10 06:53:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109641728. Throughput: 0: 1690.9, 1: 1672.8. Samples: 27413204. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:53:06,784][52050] Avg episode reward: [(0, '21.900'), (1, '20.420')] [2023-10-10 06:53:09,220][53268] Updated weights for policy 1, policy_version 53510 (0.0011) [2023-10-10 06:53:09,586][53268] Updated weights for policy 1, policy_version 53520 (0.0008) [2023-10-10 06:53:09,951][53268] Updated weights for policy 1, policy_version 53530 (0.0008) [2023-10-10 06:53:10,299][53252] Updated weights for policy 0, policy_version 53570 (0.0008) [2023-10-10 06:53:10,668][53252] Updated weights for policy 0, policy_version 53580 (0.0008) [2023-10-10 06:53:11,047][53252] Updated weights for policy 0, policy_version 53590 (0.0008) [2023-10-10 06:53:11,415][53252] Updated weights for policy 0, policy_version 53600 (0.0009) [2023-10-10 06:53:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109707264. Throughput: 0: 1663.5, 1: 1683.6. Samples: 27432630. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:53:11,784][52050] Avg episode reward: [(0, '19.830'), (1, '20.220')] [2023-10-10 06:53:14,066][53268] Updated weights for policy 1, policy_version 53540 (0.0008) [2023-10-10 06:53:14,432][53268] Updated weights for policy 1, policy_version 53550 (0.0008) [2023-10-10 06:53:14,800][53268] Updated weights for policy 1, policy_version 53560 (0.0009) [2023-10-10 06:53:15,429][53252] Updated weights for policy 0, policy_version 53610 (0.0008) [2023-10-10 06:53:15,794][53252] Updated weights for policy 0, policy_version 53620 (0.0007) [2023-10-10 06:53:16,167][53252] Updated weights for policy 0, policy_version 53630 (0.0009) [2023-10-10 06:53:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 109772800. Throughput: 0: 1694.3, 1: 1684.8. Samples: 27444002. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:53:16,784][52050] Avg episode reward: [(0, '21.030'), (1, '20.440')] [2023-10-10 06:53:18,843][53268] Updated weights for policy 1, policy_version 53570 (0.0009) [2023-10-10 06:53:19,204][53268] Updated weights for policy 1, policy_version 53580 (0.0009) [2023-10-10 06:53:19,573][53268] Updated weights for policy 1, policy_version 53590 (0.0009) [2023-10-10 06:53:19,944][53268] Updated weights for policy 1, policy_version 53600 (0.0008) [2023-10-10 06:53:20,302][53252] Updated weights for policy 0, policy_version 53640 (0.0009) [2023-10-10 06:53:20,685][53252] Updated weights for policy 0, policy_version 53650 (0.0011) [2023-10-10 06:53:21,057][53252] Updated weights for policy 0, policy_version 53660 (0.0010) [2023-10-10 06:53:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 109838336. Throughput: 0: 1689.7, 1: 1666.1. Samples: 27463408. Policy #0 lag: (min: 31.0, avg: 35.0, max: 63.0) [2023-10-10 06:53:21,784][52050] Avg episode reward: [(0, '18.440'), (1, '22.290')] [2023-10-10 06:53:24,138][53268] Updated weights for policy 1, policy_version 53610 (0.0009) [2023-10-10 06:53:24,506][53268] Updated weights for policy 1, policy_version 53620 (0.0009) [2023-10-10 06:53:24,873][53268] Updated weights for policy 1, policy_version 53630 (0.0008) [2023-10-10 06:53:25,022][53252] Updated weights for policy 0, policy_version 53670 (0.0011) [2023-10-10 06:53:25,388][53252] Updated weights for policy 0, policy_version 53680 (0.0009) [2023-10-10 06:53:25,754][53252] Updated weights for policy 0, policy_version 53690 (0.0007) [2023-10-10 06:53:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109903872. Throughput: 0: 1676.5, 1: 1689.2. Samples: 27483474. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:26,784][52050] Avg episode reward: [(0, '19.410'), (1, '21.060')] [2023-10-10 06:53:28,730][53268] Updated weights for policy 1, policy_version 53640 (0.0009) [2023-10-10 06:53:29,093][53268] Updated weights for policy 1, policy_version 53650 (0.0007) [2023-10-10 06:53:29,466][53268] Updated weights for policy 1, policy_version 53660 (0.0007) [2023-10-10 06:53:29,726][53252] Updated weights for policy 0, policy_version 53700 (0.0008) [2023-10-10 06:53:30,098][53252] Updated weights for policy 0, policy_version 53710 (0.0009) [2023-10-10 06:53:30,462][53252] Updated weights for policy 0, policy_version 53720 (0.0010) [2023-10-10 06:53:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109969408. Throughput: 0: 1704.5, 1: 1673.3. Samples: 27494754. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:31,784][52050] Avg episode reward: [(0, '21.570'), (1, '21.370')] [2023-10-10 06:53:33,714][53268] Updated weights for policy 1, policy_version 53670 (0.0007) [2023-10-10 06:53:34,087][53268] Updated weights for policy 1, policy_version 53680 (0.0007) [2023-10-10 06:53:34,458][53268] Updated weights for policy 1, policy_version 53690 (0.0008) [2023-10-10 06:53:34,575][53252] Updated weights for policy 0, policy_version 53730 (0.0011) [2023-10-10 06:53:34,951][53252] Updated weights for policy 0, policy_version 53740 (0.0008) [2023-10-10 06:53:35,320][53252] Updated weights for policy 0, policy_version 53750 (0.0008) [2023-10-10 06:53:35,694][53252] Updated weights for policy 0, policy_version 53760 (0.0009) [2023-10-10 06:53:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 110034944. Throughput: 0: 1681.1, 1: 1673.8. Samples: 27513944. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:36,784][52050] Avg episode reward: [(0, '22.150'), (1, '20.040')] [2023-10-10 06:53:38,316][53268] Updated weights for policy 1, policy_version 53700 (0.0008) [2023-10-10 06:53:38,698][53268] Updated weights for policy 1, policy_version 53710 (0.0011) [2023-10-10 06:53:39,053][53268] Updated weights for policy 1, policy_version 53720 (0.0010) [2023-10-10 06:53:39,883][53252] Updated weights for policy 0, policy_version 53770 (0.0009) [2023-10-10 06:53:40,253][53252] Updated weights for policy 0, policy_version 53780 (0.0009) [2023-10-10 06:53:40,634][53252] Updated weights for policy 0, policy_version 53790 (0.0009) [2023-10-10 06:53:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110100480. Throughput: 0: 1674.9, 1: 1691.7. Samples: 27534186. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:41,784][52050] Avg episode reward: [(0, '22.400'), (1, '19.870')] [2023-10-10 06:53:42,897][53268] Updated weights for policy 1, policy_version 53730 (0.0008) [2023-10-10 06:53:43,262][53268] Updated weights for policy 1, policy_version 53740 (0.0008) [2023-10-10 06:53:43,623][53268] Updated weights for policy 1, policy_version 53750 (0.0008) [2023-10-10 06:53:43,999][53268] Updated weights for policy 1, policy_version 53760 (0.0010) [2023-10-10 06:53:44,624][53252] Updated weights for policy 0, policy_version 53800 (0.0008) [2023-10-10 06:53:44,992][53252] Updated weights for policy 0, policy_version 53810 (0.0007) [2023-10-10 06:53:45,360][53252] Updated weights for policy 0, policy_version 53820 (0.0007) [2023-10-10 06:53:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110166016. Throughput: 0: 1692.0, 1: 1666.9. Samples: 27544748. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:46,784][52050] Avg episode reward: [(0, '22.100'), (1, '20.390')] [2023-10-10 06:53:48,091][53268] Updated weights for policy 1, policy_version 53770 (0.0008) [2023-10-10 06:53:48,453][53268] Updated weights for policy 1, policy_version 53780 (0.0009) [2023-10-10 06:53:48,811][53268] Updated weights for policy 1, policy_version 53790 (0.0009) [2023-10-10 06:53:49,400][53252] Updated weights for policy 0, policy_version 53830 (0.0008) [2023-10-10 06:53:49,768][53252] Updated weights for policy 0, policy_version 53840 (0.0008) [2023-10-10 06:53:50,143][53252] Updated weights for policy 0, policy_version 53850 (0.0009) [2023-10-10 06:53:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110231552. Throughput: 0: 1666.2, 1: 1692.4. Samples: 27564338. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:51,784][52050] Avg episode reward: [(0, '20.040'), (1, '20.890')] [2023-10-10 06:53:52,768][53268] Updated weights for policy 1, policy_version 53800 (0.0008) [2023-10-10 06:53:53,137][53268] Updated weights for policy 1, policy_version 53810 (0.0008) [2023-10-10 06:53:53,498][53268] Updated weights for policy 1, policy_version 53820 (0.0007) [2023-10-10 06:53:54,186][53252] Updated weights for policy 0, policy_version 53860 (0.0009) [2023-10-10 06:53:54,554][53252] Updated weights for policy 0, policy_version 53870 (0.0008) [2023-10-10 06:53:54,926][53252] Updated weights for policy 0, policy_version 53880 (0.0008) [2023-10-10 06:53:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110297088. Throughput: 0: 1692.2, 1: 1701.7. Samples: 27585356. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:53:56,784][52050] Avg episode reward: [(0, '20.830'), (1, '20.650')] [2023-10-10 06:53:57,589][53268] Updated weights for policy 1, policy_version 53830 (0.0008) [2023-10-10 06:53:57,976][53268] Updated weights for policy 1, policy_version 53840 (0.0010) [2023-10-10 06:53:58,348][53268] Updated weights for policy 1, policy_version 53850 (0.0009) [2023-10-10 06:53:58,861][53252] Updated weights for policy 0, policy_version 53890 (0.0007) [2023-10-10 06:53:59,227][53252] Updated weights for policy 0, policy_version 53900 (0.0010) [2023-10-10 06:53:59,595][53252] Updated weights for policy 0, policy_version 53910 (0.0010) [2023-10-10 06:53:59,969][53252] Updated weights for policy 0, policy_version 53920 (0.0009) [2023-10-10 06:54:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110362624. Throughput: 0: 1681.4, 1: 1679.1. Samples: 27595222. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 06:54:01,784][52050] Avg episode reward: [(0, '21.130'), (1, '20.600')] [2023-10-10 06:54:02,234][53268] Updated weights for policy 1, policy_version 53860 (0.0009) [2023-10-10 06:54:02,617][53268] Updated weights for policy 1, policy_version 53870 (0.0008) [2023-10-10 06:54:02,979][53268] Updated weights for policy 1, policy_version 53880 (0.0007) [2023-10-10 06:54:03,851][53252] Updated weights for policy 0, policy_version 53930 (0.0010) [2023-10-10 06:54:04,223][53252] Updated weights for policy 0, policy_version 53940 (0.0007) [2023-10-10 06:54:04,593][53252] Updated weights for policy 0, policy_version 53950 (0.0007) [2023-10-10 06:54:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110428160. Throughput: 0: 1674.3, 1: 1705.2. Samples: 27615484. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:06,784][52050] Avg episode reward: [(0, '21.200'), (1, '20.540')] [2023-10-10 06:54:07,093][53268] Updated weights for policy 1, policy_version 53890 (0.0007) [2023-10-10 06:54:07,446][53268] Updated weights for policy 1, policy_version 53900 (0.0010) [2023-10-10 06:54:07,818][53268] Updated weights for policy 1, policy_version 53910 (0.0008) [2023-10-10 06:54:08,177][53268] Updated weights for policy 1, policy_version 53920 (0.0007) [2023-10-10 06:54:08,695][53252] Updated weights for policy 0, policy_version 53960 (0.0008) [2023-10-10 06:54:09,067][53252] Updated weights for policy 0, policy_version 53970 (0.0008) [2023-10-10 06:54:09,437][53252] Updated weights for policy 0, policy_version 53980 (0.0008) [2023-10-10 06:54:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110493696. Throughput: 0: 1693.8, 1: 1701.6. Samples: 27636270. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:11,784][52050] Avg episode reward: [(0, '20.210'), (1, '18.930')] [2023-10-10 06:54:12,329][53268] Updated weights for policy 1, policy_version 53930 (0.0008) [2023-10-10 06:54:12,693][53268] Updated weights for policy 1, policy_version 53940 (0.0009) [2023-10-10 06:54:13,057][53268] Updated weights for policy 1, policy_version 53950 (0.0008) [2023-10-10 06:54:13,507][53252] Updated weights for policy 0, policy_version 53990 (0.0008) [2023-10-10 06:54:13,870][53252] Updated weights for policy 0, policy_version 54000 (0.0007) [2023-10-10 06:54:14,237][53252] Updated weights for policy 0, policy_version 54010 (0.0007) [2023-10-10 06:54:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110559232. Throughput: 0: 1668.5, 1: 1682.8. Samples: 27645564. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:16,784][52050] Avg episode reward: [(0, '20.200'), (1, '20.490')] [2023-10-10 06:54:17,299][53268] Updated weights for policy 1, policy_version 53960 (0.0011) [2023-10-10 06:54:17,672][53268] Updated weights for policy 1, policy_version 53970 (0.0010) [2023-10-10 06:54:18,039][53268] Updated weights for policy 1, policy_version 53980 (0.0010) [2023-10-10 06:54:18,268][53252] Updated weights for policy 0, policy_version 54020 (0.0007) [2023-10-10 06:54:18,632][53252] Updated weights for policy 0, policy_version 54030 (0.0007) [2023-10-10 06:54:19,005][53252] Updated weights for policy 0, policy_version 54040 (0.0007) [2023-10-10 06:54:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110624768. Throughput: 0: 1686.9, 1: 1693.1. Samples: 27666044. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:21,784][52050] Avg episode reward: [(0, '20.510'), (1, '20.070')] [2023-10-10 06:54:22,383][53268] Updated weights for policy 1, policy_version 53990 (0.0010) [2023-10-10 06:54:22,742][53268] Updated weights for policy 1, policy_version 54000 (0.0011) [2023-10-10 06:54:23,063][53252] Updated weights for policy 0, policy_version 54050 (0.0007) [2023-10-10 06:54:23,113][53268] Updated weights for policy 1, policy_version 54010 (0.0009) [2023-10-10 06:54:23,467][53252] Updated weights for policy 0, policy_version 54060 (0.0009) [2023-10-10 06:54:23,845][53252] Updated weights for policy 0, policy_version 54070 (0.0011) [2023-10-10 06:54:24,210][53252] Updated weights for policy 0, policy_version 54080 (0.0009) [2023-10-10 06:54:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110690304. Throughput: 0: 1700.0, 1: 1686.0. Samples: 27686556. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:26,784][52050] Avg episode reward: [(0, '21.830'), (1, '18.850')] [2023-10-10 06:54:27,197][53268] Updated weights for policy 1, policy_version 54020 (0.0008) [2023-10-10 06:54:27,564][53268] Updated weights for policy 1, policy_version 54030 (0.0008) [2023-10-10 06:54:27,943][53268] Updated weights for policy 1, policy_version 54040 (0.0007) [2023-10-10 06:54:28,274][53252] Updated weights for policy 0, policy_version 54090 (0.0008) [2023-10-10 06:54:28,634][53252] Updated weights for policy 0, policy_version 54100 (0.0010) [2023-10-10 06:54:29,015][53252] Updated weights for policy 0, policy_version 54110 (0.0008) [2023-10-10 06:54:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110755840. Throughput: 0: 1668.5, 1: 1683.8. Samples: 27695604. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:31,784][52050] Avg episode reward: [(0, '20.430'), (1, '19.180')] [2023-10-10 06:54:31,972][53268] Updated weights for policy 1, policy_version 54050 (0.0008) [2023-10-10 06:54:32,344][53268] Updated weights for policy 1, policy_version 54060 (0.0009) [2023-10-10 06:54:32,717][53268] Updated weights for policy 1, policy_version 54070 (0.0007) [2023-10-10 06:54:33,035][53252] Updated weights for policy 0, policy_version 54120 (0.0007) [2023-10-10 06:54:33,087][53268] Updated weights for policy 1, policy_version 54080 (0.0009) [2023-10-10 06:54:33,409][53252] Updated weights for policy 0, policy_version 54130 (0.0010) [2023-10-10 06:54:33,776][53252] Updated weights for policy 0, policy_version 54140 (0.0009) [2023-10-10 06:54:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110821376. Throughput: 0: 1694.2, 1: 1682.8. Samples: 27716302. Policy #0 lag: (min: 8.0, avg: 31.8, max: 40.0) [2023-10-10 06:54:36,784][52050] Avg episode reward: [(0, '21.250'), (1, '19.650')] [2023-10-10 06:54:37,015][53268] Updated weights for policy 1, policy_version 54090 (0.0010) [2023-10-10 06:54:37,379][53268] Updated weights for policy 1, policy_version 54100 (0.0007) [2023-10-10 06:54:37,750][53268] Updated weights for policy 1, policy_version 54110 (0.0010) [2023-10-10 06:54:37,888][53252] Updated weights for policy 0, policy_version 54150 (0.0007) [2023-10-10 06:54:38,270][53252] Updated weights for policy 0, policy_version 54160 (0.0009) [2023-10-10 06:54:38,641][53252] Updated weights for policy 0, policy_version 54170 (0.0010) [2023-10-10 06:54:41,705][53268] Updated weights for policy 1, policy_version 54120 (0.0008) [2023-10-10 06:54:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110886912. Throughput: 0: 1694.1, 1: 1676.3. Samples: 27737026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:54:41,784][52050] Avg episode reward: [(0, '20.940'), (1, '19.380')] [2023-10-10 06:54:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000054176_55476224.pth... [2023-10-10 06:54:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000052608_53870592.pth [2023-10-10 06:54:42,081][53268] Updated weights for policy 1, policy_version 54130 (0.0010) [2023-10-10 06:54:42,443][53268] Updated weights for policy 1, policy_version 54140 (0.0008) [2023-10-10 06:54:42,593][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000054144_55443456.pth... [2023-10-10 06:54:42,624][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000052544_53805056.pth [2023-10-10 06:54:42,804][53252] Updated weights for policy 0, policy_version 54180 (0.0008) [2023-10-10 06:54:43,182][53252] Updated weights for policy 0, policy_version 54190 (0.0008) [2023-10-10 06:54:43,544][53252] Updated weights for policy 0, policy_version 54200 (0.0007) [2023-10-10 06:54:46,611][53268] Updated weights for policy 1, policy_version 54150 (0.0008) [2023-10-10 06:54:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110952448. Throughput: 0: 1674.9, 1: 1681.8. Samples: 27746274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:54:46,784][52050] Avg episode reward: [(0, '20.310'), (1, '19.920')] [2023-10-10 06:54:46,998][53268] Updated weights for policy 1, policy_version 54160 (0.0007) [2023-10-10 06:54:47,370][53268] Updated weights for policy 1, policy_version 54170 (0.0008) [2023-10-10 06:54:47,570][53252] Updated weights for policy 0, policy_version 54210 (0.0008) [2023-10-10 06:54:47,941][53252] Updated weights for policy 0, policy_version 54220 (0.0008) [2023-10-10 06:54:48,317][53252] Updated weights for policy 0, policy_version 54230 (0.0007) [2023-10-10 06:54:48,682][53252] Updated weights for policy 0, policy_version 54240 (0.0010) [2023-10-10 06:54:51,318][53268] Updated weights for policy 1, policy_version 54180 (0.0008) [2023-10-10 06:54:51,682][53268] Updated weights for policy 1, policy_version 54190 (0.0010) [2023-10-10 06:54:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111017984. Throughput: 0: 1689.1, 1: 1679.9. Samples: 27767090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:54:51,784][52050] Avg episode reward: [(0, '20.040'), (1, '19.150')] [2023-10-10 06:54:52,054][53268] Updated weights for policy 1, policy_version 54200 (0.0010) [2023-10-10 06:54:52,741][53252] Updated weights for policy 0, policy_version 54250 (0.0010) [2023-10-10 06:54:53,118][53252] Updated weights for policy 0, policy_version 54260 (0.0010) [2023-10-10 06:54:53,490][53252] Updated weights for policy 0, policy_version 54270 (0.0011) [2023-10-10 06:54:56,086][53268] Updated weights for policy 1, policy_version 54210 (0.0009) [2023-10-10 06:54:56,456][53268] Updated weights for policy 1, policy_version 54220 (0.0009) [2023-10-10 06:54:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111083520. Throughput: 0: 1690.1, 1: 1679.0. Samples: 27787882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:54:56,784][52050] Avg episode reward: [(0, '18.870'), (1, '19.560')] [2023-10-10 06:54:56,832][53268] Updated weights for policy 1, policy_version 54230 (0.0009) [2023-10-10 06:54:57,193][53268] Updated weights for policy 1, policy_version 54240 (0.0008) [2023-10-10 06:54:57,467][53252] Updated weights for policy 0, policy_version 54280 (0.0010) [2023-10-10 06:54:57,826][53252] Updated weights for policy 0, policy_version 54290 (0.0008) [2023-10-10 06:54:58,201][53252] Updated weights for policy 0, policy_version 54300 (0.0010) [2023-10-10 06:55:01,146][53268] Updated weights for policy 1, policy_version 54250 (0.0007) [2023-10-10 06:55:01,504][53268] Updated weights for policy 1, policy_version 54260 (0.0010) [2023-10-10 06:55:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111149056. Throughput: 0: 1684.3, 1: 1684.8. Samples: 27797172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:55:01,784][52050] Avg episode reward: [(0, '19.370'), (1, '18.950')] [2023-10-10 06:55:01,874][53268] Updated weights for policy 1, policy_version 54270 (0.0009) [2023-10-10 06:55:02,024][53252] Updated weights for policy 0, policy_version 54310 (0.0010) [2023-10-10 06:55:02,390][53252] Updated weights for policy 0, policy_version 54320 (0.0008) [2023-10-10 06:55:02,754][53252] Updated weights for policy 0, policy_version 54330 (0.0009) [2023-10-10 06:55:05,960][53268] Updated weights for policy 1, policy_version 54280 (0.0009) [2023-10-10 06:55:06,316][53268] Updated weights for policy 1, policy_version 54290 (0.0011) [2023-10-10 06:55:06,677][53268] Updated weights for policy 1, policy_version 54300 (0.0010) [2023-10-10 06:55:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111214592. Throughput: 0: 1688.9, 1: 1684.8. Samples: 27817860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:55:06,784][52050] Avg episode reward: [(0, '18.180'), (1, '18.740')] [2023-10-10 06:55:06,938][53252] Updated weights for policy 0, policy_version 54340 (0.0008) [2023-10-10 06:55:07,313][53252] Updated weights for policy 0, policy_version 54350 (0.0009) [2023-10-10 06:55:07,680][53252] Updated weights for policy 0, policy_version 54360 (0.0009) [2023-10-10 06:55:10,892][53268] Updated weights for policy 1, policy_version 54310 (0.0008) [2023-10-10 06:55:11,260][53268] Updated weights for policy 1, policy_version 54320 (0.0009) [2023-10-10 06:55:11,632][53268] Updated weights for policy 1, policy_version 54330 (0.0009) [2023-10-10 06:55:11,772][53252] Updated weights for policy 0, policy_version 54370 (0.0008) [2023-10-10 06:55:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111280128. Throughput: 0: 1693.7, 1: 1674.1. Samples: 27838108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:55:11,784][52050] Avg episode reward: [(0, '22.480'), (1, '17.550')] [2023-10-10 06:55:12,177][53252] Updated weights for policy 0, policy_version 54380 (0.0010) [2023-10-10 06:55:12,560][53252] Updated weights for policy 0, policy_version 54390 (0.0007) [2023-10-10 06:55:12,928][53252] Updated weights for policy 0, policy_version 54400 (0.0008) [2023-10-10 06:55:15,795][53268] Updated weights for policy 1, policy_version 54340 (0.0009) [2023-10-10 06:55:16,159][53268] Updated weights for policy 1, policy_version 54350 (0.0009) [2023-10-10 06:55:16,528][53268] Updated weights for policy 1, policy_version 54360 (0.0008) [2023-10-10 06:55:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111345664. Throughput: 0: 1690.6, 1: 1688.0. Samples: 27847644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:55:16,784][52050] Avg episode reward: [(0, '22.530'), (1, '17.270')] [2023-10-10 06:55:16,889][53252] Updated weights for policy 0, policy_version 54410 (0.0007) [2023-10-10 06:55:17,254][53252] Updated weights for policy 0, policy_version 54420 (0.0008) [2023-10-10 06:55:17,630][53252] Updated weights for policy 0, policy_version 54430 (0.0007) [2023-10-10 06:55:20,703][53268] Updated weights for policy 1, policy_version 54370 (0.0009) [2023-10-10 06:55:21,068][53268] Updated weights for policy 1, policy_version 54380 (0.0007) [2023-10-10 06:55:21,429][53268] Updated weights for policy 1, policy_version 54390 (0.0009) [2023-10-10 06:55:21,731][53252] Updated weights for policy 0, policy_version 54440 (0.0007) [2023-10-10 06:55:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111411200. Throughput: 0: 1691.6, 1: 1687.0. Samples: 27868340. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:21,784][52050] Avg episode reward: [(0, '22.010'), (1, '18.780')] [2023-10-10 06:55:21,798][53268] Updated weights for policy 1, policy_version 54400 (0.0009) [2023-10-10 06:55:22,102][53252] Updated weights for policy 0, policy_version 54450 (0.0010) [2023-10-10 06:55:22,473][53252] Updated weights for policy 0, policy_version 54460 (0.0007) [2023-10-10 06:55:25,845][53268] Updated weights for policy 1, policy_version 54410 (0.0011) [2023-10-10 06:55:26,212][53268] Updated weights for policy 1, policy_version 54420 (0.0010) [2023-10-10 06:55:26,488][53252] Updated weights for policy 0, policy_version 54470 (0.0007) [2023-10-10 06:55:26,583][53268] Updated weights for policy 1, policy_version 54430 (0.0008) [2023-10-10 06:55:26,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111509504. Throughput: 0: 1686.8, 1: 1674.3. Samples: 27888274. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:26,784][52050] Avg episode reward: [(0, '20.460'), (1, '18.650')] [2023-10-10 06:55:26,859][53252] Updated weights for policy 0, policy_version 54480 (0.0009) [2023-10-10 06:55:27,236][53252] Updated weights for policy 0, policy_version 54490 (0.0011) [2023-10-10 06:55:30,594][53268] Updated weights for policy 1, policy_version 54440 (0.0008) [2023-10-10 06:55:30,971][53268] Updated weights for policy 1, policy_version 54450 (0.0011) [2023-10-10 06:55:31,337][53268] Updated weights for policy 1, policy_version 54460 (0.0009) [2023-10-10 06:55:31,426][53252] Updated weights for policy 0, policy_version 54500 (0.0010) [2023-10-10 06:55:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111575040. Throughput: 0: 1685.9, 1: 1689.3. Samples: 27898158. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:31,784][52050] Avg episode reward: [(0, '20.770'), (1, '19.890')] [2023-10-10 06:55:31,796][53252] Updated weights for policy 0, policy_version 54510 (0.0009) [2023-10-10 06:55:32,177][53252] Updated weights for policy 0, policy_version 54520 (0.0008) [2023-10-10 06:55:35,473][53268] Updated weights for policy 1, policy_version 54470 (0.0008) [2023-10-10 06:55:35,867][53268] Updated weights for policy 1, policy_version 54480 (0.0011) [2023-10-10 06:55:36,205][53252] Updated weights for policy 0, policy_version 54530 (0.0008) [2023-10-10 06:55:36,238][53268] Updated weights for policy 1, policy_version 54490 (0.0009) [2023-10-10 06:55:36,573][53252] Updated weights for policy 0, policy_version 54540 (0.0008) [2023-10-10 06:55:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111640576. Throughput: 0: 1686.1, 1: 1689.4. Samples: 27918988. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:36,784][52050] Avg episode reward: [(0, '19.120'), (1, '20.720')] [2023-10-10 06:55:36,952][53252] Updated weights for policy 0, policy_version 54550 (0.0009) [2023-10-10 06:55:37,318][53252] Updated weights for policy 0, policy_version 54560 (0.0008) [2023-10-10 06:55:40,279][53268] Updated weights for policy 1, policy_version 54500 (0.0008) [2023-10-10 06:55:40,647][53268] Updated weights for policy 1, policy_version 54510 (0.0008) [2023-10-10 06:55:41,018][53268] Updated weights for policy 1, policy_version 54520 (0.0007) [2023-10-10 06:55:41,423][53252] Updated weights for policy 0, policy_version 54570 (0.0008) [2023-10-10 06:55:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111706112. Throughput: 0: 1672.3, 1: 1663.6. Samples: 27937998. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:41,784][52050] Avg episode reward: [(0, '20.100'), (1, '21.060')] [2023-10-10 06:55:41,793][53252] Updated weights for policy 0, policy_version 54580 (0.0008) [2023-10-10 06:55:42,168][53252] Updated weights for policy 0, policy_version 54590 (0.0008) [2023-10-10 06:55:45,255][53268] Updated weights for policy 1, policy_version 54530 (0.0010) [2023-10-10 06:55:45,626][53268] Updated weights for policy 1, policy_version 54540 (0.0007) [2023-10-10 06:55:45,992][53268] Updated weights for policy 1, policy_version 54550 (0.0007) [2023-10-10 06:55:46,252][53252] Updated weights for policy 0, policy_version 54600 (0.0008) [2023-10-10 06:55:46,361][53268] Updated weights for policy 1, policy_version 54560 (0.0009) [2023-10-10 06:55:46,627][53252] Updated weights for policy 0, policy_version 54610 (0.0009) [2023-10-10 06:55:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111771648. Throughput: 0: 1679.9, 1: 1679.3. Samples: 27948334. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:46,784][52050] Avg episode reward: [(0, '20.260'), (1, '19.490')] [2023-10-10 06:55:47,007][53252] Updated weights for policy 0, policy_version 54620 (0.0010) [2023-10-10 06:55:50,279][53268] Updated weights for policy 1, policy_version 54570 (0.0010) [2023-10-10 06:55:50,643][53268] Updated weights for policy 1, policy_version 54580 (0.0010) [2023-10-10 06:55:51,014][53268] Updated weights for policy 1, policy_version 54590 (0.0009) [2023-10-10 06:55:51,184][53252] Updated weights for policy 0, policy_version 54630 (0.0008) [2023-10-10 06:55:51,562][53252] Updated weights for policy 0, policy_version 54640 (0.0008) [2023-10-10 06:55:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111837184. Throughput: 0: 1675.9, 1: 1673.2. Samples: 27968568. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:51,784][52050] Avg episode reward: [(0, '20.310'), (1, '20.040')] [2023-10-10 06:55:51,927][53252] Updated weights for policy 0, policy_version 54650 (0.0008) [2023-10-10 06:55:55,069][53268] Updated weights for policy 1, policy_version 54600 (0.0010) [2023-10-10 06:55:55,442][53268] Updated weights for policy 1, policy_version 54610 (0.0009) [2023-10-10 06:55:55,808][53268] Updated weights for policy 1, policy_version 54620 (0.0008) [2023-10-10 06:55:56,165][53252] Updated weights for policy 0, policy_version 54660 (0.0007) [2023-10-10 06:55:56,554][53252] Updated weights for policy 0, policy_version 54670 (0.0007) [2023-10-10 06:55:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111902720. Throughput: 0: 1664.5, 1: 1663.8. Samples: 27987884. Policy #0 lag: (min: 25.0, avg: 38.0, max: 57.0) [2023-10-10 06:55:56,784][52050] Avg episode reward: [(0, '21.080'), (1, '19.930')] [2023-10-10 06:55:56,932][53252] Updated weights for policy 0, policy_version 54680 (0.0007) [2023-10-10 06:55:59,811][53268] Updated weights for policy 1, policy_version 54630 (0.0009) [2023-10-10 06:56:00,188][53268] Updated weights for policy 1, policy_version 54640 (0.0009) [2023-10-10 06:56:00,542][53268] Updated weights for policy 1, policy_version 54650 (0.0010) [2023-10-10 06:56:00,738][53252] Updated weights for policy 0, policy_version 54690 (0.0007) [2023-10-10 06:56:01,104][53252] Updated weights for policy 0, policy_version 54700 (0.0007) [2023-10-10 06:56:01,469][53252] Updated weights for policy 0, policy_version 54710 (0.0011) [2023-10-10 06:56:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111968256. Throughput: 0: 1677.3, 1: 1680.7. Samples: 27998752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:01,784][52050] Avg episode reward: [(0, '19.860'), (1, '20.790')] [2023-10-10 06:56:01,840][53252] Updated weights for policy 0, policy_version 54720 (0.0011) [2023-10-10 06:56:04,737][53268] Updated weights for policy 1, policy_version 54660 (0.0008) [2023-10-10 06:56:05,111][53268] Updated weights for policy 1, policy_version 54670 (0.0007) [2023-10-10 06:56:05,467][53268] Updated weights for policy 1, policy_version 54680 (0.0007) [2023-10-10 06:56:05,858][53252] Updated weights for policy 0, policy_version 54730 (0.0008) [2023-10-10 06:56:06,236][53252] Updated weights for policy 0, policy_version 54740 (0.0007) [2023-10-10 06:56:06,605][53252] Updated weights for policy 0, policy_version 54750 (0.0008) [2023-10-10 06:56:06,783][52050] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 112066560. Throughput: 0: 1684.9, 1: 1664.5. Samples: 28019064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:06,784][52050] Avg episode reward: [(0, '19.480'), (1, '19.560')] [2023-10-10 06:56:09,681][53268] Updated weights for policy 1, policy_version 54690 (0.0008) [2023-10-10 06:56:10,056][53268] Updated weights for policy 1, policy_version 54700 (0.0009) [2023-10-10 06:56:10,431][53268] Updated weights for policy 1, policy_version 54710 (0.0008) [2023-10-10 06:56:10,631][53252] Updated weights for policy 0, policy_version 54760 (0.0007) [2023-10-10 06:56:10,792][53268] Updated weights for policy 1, policy_version 54720 (0.0010) [2023-10-10 06:56:10,998][53252] Updated weights for policy 0, policy_version 54770 (0.0008) [2023-10-10 06:56:11,381][53252] Updated weights for policy 0, policy_version 54780 (0.0008) [2023-10-10 06:56:11,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 112132096. Throughput: 0: 1665.5, 1: 1657.1. Samples: 28037794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:11,784][52050] Avg episode reward: [(0, '21.450'), (1, '18.580')] [2023-10-10 06:56:14,889][53268] Updated weights for policy 1, policy_version 54730 (0.0009) [2023-10-10 06:56:15,256][53268] Updated weights for policy 1, policy_version 54740 (0.0008) [2023-10-10 06:56:15,363][53252] Updated weights for policy 0, policy_version 54790 (0.0008) [2023-10-10 06:56:15,620][53268] Updated weights for policy 1, policy_version 54750 (0.0007) [2023-10-10 06:56:15,728][53252] Updated weights for policy 0, policy_version 54800 (0.0008) [2023-10-10 06:56:16,093][53252] Updated weights for policy 0, policy_version 54810 (0.0008) [2023-10-10 06:56:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 112197632. Throughput: 0: 1692.5, 1: 1670.8. Samples: 28049508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:16,784][52050] Avg episode reward: [(0, '20.660'), (1, '18.570')] [2023-10-10 06:56:19,763][53268] Updated weights for policy 1, policy_version 54760 (0.0007) [2023-10-10 06:56:20,146][53268] Updated weights for policy 1, policy_version 54770 (0.0010) [2023-10-10 06:56:20,255][53252] Updated weights for policy 0, policy_version 54820 (0.0007) [2023-10-10 06:56:20,506][53268] Updated weights for policy 1, policy_version 54780 (0.0009) [2023-10-10 06:56:20,620][53252] Updated weights for policy 0, policy_version 54830 (0.0009) [2023-10-10 06:56:21,002][53252] Updated weights for policy 0, policy_version 54840 (0.0009) [2023-10-10 06:56:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 112263168. Throughput: 0: 1678.8, 1: 1650.8. Samples: 28068822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:21,784][52050] Avg episode reward: [(0, '21.740'), (1, '18.420')] [2023-10-10 06:56:24,542][53268] Updated weights for policy 1, policy_version 54790 (0.0010) [2023-10-10 06:56:24,914][53268] Updated weights for policy 1, policy_version 54800 (0.0008) [2023-10-10 06:56:25,189][53252] Updated weights for policy 0, policy_version 54850 (0.0009) [2023-10-10 06:56:25,275][53268] Updated weights for policy 1, policy_version 54810 (0.0008) [2023-10-10 06:56:25,559][53252] Updated weights for policy 0, policy_version 54860 (0.0008) [2023-10-10 06:56:25,917][53252] Updated weights for policy 0, policy_version 54870 (0.0009) [2023-10-10 06:56:26,297][53252] Updated weights for policy 0, policy_version 54880 (0.0010) [2023-10-10 06:56:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112328704. Throughput: 0: 1665.8, 1: 1669.5. Samples: 28088090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:26,784][52050] Avg episode reward: [(0, '23.330'), (1, '19.090')] [2023-10-10 06:56:29,288][53268] Updated weights for policy 1, policy_version 54820 (0.0008) [2023-10-10 06:56:29,651][53268] Updated weights for policy 1, policy_version 54830 (0.0009) [2023-10-10 06:56:30,023][53268] Updated weights for policy 1, policy_version 54840 (0.0009) [2023-10-10 06:56:30,255][53252] Updated weights for policy 0, policy_version 54890 (0.0010) [2023-10-10 06:56:30,636][53252] Updated weights for policy 0, policy_version 54900 (0.0007) [2023-10-10 06:56:31,001][53252] Updated weights for policy 0, policy_version 54910 (0.0009) [2023-10-10 06:56:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112394240. Throughput: 0: 1687.9, 1: 1674.8. Samples: 28099654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:56:31,784][52050] Avg episode reward: [(0, '21.950'), (1, '18.920')] [2023-10-10 06:56:34,235][53268] Updated weights for policy 1, policy_version 54850 (0.0009) [2023-10-10 06:56:34,599][53268] Updated weights for policy 1, policy_version 54860 (0.0007) [2023-10-10 06:56:34,871][53252] Updated weights for policy 0, policy_version 54920 (0.0009) [2023-10-10 06:56:34,968][53268] Updated weights for policy 1, policy_version 54870 (0.0009) [2023-10-10 06:56:35,232][53252] Updated weights for policy 0, policy_version 54930 (0.0008) [2023-10-10 06:56:35,337][53268] Updated weights for policy 1, policy_version 54880 (0.0009) [2023-10-10 06:56:35,609][53252] Updated weights for policy 0, policy_version 54940 (0.0010) [2023-10-10 06:56:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112459776. Throughput: 0: 1678.6, 1: 1658.5. Samples: 28118740. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:56:36,784][52050] Avg episode reward: [(0, '20.750'), (1, '20.090')] [2023-10-10 06:56:39,275][53268] Updated weights for policy 1, policy_version 54890 (0.0008) [2023-10-10 06:56:39,642][53268] Updated weights for policy 1, policy_version 54900 (0.0008) [2023-10-10 06:56:39,834][53252] Updated weights for policy 0, policy_version 54950 (0.0008) [2023-10-10 06:56:40,016][53268] Updated weights for policy 1, policy_version 54910 (0.0008) [2023-10-10 06:56:40,207][53252] Updated weights for policy 0, policy_version 54960 (0.0009) [2023-10-10 06:56:40,578][53252] Updated weights for policy 0, policy_version 54970 (0.0009) [2023-10-10 06:56:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112525312. Throughput: 0: 1674.2, 1: 1678.5. Samples: 28138758. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:56:41,784][52050] Avg episode reward: [(0, '20.800'), (1, '19.730')] [2023-10-10 06:56:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000054912_56229888.pth... [2023-10-10 06:56:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000054976_56295424.pth... [2023-10-10 06:56:41,825][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000053344_54624256.pth [2023-10-10 06:56:41,829][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000054912_56229888.pth [2023-10-10 06:56:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000053408_54689792.pth [2023-10-10 06:56:41,837][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000054976_56295424.pth [2023-10-10 06:56:43,984][53268] Updated weights for policy 1, policy_version 54920 (0.0008) [2023-10-10 06:56:44,358][53268] Updated weights for policy 1, policy_version 54930 (0.0007) [2023-10-10 06:56:44,696][53252] Updated weights for policy 0, policy_version 54980 (0.0009) [2023-10-10 06:56:44,730][53268] Updated weights for policy 1, policy_version 54940 (0.0007) [2023-10-10 06:56:45,082][53252] Updated weights for policy 0, policy_version 54990 (0.0009) [2023-10-10 06:56:45,460][53252] Updated weights for policy 0, policy_version 55000 (0.0008) [2023-10-10 06:56:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112590848. Throughput: 0: 1693.8, 1: 1670.9. Samples: 28150166. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:56:46,784][52050] Avg episode reward: [(0, '20.830'), (1, '19.060')] [2023-10-10 06:56:48,789][53268] Updated weights for policy 1, policy_version 54950 (0.0008) [2023-10-10 06:56:49,161][53268] Updated weights for policy 1, policy_version 54960 (0.0008) [2023-10-10 06:56:49,435][53252] Updated weights for policy 0, policy_version 55010 (0.0009) [2023-10-10 06:56:49,526][53268] Updated weights for policy 1, policy_version 54970 (0.0007) [2023-10-10 06:56:49,820][53252] Updated weights for policy 0, policy_version 55020 (0.0007) [2023-10-10 06:56:50,189][53252] Updated weights for policy 0, policy_version 55030 (0.0009) [2023-10-10 06:56:50,559][53252] Updated weights for policy 0, policy_version 55040 (0.0010) [2023-10-10 06:56:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112656384. Throughput: 0: 1661.3, 1: 1671.9. Samples: 28169058. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:56:51,784][52050] Avg episode reward: [(0, '20.420'), (1, '19.510')] [2023-10-10 06:56:53,641][53268] Updated weights for policy 1, policy_version 54980 (0.0008) [2023-10-10 06:56:53,997][53268] Updated weights for policy 1, policy_version 54990 (0.0008) [2023-10-10 06:56:54,371][53268] Updated weights for policy 1, policy_version 55000 (0.0008) [2023-10-10 06:56:54,658][53252] Updated weights for policy 0, policy_version 55050 (0.0008) [2023-10-10 06:56:55,035][53252] Updated weights for policy 0, policy_version 55060 (0.0007) [2023-10-10 06:56:55,405][53252] Updated weights for policy 0, policy_version 55070 (0.0009) [2023-10-10 06:56:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112721920. Throughput: 0: 1679.0, 1: 1692.8. Samples: 28189522. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:56:56,785][52050] Avg episode reward: [(0, '21.670'), (1, '21.360')] [2023-10-10 06:56:58,368][53268] Updated weights for policy 1, policy_version 55010 (0.0008) [2023-10-10 06:56:58,730][53268] Updated weights for policy 1, policy_version 55020 (0.0009) [2023-10-10 06:56:59,099][53268] Updated weights for policy 1, policy_version 55030 (0.0010) [2023-10-10 06:56:59,429][53252] Updated weights for policy 0, policy_version 55080 (0.0009) [2023-10-10 06:56:59,462][53268] Updated weights for policy 1, policy_version 55040 (0.0009) [2023-10-10 06:56:59,793][53252] Updated weights for policy 0, policy_version 55090 (0.0009) [2023-10-10 06:57:00,165][53252] Updated weights for policy 0, policy_version 55100 (0.0008) [2023-10-10 06:57:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112787456. Throughput: 0: 1678.7, 1: 1672.0. Samples: 28200288. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:57:01,784][52050] Avg episode reward: [(0, '22.770'), (1, '20.100')] [2023-10-10 06:57:03,487][53268] Updated weights for policy 1, policy_version 55050 (0.0010) [2023-10-10 06:57:03,854][53268] Updated weights for policy 1, policy_version 55060 (0.0010) [2023-10-10 06:57:04,215][53268] Updated weights for policy 1, policy_version 55070 (0.0007) [2023-10-10 06:57:04,269][53252] Updated weights for policy 0, policy_version 55110 (0.0009) [2023-10-10 06:57:04,644][53252] Updated weights for policy 0, policy_version 55120 (0.0010) [2023-10-10 06:57:05,006][53252] Updated weights for policy 0, policy_version 55130 (0.0009) [2023-10-10 06:57:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 112852992. Throughput: 0: 1669.1, 1: 1683.0. Samples: 28219670. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:57:06,784][52050] Avg episode reward: [(0, '22.730'), (1, '19.310')] [2023-10-10 06:57:08,370][53268] Updated weights for policy 1, policy_version 55080 (0.0010) [2023-10-10 06:57:08,744][53268] Updated weights for policy 1, policy_version 55090 (0.0011) [2023-10-10 06:57:09,029][53252] Updated weights for policy 0, policy_version 55140 (0.0007) [2023-10-10 06:57:09,122][53268] Updated weights for policy 1, policy_version 55100 (0.0009) [2023-10-10 06:57:09,411][53252] Updated weights for policy 0, policy_version 55150 (0.0009) [2023-10-10 06:57:09,775][53252] Updated weights for policy 0, policy_version 55160 (0.0008) [2023-10-10 06:57:11,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 112918528. Throughput: 0: 1695.1, 1: 1690.0. Samples: 28240420. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) [2023-10-10 06:57:11,785][52050] Avg episode reward: [(0, '21.950'), (1, '20.510')] [2023-10-10 06:57:13,154][53268] Updated weights for policy 1, policy_version 55110 (0.0008) [2023-10-10 06:57:13,524][53268] Updated weights for policy 1, policy_version 55120 (0.0010) [2023-10-10 06:57:13,580][53252] Updated weights for policy 0, policy_version 55170 (0.0009) [2023-10-10 06:57:13,887][53268] Updated weights for policy 1, policy_version 55130 (0.0007) [2023-10-10 06:57:13,956][53252] Updated weights for policy 0, policy_version 55180 (0.0009) [2023-10-10 06:57:14,331][53252] Updated weights for policy 0, policy_version 55190 (0.0009) [2023-10-10 06:57:14,704][53252] Updated weights for policy 0, policy_version 55200 (0.0008) [2023-10-10 06:57:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 112984064. Throughput: 0: 1676.0, 1: 1668.2. Samples: 28250146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:16,784][52050] Avg episode reward: [(0, '23.520'), (1, '19.480')] [2023-10-10 06:57:17,829][53268] Updated weights for policy 1, policy_version 55140 (0.0008) [2023-10-10 06:57:18,194][53268] Updated weights for policy 1, policy_version 55150 (0.0009) [2023-10-10 06:57:18,559][53268] Updated weights for policy 1, policy_version 55160 (0.0008) [2023-10-10 06:57:18,694][53252] Updated weights for policy 0, policy_version 55210 (0.0009) [2023-10-10 06:57:19,061][53252] Updated weights for policy 0, policy_version 55220 (0.0007) [2023-10-10 06:57:19,434][53252] Updated weights for policy 0, policy_version 55230 (0.0007) [2023-10-10 06:57:21,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113049600. Throughput: 0: 1683.9, 1: 1688.2. Samples: 28270482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:21,784][52050] Avg episode reward: [(0, '22.340'), (1, '19.420')] [2023-10-10 06:57:22,654][53268] Updated weights for policy 1, policy_version 55170 (0.0009) [2023-10-10 06:57:23,026][53268] Updated weights for policy 1, policy_version 55180 (0.0010) [2023-10-10 06:57:23,401][53268] Updated weights for policy 1, policy_version 55190 (0.0008) [2023-10-10 06:57:23,551][53252] Updated weights for policy 0, policy_version 55240 (0.0007) [2023-10-10 06:57:23,763][53268] Updated weights for policy 1, policy_version 55200 (0.0009) [2023-10-10 06:57:23,923][53252] Updated weights for policy 0, policy_version 55250 (0.0007) [2023-10-10 06:57:24,291][53252] Updated weights for policy 0, policy_version 55260 (0.0007) [2023-10-10 06:57:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113115136. Throughput: 0: 1696.4, 1: 1686.4. Samples: 28290982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:26,784][52050] Avg episode reward: [(0, '22.480'), (1, '20.700')] [2023-10-10 06:57:27,948][53268] Updated weights for policy 1, policy_version 55210 (0.0009) [2023-10-10 06:57:28,262][53252] Updated weights for policy 0, policy_version 55270 (0.0008) [2023-10-10 06:57:28,318][53268] Updated weights for policy 1, policy_version 55220 (0.0008) [2023-10-10 06:57:28,632][53252] Updated weights for policy 0, policy_version 55280 (0.0008) [2023-10-10 06:57:28,681][53268] Updated weights for policy 1, policy_version 55230 (0.0009) [2023-10-10 06:57:29,018][53252] Updated weights for policy 0, policy_version 55290 (0.0010) [2023-10-10 06:57:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113180672. Throughput: 0: 1670.0, 1: 1664.4. Samples: 28300214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:31,785][52050] Avg episode reward: [(0, '21.540'), (1, '20.640')] [2023-10-10 06:57:32,747][53268] Updated weights for policy 1, policy_version 55240 (0.0009) [2023-10-10 06:57:33,100][53252] Updated weights for policy 0, policy_version 55300 (0.0008) [2023-10-10 06:57:33,115][53268] Updated weights for policy 1, policy_version 55250 (0.0007) [2023-10-10 06:57:33,470][53268] Updated weights for policy 1, policy_version 55260 (0.0009) [2023-10-10 06:57:33,480][53252] Updated weights for policy 0, policy_version 55310 (0.0007) [2023-10-10 06:57:33,846][53252] Updated weights for policy 0, policy_version 55320 (0.0008) [2023-10-10 06:57:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113246208. Throughput: 0: 1689.6, 1: 1681.1. Samples: 28320740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:36,784][52050] Avg episode reward: [(0, '22.360'), (1, '20.430')] [2023-10-10 06:57:37,525][53268] Updated weights for policy 1, policy_version 55270 (0.0010) [2023-10-10 06:57:37,892][53268] Updated weights for policy 1, policy_version 55280 (0.0010) [2023-10-10 06:57:37,925][53252] Updated weights for policy 0, policy_version 55330 (0.0007) [2023-10-10 06:57:38,254][53268] Updated weights for policy 1, policy_version 55290 (0.0008) [2023-10-10 06:57:38,294][53252] Updated weights for policy 0, policy_version 55340 (0.0009) [2023-10-10 06:57:38,673][53252] Updated weights for policy 0, policy_version 55350 (0.0009) [2023-10-10 06:57:39,041][53252] Updated weights for policy 0, policy_version 55360 (0.0010) [2023-10-10 06:57:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113311744. Throughput: 0: 1696.7, 1: 1685.3. Samples: 28341712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:41,784][52050] Avg episode reward: [(0, '19.400'), (1, '21.900')] [2023-10-10 06:57:42,118][53268] Updated weights for policy 1, policy_version 55300 (0.0008) [2023-10-10 06:57:42,484][53268] Updated weights for policy 1, policy_version 55310 (0.0009) [2023-10-10 06:57:42,860][53268] Updated weights for policy 1, policy_version 55320 (0.0010) [2023-10-10 06:57:43,232][53252] Updated weights for policy 0, policy_version 55370 (0.0009) [2023-10-10 06:57:43,607][53252] Updated weights for policy 0, policy_version 55380 (0.0010) [2023-10-10 06:57:43,976][53252] Updated weights for policy 0, policy_version 55390 (0.0010) [2023-10-10 06:57:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113377280. Throughput: 0: 1667.2, 1: 1674.8. Samples: 28350678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:46,784][52050] Avg episode reward: [(0, '20.620'), (1, '22.780')] [2023-10-10 06:57:46,786][53061] Saving new best policy, reward=22.780! [2023-10-10 06:57:47,040][53268] Updated weights for policy 1, policy_version 55330 (0.0007) [2023-10-10 06:57:47,419][53268] Updated weights for policy 1, policy_version 55340 (0.0008) [2023-10-10 06:57:47,785][53268] Updated weights for policy 1, policy_version 55350 (0.0009) [2023-10-10 06:57:48,137][53268] Updated weights for policy 1, policy_version 55360 (0.0009) [2023-10-10 06:57:48,169][53252] Updated weights for policy 0, policy_version 55400 (0.0010) [2023-10-10 06:57:48,539][53252] Updated weights for policy 0, policy_version 55410 (0.0010) [2023-10-10 06:57:48,916][53252] Updated weights for policy 0, policy_version 55420 (0.0009) [2023-10-10 06:57:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 113442816. Throughput: 0: 1684.0, 1: 1684.6. Samples: 28371258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:51,784][52050] Avg episode reward: [(0, '20.660'), (1, '21.260')] [2023-10-10 06:57:52,130][53268] Updated weights for policy 1, policy_version 55370 (0.0009) [2023-10-10 06:57:52,493][53268] Updated weights for policy 1, policy_version 55380 (0.0008) [2023-10-10 06:57:52,857][53268] Updated weights for policy 1, policy_version 55390 (0.0008) [2023-10-10 06:57:53,047][53252] Updated weights for policy 0, policy_version 55430 (0.0007) [2023-10-10 06:57:53,415][53252] Updated weights for policy 0, policy_version 55440 (0.0007) [2023-10-10 06:57:53,783][53252] Updated weights for policy 0, policy_version 55450 (0.0008) [2023-10-10 06:57:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113508352. Throughput: 0: 1687.3, 1: 1685.8. Samples: 28392206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:57:56,784][52050] Avg episode reward: [(0, '20.930'), (1, '21.670')] [2023-10-10 06:57:57,111][53268] Updated weights for policy 1, policy_version 55400 (0.0008) [2023-10-10 06:57:57,486][53268] Updated weights for policy 1, policy_version 55410 (0.0010) [2023-10-10 06:57:57,759][53252] Updated weights for policy 0, policy_version 55460 (0.0007) [2023-10-10 06:57:57,857][53268] Updated weights for policy 1, policy_version 55420 (0.0010) [2023-10-10 06:57:58,144][53252] Updated weights for policy 0, policy_version 55470 (0.0007) [2023-10-10 06:57:58,511][53252] Updated weights for policy 0, policy_version 55480 (0.0007) [2023-10-10 06:58:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 113573888. Throughput: 0: 1676.5, 1: 1678.2. Samples: 28401106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:01,784][52050] Avg episode reward: [(0, '20.820'), (1, '19.600')] [2023-10-10 06:58:01,979][53268] Updated weights for policy 1, policy_version 55430 (0.0009) [2023-10-10 06:58:02,341][53268] Updated weights for policy 1, policy_version 55440 (0.0008) [2023-10-10 06:58:02,489][53252] Updated weights for policy 0, policy_version 55490 (0.0007) [2023-10-10 06:58:02,704][53268] Updated weights for policy 1, policy_version 55450 (0.0008) [2023-10-10 06:58:02,855][53252] Updated weights for policy 0, policy_version 55500 (0.0007) [2023-10-10 06:58:03,233][53252] Updated weights for policy 0, policy_version 55510 (0.0009) [2023-10-10 06:58:03,606][53252] Updated weights for policy 0, policy_version 55520 (0.0010) [2023-10-10 06:58:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113639424. Throughput: 0: 1684.5, 1: 1676.9. Samples: 28421746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:06,784][52050] Avg episode reward: [(0, '22.350'), (1, '19.130')] [2023-10-10 06:58:06,937][53268] Updated weights for policy 1, policy_version 55460 (0.0008) [2023-10-10 06:58:07,309][53268] Updated weights for policy 1, policy_version 55470 (0.0007) [2023-10-10 06:58:07,542][53252] Updated weights for policy 0, policy_version 55530 (0.0009) [2023-10-10 06:58:07,669][53268] Updated weights for policy 1, policy_version 55480 (0.0010) [2023-10-10 06:58:07,911][53252] Updated weights for policy 0, policy_version 55540 (0.0009) [2023-10-10 06:58:08,285][53252] Updated weights for policy 0, policy_version 55550 (0.0010) [2023-10-10 06:58:11,670][53268] Updated weights for policy 1, policy_version 55490 (0.0010) [2023-10-10 06:58:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 113704960. Throughput: 0: 1686.0, 1: 1683.6. Samples: 28442614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:11,784][52050] Avg episode reward: [(0, '23.750'), (1, '19.790')] [2023-10-10 06:58:12,036][53268] Updated weights for policy 1, policy_version 55500 (0.0009) [2023-10-10 06:58:12,255][53252] Updated weights for policy 0, policy_version 55560 (0.0007) [2023-10-10 06:58:12,401][53268] Updated weights for policy 1, policy_version 55510 (0.0007) [2023-10-10 06:58:12,621][53252] Updated weights for policy 0, policy_version 55570 (0.0007) [2023-10-10 06:58:12,767][53268] Updated weights for policy 1, policy_version 55520 (0.0007) [2023-10-10 06:58:12,979][53252] Updated weights for policy 0, policy_version 55580 (0.0007) [2023-10-10 06:58:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113770496. Throughput: 0: 1683.7, 1: 1681.6. Samples: 28451650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:16,784][52050] Avg episode reward: [(0, '21.660'), (1, '19.930')] [2023-10-10 06:58:16,826][53268] Updated weights for policy 1, policy_version 55530 (0.0007) [2023-10-10 06:58:17,062][53252] Updated weights for policy 0, policy_version 55590 (0.0008) [2023-10-10 06:58:17,182][53268] Updated weights for policy 1, policy_version 55540 (0.0008) [2023-10-10 06:58:17,438][53252] Updated weights for policy 0, policy_version 55600 (0.0008) [2023-10-10 06:58:17,549][53268] Updated weights for policy 1, policy_version 55550 (0.0009) [2023-10-10 06:58:17,807][53252] Updated weights for policy 0, policy_version 55610 (0.0008) [2023-10-10 06:58:21,732][53268] Updated weights for policy 1, policy_version 55560 (0.0009) [2023-10-10 06:58:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113836032. Throughput: 0: 1688.1, 1: 1676.5. Samples: 28472150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:21,784][52050] Avg episode reward: [(0, '21.120'), (1, '21.070')] [2023-10-10 06:58:21,912][53252] Updated weights for policy 0, policy_version 55620 (0.0008) [2023-10-10 06:58:22,102][53268] Updated weights for policy 1, policy_version 55570 (0.0007) [2023-10-10 06:58:22,300][53252] Updated weights for policy 0, policy_version 55630 (0.0008) [2023-10-10 06:58:22,459][53268] Updated weights for policy 1, policy_version 55580 (0.0007) [2023-10-10 06:58:22,664][53252] Updated weights for policy 0, policy_version 55640 (0.0007) [2023-10-10 06:58:26,507][53268] Updated weights for policy 1, policy_version 55590 (0.0007) [2023-10-10 06:58:26,752][53252] Updated weights for policy 0, policy_version 55650 (0.0008) [2023-10-10 06:58:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113901568. Throughput: 0: 1686.9, 1: 1672.0. Samples: 28492862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:26,784][52050] Avg episode reward: [(0, '21.440'), (1, '21.580')] [2023-10-10 06:58:26,878][53268] Updated weights for policy 1, policy_version 55600 (0.0009) [2023-10-10 06:58:27,118][53252] Updated weights for policy 0, policy_version 55660 (0.0008) [2023-10-10 06:58:27,238][53268] Updated weights for policy 1, policy_version 55610 (0.0009) [2023-10-10 06:58:27,488][53252] Updated weights for policy 0, policy_version 55670 (0.0010) [2023-10-10 06:58:27,852][53252] Updated weights for policy 0, policy_version 55680 (0.0009) [2023-10-10 06:58:31,272][53268] Updated weights for policy 1, policy_version 55620 (0.0008) [2023-10-10 06:58:31,644][53268] Updated weights for policy 1, policy_version 55630 (0.0009) [2023-10-10 06:58:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 113967104. Throughput: 0: 1687.0, 1: 1673.3. Samples: 28501892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:58:31,784][52050] Avg episode reward: [(0, '19.000'), (1, '20.900')] [2023-10-10 06:58:32,014][53268] Updated weights for policy 1, policy_version 55640 (0.0009) [2023-10-10 06:58:32,093][53252] Updated weights for policy 0, policy_version 55690 (0.0008) [2023-10-10 06:58:32,469][53252] Updated weights for policy 0, policy_version 55700 (0.0007) [2023-10-10 06:58:32,838][53252] Updated weights for policy 0, policy_version 55710 (0.0010) [2023-10-10 06:58:36,059][53268] Updated weights for policy 1, policy_version 55650 (0.0010) [2023-10-10 06:58:36,429][53268] Updated weights for policy 1, policy_version 55660 (0.0011) [2023-10-10 06:58:36,777][53252] Updated weights for policy 0, policy_version 55720 (0.0008) [2023-10-10 06:58:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114032640. Throughput: 0: 1691.1, 1: 1674.5. Samples: 28522708. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:58:36,784][52050] Avg episode reward: [(0, '18.530'), (1, '20.660')] [2023-10-10 06:58:36,796][53268] Updated weights for policy 1, policy_version 55670 (0.0010) [2023-10-10 06:58:37,148][53252] Updated weights for policy 0, policy_version 55730 (0.0007) [2023-10-10 06:58:37,167][53268] Updated weights for policy 1, policy_version 55680 (0.0008) [2023-10-10 06:58:37,526][53252] Updated weights for policy 0, policy_version 55740 (0.0009) [2023-10-10 06:58:41,228][53268] Updated weights for policy 1, policy_version 55690 (0.0009) [2023-10-10 06:58:41,607][53268] Updated weights for policy 1, policy_version 55700 (0.0011) [2023-10-10 06:58:41,709][53252] Updated weights for policy 0, policy_version 55750 (0.0008) [2023-10-10 06:58:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114098176. Throughput: 0: 1682.5, 1: 1669.6. Samples: 28543048. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:58:41,784][52050] Avg episode reward: [(0, '20.940'), (1, '20.300')] [2023-10-10 06:58:41,967][53268] Updated weights for policy 1, policy_version 55710 (0.0007) [2023-10-10 06:58:42,040][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000055712_57049088.pth... [2023-10-10 06:58:42,070][53252] Updated weights for policy 0, policy_version 55760 (0.0009) [2023-10-10 06:58:42,073][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000054144_55443456.pth [2023-10-10 06:58:42,442][53252] Updated weights for policy 0, policy_version 55770 (0.0008) [2023-10-10 06:58:42,656][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000055776_57114624.pth... [2023-10-10 06:58:42,694][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000054176_55476224.pth [2023-10-10 06:58:46,364][53268] Updated weights for policy 1, policy_version 55720 (0.0008) [2023-10-10 06:58:46,560][53252] Updated weights for policy 0, policy_version 55780 (0.0008) [2023-10-10 06:58:46,740][53268] Updated weights for policy 1, policy_version 55730 (0.0009) [2023-10-10 06:58:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114163712. Throughput: 0: 1683.7, 1: 1677.0. Samples: 28552338. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:58:46,784][52050] Avg episode reward: [(0, '19.430'), (1, '19.780')] [2023-10-10 06:58:46,931][53252] Updated weights for policy 0, policy_version 55790 (0.0009) [2023-10-10 06:58:47,097][53268] Updated weights for policy 1, policy_version 55740 (0.0007) [2023-10-10 06:58:47,307][53252] Updated weights for policy 0, policy_version 55800 (0.0008) [2023-10-10 06:58:51,226][53268] Updated weights for policy 1, policy_version 55750 (0.0009) [2023-10-10 06:58:51,367][53252] Updated weights for policy 0, policy_version 55810 (0.0009) [2023-10-10 06:58:51,579][53268] Updated weights for policy 1, policy_version 55760 (0.0009) [2023-10-10 06:58:51,738][53252] Updated weights for policy 0, policy_version 55820 (0.0007) [2023-10-10 06:58:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 114229248. Throughput: 0: 1678.0, 1: 1675.2. Samples: 28572642. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:58:51,784][52050] Avg episode reward: [(0, '20.560'), (1, '19.710')] [2023-10-10 06:58:51,944][53268] Updated weights for policy 1, policy_version 55770 (0.0007) [2023-10-10 06:58:52,114][53252] Updated weights for policy 0, policy_version 55830 (0.0007) [2023-10-10 06:58:52,499][53252] Updated weights for policy 0, policy_version 55840 (0.0008) [2023-10-10 06:58:55,881][53268] Updated weights for policy 1, policy_version 55780 (0.0008) [2023-10-10 06:58:56,251][53268] Updated weights for policy 1, policy_version 55790 (0.0008) [2023-10-10 06:58:56,608][53268] Updated weights for policy 1, policy_version 55800 (0.0008) [2023-10-10 06:58:56,648][53252] Updated weights for policy 0, policy_version 55850 (0.0007) [2023-10-10 06:58:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 114294784. Throughput: 0: 1671.2, 1: 1664.6. Samples: 28592726. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:58:56,784][52050] Avg episode reward: [(0, '22.400'), (1, '21.280')] [2023-10-10 06:58:57,024][53252] Updated weights for policy 0, policy_version 55860 (0.0007) [2023-10-10 06:58:57,394][53252] Updated weights for policy 0, policy_version 55870 (0.0008) [2023-10-10 06:59:00,686][53268] Updated weights for policy 1, policy_version 55810 (0.0009) [2023-10-10 06:59:01,062][53268] Updated weights for policy 1, policy_version 55820 (0.0008) [2023-10-10 06:59:01,420][53268] Updated weights for policy 1, policy_version 55830 (0.0009) [2023-10-10 06:59:01,426][53252] Updated weights for policy 0, policy_version 55880 (0.0009) [2023-10-10 06:59:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 114360320. Throughput: 0: 1672.6, 1: 1677.4. Samples: 28602402. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:59:01,784][52050] Avg episode reward: [(0, '22.540'), (1, '20.160')] [2023-10-10 06:59:01,791][53268] Updated weights for policy 1, policy_version 55840 (0.0008) [2023-10-10 06:59:01,791][53252] Updated weights for policy 0, policy_version 55890 (0.0008) [2023-10-10 06:59:02,158][53252] Updated weights for policy 0, policy_version 55900 (0.0010) [2023-10-10 06:59:05,921][53268] Updated weights for policy 1, policy_version 55850 (0.0008) [2023-10-10 06:59:06,237][53252] Updated weights for policy 0, policy_version 55910 (0.0009) [2023-10-10 06:59:06,295][53268] Updated weights for policy 1, policy_version 55860 (0.0008) [2023-10-10 06:59:06,607][53252] Updated weights for policy 0, policy_version 55920 (0.0009) [2023-10-10 06:59:06,661][53268] Updated weights for policy 1, policy_version 55870 (0.0008) [2023-10-10 06:59:06,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114458624. Throughput: 0: 1674.1, 1: 1684.8. Samples: 28623300. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:59:06,784][52050] Avg episode reward: [(0, '22.480'), (1, '20.760')] [2023-10-10 06:59:06,976][53252] Updated weights for policy 0, policy_version 55930 (0.0009) [2023-10-10 06:59:10,734][53268] Updated weights for policy 1, policy_version 55880 (0.0008) [2023-10-10 06:59:11,102][53268] Updated weights for policy 1, policy_version 55890 (0.0007) [2023-10-10 06:59:11,144][53252] Updated weights for policy 0, policy_version 55940 (0.0010) [2023-10-10 06:59:11,469][53268] Updated weights for policy 1, policy_version 55900 (0.0007) [2023-10-10 06:59:11,515][53252] Updated weights for policy 0, policy_version 55950 (0.0010) [2023-10-10 06:59:11,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 114524160. Throughput: 0: 1661.1, 1: 1667.4. Samples: 28642644. Policy #0 lag: (min: 31.0, avg: 41.3, max: 63.0) [2023-10-10 06:59:11,784][52050] Avg episode reward: [(0, '23.120'), (1, '20.880')] [2023-10-10 06:59:11,881][53252] Updated weights for policy 0, policy_version 55960 (0.0009) [2023-10-10 06:59:15,396][53268] Updated weights for policy 1, policy_version 55910 (0.0008) [2023-10-10 06:59:15,763][53268] Updated weights for policy 1, policy_version 55920 (0.0009) [2023-10-10 06:59:15,880][53252] Updated weights for policy 0, policy_version 55970 (0.0009) [2023-10-10 06:59:16,127][53268] Updated weights for policy 1, policy_version 55930 (0.0008) [2023-10-10 06:59:16,239][53252] Updated weights for policy 0, policy_version 55980 (0.0008) [2023-10-10 06:59:16,610][53252] Updated weights for policy 0, policy_version 55990 (0.0009) [2023-10-10 06:59:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114589696. Throughput: 0: 1673.6, 1: 1682.1. Samples: 28652898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:16,784][52050] Avg episode reward: [(0, '21.090'), (1, '20.230')] [2023-10-10 06:59:16,977][53252] Updated weights for policy 0, policy_version 56000 (0.0008) [2023-10-10 06:59:20,363][53268] Updated weights for policy 1, policy_version 55940 (0.0009) [2023-10-10 06:59:20,738][53268] Updated weights for policy 1, policy_version 55950 (0.0007) [2023-10-10 06:59:21,102][53268] Updated weights for policy 1, policy_version 55960 (0.0008) [2023-10-10 06:59:21,167][53252] Updated weights for policy 0, policy_version 56010 (0.0008) [2023-10-10 06:59:21,537][53252] Updated weights for policy 0, policy_version 56020 (0.0007) [2023-10-10 06:59:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114655232. Throughput: 0: 1669.5, 1: 1677.2. Samples: 28673310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:21,784][52050] Avg episode reward: [(0, '20.540'), (1, '20.100')] [2023-10-10 06:59:21,902][53252] Updated weights for policy 0, policy_version 56030 (0.0007) [2023-10-10 06:59:24,949][53268] Updated weights for policy 1, policy_version 55970 (0.0010) [2023-10-10 06:59:25,314][53268] Updated weights for policy 1, policy_version 55980 (0.0009) [2023-10-10 06:59:25,675][53268] Updated weights for policy 1, policy_version 55990 (0.0009) [2023-10-10 06:59:25,932][53252] Updated weights for policy 0, policy_version 56040 (0.0008) [2023-10-10 06:59:26,043][53268] Updated weights for policy 1, policy_version 56000 (0.0008) [2023-10-10 06:59:26,304][53252] Updated weights for policy 0, policy_version 56050 (0.0007) [2023-10-10 06:59:26,673][53252] Updated weights for policy 0, policy_version 56060 (0.0007) [2023-10-10 06:59:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114720768. Throughput: 0: 1657.0, 1: 1658.8. Samples: 28692262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:26,784][52050] Avg episode reward: [(0, '20.090'), (1, '21.020')] [2023-10-10 06:59:29,968][53268] Updated weights for policy 1, policy_version 56010 (0.0012) [2023-10-10 06:59:30,330][53268] Updated weights for policy 1, policy_version 56020 (0.0010) [2023-10-10 06:59:30,634][53252] Updated weights for policy 0, policy_version 56070 (0.0008) [2023-10-10 06:59:30,707][53268] Updated weights for policy 1, policy_version 56030 (0.0007) [2023-10-10 06:59:31,015][53252] Updated weights for policy 0, policy_version 56080 (0.0010) [2023-10-10 06:59:31,387][53252] Updated weights for policy 0, policy_version 56090 (0.0007) [2023-10-10 06:59:31,783][52050] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 114819072. Throughput: 0: 1673.6, 1: 1684.7. Samples: 28703464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:31,784][52050] Avg episode reward: [(0, '22.020'), (1, '22.560')] [2023-10-10 06:59:34,930][53268] Updated weights for policy 1, policy_version 56040 (0.0009) [2023-10-10 06:59:35,298][53268] Updated weights for policy 1, policy_version 56050 (0.0008) [2023-10-10 06:59:35,310][53252] Updated weights for policy 0, policy_version 56100 (0.0008) [2023-10-10 06:59:35,676][53268] Updated weights for policy 1, policy_version 56060 (0.0008) [2023-10-10 06:59:35,680][53252] Updated weights for policy 0, policy_version 56110 (0.0009) [2023-10-10 06:59:36,052][53252] Updated weights for policy 0, policy_version 56120 (0.0009) [2023-10-10 06:59:36,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 114884608. Throughput: 0: 1676.4, 1: 1675.7. Samples: 28723486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:36,784][52050] Avg episode reward: [(0, '20.740'), (1, '23.830')] [2023-10-10 06:59:36,786][53061] Saving new best policy, reward=23.830! [2023-10-10 06:59:39,691][53268] Updated weights for policy 1, policy_version 56070 (0.0010) [2023-10-10 06:59:40,060][53268] Updated weights for policy 1, policy_version 56080 (0.0007) [2023-10-10 06:59:40,180][53252] Updated weights for policy 0, policy_version 56130 (0.0009) [2023-10-10 06:59:40,419][53268] Updated weights for policy 1, policy_version 56090 (0.0008) [2023-10-10 06:59:40,552][53252] Updated weights for policy 0, policy_version 56140 (0.0010) [2023-10-10 06:59:40,925][53252] Updated weights for policy 0, policy_version 56150 (0.0007) [2023-10-10 06:59:41,300][53252] Updated weights for policy 0, policy_version 56160 (0.0008) [2023-10-10 06:59:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 114950144. Throughput: 0: 1654.8, 1: 1669.7. Samples: 28742330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:41,785][52050] Avg episode reward: [(0, '21.340'), (1, '24.690')] [2023-10-10 06:59:41,795][53061] Saving new best policy, reward=24.690! [2023-10-10 06:59:44,527][53268] Updated weights for policy 1, policy_version 56100 (0.0009) [2023-10-10 06:59:44,901][53268] Updated weights for policy 1, policy_version 56110 (0.0008) [2023-10-10 06:59:45,263][53268] Updated weights for policy 1, policy_version 56120 (0.0007) [2023-10-10 06:59:45,454][53252] Updated weights for policy 0, policy_version 56170 (0.0007) [2023-10-10 06:59:45,828][53252] Updated weights for policy 0, policy_version 56180 (0.0009) [2023-10-10 06:59:46,199][53252] Updated weights for policy 0, policy_version 56190 (0.0009) [2023-10-10 06:59:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 115015680. Throughput: 0: 1682.4, 1: 1687.5. Samples: 28754050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:46,784][52050] Avg episode reward: [(0, '22.150'), (1, '23.550')] [2023-10-10 06:59:49,160][53268] Updated weights for policy 1, policy_version 56130 (0.0008) [2023-10-10 06:59:49,532][53268] Updated weights for policy 1, policy_version 56140 (0.0010) [2023-10-10 06:59:49,903][53268] Updated weights for policy 1, policy_version 56150 (0.0009) [2023-10-10 06:59:50,265][53268] Updated weights for policy 1, policy_version 56160 (0.0008) [2023-10-10 06:59:50,403][53252] Updated weights for policy 0, policy_version 56200 (0.0008) [2023-10-10 06:59:50,778][53252] Updated weights for policy 0, policy_version 56210 (0.0007) [2023-10-10 06:59:51,152][53252] Updated weights for policy 0, policy_version 56220 (0.0007) [2023-10-10 06:59:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 115081216. Throughput: 0: 1668.4, 1: 1664.8. Samples: 28773298. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:51,784][52050] Avg episode reward: [(0, '20.710'), (1, '22.000')] [2023-10-10 06:59:54,259][53268] Updated weights for policy 1, policy_version 56170 (0.0008) [2023-10-10 06:59:54,617][53268] Updated weights for policy 1, policy_version 56180 (0.0010) [2023-10-10 06:59:54,980][53268] Updated weights for policy 1, policy_version 56190 (0.0010) [2023-10-10 06:59:55,255][53252] Updated weights for policy 0, policy_version 56230 (0.0008) [2023-10-10 06:59:55,639][53252] Updated weights for policy 0, policy_version 56240 (0.0007) [2023-10-10 06:59:55,997][53252] Updated weights for policy 0, policy_version 56250 (0.0007) [2023-10-10 06:59:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 115146752. Throughput: 0: 1656.3, 1: 1679.8. Samples: 28792768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 06:59:56,784][52050] Avg episode reward: [(0, '21.150'), (1, '19.830')] [2023-10-10 06:59:59,072][53268] Updated weights for policy 1, policy_version 56200 (0.0008) [2023-10-10 06:59:59,443][53268] Updated weights for policy 1, policy_version 56210 (0.0008) [2023-10-10 06:59:59,805][53268] Updated weights for policy 1, policy_version 56220 (0.0007) [2023-10-10 07:00:00,085][53252] Updated weights for policy 0, policy_version 56260 (0.0009) [2023-10-10 07:00:00,442][53252] Updated weights for policy 0, policy_version 56270 (0.0009) [2023-10-10 07:00:00,811][53252] Updated weights for policy 0, policy_version 56280 (0.0009) [2023-10-10 07:00:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 115212288. Throughput: 0: 1675.5, 1: 1684.9. Samples: 28804116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:01,784][52050] Avg episode reward: [(0, '22.130'), (1, '20.940')] [2023-10-10 07:00:04,026][53268] Updated weights for policy 1, policy_version 56230 (0.0007) [2023-10-10 07:00:04,389][53268] Updated weights for policy 1, policy_version 56240 (0.0007) [2023-10-10 07:00:04,733][53252] Updated weights for policy 0, policy_version 56290 (0.0008) [2023-10-10 07:00:04,758][53268] Updated weights for policy 1, policy_version 56250 (0.0009) [2023-10-10 07:00:05,102][53252] Updated weights for policy 0, policy_version 56300 (0.0009) [2023-10-10 07:00:05,461][53252] Updated weights for policy 0, policy_version 56310 (0.0009) [2023-10-10 07:00:05,833][53252] Updated weights for policy 0, policy_version 56320 (0.0008) [2023-10-10 07:00:06,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115277824. Throughput: 0: 1667.5, 1: 1662.5. Samples: 28823162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:06,785][52050] Avg episode reward: [(0, '21.450'), (1, '19.850')] [2023-10-10 07:00:08,981][53268] Updated weights for policy 1, policy_version 56260 (0.0009) [2023-10-10 07:00:09,346][53268] Updated weights for policy 1, policy_version 56270 (0.0009) [2023-10-10 07:00:09,715][53268] Updated weights for policy 1, policy_version 56280 (0.0008) [2023-10-10 07:00:10,043][53252] Updated weights for policy 0, policy_version 56330 (0.0009) [2023-10-10 07:00:10,410][53252] Updated weights for policy 0, policy_version 56340 (0.0009) [2023-10-10 07:00:10,779][53252] Updated weights for policy 0, policy_version 56350 (0.0007) [2023-10-10 07:00:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115343360. Throughput: 0: 1668.6, 1: 1684.5. Samples: 28843152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:11,784][52050] Avg episode reward: [(0, '20.590'), (1, '21.800')] [2023-10-10 07:00:13,927][53268] Updated weights for policy 1, policy_version 56290 (0.0009) [2023-10-10 07:00:14,297][53268] Updated weights for policy 1, policy_version 56300 (0.0007) [2023-10-10 07:00:14,663][53268] Updated weights for policy 1, policy_version 56310 (0.0007) [2023-10-10 07:00:14,864][53252] Updated weights for policy 0, policy_version 56360 (0.0007) [2023-10-10 07:00:15,029][53268] Updated weights for policy 1, policy_version 56320 (0.0008) [2023-10-10 07:00:15,232][53252] Updated weights for policy 0, policy_version 56370 (0.0008) [2023-10-10 07:00:15,606][53252] Updated weights for policy 0, policy_version 56380 (0.0010) [2023-10-10 07:00:16,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115408896. Throughput: 0: 1681.0, 1: 1674.5. Samples: 28854460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:16,784][52050] Avg episode reward: [(0, '21.220'), (1, '22.260')] [2023-10-10 07:00:19,254][53268] Updated weights for policy 1, policy_version 56330 (0.0009) [2023-10-10 07:00:19,618][53252] Updated weights for policy 0, policy_version 56390 (0.0008) [2023-10-10 07:00:19,627][53268] Updated weights for policy 1, policy_version 56340 (0.0008) [2023-10-10 07:00:19,989][53252] Updated weights for policy 0, policy_version 56400 (0.0008) [2023-10-10 07:00:19,997][53268] Updated weights for policy 1, policy_version 56350 (0.0009) [2023-10-10 07:00:20,359][53252] Updated weights for policy 0, policy_version 56410 (0.0010) [2023-10-10 07:00:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115474432. Throughput: 0: 1656.8, 1: 1664.6. Samples: 28872952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:21,784][52050] Avg episode reward: [(0, '20.570'), (1, '21.630')] [2023-10-10 07:00:24,224][53268] Updated weights for policy 1, policy_version 56360 (0.0009) [2023-10-10 07:00:24,525][53252] Updated weights for policy 0, policy_version 56420 (0.0008) [2023-10-10 07:00:24,598][53268] Updated weights for policy 1, policy_version 56370 (0.0010) [2023-10-10 07:00:24,884][53252] Updated weights for policy 0, policy_version 56430 (0.0008) [2023-10-10 07:00:24,962][53268] Updated weights for policy 1, policy_version 56380 (0.0008) [2023-10-10 07:00:25,251][53252] Updated weights for policy 0, policy_version 56440 (0.0010) [2023-10-10 07:00:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115539968. Throughput: 0: 1675.9, 1: 1672.4. Samples: 28893000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:26,784][52050] Avg episode reward: [(0, '21.650'), (1, '19.370')] [2023-10-10 07:00:29,019][53268] Updated weights for policy 1, policy_version 56390 (0.0009) [2023-10-10 07:00:29,256][53252] Updated weights for policy 0, policy_version 56450 (0.0010) [2023-10-10 07:00:29,394][53268] Updated weights for policy 1, policy_version 56400 (0.0009) [2023-10-10 07:00:29,631][53252] Updated weights for policy 0, policy_version 56460 (0.0008) [2023-10-10 07:00:29,767][53268] Updated weights for policy 1, policy_version 56410 (0.0007) [2023-10-10 07:00:30,000][53252] Updated weights for policy 0, policy_version 56470 (0.0007) [2023-10-10 07:00:30,367][53252] Updated weights for policy 0, policy_version 56480 (0.0007) [2023-10-10 07:00:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115605504. Throughput: 0: 1669.6, 1: 1661.5. Samples: 28903950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:31,784][52050] Avg episode reward: [(0, '23.020'), (1, '20.020')] [2023-10-10 07:00:33,924][53268] Updated weights for policy 1, policy_version 56420 (0.0008) [2023-10-10 07:00:34,283][53268] Updated weights for policy 1, policy_version 56430 (0.0008) [2023-10-10 07:00:34,393][53252] Updated weights for policy 0, policy_version 56490 (0.0009) [2023-10-10 07:00:34,654][53268] Updated weights for policy 1, policy_version 56440 (0.0008) [2023-10-10 07:00:34,764][53252] Updated weights for policy 0, policy_version 56500 (0.0007) [2023-10-10 07:00:35,144][53252] Updated weights for policy 0, policy_version 56510 (0.0008) [2023-10-10 07:00:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115671040. Throughput: 0: 1659.6, 1: 1659.9. Samples: 28922676. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:36,784][52050] Avg episode reward: [(0, '23.740'), (1, '18.550')] [2023-10-10 07:00:38,898][53268] Updated weights for policy 1, policy_version 56450 (0.0010) [2023-10-10 07:00:39,268][53268] Updated weights for policy 1, policy_version 56460 (0.0008) [2023-10-10 07:00:39,315][53252] Updated weights for policy 0, policy_version 56520 (0.0008) [2023-10-10 07:00:39,634][53268] Updated weights for policy 1, policy_version 56470 (0.0009) [2023-10-10 07:00:39,677][53252] Updated weights for policy 0, policy_version 56530 (0.0007) [2023-10-10 07:00:39,994][53268] Updated weights for policy 1, policy_version 56480 (0.0009) [2023-10-10 07:00:40,044][53252] Updated weights for policy 0, policy_version 56540 (0.0008) [2023-10-10 07:00:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115736576. Throughput: 0: 1681.9, 1: 1657.8. Samples: 28943054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:41,784][52050] Avg episode reward: [(0, '22.080'), (1, '19.160')] [2023-10-10 07:00:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000056480_57835520.pth... [2023-10-10 07:00:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000056544_57901056.pth... [2023-10-10 07:00:41,825][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000054912_56229888.pth [2023-10-10 07:00:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000054976_56295424.pth [2023-10-10 07:00:43,977][53268] Updated weights for policy 1, policy_version 56490 (0.0009) [2023-10-10 07:00:44,307][53252] Updated weights for policy 0, policy_version 56550 (0.0009) [2023-10-10 07:00:44,351][53268] Updated weights for policy 1, policy_version 56500 (0.0008) [2023-10-10 07:00:44,694][53252] Updated weights for policy 0, policy_version 56560 (0.0008) [2023-10-10 07:00:44,715][53268] Updated weights for policy 1, policy_version 56510 (0.0009) [2023-10-10 07:00:45,068][53252] Updated weights for policy 0, policy_version 56570 (0.0009) [2023-10-10 07:00:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 115802112. Throughput: 0: 1672.7, 1: 1653.2. Samples: 28953778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:46,784][52050] Avg episode reward: [(0, '21.430'), (1, '20.940')] [2023-10-10 07:00:48,736][53268] Updated weights for policy 1, policy_version 56520 (0.0007) [2023-10-10 07:00:49,109][53252] Updated weights for policy 0, policy_version 56580 (0.0009) [2023-10-10 07:00:49,112][53268] Updated weights for policy 1, policy_version 56530 (0.0008) [2023-10-10 07:00:49,473][53252] Updated weights for policy 0, policy_version 56590 (0.0008) [2023-10-10 07:00:49,490][53268] Updated weights for policy 1, policy_version 56540 (0.0007) [2023-10-10 07:00:49,840][53252] Updated weights for policy 0, policy_version 56600 (0.0007) [2023-10-10 07:00:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115867648. Throughput: 0: 1662.3, 1: 1661.1. Samples: 28972712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:51,784][52050] Avg episode reward: [(0, '21.350'), (1, '20.680')] [2023-10-10 07:00:53,679][53268] Updated weights for policy 1, policy_version 56550 (0.0009) [2023-10-10 07:00:53,891][53252] Updated weights for policy 0, policy_version 56610 (0.0008) [2023-10-10 07:00:54,037][53268] Updated weights for policy 1, policy_version 56560 (0.0008) [2023-10-10 07:00:54,258][53252] Updated weights for policy 0, policy_version 56620 (0.0007) [2023-10-10 07:00:54,410][53268] Updated weights for policy 1, policy_version 56570 (0.0008) [2023-10-10 07:00:54,626][53252] Updated weights for policy 0, policy_version 56630 (0.0009) [2023-10-10 07:00:54,994][53252] Updated weights for policy 0, policy_version 56640 (0.0007) [2023-10-10 07:00:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115933184. Throughput: 0: 1676.9, 1: 1659.5. Samples: 28993292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:00:56,784][52050] Avg episode reward: [(0, '21.030'), (1, '20.640')] [2023-10-10 07:00:58,502][53268] Updated weights for policy 1, policy_version 56580 (0.0009) [2023-10-10 07:00:58,876][53268] Updated weights for policy 1, policy_version 56590 (0.0009) [2023-10-10 07:00:59,179][53252] Updated weights for policy 0, policy_version 56650 (0.0008) [2023-10-10 07:00:59,240][53268] Updated weights for policy 1, policy_version 56600 (0.0008) [2023-10-10 07:00:59,554][53252] Updated weights for policy 0, policy_version 56660 (0.0009) [2023-10-10 07:00:59,925][53252] Updated weights for policy 0, policy_version 56670 (0.0010) [2023-10-10 07:01:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 115998720. Throughput: 0: 1657.0, 1: 1650.4. Samples: 29003292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:01:01,784][52050] Avg episode reward: [(0, '21.150'), (1, '20.810')] [2023-10-10 07:01:03,323][53268] Updated weights for policy 1, policy_version 56610 (0.0008) [2023-10-10 07:01:03,688][53268] Updated weights for policy 1, policy_version 56620 (0.0010) [2023-10-10 07:01:03,804][53252] Updated weights for policy 0, policy_version 56680 (0.0008) [2023-10-10 07:01:04,062][53268] Updated weights for policy 1, policy_version 56630 (0.0007) [2023-10-10 07:01:04,185][53252] Updated weights for policy 0, policy_version 56690 (0.0009) [2023-10-10 07:01:04,428][53268] Updated weights for policy 1, policy_version 56640 (0.0007) [2023-10-10 07:01:04,557][53252] Updated weights for policy 0, policy_version 56700 (0.0010) [2023-10-10 07:01:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116064256. Throughput: 0: 1670.0, 1: 1667.6. Samples: 29023144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:01:06,784][52050] Avg episode reward: [(0, '23.420'), (1, '21.460')] [2023-10-10 07:01:08,242][53268] Updated weights for policy 1, policy_version 56650 (0.0008) [2023-10-10 07:01:08,601][53268] Updated weights for policy 1, policy_version 56660 (0.0010) [2023-10-10 07:01:08,601][53252] Updated weights for policy 0, policy_version 56710 (0.0009) [2023-10-10 07:01:08,970][53268] Updated weights for policy 1, policy_version 56670 (0.0009) [2023-10-10 07:01:08,975][53252] Updated weights for policy 0, policy_version 56720 (0.0007) [2023-10-10 07:01:09,348][53252] Updated weights for policy 0, policy_version 56730 (0.0008) [2023-10-10 07:01:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116129792. Throughput: 0: 1679.0, 1: 1676.3. Samples: 29043988. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:11,784][52050] Avg episode reward: [(0, '23.270'), (1, '20.300')] [2023-10-10 07:01:12,994][53268] Updated weights for policy 1, policy_version 56680 (0.0009) [2023-10-10 07:01:13,361][53268] Updated weights for policy 1, policy_version 56690 (0.0010) [2023-10-10 07:01:13,451][53252] Updated weights for policy 0, policy_version 56740 (0.0008) [2023-10-10 07:01:13,734][53268] Updated weights for policy 1, policy_version 56700 (0.0008) [2023-10-10 07:01:13,809][53252] Updated weights for policy 0, policy_version 56750 (0.0008) [2023-10-10 07:01:14,185][53252] Updated weights for policy 0, policy_version 56760 (0.0009) [2023-10-10 07:01:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116195328. Throughput: 0: 1658.3, 1: 1658.1. Samples: 29053186. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:16,784][52050] Avg episode reward: [(0, '23.070'), (1, '20.450')] [2023-10-10 07:01:17,836][53268] Updated weights for policy 1, policy_version 56710 (0.0009) [2023-10-10 07:01:18,212][53268] Updated weights for policy 1, policy_version 56720 (0.0009) [2023-10-10 07:01:18,262][53252] Updated weights for policy 0, policy_version 56770 (0.0009) [2023-10-10 07:01:18,582][53268] Updated weights for policy 1, policy_version 56730 (0.0010) [2023-10-10 07:01:18,626][53252] Updated weights for policy 0, policy_version 56780 (0.0008) [2023-10-10 07:01:18,999][53252] Updated weights for policy 0, policy_version 56790 (0.0009) [2023-10-10 07:01:19,366][53252] Updated weights for policy 0, policy_version 56800 (0.0009) [2023-10-10 07:01:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116260864. Throughput: 0: 1678.5, 1: 1680.2. Samples: 29073818. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:21,784][52050] Avg episode reward: [(0, '22.080'), (1, '21.090')] [2023-10-10 07:01:22,758][53268] Updated weights for policy 1, policy_version 56740 (0.0009) [2023-10-10 07:01:23,121][53268] Updated weights for policy 1, policy_version 56750 (0.0008) [2023-10-10 07:01:23,480][53252] Updated weights for policy 0, policy_version 56810 (0.0008) [2023-10-10 07:01:23,489][53268] Updated weights for policy 1, policy_version 56760 (0.0010) [2023-10-10 07:01:23,850][53252] Updated weights for policy 0, policy_version 56820 (0.0008) [2023-10-10 07:01:24,227][53252] Updated weights for policy 0, policy_version 56830 (0.0008) [2023-10-10 07:01:26,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 116326400. Throughput: 0: 1678.7, 1: 1680.6. Samples: 29094222. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:26,785][52050] Avg episode reward: [(0, '20.830'), (1, '22.280')] [2023-10-10 07:01:27,679][53268] Updated weights for policy 1, policy_version 56770 (0.0009) [2023-10-10 07:01:28,055][53268] Updated weights for policy 1, policy_version 56780 (0.0010) [2023-10-10 07:01:28,387][53252] Updated weights for policy 0, policy_version 56840 (0.0007) [2023-10-10 07:01:28,419][53268] Updated weights for policy 1, policy_version 56790 (0.0008) [2023-10-10 07:01:28,755][53252] Updated weights for policy 0, policy_version 56850 (0.0007) [2023-10-10 07:01:28,784][53268] Updated weights for policy 1, policy_version 56800 (0.0007) [2023-10-10 07:01:29,130][53252] Updated weights for policy 0, policy_version 56860 (0.0007) [2023-10-10 07:01:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116391936. Throughput: 0: 1659.6, 1: 1662.4. Samples: 29103268. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:31,784][52050] Avg episode reward: [(0, '20.650'), (1, '22.180')] [2023-10-10 07:01:32,855][53268] Updated weights for policy 1, policy_version 56810 (0.0010) [2023-10-10 07:01:33,126][53252] Updated weights for policy 0, policy_version 56870 (0.0007) [2023-10-10 07:01:33,225][53268] Updated weights for policy 1, policy_version 56820 (0.0008) [2023-10-10 07:01:33,506][53252] Updated weights for policy 0, policy_version 56880 (0.0007) [2023-10-10 07:01:33,595][53268] Updated weights for policy 1, policy_version 56830 (0.0010) [2023-10-10 07:01:33,883][53252] Updated weights for policy 0, policy_version 56890 (0.0010) [2023-10-10 07:01:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116457472. Throughput: 0: 1679.0, 1: 1679.2. Samples: 29123830. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:36,784][52050] Avg episode reward: [(0, '21.420'), (1, '22.270')] [2023-10-10 07:01:37,597][53268] Updated weights for policy 1, policy_version 56840 (0.0010) [2023-10-10 07:01:37,962][53268] Updated weights for policy 1, policy_version 56850 (0.0010) [2023-10-10 07:01:38,094][53252] Updated weights for policy 0, policy_version 56900 (0.0010) [2023-10-10 07:01:38,334][53268] Updated weights for policy 1, policy_version 56860 (0.0007) [2023-10-10 07:01:38,466][53252] Updated weights for policy 0, policy_version 56910 (0.0007) [2023-10-10 07:01:38,846][53252] Updated weights for policy 0, policy_version 56920 (0.0009) [2023-10-10 07:01:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116523008. Throughput: 0: 1677.7, 1: 1687.2. Samples: 29144710. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:41,784][52050] Avg episode reward: [(0, '19.750'), (1, '22.130')] [2023-10-10 07:01:42,481][53268] Updated weights for policy 1, policy_version 56870 (0.0009) [2023-10-10 07:01:42,753][53252] Updated weights for policy 0, policy_version 56930 (0.0008) [2023-10-10 07:01:42,836][53268] Updated weights for policy 1, policy_version 56880 (0.0009) [2023-10-10 07:01:43,119][53252] Updated weights for policy 0, policy_version 56940 (0.0007) [2023-10-10 07:01:43,200][53268] Updated weights for policy 1, policy_version 56890 (0.0009) [2023-10-10 07:01:43,490][53252] Updated weights for policy 0, policy_version 56950 (0.0008) [2023-10-10 07:01:43,863][53252] Updated weights for policy 0, policy_version 56960 (0.0008) [2023-10-10 07:01:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116588544. Throughput: 0: 1670.0, 1: 1677.9. Samples: 29153948. Policy #0 lag: (min: 15.0, avg: 18.4, max: 47.0) [2023-10-10 07:01:46,784][52050] Avg episode reward: [(0, '21.210'), (1, '23.910')] [2023-10-10 07:01:47,243][53268] Updated weights for policy 1, policy_version 56900 (0.0009) [2023-10-10 07:01:47,608][53268] Updated weights for policy 1, policy_version 56910 (0.0008) [2023-10-10 07:01:47,867][53252] Updated weights for policy 0, policy_version 56970 (0.0009) [2023-10-10 07:01:47,978][53268] Updated weights for policy 1, policy_version 56920 (0.0008) [2023-10-10 07:01:48,229][53252] Updated weights for policy 0, policy_version 56980 (0.0007) [2023-10-10 07:01:48,601][53252] Updated weights for policy 0, policy_version 56990 (0.0009) [2023-10-10 07:01:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116654080. Throughput: 0: 1682.2, 1: 1682.7. Samples: 29174564. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:01:51,785][52050] Avg episode reward: [(0, '21.350'), (1, '21.380')] [2023-10-10 07:01:52,129][53268] Updated weights for policy 1, policy_version 56930 (0.0008) [2023-10-10 07:01:52,499][53268] Updated weights for policy 1, policy_version 56940 (0.0009) [2023-10-10 07:01:52,763][53252] Updated weights for policy 0, policy_version 57000 (0.0008) [2023-10-10 07:01:52,865][53268] Updated weights for policy 1, policy_version 56950 (0.0009) [2023-10-10 07:01:53,131][53252] Updated weights for policy 0, policy_version 57010 (0.0007) [2023-10-10 07:01:53,224][53268] Updated weights for policy 1, policy_version 56960 (0.0008) [2023-10-10 07:01:53,505][53252] Updated weights for policy 0, policy_version 57020 (0.0008) [2023-10-10 07:01:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116719616. Throughput: 0: 1682.0, 1: 1678.6. Samples: 29195214. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:01:56,784][52050] Avg episode reward: [(0, '19.430'), (1, '20.520')] [2023-10-10 07:01:57,326][53268] Updated weights for policy 1, policy_version 56970 (0.0008) [2023-10-10 07:01:57,442][53252] Updated weights for policy 0, policy_version 57030 (0.0008) [2023-10-10 07:01:57,696][53268] Updated weights for policy 1, policy_version 56980 (0.0009) [2023-10-10 07:01:57,811][53252] Updated weights for policy 0, policy_version 57040 (0.0008) [2023-10-10 07:01:58,062][53268] Updated weights for policy 1, policy_version 56990 (0.0008) [2023-10-10 07:01:58,178][53252] Updated weights for policy 0, policy_version 57050 (0.0008) [2023-10-10 07:02:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116785152. Throughput: 0: 1680.1, 1: 1680.9. Samples: 29204432. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:01,784][52050] Avg episode reward: [(0, '20.930'), (1, '18.450')] [2023-10-10 07:02:01,997][53268] Updated weights for policy 1, policy_version 57000 (0.0009) [2023-10-10 07:02:02,297][53252] Updated weights for policy 0, policy_version 57060 (0.0008) [2023-10-10 07:02:02,382][53268] Updated weights for policy 1, policy_version 57010 (0.0007) [2023-10-10 07:02:02,670][53252] Updated weights for policy 0, policy_version 57070 (0.0007) [2023-10-10 07:02:02,746][53268] Updated weights for policy 1, policy_version 57020 (0.0007) [2023-10-10 07:02:03,043][53252] Updated weights for policy 0, policy_version 57080 (0.0010) [2023-10-10 07:02:06,615][53268] Updated weights for policy 1, policy_version 57030 (0.0007) [2023-10-10 07:02:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116850688. Throughput: 0: 1680.1, 1: 1682.2. Samples: 29225122. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:06,784][52050] Avg episode reward: [(0, '20.260'), (1, '17.700')] [2023-10-10 07:02:06,982][53268] Updated weights for policy 1, policy_version 57040 (0.0007) [2023-10-10 07:02:07,057][53252] Updated weights for policy 0, policy_version 57090 (0.0011) [2023-10-10 07:02:07,353][53268] Updated weights for policy 1, policy_version 57050 (0.0008) [2023-10-10 07:02:07,427][53252] Updated weights for policy 0, policy_version 57100 (0.0008) [2023-10-10 07:02:07,804][53252] Updated weights for policy 0, policy_version 57110 (0.0008) [2023-10-10 07:02:08,166][53252] Updated weights for policy 0, policy_version 57120 (0.0007) [2023-10-10 07:02:11,489][53268] Updated weights for policy 1, policy_version 57060 (0.0010) [2023-10-10 07:02:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116916224. Throughput: 0: 1683.8, 1: 1693.9. Samples: 29246220. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:11,784][52050] Avg episode reward: [(0, '19.380'), (1, '18.630')] [2023-10-10 07:02:11,866][53268] Updated weights for policy 1, policy_version 57070 (0.0009) [2023-10-10 07:02:12,228][53268] Updated weights for policy 1, policy_version 57080 (0.0009) [2023-10-10 07:02:12,307][53252] Updated weights for policy 0, policy_version 57130 (0.0009) [2023-10-10 07:02:12,673][53252] Updated weights for policy 0, policy_version 57140 (0.0009) [2023-10-10 07:02:13,040][53252] Updated weights for policy 0, policy_version 57150 (0.0011) [2023-10-10 07:02:16,406][53268] Updated weights for policy 1, policy_version 57090 (0.0010) [2023-10-10 07:02:16,780][53268] Updated weights for policy 1, policy_version 57100 (0.0008) [2023-10-10 07:02:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 116981760. Throughput: 0: 1682.6, 1: 1695.6. Samples: 29255288. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:16,784][52050] Avg episode reward: [(0, '20.130'), (1, '19.890')] [2023-10-10 07:02:17,130][53252] Updated weights for policy 0, policy_version 57160 (0.0008) [2023-10-10 07:02:17,136][53268] Updated weights for policy 1, policy_version 57110 (0.0008) [2023-10-10 07:02:17,502][53268] Updated weights for policy 1, policy_version 57120 (0.0009) [2023-10-10 07:02:17,503][53252] Updated weights for policy 0, policy_version 57170 (0.0008) [2023-10-10 07:02:17,877][53252] Updated weights for policy 0, policy_version 57180 (0.0008) [2023-10-10 07:02:21,582][53268] Updated weights for policy 1, policy_version 57130 (0.0009) [2023-10-10 07:02:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 117047296. Throughput: 0: 1688.0, 1: 1688.0. Samples: 29275748. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:21,784][52050] Avg episode reward: [(0, '19.480'), (1, '19.740')] [2023-10-10 07:02:21,947][53268] Updated weights for policy 1, policy_version 57140 (0.0009) [2023-10-10 07:02:22,028][53252] Updated weights for policy 0, policy_version 57190 (0.0008) [2023-10-10 07:02:22,313][53268] Updated weights for policy 1, policy_version 57150 (0.0009) [2023-10-10 07:02:22,412][53252] Updated weights for policy 0, policy_version 57200 (0.0007) [2023-10-10 07:02:22,788][53252] Updated weights for policy 0, policy_version 57210 (0.0009) [2023-10-10 07:02:26,258][53268] Updated weights for policy 1, policy_version 57160 (0.0009) [2023-10-10 07:02:26,607][53252] Updated weights for policy 0, policy_version 57220 (0.0008) [2023-10-10 07:02:26,628][53268] Updated weights for policy 1, policy_version 57170 (0.0008) [2023-10-10 07:02:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 117112832. Throughput: 0: 1686.8, 1: 1679.2. Samples: 29296182. Policy #0 lag: (min: 5.0, avg: 5.1, max: 12.0) [2023-10-10 07:02:26,784][52050] Avg episode reward: [(0, '20.110'), (1, '22.010')] [2023-10-10 07:02:26,985][53252] Updated weights for policy 0, policy_version 57230 (0.0008) [2023-10-10 07:02:26,992][53268] Updated weights for policy 1, policy_version 57180 (0.0008) [2023-10-10 07:02:27,359][53252] Updated weights for policy 0, policy_version 57240 (0.0007) [2023-10-10 07:02:31,064][53268] Updated weights for policy 1, policy_version 57190 (0.0009) [2023-10-10 07:02:31,433][53268] Updated weights for policy 1, policy_version 57200 (0.0010) [2023-10-10 07:02:31,475][53252] Updated weights for policy 0, policy_version 57250 (0.0007) [2023-10-10 07:02:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 117178368. Throughput: 0: 1686.8, 1: 1684.3. Samples: 29305644. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:31,784][52050] Avg episode reward: [(0, '19.560'), (1, '21.890')] [2023-10-10 07:02:31,799][53268] Updated weights for policy 1, policy_version 57210 (0.0008) [2023-10-10 07:02:31,850][53252] Updated weights for policy 0, policy_version 57260 (0.0007) [2023-10-10 07:02:32,227][53252] Updated weights for policy 0, policy_version 57270 (0.0007) [2023-10-10 07:02:32,601][53252] Updated weights for policy 0, policy_version 57280 (0.0007) [2023-10-10 07:02:35,987][53268] Updated weights for policy 1, policy_version 57220 (0.0008) [2023-10-10 07:02:36,358][53268] Updated weights for policy 1, policy_version 57230 (0.0007) [2023-10-10 07:02:36,563][53252] Updated weights for policy 0, policy_version 57290 (0.0009) [2023-10-10 07:02:36,719][53268] Updated weights for policy 1, policy_version 57240 (0.0007) [2023-10-10 07:02:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 117243904. Throughput: 0: 1689.1, 1: 1680.5. Samples: 29326194. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:36,784][52050] Avg episode reward: [(0, '20.200'), (1, '20.010')] [2023-10-10 07:02:36,927][53252] Updated weights for policy 0, policy_version 57300 (0.0007) [2023-10-10 07:02:37,297][53252] Updated weights for policy 0, policy_version 57310 (0.0008) [2023-10-10 07:02:40,887][53268] Updated weights for policy 1, policy_version 57250 (0.0008) [2023-10-10 07:02:41,256][53268] Updated weights for policy 1, policy_version 57260 (0.0009) [2023-10-10 07:02:41,563][53252] Updated weights for policy 0, policy_version 57320 (0.0009) [2023-10-10 07:02:41,619][53268] Updated weights for policy 1, policy_version 57270 (0.0009) [2023-10-10 07:02:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 117309440. Throughput: 0: 1683.5, 1: 1675.7. Samples: 29346378. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:41,784][52050] Avg episode reward: [(0, '21.230'), (1, '18.910')] [2023-10-10 07:02:41,929][53252] Updated weights for policy 0, policy_version 57330 (0.0008) [2023-10-10 07:02:41,986][53268] Updated weights for policy 1, policy_version 57280 (0.0009) [2023-10-10 07:02:41,986][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000057280_58654720.pth... [2023-10-10 07:02:42,015][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000055712_57049088.pth [2023-10-10 07:02:42,303][53252] Updated weights for policy 0, policy_version 57340 (0.0010) [2023-10-10 07:02:42,450][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000057344_58720256.pth... [2023-10-10 07:02:42,479][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000055776_57114624.pth [2023-10-10 07:02:45,839][53268] Updated weights for policy 1, policy_version 57290 (0.0007) [2023-10-10 07:02:46,201][53268] Updated weights for policy 1, policy_version 57300 (0.0007) [2023-10-10 07:02:46,326][53252] Updated weights for policy 0, policy_version 57350 (0.0007) [2023-10-10 07:02:46,566][53268] Updated weights for policy 1, policy_version 57310 (0.0007) [2023-10-10 07:02:46,690][53252] Updated weights for policy 0, policy_version 57360 (0.0007) [2023-10-10 07:02:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117407744. Throughput: 0: 1687.6, 1: 1683.7. Samples: 29356140. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:46,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.250')] [2023-10-10 07:02:47,070][53252] Updated weights for policy 0, policy_version 57370 (0.0008) [2023-10-10 07:02:50,698][53268] Updated weights for policy 1, policy_version 57320 (0.0010) [2023-10-10 07:02:51,084][53268] Updated weights for policy 1, policy_version 57330 (0.0010) [2023-10-10 07:02:51,163][53252] Updated weights for policy 0, policy_version 57380 (0.0009) [2023-10-10 07:02:51,440][53268] Updated weights for policy 1, policy_version 57340 (0.0008) [2023-10-10 07:02:51,536][53252] Updated weights for policy 0, policy_version 57390 (0.0008) [2023-10-10 07:02:51,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117473280. Throughput: 0: 1688.7, 1: 1684.7. Samples: 29376924. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:51,784][52050] Avg episode reward: [(0, '20.580'), (1, '18.620')] [2023-10-10 07:02:51,897][53252] Updated weights for policy 0, policy_version 57400 (0.0008) [2023-10-10 07:02:55,538][53268] Updated weights for policy 1, policy_version 57350 (0.0008) [2023-10-10 07:02:55,850][53252] Updated weights for policy 0, policy_version 57410 (0.0009) [2023-10-10 07:02:55,895][53268] Updated weights for policy 1, policy_version 57360 (0.0007) [2023-10-10 07:02:56,214][53252] Updated weights for policy 0, policy_version 57420 (0.0008) [2023-10-10 07:02:56,267][53268] Updated weights for policy 1, policy_version 57370 (0.0007) [2023-10-10 07:02:56,583][53252] Updated weights for policy 0, policy_version 57430 (0.0008) [2023-10-10 07:02:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 117538816. Throughput: 0: 1675.9, 1: 1653.9. Samples: 29396058. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:02:56,784][52050] Avg episode reward: [(0, '21.330'), (1, '19.780')] [2023-10-10 07:02:56,959][53252] Updated weights for policy 0, policy_version 57440 (0.0008) [2023-10-10 07:03:00,445][53268] Updated weights for policy 1, policy_version 57380 (0.0008) [2023-10-10 07:03:00,802][53268] Updated weights for policy 1, policy_version 57390 (0.0007) [2023-10-10 07:03:01,025][53252] Updated weights for policy 0, policy_version 57450 (0.0007) [2023-10-10 07:03:01,170][53268] Updated weights for policy 1, policy_version 57400 (0.0009) [2023-10-10 07:03:01,398][53252] Updated weights for policy 0, policy_version 57460 (0.0008) [2023-10-10 07:03:01,774][53252] Updated weights for policy 0, policy_version 57470 (0.0007) [2023-10-10 07:03:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117604352. Throughput: 0: 1691.8, 1: 1670.1. Samples: 29406574. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:03:01,784][52050] Avg episode reward: [(0, '21.450'), (1, '20.220')] [2023-10-10 07:03:05,408][53268] Updated weights for policy 1, policy_version 57410 (0.0009) [2023-10-10 07:03:05,754][53252] Updated weights for policy 0, policy_version 57480 (0.0008) [2023-10-10 07:03:05,775][53268] Updated weights for policy 1, policy_version 57420 (0.0007) [2023-10-10 07:03:06,111][53252] Updated weights for policy 0, policy_version 57490 (0.0008) [2023-10-10 07:03:06,134][53268] Updated weights for policy 1, policy_version 57430 (0.0008) [2023-10-10 07:03:06,480][53252] Updated weights for policy 0, policy_version 57500 (0.0007) [2023-10-10 07:03:06,495][53268] Updated weights for policy 1, policy_version 57440 (0.0009) [2023-10-10 07:03:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 117702656. Throughput: 0: 1698.7, 1: 1677.8. Samples: 29427692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:06,784][52050] Avg episode reward: [(0, '21.950'), (1, '20.470')] [2023-10-10 07:03:10,328][53268] Updated weights for policy 1, policy_version 57450 (0.0011) [2023-10-10 07:03:10,691][53268] Updated weights for policy 1, policy_version 57460 (0.0008) [2023-10-10 07:03:10,845][53252] Updated weights for policy 0, policy_version 57510 (0.0010) [2023-10-10 07:03:11,055][53268] Updated weights for policy 1, policy_version 57470 (0.0008) [2023-10-10 07:03:11,229][53252] Updated weights for policy 0, policy_version 57520 (0.0009) [2023-10-10 07:03:11,593][53252] Updated weights for policy 0, policy_version 57530 (0.0009) [2023-10-10 07:03:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117735424. Throughput: 0: 1673.9, 1: 1657.6. Samples: 29446098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:11,784][52050] Avg episode reward: [(0, '21.450'), (1, '19.290')] [2023-10-10 07:03:15,079][53268] Updated weights for policy 1, policy_version 57480 (0.0010) [2023-10-10 07:03:15,444][53268] Updated weights for policy 1, policy_version 57490 (0.0010) [2023-10-10 07:03:15,587][53252] Updated weights for policy 0, policy_version 57540 (0.0008) [2023-10-10 07:03:15,814][53268] Updated weights for policy 1, policy_version 57500 (0.0008) [2023-10-10 07:03:15,962][53252] Updated weights for policy 0, policy_version 57550 (0.0008) [2023-10-10 07:03:16,329][53252] Updated weights for policy 0, policy_version 57560 (0.0008) [2023-10-10 07:03:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 117833728. Throughput: 0: 1687.8, 1: 1679.1. Samples: 29457152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:16,784][52050] Avg episode reward: [(0, '20.930'), (1, '18.610')] [2023-10-10 07:03:19,999][53268] Updated weights for policy 1, policy_version 57510 (0.0008) [2023-10-10 07:03:20,219][53252] Updated weights for policy 0, policy_version 57570 (0.0007) [2023-10-10 07:03:20,365][53268] Updated weights for policy 1, policy_version 57520 (0.0008) [2023-10-10 07:03:20,587][53252] Updated weights for policy 0, policy_version 57580 (0.0009) [2023-10-10 07:03:20,727][53268] Updated weights for policy 1, policy_version 57530 (0.0008) [2023-10-10 07:03:20,952][53252] Updated weights for policy 0, policy_version 57590 (0.0008) [2023-10-10 07:03:21,322][53252] Updated weights for policy 0, policy_version 57600 (0.0008) [2023-10-10 07:03:21,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 117899264. Throughput: 0: 1681.1, 1: 1675.6. Samples: 29477246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:21,785][52050] Avg episode reward: [(0, '20.610'), (1, '19.870')] [2023-10-10 07:03:24,681][53268] Updated weights for policy 1, policy_version 57540 (0.0008) [2023-10-10 07:03:25,055][53268] Updated weights for policy 1, policy_version 57550 (0.0011) [2023-10-10 07:03:25,418][53268] Updated weights for policy 1, policy_version 57560 (0.0009) [2023-10-10 07:03:25,496][53252] Updated weights for policy 0, policy_version 57610 (0.0009) [2023-10-10 07:03:25,862][53252] Updated weights for policy 0, policy_version 57620 (0.0007) [2023-10-10 07:03:26,227][53252] Updated weights for policy 0, policy_version 57630 (0.0007) [2023-10-10 07:03:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 117964800. Throughput: 0: 1661.4, 1: 1669.4. Samples: 29496264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:26,784][52050] Avg episode reward: [(0, '20.180'), (1, '19.730')] [2023-10-10 07:03:29,434][53268] Updated weights for policy 1, policy_version 57570 (0.0010) [2023-10-10 07:03:29,801][53268] Updated weights for policy 1, policy_version 57580 (0.0010) [2023-10-10 07:03:30,169][53268] Updated weights for policy 1, policy_version 57590 (0.0008) [2023-10-10 07:03:30,317][53252] Updated weights for policy 0, policy_version 57640 (0.0010) [2023-10-10 07:03:30,539][53268] Updated weights for policy 1, policy_version 57600 (0.0009) [2023-10-10 07:03:30,680][53252] Updated weights for policy 0, policy_version 57650 (0.0007) [2023-10-10 07:03:31,056][53252] Updated weights for policy 0, policy_version 57660 (0.0009) [2023-10-10 07:03:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 118030336. Throughput: 0: 1684.5, 1: 1686.3. Samples: 29507828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:31,784][52050] Avg episode reward: [(0, '20.410'), (1, '18.660')] [2023-10-10 07:03:34,584][53268] Updated weights for policy 1, policy_version 57610 (0.0010) [2023-10-10 07:03:34,954][53268] Updated weights for policy 1, policy_version 57620 (0.0008) [2023-10-10 07:03:35,296][53252] Updated weights for policy 0, policy_version 57670 (0.0007) [2023-10-10 07:03:35,308][53268] Updated weights for policy 1, policy_version 57630 (0.0007) [2023-10-10 07:03:35,675][53252] Updated weights for policy 0, policy_version 57680 (0.0007) [2023-10-10 07:03:36,043][53252] Updated weights for policy 0, policy_version 57690 (0.0010) [2023-10-10 07:03:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 118095872. Throughput: 0: 1670.0, 1: 1666.0. Samples: 29527048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:36,784][52050] Avg episode reward: [(0, '21.590'), (1, '19.820')] [2023-10-10 07:03:39,479][53268] Updated weights for policy 1, policy_version 57640 (0.0009) [2023-10-10 07:03:39,857][53268] Updated weights for policy 1, policy_version 57650 (0.0009) [2023-10-10 07:03:40,036][53252] Updated weights for policy 0, policy_version 57700 (0.0009) [2023-10-10 07:03:40,222][53268] Updated weights for policy 1, policy_version 57660 (0.0009) [2023-10-10 07:03:40,400][53252] Updated weights for policy 0, policy_version 57710 (0.0009) [2023-10-10 07:03:40,785][53252] Updated weights for policy 0, policy_version 57720 (0.0007) [2023-10-10 07:03:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 118161408. Throughput: 0: 1659.2, 1: 1684.5. Samples: 29546524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:41,784][52050] Avg episode reward: [(0, '20.800'), (1, '22.060')] [2023-10-10 07:03:44,296][53268] Updated weights for policy 1, policy_version 57670 (0.0008) [2023-10-10 07:03:44,664][53268] Updated weights for policy 1, policy_version 57680 (0.0010) [2023-10-10 07:03:44,770][53252] Updated weights for policy 0, policy_version 57730 (0.0008) [2023-10-10 07:03:45,026][53268] Updated weights for policy 1, policy_version 57690 (0.0010) [2023-10-10 07:03:45,141][53252] Updated weights for policy 0, policy_version 57740 (0.0009) [2023-10-10 07:03:45,501][53252] Updated weights for policy 0, policy_version 57750 (0.0009) [2023-10-10 07:03:45,881][53252] Updated weights for policy 0, policy_version 57760 (0.0010) [2023-10-10 07:03:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 118226944. Throughput: 0: 1673.4, 1: 1696.8. Samples: 29558230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:46,784][52050] Avg episode reward: [(0, '21.180'), (1, '20.240')] [2023-10-10 07:03:49,140][53268] Updated weights for policy 1, policy_version 57700 (0.0008) [2023-10-10 07:03:49,516][53268] Updated weights for policy 1, policy_version 57710 (0.0009) [2023-10-10 07:03:49,886][53268] Updated weights for policy 1, policy_version 57720 (0.0008) [2023-10-10 07:03:50,186][53252] Updated weights for policy 0, policy_version 57770 (0.0008) [2023-10-10 07:03:50,555][53252] Updated weights for policy 0, policy_version 57780 (0.0007) [2023-10-10 07:03:50,923][53252] Updated weights for policy 0, policy_version 57790 (0.0009) [2023-10-10 07:03:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 118292480. Throughput: 0: 1647.3, 1: 1667.3. Samples: 29576852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:51,784][52050] Avg episode reward: [(0, '22.100'), (1, '21.490')] [2023-10-10 07:03:53,755][53268] Updated weights for policy 1, policy_version 57730 (0.0009) [2023-10-10 07:03:54,127][53268] Updated weights for policy 1, policy_version 57740 (0.0010) [2023-10-10 07:03:54,491][53268] Updated weights for policy 1, policy_version 57750 (0.0010) [2023-10-10 07:03:54,855][53268] Updated weights for policy 1, policy_version 57760 (0.0010) [2023-10-10 07:03:55,041][53252] Updated weights for policy 0, policy_version 57800 (0.0009) [2023-10-10 07:03:55,407][53252] Updated weights for policy 0, policy_version 57810 (0.0011) [2023-10-10 07:03:55,776][53252] Updated weights for policy 0, policy_version 57820 (0.0007) [2023-10-10 07:03:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 118358016. Throughput: 0: 1656.0, 1: 1692.0. Samples: 29596756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:03:56,784][52050] Avg episode reward: [(0, '19.920'), (1, '22.020')] [2023-10-10 07:03:58,901][53268] Updated weights for policy 1, policy_version 57770 (0.0011) [2023-10-10 07:03:59,269][53268] Updated weights for policy 1, policy_version 57780 (0.0011) [2023-10-10 07:03:59,644][53268] Updated weights for policy 1, policy_version 57790 (0.0009) [2023-10-10 07:03:59,813][53252] Updated weights for policy 0, policy_version 57830 (0.0008) [2023-10-10 07:04:00,182][53252] Updated weights for policy 0, policy_version 57840 (0.0009) [2023-10-10 07:04:00,559][53252] Updated weights for policy 0, policy_version 57850 (0.0008) [2023-10-10 07:04:01,784][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 118423552. Throughput: 0: 1667.7, 1: 1682.8. Samples: 29607926. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:04:01,785][52050] Avg episode reward: [(0, '19.750'), (1, '22.590')] [2023-10-10 07:04:03,708][53268] Updated weights for policy 1, policy_version 57800 (0.0007) [2023-10-10 07:04:04,080][53268] Updated weights for policy 1, policy_version 57810 (0.0009) [2023-10-10 07:04:04,449][53268] Updated weights for policy 1, policy_version 57820 (0.0010) [2023-10-10 07:04:04,639][53252] Updated weights for policy 0, policy_version 57860 (0.0007) [2023-10-10 07:04:04,999][53252] Updated weights for policy 0, policy_version 57870 (0.0007) [2023-10-10 07:04:05,372][53252] Updated weights for policy 0, policy_version 57880 (0.0009) [2023-10-10 07:04:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118489088. Throughput: 0: 1650.9, 1: 1682.0. Samples: 29627224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:04:06,784][52050] Avg episode reward: [(0, '20.120'), (1, '22.320')] [2023-10-10 07:04:08,443][53268] Updated weights for policy 1, policy_version 57830 (0.0009) [2023-10-10 07:04:08,812][53268] Updated weights for policy 1, policy_version 57840 (0.0008) [2023-10-10 07:04:09,178][53268] Updated weights for policy 1, policy_version 57850 (0.0008) [2023-10-10 07:04:09,563][53252] Updated weights for policy 0, policy_version 57890 (0.0011) [2023-10-10 07:04:09,923][53252] Updated weights for policy 0, policy_version 57900 (0.0008) [2023-10-10 07:04:10,302][53252] Updated weights for policy 0, policy_version 57910 (0.0009) [2023-10-10 07:04:10,672][53252] Updated weights for policy 0, policy_version 57920 (0.0008) [2023-10-10 07:04:11,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 118554624. Throughput: 0: 1664.0, 1: 1700.9. Samples: 29647684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:04:11,784][52050] Avg episode reward: [(0, '19.950'), (1, '21.670')] [2023-10-10 07:04:13,189][53268] Updated weights for policy 1, policy_version 57860 (0.0010) [2023-10-10 07:04:13,555][53268] Updated weights for policy 1, policy_version 57870 (0.0008) [2023-10-10 07:04:13,914][53268] Updated weights for policy 1, policy_version 57880 (0.0009) [2023-10-10 07:04:14,643][53252] Updated weights for policy 0, policy_version 57930 (0.0008) [2023-10-10 07:04:15,011][53252] Updated weights for policy 0, policy_version 57940 (0.0009) [2023-10-10 07:04:15,391][53252] Updated weights for policy 0, policy_version 57950 (0.0007) [2023-10-10 07:04:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118620160. Throughput: 0: 1668.8, 1: 1676.5. Samples: 29658368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:04:16,784][52050] Avg episode reward: [(0, '22.000'), (1, '22.160')] [2023-10-10 07:04:18,016][53268] Updated weights for policy 1, policy_version 57890 (0.0009) [2023-10-10 07:04:18,393][53268] Updated weights for policy 1, policy_version 57900 (0.0009) [2023-10-10 07:04:18,749][53268] Updated weights for policy 1, policy_version 57910 (0.0008) [2023-10-10 07:04:19,124][53268] Updated weights for policy 1, policy_version 57920 (0.0007) [2023-10-10 07:04:19,437][53252] Updated weights for policy 0, policy_version 57960 (0.0009) [2023-10-10 07:04:19,814][53252] Updated weights for policy 0, policy_version 57970 (0.0011) [2023-10-10 07:04:20,194][53252] Updated weights for policy 0, policy_version 57980 (0.0009) [2023-10-10 07:04:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118685696. Throughput: 0: 1665.6, 1: 1692.9. Samples: 29678182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:04:21,784][52050] Avg episode reward: [(0, '22.820'), (1, '21.160')] [2023-10-10 07:04:23,138][53268] Updated weights for policy 1, policy_version 57930 (0.0009) [2023-10-10 07:04:23,503][53268] Updated weights for policy 1, policy_version 57940 (0.0009) [2023-10-10 07:04:23,861][53268] Updated weights for policy 1, policy_version 57950 (0.0010) [2023-10-10 07:04:24,221][53252] Updated weights for policy 0, policy_version 57990 (0.0008) [2023-10-10 07:04:24,591][53252] Updated weights for policy 0, policy_version 58000 (0.0010) [2023-10-10 07:04:24,961][53252] Updated weights for policy 0, policy_version 58010 (0.0008) [2023-10-10 07:04:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 118751232. Throughput: 0: 1688.6, 1: 1697.2. Samples: 29698886. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:26,784][52050] Avg episode reward: [(0, '22.390'), (1, '21.900')] [2023-10-10 07:04:28,037][53268] Updated weights for policy 1, policy_version 57960 (0.0009) [2023-10-10 07:04:28,410][53268] Updated weights for policy 1, policy_version 57970 (0.0007) [2023-10-10 07:04:28,773][53268] Updated weights for policy 1, policy_version 57980 (0.0007) [2023-10-10 07:04:28,990][53252] Updated weights for policy 0, policy_version 58020 (0.0010) [2023-10-10 07:04:29,362][53252] Updated weights for policy 0, policy_version 58030 (0.0007) [2023-10-10 07:04:29,733][53252] Updated weights for policy 0, policy_version 58040 (0.0009) [2023-10-10 07:04:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118816768. Throughput: 0: 1677.3, 1: 1669.4. Samples: 29708832. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:31,784][52050] Avg episode reward: [(0, '21.570'), (1, '20.270')] [2023-10-10 07:04:32,807][53268] Updated weights for policy 1, policy_version 57990 (0.0010) [2023-10-10 07:04:33,166][53268] Updated weights for policy 1, policy_version 58000 (0.0009) [2023-10-10 07:04:33,530][53268] Updated weights for policy 1, policy_version 58010 (0.0009) [2023-10-10 07:04:33,707][53252] Updated weights for policy 0, policy_version 58050 (0.0008) [2023-10-10 07:04:34,082][53252] Updated weights for policy 0, policy_version 58060 (0.0007) [2023-10-10 07:04:34,454][53252] Updated weights for policy 0, policy_version 58070 (0.0007) [2023-10-10 07:04:34,826][53252] Updated weights for policy 0, policy_version 58080 (0.0007) [2023-10-10 07:04:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 118882304. Throughput: 0: 1681.7, 1: 1697.1. Samples: 29728898. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:36,784][52050] Avg episode reward: [(0, '22.330'), (1, '22.880')] [2023-10-10 07:04:37,549][53268] Updated weights for policy 1, policy_version 58020 (0.0009) [2023-10-10 07:04:37,918][53268] Updated weights for policy 1, policy_version 58030 (0.0008) [2023-10-10 07:04:38,283][53268] Updated weights for policy 1, policy_version 58040 (0.0008) [2023-10-10 07:04:38,601][53252] Updated weights for policy 0, policy_version 58090 (0.0010) [2023-10-10 07:04:38,970][53252] Updated weights for policy 0, policy_version 58100 (0.0010) [2023-10-10 07:04:39,350][53252] Updated weights for policy 0, policy_version 58110 (0.0008) [2023-10-10 07:04:41,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 118947840. Throughput: 0: 1702.5, 1: 1698.7. Samples: 29749810. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:41,785][52050] Avg episode reward: [(0, '22.080'), (1, '21.970')] [2023-10-10 07:04:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000058048_59441152.pth... [2023-10-10 07:04:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000058112_59506688.pth... [2023-10-10 07:04:41,833][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000056544_57901056.pth [2023-10-10 07:04:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000056480_57835520.pth [2023-10-10 07:04:42,387][53268] Updated weights for policy 1, policy_version 58050 (0.0008) [2023-10-10 07:04:42,753][53268] Updated weights for policy 1, policy_version 58060 (0.0009) [2023-10-10 07:04:43,121][53268] Updated weights for policy 1, policy_version 58070 (0.0009) [2023-10-10 07:04:43,438][53252] Updated weights for policy 0, policy_version 58120 (0.0010) [2023-10-10 07:04:43,493][53268] Updated weights for policy 1, policy_version 58080 (0.0008) [2023-10-10 07:04:43,811][53252] Updated weights for policy 0, policy_version 58130 (0.0009) [2023-10-10 07:04:44,180][53252] Updated weights for policy 0, policy_version 58140 (0.0009) [2023-10-10 07:04:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119013376. Throughput: 0: 1675.2, 1: 1684.0. Samples: 29759090. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:46,784][52050] Avg episode reward: [(0, '22.370'), (1, '21.180')] [2023-10-10 07:04:47,413][53268] Updated weights for policy 1, policy_version 58090 (0.0010) [2023-10-10 07:04:47,779][53268] Updated weights for policy 1, policy_version 58100 (0.0008) [2023-10-10 07:04:48,139][53268] Updated weights for policy 1, policy_version 58110 (0.0009) [2023-10-10 07:04:48,457][53252] Updated weights for policy 0, policy_version 58150 (0.0007) [2023-10-10 07:04:48,819][53252] Updated weights for policy 0, policy_version 58160 (0.0007) [2023-10-10 07:04:49,189][53252] Updated weights for policy 0, policy_version 58170 (0.0008) [2023-10-10 07:04:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119078912. Throughput: 0: 1694.0, 1: 1695.6. Samples: 29779756. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:51,784][52050] Avg episode reward: [(0, '22.300'), (1, '21.540')] [2023-10-10 07:04:52,159][53268] Updated weights for policy 1, policy_version 58120 (0.0008) [2023-10-10 07:04:52,519][53268] Updated weights for policy 1, policy_version 58130 (0.0007) [2023-10-10 07:04:52,885][53268] Updated weights for policy 1, policy_version 58140 (0.0008) [2023-10-10 07:04:53,257][53252] Updated weights for policy 0, policy_version 58180 (0.0009) [2023-10-10 07:04:53,630][53252] Updated weights for policy 0, policy_version 58190 (0.0007) [2023-10-10 07:04:54,006][53252] Updated weights for policy 0, policy_version 58200 (0.0008) [2023-10-10 07:04:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119144448. Throughput: 0: 1705.5, 1: 1689.9. Samples: 29800478. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:04:56,784][52050] Avg episode reward: [(0, '20.810'), (1, '21.040')] [2023-10-10 07:04:57,051][53268] Updated weights for policy 1, policy_version 58150 (0.0010) [2023-10-10 07:04:57,414][53268] Updated weights for policy 1, policy_version 58160 (0.0010) [2023-10-10 07:04:57,791][53268] Updated weights for policy 1, policy_version 58170 (0.0012) [2023-10-10 07:04:57,950][53252] Updated weights for policy 0, policy_version 58210 (0.0008) [2023-10-10 07:04:58,332][53252] Updated weights for policy 0, policy_version 58220 (0.0008) [2023-10-10 07:04:58,702][53252] Updated weights for policy 0, policy_version 58230 (0.0008) [2023-10-10 07:04:59,085][53252] Updated weights for policy 0, policy_version 58240 (0.0009) [2023-10-10 07:05:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119209984. Throughput: 0: 1672.5, 1: 1684.4. Samples: 29809432. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-10 07:05:01,784][52050] Avg episode reward: [(0, '19.950'), (1, '19.750')] [2023-10-10 07:05:01,994][53268] Updated weights for policy 1, policy_version 58180 (0.0007) [2023-10-10 07:05:02,363][53268] Updated weights for policy 1, policy_version 58190 (0.0007) [2023-10-10 07:05:02,732][53268] Updated weights for policy 1, policy_version 58200 (0.0010) [2023-10-10 07:05:03,185][53252] Updated weights for policy 0, policy_version 58250 (0.0007) [2023-10-10 07:05:03,548][53252] Updated weights for policy 0, policy_version 58260 (0.0007) [2023-10-10 07:05:03,915][53252] Updated weights for policy 0, policy_version 58270 (0.0009) [2023-10-10 07:05:06,645][53268] Updated weights for policy 1, policy_version 58210 (0.0008) [2023-10-10 07:05:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119275520. Throughput: 0: 1686.5, 1: 1683.1. Samples: 29829814. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:06,784][52050] Avg episode reward: [(0, '20.950'), (1, '22.350')] [2023-10-10 07:05:07,028][53268] Updated weights for policy 1, policy_version 58220 (0.0008) [2023-10-10 07:05:07,398][53268] Updated weights for policy 1, policy_version 58230 (0.0009) [2023-10-10 07:05:07,755][53268] Updated weights for policy 1, policy_version 58240 (0.0011) [2023-10-10 07:05:08,121][53252] Updated weights for policy 0, policy_version 58280 (0.0009) [2023-10-10 07:05:08,491][53252] Updated weights for policy 0, policy_version 58290 (0.0007) [2023-10-10 07:05:08,858][53252] Updated weights for policy 0, policy_version 58300 (0.0007) [2023-10-10 07:05:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 119341056. Throughput: 0: 1685.5, 1: 1684.0. Samples: 29850516. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:11,784][52050] Avg episode reward: [(0, '22.120'), (1, '21.600')] [2023-10-10 07:05:11,941][53268] Updated weights for policy 1, policy_version 58250 (0.0008) [2023-10-10 07:05:12,301][53268] Updated weights for policy 1, policy_version 58260 (0.0007) [2023-10-10 07:05:12,670][53268] Updated weights for policy 1, policy_version 58270 (0.0007) [2023-10-10 07:05:12,988][53252] Updated weights for policy 0, policy_version 58310 (0.0007) [2023-10-10 07:05:13,361][53252] Updated weights for policy 0, policy_version 58320 (0.0009) [2023-10-10 07:05:13,741][53252] Updated weights for policy 0, policy_version 58330 (0.0010) [2023-10-10 07:05:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119406592. Throughput: 0: 1669.0, 1: 1683.7. Samples: 29859704. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:16,784][52050] Avg episode reward: [(0, '22.220'), (1, '20.420')] [2023-10-10 07:05:16,958][53268] Updated weights for policy 1, policy_version 58280 (0.0011) [2023-10-10 07:05:17,322][53268] Updated weights for policy 1, policy_version 58290 (0.0009) [2023-10-10 07:05:17,696][53268] Updated weights for policy 1, policy_version 58300 (0.0009) [2023-10-10 07:05:17,708][53252] Updated weights for policy 0, policy_version 58340 (0.0009) [2023-10-10 07:05:18,074][53252] Updated weights for policy 0, policy_version 58350 (0.0008) [2023-10-10 07:05:18,442][53252] Updated weights for policy 0, policy_version 58360 (0.0007) [2023-10-10 07:05:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119472128. Throughput: 0: 1683.9, 1: 1677.9. Samples: 29880178. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:21,784][52050] Avg episode reward: [(0, '23.240'), (1, '19.580')] [2023-10-10 07:05:21,927][53268] Updated weights for policy 1, policy_version 58310 (0.0011) [2023-10-10 07:05:22,287][53268] Updated weights for policy 1, policy_version 58320 (0.0011) [2023-10-10 07:05:22,592][53252] Updated weights for policy 0, policy_version 58370 (0.0008) [2023-10-10 07:05:22,651][53268] Updated weights for policy 1, policy_version 58330 (0.0007) [2023-10-10 07:05:22,960][53252] Updated weights for policy 0, policy_version 58380 (0.0009) [2023-10-10 07:05:23,330][53252] Updated weights for policy 0, policy_version 58390 (0.0009) [2023-10-10 07:05:23,693][53252] Updated weights for policy 0, policy_version 58400 (0.0008) [2023-10-10 07:05:26,568][53268] Updated weights for policy 1, policy_version 58340 (0.0008) [2023-10-10 07:05:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119537664. Throughput: 0: 1681.0, 1: 1678.1. Samples: 29900968. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:26,784][52050] Avg episode reward: [(0, '22.500'), (1, '21.200')] [2023-10-10 07:05:26,943][53268] Updated weights for policy 1, policy_version 58350 (0.0009) [2023-10-10 07:05:27,303][53268] Updated weights for policy 1, policy_version 58360 (0.0008) [2023-10-10 07:05:27,711][53252] Updated weights for policy 0, policy_version 58410 (0.0009) [2023-10-10 07:05:28,086][53252] Updated weights for policy 0, policy_version 58420 (0.0009) [2023-10-10 07:05:28,455][53252] Updated weights for policy 0, policy_version 58430 (0.0010) [2023-10-10 07:05:31,404][53268] Updated weights for policy 1, policy_version 58370 (0.0008) [2023-10-10 07:05:31,767][53268] Updated weights for policy 1, policy_version 58380 (0.0007) [2023-10-10 07:05:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119603200. Throughput: 0: 1679.6, 1: 1676.0. Samples: 29910090. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:31,784][52050] Avg episode reward: [(0, '21.160'), (1, '20.130')] [2023-10-10 07:05:32,137][53268] Updated weights for policy 1, policy_version 58390 (0.0008) [2023-10-10 07:05:32,464][53252] Updated weights for policy 0, policy_version 58440 (0.0008) [2023-10-10 07:05:32,509][53268] Updated weights for policy 1, policy_version 58400 (0.0008) [2023-10-10 07:05:32,841][53252] Updated weights for policy 0, policy_version 58450 (0.0010) [2023-10-10 07:05:33,219][53252] Updated weights for policy 0, policy_version 58460 (0.0011) [2023-10-10 07:05:36,452][53268] Updated weights for policy 1, policy_version 58410 (0.0010) [2023-10-10 07:05:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119668736. Throughput: 0: 1679.3, 1: 1677.2. Samples: 29930802. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:36,784][52050] Avg episode reward: [(0, '21.600'), (1, '21.220')] [2023-10-10 07:05:36,820][53268] Updated weights for policy 1, policy_version 58420 (0.0011) [2023-10-10 07:05:37,183][53268] Updated weights for policy 1, policy_version 58430 (0.0009) [2023-10-10 07:05:37,307][53252] Updated weights for policy 0, policy_version 58470 (0.0009) [2023-10-10 07:05:37,683][53252] Updated weights for policy 0, policy_version 58480 (0.0008) [2023-10-10 07:05:38,053][53252] Updated weights for policy 0, policy_version 58490 (0.0008) [2023-10-10 07:05:41,317][53268] Updated weights for policy 1, policy_version 58440 (0.0008) [2023-10-10 07:05:41,675][53268] Updated weights for policy 1, policy_version 58450 (0.0007) [2023-10-10 07:05:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 119734272. Throughput: 0: 1674.4, 1: 1673.6. Samples: 29951134. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:05:41,784][52050] Avg episode reward: [(0, '21.560'), (1, '21.750')] [2023-10-10 07:05:42,043][53268] Updated weights for policy 1, policy_version 58460 (0.0008) [2023-10-10 07:05:42,116][53252] Updated weights for policy 0, policy_version 58500 (0.0009) [2023-10-10 07:05:42,493][53252] Updated weights for policy 0, policy_version 58510 (0.0009) [2023-10-10 07:05:42,866][53252] Updated weights for policy 0, policy_version 58520 (0.0008) [2023-10-10 07:05:45,885][53268] Updated weights for policy 1, policy_version 58470 (0.0011) [2023-10-10 07:05:46,246][53268] Updated weights for policy 1, policy_version 58480 (0.0011) [2023-10-10 07:05:46,619][53268] Updated weights for policy 1, policy_version 58490 (0.0011) [2023-10-10 07:05:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119799808. Throughput: 0: 1671.7, 1: 1684.3. Samples: 29960452. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:05:46,784][52050] Avg episode reward: [(0, '19.110'), (1, '22.690')] [2023-10-10 07:05:46,929][53252] Updated weights for policy 0, policy_version 58530 (0.0007) [2023-10-10 07:05:47,305][53252] Updated weights for policy 0, policy_version 58540 (0.0007) [2023-10-10 07:05:47,679][53252] Updated weights for policy 0, policy_version 58550 (0.0008) [2023-10-10 07:05:48,057][53252] Updated weights for policy 0, policy_version 58560 (0.0008) [2023-10-10 07:05:50,879][53268] Updated weights for policy 1, policy_version 58500 (0.0009) [2023-10-10 07:05:51,250][53268] Updated weights for policy 1, policy_version 58510 (0.0008) [2023-10-10 07:05:51,612][53268] Updated weights for policy 1, policy_version 58520 (0.0008) [2023-10-10 07:05:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 119865344. Throughput: 0: 1676.4, 1: 1684.9. Samples: 29981074. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:05:51,784][52050] Avg episode reward: [(0, '20.990'), (1, '19.710')] [2023-10-10 07:05:52,181][53252] Updated weights for policy 0, policy_version 58570 (0.0007) [2023-10-10 07:05:52,544][53252] Updated weights for policy 0, policy_version 58580 (0.0007) [2023-10-10 07:05:52,924][53252] Updated weights for policy 0, policy_version 58590 (0.0008) [2023-10-10 07:05:55,640][53268] Updated weights for policy 1, policy_version 58530 (0.0007) [2023-10-10 07:05:56,010][53268] Updated weights for policy 1, policy_version 58540 (0.0007) [2023-10-10 07:05:56,374][53268] Updated weights for policy 1, policy_version 58550 (0.0007) [2023-10-10 07:05:56,742][53268] Updated weights for policy 1, policy_version 58560 (0.0008) [2023-10-10 07:05:56,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 119963648. Throughput: 0: 1682.7, 1: 1674.0. Samples: 30001564. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:05:56,784][52050] Avg episode reward: [(0, '21.400'), (1, '20.310')] [2023-10-10 07:05:56,846][53252] Updated weights for policy 0, policy_version 58600 (0.0008) [2023-10-10 07:05:57,225][53252] Updated weights for policy 0, policy_version 58610 (0.0007) [2023-10-10 07:05:57,589][53252] Updated weights for policy 0, policy_version 58620 (0.0008) [2023-10-10 07:06:00,563][53268] Updated weights for policy 1, policy_version 58570 (0.0007) [2023-10-10 07:06:00,933][53268] Updated weights for policy 1, policy_version 58580 (0.0009) [2023-10-10 07:06:01,308][53268] Updated weights for policy 1, policy_version 58590 (0.0011) [2023-10-10 07:06:01,631][53252] Updated weights for policy 0, policy_version 58630 (0.0010) [2023-10-10 07:06:01,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 120029184. Throughput: 0: 1677.0, 1: 1689.9. Samples: 30011214. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:06:01,784][52050] Avg episode reward: [(0, '21.000'), (1, '20.120')] [2023-10-10 07:06:02,000][53252] Updated weights for policy 0, policy_version 58640 (0.0007) [2023-10-10 07:06:02,372][53252] Updated weights for policy 0, policy_version 58650 (0.0007) [2023-10-10 07:06:05,464][53268] Updated weights for policy 1, policy_version 58600 (0.0007) [2023-10-10 07:06:05,842][53268] Updated weights for policy 1, policy_version 58610 (0.0008) [2023-10-10 07:06:06,212][53268] Updated weights for policy 1, policy_version 58620 (0.0007) [2023-10-10 07:06:06,473][53252] Updated weights for policy 0, policy_version 58660 (0.0007) [2023-10-10 07:06:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120094720. Throughput: 0: 1676.7, 1: 1697.2. Samples: 30032006. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:06:06,785][52050] Avg episode reward: [(0, '22.310'), (1, '20.620')] [2023-10-10 07:06:06,847][53252] Updated weights for policy 0, policy_version 58670 (0.0008) [2023-10-10 07:06:07,216][53252] Updated weights for policy 0, policy_version 58680 (0.0009) [2023-10-10 07:06:10,188][53268] Updated weights for policy 1, policy_version 58630 (0.0008) [2023-10-10 07:06:10,556][53268] Updated weights for policy 1, policy_version 58640 (0.0007) [2023-10-10 07:06:10,925][53268] Updated weights for policy 1, policy_version 58650 (0.0009) [2023-10-10 07:06:11,261][53252] Updated weights for policy 0, policy_version 58690 (0.0009) [2023-10-10 07:06:11,623][53252] Updated weights for policy 0, policy_version 58700 (0.0010) [2023-10-10 07:06:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 120160256. Throughput: 0: 1672.7, 1: 1668.2. Samples: 30051306. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:06:11,784][52050] Avg episode reward: [(0, '20.820'), (1, '19.790')] [2023-10-10 07:06:12,005][53252] Updated weights for policy 0, policy_version 58710 (0.0010) [2023-10-10 07:06:12,371][53252] Updated weights for policy 0, policy_version 58720 (0.0009) [2023-10-10 07:06:14,888][53268] Updated weights for policy 1, policy_version 58660 (0.0008) [2023-10-10 07:06:15,258][53268] Updated weights for policy 1, policy_version 58670 (0.0010) [2023-10-10 07:06:15,623][53268] Updated weights for policy 1, policy_version 58680 (0.0008) [2023-10-10 07:06:16,449][53252] Updated weights for policy 0, policy_version 58730 (0.0007) [2023-10-10 07:06:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120225792. Throughput: 0: 1679.8, 1: 1695.7. Samples: 30061990. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:06:16,784][52050] Avg episode reward: [(0, '20.180'), (1, '20.750')] [2023-10-10 07:06:16,827][53252] Updated weights for policy 0, policy_version 58740 (0.0008) [2023-10-10 07:06:17,206][53252] Updated weights for policy 0, policy_version 58750 (0.0007) [2023-10-10 07:06:19,721][53268] Updated weights for policy 1, policy_version 58690 (0.0008) [2023-10-10 07:06:20,092][53268] Updated weights for policy 1, policy_version 58700 (0.0009) [2023-10-10 07:06:20,466][53268] Updated weights for policy 1, policy_version 58710 (0.0009) [2023-10-10 07:06:20,827][53268] Updated weights for policy 1, policy_version 58720 (0.0007) [2023-10-10 07:06:21,235][53252] Updated weights for policy 0, policy_version 58760 (0.0008) [2023-10-10 07:06:21,604][53252] Updated weights for policy 0, policy_version 58770 (0.0010) [2023-10-10 07:06:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 120291328. Throughput: 0: 1679.2, 1: 1680.7. Samples: 30081994. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:06:21,784][52050] Avg episode reward: [(0, '20.160'), (1, '20.950')] [2023-10-10 07:06:21,974][53252] Updated weights for policy 0, policy_version 58780 (0.0008) [2023-10-10 07:06:24,943][53268] Updated weights for policy 1, policy_version 58730 (0.0009) [2023-10-10 07:06:25,315][53268] Updated weights for policy 1, policy_version 58740 (0.0010) [2023-10-10 07:06:25,680][53268] Updated weights for policy 1, policy_version 58750 (0.0008) [2023-10-10 07:06:26,101][53252] Updated weights for policy 0, policy_version 58790 (0.0008) [2023-10-10 07:06:26,470][53252] Updated weights for policy 0, policy_version 58800 (0.0009) [2023-10-10 07:06:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 120356864. Throughput: 0: 1674.0, 1: 1662.4. Samples: 30101274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:26,784][52050] Avg episode reward: [(0, '19.470'), (1, '19.740')] [2023-10-10 07:06:26,840][53252] Updated weights for policy 0, policy_version 58810 (0.0008) [2023-10-10 07:06:29,693][53268] Updated weights for policy 1, policy_version 58760 (0.0009) [2023-10-10 07:06:30,056][53268] Updated weights for policy 1, policy_version 58770 (0.0009) [2023-10-10 07:06:30,421][53268] Updated weights for policy 1, policy_version 58780 (0.0008) [2023-10-10 07:06:30,912][53252] Updated weights for policy 0, policy_version 58820 (0.0010) [2023-10-10 07:06:31,288][53252] Updated weights for policy 0, policy_version 58830 (0.0009) [2023-10-10 07:06:31,660][53252] Updated weights for policy 0, policy_version 58840 (0.0007) [2023-10-10 07:06:31,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 120422400. Throughput: 0: 1685.4, 1: 1688.5. Samples: 30112278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:31,784][52050] Avg episode reward: [(0, '21.180'), (1, '21.010')] [2023-10-10 07:06:34,667][53268] Updated weights for policy 1, policy_version 58790 (0.0007) [2023-10-10 07:06:35,023][53268] Updated weights for policy 1, policy_version 58800 (0.0009) [2023-10-10 07:06:35,386][53268] Updated weights for policy 1, policy_version 58810 (0.0010) [2023-10-10 07:06:35,707][53252] Updated weights for policy 0, policy_version 58850 (0.0007) [2023-10-10 07:06:36,080][53252] Updated weights for policy 0, policy_version 58860 (0.0008) [2023-10-10 07:06:36,443][53252] Updated weights for policy 0, policy_version 58870 (0.0008) [2023-10-10 07:06:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 120487936. Throughput: 0: 1686.5, 1: 1670.4. Samples: 30132130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:36,784][52050] Avg episode reward: [(0, '21.820'), (1, '20.380')] [2023-10-10 07:06:36,821][53252] Updated weights for policy 0, policy_version 58880 (0.0007) [2023-10-10 07:06:39,465][53268] Updated weights for policy 1, policy_version 58820 (0.0008) [2023-10-10 07:06:39,832][53268] Updated weights for policy 1, policy_version 58830 (0.0008) [2023-10-10 07:06:40,196][53268] Updated weights for policy 1, policy_version 58840 (0.0009) [2023-10-10 07:06:40,923][53252] Updated weights for policy 0, policy_version 58890 (0.0008) [2023-10-10 07:06:41,284][53252] Updated weights for policy 0, policy_version 58900 (0.0010) [2023-10-10 07:06:41,647][53252] Updated weights for policy 0, policy_version 58910 (0.0008) [2023-10-10 07:06:41,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 120586240. Throughput: 0: 1661.8, 1: 1674.3. Samples: 30151686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:41,784][52050] Avg episode reward: [(0, '21.460'), (1, '20.170')] [2023-10-10 07:06:41,797][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000058848_60260352.pth... [2023-10-10 07:06:41,797][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000058912_60325888.pth... [2023-10-10 07:06:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000057344_58720256.pth [2023-10-10 07:06:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000057280_58654720.pth [2023-10-10 07:06:44,411][53268] Updated weights for policy 1, policy_version 58850 (0.0008) [2023-10-10 07:06:44,779][53268] Updated weights for policy 1, policy_version 58860 (0.0007) [2023-10-10 07:06:45,146][53268] Updated weights for policy 1, policy_version 58870 (0.0010) [2023-10-10 07:06:45,512][53268] Updated weights for policy 1, policy_version 58880 (0.0007) [2023-10-10 07:06:45,877][53252] Updated weights for policy 0, policy_version 58920 (0.0009) [2023-10-10 07:06:46,248][53252] Updated weights for policy 0, policy_version 58930 (0.0009) [2023-10-10 07:06:46,617][53252] Updated weights for policy 0, policy_version 58940 (0.0008) [2023-10-10 07:06:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120651776. Throughput: 0: 1681.5, 1: 1686.3. Samples: 30162764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:46,784][52050] Avg episode reward: [(0, '21.900'), (1, '19.350')] [2023-10-10 07:06:49,351][53268] Updated weights for policy 1, policy_version 58890 (0.0008) [2023-10-10 07:06:49,724][53268] Updated weights for policy 1, policy_version 58900 (0.0009) [2023-10-10 07:06:50,091][53268] Updated weights for policy 1, policy_version 58910 (0.0009) [2023-10-10 07:06:50,700][53252] Updated weights for policy 0, policy_version 58950 (0.0008) [2023-10-10 07:06:51,077][53252] Updated weights for policy 0, policy_version 58960 (0.0008) [2023-10-10 07:06:51,441][53252] Updated weights for policy 0, policy_version 58970 (0.0011) [2023-10-10 07:06:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 120717312. Throughput: 0: 1683.0, 1: 1663.8. Samples: 30182612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:51,784][52050] Avg episode reward: [(0, '21.200'), (1, '20.160')] [2023-10-10 07:06:54,172][53268] Updated weights for policy 1, policy_version 58920 (0.0009) [2023-10-10 07:06:54,558][53268] Updated weights for policy 1, policy_version 58930 (0.0009) [2023-10-10 07:06:54,934][53268] Updated weights for policy 1, policy_version 58940 (0.0009) [2023-10-10 07:06:55,362][53252] Updated weights for policy 0, policy_version 58980 (0.0007) [2023-10-10 07:06:55,731][53252] Updated weights for policy 0, policy_version 58990 (0.0007) [2023-10-10 07:06:56,110][53252] Updated weights for policy 0, policy_version 59000 (0.0007) [2023-10-10 07:06:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 120782848. Throughput: 0: 1663.9, 1: 1694.1. Samples: 30202418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:06:56,784][52050] Avg episode reward: [(0, '20.250'), (1, '19.040')] [2023-10-10 07:06:58,903][53268] Updated weights for policy 1, policy_version 58950 (0.0007) [2023-10-10 07:06:59,270][53268] Updated weights for policy 1, policy_version 58960 (0.0009) [2023-10-10 07:06:59,642][53268] Updated weights for policy 1, policy_version 58970 (0.0008) [2023-10-10 07:07:00,353][53252] Updated weights for policy 0, policy_version 59010 (0.0007) [2023-10-10 07:07:00,717][53252] Updated weights for policy 0, policy_version 59020 (0.0008) [2023-10-10 07:07:01,093][53252] Updated weights for policy 0, policy_version 59030 (0.0007) [2023-10-10 07:07:01,463][53252] Updated weights for policy 0, policy_version 59040 (0.0008) [2023-10-10 07:07:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 120848384. Throughput: 0: 1683.7, 1: 1683.3. Samples: 30213506. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:01,784][52050] Avg episode reward: [(0, '20.860'), (1, '19.080')] [2023-10-10 07:07:03,562][53268] Updated weights for policy 1, policy_version 58980 (0.0008) [2023-10-10 07:07:03,942][53268] Updated weights for policy 1, policy_version 58990 (0.0007) [2023-10-10 07:07:04,301][53268] Updated weights for policy 1, policy_version 59000 (0.0008) [2023-10-10 07:07:05,643][53252] Updated weights for policy 0, policy_version 59050 (0.0009) [2023-10-10 07:07:06,012][53252] Updated weights for policy 0, policy_version 59060 (0.0010) [2023-10-10 07:07:06,384][53252] Updated weights for policy 0, policy_version 59070 (0.0008) [2023-10-10 07:07:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 120913920. Throughput: 0: 1678.8, 1: 1685.4. Samples: 30233384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:06,784][52050] Avg episode reward: [(0, '21.870'), (1, '20.170')] [2023-10-10 07:07:08,432][53268] Updated weights for policy 1, policy_version 59010 (0.0010) [2023-10-10 07:07:08,796][53268] Updated weights for policy 1, policy_version 59020 (0.0007) [2023-10-10 07:07:09,173][53268] Updated weights for policy 1, policy_version 59030 (0.0008) [2023-10-10 07:07:09,541][53268] Updated weights for policy 1, policy_version 59040 (0.0008) [2023-10-10 07:07:10,473][53252] Updated weights for policy 0, policy_version 59080 (0.0008) [2023-10-10 07:07:10,847][53252] Updated weights for policy 0, policy_version 59090 (0.0009) [2023-10-10 07:07:11,222][53252] Updated weights for policy 0, policy_version 59100 (0.0009) [2023-10-10 07:07:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 120979456. Throughput: 0: 1662.3, 1: 1711.8. Samples: 30253108. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:11,785][52050] Avg episode reward: [(0, '19.940'), (1, '20.000')] [2023-10-10 07:07:13,445][53268] Updated weights for policy 1, policy_version 59050 (0.0010) [2023-10-10 07:07:13,813][53268] Updated weights for policy 1, policy_version 59060 (0.0009) [2023-10-10 07:07:14,192][53268] Updated weights for policy 1, policy_version 59070 (0.0008) [2023-10-10 07:07:15,367][53252] Updated weights for policy 0, policy_version 59110 (0.0007) [2023-10-10 07:07:15,752][53252] Updated weights for policy 0, policy_version 59120 (0.0007) [2023-10-10 07:07:16,117][53252] Updated weights for policy 0, policy_version 59130 (0.0010) [2023-10-10 07:07:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121044992. Throughput: 0: 1683.2, 1: 1680.0. Samples: 30263620. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:16,785][52050] Avg episode reward: [(0, '21.650'), (1, '21.110')] [2023-10-10 07:07:18,213][53268] Updated weights for policy 1, policy_version 59080 (0.0008) [2023-10-10 07:07:18,576][53268] Updated weights for policy 1, policy_version 59090 (0.0010) [2023-10-10 07:07:18,940][53268] Updated weights for policy 1, policy_version 59100 (0.0008) [2023-10-10 07:07:19,971][53252] Updated weights for policy 0, policy_version 59140 (0.0009) [2023-10-10 07:07:20,341][53252] Updated weights for policy 0, policy_version 59150 (0.0009) [2023-10-10 07:07:20,706][53252] Updated weights for policy 0, policy_version 59160 (0.0009) [2023-10-10 07:07:21,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121110528. Throughput: 0: 1667.4, 1: 1698.2. Samples: 30283580. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:21,784][52050] Avg episode reward: [(0, '21.730'), (1, '21.270')] [2023-10-10 07:07:22,956][53268] Updated weights for policy 1, policy_version 59110 (0.0008) [2023-10-10 07:07:23,312][53268] Updated weights for policy 1, policy_version 59120 (0.0010) [2023-10-10 07:07:23,691][53268] Updated weights for policy 1, policy_version 59130 (0.0011) [2023-10-10 07:07:24,681][53252] Updated weights for policy 0, policy_version 59170 (0.0008) [2023-10-10 07:07:25,053][53252] Updated weights for policy 0, policy_version 59180 (0.0010) [2023-10-10 07:07:25,422][53252] Updated weights for policy 0, policy_version 59190 (0.0010) [2023-10-10 07:07:25,796][53252] Updated weights for policy 0, policy_version 59200 (0.0011) [2023-10-10 07:07:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121176064. Throughput: 0: 1672.7, 1: 1709.0. Samples: 30303864. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:26,784][52050] Avg episode reward: [(0, '22.900'), (1, '22.540')] [2023-10-10 07:07:27,670][53268] Updated weights for policy 1, policy_version 59140 (0.0010) [2023-10-10 07:07:28,039][53268] Updated weights for policy 1, policy_version 59150 (0.0007) [2023-10-10 07:07:28,407][53268] Updated weights for policy 1, policy_version 59160 (0.0010) [2023-10-10 07:07:29,822][53252] Updated weights for policy 0, policy_version 59210 (0.0009) [2023-10-10 07:07:30,193][53252] Updated weights for policy 0, policy_version 59220 (0.0008) [2023-10-10 07:07:30,565][53252] Updated weights for policy 0, policy_version 59230 (0.0010) [2023-10-10 07:07:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121241600. Throughput: 0: 1687.5, 1: 1682.5. Samples: 30314414. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:31,784][52050] Avg episode reward: [(0, '22.250'), (1, '21.070')] [2023-10-10 07:07:32,361][53268] Updated weights for policy 1, policy_version 59170 (0.0009) [2023-10-10 07:07:32,733][53268] Updated weights for policy 1, policy_version 59180 (0.0007) [2023-10-10 07:07:33,091][53268] Updated weights for policy 1, policy_version 59190 (0.0009) [2023-10-10 07:07:33,459][53268] Updated weights for policy 1, policy_version 59200 (0.0009) [2023-10-10 07:07:34,609][53252] Updated weights for policy 0, policy_version 59240 (0.0010) [2023-10-10 07:07:34,979][53252] Updated weights for policy 0, policy_version 59250 (0.0007) [2023-10-10 07:07:35,349][53252] Updated weights for policy 0, policy_version 59260 (0.0009) [2023-10-10 07:07:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 121307136. Throughput: 0: 1662.8, 1: 1708.8. Samples: 30334332. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 07:07:36,784][52050] Avg episode reward: [(0, '23.070'), (1, '22.120')] [2023-10-10 07:07:37,386][53268] Updated weights for policy 1, policy_version 59210 (0.0008) [2023-10-10 07:07:37,743][53268] Updated weights for policy 1, policy_version 59220 (0.0010) [2023-10-10 07:07:38,115][53268] Updated weights for policy 1, policy_version 59230 (0.0010) [2023-10-10 07:07:39,452][53252] Updated weights for policy 0, policy_version 59270 (0.0009) [2023-10-10 07:07:39,828][53252] Updated weights for policy 0, policy_version 59280 (0.0008) [2023-10-10 07:07:40,199][53252] Updated weights for policy 0, policy_version 59290 (0.0009) [2023-10-10 07:07:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 121372672. Throughput: 0: 1680.9, 1: 1709.9. Samples: 30355006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:07:41,784][52050] Avg episode reward: [(0, '21.280'), (1, '21.010')] [2023-10-10 07:07:42,251][53268] Updated weights for policy 1, policy_version 59240 (0.0009) [2023-10-10 07:07:42,633][53268] Updated weights for policy 1, policy_version 59250 (0.0007) [2023-10-10 07:07:43,007][53268] Updated weights for policy 1, policy_version 59260 (0.0008) [2023-10-10 07:07:44,096][53252] Updated weights for policy 0, policy_version 59300 (0.0008) [2023-10-10 07:07:44,473][53252] Updated weights for policy 0, policy_version 59310 (0.0007) [2023-10-10 07:07:44,843][53252] Updated weights for policy 0, policy_version 59320 (0.0007) [2023-10-10 07:07:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121438208. Throughput: 0: 1675.6, 1: 1687.5. Samples: 30364842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:07:46,784][52050] Avg episode reward: [(0, '20.350'), (1, '19.900')] [2023-10-10 07:07:47,110][53268] Updated weights for policy 1, policy_version 59270 (0.0008) [2023-10-10 07:07:47,476][53268] Updated weights for policy 1, policy_version 59280 (0.0011) [2023-10-10 07:07:47,840][53268] Updated weights for policy 1, policy_version 59290 (0.0010) [2023-10-10 07:07:49,008][53252] Updated weights for policy 0, policy_version 59330 (0.0009) [2023-10-10 07:07:49,383][53252] Updated weights for policy 0, policy_version 59340 (0.0007) [2023-10-10 07:07:49,747][53252] Updated weights for policy 0, policy_version 59350 (0.0007) [2023-10-10 07:07:50,126][53252] Updated weights for policy 0, policy_version 59360 (0.0008) [2023-10-10 07:07:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121503744. Throughput: 0: 1663.7, 1: 1698.9. Samples: 30384702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:07:51,784][52050] Avg episode reward: [(0, '20.150'), (1, '19.650')] [2023-10-10 07:07:52,027][53268] Updated weights for policy 1, policy_version 59300 (0.0010) [2023-10-10 07:07:52,389][53268] Updated weights for policy 1, policy_version 59310 (0.0008) [2023-10-10 07:07:52,763][53268] Updated weights for policy 1, policy_version 59320 (0.0009) [2023-10-10 07:07:54,015][53252] Updated weights for policy 0, policy_version 59370 (0.0008) [2023-10-10 07:07:54,391][53252] Updated weights for policy 0, policy_version 59380 (0.0007) [2023-10-10 07:07:54,765][53252] Updated weights for policy 0, policy_version 59390 (0.0009) [2023-10-10 07:07:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 121569280. Throughput: 0: 1697.9, 1: 1696.1. Samples: 30405840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:07:56,785][52050] Avg episode reward: [(0, '22.430'), (1, '20.800')] [2023-10-10 07:07:56,899][53268] Updated weights for policy 1, policy_version 59330 (0.0010) [2023-10-10 07:07:57,276][53268] Updated weights for policy 1, policy_version 59340 (0.0010) [2023-10-10 07:07:57,641][53268] Updated weights for policy 1, policy_version 59350 (0.0008) [2023-10-10 07:07:58,013][53268] Updated weights for policy 1, policy_version 59360 (0.0007) [2023-10-10 07:07:58,806][53252] Updated weights for policy 0, policy_version 59400 (0.0007) [2023-10-10 07:07:59,174][53252] Updated weights for policy 0, policy_version 59410 (0.0007) [2023-10-10 07:07:59,542][53252] Updated weights for policy 0, policy_version 59420 (0.0007) [2023-10-10 07:08:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 121634816. Throughput: 0: 1677.8, 1: 1693.5. Samples: 30415328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:08:01,784][52050] Avg episode reward: [(0, '22.700'), (1, '20.120')] [2023-10-10 07:08:02,101][53268] Updated weights for policy 1, policy_version 59370 (0.0009) [2023-10-10 07:08:02,469][53268] Updated weights for policy 1, policy_version 59380 (0.0010) [2023-10-10 07:08:02,844][53268] Updated weights for policy 1, policy_version 59390 (0.0011) [2023-10-10 07:08:03,699][53252] Updated weights for policy 0, policy_version 59430 (0.0008) [2023-10-10 07:08:04,082][53252] Updated weights for policy 0, policy_version 59440 (0.0010) [2023-10-10 07:08:04,454][53252] Updated weights for policy 0, policy_version 59450 (0.0009) [2023-10-10 07:08:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 121700352. Throughput: 0: 1679.5, 1: 1697.4. Samples: 30435538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:08:06,784][52050] Avg episode reward: [(0, '24.080'), (1, '20.020')] [2023-10-10 07:08:06,805][53268] Updated weights for policy 1, policy_version 59400 (0.0008) [2023-10-10 07:08:07,169][53268] Updated weights for policy 1, policy_version 59410 (0.0008) [2023-10-10 07:08:07,533][53268] Updated weights for policy 1, policy_version 59420 (0.0008) [2023-10-10 07:08:08,568][53252] Updated weights for policy 0, policy_version 59460 (0.0008) [2023-10-10 07:08:08,931][53252] Updated weights for policy 0, policy_version 59470 (0.0008) [2023-10-10 07:08:09,307][53252] Updated weights for policy 0, policy_version 59480 (0.0009) [2023-10-10 07:08:11,523][53268] Updated weights for policy 1, policy_version 59430 (0.0008) [2023-10-10 07:08:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121765888. Throughput: 0: 1693.6, 1: 1700.1. Samples: 30456582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:08:11,784][52050] Avg episode reward: [(0, '23.290'), (1, '20.520')] [2023-10-10 07:08:11,890][53268] Updated weights for policy 1, policy_version 59440 (0.0008) [2023-10-10 07:08:12,255][53268] Updated weights for policy 1, policy_version 59450 (0.0010) [2023-10-10 07:08:13,105][53252] Updated weights for policy 0, policy_version 59490 (0.0008) [2023-10-10 07:08:13,479][53252] Updated weights for policy 0, policy_version 59500 (0.0008) [2023-10-10 07:08:13,848][53252] Updated weights for policy 0, policy_version 59510 (0.0009) [2023-10-10 07:08:14,223][53252] Updated weights for policy 0, policy_version 59520 (0.0011) [2023-10-10 07:08:16,170][53268] Updated weights for policy 1, policy_version 59460 (0.0008) [2023-10-10 07:08:16,532][53268] Updated weights for policy 1, policy_version 59470 (0.0007) [2023-10-10 07:08:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 121831424. Throughput: 0: 1662.8, 1: 1697.5. Samples: 30465628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:08:16,784][52050] Avg episode reward: [(0, '21.290'), (1, '20.280')] [2023-10-10 07:08:16,903][53268] Updated weights for policy 1, policy_version 59480 (0.0007) [2023-10-10 07:08:18,274][53252] Updated weights for policy 0, policy_version 59530 (0.0009) [2023-10-10 07:08:18,649][53252] Updated weights for policy 0, policy_version 59540 (0.0010) [2023-10-10 07:08:19,032][53252] Updated weights for policy 0, policy_version 59550 (0.0009) [2023-10-10 07:08:21,000][53268] Updated weights for policy 1, policy_version 59490 (0.0007) [2023-10-10 07:08:21,370][53268] Updated weights for policy 1, policy_version 59500 (0.0008) [2023-10-10 07:08:21,732][53268] Updated weights for policy 1, policy_version 59510 (0.0009) [2023-10-10 07:08:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121896960. Throughput: 0: 1685.3, 1: 1693.9. Samples: 30486400. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:21,784][52050] Avg episode reward: [(0, '21.040'), (1, '19.810')] [2023-10-10 07:08:22,100][53268] Updated weights for policy 1, policy_version 59520 (0.0009) [2023-10-10 07:08:23,102][53252] Updated weights for policy 0, policy_version 59560 (0.0009) [2023-10-10 07:08:23,483][53252] Updated weights for policy 0, policy_version 59570 (0.0010) [2023-10-10 07:08:23,852][53252] Updated weights for policy 0, policy_version 59580 (0.0010) [2023-10-10 07:08:26,127][53268] Updated weights for policy 1, policy_version 59530 (0.0008) [2023-10-10 07:08:26,491][53268] Updated weights for policy 1, policy_version 59540 (0.0008) [2023-10-10 07:08:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121962496. Throughput: 0: 1693.5, 1: 1683.6. Samples: 30506980. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:26,784][52050] Avg episode reward: [(0, '20.910'), (1, '19.460')] [2023-10-10 07:08:26,871][53268] Updated weights for policy 1, policy_version 59550 (0.0009) [2023-10-10 07:08:27,837][53252] Updated weights for policy 0, policy_version 59590 (0.0009) [2023-10-10 07:08:28,212][53252] Updated weights for policy 0, policy_version 59600 (0.0009) [2023-10-10 07:08:28,591][53252] Updated weights for policy 0, policy_version 59610 (0.0009) [2023-10-10 07:08:30,968][53268] Updated weights for policy 1, policy_version 59560 (0.0008) [2023-10-10 07:08:31,348][53268] Updated weights for policy 1, policy_version 59570 (0.0007) [2023-10-10 07:08:31,710][53268] Updated weights for policy 1, policy_version 59580 (0.0010) [2023-10-10 07:08:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 122028032. Throughput: 0: 1671.5, 1: 1696.2. Samples: 30516388. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:31,784][52050] Avg episode reward: [(0, '21.390'), (1, '20.220')] [2023-10-10 07:08:32,859][53252] Updated weights for policy 0, policy_version 59620 (0.0008) [2023-10-10 07:08:33,237][53252] Updated weights for policy 0, policy_version 59630 (0.0008) [2023-10-10 07:08:33,611][53252] Updated weights for policy 0, policy_version 59640 (0.0007) [2023-10-10 07:08:35,682][53268] Updated weights for policy 1, policy_version 59590 (0.0009) [2023-10-10 07:08:36,045][53268] Updated weights for policy 1, policy_version 59600 (0.0008) [2023-10-10 07:08:36,411][53268] Updated weights for policy 1, policy_version 59610 (0.0008) [2023-10-10 07:08:36,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122126336. Throughput: 0: 1690.4, 1: 1692.7. Samples: 30536940. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:36,784][52050] Avg episode reward: [(0, '22.490'), (1, '21.630')] [2023-10-10 07:08:37,518][53252] Updated weights for policy 0, policy_version 59650 (0.0010) [2023-10-10 07:08:37,892][53252] Updated weights for policy 0, policy_version 59660 (0.0007) [2023-10-10 07:08:38,268][53252] Updated weights for policy 0, policy_version 59670 (0.0009) [2023-10-10 07:08:38,631][53252] Updated weights for policy 0, policy_version 59680 (0.0008) [2023-10-10 07:08:40,456][53268] Updated weights for policy 1, policy_version 59620 (0.0009) [2023-10-10 07:08:40,818][53268] Updated weights for policy 1, policy_version 59630 (0.0007) [2023-10-10 07:08:41,193][53268] Updated weights for policy 1, policy_version 59640 (0.0009) [2023-10-10 07:08:41,784][52050] Fps is (10 sec: 16383.2, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 122191872. Throughput: 0: 1687.4, 1: 1673.6. Samples: 30557088. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:41,785][52050] Avg episode reward: [(0, '21.630'), (1, '20.780')] [2023-10-10 07:08:41,799][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000059648_61079552.pth... [2023-10-10 07:08:41,800][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000059680_61112320.pth... [2023-10-10 07:08:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000058112_59506688.pth [2023-10-10 07:08:41,840][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000058048_59441152.pth [2023-10-10 07:08:42,606][53252] Updated weights for policy 0, policy_version 59690 (0.0009) [2023-10-10 07:08:42,981][53252] Updated weights for policy 0, policy_version 59700 (0.0009) [2023-10-10 07:08:43,357][53252] Updated weights for policy 0, policy_version 59710 (0.0009) [2023-10-10 07:08:45,306][53268] Updated weights for policy 1, policy_version 59650 (0.0011) [2023-10-10 07:08:45,681][53268] Updated weights for policy 1, policy_version 59660 (0.0007) [2023-10-10 07:08:46,043][53268] Updated weights for policy 1, policy_version 59670 (0.0009) [2023-10-10 07:08:46,413][53268] Updated weights for policy 1, policy_version 59680 (0.0007) [2023-10-10 07:08:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122257408. Throughput: 0: 1676.7, 1: 1694.6. Samples: 30567038. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:46,784][52050] Avg episode reward: [(0, '21.920'), (1, '22.250')] [2023-10-10 07:08:47,360][53252] Updated weights for policy 0, policy_version 59720 (0.0008) [2023-10-10 07:08:47,734][53252] Updated weights for policy 0, policy_version 59730 (0.0010) [2023-10-10 07:08:48,103][53252] Updated weights for policy 0, policy_version 59740 (0.0008) [2023-10-10 07:08:50,433][53268] Updated weights for policy 1, policy_version 59690 (0.0008) [2023-10-10 07:08:50,792][53268] Updated weights for policy 1, policy_version 59700 (0.0007) [2023-10-10 07:08:51,160][53268] Updated weights for policy 1, policy_version 59710 (0.0007) [2023-10-10 07:08:51,783][52050] Fps is (10 sec: 13107.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122322944. Throughput: 0: 1694.8, 1: 1691.9. Samples: 30587940. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:51,784][52050] Avg episode reward: [(0, '20.990'), (1, '21.640')] [2023-10-10 07:08:52,167][53252] Updated weights for policy 0, policy_version 59750 (0.0008) [2023-10-10 07:08:52,538][53252] Updated weights for policy 0, policy_version 59760 (0.0009) [2023-10-10 07:08:52,911][53252] Updated weights for policy 0, policy_version 59770 (0.0008) [2023-10-10 07:08:55,476][53268] Updated weights for policy 1, policy_version 59720 (0.0008) [2023-10-10 07:08:55,856][53268] Updated weights for policy 1, policy_version 59730 (0.0009) [2023-10-10 07:08:56,223][53268] Updated weights for policy 1, policy_version 59740 (0.0008) [2023-10-10 07:08:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122388480. Throughput: 0: 1690.6, 1: 1660.7. Samples: 30607390. Policy #0 lag: (min: 18.0, avg: 20.9, max: 50.0) [2023-10-10 07:08:56,784][52050] Avg episode reward: [(0, '20.020'), (1, '22.940')] [2023-10-10 07:08:56,851][53252] Updated weights for policy 0, policy_version 59780 (0.0008) [2023-10-10 07:08:57,219][53252] Updated weights for policy 0, policy_version 59790 (0.0008) [2023-10-10 07:08:57,600][53252] Updated weights for policy 0, policy_version 59800 (0.0008) [2023-10-10 07:09:00,258][53268] Updated weights for policy 1, policy_version 59750 (0.0009) [2023-10-10 07:09:00,631][53268] Updated weights for policy 1, policy_version 59760 (0.0008) [2023-10-10 07:09:00,994][53268] Updated weights for policy 1, policy_version 59770 (0.0008) [2023-10-10 07:09:01,691][53252] Updated weights for policy 0, policy_version 59810 (0.0008) [2023-10-10 07:09:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122454016. Throughput: 0: 1693.8, 1: 1682.4. Samples: 30617558. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:01,784][52050] Avg episode reward: [(0, '19.330'), (1, '20.620')] [2023-10-10 07:09:02,059][53252] Updated weights for policy 0, policy_version 59820 (0.0011) [2023-10-10 07:09:02,428][53252] Updated weights for policy 0, policy_version 59830 (0.0011) [2023-10-10 07:09:02,799][53252] Updated weights for policy 0, policy_version 59840 (0.0009) [2023-10-10 07:09:04,985][53268] Updated weights for policy 1, policy_version 59780 (0.0010) [2023-10-10 07:09:05,360][53268] Updated weights for policy 1, policy_version 59790 (0.0008) [2023-10-10 07:09:05,728][53268] Updated weights for policy 1, policy_version 59800 (0.0009) [2023-10-10 07:09:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122519552. Throughput: 0: 1694.9, 1: 1678.5. Samples: 30638198. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:06,784][52050] Avg episode reward: [(0, '21.630'), (1, '20.910')] [2023-10-10 07:09:07,017][53252] Updated weights for policy 0, policy_version 59850 (0.0009) [2023-10-10 07:09:07,395][53252] Updated weights for policy 0, policy_version 59860 (0.0011) [2023-10-10 07:09:07,756][53252] Updated weights for policy 0, policy_version 59870 (0.0007) [2023-10-10 07:09:09,832][53268] Updated weights for policy 1, policy_version 59810 (0.0008) [2023-10-10 07:09:10,194][53268] Updated weights for policy 1, policy_version 59820 (0.0009) [2023-10-10 07:09:10,565][53268] Updated weights for policy 1, policy_version 59830 (0.0008) [2023-10-10 07:09:10,932][53268] Updated weights for policy 1, policy_version 59840 (0.0007) [2023-10-10 07:09:11,728][53252] Updated weights for policy 0, policy_version 59880 (0.0009) [2023-10-10 07:09:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122585088. Throughput: 0: 1694.8, 1: 1661.7. Samples: 30658024. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:11,784][52050] Avg episode reward: [(0, '20.700'), (1, '19.760')] [2023-10-10 07:09:12,098][53252] Updated weights for policy 0, policy_version 59890 (0.0007) [2023-10-10 07:09:12,473][53252] Updated weights for policy 0, policy_version 59900 (0.0007) [2023-10-10 07:09:15,094][53268] Updated weights for policy 1, policy_version 59850 (0.0008) [2023-10-10 07:09:15,465][53268] Updated weights for policy 1, policy_version 59860 (0.0009) [2023-10-10 07:09:15,838][53268] Updated weights for policy 1, policy_version 59870 (0.0007) [2023-10-10 07:09:16,425][53252] Updated weights for policy 0, policy_version 59910 (0.0009) [2023-10-10 07:09:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122650624. Throughput: 0: 1695.8, 1: 1677.9. Samples: 30668206. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:16,784][52050] Avg episode reward: [(0, '21.150'), (1, '21.390')] [2023-10-10 07:09:16,803][53252] Updated weights for policy 0, policy_version 59920 (0.0010) [2023-10-10 07:09:17,160][53252] Updated weights for policy 0, policy_version 59930 (0.0011) [2023-10-10 07:09:19,879][53268] Updated weights for policy 1, policy_version 59880 (0.0009) [2023-10-10 07:09:20,260][53268] Updated weights for policy 1, policy_version 59890 (0.0008) [2023-10-10 07:09:20,628][53268] Updated weights for policy 1, policy_version 59900 (0.0008) [2023-10-10 07:09:21,283][53252] Updated weights for policy 0, policy_version 59940 (0.0008) [2023-10-10 07:09:21,655][53252] Updated weights for policy 0, policy_version 59950 (0.0009) [2023-10-10 07:09:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122716160. Throughput: 0: 1695.6, 1: 1664.8. Samples: 30688160. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:21,784][52050] Avg episode reward: [(0, '22.120'), (1, '19.600')] [2023-10-10 07:09:22,029][53252] Updated weights for policy 0, policy_version 59960 (0.0008) [2023-10-10 07:09:24,459][53268] Updated weights for policy 1, policy_version 59910 (0.0009) [2023-10-10 07:09:24,822][53268] Updated weights for policy 1, policy_version 59920 (0.0008) [2023-10-10 07:09:25,191][53268] Updated weights for policy 1, policy_version 59930 (0.0009) [2023-10-10 07:09:26,041][53252] Updated weights for policy 0, policy_version 59970 (0.0008) [2023-10-10 07:09:26,415][53252] Updated weights for policy 0, policy_version 59980 (0.0007) [2023-10-10 07:09:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 122781696. Throughput: 0: 1686.1, 1: 1672.1. Samples: 30708206. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:26,784][52050] Avg episode reward: [(0, '18.480'), (1, '19.130')] [2023-10-10 07:09:26,792][53252] Updated weights for policy 0, policy_version 59990 (0.0007) [2023-10-10 07:09:27,156][53252] Updated weights for policy 0, policy_version 60000 (0.0008) [2023-10-10 07:09:29,409][53268] Updated weights for policy 1, policy_version 59940 (0.0008) [2023-10-10 07:09:29,772][53268] Updated weights for policy 1, policy_version 59950 (0.0010) [2023-10-10 07:09:30,138][53268] Updated weights for policy 1, policy_version 59960 (0.0010) [2023-10-10 07:09:31,292][53252] Updated weights for policy 0, policy_version 60010 (0.0009) [2023-10-10 07:09:31,674][53252] Updated weights for policy 0, policy_version 60020 (0.0010) [2023-10-10 07:09:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122847232. Throughput: 0: 1698.9, 1: 1679.2. Samples: 30719056. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:31,784][52050] Avg episode reward: [(0, '18.080'), (1, '19.770')] [2023-10-10 07:09:32,042][53252] Updated weights for policy 0, policy_version 60030 (0.0010) [2023-10-10 07:09:34,297][53268] Updated weights for policy 1, policy_version 59970 (0.0009) [2023-10-10 07:09:34,659][53268] Updated weights for policy 1, policy_version 59980 (0.0007) [2023-10-10 07:09:35,033][53268] Updated weights for policy 1, policy_version 59990 (0.0009) [2023-10-10 07:09:35,393][53268] Updated weights for policy 1, policy_version 60000 (0.0008) [2023-10-10 07:09:36,162][53252] Updated weights for policy 0, policy_version 60040 (0.0011) [2023-10-10 07:09:36,542][53252] Updated weights for policy 0, policy_version 60050 (0.0011) [2023-10-10 07:09:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 122912768. Throughput: 0: 1696.7, 1: 1657.4. Samples: 30738876. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-10 07:09:36,784][52050] Avg episode reward: [(0, '19.480'), (1, '20.370')] [2023-10-10 07:09:36,908][53252] Updated weights for policy 0, policy_version 60060 (0.0009) [2023-10-10 07:09:39,446][53268] Updated weights for policy 1, policy_version 60010 (0.0008) [2023-10-10 07:09:39,807][53268] Updated weights for policy 1, policy_version 60020 (0.0008) [2023-10-10 07:09:40,169][53268] Updated weights for policy 1, policy_version 60030 (0.0010) [2023-10-10 07:09:41,013][53252] Updated weights for policy 0, policy_version 60070 (0.0009) [2023-10-10 07:09:41,398][53252] Updated weights for policy 0, policy_version 60080 (0.0007) [2023-10-10 07:09:41,767][53252] Updated weights for policy 0, policy_version 60090 (0.0008) [2023-10-10 07:09:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 122978304. Throughput: 0: 1681.5, 1: 1679.6. Samples: 30758642. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:09:41,784][52050] Avg episode reward: [(0, '20.950'), (1, '19.170')] [2023-10-10 07:09:44,091][53268] Updated weights for policy 1, policy_version 60040 (0.0009) [2023-10-10 07:09:44,464][53268] Updated weights for policy 1, policy_version 60050 (0.0009) [2023-10-10 07:09:44,830][53268] Updated weights for policy 1, policy_version 60060 (0.0009) [2023-10-10 07:09:45,831][53252] Updated weights for policy 0, policy_version 60100 (0.0009) [2023-10-10 07:09:46,193][53252] Updated weights for policy 0, policy_version 60110 (0.0009) [2023-10-10 07:09:46,568][53252] Updated weights for policy 0, policy_version 60120 (0.0009) [2023-10-10 07:09:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 123043840. Throughput: 0: 1693.0, 1: 1678.8. Samples: 30769292. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:09:46,784][52050] Avg episode reward: [(0, '21.640'), (1, '20.340')] [2023-10-10 07:09:48,876][53268] Updated weights for policy 1, policy_version 60070 (0.0009) [2023-10-10 07:09:49,247][53268] Updated weights for policy 1, policy_version 60080 (0.0009) [2023-10-10 07:09:49,617][53268] Updated weights for policy 1, policy_version 60090 (0.0009) [2023-10-10 07:09:50,748][53252] Updated weights for policy 0, policy_version 60130 (0.0008) [2023-10-10 07:09:51,123][53252] Updated weights for policy 0, policy_version 60140 (0.0010) [2023-10-10 07:09:51,498][53252] Updated weights for policy 0, policy_version 60150 (0.0011) [2023-10-10 07:09:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 123109376. Throughput: 0: 1684.7, 1: 1663.2. Samples: 30788854. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:09:51,784][52050] Avg episode reward: [(0, '21.840'), (1, '22.240')] [2023-10-10 07:09:51,871][53252] Updated weights for policy 0, policy_version 60160 (0.0011) [2023-10-10 07:09:53,649][53268] Updated weights for policy 1, policy_version 60100 (0.0009) [2023-10-10 07:09:54,006][53268] Updated weights for policy 1, policy_version 60110 (0.0009) [2023-10-10 07:09:54,376][53268] Updated weights for policy 1, policy_version 60120 (0.0008) [2023-10-10 07:09:55,854][53252] Updated weights for policy 0, policy_version 60170 (0.0008) [2023-10-10 07:09:56,233][53252] Updated weights for policy 0, policy_version 60180 (0.0008) [2023-10-10 07:09:56,608][53252] Updated weights for policy 0, policy_version 60190 (0.0007) [2023-10-10 07:09:56,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123207680. Throughput: 0: 1662.6, 1: 1688.1. Samples: 30808806. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:09:56,784][52050] Avg episode reward: [(0, '20.390'), (1, '21.590')] [2023-10-10 07:09:58,283][53268] Updated weights for policy 1, policy_version 60130 (0.0009) [2023-10-10 07:09:58,659][53268] Updated weights for policy 1, policy_version 60140 (0.0010) [2023-10-10 07:09:59,031][53268] Updated weights for policy 1, policy_version 60150 (0.0009) [2023-10-10 07:09:59,389][53268] Updated weights for policy 1, policy_version 60160 (0.0008) [2023-10-10 07:10:00,666][53252] Updated weights for policy 0, policy_version 60200 (0.0010) [2023-10-10 07:10:01,048][53252] Updated weights for policy 0, policy_version 60210 (0.0009) [2023-10-10 07:10:01,432][53252] Updated weights for policy 0, policy_version 60220 (0.0008) [2023-10-10 07:10:01,783][52050] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123273216. Throughput: 0: 1685.8, 1: 1672.8. Samples: 30819346. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:10:01,784][52050] Avg episode reward: [(0, '19.000'), (1, '20.840')] [2023-10-10 07:10:03,520][53268] Updated weights for policy 1, policy_version 60170 (0.0009) [2023-10-10 07:10:03,899][53268] Updated weights for policy 1, policy_version 60180 (0.0009) [2023-10-10 07:10:04,265][53268] Updated weights for policy 1, policy_version 60190 (0.0007) [2023-10-10 07:10:05,488][53252] Updated weights for policy 0, policy_version 60230 (0.0009) [2023-10-10 07:10:05,859][53252] Updated weights for policy 0, policy_version 60240 (0.0007) [2023-10-10 07:10:06,224][53252] Updated weights for policy 0, policy_version 60250 (0.0007) [2023-10-10 07:10:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123338752. Throughput: 0: 1679.8, 1: 1677.8. Samples: 30839252. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:10:06,784][52050] Avg episode reward: [(0, '17.160'), (1, '20.320')] [2023-10-10 07:10:08,588][53268] Updated weights for policy 1, policy_version 60200 (0.0010) [2023-10-10 07:10:08,963][53268] Updated weights for policy 1, policy_version 60210 (0.0009) [2023-10-10 07:10:09,325][53268] Updated weights for policy 1, policy_version 60220 (0.0007) [2023-10-10 07:10:10,398][53252] Updated weights for policy 0, policy_version 60260 (0.0008) [2023-10-10 07:10:10,766][53252] Updated weights for policy 0, policy_version 60270 (0.0009) [2023-10-10 07:10:11,130][53252] Updated weights for policy 0, policy_version 60280 (0.0008) [2023-10-10 07:10:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123404288. Throughput: 0: 1658.3, 1: 1688.2. Samples: 30858796. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:10:11,784][52050] Avg episode reward: [(0, '19.410'), (1, '20.230')] [2023-10-10 07:10:13,206][53268] Updated weights for policy 1, policy_version 60230 (0.0007) [2023-10-10 07:10:13,565][53268] Updated weights for policy 1, policy_version 60240 (0.0008) [2023-10-10 07:10:13,932][53268] Updated weights for policy 1, policy_version 60250 (0.0009) [2023-10-10 07:10:14,991][53252] Updated weights for policy 0, policy_version 60290 (0.0010) [2023-10-10 07:10:15,356][53252] Updated weights for policy 0, policy_version 60300 (0.0010) [2023-10-10 07:10:15,732][53252] Updated weights for policy 0, policy_version 60310 (0.0009) [2023-10-10 07:10:16,103][53252] Updated weights for policy 0, policy_version 60320 (0.0009) [2023-10-10 07:10:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123469824. Throughput: 0: 1674.4, 1: 1661.5. Samples: 30869168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:16,784][52050] Avg episode reward: [(0, '20.580'), (1, '20.150')] [2023-10-10 07:10:18,079][53268] Updated weights for policy 1, policy_version 60260 (0.0009) [2023-10-10 07:10:18,444][53268] Updated weights for policy 1, policy_version 60270 (0.0008) [2023-10-10 07:10:18,815][53268] Updated weights for policy 1, policy_version 60280 (0.0011) [2023-10-10 07:10:20,159][53252] Updated weights for policy 0, policy_version 60330 (0.0011) [2023-10-10 07:10:20,534][53252] Updated weights for policy 0, policy_version 60340 (0.0009) [2023-10-10 07:10:20,908][53252] Updated weights for policy 0, policy_version 60350 (0.0011) [2023-10-10 07:10:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123535360. Throughput: 0: 1663.4, 1: 1678.4. Samples: 30889256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:21,784][52050] Avg episode reward: [(0, '20.250'), (1, '19.510')] [2023-10-10 07:10:22,953][53268] Updated weights for policy 1, policy_version 60290 (0.0010) [2023-10-10 07:10:23,334][53268] Updated weights for policy 1, policy_version 60300 (0.0008) [2023-10-10 07:10:23,693][53268] Updated weights for policy 1, policy_version 60310 (0.0007) [2023-10-10 07:10:24,067][53268] Updated weights for policy 1, policy_version 60320 (0.0008) [2023-10-10 07:10:24,879][53252] Updated weights for policy 0, policy_version 60360 (0.0007) [2023-10-10 07:10:25,252][53252] Updated weights for policy 0, policy_version 60370 (0.0008) [2023-10-10 07:10:25,633][53252] Updated weights for policy 0, policy_version 60380 (0.0009) [2023-10-10 07:10:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123600896. Throughput: 0: 1667.6, 1: 1677.8. Samples: 30909182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:26,784][52050] Avg episode reward: [(0, '20.570'), (1, '20.890')] [2023-10-10 07:10:28,110][53268] Updated weights for policy 1, policy_version 60330 (0.0008) [2023-10-10 07:10:28,471][53268] Updated weights for policy 1, policy_version 60340 (0.0009) [2023-10-10 07:10:28,841][53268] Updated weights for policy 1, policy_version 60350 (0.0008) [2023-10-10 07:10:29,542][53252] Updated weights for policy 0, policy_version 60390 (0.0009) [2023-10-10 07:10:29,929][53252] Updated weights for policy 0, policy_version 60400 (0.0008) [2023-10-10 07:10:30,298][53252] Updated weights for policy 0, policy_version 60410 (0.0008) [2023-10-10 07:10:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123666432. Throughput: 0: 1681.9, 1: 1657.4. Samples: 30919562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:31,784][52050] Avg episode reward: [(0, '21.840'), (1, '22.430')] [2023-10-10 07:10:32,936][53268] Updated weights for policy 1, policy_version 60360 (0.0010) [2023-10-10 07:10:33,304][53268] Updated weights for policy 1, policy_version 60370 (0.0009) [2023-10-10 07:10:33,681][53268] Updated weights for policy 1, policy_version 60380 (0.0008) [2023-10-10 07:10:34,441][53252] Updated weights for policy 0, policy_version 60420 (0.0008) [2023-10-10 07:10:34,818][53252] Updated weights for policy 0, policy_version 60430 (0.0007) [2023-10-10 07:10:35,181][53252] Updated weights for policy 0, policy_version 60440 (0.0007) [2023-10-10 07:10:36,784][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123731968. Throughput: 0: 1664.5, 1: 1679.2. Samples: 30939322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:36,785][52050] Avg episode reward: [(0, '20.720'), (1, '21.610')] [2023-10-10 07:10:37,786][53268] Updated weights for policy 1, policy_version 60390 (0.0009) [2023-10-10 07:10:38,150][53268] Updated weights for policy 1, policy_version 60400 (0.0008) [2023-10-10 07:10:38,521][53268] Updated weights for policy 1, policy_version 60410 (0.0010) [2023-10-10 07:10:39,119][53252] Updated weights for policy 0, policy_version 60450 (0.0007) [2023-10-10 07:10:39,489][53252] Updated weights for policy 0, policy_version 60460 (0.0009) [2023-10-10 07:10:39,861][53252] Updated weights for policy 0, policy_version 60470 (0.0007) [2023-10-10 07:10:40,227][53252] Updated weights for policy 0, policy_version 60480 (0.0008) [2023-10-10 07:10:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 123797504. Throughput: 0: 1685.5, 1: 1677.5. Samples: 30960138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:41,784][52050] Avg episode reward: [(0, '21.600'), (1, '21.510')] [2023-10-10 07:10:41,791][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000060416_61865984.pth... [2023-10-10 07:10:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000060480_61931520.pth... [2023-10-10 07:10:41,821][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000058848_60260352.pth [2023-10-10 07:10:41,829][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000058912_60325888.pth [2023-10-10 07:10:42,777][53268] Updated weights for policy 1, policy_version 60420 (0.0010) [2023-10-10 07:10:43,137][53268] Updated weights for policy 1, policy_version 60430 (0.0007) [2023-10-10 07:10:43,507][53268] Updated weights for policy 1, policy_version 60440 (0.0009) [2023-10-10 07:10:44,386][53252] Updated weights for policy 0, policy_version 60490 (0.0010) [2023-10-10 07:10:44,749][53252] Updated weights for policy 0, policy_version 60500 (0.0010) [2023-10-10 07:10:45,109][53252] Updated weights for policy 0, policy_version 60510 (0.0008) [2023-10-10 07:10:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 123863040. Throughput: 0: 1685.5, 1: 1671.7. Samples: 30970424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:46,785][52050] Avg episode reward: [(0, '22.520'), (1, '21.090')] [2023-10-10 07:10:47,562][53268] Updated weights for policy 1, policy_version 60450 (0.0008) [2023-10-10 07:10:47,925][53268] Updated weights for policy 1, policy_version 60460 (0.0009) [2023-10-10 07:10:48,291][53268] Updated weights for policy 1, policy_version 60470 (0.0008) [2023-10-10 07:10:48,658][53268] Updated weights for policy 1, policy_version 60480 (0.0009) [2023-10-10 07:10:49,135][53252] Updated weights for policy 0, policy_version 60520 (0.0007) [2023-10-10 07:10:49,511][53252] Updated weights for policy 0, policy_version 60530 (0.0007) [2023-10-10 07:10:49,883][53252] Updated weights for policy 0, policy_version 60540 (0.0007) [2023-10-10 07:10:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 123928576. Throughput: 0: 1672.8, 1: 1682.4. Samples: 30990240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:10:51,784][52050] Avg episode reward: [(0, '21.570'), (1, '20.310')] [2023-10-10 07:10:52,615][53268] Updated weights for policy 1, policy_version 60490 (0.0008) [2023-10-10 07:10:52,977][53268] Updated weights for policy 1, policy_version 60500 (0.0007) [2023-10-10 07:10:53,343][53268] Updated weights for policy 1, policy_version 60510 (0.0008) [2023-10-10 07:10:54,043][53252] Updated weights for policy 0, policy_version 60550 (0.0008) [2023-10-10 07:10:54,416][53252] Updated weights for policy 0, policy_version 60560 (0.0009) [2023-10-10 07:10:54,792][53252] Updated weights for policy 0, policy_version 60570 (0.0007) [2023-10-10 07:10:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 123994112. Throughput: 0: 1702.2, 1: 1684.5. Samples: 31011196. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:10:56,784][52050] Avg episode reward: [(0, '22.130'), (1, '20.930')] [2023-10-10 07:10:57,648][53268] Updated weights for policy 1, policy_version 60520 (0.0008) [2023-10-10 07:10:58,029][53268] Updated weights for policy 1, policy_version 60530 (0.0008) [2023-10-10 07:10:58,384][53268] Updated weights for policy 1, policy_version 60540 (0.0011) [2023-10-10 07:10:58,866][53252] Updated weights for policy 0, policy_version 60580 (0.0009) [2023-10-10 07:10:59,241][53252] Updated weights for policy 0, policy_version 60590 (0.0010) [2023-10-10 07:10:59,610][53252] Updated weights for policy 0, policy_version 60600 (0.0008) [2023-10-10 07:11:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 124059648. Throughput: 0: 1690.0, 1: 1682.4. Samples: 31020926. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:01,784][52050] Avg episode reward: [(0, '22.690'), (1, '19.650')] [2023-10-10 07:11:02,317][53268] Updated weights for policy 1, policy_version 60550 (0.0009) [2023-10-10 07:11:02,691][53268] Updated weights for policy 1, policy_version 60560 (0.0009) [2023-10-10 07:11:03,058][53268] Updated weights for policy 1, policy_version 60570 (0.0010) [2023-10-10 07:11:03,555][53252] Updated weights for policy 0, policy_version 60610 (0.0007) [2023-10-10 07:11:03,934][53252] Updated weights for policy 0, policy_version 60620 (0.0008) [2023-10-10 07:11:04,298][53252] Updated weights for policy 0, policy_version 60630 (0.0009) [2023-10-10 07:11:04,673][53252] Updated weights for policy 0, policy_version 60640 (0.0010) [2023-10-10 07:11:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124125184. Throughput: 0: 1686.5, 1: 1694.0. Samples: 31041382. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:06,784][52050] Avg episode reward: [(0, '22.420'), (1, '19.280')] [2023-10-10 07:11:06,970][53268] Updated weights for policy 1, policy_version 60580 (0.0009) [2023-10-10 07:11:07,345][53268] Updated weights for policy 1, policy_version 60590 (0.0007) [2023-10-10 07:11:07,705][53268] Updated weights for policy 1, policy_version 60600 (0.0010) [2023-10-10 07:11:08,473][53252] Updated weights for policy 0, policy_version 60650 (0.0009) [2023-10-10 07:11:08,844][53252] Updated weights for policy 0, policy_version 60660 (0.0009) [2023-10-10 07:11:09,223][53252] Updated weights for policy 0, policy_version 60670 (0.0007) [2023-10-10 07:11:11,657][53268] Updated weights for policy 1, policy_version 60610 (0.0010) [2023-10-10 07:11:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124190720. Throughput: 0: 1702.9, 1: 1701.9. Samples: 31062396. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:11,784][52050] Avg episode reward: [(0, '22.670'), (1, '20.920')] [2023-10-10 07:11:12,026][53268] Updated weights for policy 1, policy_version 60620 (0.0010) [2023-10-10 07:11:12,396][53268] Updated weights for policy 1, policy_version 60630 (0.0008) [2023-10-10 07:11:12,769][53268] Updated weights for policy 1, policy_version 60640 (0.0009) [2023-10-10 07:11:13,052][53252] Updated weights for policy 0, policy_version 60680 (0.0008) [2023-10-10 07:11:13,424][53252] Updated weights for policy 0, policy_version 60690 (0.0008) [2023-10-10 07:11:13,793][53252] Updated weights for policy 0, policy_version 60700 (0.0009) [2023-10-10 07:11:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124256256. Throughput: 0: 1678.8, 1: 1701.0. Samples: 31071654. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:16,784][52050] Avg episode reward: [(0, '21.730'), (1, '20.050')] [2023-10-10 07:11:16,832][53268] Updated weights for policy 1, policy_version 60650 (0.0009) [2023-10-10 07:11:17,197][53268] Updated weights for policy 1, policy_version 60660 (0.0010) [2023-10-10 07:11:17,566][53268] Updated weights for policy 1, policy_version 60670 (0.0009) [2023-10-10 07:11:17,859][53252] Updated weights for policy 0, policy_version 60710 (0.0007) [2023-10-10 07:11:18,228][53252] Updated weights for policy 0, policy_version 60720 (0.0008) [2023-10-10 07:11:18,597][53252] Updated weights for policy 0, policy_version 60730 (0.0010) [2023-10-10 07:11:21,519][53268] Updated weights for policy 1, policy_version 60680 (0.0007) [2023-10-10 07:11:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124321792. Throughput: 0: 1704.4, 1: 1700.0. Samples: 31092518. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:21,784][52050] Avg episode reward: [(0, '19.860'), (1, '20.180')] [2023-10-10 07:11:21,886][53268] Updated weights for policy 1, policy_version 60690 (0.0007) [2023-10-10 07:11:22,265][53268] Updated weights for policy 1, policy_version 60700 (0.0007) [2023-10-10 07:11:22,754][53252] Updated weights for policy 0, policy_version 60740 (0.0009) [2023-10-10 07:11:23,132][53252] Updated weights for policy 0, policy_version 60750 (0.0007) [2023-10-10 07:11:23,505][53252] Updated weights for policy 0, policy_version 60760 (0.0007) [2023-10-10 07:11:26,360][53268] Updated weights for policy 1, policy_version 60710 (0.0009) [2023-10-10 07:11:26,724][53268] Updated weights for policy 1, policy_version 60720 (0.0011) [2023-10-10 07:11:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124387328. Throughput: 0: 1701.0, 1: 1699.6. Samples: 31113166. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:26,784][52050] Avg episode reward: [(0, '20.420'), (1, '21.450')] [2023-10-10 07:11:27,086][53268] Updated weights for policy 1, policy_version 60730 (0.0009) [2023-10-10 07:11:27,538][53252] Updated weights for policy 0, policy_version 60770 (0.0008) [2023-10-10 07:11:27,898][53252] Updated weights for policy 0, policy_version 60780 (0.0009) [2023-10-10 07:11:28,273][53252] Updated weights for policy 0, policy_version 60790 (0.0010) [2023-10-10 07:11:28,645][53252] Updated weights for policy 0, policy_version 60800 (0.0007) [2023-10-10 07:11:31,144][53268] Updated weights for policy 1, policy_version 60740 (0.0011) [2023-10-10 07:11:31,516][53268] Updated weights for policy 1, policy_version 60750 (0.0009) [2023-10-10 07:11:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124452864. Throughput: 0: 1683.0, 1: 1695.1. Samples: 31122436. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:31,784][52050] Avg episode reward: [(0, '20.780'), (1, '20.680')] [2023-10-10 07:11:31,877][53268] Updated weights for policy 1, policy_version 60760 (0.0010) [2023-10-10 07:11:32,609][53252] Updated weights for policy 0, policy_version 60810 (0.0009) [2023-10-10 07:11:32,980][53252] Updated weights for policy 0, policy_version 60820 (0.0008) [2023-10-10 07:11:33,347][53252] Updated weights for policy 0, policy_version 60830 (0.0009) [2023-10-10 07:11:35,912][53268] Updated weights for policy 1, policy_version 60770 (0.0008) [2023-10-10 07:11:36,287][53268] Updated weights for policy 1, policy_version 60780 (0.0009) [2023-10-10 07:11:36,644][53268] Updated weights for policy 1, policy_version 60790 (0.0007) [2023-10-10 07:11:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 124518400. Throughput: 0: 1705.5, 1: 1698.1. Samples: 31143402. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 07:11:36,784][52050] Avg episode reward: [(0, '20.580'), (1, '19.390')] [2023-10-10 07:11:37,015][53268] Updated weights for policy 1, policy_version 60800 (0.0008) [2023-10-10 07:11:37,439][53252] Updated weights for policy 0, policy_version 60840 (0.0008) [2023-10-10 07:11:37,807][53252] Updated weights for policy 0, policy_version 60850 (0.0007) [2023-10-10 07:11:38,179][53252] Updated weights for policy 0, policy_version 60860 (0.0008) [2023-10-10 07:11:41,144][53268] Updated weights for policy 1, policy_version 60810 (0.0007) [2023-10-10 07:11:41,507][53268] Updated weights for policy 1, policy_version 60820 (0.0011) [2023-10-10 07:11:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 124583936. Throughput: 0: 1701.1, 1: 1687.6. Samples: 31163690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:11:41,784][52050] Avg episode reward: [(0, '21.450'), (1, '21.380')] [2023-10-10 07:11:41,875][53268] Updated weights for policy 1, policy_version 60830 (0.0009) [2023-10-10 07:11:42,047][53252] Updated weights for policy 0, policy_version 60870 (0.0007) [2023-10-10 07:11:42,417][53252] Updated weights for policy 0, policy_version 60880 (0.0007) [2023-10-10 07:11:42,797][53252] Updated weights for policy 0, policy_version 60890 (0.0009) [2023-10-10 07:11:45,873][53268] Updated weights for policy 1, policy_version 60840 (0.0009) [2023-10-10 07:11:46,250][53268] Updated weights for policy 1, policy_version 60850 (0.0008) [2023-10-10 07:11:46,621][53268] Updated weights for policy 1, policy_version 60860 (0.0008) [2023-10-10 07:11:46,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 124682240. Throughput: 0: 1686.0, 1: 1699.7. Samples: 31173280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:11:46,784][52050] Avg episode reward: [(0, '21.160'), (1, '18.370')] [2023-10-10 07:11:46,800][53252] Updated weights for policy 0, policy_version 60900 (0.0008) [2023-10-10 07:11:47,178][53252] Updated weights for policy 0, policy_version 60910 (0.0009) [2023-10-10 07:11:47,540][53252] Updated weights for policy 0, policy_version 60920 (0.0008) [2023-10-10 07:11:50,760][53268] Updated weights for policy 1, policy_version 60870 (0.0009) [2023-10-10 07:11:51,125][53268] Updated weights for policy 1, policy_version 60880 (0.0010) [2023-10-10 07:11:51,489][53268] Updated weights for policy 1, policy_version 60890 (0.0009) [2023-10-10 07:11:51,695][53252] Updated weights for policy 0, policy_version 60930 (0.0009) [2023-10-10 07:11:51,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124747776. Throughput: 0: 1698.3, 1: 1690.0. Samples: 31193856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:11:51,784][52050] Avg episode reward: [(0, '21.420'), (1, '20.330')] [2023-10-10 07:11:52,067][53252] Updated weights for policy 0, policy_version 60940 (0.0009) [2023-10-10 07:11:52,439][53252] Updated weights for policy 0, policy_version 60950 (0.0009) [2023-10-10 07:11:52,808][53252] Updated weights for policy 0, policy_version 60960 (0.0007) [2023-10-10 07:11:55,528][53268] Updated weights for policy 1, policy_version 60900 (0.0009) [2023-10-10 07:11:55,901][53268] Updated weights for policy 1, policy_version 60910 (0.0008) [2023-10-10 07:11:56,270][53268] Updated weights for policy 1, policy_version 60920 (0.0008) [2023-10-10 07:11:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124813312. Throughput: 0: 1697.5, 1: 1669.7. Samples: 31213922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:11:56,784][52050] Avg episode reward: [(0, '22.110'), (1, '20.230')] [2023-10-10 07:11:56,885][53252] Updated weights for policy 0, policy_version 60970 (0.0007) [2023-10-10 07:11:57,249][53252] Updated weights for policy 0, policy_version 60980 (0.0007) [2023-10-10 07:11:57,621][53252] Updated weights for policy 0, policy_version 60990 (0.0007) [2023-10-10 07:12:00,389][53268] Updated weights for policy 1, policy_version 60930 (0.0007) [2023-10-10 07:12:00,750][53268] Updated weights for policy 1, policy_version 60940 (0.0009) [2023-10-10 07:12:01,126][53268] Updated weights for policy 1, policy_version 60950 (0.0010) [2023-10-10 07:12:01,486][53268] Updated weights for policy 1, policy_version 60960 (0.0008) [2023-10-10 07:12:01,693][53252] Updated weights for policy 0, policy_version 61000 (0.0008) [2023-10-10 07:12:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124878848. Throughput: 0: 1692.6, 1: 1689.2. Samples: 31223836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:01,784][52050] Avg episode reward: [(0, '21.310'), (1, '22.070')] [2023-10-10 07:12:02,055][53252] Updated weights for policy 0, policy_version 61010 (0.0009) [2023-10-10 07:12:02,429][53252] Updated weights for policy 0, policy_version 61020 (0.0007) [2023-10-10 07:12:05,521][53268] Updated weights for policy 1, policy_version 60970 (0.0009) [2023-10-10 07:12:05,884][53268] Updated weights for policy 1, policy_version 60980 (0.0010) [2023-10-10 07:12:06,246][53268] Updated weights for policy 1, policy_version 60990 (0.0008) [2023-10-10 07:12:06,424][53252] Updated weights for policy 0, policy_version 61030 (0.0008) [2023-10-10 07:12:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 124944384. Throughput: 0: 1689.2, 1: 1685.1. Samples: 31244360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:06,784][52050] Avg episode reward: [(0, '20.930'), (1, '20.970')] [2023-10-10 07:12:06,786][53252] Updated weights for policy 0, policy_version 61040 (0.0007) [2023-10-10 07:12:07,162][53252] Updated weights for policy 0, policy_version 61050 (0.0007) [2023-10-10 07:12:10,127][53268] Updated weights for policy 1, policy_version 61000 (0.0008) [2023-10-10 07:12:10,493][53268] Updated weights for policy 1, policy_version 61010 (0.0009) [2023-10-10 07:12:10,854][53268] Updated weights for policy 1, policy_version 61020 (0.0009) [2023-10-10 07:12:11,242][53252] Updated weights for policy 0, policy_version 61060 (0.0008) [2023-10-10 07:12:11,629][53252] Updated weights for policy 0, policy_version 61070 (0.0008) [2023-10-10 07:12:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125009920. Throughput: 0: 1683.0, 1: 1659.1. Samples: 31263560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:11,784][52050] Avg episode reward: [(0, '21.600'), (1, '22.130')] [2023-10-10 07:12:12,001][53252] Updated weights for policy 0, policy_version 61080 (0.0009) [2023-10-10 07:12:14,792][53268] Updated weights for policy 1, policy_version 61030 (0.0008) [2023-10-10 07:12:15,158][53268] Updated weights for policy 1, policy_version 61040 (0.0009) [2023-10-10 07:12:15,526][53268] Updated weights for policy 1, policy_version 61050 (0.0008) [2023-10-10 07:12:16,189][53252] Updated weights for policy 0, policy_version 61090 (0.0008) [2023-10-10 07:12:16,564][53252] Updated weights for policy 0, policy_version 61100 (0.0009) [2023-10-10 07:12:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125075456. Throughput: 0: 1685.2, 1: 1690.8. Samples: 31274352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:16,784][52050] Avg episode reward: [(0, '22.080'), (1, '21.390')] [2023-10-10 07:12:16,939][53252] Updated weights for policy 0, policy_version 61110 (0.0010) [2023-10-10 07:12:17,309][53252] Updated weights for policy 0, policy_version 61120 (0.0011) [2023-10-10 07:12:19,457][53268] Updated weights for policy 1, policy_version 61060 (0.0009) [2023-10-10 07:12:19,827][53268] Updated weights for policy 1, policy_version 61070 (0.0008) [2023-10-10 07:12:20,199][53268] Updated weights for policy 1, policy_version 61080 (0.0009) [2023-10-10 07:12:21,328][53252] Updated weights for policy 0, policy_version 61130 (0.0009) [2023-10-10 07:12:21,687][53252] Updated weights for policy 0, policy_version 61140 (0.0009) [2023-10-10 07:12:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125140992. Throughput: 0: 1679.3, 1: 1672.9. Samples: 31294252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:21,784][52050] Avg episode reward: [(0, '21.330'), (1, '22.230')] [2023-10-10 07:12:22,060][53252] Updated weights for policy 0, policy_version 61150 (0.0008) [2023-10-10 07:12:24,415][53268] Updated weights for policy 1, policy_version 61090 (0.0009) [2023-10-10 07:12:24,790][53268] Updated weights for policy 1, policy_version 61100 (0.0009) [2023-10-10 07:12:25,151][53268] Updated weights for policy 1, policy_version 61110 (0.0011) [2023-10-10 07:12:25,522][53268] Updated weights for policy 1, policy_version 61120 (0.0009) [2023-10-10 07:12:26,179][53252] Updated weights for policy 0, policy_version 61160 (0.0007) [2023-10-10 07:12:26,552][53252] Updated weights for policy 0, policy_version 61170 (0.0007) [2023-10-10 07:12:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125206528. Throughput: 0: 1671.5, 1: 1671.2. Samples: 31314114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:26,784][52050] Avg episode reward: [(0, '22.780'), (1, '21.530')] [2023-10-10 07:12:26,926][53252] Updated weights for policy 0, policy_version 61180 (0.0009) [2023-10-10 07:12:29,797][53268] Updated weights for policy 1, policy_version 61130 (0.0011) [2023-10-10 07:12:30,161][53268] Updated weights for policy 1, policy_version 61140 (0.0009) [2023-10-10 07:12:30,526][53268] Updated weights for policy 1, policy_version 61150 (0.0010) [2023-10-10 07:12:30,874][53252] Updated weights for policy 0, policy_version 61190 (0.0010) [2023-10-10 07:12:31,248][53252] Updated weights for policy 0, policy_version 61200 (0.0008) [2023-10-10 07:12:31,614][53252] Updated weights for policy 0, policy_version 61210 (0.0009) [2023-10-10 07:12:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125272064. Throughput: 0: 1685.1, 1: 1689.1. Samples: 31325122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:31,784][52050] Avg episode reward: [(0, '20.030'), (1, '20.480')] [2023-10-10 07:12:34,740][53268] Updated weights for policy 1, policy_version 61160 (0.0009) [2023-10-10 07:12:35,115][53268] Updated weights for policy 1, policy_version 61170 (0.0008) [2023-10-10 07:12:35,491][53268] Updated weights for policy 1, policy_version 61180 (0.0007) [2023-10-10 07:12:35,706][53252] Updated weights for policy 0, policy_version 61220 (0.0009) [2023-10-10 07:12:36,071][53252] Updated weights for policy 0, policy_version 61230 (0.0008) [2023-10-10 07:12:36,441][53252] Updated weights for policy 0, policy_version 61240 (0.0008) [2023-10-10 07:12:36,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 125370368. Throughput: 0: 1689.7, 1: 1673.8. Samples: 31345212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:36,784][52050] Avg episode reward: [(0, '19.650'), (1, '19.980')] [2023-10-10 07:12:39,436][53268] Updated weights for policy 1, policy_version 61190 (0.0007) [2023-10-10 07:12:39,798][53268] Updated weights for policy 1, policy_version 61200 (0.0009) [2023-10-10 07:12:40,166][53268] Updated weights for policy 1, policy_version 61210 (0.0011) [2023-10-10 07:12:40,493][53252] Updated weights for policy 0, policy_version 61250 (0.0009) [2023-10-10 07:12:40,866][53252] Updated weights for policy 0, policy_version 61260 (0.0008) [2023-10-10 07:12:41,238][53252] Updated weights for policy 0, policy_version 61270 (0.0008) [2023-10-10 07:12:41,605][53252] Updated weights for policy 0, policy_version 61280 (0.0008) [2023-10-10 07:12:41,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 125435904. Throughput: 0: 1668.7, 1: 1678.2. Samples: 31364532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:41,784][52050] Avg episode reward: [(0, '18.790'), (1, '18.350')] [2023-10-10 07:12:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000061216_62685184.pth... [2023-10-10 07:12:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000061280_62750720.pth... [2023-10-10 07:12:41,825][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000059680_61112320.pth [2023-10-10 07:12:41,836][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000059648_61079552.pth [2023-10-10 07:12:44,077][53268] Updated weights for policy 1, policy_version 61220 (0.0010) [2023-10-10 07:12:44,445][53268] Updated weights for policy 1, policy_version 61230 (0.0008) [2023-10-10 07:12:44,809][53268] Updated weights for policy 1, policy_version 61240 (0.0008) [2023-10-10 07:12:45,710][53252] Updated weights for policy 0, policy_version 61290 (0.0010) [2023-10-10 07:12:46,087][53252] Updated weights for policy 0, policy_version 61300 (0.0010) [2023-10-10 07:12:46,457][53252] Updated weights for policy 0, policy_version 61310 (0.0010) [2023-10-10 07:12:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125501440. Throughput: 0: 1690.1, 1: 1687.4. Samples: 31375826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:46,784][52050] Avg episode reward: [(0, '19.610'), (1, '19.580')] [2023-10-10 07:12:48,788][53268] Updated weights for policy 1, policy_version 61250 (0.0009) [2023-10-10 07:12:49,163][53268] Updated weights for policy 1, policy_version 61260 (0.0009) [2023-10-10 07:12:49,529][53268] Updated weights for policy 1, policy_version 61270 (0.0010) [2023-10-10 07:12:49,905][53268] Updated weights for policy 1, policy_version 61280 (0.0009) [2023-10-10 07:12:50,361][53252] Updated weights for policy 0, policy_version 61320 (0.0009) [2023-10-10 07:12:50,735][53252] Updated weights for policy 0, policy_version 61330 (0.0011) [2023-10-10 07:12:51,097][53252] Updated weights for policy 0, policy_version 61340 (0.0009) [2023-10-10 07:12:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125566976. Throughput: 0: 1691.3, 1: 1669.2. Samples: 31395584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:12:51,784][52050] Avg episode reward: [(0, '21.190'), (1, '18.870')] [2023-10-10 07:12:53,860][53268] Updated weights for policy 1, policy_version 61290 (0.0009) [2023-10-10 07:12:54,235][53268] Updated weights for policy 1, policy_version 61300 (0.0010) [2023-10-10 07:12:54,612][53268] Updated weights for policy 1, policy_version 61310 (0.0009) [2023-10-10 07:12:55,217][53252] Updated weights for policy 0, policy_version 61350 (0.0011) [2023-10-10 07:12:55,590][53252] Updated weights for policy 0, policy_version 61360 (0.0008) [2023-10-10 07:12:55,969][53252] Updated weights for policy 0, policy_version 61370 (0.0008) [2023-10-10 07:12:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125632512. Throughput: 0: 1675.2, 1: 1702.1. Samples: 31415538. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:12:56,784][52050] Avg episode reward: [(0, '22.910'), (1, '19.900')] [2023-10-10 07:12:58,445][53268] Updated weights for policy 1, policy_version 61320 (0.0008) [2023-10-10 07:12:58,822][53268] Updated weights for policy 1, policy_version 61330 (0.0010) [2023-10-10 07:12:59,192][53268] Updated weights for policy 1, policy_version 61340 (0.0009) [2023-10-10 07:13:00,127][53252] Updated weights for policy 0, policy_version 61380 (0.0011) [2023-10-10 07:13:00,513][53252] Updated weights for policy 0, policy_version 61390 (0.0010) [2023-10-10 07:13:00,883][53252] Updated weights for policy 0, policy_version 61400 (0.0008) [2023-10-10 07:13:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125698048. Throughput: 0: 1694.4, 1: 1680.2. Samples: 31426210. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:01,784][52050] Avg episode reward: [(0, '22.480'), (1, '21.360')] [2023-10-10 07:13:03,307][53268] Updated weights for policy 1, policy_version 61350 (0.0009) [2023-10-10 07:13:03,681][53268] Updated weights for policy 1, policy_version 61360 (0.0008) [2023-10-10 07:13:04,040][53268] Updated weights for policy 1, policy_version 61370 (0.0010) [2023-10-10 07:13:04,943][53252] Updated weights for policy 0, policy_version 61410 (0.0010) [2023-10-10 07:13:05,322][53252] Updated weights for policy 0, policy_version 61420 (0.0007) [2023-10-10 07:13:05,687][53252] Updated weights for policy 0, policy_version 61430 (0.0007) [2023-10-10 07:13:06,055][53252] Updated weights for policy 0, policy_version 61440 (0.0007) [2023-10-10 07:13:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125763584. Throughput: 0: 1680.1, 1: 1692.1. Samples: 31446000. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:06,784][52050] Avg episode reward: [(0, '22.530'), (1, '20.740')] [2023-10-10 07:13:08,138][53268] Updated weights for policy 1, policy_version 61380 (0.0008) [2023-10-10 07:13:08,496][53268] Updated weights for policy 1, policy_version 61390 (0.0010) [2023-10-10 07:13:08,869][53268] Updated weights for policy 1, policy_version 61400 (0.0008) [2023-10-10 07:13:10,213][53252] Updated weights for policy 0, policy_version 61450 (0.0007) [2023-10-10 07:13:10,571][53252] Updated weights for policy 0, policy_version 61460 (0.0012) [2023-10-10 07:13:10,943][53252] Updated weights for policy 0, policy_version 61470 (0.0010) [2023-10-10 07:13:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125829120. Throughput: 0: 1669.2, 1: 1701.2. Samples: 31465780. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:11,784][52050] Avg episode reward: [(0, '20.550'), (1, '20.530')] [2023-10-10 07:13:13,067][53268] Updated weights for policy 1, policy_version 61410 (0.0010) [2023-10-10 07:13:13,438][53268] Updated weights for policy 1, policy_version 61420 (0.0009) [2023-10-10 07:13:13,806][53268] Updated weights for policy 1, policy_version 61430 (0.0008) [2023-10-10 07:13:14,176][53268] Updated weights for policy 1, policy_version 61440 (0.0007) [2023-10-10 07:13:14,954][53252] Updated weights for policy 0, policy_version 61480 (0.0009) [2023-10-10 07:13:15,324][53252] Updated weights for policy 0, policy_version 61490 (0.0009) [2023-10-10 07:13:15,702][53252] Updated weights for policy 0, policy_version 61500 (0.0008) [2023-10-10 07:13:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125894656. Throughput: 0: 1686.7, 1: 1671.2. Samples: 31476224. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:16,784][52050] Avg episode reward: [(0, '20.530'), (1, '20.300')] [2023-10-10 07:13:18,342][53268] Updated weights for policy 1, policy_version 61450 (0.0011) [2023-10-10 07:13:18,709][53268] Updated weights for policy 1, policy_version 61460 (0.0012) [2023-10-10 07:13:19,081][53268] Updated weights for policy 1, policy_version 61470 (0.0011) [2023-10-10 07:13:19,875][53252] Updated weights for policy 0, policy_version 61510 (0.0008) [2023-10-10 07:13:20,234][53252] Updated weights for policy 0, policy_version 61520 (0.0008) [2023-10-10 07:13:20,613][53252] Updated weights for policy 0, policy_version 61530 (0.0009) [2023-10-10 07:13:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 125960192. Throughput: 0: 1662.8, 1: 1685.5. Samples: 31495884. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:21,784][52050] Avg episode reward: [(0, '20.060'), (1, '19.620')] [2023-10-10 07:13:23,306][53268] Updated weights for policy 1, policy_version 61480 (0.0008) [2023-10-10 07:13:23,689][53268] Updated weights for policy 1, policy_version 61490 (0.0008) [2023-10-10 07:13:24,061][53268] Updated weights for policy 1, policy_version 61500 (0.0009) [2023-10-10 07:13:24,563][53252] Updated weights for policy 0, policy_version 61540 (0.0007) [2023-10-10 07:13:24,928][53252] Updated weights for policy 0, policy_version 61550 (0.0008) [2023-10-10 07:13:25,311][53252] Updated weights for policy 0, policy_version 61560 (0.0009) [2023-10-10 07:13:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 126025728. Throughput: 0: 1676.2, 1: 1694.1. Samples: 31516196. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:26,784][52050] Avg episode reward: [(0, '21.300'), (1, '20.030')] [2023-10-10 07:13:27,961][53268] Updated weights for policy 1, policy_version 61510 (0.0009) [2023-10-10 07:13:28,341][53268] Updated weights for policy 1, policy_version 61520 (0.0008) [2023-10-10 07:13:28,710][53268] Updated weights for policy 1, policy_version 61530 (0.0009) [2023-10-10 07:13:29,324][53252] Updated weights for policy 0, policy_version 61570 (0.0007) [2023-10-10 07:13:29,701][53252] Updated weights for policy 0, policy_version 61580 (0.0009) [2023-10-10 07:13:30,070][53252] Updated weights for policy 0, policy_version 61590 (0.0007) [2023-10-10 07:13:30,433][53252] Updated weights for policy 0, policy_version 61600 (0.0009) [2023-10-10 07:13:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 126091264. Throughput: 0: 1681.8, 1: 1669.0. Samples: 31526614. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-10 07:13:31,784][52050] Avg episode reward: [(0, '22.770'), (1, '21.000')] [2023-10-10 07:13:32,720][53268] Updated weights for policy 1, policy_version 61540 (0.0008) [2023-10-10 07:13:33,090][53268] Updated weights for policy 1, policy_version 61550 (0.0008) [2023-10-10 07:13:33,460][53268] Updated weights for policy 1, policy_version 61560 (0.0007) [2023-10-10 07:13:34,417][53252] Updated weights for policy 0, policy_version 61610 (0.0008) [2023-10-10 07:13:34,779][53252] Updated weights for policy 0, policy_version 61620 (0.0007) [2023-10-10 07:13:35,153][53252] Updated weights for policy 0, policy_version 61630 (0.0008) [2023-10-10 07:13:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126156800. Throughput: 0: 1657.7, 1: 1693.1. Samples: 31546370. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:13:36,784][52050] Avg episode reward: [(0, '23.310'), (1, '21.300')] [2023-10-10 07:13:37,379][53268] Updated weights for policy 1, policy_version 61570 (0.0007) [2023-10-10 07:13:37,753][53268] Updated weights for policy 1, policy_version 61580 (0.0010) [2023-10-10 07:13:38,124][53268] Updated weights for policy 1, policy_version 61590 (0.0008) [2023-10-10 07:13:38,493][53268] Updated weights for policy 1, policy_version 61600 (0.0011) [2023-10-10 07:13:39,101][53252] Updated weights for policy 0, policy_version 61640 (0.0009) [2023-10-10 07:13:39,473][53252] Updated weights for policy 0, policy_version 61650 (0.0011) [2023-10-10 07:13:39,840][53252] Updated weights for policy 0, policy_version 61660 (0.0011) [2023-10-10 07:13:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126222336. Throughput: 0: 1684.0, 1: 1689.5. Samples: 31567346. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:13:41,784][52050] Avg episode reward: [(0, '23.300'), (1, '20.400')] [2023-10-10 07:13:42,377][53268] Updated weights for policy 1, policy_version 61610 (0.0008) [2023-10-10 07:13:42,746][53268] Updated weights for policy 1, policy_version 61620 (0.0010) [2023-10-10 07:13:43,110][53268] Updated weights for policy 1, policy_version 61630 (0.0007) [2023-10-10 07:13:43,876][53252] Updated weights for policy 0, policy_version 61670 (0.0009) [2023-10-10 07:13:44,248][53252] Updated weights for policy 0, policy_version 61680 (0.0007) [2023-10-10 07:13:44,624][53252] Updated weights for policy 0, policy_version 61690 (0.0007) [2023-10-10 07:13:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126287872. Throughput: 0: 1673.1, 1: 1682.2. Samples: 31577200. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:13:46,784][52050] Avg episode reward: [(0, '21.960'), (1, '19.530')] [2023-10-10 07:13:47,131][53268] Updated weights for policy 1, policy_version 61640 (0.0007) [2023-10-10 07:13:47,495][53268] Updated weights for policy 1, policy_version 61650 (0.0007) [2023-10-10 07:13:47,868][53268] Updated weights for policy 1, policy_version 61660 (0.0009) [2023-10-10 07:13:48,560][53252] Updated weights for policy 0, policy_version 61700 (0.0008) [2023-10-10 07:13:48,935][53252] Updated weights for policy 0, policy_version 61710 (0.0008) [2023-10-10 07:13:49,305][53252] Updated weights for policy 0, policy_version 61720 (0.0007) [2023-10-10 07:13:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126353408. Throughput: 0: 1680.1, 1: 1688.7. Samples: 31597594. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:13:51,784][52050] Avg episode reward: [(0, '20.970'), (1, '21.550')] [2023-10-10 07:13:51,927][53268] Updated weights for policy 1, policy_version 61670 (0.0010) [2023-10-10 07:13:52,291][53268] Updated weights for policy 1, policy_version 61680 (0.0008) [2023-10-10 07:13:52,657][53268] Updated weights for policy 1, policy_version 61690 (0.0009) [2023-10-10 07:13:53,529][53252] Updated weights for policy 0, policy_version 61730 (0.0008) [2023-10-10 07:13:53,919][53252] Updated weights for policy 0, policy_version 61740 (0.0007) [2023-10-10 07:13:54,293][53252] Updated weights for policy 0, policy_version 61750 (0.0007) [2023-10-10 07:13:54,650][53252] Updated weights for policy 0, policy_version 61760 (0.0009) [2023-10-10 07:13:56,747][53268] Updated weights for policy 1, policy_version 61700 (0.0008) [2023-10-10 07:13:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126418944. Throughput: 0: 1698.8, 1: 1694.4. Samples: 31618474. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:13:56,784][52050] Avg episode reward: [(0, '20.490'), (1, '20.590')] [2023-10-10 07:13:57,112][53268] Updated weights for policy 1, policy_version 61710 (0.0008) [2023-10-10 07:13:57,480][53268] Updated weights for policy 1, policy_version 61720 (0.0008) [2023-10-10 07:13:58,803][53252] Updated weights for policy 0, policy_version 61770 (0.0007) [2023-10-10 07:13:59,173][53252] Updated weights for policy 0, policy_version 61780 (0.0009) [2023-10-10 07:13:59,544][53252] Updated weights for policy 0, policy_version 61790 (0.0008) [2023-10-10 07:14:01,609][53268] Updated weights for policy 1, policy_version 61730 (0.0008) [2023-10-10 07:14:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126484480. Throughput: 0: 1676.5, 1: 1696.5. Samples: 31628008. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:14:01,784][52050] Avg episode reward: [(0, '21.270'), (1, '19.770')] [2023-10-10 07:14:01,979][53268] Updated weights for policy 1, policy_version 61740 (0.0010) [2023-10-10 07:14:02,338][53268] Updated weights for policy 1, policy_version 61750 (0.0009) [2023-10-10 07:14:02,697][53268] Updated weights for policy 1, policy_version 61760 (0.0010) [2023-10-10 07:14:03,580][53252] Updated weights for policy 0, policy_version 61800 (0.0010) [2023-10-10 07:14:03,955][53252] Updated weights for policy 0, policy_version 61810 (0.0007) [2023-10-10 07:14:04,327][53252] Updated weights for policy 0, policy_version 61820 (0.0007) [2023-10-10 07:14:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126550016. Throughput: 0: 1688.3, 1: 1698.7. Samples: 31648300. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:14:06,784][52050] Avg episode reward: [(0, '21.900'), (1, '20.370')] [2023-10-10 07:14:06,839][53268] Updated weights for policy 1, policy_version 61770 (0.0007) [2023-10-10 07:14:07,203][53268] Updated weights for policy 1, policy_version 61780 (0.0007) [2023-10-10 07:14:07,575][53268] Updated weights for policy 1, policy_version 61790 (0.0010) [2023-10-10 07:14:08,232][53252] Updated weights for policy 0, policy_version 61830 (0.0009) [2023-10-10 07:14:08,604][53252] Updated weights for policy 0, policy_version 61840 (0.0010) [2023-10-10 07:14:08,965][53252] Updated weights for policy 0, policy_version 61850 (0.0010) [2023-10-10 07:14:11,410][53268] Updated weights for policy 1, policy_version 61800 (0.0009) [2023-10-10 07:14:11,778][53268] Updated weights for policy 1, policy_version 61810 (0.0008) [2023-10-10 07:14:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126615552. Throughput: 0: 1693.3, 1: 1704.9. Samples: 31669116. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:14:11,784][52050] Avg episode reward: [(0, '22.360'), (1, '20.790')] [2023-10-10 07:14:12,156][53268] Updated weights for policy 1, policy_version 61820 (0.0007) [2023-10-10 07:14:13,077][53252] Updated weights for policy 0, policy_version 61860 (0.0009) [2023-10-10 07:14:13,451][53252] Updated weights for policy 0, policy_version 61870 (0.0009) [2023-10-10 07:14:13,824][53252] Updated weights for policy 0, policy_version 61880 (0.0007) [2023-10-10 07:14:16,233][53268] Updated weights for policy 1, policy_version 61830 (0.0009) [2023-10-10 07:14:16,604][53268] Updated weights for policy 1, policy_version 61840 (0.0008) [2023-10-10 07:14:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126681088. Throughput: 0: 1668.4, 1: 1703.6. Samples: 31678356. Policy #0 lag: (min: 31.0, avg: 40.3, max: 63.0) [2023-10-10 07:14:16,784][52050] Avg episode reward: [(0, '20.170'), (1, '21.400')] [2023-10-10 07:14:16,980][53268] Updated weights for policy 1, policy_version 61850 (0.0009) [2023-10-10 07:14:17,642][53252] Updated weights for policy 0, policy_version 61890 (0.0008) [2023-10-10 07:14:18,019][53252] Updated weights for policy 0, policy_version 61900 (0.0011) [2023-10-10 07:14:18,401][53252] Updated weights for policy 0, policy_version 61910 (0.0008) [2023-10-10 07:14:18,770][53252] Updated weights for policy 0, policy_version 61920 (0.0010) [2023-10-10 07:14:21,093][53268] Updated weights for policy 1, policy_version 61860 (0.0008) [2023-10-10 07:14:21,459][53268] Updated weights for policy 1, policy_version 61870 (0.0008) [2023-10-10 07:14:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126746624. Throughput: 0: 1701.1, 1: 1698.7. Samples: 31699360. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:21,784][52050] Avg episode reward: [(0, '22.820'), (1, '21.670')] [2023-10-10 07:14:21,826][53268] Updated weights for policy 1, policy_version 61880 (0.0008) [2023-10-10 07:14:22,755][53252] Updated weights for policy 0, policy_version 61930 (0.0008) [2023-10-10 07:14:23,121][53252] Updated weights for policy 0, policy_version 61940 (0.0009) [2023-10-10 07:14:23,490][53252] Updated weights for policy 0, policy_version 61950 (0.0010) [2023-10-10 07:14:25,917][53268] Updated weights for policy 1, policy_version 61890 (0.0007) [2023-10-10 07:14:26,285][53268] Updated weights for policy 1, policy_version 61900 (0.0008) [2023-10-10 07:14:26,649][53268] Updated weights for policy 1, policy_version 61910 (0.0008) [2023-10-10 07:14:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126812160. Throughput: 0: 1694.5, 1: 1691.7. Samples: 31719724. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:26,784][52050] Avg episode reward: [(0, '20.550'), (1, '21.910')] [2023-10-10 07:14:27,019][53268] Updated weights for policy 1, policy_version 61920 (0.0009) [2023-10-10 07:14:27,635][53252] Updated weights for policy 0, policy_version 61960 (0.0009) [2023-10-10 07:14:28,014][53252] Updated weights for policy 0, policy_version 61970 (0.0009) [2023-10-10 07:14:28,391][53252] Updated weights for policy 0, policy_version 61980 (0.0010) [2023-10-10 07:14:31,115][53268] Updated weights for policy 1, policy_version 61930 (0.0009) [2023-10-10 07:14:31,487][53268] Updated weights for policy 1, policy_version 61940 (0.0009) [2023-10-10 07:14:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126877696. Throughput: 0: 1678.7, 1: 1700.3. Samples: 31729254. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:31,784][52050] Avg episode reward: [(0, '20.140'), (1, '22.790')] [2023-10-10 07:14:31,856][53268] Updated weights for policy 1, policy_version 61950 (0.0008) [2023-10-10 07:14:32,367][53252] Updated weights for policy 0, policy_version 61990 (0.0010) [2023-10-10 07:14:32,743][53252] Updated weights for policy 0, policy_version 62000 (0.0008) [2023-10-10 07:14:33,114][53252] Updated weights for policy 0, policy_version 62010 (0.0008) [2023-10-10 07:14:35,752][53268] Updated weights for policy 1, policy_version 61960 (0.0008) [2023-10-10 07:14:36,123][53268] Updated weights for policy 1, policy_version 61970 (0.0008) [2023-10-10 07:14:36,490][53268] Updated weights for policy 1, policy_version 61980 (0.0008) [2023-10-10 07:14:36,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 126976000. Throughput: 0: 1687.5, 1: 1699.5. Samples: 31750010. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:36,784][52050] Avg episode reward: [(0, '21.220'), (1, '21.760')] [2023-10-10 07:14:37,272][53252] Updated weights for policy 0, policy_version 62020 (0.0010) [2023-10-10 07:14:37,645][53252] Updated weights for policy 0, policy_version 62030 (0.0008) [2023-10-10 07:14:38,026][53252] Updated weights for policy 0, policy_version 62040 (0.0007) [2023-10-10 07:14:40,640][53268] Updated weights for policy 1, policy_version 61990 (0.0007) [2023-10-10 07:14:41,006][53268] Updated weights for policy 1, policy_version 62000 (0.0008) [2023-10-10 07:14:41,369][53268] Updated weights for policy 1, policy_version 62010 (0.0009) [2023-10-10 07:14:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 127041536. Throughput: 0: 1688.8, 1: 1679.7. Samples: 31770056. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:41,784][52050] Avg episode reward: [(0, '20.330'), (1, '20.330')] [2023-10-10 07:14:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000062016_63504384.pth... [2023-10-10 07:14:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000062048_63537152.pth... [2023-10-10 07:14:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000060416_61865984.pth [2023-10-10 07:14:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000060480_61931520.pth [2023-10-10 07:14:42,077][53252] Updated weights for policy 0, policy_version 62050 (0.0008) [2023-10-10 07:14:42,448][53252] Updated weights for policy 0, policy_version 62060 (0.0009) [2023-10-10 07:14:42,826][53252] Updated weights for policy 0, policy_version 62070 (0.0011) [2023-10-10 07:14:43,192][53252] Updated weights for policy 0, policy_version 62080 (0.0009) [2023-10-10 07:14:45,158][53268] Updated weights for policy 1, policy_version 62020 (0.0009) [2023-10-10 07:14:45,524][53268] Updated weights for policy 1, policy_version 62030 (0.0009) [2023-10-10 07:14:45,892][53268] Updated weights for policy 1, policy_version 62040 (0.0009) [2023-10-10 07:14:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 127107072. Throughput: 0: 1679.1, 1: 1698.1. Samples: 31779980. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:46,784][52050] Avg episode reward: [(0, '20.770'), (1, '20.490')] [2023-10-10 07:14:47,233][53252] Updated weights for policy 0, policy_version 62090 (0.0007) [2023-10-10 07:14:47,605][53252] Updated weights for policy 0, policy_version 62100 (0.0008) [2023-10-10 07:14:47,971][53252] Updated weights for policy 0, policy_version 62110 (0.0009) [2023-10-10 07:14:49,974][53268] Updated weights for policy 1, policy_version 62050 (0.0008) [2023-10-10 07:14:50,333][53268] Updated weights for policy 1, policy_version 62060 (0.0009) [2023-10-10 07:14:50,693][53268] Updated weights for policy 1, policy_version 62070 (0.0012) [2023-10-10 07:14:51,063][53268] Updated weights for policy 1, policy_version 62080 (0.0008) [2023-10-10 07:14:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127172608. Throughput: 0: 1687.5, 1: 1698.4. Samples: 31800666. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:51,784][52050] Avg episode reward: [(0, '23.490'), (1, '20.080')] [2023-10-10 07:14:52,051][53252] Updated weights for policy 0, policy_version 62120 (0.0008) [2023-10-10 07:14:52,416][53252] Updated weights for policy 0, policy_version 62130 (0.0009) [2023-10-10 07:14:52,782][53252] Updated weights for policy 0, policy_version 62140 (0.0008) [2023-10-10 07:14:55,180][53268] Updated weights for policy 1, policy_version 62090 (0.0010) [2023-10-10 07:14:55,548][53268] Updated weights for policy 1, policy_version 62100 (0.0008) [2023-10-10 07:14:55,920][53268] Updated weights for policy 1, policy_version 62110 (0.0008) [2023-10-10 07:14:56,764][53252] Updated weights for policy 0, policy_version 62150 (0.0008) [2023-10-10 07:14:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127238144. Throughput: 0: 1692.2, 1: 1672.9. Samples: 31820544. Policy #0 lag: (min: 17.0, avg: 27.8, max: 49.0) [2023-10-10 07:14:56,784][52050] Avg episode reward: [(0, '22.110'), (1, '19.410')] [2023-10-10 07:14:57,138][53252] Updated weights for policy 0, policy_version 62160 (0.0009) [2023-10-10 07:14:57,495][53252] Updated weights for policy 0, policy_version 62170 (0.0008) [2023-10-10 07:15:00,057][53268] Updated weights for policy 1, policy_version 62120 (0.0009) [2023-10-10 07:15:00,444][53268] Updated weights for policy 1, policy_version 62130 (0.0009) [2023-10-10 07:15:00,818][53268] Updated weights for policy 1, policy_version 62140 (0.0009) [2023-10-10 07:15:01,707][53252] Updated weights for policy 0, policy_version 62180 (0.0010) [2023-10-10 07:15:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127303680. Throughput: 0: 1691.1, 1: 1700.0. Samples: 31830956. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:01,784][52050] Avg episode reward: [(0, '22.250'), (1, '19.520')] [2023-10-10 07:15:02,078][53252] Updated weights for policy 0, policy_version 62190 (0.0010) [2023-10-10 07:15:02,456][53252] Updated weights for policy 0, policy_version 62200 (0.0009) [2023-10-10 07:15:04,713][53268] Updated weights for policy 1, policy_version 62150 (0.0009) [2023-10-10 07:15:05,081][53268] Updated weights for policy 1, policy_version 62160 (0.0009) [2023-10-10 07:15:05,453][53268] Updated weights for policy 1, policy_version 62170 (0.0007) [2023-10-10 07:15:06,576][53252] Updated weights for policy 0, policy_version 62210 (0.0010) [2023-10-10 07:15:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127369216. Throughput: 0: 1684.6, 1: 1685.4. Samples: 31851010. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:06,784][52050] Avg episode reward: [(0, '22.690'), (1, '21.520')] [2023-10-10 07:15:06,941][53252] Updated weights for policy 0, policy_version 62220 (0.0007) [2023-10-10 07:15:07,307][53252] Updated weights for policy 0, policy_version 62230 (0.0007) [2023-10-10 07:15:07,684][53252] Updated weights for policy 0, policy_version 62240 (0.0010) [2023-10-10 07:15:09,347][53268] Updated weights for policy 1, policy_version 62180 (0.0008) [2023-10-10 07:15:09,712][53268] Updated weights for policy 1, policy_version 62190 (0.0008) [2023-10-10 07:15:10,077][53268] Updated weights for policy 1, policy_version 62200 (0.0008) [2023-10-10 07:15:11,669][53252] Updated weights for policy 0, policy_version 62250 (0.0010) [2023-10-10 07:15:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127434752. Throughput: 0: 1688.4, 1: 1677.8. Samples: 31871202. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:11,784][52050] Avg episode reward: [(0, '20.420'), (1, '20.830')] [2023-10-10 07:15:12,039][53252] Updated weights for policy 0, policy_version 62260 (0.0012) [2023-10-10 07:15:12,419][53252] Updated weights for policy 0, policy_version 62270 (0.0009) [2023-10-10 07:15:14,118][53268] Updated weights for policy 1, policy_version 62210 (0.0011) [2023-10-10 07:15:14,489][53268] Updated weights for policy 1, policy_version 62220 (0.0009) [2023-10-10 07:15:14,859][53268] Updated weights for policy 1, policy_version 62230 (0.0009) [2023-10-10 07:15:15,227][53268] Updated weights for policy 1, policy_version 62240 (0.0008) [2023-10-10 07:15:16,295][53252] Updated weights for policy 0, policy_version 62280 (0.0008) [2023-10-10 07:15:16,660][53252] Updated weights for policy 0, policy_version 62290 (0.0009) [2023-10-10 07:15:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 127500288. Throughput: 0: 1691.8, 1: 1695.6. Samples: 31881686. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:16,784][52050] Avg episode reward: [(0, '19.360'), (1, '20.670')] [2023-10-10 07:15:17,028][53252] Updated weights for policy 0, policy_version 62300 (0.0008) [2023-10-10 07:15:19,373][53268] Updated weights for policy 1, policy_version 62250 (0.0008) [2023-10-10 07:15:19,730][53268] Updated weights for policy 1, policy_version 62260 (0.0008) [2023-10-10 07:15:20,099][53268] Updated weights for policy 1, policy_version 62270 (0.0010) [2023-10-10 07:15:21,245][53252] Updated weights for policy 0, policy_version 62310 (0.0007) [2023-10-10 07:15:21,613][53252] Updated weights for policy 0, policy_version 62320 (0.0010) [2023-10-10 07:15:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127565824. Throughput: 0: 1693.4, 1: 1668.0. Samples: 31901272. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:21,784][52050] Avg episode reward: [(0, '20.480'), (1, '20.210')] [2023-10-10 07:15:21,973][53252] Updated weights for policy 0, policy_version 62330 (0.0008) [2023-10-10 07:15:24,184][53268] Updated weights for policy 1, policy_version 62280 (0.0009) [2023-10-10 07:15:24,557][53268] Updated weights for policy 1, policy_version 62290 (0.0009) [2023-10-10 07:15:24,918][53268] Updated weights for policy 1, policy_version 62300 (0.0009) [2023-10-10 07:15:26,099][53252] Updated weights for policy 0, policy_version 62340 (0.0008) [2023-10-10 07:15:26,466][53252] Updated weights for policy 0, policy_version 62350 (0.0007) [2023-10-10 07:15:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127631360. Throughput: 0: 1680.7, 1: 1682.0. Samples: 31921378. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:26,784][52050] Avg episode reward: [(0, '21.000'), (1, '20.280')] [2023-10-10 07:15:26,837][53252] Updated weights for policy 0, policy_version 62360 (0.0007) [2023-10-10 07:15:29,082][53268] Updated weights for policy 1, policy_version 62310 (0.0007) [2023-10-10 07:15:29,447][53268] Updated weights for policy 1, policy_version 62320 (0.0007) [2023-10-10 07:15:29,811][53268] Updated weights for policy 1, policy_version 62330 (0.0009) [2023-10-10 07:15:30,986][53252] Updated weights for policy 0, policy_version 62370 (0.0009) [2023-10-10 07:15:31,381][53252] Updated weights for policy 0, policy_version 62380 (0.0008) [2023-10-10 07:15:31,747][53252] Updated weights for policy 0, policy_version 62390 (0.0007) [2023-10-10 07:15:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127696896. Throughput: 0: 1689.8, 1: 1684.4. Samples: 31931820. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:31,784][52050] Avg episode reward: [(0, '20.730'), (1, '20.210')] [2023-10-10 07:15:32,115][53252] Updated weights for policy 0, policy_version 62400 (0.0008) [2023-10-10 07:15:33,791][53268] Updated weights for policy 1, policy_version 62340 (0.0009) [2023-10-10 07:15:34,155][53268] Updated weights for policy 1, policy_version 62350 (0.0010) [2023-10-10 07:15:34,530][53268] Updated weights for policy 1, policy_version 62360 (0.0008) [2023-10-10 07:15:36,280][53252] Updated weights for policy 0, policy_version 62410 (0.0008) [2023-10-10 07:15:36,653][53252] Updated weights for policy 0, policy_version 62420 (0.0009) [2023-10-10 07:15:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127762432. Throughput: 0: 1686.1, 1: 1664.6. Samples: 31951446. Policy #0 lag: (min: 23.0, avg: 30.8, max: 55.0) [2023-10-10 07:15:36,784][52050] Avg episode reward: [(0, '21.350'), (1, '18.980')] [2023-10-10 07:15:37,020][53252] Updated weights for policy 0, policy_version 62430 (0.0009) [2023-10-10 07:15:38,657][53268] Updated weights for policy 1, policy_version 62370 (0.0009) [2023-10-10 07:15:39,027][53268] Updated weights for policy 1, policy_version 62380 (0.0009) [2023-10-10 07:15:39,389][53268] Updated weights for policy 1, policy_version 62390 (0.0007) [2023-10-10 07:15:39,761][53268] Updated weights for policy 1, policy_version 62400 (0.0010) [2023-10-10 07:15:41,058][53252] Updated weights for policy 0, policy_version 62440 (0.0008) [2023-10-10 07:15:41,435][53252] Updated weights for policy 0, policy_version 62450 (0.0009) [2023-10-10 07:15:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127827968. Throughput: 0: 1670.5, 1: 1685.0. Samples: 31971544. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:15:41,784][52050] Avg episode reward: [(0, '21.620'), (1, '19.860')] [2023-10-10 07:15:41,810][53252] Updated weights for policy 0, policy_version 62460 (0.0009) [2023-10-10 07:15:43,764][53268] Updated weights for policy 1, policy_version 62410 (0.0009) [2023-10-10 07:15:44,137][53268] Updated weights for policy 1, policy_version 62420 (0.0007) [2023-10-10 07:15:44,496][53268] Updated weights for policy 1, policy_version 62430 (0.0008) [2023-10-10 07:15:45,850][53252] Updated weights for policy 0, policy_version 62470 (0.0008) [2023-10-10 07:15:46,216][53252] Updated weights for policy 0, policy_version 62480 (0.0007) [2023-10-10 07:15:46,588][53252] Updated weights for policy 0, policy_version 62490 (0.0007) [2023-10-10 07:15:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127893504. Throughput: 0: 1685.2, 1: 1672.5. Samples: 31982052. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:15:46,784][52050] Avg episode reward: [(0, '20.160'), (1, '19.720')] [2023-10-10 07:15:48,665][53268] Updated weights for policy 1, policy_version 62440 (0.0010) [2023-10-10 07:15:49,027][53268] Updated weights for policy 1, policy_version 62450 (0.0011) [2023-10-10 07:15:49,401][53268] Updated weights for policy 1, policy_version 62460 (0.0010) [2023-10-10 07:15:50,743][53252] Updated weights for policy 0, policy_version 62500 (0.0007) [2023-10-10 07:15:51,124][53252] Updated weights for policy 0, policy_version 62510 (0.0008) [2023-10-10 07:15:51,490][53252] Updated weights for policy 0, policy_version 62520 (0.0009) [2023-10-10 07:15:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127959040. Throughput: 0: 1683.2, 1: 1676.2. Samples: 32002186. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:15:51,784][52050] Avg episode reward: [(0, '22.160'), (1, '19.150')] [2023-10-10 07:15:53,424][53268] Updated weights for policy 1, policy_version 62470 (0.0011) [2023-10-10 07:15:53,790][53268] Updated weights for policy 1, policy_version 62480 (0.0011) [2023-10-10 07:15:54,166][53268] Updated weights for policy 1, policy_version 62490 (0.0008) [2023-10-10 07:15:55,306][53252] Updated weights for policy 0, policy_version 62530 (0.0010) [2023-10-10 07:15:55,675][53252] Updated weights for policy 0, policy_version 62540 (0.0010) [2023-10-10 07:15:56,050][53252] Updated weights for policy 0, policy_version 62550 (0.0007) [2023-10-10 07:15:56,418][53252] Updated weights for policy 0, policy_version 62560 (0.0008) [2023-10-10 07:15:56,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128057344. Throughput: 0: 1657.3, 1: 1689.5. Samples: 32021808. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:15:56,784][52050] Avg episode reward: [(0, '21.630'), (1, '20.030')] [2023-10-10 07:15:58,225][53268] Updated weights for policy 1, policy_version 62500 (0.0008) [2023-10-10 07:15:58,594][53268] Updated weights for policy 1, policy_version 62510 (0.0009) [2023-10-10 07:15:58,964][53268] Updated weights for policy 1, policy_version 62520 (0.0009) [2023-10-10 07:16:00,466][53252] Updated weights for policy 0, policy_version 62570 (0.0009) [2023-10-10 07:16:00,837][53252] Updated weights for policy 0, policy_version 62580 (0.0007) [2023-10-10 07:16:01,203][53252] Updated weights for policy 0, policy_version 62590 (0.0007) [2023-10-10 07:16:01,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128122880. Throughput: 0: 1682.5, 1: 1668.0. Samples: 32032458. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:16:01,784][52050] Avg episode reward: [(0, '21.770'), (1, '20.720')] [2023-10-10 07:16:02,961][53268] Updated weights for policy 1, policy_version 62530 (0.0008) [2023-10-10 07:16:03,330][53268] Updated weights for policy 1, policy_version 62540 (0.0010) [2023-10-10 07:16:03,702][53268] Updated weights for policy 1, policy_version 62550 (0.0007) [2023-10-10 07:16:04,074][53268] Updated weights for policy 1, policy_version 62560 (0.0007) [2023-10-10 07:16:05,266][53252] Updated weights for policy 0, policy_version 62600 (0.0009) [2023-10-10 07:16:05,649][53252] Updated weights for policy 0, policy_version 62610 (0.0009) [2023-10-10 07:16:06,020][53252] Updated weights for policy 0, policy_version 62620 (0.0008) [2023-10-10 07:16:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128188416. Throughput: 0: 1671.1, 1: 1687.1. Samples: 32052390. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:16:06,784][52050] Avg episode reward: [(0, '22.030'), (1, '20.010')] [2023-10-10 07:16:08,202][53268] Updated weights for policy 1, policy_version 62570 (0.0010) [2023-10-10 07:16:08,577][53268] Updated weights for policy 1, policy_version 62580 (0.0012) [2023-10-10 07:16:08,940][53268] Updated weights for policy 1, policy_version 62590 (0.0010) [2023-10-10 07:16:10,096][53252] Updated weights for policy 0, policy_version 62630 (0.0010) [2023-10-10 07:16:10,477][53252] Updated weights for policy 0, policy_version 62640 (0.0010) [2023-10-10 07:16:10,845][53252] Updated weights for policy 0, policy_version 62650 (0.0007) [2023-10-10 07:16:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128253952. Throughput: 0: 1658.6, 1: 1690.1. Samples: 32072070. Policy #0 lag: (min: 8.0, avg: 35.2, max: 40.0) [2023-10-10 07:16:11,784][52050] Avg episode reward: [(0, '20.560'), (1, '19.800')] [2023-10-10 07:16:12,921][53268] Updated weights for policy 1, policy_version 62600 (0.0010) [2023-10-10 07:16:13,290][53268] Updated weights for policy 1, policy_version 62610 (0.0009) [2023-10-10 07:16:13,667][53268] Updated weights for policy 1, policy_version 62620 (0.0011) [2023-10-10 07:16:14,831][53252] Updated weights for policy 0, policy_version 62660 (0.0010) [2023-10-10 07:16:15,198][53252] Updated weights for policy 0, policy_version 62670 (0.0009) [2023-10-10 07:16:15,574][53252] Updated weights for policy 0, policy_version 62680 (0.0010) [2023-10-10 07:16:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128319488. Throughput: 0: 1680.8, 1: 1666.6. Samples: 32082456. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:16,784][52050] Avg episode reward: [(0, '21.370'), (1, '21.190')] [2023-10-10 07:16:17,805][53268] Updated weights for policy 1, policy_version 62630 (0.0010) [2023-10-10 07:16:18,172][53268] Updated weights for policy 1, policy_version 62640 (0.0009) [2023-10-10 07:16:18,547][53268] Updated weights for policy 1, policy_version 62650 (0.0008) [2023-10-10 07:16:19,623][53252] Updated weights for policy 0, policy_version 62690 (0.0007) [2023-10-10 07:16:20,028][53252] Updated weights for policy 0, policy_version 62700 (0.0007) [2023-10-10 07:16:20,389][53252] Updated weights for policy 0, policy_version 62710 (0.0009) [2023-10-10 07:16:20,761][53252] Updated weights for policy 0, policy_version 62720 (0.0009) [2023-10-10 07:16:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128385024. Throughput: 0: 1666.8, 1: 1689.2. Samples: 32102470. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:21,784][52050] Avg episode reward: [(0, '21.890'), (1, '20.790')] [2023-10-10 07:16:22,575][53268] Updated weights for policy 1, policy_version 62660 (0.0007) [2023-10-10 07:16:22,954][53268] Updated weights for policy 1, policy_version 62670 (0.0008) [2023-10-10 07:16:23,311][53268] Updated weights for policy 1, policy_version 62680 (0.0009) [2023-10-10 07:16:24,868][53252] Updated weights for policy 0, policy_version 62730 (0.0010) [2023-10-10 07:16:25,238][53252] Updated weights for policy 0, policy_version 62740 (0.0010) [2023-10-10 07:16:25,603][53252] Updated weights for policy 0, policy_version 62750 (0.0009) [2023-10-10 07:16:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128450560. Throughput: 0: 1673.5, 1: 1695.2. Samples: 32123138. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:26,784][52050] Avg episode reward: [(0, '22.870'), (1, '21.020')] [2023-10-10 07:16:27,275][53268] Updated weights for policy 1, policy_version 62690 (0.0009) [2023-10-10 07:16:27,643][53268] Updated weights for policy 1, policy_version 62700 (0.0008) [2023-10-10 07:16:28,019][53268] Updated weights for policy 1, policy_version 62710 (0.0008) [2023-10-10 07:16:28,374][53268] Updated weights for policy 1, policy_version 62720 (0.0010) [2023-10-10 07:16:29,644][53252] Updated weights for policy 0, policy_version 62760 (0.0008) [2023-10-10 07:16:30,012][53252] Updated weights for policy 0, policy_version 62770 (0.0009) [2023-10-10 07:16:30,388][53252] Updated weights for policy 0, policy_version 62780 (0.0010) [2023-10-10 07:16:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128516096. Throughput: 0: 1686.8, 1: 1680.4. Samples: 32133580. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:31,784][52050] Avg episode reward: [(0, '21.410'), (1, '21.970')] [2023-10-10 07:16:32,421][53268] Updated weights for policy 1, policy_version 62730 (0.0008) [2023-10-10 07:16:32,790][53268] Updated weights for policy 1, policy_version 62740 (0.0008) [2023-10-10 07:16:33,158][53268] Updated weights for policy 1, policy_version 62750 (0.0007) [2023-10-10 07:16:34,413][53252] Updated weights for policy 0, policy_version 62790 (0.0010) [2023-10-10 07:16:34,785][53252] Updated weights for policy 0, policy_version 62800 (0.0009) [2023-10-10 07:16:35,154][53252] Updated weights for policy 0, policy_version 62810 (0.0008) [2023-10-10 07:16:36,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13551.5). Total num frames: 128581632. Throughput: 0: 1663.3, 1: 1688.9. Samples: 32153034. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:36,784][52050] Avg episode reward: [(0, '19.300'), (1, '21.970')] [2023-10-10 07:16:37,290][53268] Updated weights for policy 1, policy_version 62760 (0.0010) [2023-10-10 07:16:37,658][53268] Updated weights for policy 1, policy_version 62770 (0.0009) [2023-10-10 07:16:38,025][53268] Updated weights for policy 1, policy_version 62780 (0.0008) [2023-10-10 07:16:39,215][53252] Updated weights for policy 0, policy_version 62820 (0.0007) [2023-10-10 07:16:39,587][53252] Updated weights for policy 0, policy_version 62830 (0.0007) [2023-10-10 07:16:39,952][53252] Updated weights for policy 0, policy_version 62840 (0.0008) [2023-10-10 07:16:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 128647168. Throughput: 0: 1692.6, 1: 1692.9. Samples: 32174154. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:41,784][52050] Avg episode reward: [(0, '20.380'), (1, '20.200')] [2023-10-10 07:16:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000062848_64356352.pth... [2023-10-10 07:16:41,822][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000061280_62750720.pth [2023-10-10 07:16:41,826][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000062848_64356352.pth [2023-10-10 07:16:41,966][53268] Updated weights for policy 1, policy_version 62790 (0.0007) [2023-10-10 07:16:42,339][53268] Updated weights for policy 1, policy_version 62800 (0.0008) [2023-10-10 07:16:42,709][53268] Updated weights for policy 1, policy_version 62810 (0.0009) [2023-10-10 07:16:42,932][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000062816_64323584.pth... [2023-10-10 07:16:42,968][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000061216_62685184.pth [2023-10-10 07:16:42,972][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000062816_64323584.pth [2023-10-10 07:16:43,941][53252] Updated weights for policy 0, policy_version 62850 (0.0009) [2023-10-10 07:16:44,304][53252] Updated weights for policy 0, policy_version 62860 (0.0010) [2023-10-10 07:16:44,675][53252] Updated weights for policy 0, policy_version 62870 (0.0008) [2023-10-10 07:16:45,049][53252] Updated weights for policy 0, policy_version 62880 (0.0010) [2023-10-10 07:16:46,772][53268] Updated weights for policy 1, policy_version 62820 (0.0008) [2023-10-10 07:16:46,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 128712704. Throughput: 0: 1681.0, 1: 1685.9. Samples: 32183966. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:46,784][52050] Avg episode reward: [(0, '20.340'), (1, '21.960')] [2023-10-10 07:16:47,140][53268] Updated weights for policy 1, policy_version 62830 (0.0008) [2023-10-10 07:16:47,514][53268] Updated weights for policy 1, policy_version 62840 (0.0009) [2023-10-10 07:16:49,147][53252] Updated weights for policy 0, policy_version 62890 (0.0008) [2023-10-10 07:16:49,520][53252] Updated weights for policy 0, policy_version 62900 (0.0007) [2023-10-10 07:16:49,880][53252] Updated weights for policy 0, policy_version 62910 (0.0008) [2023-10-10 07:16:51,517][53268] Updated weights for policy 1, policy_version 62850 (0.0010) [2023-10-10 07:16:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 128778240. Throughput: 0: 1680.4, 1: 1695.4. Samples: 32204300. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) [2023-10-10 07:16:51,784][52050] Avg episode reward: [(0, '20.910'), (1, '19.930')] [2023-10-10 07:16:51,882][53268] Updated weights for policy 1, policy_version 62860 (0.0008) [2023-10-10 07:16:52,248][53268] Updated weights for policy 1, policy_version 62870 (0.0009) [2023-10-10 07:16:52,606][53268] Updated weights for policy 1, policy_version 62880 (0.0009) [2023-10-10 07:16:53,820][53252] Updated weights for policy 0, policy_version 62920 (0.0010) [2023-10-10 07:16:54,197][53252] Updated weights for policy 0, policy_version 62930 (0.0008) [2023-10-10 07:16:54,582][53252] Updated weights for policy 0, policy_version 62940 (0.0008) [2023-10-10 07:16:56,706][53268] Updated weights for policy 1, policy_version 62890 (0.0011) [2023-10-10 07:16:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128843776. Throughput: 0: 1704.9, 1: 1691.1. Samples: 32224890. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:16:56,784][52050] Avg episode reward: [(0, '21.720'), (1, '18.800')] [2023-10-10 07:16:57,077][53268] Updated weights for policy 1, policy_version 62900 (0.0008) [2023-10-10 07:16:57,449][53268] Updated weights for policy 1, policy_version 62910 (0.0008) [2023-10-10 07:16:58,609][53252] Updated weights for policy 0, policy_version 62950 (0.0010) [2023-10-10 07:16:58,976][53252] Updated weights for policy 0, policy_version 62960 (0.0007) [2023-10-10 07:16:59,357][53252] Updated weights for policy 0, policy_version 62970 (0.0009) [2023-10-10 07:17:01,484][53268] Updated weights for policy 1, policy_version 62920 (0.0011) [2023-10-10 07:17:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128909312. Throughput: 0: 1684.4, 1: 1693.0. Samples: 32234438. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:01,784][52050] Avg episode reward: [(0, '22.320'), (1, '20.260')] [2023-10-10 07:17:01,853][53268] Updated weights for policy 1, policy_version 62930 (0.0011) [2023-10-10 07:17:02,221][53268] Updated weights for policy 1, policy_version 62940 (0.0009) [2023-10-10 07:17:03,395][53252] Updated weights for policy 0, policy_version 62980 (0.0008) [2023-10-10 07:17:03,772][53252] Updated weights for policy 0, policy_version 62990 (0.0009) [2023-10-10 07:17:04,147][53252] Updated weights for policy 0, policy_version 63000 (0.0010) [2023-10-10 07:17:06,260][53268] Updated weights for policy 1, policy_version 62950 (0.0008) [2023-10-10 07:17:06,625][53268] Updated weights for policy 1, policy_version 62960 (0.0009) [2023-10-10 07:17:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128974848. Throughput: 0: 1695.1, 1: 1699.1. Samples: 32255208. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:06,785][52050] Avg episode reward: [(0, '22.190'), (1, '22.370')] [2023-10-10 07:17:06,998][53268] Updated weights for policy 1, policy_version 62970 (0.0010) [2023-10-10 07:17:08,130][53252] Updated weights for policy 0, policy_version 63010 (0.0008) [2023-10-10 07:17:08,507][53252] Updated weights for policy 0, policy_version 63020 (0.0007) [2023-10-10 07:17:08,887][53252] Updated weights for policy 0, policy_version 63030 (0.0009) [2023-10-10 07:17:09,249][53252] Updated weights for policy 0, policy_version 63040 (0.0009) [2023-10-10 07:17:10,917][53268] Updated weights for policy 1, policy_version 62980 (0.0008) [2023-10-10 07:17:11,278][53268] Updated weights for policy 1, policy_version 62990 (0.0009) [2023-10-10 07:17:11,651][53268] Updated weights for policy 1, policy_version 63000 (0.0008) [2023-10-10 07:17:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129040384. Throughput: 0: 1706.0, 1: 1686.1. Samples: 32275784. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:11,784][52050] Avg episode reward: [(0, '22.200'), (1, '19.900')] [2023-10-10 07:17:13,218][53252] Updated weights for policy 0, policy_version 63050 (0.0007) [2023-10-10 07:17:13,596][53252] Updated weights for policy 0, policy_version 63060 (0.0007) [2023-10-10 07:17:13,973][53252] Updated weights for policy 0, policy_version 63070 (0.0007) [2023-10-10 07:17:15,828][53268] Updated weights for policy 1, policy_version 63010 (0.0008) [2023-10-10 07:17:16,193][53268] Updated weights for policy 1, policy_version 63020 (0.0010) [2023-10-10 07:17:16,558][53268] Updated weights for policy 1, policy_version 63030 (0.0009) [2023-10-10 07:17:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129105920. Throughput: 0: 1681.8, 1: 1695.3. Samples: 32285548. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:16,785][52050] Avg episode reward: [(0, '21.590'), (1, '20.900')] [2023-10-10 07:17:16,933][53268] Updated weights for policy 1, policy_version 63040 (0.0007) [2023-10-10 07:17:17,888][53252] Updated weights for policy 0, policy_version 63080 (0.0007) [2023-10-10 07:17:18,263][53252] Updated weights for policy 0, policy_version 63090 (0.0007) [2023-10-10 07:17:18,626][53252] Updated weights for policy 0, policy_version 63100 (0.0009) [2023-10-10 07:17:21,052][53268] Updated weights for policy 1, policy_version 63050 (0.0007) [2023-10-10 07:17:21,415][53268] Updated weights for policy 1, policy_version 63060 (0.0007) [2023-10-10 07:17:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129171456. Throughput: 0: 1705.9, 1: 1696.6. Samples: 32306146. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:21,784][52050] Avg episode reward: [(0, '22.070'), (1, '20.830')] [2023-10-10 07:17:21,789][53268] Updated weights for policy 1, policy_version 63070 (0.0010) [2023-10-10 07:17:22,678][53252] Updated weights for policy 0, policy_version 63110 (0.0008) [2023-10-10 07:17:23,053][53252] Updated weights for policy 0, policy_version 63120 (0.0009) [2023-10-10 07:17:23,424][53252] Updated weights for policy 0, policy_version 63130 (0.0007) [2023-10-10 07:17:25,821][53268] Updated weights for policy 1, policy_version 63080 (0.0008) [2023-10-10 07:17:26,198][53268] Updated weights for policy 1, policy_version 63090 (0.0009) [2023-10-10 07:17:26,576][53268] Updated weights for policy 1, policy_version 63100 (0.0010) [2023-10-10 07:17:26,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 129269760. Throughput: 0: 1704.2, 1: 1678.7. Samples: 32326384. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:26,784][52050] Avg episode reward: [(0, '22.980'), (1, '18.730')] [2023-10-10 07:17:27,565][53252] Updated weights for policy 0, policy_version 63140 (0.0009) [2023-10-10 07:17:27,929][53252] Updated weights for policy 0, policy_version 63150 (0.0007) [2023-10-10 07:17:28,310][53252] Updated weights for policy 0, policy_version 63160 (0.0010) [2023-10-10 07:17:30,667][53268] Updated weights for policy 1, policy_version 63110 (0.0010) [2023-10-10 07:17:31,030][53268] Updated weights for policy 1, policy_version 63120 (0.0010) [2023-10-10 07:17:31,395][53268] Updated weights for policy 1, policy_version 63130 (0.0011) [2023-10-10 07:17:31,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129335296. Throughput: 0: 1690.2, 1: 1694.7. Samples: 32336284. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:31,784][52050] Avg episode reward: [(0, '23.030'), (1, '18.980')] [2023-10-10 07:17:32,224][53252] Updated weights for policy 0, policy_version 63170 (0.0008) [2023-10-10 07:17:32,594][53252] Updated weights for policy 0, policy_version 63180 (0.0009) [2023-10-10 07:17:32,959][53252] Updated weights for policy 0, policy_version 63190 (0.0007) [2023-10-10 07:17:33,327][53252] Updated weights for policy 0, policy_version 63200 (0.0008) [2023-10-10 07:17:35,610][53268] Updated weights for policy 1, policy_version 63140 (0.0010) [2023-10-10 07:17:35,968][53268] Updated weights for policy 1, policy_version 63150 (0.0008) [2023-10-10 07:17:36,341][53268] Updated weights for policy 1, policy_version 63160 (0.0008) [2023-10-10 07:17:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 129400832. Throughput: 0: 1706.1, 1: 1686.6. Samples: 32356972. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-10 07:17:36,784][52050] Avg episode reward: [(0, '23.590'), (1, '20.110')] [2023-10-10 07:17:37,307][53252] Updated weights for policy 0, policy_version 63210 (0.0007) [2023-10-10 07:17:37,677][53252] Updated weights for policy 0, policy_version 63220 (0.0008) [2023-10-10 07:17:38,048][53252] Updated weights for policy 0, policy_version 63230 (0.0008) [2023-10-10 07:17:40,431][53268] Updated weights for policy 1, policy_version 63170 (0.0010) [2023-10-10 07:17:40,802][53268] Updated weights for policy 1, policy_version 63180 (0.0008) [2023-10-10 07:17:41,165][53268] Updated weights for policy 1, policy_version 63190 (0.0007) [2023-10-10 07:17:41,529][53268] Updated weights for policy 1, policy_version 63200 (0.0008) [2023-10-10 07:17:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129466368. Throughput: 0: 1706.6, 1: 1672.7. Samples: 32376956. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:17:41,785][52050] Avg episode reward: [(0, '23.740'), (1, '18.950')] [2023-10-10 07:17:42,127][53252] Updated weights for policy 0, policy_version 63240 (0.0009) [2023-10-10 07:17:42,512][53252] Updated weights for policy 0, policy_version 63250 (0.0009) [2023-10-10 07:17:42,876][53252] Updated weights for policy 0, policy_version 63260 (0.0009) [2023-10-10 07:17:45,526][53268] Updated weights for policy 1, policy_version 63210 (0.0008) [2023-10-10 07:17:45,892][53268] Updated weights for policy 1, policy_version 63220 (0.0008) [2023-10-10 07:17:46,254][53268] Updated weights for policy 1, policy_version 63230 (0.0009) [2023-10-10 07:17:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129531904. Throughput: 0: 1699.4, 1: 1689.1. Samples: 32386922. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:17:46,785][52050] Avg episode reward: [(0, '23.920'), (1, '18.970')] [2023-10-10 07:17:46,964][53252] Updated weights for policy 0, policy_version 63270 (0.0008) [2023-10-10 07:17:47,334][53252] Updated weights for policy 0, policy_version 63280 (0.0009) [2023-10-10 07:17:47,709][53252] Updated weights for policy 0, policy_version 63290 (0.0009) [2023-10-10 07:17:50,385][53268] Updated weights for policy 1, policy_version 63240 (0.0009) [2023-10-10 07:17:50,748][53268] Updated weights for policy 1, policy_version 63250 (0.0011) [2023-10-10 07:17:51,125][53268] Updated weights for policy 1, policy_version 63260 (0.0008) [2023-10-10 07:17:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 129597440. Throughput: 0: 1699.8, 1: 1682.7. Samples: 32407420. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:17:51,784][52050] Avg episode reward: [(0, '24.050'), (1, '20.840')] [2023-10-10 07:17:51,873][53252] Updated weights for policy 0, policy_version 63300 (0.0010) [2023-10-10 07:17:52,232][53252] Updated weights for policy 0, policy_version 63310 (0.0011) [2023-10-10 07:17:52,609][53252] Updated weights for policy 0, policy_version 63320 (0.0008) [2023-10-10 07:17:55,192][53268] Updated weights for policy 1, policy_version 63270 (0.0009) [2023-10-10 07:17:55,564][53268] Updated weights for policy 1, policy_version 63280 (0.0009) [2023-10-10 07:17:55,936][53268] Updated weights for policy 1, policy_version 63290 (0.0010) [2023-10-10 07:17:56,580][53252] Updated weights for policy 0, policy_version 63330 (0.0007) [2023-10-10 07:17:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129662976. Throughput: 0: 1698.0, 1: 1664.8. Samples: 32427112. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:17:56,784][52050] Avg episode reward: [(0, '22.440'), (1, '19.790')] [2023-10-10 07:17:56,980][53252] Updated weights for policy 0, policy_version 63340 (0.0008) [2023-10-10 07:17:57,351][53252] Updated weights for policy 0, policy_version 63350 (0.0007) [2023-10-10 07:17:57,718][53252] Updated weights for policy 0, policy_version 63360 (0.0009) [2023-10-10 07:17:59,923][53268] Updated weights for policy 1, policy_version 63300 (0.0008) [2023-10-10 07:18:00,287][53268] Updated weights for policy 1, policy_version 63310 (0.0008) [2023-10-10 07:18:00,648][53268] Updated weights for policy 1, policy_version 63320 (0.0010) [2023-10-10 07:18:01,709][53252] Updated weights for policy 0, policy_version 63370 (0.0007) [2023-10-10 07:18:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 129728512. Throughput: 0: 1691.4, 1: 1682.3. Samples: 32437364. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:18:01,784][52050] Avg episode reward: [(0, '22.400'), (1, '20.010')] [2023-10-10 07:18:02,076][53252] Updated weights for policy 0, policy_version 63380 (0.0007) [2023-10-10 07:18:02,444][53252] Updated weights for policy 0, policy_version 63390 (0.0008) [2023-10-10 07:18:04,753][53268] Updated weights for policy 1, policy_version 63330 (0.0011) [2023-10-10 07:18:05,122][53268] Updated weights for policy 1, policy_version 63340 (0.0010) [2023-10-10 07:18:05,482][53268] Updated weights for policy 1, policy_version 63350 (0.0009) [2023-10-10 07:18:05,853][53268] Updated weights for policy 1, policy_version 63360 (0.0009) [2023-10-10 07:18:06,715][53252] Updated weights for policy 0, policy_version 63400 (0.0007) [2023-10-10 07:18:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 129794048. Throughput: 0: 1690.6, 1: 1671.2. Samples: 32457428. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:18:06,784][52050] Avg episode reward: [(0, '21.930'), (1, '20.020')] [2023-10-10 07:18:07,082][53252] Updated weights for policy 0, policy_version 63410 (0.0007) [2023-10-10 07:18:07,453][53252] Updated weights for policy 0, policy_version 63420 (0.0008) [2023-10-10 07:18:10,012][53268] Updated weights for policy 1, policy_version 63370 (0.0011) [2023-10-10 07:18:10,373][53268] Updated weights for policy 1, policy_version 63380 (0.0011) [2023-10-10 07:18:10,757][53268] Updated weights for policy 1, policy_version 63390 (0.0012) [2023-10-10 07:18:11,427][53252] Updated weights for policy 0, policy_version 63430 (0.0008) [2023-10-10 07:18:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129859584. Throughput: 0: 1684.0, 1: 1664.8. Samples: 32477082. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:18:11,784][52050] Avg episode reward: [(0, '21.130'), (1, '19.720')] [2023-10-10 07:18:11,797][53252] Updated weights for policy 0, policy_version 63440 (0.0007) [2023-10-10 07:18:12,172][53252] Updated weights for policy 0, policy_version 63450 (0.0010) [2023-10-10 07:18:14,725][53268] Updated weights for policy 1, policy_version 63400 (0.0010) [2023-10-10 07:18:15,104][53268] Updated weights for policy 1, policy_version 63410 (0.0009) [2023-10-10 07:18:15,467][53268] Updated weights for policy 1, policy_version 63420 (0.0009) [2023-10-10 07:18:16,082][53252] Updated weights for policy 0, policy_version 63460 (0.0009) [2023-10-10 07:18:16,448][53252] Updated weights for policy 0, policy_version 63470 (0.0011) [2023-10-10 07:18:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 129925120. Throughput: 0: 1685.6, 1: 1681.4. Samples: 32487798. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:18:16,784][52050] Avg episode reward: [(0, '21.220'), (1, '19.860')] [2023-10-10 07:18:16,822][53252] Updated weights for policy 0, policy_version 63480 (0.0008) [2023-10-10 07:18:19,541][53268] Updated weights for policy 1, policy_version 63430 (0.0009) [2023-10-10 07:18:19,908][53268] Updated weights for policy 1, policy_version 63440 (0.0011) [2023-10-10 07:18:20,282][53268] Updated weights for policy 1, policy_version 63450 (0.0009) [2023-10-10 07:18:20,862][53252] Updated weights for policy 0, policy_version 63490 (0.0009) [2023-10-10 07:18:21,234][53252] Updated weights for policy 0, policy_version 63500 (0.0010) [2023-10-10 07:18:21,608][53252] Updated weights for policy 0, policy_version 63510 (0.0010) [2023-10-10 07:18:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 129990656. Throughput: 0: 1684.3, 1: 1664.5. Samples: 32507664. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-10 07:18:21,784][52050] Avg episode reward: [(0, '22.910'), (1, '20.200')] [2023-10-10 07:18:21,979][53252] Updated weights for policy 0, policy_version 63520 (0.0009) [2023-10-10 07:18:24,325][53268] Updated weights for policy 1, policy_version 63460 (0.0008) [2023-10-10 07:18:24,686][53268] Updated weights for policy 1, policy_version 63470 (0.0010) [2023-10-10 07:18:25,054][53268] Updated weights for policy 1, policy_version 63480 (0.0009) [2023-10-10 07:18:26,088][53252] Updated weights for policy 0, policy_version 63530 (0.0009) [2023-10-10 07:18:26,459][53252] Updated weights for policy 0, policy_version 63540 (0.0007) [2023-10-10 07:18:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130056192. Throughput: 0: 1667.5, 1: 1674.5. Samples: 32527342. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:26,784][52050] Avg episode reward: [(0, '22.000'), (1, '20.110')] [2023-10-10 07:18:26,830][53252] Updated weights for policy 0, policy_version 63550 (0.0008) [2023-10-10 07:18:28,882][53268] Updated weights for policy 1, policy_version 63490 (0.0008) [2023-10-10 07:18:29,258][53268] Updated weights for policy 1, policy_version 63500 (0.0009) [2023-10-10 07:18:29,633][53268] Updated weights for policy 1, policy_version 63510 (0.0010) [2023-10-10 07:18:29,992][53268] Updated weights for policy 1, policy_version 63520 (0.0009) [2023-10-10 07:18:30,771][53252] Updated weights for policy 0, policy_version 63560 (0.0007) [2023-10-10 07:18:31,142][53252] Updated weights for policy 0, policy_version 63570 (0.0008) [2023-10-10 07:18:31,513][53252] Updated weights for policy 0, policy_version 63580 (0.0007) [2023-10-10 07:18:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130154496. Throughput: 0: 1682.9, 1: 1677.1. Samples: 32538124. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:31,784][52050] Avg episode reward: [(0, '23.130'), (1, '20.710')] [2023-10-10 07:18:34,139][53268] Updated weights for policy 1, policy_version 63530 (0.0011) [2023-10-10 07:18:34,505][53268] Updated weights for policy 1, policy_version 63540 (0.0009) [2023-10-10 07:18:34,886][53268] Updated weights for policy 1, policy_version 63550 (0.0008) [2023-10-10 07:18:35,685][53252] Updated weights for policy 0, policy_version 63590 (0.0008) [2023-10-10 07:18:36,061][53252] Updated weights for policy 0, policy_version 63600 (0.0009) [2023-10-10 07:18:36,429][53252] Updated weights for policy 0, policy_version 63610 (0.0007) [2023-10-10 07:18:36,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130220032. Throughput: 0: 1686.2, 1: 1655.4. Samples: 32557792. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:36,785][52050] Avg episode reward: [(0, '21.450'), (1, '20.350')] [2023-10-10 07:18:39,115][53268] Updated weights for policy 1, policy_version 63560 (0.0012) [2023-10-10 07:18:39,479][53268] Updated weights for policy 1, policy_version 63570 (0.0008) [2023-10-10 07:18:39,841][53268] Updated weights for policy 1, policy_version 63580 (0.0011) [2023-10-10 07:18:40,460][53252] Updated weights for policy 0, policy_version 63620 (0.0007) [2023-10-10 07:18:40,829][53252] Updated weights for policy 0, policy_version 63630 (0.0007) [2023-10-10 07:18:41,206][53252] Updated weights for policy 0, policy_version 63640 (0.0007) [2023-10-10 07:18:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 130285568. Throughput: 0: 1658.3, 1: 1679.3. Samples: 32577304. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:41,784][52050] Avg episode reward: [(0, '21.010'), (1, '19.850')] [2023-10-10 07:18:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000063584_65110016.pth... [2023-10-10 07:18:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000063648_65175552.pth... [2023-10-10 07:18:41,828][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000062016_63504384.pth [2023-10-10 07:18:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000062048_63537152.pth [2023-10-10 07:18:44,014][53268] Updated weights for policy 1, policy_version 63590 (0.0009) [2023-10-10 07:18:44,371][53268] Updated weights for policy 1, policy_version 63600 (0.0009) [2023-10-10 07:18:44,739][53268] Updated weights for policy 1, policy_version 63610 (0.0009) [2023-10-10 07:18:45,311][53252] Updated weights for policy 0, policy_version 63650 (0.0008) [2023-10-10 07:18:45,717][53252] Updated weights for policy 0, policy_version 63660 (0.0008) [2023-10-10 07:18:46,087][53252] Updated weights for policy 0, policy_version 63670 (0.0007) [2023-10-10 07:18:46,466][53252] Updated weights for policy 0, policy_version 63680 (0.0007) [2023-10-10 07:18:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 130351104. Throughput: 0: 1687.1, 1: 1671.3. Samples: 32588494. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:46,784][52050] Avg episode reward: [(0, '20.770'), (1, '21.240')] [2023-10-10 07:18:48,884][53268] Updated weights for policy 1, policy_version 63620 (0.0008) [2023-10-10 07:18:49,237][53268] Updated weights for policy 1, policy_version 63630 (0.0009) [2023-10-10 07:18:49,607][53268] Updated weights for policy 1, policy_version 63640 (0.0010) [2023-10-10 07:18:50,707][53252] Updated weights for policy 0, policy_version 63690 (0.0008) [2023-10-10 07:18:51,085][53252] Updated weights for policy 0, policy_version 63700 (0.0008) [2023-10-10 07:18:51,461][53252] Updated weights for policy 0, policy_version 63710 (0.0007) [2023-10-10 07:18:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130416640. Throughput: 0: 1679.8, 1: 1664.4. Samples: 32607916. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:51,784][52050] Avg episode reward: [(0, '21.710'), (1, '23.010')] [2023-10-10 07:18:53,807][53268] Updated weights for policy 1, policy_version 63650 (0.0008) [2023-10-10 07:18:54,178][53268] Updated weights for policy 1, policy_version 63660 (0.0010) [2023-10-10 07:18:54,556][53268] Updated weights for policy 1, policy_version 63670 (0.0008) [2023-10-10 07:18:54,918][53268] Updated weights for policy 1, policy_version 63680 (0.0007) [2023-10-10 07:18:55,495][53252] Updated weights for policy 0, policy_version 63720 (0.0010) [2023-10-10 07:18:55,865][53252] Updated weights for policy 0, policy_version 63730 (0.0010) [2023-10-10 07:18:56,240][53252] Updated weights for policy 0, policy_version 63740 (0.0010) [2023-10-10 07:18:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 130482176. Throughput: 0: 1655.6, 1: 1683.7. Samples: 32627348. Policy #0 lag: (min: 0.0, avg: 25.4, max: 32.0) [2023-10-10 07:18:56,784][52050] Avg episode reward: [(0, '21.000'), (1, '21.780')] [2023-10-10 07:18:58,679][53268] Updated weights for policy 1, policy_version 63690 (0.0009) [2023-10-10 07:18:59,035][53268] Updated weights for policy 1, policy_version 63700 (0.0009) [2023-10-10 07:18:59,397][53268] Updated weights for policy 1, policy_version 63710 (0.0007) [2023-10-10 07:19:00,186][53252] Updated weights for policy 0, policy_version 63750 (0.0007) [2023-10-10 07:19:00,561][53252] Updated weights for policy 0, policy_version 63760 (0.0011) [2023-10-10 07:19:00,927][53252] Updated weights for policy 0, policy_version 63770 (0.0007) [2023-10-10 07:19:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130547712. Throughput: 0: 1685.1, 1: 1660.9. Samples: 32638368. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:01,784][52050] Avg episode reward: [(0, '22.050'), (1, '21.510')] [2023-10-10 07:19:03,447][53268] Updated weights for policy 1, policy_version 63720 (0.0008) [2023-10-10 07:19:03,802][53268] Updated weights for policy 1, policy_version 63730 (0.0009) [2023-10-10 07:19:04,171][53268] Updated weights for policy 1, policy_version 63740 (0.0007) [2023-10-10 07:19:04,986][53252] Updated weights for policy 0, policy_version 63780 (0.0008) [2023-10-10 07:19:05,362][53252] Updated weights for policy 0, policy_version 63790 (0.0007) [2023-10-10 07:19:05,727][53252] Updated weights for policy 0, policy_version 63800 (0.0007) [2023-10-10 07:19:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130613248. Throughput: 0: 1672.7, 1: 1675.8. Samples: 32658346. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:06,784][52050] Avg episode reward: [(0, '21.280'), (1, '22.110')] [2023-10-10 07:19:08,502][53268] Updated weights for policy 1, policy_version 63750 (0.0008) [2023-10-10 07:19:08,862][53268] Updated weights for policy 1, policy_version 63760 (0.0009) [2023-10-10 07:19:09,238][53268] Updated weights for policy 1, policy_version 63770 (0.0009) [2023-10-10 07:19:09,531][53252] Updated weights for policy 0, policy_version 63810 (0.0008) [2023-10-10 07:19:09,902][53252] Updated weights for policy 0, policy_version 63820 (0.0007) [2023-10-10 07:19:10,266][53252] Updated weights for policy 0, policy_version 63830 (0.0010) [2023-10-10 07:19:10,632][53252] Updated weights for policy 0, policy_version 63840 (0.0010) [2023-10-10 07:19:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130678784. Throughput: 0: 1681.0, 1: 1680.7. Samples: 32678618. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:11,784][52050] Avg episode reward: [(0, '21.040'), (1, '20.240')] [2023-10-10 07:19:13,295][53268] Updated weights for policy 1, policy_version 63780 (0.0008) [2023-10-10 07:19:13,656][53268] Updated weights for policy 1, policy_version 63790 (0.0011) [2023-10-10 07:19:14,029][53268] Updated weights for policy 1, policy_version 63800 (0.0008) [2023-10-10 07:19:14,662][53252] Updated weights for policy 0, policy_version 63850 (0.0009) [2023-10-10 07:19:15,038][53252] Updated weights for policy 0, policy_version 63860 (0.0008) [2023-10-10 07:19:15,419][53252] Updated weights for policy 0, policy_version 63870 (0.0009) [2023-10-10 07:19:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130744320. Throughput: 0: 1691.3, 1: 1665.9. Samples: 32689200. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:16,784][52050] Avg episode reward: [(0, '20.370'), (1, '19.750')] [2023-10-10 07:19:18,208][53268] Updated weights for policy 1, policy_version 63810 (0.0009) [2023-10-10 07:19:18,583][53268] Updated weights for policy 1, policy_version 63820 (0.0009) [2023-10-10 07:19:18,948][53268] Updated weights for policy 1, policy_version 63830 (0.0008) [2023-10-10 07:19:19,302][53268] Updated weights for policy 1, policy_version 63840 (0.0007) [2023-10-10 07:19:19,494][53252] Updated weights for policy 0, policy_version 63880 (0.0008) [2023-10-10 07:19:19,859][53252] Updated weights for policy 0, policy_version 63890 (0.0009) [2023-10-10 07:19:20,226][53252] Updated weights for policy 0, policy_version 63900 (0.0009) [2023-10-10 07:19:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130809856. Throughput: 0: 1670.4, 1: 1684.5. Samples: 32708762. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:21,784][52050] Avg episode reward: [(0, '21.590'), (1, '21.550')] [2023-10-10 07:19:23,376][53268] Updated weights for policy 1, policy_version 63850 (0.0010) [2023-10-10 07:19:23,744][53268] Updated weights for policy 1, policy_version 63860 (0.0011) [2023-10-10 07:19:24,105][53252] Updated weights for policy 0, policy_version 63910 (0.0009) [2023-10-10 07:19:24,115][53268] Updated weights for policy 1, policy_version 63870 (0.0010) [2023-10-10 07:19:24,476][53252] Updated weights for policy 0, policy_version 63920 (0.0008) [2023-10-10 07:19:24,848][53252] Updated weights for policy 0, policy_version 63930 (0.0008) [2023-10-10 07:19:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 130875392. Throughput: 0: 1692.8, 1: 1688.8. Samples: 32729476. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:26,784][52050] Avg episode reward: [(0, '19.770'), (1, '20.780')] [2023-10-10 07:19:27,981][53268] Updated weights for policy 1, policy_version 63880 (0.0008) [2023-10-10 07:19:28,350][53268] Updated weights for policy 1, policy_version 63890 (0.0009) [2023-10-10 07:19:28,719][53268] Updated weights for policy 1, policy_version 63900 (0.0009) [2023-10-10 07:19:28,978][53252] Updated weights for policy 0, policy_version 63940 (0.0010) [2023-10-10 07:19:29,352][53252] Updated weights for policy 0, policy_version 63950 (0.0009) [2023-10-10 07:19:29,719][53252] Updated weights for policy 0, policy_version 63960 (0.0008) [2023-10-10 07:19:31,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 130940928. Throughput: 0: 1685.6, 1: 1670.0. Samples: 32739498. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:31,785][52050] Avg episode reward: [(0, '21.410'), (1, '20.580')] [2023-10-10 07:19:32,579][53268] Updated weights for policy 1, policy_version 63910 (0.0008) [2023-10-10 07:19:32,943][53268] Updated weights for policy 1, policy_version 63920 (0.0007) [2023-10-10 07:19:33,311][53268] Updated weights for policy 1, policy_version 63930 (0.0008) [2023-10-10 07:19:33,883][53252] Updated weights for policy 0, policy_version 63970 (0.0007) [2023-10-10 07:19:34,285][53252] Updated weights for policy 0, policy_version 63980 (0.0008) [2023-10-10 07:19:34,645][53252] Updated weights for policy 0, policy_version 63990 (0.0011) [2023-10-10 07:19:35,017][53252] Updated weights for policy 0, policy_version 64000 (0.0009) [2023-10-10 07:19:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131006464. Throughput: 0: 1674.8, 1: 1694.4. Samples: 32759534. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:36,784][52050] Avg episode reward: [(0, '19.550'), (1, '21.980')] [2023-10-10 07:19:37,279][53268] Updated weights for policy 1, policy_version 63940 (0.0008) [2023-10-10 07:19:37,641][53268] Updated weights for policy 1, policy_version 63950 (0.0007) [2023-10-10 07:19:38,019][53268] Updated weights for policy 1, policy_version 63960 (0.0011) [2023-10-10 07:19:39,022][53252] Updated weights for policy 0, policy_version 64010 (0.0009) [2023-10-10 07:19:39,398][53252] Updated weights for policy 0, policy_version 64020 (0.0008) [2023-10-10 07:19:39,770][53252] Updated weights for policy 0, policy_version 64030 (0.0007) [2023-10-10 07:19:41,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131072000. Throughput: 0: 1708.5, 1: 1695.7. Samples: 32780536. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-10 07:19:41,784][52050] Avg episode reward: [(0, '20.000'), (1, '20.480')] [2023-10-10 07:19:42,121][53268] Updated weights for policy 1, policy_version 63970 (0.0009) [2023-10-10 07:19:42,485][53268] Updated weights for policy 1, policy_version 63980 (0.0010) [2023-10-10 07:19:42,852][53268] Updated weights for policy 1, policy_version 63990 (0.0007) [2023-10-10 07:19:43,227][53268] Updated weights for policy 1, policy_version 64000 (0.0009) [2023-10-10 07:19:43,814][53252] Updated weights for policy 0, policy_version 64040 (0.0008) [2023-10-10 07:19:44,193][53252] Updated weights for policy 0, policy_version 64050 (0.0007) [2023-10-10 07:19:44,564][53252] Updated weights for policy 0, policy_version 64060 (0.0007) [2023-10-10 07:19:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 131137536. Throughput: 0: 1684.8, 1: 1685.0. Samples: 32790008. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:19:46,784][52050] Avg episode reward: [(0, '22.170'), (1, '19.790')] [2023-10-10 07:19:47,387][53268] Updated weights for policy 1, policy_version 64010 (0.0008) [2023-10-10 07:19:47,760][53268] Updated weights for policy 1, policy_version 64020 (0.0009) [2023-10-10 07:19:48,122][53268] Updated weights for policy 1, policy_version 64030 (0.0010) [2023-10-10 07:19:48,564][53252] Updated weights for policy 0, policy_version 64070 (0.0007) [2023-10-10 07:19:48,934][53252] Updated weights for policy 0, policy_version 64080 (0.0007) [2023-10-10 07:19:49,294][53252] Updated weights for policy 0, policy_version 64090 (0.0008) [2023-10-10 07:19:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131203072. Throughput: 0: 1685.0, 1: 1689.3. Samples: 32810188. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:19:51,784][52050] Avg episode reward: [(0, '22.630'), (1, '19.670')] [2023-10-10 07:19:52,286][53268] Updated weights for policy 1, policy_version 64040 (0.0009) [2023-10-10 07:19:52,661][53268] Updated weights for policy 1, policy_version 64050 (0.0010) [2023-10-10 07:19:53,026][53268] Updated weights for policy 1, policy_version 64060 (0.0008) [2023-10-10 07:19:53,302][53252] Updated weights for policy 0, policy_version 64100 (0.0008) [2023-10-10 07:19:53,670][53252] Updated weights for policy 0, policy_version 64110 (0.0007) [2023-10-10 07:19:54,043][53252] Updated weights for policy 0, policy_version 64120 (0.0008) [2023-10-10 07:19:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131268608. Throughput: 0: 1699.2, 1: 1686.0. Samples: 32830954. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:19:56,784][52050] Avg episode reward: [(0, '21.210'), (1, '19.960')] [2023-10-10 07:19:57,227][53268] Updated weights for policy 1, policy_version 64070 (0.0007) [2023-10-10 07:19:57,595][53268] Updated weights for policy 1, policy_version 64080 (0.0008) [2023-10-10 07:19:57,942][53252] Updated weights for policy 0, policy_version 64130 (0.0007) [2023-10-10 07:19:57,960][53268] Updated weights for policy 1, policy_version 64090 (0.0009) [2023-10-10 07:19:58,318][53252] Updated weights for policy 0, policy_version 64140 (0.0008) [2023-10-10 07:19:58,688][53252] Updated weights for policy 0, policy_version 64150 (0.0010) [2023-10-10 07:19:59,065][53252] Updated weights for policy 0, policy_version 64160 (0.0010) [2023-10-10 07:20:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131334144. Throughput: 0: 1670.8, 1: 1683.0. Samples: 32840124. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:01,784][52050] Avg episode reward: [(0, '20.870'), (1, '19.460')] [2023-10-10 07:20:01,945][53268] Updated weights for policy 1, policy_version 64100 (0.0010) [2023-10-10 07:20:02,308][53268] Updated weights for policy 1, policy_version 64110 (0.0010) [2023-10-10 07:20:02,684][53268] Updated weights for policy 1, policy_version 64120 (0.0011) [2023-10-10 07:20:03,203][53252] Updated weights for policy 0, policy_version 64170 (0.0009) [2023-10-10 07:20:03,566][53252] Updated weights for policy 0, policy_version 64180 (0.0010) [2023-10-10 07:20:03,944][53252] Updated weights for policy 0, policy_version 64190 (0.0010) [2023-10-10 07:20:06,751][53268] Updated weights for policy 1, policy_version 64130 (0.0010) [2023-10-10 07:20:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131399680. Throughput: 0: 1694.1, 1: 1685.0. Samples: 32860822. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:06,784][52050] Avg episode reward: [(0, '20.780'), (1, '19.860')] [2023-10-10 07:20:07,123][53268] Updated weights for policy 1, policy_version 64140 (0.0010) [2023-10-10 07:20:07,493][53268] Updated weights for policy 1, policy_version 64150 (0.0009) [2023-10-10 07:20:07,853][53268] Updated weights for policy 1, policy_version 64160 (0.0010) [2023-10-10 07:20:07,999][53252] Updated weights for policy 0, policy_version 64200 (0.0007) [2023-10-10 07:20:08,377][53252] Updated weights for policy 0, policy_version 64210 (0.0007) [2023-10-10 07:20:08,753][53252] Updated weights for policy 0, policy_version 64220 (0.0011) [2023-10-10 07:20:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131465216. Throughput: 0: 1697.8, 1: 1680.7. Samples: 32881510. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:11,784][52050] Avg episode reward: [(0, '20.330'), (1, '19.540')] [2023-10-10 07:20:12,065][53268] Updated weights for policy 1, policy_version 64170 (0.0010) [2023-10-10 07:20:12,432][53268] Updated weights for policy 1, policy_version 64180 (0.0009) [2023-10-10 07:20:12,705][53252] Updated weights for policy 0, policy_version 64230 (0.0007) [2023-10-10 07:20:12,791][53268] Updated weights for policy 1, policy_version 64190 (0.0008) [2023-10-10 07:20:13,079][53252] Updated weights for policy 0, policy_version 64240 (0.0009) [2023-10-10 07:20:13,462][53252] Updated weights for policy 0, policy_version 64250 (0.0009) [2023-10-10 07:20:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131530752. Throughput: 0: 1681.7, 1: 1679.8. Samples: 32890766. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:16,784][52050] Avg episode reward: [(0, '21.200'), (1, '20.930')] [2023-10-10 07:20:16,897][53268] Updated weights for policy 1, policy_version 64200 (0.0009) [2023-10-10 07:20:17,257][53268] Updated weights for policy 1, policy_version 64210 (0.0007) [2023-10-10 07:20:17,499][53252] Updated weights for policy 0, policy_version 64260 (0.0008) [2023-10-10 07:20:17,629][53268] Updated weights for policy 1, policy_version 64220 (0.0008) [2023-10-10 07:20:17,859][53252] Updated weights for policy 0, policy_version 64270 (0.0009) [2023-10-10 07:20:18,237][53252] Updated weights for policy 0, policy_version 64280 (0.0009) [2023-10-10 07:20:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131596288. Throughput: 0: 1700.1, 1: 1672.2. Samples: 32911286. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:21,784][52050] Avg episode reward: [(0, '22.270'), (1, '20.930')] [2023-10-10 07:20:21,903][53268] Updated weights for policy 1, policy_version 64230 (0.0008) [2023-10-10 07:20:22,274][53268] Updated weights for policy 1, policy_version 64240 (0.0008) [2023-10-10 07:20:22,595][53252] Updated weights for policy 0, policy_version 64290 (0.0010) [2023-10-10 07:20:22,644][53268] Updated weights for policy 1, policy_version 64250 (0.0007) [2023-10-10 07:20:23,005][53252] Updated weights for policy 0, policy_version 64300 (0.0009) [2023-10-10 07:20:23,368][53252] Updated weights for policy 0, policy_version 64310 (0.0010) [2023-10-10 07:20:23,736][53252] Updated weights for policy 0, policy_version 64320 (0.0007) [2023-10-10 07:20:26,751][53268] Updated weights for policy 1, policy_version 64260 (0.0008) [2023-10-10 07:20:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 131661824. Throughput: 0: 1694.4, 1: 1668.1. Samples: 32931848. Policy #0 lag: (min: 29.0, avg: 46.8, max: 48.0) [2023-10-10 07:20:26,784][52050] Avg episode reward: [(0, '22.900'), (1, '20.670')] [2023-10-10 07:20:27,113][53268] Updated weights for policy 1, policy_version 64270 (0.0007) [2023-10-10 07:20:27,486][53268] Updated weights for policy 1, policy_version 64280 (0.0008) [2023-10-10 07:20:27,597][53252] Updated weights for policy 0, policy_version 64330 (0.0008) [2023-10-10 07:20:27,962][53252] Updated weights for policy 0, policy_version 64340 (0.0007) [2023-10-10 07:20:28,335][53252] Updated weights for policy 0, policy_version 64350 (0.0009) [2023-10-10 07:20:31,640][53268] Updated weights for policy 1, policy_version 64290 (0.0008) [2023-10-10 07:20:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 131727360. Throughput: 0: 1682.0, 1: 1669.3. Samples: 32940816. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:31,784][52050] Avg episode reward: [(0, '23.530'), (1, '21.260')] [2023-10-10 07:20:32,005][53268] Updated weights for policy 1, policy_version 64300 (0.0007) [2023-10-10 07:20:32,379][53268] Updated weights for policy 1, policy_version 64310 (0.0007) [2023-10-10 07:20:32,507][53252] Updated weights for policy 0, policy_version 64360 (0.0009) [2023-10-10 07:20:32,740][53268] Updated weights for policy 1, policy_version 64320 (0.0007) [2023-10-10 07:20:32,881][53252] Updated weights for policy 0, policy_version 64370 (0.0008) [2023-10-10 07:20:33,238][53252] Updated weights for policy 0, policy_version 64380 (0.0007) [2023-10-10 07:20:36,742][53268] Updated weights for policy 1, policy_version 64330 (0.0009) [2023-10-10 07:20:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131792896. Throughput: 0: 1686.7, 1: 1671.3. Samples: 32961298. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:36,784][52050] Avg episode reward: [(0, '20.700'), (1, '22.530')] [2023-10-10 07:20:37,106][53268] Updated weights for policy 1, policy_version 64340 (0.0010) [2023-10-10 07:20:37,373][53252] Updated weights for policy 0, policy_version 64390 (0.0009) [2023-10-10 07:20:37,471][53268] Updated weights for policy 1, policy_version 64350 (0.0008) [2023-10-10 07:20:37,745][53252] Updated weights for policy 0, policy_version 64400 (0.0008) [2023-10-10 07:20:38,106][53252] Updated weights for policy 0, policy_version 64410 (0.0008) [2023-10-10 07:20:41,335][53268] Updated weights for policy 1, policy_version 64360 (0.0009) [2023-10-10 07:20:41,706][53268] Updated weights for policy 1, policy_version 64370 (0.0011) [2023-10-10 07:20:41,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 131858432. Throughput: 0: 1681.5, 1: 1674.4. Samples: 32981966. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:41,785][52050] Avg episode reward: [(0, '21.580'), (1, '20.270')] [2023-10-10 07:20:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000064416_65961984.pth... [2023-10-10 07:20:41,835][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000062848_64356352.pth [2023-10-10 07:20:42,072][53268] Updated weights for policy 1, policy_version 64380 (0.0009) [2023-10-10 07:20:42,119][53252] Updated weights for policy 0, policy_version 64420 (0.0008) [2023-10-10 07:20:42,217][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000064384_65929216.pth... [2023-10-10 07:20:42,246][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000062816_64323584.pth [2023-10-10 07:20:42,487][53252] Updated weights for policy 0, policy_version 64430 (0.0010) [2023-10-10 07:20:42,864][53252] Updated weights for policy 0, policy_version 64440 (0.0010) [2023-10-10 07:20:46,136][53268] Updated weights for policy 1, policy_version 64390 (0.0008) [2023-10-10 07:20:46,510][53268] Updated weights for policy 1, policy_version 64400 (0.0007) [2023-10-10 07:20:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131923968. Throughput: 0: 1679.3, 1: 1678.4. Samples: 32991220. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:46,784][52050] Avg episode reward: [(0, '22.120'), (1, '20.410')] [2023-10-10 07:20:46,871][53268] Updated weights for policy 1, policy_version 64410 (0.0007) [2023-10-10 07:20:47,047][53252] Updated weights for policy 0, policy_version 64450 (0.0010) [2023-10-10 07:20:47,419][53252] Updated weights for policy 0, policy_version 64460 (0.0009) [2023-10-10 07:20:47,791][53252] Updated weights for policy 0, policy_version 64470 (0.0010) [2023-10-10 07:20:48,158][53252] Updated weights for policy 0, policy_version 64480 (0.0009) [2023-10-10 07:20:50,979][53268] Updated weights for policy 1, policy_version 64420 (0.0008) [2023-10-10 07:20:51,344][53268] Updated weights for policy 1, policy_version 64430 (0.0010) [2023-10-10 07:20:51,713][53268] Updated weights for policy 1, policy_version 64440 (0.0010) [2023-10-10 07:20:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 131989504. Throughput: 0: 1676.0, 1: 1680.8. Samples: 33011878. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:51,784][52050] Avg episode reward: [(0, '20.620'), (1, '20.380')] [2023-10-10 07:20:52,151][53252] Updated weights for policy 0, policy_version 64490 (0.0009) [2023-10-10 07:20:52,527][53252] Updated weights for policy 0, policy_version 64500 (0.0008) [2023-10-10 07:20:52,893][53252] Updated weights for policy 0, policy_version 64510 (0.0008) [2023-10-10 07:20:55,803][53268] Updated weights for policy 1, policy_version 64450 (0.0009) [2023-10-10 07:20:56,168][53268] Updated weights for policy 1, policy_version 64460 (0.0008) [2023-10-10 07:20:56,539][53268] Updated weights for policy 1, policy_version 64470 (0.0009) [2023-10-10 07:20:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 132055040. Throughput: 0: 1676.4, 1: 1673.3. Samples: 33032250. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:20:56,784][52050] Avg episode reward: [(0, '21.620'), (1, '19.190')] [2023-10-10 07:20:56,907][53268] Updated weights for policy 1, policy_version 64480 (0.0008) [2023-10-10 07:20:57,020][53252] Updated weights for policy 0, policy_version 64520 (0.0008) [2023-10-10 07:20:57,398][53252] Updated weights for policy 0, policy_version 64530 (0.0010) [2023-10-10 07:20:57,766][53252] Updated weights for policy 0, policy_version 64540 (0.0009) [2023-10-10 07:21:00,922][53268] Updated weights for policy 1, policy_version 64490 (0.0007) [2023-10-10 07:21:01,280][53268] Updated weights for policy 1, policy_version 64500 (0.0010) [2023-10-10 07:21:01,654][53268] Updated weights for policy 1, policy_version 64510 (0.0010) [2023-10-10 07:21:01,773][53252] Updated weights for policy 0, policy_version 64550 (0.0007) [2023-10-10 07:21:01,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132153344. Throughput: 0: 1672.0, 1: 1684.8. Samples: 33041824. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:21:01,784][52050] Avg episode reward: [(0, '22.360'), (1, '20.340')] [2023-10-10 07:21:02,154][53252] Updated weights for policy 0, policy_version 64560 (0.0009) [2023-10-10 07:21:02,509][53252] Updated weights for policy 0, policy_version 64570 (0.0009) [2023-10-10 07:21:05,660][53268] Updated weights for policy 1, policy_version 64520 (0.0010) [2023-10-10 07:21:06,032][53268] Updated weights for policy 1, policy_version 64530 (0.0010) [2023-10-10 07:21:06,394][53268] Updated weights for policy 1, policy_version 64540 (0.0008) [2023-10-10 07:21:06,645][53252] Updated weights for policy 0, policy_version 64580 (0.0009) [2023-10-10 07:21:06,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132218880. Throughput: 0: 1676.5, 1: 1685.9. Samples: 33062592. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:21:06,784][52050] Avg episode reward: [(0, '20.440'), (1, '19.670')] [2023-10-10 07:21:07,023][53252] Updated weights for policy 0, policy_version 64590 (0.0007) [2023-10-10 07:21:07,400][53252] Updated weights for policy 0, policy_version 64600 (0.0010) [2023-10-10 07:21:10,651][53268] Updated weights for policy 1, policy_version 64550 (0.0009) [2023-10-10 07:21:11,026][53268] Updated weights for policy 1, policy_version 64560 (0.0007) [2023-10-10 07:21:11,369][53252] Updated weights for policy 0, policy_version 64610 (0.0010) [2023-10-10 07:21:11,383][53268] Updated weights for policy 1, policy_version 64570 (0.0008) [2023-10-10 07:21:11,743][53252] Updated weights for policy 0, policy_version 64620 (0.0009) [2023-10-10 07:21:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132284416. Throughput: 0: 1671.4, 1: 1669.2. Samples: 33082172. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-10 07:21:11,785][52050] Avg episode reward: [(0, '20.700'), (1, '19.340')] [2023-10-10 07:21:12,110][53252] Updated weights for policy 0, policy_version 64630 (0.0007) [2023-10-10 07:21:12,482][53252] Updated weights for policy 0, policy_version 64640 (0.0009) [2023-10-10 07:21:15,496][53268] Updated weights for policy 1, policy_version 64580 (0.0007) [2023-10-10 07:21:15,877][53268] Updated weights for policy 1, policy_version 64590 (0.0009) [2023-10-10 07:21:16,242][53268] Updated weights for policy 1, policy_version 64600 (0.0009) [2023-10-10 07:21:16,465][53252] Updated weights for policy 0, policy_version 64650 (0.0007) [2023-10-10 07:21:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132349952. Throughput: 0: 1678.0, 1: 1684.4. Samples: 33092124. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:16,784][52050] Avg episode reward: [(0, '20.630'), (1, '19.730')] [2023-10-10 07:21:16,845][53252] Updated weights for policy 0, policy_version 64660 (0.0007) [2023-10-10 07:21:17,226][53252] Updated weights for policy 0, policy_version 64670 (0.0007) [2023-10-10 07:21:20,402][53268] Updated weights for policy 1, policy_version 64610 (0.0008) [2023-10-10 07:21:20,771][53268] Updated weights for policy 1, policy_version 64620 (0.0008) [2023-10-10 07:21:21,138][53268] Updated weights for policy 1, policy_version 64630 (0.0009) [2023-10-10 07:21:21,403][53252] Updated weights for policy 0, policy_version 64680 (0.0007) [2023-10-10 07:21:21,498][53268] Updated weights for policy 1, policy_version 64640 (0.0009) [2023-10-10 07:21:21,776][53252] Updated weights for policy 0, policy_version 64690 (0.0007) [2023-10-10 07:21:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132415488. Throughput: 0: 1682.3, 1: 1680.9. Samples: 33112644. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:21,784][52050] Avg episode reward: [(0, '21.620'), (1, '20.530')] [2023-10-10 07:21:22,156][53252] Updated weights for policy 0, policy_version 64700 (0.0007) [2023-10-10 07:21:25,547][53268] Updated weights for policy 1, policy_version 64650 (0.0011) [2023-10-10 07:21:25,916][53268] Updated weights for policy 1, policy_version 64660 (0.0008) [2023-10-10 07:21:26,287][53268] Updated weights for policy 1, policy_version 64670 (0.0008) [2023-10-10 07:21:26,322][53252] Updated weights for policy 0, policy_version 64710 (0.0007) [2023-10-10 07:21:26,688][53252] Updated weights for policy 0, policy_version 64720 (0.0008) [2023-10-10 07:21:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132481024. Throughput: 0: 1675.0, 1: 1660.1. Samples: 33132046. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:26,784][52050] Avg episode reward: [(0, '21.230'), (1, '19.680')] [2023-10-10 07:21:27,055][53252] Updated weights for policy 0, policy_version 64730 (0.0007) [2023-10-10 07:21:30,455][53268] Updated weights for policy 1, policy_version 64680 (0.0010) [2023-10-10 07:21:30,832][53268] Updated weights for policy 1, policy_version 64690 (0.0009) [2023-10-10 07:21:31,015][53252] Updated weights for policy 0, policy_version 64740 (0.0009) [2023-10-10 07:21:31,200][53268] Updated weights for policy 1, policy_version 64700 (0.0008) [2023-10-10 07:21:31,382][53252] Updated weights for policy 0, policy_version 64750 (0.0009) [2023-10-10 07:21:31,749][53252] Updated weights for policy 0, policy_version 64760 (0.0010) [2023-10-10 07:21:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 132546560. Throughput: 0: 1684.4, 1: 1675.3. Samples: 33142406. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:31,784][52050] Avg episode reward: [(0, '21.420'), (1, '19.770')] [2023-10-10 07:21:35,160][53268] Updated weights for policy 1, policy_version 64710 (0.0009) [2023-10-10 07:21:35,521][53268] Updated weights for policy 1, policy_version 64720 (0.0008) [2023-10-10 07:21:35,802][53252] Updated weights for policy 0, policy_version 64770 (0.0010) [2023-10-10 07:21:35,894][53268] Updated weights for policy 1, policy_version 64730 (0.0009) [2023-10-10 07:21:36,159][53252] Updated weights for policy 0, policy_version 64780 (0.0011) [2023-10-10 07:21:36,529][53252] Updated weights for policy 0, policy_version 64790 (0.0009) [2023-10-10 07:21:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132612096. Throughput: 0: 1690.8, 1: 1665.5. Samples: 33162912. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:36,784][52050] Avg episode reward: [(0, '23.350'), (1, '19.920')] [2023-10-10 07:21:36,902][53252] Updated weights for policy 0, policy_version 64800 (0.0008) [2023-10-10 07:21:39,745][53268] Updated weights for policy 1, policy_version 64740 (0.0008) [2023-10-10 07:21:40,103][53268] Updated weights for policy 1, policy_version 64750 (0.0008) [2023-10-10 07:21:40,475][53268] Updated weights for policy 1, policy_version 64760 (0.0009) [2023-10-10 07:21:40,998][53252] Updated weights for policy 0, policy_version 64810 (0.0008) [2023-10-10 07:21:41,375][53252] Updated weights for policy 0, policy_version 64820 (0.0008) [2023-10-10 07:21:41,740][53252] Updated weights for policy 0, policy_version 64830 (0.0010) [2023-10-10 07:21:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132677632. Throughput: 0: 1667.1, 1: 1652.8. Samples: 33181646. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:41,784][52050] Avg episode reward: [(0, '22.200'), (1, '20.440')] [2023-10-10 07:21:44,480][53268] Updated weights for policy 1, policy_version 64770 (0.0008) [2023-10-10 07:21:44,838][53268] Updated weights for policy 1, policy_version 64780 (0.0008) [2023-10-10 07:21:45,213][53268] Updated weights for policy 1, policy_version 64790 (0.0009) [2023-10-10 07:21:45,579][53268] Updated weights for policy 1, policy_version 64800 (0.0010) [2023-10-10 07:21:45,838][53252] Updated weights for policy 0, policy_version 64840 (0.0008) [2023-10-10 07:21:46,219][53252] Updated weights for policy 0, policy_version 64850 (0.0007) [2023-10-10 07:21:46,590][53252] Updated weights for policy 0, policy_version 64860 (0.0007) [2023-10-10 07:21:46,783][52050] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 132775936. Throughput: 0: 1684.2, 1: 1672.5. Samples: 33192878. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:46,784][52050] Avg episode reward: [(0, '21.590'), (1, '20.590')] [2023-10-10 07:21:49,657][53268] Updated weights for policy 1, policy_version 64810 (0.0008) [2023-10-10 07:21:50,016][53268] Updated weights for policy 1, policy_version 64820 (0.0008) [2023-10-10 07:21:50,390][53268] Updated weights for policy 1, policy_version 64830 (0.0011) [2023-10-10 07:21:50,694][53252] Updated weights for policy 0, policy_version 64870 (0.0008) [2023-10-10 07:21:51,067][53252] Updated weights for policy 0, policy_version 64880 (0.0007) [2023-10-10 07:21:51,439][53252] Updated weights for policy 0, policy_version 64890 (0.0008) [2023-10-10 07:21:51,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 132841472. Throughput: 0: 1678.4, 1: 1654.7. Samples: 33212578. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-10 07:21:51,784][52050] Avg episode reward: [(0, '21.770'), (1, '19.560')] [2023-10-10 07:21:54,514][53268] Updated weights for policy 1, policy_version 64840 (0.0009) [2023-10-10 07:21:54,885][53268] Updated weights for policy 1, policy_version 64850 (0.0010) [2023-10-10 07:21:55,242][53268] Updated weights for policy 1, policy_version 64860 (0.0011) [2023-10-10 07:21:55,636][53252] Updated weights for policy 0, policy_version 64900 (0.0008) [2023-10-10 07:21:56,006][53252] Updated weights for policy 0, policy_version 64910 (0.0010) [2023-10-10 07:21:56,377][53252] Updated weights for policy 0, policy_version 64920 (0.0008) [2023-10-10 07:21:56,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 132907008. Throughput: 0: 1661.4, 1: 1665.8. Samples: 33231894. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:21:56,784][52050] Avg episode reward: [(0, '21.030'), (1, '19.270')] [2023-10-10 07:21:59,141][53268] Updated weights for policy 1, policy_version 64870 (0.0010) [2023-10-10 07:21:59,504][53268] Updated weights for policy 1, policy_version 64880 (0.0009) [2023-10-10 07:21:59,878][53268] Updated weights for policy 1, policy_version 64890 (0.0009) [2023-10-10 07:22:00,361][53252] Updated weights for policy 0, policy_version 64930 (0.0010) [2023-10-10 07:22:00,770][53252] Updated weights for policy 0, policy_version 64940 (0.0010) [2023-10-10 07:22:01,141][53252] Updated weights for policy 0, policy_version 64950 (0.0009) [2023-10-10 07:22:01,503][53252] Updated weights for policy 0, policy_version 64960 (0.0007) [2023-10-10 07:22:01,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 132972544. Throughput: 0: 1676.4, 1: 1676.4. Samples: 33242998. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:01,784][52050] Avg episode reward: [(0, '20.120'), (1, '19.240')] [2023-10-10 07:22:04,050][53268] Updated weights for policy 1, policy_version 64900 (0.0007) [2023-10-10 07:22:04,423][53268] Updated weights for policy 1, policy_version 64910 (0.0009) [2023-10-10 07:22:04,787][53268] Updated weights for policy 1, policy_version 64920 (0.0009) [2023-10-10 07:22:05,538][53252] Updated weights for policy 0, policy_version 64970 (0.0009) [2023-10-10 07:22:05,917][53252] Updated weights for policy 0, policy_version 64980 (0.0010) [2023-10-10 07:22:06,285][53252] Updated weights for policy 0, policy_version 64990 (0.0009) [2023-10-10 07:22:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133038080. Throughput: 0: 1670.4, 1: 1653.7. Samples: 33262232. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:06,784][52050] Avg episode reward: [(0, '22.600'), (1, '18.330')] [2023-10-10 07:22:09,011][53268] Updated weights for policy 1, policy_version 64930 (0.0009) [2023-10-10 07:22:09,378][53268] Updated weights for policy 1, policy_version 64940 (0.0008) [2023-10-10 07:22:09,735][53268] Updated weights for policy 1, policy_version 64950 (0.0010) [2023-10-10 07:22:10,102][53268] Updated weights for policy 1, policy_version 64960 (0.0008) [2023-10-10 07:22:10,370][53252] Updated weights for policy 0, policy_version 65000 (0.0008) [2023-10-10 07:22:10,743][53252] Updated weights for policy 0, policy_version 65010 (0.0009) [2023-10-10 07:22:11,113][53252] Updated weights for policy 0, policy_version 65020 (0.0010) [2023-10-10 07:22:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 133103616. Throughput: 0: 1651.4, 1: 1677.2. Samples: 33281832. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:11,784][52050] Avg episode reward: [(0, '22.670'), (1, '18.400')] [2023-10-10 07:22:14,399][53268] Updated weights for policy 1, policy_version 64970 (0.0008) [2023-10-10 07:22:14,759][53268] Updated weights for policy 1, policy_version 64980 (0.0009) [2023-10-10 07:22:15,047][53252] Updated weights for policy 0, policy_version 65030 (0.0010) [2023-10-10 07:22:15,131][53268] Updated weights for policy 1, policy_version 64990 (0.0010) [2023-10-10 07:22:15,417][53252] Updated weights for policy 0, policy_version 65040 (0.0007) [2023-10-10 07:22:15,791][53252] Updated weights for policy 0, policy_version 65050 (0.0007) [2023-10-10 07:22:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 133169152. Throughput: 0: 1675.9, 1: 1679.9. Samples: 33293414. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:16,784][52050] Avg episode reward: [(0, '22.390'), (1, '19.400')] [2023-10-10 07:22:19,297][53268] Updated weights for policy 1, policy_version 65000 (0.0009) [2023-10-10 07:22:19,664][53268] Updated weights for policy 1, policy_version 65010 (0.0010) [2023-10-10 07:22:19,935][53252] Updated weights for policy 0, policy_version 65060 (0.0008) [2023-10-10 07:22:20,034][53268] Updated weights for policy 1, policy_version 65020 (0.0008) [2023-10-10 07:22:20,306][53252] Updated weights for policy 0, policy_version 65070 (0.0009) [2023-10-10 07:22:20,674][53252] Updated weights for policy 0, policy_version 65080 (0.0007) [2023-10-10 07:22:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133234688. Throughput: 0: 1658.5, 1: 1662.7. Samples: 33312368. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:21,784][52050] Avg episode reward: [(0, '24.900'), (1, '20.140')] [2023-10-10 07:22:21,786][52846] Saving new best policy, reward=24.900! [2023-10-10 07:22:24,089][53268] Updated weights for policy 1, policy_version 65030 (0.0009) [2023-10-10 07:22:24,446][53268] Updated weights for policy 1, policy_version 65040 (0.0007) [2023-10-10 07:22:24,790][53252] Updated weights for policy 0, policy_version 65090 (0.0007) [2023-10-10 07:22:24,810][53268] Updated weights for policy 1, policy_version 65050 (0.0007) [2023-10-10 07:22:25,153][53252] Updated weights for policy 0, policy_version 65100 (0.0009) [2023-10-10 07:22:25,524][53252] Updated weights for policy 0, policy_version 65110 (0.0010) [2023-10-10 07:22:25,907][53252] Updated weights for policy 0, policy_version 65120 (0.0010) [2023-10-10 07:22:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133300224. Throughput: 0: 1664.3, 1: 1683.6. Samples: 33332300. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:26,784][52050] Avg episode reward: [(0, '24.980'), (1, '19.980')] [2023-10-10 07:22:26,796][52846] Saving new best policy, reward=24.980! [2023-10-10 07:22:28,865][53268] Updated weights for policy 1, policy_version 65060 (0.0010) [2023-10-10 07:22:29,246][53268] Updated weights for policy 1, policy_version 65070 (0.0009) [2023-10-10 07:22:29,617][53268] Updated weights for policy 1, policy_version 65080 (0.0008) [2023-10-10 07:22:29,922][53252] Updated weights for policy 0, policy_version 65130 (0.0009) [2023-10-10 07:22:30,296][53252] Updated weights for policy 0, policy_version 65140 (0.0009) [2023-10-10 07:22:30,675][53252] Updated weights for policy 0, policy_version 65150 (0.0010) [2023-10-10 07:22:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133365760. Throughput: 0: 1678.4, 1: 1667.5. Samples: 33343444. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:31,784][52050] Avg episode reward: [(0, '21.040'), (1, '21.950')] [2023-10-10 07:22:33,805][53268] Updated weights for policy 1, policy_version 65090 (0.0009) [2023-10-10 07:22:34,186][53268] Updated weights for policy 1, policy_version 65100 (0.0009) [2023-10-10 07:22:34,546][53268] Updated weights for policy 1, policy_version 65110 (0.0008) [2023-10-10 07:22:34,658][53252] Updated weights for policy 0, policy_version 65160 (0.0008) [2023-10-10 07:22:34,906][53268] Updated weights for policy 1, policy_version 65120 (0.0009) [2023-10-10 07:22:35,027][53252] Updated weights for policy 0, policy_version 65170 (0.0009) [2023-10-10 07:22:35,407][53252] Updated weights for policy 0, policy_version 65180 (0.0009) [2023-10-10 07:22:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133431296. Throughput: 0: 1660.1, 1: 1666.4. Samples: 33362272. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-10 07:22:36,784][52050] Avg episode reward: [(0, '22.690'), (1, '22.640')] [2023-10-10 07:22:39,108][53268] Updated weights for policy 1, policy_version 65130 (0.0009) [2023-10-10 07:22:39,477][53268] Updated weights for policy 1, policy_version 65140 (0.0010) [2023-10-10 07:22:39,579][53252] Updated weights for policy 0, policy_version 65190 (0.0009) [2023-10-10 07:22:39,848][53268] Updated weights for policy 1, policy_version 65150 (0.0010) [2023-10-10 07:22:39,946][53252] Updated weights for policy 0, policy_version 65200 (0.0009) [2023-10-10 07:22:40,328][53252] Updated weights for policy 0, policy_version 65210 (0.0010) [2023-10-10 07:22:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133496832. Throughput: 0: 1671.6, 1: 1666.6. Samples: 33382114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:22:41,785][52050] Avg episode reward: [(0, '20.110'), (1, '22.100')] [2023-10-10 07:22:41,798][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000065216_66781184.pth... [2023-10-10 07:22:41,798][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000065152_66715648.pth... [2023-10-10 07:22:41,827][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000063648_65175552.pth [2023-10-10 07:22:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000063584_65110016.pth [2023-10-10 07:22:43,880][53268] Updated weights for policy 1, policy_version 65160 (0.0009) [2023-10-10 07:22:44,251][53268] Updated weights for policy 1, policy_version 65170 (0.0007) [2023-10-10 07:22:44,414][53252] Updated weights for policy 0, policy_version 65220 (0.0008) [2023-10-10 07:22:44,613][53268] Updated weights for policy 1, policy_version 65180 (0.0007) [2023-10-10 07:22:44,794][53252] Updated weights for policy 0, policy_version 65230 (0.0008) [2023-10-10 07:22:45,168][53252] Updated weights for policy 0, policy_version 65240 (0.0010) [2023-10-10 07:22:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 133562368. Throughput: 0: 1677.8, 1: 1657.9. Samples: 33393106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:22:46,784][52050] Avg episode reward: [(0, '20.310'), (1, '21.520')] [2023-10-10 07:22:48,572][53268] Updated weights for policy 1, policy_version 65190 (0.0009) [2023-10-10 07:22:48,940][53268] Updated weights for policy 1, policy_version 65200 (0.0008) [2023-10-10 07:22:49,303][53268] Updated weights for policy 1, policy_version 65210 (0.0008) [2023-10-10 07:22:49,310][53252] Updated weights for policy 0, policy_version 65250 (0.0009) [2023-10-10 07:22:49,707][53252] Updated weights for policy 0, policy_version 65260 (0.0009) [2023-10-10 07:22:50,080][53252] Updated weights for policy 0, policy_version 65270 (0.0008) [2023-10-10 07:22:50,442][53252] Updated weights for policy 0, policy_version 65280 (0.0008) [2023-10-10 07:22:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133627904. Throughput: 0: 1658.2, 1: 1669.5. Samples: 33411978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:22:51,784][52050] Avg episode reward: [(0, '20.500'), (1, '23.360')] [2023-10-10 07:22:53,469][53268] Updated weights for policy 1, policy_version 65220 (0.0009) [2023-10-10 07:22:53,847][53268] Updated weights for policy 1, policy_version 65230 (0.0010) [2023-10-10 07:22:54,215][53268] Updated weights for policy 1, policy_version 65240 (0.0009) [2023-10-10 07:22:54,520][53252] Updated weights for policy 0, policy_version 65290 (0.0009) [2023-10-10 07:22:54,880][53252] Updated weights for policy 0, policy_version 65300 (0.0010) [2023-10-10 07:22:55,254][53252] Updated weights for policy 0, policy_version 65310 (0.0009) [2023-10-10 07:22:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 133693440. Throughput: 0: 1682.0, 1: 1666.1. Samples: 33432498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:22:56,784][52050] Avg episode reward: [(0, '22.710'), (1, '21.050')] [2023-10-10 07:22:58,322][53268] Updated weights for policy 1, policy_version 65250 (0.0008) [2023-10-10 07:22:58,700][53268] Updated weights for policy 1, policy_version 65260 (0.0010) [2023-10-10 07:22:59,058][53268] Updated weights for policy 1, policy_version 65270 (0.0008) [2023-10-10 07:22:59,257][53252] Updated weights for policy 0, policy_version 65320 (0.0008) [2023-10-10 07:22:59,421][53268] Updated weights for policy 1, policy_version 65280 (0.0007) [2023-10-10 07:22:59,632][53252] Updated weights for policy 0, policy_version 65330 (0.0009) [2023-10-10 07:22:59,996][53252] Updated weights for policy 0, policy_version 65340 (0.0010) [2023-10-10 07:23:01,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133758976. Throughput: 0: 1669.6, 1: 1649.5. Samples: 33442778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:23:01,785][52050] Avg episode reward: [(0, '22.050'), (1, '21.130')] [2023-10-10 07:23:03,538][53268] Updated weights for policy 1, policy_version 65290 (0.0009) [2023-10-10 07:23:03,910][53268] Updated weights for policy 1, policy_version 65300 (0.0009) [2023-10-10 07:23:04,185][53252] Updated weights for policy 0, policy_version 65350 (0.0008) [2023-10-10 07:23:04,270][53268] Updated weights for policy 1, policy_version 65310 (0.0008) [2023-10-10 07:23:04,555][53252] Updated weights for policy 0, policy_version 65360 (0.0009) [2023-10-10 07:23:04,930][53252] Updated weights for policy 0, policy_version 65370 (0.0008) [2023-10-10 07:23:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133824512. Throughput: 0: 1661.2, 1: 1668.6. Samples: 33462210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:23:06,784][52050] Avg episode reward: [(0, '23.200'), (1, '21.970')] [2023-10-10 07:23:08,438][53268] Updated weights for policy 1, policy_version 65320 (0.0007) [2023-10-10 07:23:08,815][53268] Updated weights for policy 1, policy_version 65330 (0.0008) [2023-10-10 07:23:09,039][53252] Updated weights for policy 0, policy_version 65380 (0.0009) [2023-10-10 07:23:09,177][53268] Updated weights for policy 1, policy_version 65340 (0.0008) [2023-10-10 07:23:09,407][53252] Updated weights for policy 0, policy_version 65390 (0.0009) [2023-10-10 07:23:09,781][53252] Updated weights for policy 0, policy_version 65400 (0.0007) [2023-10-10 07:23:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 133890048. Throughput: 0: 1672.9, 1: 1665.6. Samples: 33482532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:23:11,784][52050] Avg episode reward: [(0, '22.460'), (1, '20.340')] [2023-10-10 07:23:13,195][53268] Updated weights for policy 1, policy_version 65350 (0.0009) [2023-10-10 07:23:13,562][53268] Updated weights for policy 1, policy_version 65360 (0.0011) [2023-10-10 07:23:13,897][53252] Updated weights for policy 0, policy_version 65410 (0.0009) [2023-10-10 07:23:13,940][53268] Updated weights for policy 1, policy_version 65370 (0.0009) [2023-10-10 07:23:14,263][53252] Updated weights for policy 0, policy_version 65420 (0.0009) [2023-10-10 07:23:14,645][53252] Updated weights for policy 0, policy_version 65430 (0.0008) [2023-10-10 07:23:15,018][53252] Updated weights for policy 0, policy_version 65440 (0.0007) [2023-10-10 07:23:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133955584. Throughput: 0: 1659.1, 1: 1649.8. Samples: 33492342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:23:16,784][52050] Avg episode reward: [(0, '23.010'), (1, '19.310')] [2023-10-10 07:23:18,031][53268] Updated weights for policy 1, policy_version 65380 (0.0009) [2023-10-10 07:23:18,403][53268] Updated weights for policy 1, policy_version 65390 (0.0008) [2023-10-10 07:23:18,771][53268] Updated weights for policy 1, policy_version 65400 (0.0010) [2023-10-10 07:23:19,114][53252] Updated weights for policy 0, policy_version 65450 (0.0008) [2023-10-10 07:23:19,491][53252] Updated weights for policy 0, policy_version 65460 (0.0008) [2023-10-10 07:23:19,857][53252] Updated weights for policy 0, policy_version 65470 (0.0010) [2023-10-10 07:23:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134021120. Throughput: 0: 1665.9, 1: 1668.6. Samples: 33512322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:23:21,784][52050] Avg episode reward: [(0, '21.760'), (1, '20.510')] [2023-10-10 07:23:22,879][53268] Updated weights for policy 1, policy_version 65410 (0.0010) [2023-10-10 07:23:23,253][53268] Updated weights for policy 1, policy_version 65420 (0.0009) [2023-10-10 07:23:23,620][53268] Updated weights for policy 1, policy_version 65430 (0.0009) [2023-10-10 07:23:23,928][53252] Updated weights for policy 0, policy_version 65480 (0.0007) [2023-10-10 07:23:23,981][53268] Updated weights for policy 1, policy_version 65440 (0.0008) [2023-10-10 07:23:24,298][53252] Updated weights for policy 0, policy_version 65490 (0.0009) [2023-10-10 07:23:24,669][53252] Updated weights for policy 0, policy_version 65500 (0.0010) [2023-10-10 07:23:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134086656. Throughput: 0: 1678.5, 1: 1675.2. Samples: 33533026. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:26,784][52050] Avg episode reward: [(0, '22.660'), (1, '19.580')] [2023-10-10 07:23:28,112][53268] Updated weights for policy 1, policy_version 65450 (0.0011) [2023-10-10 07:23:28,483][53268] Updated weights for policy 1, policy_version 65460 (0.0008) [2023-10-10 07:23:28,686][53252] Updated weights for policy 0, policy_version 65510 (0.0009) [2023-10-10 07:23:28,839][53268] Updated weights for policy 1, policy_version 65470 (0.0010) [2023-10-10 07:23:29,061][53252] Updated weights for policy 0, policy_version 65520 (0.0009) [2023-10-10 07:23:29,427][53252] Updated weights for policy 0, policy_version 65530 (0.0008) [2023-10-10 07:23:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134152192. Throughput: 0: 1661.6, 1: 1656.2. Samples: 33542408. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:31,784][52050] Avg episode reward: [(0, '20.700'), (1, '19.520')] [2023-10-10 07:23:33,003][53268] Updated weights for policy 1, policy_version 65480 (0.0009) [2023-10-10 07:23:33,369][53268] Updated weights for policy 1, policy_version 65490 (0.0009) [2023-10-10 07:23:33,514][53252] Updated weights for policy 0, policy_version 65540 (0.0008) [2023-10-10 07:23:33,732][53268] Updated weights for policy 1, policy_version 65500 (0.0008) [2023-10-10 07:23:33,890][53252] Updated weights for policy 0, policy_version 65550 (0.0009) [2023-10-10 07:23:34,272][53252] Updated weights for policy 0, policy_version 65560 (0.0008) [2023-10-10 07:23:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134217728. Throughput: 0: 1680.3, 1: 1669.6. Samples: 33562728. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:36,784][52050] Avg episode reward: [(0, '21.460'), (1, '19.700')] [2023-10-10 07:23:37,820][53268] Updated weights for policy 1, policy_version 65510 (0.0008) [2023-10-10 07:23:38,181][53268] Updated weights for policy 1, policy_version 65520 (0.0009) [2023-10-10 07:23:38,373][53252] Updated weights for policy 0, policy_version 65570 (0.0010) [2023-10-10 07:23:38,551][53268] Updated weights for policy 1, policy_version 65530 (0.0008) [2023-10-10 07:23:38,766][53252] Updated weights for policy 0, policy_version 65580 (0.0010) [2023-10-10 07:23:39,135][53252] Updated weights for policy 0, policy_version 65590 (0.0009) [2023-10-10 07:23:39,508][53252] Updated weights for policy 0, policy_version 65600 (0.0008) [2023-10-10 07:23:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 134283264. Throughput: 0: 1678.7, 1: 1670.7. Samples: 33583218. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:41,784][52050] Avg episode reward: [(0, '20.750'), (1, '20.010')] [2023-10-10 07:23:42,648][53268] Updated weights for policy 1, policy_version 65540 (0.0008) [2023-10-10 07:23:43,009][53268] Updated weights for policy 1, policy_version 65550 (0.0008) [2023-10-10 07:23:43,374][53268] Updated weights for policy 1, policy_version 65560 (0.0009) [2023-10-10 07:23:43,569][53252] Updated weights for policy 0, policy_version 65610 (0.0007) [2023-10-10 07:23:43,929][53252] Updated weights for policy 0, policy_version 65620 (0.0009) [2023-10-10 07:23:44,295][53252] Updated weights for policy 0, policy_version 65630 (0.0011) [2023-10-10 07:23:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134348800. Throughput: 0: 1661.4, 1: 1662.0. Samples: 33592332. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:46,784][52050] Avg episode reward: [(0, '21.220'), (1, '21.050')] [2023-10-10 07:23:47,431][53268] Updated weights for policy 1, policy_version 65570 (0.0009) [2023-10-10 07:23:47,803][53268] Updated weights for policy 1, policy_version 65580 (0.0009) [2023-10-10 07:23:48,171][53268] Updated weights for policy 1, policy_version 65590 (0.0008) [2023-10-10 07:23:48,518][53252] Updated weights for policy 0, policy_version 65640 (0.0009) [2023-10-10 07:23:48,538][53268] Updated weights for policy 1, policy_version 65600 (0.0008) [2023-10-10 07:23:48,884][53252] Updated weights for policy 0, policy_version 65650 (0.0011) [2023-10-10 07:23:49,251][53252] Updated weights for policy 0, policy_version 65660 (0.0008) [2023-10-10 07:23:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134414336. Throughput: 0: 1677.0, 1: 1673.0. Samples: 33612960. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:51,784][52050] Avg episode reward: [(0, '20.950'), (1, '20.330')] [2023-10-10 07:23:52,411][53268] Updated weights for policy 1, policy_version 65610 (0.0010) [2023-10-10 07:23:52,782][53268] Updated weights for policy 1, policy_version 65620 (0.0010) [2023-10-10 07:23:53,133][53268] Updated weights for policy 1, policy_version 65630 (0.0009) [2023-10-10 07:23:53,348][53252] Updated weights for policy 0, policy_version 65670 (0.0009) [2023-10-10 07:23:53,715][53252] Updated weights for policy 0, policy_version 65680 (0.0008) [2023-10-10 07:23:54,088][53252] Updated weights for policy 0, policy_version 65690 (0.0008) [2023-10-10 07:23:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134479872. Throughput: 0: 1682.1, 1: 1674.6. Samples: 33633586. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:23:56,784][52050] Avg episode reward: [(0, '22.610'), (1, '21.620')] [2023-10-10 07:23:57,494][53268] Updated weights for policy 1, policy_version 65640 (0.0009) [2023-10-10 07:23:57,858][53268] Updated weights for policy 1, policy_version 65650 (0.0010) [2023-10-10 07:23:58,088][53252] Updated weights for policy 0, policy_version 65700 (0.0008) [2023-10-10 07:23:58,224][53268] Updated weights for policy 1, policy_version 65660 (0.0008) [2023-10-10 07:23:58,456][53252] Updated weights for policy 0, policy_version 65710 (0.0008) [2023-10-10 07:23:58,828][53252] Updated weights for policy 0, policy_version 65720 (0.0007) [2023-10-10 07:24:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 134545408. Throughput: 0: 1666.5, 1: 1670.9. Samples: 33642524. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:24:01,784][52050] Avg episode reward: [(0, '23.130'), (1, '21.180')] [2023-10-10 07:24:02,320][53268] Updated weights for policy 1, policy_version 65670 (0.0008) [2023-10-10 07:24:02,697][53268] Updated weights for policy 1, policy_version 65680 (0.0008) [2023-10-10 07:24:02,942][53252] Updated weights for policy 0, policy_version 65730 (0.0007) [2023-10-10 07:24:03,058][53268] Updated weights for policy 1, policy_version 65690 (0.0010) [2023-10-10 07:24:03,314][53252] Updated weights for policy 0, policy_version 65740 (0.0008) [2023-10-10 07:24:03,695][53252] Updated weights for policy 0, policy_version 65750 (0.0007) [2023-10-10 07:24:04,066][53252] Updated weights for policy 0, policy_version 65760 (0.0008) [2023-10-10 07:24:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134610944. Throughput: 0: 1680.7, 1: 1672.3. Samples: 33663204. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-10 07:24:06,785][52050] Avg episode reward: [(0, '21.390'), (1, '20.540')] [2023-10-10 07:24:07,081][53268] Updated weights for policy 1, policy_version 65700 (0.0009) [2023-10-10 07:24:07,453][53268] Updated weights for policy 1, policy_version 65710 (0.0008) [2023-10-10 07:24:07,812][53268] Updated weights for policy 1, policy_version 65720 (0.0008) [2023-10-10 07:24:08,127][53252] Updated weights for policy 0, policy_version 65770 (0.0007) [2023-10-10 07:24:08,492][53252] Updated weights for policy 0, policy_version 65780 (0.0009) [2023-10-10 07:24:08,861][53252] Updated weights for policy 0, policy_version 65790 (0.0007) [2023-10-10 07:24:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134676480. Throughput: 0: 1679.1, 1: 1676.9. Samples: 33684046. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:11,784][52050] Avg episode reward: [(0, '21.900'), (1, '20.280')] [2023-10-10 07:24:11,859][53268] Updated weights for policy 1, policy_version 65730 (0.0009) [2023-10-10 07:24:12,239][53268] Updated weights for policy 1, policy_version 65740 (0.0009) [2023-10-10 07:24:12,606][53268] Updated weights for policy 1, policy_version 65750 (0.0009) [2023-10-10 07:24:12,732][53252] Updated weights for policy 0, policy_version 65800 (0.0007) [2023-10-10 07:24:12,965][53268] Updated weights for policy 1, policy_version 65760 (0.0009) [2023-10-10 07:24:13,114][53252] Updated weights for policy 0, policy_version 65810 (0.0009) [2023-10-10 07:24:13,481][53252] Updated weights for policy 0, policy_version 65820 (0.0011) [2023-10-10 07:24:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134742016. Throughput: 0: 1671.7, 1: 1676.5. Samples: 33693076. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:16,784][52050] Avg episode reward: [(0, '21.770'), (1, '21.810')] [2023-10-10 07:24:17,080][53268] Updated weights for policy 1, policy_version 65770 (0.0009) [2023-10-10 07:24:17,446][53268] Updated weights for policy 1, policy_version 65780 (0.0008) [2023-10-10 07:24:17,540][53252] Updated weights for policy 0, policy_version 65830 (0.0010) [2023-10-10 07:24:17,803][53268] Updated weights for policy 1, policy_version 65790 (0.0009) [2023-10-10 07:24:17,908][53252] Updated weights for policy 0, policy_version 65840 (0.0008) [2023-10-10 07:24:18,288][53252] Updated weights for policy 0, policy_version 65850 (0.0010) [2023-10-10 07:24:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 134807552. Throughput: 0: 1678.0, 1: 1674.8. Samples: 33713600. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:21,784][52050] Avg episode reward: [(0, '19.370'), (1, '20.620')] [2023-10-10 07:24:22,106][53268] Updated weights for policy 1, policy_version 65800 (0.0009) [2023-10-10 07:24:22,304][53252] Updated weights for policy 0, policy_version 65860 (0.0009) [2023-10-10 07:24:22,466][53268] Updated weights for policy 1, policy_version 65810 (0.0007) [2023-10-10 07:24:22,679][53252] Updated weights for policy 0, policy_version 65870 (0.0007) [2023-10-10 07:24:22,833][53268] Updated weights for policy 1, policy_version 65820 (0.0009) [2023-10-10 07:24:23,038][53252] Updated weights for policy 0, policy_version 65880 (0.0007) [2023-10-10 07:24:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134873088. Throughput: 0: 1679.1, 1: 1674.0. Samples: 33734106. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:26,784][52050] Avg episode reward: [(0, '19.580'), (1, '22.040')] [2023-10-10 07:24:26,992][53268] Updated weights for policy 1, policy_version 65830 (0.0008) [2023-10-10 07:24:27,216][53252] Updated weights for policy 0, policy_version 65890 (0.0009) [2023-10-10 07:24:27,357][53268] Updated weights for policy 1, policy_version 65840 (0.0009) [2023-10-10 07:24:27,614][53252] Updated weights for policy 0, policy_version 65900 (0.0009) [2023-10-10 07:24:27,725][53268] Updated weights for policy 1, policy_version 65850 (0.0008) [2023-10-10 07:24:27,987][53252] Updated weights for policy 0, policy_version 65910 (0.0007) [2023-10-10 07:24:28,351][53252] Updated weights for policy 0, policy_version 65920 (0.0007) [2023-10-10 07:24:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 134938624. Throughput: 0: 1675.8, 1: 1675.7. Samples: 33743152. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:31,784][52050] Avg episode reward: [(0, '21.250'), (1, '20.540')] [2023-10-10 07:24:31,885][53268] Updated weights for policy 1, policy_version 65860 (0.0009) [2023-10-10 07:24:32,253][53268] Updated weights for policy 1, policy_version 65870 (0.0009) [2023-10-10 07:24:32,448][53252] Updated weights for policy 0, policy_version 65930 (0.0007) [2023-10-10 07:24:32,623][53268] Updated weights for policy 1, policy_version 65880 (0.0007) [2023-10-10 07:24:32,816][53252] Updated weights for policy 0, policy_version 65940 (0.0009) [2023-10-10 07:24:33,196][53252] Updated weights for policy 0, policy_version 65950 (0.0009) [2023-10-10 07:24:36,745][53268] Updated weights for policy 1, policy_version 65890 (0.0007) [2023-10-10 07:24:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135004160. Throughput: 0: 1682.3, 1: 1670.4. Samples: 33763828. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:36,784][52050] Avg episode reward: [(0, '20.480'), (1, '20.490')] [2023-10-10 07:24:37,119][53268] Updated weights for policy 1, policy_version 65900 (0.0008) [2023-10-10 07:24:37,308][53252] Updated weights for policy 0, policy_version 65960 (0.0008) [2023-10-10 07:24:37,487][53268] Updated weights for policy 1, policy_version 65910 (0.0008) [2023-10-10 07:24:37,683][53252] Updated weights for policy 0, policy_version 65970 (0.0008) [2023-10-10 07:24:37,845][53268] Updated weights for policy 1, policy_version 65920 (0.0009) [2023-10-10 07:24:38,048][53252] Updated weights for policy 0, policy_version 65980 (0.0009) [2023-10-10 07:24:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135069696. Throughput: 0: 1684.4, 1: 1667.2. Samples: 33784406. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:41,784][52050] Avg episode reward: [(0, '21.580'), (1, '20.840')] [2023-10-10 07:24:42,024][53268] Updated weights for policy 1, policy_version 65930 (0.0008) [2023-10-10 07:24:42,087][53252] Updated weights for policy 0, policy_version 65990 (0.0008) [2023-10-10 07:24:42,394][53268] Updated weights for policy 1, policy_version 65940 (0.0008) [2023-10-10 07:24:42,468][53252] Updated weights for policy 0, policy_version 66000 (0.0008) [2023-10-10 07:24:42,751][53268] Updated weights for policy 1, policy_version 65950 (0.0008) [2023-10-10 07:24:42,825][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000065952_67534848.pth... [2023-10-10 07:24:42,839][53252] Updated weights for policy 0, policy_version 66010 (0.0007) [2023-10-10 07:24:42,855][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000064384_65929216.pth [2023-10-10 07:24:43,054][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000066016_67600384.pth... [2023-10-10 07:24:43,089][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000064416_65961984.pth [2023-10-10 07:24:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135135232. Throughput: 0: 1681.6, 1: 1671.3. Samples: 33793408. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:46,784][52050] Avg episode reward: [(0, '22.110'), (1, '21.030')] [2023-10-10 07:24:46,826][53268] Updated weights for policy 1, policy_version 65960 (0.0007) [2023-10-10 07:24:46,896][53252] Updated weights for policy 0, policy_version 66020 (0.0008) [2023-10-10 07:24:47,195][53268] Updated weights for policy 1, policy_version 65970 (0.0007) [2023-10-10 07:24:47,276][53252] Updated weights for policy 0, policy_version 66030 (0.0007) [2023-10-10 07:24:47,559][53268] Updated weights for policy 1, policy_version 65980 (0.0008) [2023-10-10 07:24:47,644][53252] Updated weights for policy 0, policy_version 66040 (0.0009) [2023-10-10 07:24:51,534][53268] Updated weights for policy 1, policy_version 65990 (0.0008) [2023-10-10 07:24:51,757][53252] Updated weights for policy 0, policy_version 66050 (0.0010) [2023-10-10 07:24:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135200768. Throughput: 0: 1682.8, 1: 1670.4. Samples: 33814098. Policy #0 lag: (min: 2.0, avg: 7.7, max: 34.0) [2023-10-10 07:24:51,785][52050] Avg episode reward: [(0, '21.160'), (1, '21.690')] [2023-10-10 07:24:51,905][53268] Updated weights for policy 1, policy_version 66000 (0.0008) [2023-10-10 07:24:52,120][53252] Updated weights for policy 0, policy_version 66060 (0.0008) [2023-10-10 07:24:52,273][53268] Updated weights for policy 1, policy_version 66010 (0.0007) [2023-10-10 07:24:52,498][53252] Updated weights for policy 0, policy_version 66070 (0.0007) [2023-10-10 07:24:52,873][53252] Updated weights for policy 0, policy_version 66080 (0.0007) [2023-10-10 07:24:56,333][53268] Updated weights for policy 1, policy_version 66020 (0.0009) [2023-10-10 07:24:56,704][53268] Updated weights for policy 1, policy_version 66030 (0.0008) [2023-10-10 07:24:56,723][53252] Updated weights for policy 0, policy_version 66090 (0.0009) [2023-10-10 07:24:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135266304. Throughput: 0: 1682.6, 1: 1665.1. Samples: 33834694. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:24:56,784][52050] Avg episode reward: [(0, '20.660'), (1, '20.750')] [2023-10-10 07:24:57,071][53268] Updated weights for policy 1, policy_version 66040 (0.0008) [2023-10-10 07:24:57,087][53252] Updated weights for policy 0, policy_version 66100 (0.0008) [2023-10-10 07:24:57,458][53252] Updated weights for policy 0, policy_version 66110 (0.0007) [2023-10-10 07:25:01,247][53268] Updated weights for policy 1, policy_version 66050 (0.0008) [2023-10-10 07:25:01,553][53252] Updated weights for policy 0, policy_version 66120 (0.0008) [2023-10-10 07:25:01,614][53268] Updated weights for policy 1, policy_version 66060 (0.0008) [2023-10-10 07:25:01,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135331840. Throughput: 0: 1681.5, 1: 1668.5. Samples: 33843828. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:01,784][52050] Avg episode reward: [(0, '21.500'), (1, '19.620')] [2023-10-10 07:25:01,930][53252] Updated weights for policy 0, policy_version 66130 (0.0007) [2023-10-10 07:25:01,973][53268] Updated weights for policy 1, policy_version 66070 (0.0009) [2023-10-10 07:25:02,301][53252] Updated weights for policy 0, policy_version 66140 (0.0007) [2023-10-10 07:25:02,344][53268] Updated weights for policy 1, policy_version 66080 (0.0008) [2023-10-10 07:25:06,171][53252] Updated weights for policy 0, policy_version 66150 (0.0008) [2023-10-10 07:25:06,405][53268] Updated weights for policy 1, policy_version 66090 (0.0007) [2023-10-10 07:25:06,529][53252] Updated weights for policy 0, policy_version 66160 (0.0009) [2023-10-10 07:25:06,767][53268] Updated weights for policy 1, policy_version 66100 (0.0010) [2023-10-10 07:25:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 135397376. Throughput: 0: 1683.9, 1: 1669.1. Samples: 33864484. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:06,784][52050] Avg episode reward: [(0, '22.190'), (1, '21.190')] [2023-10-10 07:25:06,902][53252] Updated weights for policy 0, policy_version 66170 (0.0009) [2023-10-10 07:25:07,131][53268] Updated weights for policy 1, policy_version 66110 (0.0007) [2023-10-10 07:25:10,915][53252] Updated weights for policy 0, policy_version 66180 (0.0008) [2023-10-10 07:25:11,168][53268] Updated weights for policy 1, policy_version 66120 (0.0009) [2023-10-10 07:25:11,272][53252] Updated weights for policy 0, policy_version 66190 (0.0009) [2023-10-10 07:25:11,541][53268] Updated weights for policy 1, policy_version 66130 (0.0008) [2023-10-10 07:25:11,649][53252] Updated weights for policy 0, policy_version 66200 (0.0009) [2023-10-10 07:25:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 135462912. Throughput: 0: 1675.0, 1: 1664.8. Samples: 33884398. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:11,784][52050] Avg episode reward: [(0, '21.230'), (1, '21.060')] [2023-10-10 07:25:11,906][53268] Updated weights for policy 1, policy_version 66140 (0.0007) [2023-10-10 07:25:15,840][53252] Updated weights for policy 0, policy_version 66210 (0.0007) [2023-10-10 07:25:15,917][53268] Updated weights for policy 1, policy_version 66150 (0.0009) [2023-10-10 07:25:16,233][53252] Updated weights for policy 0, policy_version 66220 (0.0008) [2023-10-10 07:25:16,293][53268] Updated weights for policy 1, policy_version 66160 (0.0007) [2023-10-10 07:25:16,605][53252] Updated weights for policy 0, policy_version 66230 (0.0007) [2023-10-10 07:25:16,662][53268] Updated weights for policy 1, policy_version 66170 (0.0008) [2023-10-10 07:25:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 135528448. Throughput: 0: 1686.6, 1: 1670.0. Samples: 33894202. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:16,784][52050] Avg episode reward: [(0, '20.440'), (1, '19.970')] [2023-10-10 07:25:16,963][53252] Updated weights for policy 0, policy_version 66240 (0.0008) [2023-10-10 07:25:20,746][53268] Updated weights for policy 1, policy_version 66180 (0.0007) [2023-10-10 07:25:21,114][53268] Updated weights for policy 1, policy_version 66190 (0.0007) [2023-10-10 07:25:21,213][53252] Updated weights for policy 0, policy_version 66250 (0.0007) [2023-10-10 07:25:21,471][53268] Updated weights for policy 1, policy_version 66200 (0.0009) [2023-10-10 07:25:21,578][53252] Updated weights for policy 0, policy_version 66260 (0.0008) [2023-10-10 07:25:21,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135626752. Throughput: 0: 1682.2, 1: 1674.3. Samples: 33914870. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:21,784][52050] Avg episode reward: [(0, '22.720'), (1, '21.050')] [2023-10-10 07:25:21,946][53252] Updated weights for policy 0, policy_version 66270 (0.0008) [2023-10-10 07:25:25,639][53268] Updated weights for policy 1, policy_version 66210 (0.0007) [2023-10-10 07:25:25,869][53252] Updated weights for policy 0, policy_version 66280 (0.0008) [2023-10-10 07:25:26,013][53268] Updated weights for policy 1, policy_version 66220 (0.0007) [2023-10-10 07:25:26,247][53252] Updated weights for policy 0, policy_version 66290 (0.0007) [2023-10-10 07:25:26,374][53268] Updated weights for policy 1, policy_version 66230 (0.0007) [2023-10-10 07:25:26,622][53252] Updated weights for policy 0, policy_version 66300 (0.0009) [2023-10-10 07:25:26,748][53268] Updated weights for policy 1, policy_version 66240 (0.0007) [2023-10-10 07:25:26,783][52050] Fps is (10 sec: 19660.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 135725056. Throughput: 0: 1664.5, 1: 1664.4. Samples: 33934210. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:26,784][52050] Avg episode reward: [(0, '21.780'), (1, '20.810')] [2023-10-10 07:25:30,670][53252] Updated weights for policy 0, policy_version 66310 (0.0009) [2023-10-10 07:25:31,034][53268] Updated weights for policy 1, policy_version 66250 (0.0010) [2023-10-10 07:25:31,037][53252] Updated weights for policy 0, policy_version 66320 (0.0008) [2023-10-10 07:25:31,403][53268] Updated weights for policy 1, policy_version 66260 (0.0010) [2023-10-10 07:25:31,411][53252] Updated weights for policy 0, policy_version 66330 (0.0008) [2023-10-10 07:25:31,770][53268] Updated weights for policy 1, policy_version 66270 (0.0007) [2023-10-10 07:25:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135757824. Throughput: 0: 1687.5, 1: 1675.6. Samples: 33944748. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:25:31,784][52050] Avg episode reward: [(0, '20.900'), (1, '20.620')] [2023-10-10 07:25:35,652][53252] Updated weights for policy 0, policy_version 66340 (0.0008) [2023-10-10 07:25:36,030][53268] Updated weights for policy 1, policy_version 66280 (0.0009) [2023-10-10 07:25:36,031][53252] Updated weights for policy 0, policy_version 66350 (0.0008) [2023-10-10 07:25:36,406][53252] Updated weights for policy 0, policy_version 66360 (0.0008) [2023-10-10 07:25:36,414][53268] Updated weights for policy 1, policy_version 66290 (0.0007) [2023-10-10 07:25:36,783][52050] Fps is (10 sec: 9830.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135823360. Throughput: 0: 1686.3, 1: 1675.7. Samples: 33965388. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:25:36,784][52050] Avg episode reward: [(0, '21.610'), (1, '19.770')] [2023-10-10 07:25:36,790][53268] Updated weights for policy 1, policy_version 66300 (0.0007) [2023-10-10 07:25:40,583][53252] Updated weights for policy 0, policy_version 66370 (0.0009) [2023-10-10 07:25:40,910][53268] Updated weights for policy 1, policy_version 66310 (0.0008) [2023-10-10 07:25:40,960][53252] Updated weights for policy 0, policy_version 66380 (0.0009) [2023-10-10 07:25:41,263][53268] Updated weights for policy 1, policy_version 66320 (0.0010) [2023-10-10 07:25:41,317][53252] Updated weights for policy 0, policy_version 66390 (0.0008) [2023-10-10 07:25:41,637][53268] Updated weights for policy 1, policy_version 66330 (0.0010) [2023-10-10 07:25:41,697][53252] Updated weights for policy 0, policy_version 66400 (0.0008) [2023-10-10 07:25:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135888896. Throughput: 0: 1666.7, 1: 1659.8. Samples: 33984388. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:25:41,784][52050] Avg episode reward: [(0, '22.200'), (1, '22.830')] [2023-10-10 07:25:45,633][53268] Updated weights for policy 1, policy_version 66340 (0.0011) [2023-10-10 07:25:45,789][53252] Updated weights for policy 0, policy_version 66410 (0.0009) [2023-10-10 07:25:45,999][53268] Updated weights for policy 1, policy_version 66350 (0.0008) [2023-10-10 07:25:46,160][53252] Updated weights for policy 0, policy_version 66420 (0.0009) [2023-10-10 07:25:46,365][53268] Updated weights for policy 1, policy_version 66360 (0.0007) [2023-10-10 07:25:46,537][53252] Updated weights for policy 0, policy_version 66430 (0.0009) [2023-10-10 07:25:46,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 135987200. Throughput: 0: 1685.2, 1: 1672.9. Samples: 33994940. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:25:46,784][52050] Avg episode reward: [(0, '20.380'), (1, '20.840')] [2023-10-10 07:25:50,335][53268] Updated weights for policy 1, policy_version 66370 (0.0008) [2023-10-10 07:25:50,457][53252] Updated weights for policy 0, policy_version 66440 (0.0008) [2023-10-10 07:25:50,701][53268] Updated weights for policy 1, policy_version 66380 (0.0008) [2023-10-10 07:25:50,831][53252] Updated weights for policy 0, policy_version 66450 (0.0007) [2023-10-10 07:25:51,068][53268] Updated weights for policy 1, policy_version 66390 (0.0008) [2023-10-10 07:25:51,194][53252] Updated weights for policy 0, policy_version 66460 (0.0008) [2023-10-10 07:25:51,435][53268] Updated weights for policy 1, policy_version 66400 (0.0009) [2023-10-10 07:25:51,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 136052736. Throughput: 0: 1679.7, 1: 1673.2. Samples: 34015364. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:25:51,784][52050] Avg episode reward: [(0, '20.590'), (1, '19.600')] [2023-10-10 07:25:55,424][53252] Updated weights for policy 0, policy_version 66470 (0.0009) [2023-10-10 07:25:55,525][53268] Updated weights for policy 1, policy_version 66410 (0.0008) [2023-10-10 07:25:55,799][53252] Updated weights for policy 0, policy_version 66480 (0.0009) [2023-10-10 07:25:55,893][53268] Updated weights for policy 1, policy_version 66420 (0.0008) [2023-10-10 07:25:56,160][53252] Updated weights for policy 0, policy_version 66490 (0.0007) [2023-10-10 07:25:56,251][53268] Updated weights for policy 1, policy_version 66430 (0.0009) [2023-10-10 07:25:56,783][52050] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 136118272. Throughput: 0: 1665.4, 1: 1658.6. Samples: 34033976. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:25:56,785][52050] Avg episode reward: [(0, '22.310'), (1, '20.460')] [2023-10-10 07:26:00,068][53252] Updated weights for policy 0, policy_version 66500 (0.0008) [2023-10-10 07:26:00,336][53268] Updated weights for policy 1, policy_version 66440 (0.0009) [2023-10-10 07:26:00,432][53252] Updated weights for policy 0, policy_version 66510 (0.0009) [2023-10-10 07:26:00,703][53268] Updated weights for policy 1, policy_version 66450 (0.0008) [2023-10-10 07:26:00,804][53252] Updated weights for policy 0, policy_version 66520 (0.0008) [2023-10-10 07:26:01,066][53268] Updated weights for policy 1, policy_version 66460 (0.0008) [2023-10-10 07:26:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 136183808. Throughput: 0: 1686.3, 1: 1676.4. Samples: 34045526. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:26:01,785][52050] Avg episode reward: [(0, '21.700'), (1, '22.380')] [2023-10-10 07:26:04,855][53252] Updated weights for policy 0, policy_version 66530 (0.0008) [2023-10-10 07:26:04,988][53268] Updated weights for policy 1, policy_version 66470 (0.0009) [2023-10-10 07:26:05,243][53252] Updated weights for policy 0, policy_version 66540 (0.0007) [2023-10-10 07:26:05,347][53268] Updated weights for policy 1, policy_version 66480 (0.0009) [2023-10-10 07:26:05,619][53252] Updated weights for policy 0, policy_version 66550 (0.0009) [2023-10-10 07:26:05,714][53268] Updated weights for policy 1, policy_version 66490 (0.0008) [2023-10-10 07:26:05,984][53252] Updated weights for policy 0, policy_version 66560 (0.0007) [2023-10-10 07:26:06,783][52050] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 136249344. Throughput: 0: 1674.1, 1: 1666.0. Samples: 34065174. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:26:06,784][52050] Avg episode reward: [(0, '22.190'), (1, '20.480')] [2023-10-10 07:26:09,816][53268] Updated weights for policy 1, policy_version 66500 (0.0007) [2023-10-10 07:26:10,071][53252] Updated weights for policy 0, policy_version 66570 (0.0007) [2023-10-10 07:26:10,180][53268] Updated weights for policy 1, policy_version 66510 (0.0008) [2023-10-10 07:26:10,433][53252] Updated weights for policy 0, policy_version 66580 (0.0007) [2023-10-10 07:26:10,537][53268] Updated weights for policy 1, policy_version 66520 (0.0009) [2023-10-10 07:26:10,805][53252] Updated weights for policy 0, policy_version 66590 (0.0008) [2023-10-10 07:26:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 136314880. Throughput: 0: 1672.3, 1: 1664.1. Samples: 34084346. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:26:11,784][52050] Avg episode reward: [(0, '22.190'), (1, '21.050')] [2023-10-10 07:26:14,468][53268] Updated weights for policy 1, policy_version 66530 (0.0009) [2023-10-10 07:26:14,833][53268] Updated weights for policy 1, policy_version 66540 (0.0008) [2023-10-10 07:26:14,978][53252] Updated weights for policy 0, policy_version 66600 (0.0010) [2023-10-10 07:26:15,198][53268] Updated weights for policy 1, policy_version 66550 (0.0009) [2023-10-10 07:26:15,343][53252] Updated weights for policy 0, policy_version 66610 (0.0009) [2023-10-10 07:26:15,571][53268] Updated weights for policy 1, policy_version 66560 (0.0008) [2023-10-10 07:26:15,723][53252] Updated weights for policy 0, policy_version 66620 (0.0009) [2023-10-10 07:26:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 136380416. Throughput: 0: 1682.5, 1: 1683.2. Samples: 34096204. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) [2023-10-10 07:26:16,784][52050] Avg episode reward: [(0, '21.810'), (1, '22.280')] [2023-10-10 07:26:19,707][53252] Updated weights for policy 0, policy_version 66630 (0.0009) [2023-10-10 07:26:19,819][53268] Updated weights for policy 1, policy_version 66570 (0.0007) [2023-10-10 07:26:20,079][53252] Updated weights for policy 0, policy_version 66640 (0.0008) [2023-10-10 07:26:20,195][53268] Updated weights for policy 1, policy_version 66580 (0.0009) [2023-10-10 07:26:20,455][53252] Updated weights for policy 0, policy_version 66650 (0.0010) [2023-10-10 07:26:20,551][53268] Updated weights for policy 1, policy_version 66590 (0.0010) [2023-10-10 07:26:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136445952. Throughput: 0: 1661.2, 1: 1666.9. Samples: 34115150. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:21,784][52050] Avg episode reward: [(0, '20.800'), (1, '21.750')] [2023-10-10 07:26:24,545][53252] Updated weights for policy 0, policy_version 66660 (0.0008) [2023-10-10 07:26:24,681][53268] Updated weights for policy 1, policy_version 66600 (0.0008) [2023-10-10 07:26:24,907][53252] Updated weights for policy 0, policy_version 66670 (0.0009) [2023-10-10 07:26:25,054][53268] Updated weights for policy 1, policy_version 66610 (0.0009) [2023-10-10 07:26:25,273][53252] Updated weights for policy 0, policy_version 66680 (0.0008) [2023-10-10 07:26:25,412][53268] Updated weights for policy 1, policy_version 66620 (0.0008) [2023-10-10 07:26:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136511488. Throughput: 0: 1670.0, 1: 1676.7. Samples: 34134992. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:26,784][52050] Avg episode reward: [(0, '20.980'), (1, '20.520')] [2023-10-10 07:26:29,428][53268] Updated weights for policy 1, policy_version 66630 (0.0009) [2023-10-10 07:26:29,457][53252] Updated weights for policy 0, policy_version 66690 (0.0009) [2023-10-10 07:26:29,786][53268] Updated weights for policy 1, policy_version 66640 (0.0010) [2023-10-10 07:26:29,822][53252] Updated weights for policy 0, policy_version 66700 (0.0007) [2023-10-10 07:26:30,157][53268] Updated weights for policy 1, policy_version 66650 (0.0009) [2023-10-10 07:26:30,198][53252] Updated weights for policy 0, policy_version 66710 (0.0008) [2023-10-10 07:26:30,562][53252] Updated weights for policy 0, policy_version 66720 (0.0008) [2023-10-10 07:26:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136577024. Throughput: 0: 1679.0, 1: 1692.4. Samples: 34146654. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:31,784][52050] Avg episode reward: [(0, '20.740'), (1, '20.300')] [2023-10-10 07:26:34,020][53268] Updated weights for policy 1, policy_version 66660 (0.0008) [2023-10-10 07:26:34,382][53268] Updated weights for policy 1, policy_version 66670 (0.0007) [2023-10-10 07:26:34,524][53252] Updated weights for policy 0, policy_version 66730 (0.0008) [2023-10-10 07:26:34,752][53268] Updated weights for policy 1, policy_version 66680 (0.0007) [2023-10-10 07:26:34,891][53252] Updated weights for policy 0, policy_version 66740 (0.0009) [2023-10-10 07:26:35,259][53252] Updated weights for policy 0, policy_version 66750 (0.0009) [2023-10-10 07:26:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136642560. Throughput: 0: 1657.3, 1: 1669.0. Samples: 34165048. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:36,784][52050] Avg episode reward: [(0, '22.650'), (1, '20.590')] [2023-10-10 07:26:38,692][53268] Updated weights for policy 1, policy_version 66690 (0.0009) [2023-10-10 07:26:39,059][53268] Updated weights for policy 1, policy_version 66700 (0.0009) [2023-10-10 07:26:39,415][53268] Updated weights for policy 1, policy_version 66710 (0.0007) [2023-10-10 07:26:39,435][53252] Updated weights for policy 0, policy_version 66760 (0.0008) [2023-10-10 07:26:39,778][53268] Updated weights for policy 1, policy_version 66720 (0.0008) [2023-10-10 07:26:39,806][53252] Updated weights for policy 0, policy_version 66770 (0.0008) [2023-10-10 07:26:40,170][53252] Updated weights for policy 0, policy_version 66780 (0.0007) [2023-10-10 07:26:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 136708096. Throughput: 0: 1678.5, 1: 1689.0. Samples: 34185512. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:41,784][52050] Avg episode reward: [(0, '22.540'), (1, '20.270')] [2023-10-10 07:26:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth... [2023-10-10 07:26:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000066720_68321280.pth... [2023-10-10 07:26:41,830][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000065152_66715648.pth [2023-10-10 07:26:41,836][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000065216_66781184.pth [2023-10-10 07:26:43,939][53268] Updated weights for policy 1, policy_version 66730 (0.0007) [2023-10-10 07:26:44,159][53252] Updated weights for policy 0, policy_version 66790 (0.0008) [2023-10-10 07:26:44,307][53268] Updated weights for policy 1, policy_version 66740 (0.0007) [2023-10-10 07:26:44,523][53252] Updated weights for policy 0, policy_version 66800 (0.0008) [2023-10-10 07:26:44,663][53268] Updated weights for policy 1, policy_version 66750 (0.0007) [2023-10-10 07:26:44,898][53252] Updated weights for policy 0, policy_version 66810 (0.0007) [2023-10-10 07:26:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 136773632. Throughput: 0: 1667.4, 1: 1679.4. Samples: 34196130. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:46,784][52050] Avg episode reward: [(0, '24.580'), (1, '20.650')] [2023-10-10 07:26:48,674][53268] Updated weights for policy 1, policy_version 66760 (0.0007) [2023-10-10 07:26:49,023][53252] Updated weights for policy 0, policy_version 66820 (0.0008) [2023-10-10 07:26:49,037][53268] Updated weights for policy 1, policy_version 66770 (0.0009) [2023-10-10 07:26:49,387][53252] Updated weights for policy 0, policy_version 66830 (0.0007) [2023-10-10 07:26:49,406][53268] Updated weights for policy 1, policy_version 66780 (0.0008) [2023-10-10 07:26:49,757][53252] Updated weights for policy 0, policy_version 66840 (0.0009) [2023-10-10 07:26:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 136839168. Throughput: 0: 1664.2, 1: 1673.2. Samples: 34215354. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:51,784][52050] Avg episode reward: [(0, '24.640'), (1, '21.350')] [2023-10-10 07:26:53,541][53268] Updated weights for policy 1, policy_version 66790 (0.0009) [2023-10-10 07:26:53,763][53252] Updated weights for policy 0, policy_version 66850 (0.0008) [2023-10-10 07:26:53,906][53268] Updated weights for policy 1, policy_version 66800 (0.0010) [2023-10-10 07:26:54,144][53252] Updated weights for policy 0, policy_version 66860 (0.0009) [2023-10-10 07:26:54,265][53268] Updated weights for policy 1, policy_version 66810 (0.0007) [2023-10-10 07:26:54,516][53252] Updated weights for policy 0, policy_version 66870 (0.0009) [2023-10-10 07:26:54,885][53252] Updated weights for policy 0, policy_version 66880 (0.0008) [2023-10-10 07:26:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 136904704. Throughput: 0: 1679.2, 1: 1690.9. Samples: 34235998. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:26:56,784][52050] Avg episode reward: [(0, '22.570'), (1, '21.230')] [2023-10-10 07:26:58,481][53268] Updated weights for policy 1, policy_version 66820 (0.0009) [2023-10-10 07:26:58,857][53268] Updated weights for policy 1, policy_version 66830 (0.0008) [2023-10-10 07:26:58,918][53252] Updated weights for policy 0, policy_version 66890 (0.0010) [2023-10-10 07:26:59,224][53268] Updated weights for policy 1, policy_version 66840 (0.0008) [2023-10-10 07:26:59,286][53252] Updated weights for policy 0, policy_version 66900 (0.0007) [2023-10-10 07:26:59,657][53252] Updated weights for policy 0, policy_version 66910 (0.0009) [2023-10-10 07:27:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 136970240. Throughput: 0: 1657.1, 1: 1670.0. Samples: 34245924. Policy #0 lag: (min: 17.0, avg: 27.9, max: 49.0) [2023-10-10 07:27:01,784][52050] Avg episode reward: [(0, '22.510'), (1, '21.240')] [2023-10-10 07:27:03,376][53268] Updated weights for policy 1, policy_version 66850 (0.0008) [2023-10-10 07:27:03,744][53268] Updated weights for policy 1, policy_version 66860 (0.0009) [2023-10-10 07:27:03,763][53252] Updated weights for policy 0, policy_version 66920 (0.0009) [2023-10-10 07:27:04,104][53268] Updated weights for policy 1, policy_version 66870 (0.0008) [2023-10-10 07:27:04,141][53252] Updated weights for policy 0, policy_version 66930 (0.0008) [2023-10-10 07:27:04,473][53268] Updated weights for policy 1, policy_version 66880 (0.0010) [2023-10-10 07:27:04,513][53252] Updated weights for policy 0, policy_version 66940 (0.0007) [2023-10-10 07:27:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137035776. Throughput: 0: 1667.1, 1: 1673.9. Samples: 34265494. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:06,784][52050] Avg episode reward: [(0, '21.300'), (1, '20.120')] [2023-10-10 07:27:08,556][53252] Updated weights for policy 0, policy_version 66950 (0.0008) [2023-10-10 07:27:08,649][53268] Updated weights for policy 1, policy_version 66890 (0.0010) [2023-10-10 07:27:08,926][53252] Updated weights for policy 0, policy_version 66960 (0.0009) [2023-10-10 07:27:09,028][53268] Updated weights for policy 1, policy_version 66900 (0.0009) [2023-10-10 07:27:09,293][53252] Updated weights for policy 0, policy_version 66970 (0.0009) [2023-10-10 07:27:09,395][53268] Updated weights for policy 1, policy_version 66910 (0.0008) [2023-10-10 07:27:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 137101312. Throughput: 0: 1676.5, 1: 1679.6. Samples: 34286018. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:11,784][52050] Avg episode reward: [(0, '20.430'), (1, '20.270')] [2023-10-10 07:27:13,325][53252] Updated weights for policy 0, policy_version 66980 (0.0008) [2023-10-10 07:27:13,481][53268] Updated weights for policy 1, policy_version 66920 (0.0008) [2023-10-10 07:27:13,697][53252] Updated weights for policy 0, policy_version 66990 (0.0007) [2023-10-10 07:27:13,851][53268] Updated weights for policy 1, policy_version 66930 (0.0007) [2023-10-10 07:27:14,074][53252] Updated weights for policy 0, policy_version 67000 (0.0007) [2023-10-10 07:27:14,224][53268] Updated weights for policy 1, policy_version 66940 (0.0009) [2023-10-10 07:27:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137166848. Throughput: 0: 1653.4, 1: 1654.1. Samples: 34295494. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:16,784][52050] Avg episode reward: [(0, '21.650'), (1, '21.610')] [2023-10-10 07:27:18,226][53252] Updated weights for policy 0, policy_version 67010 (0.0009) [2023-10-10 07:27:18,301][53268] Updated weights for policy 1, policy_version 66950 (0.0009) [2023-10-10 07:27:18,588][53252] Updated weights for policy 0, policy_version 67020 (0.0008) [2023-10-10 07:27:18,667][53268] Updated weights for policy 1, policy_version 66960 (0.0009) [2023-10-10 07:27:18,955][53252] Updated weights for policy 0, policy_version 67030 (0.0008) [2023-10-10 07:27:19,040][53268] Updated weights for policy 1, policy_version 66970 (0.0008) [2023-10-10 07:27:19,326][53252] Updated weights for policy 0, policy_version 67040 (0.0009) [2023-10-10 07:27:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137232384. Throughput: 0: 1675.7, 1: 1671.2. Samples: 34315662. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:21,784][52050] Avg episode reward: [(0, '22.030'), (1, '20.050')] [2023-10-10 07:27:23,216][53268] Updated weights for policy 1, policy_version 66980 (0.0008) [2023-10-10 07:27:23,346][53252] Updated weights for policy 0, policy_version 67050 (0.0009) [2023-10-10 07:27:23,577][53268] Updated weights for policy 1, policy_version 66990 (0.0008) [2023-10-10 07:27:23,710][53252] Updated weights for policy 0, policy_version 67060 (0.0007) [2023-10-10 07:27:23,935][53268] Updated weights for policy 1, policy_version 67000 (0.0009) [2023-10-10 07:27:24,071][53252] Updated weights for policy 0, policy_version 67070 (0.0008) [2023-10-10 07:27:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137297920. Throughput: 0: 1682.1, 1: 1673.8. Samples: 34336526. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:26,784][52050] Avg episode reward: [(0, '21.640'), (1, '20.830')] [2023-10-10 07:27:27,935][53268] Updated weights for policy 1, policy_version 67010 (0.0008) [2023-10-10 07:27:28,290][53252] Updated weights for policy 0, policy_version 67080 (0.0008) [2023-10-10 07:27:28,300][53268] Updated weights for policy 1, policy_version 67020 (0.0007) [2023-10-10 07:27:28,660][53252] Updated weights for policy 0, policy_version 67090 (0.0009) [2023-10-10 07:27:28,671][53268] Updated weights for policy 1, policy_version 67030 (0.0009) [2023-10-10 07:27:29,035][53268] Updated weights for policy 1, policy_version 67040 (0.0008) [2023-10-10 07:27:29,036][53252] Updated weights for policy 0, policy_version 67100 (0.0008) [2023-10-10 07:27:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 137363456. Throughput: 0: 1661.8, 1: 1658.0. Samples: 34345522. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:31,784][52050] Avg episode reward: [(0, '19.490'), (1, '21.090')] [2023-10-10 07:27:33,073][53252] Updated weights for policy 0, policy_version 67110 (0.0008) [2023-10-10 07:27:33,119][53268] Updated weights for policy 1, policy_version 67050 (0.0008) [2023-10-10 07:27:33,433][53252] Updated weights for policy 0, policy_version 67120 (0.0008) [2023-10-10 07:27:33,478][53268] Updated weights for policy 1, policy_version 67060 (0.0009) [2023-10-10 07:27:33,809][53252] Updated weights for policy 0, policy_version 67130 (0.0010) [2023-10-10 07:27:33,842][53268] Updated weights for policy 1, policy_version 67070 (0.0007) [2023-10-10 07:27:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137428992. Throughput: 0: 1686.7, 1: 1672.1. Samples: 34366502. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:36,785][52050] Avg episode reward: [(0, '20.180'), (1, '20.060')] [2023-10-10 07:27:37,760][53252] Updated weights for policy 0, policy_version 67140 (0.0009) [2023-10-10 07:27:38,112][53268] Updated weights for policy 1, policy_version 67080 (0.0008) [2023-10-10 07:27:38,129][53252] Updated weights for policy 0, policy_version 67150 (0.0008) [2023-10-10 07:27:38,473][53268] Updated weights for policy 1, policy_version 67090 (0.0009) [2023-10-10 07:27:38,492][53252] Updated weights for policy 0, policy_version 67160 (0.0010) [2023-10-10 07:27:38,837][53268] Updated weights for policy 1, policy_version 67100 (0.0007) [2023-10-10 07:27:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 137494528. Throughput: 0: 1688.8, 1: 1672.2. Samples: 34387240. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:41,784][52050] Avg episode reward: [(0, '20.830'), (1, '22.000')] [2023-10-10 07:27:42,682][53252] Updated weights for policy 0, policy_version 67170 (0.0008) [2023-10-10 07:27:42,882][53268] Updated weights for policy 1, policy_version 67110 (0.0009) [2023-10-10 07:27:43,089][53252] Updated weights for policy 0, policy_version 67180 (0.0007) [2023-10-10 07:27:43,250][53268] Updated weights for policy 1, policy_version 67120 (0.0008) [2023-10-10 07:27:43,462][53252] Updated weights for policy 0, policy_version 67190 (0.0008) [2023-10-10 07:27:43,609][53268] Updated weights for policy 1, policy_version 67130 (0.0009) [2023-10-10 07:27:43,836][53252] Updated weights for policy 0, policy_version 67200 (0.0007) [2023-10-10 07:27:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137560064. Throughput: 0: 1679.0, 1: 1663.0. Samples: 34396312. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-10 07:27:46,784][52050] Avg episode reward: [(0, '21.450'), (1, '21.450')] [2023-10-10 07:27:47,730][53268] Updated weights for policy 1, policy_version 67140 (0.0009) [2023-10-10 07:27:47,805][53252] Updated weights for policy 0, policy_version 67210 (0.0009) [2023-10-10 07:27:48,098][53268] Updated weights for policy 1, policy_version 67150 (0.0008) [2023-10-10 07:27:48,184][53252] Updated weights for policy 0, policy_version 67220 (0.0008) [2023-10-10 07:27:48,455][53268] Updated weights for policy 1, policy_version 67160 (0.0009) [2023-10-10 07:27:48,548][53252] Updated weights for policy 0, policy_version 67230 (0.0009) [2023-10-10 07:27:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137625600. Throughput: 0: 1688.7, 1: 1675.6. Samples: 34416888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:27:51,784][52050] Avg episode reward: [(0, '20.940'), (1, '21.560')] [2023-10-10 07:27:52,592][53268] Updated weights for policy 1, policy_version 67170 (0.0008) [2023-10-10 07:27:52,763][53252] Updated weights for policy 0, policy_version 67240 (0.0007) [2023-10-10 07:27:52,954][53268] Updated weights for policy 1, policy_version 67180 (0.0007) [2023-10-10 07:27:53,128][53252] Updated weights for policy 0, policy_version 67250 (0.0007) [2023-10-10 07:27:53,317][53268] Updated weights for policy 1, policy_version 67190 (0.0008) [2023-10-10 07:27:53,501][53252] Updated weights for policy 0, policy_version 67260 (0.0008) [2023-10-10 07:27:53,682][53268] Updated weights for policy 1, policy_version 67200 (0.0007) [2023-10-10 07:27:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137691136. Throughput: 0: 1690.6, 1: 1677.6. Samples: 34437588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:27:56,784][52050] Avg episode reward: [(0, '22.950'), (1, '22.050')] [2023-10-10 07:27:57,476][53252] Updated weights for policy 0, policy_version 67270 (0.0009) [2023-10-10 07:27:57,846][53252] Updated weights for policy 0, policy_version 67280 (0.0009) [2023-10-10 07:27:57,862][53268] Updated weights for policy 1, policy_version 67210 (0.0009) [2023-10-10 07:27:58,223][53252] Updated weights for policy 0, policy_version 67290 (0.0008) [2023-10-10 07:27:58,230][53268] Updated weights for policy 1, policy_version 67220 (0.0008) [2023-10-10 07:27:58,599][53268] Updated weights for policy 1, policy_version 67230 (0.0007) [2023-10-10 07:28:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137756672. Throughput: 0: 1686.7, 1: 1670.3. Samples: 34446558. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:01,784][52050] Avg episode reward: [(0, '23.410'), (1, '19.990')] [2023-10-10 07:28:02,335][53252] Updated weights for policy 0, policy_version 67300 (0.0007) [2023-10-10 07:28:02,415][53268] Updated weights for policy 1, policy_version 67240 (0.0008) [2023-10-10 07:28:02,721][53252] Updated weights for policy 0, policy_version 67310 (0.0008) [2023-10-10 07:28:02,786][53268] Updated weights for policy 1, policy_version 67250 (0.0008) [2023-10-10 07:28:03,089][53252] Updated weights for policy 0, policy_version 67320 (0.0007) [2023-10-10 07:28:03,144][53268] Updated weights for policy 1, policy_version 67260 (0.0007) [2023-10-10 07:28:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137822208. Throughput: 0: 1690.5, 1: 1682.3. Samples: 34467440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:06,784][52050] Avg episode reward: [(0, '22.880'), (1, '19.220')] [2023-10-10 07:28:07,027][53268] Updated weights for policy 1, policy_version 67270 (0.0008) [2023-10-10 07:28:07,072][53252] Updated weights for policy 0, policy_version 67330 (0.0007) [2023-10-10 07:28:07,401][53268] Updated weights for policy 1, policy_version 67280 (0.0008) [2023-10-10 07:28:07,447][53252] Updated weights for policy 0, policy_version 67340 (0.0008) [2023-10-10 07:28:07,765][53268] Updated weights for policy 1, policy_version 67290 (0.0008) [2023-10-10 07:28:07,814][53252] Updated weights for policy 0, policy_version 67350 (0.0009) [2023-10-10 07:28:08,189][53252] Updated weights for policy 0, policy_version 67360 (0.0009) [2023-10-10 07:28:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137887744. Throughput: 0: 1689.7, 1: 1679.6. Samples: 34488144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:11,784][52050] Avg episode reward: [(0, '22.850'), (1, '19.030')] [2023-10-10 07:28:11,951][53268] Updated weights for policy 1, policy_version 67300 (0.0008) [2023-10-10 07:28:12,223][53252] Updated weights for policy 0, policy_version 67370 (0.0009) [2023-10-10 07:28:12,310][53268] Updated weights for policy 1, policy_version 67310 (0.0008) [2023-10-10 07:28:12,597][53252] Updated weights for policy 0, policy_version 67380 (0.0008) [2023-10-10 07:28:12,678][53268] Updated weights for policy 1, policy_version 67320 (0.0008) [2023-10-10 07:28:12,976][53252] Updated weights for policy 0, policy_version 67390 (0.0008) [2023-10-10 07:28:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 137953280. Throughput: 0: 1690.0, 1: 1680.8. Samples: 34497208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:16,784][52050] Avg episode reward: [(0, '22.690'), (1, '20.210')] [2023-10-10 07:28:16,798][53252] Updated weights for policy 0, policy_version 67400 (0.0008) [2023-10-10 07:28:16,852][53268] Updated weights for policy 1, policy_version 67330 (0.0008) [2023-10-10 07:28:17,171][53252] Updated weights for policy 0, policy_version 67410 (0.0009) [2023-10-10 07:28:17,223][53268] Updated weights for policy 1, policy_version 67340 (0.0008) [2023-10-10 07:28:17,536][53252] Updated weights for policy 0, policy_version 67420 (0.0008) [2023-10-10 07:28:17,579][53268] Updated weights for policy 1, policy_version 67350 (0.0009) [2023-10-10 07:28:17,946][53268] Updated weights for policy 1, policy_version 67360 (0.0009) [2023-10-10 07:28:21,609][53252] Updated weights for policy 0, policy_version 67430 (0.0008) [2023-10-10 07:28:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138018816. Throughput: 0: 1685.2, 1: 1678.3. Samples: 34517858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:21,784][52050] Avg episode reward: [(0, '21.910'), (1, '18.860')] [2023-10-10 07:28:21,976][53252] Updated weights for policy 0, policy_version 67440 (0.0007) [2023-10-10 07:28:22,020][53268] Updated weights for policy 1, policy_version 67370 (0.0009) [2023-10-10 07:28:22,344][53252] Updated weights for policy 0, policy_version 67450 (0.0008) [2023-10-10 07:28:22,387][53268] Updated weights for policy 1, policy_version 67380 (0.0007) [2023-10-10 07:28:22,744][53268] Updated weights for policy 1, policy_version 67390 (0.0008) [2023-10-10 07:28:26,353][53252] Updated weights for policy 0, policy_version 67460 (0.0009) [2023-10-10 07:28:26,730][53252] Updated weights for policy 0, policy_version 67470 (0.0007) [2023-10-10 07:28:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 138084352. Throughput: 0: 1680.7, 1: 1680.5. Samples: 34538490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:26,784][52050] Avg episode reward: [(0, '21.170'), (1, '21.460')] [2023-10-10 07:28:26,895][53268] Updated weights for policy 1, policy_version 67400 (0.0007) [2023-10-10 07:28:27,107][53252] Updated weights for policy 0, policy_version 67480 (0.0009) [2023-10-10 07:28:27,266][53268] Updated weights for policy 1, policy_version 67410 (0.0009) [2023-10-10 07:28:27,631][53268] Updated weights for policy 1, policy_version 67420 (0.0009) [2023-10-10 07:28:31,227][53252] Updated weights for policy 0, policy_version 67490 (0.0009) [2023-10-10 07:28:31,637][53252] Updated weights for policy 0, policy_version 67500 (0.0007) [2023-10-10 07:28:31,715][53268] Updated weights for policy 1, policy_version 67430 (0.0008) [2023-10-10 07:28:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 138149888. Throughput: 0: 1687.7, 1: 1680.7. Samples: 34547890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:28:31,784][52050] Avg episode reward: [(0, '20.690'), (1, '20.870')] [2023-10-10 07:28:32,009][53252] Updated weights for policy 0, policy_version 67510 (0.0007) [2023-10-10 07:28:32,078][53268] Updated weights for policy 1, policy_version 67440 (0.0008) [2023-10-10 07:28:32,380][53252] Updated weights for policy 0, policy_version 67520 (0.0009) [2023-10-10 07:28:32,443][53268] Updated weights for policy 1, policy_version 67450 (0.0009) [2023-10-10 07:28:36,538][53252] Updated weights for policy 0, policy_version 67530 (0.0009) [2023-10-10 07:28:36,631][53268] Updated weights for policy 1, policy_version 67460 (0.0010) [2023-10-10 07:28:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138215424. Throughput: 0: 1684.0, 1: 1678.6. Samples: 34568206. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:28:36,784][52050] Avg episode reward: [(0, '20.680'), (1, '21.630')] [2023-10-10 07:28:36,914][53252] Updated weights for policy 0, policy_version 67540 (0.0008) [2023-10-10 07:28:36,990][53268] Updated weights for policy 1, policy_version 67470 (0.0007) [2023-10-10 07:28:37,287][53252] Updated weights for policy 0, policy_version 67550 (0.0008) [2023-10-10 07:28:37,354][53268] Updated weights for policy 1, policy_version 67480 (0.0008) [2023-10-10 07:28:41,459][53268] Updated weights for policy 1, policy_version 67490 (0.0010) [2023-10-10 07:28:41,502][53252] Updated weights for policy 0, policy_version 67560 (0.0008) [2023-10-10 07:28:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 138280960. Throughput: 0: 1674.5, 1: 1680.0. Samples: 34588542. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:28:41,784][52050] Avg episode reward: [(0, '21.850'), (1, '21.240')] [2023-10-10 07:28:41,829][53268] Updated weights for policy 1, policy_version 67500 (0.0008) [2023-10-10 07:28:41,873][53252] Updated weights for policy 0, policy_version 67570 (0.0007) [2023-10-10 07:28:42,201][53268] Updated weights for policy 1, policy_version 67510 (0.0008) [2023-10-10 07:28:42,236][53252] Updated weights for policy 0, policy_version 67580 (0.0007) [2023-10-10 07:28:42,382][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000067584_69206016.pth... [2023-10-10 07:28:42,410][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000066016_67600384.pth [2023-10-10 07:28:42,562][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000067520_69140480.pth... [2023-10-10 07:28:42,566][53268] Updated weights for policy 1, policy_version 67520 (0.0010) [2023-10-10 07:28:42,601][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000065952_67534848.pth [2023-10-10 07:28:46,311][53252] Updated weights for policy 0, policy_version 67590 (0.0008) [2023-10-10 07:28:46,673][53252] Updated weights for policy 0, policy_version 67600 (0.0008) [2023-10-10 07:28:46,739][53268] Updated weights for policy 1, policy_version 67530 (0.0008) [2023-10-10 07:28:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138346496. Throughput: 0: 1681.4, 1: 1683.1. Samples: 34597958. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:28:46,784][52050] Avg episode reward: [(0, '19.770'), (1, '21.910')] [2023-10-10 07:28:47,050][53252] Updated weights for policy 0, policy_version 67610 (0.0007) [2023-10-10 07:28:47,117][53268] Updated weights for policy 1, policy_version 67540 (0.0008) [2023-10-10 07:28:47,490][53268] Updated weights for policy 1, policy_version 67550 (0.0008) [2023-10-10 07:28:51,209][53252] Updated weights for policy 0, policy_version 67620 (0.0007) [2023-10-10 07:28:51,504][53268] Updated weights for policy 1, policy_version 67560 (0.0009) [2023-10-10 07:28:51,592][53252] Updated weights for policy 0, policy_version 67630 (0.0007) [2023-10-10 07:28:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138412032. Throughput: 0: 1679.7, 1: 1677.6. Samples: 34618518. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:28:51,784][52050] Avg episode reward: [(0, '20.980'), (1, '19.550')] [2023-10-10 07:28:51,870][53268] Updated weights for policy 1, policy_version 67570 (0.0008) [2023-10-10 07:28:51,960][53252] Updated weights for policy 0, policy_version 67640 (0.0008) [2023-10-10 07:28:52,239][53268] Updated weights for policy 1, policy_version 67580 (0.0009) [2023-10-10 07:28:55,907][53252] Updated weights for policy 0, policy_version 67650 (0.0007) [2023-10-10 07:28:56,279][53252] Updated weights for policy 0, policy_version 67660 (0.0007) [2023-10-10 07:28:56,369][53268] Updated weights for policy 1, policy_version 67590 (0.0008) [2023-10-10 07:28:56,651][53252] Updated weights for policy 0, policy_version 67670 (0.0008) [2023-10-10 07:28:56,733][53268] Updated weights for policy 1, policy_version 67600 (0.0008) [2023-10-10 07:28:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138477568. Throughput: 0: 1669.3, 1: 1677.8. Samples: 34638764. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:28:56,784][52050] Avg episode reward: [(0, '21.560'), (1, '20.160')] [2023-10-10 07:28:57,035][53252] Updated weights for policy 0, policy_version 67680 (0.0008) [2023-10-10 07:28:57,112][53268] Updated weights for policy 1, policy_version 67610 (0.0009) [2023-10-10 07:29:01,003][53252] Updated weights for policy 0, policy_version 67690 (0.0009) [2023-10-10 07:29:01,016][53268] Updated weights for policy 1, policy_version 67620 (0.0008) [2023-10-10 07:29:01,376][53268] Updated weights for policy 1, policy_version 67630 (0.0009) [2023-10-10 07:29:01,379][53252] Updated weights for policy 0, policy_version 67700 (0.0009) [2023-10-10 07:29:01,747][53268] Updated weights for policy 1, policy_version 67640 (0.0008) [2023-10-10 07:29:01,751][53252] Updated weights for policy 0, policy_version 67710 (0.0009) [2023-10-10 07:29:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 138543104. Throughput: 0: 1683.7, 1: 1676.1. Samples: 34648400. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:29:01,784][52050] Avg episode reward: [(0, '21.430'), (1, '20.100')] [2023-10-10 07:29:05,665][53268] Updated weights for policy 1, policy_version 67650 (0.0009) [2023-10-10 07:29:05,694][53252] Updated weights for policy 0, policy_version 67720 (0.0007) [2023-10-10 07:29:06,029][53268] Updated weights for policy 1, policy_version 67660 (0.0008) [2023-10-10 07:29:06,070][53252] Updated weights for policy 0, policy_version 67730 (0.0007) [2023-10-10 07:29:06,395][53268] Updated weights for policy 1, policy_version 67670 (0.0008) [2023-10-10 07:29:06,447][53252] Updated weights for policy 0, policy_version 67740 (0.0008) [2023-10-10 07:29:06,759][53268] Updated weights for policy 1, policy_version 67680 (0.0008) [2023-10-10 07:29:06,783][52050] Fps is (10 sec: 19661.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 138674176. Throughput: 0: 1684.1, 1: 1680.6. Samples: 34669268. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:29:06,784][52050] Avg episode reward: [(0, '21.810'), (1, '20.550')] [2023-10-10 07:29:10,639][53252] Updated weights for policy 0, policy_version 67750 (0.0009) [2023-10-10 07:29:10,932][53268] Updated weights for policy 1, policy_version 67690 (0.0009) [2023-10-10 07:29:11,008][53252] Updated weights for policy 0, policy_version 67760 (0.0007) [2023-10-10 07:29:11,304][53268] Updated weights for policy 1, policy_version 67700 (0.0008) [2023-10-10 07:29:11,372][53252] Updated weights for policy 0, policy_version 67770 (0.0009) [2023-10-10 07:29:11,665][53268] Updated weights for policy 1, policy_version 67710 (0.0007) [2023-10-10 07:29:11,783][52050] Fps is (10 sec: 19660.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 138739712. Throughput: 0: 1661.4, 1: 1658.8. Samples: 34687902. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-10 07:29:11,784][52050] Avg episode reward: [(0, '21.940'), (1, '19.980')] [2023-10-10 07:29:15,399][53252] Updated weights for policy 0, policy_version 67780 (0.0009) [2023-10-10 07:29:15,767][53252] Updated weights for policy 0, policy_version 67790 (0.0009) [2023-10-10 07:29:15,974][53268] Updated weights for policy 1, policy_version 67720 (0.0008) [2023-10-10 07:29:16,136][53252] Updated weights for policy 0, policy_version 67800 (0.0009) [2023-10-10 07:29:16,344][53268] Updated weights for policy 1, policy_version 67730 (0.0008) [2023-10-10 07:29:16,708][53268] Updated weights for policy 1, policy_version 67740 (0.0007) [2023-10-10 07:29:16,783][52050] Fps is (10 sec: 9830.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138772480. Throughput: 0: 1680.4, 1: 1668.4. Samples: 34698582. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:16,784][52050] Avg episode reward: [(0, '20.700'), (1, '21.600')] [2023-10-10 07:29:20,275][53252] Updated weights for policy 0, policy_version 67810 (0.0008) [2023-10-10 07:29:20,681][53252] Updated weights for policy 0, policy_version 67820 (0.0008) [2023-10-10 07:29:20,972][53268] Updated weights for policy 1, policy_version 67750 (0.0009) [2023-10-10 07:29:21,049][53252] Updated weights for policy 0, policy_version 67830 (0.0009) [2023-10-10 07:29:21,343][53268] Updated weights for policy 1, policy_version 67760 (0.0008) [2023-10-10 07:29:21,413][53252] Updated weights for policy 0, policy_version 67840 (0.0009) [2023-10-10 07:29:21,707][53268] Updated weights for policy 1, policy_version 67770 (0.0011) [2023-10-10 07:29:21,783][52050] Fps is (10 sec: 9830.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138838016. Throughput: 0: 1677.7, 1: 1667.3. Samples: 34718734. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:21,784][52050] Avg episode reward: [(0, '20.990'), (1, '21.430')] [2023-10-10 07:29:25,375][53252] Updated weights for policy 0, policy_version 67850 (0.0008) [2023-10-10 07:29:25,749][53252] Updated weights for policy 0, policy_version 67860 (0.0009) [2023-10-10 07:29:25,875][53268] Updated weights for policy 1, policy_version 67780 (0.0010) [2023-10-10 07:29:26,111][53252] Updated weights for policy 0, policy_version 67870 (0.0009) [2023-10-10 07:29:26,240][53268] Updated weights for policy 1, policy_version 67790 (0.0009) [2023-10-10 07:29:26,605][53268] Updated weights for policy 1, policy_version 67800 (0.0009) [2023-10-10 07:29:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138903552. Throughput: 0: 1664.8, 1: 1652.1. Samples: 34737806. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:26,784][52050] Avg episode reward: [(0, '21.810'), (1, '22.590')] [2023-10-10 07:29:30,533][53252] Updated weights for policy 0, policy_version 67880 (0.0011) [2023-10-10 07:29:30,905][53252] Updated weights for policy 0, policy_version 67890 (0.0010) [2023-10-10 07:29:31,041][53268] Updated weights for policy 1, policy_version 67810 (0.0010) [2023-10-10 07:29:31,282][53252] Updated weights for policy 0, policy_version 67900 (0.0009) [2023-10-10 07:29:31,408][53268] Updated weights for policy 1, policy_version 67820 (0.0009) [2023-10-10 07:29:31,783][53268] Updated weights for policy 1, policy_version 67830 (0.0009) [2023-10-10 07:29:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138969088. Throughput: 0: 1672.7, 1: 1651.2. Samples: 34747532. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:31,784][52050] Avg episode reward: [(0, '21.480'), (1, '22.730')] [2023-10-10 07:29:32,141][53268] Updated weights for policy 1, policy_version 67840 (0.0008) [2023-10-10 07:29:35,454][53252] Updated weights for policy 0, policy_version 67910 (0.0008) [2023-10-10 07:29:35,821][53252] Updated weights for policy 0, policy_version 67920 (0.0010) [2023-10-10 07:29:36,192][53252] Updated weights for policy 0, policy_version 67930 (0.0010) [2023-10-10 07:29:36,566][53268] Updated weights for policy 1, policy_version 67850 (0.0010) [2023-10-10 07:29:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139034624. Throughput: 0: 1661.7, 1: 1642.9. Samples: 34767228. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:36,784][52050] Avg episode reward: [(0, '22.200'), (1, '22.650')] [2023-10-10 07:29:36,930][53268] Updated weights for policy 1, policy_version 67860 (0.0008) [2023-10-10 07:29:37,298][53268] Updated weights for policy 1, policy_version 67870 (0.0011) [2023-10-10 07:29:40,691][53252] Updated weights for policy 0, policy_version 67940 (0.0009) [2023-10-10 07:29:41,060][53252] Updated weights for policy 0, policy_version 67950 (0.0010) [2023-10-10 07:29:41,433][53252] Updated weights for policy 0, policy_version 67960 (0.0008) [2023-10-10 07:29:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139100160. Throughput: 0: 1636.2, 1: 1619.8. Samples: 34785284. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:41,784][52050] Avg episode reward: [(0, '20.110'), (1, '22.390')] [2023-10-10 07:29:41,831][53268] Updated weights for policy 1, policy_version 67880 (0.0010) [2023-10-10 07:29:42,193][53268] Updated weights for policy 1, policy_version 67890 (0.0011) [2023-10-10 07:29:42,563][53268] Updated weights for policy 1, policy_version 67900 (0.0010) [2023-10-10 07:29:46,061][53252] Updated weights for policy 0, policy_version 67970 (0.0009) [2023-10-10 07:29:46,433][53252] Updated weights for policy 0, policy_version 67980 (0.0009) [2023-10-10 07:29:46,783][52050] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 139132928. Throughput: 0: 1627.8, 1: 1613.8. Samples: 34794274. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:46,784][52050] Avg episode reward: [(0, '20.370'), (1, '23.110')] [2023-10-10 07:29:46,803][53252] Updated weights for policy 0, policy_version 67990 (0.0011) [2023-10-10 07:29:47,154][53268] Updated weights for policy 1, policy_version 67910 (0.0010) [2023-10-10 07:29:47,174][53252] Updated weights for policy 0, policy_version 68000 (0.0010) [2023-10-10 07:29:47,518][53268] Updated weights for policy 1, policy_version 67920 (0.0010) [2023-10-10 07:29:47,892][53268] Updated weights for policy 1, policy_version 67930 (0.0010) [2023-10-10 07:29:51,584][53252] Updated weights for policy 0, policy_version 68010 (0.0007) [2023-10-10 07:29:51,783][52050] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 139198464. Throughput: 0: 1600.9, 1: 1587.7. Samples: 34812758. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:51,784][52050] Avg episode reward: [(0, '20.220'), (1, '23.700')] [2023-10-10 07:29:51,957][53252] Updated weights for policy 0, policy_version 68020 (0.0007) [2023-10-10 07:29:52,257][53268] Updated weights for policy 1, policy_version 67940 (0.0008) [2023-10-10 07:29:52,325][53252] Updated weights for policy 0, policy_version 68030 (0.0008) [2023-10-10 07:29:52,627][53268] Updated weights for policy 1, policy_version 67950 (0.0008) [2023-10-10 07:29:52,986][53268] Updated weights for policy 1, policy_version 67960 (0.0009) [2023-10-10 07:29:56,768][53252] Updated weights for policy 0, policy_version 68040 (0.0008) [2023-10-10 07:29:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 139264000. Throughput: 0: 1614.2, 1: 1594.0. Samples: 34832268. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:29:56,784][52050] Avg episode reward: [(0, '19.160'), (1, '21.340')] [2023-10-10 07:29:57,135][53252] Updated weights for policy 0, policy_version 68050 (0.0007) [2023-10-10 07:29:57,217][53268] Updated weights for policy 1, policy_version 67970 (0.0010) [2023-10-10 07:29:57,503][53252] Updated weights for policy 0, policy_version 68060 (0.0008) [2023-10-10 07:29:57,583][53268] Updated weights for policy 1, policy_version 67980 (0.0011) [2023-10-10 07:29:57,955][53268] Updated weights for policy 1, policy_version 67990 (0.0010) [2023-10-10 07:29:58,315][53268] Updated weights for policy 1, policy_version 68000 (0.0011) [2023-10-10 07:30:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 139329536. Throughput: 0: 1582.6, 1: 1579.7. Samples: 34840884. Policy #0 lag: (min: 8.0, avg: 19.1, max: 40.0) [2023-10-10 07:30:01,784][52050] Avg episode reward: [(0, '19.620'), (1, '20.810')] [2023-10-10 07:30:01,937][53252] Updated weights for policy 0, policy_version 68070 (0.0011) [2023-10-10 07:30:02,305][53252] Updated weights for policy 0, policy_version 68080 (0.0008) [2023-10-10 07:30:02,584][53268] Updated weights for policy 1, policy_version 68010 (0.0009) [2023-10-10 07:30:02,683][53252] Updated weights for policy 0, policy_version 68090 (0.0009) [2023-10-10 07:30:02,941][53268] Updated weights for policy 1, policy_version 68020 (0.0009) [2023-10-10 07:30:03,310][53268] Updated weights for policy 1, policy_version 68030 (0.0011) [2023-10-10 07:30:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 12014.9, 300 sec: 13329.4). Total num frames: 139395072. Throughput: 0: 1573.7, 1: 1568.1. Samples: 34860114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:06,784][52050] Avg episode reward: [(0, '21.070'), (1, '21.730')] [2023-10-10 07:30:06,952][53252] Updated weights for policy 0, policy_version 68100 (0.0009) [2023-10-10 07:30:07,335][53252] Updated weights for policy 0, policy_version 68110 (0.0008) [2023-10-10 07:30:07,493][53268] Updated weights for policy 1, policy_version 68040 (0.0008) [2023-10-10 07:30:07,702][53252] Updated weights for policy 0, policy_version 68120 (0.0009) [2023-10-10 07:30:07,859][53268] Updated weights for policy 1, policy_version 68050 (0.0009) [2023-10-10 07:30:08,218][53268] Updated weights for policy 1, policy_version 68060 (0.0009) [2023-10-10 07:30:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 12014.9, 300 sec: 13329.3). Total num frames: 139460608. Throughput: 0: 1592.8, 1: 1586.3. Samples: 34880866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:11,784][52050] Avg episode reward: [(0, '20.620'), (1, '19.970')] [2023-10-10 07:30:11,803][53252] Updated weights for policy 0, policy_version 68130 (0.0009) [2023-10-10 07:30:12,151][53268] Updated weights for policy 1, policy_version 68070 (0.0008) [2023-10-10 07:30:12,173][53252] Updated weights for policy 0, policy_version 68140 (0.0009) [2023-10-10 07:30:12,515][53268] Updated weights for policy 1, policy_version 68080 (0.0008) [2023-10-10 07:30:12,537][53252] Updated weights for policy 0, policy_version 68150 (0.0007) [2023-10-10 07:30:12,874][53268] Updated weights for policy 1, policy_version 68090 (0.0007) [2023-10-10 07:30:12,907][53252] Updated weights for policy 0, policy_version 68160 (0.0007) [2023-10-10 07:30:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 12561.0, 300 sec: 13218.3). Total num frames: 139526144. Throughput: 0: 1576.0, 1: 1588.9. Samples: 34889952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:16,784][52050] Avg episode reward: [(0, '22.730'), (1, '20.700')] [2023-10-10 07:30:16,910][53268] Updated weights for policy 1, policy_version 68100 (0.0009) [2023-10-10 07:30:17,052][53252] Updated weights for policy 0, policy_version 68170 (0.0009) [2023-10-10 07:30:17,268][53268] Updated weights for policy 1, policy_version 68110 (0.0009) [2023-10-10 07:30:17,413][53252] Updated weights for policy 0, policy_version 68180 (0.0008) [2023-10-10 07:30:17,642][53268] Updated weights for policy 1, policy_version 68120 (0.0008) [2023-10-10 07:30:17,784][53252] Updated weights for policy 0, policy_version 68190 (0.0009) [2023-10-10 07:30:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 12561.1, 300 sec: 13107.2). Total num frames: 139591680. Throughput: 0: 1586.6, 1: 1600.2. Samples: 34910634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:21,784][52050] Avg episode reward: [(0, '21.800'), (1, '21.990')] [2023-10-10 07:30:21,854][53268] Updated weights for policy 1, policy_version 68130 (0.0008) [2023-10-10 07:30:21,988][53252] Updated weights for policy 0, policy_version 68200 (0.0009) [2023-10-10 07:30:22,266][53268] Updated weights for policy 1, policy_version 68140 (0.0007) [2023-10-10 07:30:22,346][53252] Updated weights for policy 0, policy_version 68210 (0.0008) [2023-10-10 07:30:22,629][53268] Updated weights for policy 1, policy_version 68150 (0.0009) [2023-10-10 07:30:22,722][53252] Updated weights for policy 0, policy_version 68220 (0.0007) [2023-10-10 07:30:22,994][53268] Updated weights for policy 1, policy_version 68160 (0.0007) [2023-10-10 07:30:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 12561.1, 300 sec: 13218.3). Total num frames: 139657216. Throughput: 0: 1619.9, 1: 1625.4. Samples: 34931320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:26,784][52050] Avg episode reward: [(0, '22.120'), (1, '21.110')] [2023-10-10 07:30:26,964][53252] Updated weights for policy 0, policy_version 68230 (0.0008) [2023-10-10 07:30:27,006][53268] Updated weights for policy 1, policy_version 68170 (0.0007) [2023-10-10 07:30:27,324][53252] Updated weights for policy 0, policy_version 68240 (0.0009) [2023-10-10 07:30:27,367][53268] Updated weights for policy 1, policy_version 68180 (0.0007) [2023-10-10 07:30:27,690][53252] Updated weights for policy 0, policy_version 68250 (0.0009) [2023-10-10 07:30:27,742][53268] Updated weights for policy 1, policy_version 68190 (0.0008) [2023-10-10 07:30:31,673][53252] Updated weights for policy 0, policy_version 68260 (0.0008) [2023-10-10 07:30:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 12561.1, 300 sec: 13218.3). Total num frames: 139722752. Throughput: 0: 1616.2, 1: 1632.7. Samples: 34940474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:31,784][52050] Avg episode reward: [(0, '22.370'), (1, '20.120')] [2023-10-10 07:30:32,007][53268] Updated weights for policy 1, policy_version 68200 (0.0008) [2023-10-10 07:30:32,039][53252] Updated weights for policy 0, policy_version 68270 (0.0008) [2023-10-10 07:30:32,371][53268] Updated weights for policy 1, policy_version 68210 (0.0008) [2023-10-10 07:30:32,404][53252] Updated weights for policy 0, policy_version 68280 (0.0009) [2023-10-10 07:30:32,730][53268] Updated weights for policy 1, policy_version 68220 (0.0010) [2023-10-10 07:30:36,498][53252] Updated weights for policy 0, policy_version 68290 (0.0011) [2023-10-10 07:30:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 12561.1, 300 sec: 13218.3). Total num frames: 139788288. Throughput: 0: 1645.8, 1: 1650.8. Samples: 34961104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:36,784][52050] Avg episode reward: [(0, '21.670'), (1, '21.390')] [2023-10-10 07:30:36,862][53252] Updated weights for policy 0, policy_version 68300 (0.0007) [2023-10-10 07:30:37,053][53268] Updated weights for policy 1, policy_version 68230 (0.0009) [2023-10-10 07:30:37,228][53252] Updated weights for policy 0, policy_version 68310 (0.0009) [2023-10-10 07:30:37,418][53268] Updated weights for policy 1, policy_version 68240 (0.0008) [2023-10-10 07:30:37,601][53252] Updated weights for policy 0, policy_version 68320 (0.0010) [2023-10-10 07:30:37,787][53268] Updated weights for policy 1, policy_version 68250 (0.0008) [2023-10-10 07:30:41,713][53252] Updated weights for policy 0, policy_version 68330 (0.0009) [2023-10-10 07:30:41,755][53268] Updated weights for policy 1, policy_version 68260 (0.0008) [2023-10-10 07:30:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 12561.1, 300 sec: 13107.2). Total num frames: 139853824. Throughput: 0: 1654.2, 1: 1662.2. Samples: 34981504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:41,784][52050] Avg episode reward: [(0, '22.130'), (1, '21.060')] [2023-10-10 07:30:42,082][53252] Updated weights for policy 0, policy_version 68340 (0.0008) [2023-10-10 07:30:42,119][53268] Updated weights for policy 1, policy_version 68270 (0.0008) [2023-10-10 07:30:42,447][53252] Updated weights for policy 0, policy_version 68350 (0.0009) [2023-10-10 07:30:42,490][53268] Updated weights for policy 1, policy_version 68280 (0.0008) [2023-10-10 07:30:42,520][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000068352_69992448.pth... [2023-10-10 07:30:42,554][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth [2023-10-10 07:30:42,778][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000068288_69926912.pth... [2023-10-10 07:30:42,816][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000066720_68321280.pth [2023-10-10 07:30:46,526][53252] Updated weights for policy 0, policy_version 68360 (0.0009) [2023-10-10 07:30:46,576][53268] Updated weights for policy 1, policy_version 68290 (0.0007) [2023-10-10 07:30:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 139919360. Throughput: 0: 1662.4, 1: 1664.8. Samples: 34990610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:30:46,784][52050] Avg episode reward: [(0, '22.230'), (1, '20.770')] [2023-10-10 07:30:46,896][53252] Updated weights for policy 0, policy_version 68370 (0.0007) [2023-10-10 07:30:46,948][53268] Updated weights for policy 1, policy_version 68300 (0.0007) [2023-10-10 07:30:47,264][53252] Updated weights for policy 0, policy_version 68380 (0.0009) [2023-10-10 07:30:47,311][53268] Updated weights for policy 1, policy_version 68310 (0.0008) [2023-10-10 07:30:47,673][53268] Updated weights for policy 1, policy_version 68320 (0.0009) [2023-10-10 07:30:51,332][53252] Updated weights for policy 0, policy_version 68390 (0.0009) [2023-10-10 07:30:51,711][53252] Updated weights for policy 0, policy_version 68400 (0.0007) [2023-10-10 07:30:51,755][53268] Updated weights for policy 1, policy_version 68330 (0.0007) [2023-10-10 07:30:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 139984896. Throughput: 0: 1674.0, 1: 1680.3. Samples: 35011056. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:30:51,784][52050] Avg episode reward: [(0, '23.390'), (1, '21.610')] [2023-10-10 07:30:52,088][53252] Updated weights for policy 0, policy_version 68410 (0.0008) [2023-10-10 07:30:52,122][53268] Updated weights for policy 1, policy_version 68340 (0.0007) [2023-10-10 07:30:52,487][53268] Updated weights for policy 1, policy_version 68350 (0.0010) [2023-10-10 07:30:55,947][53252] Updated weights for policy 0, policy_version 68420 (0.0009) [2023-10-10 07:30:56,309][53252] Updated weights for policy 0, policy_version 68430 (0.0007) [2023-10-10 07:30:56,667][53268] Updated weights for policy 1, policy_version 68360 (0.0008) [2023-10-10 07:30:56,682][53252] Updated weights for policy 0, policy_version 68440 (0.0007) [2023-10-10 07:30:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 140050432. Throughput: 0: 1663.8, 1: 1671.5. Samples: 35030952. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:30:56,786][52050] Avg episode reward: [(0, '23.830'), (1, '20.540')] [2023-10-10 07:30:57,025][53268] Updated weights for policy 1, policy_version 68370 (0.0008) [2023-10-10 07:30:57,385][53268] Updated weights for policy 1, policy_version 68380 (0.0007) [2023-10-10 07:31:00,779][53252] Updated weights for policy 0, policy_version 68450 (0.0007) [2023-10-10 07:31:01,151][53252] Updated weights for policy 0, policy_version 68460 (0.0008) [2023-10-10 07:31:01,349][53268] Updated weights for policy 1, policy_version 68390 (0.0009) [2023-10-10 07:31:01,523][53252] Updated weights for policy 0, policy_version 68470 (0.0008) [2023-10-10 07:31:01,723][53268] Updated weights for policy 1, policy_version 68400 (0.0008) [2023-10-10 07:31:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 140115968. Throughput: 0: 1678.8, 1: 1671.5. Samples: 35040714. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:01,784][52050] Avg episode reward: [(0, '22.790'), (1, '19.520')] [2023-10-10 07:31:01,900][53252] Updated weights for policy 0, policy_version 68480 (0.0009) [2023-10-10 07:31:02,083][53268] Updated weights for policy 1, policy_version 68410 (0.0008) [2023-10-10 07:31:05,988][53252] Updated weights for policy 0, policy_version 68490 (0.0009) [2023-10-10 07:31:06,156][53268] Updated weights for policy 1, policy_version 68420 (0.0008) [2023-10-10 07:31:06,348][53252] Updated weights for policy 0, policy_version 68500 (0.0009) [2023-10-10 07:31:06,533][53268] Updated weights for policy 1, policy_version 68430 (0.0008) [2023-10-10 07:31:06,728][53252] Updated weights for policy 0, policy_version 68510 (0.0008) [2023-10-10 07:31:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 140181504. Throughput: 0: 1682.5, 1: 1670.2. Samples: 35061504. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:06,784][52050] Avg episode reward: [(0, '22.910'), (1, '20.400')] [2023-10-10 07:31:06,902][53268] Updated weights for policy 1, policy_version 68440 (0.0008) [2023-10-10 07:31:10,803][53252] Updated weights for policy 0, policy_version 68520 (0.0009) [2023-10-10 07:31:10,858][53268] Updated weights for policy 1, policy_version 68450 (0.0010) [2023-10-10 07:31:11,170][53252] Updated weights for policy 0, policy_version 68530 (0.0008) [2023-10-10 07:31:11,227][53268] Updated weights for policy 1, policy_version 68460 (0.0010) [2023-10-10 07:31:11,541][53252] Updated weights for policy 0, policy_version 68540 (0.0008) [2023-10-10 07:31:11,595][53268] Updated weights for policy 1, policy_version 68470 (0.0009) [2023-10-10 07:31:11,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 140279808. Throughput: 0: 1661.9, 1: 1665.4. Samples: 35081046. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:11,784][52050] Avg episode reward: [(0, '21.420'), (1, '21.170')] [2023-10-10 07:31:11,964][53268] Updated weights for policy 1, policy_version 68480 (0.0007) [2023-10-10 07:31:15,683][53252] Updated weights for policy 0, policy_version 68550 (0.0008) [2023-10-10 07:31:16,052][53252] Updated weights for policy 0, policy_version 68560 (0.0010) [2023-10-10 07:31:16,092][53268] Updated weights for policy 1, policy_version 68490 (0.0008) [2023-10-10 07:31:16,424][53252] Updated weights for policy 0, policy_version 68570 (0.0008) [2023-10-10 07:31:16,456][53268] Updated weights for policy 1, policy_version 68500 (0.0008) [2023-10-10 07:31:16,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 140345344. Throughput: 0: 1681.1, 1: 1674.7. Samples: 35091484. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:16,785][52050] Avg episode reward: [(0, '21.160'), (1, '21.040')] [2023-10-10 07:31:16,827][53268] Updated weights for policy 1, policy_version 68510 (0.0010) [2023-10-10 07:31:20,302][53252] Updated weights for policy 0, policy_version 68580 (0.0010) [2023-10-10 07:31:20,682][53252] Updated weights for policy 0, policy_version 68590 (0.0008) [2023-10-10 07:31:20,894][53268] Updated weights for policy 1, policy_version 68520 (0.0009) [2023-10-10 07:31:21,044][53252] Updated weights for policy 0, policy_version 68600 (0.0008) [2023-10-10 07:31:21,260][53268] Updated weights for policy 1, policy_version 68530 (0.0007) [2023-10-10 07:31:21,621][53268] Updated weights for policy 1, policy_version 68540 (0.0007) [2023-10-10 07:31:21,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 140443648. Throughput: 0: 1676.6, 1: 1680.6. Samples: 35112178. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:21,784][52050] Avg episode reward: [(0, '21.900'), (1, '21.730')] [2023-10-10 07:31:24,953][53252] Updated weights for policy 0, policy_version 68610 (0.0008) [2023-10-10 07:31:25,323][53252] Updated weights for policy 0, policy_version 68620 (0.0009) [2023-10-10 07:31:25,700][53252] Updated weights for policy 0, policy_version 68630 (0.0007) [2023-10-10 07:31:25,703][53268] Updated weights for policy 1, policy_version 68550 (0.0008) [2023-10-10 07:31:26,068][53268] Updated weights for policy 1, policy_version 68560 (0.0010) [2023-10-10 07:31:26,075][53252] Updated weights for policy 0, policy_version 68640 (0.0009) [2023-10-10 07:31:26,427][53268] Updated weights for policy 1, policy_version 68570 (0.0009) [2023-10-10 07:31:26,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13329.3). Total num frames: 140509184. Throughput: 0: 1664.8, 1: 1667.5. Samples: 35131458. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-10 07:31:26,785][52050] Avg episode reward: [(0, '21.330'), (1, '22.630')] [2023-10-10 07:31:30,072][53252] Updated weights for policy 0, policy_version 68650 (0.0008) [2023-10-10 07:31:30,437][53252] Updated weights for policy 0, policy_version 68660 (0.0011) [2023-10-10 07:31:30,686][53268] Updated weights for policy 1, policy_version 68580 (0.0008) [2023-10-10 07:31:30,818][53252] Updated weights for policy 0, policy_version 68670 (0.0009) [2023-10-10 07:31:31,047][53268] Updated weights for policy 1, policy_version 68590 (0.0009) [2023-10-10 07:31:31,419][53268] Updated weights for policy 1, policy_version 68600 (0.0008) [2023-10-10 07:31:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 140574720. Throughput: 0: 1690.2, 1: 1679.5. Samples: 35142244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:31,784][52050] Avg episode reward: [(0, '22.070'), (1, '21.670')] [2023-10-10 07:31:34,634][53252] Updated weights for policy 0, policy_version 68680 (0.0007) [2023-10-10 07:31:35,001][53252] Updated weights for policy 0, policy_version 68690 (0.0010) [2023-10-10 07:31:35,372][53252] Updated weights for policy 0, policy_version 68700 (0.0009) [2023-10-10 07:31:35,451][53268] Updated weights for policy 1, policy_version 68610 (0.0008) [2023-10-10 07:31:35,817][53268] Updated weights for policy 1, policy_version 68620 (0.0009) [2023-10-10 07:31:36,185][53268] Updated weights for policy 1, policy_version 68630 (0.0010) [2023-10-10 07:31:36,546][53268] Updated weights for policy 1, policy_version 68640 (0.0011) [2023-10-10 07:31:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 140640256. Throughput: 0: 1676.1, 1: 1679.9. Samples: 35162078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:36,784][52050] Avg episode reward: [(0, '22.900'), (1, '22.140')] [2023-10-10 07:31:39,772][53252] Updated weights for policy 0, policy_version 68710 (0.0008) [2023-10-10 07:31:40,142][53252] Updated weights for policy 0, policy_version 68720 (0.0008) [2023-10-10 07:31:40,508][53252] Updated weights for policy 0, policy_version 68730 (0.0010) [2023-10-10 07:31:40,854][53268] Updated weights for policy 1, policy_version 68650 (0.0008) [2023-10-10 07:31:41,220][53268] Updated weights for policy 1, policy_version 68660 (0.0009) [2023-10-10 07:31:41,580][53268] Updated weights for policy 1, policy_version 68670 (0.0011) [2023-10-10 07:31:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 140705792. Throughput: 0: 1681.2, 1: 1665.8. Samples: 35181566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:41,784][52050] Avg episode reward: [(0, '22.870'), (1, '23.480')] [2023-10-10 07:31:44,658][53252] Updated weights for policy 0, policy_version 68740 (0.0007) [2023-10-10 07:31:45,025][53252] Updated weights for policy 0, policy_version 68750 (0.0008) [2023-10-10 07:31:45,400][53252] Updated weights for policy 0, policy_version 68760 (0.0007) [2023-10-10 07:31:45,491][53268] Updated weights for policy 1, policy_version 68680 (0.0009) [2023-10-10 07:31:45,858][53268] Updated weights for policy 1, policy_version 68690 (0.0008) [2023-10-10 07:31:46,231][53268] Updated weights for policy 1, policy_version 68700 (0.0008) [2023-10-10 07:31:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 140771328. Throughput: 0: 1692.6, 1: 1684.1. Samples: 35192664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:46,784][52050] Avg episode reward: [(0, '21.940'), (1, '22.570')] [2023-10-10 07:31:49,499][53252] Updated weights for policy 0, policy_version 68770 (0.0007) [2023-10-10 07:31:49,862][53252] Updated weights for policy 0, policy_version 68780 (0.0010) [2023-10-10 07:31:50,240][53252] Updated weights for policy 0, policy_version 68790 (0.0008) [2023-10-10 07:31:50,335][53268] Updated weights for policy 1, policy_version 68710 (0.0009) [2023-10-10 07:31:50,595][53252] Updated weights for policy 0, policy_version 68800 (0.0008) [2023-10-10 07:31:50,697][53268] Updated weights for policy 1, policy_version 68720 (0.0009) [2023-10-10 07:31:51,060][53268] Updated weights for policy 1, policy_version 68730 (0.0011) [2023-10-10 07:31:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 140836864. Throughput: 0: 1671.1, 1: 1682.8. Samples: 35212430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:51,784][52050] Avg episode reward: [(0, '22.370'), (1, '21.030')] [2023-10-10 07:31:54,407][53252] Updated weights for policy 0, policy_version 68810 (0.0008) [2023-10-10 07:31:54,782][53252] Updated weights for policy 0, policy_version 68820 (0.0007) [2023-10-10 07:31:55,147][53252] Updated weights for policy 0, policy_version 68830 (0.0008) [2023-10-10 07:31:55,254][53268] Updated weights for policy 1, policy_version 68740 (0.0010) [2023-10-10 07:31:55,658][53268] Updated weights for policy 1, policy_version 68750 (0.0011) [2023-10-10 07:31:56,025][53268] Updated weights for policy 1, policy_version 68760 (0.0010) [2023-10-10 07:31:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 140902400. Throughput: 0: 1690.4, 1: 1660.6. Samples: 35231842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:31:56,784][52050] Avg episode reward: [(0, '22.100'), (1, '21.640')] [2023-10-10 07:31:59,319][53252] Updated weights for policy 0, policy_version 68840 (0.0009) [2023-10-10 07:31:59,693][53252] Updated weights for policy 0, policy_version 68850 (0.0010) [2023-10-10 07:32:00,024][53268] Updated weights for policy 1, policy_version 68770 (0.0009) [2023-10-10 07:32:00,061][53252] Updated weights for policy 0, policy_version 68860 (0.0008) [2023-10-10 07:32:00,387][53268] Updated weights for policy 1, policy_version 68780 (0.0009) [2023-10-10 07:32:00,746][53268] Updated weights for policy 1, policy_version 68790 (0.0010) [2023-10-10 07:32:01,110][53268] Updated weights for policy 1, policy_version 68800 (0.0008) [2023-10-10 07:32:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13329.4). Total num frames: 140967936. Throughput: 0: 1690.2, 1: 1675.6. Samples: 35242944. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:01,785][52050] Avg episode reward: [(0, '21.230'), (1, '21.700')] [2023-10-10 07:32:03,948][53252] Updated weights for policy 0, policy_version 68870 (0.0009) [2023-10-10 07:32:04,312][53252] Updated weights for policy 0, policy_version 68880 (0.0009) [2023-10-10 07:32:04,689][53252] Updated weights for policy 0, policy_version 68890 (0.0008) [2023-10-10 07:32:04,982][53268] Updated weights for policy 1, policy_version 68810 (0.0009) [2023-10-10 07:32:05,350][53268] Updated weights for policy 1, policy_version 68820 (0.0010) [2023-10-10 07:32:05,715][53268] Updated weights for policy 1, policy_version 68830 (0.0010) [2023-10-10 07:32:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13329.4). Total num frames: 141033472. Throughput: 0: 1676.0, 1: 1670.1. Samples: 35262750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:06,784][52050] Avg episode reward: [(0, '21.920'), (1, '19.550')] [2023-10-10 07:32:08,750][53252] Updated weights for policy 0, policy_version 68900 (0.0009) [2023-10-10 07:32:09,115][53252] Updated weights for policy 0, policy_version 68910 (0.0007) [2023-10-10 07:32:09,483][53252] Updated weights for policy 0, policy_version 68920 (0.0007) [2023-10-10 07:32:09,695][53268] Updated weights for policy 1, policy_version 68840 (0.0009) [2023-10-10 07:32:10,059][53268] Updated weights for policy 1, policy_version 68850 (0.0010) [2023-10-10 07:32:10,428][53268] Updated weights for policy 1, policy_version 68860 (0.0008) [2023-10-10 07:32:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 141099008. Throughput: 0: 1695.3, 1: 1669.1. Samples: 35282858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:11,785][52050] Avg episode reward: [(0, '20.470'), (1, '21.230')] [2023-10-10 07:32:13,707][53252] Updated weights for policy 0, policy_version 68930 (0.0008) [2023-10-10 07:32:14,074][53252] Updated weights for policy 0, policy_version 68940 (0.0007) [2023-10-10 07:32:14,453][53252] Updated weights for policy 0, policy_version 68950 (0.0008) [2023-10-10 07:32:14,468][53268] Updated weights for policy 1, policy_version 68870 (0.0008) [2023-10-10 07:32:14,817][53252] Updated weights for policy 0, policy_version 68960 (0.0008) [2023-10-10 07:32:14,846][53268] Updated weights for policy 1, policy_version 68880 (0.0008) [2023-10-10 07:32:15,209][53268] Updated weights for policy 1, policy_version 68890 (0.0007) [2023-10-10 07:32:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 141164544. Throughput: 0: 1677.8, 1: 1687.9. Samples: 35293702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:16,784][52050] Avg episode reward: [(0, '21.850'), (1, '20.840')] [2023-10-10 07:32:18,689][53252] Updated weights for policy 0, policy_version 68970 (0.0010) [2023-10-10 07:32:19,067][53252] Updated weights for policy 0, policy_version 68980 (0.0009) [2023-10-10 07:32:19,247][53268] Updated weights for policy 1, policy_version 68900 (0.0007) [2023-10-10 07:32:19,424][53252] Updated weights for policy 0, policy_version 68990 (0.0009) [2023-10-10 07:32:19,617][53268] Updated weights for policy 1, policy_version 68910 (0.0010) [2023-10-10 07:32:19,990][53268] Updated weights for policy 1, policy_version 68920 (0.0009) [2023-10-10 07:32:21,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 141230080. Throughput: 0: 1688.2, 1: 1664.7. Samples: 35312958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:21,785][52050] Avg episode reward: [(0, '23.210'), (1, '23.030')] [2023-10-10 07:32:23,452][53252] Updated weights for policy 0, policy_version 69000 (0.0007) [2023-10-10 07:32:23,823][53252] Updated weights for policy 0, policy_version 69010 (0.0007) [2023-10-10 07:32:24,139][53268] Updated weights for policy 1, policy_version 68930 (0.0009) [2023-10-10 07:32:24,199][53252] Updated weights for policy 0, policy_version 69020 (0.0008) [2023-10-10 07:32:24,499][53268] Updated weights for policy 1, policy_version 68940 (0.0008) [2023-10-10 07:32:24,867][53268] Updated weights for policy 1, policy_version 68950 (0.0009) [2023-10-10 07:32:25,238][53268] Updated weights for policy 1, policy_version 68960 (0.0010) [2023-10-10 07:32:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141295616. Throughput: 0: 1700.4, 1: 1681.3. Samples: 35333746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:26,784][52050] Avg episode reward: [(0, '22.160'), (1, '23.120')] [2023-10-10 07:32:28,280][53252] Updated weights for policy 0, policy_version 69030 (0.0008) [2023-10-10 07:32:28,645][53252] Updated weights for policy 0, policy_version 69040 (0.0007) [2023-10-10 07:32:29,020][53252] Updated weights for policy 0, policy_version 69050 (0.0010) [2023-10-10 07:32:29,457][53268] Updated weights for policy 1, policy_version 68970 (0.0007) [2023-10-10 07:32:29,818][53268] Updated weights for policy 1, policy_version 68980 (0.0011) [2023-10-10 07:32:30,185][53268] Updated weights for policy 1, policy_version 68990 (0.0008) [2023-10-10 07:32:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141361152. Throughput: 0: 1673.9, 1: 1691.1. Samples: 35344086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:31,784][52050] Avg episode reward: [(0, '21.490'), (1, '22.070')] [2023-10-10 07:32:33,022][53252] Updated weights for policy 0, policy_version 69060 (0.0008) [2023-10-10 07:32:33,402][53252] Updated weights for policy 0, policy_version 69070 (0.0008) [2023-10-10 07:32:33,765][53252] Updated weights for policy 0, policy_version 69080 (0.0008) [2023-10-10 07:32:34,242][53268] Updated weights for policy 1, policy_version 69000 (0.0008) [2023-10-10 07:32:34,612][53268] Updated weights for policy 1, policy_version 69010 (0.0009) [2023-10-10 07:32:34,969][53268] Updated weights for policy 1, policy_version 69020 (0.0008) [2023-10-10 07:32:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141426688. Throughput: 0: 1695.8, 1: 1668.7. Samples: 35363830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:36,784][52050] Avg episode reward: [(0, '20.080'), (1, '21.430')] [2023-10-10 07:32:37,783][53252] Updated weights for policy 0, policy_version 69090 (0.0008) [2023-10-10 07:32:38,158][53252] Updated weights for policy 0, policy_version 69100 (0.0007) [2023-10-10 07:32:38,530][53252] Updated weights for policy 0, policy_version 69110 (0.0007) [2023-10-10 07:32:38,904][53252] Updated weights for policy 0, policy_version 69120 (0.0008) [2023-10-10 07:32:38,946][53268] Updated weights for policy 1, policy_version 69030 (0.0009) [2023-10-10 07:32:39,308][53268] Updated weights for policy 1, policy_version 69040 (0.0008) [2023-10-10 07:32:39,677][53268] Updated weights for policy 1, policy_version 69050 (0.0010) [2023-10-10 07:32:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 141492224. Throughput: 0: 1697.2, 1: 1696.6. Samples: 35384562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:41,785][52050] Avg episode reward: [(0, '21.500'), (1, '21.910')] [2023-10-10 07:32:41,797][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000069120_70778880.pth... [2023-10-10 07:32:41,797][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000069056_70713344.pth... [2023-10-10 07:32:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000067520_69140480.pth [2023-10-10 07:32:41,834][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000067584_69206016.pth [2023-10-10 07:32:42,909][53252] Updated weights for policy 0, policy_version 69130 (0.0009) [2023-10-10 07:32:43,281][53252] Updated weights for policy 0, policy_version 69140 (0.0008) [2023-10-10 07:32:43,647][53252] Updated weights for policy 0, policy_version 69150 (0.0008) [2023-10-10 07:32:43,795][53268] Updated weights for policy 1, policy_version 69060 (0.0010) [2023-10-10 07:32:44,207][53268] Updated weights for policy 1, policy_version 69070 (0.0009) [2023-10-10 07:32:44,563][53268] Updated weights for policy 1, policy_version 69080 (0.0007) [2023-10-10 07:32:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 141557760. Throughput: 0: 1679.0, 1: 1688.3. Samples: 35394474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:46,784][52050] Avg episode reward: [(0, '19.910'), (1, '20.880')] [2023-10-10 07:32:47,698][53252] Updated weights for policy 0, policy_version 69160 (0.0010) [2023-10-10 07:32:48,065][53252] Updated weights for policy 0, policy_version 69170 (0.0009) [2023-10-10 07:32:48,443][53252] Updated weights for policy 0, policy_version 69180 (0.0009) [2023-10-10 07:32:48,599][53268] Updated weights for policy 1, policy_version 69090 (0.0010) [2023-10-10 07:32:48,975][53268] Updated weights for policy 1, policy_version 69100 (0.0009) [2023-10-10 07:32:49,341][53268] Updated weights for policy 1, policy_version 69110 (0.0007) [2023-10-10 07:32:49,706][53268] Updated weights for policy 1, policy_version 69120 (0.0009) [2023-10-10 07:32:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141623296. Throughput: 0: 1695.5, 1: 1674.0. Samples: 35414378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:51,784][52050] Avg episode reward: [(0, '20.060'), (1, '19.820')] [2023-10-10 07:32:52,434][53252] Updated weights for policy 0, policy_version 69190 (0.0008) [2023-10-10 07:32:52,808][53252] Updated weights for policy 0, policy_version 69200 (0.0008) [2023-10-10 07:32:53,172][53252] Updated weights for policy 0, policy_version 69210 (0.0008) [2023-10-10 07:32:53,800][53268] Updated weights for policy 1, policy_version 69130 (0.0008) [2023-10-10 07:32:54,165][53268] Updated weights for policy 1, policy_version 69140 (0.0011) [2023-10-10 07:32:54,536][53268] Updated weights for policy 1, policy_version 69150 (0.0011) [2023-10-10 07:32:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 141688832. Throughput: 0: 1696.1, 1: 1687.4. Samples: 35435116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:32:56,784][52050] Avg episode reward: [(0, '21.000'), (1, '22.300')] [2023-10-10 07:32:57,185][53252] Updated weights for policy 0, policy_version 69220 (0.0007) [2023-10-10 07:32:57,545][53252] Updated weights for policy 0, policy_version 69230 (0.0007) [2023-10-10 07:32:57,917][53252] Updated weights for policy 0, policy_version 69240 (0.0007) [2023-10-10 07:32:58,471][53268] Updated weights for policy 1, policy_version 69160 (0.0009) [2023-10-10 07:32:58,832][53268] Updated weights for policy 1, policy_version 69170 (0.0008) [2023-10-10 07:32:59,200][53268] Updated weights for policy 1, policy_version 69180 (0.0008) [2023-10-10 07:33:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141754368. Throughput: 0: 1686.5, 1: 1668.4. Samples: 35444670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:01,784][52050] Avg episode reward: [(0, '20.350'), (1, '22.450')] [2023-10-10 07:33:02,071][53252] Updated weights for policy 0, policy_version 69250 (0.0008) [2023-10-10 07:33:02,446][53252] Updated weights for policy 0, policy_version 69260 (0.0009) [2023-10-10 07:33:02,821][53252] Updated weights for policy 0, policy_version 69270 (0.0009) [2023-10-10 07:33:03,193][53252] Updated weights for policy 0, policy_version 69280 (0.0008) [2023-10-10 07:33:03,200][53268] Updated weights for policy 1, policy_version 69190 (0.0009) [2023-10-10 07:33:03,560][53268] Updated weights for policy 1, policy_version 69200 (0.0009) [2023-10-10 07:33:03,933][53268] Updated weights for policy 1, policy_version 69210 (0.0008) [2023-10-10 07:33:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 141819904. Throughput: 0: 1696.7, 1: 1685.3. Samples: 35465148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:06,785][52050] Avg episode reward: [(0, '21.470'), (1, '23.040')] [2023-10-10 07:33:07,277][53252] Updated weights for policy 0, policy_version 69290 (0.0008) [2023-10-10 07:33:07,646][53252] Updated weights for policy 0, policy_version 69300 (0.0009) [2023-10-10 07:33:08,002][53252] Updated weights for policy 0, policy_version 69310 (0.0008) [2023-10-10 07:33:08,034][53268] Updated weights for policy 1, policy_version 69220 (0.0008) [2023-10-10 07:33:08,395][53268] Updated weights for policy 1, policy_version 69230 (0.0008) [2023-10-10 07:33:08,756][53268] Updated weights for policy 1, policy_version 69240 (0.0011) [2023-10-10 07:33:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141885440. Throughput: 0: 1691.6, 1: 1689.8. Samples: 35485908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:11,784][52050] Avg episode reward: [(0, '21.780'), (1, '23.240')] [2023-10-10 07:33:12,090][53252] Updated weights for policy 0, policy_version 69320 (0.0008) [2023-10-10 07:33:12,463][53252] Updated weights for policy 0, policy_version 69330 (0.0008) [2023-10-10 07:33:12,839][53252] Updated weights for policy 0, policy_version 69340 (0.0007) [2023-10-10 07:33:13,029][53268] Updated weights for policy 1, policy_version 69250 (0.0010) [2023-10-10 07:33:13,395][53268] Updated weights for policy 1, policy_version 69260 (0.0007) [2023-10-10 07:33:13,756][53268] Updated weights for policy 1, policy_version 69270 (0.0009) [2023-10-10 07:33:14,121][53268] Updated weights for policy 1, policy_version 69280 (0.0009) [2023-10-10 07:33:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 141950976. Throughput: 0: 1694.7, 1: 1661.4. Samples: 35495108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:16,784][52050] Avg episode reward: [(0, '23.190'), (1, '22.900')] [2023-10-10 07:33:16,948][53252] Updated weights for policy 0, policy_version 69350 (0.0008) [2023-10-10 07:33:17,315][53252] Updated weights for policy 0, policy_version 69360 (0.0007) [2023-10-10 07:33:17,686][53252] Updated weights for policy 0, policy_version 69370 (0.0007) [2023-10-10 07:33:18,095][53268] Updated weights for policy 1, policy_version 69290 (0.0007) [2023-10-10 07:33:18,459][53268] Updated weights for policy 1, policy_version 69300 (0.0009) [2023-10-10 07:33:18,826][53268] Updated weights for policy 1, policy_version 69310 (0.0010) [2023-10-10 07:33:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 142016512. Throughput: 0: 1688.6, 1: 1684.4. Samples: 35515616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:21,784][52050] Avg episode reward: [(0, '21.500'), (1, '21.590')] [2023-10-10 07:33:21,926][53252] Updated weights for policy 0, policy_version 69380 (0.0008) [2023-10-10 07:33:22,303][53252] Updated weights for policy 0, policy_version 69390 (0.0010) [2023-10-10 07:33:22,676][53252] Updated weights for policy 0, policy_version 69400 (0.0007) [2023-10-10 07:33:22,816][53268] Updated weights for policy 1, policy_version 69320 (0.0007) [2023-10-10 07:33:23,184][53268] Updated weights for policy 1, policy_version 69330 (0.0010) [2023-10-10 07:33:23,556][53268] Updated weights for policy 1, policy_version 69340 (0.0011) [2023-10-10 07:33:26,703][53252] Updated weights for policy 0, policy_version 69410 (0.0007) [2023-10-10 07:33:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 142082048. Throughput: 0: 1686.7, 1: 1681.5. Samples: 35536130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:26,784][52050] Avg episode reward: [(0, '20.150'), (1, '22.250')] [2023-10-10 07:33:27,068][53252] Updated weights for policy 0, policy_version 69420 (0.0009) [2023-10-10 07:33:27,442][53252] Updated weights for policy 0, policy_version 69430 (0.0007) [2023-10-10 07:33:27,605][53268] Updated weights for policy 1, policy_version 69350 (0.0010) [2023-10-10 07:33:27,806][53252] Updated weights for policy 0, policy_version 69440 (0.0008) [2023-10-10 07:33:27,973][53268] Updated weights for policy 1, policy_version 69360 (0.0010) [2023-10-10 07:33:28,341][53268] Updated weights for policy 1, policy_version 69370 (0.0008) [2023-10-10 07:33:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 142147584. Throughput: 0: 1684.7, 1: 1665.3. Samples: 35545224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:31,784][52050] Avg episode reward: [(0, '21.190'), (1, '19.960')] [2023-10-10 07:33:31,939][53252] Updated weights for policy 0, policy_version 69450 (0.0007) [2023-10-10 07:33:32,310][53252] Updated weights for policy 0, policy_version 69460 (0.0007) [2023-10-10 07:33:32,489][53268] Updated weights for policy 1, policy_version 69380 (0.0009) [2023-10-10 07:33:32,690][53252] Updated weights for policy 0, policy_version 69470 (0.0007) [2023-10-10 07:33:32,854][53268] Updated weights for policy 1, policy_version 69390 (0.0008) [2023-10-10 07:33:33,225][53268] Updated weights for policy 1, policy_version 69400 (0.0009) [2023-10-10 07:33:36,782][53252] Updated weights for policy 0, policy_version 69480 (0.0008) [2023-10-10 07:33:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 142213120. Throughput: 0: 1683.3, 1: 1683.3. Samples: 35565874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:36,784][52050] Avg episode reward: [(0, '20.280'), (1, '19.880')] [2023-10-10 07:33:37,154][53252] Updated weights for policy 0, policy_version 69490 (0.0008) [2023-10-10 07:33:37,426][53268] Updated weights for policy 1, policy_version 69410 (0.0009) [2023-10-10 07:33:37,523][53252] Updated weights for policy 0, policy_version 69500 (0.0009) [2023-10-10 07:33:37,788][53268] Updated weights for policy 1, policy_version 69420 (0.0009) [2023-10-10 07:33:38,151][53268] Updated weights for policy 1, policy_version 69430 (0.0009) [2023-10-10 07:33:38,515][53268] Updated weights for policy 1, policy_version 69440 (0.0008) [2023-10-10 07:33:41,644][53252] Updated weights for policy 0, policy_version 69510 (0.0009) [2023-10-10 07:33:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 142278656. Throughput: 0: 1674.6, 1: 1683.9. Samples: 35586246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:33:41,784][52050] Avg episode reward: [(0, '19.760'), (1, '20.970')] [2023-10-10 07:33:42,007][53252] Updated weights for policy 0, policy_version 69520 (0.0008) [2023-10-10 07:33:42,382][53252] Updated weights for policy 0, policy_version 69530 (0.0007) [2023-10-10 07:33:42,657][53268] Updated weights for policy 1, policy_version 69450 (0.0007) [2023-10-10 07:33:43,025][53268] Updated weights for policy 1, policy_version 69460 (0.0009) [2023-10-10 07:33:43,399][53268] Updated weights for policy 1, policy_version 69470 (0.0009) [2023-10-10 07:33:46,568][53252] Updated weights for policy 0, policy_version 69540 (0.0010) [2023-10-10 07:33:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 142344192. Throughput: 0: 1676.8, 1: 1672.6. Samples: 35595392. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:33:46,784][52050] Avg episode reward: [(0, '20.610'), (1, '21.260')] [2023-10-10 07:33:46,936][53252] Updated weights for policy 0, policy_version 69550 (0.0009) [2023-10-10 07:33:47,308][53252] Updated weights for policy 0, policy_version 69560 (0.0007) [2023-10-10 07:33:47,365][53268] Updated weights for policy 1, policy_version 69480 (0.0008) [2023-10-10 07:33:47,732][53268] Updated weights for policy 1, policy_version 69490 (0.0009) [2023-10-10 07:33:48,103][53268] Updated weights for policy 1, policy_version 69500 (0.0008) [2023-10-10 07:33:51,087][53252] Updated weights for policy 0, policy_version 69570 (0.0009) [2023-10-10 07:33:51,462][53252] Updated weights for policy 0, policy_version 69580 (0.0009) [2023-10-10 07:33:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 142409728. Throughput: 0: 1674.6, 1: 1684.6. Samples: 35616312. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:33:51,784][52050] Avg episode reward: [(0, '20.390'), (1, '19.890')] [2023-10-10 07:33:51,828][53252] Updated weights for policy 0, policy_version 69590 (0.0009) [2023-10-10 07:33:52,205][53252] Updated weights for policy 0, policy_version 69600 (0.0008) [2023-10-10 07:33:52,262][53268] Updated weights for policy 1, policy_version 69510 (0.0009) [2023-10-10 07:33:52,618][53268] Updated weights for policy 1, policy_version 69520 (0.0011) [2023-10-10 07:33:52,991][53268] Updated weights for policy 1, policy_version 69530 (0.0012) [2023-10-10 07:33:56,279][53252] Updated weights for policy 0, policy_version 69610 (0.0008) [2023-10-10 07:33:56,657][53252] Updated weights for policy 0, policy_version 69620 (0.0007) [2023-10-10 07:33:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 142475264. Throughput: 0: 1673.8, 1: 1682.7. Samples: 35636952. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:33:56,784][52050] Avg episode reward: [(0, '20.310'), (1, '21.620')] [2023-10-10 07:33:56,982][53268] Updated weights for policy 1, policy_version 69540 (0.0009) [2023-10-10 07:33:57,028][53252] Updated weights for policy 0, policy_version 69630 (0.0007) [2023-10-10 07:33:57,350][53268] Updated weights for policy 1, policy_version 69550 (0.0009) [2023-10-10 07:33:57,717][53268] Updated weights for policy 1, policy_version 69560 (0.0010) [2023-10-10 07:34:01,031][53252] Updated weights for policy 0, policy_version 69640 (0.0008) [2023-10-10 07:34:01,401][53252] Updated weights for policy 0, policy_version 69650 (0.0007) [2023-10-10 07:34:01,623][53268] Updated weights for policy 1, policy_version 69570 (0.0009) [2023-10-10 07:34:01,767][53252] Updated weights for policy 0, policy_version 69660 (0.0007) [2023-10-10 07:34:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 142540800. Throughput: 0: 1680.1, 1: 1680.6. Samples: 35646342. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:01,784][52050] Avg episode reward: [(0, '20.040'), (1, '20.610')] [2023-10-10 07:34:01,992][53268] Updated weights for policy 1, policy_version 69580 (0.0007) [2023-10-10 07:34:02,346][53268] Updated weights for policy 1, policy_version 69590 (0.0008) [2023-10-10 07:34:02,720][53268] Updated weights for policy 1, policy_version 69600 (0.0010) [2023-10-10 07:34:05,898][53252] Updated weights for policy 0, policy_version 69670 (0.0008) [2023-10-10 07:34:06,278][53252] Updated weights for policy 0, policy_version 69680 (0.0009) [2023-10-10 07:34:06,652][53252] Updated weights for policy 0, policy_version 69690 (0.0007) [2023-10-10 07:34:06,677][53268] Updated weights for policy 1, policy_version 69610 (0.0007) [2023-10-10 07:34:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 142606336. Throughput: 0: 1687.2, 1: 1683.6. Samples: 35667300. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:06,784][52050] Avg episode reward: [(0, '21.280'), (1, '19.550')] [2023-10-10 07:34:07,037][53268] Updated weights for policy 1, policy_version 69620 (0.0008) [2023-10-10 07:34:07,407][53268] Updated weights for policy 1, policy_version 69630 (0.0008) [2023-10-10 07:34:10,684][53252] Updated weights for policy 0, policy_version 69700 (0.0009) [2023-10-10 07:34:11,054][53252] Updated weights for policy 0, policy_version 69710 (0.0008) [2023-10-10 07:34:11,432][53252] Updated weights for policy 0, policy_version 69720 (0.0007) [2023-10-10 07:34:11,467][53268] Updated weights for policy 1, policy_version 69640 (0.0008) [2023-10-10 07:34:11,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 142704640. Throughput: 0: 1669.2, 1: 1690.5. Samples: 35687320. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:11,784][52050] Avg episode reward: [(0, '19.730'), (1, '20.470')] [2023-10-10 07:34:11,834][53268] Updated weights for policy 1, policy_version 69650 (0.0008) [2023-10-10 07:34:12,196][53268] Updated weights for policy 1, policy_version 69660 (0.0009) [2023-10-10 07:34:15,310][53252] Updated weights for policy 0, policy_version 69730 (0.0008) [2023-10-10 07:34:15,677][53252] Updated weights for policy 0, policy_version 69740 (0.0009) [2023-10-10 07:34:16,042][53252] Updated weights for policy 0, policy_version 69750 (0.0007) [2023-10-10 07:34:16,283][53268] Updated weights for policy 1, policy_version 69670 (0.0010) [2023-10-10 07:34:16,409][53252] Updated weights for policy 0, policy_version 69760 (0.0008) [2023-10-10 07:34:16,650][53268] Updated weights for policy 1, policy_version 69680 (0.0009) [2023-10-10 07:34:16,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 142770176. Throughput: 0: 1693.1, 1: 1690.5. Samples: 35697486. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:16,784][52050] Avg episode reward: [(0, '21.390'), (1, '21.340')] [2023-10-10 07:34:17,022][53268] Updated weights for policy 1, policy_version 69690 (0.0009) [2023-10-10 07:34:20,446][53252] Updated weights for policy 0, policy_version 69770 (0.0007) [2023-10-10 07:34:20,823][53252] Updated weights for policy 0, policy_version 69780 (0.0009) [2023-10-10 07:34:21,087][53268] Updated weights for policy 1, policy_version 69700 (0.0007) [2023-10-10 07:34:21,193][53252] Updated weights for policy 0, policy_version 69790 (0.0007) [2023-10-10 07:34:21,487][53268] Updated weights for policy 1, policy_version 69710 (0.0009) [2023-10-10 07:34:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 142835712. Throughput: 0: 1686.6, 1: 1694.0. Samples: 35718000. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:21,784][52050] Avg episode reward: [(0, '20.810'), (1, '22.720')] [2023-10-10 07:34:21,858][53268] Updated weights for policy 1, policy_version 69720 (0.0010) [2023-10-10 07:34:25,295][53252] Updated weights for policy 0, policy_version 69800 (0.0007) [2023-10-10 07:34:25,667][53252] Updated weights for policy 0, policy_version 69810 (0.0007) [2023-10-10 07:34:26,042][53252] Updated weights for policy 0, policy_version 69820 (0.0009) [2023-10-10 07:34:26,091][53268] Updated weights for policy 1, policy_version 69730 (0.0010) [2023-10-10 07:34:26,455][53268] Updated weights for policy 1, policy_version 69740 (0.0008) [2023-10-10 07:34:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 142901248. Throughput: 0: 1671.8, 1: 1689.7. Samples: 35737512. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-10 07:34:26,784][52050] Avg episode reward: [(0, '21.150'), (1, '23.010')] [2023-10-10 07:34:26,811][53268] Updated weights for policy 1, policy_version 69750 (0.0011) [2023-10-10 07:34:27,182][53268] Updated weights for policy 1, policy_version 69760 (0.0008) [2023-10-10 07:34:29,996][53252] Updated weights for policy 0, policy_version 69830 (0.0008) [2023-10-10 07:34:30,358][53252] Updated weights for policy 0, policy_version 69840 (0.0010) [2023-10-10 07:34:30,733][53252] Updated weights for policy 0, policy_version 69850 (0.0010) [2023-10-10 07:34:31,176][53268] Updated weights for policy 1, policy_version 69770 (0.0008) [2023-10-10 07:34:31,541][53268] Updated weights for policy 1, policy_version 69780 (0.0010) [2023-10-10 07:34:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 142966784. Throughput: 0: 1695.3, 1: 1696.2. Samples: 35748010. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:31,784][52050] Avg episode reward: [(0, '21.680'), (1, '21.860')] [2023-10-10 07:34:31,903][53268] Updated weights for policy 1, policy_version 69790 (0.0008) [2023-10-10 07:34:34,684][53252] Updated weights for policy 0, policy_version 69860 (0.0011) [2023-10-10 07:34:35,060][53252] Updated weights for policy 0, policy_version 69870 (0.0009) [2023-10-10 07:34:35,425][53252] Updated weights for policy 0, policy_version 69880 (0.0008) [2023-10-10 07:34:36,066][53268] Updated weights for policy 1, policy_version 69800 (0.0008) [2023-10-10 07:34:36,434][53268] Updated weights for policy 1, policy_version 69810 (0.0011) [2023-10-10 07:34:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 143032320. Throughput: 0: 1676.3, 1: 1690.6. Samples: 35767826. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:36,784][52050] Avg episode reward: [(0, '19.690'), (1, '20.020')] [2023-10-10 07:34:36,796][53268] Updated weights for policy 1, policy_version 69820 (0.0011) [2023-10-10 07:34:39,541][53252] Updated weights for policy 0, policy_version 69890 (0.0008) [2023-10-10 07:34:39,912][53252] Updated weights for policy 0, policy_version 69900 (0.0009) [2023-10-10 07:34:40,286][53252] Updated weights for policy 0, policy_version 69910 (0.0009) [2023-10-10 07:34:40,660][53252] Updated weights for policy 0, policy_version 69920 (0.0009) [2023-10-10 07:34:40,997][53268] Updated weights for policy 1, policy_version 69830 (0.0009) [2023-10-10 07:34:41,364][53268] Updated weights for policy 1, policy_version 69840 (0.0011) [2023-10-10 07:34:41,744][53268] Updated weights for policy 1, policy_version 69850 (0.0011) [2023-10-10 07:34:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 143097856. Throughput: 0: 1671.5, 1: 1681.8. Samples: 35787850. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:41,784][52050] Avg episode reward: [(0, '20.450'), (1, '19.750')] [2023-10-10 07:34:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000069920_71598080.pth... [2023-10-10 07:34:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000068352_69992448.pth [2023-10-10 07:34:41,953][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000069856_71532544.pth... [2023-10-10 07:34:41,981][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000068288_69926912.pth [2023-10-10 07:34:44,628][53252] Updated weights for policy 0, policy_version 69930 (0.0008) [2023-10-10 07:34:44,990][53252] Updated weights for policy 0, policy_version 69940 (0.0008) [2023-10-10 07:34:45,366][53252] Updated weights for policy 0, policy_version 69950 (0.0010) [2023-10-10 07:34:45,792][53268] Updated weights for policy 1, policy_version 69860 (0.0009) [2023-10-10 07:34:46,152][53268] Updated weights for policy 1, policy_version 69870 (0.0008) [2023-10-10 07:34:46,520][53268] Updated weights for policy 1, policy_version 69880 (0.0011) [2023-10-10 07:34:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 143163392. Throughput: 0: 1688.7, 1: 1694.2. Samples: 35798570. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:46,784][52050] Avg episode reward: [(0, '20.990'), (1, '18.850')] [2023-10-10 07:34:49,658][53252] Updated weights for policy 0, policy_version 69960 (0.0010) [2023-10-10 07:34:50,039][53252] Updated weights for policy 0, policy_version 69970 (0.0009) [2023-10-10 07:34:50,416][53252] Updated weights for policy 0, policy_version 69980 (0.0011) [2023-10-10 07:34:50,652][53268] Updated weights for policy 1, policy_version 69890 (0.0011) [2023-10-10 07:34:51,018][53268] Updated weights for policy 1, policy_version 69900 (0.0011) [2023-10-10 07:34:51,378][53268] Updated weights for policy 1, policy_version 69910 (0.0012) [2023-10-10 07:34:51,742][53268] Updated weights for policy 1, policy_version 69920 (0.0011) [2023-10-10 07:34:51,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 143261696. Throughput: 0: 1661.6, 1: 1692.9. Samples: 35818252. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:51,784][52050] Avg episode reward: [(0, '21.390'), (1, '20.510')] [2023-10-10 07:34:54,481][53252] Updated weights for policy 0, policy_version 69990 (0.0010) [2023-10-10 07:34:54,854][53252] Updated weights for policy 0, policy_version 70000 (0.0008) [2023-10-10 07:34:55,220][53252] Updated weights for policy 0, policy_version 70010 (0.0010) [2023-10-10 07:34:55,616][53268] Updated weights for policy 1, policy_version 69930 (0.0009) [2023-10-10 07:34:55,983][53268] Updated weights for policy 1, policy_version 69940 (0.0007) [2023-10-10 07:34:56,341][53268] Updated weights for policy 1, policy_version 69950 (0.0007) [2023-10-10 07:34:56,783][52050] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 143327232. Throughput: 0: 1681.3, 1: 1668.1. Samples: 35838044. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:34:56,784][52050] Avg episode reward: [(0, '21.820'), (1, '20.870')] [2023-10-10 07:34:59,120][53252] Updated weights for policy 0, policy_version 70020 (0.0007) [2023-10-10 07:34:59,494][53252] Updated weights for policy 0, policy_version 70030 (0.0008) [2023-10-10 07:34:59,866][53252] Updated weights for policy 0, policy_version 70040 (0.0010) [2023-10-10 07:35:00,351][53268] Updated weights for policy 1, policy_version 69960 (0.0007) [2023-10-10 07:35:00,719][53268] Updated weights for policy 1, policy_version 69970 (0.0007) [2023-10-10 07:35:01,077][53268] Updated weights for policy 1, policy_version 69980 (0.0008) [2023-10-10 07:35:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 143392768. Throughput: 0: 1674.7, 1: 1688.5. Samples: 35848834. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:35:01,784][52050] Avg episode reward: [(0, '22.660'), (1, '18.650')] [2023-10-10 07:35:03,848][53252] Updated weights for policy 0, policy_version 70050 (0.0010) [2023-10-10 07:35:04,232][53252] Updated weights for policy 0, policy_version 70060 (0.0010) [2023-10-10 07:35:04,599][53252] Updated weights for policy 0, policy_version 70070 (0.0010) [2023-10-10 07:35:04,951][53268] Updated weights for policy 1, policy_version 69990 (0.0010) [2023-10-10 07:35:04,970][53252] Updated weights for policy 0, policy_version 70080 (0.0009) [2023-10-10 07:35:05,325][53268] Updated weights for policy 1, policy_version 70000 (0.0010) [2023-10-10 07:35:05,693][53268] Updated weights for policy 1, policy_version 70010 (0.0010) [2023-10-10 07:35:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 143458304. Throughput: 0: 1664.3, 1: 1679.1. Samples: 35868456. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:35:06,784][52050] Avg episode reward: [(0, '22.020'), (1, '19.840')] [2023-10-10 07:35:08,895][53252] Updated weights for policy 0, policy_version 70090 (0.0007) [2023-10-10 07:35:09,262][53252] Updated weights for policy 0, policy_version 70100 (0.0007) [2023-10-10 07:35:09,632][53252] Updated weights for policy 0, policy_version 70110 (0.0008) [2023-10-10 07:35:09,842][53268] Updated weights for policy 1, policy_version 70020 (0.0011) [2023-10-10 07:35:10,222][53268] Updated weights for policy 1, policy_version 70030 (0.0011) [2023-10-10 07:35:10,597][53268] Updated weights for policy 1, policy_version 70040 (0.0012) [2023-10-10 07:35:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143523840. Throughput: 0: 1688.9, 1: 1662.6. Samples: 35888328. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:35:11,784][52050] Avg episode reward: [(0, '19.630'), (1, '18.920')] [2023-10-10 07:35:13,792][53252] Updated weights for policy 0, policy_version 70120 (0.0010) [2023-10-10 07:35:14,168][53252] Updated weights for policy 0, policy_version 70130 (0.0009) [2023-10-10 07:35:14,537][53252] Updated weights for policy 0, policy_version 70140 (0.0008) [2023-10-10 07:35:14,610][53268] Updated weights for policy 1, policy_version 70050 (0.0008) [2023-10-10 07:35:14,981][53268] Updated weights for policy 1, policy_version 70060 (0.0009) [2023-10-10 07:35:15,347][53268] Updated weights for policy 1, policy_version 70070 (0.0009) [2023-10-10 07:35:15,711][53268] Updated weights for policy 1, policy_version 70080 (0.0008) [2023-10-10 07:35:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143589376. Throughput: 0: 1671.7, 1: 1685.3. Samples: 35899078. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) [2023-10-10 07:35:16,784][52050] Avg episode reward: [(0, '20.330'), (1, '20.020')] [2023-10-10 07:35:18,572][53252] Updated weights for policy 0, policy_version 70150 (0.0008) [2023-10-10 07:35:18,946][53252] Updated weights for policy 0, policy_version 70160 (0.0007) [2023-10-10 07:35:19,310][53252] Updated weights for policy 0, policy_version 70170 (0.0007) [2023-10-10 07:35:19,658][53268] Updated weights for policy 1, policy_version 70090 (0.0010) [2023-10-10 07:35:20,025][53268] Updated weights for policy 1, policy_version 70100 (0.0007) [2023-10-10 07:35:20,397][53268] Updated weights for policy 1, policy_version 70110 (0.0009) [2023-10-10 07:35:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143654912. Throughput: 0: 1686.1, 1: 1670.7. Samples: 35918882. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:21,784][52050] Avg episode reward: [(0, '21.330'), (1, '21.810')] [2023-10-10 07:35:23,417][53252] Updated weights for policy 0, policy_version 70180 (0.0008) [2023-10-10 07:35:23,778][53252] Updated weights for policy 0, policy_version 70190 (0.0008) [2023-10-10 07:35:24,147][53252] Updated weights for policy 0, policy_version 70200 (0.0008) [2023-10-10 07:35:24,511][53268] Updated weights for policy 1, policy_version 70120 (0.0009) [2023-10-10 07:35:24,874][53268] Updated weights for policy 1, policy_version 70130 (0.0007) [2023-10-10 07:35:25,231][53268] Updated weights for policy 1, policy_version 70140 (0.0010) [2023-10-10 07:35:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143720448. Throughput: 0: 1690.7, 1: 1674.9. Samples: 35939302. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:26,784][52050] Avg episode reward: [(0, '21.620'), (1, '22.680')] [2023-10-10 07:35:28,166][53252] Updated weights for policy 0, policy_version 70210 (0.0009) [2023-10-10 07:35:28,534][53252] Updated weights for policy 0, policy_version 70220 (0.0008) [2023-10-10 07:35:28,904][53252] Updated weights for policy 0, policy_version 70230 (0.0008) [2023-10-10 07:35:29,281][53252] Updated weights for policy 0, policy_version 70240 (0.0007) [2023-10-10 07:35:29,383][53268] Updated weights for policy 1, policy_version 70150 (0.0010) [2023-10-10 07:35:29,747][53268] Updated weights for policy 1, policy_version 70160 (0.0011) [2023-10-10 07:35:30,114][53268] Updated weights for policy 1, policy_version 70170 (0.0010) [2023-10-10 07:35:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143785984. Throughput: 0: 1663.7, 1: 1692.2. Samples: 35949586. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:31,784][52050] Avg episode reward: [(0, '23.570'), (1, '22.250')] [2023-10-10 07:35:33,321][53252] Updated weights for policy 0, policy_version 70250 (0.0007) [2023-10-10 07:35:33,708][53252] Updated weights for policy 0, policy_version 70260 (0.0010) [2023-10-10 07:35:34,076][53252] Updated weights for policy 0, policy_version 70270 (0.0007) [2023-10-10 07:35:34,362][53268] Updated weights for policy 1, policy_version 70180 (0.0008) [2023-10-10 07:35:34,738][53268] Updated weights for policy 1, policy_version 70190 (0.0007) [2023-10-10 07:35:35,102][53268] Updated weights for policy 1, policy_version 70200 (0.0009) [2023-10-10 07:35:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 143851520. Throughput: 0: 1687.4, 1: 1672.8. Samples: 35969464. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:36,784][52050] Avg episode reward: [(0, '23.850'), (1, '21.730')] [2023-10-10 07:35:38,032][53252] Updated weights for policy 0, policy_version 70280 (0.0007) [2023-10-10 07:35:38,391][53252] Updated weights for policy 0, policy_version 70290 (0.0008) [2023-10-10 07:35:38,766][53252] Updated weights for policy 0, policy_version 70300 (0.0009) [2023-10-10 07:35:38,995][53268] Updated weights for policy 1, policy_version 70210 (0.0009) [2023-10-10 07:35:39,363][53268] Updated weights for policy 1, policy_version 70220 (0.0009) [2023-10-10 07:35:39,721][53268] Updated weights for policy 1, policy_version 70230 (0.0009) [2023-10-10 07:35:40,089][53268] Updated weights for policy 1, policy_version 70240 (0.0008) [2023-10-10 07:35:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 143917056. Throughput: 0: 1691.5, 1: 1686.6. Samples: 35990056. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:41,785][52050] Avg episode reward: [(0, '24.590'), (1, '21.590')] [2023-10-10 07:35:42,836][53252] Updated weights for policy 0, policy_version 70310 (0.0007) [2023-10-10 07:35:43,201][53252] Updated weights for policy 0, policy_version 70320 (0.0008) [2023-10-10 07:35:43,568][53252] Updated weights for policy 0, policy_version 70330 (0.0008) [2023-10-10 07:35:44,335][53268] Updated weights for policy 1, policy_version 70250 (0.0008) [2023-10-10 07:35:44,689][53268] Updated weights for policy 1, policy_version 70260 (0.0008) [2023-10-10 07:35:45,063][53268] Updated weights for policy 1, policy_version 70270 (0.0008) [2023-10-10 07:35:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 143982592. Throughput: 0: 1675.3, 1: 1687.8. Samples: 36000174. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:46,784][52050] Avg episode reward: [(0, '23.600'), (1, '21.790')] [2023-10-10 07:35:47,656][53252] Updated weights for policy 0, policy_version 70340 (0.0008) [2023-10-10 07:35:48,024][53252] Updated weights for policy 0, policy_version 70350 (0.0009) [2023-10-10 07:35:48,396][53252] Updated weights for policy 0, policy_version 70360 (0.0007) [2023-10-10 07:35:48,828][53268] Updated weights for policy 1, policy_version 70280 (0.0011) [2023-10-10 07:35:49,203][53268] Updated weights for policy 1, policy_version 70290 (0.0010) [2023-10-10 07:35:49,572][53268] Updated weights for policy 1, policy_version 70300 (0.0010) [2023-10-10 07:35:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 144048128. Throughput: 0: 1698.1, 1: 1671.7. Samples: 36020100. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:51,785][52050] Avg episode reward: [(0, '21.390'), (1, '21.520')] [2023-10-10 07:35:52,296][53252] Updated weights for policy 0, policy_version 70370 (0.0007) [2023-10-10 07:35:52,664][53252] Updated weights for policy 0, policy_version 70380 (0.0007) [2023-10-10 07:35:53,043][53252] Updated weights for policy 0, policy_version 70390 (0.0010) [2023-10-10 07:35:53,407][53252] Updated weights for policy 0, policy_version 70400 (0.0007) [2023-10-10 07:35:53,764][53268] Updated weights for policy 1, policy_version 70310 (0.0009) [2023-10-10 07:35:54,132][53268] Updated weights for policy 1, policy_version 70320 (0.0007) [2023-10-10 07:35:54,496][53268] Updated weights for policy 1, policy_version 70330 (0.0008) [2023-10-10 07:35:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 144113664. Throughput: 0: 1696.3, 1: 1691.7. Samples: 36040790. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:35:56,784][52050] Avg episode reward: [(0, '21.670'), (1, '22.380')] [2023-10-10 07:35:57,509][53252] Updated weights for policy 0, policy_version 70410 (0.0009) [2023-10-10 07:35:57,884][53252] Updated weights for policy 0, policy_version 70420 (0.0007) [2023-10-10 07:35:58,259][53252] Updated weights for policy 0, policy_version 70430 (0.0007) [2023-10-10 07:35:58,834][53268] Updated weights for policy 1, policy_version 70340 (0.0009) [2023-10-10 07:35:59,224][53268] Updated weights for policy 1, policy_version 70350 (0.0009) [2023-10-10 07:35:59,596][53268] Updated weights for policy 1, policy_version 70360 (0.0007) [2023-10-10 07:36:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 144179200. Throughput: 0: 1686.9, 1: 1675.4. Samples: 36050382. Policy #0 lag: (min: 21.0, avg: 44.6, max: 48.0) [2023-10-10 07:36:01,785][52050] Avg episode reward: [(0, '20.920'), (1, '22.570')] [2023-10-10 07:36:02,442][53252] Updated weights for policy 0, policy_version 70440 (0.0007) [2023-10-10 07:36:02,820][53252] Updated weights for policy 0, policy_version 70450 (0.0007) [2023-10-10 07:36:03,183][53252] Updated weights for policy 0, policy_version 70460 (0.0007) [2023-10-10 07:36:03,564][53268] Updated weights for policy 1, policy_version 70370 (0.0010) [2023-10-10 07:36:03,929][53268] Updated weights for policy 1, policy_version 70380 (0.0008) [2023-10-10 07:36:04,294][53268] Updated weights for policy 1, policy_version 70390 (0.0007) [2023-10-10 07:36:04,662][53268] Updated weights for policy 1, policy_version 70400 (0.0009) [2023-10-10 07:36:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 144244736. Throughput: 0: 1693.3, 1: 1673.3. Samples: 36070378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:06,784][52050] Avg episode reward: [(0, '19.870'), (1, '23.680')] [2023-10-10 07:36:07,081][53252] Updated weights for policy 0, policy_version 70470 (0.0009) [2023-10-10 07:36:07,456][53252] Updated weights for policy 0, policy_version 70480 (0.0009) [2023-10-10 07:36:07,825][53252] Updated weights for policy 0, policy_version 70490 (0.0008) [2023-10-10 07:36:08,700][53268] Updated weights for policy 1, policy_version 70410 (0.0010) [2023-10-10 07:36:09,067][53268] Updated weights for policy 1, policy_version 70420 (0.0008) [2023-10-10 07:36:09,441][53268] Updated weights for policy 1, policy_version 70430 (0.0008) [2023-10-10 07:36:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 144310272. Throughput: 0: 1697.4, 1: 1679.9. Samples: 36091280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:11,784][52050] Avg episode reward: [(0, '19.100'), (1, '22.930')] [2023-10-10 07:36:11,885][53252] Updated weights for policy 0, policy_version 70500 (0.0007) [2023-10-10 07:36:12,260][53252] Updated weights for policy 0, policy_version 70510 (0.0007) [2023-10-10 07:36:12,632][53252] Updated weights for policy 0, policy_version 70520 (0.0008) [2023-10-10 07:36:13,492][53268] Updated weights for policy 1, policy_version 70440 (0.0011) [2023-10-10 07:36:13,865][53268] Updated weights for policy 1, policy_version 70450 (0.0010) [2023-10-10 07:36:14,238][53268] Updated weights for policy 1, policy_version 70460 (0.0008) [2023-10-10 07:36:16,574][53252] Updated weights for policy 0, policy_version 70530 (0.0008) [2023-10-10 07:36:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 144375808. Throughput: 0: 1698.5, 1: 1661.0. Samples: 36100764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:16,784][52050] Avg episode reward: [(0, '20.640'), (1, '22.330')] [2023-10-10 07:36:16,945][53252] Updated weights for policy 0, policy_version 70540 (0.0007) [2023-10-10 07:36:17,314][53252] Updated weights for policy 0, policy_version 70550 (0.0007) [2023-10-10 07:36:17,680][53252] Updated weights for policy 0, policy_version 70560 (0.0009) [2023-10-10 07:36:18,224][53268] Updated weights for policy 1, policy_version 70470 (0.0008) [2023-10-10 07:36:18,590][53268] Updated weights for policy 1, policy_version 70480 (0.0010) [2023-10-10 07:36:18,951][53268] Updated weights for policy 1, policy_version 70490 (0.0009) [2023-10-10 07:36:21,714][53252] Updated weights for policy 0, policy_version 70570 (0.0011) [2023-10-10 07:36:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 144441344. Throughput: 0: 1702.2, 1: 1675.3. Samples: 36121452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:21,784][52050] Avg episode reward: [(0, '18.830'), (1, '21.860')] [2023-10-10 07:36:22,085][53252] Updated weights for policy 0, policy_version 70580 (0.0008) [2023-10-10 07:36:22,444][53252] Updated weights for policy 0, policy_version 70590 (0.0007) [2023-10-10 07:36:22,925][53268] Updated weights for policy 1, policy_version 70500 (0.0007) [2023-10-10 07:36:23,294][53268] Updated weights for policy 1, policy_version 70510 (0.0009) [2023-10-10 07:36:23,658][53268] Updated weights for policy 1, policy_version 70520 (0.0008) [2023-10-10 07:36:26,607][53252] Updated weights for policy 0, policy_version 70600 (0.0007) [2023-10-10 07:36:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 144506880. Throughput: 0: 1694.0, 1: 1683.1. Samples: 36142026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:26,784][52050] Avg episode reward: [(0, '20.090'), (1, '23.060')] [2023-10-10 07:36:26,971][53252] Updated weights for policy 0, policy_version 70610 (0.0007) [2023-10-10 07:36:27,346][53252] Updated weights for policy 0, policy_version 70620 (0.0008) [2023-10-10 07:36:27,571][53268] Updated weights for policy 1, policy_version 70530 (0.0010) [2023-10-10 07:36:27,945][53268] Updated weights for policy 1, policy_version 70540 (0.0010) [2023-10-10 07:36:28,301][53268] Updated weights for policy 1, policy_version 70550 (0.0008) [2023-10-10 07:36:28,667][53268] Updated weights for policy 1, policy_version 70560 (0.0011) [2023-10-10 07:36:31,407][53252] Updated weights for policy 0, policy_version 70630 (0.0009) [2023-10-10 07:36:31,782][53252] Updated weights for policy 0, policy_version 70640 (0.0010) [2023-10-10 07:36:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 144572416. Throughput: 0: 1693.0, 1: 1667.6. Samples: 36151402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:31,784][52050] Avg episode reward: [(0, '23.410'), (1, '23.030')] [2023-10-10 07:36:32,151][53252] Updated weights for policy 0, policy_version 70650 (0.0008) [2023-10-10 07:36:32,615][53268] Updated weights for policy 1, policy_version 70570 (0.0009) [2023-10-10 07:36:32,977][53268] Updated weights for policy 1, policy_version 70580 (0.0008) [2023-10-10 07:36:33,349][53268] Updated weights for policy 1, policy_version 70590 (0.0009) [2023-10-10 07:36:36,223][53252] Updated weights for policy 0, policy_version 70660 (0.0009) [2023-10-10 07:36:36,594][53252] Updated weights for policy 0, policy_version 70670 (0.0009) [2023-10-10 07:36:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 144637952. Throughput: 0: 1685.6, 1: 1695.3. Samples: 36172244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:36,784][52050] Avg episode reward: [(0, '21.680'), (1, '22.930')] [2023-10-10 07:36:36,957][53252] Updated weights for policy 0, policy_version 70680 (0.0008) [2023-10-10 07:36:37,296][53268] Updated weights for policy 1, policy_version 70600 (0.0009) [2023-10-10 07:36:37,657][53268] Updated weights for policy 1, policy_version 70610 (0.0009) [2023-10-10 07:36:38,029][53268] Updated weights for policy 1, policy_version 70620 (0.0011) [2023-10-10 07:36:41,041][53252] Updated weights for policy 0, policy_version 70690 (0.0007) [2023-10-10 07:36:41,404][53252] Updated weights for policy 0, policy_version 70700 (0.0008) [2023-10-10 07:36:41,772][53252] Updated weights for policy 0, policy_version 70710 (0.0009) [2023-10-10 07:36:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 144703488. Throughput: 0: 1675.9, 1: 1697.5. Samples: 36192590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:41,784][52050] Avg episode reward: [(0, '21.890'), (1, '21.690')] [2023-10-10 07:36:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000070624_72318976.pth... [2023-10-10 07:36:41,829][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000069056_70713344.pth [2023-10-10 07:36:41,833][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000070624_72318976.pth [2023-10-10 07:36:42,140][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000070720_72417280.pth... [2023-10-10 07:36:42,141][53252] Updated weights for policy 0, policy_version 70720 (0.0007) [2023-10-10 07:36:42,168][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000069120_70778880.pth [2023-10-10 07:36:42,172][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000070720_72417280.pth [2023-10-10 07:36:42,251][53268] Updated weights for policy 1, policy_version 70630 (0.0009) [2023-10-10 07:36:42,617][53268] Updated weights for policy 1, policy_version 70640 (0.0008) [2023-10-10 07:36:42,971][53268] Updated weights for policy 1, policy_version 70650 (0.0007) [2023-10-10 07:36:46,312][53252] Updated weights for policy 0, policy_version 70730 (0.0011) [2023-10-10 07:36:46,678][53252] Updated weights for policy 0, policy_version 70740 (0.0009) [2023-10-10 07:36:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 144769024. Throughput: 0: 1690.9, 1: 1687.3. Samples: 36202402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:46,784][52050] Avg episode reward: [(0, '23.820'), (1, '22.920')] [2023-10-10 07:36:47,062][53252] Updated weights for policy 0, policy_version 70750 (0.0009) [2023-10-10 07:36:47,199][53268] Updated weights for policy 1, policy_version 70660 (0.0008) [2023-10-10 07:36:47,580][53268] Updated weights for policy 1, policy_version 70670 (0.0009) [2023-10-10 07:36:47,949][53268] Updated weights for policy 1, policy_version 70680 (0.0009) [2023-10-10 07:36:51,103][53252] Updated weights for policy 0, policy_version 70760 (0.0007) [2023-10-10 07:36:51,475][53252] Updated weights for policy 0, policy_version 70770 (0.0008) [2023-10-10 07:36:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 144834560. Throughput: 0: 1687.3, 1: 1702.7. Samples: 36222930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:51,784][52050] Avg episode reward: [(0, '23.700'), (1, '23.450')] [2023-10-10 07:36:51,853][53252] Updated weights for policy 0, policy_version 70780 (0.0011) [2023-10-10 07:36:51,995][53268] Updated weights for policy 1, policy_version 70690 (0.0009) [2023-10-10 07:36:52,361][53268] Updated weights for policy 1, policy_version 70700 (0.0011) [2023-10-10 07:36:52,725][53268] Updated weights for policy 1, policy_version 70710 (0.0011) [2023-10-10 07:36:53,088][53268] Updated weights for policy 1, policy_version 70720 (0.0010) [2023-10-10 07:36:55,823][53252] Updated weights for policy 0, policy_version 70790 (0.0007) [2023-10-10 07:36:56,197][53252] Updated weights for policy 0, policy_version 70800 (0.0009) [2023-10-10 07:36:56,575][53252] Updated weights for policy 0, policy_version 70810 (0.0009) [2023-10-10 07:36:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 144900096. Throughput: 0: 1675.9, 1: 1698.9. Samples: 36243148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:36:56,784][52050] Avg episode reward: [(0, '21.420'), (1, '21.730')] [2023-10-10 07:36:57,062][53268] Updated weights for policy 1, policy_version 70730 (0.0009) [2023-10-10 07:36:57,424][53268] Updated weights for policy 1, policy_version 70740 (0.0009) [2023-10-10 07:36:57,786][53268] Updated weights for policy 1, policy_version 70750 (0.0010) [2023-10-10 07:37:00,481][53252] Updated weights for policy 0, policy_version 70820 (0.0008) [2023-10-10 07:37:00,855][53252] Updated weights for policy 0, policy_version 70830 (0.0008) [2023-10-10 07:37:01,218][53252] Updated weights for policy 0, policy_version 70840 (0.0008) [2023-10-10 07:37:01,750][53268] Updated weights for policy 1, policy_version 70760 (0.0008) [2023-10-10 07:37:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144998400. Throughput: 0: 1693.1, 1: 1688.6. Samples: 36252942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:01,784][52050] Avg episode reward: [(0, '23.010'), (1, '20.940')] [2023-10-10 07:37:02,115][53268] Updated weights for policy 1, policy_version 70770 (0.0008) [2023-10-10 07:37:02,482][53268] Updated weights for policy 1, policy_version 70780 (0.0008) [2023-10-10 07:37:05,353][53252] Updated weights for policy 0, policy_version 70850 (0.0008) [2023-10-10 07:37:05,732][53252] Updated weights for policy 0, policy_version 70860 (0.0010) [2023-10-10 07:37:06,100][53252] Updated weights for policy 0, policy_version 70870 (0.0008) [2023-10-10 07:37:06,468][53252] Updated weights for policy 0, policy_version 70880 (0.0008) [2023-10-10 07:37:06,634][53268] Updated weights for policy 1, policy_version 70790 (0.0008) [2023-10-10 07:37:06,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145063936. Throughput: 0: 1685.0, 1: 1694.8. Samples: 36273544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:06,784][52050] Avg episode reward: [(0, '22.300'), (1, '22.480')] [2023-10-10 07:37:06,999][53268] Updated weights for policy 1, policy_version 70800 (0.0008) [2023-10-10 07:37:07,363][53268] Updated weights for policy 1, policy_version 70810 (0.0009) [2023-10-10 07:37:10,527][53252] Updated weights for policy 0, policy_version 70890 (0.0009) [2023-10-10 07:37:10,911][53252] Updated weights for policy 0, policy_version 70900 (0.0007) [2023-10-10 07:37:11,266][53252] Updated weights for policy 0, policy_version 70910 (0.0008) [2023-10-10 07:37:11,379][53268] Updated weights for policy 1, policy_version 70820 (0.0010) [2023-10-10 07:37:11,742][53268] Updated weights for policy 1, policy_version 70830 (0.0010) [2023-10-10 07:37:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145129472. Throughput: 0: 1666.4, 1: 1698.7. Samples: 36293452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:11,784][52050] Avg episode reward: [(0, '22.560'), (1, '20.250')] [2023-10-10 07:37:12,116][53268] Updated weights for policy 1, policy_version 70840 (0.0008) [2023-10-10 07:37:15,337][53252] Updated weights for policy 0, policy_version 70920 (0.0007) [2023-10-10 07:37:15,708][53252] Updated weights for policy 0, policy_version 70930 (0.0009) [2023-10-10 07:37:16,064][53252] Updated weights for policy 0, policy_version 70940 (0.0009) [2023-10-10 07:37:16,176][53268] Updated weights for policy 1, policy_version 70850 (0.0010) [2023-10-10 07:37:16,547][53268] Updated weights for policy 1, policy_version 70860 (0.0007) [2023-10-10 07:37:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145195008. Throughput: 0: 1696.8, 1: 1693.2. Samples: 36303954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:16,784][52050] Avg episode reward: [(0, '22.260'), (1, '21.380')] [2023-10-10 07:37:16,911][53268] Updated weights for policy 1, policy_version 70870 (0.0007) [2023-10-10 07:37:17,281][53268] Updated weights for policy 1, policy_version 70880 (0.0010) [2023-10-10 07:37:20,193][53252] Updated weights for policy 0, policy_version 70950 (0.0009) [2023-10-10 07:37:20,560][53252] Updated weights for policy 0, policy_version 70960 (0.0010) [2023-10-10 07:37:20,939][53252] Updated weights for policy 0, policy_version 70970 (0.0010) [2023-10-10 07:37:21,382][53268] Updated weights for policy 1, policy_version 70890 (0.0010) [2023-10-10 07:37:21,758][53268] Updated weights for policy 1, policy_version 70900 (0.0011) [2023-10-10 07:37:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145260544. Throughput: 0: 1689.1, 1: 1688.1. Samples: 36324218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:21,784][52050] Avg episode reward: [(0, '21.740'), (1, '22.220')] [2023-10-10 07:37:22,127][53268] Updated weights for policy 1, policy_version 70910 (0.0011) [2023-10-10 07:37:24,934][53252] Updated weights for policy 0, policy_version 70980 (0.0008) [2023-10-10 07:37:25,300][53252] Updated weights for policy 0, policy_version 70990 (0.0008) [2023-10-10 07:37:25,685][53252] Updated weights for policy 0, policy_version 71000 (0.0008) [2023-10-10 07:37:26,200][53268] Updated weights for policy 1, policy_version 70920 (0.0008) [2023-10-10 07:37:26,567][53268] Updated weights for policy 1, policy_version 70930 (0.0009) [2023-10-10 07:37:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145326080. Throughput: 0: 1677.1, 1: 1687.1. Samples: 36343984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:26,784][52050] Avg episode reward: [(0, '22.750'), (1, '23.000')] [2023-10-10 07:37:26,932][53268] Updated weights for policy 1, policy_version 70940 (0.0009) [2023-10-10 07:37:29,702][53252] Updated weights for policy 0, policy_version 71010 (0.0008) [2023-10-10 07:37:30,080][53252] Updated weights for policy 0, policy_version 71020 (0.0009) [2023-10-10 07:37:30,444][53252] Updated weights for policy 0, policy_version 71030 (0.0009) [2023-10-10 07:37:30,817][53252] Updated weights for policy 0, policy_version 71040 (0.0007) [2023-10-10 07:37:31,057][53268] Updated weights for policy 1, policy_version 70950 (0.0008) [2023-10-10 07:37:31,414][53268] Updated weights for policy 1, policy_version 70960 (0.0008) [2023-10-10 07:37:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 145391616. Throughput: 0: 1693.4, 1: 1692.3. Samples: 36354756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:31,784][52050] Avg episode reward: [(0, '21.430'), (1, '22.130')] [2023-10-10 07:37:31,788][53268] Updated weights for policy 1, policy_version 70970 (0.0008) [2023-10-10 07:37:34,813][53252] Updated weights for policy 0, policy_version 71050 (0.0008) [2023-10-10 07:37:35,193][53252] Updated weights for policy 0, policy_version 71060 (0.0009) [2023-10-10 07:37:35,555][53252] Updated weights for policy 0, policy_version 71070 (0.0009) [2023-10-10 07:37:35,747][53268] Updated weights for policy 1, policy_version 70980 (0.0009) [2023-10-10 07:37:36,151][53268] Updated weights for policy 1, policy_version 70990 (0.0009) [2023-10-10 07:37:36,527][53268] Updated weights for policy 1, policy_version 71000 (0.0010) [2023-10-10 07:37:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 145457152. Throughput: 0: 1678.4, 1: 1699.8. Samples: 36374952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:36,784][52050] Avg episode reward: [(0, '21.660'), (1, '22.250')] [2023-10-10 07:37:39,568][53252] Updated weights for policy 0, policy_version 71080 (0.0007) [2023-10-10 07:37:39,945][53252] Updated weights for policy 0, policy_version 71090 (0.0008) [2023-10-10 07:37:40,321][53252] Updated weights for policy 0, policy_version 71100 (0.0008) [2023-10-10 07:37:40,527][53268] Updated weights for policy 1, policy_version 71010 (0.0007) [2023-10-10 07:37:40,907][53268] Updated weights for policy 1, policy_version 71020 (0.0009) [2023-10-10 07:37:41,271][53268] Updated weights for policy 1, policy_version 71030 (0.0010) [2023-10-10 07:37:41,642][53268] Updated weights for policy 1, policy_version 71040 (0.0009) [2023-10-10 07:37:41,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 145555456. Throughput: 0: 1679.2, 1: 1683.2. Samples: 36394452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:41,784][52050] Avg episode reward: [(0, '21.390'), (1, '21.560')] [2023-10-10 07:37:44,428][53252] Updated weights for policy 0, policy_version 71110 (0.0008) [2023-10-10 07:37:44,803][53252] Updated weights for policy 0, policy_version 71120 (0.0009) [2023-10-10 07:37:45,169][53252] Updated weights for policy 0, policy_version 71130 (0.0008) [2023-10-10 07:37:45,564][53268] Updated weights for policy 1, policy_version 71050 (0.0008) [2023-10-10 07:37:45,930][53268] Updated weights for policy 1, policy_version 71060 (0.0010) [2023-10-10 07:37:46,305][53268] Updated weights for policy 1, policy_version 71070 (0.0010) [2023-10-10 07:37:46,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 145620992. Throughput: 0: 1688.4, 1: 1702.2. Samples: 36405520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:46,784][52050] Avg episode reward: [(0, '21.170'), (1, '21.680')] [2023-10-10 07:37:49,395][53252] Updated weights for policy 0, policy_version 71140 (0.0007) [2023-10-10 07:37:49,762][53252] Updated weights for policy 0, policy_version 71150 (0.0008) [2023-10-10 07:37:50,122][53252] Updated weights for policy 0, policy_version 71160 (0.0008) [2023-10-10 07:37:50,475][53268] Updated weights for policy 1, policy_version 71080 (0.0011) [2023-10-10 07:37:50,844][53268] Updated weights for policy 1, policy_version 71090 (0.0010) [2023-10-10 07:37:51,221][53268] Updated weights for policy 1, policy_version 71100 (0.0009) [2023-10-10 07:37:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 145686528. Throughput: 0: 1667.6, 1: 1697.3. Samples: 36424968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:51,784][52050] Avg episode reward: [(0, '20.840'), (1, '21.420')] [2023-10-10 07:37:54,070][53252] Updated weights for policy 0, policy_version 71170 (0.0008) [2023-10-10 07:37:54,450][53252] Updated weights for policy 0, policy_version 71180 (0.0007) [2023-10-10 07:37:54,816][53252] Updated weights for policy 0, policy_version 71190 (0.0009) [2023-10-10 07:37:55,100][53268] Updated weights for policy 1, policy_version 71110 (0.0009) [2023-10-10 07:37:55,187][53252] Updated weights for policy 0, policy_version 71200 (0.0008) [2023-10-10 07:37:55,471][53268] Updated weights for policy 1, policy_version 71120 (0.0009) [2023-10-10 07:37:55,827][53268] Updated weights for policy 1, policy_version 71130 (0.0009) [2023-10-10 07:37:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 145752064. Throughput: 0: 1691.2, 1: 1667.6. Samples: 36444596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:37:56,784][52050] Avg episode reward: [(0, '21.790'), (1, '22.480')] [2023-10-10 07:37:59,330][53252] Updated weights for policy 0, policy_version 71210 (0.0007) [2023-10-10 07:37:59,707][53252] Updated weights for policy 0, policy_version 71220 (0.0008) [2023-10-10 07:37:59,771][53268] Updated weights for policy 1, policy_version 71140 (0.0009) [2023-10-10 07:38:00,082][53252] Updated weights for policy 0, policy_version 71230 (0.0008) [2023-10-10 07:38:00,146][53268] Updated weights for policy 1, policy_version 71150 (0.0008) [2023-10-10 07:38:00,512][53268] Updated weights for policy 1, policy_version 71160 (0.0009) [2023-10-10 07:38:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145817600. Throughput: 0: 1674.3, 1: 1698.7. Samples: 36455738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:38:01,784][52050] Avg episode reward: [(0, '20.300'), (1, '20.790')] [2023-10-10 07:38:04,328][53252] Updated weights for policy 0, policy_version 71240 (0.0009) [2023-10-10 07:38:04,618][53268] Updated weights for policy 1, policy_version 71170 (0.0009) [2023-10-10 07:38:04,702][53252] Updated weights for policy 0, policy_version 71250 (0.0008) [2023-10-10 07:38:04,988][53268] Updated weights for policy 1, policy_version 71180 (0.0007) [2023-10-10 07:38:05,074][53252] Updated weights for policy 0, policy_version 71260 (0.0008) [2023-10-10 07:38:05,350][53268] Updated weights for policy 1, policy_version 71190 (0.0007) [2023-10-10 07:38:05,723][53268] Updated weights for policy 1, policy_version 71200 (0.0008) [2023-10-10 07:38:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145883136. Throughput: 0: 1659.6, 1: 1682.9. Samples: 36474634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:38:06,784][52050] Avg episode reward: [(0, '20.720'), (1, '21.500')] [2023-10-10 07:38:09,170][53252] Updated weights for policy 0, policy_version 71270 (0.0007) [2023-10-10 07:38:09,532][53252] Updated weights for policy 0, policy_version 71280 (0.0007) [2023-10-10 07:38:09,676][53268] Updated weights for policy 1, policy_version 71210 (0.0009) [2023-10-10 07:38:09,907][53252] Updated weights for policy 0, policy_version 71290 (0.0008) [2023-10-10 07:38:10,035][53268] Updated weights for policy 1, policy_version 71220 (0.0009) [2023-10-10 07:38:10,409][53268] Updated weights for policy 1, policy_version 71230 (0.0011) [2023-10-10 07:38:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145948672. Throughput: 0: 1673.5, 1: 1677.5. Samples: 36494778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:38:11,784][52050] Avg episode reward: [(0, '21.460'), (1, '20.530')] [2023-10-10 07:38:13,984][53252] Updated weights for policy 0, policy_version 71300 (0.0008) [2023-10-10 07:38:14,323][53268] Updated weights for policy 1, policy_version 71240 (0.0008) [2023-10-10 07:38:14,346][53252] Updated weights for policy 0, policy_version 71310 (0.0007) [2023-10-10 07:38:14,703][53268] Updated weights for policy 1, policy_version 71250 (0.0008) [2023-10-10 07:38:14,722][53252] Updated weights for policy 0, policy_version 71320 (0.0008) [2023-10-10 07:38:15,069][53268] Updated weights for policy 1, policy_version 71260 (0.0008) [2023-10-10 07:38:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 146014208. Throughput: 0: 1658.0, 1: 1701.0. Samples: 36505912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:38:16,784][52050] Avg episode reward: [(0, '21.530'), (1, '20.940')] [2023-10-10 07:38:18,898][53252] Updated weights for policy 0, policy_version 71330 (0.0009) [2023-10-10 07:38:19,226][53268] Updated weights for policy 1, policy_version 71270 (0.0007) [2023-10-10 07:38:19,273][53252] Updated weights for policy 0, policy_version 71340 (0.0008) [2023-10-10 07:38:19,598][53268] Updated weights for policy 1, policy_version 71280 (0.0009) [2023-10-10 07:38:19,629][53252] Updated weights for policy 0, policy_version 71350 (0.0008) [2023-10-10 07:38:19,969][53268] Updated weights for policy 1, policy_version 71290 (0.0008) [2023-10-10 07:38:20,001][53252] Updated weights for policy 0, policy_version 71360 (0.0008) [2023-10-10 07:38:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 146079744. Throughput: 0: 1656.0, 1: 1673.0. Samples: 36524756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:38:21,784][52050] Avg episode reward: [(0, '22.080'), (1, '20.060')] [2023-10-10 07:38:24,059][53268] Updated weights for policy 1, policy_version 71300 (0.0010) [2023-10-10 07:38:24,236][53252] Updated weights for policy 0, policy_version 71370 (0.0008) [2023-10-10 07:38:24,441][53268] Updated weights for policy 1, policy_version 71310 (0.0009) [2023-10-10 07:38:24,603][53252] Updated weights for policy 0, policy_version 71380 (0.0007) [2023-10-10 07:38:24,804][53268] Updated weights for policy 1, policy_version 71320 (0.0010) [2023-10-10 07:38:24,976][53252] Updated weights for policy 0, policy_version 71390 (0.0009) [2023-10-10 07:38:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 146145280. Throughput: 0: 1661.8, 1: 1689.4. Samples: 36545258. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:26,784][52050] Avg episode reward: [(0, '21.570'), (1, '21.340')] [2023-10-10 07:38:28,869][53268] Updated weights for policy 1, policy_version 71330 (0.0009) [2023-10-10 07:38:29,018][53252] Updated weights for policy 0, policy_version 71400 (0.0007) [2023-10-10 07:38:29,231][53268] Updated weights for policy 1, policy_version 71340 (0.0010) [2023-10-10 07:38:29,389][53252] Updated weights for policy 0, policy_version 71410 (0.0007) [2023-10-10 07:38:29,597][53268] Updated weights for policy 1, policy_version 71350 (0.0010) [2023-10-10 07:38:29,755][53252] Updated weights for policy 0, policy_version 71420 (0.0007) [2023-10-10 07:38:29,963][53268] Updated weights for policy 1, policy_version 71360 (0.0009) [2023-10-10 07:38:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 146210816. Throughput: 0: 1648.1, 1: 1690.9. Samples: 36555774. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:31,784][52050] Avg episode reward: [(0, '21.820'), (1, '20.250')] [2023-10-10 07:38:33,950][53252] Updated weights for policy 0, policy_version 71430 (0.0008) [2023-10-10 07:38:34,163][53268] Updated weights for policy 1, policy_version 71370 (0.0010) [2023-10-10 07:38:34,314][53252] Updated weights for policy 0, policy_version 71440 (0.0007) [2023-10-10 07:38:34,531][53268] Updated weights for policy 1, policy_version 71380 (0.0008) [2023-10-10 07:38:34,676][53252] Updated weights for policy 0, policy_version 71450 (0.0008) [2023-10-10 07:38:34,897][53268] Updated weights for policy 1, policy_version 71390 (0.0007) [2023-10-10 07:38:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 146276352. Throughput: 0: 1656.6, 1: 1670.3. Samples: 36574676. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:36,784][52050] Avg episode reward: [(0, '22.740'), (1, '20.940')] [2023-10-10 07:38:38,638][53252] Updated weights for policy 0, policy_version 71460 (0.0008) [2023-10-10 07:38:38,965][53268] Updated weights for policy 1, policy_version 71400 (0.0010) [2023-10-10 07:38:39,017][53252] Updated weights for policy 0, policy_version 71470 (0.0009) [2023-10-10 07:38:39,338][53268] Updated weights for policy 1, policy_version 71410 (0.0008) [2023-10-10 07:38:39,382][53252] Updated weights for policy 0, policy_version 71480 (0.0008) [2023-10-10 07:38:39,694][53268] Updated weights for policy 1, policy_version 71420 (0.0009) [2023-10-10 07:38:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 146341888. Throughput: 0: 1658.7, 1: 1694.3. Samples: 36595480. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:41,785][52050] Avg episode reward: [(0, '20.980'), (1, '20.790')] [2023-10-10 07:38:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000071488_73203712.pth... [2023-10-10 07:38:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000071424_73138176.pth... [2023-10-10 07:38:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000069856_71532544.pth [2023-10-10 07:38:41,827][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000069920_71598080.pth [2023-10-10 07:38:43,471][53252] Updated weights for policy 0, policy_version 71490 (0.0010) [2023-10-10 07:38:43,697][53268] Updated weights for policy 1, policy_version 71430 (0.0009) [2023-10-10 07:38:43,845][53252] Updated weights for policy 0, policy_version 71500 (0.0008) [2023-10-10 07:38:44,069][53268] Updated weights for policy 1, policy_version 71440 (0.0008) [2023-10-10 07:38:44,210][53252] Updated weights for policy 0, policy_version 71510 (0.0007) [2023-10-10 07:38:44,440][53268] Updated weights for policy 1, policy_version 71450 (0.0009) [2023-10-10 07:38:44,576][53252] Updated weights for policy 0, policy_version 71520 (0.0007) [2023-10-10 07:38:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 146407424. Throughput: 0: 1649.3, 1: 1678.0. Samples: 36605466. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:46,784][52050] Avg episode reward: [(0, '22.290'), (1, '19.930')] [2023-10-10 07:38:48,503][53268] Updated weights for policy 1, policy_version 71460 (0.0008) [2023-10-10 07:38:48,794][53252] Updated weights for policy 0, policy_version 71530 (0.0009) [2023-10-10 07:38:48,869][53268] Updated weights for policy 1, policy_version 71470 (0.0008) [2023-10-10 07:38:49,157][53252] Updated weights for policy 0, policy_version 71540 (0.0010) [2023-10-10 07:38:49,234][53268] Updated weights for policy 1, policy_version 71480 (0.0007) [2023-10-10 07:38:49,521][53252] Updated weights for policy 0, policy_version 71550 (0.0008) [2023-10-10 07:38:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 146472960. Throughput: 0: 1664.5, 1: 1679.6. Samples: 36625120. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:51,784][52050] Avg episode reward: [(0, '22.880'), (1, '21.390')] [2023-10-10 07:38:53,284][53268] Updated weights for policy 1, policy_version 71490 (0.0008) [2023-10-10 07:38:53,654][53268] Updated weights for policy 1, policy_version 71500 (0.0008) [2023-10-10 07:38:53,792][53252] Updated weights for policy 0, policy_version 71560 (0.0009) [2023-10-10 07:38:54,022][53268] Updated weights for policy 1, policy_version 71510 (0.0008) [2023-10-10 07:38:54,169][53252] Updated weights for policy 0, policy_version 71570 (0.0008) [2023-10-10 07:38:54,388][53268] Updated weights for policy 1, policy_version 71520 (0.0008) [2023-10-10 07:38:54,537][53252] Updated weights for policy 0, policy_version 71580 (0.0007) [2023-10-10 07:38:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 146538496. Throughput: 0: 1664.9, 1: 1691.6. Samples: 36645820. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:38:56,784][52050] Avg episode reward: [(0, '22.800'), (1, '21.230')] [2023-10-10 07:38:58,377][53268] Updated weights for policy 1, policy_version 71530 (0.0009) [2023-10-10 07:38:58,586][53252] Updated weights for policy 0, policy_version 71590 (0.0008) [2023-10-10 07:38:58,742][53268] Updated weights for policy 1, policy_version 71540 (0.0008) [2023-10-10 07:38:58,959][53252] Updated weights for policy 0, policy_version 71600 (0.0009) [2023-10-10 07:38:59,112][53268] Updated weights for policy 1, policy_version 71550 (0.0009) [2023-10-10 07:38:59,326][53252] Updated weights for policy 0, policy_version 71610 (0.0007) [2023-10-10 07:39:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 146604032. Throughput: 0: 1654.4, 1: 1666.6. Samples: 36655360. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:39:01,784][52050] Avg episode reward: [(0, '19.990'), (1, '21.680')] [2023-10-10 07:39:03,275][53268] Updated weights for policy 1, policy_version 71560 (0.0008) [2023-10-10 07:39:03,394][53252] Updated weights for policy 0, policy_version 71620 (0.0009) [2023-10-10 07:39:03,634][53268] Updated weights for policy 1, policy_version 71570 (0.0008) [2023-10-10 07:39:03,769][53252] Updated weights for policy 0, policy_version 71630 (0.0007) [2023-10-10 07:39:04,004][53268] Updated weights for policy 1, policy_version 71580 (0.0007) [2023-10-10 07:39:04,141][53252] Updated weights for policy 0, policy_version 71640 (0.0007) [2023-10-10 07:39:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146669568. Throughput: 0: 1668.5, 1: 1685.4. Samples: 36675684. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:39:06,784][52050] Avg episode reward: [(0, '21.350'), (1, '23.090')] [2023-10-10 07:39:07,973][53268] Updated weights for policy 1, policy_version 71590 (0.0009) [2023-10-10 07:39:08,240][53252] Updated weights for policy 0, policy_version 71650 (0.0007) [2023-10-10 07:39:08,328][53268] Updated weights for policy 1, policy_version 71600 (0.0009) [2023-10-10 07:39:08,601][53252] Updated weights for policy 0, policy_version 71660 (0.0010) [2023-10-10 07:39:08,700][53268] Updated weights for policy 1, policy_version 71610 (0.0007) [2023-10-10 07:39:08,971][53252] Updated weights for policy 0, policy_version 71670 (0.0008) [2023-10-10 07:39:09,342][53252] Updated weights for policy 0, policy_version 71680 (0.0008) [2023-10-10 07:39:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 146735104. Throughput: 0: 1670.2, 1: 1692.6. Samples: 36696582. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-10 07:39:11,784][52050] Avg episode reward: [(0, '21.550'), (1, '24.030')] [2023-10-10 07:39:12,886][53268] Updated weights for policy 1, policy_version 71620 (0.0007) [2023-10-10 07:39:13,288][53268] Updated weights for policy 1, policy_version 71630 (0.0007) [2023-10-10 07:39:13,535][53252] Updated weights for policy 0, policy_version 71690 (0.0007) [2023-10-10 07:39:13,652][53268] Updated weights for policy 1, policy_version 71640 (0.0009) [2023-10-10 07:39:13,915][53252] Updated weights for policy 0, policy_version 71700 (0.0007) [2023-10-10 07:39:14,278][53252] Updated weights for policy 0, policy_version 71710 (0.0009) [2023-10-10 07:39:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146800640. Throughput: 0: 1659.9, 1: 1670.0. Samples: 36705622. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:16,784][52050] Avg episode reward: [(0, '21.300'), (1, '23.030')] [2023-10-10 07:39:17,567][53268] Updated weights for policy 1, policy_version 71650 (0.0009) [2023-10-10 07:39:17,938][53268] Updated weights for policy 1, policy_version 71660 (0.0007) [2023-10-10 07:39:18,306][53268] Updated weights for policy 1, policy_version 71670 (0.0007) [2023-10-10 07:39:18,428][53252] Updated weights for policy 0, policy_version 71720 (0.0008) [2023-10-10 07:39:18,665][53268] Updated weights for policy 1, policy_version 71680 (0.0008) [2023-10-10 07:39:18,798][53252] Updated weights for policy 0, policy_version 71730 (0.0009) [2023-10-10 07:39:19,177][53252] Updated weights for policy 0, policy_version 71740 (0.0009) [2023-10-10 07:39:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146866176. Throughput: 0: 1669.5, 1: 1697.0. Samples: 36726170. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:21,784][52050] Avg episode reward: [(0, '21.460'), (1, '22.730')] [2023-10-10 07:39:22,749][53268] Updated weights for policy 1, policy_version 71690 (0.0007) [2023-10-10 07:39:23,102][53252] Updated weights for policy 0, policy_version 71750 (0.0009) [2023-10-10 07:39:23,109][53268] Updated weights for policy 1, policy_version 71700 (0.0008) [2023-10-10 07:39:23,475][53252] Updated weights for policy 0, policy_version 71760 (0.0007) [2023-10-10 07:39:23,478][53268] Updated weights for policy 1, policy_version 71710 (0.0009) [2023-10-10 07:39:23,847][53252] Updated weights for policy 0, policy_version 71770 (0.0007) [2023-10-10 07:39:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146931712. Throughput: 0: 1671.6, 1: 1697.7. Samples: 36747100. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:26,784][52050] Avg episode reward: [(0, '21.920'), (1, '22.040')] [2023-10-10 07:39:27,449][53268] Updated weights for policy 1, policy_version 71720 (0.0007) [2023-10-10 07:39:27,791][53252] Updated weights for policy 0, policy_version 71780 (0.0009) [2023-10-10 07:39:27,817][53268] Updated weights for policy 1, policy_version 71730 (0.0007) [2023-10-10 07:39:28,160][53252] Updated weights for policy 0, policy_version 71790 (0.0008) [2023-10-10 07:39:28,182][53268] Updated weights for policy 1, policy_version 71740 (0.0008) [2023-10-10 07:39:28,536][53252] Updated weights for policy 0, policy_version 71800 (0.0007) [2023-10-10 07:39:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146997248. Throughput: 0: 1666.8, 1: 1683.4. Samples: 36756228. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:31,784][52050] Avg episode reward: [(0, '20.900'), (1, '21.990')] [2023-10-10 07:39:32,261][53268] Updated weights for policy 1, policy_version 71750 (0.0009) [2023-10-10 07:39:32,622][53252] Updated weights for policy 0, policy_version 71810 (0.0007) [2023-10-10 07:39:32,630][53268] Updated weights for policy 1, policy_version 71760 (0.0009) [2023-10-10 07:39:32,992][53252] Updated weights for policy 0, policy_version 71820 (0.0008) [2023-10-10 07:39:33,008][53268] Updated weights for policy 1, policy_version 71770 (0.0009) [2023-10-10 07:39:33,364][53252] Updated weights for policy 0, policy_version 71830 (0.0007) [2023-10-10 07:39:33,728][53252] Updated weights for policy 0, policy_version 71840 (0.0007) [2023-10-10 07:39:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147062784. Throughput: 0: 1673.9, 1: 1698.4. Samples: 36776874. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:36,784][52050] Avg episode reward: [(0, '20.630'), (1, '22.560')] [2023-10-10 07:39:36,840][53268] Updated weights for policy 1, policy_version 71780 (0.0009) [2023-10-10 07:39:37,208][53268] Updated weights for policy 1, policy_version 71790 (0.0009) [2023-10-10 07:39:37,571][53268] Updated weights for policy 1, policy_version 71800 (0.0009) [2023-10-10 07:39:37,665][53252] Updated weights for policy 0, policy_version 71850 (0.0008) [2023-10-10 07:39:38,032][53252] Updated weights for policy 0, policy_version 71860 (0.0009) [2023-10-10 07:39:38,397][53252] Updated weights for policy 0, policy_version 71870 (0.0010) [2023-10-10 07:39:41,551][53268] Updated weights for policy 1, policy_version 71810 (0.0008) [2023-10-10 07:39:41,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147128320. Throughput: 0: 1677.0, 1: 1700.4. Samples: 36797804. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:41,785][52050] Avg episode reward: [(0, '20.620'), (1, '21.730')] [2023-10-10 07:39:41,929][53268] Updated weights for policy 1, policy_version 71820 (0.0007) [2023-10-10 07:39:42,297][53268] Updated weights for policy 1, policy_version 71830 (0.0009) [2023-10-10 07:39:42,598][53252] Updated weights for policy 0, policy_version 71880 (0.0010) [2023-10-10 07:39:42,656][53268] Updated weights for policy 1, policy_version 71840 (0.0007) [2023-10-10 07:39:42,977][53252] Updated weights for policy 0, policy_version 71890 (0.0008) [2023-10-10 07:39:43,339][53252] Updated weights for policy 0, policy_version 71900 (0.0008) [2023-10-10 07:39:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147193856. Throughput: 0: 1675.1, 1: 1694.5. Samples: 36806992. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:46,784][52050] Avg episode reward: [(0, '22.840'), (1, '22.310')] [2023-10-10 07:39:47,039][53268] Updated weights for policy 1, policy_version 71850 (0.0009) [2023-10-10 07:39:47,402][53268] Updated weights for policy 1, policy_version 71860 (0.0010) [2023-10-10 07:39:47,425][53252] Updated weights for policy 0, policy_version 71910 (0.0009) [2023-10-10 07:39:47,776][53268] Updated weights for policy 1, policy_version 71870 (0.0010) [2023-10-10 07:39:47,794][53252] Updated weights for policy 0, policy_version 71920 (0.0009) [2023-10-10 07:39:48,167][53252] Updated weights for policy 0, policy_version 71930 (0.0010) [2023-10-10 07:39:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147259392. Throughput: 0: 1677.3, 1: 1695.6. Samples: 36827466. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:51,784][52050] Avg episode reward: [(0, '22.150'), (1, '20.780')] [2023-10-10 07:39:51,786][53268] Updated weights for policy 1, policy_version 71880 (0.0010) [2023-10-10 07:39:52,153][53268] Updated weights for policy 1, policy_version 71890 (0.0008) [2023-10-10 07:39:52,293][53252] Updated weights for policy 0, policy_version 71940 (0.0009) [2023-10-10 07:39:52,528][53268] Updated weights for policy 1, policy_version 71900 (0.0007) [2023-10-10 07:39:52,666][53252] Updated weights for policy 0, policy_version 71950 (0.0009) [2023-10-10 07:39:53,034][53252] Updated weights for policy 0, policy_version 71960 (0.0010) [2023-10-10 07:39:56,373][53268] Updated weights for policy 1, policy_version 71910 (0.0008) [2023-10-10 07:39:56,731][53268] Updated weights for policy 1, policy_version 71920 (0.0008) [2023-10-10 07:39:56,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147324928. Throughput: 0: 1678.0, 1: 1695.1. Samples: 36848372. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:39:56,784][52050] Avg episode reward: [(0, '22.890'), (1, '20.850')] [2023-10-10 07:39:57,039][53252] Updated weights for policy 0, policy_version 71970 (0.0010) [2023-10-10 07:39:57,093][53268] Updated weights for policy 1, policy_version 71930 (0.0009) [2023-10-10 07:39:57,413][53252] Updated weights for policy 0, policy_version 71980 (0.0007) [2023-10-10 07:39:57,787][53252] Updated weights for policy 0, policy_version 71990 (0.0007) [2023-10-10 07:39:58,149][53252] Updated weights for policy 0, policy_version 72000 (0.0008) [2023-10-10 07:40:01,324][53268] Updated weights for policy 1, policy_version 71940 (0.0009) [2023-10-10 07:40:01,720][53268] Updated weights for policy 1, policy_version 71950 (0.0008) [2023-10-10 07:40:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147390464. Throughput: 0: 1677.3, 1: 1698.1. Samples: 36857518. Policy #0 lag: (min: 7.0, avg: 14.3, max: 39.0) [2023-10-10 07:40:01,784][52050] Avg episode reward: [(0, '21.590'), (1, '20.830')] [2023-10-10 07:40:02,095][53268] Updated weights for policy 1, policy_version 71960 (0.0008) [2023-10-10 07:40:02,267][53252] Updated weights for policy 0, policy_version 72010 (0.0008) [2023-10-10 07:40:02,630][53252] Updated weights for policy 0, policy_version 72020 (0.0010) [2023-10-10 07:40:03,005][53252] Updated weights for policy 0, policy_version 72030 (0.0010) [2023-10-10 07:40:06,080][53268] Updated weights for policy 1, policy_version 71970 (0.0007) [2023-10-10 07:40:06,452][53268] Updated weights for policy 1, policy_version 71980 (0.0008) [2023-10-10 07:40:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147456000. Throughput: 0: 1681.9, 1: 1690.7. Samples: 36877936. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:06,784][52050] Avg episode reward: [(0, '20.350'), (1, '21.590')] [2023-10-10 07:40:06,821][53268] Updated weights for policy 1, policy_version 71990 (0.0008) [2023-10-10 07:40:07,190][53252] Updated weights for policy 0, policy_version 72040 (0.0009) [2023-10-10 07:40:07,191][53268] Updated weights for policy 1, policy_version 72000 (0.0008) [2023-10-10 07:40:07,557][53252] Updated weights for policy 0, policy_version 72050 (0.0009) [2023-10-10 07:40:07,927][53252] Updated weights for policy 0, policy_version 72060 (0.0008) [2023-10-10 07:40:11,283][53268] Updated weights for policy 1, policy_version 72010 (0.0009) [2023-10-10 07:40:11,648][53268] Updated weights for policy 1, policy_version 72020 (0.0007) [2023-10-10 07:40:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147521536. Throughput: 0: 1676.5, 1: 1684.8. Samples: 36898360. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:11,784][52050] Avg episode reward: [(0, '19.820'), (1, '20.810')] [2023-10-10 07:40:11,893][53252] Updated weights for policy 0, policy_version 72070 (0.0008) [2023-10-10 07:40:12,021][53268] Updated weights for policy 1, policy_version 72030 (0.0008) [2023-10-10 07:40:12,262][53252] Updated weights for policy 0, policy_version 72080 (0.0008) [2023-10-10 07:40:12,641][53252] Updated weights for policy 0, policy_version 72090 (0.0008) [2023-10-10 07:40:16,203][53268] Updated weights for policy 1, policy_version 72040 (0.0011) [2023-10-10 07:40:16,576][53268] Updated weights for policy 1, policy_version 72050 (0.0008) [2023-10-10 07:40:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147587072. Throughput: 0: 1676.2, 1: 1689.8. Samples: 36907698. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:16,784][52050] Avg episode reward: [(0, '19.620'), (1, '20.970')] [2023-10-10 07:40:16,817][53252] Updated weights for policy 0, policy_version 72100 (0.0009) [2023-10-10 07:40:16,943][53268] Updated weights for policy 1, policy_version 72060 (0.0009) [2023-10-10 07:40:17,188][53252] Updated weights for policy 0, policy_version 72110 (0.0008) [2023-10-10 07:40:17,566][53252] Updated weights for policy 0, policy_version 72120 (0.0008) [2023-10-10 07:40:21,126][53268] Updated weights for policy 1, policy_version 72070 (0.0009) [2023-10-10 07:40:21,492][53268] Updated weights for policy 1, policy_version 72080 (0.0009) [2023-10-10 07:40:21,778][53252] Updated weights for policy 0, policy_version 72130 (0.0009) [2023-10-10 07:40:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 147652608. Throughput: 0: 1673.6, 1: 1682.9. Samples: 36927914. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:21,784][52050] Avg episode reward: [(0, '21.170'), (1, '21.710')] [2023-10-10 07:40:21,852][53268] Updated weights for policy 1, policy_version 72090 (0.0009) [2023-10-10 07:40:22,144][53252] Updated weights for policy 0, policy_version 72140 (0.0007) [2023-10-10 07:40:22,514][53252] Updated weights for policy 0, policy_version 72150 (0.0008) [2023-10-10 07:40:22,881][53252] Updated weights for policy 0, policy_version 72160 (0.0008) [2023-10-10 07:40:25,803][53268] Updated weights for policy 1, policy_version 72100 (0.0007) [2023-10-10 07:40:26,165][53268] Updated weights for policy 1, policy_version 72110 (0.0009) [2023-10-10 07:40:26,537][53268] Updated weights for policy 1, policy_version 72120 (0.0007) [2023-10-10 07:40:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 147718144. Throughput: 0: 1676.9, 1: 1666.1. Samples: 36948234. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:26,784][52050] Avg episode reward: [(0, '22.840'), (1, '21.120')] [2023-10-10 07:40:27,064][53252] Updated weights for policy 0, policy_version 72170 (0.0009) [2023-10-10 07:40:27,430][53252] Updated weights for policy 0, policy_version 72180 (0.0009) [2023-10-10 07:40:27,797][53252] Updated weights for policy 0, policy_version 72190 (0.0009) [2023-10-10 07:40:30,610][53268] Updated weights for policy 1, policy_version 72130 (0.0008) [2023-10-10 07:40:30,971][53268] Updated weights for policy 1, policy_version 72140 (0.0007) [2023-10-10 07:40:31,343][53268] Updated weights for policy 1, policy_version 72150 (0.0009) [2023-10-10 07:40:31,706][53268] Updated weights for policy 1, policy_version 72160 (0.0008) [2023-10-10 07:40:31,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 147816448. Throughput: 0: 1676.8, 1: 1679.2. Samples: 36958008. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:31,784][52050] Avg episode reward: [(0, '22.310'), (1, '20.880')] [2023-10-10 07:40:31,873][53252] Updated weights for policy 0, policy_version 72200 (0.0009) [2023-10-10 07:40:32,252][53252] Updated weights for policy 0, policy_version 72210 (0.0008) [2023-10-10 07:40:32,624][53252] Updated weights for policy 0, policy_version 72220 (0.0007) [2023-10-10 07:40:35,836][53268] Updated weights for policy 1, policy_version 72170 (0.0008) [2023-10-10 07:40:36,193][53268] Updated weights for policy 1, policy_version 72180 (0.0008) [2023-10-10 07:40:36,565][53268] Updated weights for policy 1, policy_version 72190 (0.0008) [2023-10-10 07:40:36,640][53252] Updated weights for policy 0, policy_version 72230 (0.0007) [2023-10-10 07:40:36,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147881984. Throughput: 0: 1675.5, 1: 1682.8. Samples: 36978588. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:36,784][52050] Avg episode reward: [(0, '23.460'), (1, '21.230')] [2023-10-10 07:40:37,015][53252] Updated weights for policy 0, policy_version 72240 (0.0010) [2023-10-10 07:40:37,388][53252] Updated weights for policy 0, policy_version 72250 (0.0009) [2023-10-10 07:40:40,592][53268] Updated weights for policy 1, policy_version 72200 (0.0010) [2023-10-10 07:40:40,966][53268] Updated weights for policy 1, policy_version 72210 (0.0008) [2023-10-10 07:40:41,327][53268] Updated weights for policy 1, policy_version 72220 (0.0011) [2023-10-10 07:40:41,400][53252] Updated weights for policy 0, policy_version 72260 (0.0010) [2023-10-10 07:40:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 147947520. Throughput: 0: 1677.4, 1: 1653.6. Samples: 36998270. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:41,784][53252] Updated weights for policy 0, policy_version 72270 (0.0010) [2023-10-10 07:40:41,785][52050] Avg episode reward: [(0, '23.670'), (1, '21.740')] [2023-10-10 07:40:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000072224_73957376.pth... [2023-10-10 07:40:41,824][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000070624_72318976.pth [2023-10-10 07:40:42,147][53252] Updated weights for policy 0, policy_version 72280 (0.0010) [2023-10-10 07:40:42,443][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000072288_74022912.pth... [2023-10-10 07:40:42,480][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000070720_72417280.pth [2023-10-10 07:40:45,501][53268] Updated weights for policy 1, policy_version 72230 (0.0010) [2023-10-10 07:40:45,874][53268] Updated weights for policy 1, policy_version 72240 (0.0008) [2023-10-10 07:40:46,180][53252] Updated weights for policy 0, policy_version 72290 (0.0009) [2023-10-10 07:40:46,244][53268] Updated weights for policy 1, policy_version 72250 (0.0008) [2023-10-10 07:40:46,560][53252] Updated weights for policy 0, policy_version 72300 (0.0008) [2023-10-10 07:40:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 148013056. Throughput: 0: 1678.0, 1: 1675.2. Samples: 37008416. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) [2023-10-10 07:40:46,784][52050] Avg episode reward: [(0, '22.230'), (1, '22.920')] [2023-10-10 07:40:46,923][53252] Updated weights for policy 0, policy_version 72310 (0.0008) [2023-10-10 07:40:47,297][53252] Updated weights for policy 0, policy_version 72320 (0.0009) [2023-10-10 07:40:50,289][53268] Updated weights for policy 1, policy_version 72260 (0.0008) [2023-10-10 07:40:50,703][53268] Updated weights for policy 1, policy_version 72270 (0.0009) [2023-10-10 07:40:51,064][53268] Updated weights for policy 1, policy_version 72280 (0.0008) [2023-10-10 07:40:51,403][53252] Updated weights for policy 0, policy_version 72330 (0.0009) [2023-10-10 07:40:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 148078592. Throughput: 0: 1677.1, 1: 1680.7. Samples: 37029036. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:40:51,784][52050] Avg episode reward: [(0, '22.530'), (1, '22.310')] [2023-10-10 07:40:51,787][53252] Updated weights for policy 0, policy_version 72340 (0.0010) [2023-10-10 07:40:52,152][53252] Updated weights for policy 0, policy_version 72350 (0.0008) [2023-10-10 07:40:55,020][53268] Updated weights for policy 1, policy_version 72290 (0.0010) [2023-10-10 07:40:55,392][53268] Updated weights for policy 1, policy_version 72300 (0.0009) [2023-10-10 07:40:55,756][53268] Updated weights for policy 1, policy_version 72310 (0.0008) [2023-10-10 07:40:56,130][53268] Updated weights for policy 1, policy_version 72320 (0.0008) [2023-10-10 07:40:56,169][53252] Updated weights for policy 0, policy_version 72360 (0.0007) [2023-10-10 07:40:56,543][53252] Updated weights for policy 0, policy_version 72370 (0.0009) [2023-10-10 07:40:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 148144128. Throughput: 0: 1668.9, 1: 1661.2. Samples: 37048216. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:40:56,784][52050] Avg episode reward: [(0, '22.990'), (1, '22.390')] [2023-10-10 07:40:56,913][53252] Updated weights for policy 0, policy_version 72380 (0.0008) [2023-10-10 07:41:00,017][53268] Updated weights for policy 1, policy_version 72330 (0.0009) [2023-10-10 07:41:00,389][53268] Updated weights for policy 1, policy_version 72340 (0.0007) [2023-10-10 07:41:00,755][53268] Updated weights for policy 1, policy_version 72350 (0.0007) [2023-10-10 07:41:00,785][53252] Updated weights for policy 0, policy_version 72390 (0.0010) [2023-10-10 07:41:01,155][53252] Updated weights for policy 0, policy_version 72400 (0.0008) [2023-10-10 07:41:01,514][53252] Updated weights for policy 0, policy_version 72410 (0.0009) [2023-10-10 07:41:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148242432. Throughput: 0: 1682.3, 1: 1686.1. Samples: 37059274. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:01,784][52050] Avg episode reward: [(0, '22.850'), (1, '22.240')] [2023-10-10 07:41:04,696][53268] Updated weights for policy 1, policy_version 72360 (0.0008) [2023-10-10 07:41:05,066][53268] Updated weights for policy 1, policy_version 72370 (0.0008) [2023-10-10 07:41:05,432][53268] Updated weights for policy 1, policy_version 72380 (0.0009) [2023-10-10 07:41:05,554][53252] Updated weights for policy 0, policy_version 72420 (0.0008) [2023-10-10 07:41:05,932][53252] Updated weights for policy 0, policy_version 72430 (0.0009) [2023-10-10 07:41:06,304][53252] Updated weights for policy 0, policy_version 72440 (0.0009) [2023-10-10 07:41:06,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148307968. Throughput: 0: 1690.8, 1: 1682.2. Samples: 37079698. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:06,784][52050] Avg episode reward: [(0, '20.430'), (1, '21.160')] [2023-10-10 07:41:09,418][53268] Updated weights for policy 1, policy_version 72390 (0.0008) [2023-10-10 07:41:09,774][53268] Updated weights for policy 1, policy_version 72400 (0.0010) [2023-10-10 07:41:10,145][53268] Updated weights for policy 1, policy_version 72410 (0.0008) [2023-10-10 07:41:10,390][53252] Updated weights for policy 0, policy_version 72450 (0.0009) [2023-10-10 07:41:10,750][53252] Updated weights for policy 0, policy_version 72460 (0.0010) [2023-10-10 07:41:11,129][53252] Updated weights for policy 0, policy_version 72470 (0.0008) [2023-10-10 07:41:11,500][53252] Updated weights for policy 0, policy_version 72480 (0.0008) [2023-10-10 07:41:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148373504. Throughput: 0: 1668.6, 1: 1686.7. Samples: 37099220. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:11,784][52050] Avg episode reward: [(0, '21.510'), (1, '21.030')] [2023-10-10 07:41:14,140][53268] Updated weights for policy 1, policy_version 72420 (0.0009) [2023-10-10 07:41:14,510][53268] Updated weights for policy 1, policy_version 72430 (0.0008) [2023-10-10 07:41:14,881][53268] Updated weights for policy 1, policy_version 72440 (0.0009) [2023-10-10 07:41:15,667][53252] Updated weights for policy 0, policy_version 72490 (0.0009) [2023-10-10 07:41:16,051][53252] Updated weights for policy 0, policy_version 72500 (0.0009) [2023-10-10 07:41:16,420][53252] Updated weights for policy 0, policy_version 72510 (0.0007) [2023-10-10 07:41:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148439040. Throughput: 0: 1688.5, 1: 1703.1. Samples: 37110628. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:16,784][52050] Avg episode reward: [(0, '22.940'), (1, '20.880')] [2023-10-10 07:41:18,898][53268] Updated weights for policy 1, policy_version 72450 (0.0008) [2023-10-10 07:41:19,260][53268] Updated weights for policy 1, policy_version 72460 (0.0007) [2023-10-10 07:41:19,626][53268] Updated weights for policy 1, policy_version 72470 (0.0008) [2023-10-10 07:41:19,997][53268] Updated weights for policy 1, policy_version 72480 (0.0007) [2023-10-10 07:41:20,380][53252] Updated weights for policy 0, policy_version 72520 (0.0008) [2023-10-10 07:41:20,762][53252] Updated weights for policy 0, policy_version 72530 (0.0009) [2023-10-10 07:41:21,127][53252] Updated weights for policy 0, policy_version 72540 (0.0009) [2023-10-10 07:41:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148504576. Throughput: 0: 1686.3, 1: 1676.7. Samples: 37129922. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:21,784][52050] Avg episode reward: [(0, '20.970'), (1, '21.430')] [2023-10-10 07:41:24,019][53268] Updated weights for policy 1, policy_version 72490 (0.0008) [2023-10-10 07:41:24,391][53268] Updated weights for policy 1, policy_version 72500 (0.0007) [2023-10-10 07:41:24,753][53268] Updated weights for policy 1, policy_version 72510 (0.0008) [2023-10-10 07:41:25,103][53252] Updated weights for policy 0, policy_version 72550 (0.0008) [2023-10-10 07:41:25,464][53252] Updated weights for policy 0, policy_version 72560 (0.0007) [2023-10-10 07:41:25,836][53252] Updated weights for policy 0, policy_version 72570 (0.0008) [2023-10-10 07:41:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 148570112. Throughput: 0: 1663.6, 1: 1703.0. Samples: 37149766. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:26,785][52050] Avg episode reward: [(0, '20.990'), (1, '22.020')] [2023-10-10 07:41:28,819][53268] Updated weights for policy 1, policy_version 72520 (0.0009) [2023-10-10 07:41:29,187][53268] Updated weights for policy 1, policy_version 72530 (0.0009) [2023-10-10 07:41:29,554][53268] Updated weights for policy 1, policy_version 72540 (0.0009) [2023-10-10 07:41:29,838][53252] Updated weights for policy 0, policy_version 72580 (0.0009) [2023-10-10 07:41:30,204][53252] Updated weights for policy 0, policy_version 72590 (0.0008) [2023-10-10 07:41:30,581][53252] Updated weights for policy 0, policy_version 72600 (0.0007) [2023-10-10 07:41:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 148635648. Throughput: 0: 1690.5, 1: 1696.0. Samples: 37160808. Policy #0 lag: (min: 31.0, avg: 45.3, max: 63.0) [2023-10-10 07:41:31,784][52050] Avg episode reward: [(0, '21.460'), (1, '21.920')] [2023-10-10 07:41:33,561][53268] Updated weights for policy 1, policy_version 72550 (0.0009) [2023-10-10 07:41:33,922][53268] Updated weights for policy 1, policy_version 72560 (0.0007) [2023-10-10 07:41:34,283][53268] Updated weights for policy 1, policy_version 72570 (0.0007) [2023-10-10 07:41:34,701][53252] Updated weights for policy 0, policy_version 72610 (0.0008) [2023-10-10 07:41:35,072][53252] Updated weights for policy 0, policy_version 72620 (0.0011) [2023-10-10 07:41:35,454][53252] Updated weights for policy 0, policy_version 72630 (0.0011) [2023-10-10 07:41:35,820][53252] Updated weights for policy 0, policy_version 72640 (0.0009) [2023-10-10 07:41:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 148701184. Throughput: 0: 1678.6, 1: 1679.0. Samples: 37180130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:41:36,784][52050] Avg episode reward: [(0, '21.590'), (1, '21.650')] [2023-10-10 07:41:38,453][53268] Updated weights for policy 1, policy_version 72580 (0.0009) [2023-10-10 07:41:38,831][53268] Updated weights for policy 1, policy_version 72590 (0.0008) [2023-10-10 07:41:39,203][53268] Updated weights for policy 1, policy_version 72600 (0.0008) [2023-10-10 07:41:39,852][53252] Updated weights for policy 0, policy_version 72650 (0.0007) [2023-10-10 07:41:40,221][53252] Updated weights for policy 0, policy_version 72660 (0.0007) [2023-10-10 07:41:40,597][53252] Updated weights for policy 0, policy_version 72670 (0.0007) [2023-10-10 07:41:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 148766720. Throughput: 0: 1677.1, 1: 1707.0. Samples: 37200498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:41:41,784][52050] Avg episode reward: [(0, '21.750'), (1, '22.150')] [2023-10-10 07:41:43,128][53268] Updated weights for policy 1, policy_version 72610 (0.0008) [2023-10-10 07:41:43,501][53268] Updated weights for policy 1, policy_version 72620 (0.0008) [2023-10-10 07:41:43,877][53268] Updated weights for policy 1, policy_version 72630 (0.0008) [2023-10-10 07:41:44,244][53268] Updated weights for policy 1, policy_version 72640 (0.0007) [2023-10-10 07:41:44,727][53252] Updated weights for policy 0, policy_version 72680 (0.0009) [2023-10-10 07:41:45,103][53252] Updated weights for policy 0, policy_version 72690 (0.0010) [2023-10-10 07:41:45,468][53252] Updated weights for policy 0, policy_version 72700 (0.0009) [2023-10-10 07:41:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 148832256. Throughput: 0: 1689.5, 1: 1682.1. Samples: 37210998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:41:46,784][52050] Avg episode reward: [(0, '20.630'), (1, '20.360')] [2023-10-10 07:41:48,220][53268] Updated weights for policy 1, policy_version 72650 (0.0009) [2023-10-10 07:41:48,583][53268] Updated weights for policy 1, policy_version 72660 (0.0008) [2023-10-10 07:41:48,956][53268] Updated weights for policy 1, policy_version 72670 (0.0007) [2023-10-10 07:41:49,649][53252] Updated weights for policy 0, policy_version 72710 (0.0009) [2023-10-10 07:41:50,019][53252] Updated weights for policy 0, policy_version 72720 (0.0008) [2023-10-10 07:41:50,395][53252] Updated weights for policy 0, policy_version 72730 (0.0009) [2023-10-10 07:41:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 148897792. Throughput: 0: 1661.4, 1: 1692.2. Samples: 37230610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:41:51,784][52050] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-10 07:41:52,948][53268] Updated weights for policy 1, policy_version 72680 (0.0009) [2023-10-10 07:41:53,324][53268] Updated weights for policy 1, policy_version 72690 (0.0008) [2023-10-10 07:41:53,692][53268] Updated weights for policy 1, policy_version 72700 (0.0009) [2023-10-10 07:41:54,330][53252] Updated weights for policy 0, policy_version 72740 (0.0007) [2023-10-10 07:41:54,709][53252] Updated weights for policy 0, policy_version 72750 (0.0007) [2023-10-10 07:41:55,081][53252] Updated weights for policy 0, policy_version 72760 (0.0008) [2023-10-10 07:41:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 148963328. Throughput: 0: 1677.1, 1: 1702.0. Samples: 37251278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:41:56,784][52050] Avg episode reward: [(0, '23.200'), (1, '21.560')] [2023-10-10 07:41:57,618][53268] Updated weights for policy 1, policy_version 72710 (0.0010) [2023-10-10 07:41:57,981][53268] Updated weights for policy 1, policy_version 72720 (0.0010) [2023-10-10 07:41:58,349][53268] Updated weights for policy 1, policy_version 72730 (0.0008) [2023-10-10 07:41:59,280][53252] Updated weights for policy 0, policy_version 72770 (0.0007) [2023-10-10 07:41:59,645][53252] Updated weights for policy 0, policy_version 72780 (0.0010) [2023-10-10 07:42:00,012][53252] Updated weights for policy 0, policy_version 72790 (0.0009) [2023-10-10 07:42:00,382][53252] Updated weights for policy 0, policy_version 72800 (0.0011) [2023-10-10 07:42:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149028864. Throughput: 0: 1679.5, 1: 1672.6. Samples: 37261470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:42:01,784][52050] Avg episode reward: [(0, '22.980'), (1, '21.970')] [2023-10-10 07:42:02,475][53268] Updated weights for policy 1, policy_version 72740 (0.0009) [2023-10-10 07:42:02,842][53268] Updated weights for policy 1, policy_version 72750 (0.0007) [2023-10-10 07:42:03,202][53268] Updated weights for policy 1, policy_version 72760 (0.0007) [2023-10-10 07:42:04,509][53252] Updated weights for policy 0, policy_version 72810 (0.0007) [2023-10-10 07:42:04,877][53252] Updated weights for policy 0, policy_version 72820 (0.0008) [2023-10-10 07:42:05,255][53252] Updated weights for policy 0, policy_version 72830 (0.0008) [2023-10-10 07:42:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149094400. Throughput: 0: 1660.6, 1: 1701.1. Samples: 37281198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:42:06,784][52050] Avg episode reward: [(0, '21.340'), (1, '22.070')] [2023-10-10 07:42:07,185][53268] Updated weights for policy 1, policy_version 72770 (0.0010) [2023-10-10 07:42:07,539][53268] Updated weights for policy 1, policy_version 72780 (0.0009) [2023-10-10 07:42:07,918][53268] Updated weights for policy 1, policy_version 72790 (0.0008) [2023-10-10 07:42:08,293][53268] Updated weights for policy 1, policy_version 72800 (0.0010) [2023-10-10 07:42:09,267][53252] Updated weights for policy 0, policy_version 72840 (0.0009) [2023-10-10 07:42:09,638][53252] Updated weights for policy 0, policy_version 72850 (0.0009) [2023-10-10 07:42:10,014][53252] Updated weights for policy 0, policy_version 72860 (0.0009) [2023-10-10 07:42:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149159936. Throughput: 0: 1680.1, 1: 1703.3. Samples: 37302022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:42:11,784][52050] Avg episode reward: [(0, '23.840'), (1, '23.090')] [2023-10-10 07:42:12,242][53268] Updated weights for policy 1, policy_version 72810 (0.0008) [2023-10-10 07:42:12,601][53268] Updated weights for policy 1, policy_version 72820 (0.0010) [2023-10-10 07:42:12,980][53268] Updated weights for policy 1, policy_version 72830 (0.0010) [2023-10-10 07:42:13,964][53252] Updated weights for policy 0, policy_version 72870 (0.0007) [2023-10-10 07:42:14,340][53252] Updated weights for policy 0, policy_version 72880 (0.0008) [2023-10-10 07:42:14,719][53252] Updated weights for policy 0, policy_version 72890 (0.0007) [2023-10-10 07:42:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149225472. Throughput: 0: 1666.1, 1: 1688.0. Samples: 37311744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:42:16,784][52050] Avg episode reward: [(0, '21.590'), (1, '23.310')] [2023-10-10 07:42:17,051][53268] Updated weights for policy 1, policy_version 72840 (0.0011) [2023-10-10 07:42:17,424][53268] Updated weights for policy 1, policy_version 72850 (0.0008) [2023-10-10 07:42:17,786][53268] Updated weights for policy 1, policy_version 72860 (0.0009) [2023-10-10 07:42:18,891][53252] Updated weights for policy 0, policy_version 72900 (0.0009) [2023-10-10 07:42:19,267][53252] Updated weights for policy 0, policy_version 72910 (0.0007) [2023-10-10 07:42:19,645][53252] Updated weights for policy 0, policy_version 72920 (0.0007) [2023-10-10 07:42:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149291008. Throughput: 0: 1666.9, 1: 1701.7. Samples: 37331718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:42:21,784][52050] Avg episode reward: [(0, '22.090'), (1, '22.880')] [2023-10-10 07:42:21,987][53268] Updated weights for policy 1, policy_version 72870 (0.0011) [2023-10-10 07:42:22,363][53268] Updated weights for policy 1, policy_version 72880 (0.0010) [2023-10-10 07:42:22,736][53268] Updated weights for policy 1, policy_version 72890 (0.0009) [2023-10-10 07:42:23,812][53252] Updated weights for policy 0, policy_version 72930 (0.0008) [2023-10-10 07:42:24,177][53252] Updated weights for policy 0, policy_version 72940 (0.0010) [2023-10-10 07:42:24,552][53252] Updated weights for policy 0, policy_version 72950 (0.0007) [2023-10-10 07:42:24,913][53252] Updated weights for policy 0, policy_version 72960 (0.0008) [2023-10-10 07:42:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149356544. Throughput: 0: 1677.5, 1: 1698.9. Samples: 37352436. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:26,784][52050] Avg episode reward: [(0, '21.310'), (1, '21.750')] [2023-10-10 07:42:26,818][53268] Updated weights for policy 1, policy_version 72900 (0.0008) [2023-10-10 07:42:27,213][53268] Updated weights for policy 1, policy_version 72910 (0.0008) [2023-10-10 07:42:27,589][53268] Updated weights for policy 1, policy_version 72920 (0.0008) [2023-10-10 07:42:28,893][53252] Updated weights for policy 0, policy_version 72970 (0.0008) [2023-10-10 07:42:29,266][53252] Updated weights for policy 0, policy_version 72980 (0.0007) [2023-10-10 07:42:29,642][53252] Updated weights for policy 0, policy_version 72990 (0.0007) [2023-10-10 07:42:31,645][53268] Updated weights for policy 1, policy_version 72930 (0.0010) [2023-10-10 07:42:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149422080. Throughput: 0: 1659.6, 1: 1690.9. Samples: 37361774. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:31,784][52050] Avg episode reward: [(0, '22.310'), (1, '22.610')] [2023-10-10 07:42:32,008][53268] Updated weights for policy 1, policy_version 72940 (0.0009) [2023-10-10 07:42:32,370][53268] Updated weights for policy 1, policy_version 72950 (0.0009) [2023-10-10 07:42:32,734][53268] Updated weights for policy 1, policy_version 72960 (0.0007) [2023-10-10 07:42:33,705][53252] Updated weights for policy 0, policy_version 73000 (0.0008) [2023-10-10 07:42:34,082][53252] Updated weights for policy 0, policy_version 73010 (0.0009) [2023-10-10 07:42:34,447][53252] Updated weights for policy 0, policy_version 73020 (0.0008) [2023-10-10 07:42:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 149487616. Throughput: 0: 1678.2, 1: 1693.7. Samples: 37382344. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:36,785][52050] Avg episode reward: [(0, '20.850'), (1, '22.810')] [2023-10-10 07:42:36,844][53268] Updated weights for policy 1, policy_version 72970 (0.0008) [2023-10-10 07:42:37,215][53268] Updated weights for policy 1, policy_version 72980 (0.0008) [2023-10-10 07:42:37,583][53268] Updated weights for policy 1, policy_version 72990 (0.0009) [2023-10-10 07:42:38,505][53252] Updated weights for policy 0, policy_version 73030 (0.0009) [2023-10-10 07:42:38,872][53252] Updated weights for policy 0, policy_version 73040 (0.0007) [2023-10-10 07:42:39,242][53252] Updated weights for policy 0, policy_version 73050 (0.0009) [2023-10-10 07:42:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 149553152. Throughput: 0: 1681.6, 1: 1687.7. Samples: 37402896. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:41,784][52050] Avg episode reward: [(0, '20.160'), (1, '21.150')] [2023-10-10 07:42:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000073056_74809344.pth... [2023-10-10 07:42:41,799][53268] Updated weights for policy 1, policy_version 73000 (0.0009) [2023-10-10 07:42:41,834][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000071488_73203712.pth [2023-10-10 07:42:42,160][53268] Updated weights for policy 1, policy_version 73010 (0.0009) [2023-10-10 07:42:42,523][53268] Updated weights for policy 1, policy_version 73020 (0.0010) [2023-10-10 07:42:42,668][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000073024_74776576.pth... [2023-10-10 07:42:42,697][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000071424_73138176.pth [2023-10-10 07:42:43,265][53252] Updated weights for policy 0, policy_version 73060 (0.0009) [2023-10-10 07:42:43,637][53252] Updated weights for policy 0, policy_version 73070 (0.0009) [2023-10-10 07:42:44,011][53252] Updated weights for policy 0, policy_version 73080 (0.0008) [2023-10-10 07:42:46,513][53268] Updated weights for policy 1, policy_version 73030 (0.0008) [2023-10-10 07:42:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149618688. Throughput: 0: 1657.5, 1: 1689.6. Samples: 37412092. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:46,784][52050] Avg episode reward: [(0, '21.350'), (1, '22.160')] [2023-10-10 07:42:46,881][53268] Updated weights for policy 1, policy_version 73040 (0.0009) [2023-10-10 07:42:47,242][53268] Updated weights for policy 1, policy_version 73050 (0.0009) [2023-10-10 07:42:48,162][53252] Updated weights for policy 0, policy_version 73090 (0.0007) [2023-10-10 07:42:48,534][53252] Updated weights for policy 0, policy_version 73100 (0.0008) [2023-10-10 07:42:48,902][53252] Updated weights for policy 0, policy_version 73110 (0.0008) [2023-10-10 07:42:49,276][53252] Updated weights for policy 0, policy_version 73120 (0.0008) [2023-10-10 07:42:51,402][53268] Updated weights for policy 1, policy_version 73060 (0.0011) [2023-10-10 07:42:51,763][53268] Updated weights for policy 1, policy_version 73070 (0.0010) [2023-10-10 07:42:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149684224. Throughput: 0: 1684.3, 1: 1686.5. Samples: 37432886. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:51,784][52050] Avg episode reward: [(0, '21.530'), (1, '23.100')] [2023-10-10 07:42:52,131][53268] Updated weights for policy 1, policy_version 73080 (0.0008) [2023-10-10 07:42:53,254][53252] Updated weights for policy 0, policy_version 73130 (0.0007) [2023-10-10 07:42:53,631][53252] Updated weights for policy 0, policy_version 73140 (0.0008) [2023-10-10 07:42:53,996][53252] Updated weights for policy 0, policy_version 73150 (0.0008) [2023-10-10 07:42:56,088][53268] Updated weights for policy 1, policy_version 73090 (0.0009) [2023-10-10 07:42:56,455][53268] Updated weights for policy 1, policy_version 73100 (0.0010) [2023-10-10 07:42:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149749760. Throughput: 0: 1691.3, 1: 1676.6. Samples: 37453576. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:42:56,784][52050] Avg episode reward: [(0, '22.030'), (1, '22.360')] [2023-10-10 07:42:56,821][53268] Updated weights for policy 1, policy_version 73110 (0.0009) [2023-10-10 07:42:57,191][53268] Updated weights for policy 1, policy_version 73120 (0.0011) [2023-10-10 07:42:57,904][53252] Updated weights for policy 0, policy_version 73160 (0.0008) [2023-10-10 07:42:58,277][53252] Updated weights for policy 0, policy_version 73170 (0.0009) [2023-10-10 07:42:58,648][53252] Updated weights for policy 0, policy_version 73180 (0.0009) [2023-10-10 07:43:01,338][53268] Updated weights for policy 1, policy_version 73130 (0.0008) [2023-10-10 07:43:01,711][53268] Updated weights for policy 1, policy_version 73140 (0.0007) [2023-10-10 07:43:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149815296. Throughput: 0: 1678.1, 1: 1679.7. Samples: 37462848. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:43:01,784][52050] Avg episode reward: [(0, '23.340'), (1, '21.770')] [2023-10-10 07:43:02,071][53268] Updated weights for policy 1, policy_version 73150 (0.0007) [2023-10-10 07:43:02,766][53252] Updated weights for policy 0, policy_version 73190 (0.0009) [2023-10-10 07:43:03,140][53252] Updated weights for policy 0, policy_version 73200 (0.0007) [2023-10-10 07:43:03,509][53252] Updated weights for policy 0, policy_version 73210 (0.0007) [2023-10-10 07:43:06,129][53268] Updated weights for policy 1, policy_version 73160 (0.0008) [2023-10-10 07:43:06,494][53268] Updated weights for policy 1, policy_version 73170 (0.0007) [2023-10-10 07:43:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 149880832. Throughput: 0: 1692.0, 1: 1680.5. Samples: 37483482. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:43:06,784][52050] Avg episode reward: [(0, '23.620'), (1, '22.430')] [2023-10-10 07:43:06,854][53268] Updated weights for policy 1, policy_version 73180 (0.0007) [2023-10-10 07:43:07,515][53252] Updated weights for policy 0, policy_version 73220 (0.0008) [2023-10-10 07:43:07,888][53252] Updated weights for policy 0, policy_version 73230 (0.0007) [2023-10-10 07:43:08,265][53252] Updated weights for policy 0, policy_version 73240 (0.0008) [2023-10-10 07:43:10,917][53268] Updated weights for policy 1, policy_version 73190 (0.0007) [2023-10-10 07:43:11,284][53268] Updated weights for policy 1, policy_version 73200 (0.0007) [2023-10-10 07:43:11,655][53268] Updated weights for policy 1, policy_version 73210 (0.0007) [2023-10-10 07:43:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 149946368. Throughput: 0: 1693.8, 1: 1671.7. Samples: 37503882. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-10 07:43:11,784][52050] Avg episode reward: [(0, '22.280'), (1, '19.840')] [2023-10-10 07:43:12,255][53252] Updated weights for policy 0, policy_version 73250 (0.0008) [2023-10-10 07:43:12,620][53252] Updated weights for policy 0, policy_version 73260 (0.0009) [2023-10-10 07:43:12,988][53252] Updated weights for policy 0, policy_version 73270 (0.0009) [2023-10-10 07:43:13,353][53252] Updated weights for policy 0, policy_version 73280 (0.0009) [2023-10-10 07:43:15,531][53268] Updated weights for policy 1, policy_version 73220 (0.0008) [2023-10-10 07:43:15,902][53268] Updated weights for policy 1, policy_version 73230 (0.0009) [2023-10-10 07:43:16,265][53268] Updated weights for policy 1, policy_version 73240 (0.0008) [2023-10-10 07:43:16,783][52050] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150044672. Throughput: 0: 1684.8, 1: 1691.4. Samples: 37513706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:16,785][52050] Avg episode reward: [(0, '22.130'), (1, '20.530')] [2023-10-10 07:43:17,377][53252] Updated weights for policy 0, policy_version 73290 (0.0008) [2023-10-10 07:43:17,751][53252] Updated weights for policy 0, policy_version 73300 (0.0009) [2023-10-10 07:43:18,130][53252] Updated weights for policy 0, policy_version 73310 (0.0008) [2023-10-10 07:43:20,481][53268] Updated weights for policy 1, policy_version 73250 (0.0010) [2023-10-10 07:43:20,853][53268] Updated weights for policy 1, policy_version 73260 (0.0009) [2023-10-10 07:43:21,220][53268] Updated weights for policy 1, policy_version 73270 (0.0010) [2023-10-10 07:43:21,573][53268] Updated weights for policy 1, policy_version 73280 (0.0010) [2023-10-10 07:43:21,783][52050] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150110208. Throughput: 0: 1689.9, 1: 1686.2. Samples: 37534268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:21,784][52050] Avg episode reward: [(0, '20.440'), (1, '20.630')] [2023-10-10 07:43:22,188][53252] Updated weights for policy 0, policy_version 73320 (0.0008) [2023-10-10 07:43:22,564][53252] Updated weights for policy 0, policy_version 73330 (0.0007) [2023-10-10 07:43:22,938][53252] Updated weights for policy 0, policy_version 73340 (0.0009) [2023-10-10 07:43:25,737][53268] Updated weights for policy 1, policy_version 73290 (0.0009) [2023-10-10 07:43:26,103][53268] Updated weights for policy 1, policy_version 73300 (0.0010) [2023-10-10 07:43:26,474][53268] Updated weights for policy 1, policy_version 73310 (0.0009) [2023-10-10 07:43:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150175744. Throughput: 0: 1698.9, 1: 1667.8. Samples: 37554398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:26,784][52050] Avg episode reward: [(0, '21.700'), (1, '20.260')] [2023-10-10 07:43:26,898][53252] Updated weights for policy 0, policy_version 73350 (0.0009) [2023-10-10 07:43:27,282][53252] Updated weights for policy 0, policy_version 73360 (0.0010) [2023-10-10 07:43:27,649][53252] Updated weights for policy 0, policy_version 73370 (0.0010) [2023-10-10 07:43:30,471][53268] Updated weights for policy 1, policy_version 73320 (0.0010) [2023-10-10 07:43:30,847][53268] Updated weights for policy 1, policy_version 73330 (0.0008) [2023-10-10 07:43:31,213][53268] Updated weights for policy 1, policy_version 73340 (0.0008) [2023-10-10 07:43:31,782][53252] Updated weights for policy 0, policy_version 73380 (0.0010) [2023-10-10 07:43:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150241280. Throughput: 0: 1695.2, 1: 1685.2. Samples: 37564210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:31,784][52050] Avg episode reward: [(0, '21.730'), (1, '20.060')] [2023-10-10 07:43:32,148][53252] Updated weights for policy 0, policy_version 73390 (0.0008) [2023-10-10 07:43:32,523][53252] Updated weights for policy 0, policy_version 73400 (0.0009) [2023-10-10 07:43:35,249][53268] Updated weights for policy 1, policy_version 73350 (0.0009) [2023-10-10 07:43:35,617][53268] Updated weights for policy 1, policy_version 73360 (0.0010) [2023-10-10 07:43:35,990][53268] Updated weights for policy 1, policy_version 73370 (0.0008) [2023-10-10 07:43:36,627][53252] Updated weights for policy 0, policy_version 73410 (0.0009) [2023-10-10 07:43:36,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150306816. Throughput: 0: 1694.3, 1: 1680.3. Samples: 37584744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:36,785][52050] Avg episode reward: [(0, '21.740'), (1, '22.780')] [2023-10-10 07:43:36,987][53252] Updated weights for policy 0, policy_version 73420 (0.0007) [2023-10-10 07:43:37,355][53252] Updated weights for policy 0, policy_version 73430 (0.0007) [2023-10-10 07:43:37,721][53252] Updated weights for policy 0, policy_version 73440 (0.0007) [2023-10-10 07:43:40,068][53268] Updated weights for policy 1, policy_version 73380 (0.0008) [2023-10-10 07:43:40,434][53268] Updated weights for policy 1, policy_version 73390 (0.0010) [2023-10-10 07:43:40,792][53268] Updated weights for policy 1, policy_version 73400 (0.0009) [2023-10-10 07:43:41,708][53252] Updated weights for policy 0, policy_version 73450 (0.0008) [2023-10-10 07:43:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150372352. Throughput: 0: 1692.0, 1: 1660.0. Samples: 37604416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:41,784][52050] Avg episode reward: [(0, '21.660'), (1, '21.390')] [2023-10-10 07:43:42,077][53252] Updated weights for policy 0, policy_version 73460 (0.0007) [2023-10-10 07:43:42,452][53252] Updated weights for policy 0, policy_version 73470 (0.0008) [2023-10-10 07:43:44,786][53268] Updated weights for policy 1, policy_version 73410 (0.0010) [2023-10-10 07:43:45,153][53268] Updated weights for policy 1, policy_version 73420 (0.0009) [2023-10-10 07:43:45,527][53268] Updated weights for policy 1, policy_version 73430 (0.0009) [2023-10-10 07:43:45,896][53268] Updated weights for policy 1, policy_version 73440 (0.0009) [2023-10-10 07:43:46,539][53252] Updated weights for policy 0, policy_version 73480 (0.0008) [2023-10-10 07:43:46,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150437888. Throughput: 0: 1689.8, 1: 1688.5. Samples: 37614874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:46,784][52050] Avg episode reward: [(0, '22.110'), (1, '22.100')] [2023-10-10 07:43:46,919][53252] Updated weights for policy 0, policy_version 73490 (0.0007) [2023-10-10 07:43:47,290][53252] Updated weights for policy 0, policy_version 73500 (0.0007) [2023-10-10 07:43:50,073][53268] Updated weights for policy 1, policy_version 73450 (0.0008) [2023-10-10 07:43:50,432][53268] Updated weights for policy 1, policy_version 73460 (0.0008) [2023-10-10 07:43:50,803][53268] Updated weights for policy 1, policy_version 73470 (0.0008) [2023-10-10 07:43:51,211][53252] Updated weights for policy 0, policy_version 73510 (0.0008) [2023-10-10 07:43:51,592][53252] Updated weights for policy 0, policy_version 73520 (0.0009) [2023-10-10 07:43:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150503424. Throughput: 0: 1693.4, 1: 1681.6. Samples: 37635358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:51,784][52050] Avg episode reward: [(0, '22.540'), (1, '22.830')] [2023-10-10 07:43:51,950][53252] Updated weights for policy 0, policy_version 73530 (0.0008) [2023-10-10 07:43:54,817][53268] Updated weights for policy 1, policy_version 73480 (0.0009) [2023-10-10 07:43:55,182][53268] Updated weights for policy 1, policy_version 73490 (0.0008) [2023-10-10 07:43:55,552][53268] Updated weights for policy 1, policy_version 73500 (0.0008) [2023-10-10 07:43:55,971][53252] Updated weights for policy 0, policy_version 73540 (0.0008) [2023-10-10 07:43:56,332][53252] Updated weights for policy 0, policy_version 73550 (0.0011) [2023-10-10 07:43:56,711][53252] Updated weights for policy 0, policy_version 73560 (0.0010) [2023-10-10 07:43:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150568960. Throughput: 0: 1682.9, 1: 1674.7. Samples: 37654970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:43:56,784][52050] Avg episode reward: [(0, '21.860'), (1, '19.990')] [2023-10-10 07:43:59,601][53268] Updated weights for policy 1, policy_version 73510 (0.0008) [2023-10-10 07:43:59,964][53268] Updated weights for policy 1, policy_version 73520 (0.0010) [2023-10-10 07:44:00,333][53268] Updated weights for policy 1, policy_version 73530 (0.0011) [2023-10-10 07:44:00,775][53252] Updated weights for policy 0, policy_version 73570 (0.0008) [2023-10-10 07:44:01,143][53252] Updated weights for policy 0, policy_version 73580 (0.0008) [2023-10-10 07:44:01,508][53252] Updated weights for policy 0, policy_version 73590 (0.0007) [2023-10-10 07:44:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150634496. Throughput: 0: 1697.1, 1: 1688.2. Samples: 37666044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:44:01,784][52050] Avg episode reward: [(0, '19.990'), (1, '20.360')] [2023-10-10 07:44:01,876][53252] Updated weights for policy 0, policy_version 73600 (0.0008) [2023-10-10 07:44:04,593][53268] Updated weights for policy 1, policy_version 73540 (0.0010) [2023-10-10 07:44:04,970][53268] Updated weights for policy 1, policy_version 73550 (0.0011) [2023-10-10 07:44:05,340][53268] Updated weights for policy 1, policy_version 73560 (0.0009) [2023-10-10 07:44:05,913][53252] Updated weights for policy 0, policy_version 73610 (0.0007) [2023-10-10 07:44:06,295][53252] Updated weights for policy 0, policy_version 73620 (0.0010) [2023-10-10 07:44:06,667][53252] Updated weights for policy 0, policy_version 73630 (0.0008) [2023-10-10 07:44:06,783][52050] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 150732800. Throughput: 0: 1701.9, 1: 1671.1. Samples: 37686050. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:06,784][52050] Avg episode reward: [(0, '21.400'), (1, '20.900')] [2023-10-10 07:44:09,316][53268] Updated weights for policy 1, policy_version 73570 (0.0009) [2023-10-10 07:44:09,672][53268] Updated weights for policy 1, policy_version 73580 (0.0008) [2023-10-10 07:44:10,042][53268] Updated weights for policy 1, policy_version 73590 (0.0008) [2023-10-10 07:44:10,404][53268] Updated weights for policy 1, policy_version 73600 (0.0009) [2023-10-10 07:44:10,727][53252] Updated weights for policy 0, policy_version 73640 (0.0007) [2023-10-10 07:44:11,104][53252] Updated weights for policy 0, policy_version 73650 (0.0007) [2023-10-10 07:44:11,478][53252] Updated weights for policy 0, policy_version 73660 (0.0011) [2023-10-10 07:44:11,783][52050] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 150798336. Throughput: 0: 1670.6, 1: 1683.6. Samples: 37705338. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:11,784][52050] Avg episode reward: [(0, '21.920'), (1, '20.500')] [2023-10-10 07:44:14,406][53268] Updated weights for policy 1, policy_version 73610 (0.0009) [2023-10-10 07:44:14,774][53268] Updated weights for policy 1, policy_version 73620 (0.0009) [2023-10-10 07:44:15,137][53268] Updated weights for policy 1, policy_version 73630 (0.0009) [2023-10-10 07:44:15,520][53252] Updated weights for policy 0, policy_version 73670 (0.0010) [2023-10-10 07:44:15,897][53252] Updated weights for policy 0, policy_version 73680 (0.0007) [2023-10-10 07:44:16,260][53252] Updated weights for policy 0, policy_version 73690 (0.0007) [2023-10-10 07:44:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 150863872. Throughput: 0: 1694.8, 1: 1690.0. Samples: 37716530. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:16,784][52050] Avg episode reward: [(0, '22.540'), (1, '20.160')] [2023-10-10 07:44:19,322][53268] Updated weights for policy 1, policy_version 73640 (0.0008) [2023-10-10 07:44:19,685][53268] Updated weights for policy 1, policy_version 73650 (0.0011) [2023-10-10 07:44:20,050][53268] Updated weights for policy 1, policy_version 73660 (0.0008) [2023-10-10 07:44:20,426][53252] Updated weights for policy 0, policy_version 73700 (0.0008) [2023-10-10 07:44:20,794][53252] Updated weights for policy 0, policy_version 73710 (0.0008) [2023-10-10 07:44:21,174][53252] Updated weights for policy 0, policy_version 73720 (0.0008) [2023-10-10 07:44:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150929408. Throughput: 0: 1693.1, 1: 1666.1. Samples: 37735908. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:21,784][52050] Avg episode reward: [(0, '23.750'), (1, '23.500')] [2023-10-10 07:44:24,197][53268] Updated weights for policy 1, policy_version 73670 (0.0008) [2023-10-10 07:44:24,562][53268] Updated weights for policy 1, policy_version 73680 (0.0010) [2023-10-10 07:44:24,929][53268] Updated weights for policy 1, policy_version 73690 (0.0009) [2023-10-10 07:44:25,371][53252] Updated weights for policy 0, policy_version 73730 (0.0009) [2023-10-10 07:44:25,744][53252] Updated weights for policy 0, policy_version 73740 (0.0007) [2023-10-10 07:44:26,111][53252] Updated weights for policy 0, policy_version 73750 (0.0008) [2023-10-10 07:44:26,487][53252] Updated weights for policy 0, policy_version 73760 (0.0007) [2023-10-10 07:44:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 150994944. Throughput: 0: 1664.0, 1: 1693.9. Samples: 37755522. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:26,784][52050] Avg episode reward: [(0, '22.340'), (1, '22.690')] [2023-10-10 07:44:29,008][53268] Updated weights for policy 1, policy_version 73700 (0.0010) [2023-10-10 07:44:29,374][53268] Updated weights for policy 1, policy_version 73710 (0.0008) [2023-10-10 07:44:29,732][53268] Updated weights for policy 1, policy_version 73720 (0.0010) [2023-10-10 07:44:30,497][53252] Updated weights for policy 0, policy_version 73770 (0.0008) [2023-10-10 07:44:30,880][53252] Updated weights for policy 0, policy_version 73780 (0.0007) [2023-10-10 07:44:31,257][53252] Updated weights for policy 0, policy_version 73790 (0.0007) [2023-10-10 07:44:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151060480. Throughput: 0: 1693.0, 1: 1682.6. Samples: 37766774. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:31,784][52050] Avg episode reward: [(0, '21.110'), (1, '22.950')] [2023-10-10 07:44:33,660][53268] Updated weights for policy 1, policy_version 73730 (0.0010) [2023-10-10 07:44:34,020][53268] Updated weights for policy 1, policy_version 73740 (0.0009) [2023-10-10 07:44:34,389][53268] Updated weights for policy 1, policy_version 73750 (0.0010) [2023-10-10 07:44:34,752][53268] Updated weights for policy 1, policy_version 73760 (0.0008) [2023-10-10 07:44:35,180][53252] Updated weights for policy 0, policy_version 73800 (0.0008) [2023-10-10 07:44:35,559][53252] Updated weights for policy 0, policy_version 73810 (0.0007) [2023-10-10 07:44:35,937][53252] Updated weights for policy 0, policy_version 73820 (0.0009) [2023-10-10 07:44:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151126016. Throughput: 0: 1677.8, 1: 1668.5. Samples: 37785944. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:36,784][52050] Avg episode reward: [(0, '20.120'), (1, '21.820')] [2023-10-10 07:44:38,932][53268] Updated weights for policy 1, policy_version 73770 (0.0009) [2023-10-10 07:44:39,297][53268] Updated weights for policy 1, policy_version 73780 (0.0009) [2023-10-10 07:44:39,672][53268] Updated weights for policy 1, policy_version 73790 (0.0008) [2023-10-10 07:44:39,893][53252] Updated weights for policy 0, policy_version 73830 (0.0008) [2023-10-10 07:44:40,267][53252] Updated weights for policy 0, policy_version 73840 (0.0008) [2023-10-10 07:44:40,636][53252] Updated weights for policy 0, policy_version 73850 (0.0009) [2023-10-10 07:44:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151191552. Throughput: 0: 1671.4, 1: 1680.7. Samples: 37805816. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:41,784][52050] Avg episode reward: [(0, '20.270'), (1, '22.750')] [2023-10-10 07:44:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000073792_75563008.pth... [2023-10-10 07:44:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000073856_75628544.pth... [2023-10-10 07:44:41,828][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000072288_74022912.pth [2023-10-10 07:44:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000072224_73957376.pth [2023-10-10 07:44:43,711][53268] Updated weights for policy 1, policy_version 73800 (0.0008) [2023-10-10 07:44:44,070][53268] Updated weights for policy 1, policy_version 73810 (0.0010) [2023-10-10 07:44:44,448][53268] Updated weights for policy 1, policy_version 73820 (0.0007) [2023-10-10 07:44:44,606][53252] Updated weights for policy 0, policy_version 73860 (0.0011) [2023-10-10 07:44:44,983][53252] Updated weights for policy 0, policy_version 73870 (0.0009) [2023-10-10 07:44:45,350][53252] Updated weights for policy 0, policy_version 73880 (0.0008) [2023-10-10 07:44:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151257088. Throughput: 0: 1692.1, 1: 1660.8. Samples: 37816922. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-10 07:44:46,784][52050] Avg episode reward: [(0, '20.910'), (1, '21.040')] [2023-10-10 07:44:48,392][53268] Updated weights for policy 1, policy_version 73830 (0.0007) [2023-10-10 07:44:48,755][53268] Updated weights for policy 1, policy_version 73840 (0.0008) [2023-10-10 07:44:49,127][53268] Updated weights for policy 1, policy_version 73850 (0.0009) [2023-10-10 07:44:49,360][53252] Updated weights for policy 0, policy_version 73890 (0.0008) [2023-10-10 07:44:49,736][53252] Updated weights for policy 0, policy_version 73900 (0.0009) [2023-10-10 07:44:50,099][53252] Updated weights for policy 0, policy_version 73910 (0.0007) [2023-10-10 07:44:50,471][53252] Updated weights for policy 0, policy_version 73920 (0.0008) [2023-10-10 07:44:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151322624. Throughput: 0: 1666.4, 1: 1674.0. Samples: 37836368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:44:51,784][52050] Avg episode reward: [(0, '20.710'), (1, '21.650')] [2023-10-10 07:44:53,351][53268] Updated weights for policy 1, policy_version 73860 (0.0009) [2023-10-10 07:44:53,761][53268] Updated weights for policy 1, policy_version 73870 (0.0008) [2023-10-10 07:44:54,130][53268] Updated weights for policy 1, policy_version 73880 (0.0010) [2023-10-10 07:44:54,434][53252] Updated weights for policy 0, policy_version 73930 (0.0008) [2023-10-10 07:44:54,796][53252] Updated weights for policy 0, policy_version 73940 (0.0008) [2023-10-10 07:44:55,176][53252] Updated weights for policy 0, policy_version 73950 (0.0010) [2023-10-10 07:44:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151388160. Throughput: 0: 1694.8, 1: 1677.5. Samples: 37857092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:44:56,785][52050] Avg episode reward: [(0, '21.320'), (1, '21.790')] [2023-10-10 07:44:58,129][53268] Updated weights for policy 1, policy_version 73890 (0.0009) [2023-10-10 07:44:58,496][53268] Updated weights for policy 1, policy_version 73900 (0.0008) [2023-10-10 07:44:58,863][53268] Updated weights for policy 1, policy_version 73910 (0.0007) [2023-10-10 07:44:59,231][53268] Updated weights for policy 1, policy_version 73920 (0.0007) [2023-10-10 07:44:59,255][53252] Updated weights for policy 0, policy_version 73960 (0.0010) [2023-10-10 07:44:59,634][53252] Updated weights for policy 0, policy_version 73970 (0.0009) [2023-10-10 07:45:00,008][53252] Updated weights for policy 0, policy_version 73980 (0.0008) [2023-10-10 07:45:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151453696. Throughput: 0: 1693.3, 1: 1654.7. Samples: 37867192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:01,784][52050] Avg episode reward: [(0, '23.610'), (1, '21.700')] [2023-10-10 07:45:03,344][53268] Updated weights for policy 1, policy_version 73930 (0.0011) [2023-10-10 07:45:03,714][53268] Updated weights for policy 1, policy_version 73940 (0.0010) [2023-10-10 07:45:03,862][53252] Updated weights for policy 0, policy_version 73990 (0.0009) [2023-10-10 07:45:04,076][53268] Updated weights for policy 1, policy_version 73950 (0.0008) [2023-10-10 07:45:04,224][53252] Updated weights for policy 0, policy_version 74000 (0.0007) [2023-10-10 07:45:04,602][53252] Updated weights for policy 0, policy_version 74010 (0.0007) [2023-10-10 07:45:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 151519232. Throughput: 0: 1684.4, 1: 1677.3. Samples: 37887186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:06,784][52050] Avg episode reward: [(0, '22.880'), (1, '22.350')] [2023-10-10 07:45:07,990][53268] Updated weights for policy 1, policy_version 73960 (0.0010) [2023-10-10 07:45:08,346][53268] Updated weights for policy 1, policy_version 73970 (0.0008) [2023-10-10 07:45:08,711][53268] Updated weights for policy 1, policy_version 73980 (0.0011) [2023-10-10 07:45:08,814][53252] Updated weights for policy 0, policy_version 74020 (0.0009) [2023-10-10 07:45:09,183][53252] Updated weights for policy 0, policy_version 74030 (0.0008) [2023-10-10 07:45:09,553][53252] Updated weights for policy 0, policy_version 74040 (0.0009) [2023-10-10 07:45:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 151584768. Throughput: 0: 1703.7, 1: 1680.1. Samples: 37907794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:11,785][52050] Avg episode reward: [(0, '21.670'), (1, '21.690')] [2023-10-10 07:45:12,850][53268] Updated weights for policy 1, policy_version 73990 (0.0010) [2023-10-10 07:45:13,224][53268] Updated weights for policy 1, policy_version 74000 (0.0008) [2023-10-10 07:45:13,582][53268] Updated weights for policy 1, policy_version 74010 (0.0011) [2023-10-10 07:45:13,608][53252] Updated weights for policy 0, policy_version 74050 (0.0008) [2023-10-10 07:45:13,975][53252] Updated weights for policy 0, policy_version 74060 (0.0009) [2023-10-10 07:45:14,351][53252] Updated weights for policy 0, policy_version 74070 (0.0010) [2023-10-10 07:45:14,722][53252] Updated weights for policy 0, policy_version 74080 (0.0010) [2023-10-10 07:45:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 151650304. Throughput: 0: 1686.0, 1: 1660.9. Samples: 37917384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:16,784][52050] Avg episode reward: [(0, '23.770'), (1, '21.610')] [2023-10-10 07:45:17,624][53268] Updated weights for policy 1, policy_version 74020 (0.0009) [2023-10-10 07:45:17,994][53268] Updated weights for policy 1, policy_version 74030 (0.0010) [2023-10-10 07:45:18,364][53268] Updated weights for policy 1, policy_version 74040 (0.0009) [2023-10-10 07:45:18,729][53252] Updated weights for policy 0, policy_version 74090 (0.0009) [2023-10-10 07:45:19,095][53252] Updated weights for policy 0, policy_version 74100 (0.0008) [2023-10-10 07:45:19,462][53252] Updated weights for policy 0, policy_version 74110 (0.0008) [2023-10-10 07:45:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 151715840. Throughput: 0: 1689.0, 1: 1681.0. Samples: 37937592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:21,784][52050] Avg episode reward: [(0, '22.200'), (1, '21.560')] [2023-10-10 07:45:22,496][53268] Updated weights for policy 1, policy_version 74050 (0.0008) [2023-10-10 07:45:22,867][53268] Updated weights for policy 1, policy_version 74060 (0.0008) [2023-10-10 07:45:23,229][53268] Updated weights for policy 1, policy_version 74070 (0.0009) [2023-10-10 07:45:23,527][53252] Updated weights for policy 0, policy_version 74120 (0.0007) [2023-10-10 07:45:23,596][53268] Updated weights for policy 1, policy_version 74080 (0.0008) [2023-10-10 07:45:23,907][53252] Updated weights for policy 0, policy_version 74130 (0.0007) [2023-10-10 07:45:24,268][53252] Updated weights for policy 0, policy_version 74140 (0.0007) [2023-10-10 07:45:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151781376. Throughput: 0: 1705.1, 1: 1682.3. Samples: 37958250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:26,784][52050] Avg episode reward: [(0, '20.850'), (1, '22.540')] [2023-10-10 07:45:27,771][53268] Updated weights for policy 1, policy_version 74090 (0.0009) [2023-10-10 07:45:28,140][53268] Updated weights for policy 1, policy_version 74100 (0.0008) [2023-10-10 07:45:28,295][53252] Updated weights for policy 0, policy_version 74150 (0.0009) [2023-10-10 07:45:28,513][53268] Updated weights for policy 1, policy_version 74110 (0.0007) [2023-10-10 07:45:28,663][53252] Updated weights for policy 0, policy_version 74160 (0.0009) [2023-10-10 07:45:29,037][53252] Updated weights for policy 0, policy_version 74170 (0.0008) [2023-10-10 07:45:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151846912. Throughput: 0: 1670.1, 1: 1670.8. Samples: 37967260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:31,784][52050] Avg episode reward: [(0, '23.300'), (1, '21.060')] [2023-10-10 07:45:32,533][53268] Updated weights for policy 1, policy_version 74120 (0.0011) [2023-10-10 07:45:32,898][53268] Updated weights for policy 1, policy_version 74130 (0.0011) [2023-10-10 07:45:33,160][53252] Updated weights for policy 0, policy_version 74180 (0.0008) [2023-10-10 07:45:33,275][53268] Updated weights for policy 1, policy_version 74140 (0.0010) [2023-10-10 07:45:33,535][53252] Updated weights for policy 0, policy_version 74190 (0.0009) [2023-10-10 07:45:33,912][53252] Updated weights for policy 0, policy_version 74200 (0.0008) [2023-10-10 07:45:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151912448. Throughput: 0: 1689.9, 1: 1676.5. Samples: 37987858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:36,784][52050] Avg episode reward: [(0, '21.920'), (1, '21.360')] [2023-10-10 07:45:37,271][53268] Updated weights for policy 1, policy_version 74150 (0.0009) [2023-10-10 07:45:37,638][53268] Updated weights for policy 1, policy_version 74160 (0.0009) [2023-10-10 07:45:37,931][53252] Updated weights for policy 0, policy_version 74210 (0.0008) [2023-10-10 07:45:38,004][53268] Updated weights for policy 1, policy_version 74170 (0.0009) [2023-10-10 07:45:38,296][53252] Updated weights for policy 0, policy_version 74220 (0.0008) [2023-10-10 07:45:38,666][53252] Updated weights for policy 0, policy_version 74230 (0.0011) [2023-10-10 07:45:39,038][53252] Updated weights for policy 0, policy_version 74240 (0.0007) [2023-10-10 07:45:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151977984. Throughput: 0: 1685.9, 1: 1683.1. Samples: 38008694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:41,784][52050] Avg episode reward: [(0, '21.250'), (1, '20.510')] [2023-10-10 07:45:42,295][53268] Updated weights for policy 1, policy_version 74180 (0.0009) [2023-10-10 07:45:42,694][53268] Updated weights for policy 1, policy_version 74190 (0.0008) [2023-10-10 07:45:43,054][53268] Updated weights for policy 1, policy_version 74200 (0.0007) [2023-10-10 07:45:43,054][53252] Updated weights for policy 0, policy_version 74250 (0.0009) [2023-10-10 07:45:43,426][53252] Updated weights for policy 0, policy_version 74260 (0.0007) [2023-10-10 07:45:43,789][53252] Updated weights for policy 0, policy_version 74270 (0.0007) [2023-10-10 07:45:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 152043520. Throughput: 0: 1667.5, 1: 1675.5. Samples: 38017626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:46,785][52050] Avg episode reward: [(0, '22.640'), (1, '21.120')] [2023-10-10 07:45:47,075][53268] Updated weights for policy 1, policy_version 74210 (0.0008) [2023-10-10 07:45:47,438][53268] Updated weights for policy 1, policy_version 74220 (0.0011) [2023-10-10 07:45:47,688][53252] Updated weights for policy 0, policy_version 74280 (0.0008) [2023-10-10 07:45:47,794][53268] Updated weights for policy 1, policy_version 74230 (0.0009) [2023-10-10 07:45:48,059][53252] Updated weights for policy 0, policy_version 74290 (0.0007) [2023-10-10 07:45:48,169][53268] Updated weights for policy 1, policy_version 74240 (0.0011) [2023-10-10 07:45:48,428][53252] Updated weights for policy 0, policy_version 74300 (0.0009) [2023-10-10 07:45:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 152109056. Throughput: 0: 1676.4, 1: 1681.8. Samples: 38038302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:51,784][52050] Avg episode reward: [(0, '20.210'), (1, '20.320')] [2023-10-10 07:45:52,212][53268] Updated weights for policy 1, policy_version 74250 (0.0007) [2023-10-10 07:45:52,590][53268] Updated weights for policy 1, policy_version 74260 (0.0009) [2023-10-10 07:45:52,607][53252] Updated weights for policy 0, policy_version 74310 (0.0009) [2023-10-10 07:45:52,955][53268] Updated weights for policy 1, policy_version 74270 (0.0007) [2023-10-10 07:45:52,969][53252] Updated weights for policy 0, policy_version 74320 (0.0007) [2023-10-10 07:45:53,342][53252] Updated weights for policy 0, policy_version 74330 (0.0007) [2023-10-10 07:45:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 152174592. Throughput: 0: 1687.6, 1: 1675.4. Samples: 38059126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:45:56,784][52050] Avg episode reward: [(0, '20.340'), (1, '21.430')] [2023-10-10 07:45:57,130][53268] Updated weights for policy 1, policy_version 74280 (0.0008) [2023-10-10 07:45:57,346][53252] Updated weights for policy 0, policy_version 74340 (0.0008) [2023-10-10 07:45:57,499][53268] Updated weights for policy 1, policy_version 74290 (0.0008) [2023-10-10 07:45:57,712][53252] Updated weights for policy 0, policy_version 74350 (0.0010) [2023-10-10 07:45:57,867][53268] Updated weights for policy 1, policy_version 74300 (0.0007) [2023-10-10 07:45:58,082][53252] Updated weights for policy 0, policy_version 74360 (0.0008) [2023-10-10 07:46:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 152240128. Throughput: 0: 1676.9, 1: 1672.0. Samples: 38068082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:01,784][52050] Avg episode reward: [(0, '19.910'), (1, '22.880')] [2023-10-10 07:46:01,911][53268] Updated weights for policy 1, policy_version 74310 (0.0007) [2023-10-10 07:46:02,273][53268] Updated weights for policy 1, policy_version 74320 (0.0008) [2023-10-10 07:46:02,298][53252] Updated weights for policy 0, policy_version 74370 (0.0007) [2023-10-10 07:46:02,637][53268] Updated weights for policy 1, policy_version 74330 (0.0007) [2023-10-10 07:46:02,666][53252] Updated weights for policy 0, policy_version 74380 (0.0009) [2023-10-10 07:46:03,038][53252] Updated weights for policy 0, policy_version 74390 (0.0008) [2023-10-10 07:46:03,414][53252] Updated weights for policy 0, policy_version 74400 (0.0009) [2023-10-10 07:46:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152305664. Throughput: 0: 1689.6, 1: 1671.0. Samples: 38088818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:06,784][52050] Avg episode reward: [(0, '21.390'), (1, '22.020')] [2023-10-10 07:46:06,798][53268] Updated weights for policy 1, policy_version 74340 (0.0008) [2023-10-10 07:46:07,159][53268] Updated weights for policy 1, policy_version 74350 (0.0007) [2023-10-10 07:46:07,519][53252] Updated weights for policy 0, policy_version 74410 (0.0007) [2023-10-10 07:46:07,538][53268] Updated weights for policy 1, policy_version 74360 (0.0008) [2023-10-10 07:46:07,884][53252] Updated weights for policy 0, policy_version 74420 (0.0008) [2023-10-10 07:46:08,258][53252] Updated weights for policy 0, policy_version 74430 (0.0007) [2023-10-10 07:46:11,700][53268] Updated weights for policy 1, policy_version 74370 (0.0008) [2023-10-10 07:46:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152371200. Throughput: 0: 1688.9, 1: 1672.7. Samples: 38109524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:11,784][52050] Avg episode reward: [(0, '22.950'), (1, '21.440')] [2023-10-10 07:46:12,061][53268] Updated weights for policy 1, policy_version 74380 (0.0007) [2023-10-10 07:46:12,423][53252] Updated weights for policy 0, policy_version 74440 (0.0008) [2023-10-10 07:46:12,433][53268] Updated weights for policy 1, policy_version 74390 (0.0009) [2023-10-10 07:46:12,789][53252] Updated weights for policy 0, policy_version 74450 (0.0008) [2023-10-10 07:46:12,789][53268] Updated weights for policy 1, policy_version 74400 (0.0008) [2023-10-10 07:46:13,163][53252] Updated weights for policy 0, policy_version 74460 (0.0008) [2023-10-10 07:46:16,707][53268] Updated weights for policy 1, policy_version 74410 (0.0007) [2023-10-10 07:46:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152436736. Throughput: 0: 1688.9, 1: 1675.2. Samples: 38118644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:16,784][52050] Avg episode reward: [(0, '21.850'), (1, '20.340')] [2023-10-10 07:46:17,076][53268] Updated weights for policy 1, policy_version 74420 (0.0007) [2023-10-10 07:46:17,190][53252] Updated weights for policy 0, policy_version 74470 (0.0007) [2023-10-10 07:46:17,445][53268] Updated weights for policy 1, policy_version 74430 (0.0008) [2023-10-10 07:46:17,564][53252] Updated weights for policy 0, policy_version 74480 (0.0007) [2023-10-10 07:46:17,935][53252] Updated weights for policy 0, policy_version 74490 (0.0008) [2023-10-10 07:46:21,499][53268] Updated weights for policy 1, policy_version 74440 (0.0009) [2023-10-10 07:46:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152502272. Throughput: 0: 1685.6, 1: 1680.1. Samples: 38139314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:21,784][52050] Avg episode reward: [(0, '23.520'), (1, '20.600')] [2023-10-10 07:46:21,871][53268] Updated weights for policy 1, policy_version 74450 (0.0009) [2023-10-10 07:46:21,979][53252] Updated weights for policy 0, policy_version 74500 (0.0009) [2023-10-10 07:46:22,250][53268] Updated weights for policy 1, policy_version 74460 (0.0009) [2023-10-10 07:46:22,350][53252] Updated weights for policy 0, policy_version 74510 (0.0008) [2023-10-10 07:46:22,722][53252] Updated weights for policy 0, policy_version 74520 (0.0010) [2023-10-10 07:46:26,319][53268] Updated weights for policy 1, policy_version 74470 (0.0010) [2023-10-10 07:46:26,691][53268] Updated weights for policy 1, policy_version 74480 (0.0011) [2023-10-10 07:46:26,767][53252] Updated weights for policy 0, policy_version 74530 (0.0008) [2023-10-10 07:46:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 152567808. Throughput: 0: 1686.5, 1: 1675.7. Samples: 38159992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:26,784][52050] Avg episode reward: [(0, '22.270'), (1, '18.830')] [2023-10-10 07:46:27,050][53268] Updated weights for policy 1, policy_version 74490 (0.0008) [2023-10-10 07:46:27,134][53252] Updated weights for policy 0, policy_version 74540 (0.0007) [2023-10-10 07:46:27,509][53252] Updated weights for policy 0, policy_version 74550 (0.0008) [2023-10-10 07:46:27,877][53252] Updated weights for policy 0, policy_version 74560 (0.0009) [2023-10-10 07:46:31,232][53268] Updated weights for policy 1, policy_version 74500 (0.0010) [2023-10-10 07:46:31,626][53268] Updated weights for policy 1, policy_version 74510 (0.0008) [2023-10-10 07:46:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152633344. Throughput: 0: 1687.5, 1: 1680.9. Samples: 38169202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:31,784][52050] Avg episode reward: [(0, '21.380'), (1, '20.860')] [2023-10-10 07:46:31,902][53252] Updated weights for policy 0, policy_version 74570 (0.0008) [2023-10-10 07:46:31,999][53268] Updated weights for policy 1, policy_version 74520 (0.0007) [2023-10-10 07:46:32,269][53252] Updated weights for policy 0, policy_version 74580 (0.0009) [2023-10-10 07:46:32,651][53252] Updated weights for policy 0, policy_version 74590 (0.0008) [2023-10-10 07:46:36,153][53268] Updated weights for policy 1, policy_version 74530 (0.0009) [2023-10-10 07:46:36,519][53268] Updated weights for policy 1, policy_version 74540 (0.0009) [2023-10-10 07:46:36,662][53252] Updated weights for policy 0, policy_version 74600 (0.0007) [2023-10-10 07:46:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152698880. Throughput: 0: 1684.8, 1: 1675.7. Samples: 38189526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:36,784][52050] Avg episode reward: [(0, '20.650'), (1, '21.060')] [2023-10-10 07:46:36,880][53268] Updated weights for policy 1, policy_version 74550 (0.0008) [2023-10-10 07:46:37,031][53252] Updated weights for policy 0, policy_version 74610 (0.0007) [2023-10-10 07:46:37,245][53268] Updated weights for policy 1, policy_version 74560 (0.0009) [2023-10-10 07:46:37,407][53252] Updated weights for policy 0, policy_version 74620 (0.0007) [2023-10-10 07:46:41,210][53268] Updated weights for policy 1, policy_version 74570 (0.0008) [2023-10-10 07:46:41,439][53252] Updated weights for policy 0, policy_version 74630 (0.0010) [2023-10-10 07:46:41,578][53268] Updated weights for policy 1, policy_version 74580 (0.0008) [2023-10-10 07:46:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 152764416. Throughput: 0: 1678.3, 1: 1668.4. Samples: 38209730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:41,784][52050] Avg episode reward: [(0, '22.640'), (1, '21.460')] [2023-10-10 07:46:41,810][53252] Updated weights for policy 0, policy_version 74640 (0.0008) [2023-10-10 07:46:41,941][53268] Updated weights for policy 1, policy_version 74590 (0.0008) [2023-10-10 07:46:42,008][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000074592_76382208.pth... [2023-10-10 07:46:42,040][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000073024_74776576.pth [2023-10-10 07:46:42,184][53252] Updated weights for policy 0, policy_version 74650 (0.0008) [2023-10-10 07:46:42,403][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000074656_76447744.pth... [2023-10-10 07:46:42,431][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000073056_74809344.pth [2023-10-10 07:46:46,010][53268] Updated weights for policy 1, policy_version 74600 (0.0009) [2023-10-10 07:46:46,365][53252] Updated weights for policy 0, policy_version 74660 (0.0007) [2023-10-10 07:46:46,370][53268] Updated weights for policy 1, policy_version 74610 (0.0009) [2023-10-10 07:46:46,740][53268] Updated weights for policy 1, policy_version 74620 (0.0009) [2023-10-10 07:46:46,741][53252] Updated weights for policy 0, policy_version 74670 (0.0008) [2023-10-10 07:46:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 152829952. Throughput: 0: 1680.9, 1: 1680.0. Samples: 38219326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:46,784][52050] Avg episode reward: [(0, '23.300'), (1, '20.420')] [2023-10-10 07:46:47,107][53252] Updated weights for policy 0, policy_version 74680 (0.0009) [2023-10-10 07:46:50,794][53268] Updated weights for policy 1, policy_version 74630 (0.0009) [2023-10-10 07:46:51,156][53268] Updated weights for policy 1, policy_version 74640 (0.0010) [2023-10-10 07:46:51,339][53252] Updated weights for policy 0, policy_version 74690 (0.0009) [2023-10-10 07:46:51,538][53268] Updated weights for policy 1, policy_version 74650 (0.0009) [2023-10-10 07:46:51,710][53252] Updated weights for policy 0, policy_version 74700 (0.0007) [2023-10-10 07:46:51,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152928256. Throughput: 0: 1676.2, 1: 1680.9. Samples: 38239888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:51,784][52050] Avg episode reward: [(0, '23.010'), (1, '23.220')] [2023-10-10 07:46:52,078][53252] Updated weights for policy 0, policy_version 74710 (0.0007) [2023-10-10 07:46:52,454][53252] Updated weights for policy 0, policy_version 74720 (0.0009) [2023-10-10 07:46:55,678][53268] Updated weights for policy 1, policy_version 74660 (0.0008) [2023-10-10 07:46:56,046][53268] Updated weights for policy 1, policy_version 74670 (0.0008) [2023-10-10 07:46:56,408][53268] Updated weights for policy 1, policy_version 74680 (0.0007) [2023-10-10 07:46:56,452][53252] Updated weights for policy 0, policy_version 74730 (0.0009) [2023-10-10 07:46:56,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152993792. Throughput: 0: 1670.7, 1: 1665.5. Samples: 38259658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:46:56,784][52050] Avg episode reward: [(0, '23.130'), (1, '21.440')] [2023-10-10 07:46:56,824][53252] Updated weights for policy 0, policy_version 74740 (0.0007) [2023-10-10 07:46:57,194][53252] Updated weights for policy 0, policy_version 74750 (0.0007) [2023-10-10 07:47:00,397][53268] Updated weights for policy 1, policy_version 74690 (0.0008) [2023-10-10 07:47:00,757][53268] Updated weights for policy 1, policy_version 74700 (0.0010) [2023-10-10 07:47:01,125][53268] Updated weights for policy 1, policy_version 74710 (0.0009) [2023-10-10 07:47:01,435][53252] Updated weights for policy 0, policy_version 74760 (0.0007) [2023-10-10 07:47:01,483][53268] Updated weights for policy 1, policy_version 74720 (0.0009) [2023-10-10 07:47:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 153059328. Throughput: 0: 1675.0, 1: 1679.7. Samples: 38269604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:01,784][52050] Avg episode reward: [(0, '23.520'), (1, '21.730')] [2023-10-10 07:47:01,807][53252] Updated weights for policy 0, policy_version 74770 (0.0010) [2023-10-10 07:47:02,176][53252] Updated weights for policy 0, policy_version 74780 (0.0007) [2023-10-10 07:47:05,774][53268] Updated weights for policy 1, policy_version 74730 (0.0008) [2023-10-10 07:47:06,145][53268] Updated weights for policy 1, policy_version 74740 (0.0009) [2023-10-10 07:47:06,237][53252] Updated weights for policy 0, policy_version 74790 (0.0008) [2023-10-10 07:47:06,511][53268] Updated weights for policy 1, policy_version 74750 (0.0009) [2023-10-10 07:47:06,603][53252] Updated weights for policy 0, policy_version 74800 (0.0009) [2023-10-10 07:47:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153124864. Throughput: 0: 1673.6, 1: 1673.9. Samples: 38289952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:06,784][52050] Avg episode reward: [(0, '22.520'), (1, '21.990')] [2023-10-10 07:47:06,976][53252] Updated weights for policy 0, policy_version 74810 (0.0009) [2023-10-10 07:47:10,590][53268] Updated weights for policy 1, policy_version 74760 (0.0007) [2023-10-10 07:47:10,967][53268] Updated weights for policy 1, policy_version 74770 (0.0009) [2023-10-10 07:47:11,095][53252] Updated weights for policy 0, policy_version 74820 (0.0008) [2023-10-10 07:47:11,330][53268] Updated weights for policy 1, policy_version 74780 (0.0009) [2023-10-10 07:47:11,464][53252] Updated weights for policy 0, policy_version 74830 (0.0008) [2023-10-10 07:47:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153190400. Throughput: 0: 1658.4, 1: 1657.5. Samples: 38309204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:11,784][52050] Avg episode reward: [(0, '21.870'), (1, '21.390')] [2023-10-10 07:47:11,841][53252] Updated weights for policy 0, policy_version 74840 (0.0011) [2023-10-10 07:47:15,350][53268] Updated weights for policy 1, policy_version 74790 (0.0008) [2023-10-10 07:47:15,731][53268] Updated weights for policy 1, policy_version 74800 (0.0008) [2023-10-10 07:47:16,032][53252] Updated weights for policy 0, policy_version 74850 (0.0011) [2023-10-10 07:47:16,095][53268] Updated weights for policy 1, policy_version 74810 (0.0010) [2023-10-10 07:47:16,401][53252] Updated weights for policy 0, policy_version 74860 (0.0007) [2023-10-10 07:47:16,782][53252] Updated weights for policy 0, policy_version 74870 (0.0008) [2023-10-10 07:47:16,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153255936. Throughput: 0: 1665.9, 1: 1674.3. Samples: 38319510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:16,784][52050] Avg episode reward: [(0, '20.450'), (1, '20.910')] [2023-10-10 07:47:17,150][53252] Updated weights for policy 0, policy_version 74880 (0.0007) [2023-10-10 07:47:20,237][53268] Updated weights for policy 1, policy_version 74820 (0.0008) [2023-10-10 07:47:20,637][53268] Updated weights for policy 1, policy_version 74830 (0.0007) [2023-10-10 07:47:21,010][53268] Updated weights for policy 1, policy_version 74840 (0.0007) [2023-10-10 07:47:21,255][53252] Updated weights for policy 0, policy_version 74890 (0.0008) [2023-10-10 07:47:21,620][53252] Updated weights for policy 0, policy_version 74900 (0.0009) [2023-10-10 07:47:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153321472. Throughput: 0: 1665.6, 1: 1673.0. Samples: 38339762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:21,784][52050] Avg episode reward: [(0, '21.070'), (1, '20.730')] [2023-10-10 07:47:21,987][53252] Updated weights for policy 0, policy_version 74910 (0.0009) [2023-10-10 07:47:25,125][53268] Updated weights for policy 1, policy_version 74850 (0.0010) [2023-10-10 07:47:25,493][53268] Updated weights for policy 1, policy_version 74860 (0.0008) [2023-10-10 07:47:25,858][53268] Updated weights for policy 1, policy_version 74870 (0.0008) [2023-10-10 07:47:26,049][53252] Updated weights for policy 0, policy_version 74920 (0.0008) [2023-10-10 07:47:26,224][53268] Updated weights for policy 1, policy_version 74880 (0.0007) [2023-10-10 07:47:26,415][53252] Updated weights for policy 0, policy_version 74930 (0.0010) [2023-10-10 07:47:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153387008. Throughput: 0: 1655.1, 1: 1659.6. Samples: 38358890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:26,784][52050] Avg episode reward: [(0, '19.810'), (1, '22.550')] [2023-10-10 07:47:26,787][53252] Updated weights for policy 0, policy_version 74940 (0.0007) [2023-10-10 07:47:30,238][53268] Updated weights for policy 1, policy_version 74890 (0.0011) [2023-10-10 07:47:30,620][53268] Updated weights for policy 1, policy_version 74900 (0.0010) [2023-10-10 07:47:30,861][53252] Updated weights for policy 0, policy_version 74950 (0.0007) [2023-10-10 07:47:30,982][53268] Updated weights for policy 1, policy_version 74910 (0.0010) [2023-10-10 07:47:31,241][53252] Updated weights for policy 0, policy_version 74960 (0.0009) [2023-10-10 07:47:31,614][53252] Updated weights for policy 0, policy_version 74970 (0.0009) [2023-10-10 07:47:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 153452544. Throughput: 0: 1666.1, 1: 1679.5. Samples: 38369878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:31,784][52050] Avg episode reward: [(0, '20.780'), (1, '22.450')] [2023-10-10 07:47:35,037][53268] Updated weights for policy 1, policy_version 74920 (0.0008) [2023-10-10 07:47:35,403][53268] Updated weights for policy 1, policy_version 74930 (0.0009) [2023-10-10 07:47:35,604][53252] Updated weights for policy 0, policy_version 74980 (0.0009) [2023-10-10 07:47:35,771][53268] Updated weights for policy 1, policy_version 74940 (0.0007) [2023-10-10 07:47:35,977][53252] Updated weights for policy 0, policy_version 74990 (0.0009) [2023-10-10 07:47:36,346][53252] Updated weights for policy 0, policy_version 75000 (0.0007) [2023-10-10 07:47:36,783][52050] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 153550848. Throughput: 0: 1668.6, 1: 1670.4. Samples: 38390142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:36,784][52050] Avg episode reward: [(0, '20.090'), (1, '21.090')] [2023-10-10 07:47:39,650][53268] Updated weights for policy 1, policy_version 74950 (0.0007) [2023-10-10 07:47:40,012][53268] Updated weights for policy 1, policy_version 74960 (0.0010) [2023-10-10 07:47:40,380][53268] Updated weights for policy 1, policy_version 74970 (0.0010) [2023-10-10 07:47:40,530][53252] Updated weights for policy 0, policy_version 75010 (0.0008) [2023-10-10 07:47:40,905][53252] Updated weights for policy 0, policy_version 75020 (0.0007) [2023-10-10 07:47:41,269][53252] Updated weights for policy 0, policy_version 75030 (0.0009) [2023-10-10 07:47:41,634][53252] Updated weights for policy 0, policy_version 75040 (0.0011) [2023-10-10 07:47:41,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 153616384. Throughput: 0: 1652.1, 1: 1670.9. Samples: 38409194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:41,784][52050] Avg episode reward: [(0, '20.600'), (1, '23.090')] [2023-10-10 07:47:44,587][53268] Updated weights for policy 1, policy_version 74980 (0.0008) [2023-10-10 07:47:44,955][53268] Updated weights for policy 1, policy_version 74990 (0.0007) [2023-10-10 07:47:45,323][53268] Updated weights for policy 1, policy_version 75000 (0.0009) [2023-10-10 07:47:45,755][53252] Updated weights for policy 0, policy_version 75050 (0.0009) [2023-10-10 07:47:46,136][53252] Updated weights for policy 0, policy_version 75060 (0.0009) [2023-10-10 07:47:46,504][53252] Updated weights for policy 0, policy_version 75070 (0.0008) [2023-10-10 07:47:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 153681920. Throughput: 0: 1672.1, 1: 1683.1. Samples: 38420588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:46,784][52050] Avg episode reward: [(0, '21.040'), (1, '22.350')] [2023-10-10 07:47:49,377][53268] Updated weights for policy 1, policy_version 75010 (0.0008) [2023-10-10 07:47:49,746][53268] Updated weights for policy 1, policy_version 75020 (0.0008) [2023-10-10 07:47:50,111][53268] Updated weights for policy 1, policy_version 75030 (0.0008) [2023-10-10 07:47:50,350][53252] Updated weights for policy 0, policy_version 75080 (0.0007) [2023-10-10 07:47:50,485][53268] Updated weights for policy 1, policy_version 75040 (0.0008) [2023-10-10 07:47:50,706][53252] Updated weights for policy 0, policy_version 75090 (0.0008) [2023-10-10 07:47:51,091][53252] Updated weights for policy 0, policy_version 75100 (0.0009) [2023-10-10 07:47:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153747456. Throughput: 0: 1670.5, 1: 1666.9. Samples: 38440136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:51,784][52050] Avg episode reward: [(0, '23.110'), (1, '21.450')] [2023-10-10 07:47:54,609][53268] Updated weights for policy 1, policy_version 75050 (0.0010) [2023-10-10 07:47:54,963][53268] Updated weights for policy 1, policy_version 75060 (0.0008) [2023-10-10 07:47:55,246][53252] Updated weights for policy 0, policy_version 75110 (0.0010) [2023-10-10 07:47:55,330][53268] Updated weights for policy 1, policy_version 75070 (0.0009) [2023-10-10 07:47:55,613][53252] Updated weights for policy 0, policy_version 75120 (0.0009) [2023-10-10 07:47:55,984][53252] Updated weights for policy 0, policy_version 75130 (0.0008) [2023-10-10 07:47:56,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 153812992. Throughput: 0: 1667.4, 1: 1679.1. Samples: 38459794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:47:56,784][52050] Avg episode reward: [(0, '21.960'), (1, '20.840')] [2023-10-10 07:47:59,378][53268] Updated weights for policy 1, policy_version 75080 (0.0010) [2023-10-10 07:47:59,746][53268] Updated weights for policy 1, policy_version 75090 (0.0009) [2023-10-10 07:47:59,825][53252] Updated weights for policy 0, policy_version 75140 (0.0008) [2023-10-10 07:48:00,121][53268] Updated weights for policy 1, policy_version 75100 (0.0009) [2023-10-10 07:48:00,189][53252] Updated weights for policy 0, policy_version 75150 (0.0009) [2023-10-10 07:48:00,557][53252] Updated weights for policy 0, policy_version 75160 (0.0009) [2023-10-10 07:48:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153878528. Throughput: 0: 1689.5, 1: 1689.0. Samples: 38471542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:48:01,784][52050] Avg episode reward: [(0, '24.830'), (1, '21.210')] [2023-10-10 07:48:04,043][53268] Updated weights for policy 1, policy_version 75110 (0.0009) [2023-10-10 07:48:04,405][53268] Updated weights for policy 1, policy_version 75120 (0.0009) [2023-10-10 07:48:04,423][53252] Updated weights for policy 0, policy_version 75170 (0.0007) [2023-10-10 07:48:04,765][53268] Updated weights for policy 1, policy_version 75130 (0.0009) [2023-10-10 07:48:04,802][53252] Updated weights for policy 0, policy_version 75180 (0.0009) [2023-10-10 07:48:05,183][53252] Updated weights for policy 0, policy_version 75190 (0.0009) [2023-10-10 07:48:05,553][53252] Updated weights for policy 0, policy_version 75200 (0.0009) [2023-10-10 07:48:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 153944064. Throughput: 0: 1672.0, 1: 1670.4. Samples: 38490170. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:06,784][52050] Avg episode reward: [(0, '23.080'), (1, '19.700')] [2023-10-10 07:48:09,002][53268] Updated weights for policy 1, policy_version 75140 (0.0009) [2023-10-10 07:48:09,394][53268] Updated weights for policy 1, policy_version 75150 (0.0007) [2023-10-10 07:48:09,754][53252] Updated weights for policy 0, policy_version 75210 (0.0007) [2023-10-10 07:48:09,755][53268] Updated weights for policy 1, policy_version 75160 (0.0008) [2023-10-10 07:48:10,118][53252] Updated weights for policy 0, policy_version 75220 (0.0007) [2023-10-10 07:48:10,489][53252] Updated weights for policy 0, policy_version 75230 (0.0008) [2023-10-10 07:48:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154009600. Throughput: 0: 1678.3, 1: 1688.4. Samples: 38510390. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:11,784][52050] Avg episode reward: [(0, '21.400'), (1, '20.720')] [2023-10-10 07:48:13,769][53268] Updated weights for policy 1, policy_version 75170 (0.0008) [2023-10-10 07:48:14,136][53268] Updated weights for policy 1, policy_version 75180 (0.0009) [2023-10-10 07:48:14,501][53268] Updated weights for policy 1, policy_version 75190 (0.0009) [2023-10-10 07:48:14,539][53252] Updated weights for policy 0, policy_version 75240 (0.0008) [2023-10-10 07:48:14,866][53268] Updated weights for policy 1, policy_version 75200 (0.0008) [2023-10-10 07:48:14,914][53252] Updated weights for policy 0, policy_version 75250 (0.0008) [2023-10-10 07:48:15,274][53252] Updated weights for policy 0, policy_version 75260 (0.0009) [2023-10-10 07:48:16,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154075136. Throughput: 0: 1688.6, 1: 1677.9. Samples: 38521372. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:16,785][52050] Avg episode reward: [(0, '20.880'), (1, '21.720')] [2023-10-10 07:48:19,076][53268] Updated weights for policy 1, policy_version 75210 (0.0010) [2023-10-10 07:48:19,438][53268] Updated weights for policy 1, policy_version 75220 (0.0008) [2023-10-10 07:48:19,476][53252] Updated weights for policy 0, policy_version 75270 (0.0009) [2023-10-10 07:48:19,805][53268] Updated weights for policy 1, policy_version 75230 (0.0008) [2023-10-10 07:48:19,846][53252] Updated weights for policy 0, policy_version 75280 (0.0008) [2023-10-10 07:48:20,208][53252] Updated weights for policy 0, policy_version 75290 (0.0010) [2023-10-10 07:48:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154140672. Throughput: 0: 1659.7, 1: 1670.0. Samples: 38539980. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:21,784][52050] Avg episode reward: [(0, '21.370'), (1, '20.510')] [2023-10-10 07:48:23,775][53268] Updated weights for policy 1, policy_version 75240 (0.0010) [2023-10-10 07:48:24,146][53268] Updated weights for policy 1, policy_version 75250 (0.0007) [2023-10-10 07:48:24,247][53252] Updated weights for policy 0, policy_version 75300 (0.0009) [2023-10-10 07:48:24,513][53268] Updated weights for policy 1, policy_version 75260 (0.0008) [2023-10-10 07:48:24,623][53252] Updated weights for policy 0, policy_version 75310 (0.0007) [2023-10-10 07:48:24,995][53252] Updated weights for policy 0, policy_version 75320 (0.0008) [2023-10-10 07:48:26,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154206208. Throughput: 0: 1682.8, 1: 1685.2. Samples: 38560750. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:26,784][52050] Avg episode reward: [(0, '21.160'), (1, '20.780')] [2023-10-10 07:48:28,460][53268] Updated weights for policy 1, policy_version 75270 (0.0009) [2023-10-10 07:48:28,823][53268] Updated weights for policy 1, policy_version 75280 (0.0008) [2023-10-10 07:48:29,155][53252] Updated weights for policy 0, policy_version 75330 (0.0009) [2023-10-10 07:48:29,187][53268] Updated weights for policy 1, policy_version 75290 (0.0007) [2023-10-10 07:48:29,523][53252] Updated weights for policy 0, policy_version 75340 (0.0008) [2023-10-10 07:48:29,894][53252] Updated weights for policy 0, policy_version 75350 (0.0008) [2023-10-10 07:48:30,262][53252] Updated weights for policy 0, policy_version 75360 (0.0008) [2023-10-10 07:48:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154271744. Throughput: 0: 1679.4, 1: 1666.5. Samples: 38571154. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:31,785][52050] Avg episode reward: [(0, '21.700'), (1, '22.160')] [2023-10-10 07:48:33,057][53268] Updated weights for policy 1, policy_version 75300 (0.0008) [2023-10-10 07:48:33,431][53268] Updated weights for policy 1, policy_version 75310 (0.0008) [2023-10-10 07:48:33,791][53268] Updated weights for policy 1, policy_version 75320 (0.0008) [2023-10-10 07:48:34,614][53252] Updated weights for policy 0, policy_version 75370 (0.0008) [2023-10-10 07:48:34,988][53252] Updated weights for policy 0, policy_version 75380 (0.0010) [2023-10-10 07:48:35,357][53252] Updated weights for policy 0, policy_version 75390 (0.0009) [2023-10-10 07:48:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 154337280. Throughput: 0: 1661.1, 1: 1683.1. Samples: 38590624. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:36,784][52050] Avg episode reward: [(0, '20.720'), (1, '21.950')] [2023-10-10 07:48:37,936][53268] Updated weights for policy 1, policy_version 75330 (0.0009) [2023-10-10 07:48:38,312][53268] Updated weights for policy 1, policy_version 75340 (0.0009) [2023-10-10 07:48:38,676][53268] Updated weights for policy 1, policy_version 75350 (0.0010) [2023-10-10 07:48:39,033][53268] Updated weights for policy 1, policy_version 75360 (0.0008) [2023-10-10 07:48:39,354][53252] Updated weights for policy 0, policy_version 75400 (0.0009) [2023-10-10 07:48:39,725][53252] Updated weights for policy 0, policy_version 75410 (0.0011) [2023-10-10 07:48:40,090][53252] Updated weights for policy 0, policy_version 75420 (0.0010) [2023-10-10 07:48:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154402816. Throughput: 0: 1675.4, 1: 1695.8. Samples: 38611498. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:41,785][52050] Avg episode reward: [(0, '21.150'), (1, '22.500')] [2023-10-10 07:48:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000075424_77234176.pth... [2023-10-10 07:48:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000075360_77168640.pth... [2023-10-10 07:48:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000073792_75563008.pth [2023-10-10 07:48:41,834][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000073856_75628544.pth [2023-10-10 07:48:42,935][53268] Updated weights for policy 1, policy_version 75370 (0.0010) [2023-10-10 07:48:43,305][53268] Updated weights for policy 1, policy_version 75380 (0.0010) [2023-10-10 07:48:43,676][53268] Updated weights for policy 1, policy_version 75390 (0.0009) [2023-10-10 07:48:44,138][53252] Updated weights for policy 0, policy_version 75430 (0.0008) [2023-10-10 07:48:44,511][53252] Updated weights for policy 0, policy_version 75440 (0.0009) [2023-10-10 07:48:44,882][53252] Updated weights for policy 0, policy_version 75450 (0.0007) [2023-10-10 07:48:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154468352. Throughput: 0: 1665.0, 1: 1672.7. Samples: 38621740. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:46,784][52050] Avg episode reward: [(0, '22.360'), (1, '21.370')] [2023-10-10 07:48:47,626][53268] Updated weights for policy 1, policy_version 75400 (0.0009) [2023-10-10 07:48:47,980][53268] Updated weights for policy 1, policy_version 75410 (0.0008) [2023-10-10 07:48:48,352][53268] Updated weights for policy 1, policy_version 75420 (0.0009) [2023-10-10 07:48:48,883][53252] Updated weights for policy 0, policy_version 75460 (0.0008) [2023-10-10 07:48:49,253][53252] Updated weights for policy 0, policy_version 75470 (0.0009) [2023-10-10 07:48:49,616][53252] Updated weights for policy 0, policy_version 75480 (0.0008) [2023-10-10 07:48:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 154533888. Throughput: 0: 1666.9, 1: 1702.3. Samples: 38641782. Policy #0 lag: (min: 7.0, avg: 15.6, max: 39.0) [2023-10-10 07:48:51,784][52050] Avg episode reward: [(0, '22.870'), (1, '22.150')] [2023-10-10 07:48:52,362][53268] Updated weights for policy 1, policy_version 75430 (0.0009) [2023-10-10 07:48:52,741][53268] Updated weights for policy 1, policy_version 75440 (0.0007) [2023-10-10 07:48:53,104][53268] Updated weights for policy 1, policy_version 75450 (0.0010) [2023-10-10 07:48:53,767][53252] Updated weights for policy 0, policy_version 75490 (0.0010) [2023-10-10 07:48:54,131][53252] Updated weights for policy 0, policy_version 75500 (0.0011) [2023-10-10 07:48:54,509][53252] Updated weights for policy 0, policy_version 75510 (0.0010) [2023-10-10 07:48:54,884][53252] Updated weights for policy 0, policy_version 75520 (0.0008) [2023-10-10 07:48:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154599424. Throughput: 0: 1670.1, 1: 1711.1. Samples: 38662544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:48:56,784][52050] Avg episode reward: [(0, '22.750'), (1, '19.740')] [2023-10-10 07:48:57,198][53268] Updated weights for policy 1, policy_version 75460 (0.0009) [2023-10-10 07:48:57,561][53268] Updated weights for policy 1, policy_version 75470 (0.0010) [2023-10-10 07:48:57,940][53268] Updated weights for policy 1, policy_version 75480 (0.0009) [2023-10-10 07:48:58,899][53252] Updated weights for policy 0, policy_version 75530 (0.0007) [2023-10-10 07:48:59,281][53252] Updated weights for policy 0, policy_version 75540 (0.0007) [2023-10-10 07:48:59,657][53252] Updated weights for policy 0, policy_version 75550 (0.0008) [2023-10-10 07:49:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 154664960. Throughput: 0: 1657.0, 1: 1691.9. Samples: 38672074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:01,784][52050] Avg episode reward: [(0, '24.690'), (1, '19.570')] [2023-10-10 07:49:01,840][53268] Updated weights for policy 1, policy_version 75490 (0.0008) [2023-10-10 07:49:02,205][53268] Updated weights for policy 1, policy_version 75500 (0.0008) [2023-10-10 07:49:02,558][53268] Updated weights for policy 1, policy_version 75510 (0.0009) [2023-10-10 07:49:02,930][53268] Updated weights for policy 1, policy_version 75520 (0.0008) [2023-10-10 07:49:03,669][53252] Updated weights for policy 0, policy_version 75560 (0.0009) [2023-10-10 07:49:04,041][53252] Updated weights for policy 0, policy_version 75570 (0.0009) [2023-10-10 07:49:04,421][53252] Updated weights for policy 0, policy_version 75580 (0.0009) [2023-10-10 07:49:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154730496. Throughput: 0: 1676.8, 1: 1711.4. Samples: 38692450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:06,784][52050] Avg episode reward: [(0, '24.150'), (1, '18.850')] [2023-10-10 07:49:06,864][53268] Updated weights for policy 1, policy_version 75530 (0.0009) [2023-10-10 07:49:07,240][53268] Updated weights for policy 1, policy_version 75540 (0.0008) [2023-10-10 07:49:07,615][53268] Updated weights for policy 1, policy_version 75550 (0.0008) [2023-10-10 07:49:08,441][53252] Updated weights for policy 0, policy_version 75590 (0.0009) [2023-10-10 07:49:08,810][53252] Updated weights for policy 0, policy_version 75600 (0.0009) [2023-10-10 07:49:09,192][53252] Updated weights for policy 0, policy_version 75610 (0.0008) [2023-10-10 07:49:11,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154796032. Throughput: 0: 1679.0, 1: 1705.0. Samples: 38713030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:11,784][52050] Avg episode reward: [(0, '21.450'), (1, '21.200')] [2023-10-10 07:49:11,803][53268] Updated weights for policy 1, policy_version 75560 (0.0007) [2023-10-10 07:49:12,172][53268] Updated weights for policy 1, policy_version 75570 (0.0008) [2023-10-10 07:49:12,536][53268] Updated weights for policy 1, policy_version 75580 (0.0007) [2023-10-10 07:49:13,081][53252] Updated weights for policy 0, policy_version 75620 (0.0008) [2023-10-10 07:49:13,454][53252] Updated weights for policy 0, policy_version 75630 (0.0007) [2023-10-10 07:49:13,819][53252] Updated weights for policy 0, policy_version 75640 (0.0007) [2023-10-10 07:49:16,721][53268] Updated weights for policy 1, policy_version 75590 (0.0009) [2023-10-10 07:49:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154861568. Throughput: 0: 1662.5, 1: 1694.8. Samples: 38722232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:16,784][52050] Avg episode reward: [(0, '21.020'), (1, '22.220')] [2023-10-10 07:49:17,083][53268] Updated weights for policy 1, policy_version 75600 (0.0007) [2023-10-10 07:49:17,447][53268] Updated weights for policy 1, policy_version 75610 (0.0008) [2023-10-10 07:49:17,842][53252] Updated weights for policy 0, policy_version 75650 (0.0007) [2023-10-10 07:49:18,210][53252] Updated weights for policy 0, policy_version 75660 (0.0008) [2023-10-10 07:49:18,582][53252] Updated weights for policy 0, policy_version 75670 (0.0008) [2023-10-10 07:49:18,951][53252] Updated weights for policy 0, policy_version 75680 (0.0007) [2023-10-10 07:49:21,584][53268] Updated weights for policy 1, policy_version 75620 (0.0008) [2023-10-10 07:49:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154927104. Throughput: 0: 1686.4, 1: 1701.5. Samples: 38743082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:21,784][52050] Avg episode reward: [(0, '21.990'), (1, '20.900')] [2023-10-10 07:49:21,950][53268] Updated weights for policy 1, policy_version 75630 (0.0009) [2023-10-10 07:49:22,313][53268] Updated weights for policy 1, policy_version 75640 (0.0009) [2023-10-10 07:49:23,083][53252] Updated weights for policy 0, policy_version 75690 (0.0007) [2023-10-10 07:49:23,456][53252] Updated weights for policy 0, policy_version 75700 (0.0007) [2023-10-10 07:49:23,822][53252] Updated weights for policy 0, policy_version 75710 (0.0009) [2023-10-10 07:49:26,298][53268] Updated weights for policy 1, policy_version 75650 (0.0008) [2023-10-10 07:49:26,661][53268] Updated weights for policy 1, policy_version 75660 (0.0010) [2023-10-10 07:49:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 154992640. Throughput: 0: 1690.2, 1: 1699.5. Samples: 38764032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:26,784][52050] Avg episode reward: [(0, '20.820'), (1, '21.800')] [2023-10-10 07:49:27,021][53268] Updated weights for policy 1, policy_version 75670 (0.0011) [2023-10-10 07:49:27,395][53268] Updated weights for policy 1, policy_version 75680 (0.0009) [2023-10-10 07:49:27,886][53252] Updated weights for policy 0, policy_version 75720 (0.0010) [2023-10-10 07:49:28,260][53252] Updated weights for policy 0, policy_version 75730 (0.0008) [2023-10-10 07:49:28,622][53252] Updated weights for policy 0, policy_version 75740 (0.0007) [2023-10-10 07:49:31,521][53268] Updated weights for policy 1, policy_version 75690 (0.0010) [2023-10-10 07:49:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 155058176. Throughput: 0: 1670.9, 1: 1696.5. Samples: 38773276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:31,784][52050] Avg episode reward: [(0, '22.210'), (1, '21.470')] [2023-10-10 07:49:31,893][53268] Updated weights for policy 1, policy_version 75700 (0.0008) [2023-10-10 07:49:32,265][53268] Updated weights for policy 1, policy_version 75710 (0.0008) [2023-10-10 07:49:32,754][53252] Updated weights for policy 0, policy_version 75750 (0.0008) [2023-10-10 07:49:33,123][53252] Updated weights for policy 0, policy_version 75760 (0.0007) [2023-10-10 07:49:33,500][53252] Updated weights for policy 0, policy_version 75770 (0.0008) [2023-10-10 07:49:36,373][53268] Updated weights for policy 1, policy_version 75720 (0.0010) [2023-10-10 07:49:36,733][53268] Updated weights for policy 1, policy_version 75730 (0.0009) [2023-10-10 07:49:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 155123712. Throughput: 0: 1688.8, 1: 1688.7. Samples: 38793770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:36,784][52050] Avg episode reward: [(0, '22.640'), (1, '21.070')] [2023-10-10 07:49:37,099][53268] Updated weights for policy 1, policy_version 75740 (0.0007) [2023-10-10 07:49:37,582][53252] Updated weights for policy 0, policy_version 75780 (0.0008) [2023-10-10 07:49:37,957][53252] Updated weights for policy 0, policy_version 75790 (0.0009) [2023-10-10 07:49:38,321][53252] Updated weights for policy 0, policy_version 75800 (0.0008) [2023-10-10 07:49:41,182][53268] Updated weights for policy 1, policy_version 75750 (0.0009) [2023-10-10 07:49:41,548][53268] Updated weights for policy 1, policy_version 75760 (0.0010) [2023-10-10 07:49:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 155189248. Throughput: 0: 1696.5, 1: 1677.4. Samples: 38814370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:41,784][52050] Avg episode reward: [(0, '21.700'), (1, '20.530')] [2023-10-10 07:49:41,911][53268] Updated weights for policy 1, policy_version 75770 (0.0009) [2023-10-10 07:49:42,261][53252] Updated weights for policy 0, policy_version 75810 (0.0009) [2023-10-10 07:49:42,629][53252] Updated weights for policy 0, policy_version 75820 (0.0007) [2023-10-10 07:49:43,008][53252] Updated weights for policy 0, policy_version 75830 (0.0007) [2023-10-10 07:49:43,373][53252] Updated weights for policy 0, policy_version 75840 (0.0008) [2023-10-10 07:49:46,052][53268] Updated weights for policy 1, policy_version 75780 (0.0009) [2023-10-10 07:49:46,448][53268] Updated weights for policy 1, policy_version 75790 (0.0010) [2023-10-10 07:49:46,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 155254784. Throughput: 0: 1689.9, 1: 1688.4. Samples: 38824100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:46,784][52050] Avg episode reward: [(0, '21.180'), (1, '22.020')] [2023-10-10 07:49:46,812][53268] Updated weights for policy 1, policy_version 75800 (0.0011) [2023-10-10 07:49:47,312][53252] Updated weights for policy 0, policy_version 75850 (0.0008) [2023-10-10 07:49:47,690][53252] Updated weights for policy 0, policy_version 75860 (0.0010) [2023-10-10 07:49:48,046][53252] Updated weights for policy 0, policy_version 75870 (0.0009) [2023-10-10 07:49:50,862][53268] Updated weights for policy 1, policy_version 75810 (0.0009) [2023-10-10 07:49:51,240][53268] Updated weights for policy 1, policy_version 75820 (0.0010) [2023-10-10 07:49:51,603][53268] Updated weights for policy 1, policy_version 75830 (0.0007) [2023-10-10 07:49:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 155320320. Throughput: 0: 1693.8, 1: 1681.6. Samples: 38844344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:51,784][52050] Avg episode reward: [(0, '21.810'), (1, '22.710')] [2023-10-10 07:49:51,966][53268] Updated weights for policy 1, policy_version 75840 (0.0009) [2023-10-10 07:49:52,208][53252] Updated weights for policy 0, policy_version 75880 (0.0009) [2023-10-10 07:49:52,579][53252] Updated weights for policy 0, policy_version 75890 (0.0008) [2023-10-10 07:49:52,954][53252] Updated weights for policy 0, policy_version 75900 (0.0008) [2023-10-10 07:49:56,007][53268] Updated weights for policy 1, policy_version 75850 (0.0009) [2023-10-10 07:49:56,384][53268] Updated weights for policy 1, policy_version 75860 (0.0007) [2023-10-10 07:49:56,752][53268] Updated weights for policy 1, policy_version 75870 (0.0010) [2023-10-10 07:49:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 155385856. Throughput: 0: 1689.4, 1: 1677.0. Samples: 38864518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:49:56,784][52050] Avg episode reward: [(0, '20.230'), (1, '23.110')] [2023-10-10 07:49:56,983][53252] Updated weights for policy 0, policy_version 75910 (0.0009) [2023-10-10 07:49:57,368][53252] Updated weights for policy 0, policy_version 75920 (0.0010) [2023-10-10 07:49:57,724][53252] Updated weights for policy 0, policy_version 75930 (0.0008) [2023-10-10 07:50:00,828][53268] Updated weights for policy 1, policy_version 75880 (0.0007) [2023-10-10 07:50:01,193][53268] Updated weights for policy 1, policy_version 75890 (0.0010) [2023-10-10 07:50:01,561][53268] Updated weights for policy 1, policy_version 75900 (0.0011) [2023-10-10 07:50:01,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155484160. Throughput: 0: 1684.4, 1: 1688.3. Samples: 38874004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:01,784][52050] Avg episode reward: [(0, '22.460'), (1, '22.580')] [2023-10-10 07:50:01,798][53252] Updated weights for policy 0, policy_version 75940 (0.0010) [2023-10-10 07:50:02,174][53252] Updated weights for policy 0, policy_version 75950 (0.0008) [2023-10-10 07:50:02,543][53252] Updated weights for policy 0, policy_version 75960 (0.0009) [2023-10-10 07:50:05,752][53268] Updated weights for policy 1, policy_version 75910 (0.0008) [2023-10-10 07:50:06,117][53268] Updated weights for policy 1, policy_version 75920 (0.0008) [2023-10-10 07:50:06,494][53268] Updated weights for policy 1, policy_version 75930 (0.0007) [2023-10-10 07:50:06,654][53252] Updated weights for policy 0, policy_version 75970 (0.0008) [2023-10-10 07:50:06,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155549696. Throughput: 0: 1688.1, 1: 1686.3. Samples: 38894930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:06,784][52050] Avg episode reward: [(0, '22.980'), (1, '22.450')] [2023-10-10 07:50:07,027][53252] Updated weights for policy 0, policy_version 75980 (0.0007) [2023-10-10 07:50:07,410][53252] Updated weights for policy 0, policy_version 75990 (0.0009) [2023-10-10 07:50:07,784][53252] Updated weights for policy 0, policy_version 76000 (0.0009) [2023-10-10 07:50:10,461][53268] Updated weights for policy 1, policy_version 75940 (0.0008) [2023-10-10 07:50:10,824][53268] Updated weights for policy 1, policy_version 75950 (0.0009) [2023-10-10 07:50:11,200][53268] Updated weights for policy 1, policy_version 75960 (0.0010) [2023-10-10 07:50:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155615232. Throughput: 0: 1690.1, 1: 1662.0. Samples: 38914876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:11,784][52050] Avg episode reward: [(0, '22.860'), (1, '22.690')] [2023-10-10 07:50:11,930][53252] Updated weights for policy 0, policy_version 76010 (0.0009) [2023-10-10 07:50:12,303][53252] Updated weights for policy 0, policy_version 76020 (0.0009) [2023-10-10 07:50:12,663][53252] Updated weights for policy 0, policy_version 76030 (0.0009) [2023-10-10 07:50:15,051][53268] Updated weights for policy 1, policy_version 75970 (0.0009) [2023-10-10 07:50:15,424][53268] Updated weights for policy 1, policy_version 75980 (0.0008) [2023-10-10 07:50:15,788][53268] Updated weights for policy 1, policy_version 75990 (0.0009) [2023-10-10 07:50:16,156][53268] Updated weights for policy 1, policy_version 76000 (0.0008) [2023-10-10 07:50:16,613][53252] Updated weights for policy 0, policy_version 76040 (0.0007) [2023-10-10 07:50:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155680768. Throughput: 0: 1685.5, 1: 1684.5. Samples: 38924930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:16,784][52050] Avg episode reward: [(0, '22.510'), (1, '20.860')] [2023-10-10 07:50:16,988][53252] Updated weights for policy 0, policy_version 76050 (0.0009) [2023-10-10 07:50:17,369][53252] Updated weights for policy 0, policy_version 76060 (0.0007) [2023-10-10 07:50:20,225][53268] Updated weights for policy 1, policy_version 76010 (0.0008) [2023-10-10 07:50:20,588][53268] Updated weights for policy 1, policy_version 76020 (0.0011) [2023-10-10 07:50:20,953][53268] Updated weights for policy 1, policy_version 76030 (0.0007) [2023-10-10 07:50:21,429][53252] Updated weights for policy 0, policy_version 76070 (0.0009) [2023-10-10 07:50:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155746304. Throughput: 0: 1687.6, 1: 1684.3. Samples: 38945504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:21,784][52050] Avg episode reward: [(0, '22.300'), (1, '19.230')] [2023-10-10 07:50:21,805][53252] Updated weights for policy 0, policy_version 76080 (0.0008) [2023-10-10 07:50:22,165][53252] Updated weights for policy 0, policy_version 76090 (0.0009) [2023-10-10 07:50:25,044][53268] Updated weights for policy 1, policy_version 76040 (0.0008) [2023-10-10 07:50:25,407][53268] Updated weights for policy 1, policy_version 76050 (0.0009) [2023-10-10 07:50:25,772][53268] Updated weights for policy 1, policy_version 76060 (0.0007) [2023-10-10 07:50:26,244][53252] Updated weights for policy 0, policy_version 76100 (0.0011) [2023-10-10 07:50:26,612][53252] Updated weights for policy 0, policy_version 76110 (0.0009) [2023-10-10 07:50:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155811840. Throughput: 0: 1676.4, 1: 1671.5. Samples: 38965024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:26,784][52050] Avg episode reward: [(0, '20.960'), (1, '21.090')] [2023-10-10 07:50:26,986][53252] Updated weights for policy 0, policy_version 76120 (0.0009) [2023-10-10 07:50:29,882][53268] Updated weights for policy 1, policy_version 76070 (0.0010) [2023-10-10 07:50:30,250][53268] Updated weights for policy 1, policy_version 76080 (0.0010) [2023-10-10 07:50:30,610][53268] Updated weights for policy 1, policy_version 76090 (0.0009) [2023-10-10 07:50:31,122][53252] Updated weights for policy 0, policy_version 76130 (0.0009) [2023-10-10 07:50:31,504][53252] Updated weights for policy 0, policy_version 76140 (0.0008) [2023-10-10 07:50:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155877376. Throughput: 0: 1675.5, 1: 1693.7. Samples: 38975714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:50:31,784][52050] Avg episode reward: [(0, '21.770'), (1, '20.150')] [2023-10-10 07:50:31,885][53252] Updated weights for policy 0, policy_version 76150 (0.0010) [2023-10-10 07:50:32,262][53252] Updated weights for policy 0, policy_version 76160 (0.0008) [2023-10-10 07:50:34,754][53268] Updated weights for policy 1, policy_version 76100 (0.0008) [2023-10-10 07:50:35,158][53268] Updated weights for policy 1, policy_version 76110 (0.0009) [2023-10-10 07:50:35,522][53268] Updated weights for policy 1, policy_version 76120 (0.0010) [2023-10-10 07:50:36,407][53252] Updated weights for policy 0, policy_version 76170 (0.0009) [2023-10-10 07:50:36,772][53252] Updated weights for policy 0, policy_version 76180 (0.0008) [2023-10-10 07:50:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155942912. Throughput: 0: 1682.9, 1: 1681.6. Samples: 38995748. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:50:36,784][52050] Avg episode reward: [(0, '21.980'), (1, '21.030')] [2023-10-10 07:50:37,146][53252] Updated weights for policy 0, policy_version 76190 (0.0008) [2023-10-10 07:50:39,465][53268] Updated weights for policy 1, policy_version 76130 (0.0008) [2023-10-10 07:50:39,829][53268] Updated weights for policy 1, policy_version 76140 (0.0009) [2023-10-10 07:50:40,198][53268] Updated weights for policy 1, policy_version 76150 (0.0009) [2023-10-10 07:50:40,558][53268] Updated weights for policy 1, policy_version 76160 (0.0009) [2023-10-10 07:50:41,155][53252] Updated weights for policy 0, policy_version 76200 (0.0008) [2023-10-10 07:50:41,529][53252] Updated weights for policy 0, policy_version 76210 (0.0007) [2023-10-10 07:50:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156008448. Throughput: 0: 1672.3, 1: 1680.2. Samples: 39015380. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:50:41,784][52050] Avg episode reward: [(0, '22.790'), (1, '20.410')] [2023-10-10 07:50:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000076160_77987840.pth... [2023-10-10 07:50:41,835][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000074592_76382208.pth [2023-10-10 07:50:41,906][53252] Updated weights for policy 0, policy_version 76220 (0.0008) [2023-10-10 07:50:42,044][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000076224_78053376.pth... [2023-10-10 07:50:42,082][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000074656_76447744.pth [2023-10-10 07:50:44,494][53268] Updated weights for policy 1, policy_version 76170 (0.0007) [2023-10-10 07:50:44,857][53268] Updated weights for policy 1, policy_version 76180 (0.0007) [2023-10-10 07:50:45,213][53268] Updated weights for policy 1, policy_version 76190 (0.0008) [2023-10-10 07:50:46,013][53252] Updated weights for policy 0, policy_version 76230 (0.0008) [2023-10-10 07:50:46,385][53252] Updated weights for policy 0, policy_version 76240 (0.0009) [2023-10-10 07:50:46,757][53252] Updated weights for policy 0, policy_version 76250 (0.0008) [2023-10-10 07:50:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156073984. Throughput: 0: 1681.8, 1: 1701.5. Samples: 39026252. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:50:46,784][52050] Avg episode reward: [(0, '21.580'), (1, '23.940')] [2023-10-10 07:50:49,199][53268] Updated weights for policy 1, policy_version 76200 (0.0007) [2023-10-10 07:50:49,576][53268] Updated weights for policy 1, policy_version 76210 (0.0007) [2023-10-10 07:50:49,951][53268] Updated weights for policy 1, policy_version 76220 (0.0008) [2023-10-10 07:50:50,825][53252] Updated weights for policy 0, policy_version 76260 (0.0008) [2023-10-10 07:50:51,200][53252] Updated weights for policy 0, policy_version 76270 (0.0008) [2023-10-10 07:50:51,570][53252] Updated weights for policy 0, policy_version 76280 (0.0009) [2023-10-10 07:50:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156139520. Throughput: 0: 1678.6, 1: 1674.6. Samples: 39045822. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:50:51,785][52050] Avg episode reward: [(0, '21.430'), (1, '22.780')] [2023-10-10 07:50:54,083][53268] Updated weights for policy 1, policy_version 76230 (0.0008) [2023-10-10 07:50:54,447][53268] Updated weights for policy 1, policy_version 76240 (0.0008) [2023-10-10 07:50:54,806][53268] Updated weights for policy 1, policy_version 76250 (0.0010) [2023-10-10 07:50:55,566][53252] Updated weights for policy 0, policy_version 76290 (0.0009) [2023-10-10 07:50:55,943][53252] Updated weights for policy 0, policy_version 76300 (0.0009) [2023-10-10 07:50:56,318][53252] Updated weights for policy 0, policy_version 76310 (0.0009) [2023-10-10 07:50:56,694][53252] Updated weights for policy 0, policy_version 76320 (0.0007) [2023-10-10 07:50:56,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 156237824. Throughput: 0: 1655.6, 1: 1696.4. Samples: 39065718. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:50:56,784][52050] Avg episode reward: [(0, '22.380'), (1, '23.620')] [2023-10-10 07:50:58,832][53268] Updated weights for policy 1, policy_version 76260 (0.0009) [2023-10-10 07:50:59,195][53268] Updated weights for policy 1, policy_version 76270 (0.0010) [2023-10-10 07:50:59,562][53268] Updated weights for policy 1, policy_version 76280 (0.0011) [2023-10-10 07:51:00,787][53252] Updated weights for policy 0, policy_version 76330 (0.0007) [2023-10-10 07:51:01,155][53252] Updated weights for policy 0, policy_version 76340 (0.0008) [2023-10-10 07:51:01,536][53252] Updated weights for policy 0, policy_version 76350 (0.0009) [2023-10-10 07:51:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156303360. Throughput: 0: 1679.2, 1: 1688.8. Samples: 39076490. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:51:01,785][52050] Avg episode reward: [(0, '21.590'), (1, '21.830')] [2023-10-10 07:51:03,791][53268] Updated weights for policy 1, policy_version 76290 (0.0010) [2023-10-10 07:51:04,160][53268] Updated weights for policy 1, policy_version 76300 (0.0010) [2023-10-10 07:51:04,523][53268] Updated weights for policy 1, policy_version 76310 (0.0009) [2023-10-10 07:51:04,898][53268] Updated weights for policy 1, policy_version 76320 (0.0009) [2023-10-10 07:51:05,456][53252] Updated weights for policy 0, policy_version 76360 (0.0008) [2023-10-10 07:51:05,830][53252] Updated weights for policy 0, policy_version 76370 (0.0008) [2023-10-10 07:51:06,204][53252] Updated weights for policy 0, policy_version 76380 (0.0008) [2023-10-10 07:51:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156368896. Throughput: 0: 1677.2, 1: 1669.1. Samples: 39096092. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:51:06,785][52050] Avg episode reward: [(0, '21.840'), (1, '23.020')] [2023-10-10 07:51:08,925][53268] Updated weights for policy 1, policy_version 76330 (0.0011) [2023-10-10 07:51:09,295][53268] Updated weights for policy 1, policy_version 76340 (0.0010) [2023-10-10 07:51:09,662][53268] Updated weights for policy 1, policy_version 76350 (0.0011) [2023-10-10 07:51:10,180][53252] Updated weights for policy 0, policy_version 76390 (0.0009) [2023-10-10 07:51:10,562][53252] Updated weights for policy 0, policy_version 76400 (0.0011) [2023-10-10 07:51:10,932][53252] Updated weights for policy 0, policy_version 76410 (0.0012) [2023-10-10 07:51:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156434432. Throughput: 0: 1658.9, 1: 1689.9. Samples: 39115718. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:51:11,785][52050] Avg episode reward: [(0, '21.860'), (1, '21.150')] [2023-10-10 07:51:13,665][53268] Updated weights for policy 1, policy_version 76360 (0.0011) [2023-10-10 07:51:14,030][53268] Updated weights for policy 1, policy_version 76370 (0.0009) [2023-10-10 07:51:14,396][53268] Updated weights for policy 1, policy_version 76380 (0.0007) [2023-10-10 07:51:15,009][53252] Updated weights for policy 0, policy_version 76420 (0.0008) [2023-10-10 07:51:15,386][53252] Updated weights for policy 0, policy_version 76430 (0.0007) [2023-10-10 07:51:15,752][53252] Updated weights for policy 0, policy_version 76440 (0.0007) [2023-10-10 07:51:16,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 156499968. Throughput: 0: 1684.0, 1: 1668.7. Samples: 39126584. Policy #0 lag: (min: 1.0, avg: 8.6, max: 33.0) [2023-10-10 07:51:16,784][52050] Avg episode reward: [(0, '22.660'), (1, '21.840')] [2023-10-10 07:51:18,586][53268] Updated weights for policy 1, policy_version 76390 (0.0010) [2023-10-10 07:51:18,957][53268] Updated weights for policy 1, policy_version 76400 (0.0009) [2023-10-10 07:51:19,334][53268] Updated weights for policy 1, policy_version 76410 (0.0007) [2023-10-10 07:51:19,765][53252] Updated weights for policy 0, policy_version 76450 (0.0008) [2023-10-10 07:51:20,137][53252] Updated weights for policy 0, policy_version 76460 (0.0011) [2023-10-10 07:51:20,505][53252] Updated weights for policy 0, policy_version 76470 (0.0010) [2023-10-10 07:51:20,884][53252] Updated weights for policy 0, policy_version 76480 (0.0010) [2023-10-10 07:51:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156565504. Throughput: 0: 1669.0, 1: 1673.8. Samples: 39146174. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:21,784][52050] Avg episode reward: [(0, '21.080'), (1, '20.840')] [2023-10-10 07:51:23,286][53268] Updated weights for policy 1, policy_version 76420 (0.0008) [2023-10-10 07:51:23,692][53268] Updated weights for policy 1, policy_version 76430 (0.0007) [2023-10-10 07:51:24,060][53268] Updated weights for policy 1, policy_version 76440 (0.0008) [2023-10-10 07:51:25,060][53252] Updated weights for policy 0, policy_version 76490 (0.0010) [2023-10-10 07:51:25,421][53252] Updated weights for policy 0, policy_version 76500 (0.0010) [2023-10-10 07:51:25,781][53252] Updated weights for policy 0, policy_version 76510 (0.0009) [2023-10-10 07:51:26,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156631040. Throughput: 0: 1671.8, 1: 1686.2. Samples: 39166490. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:26,784][52050] Avg episode reward: [(0, '23.270'), (1, '23.250')] [2023-10-10 07:51:28,106][53268] Updated weights for policy 1, policy_version 76450 (0.0010) [2023-10-10 07:51:28,479][53268] Updated weights for policy 1, policy_version 76460 (0.0009) [2023-10-10 07:51:28,845][53268] Updated weights for policy 1, policy_version 76470 (0.0008) [2023-10-10 07:51:29,212][53268] Updated weights for policy 1, policy_version 76480 (0.0010) [2023-10-10 07:51:29,734][53252] Updated weights for policy 0, policy_version 76520 (0.0009) [2023-10-10 07:51:30,106][53252] Updated weights for policy 0, policy_version 76530 (0.0010) [2023-10-10 07:51:30,485][53252] Updated weights for policy 0, policy_version 76540 (0.0009) [2023-10-10 07:51:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 156696576. Throughput: 0: 1695.2, 1: 1656.8. Samples: 39177096. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:31,784][52050] Avg episode reward: [(0, '23.760'), (1, '20.610')] [2023-10-10 07:51:33,210][53268] Updated weights for policy 1, policy_version 76490 (0.0008) [2023-10-10 07:51:33,570][53268] Updated weights for policy 1, policy_version 76500 (0.0007) [2023-10-10 07:51:33,939][53268] Updated weights for policy 1, policy_version 76510 (0.0007) [2023-10-10 07:51:34,545][53252] Updated weights for policy 0, policy_version 76550 (0.0007) [2023-10-10 07:51:34,920][53252] Updated weights for policy 0, policy_version 76560 (0.0007) [2023-10-10 07:51:35,295][53252] Updated weights for policy 0, policy_version 76570 (0.0008) [2023-10-10 07:51:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156762112. Throughput: 0: 1678.0, 1: 1679.4. Samples: 39196908. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:36,784][52050] Avg episode reward: [(0, '24.160'), (1, '20.970')] [2023-10-10 07:51:37,940][53268] Updated weights for policy 1, policy_version 76520 (0.0008) [2023-10-10 07:51:38,305][53268] Updated weights for policy 1, policy_version 76530 (0.0009) [2023-10-10 07:51:38,658][53268] Updated weights for policy 1, policy_version 76540 (0.0009) [2023-10-10 07:51:39,421][53252] Updated weights for policy 0, policy_version 76580 (0.0009) [2023-10-10 07:51:39,787][53252] Updated weights for policy 0, policy_version 76590 (0.0009) [2023-10-10 07:51:40,162][53252] Updated weights for policy 0, policy_version 76600 (0.0007) [2023-10-10 07:51:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156827648. Throughput: 0: 1693.6, 1: 1682.1. Samples: 39217626. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:41,784][52050] Avg episode reward: [(0, '22.620'), (1, '21.440')] [2023-10-10 07:51:42,671][53268] Updated weights for policy 1, policy_version 76550 (0.0008) [2023-10-10 07:51:43,043][53268] Updated weights for policy 1, policy_version 76560 (0.0007) [2023-10-10 07:51:43,408][53268] Updated weights for policy 1, policy_version 76570 (0.0009) [2023-10-10 07:51:44,050][53252] Updated weights for policy 0, policy_version 76610 (0.0007) [2023-10-10 07:51:44,429][53252] Updated weights for policy 0, policy_version 76620 (0.0008) [2023-10-10 07:51:44,796][53252] Updated weights for policy 0, policy_version 76630 (0.0011) [2023-10-10 07:51:45,162][53252] Updated weights for policy 0, policy_version 76640 (0.0010) [2023-10-10 07:51:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156893184. Throughput: 0: 1692.5, 1: 1666.6. Samples: 39227646. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:46,784][52050] Avg episode reward: [(0, '21.940'), (1, '20.670')] [2023-10-10 07:51:47,240][53268] Updated weights for policy 1, policy_version 76580 (0.0010) [2023-10-10 07:51:47,604][53268] Updated weights for policy 1, policy_version 76590 (0.0009) [2023-10-10 07:51:47,965][53268] Updated weights for policy 1, policy_version 76600 (0.0008) [2023-10-10 07:51:49,432][53252] Updated weights for policy 0, policy_version 76650 (0.0009) [2023-10-10 07:51:49,804][53252] Updated weights for policy 0, policy_version 76660 (0.0007) [2023-10-10 07:51:50,183][53252] Updated weights for policy 0, policy_version 76670 (0.0009) [2023-10-10 07:51:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156958720. Throughput: 0: 1669.4, 1: 1695.2. Samples: 39247496. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:51,784][52050] Avg episode reward: [(0, '21.600'), (1, '20.630')] [2023-10-10 07:51:51,964][53268] Updated weights for policy 1, policy_version 76610 (0.0008) [2023-10-10 07:51:52,326][53268] Updated weights for policy 1, policy_version 76620 (0.0008) [2023-10-10 07:51:52,702][53268] Updated weights for policy 1, policy_version 76630 (0.0007) [2023-10-10 07:51:53,058][53268] Updated weights for policy 1, policy_version 76640 (0.0009) [2023-10-10 07:51:54,408][53252] Updated weights for policy 0, policy_version 76680 (0.0009) [2023-10-10 07:51:54,784][53252] Updated weights for policy 0, policy_version 76690 (0.0008) [2023-10-10 07:51:55,151][53252] Updated weights for policy 0, policy_version 76700 (0.0009) [2023-10-10 07:51:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157024256. Throughput: 0: 1691.3, 1: 1700.6. Samples: 39268354. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:51:56,784][52050] Avg episode reward: [(0, '20.930'), (1, '20.060')] [2023-10-10 07:51:57,103][53268] Updated weights for policy 1, policy_version 76650 (0.0008) [2023-10-10 07:51:57,464][53268] Updated weights for policy 1, policy_version 76660 (0.0010) [2023-10-10 07:51:57,827][53268] Updated weights for policy 1, policy_version 76670 (0.0011) [2023-10-10 07:51:59,196][53252] Updated weights for policy 0, policy_version 76710 (0.0008) [2023-10-10 07:51:59,569][53252] Updated weights for policy 0, policy_version 76720 (0.0009) [2023-10-10 07:51:59,937][53252] Updated weights for policy 0, policy_version 76730 (0.0007) [2023-10-10 07:52:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157089792. Throughput: 0: 1683.9, 1: 1690.5. Samples: 39278432. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:52:01,784][52050] Avg episode reward: [(0, '20.080'), (1, '21.470')] [2023-10-10 07:52:01,963][53268] Updated weights for policy 1, policy_version 76680 (0.0010) [2023-10-10 07:52:02,335][53268] Updated weights for policy 1, policy_version 76690 (0.0009) [2023-10-10 07:52:02,698][53268] Updated weights for policy 1, policy_version 76700 (0.0011) [2023-10-10 07:52:03,925][53252] Updated weights for policy 0, policy_version 76740 (0.0009) [2023-10-10 07:52:04,306][53252] Updated weights for policy 0, policy_version 76750 (0.0011) [2023-10-10 07:52:04,674][53252] Updated weights for policy 0, policy_version 76760 (0.0009) [2023-10-10 07:52:06,704][53268] Updated weights for policy 1, policy_version 76710 (0.0009) [2023-10-10 07:52:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157155328. Throughput: 0: 1677.7, 1: 1706.9. Samples: 39298484. Policy #0 lag: (min: 10.0, avg: 21.9, max: 42.0) [2023-10-10 07:52:06,784][52050] Avg episode reward: [(0, '20.940'), (1, '21.430')] [2023-10-10 07:52:07,071][53268] Updated weights for policy 1, policy_version 76720 (0.0008) [2023-10-10 07:52:07,432][53268] Updated weights for policy 1, policy_version 76730 (0.0008) [2023-10-10 07:52:08,549][53252] Updated weights for policy 0, policy_version 76770 (0.0009) [2023-10-10 07:52:08,915][53252] Updated weights for policy 0, policy_version 76780 (0.0007) [2023-10-10 07:52:09,281][53252] Updated weights for policy 0, policy_version 76790 (0.0008) [2023-10-10 07:52:09,658][53252] Updated weights for policy 0, policy_version 76800 (0.0007) [2023-10-10 07:52:11,684][53268] Updated weights for policy 1, policy_version 76740 (0.0009) [2023-10-10 07:52:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157220864. Throughput: 0: 1688.5, 1: 1704.8. Samples: 39319186. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:11,784][52050] Avg episode reward: [(0, '20.420'), (1, '20.460')] [2023-10-10 07:52:12,090][53268] Updated weights for policy 1, policy_version 76750 (0.0010) [2023-10-10 07:52:12,451][53268] Updated weights for policy 1, policy_version 76760 (0.0010) [2023-10-10 07:52:13,621][53252] Updated weights for policy 0, policy_version 76810 (0.0010) [2023-10-10 07:52:13,999][53252] Updated weights for policy 0, policy_version 76820 (0.0008) [2023-10-10 07:52:14,363][53252] Updated weights for policy 0, policy_version 76830 (0.0008) [2023-10-10 07:52:16,597][53268] Updated weights for policy 1, policy_version 76770 (0.0009) [2023-10-10 07:52:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 157286400. Throughput: 0: 1662.7, 1: 1698.5. Samples: 39328352. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:16,784][52050] Avg episode reward: [(0, '22.360'), (1, '19.430')] [2023-10-10 07:52:16,966][53268] Updated weights for policy 1, policy_version 76780 (0.0007) [2023-10-10 07:52:17,328][53268] Updated weights for policy 1, policy_version 76790 (0.0007) [2023-10-10 07:52:17,694][53268] Updated weights for policy 1, policy_version 76800 (0.0009) [2023-10-10 07:52:18,399][53252] Updated weights for policy 0, policy_version 76840 (0.0009) [2023-10-10 07:52:18,771][53252] Updated weights for policy 0, policy_version 76850 (0.0007) [2023-10-10 07:52:19,139][53252] Updated weights for policy 0, policy_version 76860 (0.0007) [2023-10-10 07:52:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157351936. Throughput: 0: 1678.0, 1: 1700.7. Samples: 39348948. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:21,784][52050] Avg episode reward: [(0, '21.480'), (1, '22.060')] [2023-10-10 07:52:21,825][53268] Updated weights for policy 1, policy_version 76810 (0.0008) [2023-10-10 07:52:22,199][53268] Updated weights for policy 1, policy_version 76820 (0.0008) [2023-10-10 07:52:22,565][53268] Updated weights for policy 1, policy_version 76830 (0.0008) [2023-10-10 07:52:23,190][53252] Updated weights for policy 0, policy_version 76870 (0.0007) [2023-10-10 07:52:23,566][53252] Updated weights for policy 0, policy_version 76880 (0.0008) [2023-10-10 07:52:23,946][53252] Updated weights for policy 0, policy_version 76890 (0.0010) [2023-10-10 07:52:26,547][53268] Updated weights for policy 1, policy_version 76840 (0.0007) [2023-10-10 07:52:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157417472. Throughput: 0: 1685.8, 1: 1695.4. Samples: 39369782. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:26,784][52050] Avg episode reward: [(0, '22.950'), (1, '21.070')] [2023-10-10 07:52:26,908][53268] Updated weights for policy 1, policy_version 76850 (0.0007) [2023-10-10 07:52:27,278][53268] Updated weights for policy 1, policy_version 76860 (0.0007) [2023-10-10 07:52:27,919][53252] Updated weights for policy 0, policy_version 76900 (0.0008) [2023-10-10 07:52:28,285][53252] Updated weights for policy 0, policy_version 76910 (0.0007) [2023-10-10 07:52:28,654][53252] Updated weights for policy 0, policy_version 76920 (0.0009) [2023-10-10 07:52:31,441][53268] Updated weights for policy 1, policy_version 76870 (0.0009) [2023-10-10 07:52:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 157483008. Throughput: 0: 1670.7, 1: 1695.4. Samples: 39379120. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:31,784][52050] Avg episode reward: [(0, '21.350'), (1, '20.460')] [2023-10-10 07:52:31,803][53268] Updated weights for policy 1, policy_version 76880 (0.0010) [2023-10-10 07:52:32,167][53268] Updated weights for policy 1, policy_version 76890 (0.0009) [2023-10-10 07:52:32,693][53252] Updated weights for policy 0, policy_version 76930 (0.0009) [2023-10-10 07:52:33,064][53252] Updated weights for policy 0, policy_version 76940 (0.0010) [2023-10-10 07:52:33,446][53252] Updated weights for policy 0, policy_version 76950 (0.0008) [2023-10-10 07:52:33,810][53252] Updated weights for policy 0, policy_version 76960 (0.0009) [2023-10-10 07:52:36,388][53268] Updated weights for policy 1, policy_version 76900 (0.0009) [2023-10-10 07:52:36,757][53268] Updated weights for policy 1, policy_version 76910 (0.0010) [2023-10-10 07:52:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 157548544. Throughput: 0: 1698.1, 1: 1688.0. Samples: 39399872. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:36,784][52050] Avg episode reward: [(0, '20.880'), (1, '21.400')] [2023-10-10 07:52:37,128][53268] Updated weights for policy 1, policy_version 76920 (0.0009) [2023-10-10 07:52:37,814][53252] Updated weights for policy 0, policy_version 76970 (0.0008) [2023-10-10 07:52:38,187][53252] Updated weights for policy 0, policy_version 76980 (0.0008) [2023-10-10 07:52:38,560][53252] Updated weights for policy 0, policy_version 76990 (0.0008) [2023-10-10 07:52:41,017][53268] Updated weights for policy 1, policy_version 76930 (0.0008) [2023-10-10 07:52:41,376][53268] Updated weights for policy 1, policy_version 76940 (0.0009) [2023-10-10 07:52:41,734][53268] Updated weights for policy 1, policy_version 76950 (0.0010) [2023-10-10 07:52:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 157614080. Throughput: 0: 1705.5, 1: 1680.0. Samples: 39420700. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:41,784][52050] Avg episode reward: [(0, '22.190'), (1, '23.350')] [2023-10-10 07:52:41,790][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000076992_78839808.pth... [2023-10-10 07:52:41,821][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000075424_77234176.pth [2023-10-10 07:52:42,100][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000076960_78807040.pth... [2023-10-10 07:52:42,104][53268] Updated weights for policy 1, policy_version 76960 (0.0010) [2023-10-10 07:52:42,138][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000075360_77168640.pth [2023-10-10 07:52:42,617][53252] Updated weights for policy 0, policy_version 77000 (0.0008) [2023-10-10 07:52:42,982][53252] Updated weights for policy 0, policy_version 77010 (0.0008) [2023-10-10 07:52:43,360][53252] Updated weights for policy 0, policy_version 77020 (0.0010) [2023-10-10 07:52:46,020][53268] Updated weights for policy 1, policy_version 76970 (0.0009) [2023-10-10 07:52:46,393][53268] Updated weights for policy 1, policy_version 76980 (0.0009) [2023-10-10 07:52:46,751][53268] Updated weights for policy 1, policy_version 76990 (0.0009) [2023-10-10 07:52:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 157679616. Throughput: 0: 1683.6, 1: 1687.6. Samples: 39430140. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:46,784][52050] Avg episode reward: [(0, '23.140'), (1, '22.020')] [2023-10-10 07:52:47,378][53252] Updated weights for policy 0, policy_version 77030 (0.0008) [2023-10-10 07:52:47,750][53252] Updated weights for policy 0, policy_version 77040 (0.0008) [2023-10-10 07:52:48,121][53252] Updated weights for policy 0, policy_version 77050 (0.0008) [2023-10-10 07:52:50,835][53268] Updated weights for policy 1, policy_version 77000 (0.0010) [2023-10-10 07:52:51,192][53268] Updated weights for policy 1, policy_version 77010 (0.0011) [2023-10-10 07:52:51,559][53268] Updated weights for policy 1, policy_version 77020 (0.0010) [2023-10-10 07:52:51,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157777920. Throughput: 0: 1703.6, 1: 1685.0. Samples: 39450968. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:51,784][52050] Avg episode reward: [(0, '22.680'), (1, '21.110')] [2023-10-10 07:52:52,096][53252] Updated weights for policy 0, policy_version 77060 (0.0009) [2023-10-10 07:52:52,469][53252] Updated weights for policy 0, policy_version 77070 (0.0008) [2023-10-10 07:52:52,832][53252] Updated weights for policy 0, policy_version 77080 (0.0009) [2023-10-10 07:52:55,544][53268] Updated weights for policy 1, policy_version 77030 (0.0010) [2023-10-10 07:52:55,910][53268] Updated weights for policy 1, policy_version 77040 (0.0008) [2023-10-10 07:52:56,283][53268] Updated weights for policy 1, policy_version 77050 (0.0011) [2023-10-10 07:52:56,761][53252] Updated weights for policy 0, policy_version 77090 (0.0007) [2023-10-10 07:52:56,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 157843456. Throughput: 0: 1704.8, 1: 1669.1. Samples: 39471012. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-10 07:52:56,784][52050] Avg episode reward: [(0, '22.510'), (1, '21.340')] [2023-10-10 07:52:57,138][53252] Updated weights for policy 0, policy_version 77100 (0.0009) [2023-10-10 07:52:57,499][53252] Updated weights for policy 0, policy_version 77110 (0.0009) [2023-10-10 07:52:57,876][53252] Updated weights for policy 0, policy_version 77120 (0.0010) [2023-10-10 07:53:00,410][53268] Updated weights for policy 1, policy_version 77060 (0.0011) [2023-10-10 07:53:00,806][53268] Updated weights for policy 1, policy_version 77070 (0.0008) [2023-10-10 07:53:01,178][53268] Updated weights for policy 1, policy_version 77080 (0.0010) [2023-10-10 07:53:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 157908992. Throughput: 0: 1699.5, 1: 1694.8. Samples: 39481094. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:01,784][52050] Avg episode reward: [(0, '22.120'), (1, '22.880')] [2023-10-10 07:53:01,860][53252] Updated weights for policy 0, policy_version 77130 (0.0008) [2023-10-10 07:53:02,240][53252] Updated weights for policy 0, policy_version 77140 (0.0009) [2023-10-10 07:53:02,597][53252] Updated weights for policy 0, policy_version 77150 (0.0007) [2023-10-10 07:53:05,220][53268] Updated weights for policy 1, policy_version 77090 (0.0010) [2023-10-10 07:53:05,596][53268] Updated weights for policy 1, policy_version 77100 (0.0009) [2023-10-10 07:53:05,962][53268] Updated weights for policy 1, policy_version 77110 (0.0009) [2023-10-10 07:53:06,333][53268] Updated weights for policy 1, policy_version 77120 (0.0011) [2023-10-10 07:53:06,753][53252] Updated weights for policy 0, policy_version 77160 (0.0008) [2023-10-10 07:53:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 157974528. Throughput: 0: 1706.5, 1: 1691.9. Samples: 39501876. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:06,784][52050] Avg episode reward: [(0, '22.010'), (1, '22.280')] [2023-10-10 07:53:07,126][53252] Updated weights for policy 0, policy_version 77170 (0.0008) [2023-10-10 07:53:07,492][53252] Updated weights for policy 0, policy_version 77180 (0.0008) [2023-10-10 07:53:10,302][53268] Updated weights for policy 1, policy_version 77130 (0.0008) [2023-10-10 07:53:10,663][53268] Updated weights for policy 1, policy_version 77140 (0.0009) [2023-10-10 07:53:11,031][53268] Updated weights for policy 1, policy_version 77150 (0.0007) [2023-10-10 07:53:11,704][53252] Updated weights for policy 0, policy_version 77190 (0.0008) [2023-10-10 07:53:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158040064. Throughput: 0: 1701.3, 1: 1668.0. Samples: 39521400. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:11,784][52050] Avg episode reward: [(0, '21.460'), (1, '22.530')] [2023-10-10 07:53:12,078][53252] Updated weights for policy 0, policy_version 77200 (0.0008) [2023-10-10 07:53:12,458][53252] Updated weights for policy 0, policy_version 77210 (0.0009) [2023-10-10 07:53:15,225][53268] Updated weights for policy 1, policy_version 77160 (0.0008) [2023-10-10 07:53:15,587][53268] Updated weights for policy 1, policy_version 77170 (0.0010) [2023-10-10 07:53:15,946][53268] Updated weights for policy 1, policy_version 77180 (0.0008) [2023-10-10 07:53:16,572][53252] Updated weights for policy 0, policy_version 77220 (0.0009) [2023-10-10 07:53:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158105600. Throughput: 0: 1697.7, 1: 1695.1. Samples: 39531794. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:16,784][52050] Avg episode reward: [(0, '22.650'), (1, '25.070')] [2023-10-10 07:53:16,786][53061] Saving new best policy, reward=25.070! [2023-10-10 07:53:16,944][53252] Updated weights for policy 0, policy_version 77230 (0.0009) [2023-10-10 07:53:17,314][53252] Updated weights for policy 0, policy_version 77240 (0.0009) [2023-10-10 07:53:19,905][53268] Updated weights for policy 1, policy_version 77190 (0.0010) [2023-10-10 07:53:20,273][53268] Updated weights for policy 1, policy_version 77200 (0.0009) [2023-10-10 07:53:20,636][53268] Updated weights for policy 1, policy_version 77210 (0.0010) [2023-10-10 07:53:21,529][53252] Updated weights for policy 0, policy_version 77250 (0.0009) [2023-10-10 07:53:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158171136. Throughput: 0: 1695.8, 1: 1689.1. Samples: 39552192. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:21,784][52050] Avg episode reward: [(0, '24.080'), (1, '24.280')] [2023-10-10 07:53:21,909][53252] Updated weights for policy 0, policy_version 77260 (0.0008) [2023-10-10 07:53:22,287][53252] Updated weights for policy 0, policy_version 77270 (0.0008) [2023-10-10 07:53:22,666][53252] Updated weights for policy 0, policy_version 77280 (0.0007) [2023-10-10 07:53:24,590][53268] Updated weights for policy 1, policy_version 77220 (0.0007) [2023-10-10 07:53:24,958][53268] Updated weights for policy 1, policy_version 77230 (0.0008) [2023-10-10 07:53:25,316][53268] Updated weights for policy 1, policy_version 77240 (0.0010) [2023-10-10 07:53:26,516][53252] Updated weights for policy 0, policy_version 77290 (0.0008) [2023-10-10 07:53:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158236672. Throughput: 0: 1687.4, 1: 1677.6. Samples: 39572126. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:26,784][52050] Avg episode reward: [(0, '21.820'), (1, '22.270')] [2023-10-10 07:53:26,889][53252] Updated weights for policy 0, policy_version 77300 (0.0007) [2023-10-10 07:53:27,267][53252] Updated weights for policy 0, policy_version 77310 (0.0007) [2023-10-10 07:53:29,263][53268] Updated weights for policy 1, policy_version 77250 (0.0010) [2023-10-10 07:53:29,630][53268] Updated weights for policy 1, policy_version 77260 (0.0009) [2023-10-10 07:53:30,007][53268] Updated weights for policy 1, policy_version 77270 (0.0011) [2023-10-10 07:53:30,365][53268] Updated weights for policy 1, policy_version 77280 (0.0008) [2023-10-10 07:53:31,482][53252] Updated weights for policy 0, policy_version 77320 (0.0007) [2023-10-10 07:53:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158302208. Throughput: 0: 1692.1, 1: 1698.4. Samples: 39582716. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:31,784][52050] Avg episode reward: [(0, '21.690'), (1, '21.900')] [2023-10-10 07:53:31,863][53252] Updated weights for policy 0, policy_version 77330 (0.0007) [2023-10-10 07:53:32,220][53252] Updated weights for policy 0, policy_version 77340 (0.0007) [2023-10-10 07:53:34,401][53268] Updated weights for policy 1, policy_version 77290 (0.0010) [2023-10-10 07:53:34,760][53268] Updated weights for policy 1, policy_version 77300 (0.0008) [2023-10-10 07:53:35,130][53268] Updated weights for policy 1, policy_version 77310 (0.0010) [2023-10-10 07:53:36,271][53252] Updated weights for policy 0, policy_version 77350 (0.0011) [2023-10-10 07:53:36,644][53252] Updated weights for policy 0, policy_version 77360 (0.0010) [2023-10-10 07:53:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158367744. Throughput: 0: 1691.6, 1: 1673.5. Samples: 39602394. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:36,784][52050] Avg episode reward: [(0, '21.850'), (1, '24.260')] [2023-10-10 07:53:37,026][53252] Updated weights for policy 0, policy_version 77370 (0.0011) [2023-10-10 07:53:39,145][53268] Updated weights for policy 1, policy_version 77320 (0.0009) [2023-10-10 07:53:39,509][53268] Updated weights for policy 1, policy_version 77330 (0.0008) [2023-10-10 07:53:39,890][53268] Updated weights for policy 1, policy_version 77340 (0.0009) [2023-10-10 07:53:40,946][53252] Updated weights for policy 0, policy_version 77380 (0.0011) [2023-10-10 07:53:41,328][53252] Updated weights for policy 0, policy_version 77390 (0.0010) [2023-10-10 07:53:41,687][53252] Updated weights for policy 0, policy_version 77400 (0.0009) [2023-10-10 07:53:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158433280. Throughput: 0: 1678.4, 1: 1688.1. Samples: 39622508. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:41,784][52050] Avg episode reward: [(0, '21.550'), (1, '21.380')] [2023-10-10 07:53:43,982][53268] Updated weights for policy 1, policy_version 77350 (0.0008) [2023-10-10 07:53:44,364][53268] Updated weights for policy 1, policy_version 77360 (0.0008) [2023-10-10 07:53:44,730][53268] Updated weights for policy 1, policy_version 77370 (0.0010) [2023-10-10 07:53:45,733][53252] Updated weights for policy 0, policy_version 77410 (0.0009) [2023-10-10 07:53:46,098][53252] Updated weights for policy 0, policy_version 77420 (0.0008) [2023-10-10 07:53:46,455][53252] Updated weights for policy 0, policy_version 77430 (0.0008) [2023-10-10 07:53:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 158498816. Throughput: 0: 1688.6, 1: 1689.1. Samples: 39633090. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) [2023-10-10 07:53:46,784][52050] Avg episode reward: [(0, '20.590'), (1, '21.470')] [2023-10-10 07:53:46,827][53252] Updated weights for policy 0, policy_version 77440 (0.0007) [2023-10-10 07:53:48,932][53268] Updated weights for policy 1, policy_version 77380 (0.0010) [2023-10-10 07:53:49,306][53268] Updated weights for policy 1, policy_version 77390 (0.0009) [2023-10-10 07:53:49,667][53268] Updated weights for policy 1, policy_version 77400 (0.0007) [2023-10-10 07:53:50,909][53252] Updated weights for policy 0, policy_version 77450 (0.0009) [2023-10-10 07:53:51,285][53252] Updated weights for policy 0, policy_version 77460 (0.0009) [2023-10-10 07:53:51,641][53252] Updated weights for policy 0, policy_version 77470 (0.0008) [2023-10-10 07:53:51,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158597120. Throughput: 0: 1684.6, 1: 1668.9. Samples: 39652784. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:53:51,784][52050] Avg episode reward: [(0, '21.880'), (1, '21.280')] [2023-10-10 07:53:53,632][53268] Updated weights for policy 1, policy_version 77410 (0.0009) [2023-10-10 07:53:54,014][53268] Updated weights for policy 1, policy_version 77420 (0.0008) [2023-10-10 07:53:54,390][53268] Updated weights for policy 1, policy_version 77430 (0.0007) [2023-10-10 07:53:54,760][53268] Updated weights for policy 1, policy_version 77440 (0.0009) [2023-10-10 07:53:55,652][53252] Updated weights for policy 0, policy_version 77480 (0.0008) [2023-10-10 07:53:56,012][53252] Updated weights for policy 0, policy_version 77490 (0.0008) [2023-10-10 07:53:56,392][53252] Updated weights for policy 0, policy_version 77500 (0.0007) [2023-10-10 07:53:56,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158662656. Throughput: 0: 1666.1, 1: 1692.3. Samples: 39672528. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:53:56,784][52050] Avg episode reward: [(0, '23.960'), (1, '20.740')] [2023-10-10 07:53:58,997][53268] Updated weights for policy 1, policy_version 77450 (0.0008) [2023-10-10 07:53:59,366][53268] Updated weights for policy 1, policy_version 77460 (0.0008) [2023-10-10 07:53:59,733][53268] Updated weights for policy 1, policy_version 77470 (0.0009) [2023-10-10 07:54:00,398][53252] Updated weights for policy 0, policy_version 77510 (0.0008) [2023-10-10 07:54:00,779][53252] Updated weights for policy 0, policy_version 77520 (0.0010) [2023-10-10 07:54:01,153][53252] Updated weights for policy 0, policy_version 77530 (0.0008) [2023-10-10 07:54:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158728192. Throughput: 0: 1688.9, 1: 1676.7. Samples: 39683244. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:01,784][52050] Avg episode reward: [(0, '23.220'), (1, '20.400')] [2023-10-10 07:54:03,910][53268] Updated weights for policy 1, policy_version 77480 (0.0009) [2023-10-10 07:54:04,277][53268] Updated weights for policy 1, policy_version 77490 (0.0010) [2023-10-10 07:54:04,642][53268] Updated weights for policy 1, policy_version 77500 (0.0008) [2023-10-10 07:54:05,109][53252] Updated weights for policy 0, policy_version 77540 (0.0008) [2023-10-10 07:54:05,478][53252] Updated weights for policy 0, policy_version 77550 (0.0010) [2023-10-10 07:54:05,850][53252] Updated weights for policy 0, policy_version 77560 (0.0007) [2023-10-10 07:54:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 158793728. Throughput: 0: 1682.8, 1: 1667.3. Samples: 39702950. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:06,784][52050] Avg episode reward: [(0, '22.500'), (1, '21.750')] [2023-10-10 07:54:08,542][53268] Updated weights for policy 1, policy_version 77510 (0.0008) [2023-10-10 07:54:08,906][53268] Updated weights for policy 1, policy_version 77520 (0.0008) [2023-10-10 07:54:09,274][53268] Updated weights for policy 1, policy_version 77530 (0.0010) [2023-10-10 07:54:09,809][53252] Updated weights for policy 0, policy_version 77570 (0.0010) [2023-10-10 07:54:10,172][53252] Updated weights for policy 0, policy_version 77580 (0.0010) [2023-10-10 07:54:10,540][53252] Updated weights for policy 0, policy_version 77590 (0.0009) [2023-10-10 07:54:10,910][53252] Updated weights for policy 0, policy_version 77600 (0.0011) [2023-10-10 07:54:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158859264. Throughput: 0: 1667.8, 1: 1682.9. Samples: 39722910. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:11,785][52050] Avg episode reward: [(0, '23.990'), (1, '22.140')] [2023-10-10 07:54:13,219][53268] Updated weights for policy 1, policy_version 77540 (0.0008) [2023-10-10 07:54:13,588][53268] Updated weights for policy 1, policy_version 77550 (0.0007) [2023-10-10 07:54:13,956][53268] Updated weights for policy 1, policy_version 77560 (0.0007) [2023-10-10 07:54:14,961][53252] Updated weights for policy 0, policy_version 77610 (0.0008) [2023-10-10 07:54:15,333][53252] Updated weights for policy 0, policy_version 77620 (0.0009) [2023-10-10 07:54:15,707][53252] Updated weights for policy 0, policy_version 77630 (0.0010) [2023-10-10 07:54:16,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158924800. Throughput: 0: 1690.3, 1: 1661.1. Samples: 39733530. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:16,784][52050] Avg episode reward: [(0, '21.610'), (1, '21.510')] [2023-10-10 07:54:18,048][53268] Updated weights for policy 1, policy_version 77570 (0.0009) [2023-10-10 07:54:18,421][53268] Updated weights for policy 1, policy_version 77580 (0.0007) [2023-10-10 07:54:18,788][53268] Updated weights for policy 1, policy_version 77590 (0.0010) [2023-10-10 07:54:19,152][53268] Updated weights for policy 1, policy_version 77600 (0.0008) [2023-10-10 07:54:19,745][53252] Updated weights for policy 0, policy_version 77640 (0.0010) [2023-10-10 07:54:20,124][53252] Updated weights for policy 0, policy_version 77650 (0.0009) [2023-10-10 07:54:20,489][53252] Updated weights for policy 0, policy_version 77660 (0.0010) [2023-10-10 07:54:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158990336. Throughput: 0: 1673.1, 1: 1681.3. Samples: 39753338. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:21,784][52050] Avg episode reward: [(0, '21.240'), (1, '20.850')] [2023-10-10 07:54:23,349][53268] Updated weights for policy 1, policy_version 77610 (0.0008) [2023-10-10 07:54:23,721][53268] Updated weights for policy 1, policy_version 77620 (0.0008) [2023-10-10 07:54:24,090][53268] Updated weights for policy 1, policy_version 77630 (0.0009) [2023-10-10 07:54:24,466][53252] Updated weights for policy 0, policy_version 77670 (0.0008) [2023-10-10 07:54:24,839][53252] Updated weights for policy 0, policy_version 77680 (0.0007) [2023-10-10 07:54:25,207][53252] Updated weights for policy 0, policy_version 77690 (0.0007) [2023-10-10 07:54:26,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 159055872. Throughput: 0: 1680.5, 1: 1685.3. Samples: 39773970. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:26,784][52050] Avg episode reward: [(0, '22.960'), (1, '21.090')] [2023-10-10 07:54:28,211][53268] Updated weights for policy 1, policy_version 77640 (0.0007) [2023-10-10 07:54:28,583][53268] Updated weights for policy 1, policy_version 77650 (0.0007) [2023-10-10 07:54:28,959][53268] Updated weights for policy 1, policy_version 77660 (0.0009) [2023-10-10 07:54:29,316][53252] Updated weights for policy 0, policy_version 77700 (0.0008) [2023-10-10 07:54:29,685][53252] Updated weights for policy 0, policy_version 77710 (0.0008) [2023-10-10 07:54:30,057][53252] Updated weights for policy 0, policy_version 77720 (0.0008) [2023-10-10 07:54:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159121408. Throughput: 0: 1693.1, 1: 1661.9. Samples: 39784066. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-10 07:54:31,784][52050] Avg episode reward: [(0, '22.500'), (1, '19.680')] [2023-10-10 07:54:32,975][53268] Updated weights for policy 1, policy_version 77670 (0.0009) [2023-10-10 07:54:33,342][53268] Updated weights for policy 1, policy_version 77680 (0.0010) [2023-10-10 07:54:33,714][53268] Updated weights for policy 1, policy_version 77690 (0.0010) [2023-10-10 07:54:34,035][53252] Updated weights for policy 0, policy_version 77730 (0.0008) [2023-10-10 07:54:34,403][53252] Updated weights for policy 0, policy_version 77740 (0.0007) [2023-10-10 07:54:34,778][53252] Updated weights for policy 0, policy_version 77750 (0.0007) [2023-10-10 07:54:35,148][53252] Updated weights for policy 0, policy_version 77760 (0.0008) [2023-10-10 07:54:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159186944. Throughput: 0: 1673.4, 1: 1683.0. Samples: 39803824. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:54:36,784][52050] Avg episode reward: [(0, '22.490'), (1, '19.950')] [2023-10-10 07:54:37,780][53268] Updated weights for policy 1, policy_version 77700 (0.0008) [2023-10-10 07:54:38,150][53268] Updated weights for policy 1, policy_version 77710 (0.0007) [2023-10-10 07:54:38,514][53268] Updated weights for policy 1, policy_version 77720 (0.0008) [2023-10-10 07:54:39,132][53252] Updated weights for policy 0, policy_version 77770 (0.0008) [2023-10-10 07:54:39,503][53252] Updated weights for policy 0, policy_version 77780 (0.0009) [2023-10-10 07:54:39,883][53252] Updated weights for policy 0, policy_version 77790 (0.0009) [2023-10-10 07:54:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159252480. Throughput: 0: 1695.7, 1: 1685.6. Samples: 39824686. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:54:41,784][52050] Avg episode reward: [(0, '22.000'), (1, '21.560')] [2023-10-10 07:54:41,798][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000077728_79593472.pth... [2023-10-10 07:54:41,798][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000077792_79659008.pth... [2023-10-10 07:54:41,843][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000076160_77987840.pth [2023-10-10 07:54:41,843][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000076224_78053376.pth [2023-10-10 07:54:42,595][53268] Updated weights for policy 1, policy_version 77730 (0.0008) [2023-10-10 07:54:43,016][53268] Updated weights for policy 1, policy_version 77740 (0.0007) [2023-10-10 07:54:43,372][53268] Updated weights for policy 1, policy_version 77750 (0.0010) [2023-10-10 07:54:43,737][53268] Updated weights for policy 1, policy_version 77760 (0.0009) [2023-10-10 07:54:43,977][53252] Updated weights for policy 0, policy_version 77800 (0.0009) [2023-10-10 07:54:44,339][53252] Updated weights for policy 0, policy_version 77810 (0.0009) [2023-10-10 07:54:44,707][53252] Updated weights for policy 0, policy_version 77820 (0.0009) [2023-10-10 07:54:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159318016. Throughput: 0: 1684.4, 1: 1668.6. Samples: 39834128. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:54:46,785][52050] Avg episode reward: [(0, '22.280'), (1, '20.130')] [2023-10-10 07:54:47,803][53268] Updated weights for policy 1, policy_version 77770 (0.0008) [2023-10-10 07:54:48,185][53268] Updated weights for policy 1, policy_version 77780 (0.0008) [2023-10-10 07:54:48,553][53268] Updated weights for policy 1, policy_version 77790 (0.0008) [2023-10-10 07:54:48,579][53252] Updated weights for policy 0, policy_version 77830 (0.0008) [2023-10-10 07:54:48,952][53252] Updated weights for policy 0, policy_version 77840 (0.0009) [2023-10-10 07:54:49,313][53252] Updated weights for policy 0, policy_version 77850 (0.0007) [2023-10-10 07:54:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 159383552. Throughput: 0: 1678.5, 1: 1683.3. Samples: 39854230. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:54:51,784][52050] Avg episode reward: [(0, '22.620'), (1, '21.450')] [2023-10-10 07:54:52,530][53268] Updated weights for policy 1, policy_version 77800 (0.0009) [2023-10-10 07:54:52,899][53268] Updated weights for policy 1, policy_version 77810 (0.0011) [2023-10-10 07:54:53,262][53268] Updated weights for policy 1, policy_version 77820 (0.0007) [2023-10-10 07:54:53,386][53252] Updated weights for policy 0, policy_version 77860 (0.0008) [2023-10-10 07:54:53,752][53252] Updated weights for policy 0, policy_version 77870 (0.0010) [2023-10-10 07:54:54,125][53252] Updated weights for policy 0, policy_version 77880 (0.0009) [2023-10-10 07:54:56,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 159449088. Throughput: 0: 1701.3, 1: 1682.7. Samples: 39875188. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:54:56,784][52050] Avg episode reward: [(0, '22.430'), (1, '22.410')] [2023-10-10 07:54:57,455][53268] Updated weights for policy 1, policy_version 77830 (0.0009) [2023-10-10 07:54:57,829][53268] Updated weights for policy 1, policy_version 77840 (0.0010) [2023-10-10 07:54:58,148][53252] Updated weights for policy 0, policy_version 77890 (0.0010) [2023-10-10 07:54:58,185][53268] Updated weights for policy 1, policy_version 77850 (0.0008) [2023-10-10 07:54:58,513][53252] Updated weights for policy 0, policy_version 77900 (0.0008) [2023-10-10 07:54:58,882][53252] Updated weights for policy 0, policy_version 77910 (0.0008) [2023-10-10 07:54:59,252][53252] Updated weights for policy 0, policy_version 77920 (0.0008) [2023-10-10 07:55:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159514624. Throughput: 0: 1675.3, 1: 1675.3. Samples: 39884304. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:55:01,784][52050] Avg episode reward: [(0, '20.670'), (1, '20.720')] [2023-10-10 07:55:02,191][53268] Updated weights for policy 1, policy_version 77860 (0.0008) [2023-10-10 07:55:02,560][53268] Updated weights for policy 1, policy_version 77870 (0.0008) [2023-10-10 07:55:02,935][53268] Updated weights for policy 1, policy_version 77880 (0.0009) [2023-10-10 07:55:03,292][53252] Updated weights for policy 0, policy_version 77930 (0.0010) [2023-10-10 07:55:03,661][53252] Updated weights for policy 0, policy_version 77940 (0.0007) [2023-10-10 07:55:04,028][53252] Updated weights for policy 0, policy_version 77950 (0.0007) [2023-10-10 07:55:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 159580160. Throughput: 0: 1691.6, 1: 1675.3. Samples: 39904848. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:55:06,784][52050] Avg episode reward: [(0, '22.690'), (1, '20.340')] [2023-10-10 07:55:06,944][53268] Updated weights for policy 1, policy_version 77890 (0.0008) [2023-10-10 07:55:07,311][53268] Updated weights for policy 1, policy_version 77900 (0.0010) [2023-10-10 07:55:07,681][53268] Updated weights for policy 1, policy_version 77910 (0.0008) [2023-10-10 07:55:08,042][53268] Updated weights for policy 1, policy_version 77920 (0.0007) [2023-10-10 07:55:08,186][53252] Updated weights for policy 0, policy_version 77960 (0.0009) [2023-10-10 07:55:08,565][53252] Updated weights for policy 0, policy_version 77970 (0.0009) [2023-10-10 07:55:08,932][53252] Updated weights for policy 0, policy_version 77980 (0.0008) [2023-10-10 07:55:11,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159645696. Throughput: 0: 1694.6, 1: 1677.5. Samples: 39925716. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:55:11,785][52050] Avg episode reward: [(0, '22.300'), (1, '22.850')] [2023-10-10 07:55:12,151][53268] Updated weights for policy 1, policy_version 77930 (0.0009) [2023-10-10 07:55:12,506][53268] Updated weights for policy 1, policy_version 77940 (0.0010) [2023-10-10 07:55:12,868][53268] Updated weights for policy 1, policy_version 77950 (0.0009) [2023-10-10 07:55:12,946][53252] Updated weights for policy 0, policy_version 77990 (0.0007) [2023-10-10 07:55:13,318][53252] Updated weights for policy 0, policy_version 78000 (0.0008) [2023-10-10 07:55:13,698][53252] Updated weights for policy 0, policy_version 78010 (0.0007) [2023-10-10 07:55:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 159711232. Throughput: 0: 1671.3, 1: 1680.0. Samples: 39934874. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:55:16,784][52050] Avg episode reward: [(0, '21.940'), (1, '23.050')] [2023-10-10 07:55:17,020][53268] Updated weights for policy 1, policy_version 77960 (0.0009) [2023-10-10 07:55:17,388][53268] Updated weights for policy 1, policy_version 77970 (0.0010) [2023-10-10 07:55:17,664][53252] Updated weights for policy 0, policy_version 78020 (0.0007) [2023-10-10 07:55:17,749][53268] Updated weights for policy 1, policy_version 77980 (0.0007) [2023-10-10 07:55:18,022][53252] Updated weights for policy 0, policy_version 78030 (0.0010) [2023-10-10 07:55:18,394][53252] Updated weights for policy 0, policy_version 78040 (0.0007) [2023-10-10 07:55:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159776768. Throughput: 0: 1694.2, 1: 1680.9. Samples: 39955704. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-10 07:55:21,785][52050] Avg episode reward: [(0, '20.940'), (1, '22.450')] [2023-10-10 07:55:21,929][53268] Updated weights for policy 1, policy_version 77990 (0.0009) [2023-10-10 07:55:22,299][53268] Updated weights for policy 1, policy_version 78000 (0.0010) [2023-10-10 07:55:22,665][53268] Updated weights for policy 1, policy_version 78010 (0.0008) [2023-10-10 07:55:22,682][53252] Updated weights for policy 0, policy_version 78050 (0.0009) [2023-10-10 07:55:23,045][53252] Updated weights for policy 0, policy_version 78060 (0.0010) [2023-10-10 07:55:23,413][53252] Updated weights for policy 0, policy_version 78070 (0.0009) [2023-10-10 07:55:23,774][53252] Updated weights for policy 0, policy_version 78080 (0.0009) [2023-10-10 07:55:26,571][53268] Updated weights for policy 1, policy_version 78020 (0.0007) [2023-10-10 07:55:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159842304. Throughput: 0: 1692.5, 1: 1685.8. Samples: 39976708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:26,784][52050] Avg episode reward: [(0, '24.200'), (1, '23.540')] [2023-10-10 07:55:26,945][53268] Updated weights for policy 1, policy_version 78030 (0.0008) [2023-10-10 07:55:27,303][53268] Updated weights for policy 1, policy_version 78040 (0.0008) [2023-10-10 07:55:27,638][53252] Updated weights for policy 0, policy_version 78090 (0.0008) [2023-10-10 07:55:28,011][53252] Updated weights for policy 0, policy_version 78100 (0.0007) [2023-10-10 07:55:28,382][53252] Updated weights for policy 0, policy_version 78110 (0.0010) [2023-10-10 07:55:31,330][53268] Updated weights for policy 1, policy_version 78050 (0.0008) [2023-10-10 07:55:31,700][53268] Updated weights for policy 1, policy_version 78060 (0.0008) [2023-10-10 07:55:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159907840. Throughput: 0: 1680.7, 1: 1693.5. Samples: 39985966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:31,784][52050] Avg episode reward: [(0, '23.680'), (1, '22.070')] [2023-10-10 07:55:32,066][53268] Updated weights for policy 1, policy_version 78070 (0.0007) [2023-10-10 07:55:32,428][53268] Updated weights for policy 1, policy_version 78080 (0.0008) [2023-10-10 07:55:32,559][53252] Updated weights for policy 0, policy_version 78120 (0.0008) [2023-10-10 07:55:32,931][53252] Updated weights for policy 0, policy_version 78130 (0.0007) [2023-10-10 07:55:33,299][53252] Updated weights for policy 0, policy_version 78140 (0.0007) [2023-10-10 07:55:36,603][53268] Updated weights for policy 1, policy_version 78090 (0.0009) [2023-10-10 07:55:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159973376. Throughput: 0: 1690.0, 1: 1695.2. Samples: 40006564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:36,784][52050] Avg episode reward: [(0, '22.640'), (1, '21.310')] [2023-10-10 07:55:36,969][53268] Updated weights for policy 1, policy_version 78100 (0.0008) [2023-10-10 07:55:37,322][53252] Updated weights for policy 0, policy_version 78150 (0.0008) [2023-10-10 07:55:37,338][53268] Updated weights for policy 1, policy_version 78110 (0.0008) [2023-10-10 07:55:37,689][53252] Updated weights for policy 0, policy_version 78160 (0.0008) [2023-10-10 07:55:38,072][53252] Updated weights for policy 0, policy_version 78170 (0.0009) [2023-10-10 07:55:41,436][53268] Updated weights for policy 1, policy_version 78120 (0.0008) [2023-10-10 07:55:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 160038912. Throughput: 0: 1683.9, 1: 1689.3. Samples: 40026982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:41,784][52050] Avg episode reward: [(0, '22.910'), (1, '22.520')] [2023-10-10 07:55:41,808][53268] Updated weights for policy 1, policy_version 78130 (0.0009) [2023-10-10 07:55:42,108][53252] Updated weights for policy 0, policy_version 78180 (0.0008) [2023-10-10 07:55:42,172][53268] Updated weights for policy 1, policy_version 78140 (0.0007) [2023-10-10 07:55:42,483][53252] Updated weights for policy 0, policy_version 78190 (0.0008) [2023-10-10 07:55:42,850][53252] Updated weights for policy 0, policy_version 78200 (0.0008) [2023-10-10 07:55:46,218][53268] Updated weights for policy 1, policy_version 78150 (0.0010) [2023-10-10 07:55:46,583][53268] Updated weights for policy 1, policy_version 78160 (0.0011) [2023-10-10 07:55:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 160104448. Throughput: 0: 1684.4, 1: 1687.6. Samples: 40036044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:46,784][52050] Avg episode reward: [(0, '22.240'), (1, '21.530')] [2023-10-10 07:55:46,950][53252] Updated weights for policy 0, policy_version 78210 (0.0009) [2023-10-10 07:55:46,959][53268] Updated weights for policy 1, policy_version 78170 (0.0011) [2023-10-10 07:55:47,317][53252] Updated weights for policy 0, policy_version 78220 (0.0009) [2023-10-10 07:55:47,691][53252] Updated weights for policy 0, policy_version 78230 (0.0009) [2023-10-10 07:55:48,062][53252] Updated weights for policy 0, policy_version 78240 (0.0008) [2023-10-10 07:55:51,041][53268] Updated weights for policy 1, policy_version 78180 (0.0008) [2023-10-10 07:55:51,415][53268] Updated weights for policy 1, policy_version 78190 (0.0009) [2023-10-10 07:55:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160169984. Throughput: 0: 1677.5, 1: 1693.7. Samples: 40056550. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:51,785][52050] Avg episode reward: [(0, '19.870'), (1, '20.880')] [2023-10-10 07:55:51,786][53268] Updated weights for policy 1, policy_version 78200 (0.0010) [2023-10-10 07:55:52,267][53252] Updated weights for policy 0, policy_version 78250 (0.0008) [2023-10-10 07:55:52,643][53252] Updated weights for policy 0, policy_version 78260 (0.0009) [2023-10-10 07:55:53,000][53252] Updated weights for policy 0, policy_version 78270 (0.0009) [2023-10-10 07:55:56,009][53268] Updated weights for policy 1, policy_version 78210 (0.0008) [2023-10-10 07:55:56,372][53268] Updated weights for policy 1, policy_version 78220 (0.0010) [2023-10-10 07:55:56,728][53268] Updated weights for policy 1, policy_version 78230 (0.0010) [2023-10-10 07:55:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160235520. Throughput: 0: 1679.5, 1: 1680.3. Samples: 40076908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:55:56,784][52050] Avg episode reward: [(0, '21.760'), (1, '20.790')] [2023-10-10 07:55:57,092][53268] Updated weights for policy 1, policy_version 78240 (0.0008) [2023-10-10 07:55:57,122][53252] Updated weights for policy 0, policy_version 78280 (0.0009) [2023-10-10 07:55:57,494][53252] Updated weights for policy 0, policy_version 78290 (0.0008) [2023-10-10 07:55:57,865][53252] Updated weights for policy 0, policy_version 78300 (0.0008) [2023-10-10 07:56:01,093][53268] Updated weights for policy 1, policy_version 78250 (0.0008) [2023-10-10 07:56:01,462][53268] Updated weights for policy 1, policy_version 78260 (0.0009) [2023-10-10 07:56:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 160301056. Throughput: 0: 1679.4, 1: 1685.9. Samples: 40086310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:56:01,784][52050] Avg episode reward: [(0, '20.930'), (1, '22.160')] [2023-10-10 07:56:01,830][53268] Updated weights for policy 1, policy_version 78270 (0.0008) [2023-10-10 07:56:01,896][53252] Updated weights for policy 0, policy_version 78310 (0.0008) [2023-10-10 07:56:02,262][53252] Updated weights for policy 0, policy_version 78320 (0.0009) [2023-10-10 07:56:02,627][53252] Updated weights for policy 0, policy_version 78330 (0.0008) [2023-10-10 07:56:05,837][53268] Updated weights for policy 1, policy_version 78280 (0.0009) [2023-10-10 07:56:06,204][53268] Updated weights for policy 1, policy_version 78290 (0.0010) [2023-10-10 07:56:06,570][53268] Updated weights for policy 1, policy_version 78300 (0.0008) [2023-10-10 07:56:06,704][53252] Updated weights for policy 0, policy_version 78340 (0.0007) [2023-10-10 07:56:06,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 160399360. Throughput: 0: 1676.4, 1: 1687.3. Samples: 40107068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:56:06,784][52050] Avg episode reward: [(0, '20.490'), (1, '21.290')] [2023-10-10 07:56:07,081][53252] Updated weights for policy 0, policy_version 78350 (0.0008) [2023-10-10 07:56:07,445][53252] Updated weights for policy 0, policy_version 78360 (0.0010) [2023-10-10 07:56:10,504][53268] Updated weights for policy 1, policy_version 78310 (0.0008) [2023-10-10 07:56:10,875][53268] Updated weights for policy 1, policy_version 78320 (0.0008) [2023-10-10 07:56:11,248][53268] Updated weights for policy 1, policy_version 78330 (0.0007) [2023-10-10 07:56:11,649][53252] Updated weights for policy 0, policy_version 78370 (0.0009) [2023-10-10 07:56:11,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 160464896. Throughput: 0: 1677.0, 1: 1663.9. Samples: 40127046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:56:11,784][52050] Avg episode reward: [(0, '21.300'), (1, '21.910')] [2023-10-10 07:56:12,018][53252] Updated weights for policy 0, policy_version 78380 (0.0007) [2023-10-10 07:56:12,393][53252] Updated weights for policy 0, policy_version 78390 (0.0007) [2023-10-10 07:56:12,766][53252] Updated weights for policy 0, policy_version 78400 (0.0008) [2023-10-10 07:56:15,201][53268] Updated weights for policy 1, policy_version 78340 (0.0009) [2023-10-10 07:56:15,573][53268] Updated weights for policy 1, policy_version 78350 (0.0008) [2023-10-10 07:56:15,937][53268] Updated weights for policy 1, policy_version 78360 (0.0007) [2023-10-10 07:56:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160530432. Throughput: 0: 1675.0, 1: 1688.0. Samples: 40137304. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:16,784][52050] Avg episode reward: [(0, '21.410'), (1, '22.570')] [2023-10-10 07:56:16,911][53252] Updated weights for policy 0, policy_version 78410 (0.0009) [2023-10-10 07:56:17,290][53252] Updated weights for policy 0, policy_version 78420 (0.0010) [2023-10-10 07:56:17,658][53252] Updated weights for policy 0, policy_version 78430 (0.0010) [2023-10-10 07:56:20,067][53268] Updated weights for policy 1, policy_version 78370 (0.0008) [2023-10-10 07:56:20,485][53268] Updated weights for policy 1, policy_version 78380 (0.0008) [2023-10-10 07:56:20,845][53268] Updated weights for policy 1, policy_version 78390 (0.0009) [2023-10-10 07:56:21,215][53268] Updated weights for policy 1, policy_version 78400 (0.0009) [2023-10-10 07:56:21,638][53252] Updated weights for policy 0, policy_version 78440 (0.0008) [2023-10-10 07:56:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160595968. Throughput: 0: 1677.8, 1: 1680.6. Samples: 40157694. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:21,784][52050] Avg episode reward: [(0, '20.170'), (1, '22.300')] [2023-10-10 07:56:22,016][53252] Updated weights for policy 0, policy_version 78450 (0.0007) [2023-10-10 07:56:22,394][53252] Updated weights for policy 0, policy_version 78460 (0.0007) [2023-10-10 07:56:25,154][53268] Updated weights for policy 1, policy_version 78410 (0.0008) [2023-10-10 07:56:25,517][53268] Updated weights for policy 1, policy_version 78420 (0.0008) [2023-10-10 07:56:25,889][53268] Updated weights for policy 1, policy_version 78430 (0.0008) [2023-10-10 07:56:26,121][53252] Updated weights for policy 0, policy_version 78470 (0.0007) [2023-10-10 07:56:26,493][53252] Updated weights for policy 0, policy_version 78480 (0.0009) [2023-10-10 07:56:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160661504. Throughput: 0: 1677.1, 1: 1659.9. Samples: 40177148. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:26,784][52050] Avg episode reward: [(0, '21.360'), (1, '21.370')] [2023-10-10 07:56:26,874][53252] Updated weights for policy 0, policy_version 78490 (0.0010) [2023-10-10 07:56:29,893][53268] Updated weights for policy 1, policy_version 78440 (0.0008) [2023-10-10 07:56:30,256][53268] Updated weights for policy 1, policy_version 78450 (0.0008) [2023-10-10 07:56:30,627][53268] Updated weights for policy 1, policy_version 78460 (0.0009) [2023-10-10 07:56:30,844][53252] Updated weights for policy 0, policy_version 78500 (0.0009) [2023-10-10 07:56:31,214][53252] Updated weights for policy 0, policy_version 78510 (0.0008) [2023-10-10 07:56:31,586][53252] Updated weights for policy 0, policy_version 78520 (0.0007) [2023-10-10 07:56:31,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160727040. Throughput: 0: 1687.9, 1: 1687.6. Samples: 40187942. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:31,785][52050] Avg episode reward: [(0, '22.650'), (1, '21.780')] [2023-10-10 07:56:34,650][53268] Updated weights for policy 1, policy_version 78470 (0.0009) [2023-10-10 07:56:35,012][53268] Updated weights for policy 1, policy_version 78480 (0.0011) [2023-10-10 07:56:35,382][53268] Updated weights for policy 1, policy_version 78490 (0.0009) [2023-10-10 07:56:35,616][53252] Updated weights for policy 0, policy_version 78530 (0.0009) [2023-10-10 07:56:35,977][53252] Updated weights for policy 0, policy_version 78540 (0.0007) [2023-10-10 07:56:36,360][53252] Updated weights for policy 0, policy_version 78550 (0.0007) [2023-10-10 07:56:36,721][53252] Updated weights for policy 0, policy_version 78560 (0.0008) [2023-10-10 07:56:36,783][52050] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 160825344. Throughput: 0: 1696.8, 1: 1671.3. Samples: 40208116. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:36,784][52050] Avg episode reward: [(0, '22.060'), (1, '20.270')] [2023-10-10 07:56:39,368][53268] Updated weights for policy 1, policy_version 78500 (0.0008) [2023-10-10 07:56:39,741][53268] Updated weights for policy 1, policy_version 78510 (0.0008) [2023-10-10 07:56:40,102][53268] Updated weights for policy 1, policy_version 78520 (0.0009) [2023-10-10 07:56:40,793][53252] Updated weights for policy 0, policy_version 78570 (0.0010) [2023-10-10 07:56:41,159][53252] Updated weights for policy 0, policy_version 78580 (0.0008) [2023-10-10 07:56:41,517][53252] Updated weights for policy 0, policy_version 78590 (0.0008) [2023-10-10 07:56:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 160890880. Throughput: 0: 1674.9, 1: 1674.5. Samples: 40227632. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:41,784][52050] Avg episode reward: [(0, '22.430'), (1, '21.880')] [2023-10-10 07:56:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000078592_80478208.pth... [2023-10-10 07:56:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000078528_80412672.pth... [2023-10-10 07:56:41,839][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000076960_78807040.pth [2023-10-10 07:56:41,840][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000076992_78839808.pth [2023-10-10 07:56:41,845][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000078528_80412672.pth [2023-10-10 07:56:41,846][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000078592_80478208.pth [2023-10-10 07:56:44,226][53268] Updated weights for policy 1, policy_version 78530 (0.0008) [2023-10-10 07:56:44,592][53268] Updated weights for policy 1, policy_version 78540 (0.0009) [2023-10-10 07:56:44,959][53268] Updated weights for policy 1, policy_version 78550 (0.0008) [2023-10-10 07:56:45,325][53268] Updated weights for policy 1, policy_version 78560 (0.0009) [2023-10-10 07:56:45,764][53252] Updated weights for policy 0, policy_version 78600 (0.0008) [2023-10-10 07:56:46,147][53252] Updated weights for policy 0, policy_version 78610 (0.0007) [2023-10-10 07:56:46,510][53252] Updated weights for policy 0, policy_version 78620 (0.0007) [2023-10-10 07:56:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 160956416. Throughput: 0: 1698.1, 1: 1694.9. Samples: 40238996. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:46,784][52050] Avg episode reward: [(0, '24.390'), (1, '19.610')] [2023-10-10 07:56:49,386][53268] Updated weights for policy 1, policy_version 78570 (0.0007) [2023-10-10 07:56:49,766][53268] Updated weights for policy 1, policy_version 78580 (0.0009) [2023-10-10 07:56:50,138][53268] Updated weights for policy 1, policy_version 78590 (0.0010) [2023-10-10 07:56:50,652][53252] Updated weights for policy 0, policy_version 78630 (0.0009) [2023-10-10 07:56:51,021][53252] Updated weights for policy 0, policy_version 78640 (0.0010) [2023-10-10 07:56:51,400][53252] Updated weights for policy 0, policy_version 78650 (0.0010) [2023-10-10 07:56:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 161021952. Throughput: 0: 1700.5, 1: 1668.6. Samples: 40258678. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:51,784][52050] Avg episode reward: [(0, '21.530'), (1, '19.350')] [2023-10-10 07:56:54,187][53268] Updated weights for policy 1, policy_version 78600 (0.0010) [2023-10-10 07:56:54,558][53268] Updated weights for policy 1, policy_version 78610 (0.0010) [2023-10-10 07:56:54,930][53268] Updated weights for policy 1, policy_version 78620 (0.0010) [2023-10-10 07:56:55,261][53252] Updated weights for policy 0, policy_version 78660 (0.0009) [2023-10-10 07:56:55,634][53252] Updated weights for policy 0, policy_version 78670 (0.0008) [2023-10-10 07:56:56,003][53252] Updated weights for policy 0, policy_version 78680 (0.0010) [2023-10-10 07:56:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 161087488. Throughput: 0: 1674.8, 1: 1684.1. Samples: 40278200. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:56:56,784][52050] Avg episode reward: [(0, '22.890'), (1, '19.550')] [2023-10-10 07:56:58,864][53268] Updated weights for policy 1, policy_version 78630 (0.0008) [2023-10-10 07:56:59,231][53268] Updated weights for policy 1, policy_version 78640 (0.0007) [2023-10-10 07:56:59,598][53268] Updated weights for policy 1, policy_version 78650 (0.0007) [2023-10-10 07:56:59,870][53252] Updated weights for policy 0, policy_version 78690 (0.0011) [2023-10-10 07:57:00,236][53252] Updated weights for policy 0, policy_version 78700 (0.0008) [2023-10-10 07:57:00,615][53252] Updated weights for policy 0, policy_version 78710 (0.0010) [2023-10-10 07:57:00,979][53252] Updated weights for policy 0, policy_version 78720 (0.0007) [2023-10-10 07:57:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 161153024. Throughput: 0: 1705.9, 1: 1675.5. Samples: 40289466. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-10 07:57:01,784][52050] Avg episode reward: [(0, '22.100'), (1, '20.700')] [2023-10-10 07:57:03,724][53268] Updated weights for policy 1, policy_version 78660 (0.0009) [2023-10-10 07:57:04,094][53268] Updated weights for policy 1, policy_version 78670 (0.0007) [2023-10-10 07:57:04,453][53268] Updated weights for policy 1, policy_version 78680 (0.0009) [2023-10-10 07:57:05,190][53252] Updated weights for policy 0, policy_version 78730 (0.0010) [2023-10-10 07:57:05,548][53252] Updated weights for policy 0, policy_version 78740 (0.0010) [2023-10-10 07:57:05,917][53252] Updated weights for policy 0, policy_version 78750 (0.0010) [2023-10-10 07:57:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161218560. Throughput: 0: 1689.6, 1: 1668.0. Samples: 40308784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:06,784][52050] Avg episode reward: [(0, '22.140'), (1, '21.930')] [2023-10-10 07:57:08,595][53268] Updated weights for policy 1, policy_version 78690 (0.0009) [2023-10-10 07:57:09,013][53268] Updated weights for policy 1, policy_version 78700 (0.0007) [2023-10-10 07:57:09,383][53268] Updated weights for policy 1, policy_version 78710 (0.0008) [2023-10-10 07:57:09,743][53268] Updated weights for policy 1, policy_version 78720 (0.0010) [2023-10-10 07:57:09,972][53252] Updated weights for policy 0, policy_version 78760 (0.0011) [2023-10-10 07:57:10,340][53252] Updated weights for policy 0, policy_version 78770 (0.0011) [2023-10-10 07:57:10,707][53252] Updated weights for policy 0, policy_version 78780 (0.0011) [2023-10-10 07:57:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161284096. Throughput: 0: 1677.1, 1: 1691.2. Samples: 40328724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:11,785][52050] Avg episode reward: [(0, '21.910'), (1, '21.150')] [2023-10-10 07:57:13,887][53268] Updated weights for policy 1, policy_version 78730 (0.0008) [2023-10-10 07:57:14,243][53268] Updated weights for policy 1, policy_version 78740 (0.0007) [2023-10-10 07:57:14,610][53268] Updated weights for policy 1, policy_version 78750 (0.0007) [2023-10-10 07:57:14,954][53252] Updated weights for policy 0, policy_version 78790 (0.0009) [2023-10-10 07:57:15,311][53252] Updated weights for policy 0, policy_version 78800 (0.0010) [2023-10-10 07:57:15,694][53252] Updated weights for policy 0, policy_version 78810 (0.0010) [2023-10-10 07:57:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 161349632. Throughput: 0: 1693.4, 1: 1679.9. Samples: 40339738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:16,784][52050] Avg episode reward: [(0, '23.090'), (1, '23.250')] [2023-10-10 07:57:18,657][53268] Updated weights for policy 1, policy_version 78760 (0.0008) [2023-10-10 07:57:19,023][53268] Updated weights for policy 1, policy_version 78770 (0.0009) [2023-10-10 07:57:19,389][53268] Updated weights for policy 1, policy_version 78780 (0.0008) [2023-10-10 07:57:19,879][53252] Updated weights for policy 0, policy_version 78820 (0.0008) [2023-10-10 07:57:20,244][53252] Updated weights for policy 0, policy_version 78830 (0.0011) [2023-10-10 07:57:20,622][53252] Updated weights for policy 0, policy_version 78840 (0.0009) [2023-10-10 07:57:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 161415168. Throughput: 0: 1671.9, 1: 1685.0. Samples: 40359176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:21,784][52050] Avg episode reward: [(0, '23.070'), (1, '23.690')] [2023-10-10 07:57:23,302][53268] Updated weights for policy 1, policy_version 78790 (0.0010) [2023-10-10 07:57:23,676][53268] Updated weights for policy 1, policy_version 78800 (0.0009) [2023-10-10 07:57:24,049][53268] Updated weights for policy 1, policy_version 78810 (0.0009) [2023-10-10 07:57:24,740][53252] Updated weights for policy 0, policy_version 78850 (0.0008) [2023-10-10 07:57:25,105][53252] Updated weights for policy 0, policy_version 78860 (0.0007) [2023-10-10 07:57:25,472][53252] Updated weights for policy 0, policy_version 78870 (0.0007) [2023-10-10 07:57:25,845][53252] Updated weights for policy 0, policy_version 78880 (0.0007) [2023-10-10 07:57:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 161480704. Throughput: 0: 1679.7, 1: 1697.5. Samples: 40379606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:26,784][52050] Avg episode reward: [(0, '22.080'), (1, '21.760')] [2023-10-10 07:57:27,991][53268] Updated weights for policy 1, policy_version 78820 (0.0008) [2023-10-10 07:57:28,360][53268] Updated weights for policy 1, policy_version 78830 (0.0008) [2023-10-10 07:57:28,725][53268] Updated weights for policy 1, policy_version 78840 (0.0007) [2023-10-10 07:57:29,995][53252] Updated weights for policy 0, policy_version 78890 (0.0008) [2023-10-10 07:57:30,368][53252] Updated weights for policy 0, policy_version 78900 (0.0009) [2023-10-10 07:57:30,735][53252] Updated weights for policy 0, policy_version 78910 (0.0010) [2023-10-10 07:57:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 161546240. Throughput: 0: 1686.0, 1: 1668.3. Samples: 40389938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:31,784][52050] Avg episode reward: [(0, '23.340'), (1, '21.640')] [2023-10-10 07:57:32,866][53268] Updated weights for policy 1, policy_version 78850 (0.0007) [2023-10-10 07:57:33,231][53268] Updated weights for policy 1, policy_version 78860 (0.0008) [2023-10-10 07:57:33,591][53268] Updated weights for policy 1, policy_version 78870 (0.0009) [2023-10-10 07:57:33,962][53268] Updated weights for policy 1, policy_version 78880 (0.0007) [2023-10-10 07:57:34,676][53252] Updated weights for policy 0, policy_version 78920 (0.0009) [2023-10-10 07:57:35,042][53252] Updated weights for policy 0, policy_version 78930 (0.0009) [2023-10-10 07:57:35,414][53252] Updated weights for policy 0, policy_version 78940 (0.0008) [2023-10-10 07:57:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 161611776. Throughput: 0: 1661.3, 1: 1695.2. Samples: 40409718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:36,784][52050] Avg episode reward: [(0, '23.630'), (1, '21.240')] [2023-10-10 07:57:37,957][53268] Updated weights for policy 1, policy_version 78890 (0.0008) [2023-10-10 07:57:38,320][53268] Updated weights for policy 1, policy_version 78900 (0.0007) [2023-10-10 07:57:38,692][53268] Updated weights for policy 1, policy_version 78910 (0.0007) [2023-10-10 07:57:39,240][53252] Updated weights for policy 0, policy_version 78950 (0.0007) [2023-10-10 07:57:39,614][53252] Updated weights for policy 0, policy_version 78960 (0.0008) [2023-10-10 07:57:39,989][53252] Updated weights for policy 0, policy_version 78970 (0.0011) [2023-10-10 07:57:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 161677312. Throughput: 0: 1684.0, 1: 1699.2. Samples: 40430446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:41,784][52050] Avg episode reward: [(0, '23.050'), (1, '18.570')] [2023-10-10 07:57:42,935][53268] Updated weights for policy 1, policy_version 78920 (0.0007) [2023-10-10 07:57:43,300][53268] Updated weights for policy 1, policy_version 78930 (0.0009) [2023-10-10 07:57:43,665][53268] Updated weights for policy 1, policy_version 78940 (0.0010) [2023-10-10 07:57:43,812][53252] Updated weights for policy 0, policy_version 78980 (0.0009) [2023-10-10 07:57:44,186][53252] Updated weights for policy 0, policy_version 78990 (0.0009) [2023-10-10 07:57:44,567][53252] Updated weights for policy 0, policy_version 79000 (0.0011) [2023-10-10 07:57:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161742848. Throughput: 0: 1669.0, 1: 1681.4. Samples: 40440232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:46,784][52050] Avg episode reward: [(0, '22.640'), (1, '20.140')] [2023-10-10 07:57:47,567][53268] Updated weights for policy 1, policy_version 78950 (0.0007) [2023-10-10 07:57:47,936][53268] Updated weights for policy 1, policy_version 78960 (0.0008) [2023-10-10 07:57:48,296][53268] Updated weights for policy 1, policy_version 78970 (0.0007) [2023-10-10 07:57:48,678][53252] Updated weights for policy 0, policy_version 79010 (0.0007) [2023-10-10 07:57:49,047][53252] Updated weights for policy 0, policy_version 79020 (0.0009) [2023-10-10 07:57:49,418][53252] Updated weights for policy 0, policy_version 79030 (0.0007) [2023-10-10 07:57:49,795][53252] Updated weights for policy 0, policy_version 79040 (0.0010) [2023-10-10 07:57:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161808384. Throughput: 0: 1668.1, 1: 1703.9. Samples: 40460526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:51,785][52050] Avg episode reward: [(0, '21.770'), (1, '21.160')] [2023-10-10 07:57:52,284][53268] Updated weights for policy 1, policy_version 78980 (0.0008) [2023-10-10 07:57:52,647][53268] Updated weights for policy 1, policy_version 78990 (0.0010) [2023-10-10 07:57:53,025][53268] Updated weights for policy 1, policy_version 79000 (0.0008) [2023-10-10 07:57:53,897][53252] Updated weights for policy 0, policy_version 79050 (0.0008) [2023-10-10 07:57:54,262][53252] Updated weights for policy 0, policy_version 79060 (0.0007) [2023-10-10 07:57:54,632][53252] Updated weights for policy 0, policy_version 79070 (0.0007) [2023-10-10 07:57:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 161873920. Throughput: 0: 1686.3, 1: 1706.5. Samples: 40481398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) [2023-10-10 07:57:56,784][52050] Avg episode reward: [(0, '20.200'), (1, '20.630')] [2023-10-10 07:57:57,070][53268] Updated weights for policy 1, policy_version 79010 (0.0008) [2023-10-10 07:57:57,463][53268] Updated weights for policy 1, policy_version 79020 (0.0008) [2023-10-10 07:57:57,827][53268] Updated weights for policy 1, policy_version 79030 (0.0007) [2023-10-10 07:57:58,205][53268] Updated weights for policy 1, policy_version 79040 (0.0012) [2023-10-10 07:57:58,711][53252] Updated weights for policy 0, policy_version 79080 (0.0008) [2023-10-10 07:57:59,094][53252] Updated weights for policy 0, policy_version 79090 (0.0008) [2023-10-10 07:57:59,461][53252] Updated weights for policy 0, policy_version 79100 (0.0007) [2023-10-10 07:58:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161939456. Throughput: 0: 1666.6, 1: 1691.6. Samples: 40490854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:01,784][52050] Avg episode reward: [(0, '19.910'), (1, '21.090')] [2023-10-10 07:58:02,216][53268] Updated weights for policy 1, policy_version 79050 (0.0007) [2023-10-10 07:58:02,586][53268] Updated weights for policy 1, policy_version 79060 (0.0010) [2023-10-10 07:58:02,953][53268] Updated weights for policy 1, policy_version 79070 (0.0008) [2023-10-10 07:58:03,350][53252] Updated weights for policy 0, policy_version 79110 (0.0009) [2023-10-10 07:58:03,720][53252] Updated weights for policy 0, policy_version 79120 (0.0010) [2023-10-10 07:58:04,097][53252] Updated weights for policy 0, policy_version 79130 (0.0010) [2023-10-10 07:58:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162004992. Throughput: 0: 1680.1, 1: 1700.8. Samples: 40511316. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:06,784][52050] Avg episode reward: [(0, '21.030'), (1, '22.600')] [2023-10-10 07:58:07,066][53268] Updated weights for policy 1, policy_version 79080 (0.0010) [2023-10-10 07:58:07,443][53268] Updated weights for policy 1, policy_version 79090 (0.0009) [2023-10-10 07:58:07,798][53268] Updated weights for policy 1, policy_version 79100 (0.0009) [2023-10-10 07:58:08,250][53252] Updated weights for policy 0, policy_version 79140 (0.0008) [2023-10-10 07:58:08,621][53252] Updated weights for policy 0, policy_version 79150 (0.0009) [2023-10-10 07:58:08,990][53252] Updated weights for policy 0, policy_version 79160 (0.0009) [2023-10-10 07:58:11,550][53268] Updated weights for policy 1, policy_version 79110 (0.0011) [2023-10-10 07:58:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 162070528. Throughput: 0: 1694.2, 1: 1697.0. Samples: 40532210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:11,784][52050] Avg episode reward: [(0, '20.990'), (1, '22.760')] [2023-10-10 07:58:11,913][53268] Updated weights for policy 1, policy_version 79120 (0.0010) [2023-10-10 07:58:12,294][53268] Updated weights for policy 1, policy_version 79130 (0.0011) [2023-10-10 07:58:12,946][53252] Updated weights for policy 0, policy_version 79170 (0.0009) [2023-10-10 07:58:13,317][53252] Updated weights for policy 0, policy_version 79180 (0.0008) [2023-10-10 07:58:13,690][53252] Updated weights for policy 0, policy_version 79190 (0.0007) [2023-10-10 07:58:14,054][53252] Updated weights for policy 0, policy_version 79200 (0.0007) [2023-10-10 07:58:16,402][53268] Updated weights for policy 1, policy_version 79140 (0.0010) [2023-10-10 07:58:16,769][53268] Updated weights for policy 1, policy_version 79150 (0.0011) [2023-10-10 07:58:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162136064. Throughput: 0: 1669.7, 1: 1698.9. Samples: 40541528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:16,784][52050] Avg episode reward: [(0, '23.130'), (1, '22.100')] [2023-10-10 07:58:17,146][53268] Updated weights for policy 1, policy_version 79160 (0.0008) [2023-10-10 07:58:18,189][53252] Updated weights for policy 0, policy_version 79210 (0.0009) [2023-10-10 07:58:18,558][53252] Updated weights for policy 0, policy_version 79220 (0.0009) [2023-10-10 07:58:18,925][53252] Updated weights for policy 0, policy_version 79230 (0.0009) [2023-10-10 07:58:21,360][53268] Updated weights for policy 1, policy_version 79170 (0.0007) [2023-10-10 07:58:21,732][53268] Updated weights for policy 1, policy_version 79180 (0.0009) [2023-10-10 07:58:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162201600. Throughput: 0: 1693.7, 1: 1699.3. Samples: 40562406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:21,784][52050] Avg episode reward: [(0, '23.080'), (1, '21.010')] [2023-10-10 07:58:22,095][53268] Updated weights for policy 1, policy_version 79190 (0.0010) [2023-10-10 07:58:22,466][53268] Updated weights for policy 1, policy_version 79200 (0.0008) [2023-10-10 07:58:22,982][53252] Updated weights for policy 0, policy_version 79240 (0.0008) [2023-10-10 07:58:23,356][53252] Updated weights for policy 0, policy_version 79250 (0.0010) [2023-10-10 07:58:23,728][53252] Updated weights for policy 0, policy_version 79260 (0.0008) [2023-10-10 07:58:26,545][53268] Updated weights for policy 1, policy_version 79210 (0.0009) [2023-10-10 07:58:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162267136. Throughput: 0: 1697.3, 1: 1698.0. Samples: 40583234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:26,784][52050] Avg episode reward: [(0, '22.360'), (1, '21.240')] [2023-10-10 07:58:26,902][53268] Updated weights for policy 1, policy_version 79220 (0.0009) [2023-10-10 07:58:27,265][53268] Updated weights for policy 1, policy_version 79230 (0.0010) [2023-10-10 07:58:27,575][53252] Updated weights for policy 0, policy_version 79270 (0.0008) [2023-10-10 07:58:27,941][53252] Updated weights for policy 0, policy_version 79280 (0.0010) [2023-10-10 07:58:28,322][53252] Updated weights for policy 0, policy_version 79290 (0.0009) [2023-10-10 07:58:31,339][53268] Updated weights for policy 1, policy_version 79240 (0.0008) [2023-10-10 07:58:31,715][53268] Updated weights for policy 1, policy_version 79250 (0.0009) [2023-10-10 07:58:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 162332672. Throughput: 0: 1685.2, 1: 1700.7. Samples: 40592600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:31,784][52050] Avg episode reward: [(0, '21.790'), (1, '22.570')] [2023-10-10 07:58:32,082][53268] Updated weights for policy 1, policy_version 79260 (0.0009) [2023-10-10 07:58:32,464][53252] Updated weights for policy 0, policy_version 79300 (0.0007) [2023-10-10 07:58:32,846][53252] Updated weights for policy 0, policy_version 79310 (0.0008) [2023-10-10 07:58:33,221][53252] Updated weights for policy 0, policy_version 79320 (0.0010) [2023-10-10 07:58:35,994][53268] Updated weights for policy 1, policy_version 79270 (0.0008) [2023-10-10 07:58:36,359][53268] Updated weights for policy 1, policy_version 79280 (0.0008) [2023-10-10 07:58:36,728][53268] Updated weights for policy 1, policy_version 79290 (0.0008) [2023-10-10 07:58:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162398208. Throughput: 0: 1702.6, 1: 1698.8. Samples: 40613590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:36,784][52050] Avg episode reward: [(0, '20.400'), (1, '21.600')] [2023-10-10 07:58:37,097][53252] Updated weights for policy 0, policy_version 79330 (0.0009) [2023-10-10 07:58:37,466][53252] Updated weights for policy 0, policy_version 79340 (0.0008) [2023-10-10 07:58:37,839][53252] Updated weights for policy 0, policy_version 79350 (0.0008) [2023-10-10 07:58:38,210][53252] Updated weights for policy 0, policy_version 79360 (0.0008) [2023-10-10 07:58:40,867][53268] Updated weights for policy 1, policy_version 79300 (0.0009) [2023-10-10 07:58:41,236][53268] Updated weights for policy 1, policy_version 79310 (0.0008) [2023-10-10 07:58:41,599][53268] Updated weights for policy 1, policy_version 79320 (0.0008) [2023-10-10 07:58:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162463744. Throughput: 0: 1702.1, 1: 1688.0. Samples: 40633956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:41,784][52050] Avg episode reward: [(0, '21.550'), (1, '21.610')] [2023-10-10 07:58:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000079360_81264640.pth... [2023-10-10 07:58:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000077792_79659008.pth [2023-10-10 07:58:41,888][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000079328_81231872.pth... [2023-10-10 07:58:41,919][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000077728_79593472.pth [2023-10-10 07:58:42,340][53252] Updated weights for policy 0, policy_version 79370 (0.0010) [2023-10-10 07:58:42,713][53252] Updated weights for policy 0, policy_version 79380 (0.0010) [2023-10-10 07:58:43,089][53252] Updated weights for policy 0, policy_version 79390 (0.0011) [2023-10-10 07:58:45,539][53268] Updated weights for policy 1, policy_version 79330 (0.0008) [2023-10-10 07:58:45,942][53268] Updated weights for policy 1, policy_version 79340 (0.0010) [2023-10-10 07:58:46,310][53268] Updated weights for policy 1, policy_version 79350 (0.0008) [2023-10-10 07:58:46,677][53268] Updated weights for policy 1, policy_version 79360 (0.0007) [2023-10-10 07:58:46,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162562048. Throughput: 0: 1695.7, 1: 1701.4. Samples: 40643722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:46,784][52050] Avg episode reward: [(0, '20.870'), (1, '21.440')] [2023-10-10 07:58:47,084][53252] Updated weights for policy 0, policy_version 79400 (0.0011) [2023-10-10 07:58:47,447][53252] Updated weights for policy 0, policy_version 79410 (0.0011) [2023-10-10 07:58:47,825][53252] Updated weights for policy 0, policy_version 79420 (0.0010) [2023-10-10 07:58:50,933][53268] Updated weights for policy 1, policy_version 79370 (0.0012) [2023-10-10 07:58:51,305][53268] Updated weights for policy 1, policy_version 79380 (0.0009) [2023-10-10 07:58:51,657][53268] Updated weights for policy 1, policy_version 79390 (0.0010) [2023-10-10 07:58:51,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 162627584. Throughput: 0: 1701.5, 1: 1693.3. Samples: 40664084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 07:58:51,784][52050] Avg episode reward: [(0, '19.410'), (1, '22.730')] [2023-10-10 07:58:51,902][53252] Updated weights for policy 0, policy_version 79430 (0.0009) [2023-10-10 07:58:52,272][53252] Updated weights for policy 0, policy_version 79440 (0.0009) [2023-10-10 07:58:52,653][53252] Updated weights for policy 0, policy_version 79450 (0.0008) [2023-10-10 07:58:55,650][53268] Updated weights for policy 1, policy_version 79400 (0.0009) [2023-10-10 07:58:56,003][53268] Updated weights for policy 1, policy_version 79410 (0.0010) [2023-10-10 07:58:56,374][53268] Updated weights for policy 1, policy_version 79420 (0.0012) [2023-10-10 07:58:56,758][53252] Updated weights for policy 0, policy_version 79460 (0.0007) [2023-10-10 07:58:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162693120. Throughput: 0: 1703.6, 1: 1671.2. Samples: 40684076. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:58:56,784][52050] Avg episode reward: [(0, '20.100'), (1, '22.460')] [2023-10-10 07:58:57,123][53252] Updated weights for policy 0, policy_version 79470 (0.0007) [2023-10-10 07:58:57,493][53252] Updated weights for policy 0, policy_version 79480 (0.0009) [2023-10-10 07:59:00,563][53268] Updated weights for policy 1, policy_version 79430 (0.0008) [2023-10-10 07:59:00,935][53268] Updated weights for policy 1, policy_version 79440 (0.0007) [2023-10-10 07:59:01,295][53268] Updated weights for policy 1, policy_version 79450 (0.0010) [2023-10-10 07:59:01,518][53252] Updated weights for policy 0, policy_version 79490 (0.0008) [2023-10-10 07:59:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 162758656. Throughput: 0: 1698.9, 1: 1688.1. Samples: 40693942. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:01,784][52050] Avg episode reward: [(0, '20.110'), (1, '22.480')] [2023-10-10 07:59:01,899][53252] Updated weights for policy 0, policy_version 79500 (0.0008) [2023-10-10 07:59:02,266][53252] Updated weights for policy 0, policy_version 79510 (0.0007) [2023-10-10 07:59:02,636][53252] Updated weights for policy 0, policy_version 79520 (0.0009) [2023-10-10 07:59:05,299][53268] Updated weights for policy 1, policy_version 79460 (0.0008) [2023-10-10 07:59:05,679][53268] Updated weights for policy 1, policy_version 79470 (0.0008) [2023-10-10 07:59:06,044][53268] Updated weights for policy 1, policy_version 79480 (0.0009) [2023-10-10 07:59:06,613][53252] Updated weights for policy 0, policy_version 79530 (0.0011) [2023-10-10 07:59:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162824192. Throughput: 0: 1701.3, 1: 1687.3. Samples: 40714896. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:06,784][52050] Avg episode reward: [(0, '21.520'), (1, '23.210')] [2023-10-10 07:59:06,988][53252] Updated weights for policy 0, policy_version 79540 (0.0009) [2023-10-10 07:59:07,358][53252] Updated weights for policy 0, policy_version 79550 (0.0010) [2023-10-10 07:59:09,951][53268] Updated weights for policy 1, policy_version 79490 (0.0010) [2023-10-10 07:59:10,319][53268] Updated weights for policy 1, policy_version 79500 (0.0011) [2023-10-10 07:59:10,686][53268] Updated weights for policy 1, policy_version 79510 (0.0009) [2023-10-10 07:59:11,054][53268] Updated weights for policy 1, policy_version 79520 (0.0010) [2023-10-10 07:59:11,451][53252] Updated weights for policy 0, policy_version 79560 (0.0009) [2023-10-10 07:59:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162889728. Throughput: 0: 1695.2, 1: 1662.2. Samples: 40734320. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:11,784][52050] Avg episode reward: [(0, '23.440'), (1, '22.490')] [2023-10-10 07:59:11,820][53252] Updated weights for policy 0, policy_version 79570 (0.0009) [2023-10-10 07:59:12,196][53252] Updated weights for policy 0, policy_version 79580 (0.0007) [2023-10-10 07:59:15,136][53268] Updated weights for policy 1, policy_version 79530 (0.0009) [2023-10-10 07:59:15,505][53268] Updated weights for policy 1, policy_version 79540 (0.0011) [2023-10-10 07:59:15,876][53268] Updated weights for policy 1, policy_version 79550 (0.0010) [2023-10-10 07:59:16,154][53252] Updated weights for policy 0, policy_version 79590 (0.0009) [2023-10-10 07:59:16,513][53252] Updated weights for policy 0, policy_version 79600 (0.0010) [2023-10-10 07:59:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162955264. Throughput: 0: 1697.1, 1: 1690.2. Samples: 40745028. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:16,784][52050] Avg episode reward: [(0, '22.300'), (1, '21.600')] [2023-10-10 07:59:16,881][53252] Updated weights for policy 0, policy_version 79610 (0.0009) [2023-10-10 07:59:20,083][53268] Updated weights for policy 1, policy_version 79560 (0.0009) [2023-10-10 07:59:20,449][53268] Updated weights for policy 1, policy_version 79570 (0.0008) [2023-10-10 07:59:20,817][53268] Updated weights for policy 1, policy_version 79580 (0.0009) [2023-10-10 07:59:21,044][53252] Updated weights for policy 0, policy_version 79620 (0.0008) [2023-10-10 07:59:21,413][53252] Updated weights for policy 0, policy_version 79630 (0.0009) [2023-10-10 07:59:21,782][53252] Updated weights for policy 0, policy_version 79640 (0.0008) [2023-10-10 07:59:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163020800. Throughput: 0: 1694.6, 1: 1672.4. Samples: 40765102. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:21,784][52050] Avg episode reward: [(0, '22.890'), (1, '22.840')] [2023-10-10 07:59:24,796][53268] Updated weights for policy 1, policy_version 79590 (0.0007) [2023-10-10 07:59:25,161][53268] Updated weights for policy 1, policy_version 79600 (0.0010) [2023-10-10 07:59:25,533][53268] Updated weights for policy 1, policy_version 79610 (0.0009) [2023-10-10 07:59:25,883][53252] Updated weights for policy 0, policy_version 79650 (0.0008) [2023-10-10 07:59:26,262][53252] Updated weights for policy 0, policy_version 79660 (0.0008) [2023-10-10 07:59:26,636][53252] Updated weights for policy 0, policy_version 79670 (0.0007) [2023-10-10 07:59:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163086336. Throughput: 0: 1678.7, 1: 1663.7. Samples: 40784366. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:26,784][52050] Avg episode reward: [(0, '22.730'), (1, '22.960')] [2023-10-10 07:59:27,000][53252] Updated weights for policy 0, policy_version 79680 (0.0007) [2023-10-10 07:59:29,679][53268] Updated weights for policy 1, policy_version 79620 (0.0009) [2023-10-10 07:59:30,056][53268] Updated weights for policy 1, policy_version 79630 (0.0007) [2023-10-10 07:59:30,413][53268] Updated weights for policy 1, policy_version 79640 (0.0008) [2023-10-10 07:59:31,126][53252] Updated weights for policy 0, policy_version 79690 (0.0008) [2023-10-10 07:59:31,490][53252] Updated weights for policy 0, policy_version 79700 (0.0008) [2023-10-10 07:59:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163151872. Throughput: 0: 1690.1, 1: 1682.9. Samples: 40795508. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:31,784][52050] Avg episode reward: [(0, '22.950'), (1, '23.140')] [2023-10-10 07:59:31,855][53252] Updated weights for policy 0, policy_version 79710 (0.0008) [2023-10-10 07:59:34,478][53268] Updated weights for policy 1, policy_version 79650 (0.0008) [2023-10-10 07:59:34,873][53268] Updated weights for policy 1, policy_version 79660 (0.0009) [2023-10-10 07:59:35,235][53268] Updated weights for policy 1, policy_version 79670 (0.0010) [2023-10-10 07:59:35,606][53268] Updated weights for policy 1, policy_version 79680 (0.0009) [2023-10-10 07:59:35,933][53252] Updated weights for policy 0, policy_version 79720 (0.0009) [2023-10-10 07:59:36,306][53252] Updated weights for policy 0, policy_version 79730 (0.0007) [2023-10-10 07:59:36,672][53252] Updated weights for policy 0, policy_version 79740 (0.0007) [2023-10-10 07:59:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163217408. Throughput: 0: 1689.1, 1: 1672.1. Samples: 40815340. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:36,784][52050] Avg episode reward: [(0, '23.290'), (1, '24.750')] [2023-10-10 07:59:39,581][53268] Updated weights for policy 1, policy_version 79690 (0.0008) [2023-10-10 07:59:39,949][53268] Updated weights for policy 1, policy_version 79700 (0.0010) [2023-10-10 07:59:40,318][53268] Updated weights for policy 1, policy_version 79710 (0.0009) [2023-10-10 07:59:40,678][53252] Updated weights for policy 0, policy_version 79750 (0.0007) [2023-10-10 07:59:41,048][53252] Updated weights for policy 0, policy_version 79760 (0.0008) [2023-10-10 07:59:41,422][53252] Updated weights for policy 0, policy_version 79770 (0.0009) [2023-10-10 07:59:41,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 163315712. Throughput: 0: 1662.4, 1: 1681.9. Samples: 40834572. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-10 07:59:41,784][52050] Avg episode reward: [(0, '21.690'), (1, '23.620')] [2023-10-10 07:59:44,414][53268] Updated weights for policy 1, policy_version 79720 (0.0008) [2023-10-10 07:59:44,777][53268] Updated weights for policy 1, policy_version 79730 (0.0010) [2023-10-10 07:59:45,150][53268] Updated weights for policy 1, policy_version 79740 (0.0008) [2023-10-10 07:59:45,421][53252] Updated weights for policy 0, policy_version 79780 (0.0009) [2023-10-10 07:59:45,790][53252] Updated weights for policy 0, policy_version 79790 (0.0007) [2023-10-10 07:59:46,159][53252] Updated weights for policy 0, policy_version 79800 (0.0008) [2023-10-10 07:59:46,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163381248. Throughput: 0: 1683.5, 1: 1690.9. Samples: 40845790. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 07:59:46,784][52050] Avg episode reward: [(0, '22.130'), (1, '22.660')] [2023-10-10 07:59:49,279][53268] Updated weights for policy 1, policy_version 79750 (0.0007) [2023-10-10 07:59:49,637][53268] Updated weights for policy 1, policy_version 79760 (0.0007) [2023-10-10 07:59:50,010][53268] Updated weights for policy 1, policy_version 79770 (0.0007) [2023-10-10 07:59:50,359][53252] Updated weights for policy 0, policy_version 79810 (0.0010) [2023-10-10 07:59:50,722][53252] Updated weights for policy 0, policy_version 79820 (0.0010) [2023-10-10 07:59:51,097][53252] Updated weights for policy 0, policy_version 79830 (0.0008) [2023-10-10 07:59:51,466][53252] Updated weights for policy 0, policy_version 79840 (0.0008) [2023-10-10 07:59:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163446784. Throughput: 0: 1673.4, 1: 1662.7. Samples: 40865018. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 07:59:51,784][52050] Avg episode reward: [(0, '22.750'), (1, '21.010')] [2023-10-10 07:59:54,152][53268] Updated weights for policy 1, policy_version 79780 (0.0009) [2023-10-10 07:59:54,517][53268] Updated weights for policy 1, policy_version 79790 (0.0009) [2023-10-10 07:59:54,886][53268] Updated weights for policy 1, policy_version 79800 (0.0010) [2023-10-10 07:59:55,535][53252] Updated weights for policy 0, policy_version 79850 (0.0007) [2023-10-10 07:59:55,906][53252] Updated weights for policy 0, policy_version 79860 (0.0009) [2023-10-10 07:59:56,288][53252] Updated weights for policy 0, policy_version 79870 (0.0008) [2023-10-10 07:59:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163512320. Throughput: 0: 1650.0, 1: 1683.7. Samples: 40884336. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 07:59:56,784][52050] Avg episode reward: [(0, '21.660'), (1, '20.450')] [2023-10-10 07:59:58,722][53268] Updated weights for policy 1, policy_version 79810 (0.0011) [2023-10-10 07:59:59,092][53268] Updated weights for policy 1, policy_version 79820 (0.0010) [2023-10-10 07:59:59,464][53268] Updated weights for policy 1, policy_version 79830 (0.0008) [2023-10-10 07:59:59,829][53268] Updated weights for policy 1, policy_version 79840 (0.0007) [2023-10-10 08:00:00,291][53252] Updated weights for policy 0, policy_version 79880 (0.0008) [2023-10-10 08:00:00,662][53252] Updated weights for policy 0, policy_version 79890 (0.0009) [2023-10-10 08:00:01,031][53252] Updated weights for policy 0, policy_version 79900 (0.0010) [2023-10-10 08:00:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163577856. Throughput: 0: 1674.5, 1: 1671.8. Samples: 40895610. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:01,784][52050] Avg episode reward: [(0, '22.100'), (1, '20.830')] [2023-10-10 08:00:03,933][53268] Updated weights for policy 1, policy_version 79850 (0.0010) [2023-10-10 08:00:04,303][53268] Updated weights for policy 1, policy_version 79860 (0.0010) [2023-10-10 08:00:04,667][53268] Updated weights for policy 1, policy_version 79870 (0.0010) [2023-10-10 08:00:05,124][53252] Updated weights for policy 0, policy_version 79910 (0.0008) [2023-10-10 08:00:05,504][53252] Updated weights for policy 0, policy_version 79920 (0.0009) [2023-10-10 08:00:05,872][53252] Updated weights for policy 0, policy_version 79930 (0.0011) [2023-10-10 08:00:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 163643392. Throughput: 0: 1667.6, 1: 1666.8. Samples: 40915152. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:06,784][52050] Avg episode reward: [(0, '22.800'), (1, '20.740')] [2023-10-10 08:00:08,615][53268] Updated weights for policy 1, policy_version 79880 (0.0008) [2023-10-10 08:00:08,982][53268] Updated weights for policy 1, policy_version 79890 (0.0007) [2023-10-10 08:00:09,346][53268] Updated weights for policy 1, policy_version 79900 (0.0009) [2023-10-10 08:00:09,766][53252] Updated weights for policy 0, policy_version 79940 (0.0008) [2023-10-10 08:00:10,138][53252] Updated weights for policy 0, policy_version 79950 (0.0007) [2023-10-10 08:00:10,498][53252] Updated weights for policy 0, policy_version 79960 (0.0010) [2023-10-10 08:00:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163708928. Throughput: 0: 1669.9, 1: 1689.9. Samples: 40935554. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:11,784][52050] Avg episode reward: [(0, '22.420'), (1, '20.970')] [2023-10-10 08:00:13,502][53268] Updated weights for policy 1, policy_version 79910 (0.0009) [2023-10-10 08:00:13,864][53268] Updated weights for policy 1, policy_version 79920 (0.0009) [2023-10-10 08:00:14,232][53268] Updated weights for policy 1, policy_version 79930 (0.0009) [2023-10-10 08:00:14,645][53252] Updated weights for policy 0, policy_version 79970 (0.0009) [2023-10-10 08:00:15,018][53252] Updated weights for policy 0, policy_version 79980 (0.0007) [2023-10-10 08:00:15,381][53252] Updated weights for policy 0, policy_version 79990 (0.0007) [2023-10-10 08:00:15,753][53252] Updated weights for policy 0, policy_version 80000 (0.0007) [2023-10-10 08:00:16,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163774464. Throughput: 0: 1685.9, 1: 1661.3. Samples: 40946132. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:16,785][52050] Avg episode reward: [(0, '22.080'), (1, '23.330')] [2023-10-10 08:00:18,262][53268] Updated weights for policy 1, policy_version 79940 (0.0008) [2023-10-10 08:00:18,629][53268] Updated weights for policy 1, policy_version 79950 (0.0010) [2023-10-10 08:00:18,994][53268] Updated weights for policy 1, policy_version 79960 (0.0011) [2023-10-10 08:00:19,751][53252] Updated weights for policy 0, policy_version 80010 (0.0009) [2023-10-10 08:00:20,115][53252] Updated weights for policy 0, policy_version 80020 (0.0007) [2023-10-10 08:00:20,481][53252] Updated weights for policy 0, policy_version 80030 (0.0011) [2023-10-10 08:00:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163840000. Throughput: 0: 1666.4, 1: 1673.6. Samples: 40965638. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:21,784][52050] Avg episode reward: [(0, '21.550'), (1, '22.340')] [2023-10-10 08:00:23,186][53268] Updated weights for policy 1, policy_version 79970 (0.0008) [2023-10-10 08:00:23,594][53268] Updated weights for policy 1, policy_version 79980 (0.0010) [2023-10-10 08:00:23,964][53268] Updated weights for policy 1, policy_version 79990 (0.0009) [2023-10-10 08:00:24,335][53268] Updated weights for policy 1, policy_version 80000 (0.0008) [2023-10-10 08:00:24,541][53252] Updated weights for policy 0, policy_version 80040 (0.0008) [2023-10-10 08:00:24,919][53252] Updated weights for policy 0, policy_version 80050 (0.0008) [2023-10-10 08:00:25,275][53252] Updated weights for policy 0, policy_version 80060 (0.0008) [2023-10-10 08:00:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163905536. Throughput: 0: 1685.9, 1: 1677.9. Samples: 40985944. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:26,784][52050] Avg episode reward: [(0, '21.090'), (1, '22.640')] [2023-10-10 08:00:28,322][53268] Updated weights for policy 1, policy_version 80010 (0.0008) [2023-10-10 08:00:28,682][53268] Updated weights for policy 1, policy_version 80020 (0.0008) [2023-10-10 08:00:29,050][53268] Updated weights for policy 1, policy_version 80030 (0.0009) [2023-10-10 08:00:29,233][53252] Updated weights for policy 0, policy_version 80070 (0.0008) [2023-10-10 08:00:29,589][53252] Updated weights for policy 0, policy_version 80080 (0.0008) [2023-10-10 08:00:29,961][53252] Updated weights for policy 0, policy_version 80090 (0.0007) [2023-10-10 08:00:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 163971072. Throughput: 0: 1684.4, 1: 1652.0. Samples: 40995928. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:31,784][52050] Avg episode reward: [(0, '21.570'), (1, '22.450')] [2023-10-10 08:00:33,115][53268] Updated weights for policy 1, policy_version 80040 (0.0007) [2023-10-10 08:00:33,480][53268] Updated weights for policy 1, policy_version 80050 (0.0007) [2023-10-10 08:00:33,842][53268] Updated weights for policy 1, policy_version 80060 (0.0009) [2023-10-10 08:00:34,140][53252] Updated weights for policy 0, policy_version 80100 (0.0008) [2023-10-10 08:00:34,509][53252] Updated weights for policy 0, policy_version 80110 (0.0007) [2023-10-10 08:00:34,886][53252] Updated weights for policy 0, policy_version 80120 (0.0009) [2023-10-10 08:00:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 164036608. Throughput: 0: 1667.2, 1: 1683.2. Samples: 41015790. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-10 08:00:36,784][52050] Avg episode reward: [(0, '21.680'), (1, '21.280')] [2023-10-10 08:00:37,954][53268] Updated weights for policy 1, policy_version 80070 (0.0009) [2023-10-10 08:00:38,311][53268] Updated weights for policy 1, policy_version 80080 (0.0010) [2023-10-10 08:00:38,683][53268] Updated weights for policy 1, policy_version 80090 (0.0009) [2023-10-10 08:00:39,129][53252] Updated weights for policy 0, policy_version 80130 (0.0009) [2023-10-10 08:00:39,491][53252] Updated weights for policy 0, policy_version 80140 (0.0008) [2023-10-10 08:00:39,861][53252] Updated weights for policy 0, policy_version 80150 (0.0009) [2023-10-10 08:00:40,234][53252] Updated weights for policy 0, policy_version 80160 (0.0008) [2023-10-10 08:00:41,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 164102144. Throughput: 0: 1693.6, 1: 1688.9. Samples: 41036550. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:00:41,785][52050] Avg episode reward: [(0, '21.180'), (1, '21.460')] [2023-10-10 08:00:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000080160_82083840.pth... [2023-10-10 08:00:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000080096_82018304.pth... [2023-10-10 08:00:41,836][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000078528_80412672.pth [2023-10-10 08:00:41,843][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000078592_80478208.pth [2023-10-10 08:00:42,705][53268] Updated weights for policy 1, policy_version 80100 (0.0008) [2023-10-10 08:00:43,068][53268] Updated weights for policy 1, policy_version 80110 (0.0008) [2023-10-10 08:00:43,433][53268] Updated weights for policy 1, policy_version 80120 (0.0010) [2023-10-10 08:00:44,366][53252] Updated weights for policy 0, policy_version 80170 (0.0009) [2023-10-10 08:00:44,744][53252] Updated weights for policy 0, policy_version 80180 (0.0011) [2023-10-10 08:00:45,119][53252] Updated weights for policy 0, policy_version 80190 (0.0010) [2023-10-10 08:00:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 164167680. Throughput: 0: 1684.2, 1: 1670.7. Samples: 41046582. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:00:46,784][52050] Avg episode reward: [(0, '21.120'), (1, '22.010')] [2023-10-10 08:00:47,588][53268] Updated weights for policy 1, policy_version 80130 (0.0008) [2023-10-10 08:00:47,950][53268] Updated weights for policy 1, policy_version 80140 (0.0010) [2023-10-10 08:00:48,319][53268] Updated weights for policy 1, policy_version 80150 (0.0008) [2023-10-10 08:00:48,677][53268] Updated weights for policy 1, policy_version 80160 (0.0009) [2023-10-10 08:00:48,907][53252] Updated weights for policy 0, policy_version 80200 (0.0010) [2023-10-10 08:00:49,278][53252] Updated weights for policy 0, policy_version 80210 (0.0007) [2023-10-10 08:00:49,652][53252] Updated weights for policy 0, policy_version 80220 (0.0008) [2023-10-10 08:00:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 164233216. Throughput: 0: 1675.1, 1: 1687.1. Samples: 41066450. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:00:51,784][52050] Avg episode reward: [(0, '22.060'), (1, '20.850')] [2023-10-10 08:00:52,984][53268] Updated weights for policy 1, policy_version 80170 (0.0007) [2023-10-10 08:00:53,350][53268] Updated weights for policy 1, policy_version 80180 (0.0007) [2023-10-10 08:00:53,617][53252] Updated weights for policy 0, policy_version 80230 (0.0007) [2023-10-10 08:00:53,713][53268] Updated weights for policy 1, policy_version 80190 (0.0007) [2023-10-10 08:00:53,988][53252] Updated weights for policy 0, policy_version 80240 (0.0008) [2023-10-10 08:00:54,366][53252] Updated weights for policy 0, policy_version 80250 (0.0008) [2023-10-10 08:00:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 164298752. Throughput: 0: 1690.4, 1: 1678.7. Samples: 41087164. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:00:56,784][52050] Avg episode reward: [(0, '22.950'), (1, '22.200')] [2023-10-10 08:00:57,691][53268] Updated weights for policy 1, policy_version 80200 (0.0008) [2023-10-10 08:00:58,056][53268] Updated weights for policy 1, policy_version 80210 (0.0008) [2023-10-10 08:00:58,411][53268] Updated weights for policy 1, policy_version 80220 (0.0009) [2023-10-10 08:00:58,493][53252] Updated weights for policy 0, policy_version 80260 (0.0007) [2023-10-10 08:00:58,871][53252] Updated weights for policy 0, policy_version 80270 (0.0010) [2023-10-10 08:00:59,250][53252] Updated weights for policy 0, policy_version 80280 (0.0007) [2023-10-10 08:01:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164364288. Throughput: 0: 1668.4, 1: 1674.9. Samples: 41096580. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:01,784][52050] Avg episode reward: [(0, '23.380'), (1, '20.190')] [2023-10-10 08:01:02,536][53268] Updated weights for policy 1, policy_version 80230 (0.0008) [2023-10-10 08:01:02,900][53268] Updated weights for policy 1, policy_version 80240 (0.0009) [2023-10-10 08:01:03,266][53252] Updated weights for policy 0, policy_version 80290 (0.0008) [2023-10-10 08:01:03,276][53268] Updated weights for policy 1, policy_version 80250 (0.0008) [2023-10-10 08:01:03,636][53252] Updated weights for policy 0, policy_version 80300 (0.0009) [2023-10-10 08:01:04,006][53252] Updated weights for policy 0, policy_version 80310 (0.0009) [2023-10-10 08:01:04,384][53252] Updated weights for policy 0, policy_version 80320 (0.0007) [2023-10-10 08:01:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164429824. Throughput: 0: 1682.2, 1: 1679.4. Samples: 41116908. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:06,784][52050] Avg episode reward: [(0, '22.740'), (1, '20.900')] [2023-10-10 08:01:07,247][53268] Updated weights for policy 1, policy_version 80260 (0.0009) [2023-10-10 08:01:07,617][53268] Updated weights for policy 1, policy_version 80270 (0.0007) [2023-10-10 08:01:07,989][53268] Updated weights for policy 1, policy_version 80280 (0.0007) [2023-10-10 08:01:08,380][53252] Updated weights for policy 0, policy_version 80330 (0.0007) [2023-10-10 08:01:08,735][53252] Updated weights for policy 0, policy_version 80340 (0.0009) [2023-10-10 08:01:09,116][53252] Updated weights for policy 0, policy_version 80350 (0.0007) [2023-10-10 08:01:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164495360. Throughput: 0: 1686.9, 1: 1687.5. Samples: 41137792. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:11,784][52050] Avg episode reward: [(0, '21.780'), (1, '21.640')] [2023-10-10 08:01:12,185][53268] Updated weights for policy 1, policy_version 80290 (0.0007) [2023-10-10 08:01:12,608][53268] Updated weights for policy 1, policy_version 80300 (0.0011) [2023-10-10 08:01:12,980][53268] Updated weights for policy 1, policy_version 80310 (0.0009) [2023-10-10 08:01:13,221][53252] Updated weights for policy 0, policy_version 80360 (0.0007) [2023-10-10 08:01:13,334][53268] Updated weights for policy 1, policy_version 80320 (0.0009) [2023-10-10 08:01:13,595][53252] Updated weights for policy 0, policy_version 80370 (0.0008) [2023-10-10 08:01:13,973][53252] Updated weights for policy 0, policy_version 80380 (0.0010) [2023-10-10 08:01:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 164560896. Throughput: 0: 1668.4, 1: 1680.9. Samples: 41146650. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:16,784][52050] Avg episode reward: [(0, '21.240'), (1, '21.280')] [2023-10-10 08:01:17,363][53268] Updated weights for policy 1, policy_version 80330 (0.0011) [2023-10-10 08:01:17,724][53268] Updated weights for policy 1, policy_version 80340 (0.0009) [2023-10-10 08:01:18,085][53252] Updated weights for policy 0, policy_version 80390 (0.0010) [2023-10-10 08:01:18,086][53268] Updated weights for policy 1, policy_version 80350 (0.0011) [2023-10-10 08:01:18,449][53252] Updated weights for policy 0, policy_version 80400 (0.0009) [2023-10-10 08:01:18,819][53252] Updated weights for policy 0, policy_version 80410 (0.0010) [2023-10-10 08:01:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164626432. Throughput: 0: 1689.0, 1: 1679.7. Samples: 41167384. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:21,784][52050] Avg episode reward: [(0, '20.950'), (1, '21.890')] [2023-10-10 08:01:22,168][53268] Updated weights for policy 1, policy_version 80360 (0.0011) [2023-10-10 08:01:22,537][53268] Updated weights for policy 1, policy_version 80370 (0.0007) [2023-10-10 08:01:22,874][53252] Updated weights for policy 0, policy_version 80420 (0.0009) [2023-10-10 08:01:22,898][53268] Updated weights for policy 1, policy_version 80380 (0.0007) [2023-10-10 08:01:23,248][53252] Updated weights for policy 0, policy_version 80430 (0.0008) [2023-10-10 08:01:23,615][53252] Updated weights for policy 0, policy_version 80440 (0.0010) [2023-10-10 08:01:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 164691968. Throughput: 0: 1696.1, 1: 1677.9. Samples: 41188380. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:26,784][52050] Avg episode reward: [(0, '20.020'), (1, '21.950')] [2023-10-10 08:01:26,891][53268] Updated weights for policy 1, policy_version 80390 (0.0008) [2023-10-10 08:01:27,253][53268] Updated weights for policy 1, policy_version 80400 (0.0009) [2023-10-10 08:01:27,499][53252] Updated weights for policy 0, policy_version 80450 (0.0008) [2023-10-10 08:01:27,621][53268] Updated weights for policy 1, policy_version 80410 (0.0008) [2023-10-10 08:01:27,863][53252] Updated weights for policy 0, policy_version 80460 (0.0007) [2023-10-10 08:01:28,233][53252] Updated weights for policy 0, policy_version 80470 (0.0009) [2023-10-10 08:01:28,605][53252] Updated weights for policy 0, policy_version 80480 (0.0009) [2023-10-10 08:01:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 164757504. Throughput: 0: 1677.3, 1: 1676.0. Samples: 41197478. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) [2023-10-10 08:01:31,784][52050] Avg episode reward: [(0, '23.170'), (1, '20.800')] [2023-10-10 08:01:31,862][53268] Updated weights for policy 1, policy_version 80420 (0.0010) [2023-10-10 08:01:32,234][53268] Updated weights for policy 1, policy_version 80430 (0.0008) [2023-10-10 08:01:32,593][53268] Updated weights for policy 1, policy_version 80440 (0.0009) [2023-10-10 08:01:32,944][53252] Updated weights for policy 0, policy_version 80490 (0.0008) [2023-10-10 08:01:33,318][53252] Updated weights for policy 0, policy_version 80500 (0.0009) [2023-10-10 08:01:33,698][53252] Updated weights for policy 0, policy_version 80510 (0.0010) [2023-10-10 08:01:36,583][53268] Updated weights for policy 1, policy_version 80450 (0.0008) [2023-10-10 08:01:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 164823040. Throughput: 0: 1688.9, 1: 1674.3. Samples: 41217792. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:01:36,784][52050] Avg episode reward: [(0, '22.980'), (1, '21.120')] [2023-10-10 08:01:36,946][53268] Updated weights for policy 1, policy_version 80460 (0.0007) [2023-10-10 08:01:37,314][53268] Updated weights for policy 1, policy_version 80470 (0.0010) [2023-10-10 08:01:37,680][53268] Updated weights for policy 1, policy_version 80480 (0.0009) [2023-10-10 08:01:37,881][53252] Updated weights for policy 0, policy_version 80520 (0.0009) [2023-10-10 08:01:38,259][53252] Updated weights for policy 0, policy_version 80530 (0.0009) [2023-10-10 08:01:38,629][53252] Updated weights for policy 0, policy_version 80540 (0.0008) [2023-10-10 08:01:41,691][53268] Updated weights for policy 1, policy_version 80490 (0.0008) [2023-10-10 08:01:41,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 164888576. Throughput: 0: 1688.3, 1: 1686.1. Samples: 41239014. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:01:41,785][52050] Avg episode reward: [(0, '22.850'), (1, '20.200')] [2023-10-10 08:01:42,062][53268] Updated weights for policy 1, policy_version 80500 (0.0009) [2023-10-10 08:01:42,428][53268] Updated weights for policy 1, policy_version 80510 (0.0007) [2023-10-10 08:01:42,456][53252] Updated weights for policy 0, policy_version 80550 (0.0008) [2023-10-10 08:01:42,828][53252] Updated weights for policy 0, policy_version 80560 (0.0008) [2023-10-10 08:01:43,199][53252] Updated weights for policy 0, policy_version 80570 (0.0008) [2023-10-10 08:01:46,420][53268] Updated weights for policy 1, policy_version 80520 (0.0009) [2023-10-10 08:01:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 164954112. Throughput: 0: 1684.5, 1: 1686.3. Samples: 41248266. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:01:46,785][52050] Avg episode reward: [(0, '24.260'), (1, '22.590')] [2023-10-10 08:01:46,796][53268] Updated weights for policy 1, policy_version 80530 (0.0009) [2023-10-10 08:01:47,170][53268] Updated weights for policy 1, policy_version 80540 (0.0009) [2023-10-10 08:01:47,185][53252] Updated weights for policy 0, policy_version 80580 (0.0008) [2023-10-10 08:01:47,565][53252] Updated weights for policy 0, policy_version 80590 (0.0009) [2023-10-10 08:01:47,934][53252] Updated weights for policy 0, policy_version 80600 (0.0010) [2023-10-10 08:01:51,201][53268] Updated weights for policy 1, policy_version 80550 (0.0009) [2023-10-10 08:01:51,571][53268] Updated weights for policy 1, policy_version 80560 (0.0010) [2023-10-10 08:01:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165019648. Throughput: 0: 1689.8, 1: 1687.7. Samples: 41268894. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:01:51,784][52050] Avg episode reward: [(0, '24.960'), (1, '23.670')] [2023-10-10 08:01:51,914][53252] Updated weights for policy 0, policy_version 80610 (0.0009) [2023-10-10 08:01:51,925][53268] Updated weights for policy 1, policy_version 80570 (0.0009) [2023-10-10 08:01:52,289][53252] Updated weights for policy 0, policy_version 80620 (0.0008) [2023-10-10 08:01:52,655][53252] Updated weights for policy 0, policy_version 80630 (0.0008) [2023-10-10 08:01:53,021][53252] Updated weights for policy 0, policy_version 80640 (0.0009) [2023-10-10 08:01:55,984][53268] Updated weights for policy 1, policy_version 80580 (0.0010) [2023-10-10 08:01:56,364][53268] Updated weights for policy 1, policy_version 80590 (0.0009) [2023-10-10 08:01:56,736][53268] Updated weights for policy 1, policy_version 80600 (0.0008) [2023-10-10 08:01:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 165085184. Throughput: 0: 1688.2, 1: 1674.1. Samples: 41289098. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:01:56,784][52050] Avg episode reward: [(0, '23.570'), (1, '22.470')] [2023-10-10 08:01:57,045][53252] Updated weights for policy 0, policy_version 80650 (0.0008) [2023-10-10 08:01:57,429][53252] Updated weights for policy 0, policy_version 80660 (0.0010) [2023-10-10 08:01:57,803][53252] Updated weights for policy 0, policy_version 80670 (0.0007) [2023-10-10 08:02:00,730][53268] Updated weights for policy 1, policy_version 80610 (0.0008) [2023-10-10 08:02:01,140][53268] Updated weights for policy 1, policy_version 80620 (0.0008) [2023-10-10 08:02:01,507][53268] Updated weights for policy 1, policy_version 80630 (0.0007) [2023-10-10 08:02:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 165150720. Throughput: 0: 1684.7, 1: 1689.3. Samples: 41298482. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:01,784][52050] Avg episode reward: [(0, '23.860'), (1, '23.110')] [2023-10-10 08:02:01,868][53252] Updated weights for policy 0, policy_version 80680 (0.0008) [2023-10-10 08:02:01,871][53268] Updated weights for policy 1, policy_version 80640 (0.0007) [2023-10-10 08:02:02,242][53252] Updated weights for policy 0, policy_version 80690 (0.0009) [2023-10-10 08:02:02,618][53252] Updated weights for policy 0, policy_version 80700 (0.0010) [2023-10-10 08:02:05,900][53268] Updated weights for policy 1, policy_version 80650 (0.0010) [2023-10-10 08:02:06,269][53268] Updated weights for policy 1, policy_version 80660 (0.0010) [2023-10-10 08:02:06,639][53268] Updated weights for policy 1, policy_version 80670 (0.0009) [2023-10-10 08:02:06,755][53252] Updated weights for policy 0, policy_version 80710 (0.0008) [2023-10-10 08:02:06,783][52050] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165249024. Throughput: 0: 1688.0, 1: 1686.7. Samples: 41319246. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:06,784][52050] Avg episode reward: [(0, '22.630'), (1, '21.890')] [2023-10-10 08:02:07,126][53252] Updated weights for policy 0, policy_version 80720 (0.0008) [2023-10-10 08:02:07,499][53252] Updated weights for policy 0, policy_version 80730 (0.0008) [2023-10-10 08:02:10,755][53268] Updated weights for policy 1, policy_version 80680 (0.0011) [2023-10-10 08:02:11,120][53268] Updated weights for policy 1, policy_version 80690 (0.0009) [2023-10-10 08:02:11,497][53268] Updated weights for policy 1, policy_version 80700 (0.0010) [2023-10-10 08:02:11,657][53252] Updated weights for policy 0, policy_version 80740 (0.0009) [2023-10-10 08:02:11,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 165314560. Throughput: 0: 1682.3, 1: 1669.3. Samples: 41339200. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:11,784][52050] Avg episode reward: [(0, '21.680'), (1, '20.820')] [2023-10-10 08:02:12,018][53252] Updated weights for policy 0, policy_version 80750 (0.0007) [2023-10-10 08:02:12,392][53252] Updated weights for policy 0, policy_version 80760 (0.0007) [2023-10-10 08:02:15,512][53268] Updated weights for policy 1, policy_version 80710 (0.0008) [2023-10-10 08:02:15,884][53268] Updated weights for policy 1, policy_version 80720 (0.0009) [2023-10-10 08:02:16,253][53268] Updated weights for policy 1, policy_version 80730 (0.0008) [2023-10-10 08:02:16,507][53252] Updated weights for policy 0, policy_version 80770 (0.0010) [2023-10-10 08:02:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165380096. Throughput: 0: 1681.8, 1: 1690.7. Samples: 41349240. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:16,784][52050] Avg episode reward: [(0, '21.780'), (1, '21.750')] [2023-10-10 08:02:16,882][53252] Updated weights for policy 0, policy_version 80780 (0.0008) [2023-10-10 08:02:17,263][53252] Updated weights for policy 0, policy_version 80790 (0.0009) [2023-10-10 08:02:17,629][53252] Updated weights for policy 0, policy_version 80800 (0.0009) [2023-10-10 08:02:20,344][53268] Updated weights for policy 1, policy_version 80740 (0.0009) [2023-10-10 08:02:20,707][53268] Updated weights for policy 1, policy_version 80750 (0.0010) [2023-10-10 08:02:21,074][53268] Updated weights for policy 1, policy_version 80760 (0.0007) [2023-10-10 08:02:21,767][53252] Updated weights for policy 0, policy_version 80810 (0.0009) [2023-10-10 08:02:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165445632. Throughput: 0: 1684.9, 1: 1695.4. Samples: 41369908. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:21,784][52050] Avg episode reward: [(0, '21.700'), (1, '23.090')] [2023-10-10 08:02:22,146][53252] Updated weights for policy 0, policy_version 80820 (0.0010) [2023-10-10 08:02:22,527][53252] Updated weights for policy 0, policy_version 80830 (0.0010) [2023-10-10 08:02:25,283][53268] Updated weights for policy 1, policy_version 80770 (0.0008) [2023-10-10 08:02:25,650][53268] Updated weights for policy 1, policy_version 80780 (0.0007) [2023-10-10 08:02:26,014][53268] Updated weights for policy 1, policy_version 80790 (0.0009) [2023-10-10 08:02:26,357][53252] Updated weights for policy 0, policy_version 80840 (0.0010) [2023-10-10 08:02:26,377][53268] Updated weights for policy 1, policy_version 80800 (0.0008) [2023-10-10 08:02:26,726][53252] Updated weights for policy 0, policy_version 80850 (0.0009) [2023-10-10 08:02:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165511168. Throughput: 0: 1675.2, 1: 1662.2. Samples: 41389198. Policy #0 lag: (min: 10.0, avg: 11.7, max: 28.0) [2023-10-10 08:02:26,784][52050] Avg episode reward: [(0, '21.870'), (1, '21.180')] [2023-10-10 08:02:27,101][53252] Updated weights for policy 0, policy_version 80860 (0.0008) [2023-10-10 08:02:30,207][53268] Updated weights for policy 1, policy_version 80810 (0.0009) [2023-10-10 08:02:30,576][53268] Updated weights for policy 1, policy_version 80820 (0.0008) [2023-10-10 08:02:30,933][53268] Updated weights for policy 1, policy_version 80830 (0.0009) [2023-10-10 08:02:31,204][53252] Updated weights for policy 0, policy_version 80870 (0.0007) [2023-10-10 08:02:31,569][53252] Updated weights for policy 0, policy_version 80880 (0.0009) [2023-10-10 08:02:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165576704. Throughput: 0: 1681.7, 1: 1687.7. Samples: 41399888. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:31,784][52050] Avg episode reward: [(0, '22.960'), (1, '21.820')] [2023-10-10 08:02:31,935][53252] Updated weights for policy 0, policy_version 80890 (0.0009) [2023-10-10 08:02:34,936][53268] Updated weights for policy 1, policy_version 80840 (0.0010) [2023-10-10 08:02:35,308][53268] Updated weights for policy 1, policy_version 80850 (0.0011) [2023-10-10 08:02:35,669][53268] Updated weights for policy 1, policy_version 80860 (0.0009) [2023-10-10 08:02:35,970][53252] Updated weights for policy 0, policy_version 80900 (0.0009) [2023-10-10 08:02:36,340][53252] Updated weights for policy 0, policy_version 80910 (0.0008) [2023-10-10 08:02:36,720][53252] Updated weights for policy 0, policy_version 80920 (0.0007) [2023-10-10 08:02:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165642240. Throughput: 0: 1685.4, 1: 1676.4. Samples: 41420176. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:36,784][52050] Avg episode reward: [(0, '24.010'), (1, '20.730')] [2023-10-10 08:02:39,773][53268] Updated weights for policy 1, policy_version 80870 (0.0008) [2023-10-10 08:02:40,139][53268] Updated weights for policy 1, policy_version 80880 (0.0011) [2023-10-10 08:02:40,505][53268] Updated weights for policy 1, policy_version 80890 (0.0009) [2023-10-10 08:02:40,835][53252] Updated weights for policy 0, policy_version 80930 (0.0008) [2023-10-10 08:02:41,203][53252] Updated weights for policy 0, policy_version 80940 (0.0011) [2023-10-10 08:02:41,574][53252] Updated weights for policy 0, policy_version 80950 (0.0010) [2023-10-10 08:02:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 165707776. Throughput: 0: 1673.6, 1: 1672.8. Samples: 41439684. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:41,784][52050] Avg episode reward: [(0, '23.180'), (1, '23.400')] [2023-10-10 08:02:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000080896_82837504.pth... [2023-10-10 08:02:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000079328_81231872.pth [2023-10-10 08:02:41,946][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000080960_82903040.pth... [2023-10-10 08:02:41,950][53252] Updated weights for policy 0, policy_version 80960 (0.0009) [2023-10-10 08:02:41,983][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000079360_81264640.pth [2023-10-10 08:02:44,661][53268] Updated weights for policy 1, policy_version 80900 (0.0008) [2023-10-10 08:02:45,027][53268] Updated weights for policy 1, policy_version 80910 (0.0011) [2023-10-10 08:02:45,395][53268] Updated weights for policy 1, policy_version 80920 (0.0008) [2023-10-10 08:02:45,905][53252] Updated weights for policy 0, policy_version 80970 (0.0007) [2023-10-10 08:02:46,276][53252] Updated weights for policy 0, policy_version 80980 (0.0007) [2023-10-10 08:02:46,638][53252] Updated weights for policy 0, policy_version 80990 (0.0008) [2023-10-10 08:02:46,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 165806080. Throughput: 0: 1689.4, 1: 1693.8. Samples: 41450728. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:46,784][52050] Avg episode reward: [(0, '23.770'), (1, '22.240')] [2023-10-10 08:02:49,429][53268] Updated weights for policy 1, policy_version 80930 (0.0010) [2023-10-10 08:02:49,842][53268] Updated weights for policy 1, policy_version 80940 (0.0010) [2023-10-10 08:02:50,215][53268] Updated weights for policy 1, policy_version 80950 (0.0009) [2023-10-10 08:02:50,575][53268] Updated weights for policy 1, policy_version 80960 (0.0007) [2023-10-10 08:02:50,924][53252] Updated weights for policy 0, policy_version 81000 (0.0009) [2023-10-10 08:02:51,296][53252] Updated weights for policy 0, policy_version 81010 (0.0008) [2023-10-10 08:02:51,668][53252] Updated weights for policy 0, policy_version 81020 (0.0007) [2023-10-10 08:02:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 165838848. Throughput: 0: 1686.1, 1: 1673.6. Samples: 41470432. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:51,784][52050] Avg episode reward: [(0, '22.330'), (1, '21.590')] [2023-10-10 08:02:54,659][53268] Updated weights for policy 1, policy_version 80970 (0.0007) [2023-10-10 08:02:55,022][53268] Updated weights for policy 1, policy_version 80980 (0.0008) [2023-10-10 08:02:55,395][53268] Updated weights for policy 1, policy_version 80990 (0.0007) [2023-10-10 08:02:55,731][53252] Updated weights for policy 0, policy_version 81030 (0.0007) [2023-10-10 08:02:56,104][53252] Updated weights for policy 0, policy_version 81040 (0.0007) [2023-10-10 08:02:56,469][53252] Updated weights for policy 0, policy_version 81050 (0.0007) [2023-10-10 08:02:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 165937152. Throughput: 0: 1666.2, 1: 1679.1. Samples: 41489738. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:02:56,784][52050] Avg episode reward: [(0, '21.670'), (1, '23.040')] [2023-10-10 08:02:59,577][53268] Updated weights for policy 1, policy_version 81000 (0.0009) [2023-10-10 08:02:59,939][53268] Updated weights for policy 1, policy_version 81010 (0.0007) [2023-10-10 08:03:00,304][53268] Updated weights for policy 1, policy_version 81020 (0.0010) [2023-10-10 08:03:00,472][53252] Updated weights for policy 0, policy_version 81060 (0.0007) [2023-10-10 08:03:00,843][53252] Updated weights for policy 0, policy_version 81070 (0.0007) [2023-10-10 08:03:01,218][53252] Updated weights for policy 0, policy_version 81080 (0.0009) [2023-10-10 08:03:01,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 166002688. Throughput: 0: 1685.6, 1: 1686.7. Samples: 41500998. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:03:01,784][52050] Avg episode reward: [(0, '20.450'), (1, '22.920')] [2023-10-10 08:03:04,342][53268] Updated weights for policy 1, policy_version 81030 (0.0008) [2023-10-10 08:03:04,704][53268] Updated weights for policy 1, policy_version 81040 (0.0009) [2023-10-10 08:03:05,065][53268] Updated weights for policy 1, policy_version 81050 (0.0009) [2023-10-10 08:03:05,200][53252] Updated weights for policy 0, policy_version 81090 (0.0008) [2023-10-10 08:03:05,574][53252] Updated weights for policy 0, policy_version 81100 (0.0008) [2023-10-10 08:03:05,943][53252] Updated weights for policy 0, policy_version 81110 (0.0008) [2023-10-10 08:03:06,311][53252] Updated weights for policy 0, policy_version 81120 (0.0009) [2023-10-10 08:03:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166068224. Throughput: 0: 1686.9, 1: 1661.2. Samples: 41520576. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:03:06,784][52050] Avg episode reward: [(0, '19.760'), (1, '22.610')] [2023-10-10 08:03:09,070][53268] Updated weights for policy 1, policy_version 81060 (0.0010) [2023-10-10 08:03:09,444][53268] Updated weights for policy 1, policy_version 81070 (0.0010) [2023-10-10 08:03:09,806][53268] Updated weights for policy 1, policy_version 81080 (0.0010) [2023-10-10 08:03:10,530][53252] Updated weights for policy 0, policy_version 81130 (0.0009) [2023-10-10 08:03:10,895][53252] Updated weights for policy 0, policy_version 81140 (0.0008) [2023-10-10 08:03:11,269][53252] Updated weights for policy 0, policy_version 81150 (0.0009) [2023-10-10 08:03:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166133760. Throughput: 0: 1668.2, 1: 1684.4. Samples: 41540066. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:03:11,784][52050] Avg episode reward: [(0, '20.950'), (1, '22.680')] [2023-10-10 08:03:13,830][53268] Updated weights for policy 1, policy_version 81090 (0.0010) [2023-10-10 08:03:14,210][53268] Updated weights for policy 1, policy_version 81100 (0.0009) [2023-10-10 08:03:14,575][53268] Updated weights for policy 1, policy_version 81110 (0.0011) [2023-10-10 08:03:14,944][53268] Updated weights for policy 1, policy_version 81120 (0.0010) [2023-10-10 08:03:15,157][53252] Updated weights for policy 0, policy_version 81160 (0.0008) [2023-10-10 08:03:15,524][53252] Updated weights for policy 0, policy_version 81170 (0.0010) [2023-10-10 08:03:15,900][53252] Updated weights for policy 0, policy_version 81180 (0.0010) [2023-10-10 08:03:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166199296. Throughput: 0: 1688.9, 1: 1678.7. Samples: 41551428. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:03:16,784][52050] Avg episode reward: [(0, '20.790'), (1, '22.120')] [2023-10-10 08:03:19,094][53268] Updated weights for policy 1, policy_version 81130 (0.0010) [2023-10-10 08:03:19,464][53268] Updated weights for policy 1, policy_version 81140 (0.0010) [2023-10-10 08:03:19,836][53268] Updated weights for policy 1, policy_version 81150 (0.0007) [2023-10-10 08:03:19,960][53252] Updated weights for policy 0, policy_version 81190 (0.0009) [2023-10-10 08:03:20,334][53252] Updated weights for policy 0, policy_version 81200 (0.0009) [2023-10-10 08:03:20,690][53252] Updated weights for policy 0, policy_version 81210 (0.0011) [2023-10-10 08:03:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166264832. Throughput: 0: 1674.9, 1: 1670.0. Samples: 41570698. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:21,784][52050] Avg episode reward: [(0, '23.490'), (1, '22.270')] [2023-10-10 08:03:23,716][53268] Updated weights for policy 1, policy_version 81160 (0.0008) [2023-10-10 08:03:24,082][53268] Updated weights for policy 1, policy_version 81170 (0.0008) [2023-10-10 08:03:24,447][53268] Updated weights for policy 1, policy_version 81180 (0.0009) [2023-10-10 08:03:24,781][53252] Updated weights for policy 0, policy_version 81220 (0.0010) [2023-10-10 08:03:25,156][53252] Updated weights for policy 0, policy_version 81230 (0.0008) [2023-10-10 08:03:25,531][53252] Updated weights for policy 0, policy_version 81240 (0.0010) [2023-10-10 08:03:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166330368. Throughput: 0: 1673.2, 1: 1687.7. Samples: 41590924. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:26,784][52050] Avg episode reward: [(0, '22.090'), (1, '19.420')] [2023-10-10 08:03:28,482][53268] Updated weights for policy 1, policy_version 81190 (0.0009) [2023-10-10 08:03:28,843][53268] Updated weights for policy 1, policy_version 81200 (0.0008) [2023-10-10 08:03:29,208][53268] Updated weights for policy 1, policy_version 81210 (0.0010) [2023-10-10 08:03:29,701][53252] Updated weights for policy 0, policy_version 81250 (0.0008) [2023-10-10 08:03:30,063][53252] Updated weights for policy 0, policy_version 81260 (0.0010) [2023-10-10 08:03:30,436][53252] Updated weights for policy 0, policy_version 81270 (0.0009) [2023-10-10 08:03:30,806][53252] Updated weights for policy 0, policy_version 81280 (0.0007) [2023-10-10 08:03:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166395904. Throughput: 0: 1690.6, 1: 1664.7. Samples: 41601718. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:31,784][52050] Avg episode reward: [(0, '21.720'), (1, '20.260')] [2023-10-10 08:03:33,346][53268] Updated weights for policy 1, policy_version 81220 (0.0008) [2023-10-10 08:03:33,706][53268] Updated weights for policy 1, policy_version 81230 (0.0008) [2023-10-10 08:03:34,068][53268] Updated weights for policy 1, policy_version 81240 (0.0011) [2023-10-10 08:03:34,810][53252] Updated weights for policy 0, policy_version 81290 (0.0010) [2023-10-10 08:03:35,173][53252] Updated weights for policy 0, policy_version 81300 (0.0011) [2023-10-10 08:03:35,549][53252] Updated weights for policy 0, policy_version 81310 (0.0010) [2023-10-10 08:03:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166461440. Throughput: 0: 1672.1, 1: 1677.6. Samples: 41621170. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:36,784][52050] Avg episode reward: [(0, '21.310'), (1, '20.450')] [2023-10-10 08:03:38,200][53268] Updated weights for policy 1, policy_version 81250 (0.0011) [2023-10-10 08:03:38,608][53268] Updated weights for policy 1, policy_version 81260 (0.0011) [2023-10-10 08:03:38,975][53268] Updated weights for policy 1, policy_version 81270 (0.0010) [2023-10-10 08:03:39,337][53268] Updated weights for policy 1, policy_version 81280 (0.0009) [2023-10-10 08:03:39,440][53252] Updated weights for policy 0, policy_version 81320 (0.0008) [2023-10-10 08:03:39,810][53252] Updated weights for policy 0, policy_version 81330 (0.0007) [2023-10-10 08:03:40,184][53252] Updated weights for policy 0, policy_version 81340 (0.0007) [2023-10-10 08:03:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166526976. Throughput: 0: 1696.5, 1: 1684.8. Samples: 41641898. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:41,784][52050] Avg episode reward: [(0, '22.000'), (1, '21.500')] [2023-10-10 08:03:43,549][53268] Updated weights for policy 1, policy_version 81290 (0.0009) [2023-10-10 08:03:43,902][53268] Updated weights for policy 1, policy_version 81300 (0.0007) [2023-10-10 08:03:44,183][53252] Updated weights for policy 0, policy_version 81350 (0.0008) [2023-10-10 08:03:44,270][53268] Updated weights for policy 1, policy_version 81310 (0.0008) [2023-10-10 08:03:44,545][53252] Updated weights for policy 0, policy_version 81360 (0.0007) [2023-10-10 08:03:44,917][53252] Updated weights for policy 0, policy_version 81370 (0.0007) [2023-10-10 08:03:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166592512. Throughput: 0: 1694.8, 1: 1659.3. Samples: 41651932. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:46,784][52050] Avg episode reward: [(0, '21.060'), (1, '21.990')] [2023-10-10 08:03:48,410][53268] Updated weights for policy 1, policy_version 81320 (0.0010) [2023-10-10 08:03:48,776][53268] Updated weights for policy 1, policy_version 81330 (0.0011) [2023-10-10 08:03:48,926][53252] Updated weights for policy 0, policy_version 81380 (0.0009) [2023-10-10 08:03:49,138][53268] Updated weights for policy 1, policy_version 81340 (0.0009) [2023-10-10 08:03:49,292][53252] Updated weights for policy 0, policy_version 81390 (0.0007) [2023-10-10 08:03:49,661][53252] Updated weights for policy 0, policy_version 81400 (0.0007) [2023-10-10 08:03:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 166658048. Throughput: 0: 1677.1, 1: 1676.1. Samples: 41671470. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:51,784][52050] Avg episode reward: [(0, '22.190'), (1, '21.900')] [2023-10-10 08:03:53,143][53268] Updated weights for policy 1, policy_version 81350 (0.0008) [2023-10-10 08:03:53,515][53268] Updated weights for policy 1, policy_version 81360 (0.0010) [2023-10-10 08:03:53,651][53252] Updated weights for policy 0, policy_version 81410 (0.0008) [2023-10-10 08:03:53,871][53268] Updated weights for policy 1, policy_version 81370 (0.0008) [2023-10-10 08:03:54,022][53252] Updated weights for policy 0, policy_version 81420 (0.0007) [2023-10-10 08:03:54,386][53252] Updated weights for policy 0, policy_version 81430 (0.0007) [2023-10-10 08:03:54,756][53252] Updated weights for policy 0, policy_version 81440 (0.0008) [2023-10-10 08:03:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166723584. Throughput: 0: 1702.2, 1: 1679.7. Samples: 41692252. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:03:56,784][52050] Avg episode reward: [(0, '22.150'), (1, '21.930')] [2023-10-10 08:03:57,954][53268] Updated weights for policy 1, policy_version 81380 (0.0010) [2023-10-10 08:03:58,318][53268] Updated weights for policy 1, policy_version 81390 (0.0010) [2023-10-10 08:03:58,690][53268] Updated weights for policy 1, policy_version 81400 (0.0009) [2023-10-10 08:03:58,961][53252] Updated weights for policy 0, policy_version 81450 (0.0010) [2023-10-10 08:03:59,338][53252] Updated weights for policy 0, policy_version 81460 (0.0009) [2023-10-10 08:03:59,691][53252] Updated weights for policy 0, policy_version 81470 (0.0009) [2023-10-10 08:04:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166789120. Throughput: 0: 1681.0, 1: 1661.4. Samples: 41701836. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:04:01,784][52050] Avg episode reward: [(0, '21.310'), (1, '20.780')] [2023-10-10 08:04:02,664][53268] Updated weights for policy 1, policy_version 81410 (0.0009) [2023-10-10 08:04:03,041][53268] Updated weights for policy 1, policy_version 81420 (0.0009) [2023-10-10 08:04:03,397][53268] Updated weights for policy 1, policy_version 81430 (0.0009) [2023-10-10 08:04:03,670][53252] Updated weights for policy 0, policy_version 81480 (0.0008) [2023-10-10 08:04:03,763][53268] Updated weights for policy 1, policy_version 81440 (0.0008) [2023-10-10 08:04:04,047][53252] Updated weights for policy 0, policy_version 81490 (0.0009) [2023-10-10 08:04:04,413][53252] Updated weights for policy 0, policy_version 81500 (0.0009) [2023-10-10 08:04:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166854656. Throughput: 0: 1683.1, 1: 1679.4. Samples: 41722012. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:04:06,784][52050] Avg episode reward: [(0, '21.780'), (1, '22.230')] [2023-10-10 08:04:07,845][53268] Updated weights for policy 1, policy_version 81450 (0.0009) [2023-10-10 08:04:08,214][53268] Updated weights for policy 1, policy_version 81460 (0.0008) [2023-10-10 08:04:08,456][53252] Updated weights for policy 0, policy_version 81510 (0.0007) [2023-10-10 08:04:08,583][53268] Updated weights for policy 1, policy_version 81470 (0.0009) [2023-10-10 08:04:08,824][53252] Updated weights for policy 0, policy_version 81520 (0.0008) [2023-10-10 08:04:09,198][53252] Updated weights for policy 0, policy_version 81530 (0.0010) [2023-10-10 08:04:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166920192. Throughput: 0: 1691.7, 1: 1679.3. Samples: 41742622. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:04:11,784][52050] Avg episode reward: [(0, '22.950'), (1, '20.930')] [2023-10-10 08:04:12,759][53268] Updated weights for policy 1, policy_version 81480 (0.0010) [2023-10-10 08:04:13,129][53268] Updated weights for policy 1, policy_version 81490 (0.0008) [2023-10-10 08:04:13,424][53252] Updated weights for policy 0, policy_version 81540 (0.0007) [2023-10-10 08:04:13,508][53268] Updated weights for policy 1, policy_version 81500 (0.0010) [2023-10-10 08:04:13,788][53252] Updated weights for policy 0, policy_version 81550 (0.0008) [2023-10-10 08:04:14,151][53252] Updated weights for policy 0, policy_version 81560 (0.0008) [2023-10-10 08:04:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166985728. Throughput: 0: 1663.2, 1: 1670.3. Samples: 41751728. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:16,784][52050] Avg episode reward: [(0, '22.280'), (1, '22.310')] [2023-10-10 08:04:17,580][53268] Updated weights for policy 1, policy_version 81510 (0.0010) [2023-10-10 08:04:17,944][53268] Updated weights for policy 1, policy_version 81520 (0.0008) [2023-10-10 08:04:18,142][53252] Updated weights for policy 0, policy_version 81570 (0.0008) [2023-10-10 08:04:18,311][53268] Updated weights for policy 1, policy_version 81530 (0.0009) [2023-10-10 08:04:18,510][53252] Updated weights for policy 0, policy_version 81580 (0.0008) [2023-10-10 08:04:18,880][53252] Updated weights for policy 0, policy_version 81590 (0.0007) [2023-10-10 08:04:19,252][53252] Updated weights for policy 0, policy_version 81600 (0.0008) [2023-10-10 08:04:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167051264. Throughput: 0: 1686.8, 1: 1676.7. Samples: 41772524. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:21,784][52050] Avg episode reward: [(0, '22.980'), (1, '22.620')] [2023-10-10 08:04:22,365][53268] Updated weights for policy 1, policy_version 81540 (0.0007) [2023-10-10 08:04:22,726][53268] Updated weights for policy 1, policy_version 81550 (0.0011) [2023-10-10 08:04:23,095][53268] Updated weights for policy 1, policy_version 81560 (0.0008) [2023-10-10 08:04:23,234][53252] Updated weights for policy 0, policy_version 81610 (0.0009) [2023-10-10 08:04:23,608][53252] Updated weights for policy 0, policy_version 81620 (0.0009) [2023-10-10 08:04:23,974][53252] Updated weights for policy 0, policy_version 81630 (0.0007) [2023-10-10 08:04:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167116800. Throughput: 0: 1684.5, 1: 1676.5. Samples: 41793140. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:26,784][52050] Avg episode reward: [(0, '23.250'), (1, '24.370')] [2023-10-10 08:04:27,386][53268] Updated weights for policy 1, policy_version 81570 (0.0009) [2023-10-10 08:04:27,793][53268] Updated weights for policy 1, policy_version 81580 (0.0009) [2023-10-10 08:04:28,042][53252] Updated weights for policy 0, policy_version 81640 (0.0007) [2023-10-10 08:04:28,160][53268] Updated weights for policy 1, policy_version 81590 (0.0008) [2023-10-10 08:04:28,420][53252] Updated weights for policy 0, policy_version 81650 (0.0008) [2023-10-10 08:04:28,525][53268] Updated weights for policy 1, policy_version 81600 (0.0009) [2023-10-10 08:04:28,782][53252] Updated weights for policy 0, policy_version 81660 (0.0009) [2023-10-10 08:04:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167182336. Throughput: 0: 1664.3, 1: 1672.5. Samples: 41802086. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:31,784][52050] Avg episode reward: [(0, '23.190'), (1, '21.910')] [2023-10-10 08:04:32,445][53268] Updated weights for policy 1, policy_version 81610 (0.0007) [2023-10-10 08:04:32,739][53252] Updated weights for policy 0, policy_version 81670 (0.0007) [2023-10-10 08:04:32,814][53268] Updated weights for policy 1, policy_version 81620 (0.0007) [2023-10-10 08:04:33,099][53252] Updated weights for policy 0, policy_version 81680 (0.0007) [2023-10-10 08:04:33,181][53268] Updated weights for policy 1, policy_version 81630 (0.0010) [2023-10-10 08:04:33,471][53252] Updated weights for policy 0, policy_version 81690 (0.0008) [2023-10-10 08:04:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167247872. Throughput: 0: 1681.9, 1: 1681.6. Samples: 41822824. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:36,784][52050] Avg episode reward: [(0, '22.360'), (1, '23.160')] [2023-10-10 08:04:37,399][53268] Updated weights for policy 1, policy_version 81640 (0.0007) [2023-10-10 08:04:37,563][53252] Updated weights for policy 0, policy_version 81700 (0.0008) [2023-10-10 08:04:37,770][53268] Updated weights for policy 1, policy_version 81650 (0.0009) [2023-10-10 08:04:37,930][53252] Updated weights for policy 0, policy_version 81710 (0.0009) [2023-10-10 08:04:38,131][53268] Updated weights for policy 1, policy_version 81660 (0.0008) [2023-10-10 08:04:38,306][53252] Updated weights for policy 0, policy_version 81720 (0.0008) [2023-10-10 08:04:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167313408. Throughput: 0: 1688.0, 1: 1675.4. Samples: 41843608. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:41,784][52050] Avg episode reward: [(0, '20.700'), (1, '20.430')] [2023-10-10 08:04:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000081664_83623936.pth... [2023-10-10 08:04:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000081728_83689472.pth... [2023-10-10 08:04:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000080096_82018304.pth [2023-10-10 08:04:41,832][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000080160_82083840.pth [2023-10-10 08:04:42,265][53268] Updated weights for policy 1, policy_version 81670 (0.0007) [2023-10-10 08:04:42,344][53252] Updated weights for policy 0, policy_version 81730 (0.0008) [2023-10-10 08:04:42,627][53268] Updated weights for policy 1, policy_version 81680 (0.0008) [2023-10-10 08:04:42,720][53252] Updated weights for policy 0, policy_version 81740 (0.0007) [2023-10-10 08:04:42,989][53268] Updated weights for policy 1, policy_version 81690 (0.0009) [2023-10-10 08:04:43,085][53252] Updated weights for policy 0, policy_version 81750 (0.0009) [2023-10-10 08:04:43,456][53252] Updated weights for policy 0, policy_version 81760 (0.0008) [2023-10-10 08:04:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167378944. Throughput: 0: 1677.8, 1: 1675.0. Samples: 41852712. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:46,784][52050] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-10 08:04:47,037][53268] Updated weights for policy 1, policy_version 81700 (0.0008) [2023-10-10 08:04:47,408][53268] Updated weights for policy 1, policy_version 81710 (0.0008) [2023-10-10 08:04:47,610][53252] Updated weights for policy 0, policy_version 81770 (0.0008) [2023-10-10 08:04:47,766][53268] Updated weights for policy 1, policy_version 81720 (0.0008) [2023-10-10 08:04:47,979][53252] Updated weights for policy 0, policy_version 81780 (0.0008) [2023-10-10 08:04:48,351][53252] Updated weights for policy 0, policy_version 81790 (0.0010) [2023-10-10 08:04:51,732][53268] Updated weights for policy 1, policy_version 81730 (0.0009) [2023-10-10 08:04:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167444480. Throughput: 0: 1681.5, 1: 1678.4. Samples: 41873208. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:51,784][52050] Avg episode reward: [(0, '21.150'), (1, '21.500')] [2023-10-10 08:04:52,097][53268] Updated weights for policy 1, policy_version 81740 (0.0007) [2023-10-10 08:04:52,352][53252] Updated weights for policy 0, policy_version 81800 (0.0007) [2023-10-10 08:04:52,461][53268] Updated weights for policy 1, policy_version 81750 (0.0008) [2023-10-10 08:04:52,720][53252] Updated weights for policy 0, policy_version 81810 (0.0007) [2023-10-10 08:04:52,827][53268] Updated weights for policy 1, policy_version 81760 (0.0009) [2023-10-10 08:04:53,092][53252] Updated weights for policy 0, policy_version 81820 (0.0009) [2023-10-10 08:04:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 167510016. Throughput: 0: 1688.3, 1: 1673.8. Samples: 41893916. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:04:56,784][52050] Avg episode reward: [(0, '21.000'), (1, '19.950')] [2023-10-10 08:04:56,989][53268] Updated weights for policy 1, policy_version 81770 (0.0011) [2023-10-10 08:04:57,265][53252] Updated weights for policy 0, policy_version 81830 (0.0009) [2023-10-10 08:04:57,359][53268] Updated weights for policy 1, policy_version 81780 (0.0007) [2023-10-10 08:04:57,636][53252] Updated weights for policy 0, policy_version 81840 (0.0007) [2023-10-10 08:04:57,720][53268] Updated weights for policy 1, policy_version 81790 (0.0009) [2023-10-10 08:04:58,004][53252] Updated weights for policy 0, policy_version 81850 (0.0007) [2023-10-10 08:05:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 167575552. Throughput: 0: 1683.5, 1: 1677.7. Samples: 41902982. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:05:01,784][52050] Avg episode reward: [(0, '20.760'), (1, '20.530')] [2023-10-10 08:05:01,940][53268] Updated weights for policy 1, policy_version 81800 (0.0008) [2023-10-10 08:05:01,995][53252] Updated weights for policy 0, policy_version 81860 (0.0008) [2023-10-10 08:05:02,312][53268] Updated weights for policy 1, policy_version 81810 (0.0007) [2023-10-10 08:05:02,369][53252] Updated weights for policy 0, policy_version 81870 (0.0008) [2023-10-10 08:05:02,678][53268] Updated weights for policy 1, policy_version 81820 (0.0008) [2023-10-10 08:05:02,748][53252] Updated weights for policy 0, policy_version 81880 (0.0007) [2023-10-10 08:05:06,591][53268] Updated weights for policy 1, policy_version 81830 (0.0007) [2023-10-10 08:05:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167641088. Throughput: 0: 1684.5, 1: 1673.5. Samples: 41923634. Policy #0 lag: (min: 31.0, avg: 43.0, max: 63.0) [2023-10-10 08:05:06,784][52050] Avg episode reward: [(0, '21.950'), (1, '22.670')] [2023-10-10 08:05:06,817][53252] Updated weights for policy 0, policy_version 81890 (0.0009) [2023-10-10 08:05:06,954][53268] Updated weights for policy 1, policy_version 81840 (0.0007) [2023-10-10 08:05:07,185][53252] Updated weights for policy 0, policy_version 81900 (0.0007) [2023-10-10 08:05:07,328][53268] Updated weights for policy 1, policy_version 81850 (0.0007) [2023-10-10 08:05:07,553][53252] Updated weights for policy 0, policy_version 81910 (0.0007) [2023-10-10 08:05:07,925][53252] Updated weights for policy 0, policy_version 81920 (0.0007) [2023-10-10 08:05:11,466][53268] Updated weights for policy 1, policy_version 81860 (0.0010) [2023-10-10 08:05:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167706624. Throughput: 0: 1682.0, 1: 1674.6. Samples: 41944190. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:11,784][52050] Avg episode reward: [(0, '21.100'), (1, '21.290')] [2023-10-10 08:05:11,846][53268] Updated weights for policy 1, policy_version 81870 (0.0008) [2023-10-10 08:05:11,970][53252] Updated weights for policy 0, policy_version 81930 (0.0008) [2023-10-10 08:05:12,210][53268] Updated weights for policy 1, policy_version 81880 (0.0007) [2023-10-10 08:05:12,333][53252] Updated weights for policy 0, policy_version 81940 (0.0010) [2023-10-10 08:05:12,703][53252] Updated weights for policy 0, policy_version 81950 (0.0008) [2023-10-10 08:05:16,276][53268] Updated weights for policy 1, policy_version 81890 (0.0009) [2023-10-10 08:05:16,692][53268] Updated weights for policy 1, policy_version 81900 (0.0010) [2023-10-10 08:05:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167772160. Throughput: 0: 1681.4, 1: 1674.5. Samples: 41953102. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:16,785][52050] Avg episode reward: [(0, '20.670'), (1, '21.460')] [2023-10-10 08:05:16,917][53252] Updated weights for policy 0, policy_version 81960 (0.0009) [2023-10-10 08:05:17,055][53268] Updated weights for policy 1, policy_version 81910 (0.0008) [2023-10-10 08:05:17,284][53252] Updated weights for policy 0, policy_version 81970 (0.0009) [2023-10-10 08:05:17,413][53268] Updated weights for policy 1, policy_version 81920 (0.0007) [2023-10-10 08:05:17,654][53252] Updated weights for policy 0, policy_version 81980 (0.0009) [2023-10-10 08:05:21,405][53268] Updated weights for policy 1, policy_version 81930 (0.0009) [2023-10-10 08:05:21,704][53252] Updated weights for policy 0, policy_version 81990 (0.0007) [2023-10-10 08:05:21,769][53268] Updated weights for policy 1, policy_version 81940 (0.0009) [2023-10-10 08:05:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167837696. Throughput: 0: 1683.2, 1: 1668.7. Samples: 41973656. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:21,784][52050] Avg episode reward: [(0, '20.890'), (1, '22.210')] [2023-10-10 08:05:22,075][53252] Updated weights for policy 0, policy_version 82000 (0.0008) [2023-10-10 08:05:22,130][53268] Updated weights for policy 1, policy_version 81950 (0.0008) [2023-10-10 08:05:22,458][53252] Updated weights for policy 0, policy_version 82010 (0.0008) [2023-10-10 08:05:26,274][53268] Updated weights for policy 1, policy_version 81960 (0.0008) [2023-10-10 08:05:26,579][53252] Updated weights for policy 0, policy_version 82020 (0.0010) [2023-10-10 08:05:26,646][53268] Updated weights for policy 1, policy_version 81970 (0.0007) [2023-10-10 08:05:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 167903232. Throughput: 0: 1676.2, 1: 1669.0. Samples: 41994142. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:26,784][52050] Avg episode reward: [(0, '22.230'), (1, '22.310')] [2023-10-10 08:05:26,955][53252] Updated weights for policy 0, policy_version 82030 (0.0007) [2023-10-10 08:05:27,011][53268] Updated weights for policy 1, policy_version 81980 (0.0007) [2023-10-10 08:05:27,332][53252] Updated weights for policy 0, policy_version 82040 (0.0009) [2023-10-10 08:05:31,090][53268] Updated weights for policy 1, policy_version 81990 (0.0009) [2023-10-10 08:05:31,453][53268] Updated weights for policy 1, policy_version 82000 (0.0008) [2023-10-10 08:05:31,553][53252] Updated weights for policy 0, policy_version 82050 (0.0009) [2023-10-10 08:05:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167968768. Throughput: 0: 1675.9, 1: 1670.2. Samples: 42003284. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:31,784][52050] Avg episode reward: [(0, '22.200'), (1, '22.370')] [2023-10-10 08:05:31,816][53268] Updated weights for policy 1, policy_version 82010 (0.0007) [2023-10-10 08:05:31,953][53252] Updated weights for policy 0, policy_version 82060 (0.0009) [2023-10-10 08:05:32,329][53252] Updated weights for policy 0, policy_version 82070 (0.0010) [2023-10-10 08:05:32,709][53252] Updated weights for policy 0, policy_version 82080 (0.0010) [2023-10-10 08:05:36,048][53268] Updated weights for policy 1, policy_version 82020 (0.0008) [2023-10-10 08:05:36,410][53268] Updated weights for policy 1, policy_version 82030 (0.0008) [2023-10-10 08:05:36,779][53268] Updated weights for policy 1, policy_version 82040 (0.0008) [2023-10-10 08:05:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168034304. Throughput: 0: 1675.4, 1: 1672.8. Samples: 42023878. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:36,784][52050] Avg episode reward: [(0, '23.890'), (1, '24.830')] [2023-10-10 08:05:36,813][53252] Updated weights for policy 0, policy_version 82090 (0.0007) [2023-10-10 08:05:37,187][53252] Updated weights for policy 0, policy_version 82100 (0.0007) [2023-10-10 08:05:37,558][53252] Updated weights for policy 0, policy_version 82110 (0.0008) [2023-10-10 08:05:40,987][53268] Updated weights for policy 1, policy_version 82050 (0.0008) [2023-10-10 08:05:41,351][53268] Updated weights for policy 1, policy_version 82060 (0.0009) [2023-10-10 08:05:41,577][53252] Updated weights for policy 0, policy_version 82120 (0.0008) [2023-10-10 08:05:41,721][53268] Updated weights for policy 1, policy_version 82070 (0.0009) [2023-10-10 08:05:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168099840. Throughput: 0: 1670.5, 1: 1670.4. Samples: 42044258. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:41,784][52050] Avg episode reward: [(0, '21.680'), (1, '23.310')] [2023-10-10 08:05:41,944][53252] Updated weights for policy 0, policy_version 82130 (0.0007) [2023-10-10 08:05:42,086][53268] Updated weights for policy 1, policy_version 82080 (0.0008) [2023-10-10 08:05:42,319][53252] Updated weights for policy 0, policy_version 82140 (0.0007) [2023-10-10 08:05:46,062][53268] Updated weights for policy 1, policy_version 82090 (0.0009) [2023-10-10 08:05:46,422][53268] Updated weights for policy 1, policy_version 82100 (0.0011) [2023-10-10 08:05:46,481][53252] Updated weights for policy 0, policy_version 82150 (0.0008) [2023-10-10 08:05:46,783][53268] Updated weights for policy 1, policy_version 82110 (0.0009) [2023-10-10 08:05:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168165376. Throughput: 0: 1676.5, 1: 1673.6. Samples: 42053738. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:46,784][52050] Avg episode reward: [(0, '21.740'), (1, '23.390')] [2023-10-10 08:05:46,841][53252] Updated weights for policy 0, policy_version 82160 (0.0007) [2023-10-10 08:05:47,209][53252] Updated weights for policy 0, policy_version 82170 (0.0009) [2023-10-10 08:05:51,022][53268] Updated weights for policy 1, policy_version 82120 (0.0008) [2023-10-10 08:05:51,087][53252] Updated weights for policy 0, policy_version 82180 (0.0008) [2023-10-10 08:05:51,393][53268] Updated weights for policy 1, policy_version 82130 (0.0009) [2023-10-10 08:05:51,457][53252] Updated weights for policy 0, policy_version 82190 (0.0009) [2023-10-10 08:05:51,756][53268] Updated weights for policy 1, policy_version 82140 (0.0008) [2023-10-10 08:05:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 168230912. Throughput: 0: 1675.9, 1: 1678.5. Samples: 42074582. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:51,784][52050] Avg episode reward: [(0, '23.210'), (1, '23.550')] [2023-10-10 08:05:51,824][53252] Updated weights for policy 0, policy_version 82200 (0.0008) [2023-10-10 08:05:55,510][53268] Updated weights for policy 1, policy_version 82150 (0.0007) [2023-10-10 08:05:55,877][53268] Updated weights for policy 1, policy_version 82160 (0.0008) [2023-10-10 08:05:55,896][53252] Updated weights for policy 0, policy_version 82210 (0.0007) [2023-10-10 08:05:56,238][53268] Updated weights for policy 1, policy_version 82170 (0.0008) [2023-10-10 08:05:56,271][53252] Updated weights for policy 0, policy_version 82220 (0.0007) [2023-10-10 08:05:56,636][53252] Updated weights for policy 0, policy_version 82230 (0.0007) [2023-10-10 08:05:56,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168329216. Throughput: 0: 1663.1, 1: 1667.5. Samples: 42094072. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:05:56,784][52050] Avg episode reward: [(0, '21.750'), (1, '21.800')] [2023-10-10 08:05:57,005][53252] Updated weights for policy 0, policy_version 82240 (0.0008) [2023-10-10 08:06:00,263][53268] Updated weights for policy 1, policy_version 82180 (0.0008) [2023-10-10 08:06:00,631][53268] Updated weights for policy 1, policy_version 82190 (0.0008) [2023-10-10 08:06:00,987][53268] Updated weights for policy 1, policy_version 82200 (0.0009) [2023-10-10 08:06:01,187][53252] Updated weights for policy 0, policy_version 82250 (0.0008) [2023-10-10 08:06:01,549][53252] Updated weights for policy 0, policy_version 82260 (0.0009) [2023-10-10 08:06:01,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 168394752. Throughput: 0: 1677.9, 1: 1691.2. Samples: 42104710. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) [2023-10-10 08:06:01,784][52050] Avg episode reward: [(0, '21.940'), (1, '21.960')] [2023-10-10 08:06:01,931][53252] Updated weights for policy 0, policy_version 82270 (0.0007) [2023-10-10 08:06:05,109][53268] Updated weights for policy 1, policy_version 82210 (0.0008) [2023-10-10 08:06:05,525][53268] Updated weights for policy 1, policy_version 82220 (0.0011) [2023-10-10 08:06:05,844][53252] Updated weights for policy 0, policy_version 82280 (0.0008) [2023-10-10 08:06:05,903][53268] Updated weights for policy 1, policy_version 82230 (0.0008) [2023-10-10 08:06:06,220][53252] Updated weights for policy 0, policy_version 82290 (0.0007) [2023-10-10 08:06:06,264][53268] Updated weights for policy 1, policy_version 82240 (0.0008) [2023-10-10 08:06:06,593][53252] Updated weights for policy 0, policy_version 82300 (0.0010) [2023-10-10 08:06:06,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168493056. Throughput: 0: 1676.3, 1: 1691.8. Samples: 42125218. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:06,784][52050] Avg episode reward: [(0, '22.660'), (1, '21.780')] [2023-10-10 08:06:10,274][53268] Updated weights for policy 1, policy_version 82250 (0.0008) [2023-10-10 08:06:10,541][53252] Updated weights for policy 0, policy_version 82310 (0.0008) [2023-10-10 08:06:10,643][53268] Updated weights for policy 1, policy_version 82260 (0.0010) [2023-10-10 08:06:10,925][53252] Updated weights for policy 0, policy_version 82320 (0.0008) [2023-10-10 08:06:11,010][53268] Updated weights for policy 1, policy_version 82270 (0.0009) [2023-10-10 08:06:11,289][53252] Updated weights for policy 0, policy_version 82330 (0.0008) [2023-10-10 08:06:11,783][52050] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 168558592. Throughput: 0: 1659.1, 1: 1666.7. Samples: 42143804. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:11,784][52050] Avg episode reward: [(0, '22.870'), (1, '21.180')] [2023-10-10 08:06:15,089][53268] Updated weights for policy 1, policy_version 82280 (0.0010) [2023-10-10 08:06:15,358][53252] Updated weights for policy 0, policy_version 82340 (0.0008) [2023-10-10 08:06:15,455][53268] Updated weights for policy 1, policy_version 82290 (0.0008) [2023-10-10 08:06:15,728][53252] Updated weights for policy 0, policy_version 82350 (0.0007) [2023-10-10 08:06:15,826][53268] Updated weights for policy 1, policy_version 82300 (0.0008) [2023-10-10 08:06:16,088][53252] Updated weights for policy 0, policy_version 82360 (0.0008) [2023-10-10 08:06:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168624128. Throughput: 0: 1684.8, 1: 1692.5. Samples: 42155266. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:16,785][52050] Avg episode reward: [(0, '23.400'), (1, '19.770')] [2023-10-10 08:06:19,948][53268] Updated weights for policy 1, policy_version 82310 (0.0008) [2023-10-10 08:06:20,270][53252] Updated weights for policy 0, policy_version 82370 (0.0008) [2023-10-10 08:06:20,306][53268] Updated weights for policy 1, policy_version 82320 (0.0009) [2023-10-10 08:06:20,645][53252] Updated weights for policy 0, policy_version 82380 (0.0008) [2023-10-10 08:06:20,671][53268] Updated weights for policy 1, policy_version 82330 (0.0008) [2023-10-10 08:06:21,017][53252] Updated weights for policy 0, policy_version 82390 (0.0010) [2023-10-10 08:06:21,380][53252] Updated weights for policy 0, policy_version 82400 (0.0009) [2023-10-10 08:06:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168689664. Throughput: 0: 1683.4, 1: 1677.0. Samples: 42175094. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:21,784][52050] Avg episode reward: [(0, '21.200'), (1, '20.970')] [2023-10-10 08:06:24,769][53268] Updated weights for policy 1, policy_version 82340 (0.0009) [2023-10-10 08:06:25,132][53268] Updated weights for policy 1, policy_version 82350 (0.0008) [2023-10-10 08:06:25,457][53252] Updated weights for policy 0, policy_version 82410 (0.0008) [2023-10-10 08:06:25,494][53268] Updated weights for policy 1, policy_version 82360 (0.0009) [2023-10-10 08:06:25,822][53252] Updated weights for policy 0, policy_version 82420 (0.0009) [2023-10-10 08:06:26,203][53252] Updated weights for policy 0, policy_version 82430 (0.0011) [2023-10-10 08:06:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168755200. Throughput: 0: 1659.7, 1: 1665.7. Samples: 42193904. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:26,784][52050] Avg episode reward: [(0, '23.070'), (1, '23.250')] [2023-10-10 08:06:29,448][53268] Updated weights for policy 1, policy_version 82370 (0.0008) [2023-10-10 08:06:29,813][53268] Updated weights for policy 1, policy_version 82380 (0.0011) [2023-10-10 08:06:30,187][53268] Updated weights for policy 1, policy_version 82390 (0.0008) [2023-10-10 08:06:30,317][53252] Updated weights for policy 0, policy_version 82440 (0.0008) [2023-10-10 08:06:30,548][53268] Updated weights for policy 1, policy_version 82400 (0.0009) [2023-10-10 08:06:30,686][53252] Updated weights for policy 0, policy_version 82450 (0.0009) [2023-10-10 08:06:31,059][53252] Updated weights for policy 0, policy_version 82460 (0.0010) [2023-10-10 08:06:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 168820736. Throughput: 0: 1682.5, 1: 1689.5. Samples: 42205476. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:31,784][52050] Avg episode reward: [(0, '22.840'), (1, '23.540')] [2023-10-10 08:06:34,691][53268] Updated weights for policy 1, policy_version 82410 (0.0009) [2023-10-10 08:06:34,961][53252] Updated weights for policy 0, policy_version 82470 (0.0009) [2023-10-10 08:06:35,048][53268] Updated weights for policy 1, policy_version 82420 (0.0009) [2023-10-10 08:06:35,322][53252] Updated weights for policy 0, policy_version 82480 (0.0009) [2023-10-10 08:06:35,413][53268] Updated weights for policy 1, policy_version 82430 (0.0009) [2023-10-10 08:06:35,698][53252] Updated weights for policy 0, policy_version 82490 (0.0008) [2023-10-10 08:06:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168886272. Throughput: 0: 1670.3, 1: 1666.1. Samples: 42224716. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:36,784][52050] Avg episode reward: [(0, '20.980'), (1, '23.130')] [2023-10-10 08:06:39,520][53268] Updated weights for policy 1, policy_version 82440 (0.0010) [2023-10-10 08:06:39,876][53252] Updated weights for policy 0, policy_version 82500 (0.0008) [2023-10-10 08:06:39,886][53268] Updated weights for policy 1, policy_version 82450 (0.0009) [2023-10-10 08:06:40,230][53252] Updated weights for policy 0, policy_version 82510 (0.0009) [2023-10-10 08:06:40,246][53268] Updated weights for policy 1, policy_version 82460 (0.0009) [2023-10-10 08:06:40,606][53252] Updated weights for policy 0, policy_version 82520 (0.0008) [2023-10-10 08:06:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168951808. Throughput: 0: 1669.7, 1: 1676.8. Samples: 42244666. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:41,785][52050] Avg episode reward: [(0, '21.680'), (1, '24.920')] [2023-10-10 08:06:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000082528_84508672.pth... [2023-10-10 08:06:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000082464_84443136.pth... [2023-10-10 08:06:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000080960_82903040.pth [2023-10-10 08:06:41,836][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000080896_82837504.pth [2023-10-10 08:06:44,168][53268] Updated weights for policy 1, policy_version 82470 (0.0009) [2023-10-10 08:06:44,541][53268] Updated weights for policy 1, policy_version 82480 (0.0007) [2023-10-10 08:06:44,571][53252] Updated weights for policy 0, policy_version 82530 (0.0008) [2023-10-10 08:06:44,911][53268] Updated weights for policy 1, policy_version 82490 (0.0008) [2023-10-10 08:06:44,938][53252] Updated weights for policy 0, policy_version 82540 (0.0010) [2023-10-10 08:06:45,312][53252] Updated weights for policy 0, policy_version 82550 (0.0008) [2023-10-10 08:06:45,682][53252] Updated weights for policy 0, policy_version 82560 (0.0010) [2023-10-10 08:06:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 169017344. Throughput: 0: 1688.7, 1: 1683.1. Samples: 42256440. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:46,784][52050] Avg episode reward: [(0, '22.630'), (1, '21.610')] [2023-10-10 08:06:48,989][53268] Updated weights for policy 1, policy_version 82500 (0.0010) [2023-10-10 08:06:49,363][53268] Updated weights for policy 1, policy_version 82510 (0.0010) [2023-10-10 08:06:49,719][53252] Updated weights for policy 0, policy_version 82570 (0.0007) [2023-10-10 08:06:49,721][53268] Updated weights for policy 1, policy_version 82520 (0.0009) [2023-10-10 08:06:50,094][53252] Updated weights for policy 0, policy_version 82580 (0.0009) [2023-10-10 08:06:50,458][53252] Updated weights for policy 0, policy_version 82590 (0.0008) [2023-10-10 08:06:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 169082880. Throughput: 0: 1668.2, 1: 1660.1. Samples: 42274994. Policy #0 lag: (min: 9.0, avg: 24.2, max: 41.0) [2023-10-10 08:06:51,785][52050] Avg episode reward: [(0, '21.860'), (1, '19.730')] [2023-10-10 08:06:53,845][53268] Updated weights for policy 1, policy_version 82530 (0.0009) [2023-10-10 08:06:54,272][53268] Updated weights for policy 1, policy_version 82540 (0.0008) [2023-10-10 08:06:54,518][53252] Updated weights for policy 0, policy_version 82600 (0.0007) [2023-10-10 08:06:54,631][53268] Updated weights for policy 1, policy_version 82550 (0.0008) [2023-10-10 08:06:54,893][53252] Updated weights for policy 0, policy_version 82610 (0.0008) [2023-10-10 08:06:55,001][53268] Updated weights for policy 1, policy_version 82560 (0.0008) [2023-10-10 08:06:55,261][53252] Updated weights for policy 0, policy_version 82620 (0.0010) [2023-10-10 08:06:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 169148416. Throughput: 0: 1679.1, 1: 1688.5. Samples: 42295344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:06:56,784][52050] Avg episode reward: [(0, '22.420'), (1, '21.660')] [2023-10-10 08:06:59,094][53268] Updated weights for policy 1, policy_version 82570 (0.0008) [2023-10-10 08:06:59,271][53252] Updated weights for policy 0, policy_version 82630 (0.0008) [2023-10-10 08:06:59,472][53268] Updated weights for policy 1, policy_version 82580 (0.0007) [2023-10-10 08:06:59,652][53252] Updated weights for policy 0, policy_version 82640 (0.0007) [2023-10-10 08:06:59,838][53268] Updated weights for policy 1, policy_version 82590 (0.0010) [2023-10-10 08:07:00,017][53252] Updated weights for policy 0, policy_version 82650 (0.0008) [2023-10-10 08:07:01,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 169213952. Throughput: 0: 1674.9, 1: 1680.9. Samples: 42306276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:01,784][52050] Avg episode reward: [(0, '22.230'), (1, '22.360')] [2023-10-10 08:07:03,866][53268] Updated weights for policy 1, policy_version 82600 (0.0010) [2023-10-10 08:07:04,070][53252] Updated weights for policy 0, policy_version 82660 (0.0008) [2023-10-10 08:07:04,233][53268] Updated weights for policy 1, policy_version 82610 (0.0008) [2023-10-10 08:07:04,443][53252] Updated weights for policy 0, policy_version 82670 (0.0009) [2023-10-10 08:07:04,595][53268] Updated weights for policy 1, policy_version 82620 (0.0008) [2023-10-10 08:07:04,816][53252] Updated weights for policy 0, policy_version 82680 (0.0009) [2023-10-10 08:07:06,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 169279488. Throughput: 0: 1667.5, 1: 1673.1. Samples: 42325422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:06,785][52050] Avg episode reward: [(0, '23.140'), (1, '22.030')] [2023-10-10 08:07:08,662][53268] Updated weights for policy 1, policy_version 82630 (0.0009) [2023-10-10 08:07:08,805][53252] Updated weights for policy 0, policy_version 82690 (0.0009) [2023-10-10 08:07:09,034][53268] Updated weights for policy 1, policy_version 82640 (0.0008) [2023-10-10 08:07:09,197][53252] Updated weights for policy 0, policy_version 82700 (0.0007) [2023-10-10 08:07:09,409][53268] Updated weights for policy 1, policy_version 82650 (0.0008) [2023-10-10 08:07:09,563][53252] Updated weights for policy 0, policy_version 82710 (0.0008) [2023-10-10 08:07:09,939][53252] Updated weights for policy 0, policy_version 82720 (0.0008) [2023-10-10 08:07:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169345024. Throughput: 0: 1691.3, 1: 1690.3. Samples: 42346074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:11,784][52050] Avg episode reward: [(0, '22.850'), (1, '23.740')] [2023-10-10 08:07:13,459][53268] Updated weights for policy 1, policy_version 82660 (0.0009) [2023-10-10 08:07:13,824][53268] Updated weights for policy 1, policy_version 82670 (0.0009) [2023-10-10 08:07:14,056][53252] Updated weights for policy 0, policy_version 82730 (0.0007) [2023-10-10 08:07:14,191][53268] Updated weights for policy 1, policy_version 82680 (0.0007) [2023-10-10 08:07:14,426][53252] Updated weights for policy 0, policy_version 82740 (0.0008) [2023-10-10 08:07:14,792][53252] Updated weights for policy 0, policy_version 82750 (0.0009) [2023-10-10 08:07:16,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169410560. Throughput: 0: 1675.2, 1: 1670.4. Samples: 42356024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:16,784][52050] Avg episode reward: [(0, '20.700'), (1, '23.510')] [2023-10-10 08:07:18,188][53268] Updated weights for policy 1, policy_version 82690 (0.0010) [2023-10-10 08:07:18,557][53268] Updated weights for policy 1, policy_version 82700 (0.0009) [2023-10-10 08:07:18,931][53268] Updated weights for policy 1, policy_version 82710 (0.0009) [2023-10-10 08:07:18,954][53252] Updated weights for policy 0, policy_version 82760 (0.0008) [2023-10-10 08:07:19,295][53268] Updated weights for policy 1, policy_version 82720 (0.0009) [2023-10-10 08:07:19,323][53252] Updated weights for policy 0, policy_version 82770 (0.0008) [2023-10-10 08:07:19,700][53252] Updated weights for policy 0, policy_version 82780 (0.0010) [2023-10-10 08:07:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169476096. Throughput: 0: 1671.5, 1: 1683.7. Samples: 42375698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:21,784][52050] Avg episode reward: [(0, '21.770'), (1, '23.630')] [2023-10-10 08:07:23,416][53268] Updated weights for policy 1, policy_version 82730 (0.0009) [2023-10-10 08:07:23,773][53268] Updated weights for policy 1, policy_version 82740 (0.0009) [2023-10-10 08:07:23,875][53252] Updated weights for policy 0, policy_version 82790 (0.0009) [2023-10-10 08:07:24,152][53268] Updated weights for policy 1, policy_version 82750 (0.0007) [2023-10-10 08:07:24,237][53252] Updated weights for policy 0, policy_version 82800 (0.0009) [2023-10-10 08:07:24,613][53252] Updated weights for policy 0, policy_version 82810 (0.0011) [2023-10-10 08:07:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169541632. Throughput: 0: 1686.3, 1: 1680.9. Samples: 42396188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:26,784][52050] Avg episode reward: [(0, '19.360'), (1, '23.400')] [2023-10-10 08:07:28,388][53268] Updated weights for policy 1, policy_version 82760 (0.0008) [2023-10-10 08:07:28,597][53252] Updated weights for policy 0, policy_version 82820 (0.0008) [2023-10-10 08:07:28,752][53268] Updated weights for policy 1, policy_version 82770 (0.0009) [2023-10-10 08:07:28,965][53252] Updated weights for policy 0, policy_version 82830 (0.0008) [2023-10-10 08:07:29,126][53268] Updated weights for policy 1, policy_version 82780 (0.0010) [2023-10-10 08:07:29,338][53252] Updated weights for policy 0, policy_version 82840 (0.0008) [2023-10-10 08:07:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169607168. Throughput: 0: 1664.5, 1: 1656.6. Samples: 42405890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:31,784][52050] Avg episode reward: [(0, '21.840'), (1, '23.080')] [2023-10-10 08:07:33,124][53268] Updated weights for policy 1, policy_version 82790 (0.0009) [2023-10-10 08:07:33,314][53252] Updated weights for policy 0, policy_version 82850 (0.0009) [2023-10-10 08:07:33,494][53268] Updated weights for policy 1, policy_version 82800 (0.0010) [2023-10-10 08:07:33,681][53252] Updated weights for policy 0, policy_version 82860 (0.0008) [2023-10-10 08:07:33,863][53268] Updated weights for policy 1, policy_version 82810 (0.0007) [2023-10-10 08:07:34,047][53252] Updated weights for policy 0, policy_version 82870 (0.0008) [2023-10-10 08:07:34,412][53252] Updated weights for policy 0, policy_version 82880 (0.0009) [2023-10-10 08:07:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 169672704. Throughput: 0: 1678.3, 1: 1681.8. Samples: 42426198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:36,785][52050] Avg episode reward: [(0, '22.130'), (1, '21.940')] [2023-10-10 08:07:37,966][53268] Updated weights for policy 1, policy_version 82820 (0.0008) [2023-10-10 08:07:38,335][53268] Updated weights for policy 1, policy_version 82830 (0.0008) [2023-10-10 08:07:38,442][53252] Updated weights for policy 0, policy_version 82890 (0.0009) [2023-10-10 08:07:38,689][53268] Updated weights for policy 1, policy_version 82840 (0.0009) [2023-10-10 08:07:38,808][53252] Updated weights for policy 0, policy_version 82900 (0.0008) [2023-10-10 08:07:39,178][53252] Updated weights for policy 0, policy_version 82910 (0.0008) [2023-10-10 08:07:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169738240. Throughput: 0: 1687.8, 1: 1679.8. Samples: 42446886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:41,784][52050] Avg episode reward: [(0, '21.190'), (1, '20.850')] [2023-10-10 08:07:42,708][53268] Updated weights for policy 1, policy_version 82850 (0.0008) [2023-10-10 08:07:43,098][53268] Updated weights for policy 1, policy_version 82860 (0.0007) [2023-10-10 08:07:43,456][53268] Updated weights for policy 1, policy_version 82870 (0.0008) [2023-10-10 08:07:43,456][53252] Updated weights for policy 0, policy_version 82920 (0.0008) [2023-10-10 08:07:43,830][53252] Updated weights for policy 0, policy_version 82930 (0.0009) [2023-10-10 08:07:43,831][53268] Updated weights for policy 1, policy_version 82880 (0.0008) [2023-10-10 08:07:44,208][53252] Updated weights for policy 0, policy_version 82940 (0.0008) [2023-10-10 08:07:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169803776. Throughput: 0: 1666.1, 1: 1659.3. Samples: 42455920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:07:46,784][52050] Avg episode reward: [(0, '22.750'), (1, '21.440')] [2023-10-10 08:07:47,944][53268] Updated weights for policy 1, policy_version 82890 (0.0009) [2023-10-10 08:07:48,313][53268] Updated weights for policy 1, policy_version 82900 (0.0007) [2023-10-10 08:07:48,381][53252] Updated weights for policy 0, policy_version 82950 (0.0009) [2023-10-10 08:07:48,682][53268] Updated weights for policy 1, policy_version 82910 (0.0009) [2023-10-10 08:07:48,759][53252] Updated weights for policy 0, policy_version 82960 (0.0009) [2023-10-10 08:07:49,134][53252] Updated weights for policy 0, policy_version 82970 (0.0008) [2023-10-10 08:07:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169869312. Throughput: 0: 1677.0, 1: 1680.5. Samples: 42476510. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:07:51,784][52050] Avg episode reward: [(0, '22.770'), (1, '21.530')] [2023-10-10 08:07:52,680][53268] Updated weights for policy 1, policy_version 82920 (0.0009) [2023-10-10 08:07:53,056][53252] Updated weights for policy 0, policy_version 82980 (0.0007) [2023-10-10 08:07:53,059][53268] Updated weights for policy 1, policy_version 82930 (0.0009) [2023-10-10 08:07:53,418][53268] Updated weights for policy 1, policy_version 82940 (0.0010) [2023-10-10 08:07:53,425][53252] Updated weights for policy 0, policy_version 82990 (0.0008) [2023-10-10 08:07:53,795][53252] Updated weights for policy 0, policy_version 83000 (0.0007) [2023-10-10 08:07:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 169934848. Throughput: 0: 1681.9, 1: 1684.7. Samples: 42497572. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:07:56,784][52050] Avg episode reward: [(0, '22.060'), (1, '21.750')] [2023-10-10 08:07:57,544][53268] Updated weights for policy 1, policy_version 82950 (0.0008) [2023-10-10 08:07:57,896][53252] Updated weights for policy 0, policy_version 83010 (0.0008) [2023-10-10 08:07:57,913][53268] Updated weights for policy 1, policy_version 82960 (0.0009) [2023-10-10 08:07:58,277][53268] Updated weights for policy 1, policy_version 82970 (0.0009) [2023-10-10 08:07:58,281][53252] Updated weights for policy 0, policy_version 83020 (0.0007) [2023-10-10 08:07:58,658][53252] Updated weights for policy 0, policy_version 83030 (0.0009) [2023-10-10 08:07:59,024][53252] Updated weights for policy 0, policy_version 83040 (0.0008) [2023-10-10 08:08:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 170000384. Throughput: 0: 1671.3, 1: 1675.4. Samples: 42506626. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:01,784][52050] Avg episode reward: [(0, '23.280'), (1, '22.680')] [2023-10-10 08:08:02,442][53268] Updated weights for policy 1, policy_version 82980 (0.0007) [2023-10-10 08:08:02,803][53268] Updated weights for policy 1, policy_version 82990 (0.0008) [2023-10-10 08:08:03,124][53252] Updated weights for policy 0, policy_version 83050 (0.0008) [2023-10-10 08:08:03,173][53268] Updated weights for policy 1, policy_version 83000 (0.0008) [2023-10-10 08:08:03,496][53252] Updated weights for policy 0, policy_version 83060 (0.0007) [2023-10-10 08:08:03,863][53252] Updated weights for policy 0, policy_version 83070 (0.0007) [2023-10-10 08:08:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 170065920. Throughput: 0: 1684.7, 1: 1680.4. Samples: 42527126. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:06,784][52050] Avg episode reward: [(0, '22.700'), (1, '23.580')] [2023-10-10 08:08:07,362][53268] Updated weights for policy 1, policy_version 83010 (0.0008) [2023-10-10 08:08:07,724][53268] Updated weights for policy 1, policy_version 83020 (0.0009) [2023-10-10 08:08:07,858][53252] Updated weights for policy 0, policy_version 83080 (0.0007) [2023-10-10 08:08:08,089][53268] Updated weights for policy 1, policy_version 83030 (0.0007) [2023-10-10 08:08:08,220][53252] Updated weights for policy 0, policy_version 83090 (0.0008) [2023-10-10 08:08:08,457][53268] Updated weights for policy 1, policy_version 83040 (0.0008) [2023-10-10 08:08:08,599][53252] Updated weights for policy 0, policy_version 83100 (0.0009) [2023-10-10 08:08:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170131456. Throughput: 0: 1684.6, 1: 1680.7. Samples: 42547624. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:11,784][52050] Avg episode reward: [(0, '22.610'), (1, '21.500')] [2023-10-10 08:08:12,631][53252] Updated weights for policy 0, policy_version 83110 (0.0008) [2023-10-10 08:08:12,640][53268] Updated weights for policy 1, policy_version 83050 (0.0007) [2023-10-10 08:08:12,996][53252] Updated weights for policy 0, policy_version 83120 (0.0008) [2023-10-10 08:08:13,012][53268] Updated weights for policy 1, policy_version 83060 (0.0010) [2023-10-10 08:08:13,371][53252] Updated weights for policy 0, policy_version 83130 (0.0009) [2023-10-10 08:08:13,382][53268] Updated weights for policy 1, policy_version 83070 (0.0007) [2023-10-10 08:08:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170196992. Throughput: 0: 1676.5, 1: 1674.3. Samples: 42556676. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:16,784][52050] Avg episode reward: [(0, '20.750'), (1, '21.180')] [2023-10-10 08:08:17,384][53268] Updated weights for policy 1, policy_version 83080 (0.0008) [2023-10-10 08:08:17,445][53252] Updated weights for policy 0, policy_version 83140 (0.0009) [2023-10-10 08:08:17,752][53268] Updated weights for policy 1, policy_version 83090 (0.0008) [2023-10-10 08:08:17,806][53252] Updated weights for policy 0, policy_version 83150 (0.0008) [2023-10-10 08:08:18,110][53268] Updated weights for policy 1, policy_version 83100 (0.0008) [2023-10-10 08:08:18,176][53252] Updated weights for policy 0, policy_version 83160 (0.0008) [2023-10-10 08:08:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170262528. Throughput: 0: 1685.5, 1: 1672.3. Samples: 42577298. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:21,784][52050] Avg episode reward: [(0, '21.210'), (1, '21.790')] [2023-10-10 08:08:22,152][53252] Updated weights for policy 0, policy_version 83170 (0.0008) [2023-10-10 08:08:22,297][53268] Updated weights for policy 1, policy_version 83110 (0.0009) [2023-10-10 08:08:22,529][53252] Updated weights for policy 0, policy_version 83180 (0.0010) [2023-10-10 08:08:22,666][53268] Updated weights for policy 1, policy_version 83120 (0.0009) [2023-10-10 08:08:22,898][53252] Updated weights for policy 0, policy_version 83190 (0.0010) [2023-10-10 08:08:23,034][53268] Updated weights for policy 1, policy_version 83130 (0.0008) [2023-10-10 08:08:23,272][53252] Updated weights for policy 0, policy_version 83200 (0.0008) [2023-10-10 08:08:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 170328064. Throughput: 0: 1688.7, 1: 1676.8. Samples: 42598336. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:26,785][52050] Avg episode reward: [(0, '21.240'), (1, '21.270')] [2023-10-10 08:08:27,171][53268] Updated weights for policy 1, policy_version 83140 (0.0007) [2023-10-10 08:08:27,292][53252] Updated weights for policy 0, policy_version 83210 (0.0009) [2023-10-10 08:08:27,535][53268] Updated weights for policy 1, policy_version 83150 (0.0008) [2023-10-10 08:08:27,664][53252] Updated weights for policy 0, policy_version 83220 (0.0009) [2023-10-10 08:08:27,909][53268] Updated weights for policy 1, policy_version 83160 (0.0008) [2023-10-10 08:08:28,038][53252] Updated weights for policy 0, policy_version 83230 (0.0010) [2023-10-10 08:08:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170393600. Throughput: 0: 1688.0, 1: 1674.8. Samples: 42607248. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:31,784][52050] Avg episode reward: [(0, '24.260'), (1, '19.830')] [2023-10-10 08:08:32,007][53268] Updated weights for policy 1, policy_version 83170 (0.0009) [2023-10-10 08:08:32,016][53252] Updated weights for policy 0, policy_version 83240 (0.0008) [2023-10-10 08:08:32,381][53268] Updated weights for policy 1, policy_version 83180 (0.0009) [2023-10-10 08:08:32,384][53252] Updated weights for policy 0, policy_version 83250 (0.0007) [2023-10-10 08:08:32,748][53268] Updated weights for policy 1, policy_version 83190 (0.0009) [2023-10-10 08:08:32,753][53252] Updated weights for policy 0, policy_version 83260 (0.0008) [2023-10-10 08:08:33,105][53268] Updated weights for policy 1, policy_version 83200 (0.0009) [2023-10-10 08:08:36,745][53252] Updated weights for policy 0, policy_version 83270 (0.0008) [2023-10-10 08:08:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170459136. Throughput: 0: 1694.1, 1: 1672.9. Samples: 42628028. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:36,784][52050] Avg episode reward: [(0, '22.320'), (1, '20.150')] [2023-10-10 08:08:37,082][53268] Updated weights for policy 1, policy_version 83210 (0.0007) [2023-10-10 08:08:37,112][53252] Updated weights for policy 0, policy_version 83280 (0.0007) [2023-10-10 08:08:37,444][53268] Updated weights for policy 1, policy_version 83220 (0.0009) [2023-10-10 08:08:37,477][53252] Updated weights for policy 0, policy_version 83290 (0.0009) [2023-10-10 08:08:37,813][53268] Updated weights for policy 1, policy_version 83230 (0.0011) [2023-10-10 08:08:41,555][53252] Updated weights for policy 0, policy_version 83300 (0.0009) [2023-10-10 08:08:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170524672. Throughput: 0: 1694.2, 1: 1670.4. Samples: 42648980. Policy #0 lag: (min: 9.0, avg: 16.3, max: 41.0) [2023-10-10 08:08:41,784][52050] Avg episode reward: [(0, '23.450'), (1, '22.250')] [2023-10-10 08:08:41,832][53268] Updated weights for policy 1, policy_version 83240 (0.0009) [2023-10-10 08:08:41,912][53252] Updated weights for policy 0, policy_version 83310 (0.0007) [2023-10-10 08:08:42,210][53268] Updated weights for policy 1, policy_version 83250 (0.0009) [2023-10-10 08:08:42,294][53252] Updated weights for policy 0, policy_version 83320 (0.0007) [2023-10-10 08:08:42,577][53268] Updated weights for policy 1, policy_version 83260 (0.0008) [2023-10-10 08:08:42,586][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000083328_85327872.pth... [2023-10-10 08:08:42,615][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000081728_83689472.pth [2023-10-10 08:08:42,712][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000083264_85262336.pth... [2023-10-10 08:08:42,752][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000081664_83623936.pth [2023-10-10 08:08:46,352][53252] Updated weights for policy 0, policy_version 83330 (0.0008) [2023-10-10 08:08:46,512][53268] Updated weights for policy 1, policy_version 83270 (0.0008) [2023-10-10 08:08:46,752][53252] Updated weights for policy 0, policy_version 83340 (0.0008) [2023-10-10 08:08:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170590208. Throughput: 0: 1695.2, 1: 1668.2. Samples: 42657980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:08:46,784][52050] Avg episode reward: [(0, '24.480'), (1, '19.540')] [2023-10-10 08:08:46,881][53268] Updated weights for policy 1, policy_version 83280 (0.0008) [2023-10-10 08:08:47,130][53252] Updated weights for policy 0, policy_version 83350 (0.0009) [2023-10-10 08:08:47,240][53268] Updated weights for policy 1, policy_version 83290 (0.0007) [2023-10-10 08:08:47,492][53252] Updated weights for policy 0, policy_version 83360 (0.0008) [2023-10-10 08:08:51,242][53268] Updated weights for policy 1, policy_version 83300 (0.0008) [2023-10-10 08:08:51,543][53252] Updated weights for policy 0, policy_version 83370 (0.0008) [2023-10-10 08:08:51,604][53268] Updated weights for policy 1, policy_version 83310 (0.0007) [2023-10-10 08:08:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170655744. Throughput: 0: 1693.8, 1: 1678.5. Samples: 42678882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:08:51,784][52050] Avg episode reward: [(0, '25.480'), (1, '21.630')] [2023-10-10 08:08:51,912][53252] Updated weights for policy 0, policy_version 83380 (0.0010) [2023-10-10 08:08:51,971][53268] Updated weights for policy 1, policy_version 83320 (0.0007) [2023-10-10 08:08:52,287][53252] Updated weights for policy 0, policy_version 83390 (0.0009) [2023-10-10 08:08:52,359][52846] Saving new best policy, reward=25.480! [2023-10-10 08:08:56,005][53268] Updated weights for policy 1, policy_version 83330 (0.0007) [2023-10-10 08:08:56,371][53268] Updated weights for policy 1, policy_version 83340 (0.0008) [2023-10-10 08:08:56,499][53252] Updated weights for policy 0, policy_version 83400 (0.0009) [2023-10-10 08:08:56,742][53268] Updated weights for policy 1, policy_version 83350 (0.0007) [2023-10-10 08:08:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170721280. Throughput: 0: 1688.7, 1: 1677.5. Samples: 42699100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:08:56,784][52050] Avg episode reward: [(0, '23.170'), (1, '22.690')] [2023-10-10 08:08:56,866][53252] Updated weights for policy 0, policy_version 83410 (0.0008) [2023-10-10 08:08:57,105][53268] Updated weights for policy 1, policy_version 83360 (0.0008) [2023-10-10 08:08:57,240][53252] Updated weights for policy 0, policy_version 83420 (0.0009) [2023-10-10 08:09:01,254][53268] Updated weights for policy 1, policy_version 83370 (0.0009) [2023-10-10 08:09:01,367][53252] Updated weights for policy 0, policy_version 83430 (0.0009) [2023-10-10 08:09:01,613][53268] Updated weights for policy 1, policy_version 83380 (0.0008) [2023-10-10 08:09:01,741][53252] Updated weights for policy 0, policy_version 83440 (0.0007) [2023-10-10 08:09:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 170786816. Throughput: 0: 1688.0, 1: 1685.6. Samples: 42708488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:01,784][52050] Avg episode reward: [(0, '26.160'), (1, '20.930')] [2023-10-10 08:09:01,976][53268] Updated weights for policy 1, policy_version 83390 (0.0009) [2023-10-10 08:09:02,117][53252] Updated weights for policy 0, policy_version 83450 (0.0008) [2023-10-10 08:09:02,339][52846] Saving new best policy, reward=26.160! [2023-10-10 08:09:06,062][53268] Updated weights for policy 1, policy_version 83400 (0.0009) [2023-10-10 08:09:06,217][53252] Updated weights for policy 0, policy_version 83460 (0.0010) [2023-10-10 08:09:06,430][53268] Updated weights for policy 1, policy_version 83410 (0.0007) [2023-10-10 08:09:06,587][53252] Updated weights for policy 0, policy_version 83470 (0.0009) [2023-10-10 08:09:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170852352. Throughput: 0: 1684.5, 1: 1684.8. Samples: 42728920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:06,785][52050] Avg episode reward: [(0, '24.420'), (1, '21.430')] [2023-10-10 08:09:06,798][53268] Updated weights for policy 1, policy_version 83420 (0.0007) [2023-10-10 08:09:06,953][53252] Updated weights for policy 0, policy_version 83480 (0.0008) [2023-10-10 08:09:10,909][53268] Updated weights for policy 1, policy_version 83430 (0.0009) [2023-10-10 08:09:11,053][53252] Updated weights for policy 0, policy_version 83490 (0.0009) [2023-10-10 08:09:11,268][53268] Updated weights for policy 1, policy_version 83440 (0.0009) [2023-10-10 08:09:11,420][53252] Updated weights for policy 0, policy_version 83500 (0.0007) [2023-10-10 08:09:11,633][53268] Updated weights for policy 1, policy_version 83450 (0.0010) [2023-10-10 08:09:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 170917888. Throughput: 0: 1667.1, 1: 1675.6. Samples: 42748756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:11,784][52050] Avg episode reward: [(0, '21.670'), (1, '21.930')] [2023-10-10 08:09:11,792][53252] Updated weights for policy 0, policy_version 83510 (0.0007) [2023-10-10 08:09:12,156][53252] Updated weights for policy 0, policy_version 83520 (0.0008) [2023-10-10 08:09:15,663][53268] Updated weights for policy 1, policy_version 83460 (0.0007) [2023-10-10 08:09:16,024][53268] Updated weights for policy 1, policy_version 83470 (0.0008) [2023-10-10 08:09:16,177][53252] Updated weights for policy 0, policy_version 83530 (0.0009) [2023-10-10 08:09:16,400][53268] Updated weights for policy 1, policy_version 83480 (0.0009) [2023-10-10 08:09:16,554][53252] Updated weights for policy 0, policy_version 83540 (0.0008) [2023-10-10 08:09:16,783][52050] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 171016192. Throughput: 0: 1679.9, 1: 1687.4. Samples: 42758778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:16,784][52050] Avg episode reward: [(0, '21.960'), (1, '21.150')] [2023-10-10 08:09:16,923][53252] Updated weights for policy 0, policy_version 83550 (0.0009) [2023-10-10 08:09:20,702][53268] Updated weights for policy 1, policy_version 83490 (0.0008) [2023-10-10 08:09:20,901][53252] Updated weights for policy 0, policy_version 83560 (0.0008) [2023-10-10 08:09:21,120][53268] Updated weights for policy 1, policy_version 83500 (0.0009) [2023-10-10 08:09:21,273][53252] Updated weights for policy 0, policy_version 83570 (0.0008) [2023-10-10 08:09:21,478][53268] Updated weights for policy 1, policy_version 83510 (0.0007) [2023-10-10 08:09:21,642][53252] Updated weights for policy 0, policy_version 83580 (0.0010) [2023-10-10 08:09:21,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171081728. Throughput: 0: 1681.0, 1: 1685.6. Samples: 42779526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:21,784][52050] Avg episode reward: [(0, '20.560'), (1, '21.830')] [2023-10-10 08:09:21,854][53268] Updated weights for policy 1, policy_version 83520 (0.0007) [2023-10-10 08:09:25,686][53252] Updated weights for policy 0, policy_version 83590 (0.0009) [2023-10-10 08:09:25,865][53268] Updated weights for policy 1, policy_version 83530 (0.0009) [2023-10-10 08:09:26,054][53252] Updated weights for policy 0, policy_version 83600 (0.0007) [2023-10-10 08:09:26,233][53268] Updated weights for policy 1, policy_version 83540 (0.0009) [2023-10-10 08:09:26,426][53252] Updated weights for policy 0, policy_version 83610 (0.0007) [2023-10-10 08:09:26,594][53268] Updated weights for policy 1, policy_version 83550 (0.0008) [2023-10-10 08:09:26,783][52050] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171180032. Throughput: 0: 1660.9, 1: 1663.2. Samples: 42798566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:26,784][52050] Avg episode reward: [(0, '21.010'), (1, '23.300')] [2023-10-10 08:09:30,517][53252] Updated weights for policy 0, policy_version 83620 (0.0007) [2023-10-10 08:09:30,628][53268] Updated weights for policy 1, policy_version 83560 (0.0009) [2023-10-10 08:09:30,896][53252] Updated weights for policy 0, policy_version 83630 (0.0008) [2023-10-10 08:09:30,995][53268] Updated weights for policy 1, policy_version 83570 (0.0008) [2023-10-10 08:09:31,260][53252] Updated weights for policy 0, policy_version 83640 (0.0010) [2023-10-10 08:09:31,360][53268] Updated weights for policy 1, policy_version 83580 (0.0008) [2023-10-10 08:09:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171245568. Throughput: 0: 1683.5, 1: 1678.2. Samples: 42809256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:09:31,785][52050] Avg episode reward: [(0, '21.630'), (1, '22.740')] [2023-10-10 08:09:35,412][53252] Updated weights for policy 0, policy_version 83650 (0.0009) [2023-10-10 08:09:35,452][53268] Updated weights for policy 1, policy_version 83590 (0.0008) [2023-10-10 08:09:35,776][53252] Updated weights for policy 0, policy_version 83660 (0.0007) [2023-10-10 08:09:35,815][53268] Updated weights for policy 1, policy_version 83600 (0.0010) [2023-10-10 08:09:36,158][53252] Updated weights for policy 0, policy_version 83670 (0.0008) [2023-10-10 08:09:36,183][53268] Updated weights for policy 1, policy_version 83610 (0.0010) [2023-10-10 08:09:36,528][53252] Updated weights for policy 0, policy_version 83680 (0.0011) [2023-10-10 08:09:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171311104. Throughput: 0: 1678.1, 1: 1668.7. Samples: 42829490. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:09:36,784][52050] Avg episode reward: [(0, '22.330'), (1, '21.440')] [2023-10-10 08:09:40,301][53268] Updated weights for policy 1, policy_version 83620 (0.0010) [2023-10-10 08:09:40,502][53252] Updated weights for policy 0, policy_version 83690 (0.0008) [2023-10-10 08:09:40,668][53268] Updated weights for policy 1, policy_version 83630 (0.0009) [2023-10-10 08:09:40,883][53252] Updated weights for policy 0, policy_version 83700 (0.0007) [2023-10-10 08:09:41,046][53268] Updated weights for policy 1, policy_version 83640 (0.0008) [2023-10-10 08:09:41,245][53252] Updated weights for policy 0, policy_version 83710 (0.0008) [2023-10-10 08:09:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171376640. Throughput: 0: 1653.3, 1: 1653.6. Samples: 42847910. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:09:41,784][52050] Avg episode reward: [(0, '21.900'), (1, '21.210')] [2023-10-10 08:09:45,068][53252] Updated weights for policy 0, policy_version 83720 (0.0007) [2023-10-10 08:09:45,078][53268] Updated weights for policy 1, policy_version 83650 (0.0008) [2023-10-10 08:09:45,442][53268] Updated weights for policy 1, policy_version 83660 (0.0009) [2023-10-10 08:09:45,444][53252] Updated weights for policy 0, policy_version 83730 (0.0007) [2023-10-10 08:09:45,803][53268] Updated weights for policy 1, policy_version 83670 (0.0009) [2023-10-10 08:09:45,808][53252] Updated weights for policy 0, policy_version 83740 (0.0009) [2023-10-10 08:09:46,173][53268] Updated weights for policy 1, policy_version 83680 (0.0009) [2023-10-10 08:09:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171442176. Throughput: 0: 1683.4, 1: 1674.3. Samples: 42859588. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:09:46,784][52050] Avg episode reward: [(0, '22.870'), (1, '19.870')] [2023-10-10 08:09:50,017][53252] Updated weights for policy 0, policy_version 83750 (0.0008) [2023-10-10 08:09:50,347][53268] Updated weights for policy 1, policy_version 83690 (0.0009) [2023-10-10 08:09:50,382][53252] Updated weights for policy 0, policy_version 83760 (0.0009) [2023-10-10 08:09:50,716][53268] Updated weights for policy 1, policy_version 83700 (0.0010) [2023-10-10 08:09:50,750][53252] Updated weights for policy 0, policy_version 83770 (0.0008) [2023-10-10 08:09:51,078][53268] Updated weights for policy 1, policy_version 83710 (0.0008) [2023-10-10 08:09:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171507712. Throughput: 0: 1675.3, 1: 1668.1. Samples: 42879374. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:09:51,784][52050] Avg episode reward: [(0, '24.660'), (1, '19.650')] [2023-10-10 08:09:54,787][53252] Updated weights for policy 0, policy_version 83780 (0.0009) [2023-10-10 08:09:55,158][53252] Updated weights for policy 0, policy_version 83790 (0.0010) [2023-10-10 08:09:55,165][53268] Updated weights for policy 1, policy_version 83720 (0.0008) [2023-10-10 08:09:55,519][53252] Updated weights for policy 0, policy_version 83800 (0.0009) [2023-10-10 08:09:55,535][53268] Updated weights for policy 1, policy_version 83730 (0.0010) [2023-10-10 08:09:55,890][53268] Updated weights for policy 1, policy_version 83740 (0.0008) [2023-10-10 08:09:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171573248. Throughput: 0: 1671.2, 1: 1654.5. Samples: 42898412. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:09:56,784][52050] Avg episode reward: [(0, '23.480'), (1, '22.370')] [2023-10-10 08:09:59,537][53252] Updated weights for policy 0, policy_version 83810 (0.0008) [2023-10-10 08:09:59,901][53252] Updated weights for policy 0, policy_version 83820 (0.0008) [2023-10-10 08:09:59,994][53268] Updated weights for policy 1, policy_version 83750 (0.0008) [2023-10-10 08:10:00,279][53252] Updated weights for policy 0, policy_version 83830 (0.0008) [2023-10-10 08:10:00,356][53268] Updated weights for policy 1, policy_version 83760 (0.0008) [2023-10-10 08:10:00,649][53252] Updated weights for policy 0, policy_version 83840 (0.0010) [2023-10-10 08:10:00,720][53268] Updated weights for policy 1, policy_version 83770 (0.0008) [2023-10-10 08:10:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171638784. Throughput: 0: 1694.2, 1: 1673.6. Samples: 42910328. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:01,784][52050] Avg episode reward: [(0, '22.820'), (1, '21.680')] [2023-10-10 08:10:04,693][53252] Updated weights for policy 0, policy_version 83850 (0.0007) [2023-10-10 08:10:04,876][53268] Updated weights for policy 1, policy_version 83780 (0.0009) [2023-10-10 08:10:05,055][53252] Updated weights for policy 0, policy_version 83860 (0.0010) [2023-10-10 08:10:05,230][53268] Updated weights for policy 1, policy_version 83790 (0.0010) [2023-10-10 08:10:05,424][53252] Updated weights for policy 0, policy_version 83870 (0.0009) [2023-10-10 08:10:05,595][53268] Updated weights for policy 1, policy_version 83800 (0.0010) [2023-10-10 08:10:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171704320. Throughput: 0: 1667.4, 1: 1662.8. Samples: 42929384. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:06,784][52050] Avg episode reward: [(0, '21.530'), (1, '21.160')] [2023-10-10 08:10:09,588][53252] Updated weights for policy 0, policy_version 83880 (0.0008) [2023-10-10 08:10:09,634][53268] Updated weights for policy 1, policy_version 83810 (0.0009) [2023-10-10 08:10:09,956][53252] Updated weights for policy 0, policy_version 83890 (0.0007) [2023-10-10 08:10:10,039][53268] Updated weights for policy 1, policy_version 83820 (0.0008) [2023-10-10 08:10:10,334][53252] Updated weights for policy 0, policy_version 83900 (0.0009) [2023-10-10 08:10:10,404][53268] Updated weights for policy 1, policy_version 83830 (0.0009) [2023-10-10 08:10:10,773][53268] Updated weights for policy 1, policy_version 83840 (0.0008) [2023-10-10 08:10:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171769856. Throughput: 0: 1679.7, 1: 1665.2. Samples: 42949088. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:11,785][52050] Avg episode reward: [(0, '20.070'), (1, '22.890')] [2023-10-10 08:10:14,438][53252] Updated weights for policy 0, policy_version 83910 (0.0007) [2023-10-10 08:10:14,812][53252] Updated weights for policy 0, policy_version 83920 (0.0009) [2023-10-10 08:10:14,851][53268] Updated weights for policy 1, policy_version 83850 (0.0008) [2023-10-10 08:10:15,180][53252] Updated weights for policy 0, policy_version 83930 (0.0011) [2023-10-10 08:10:15,217][53268] Updated weights for policy 1, policy_version 83860 (0.0009) [2023-10-10 08:10:15,588][53268] Updated weights for policy 1, policy_version 83870 (0.0009) [2023-10-10 08:10:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171835392. Throughput: 0: 1681.0, 1: 1683.7. Samples: 42960666. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:16,785][52050] Avg episode reward: [(0, '20.750'), (1, '20.720')] [2023-10-10 08:10:19,201][53252] Updated weights for policy 0, policy_version 83940 (0.0010) [2023-10-10 08:10:19,577][53252] Updated weights for policy 0, policy_version 83950 (0.0008) [2023-10-10 08:10:19,690][53268] Updated weights for policy 1, policy_version 83880 (0.0009) [2023-10-10 08:10:19,943][53252] Updated weights for policy 0, policy_version 83960 (0.0008) [2023-10-10 08:10:20,055][53268] Updated weights for policy 1, policy_version 83890 (0.0009) [2023-10-10 08:10:20,422][53268] Updated weights for policy 1, policy_version 83900 (0.0009) [2023-10-10 08:10:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171900928. Throughput: 0: 1665.5, 1: 1668.7. Samples: 42979528. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:21,784][52050] Avg episode reward: [(0, '22.620'), (1, '21.510')] [2023-10-10 08:10:24,009][53252] Updated weights for policy 0, policy_version 83970 (0.0009) [2023-10-10 08:10:24,401][53252] Updated weights for policy 0, policy_version 83980 (0.0008) [2023-10-10 08:10:24,411][53268] Updated weights for policy 1, policy_version 83910 (0.0009) [2023-10-10 08:10:24,776][53268] Updated weights for policy 1, policy_version 83920 (0.0009) [2023-10-10 08:10:24,778][53252] Updated weights for policy 0, policy_version 83990 (0.0009) [2023-10-10 08:10:25,133][53268] Updated weights for policy 1, policy_version 83930 (0.0011) [2023-10-10 08:10:25,138][53252] Updated weights for policy 0, policy_version 84000 (0.0010) [2023-10-10 08:10:26,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 171966464. Throughput: 0: 1693.9, 1: 1679.7. Samples: 42999722. Policy #0 lag: (min: 15.0, avg: 15.2, max: 24.0) [2023-10-10 08:10:26,785][52050] Avg episode reward: [(0, '23.660'), (1, '22.160')] [2023-10-10 08:10:29,130][53268] Updated weights for policy 1, policy_version 83940 (0.0010) [2023-10-10 08:10:29,267][53252] Updated weights for policy 0, policy_version 84010 (0.0008) [2023-10-10 08:10:29,484][53268] Updated weights for policy 1, policy_version 83950 (0.0008) [2023-10-10 08:10:29,634][53252] Updated weights for policy 0, policy_version 84020 (0.0009) [2023-10-10 08:10:29,846][53268] Updated weights for policy 1, policy_version 83960 (0.0008) [2023-10-10 08:10:30,002][53252] Updated weights for policy 0, policy_version 84030 (0.0009) [2023-10-10 08:10:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 172032000. Throughput: 0: 1676.8, 1: 1676.5. Samples: 43010484. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:31,785][52050] Avg episode reward: [(0, '23.020'), (1, '21.610')] [2023-10-10 08:10:33,948][53268] Updated weights for policy 1, policy_version 83970 (0.0007) [2023-10-10 08:10:34,206][53252] Updated weights for policy 0, policy_version 84040 (0.0007) [2023-10-10 08:10:34,315][53268] Updated weights for policy 1, policy_version 83980 (0.0007) [2023-10-10 08:10:34,581][53252] Updated weights for policy 0, policy_version 84050 (0.0007) [2023-10-10 08:10:34,675][53268] Updated weights for policy 1, policy_version 83990 (0.0009) [2023-10-10 08:10:34,941][53252] Updated weights for policy 0, policy_version 84060 (0.0010) [2023-10-10 08:10:35,033][53268] Updated weights for policy 1, policy_version 84000 (0.0009) [2023-10-10 08:10:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 172097536. Throughput: 0: 1667.8, 1: 1667.9. Samples: 43029480. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:36,784][52050] Avg episode reward: [(0, '23.240'), (1, '22.220')] [2023-10-10 08:10:38,924][53252] Updated weights for policy 0, policy_version 84070 (0.0007) [2023-10-10 08:10:39,162][53268] Updated weights for policy 1, policy_version 84010 (0.0008) [2023-10-10 08:10:39,287][53252] Updated weights for policy 0, policy_version 84080 (0.0009) [2023-10-10 08:10:39,541][53268] Updated weights for policy 1, policy_version 84020 (0.0010) [2023-10-10 08:10:39,659][53252] Updated weights for policy 0, policy_version 84090 (0.0008) [2023-10-10 08:10:39,914][53268] Updated weights for policy 1, policy_version 84030 (0.0008) [2023-10-10 08:10:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 172163072. Throughput: 0: 1684.2, 1: 1684.8. Samples: 43050020. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:41,784][52050] Avg episode reward: [(0, '20.790'), (1, '22.250')] [2023-10-10 08:10:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000084032_86048768.pth... [2023-10-10 08:10:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000084096_86114304.pth... [2023-10-10 08:10:41,827][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000082528_84508672.pth [2023-10-10 08:10:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000082464_84443136.pth [2023-10-10 08:10:43,684][53252] Updated weights for policy 0, policy_version 84100 (0.0008) [2023-10-10 08:10:43,972][53268] Updated weights for policy 1, policy_version 84040 (0.0008) [2023-10-10 08:10:44,047][53252] Updated weights for policy 0, policy_version 84110 (0.0009) [2023-10-10 08:10:44,329][53268] Updated weights for policy 1, policy_version 84050 (0.0007) [2023-10-10 08:10:44,418][53252] Updated weights for policy 0, policy_version 84120 (0.0009) [2023-10-10 08:10:44,694][53268] Updated weights for policy 1, policy_version 84060 (0.0008) [2023-10-10 08:10:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 172228608. Throughput: 0: 1661.6, 1: 1675.2. Samples: 43060484. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:46,784][52050] Avg episode reward: [(0, '22.300'), (1, '21.220')] [2023-10-10 08:10:48,630][53252] Updated weights for policy 0, policy_version 84130 (0.0007) [2023-10-10 08:10:48,772][53268] Updated weights for policy 1, policy_version 84070 (0.0009) [2023-10-10 08:10:49,004][53252] Updated weights for policy 0, policy_version 84140 (0.0007) [2023-10-10 08:10:49,133][53268] Updated weights for policy 1, policy_version 84080 (0.0007) [2023-10-10 08:10:49,366][53252] Updated weights for policy 0, policy_version 84150 (0.0011) [2023-10-10 08:10:49,504][53268] Updated weights for policy 1, policy_version 84090 (0.0007) [2023-10-10 08:10:49,734][53252] Updated weights for policy 0, policy_version 84160 (0.0009) [2023-10-10 08:10:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172294144. Throughput: 0: 1672.5, 1: 1673.4. Samples: 43079950. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:51,784][52050] Avg episode reward: [(0, '22.210'), (1, '23.470')] [2023-10-10 08:10:53,711][53268] Updated weights for policy 1, policy_version 84100 (0.0008) [2023-10-10 08:10:53,792][53252] Updated weights for policy 0, policy_version 84170 (0.0007) [2023-10-10 08:10:54,081][53268] Updated weights for policy 1, policy_version 84110 (0.0009) [2023-10-10 08:10:54,171][53252] Updated weights for policy 0, policy_version 84180 (0.0007) [2023-10-10 08:10:54,451][53268] Updated weights for policy 1, policy_version 84120 (0.0007) [2023-10-10 08:10:54,547][53252] Updated weights for policy 0, policy_version 84190 (0.0007) [2023-10-10 08:10:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172359680. Throughput: 0: 1678.8, 1: 1688.2. Samples: 43100602. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:10:56,784][52050] Avg episode reward: [(0, '22.570'), (1, '24.970')] [2023-10-10 08:10:58,295][53268] Updated weights for policy 1, policy_version 84130 (0.0009) [2023-10-10 08:10:58,624][53252] Updated weights for policy 0, policy_version 84200 (0.0008) [2023-10-10 08:10:58,705][53268] Updated weights for policy 1, policy_version 84140 (0.0009) [2023-10-10 08:10:58,990][53252] Updated weights for policy 0, policy_version 84210 (0.0007) [2023-10-10 08:10:59,074][53268] Updated weights for policy 1, policy_version 84150 (0.0007) [2023-10-10 08:10:59,366][53252] Updated weights for policy 0, policy_version 84220 (0.0007) [2023-10-10 08:10:59,448][53268] Updated weights for policy 1, policy_version 84160 (0.0007) [2023-10-10 08:11:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 172425216. Throughput: 0: 1661.2, 1: 1661.4. Samples: 43110182. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:11:01,784][52050] Avg episode reward: [(0, '23.140'), (1, '22.800')] [2023-10-10 08:11:03,265][53252] Updated weights for policy 0, policy_version 84230 (0.0008) [2023-10-10 08:11:03,369][53268] Updated weights for policy 1, policy_version 84170 (0.0009) [2023-10-10 08:11:03,643][53252] Updated weights for policy 0, policy_version 84240 (0.0009) [2023-10-10 08:11:03,732][53268] Updated weights for policy 1, policy_version 84180 (0.0008) [2023-10-10 08:11:04,023][53252] Updated weights for policy 0, policy_version 84250 (0.0009) [2023-10-10 08:11:04,094][53268] Updated weights for policy 1, policy_version 84190 (0.0008) [2023-10-10 08:11:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172490752. Throughput: 0: 1675.7, 1: 1679.3. Samples: 43130504. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:11:06,784][52050] Avg episode reward: [(0, '23.240'), (1, '23.520')] [2023-10-10 08:11:08,161][53252] Updated weights for policy 0, policy_version 84260 (0.0008) [2023-10-10 08:11:08,213][53268] Updated weights for policy 1, policy_version 84200 (0.0009) [2023-10-10 08:11:08,530][53252] Updated weights for policy 0, policy_version 84270 (0.0007) [2023-10-10 08:11:08,584][53268] Updated weights for policy 1, policy_version 84210 (0.0008) [2023-10-10 08:11:08,894][53252] Updated weights for policy 0, policy_version 84280 (0.0008) [2023-10-10 08:11:08,943][53268] Updated weights for policy 1, policy_version 84220 (0.0008) [2023-10-10 08:11:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172556288. Throughput: 0: 1678.5, 1: 1688.0. Samples: 43151212. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:11:11,785][52050] Avg episode reward: [(0, '23.500'), (1, '21.850')] [2023-10-10 08:11:12,910][53252] Updated weights for policy 0, policy_version 84290 (0.0007) [2023-10-10 08:11:13,062][53268] Updated weights for policy 1, policy_version 84230 (0.0009) [2023-10-10 08:11:13,309][53252] Updated weights for policy 0, policy_version 84300 (0.0008) [2023-10-10 08:11:13,428][53268] Updated weights for policy 1, policy_version 84240 (0.0009) [2023-10-10 08:11:13,685][53252] Updated weights for policy 0, policy_version 84310 (0.0008) [2023-10-10 08:11:13,795][53268] Updated weights for policy 1, policy_version 84250 (0.0009) [2023-10-10 08:11:14,061][53252] Updated weights for policy 0, policy_version 84320 (0.0010) [2023-10-10 08:11:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172621824. Throughput: 0: 1662.2, 1: 1662.8. Samples: 43160108. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:11:16,784][52050] Avg episode reward: [(0, '24.050'), (1, '20.600')] [2023-10-10 08:11:17,933][53268] Updated weights for policy 1, policy_version 84260 (0.0009) [2023-10-10 08:11:18,285][53252] Updated weights for policy 0, policy_version 84330 (0.0007) [2023-10-10 08:11:18,288][53268] Updated weights for policy 1, policy_version 84270 (0.0009) [2023-10-10 08:11:18,652][53252] Updated weights for policy 0, policy_version 84340 (0.0008) [2023-10-10 08:11:18,656][53268] Updated weights for policy 1, policy_version 84280 (0.0009) [2023-10-10 08:11:19,015][53252] Updated weights for policy 0, policy_version 84350 (0.0007) [2023-10-10 08:11:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172687360. Throughput: 0: 1681.7, 1: 1682.8. Samples: 43180884. Policy #0 lag: (min: 22.0, avg: 23.5, max: 48.0) [2023-10-10 08:11:21,784][52050] Avg episode reward: [(0, '23.440'), (1, '20.800')] [2023-10-10 08:11:22,589][53268] Updated weights for policy 1, policy_version 84290 (0.0008) [2023-10-10 08:11:22,959][53268] Updated weights for policy 1, policy_version 84300 (0.0010) [2023-10-10 08:11:23,097][53252] Updated weights for policy 0, policy_version 84360 (0.0007) [2023-10-10 08:11:23,320][53268] Updated weights for policy 1, policy_version 84310 (0.0009) [2023-10-10 08:11:23,453][53252] Updated weights for policy 0, policy_version 84370 (0.0007) [2023-10-10 08:11:23,689][53268] Updated weights for policy 1, policy_version 84320 (0.0009) [2023-10-10 08:11:23,825][53252] Updated weights for policy 0, policy_version 84380 (0.0008) [2023-10-10 08:11:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 172752896. Throughput: 0: 1679.2, 1: 1689.3. Samples: 43201600. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:26,784][52050] Avg episode reward: [(0, '23.360'), (1, '22.140')] [2023-10-10 08:11:27,712][53268] Updated weights for policy 1, policy_version 84330 (0.0008) [2023-10-10 08:11:27,901][53252] Updated weights for policy 0, policy_version 84390 (0.0008) [2023-10-10 08:11:28,087][53268] Updated weights for policy 1, policy_version 84340 (0.0008) [2023-10-10 08:11:28,272][53252] Updated weights for policy 0, policy_version 84400 (0.0009) [2023-10-10 08:11:28,457][53268] Updated weights for policy 1, policy_version 84350 (0.0009) [2023-10-10 08:11:28,629][53252] Updated weights for policy 0, policy_version 84410 (0.0008) [2023-10-10 08:11:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172818432. Throughput: 0: 1669.1, 1: 1671.5. Samples: 43210808. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:31,784][52050] Avg episode reward: [(0, '26.420'), (1, '20.620')] [2023-10-10 08:11:31,785][52846] Saving new best policy, reward=26.420! [2023-10-10 08:11:32,440][53268] Updated weights for policy 1, policy_version 84360 (0.0008) [2023-10-10 08:11:32,806][53252] Updated weights for policy 0, policy_version 84420 (0.0007) [2023-10-10 08:11:32,810][53268] Updated weights for policy 1, policy_version 84370 (0.0008) [2023-10-10 08:11:33,173][53268] Updated weights for policy 1, policy_version 84380 (0.0009) [2023-10-10 08:11:33,181][53252] Updated weights for policy 0, policy_version 84430 (0.0007) [2023-10-10 08:11:33,561][53252] Updated weights for policy 0, policy_version 84440 (0.0009) [2023-10-10 08:11:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 172883968. Throughput: 0: 1678.6, 1: 1689.2. Samples: 43231502. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:36,784][52050] Avg episode reward: [(0, '24.190'), (1, '22.230')] [2023-10-10 08:11:37,406][53268] Updated weights for policy 1, policy_version 84390 (0.0007) [2023-10-10 08:11:37,527][53252] Updated weights for policy 0, policy_version 84450 (0.0008) [2023-10-10 08:11:37,765][53268] Updated weights for policy 1, policy_version 84400 (0.0007) [2023-10-10 08:11:37,893][53252] Updated weights for policy 0, policy_version 84460 (0.0007) [2023-10-10 08:11:38,116][53268] Updated weights for policy 1, policy_version 84410 (0.0008) [2023-10-10 08:11:38,267][53252] Updated weights for policy 0, policy_version 84470 (0.0007) [2023-10-10 08:11:38,631][53252] Updated weights for policy 0, policy_version 84480 (0.0008) [2023-10-10 08:11:41,784][52050] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 172949504. Throughput: 0: 1681.6, 1: 1693.5. Samples: 43252484. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:41,785][52050] Avg episode reward: [(0, '23.070'), (1, '24.530')] [2023-10-10 08:11:42,234][53268] Updated weights for policy 1, policy_version 84420 (0.0010) [2023-10-10 08:11:42,599][53268] Updated weights for policy 1, policy_version 84430 (0.0009) [2023-10-10 08:11:42,686][53252] Updated weights for policy 0, policy_version 84490 (0.0007) [2023-10-10 08:11:42,958][53268] Updated weights for policy 1, policy_version 84440 (0.0008) [2023-10-10 08:11:43,068][53252] Updated weights for policy 0, policy_version 84500 (0.0008) [2023-10-10 08:11:43,444][53252] Updated weights for policy 0, policy_version 84510 (0.0007) [2023-10-10 08:11:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173015040. Throughput: 0: 1677.3, 1: 1689.4. Samples: 43261680. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:46,784][52050] Avg episode reward: [(0, '21.990'), (1, '23.580')] [2023-10-10 08:11:47,072][53268] Updated weights for policy 1, policy_version 84450 (0.0008) [2023-10-10 08:11:47,450][53268] Updated weights for policy 1, policy_version 84460 (0.0010) [2023-10-10 08:11:47,562][53252] Updated weights for policy 0, policy_version 84520 (0.0008) [2023-10-10 08:11:47,807][53268] Updated weights for policy 1, policy_version 84470 (0.0009) [2023-10-10 08:11:47,934][53252] Updated weights for policy 0, policy_version 84530 (0.0009) [2023-10-10 08:11:48,179][53268] Updated weights for policy 1, policy_version 84480 (0.0008) [2023-10-10 08:11:48,302][53252] Updated weights for policy 0, policy_version 84540 (0.0008) [2023-10-10 08:11:51,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173080576. Throughput: 0: 1687.5, 1: 1689.6. Samples: 43282470. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:51,784][52050] Avg episode reward: [(0, '19.420'), (1, '22.840')] [2023-10-10 08:11:52,245][53252] Updated weights for policy 0, policy_version 84550 (0.0008) [2023-10-10 08:11:52,324][53268] Updated weights for policy 1, policy_version 84490 (0.0008) [2023-10-10 08:11:52,626][53252] Updated weights for policy 0, policy_version 84560 (0.0010) [2023-10-10 08:11:52,689][53268] Updated weights for policy 1, policy_version 84500 (0.0008) [2023-10-10 08:11:52,998][53252] Updated weights for policy 0, policy_version 84570 (0.0008) [2023-10-10 08:11:53,045][53268] Updated weights for policy 1, policy_version 84510 (0.0008) [2023-10-10 08:11:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173146112. Throughput: 0: 1684.0, 1: 1694.0. Samples: 43303220. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:11:56,784][52050] Avg episode reward: [(0, '20.480'), (1, '22.340')] [2023-10-10 08:11:57,114][53268] Updated weights for policy 1, policy_version 84520 (0.0007) [2023-10-10 08:11:57,143][53252] Updated weights for policy 0, policy_version 84580 (0.0007) [2023-10-10 08:11:57,481][53268] Updated weights for policy 1, policy_version 84530 (0.0010) [2023-10-10 08:11:57,502][53252] Updated weights for policy 0, policy_version 84590 (0.0007) [2023-10-10 08:11:57,844][53268] Updated weights for policy 1, policy_version 84540 (0.0010) [2023-10-10 08:11:57,870][53252] Updated weights for policy 0, policy_version 84600 (0.0009) [2023-10-10 08:12:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173211648. Throughput: 0: 1686.2, 1: 1696.3. Samples: 43312320. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:12:01,784][52050] Avg episode reward: [(0, '20.700'), (1, '22.480')] [2023-10-10 08:12:01,884][53268] Updated weights for policy 1, policy_version 84550 (0.0009) [2023-10-10 08:12:02,061][53252] Updated weights for policy 0, policy_version 84610 (0.0010) [2023-10-10 08:12:02,246][53268] Updated weights for policy 1, policy_version 84560 (0.0007) [2023-10-10 08:12:02,438][53252] Updated weights for policy 0, policy_version 84620 (0.0008) [2023-10-10 08:12:02,611][53268] Updated weights for policy 1, policy_version 84570 (0.0007) [2023-10-10 08:12:02,812][53252] Updated weights for policy 0, policy_version 84630 (0.0007) [2023-10-10 08:12:03,187][53252] Updated weights for policy 0, policy_version 84640 (0.0008) [2023-10-10 08:12:06,712][53268] Updated weights for policy 1, policy_version 84580 (0.0008) [2023-10-10 08:12:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173277184. Throughput: 0: 1682.7, 1: 1691.3. Samples: 43332714. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:12:06,784][52050] Avg episode reward: [(0, '21.530'), (1, '21.070')] [2023-10-10 08:12:07,076][53268] Updated weights for policy 1, policy_version 84590 (0.0008) [2023-10-10 08:12:07,161][53252] Updated weights for policy 0, policy_version 84650 (0.0008) [2023-10-10 08:12:07,438][53268] Updated weights for policy 1, policy_version 84600 (0.0008) [2023-10-10 08:12:07,535][53252] Updated weights for policy 0, policy_version 84660 (0.0009) [2023-10-10 08:12:07,897][53252] Updated weights for policy 0, policy_version 84670 (0.0008) [2023-10-10 08:12:11,364][53268] Updated weights for policy 1, policy_version 84610 (0.0009) [2023-10-10 08:12:11,729][53268] Updated weights for policy 1, policy_version 84620 (0.0010) [2023-10-10 08:12:11,784][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 173342720. Throughput: 0: 1685.9, 1: 1690.2. Samples: 43353524. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:12:11,785][52050] Avg episode reward: [(0, '22.420'), (1, '22.130')] [2023-10-10 08:12:11,989][53252] Updated weights for policy 0, policy_version 84680 (0.0009) [2023-10-10 08:12:12,095][53268] Updated weights for policy 1, policy_version 84630 (0.0007) [2023-10-10 08:12:12,360][53252] Updated weights for policy 0, policy_version 84690 (0.0009) [2023-10-10 08:12:12,467][53268] Updated weights for policy 1, policy_version 84640 (0.0009) [2023-10-10 08:12:12,732][53252] Updated weights for policy 0, policy_version 84700 (0.0009) [2023-10-10 08:12:16,565][53268] Updated weights for policy 1, policy_version 84650 (0.0007) [2023-10-10 08:12:16,688][53252] Updated weights for policy 0, policy_version 84710 (0.0008) [2023-10-10 08:12:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173408256. Throughput: 0: 1684.9, 1: 1688.1. Samples: 43362594. Policy #0 lag: (min: 24.0, avg: 51.0, max: 56.0) [2023-10-10 08:12:16,784][52050] Avg episode reward: [(0, '23.180'), (1, '23.200')] [2023-10-10 08:12:16,925][53268] Updated weights for policy 1, policy_version 84660 (0.0009) [2023-10-10 08:12:17,051][53252] Updated weights for policy 0, policy_version 84720 (0.0008) [2023-10-10 08:12:17,299][53268] Updated weights for policy 1, policy_version 84670 (0.0007) [2023-10-10 08:12:17,424][53252] Updated weights for policy 0, policy_version 84730 (0.0009) [2023-10-10 08:12:21,326][53268] Updated weights for policy 1, policy_version 84680 (0.0011) [2023-10-10 08:12:21,512][53252] Updated weights for policy 0, policy_version 84740 (0.0007) [2023-10-10 08:12:21,691][53268] Updated weights for policy 1, policy_version 84690 (0.0007) [2023-10-10 08:12:21,783][52050] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 173473792. Throughput: 0: 1688.4, 1: 1688.2. Samples: 43383448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:21,784][52050] Avg episode reward: [(0, '23.250'), (1, '22.640')] [2023-10-10 08:12:21,879][53252] Updated weights for policy 0, policy_version 84750 (0.0007) [2023-10-10 08:12:22,050][53268] Updated weights for policy 1, policy_version 84700 (0.0008) [2023-10-10 08:12:22,249][53252] Updated weights for policy 0, policy_version 84760 (0.0007) [2023-10-10 08:12:26,085][53268] Updated weights for policy 1, policy_version 84710 (0.0010) [2023-10-10 08:12:26,284][53252] Updated weights for policy 0, policy_version 84770 (0.0008) [2023-10-10 08:12:26,454][53268] Updated weights for policy 1, policy_version 84720 (0.0008) [2023-10-10 08:12:26,644][53252] Updated weights for policy 0, policy_version 84780 (0.0009) [2023-10-10 08:12:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 173539328. Throughput: 0: 1678.4, 1: 1679.5. Samples: 43403590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:26,785][52050] Avg episode reward: [(0, '23.530'), (1, '21.690')] [2023-10-10 08:12:26,817][53268] Updated weights for policy 1, policy_version 84730 (0.0007) [2023-10-10 08:12:27,012][53252] Updated weights for policy 0, policy_version 84790 (0.0008) [2023-10-10 08:12:27,379][53252] Updated weights for policy 0, policy_version 84800 (0.0008) [2023-10-10 08:12:30,678][53268] Updated weights for policy 1, policy_version 84740 (0.0007) [2023-10-10 08:12:31,047][53268] Updated weights for policy 1, policy_version 84750 (0.0009) [2023-10-10 08:12:31,383][53252] Updated weights for policy 0, policy_version 84810 (0.0008) [2023-10-10 08:12:31,411][53268] Updated weights for policy 1, policy_version 84760 (0.0008) [2023-10-10 08:12:31,746][53252] Updated weights for policy 0, policy_version 84820 (0.0007) [2023-10-10 08:12:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 173637632. Throughput: 0: 1682.7, 1: 1690.1. Samples: 43413458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:31,784][52050] Avg episode reward: [(0, '23.500'), (1, '23.010')] [2023-10-10 08:12:32,116][53252] Updated weights for policy 0, policy_version 84830 (0.0009) [2023-10-10 08:12:35,634][53268] Updated weights for policy 1, policy_version 84770 (0.0008) [2023-10-10 08:12:36,014][53268] Updated weights for policy 1, policy_version 84780 (0.0009) [2023-10-10 08:12:36,169][53252] Updated weights for policy 0, policy_version 84840 (0.0008) [2023-10-10 08:12:36,380][53268] Updated weights for policy 1, policy_version 84790 (0.0009) [2023-10-10 08:12:36,543][53252] Updated weights for policy 0, policy_version 84850 (0.0008) [2023-10-10 08:12:36,737][53268] Updated weights for policy 1, policy_version 84800 (0.0007) [2023-10-10 08:12:36,783][52050] Fps is (10 sec: 16384.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173703168. Throughput: 0: 1679.4, 1: 1690.4. Samples: 43434112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:36,784][52050] Avg episode reward: [(0, '23.370'), (1, '22.120')] [2023-10-10 08:12:36,908][53252] Updated weights for policy 0, policy_version 84860 (0.0008) [2023-10-10 08:12:40,845][53268] Updated weights for policy 1, policy_version 84810 (0.0009) [2023-10-10 08:12:40,881][53252] Updated weights for policy 0, policy_version 84870 (0.0009) [2023-10-10 08:12:41,222][53268] Updated weights for policy 1, policy_version 84820 (0.0008) [2023-10-10 08:12:41,253][53252] Updated weights for policy 0, policy_version 84880 (0.0009) [2023-10-10 08:12:41,592][53268] Updated weights for policy 1, policy_version 84830 (0.0007) [2023-10-10 08:12:41,619][53252] Updated weights for policy 0, policy_version 84890 (0.0007) [2023-10-10 08:12:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.5, 300 sec: 13440.4). Total num frames: 173768704. Throughput: 0: 1665.2, 1: 1668.6. Samples: 43453244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:41,784][52050] Avg episode reward: [(0, '24.200'), (1, '22.020')] [2023-10-10 08:12:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000084832_86867968.pth... [2023-10-10 08:12:41,834][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000084896_86933504.pth... [2023-10-10 08:12:41,834][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000083264_85262336.pth [2023-10-10 08:12:41,872][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000083328_85327872.pth [2023-10-10 08:12:45,586][53268] Updated weights for policy 1, policy_version 84840 (0.0010) [2023-10-10 08:12:45,754][53252] Updated weights for policy 0, policy_version 84900 (0.0008) [2023-10-10 08:12:45,953][53268] Updated weights for policy 1, policy_version 84850 (0.0007) [2023-10-10 08:12:46,117][53252] Updated weights for policy 0, policy_version 84910 (0.0009) [2023-10-10 08:12:46,325][53268] Updated weights for policy 1, policy_version 84860 (0.0008) [2023-10-10 08:12:46,487][53252] Updated weights for policy 0, policy_version 84920 (0.0008) [2023-10-10 08:12:46,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173834240. Throughput: 0: 1683.6, 1: 1682.4. Samples: 43463792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:46,785][52050] Avg episode reward: [(0, '23.300'), (1, '22.730')] [2023-10-10 08:12:50,185][53268] Updated weights for policy 1, policy_version 84870 (0.0009) [2023-10-10 08:12:50,558][53268] Updated weights for policy 1, policy_version 84880 (0.0011) [2023-10-10 08:12:50,665][53252] Updated weights for policy 0, policy_version 84930 (0.0008) [2023-10-10 08:12:50,922][53268] Updated weights for policy 1, policy_version 84890 (0.0009) [2023-10-10 08:12:51,062][53252] Updated weights for policy 0, policy_version 84940 (0.0007) [2023-10-10 08:12:51,428][53252] Updated weights for policy 0, policy_version 84950 (0.0010) [2023-10-10 08:12:51,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173899776. Throughput: 0: 1683.9, 1: 1690.7. Samples: 43484570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:51,785][52050] Avg episode reward: [(0, '23.670'), (1, '21.160')] [2023-10-10 08:12:51,801][53252] Updated weights for policy 0, policy_version 84960 (0.0009) [2023-10-10 08:12:54,941][53268] Updated weights for policy 1, policy_version 84900 (0.0008) [2023-10-10 08:12:55,307][53268] Updated weights for policy 1, policy_version 84910 (0.0010) [2023-10-10 08:12:55,672][53252] Updated weights for policy 0, policy_version 84970 (0.0008) [2023-10-10 08:12:55,679][53268] Updated weights for policy 1, policy_version 84920 (0.0008) [2023-10-10 08:12:56,045][53252] Updated weights for policy 0, policy_version 84980 (0.0009) [2023-10-10 08:12:56,411][53252] Updated weights for policy 0, policy_version 84990 (0.0008) [2023-10-10 08:12:56,783][52050] Fps is (10 sec: 16384.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 173998080. Throughput: 0: 1658.4, 1: 1669.9. Samples: 43503296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:12:56,784][52050] Avg episode reward: [(0, '23.670'), (1, '23.080')] [2023-10-10 08:12:59,904][53268] Updated weights for policy 1, policy_version 84930 (0.0008) [2023-10-10 08:13:00,282][53268] Updated weights for policy 1, policy_version 84940 (0.0008) [2023-10-10 08:13:00,419][53252] Updated weights for policy 0, policy_version 85000 (0.0008) [2023-10-10 08:13:00,644][53268] Updated weights for policy 1, policy_version 84950 (0.0009) [2023-10-10 08:13:00,784][53252] Updated weights for policy 0, policy_version 85010 (0.0007) [2023-10-10 08:13:01,015][53268] Updated weights for policy 1, policy_version 84960 (0.0008) [2023-10-10 08:13:01,152][53252] Updated weights for policy 0, policy_version 85020 (0.0009) [2023-10-10 08:13:01,783][52050] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 174063616. Throughput: 0: 1687.2, 1: 1701.8. Samples: 43515098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:01,784][52050] Avg episode reward: [(0, '22.430'), (1, '25.010')] [2023-10-10 08:13:05,042][53268] Updated weights for policy 1, policy_version 84970 (0.0008) [2023-10-10 08:13:05,171][53252] Updated weights for policy 0, policy_version 85030 (0.0007) [2023-10-10 08:13:05,409][53268] Updated weights for policy 1, policy_version 84980 (0.0008) [2023-10-10 08:13:05,538][53252] Updated weights for policy 0, policy_version 85040 (0.0009) [2023-10-10 08:13:05,780][53268] Updated weights for policy 1, policy_version 84990 (0.0007) [2023-10-10 08:13:05,912][53252] Updated weights for policy 0, policy_version 85050 (0.0007) [2023-10-10 08:13:06,784][52050] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 174129152. Throughput: 0: 1677.9, 1: 1686.1. Samples: 43534828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:06,785][52050] Avg episode reward: [(0, '21.450'), (1, '21.900')] [2023-10-10 08:13:09,925][53268] Updated weights for policy 1, policy_version 85000 (0.0008) [2023-10-10 08:13:09,956][53252] Updated weights for policy 0, policy_version 85060 (0.0008) [2023-10-10 08:13:10,301][53268] Updated weights for policy 1, policy_version 85010 (0.0010) [2023-10-10 08:13:10,325][53252] Updated weights for policy 0, policy_version 85070 (0.0010) [2023-10-10 08:13:10,663][53268] Updated weights for policy 1, policy_version 85020 (0.0008) [2023-10-10 08:13:10,695][53252] Updated weights for policy 0, policy_version 85080 (0.0008) [2023-10-10 08:13:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 13551.5). Total num frames: 174194688. Throughput: 0: 1664.1, 1: 1673.9. Samples: 43553796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:11,784][52050] Avg episode reward: [(0, '20.630'), (1, '21.380')] [2023-10-10 08:13:14,763][53252] Updated weights for policy 0, policy_version 85090 (0.0008) [2023-10-10 08:13:14,831][53268] Updated weights for policy 1, policy_version 85030 (0.0009) [2023-10-10 08:13:15,128][53252] Updated weights for policy 0, policy_version 85100 (0.0010) [2023-10-10 08:13:15,187][53268] Updated weights for policy 1, policy_version 85040 (0.0009) [2023-10-10 08:13:15,502][53252] Updated weights for policy 0, policy_version 85110 (0.0008) [2023-10-10 08:13:15,549][53268] Updated weights for policy 1, policy_version 85050 (0.0009) [2023-10-10 08:13:15,874][53252] Updated weights for policy 0, policy_version 85120 (0.0007) [2023-10-10 08:13:16,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 174260224. Throughput: 0: 1686.8, 1: 1690.2. Samples: 43565424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:16,784][52050] Avg episode reward: [(0, '21.430'), (1, '21.910')] [2023-10-10 08:13:19,649][53268] Updated weights for policy 1, policy_version 85060 (0.0008) [2023-10-10 08:13:20,003][53252] Updated weights for policy 0, policy_version 85130 (0.0007) [2023-10-10 08:13:20,016][53268] Updated weights for policy 1, policy_version 85070 (0.0008) [2023-10-10 08:13:20,375][53252] Updated weights for policy 0, policy_version 85140 (0.0008) [2023-10-10 08:13:20,375][53268] Updated weights for policy 1, policy_version 85080 (0.0009) [2023-10-10 08:13:20,741][53252] Updated weights for policy 0, policy_version 85150 (0.0007) [2023-10-10 08:13:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 174325760. Throughput: 0: 1672.0, 1: 1672.0. Samples: 43584592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:21,784][52050] Avg episode reward: [(0, '23.400'), (1, '20.520')] [2023-10-10 08:13:24,413][53268] Updated weights for policy 1, policy_version 85090 (0.0009) [2023-10-10 08:13:24,836][53268] Updated weights for policy 1, policy_version 85100 (0.0009) [2023-10-10 08:13:25,056][53252] Updated weights for policy 0, policy_version 85160 (0.0007) [2023-10-10 08:13:25,204][53268] Updated weights for policy 1, policy_version 85110 (0.0008) [2023-10-10 08:13:25,427][53252] Updated weights for policy 0, policy_version 85170 (0.0007) [2023-10-10 08:13:25,572][53268] Updated weights for policy 1, policy_version 85120 (0.0009) [2023-10-10 08:13:25,801][53252] Updated weights for policy 0, policy_version 85180 (0.0007) [2023-10-10 08:13:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 13551.5). Total num frames: 174391296. Throughput: 0: 1672.5, 1: 1679.0. Samples: 43604064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:26,784][52050] Avg episode reward: [(0, '21.790'), (1, '21.110')] [2023-10-10 08:13:29,500][53268] Updated weights for policy 1, policy_version 85130 (0.0009) [2023-10-10 08:13:29,746][53252] Updated weights for policy 0, policy_version 85190 (0.0008) [2023-10-10 08:13:29,858][53268] Updated weights for policy 1, policy_version 85140 (0.0008) [2023-10-10 08:13:30,115][53252] Updated weights for policy 0, policy_version 85200 (0.0007) [2023-10-10 08:13:30,224][53268] Updated weights for policy 1, policy_version 85150 (0.0009) [2023-10-10 08:13:30,491][53252] Updated weights for policy 0, policy_version 85210 (0.0009) [2023-10-10 08:13:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174456832. Throughput: 0: 1683.6, 1: 1693.5. Samples: 43615758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:31,784][52050] Avg episode reward: [(0, '23.190'), (1, '21.310')] [2023-10-10 08:13:34,175][53268] Updated weights for policy 1, policy_version 85160 (0.0008) [2023-10-10 08:13:34,543][53268] Updated weights for policy 1, policy_version 85170 (0.0009) [2023-10-10 08:13:34,668][53252] Updated weights for policy 0, policy_version 85220 (0.0007) [2023-10-10 08:13:34,914][53268] Updated weights for policy 1, policy_version 85180 (0.0008) [2023-10-10 08:13:35,030][53252] Updated weights for policy 0, policy_version 85230 (0.0009) [2023-10-10 08:13:35,398][53252] Updated weights for policy 0, policy_version 85240 (0.0008) [2023-10-10 08:13:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174522368. Throughput: 0: 1664.4, 1: 1663.7. Samples: 43634332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:36,784][52050] Avg episode reward: [(0, '23.920'), (1, '21.180')] [2023-10-10 08:13:39,035][53268] Updated weights for policy 1, policy_version 85190 (0.0009) [2023-10-10 08:13:39,403][53268] Updated weights for policy 1, policy_version 85200 (0.0008) [2023-10-10 08:13:39,593][53252] Updated weights for policy 0, policy_version 85250 (0.0007) [2023-10-10 08:13:39,768][53268] Updated weights for policy 1, policy_version 85210 (0.0009) [2023-10-10 08:13:39,971][53252] Updated weights for policy 0, policy_version 85260 (0.0008) [2023-10-10 08:13:40,349][53252] Updated weights for policy 0, policy_version 85270 (0.0010) [2023-10-10 08:13:40,713][53252] Updated weights for policy 0, policy_version 85280 (0.0009) [2023-10-10 08:13:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174587904. Throughput: 0: 1674.6, 1: 1684.0. Samples: 43654430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:41,784][52050] Avg episode reward: [(0, '23.520'), (1, '22.200')] [2023-10-10 08:13:43,903][53268] Updated weights for policy 1, policy_version 85220 (0.0009) [2023-10-10 08:13:44,264][53268] Updated weights for policy 1, policy_version 85230 (0.0008) [2023-10-10 08:13:44,628][53268] Updated weights for policy 1, policy_version 85240 (0.0008) [2023-10-10 08:13:44,733][53252] Updated weights for policy 0, policy_version 85290 (0.0008) [2023-10-10 08:13:45,100][53252] Updated weights for policy 0, policy_version 85300 (0.0009) [2023-10-10 08:13:45,470][53252] Updated weights for policy 0, policy_version 85310 (0.0009) [2023-10-10 08:13:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 174653440. Throughput: 0: 1682.0, 1: 1671.9. Samples: 43666024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:46,784][52050] Avg episode reward: [(0, '21.040'), (1, '21.870')] [2023-10-10 08:13:48,833][53268] Updated weights for policy 1, policy_version 85250 (0.0008) [2023-10-10 08:13:49,198][53268] Updated weights for policy 1, policy_version 85260 (0.0007) [2023-10-10 08:13:49,565][53268] Updated weights for policy 1, policy_version 85270 (0.0007) [2023-10-10 08:13:49,596][53252] Updated weights for policy 0, policy_version 85320 (0.0009) [2023-10-10 08:13:49,918][53268] Updated weights for policy 1, policy_version 85280 (0.0009) [2023-10-10 08:13:49,961][53252] Updated weights for policy 0, policy_version 85330 (0.0008) [2023-10-10 08:13:50,325][53252] Updated weights for policy 0, policy_version 85340 (0.0009) [2023-10-10 08:13:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 174718976. Throughput: 0: 1663.9, 1: 1665.0. Samples: 43684628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:51,784][52050] Avg episode reward: [(0, '23.950'), (1, '22.480')] [2023-10-10 08:13:53,876][53268] Updated weights for policy 1, policy_version 85290 (0.0010) [2023-10-10 08:13:54,243][53268] Updated weights for policy 1, policy_version 85300 (0.0009) [2023-10-10 08:13:54,382][53252] Updated weights for policy 0, policy_version 85350 (0.0008) [2023-10-10 08:13:54,616][53268] Updated weights for policy 1, policy_version 85310 (0.0008) [2023-10-10 08:13:54,761][53252] Updated weights for policy 0, policy_version 85360 (0.0008) [2023-10-10 08:13:55,137][53252] Updated weights for policy 0, policy_version 85370 (0.0009) [2023-10-10 08:13:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 174784512. Throughput: 0: 1677.2, 1: 1683.1. Samples: 43705010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:13:56,784][52050] Avg episode reward: [(0, '23.390'), (1, '23.560')] [2023-10-10 08:13:58,798][53268] Updated weights for policy 1, policy_version 85320 (0.0009) [2023-10-10 08:13:59,094][53252] Updated weights for policy 0, policy_version 85380 (0.0007) [2023-10-10 08:13:59,169][53268] Updated weights for policy 1, policy_version 85330 (0.0009) [2023-10-10 08:13:59,478][53252] Updated weights for policy 0, policy_version 85390 (0.0007) [2023-10-10 08:13:59,534][53268] Updated weights for policy 1, policy_version 85340 (0.0009) [2023-10-10 08:13:59,844][53252] Updated weights for policy 0, policy_version 85400 (0.0008) [2023-10-10 08:14:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 174850048. Throughput: 0: 1668.4, 1: 1668.3. Samples: 43715576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:14:01,784][52050] Avg episode reward: [(0, '22.760'), (1, '23.910')] [2023-10-10 08:14:03,512][53268] Updated weights for policy 1, policy_version 85350 (0.0010) [2023-10-10 08:14:03,884][53268] Updated weights for policy 1, policy_version 85360 (0.0009) [2023-10-10 08:14:03,958][53252] Updated weights for policy 0, policy_version 85410 (0.0009) [2023-10-10 08:14:04,251][53268] Updated weights for policy 1, policy_version 85370 (0.0009) [2023-10-10 08:14:04,335][53252] Updated weights for policy 0, policy_version 85420 (0.0009) [2023-10-10 08:14:04,723][53252] Updated weights for policy 0, policy_version 85430 (0.0007) [2023-10-10 08:14:05,086][53252] Updated weights for policy 0, policy_version 85440 (0.0008) [2023-10-10 08:14:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 174915584. Throughput: 0: 1662.3, 1: 1673.3. Samples: 43734694. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:06,784][52050] Avg episode reward: [(0, '22.670'), (1, '23.360')] [2023-10-10 08:14:08,344][53268] Updated weights for policy 1, policy_version 85380 (0.0008) [2023-10-10 08:14:08,713][53268] Updated weights for policy 1, policy_version 85390 (0.0009) [2023-10-10 08:14:09,056][53252] Updated weights for policy 0, policy_version 85450 (0.0007) [2023-10-10 08:14:09,091][53268] Updated weights for policy 1, policy_version 85400 (0.0009) [2023-10-10 08:14:09,423][53252] Updated weights for policy 0, policy_version 85460 (0.0008) [2023-10-10 08:14:09,798][53252] Updated weights for policy 0, policy_version 85470 (0.0007) [2023-10-10 08:14:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 174981120. Throughput: 0: 1678.6, 1: 1682.4. Samples: 43755310. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:11,784][52050] Avg episode reward: [(0, '23.240'), (1, '23.720')] [2023-10-10 08:14:13,409][53268] Updated weights for policy 1, policy_version 85410 (0.0010) [2023-10-10 08:14:13,800][53268] Updated weights for policy 1, policy_version 85420 (0.0009) [2023-10-10 08:14:13,955][53252] Updated weights for policy 0, policy_version 85480 (0.0009) [2023-10-10 08:14:14,172][53268] Updated weights for policy 1, policy_version 85430 (0.0008) [2023-10-10 08:14:14,327][53252] Updated weights for policy 0, policy_version 85490 (0.0008) [2023-10-10 08:14:14,540][53268] Updated weights for policy 1, policy_version 85440 (0.0007) [2023-10-10 08:14:14,697][53252] Updated weights for policy 0, policy_version 85500 (0.0009) [2023-10-10 08:14:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175046656. Throughput: 0: 1663.0, 1: 1658.9. Samples: 43765246. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:16,784][52050] Avg episode reward: [(0, '22.450'), (1, '24.310')] [2023-10-10 08:14:18,553][53252] Updated weights for policy 0, policy_version 85510 (0.0009) [2023-10-10 08:14:18,615][53268] Updated weights for policy 1, policy_version 85450 (0.0009) [2023-10-10 08:14:18,930][53252] Updated weights for policy 0, policy_version 85520 (0.0008) [2023-10-10 08:14:18,981][53268] Updated weights for policy 1, policy_version 85460 (0.0007) [2023-10-10 08:14:19,300][53252] Updated weights for policy 0, policy_version 85530 (0.0008) [2023-10-10 08:14:19,349][53268] Updated weights for policy 1, policy_version 85470 (0.0009) [2023-10-10 08:14:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175112192. Throughput: 0: 1677.0, 1: 1674.9. Samples: 43785168. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:21,784][52050] Avg episode reward: [(0, '23.190'), (1, '22.400')] [2023-10-10 08:14:23,368][53268] Updated weights for policy 1, policy_version 85480 (0.0009) [2023-10-10 08:14:23,458][53252] Updated weights for policy 0, policy_version 85540 (0.0009) [2023-10-10 08:14:23,725][53268] Updated weights for policy 1, policy_version 85490 (0.0008) [2023-10-10 08:14:23,813][53252] Updated weights for policy 0, policy_version 85550 (0.0010) [2023-10-10 08:14:24,099][53268] Updated weights for policy 1, policy_version 85500 (0.0008) [2023-10-10 08:14:24,190][53252] Updated weights for policy 0, policy_version 85560 (0.0010) [2023-10-10 08:14:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 175177728. Throughput: 0: 1692.9, 1: 1674.7. Samples: 43805972. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:26,784][52050] Avg episode reward: [(0, '24.630'), (1, '22.060')] [2023-10-10 08:14:28,235][53252] Updated weights for policy 0, policy_version 85570 (0.0007) [2023-10-10 08:14:28,323][53268] Updated weights for policy 1, policy_version 85510 (0.0009) [2023-10-10 08:14:28,639][53252] Updated weights for policy 0, policy_version 85580 (0.0008) [2023-10-10 08:14:28,683][53268] Updated weights for policy 1, policy_version 85520 (0.0010) [2023-10-10 08:14:29,008][53252] Updated weights for policy 0, policy_version 85590 (0.0009) [2023-10-10 08:14:29,059][53268] Updated weights for policy 1, policy_version 85530 (0.0009) [2023-10-10 08:14:29,382][53252] Updated weights for policy 0, policy_version 85600 (0.0008) [2023-10-10 08:14:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175243264. Throughput: 0: 1656.8, 1: 1659.1. Samples: 43815238. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:31,784][52050] Avg episode reward: [(0, '22.440'), (1, '22.540')] [2023-10-10 08:14:33,242][53268] Updated weights for policy 1, policy_version 85540 (0.0008) [2023-10-10 08:14:33,551][53252] Updated weights for policy 0, policy_version 85610 (0.0008) [2023-10-10 08:14:33,602][53268] Updated weights for policy 1, policy_version 85550 (0.0008) [2023-10-10 08:14:33,914][53252] Updated weights for policy 0, policy_version 85620 (0.0007) [2023-10-10 08:14:33,966][53268] Updated weights for policy 1, policy_version 85560 (0.0010) [2023-10-10 08:14:34,276][53252] Updated weights for policy 0, policy_version 85630 (0.0008) [2023-10-10 08:14:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175308800. Throughput: 0: 1679.9, 1: 1674.6. Samples: 43835580. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:36,784][52050] Avg episode reward: [(0, '22.320'), (1, '22.100')] [2023-10-10 08:14:37,926][53268] Updated weights for policy 1, policy_version 85570 (0.0009) [2023-10-10 08:14:38,283][53268] Updated weights for policy 1, policy_version 85580 (0.0009) [2023-10-10 08:14:38,554][53252] Updated weights for policy 0, policy_version 85640 (0.0008) [2023-10-10 08:14:38,652][53268] Updated weights for policy 1, policy_version 85590 (0.0009) [2023-10-10 08:14:38,922][53252] Updated weights for policy 0, policy_version 85650 (0.0009) [2023-10-10 08:14:39,012][53268] Updated weights for policy 1, policy_version 85600 (0.0008) [2023-10-10 08:14:39,283][53252] Updated weights for policy 0, policy_version 85660 (0.0009) [2023-10-10 08:14:41,784][52050] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 175374336. Throughput: 0: 1683.0, 1: 1677.7. Samples: 43856242. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:41,785][52050] Avg episode reward: [(0, '21.700'), (1, '21.310')] [2023-10-10 08:14:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000085600_87654400.pth... [2023-10-10 08:14:41,796][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000085664_87719936.pth... [2023-10-10 08:14:41,832][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000084032_86048768.pth [2023-10-10 08:14:41,839][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000084096_86114304.pth [2023-10-10 08:14:42,987][53268] Updated weights for policy 1, policy_version 85610 (0.0008) [2023-10-10 08:14:43,272][53252] Updated weights for policy 0, policy_version 85670 (0.0009) [2023-10-10 08:14:43,351][53268] Updated weights for policy 1, policy_version 85620 (0.0008) [2023-10-10 08:14:43,645][53252] Updated weights for policy 0, policy_version 85680 (0.0007) [2023-10-10 08:14:43,718][53268] Updated weights for policy 1, policy_version 85630 (0.0008) [2023-10-10 08:14:44,016][53252] Updated weights for policy 0, policy_version 85690 (0.0007) [2023-10-10 08:14:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 175439872. Throughput: 0: 1662.6, 1: 1663.7. Samples: 43865262. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:46,785][52050] Avg episode reward: [(0, '20.870'), (1, '22.360')] [2023-10-10 08:14:47,969][53268] Updated weights for policy 1, policy_version 85640 (0.0009) [2023-10-10 08:14:48,108][53252] Updated weights for policy 0, policy_version 85700 (0.0009) [2023-10-10 08:14:48,337][53268] Updated weights for policy 1, policy_version 85650 (0.0009) [2023-10-10 08:14:48,478][53252] Updated weights for policy 0, policy_version 85710 (0.0009) [2023-10-10 08:14:48,691][53268] Updated weights for policy 1, policy_version 85660 (0.0009) [2023-10-10 08:14:48,858][53252] Updated weights for policy 0, policy_version 85720 (0.0009) [2023-10-10 08:14:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 175505408. Throughput: 0: 1684.8, 1: 1671.5. Samples: 43885726. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:51,784][52050] Avg episode reward: [(0, '21.360'), (1, '22.990')] [2023-10-10 08:14:52,559][53268] Updated weights for policy 1, policy_version 85670 (0.0008) [2023-10-10 08:14:52,920][53268] Updated weights for policy 1, policy_version 85680 (0.0008) [2023-10-10 08:14:52,980][53252] Updated weights for policy 0, policy_version 85730 (0.0009) [2023-10-10 08:14:53,287][53268] Updated weights for policy 1, policy_version 85690 (0.0007) [2023-10-10 08:14:53,351][53252] Updated weights for policy 0, policy_version 85740 (0.0008) [2023-10-10 08:14:53,728][53252] Updated weights for policy 0, policy_version 85750 (0.0008) [2023-10-10 08:14:54,099][53252] Updated weights for policy 0, policy_version 85760 (0.0007) [2023-10-10 08:14:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175570944. Throughput: 0: 1685.7, 1: 1676.4. Samples: 43906604. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-10 08:14:56,784][52050] Avg episode reward: [(0, '20.760'), (1, '22.100')] [2023-10-10 08:14:57,490][53268] Updated weights for policy 1, policy_version 85700 (0.0009) [2023-10-10 08:14:57,860][53268] Updated weights for policy 1, policy_version 85710 (0.0010) [2023-10-10 08:14:58,052][53252] Updated weights for policy 0, policy_version 85770 (0.0010) [2023-10-10 08:14:58,223][53268] Updated weights for policy 1, policy_version 85720 (0.0009) [2023-10-10 08:14:58,430][53252] Updated weights for policy 0, policy_version 85780 (0.0007) [2023-10-10 08:14:58,796][53252] Updated weights for policy 0, policy_version 85790 (0.0010) [2023-10-10 08:15:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175636480. Throughput: 0: 1670.6, 1: 1671.0. Samples: 43915618. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:01,784][52050] Avg episode reward: [(0, '21.590'), (1, '22.540')] [2023-10-10 08:15:02,415][53268] Updated weights for policy 1, policy_version 85730 (0.0008) [2023-10-10 08:15:02,819][53268] Updated weights for policy 1, policy_version 85740 (0.0009) [2023-10-10 08:15:03,033][53252] Updated weights for policy 0, policy_version 85800 (0.0009) [2023-10-10 08:15:03,180][53268] Updated weights for policy 1, policy_version 85750 (0.0009) [2023-10-10 08:15:03,402][53252] Updated weights for policy 0, policy_version 85810 (0.0008) [2023-10-10 08:15:03,545][53268] Updated weights for policy 1, policy_version 85760 (0.0009) [2023-10-10 08:15:03,763][53252] Updated weights for policy 0, policy_version 85820 (0.0007) [2023-10-10 08:15:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175702016. Throughput: 0: 1676.6, 1: 1668.7. Samples: 43935706. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:06,784][52050] Avg episode reward: [(0, '21.750'), (1, '21.630')] [2023-10-10 08:15:07,660][53268] Updated weights for policy 1, policy_version 85770 (0.0010) [2023-10-10 08:15:07,748][53252] Updated weights for policy 0, policy_version 85830 (0.0007) [2023-10-10 08:15:08,020][53268] Updated weights for policy 1, policy_version 85780 (0.0009) [2023-10-10 08:15:08,112][53252] Updated weights for policy 0, policy_version 85840 (0.0008) [2023-10-10 08:15:08,384][53268] Updated weights for policy 1, policy_version 85790 (0.0009) [2023-10-10 08:15:08,490][53252] Updated weights for policy 0, policy_version 85850 (0.0008) [2023-10-10 08:15:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 175767552. Throughput: 0: 1676.9, 1: 1670.3. Samples: 43956598. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:11,784][52050] Avg episode reward: [(0, '21.710'), (1, '20.430')] [2023-10-10 08:15:12,410][53268] Updated weights for policy 1, policy_version 85800 (0.0009) [2023-10-10 08:15:12,593][53252] Updated weights for policy 0, policy_version 85860 (0.0009) [2023-10-10 08:15:12,764][53268] Updated weights for policy 1, policy_version 85810 (0.0008) [2023-10-10 08:15:12,975][53252] Updated weights for policy 0, policy_version 85870 (0.0008) [2023-10-10 08:15:13,127][53268] Updated weights for policy 1, policy_version 85820 (0.0008) [2023-10-10 08:15:13,352][53252] Updated weights for policy 0, policy_version 85880 (0.0008) [2023-10-10 08:15:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175833088. Throughput: 0: 1674.7, 1: 1666.7. Samples: 43965600. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:16,784][52050] Avg episode reward: [(0, '22.560'), (1, '20.830')] [2023-10-10 08:15:17,198][53268] Updated weights for policy 1, policy_version 85830 (0.0008) [2023-10-10 08:15:17,328][53252] Updated weights for policy 0, policy_version 85890 (0.0007) [2023-10-10 08:15:17,560][53268] Updated weights for policy 1, policy_version 85840 (0.0008) [2023-10-10 08:15:17,695][53252] Updated weights for policy 0, policy_version 85900 (0.0007) [2023-10-10 08:15:17,929][53268] Updated weights for policy 1, policy_version 85850 (0.0007) [2023-10-10 08:15:18,067][53252] Updated weights for policy 0, policy_version 85910 (0.0008) [2023-10-10 08:15:18,437][53252] Updated weights for policy 0, policy_version 85920 (0.0008) [2023-10-10 08:15:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175898624. Throughput: 0: 1677.2, 1: 1676.7. Samples: 43986506. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:21,784][52050] Avg episode reward: [(0, '24.380'), (1, '21.970')] [2023-10-10 08:15:21,853][53268] Updated weights for policy 1, policy_version 85860 (0.0007) [2023-10-10 08:15:22,225][53268] Updated weights for policy 1, policy_version 85870 (0.0007) [2023-10-10 08:15:22,328][53252] Updated weights for policy 0, policy_version 85930 (0.0008) [2023-10-10 08:15:22,590][53268] Updated weights for policy 1, policy_version 85880 (0.0007) [2023-10-10 08:15:22,693][53252] Updated weights for policy 0, policy_version 85940 (0.0007) [2023-10-10 08:15:23,068][53252] Updated weights for policy 0, policy_version 85950 (0.0007) [2023-10-10 08:15:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 175964160. Throughput: 0: 1682.9, 1: 1671.5. Samples: 44007190. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:26,784][52050] Avg episode reward: [(0, '23.830'), (1, '23.860')] [2023-10-10 08:15:26,937][53268] Updated weights for policy 1, policy_version 85890 (0.0007) [2023-10-10 08:15:27,191][53252] Updated weights for policy 0, policy_version 85960 (0.0008) [2023-10-10 08:15:27,292][53268] Updated weights for policy 1, policy_version 85900 (0.0007) [2023-10-10 08:15:27,566][53252] Updated weights for policy 0, policy_version 85970 (0.0009) [2023-10-10 08:15:27,652][53268] Updated weights for policy 1, policy_version 85910 (0.0008) [2023-10-10 08:15:27,927][53252] Updated weights for policy 0, policy_version 85980 (0.0011) [2023-10-10 08:15:28,023][53268] Updated weights for policy 1, policy_version 85920 (0.0009) [2023-10-10 08:15:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176029696. Throughput: 0: 1684.9, 1: 1671.6. Samples: 44016302. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:31,784][52050] Avg episode reward: [(0, '25.100'), (1, '22.760')] [2023-10-10 08:15:31,999][53252] Updated weights for policy 0, policy_version 85990 (0.0007) [2023-10-10 08:15:32,120][53268] Updated weights for policy 1, policy_version 85930 (0.0008) [2023-10-10 08:15:32,360][53252] Updated weights for policy 0, policy_version 86000 (0.0008) [2023-10-10 08:15:32,491][53268] Updated weights for policy 1, policy_version 85940 (0.0008) [2023-10-10 08:15:32,735][53252] Updated weights for policy 0, policy_version 86010 (0.0008) [2023-10-10 08:15:32,854][53268] Updated weights for policy 1, policy_version 85950 (0.0010) [2023-10-10 08:15:36,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176095232. Throughput: 0: 1679.5, 1: 1674.5. Samples: 44036656. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:36,785][52050] Avg episode reward: [(0, '23.740'), (1, '22.790')] [2023-10-10 08:15:36,959][53252] Updated weights for policy 0, policy_version 86020 (0.0008) [2023-10-10 08:15:37,028][53268] Updated weights for policy 1, policy_version 85960 (0.0008) [2023-10-10 08:15:37,321][53252] Updated weights for policy 0, policy_version 86030 (0.0008) [2023-10-10 08:15:37,394][53268] Updated weights for policy 1, policy_version 85970 (0.0009) [2023-10-10 08:15:37,690][53252] Updated weights for policy 0, policy_version 86040 (0.0007) [2023-10-10 08:15:37,751][53268] Updated weights for policy 1, policy_version 85980 (0.0009) [2023-10-10 08:15:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176160768. Throughput: 0: 1673.0, 1: 1671.1. Samples: 44057090. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:41,784][52050] Avg episode reward: [(0, '23.550'), (1, '23.100')] [2023-10-10 08:15:41,835][53268] Updated weights for policy 1, policy_version 85990 (0.0009) [2023-10-10 08:15:41,868][53252] Updated weights for policy 0, policy_version 86050 (0.0007) [2023-10-10 08:15:42,197][53268] Updated weights for policy 1, policy_version 86000 (0.0008) [2023-10-10 08:15:42,239][53252] Updated weights for policy 0, policy_version 86060 (0.0008) [2023-10-10 08:15:42,552][53268] Updated weights for policy 1, policy_version 86010 (0.0007) [2023-10-10 08:15:42,608][53252] Updated weights for policy 0, policy_version 86070 (0.0009) [2023-10-10 08:15:42,969][53252] Updated weights for policy 0, policy_version 86080 (0.0008) [2023-10-10 08:15:46,621][53268] Updated weights for policy 1, policy_version 86020 (0.0008) [2023-10-10 08:15:46,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 176226304. Throughput: 0: 1672.4, 1: 1671.3. Samples: 44066086. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:46,784][52050] Avg episode reward: [(0, '22.350'), (1, '22.590')] [2023-10-10 08:15:46,975][53268] Updated weights for policy 1, policy_version 86030 (0.0009) [2023-10-10 08:15:47,066][53252] Updated weights for policy 0, policy_version 86090 (0.0007) [2023-10-10 08:15:47,347][53268] Updated weights for policy 1, policy_version 86040 (0.0008) [2023-10-10 08:15:47,433][53252] Updated weights for policy 0, policy_version 86100 (0.0009) [2023-10-10 08:15:47,807][53252] Updated weights for policy 0, policy_version 86110 (0.0008) [2023-10-10 08:15:51,600][53268] Updated weights for policy 1, policy_version 86050 (0.0009) [2023-10-10 08:15:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176291840. Throughput: 0: 1672.8, 1: 1678.3. Samples: 44086502. Policy #0 lag: (min: 46.0, avg: 55.7, max: 56.0) [2023-10-10 08:15:51,784][52050] Avg episode reward: [(0, '21.090'), (1, '22.190')] [2023-10-10 08:15:51,919][53252] Updated weights for policy 0, policy_version 86120 (0.0009) [2023-10-10 08:15:52,019][53268] Updated weights for policy 1, policy_version 86060 (0.0008) [2023-10-10 08:15:52,291][53252] Updated weights for policy 0, policy_version 86130 (0.0008) [2023-10-10 08:15:52,384][53268] Updated weights for policy 1, policy_version 86070 (0.0007) [2023-10-10 08:15:52,650][53252] Updated weights for policy 0, policy_version 86140 (0.0008) [2023-10-10 08:15:52,739][53268] Updated weights for policy 1, policy_version 86080 (0.0008) [2023-10-10 08:15:56,728][53252] Updated weights for policy 0, policy_version 86150 (0.0009) [2023-10-10 08:15:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176357376. Throughput: 0: 1669.5, 1: 1672.5. Samples: 44106988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:15:56,784][52050] Avg episode reward: [(0, '21.710'), (1, '21.470')] [2023-10-10 08:15:56,888][53268] Updated weights for policy 1, policy_version 86090 (0.0008) [2023-10-10 08:15:57,097][53252] Updated weights for policy 0, policy_version 86160 (0.0009) [2023-10-10 08:15:57,257][53268] Updated weights for policy 1, policy_version 86100 (0.0008) [2023-10-10 08:15:57,467][53252] Updated weights for policy 0, policy_version 86170 (0.0008) [2023-10-10 08:15:57,620][53268] Updated weights for policy 1, policy_version 86110 (0.0008) [2023-10-10 08:16:01,606][53252] Updated weights for policy 0, policy_version 86180 (0.0009) [2023-10-10 08:16:01,639][53268] Updated weights for policy 1, policy_version 86120 (0.0009) [2023-10-10 08:16:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176422912. Throughput: 0: 1670.2, 1: 1672.5. Samples: 44116020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:01,784][52050] Avg episode reward: [(0, '21.060'), (1, '20.960')] [2023-10-10 08:16:01,986][53252] Updated weights for policy 0, policy_version 86190 (0.0008) [2023-10-10 08:16:02,009][53268] Updated weights for policy 1, policy_version 86130 (0.0008) [2023-10-10 08:16:02,351][53252] Updated weights for policy 0, policy_version 86200 (0.0007) [2023-10-10 08:16:02,379][53268] Updated weights for policy 1, policy_version 86140 (0.0007) [2023-10-10 08:16:06,323][53268] Updated weights for policy 1, policy_version 86150 (0.0009) [2023-10-10 08:16:06,543][53252] Updated weights for policy 0, policy_version 86210 (0.0009) [2023-10-10 08:16:06,695][53268] Updated weights for policy 1, policy_version 86160 (0.0010) [2023-10-10 08:16:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 176488448. Throughput: 0: 1667.3, 1: 1670.3. Samples: 44136700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:06,784][52050] Avg episode reward: [(0, '21.440'), (1, '22.160')] [2023-10-10 08:16:06,920][53252] Updated weights for policy 0, policy_version 86220 (0.0009) [2023-10-10 08:16:07,059][53268] Updated weights for policy 1, policy_version 86170 (0.0008) [2023-10-10 08:16:07,294][53252] Updated weights for policy 0, policy_version 86230 (0.0008) [2023-10-10 08:16:07,658][53252] Updated weights for policy 0, policy_version 86240 (0.0008) [2023-10-10 08:16:10,460][53268] Updated weights for policy 1, policy_version 86180 (0.0009) [2023-10-10 08:16:10,826][53268] Updated weights for policy 1, policy_version 86190 (0.0008) [2023-10-10 08:16:11,016][53252] Updated weights for policy 0, policy_version 86250 (0.0007) [2023-10-10 08:16:11,200][53268] Updated weights for policy 1, policy_version 86200 (0.0008) [2023-10-10 08:16:11,391][53252] Updated weights for policy 0, policy_version 86260 (0.0007) [2023-10-10 08:16:11,773][53252] Updated weights for policy 0, policy_version 86270 (0.0007) [2023-10-10 08:16:11,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176586752. Throughput: 0: 1671.3, 1: 1678.9. Samples: 44157950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:11,784][52050] Avg episode reward: [(0, '20.520'), (1, '20.830')] [2023-10-10 08:16:15,319][53268] Updated weights for policy 1, policy_version 86210 (0.0008) [2023-10-10 08:16:15,682][53268] Updated weights for policy 1, policy_version 86220 (0.0009) [2023-10-10 08:16:15,795][53252] Updated weights for policy 0, policy_version 86280 (0.0008) [2023-10-10 08:16:16,055][53268] Updated weights for policy 1, policy_version 86230 (0.0009) [2023-10-10 08:16:16,165][53252] Updated weights for policy 0, policy_version 86290 (0.0007) [2023-10-10 08:16:16,417][53268] Updated weights for policy 1, policy_version 86240 (0.0009) [2023-10-10 08:16:16,539][53252] Updated weights for policy 0, policy_version 86300 (0.0008) [2023-10-10 08:16:16,783][52050] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176685056. Throughput: 0: 1692.8, 1: 1698.0. Samples: 44168888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:16,784][52050] Avg episode reward: [(0, '20.280'), (1, '21.110')] [2023-10-10 08:16:20,435][53268] Updated weights for policy 1, policy_version 86250 (0.0007) [2023-10-10 08:16:20,719][53252] Updated weights for policy 0, policy_version 86310 (0.0008) [2023-10-10 08:16:20,806][53268] Updated weights for policy 1, policy_version 86260 (0.0008) [2023-10-10 08:16:21,087][53252] Updated weights for policy 0, policy_version 86320 (0.0007) [2023-10-10 08:16:21,181][53268] Updated weights for policy 1, policy_version 86270 (0.0009) [2023-10-10 08:16:21,458][53252] Updated weights for policy 0, policy_version 86330 (0.0010) [2023-10-10 08:16:21,783][52050] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176750592. Throughput: 0: 1692.9, 1: 1698.6. Samples: 44189272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:21,784][52050] Avg episode reward: [(0, '23.280'), (1, '22.850')] [2023-10-10 08:16:25,335][53268] Updated weights for policy 1, policy_version 86280 (0.0010) [2023-10-10 08:16:25,503][53252] Updated weights for policy 0, policy_version 86340 (0.0009) [2023-10-10 08:16:25,706][53268] Updated weights for policy 1, policy_version 86290 (0.0009) [2023-10-10 08:16:25,863][53252] Updated weights for policy 0, policy_version 86350 (0.0009) [2023-10-10 08:16:26,077][53268] Updated weights for policy 1, policy_version 86300 (0.0008) [2023-10-10 08:16:26,237][53252] Updated weights for policy 0, policy_version 86360 (0.0008) [2023-10-10 08:16:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176816128. Throughput: 0: 1674.8, 1: 1676.8. Samples: 44207912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:26,784][52050] Avg episode reward: [(0, '21.400'), (1, '23.640')] [2023-10-10 08:16:30,181][53268] Updated weights for policy 1, policy_version 86310 (0.0008) [2023-10-10 08:16:30,396][53252] Updated weights for policy 0, policy_version 86370 (0.0007) [2023-10-10 08:16:30,549][53268] Updated weights for policy 1, policy_version 86320 (0.0010) [2023-10-10 08:16:30,771][53252] Updated weights for policy 0, policy_version 86380 (0.0008) [2023-10-10 08:16:30,917][53268] Updated weights for policy 1, policy_version 86330 (0.0008) [2023-10-10 08:16:31,150][53252] Updated weights for policy 0, policy_version 86390 (0.0009) [2023-10-10 08:16:31,514][53252] Updated weights for policy 0, policy_version 86400 (0.0008) [2023-10-10 08:16:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 176881664. Throughput: 0: 1697.0, 1: 1707.1. Samples: 44219274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:31,784][52050] Avg episode reward: [(0, '22.060'), (1, '22.850')] [2023-10-10 08:16:34,909][53268] Updated weights for policy 1, policy_version 86340 (0.0008) [2023-10-10 08:16:35,270][53268] Updated weights for policy 1, policy_version 86350 (0.0007) [2023-10-10 08:16:35,615][53252] Updated weights for policy 0, policy_version 86410 (0.0007) [2023-10-10 08:16:35,632][53268] Updated weights for policy 1, policy_version 86360 (0.0007) [2023-10-10 08:16:35,992][53252] Updated weights for policy 0, policy_version 86420 (0.0008) [2023-10-10 08:16:36,373][53252] Updated weights for policy 0, policy_version 86430 (0.0010) [2023-10-10 08:16:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 176947200. Throughput: 0: 1698.4, 1: 1700.8. Samples: 44239468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:36,784][52050] Avg episode reward: [(0, '23.830'), (1, '22.850')] [2023-10-10 08:16:39,837][53268] Updated weights for policy 1, policy_version 86370 (0.0009) [2023-10-10 08:16:40,240][53268] Updated weights for policy 1, policy_version 86380 (0.0009) [2023-10-10 08:16:40,370][53252] Updated weights for policy 0, policy_version 86440 (0.0008) [2023-10-10 08:16:40,603][53268] Updated weights for policy 1, policy_version 86390 (0.0009) [2023-10-10 08:16:40,737][53252] Updated weights for policy 0, policy_version 86450 (0.0008) [2023-10-10 08:16:40,966][53268] Updated weights for policy 1, policy_version 86400 (0.0009) [2023-10-10 08:16:41,104][53252] Updated weights for policy 0, policy_version 86460 (0.0007) [2023-10-10 08:16:41,784][52050] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 177012736. Throughput: 0: 1673.9, 1: 1682.0. Samples: 44258004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:16:41,785][52050] Avg episode reward: [(0, '23.660'), (1, '23.630')] [2023-10-10 08:16:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000086400_88473600.pth... [2023-10-10 08:16:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000086464_88539136.pth... [2023-10-10 08:16:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000084832_86867968.pth [2023-10-10 08:16:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000084896_86933504.pth [2023-10-10 08:16:41,832][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000086400_88473600.pth [2023-10-10 08:16:41,834][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000086464_88539136.pth [2023-10-10 08:16:44,994][53268] Updated weights for policy 1, policy_version 86410 (0.0010) [2023-10-10 08:16:45,215][53252] Updated weights for policy 0, policy_version 86470 (0.0007) [2023-10-10 08:16:45,364][53268] Updated weights for policy 1, policy_version 86420 (0.0010) [2023-10-10 08:16:45,584][53252] Updated weights for policy 0, policy_version 86480 (0.0008) [2023-10-10 08:16:45,727][53268] Updated weights for policy 1, policy_version 86430 (0.0008) [2023-10-10 08:16:45,967][53252] Updated weights for policy 0, policy_version 86490 (0.0010) [2023-10-10 08:16:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 177078272. Throughput: 0: 1701.6, 1: 1710.7. Samples: 44269576. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:16:46,784][52050] Avg episode reward: [(0, '21.660'), (1, '24.070')] [2023-10-10 08:16:49,760][53268] Updated weights for policy 1, policy_version 86440 (0.0009) [2023-10-10 08:16:50,120][53268] Updated weights for policy 1, policy_version 86450 (0.0007) [2023-10-10 08:16:50,208][53252] Updated weights for policy 0, policy_version 86500 (0.0009) [2023-10-10 08:16:50,502][53268] Updated weights for policy 1, policy_version 86460 (0.0008) [2023-10-10 08:16:50,577][53252] Updated weights for policy 0, policy_version 86510 (0.0007) [2023-10-10 08:16:50,948][53252] Updated weights for policy 0, policy_version 86520 (0.0007) [2023-10-10 08:16:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 177143808. Throughput: 0: 1696.8, 1: 1686.4. Samples: 44288944. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:16:51,784][52050] Avg episode reward: [(0, '23.350'), (1, '22.430')] [2023-10-10 08:16:54,502][53268] Updated weights for policy 1, policy_version 86470 (0.0010) [2023-10-10 08:16:54,838][53252] Updated weights for policy 0, policy_version 86530 (0.0008) [2023-10-10 08:16:54,872][53268] Updated weights for policy 1, policy_version 86480 (0.0011) [2023-10-10 08:16:55,204][53252] Updated weights for policy 0, policy_version 86540 (0.0007) [2023-10-10 08:16:55,240][53268] Updated weights for policy 1, policy_version 86490 (0.0009) [2023-10-10 08:16:55,571][53252] Updated weights for policy 0, policy_version 86550 (0.0008) [2023-10-10 08:16:55,951][53252] Updated weights for policy 0, policy_version 86560 (0.0008) [2023-10-10 08:16:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 177209344. Throughput: 0: 1672.4, 1: 1666.8. Samples: 44308214. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:16:56,784][52050] Avg episode reward: [(0, '23.280'), (1, '23.160')] [2023-10-10 08:16:59,495][53268] Updated weights for policy 1, policy_version 86500 (0.0010) [2023-10-10 08:16:59,852][53268] Updated weights for policy 1, policy_version 86510 (0.0010) [2023-10-10 08:17:00,077][53252] Updated weights for policy 0, policy_version 86570 (0.0009) [2023-10-10 08:17:00,212][53268] Updated weights for policy 1, policy_version 86520 (0.0008) [2023-10-10 08:17:00,443][53252] Updated weights for policy 0, policy_version 86580 (0.0007) [2023-10-10 08:17:00,821][53252] Updated weights for policy 0, policy_version 86590 (0.0010) [2023-10-10 08:17:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 177274880. Throughput: 0: 1678.7, 1: 1674.4. Samples: 44319780. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:01,784][52050] Avg episode reward: [(0, '21.530'), (1, '22.710')] [2023-10-10 08:17:04,280][53268] Updated weights for policy 1, policy_version 86530 (0.0008) [2023-10-10 08:17:04,647][53268] Updated weights for policy 1, policy_version 86540 (0.0010) [2023-10-10 08:17:04,805][53252] Updated weights for policy 0, policy_version 86600 (0.0008) [2023-10-10 08:17:05,021][53268] Updated weights for policy 1, policy_version 86550 (0.0008) [2023-10-10 08:17:05,176][53252] Updated weights for policy 0, policy_version 86610 (0.0008) [2023-10-10 08:17:05,384][53268] Updated weights for policy 1, policy_version 86560 (0.0008) [2023-10-10 08:17:05,552][53252] Updated weights for policy 0, policy_version 86620 (0.0008) [2023-10-10 08:17:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 177340416. Throughput: 0: 1663.6, 1: 1653.1. Samples: 44338524. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:06,784][52050] Avg episode reward: [(0, '21.190'), (1, '23.300')] [2023-10-10 08:17:09,374][53268] Updated weights for policy 1, policy_version 86570 (0.0010) [2023-10-10 08:17:09,601][53252] Updated weights for policy 0, policy_version 86630 (0.0007) [2023-10-10 08:17:09,732][53268] Updated weights for policy 1, policy_version 86580 (0.0009) [2023-10-10 08:17:09,959][53252] Updated weights for policy 0, policy_version 86640 (0.0007) [2023-10-10 08:17:10,096][53268] Updated weights for policy 1, policy_version 86590 (0.0007) [2023-10-10 08:17:10,334][53252] Updated weights for policy 0, policy_version 86650 (0.0007) [2023-10-10 08:17:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 177405952. Throughput: 0: 1675.6, 1: 1668.6. Samples: 44358402. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:11,784][52050] Avg episode reward: [(0, '22.840'), (1, '22.650')] [2023-10-10 08:17:14,256][53268] Updated weights for policy 1, policy_version 86600 (0.0008) [2023-10-10 08:17:14,399][53252] Updated weights for policy 0, policy_version 86660 (0.0009) [2023-10-10 08:17:14,616][53268] Updated weights for policy 1, policy_version 86610 (0.0010) [2023-10-10 08:17:14,761][53252] Updated weights for policy 0, policy_version 86670 (0.0008) [2023-10-10 08:17:14,983][53268] Updated weights for policy 1, policy_version 86620 (0.0011) [2023-10-10 08:17:15,136][53252] Updated weights for policy 0, policy_version 86680 (0.0008) [2023-10-10 08:17:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 177471488. Throughput: 0: 1683.0, 1: 1666.0. Samples: 44369980. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:16,784][52050] Avg episode reward: [(0, '23.170'), (1, '22.380')] [2023-10-10 08:17:19,093][53252] Updated weights for policy 0, policy_version 86690 (0.0008) [2023-10-10 08:17:19,331][53268] Updated weights for policy 1, policy_version 86630 (0.0008) [2023-10-10 08:17:19,462][53252] Updated weights for policy 0, policy_version 86700 (0.0007) [2023-10-10 08:17:19,696][53268] Updated weights for policy 1, policy_version 86640 (0.0009) [2023-10-10 08:17:19,834][53252] Updated weights for policy 0, policy_version 86710 (0.0008) [2023-10-10 08:17:20,067][53268] Updated weights for policy 1, policy_version 86650 (0.0008) [2023-10-10 08:17:20,205][53252] Updated weights for policy 0, policy_version 86720 (0.0008) [2023-10-10 08:17:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 177537024. Throughput: 0: 1663.3, 1: 1652.1. Samples: 44388660. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:21,784][52050] Avg episode reward: [(0, '23.720'), (1, '22.410')] [2023-10-10 08:17:24,244][53268] Updated weights for policy 1, policy_version 86660 (0.0010) [2023-10-10 08:17:24,252][53252] Updated weights for policy 0, policy_version 86730 (0.0007) [2023-10-10 08:17:24,605][53252] Updated weights for policy 0, policy_version 86740 (0.0008) [2023-10-10 08:17:24,606][53268] Updated weights for policy 1, policy_version 86670 (0.0008) [2023-10-10 08:17:24,980][53268] Updated weights for policy 1, policy_version 86680 (0.0009) [2023-10-10 08:17:24,982][53252] Updated weights for policy 0, policy_version 86750 (0.0007) [2023-10-10 08:17:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177602560. Throughput: 0: 1694.0, 1: 1666.8. Samples: 44409238. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:26,784][52050] Avg episode reward: [(0, '22.100'), (1, '22.230')] [2023-10-10 08:17:28,943][53268] Updated weights for policy 1, policy_version 86690 (0.0010) [2023-10-10 08:17:28,983][53252] Updated weights for policy 0, policy_version 86760 (0.0008) [2023-10-10 08:17:29,323][53268] Updated weights for policy 1, policy_version 86700 (0.0008) [2023-10-10 08:17:29,353][53252] Updated weights for policy 0, policy_version 86770 (0.0007) [2023-10-10 08:17:29,683][53268] Updated weights for policy 1, policy_version 86710 (0.0009) [2023-10-10 08:17:29,725][53252] Updated weights for policy 0, policy_version 86780 (0.0007) [2023-10-10 08:17:30,043][53268] Updated weights for policy 1, policy_version 86720 (0.0009) [2023-10-10 08:17:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177668096. Throughput: 0: 1677.9, 1: 1661.7. Samples: 44419860. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:31,784][52050] Avg episode reward: [(0, '20.840'), (1, '22.260')] [2023-10-10 08:17:33,732][53252] Updated weights for policy 0, policy_version 86790 (0.0007) [2023-10-10 08:17:33,968][53268] Updated weights for policy 1, policy_version 86730 (0.0008) [2023-10-10 08:17:34,103][53252] Updated weights for policy 0, policy_version 86800 (0.0007) [2023-10-10 08:17:34,338][53268] Updated weights for policy 1, policy_version 86740 (0.0008) [2023-10-10 08:17:34,477][53252] Updated weights for policy 0, policy_version 86810 (0.0009) [2023-10-10 08:17:34,694][53268] Updated weights for policy 1, policy_version 86750 (0.0008) [2023-10-10 08:17:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177733632. Throughput: 0: 1673.9, 1: 1659.5. Samples: 44438950. Policy #0 lag: (min: 7.0, avg: 10.8, max: 39.0) [2023-10-10 08:17:36,784][52050] Avg episode reward: [(0, '19.960'), (1, '20.950')] [2023-10-10 08:17:38,588][53252] Updated weights for policy 0, policy_version 86820 (0.0007) [2023-10-10 08:17:38,772][53268] Updated weights for policy 1, policy_version 86760 (0.0010) [2023-10-10 08:17:38,959][53252] Updated weights for policy 0, policy_version 86830 (0.0009) [2023-10-10 08:17:39,140][53268] Updated weights for policy 1, policy_version 86770 (0.0008) [2023-10-10 08:17:39,336][53252] Updated weights for policy 0, policy_version 86840 (0.0008) [2023-10-10 08:17:39,493][53268] Updated weights for policy 1, policy_version 86780 (0.0008) [2023-10-10 08:17:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 177799168. Throughput: 0: 1687.8, 1: 1670.8. Samples: 44459350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:17:41,784][52050] Avg episode reward: [(0, '19.590'), (1, '21.670')] [2023-10-10 08:17:43,569][53252] Updated weights for policy 0, policy_version 86850 (0.0008) [2023-10-10 08:17:43,656][53268] Updated weights for policy 1, policy_version 86790 (0.0010) [2023-10-10 08:17:43,940][53252] Updated weights for policy 0, policy_version 86860 (0.0009) [2023-10-10 08:17:44,014][53268] Updated weights for policy 1, policy_version 86800 (0.0007) [2023-10-10 08:17:44,317][53252] Updated weights for policy 0, policy_version 86870 (0.0009) [2023-10-10 08:17:44,383][53268] Updated weights for policy 1, policy_version 86810 (0.0009) [2023-10-10 08:17:44,686][53252] Updated weights for policy 0, policy_version 86880 (0.0008) [2023-10-10 08:17:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.5). Total num frames: 177864704. Throughput: 0: 1664.7, 1: 1655.4. Samples: 44469184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:17:46,784][52050] Avg episode reward: [(0, '20.510'), (1, '21.330')] [2023-10-10 08:17:48,607][53268] Updated weights for policy 1, policy_version 86820 (0.0008) [2023-10-10 08:17:48,807][53252] Updated weights for policy 0, policy_version 86890 (0.0008) [2023-10-10 08:17:48,976][53268] Updated weights for policy 1, policy_version 86830 (0.0008) [2023-10-10 08:17:49,177][53252] Updated weights for policy 0, policy_version 86900 (0.0010) [2023-10-10 08:17:49,343][53268] Updated weights for policy 1, policy_version 86840 (0.0009) [2023-10-10 08:17:49,549][53252] Updated weights for policy 0, policy_version 86910 (0.0008) [2023-10-10 08:17:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 177930240. Throughput: 0: 1675.5, 1: 1665.5. Samples: 44488870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:17:51,784][52050] Avg episode reward: [(0, '22.380'), (1, '21.940')] [2023-10-10 08:17:53,316][53268] Updated weights for policy 1, policy_version 86850 (0.0009) [2023-10-10 08:17:53,554][53252] Updated weights for policy 0, policy_version 86920 (0.0007) [2023-10-10 08:17:53,681][53268] Updated weights for policy 1, policy_version 86860 (0.0008) [2023-10-10 08:17:53,914][53252] Updated weights for policy 0, policy_version 86930 (0.0008) [2023-10-10 08:17:54,053][53268] Updated weights for policy 1, policy_version 86870 (0.0007) [2023-10-10 08:17:54,290][53252] Updated weights for policy 0, policy_version 86940 (0.0007) [2023-10-10 08:17:54,426][53268] Updated weights for policy 1, policy_version 86880 (0.0009) [2023-10-10 08:17:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 177995776. Throughput: 0: 1690.2, 1: 1676.9. Samples: 44509922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:17:56,784][52050] Avg episode reward: [(0, '21.720'), (1, '21.810')] [2023-10-10 08:17:58,312][53252] Updated weights for policy 0, policy_version 86950 (0.0008) [2023-10-10 08:17:58,517][53268] Updated weights for policy 1, policy_version 86890 (0.0010) [2023-10-10 08:17:58,682][53252] Updated weights for policy 0, policy_version 86960 (0.0009) [2023-10-10 08:17:58,887][53268] Updated weights for policy 1, policy_version 86900 (0.0008) [2023-10-10 08:17:59,038][53252] Updated weights for policy 0, policy_version 86970 (0.0007) [2023-10-10 08:17:59,246][53268] Updated weights for policy 1, policy_version 86910 (0.0008) [2023-10-10 08:18:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178061312. Throughput: 0: 1664.5, 1: 1652.6. Samples: 44519248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:01,784][52050] Avg episode reward: [(0, '21.570'), (1, '22.240')] [2023-10-10 08:18:03,259][53252] Updated weights for policy 0, policy_version 86980 (0.0008) [2023-10-10 08:18:03,400][53268] Updated weights for policy 1, policy_version 86920 (0.0009) [2023-10-10 08:18:03,623][53252] Updated weights for policy 0, policy_version 86990 (0.0007) [2023-10-10 08:18:03,763][53268] Updated weights for policy 1, policy_version 86930 (0.0007) [2023-10-10 08:18:04,000][53252] Updated weights for policy 0, policy_version 87000 (0.0008) [2023-10-10 08:18:04,136][53268] Updated weights for policy 1, policy_version 86940 (0.0008) [2023-10-10 08:18:06,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178126848. Throughput: 0: 1682.7, 1: 1674.3. Samples: 44539724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:06,784][52050] Avg episode reward: [(0, '23.140'), (1, '23.700')] [2023-10-10 08:18:07,982][53252] Updated weights for policy 0, policy_version 87010 (0.0009) [2023-10-10 08:18:08,352][53252] Updated weights for policy 0, policy_version 87020 (0.0010) [2023-10-10 08:18:08,412][53268] Updated weights for policy 1, policy_version 86950 (0.0008) [2023-10-10 08:18:08,726][53252] Updated weights for policy 0, policy_version 87030 (0.0008) [2023-10-10 08:18:08,777][53268] Updated weights for policy 1, policy_version 86960 (0.0009) [2023-10-10 08:18:09,098][53252] Updated weights for policy 0, policy_version 87040 (0.0007) [2023-10-10 08:18:09,146][53268] Updated weights for policy 1, policy_version 86970 (0.0008) [2023-10-10 08:18:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 178192384. Throughput: 0: 1681.2, 1: 1685.1. Samples: 44560724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:11,784][52050] Avg episode reward: [(0, '22.350'), (1, '22.890')] [2023-10-10 08:18:13,190][53252] Updated weights for policy 0, policy_version 87050 (0.0010) [2023-10-10 08:18:13,224][53268] Updated weights for policy 1, policy_version 86980 (0.0007) [2023-10-10 08:18:13,554][53252] Updated weights for policy 0, policy_version 87060 (0.0008) [2023-10-10 08:18:13,587][53268] Updated weights for policy 1, policy_version 86990 (0.0007) [2023-10-10 08:18:13,929][53252] Updated weights for policy 0, policy_version 87070 (0.0008) [2023-10-10 08:18:13,964][53268] Updated weights for policy 1, policy_version 87000 (0.0008) [2023-10-10 08:18:16,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 178257920. Throughput: 0: 1669.5, 1: 1664.1. Samples: 44569872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:16,784][52050] Avg episode reward: [(0, '22.550'), (1, '22.680')] [2023-10-10 08:18:18,079][53268] Updated weights for policy 1, policy_version 87010 (0.0009) [2023-10-10 08:18:18,089][53252] Updated weights for policy 0, policy_version 87080 (0.0008) [2023-10-10 08:18:18,443][53268] Updated weights for policy 1, policy_version 87020 (0.0007) [2023-10-10 08:18:18,458][53252] Updated weights for policy 0, policy_version 87090 (0.0009) [2023-10-10 08:18:18,800][53268] Updated weights for policy 1, policy_version 87030 (0.0007) [2023-10-10 08:18:18,828][53252] Updated weights for policy 0, policy_version 87100 (0.0008) [2023-10-10 08:18:19,167][53268] Updated weights for policy 1, policy_version 87040 (0.0008) [2023-10-10 08:18:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178323456. Throughput: 0: 1680.4, 1: 1677.1. Samples: 44590036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:21,784][52050] Avg episode reward: [(0, '22.980'), (1, '25.390')] [2023-10-10 08:18:21,784][53061] Saving new best policy, reward=25.390! [2023-10-10 08:18:22,823][53252] Updated weights for policy 0, policy_version 87110 (0.0007) [2023-10-10 08:18:23,201][53252] Updated weights for policy 0, policy_version 87120 (0.0007) [2023-10-10 08:18:23,316][53268] Updated weights for policy 1, policy_version 87050 (0.0009) [2023-10-10 08:18:23,573][53252] Updated weights for policy 0, policy_version 87130 (0.0009) [2023-10-10 08:18:23,684][53268] Updated weights for policy 1, policy_version 87060 (0.0008) [2023-10-10 08:18:24,051][53268] Updated weights for policy 1, policy_version 87070 (0.0009) [2023-10-10 08:18:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178388992. Throughput: 0: 1687.7, 1: 1678.0. Samples: 44610806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:26,784][52050] Avg episode reward: [(0, '21.450'), (1, '25.210')] [2023-10-10 08:18:27,468][53252] Updated weights for policy 0, policy_version 87140 (0.0007) [2023-10-10 08:18:27,679][53268] Updated weights for policy 1, policy_version 87080 (0.0009) [2023-10-10 08:18:27,849][53252] Updated weights for policy 0, policy_version 87150 (0.0008) [2023-10-10 08:18:28,043][53268] Updated weights for policy 1, policy_version 87090 (0.0008) [2023-10-10 08:18:28,217][53252] Updated weights for policy 0, policy_version 87160 (0.0007) [2023-10-10 08:18:28,404][53268] Updated weights for policy 1, policy_version 87100 (0.0007) [2023-10-10 08:18:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178454528. Throughput: 0: 1680.2, 1: 1671.2. Samples: 44619998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:31,784][52050] Avg episode reward: [(0, '22.040'), (1, '22.840')] [2023-10-10 08:18:32,185][53252] Updated weights for policy 0, policy_version 87170 (0.0007) [2023-10-10 08:18:32,515][53268] Updated weights for policy 1, policy_version 87110 (0.0010) [2023-10-10 08:18:32,559][53252] Updated weights for policy 0, policy_version 87180 (0.0009) [2023-10-10 08:18:32,879][53268] Updated weights for policy 1, policy_version 87120 (0.0008) [2023-10-10 08:18:32,922][53252] Updated weights for policy 0, policy_version 87190 (0.0008) [2023-10-10 08:18:33,254][53268] Updated weights for policy 1, policy_version 87130 (0.0009) [2023-10-10 08:18:33,295][53252] Updated weights for policy 0, policy_version 87200 (0.0008) [2023-10-10 08:18:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178520064. Throughput: 0: 1693.0, 1: 1682.6. Samples: 44640774. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:36,784][52050] Avg episode reward: [(0, '21.760'), (1, '21.980')] [2023-10-10 08:18:37,374][53268] Updated weights for policy 1, policy_version 87140 (0.0009) [2023-10-10 08:18:37,436][53252] Updated weights for policy 0, policy_version 87210 (0.0007) [2023-10-10 08:18:37,740][53268] Updated weights for policy 1, policy_version 87150 (0.0007) [2023-10-10 08:18:37,805][53252] Updated weights for policy 0, policy_version 87220 (0.0007) [2023-10-10 08:18:38,107][53268] Updated weights for policy 1, policy_version 87160 (0.0007) [2023-10-10 08:18:38,172][53252] Updated weights for policy 0, policy_version 87230 (0.0010) [2023-10-10 08:18:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178585600. Throughput: 0: 1688.2, 1: 1677.7. Samples: 44661386. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:41,784][52050] Avg episode reward: [(0, '21.830'), (1, '25.310')] [2023-10-10 08:18:41,794][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000087232_89325568.pth... [2023-10-10 08:18:41,794][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000087168_89260032.pth... [2023-10-10 08:18:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000085664_87719936.pth [2023-10-10 08:18:41,830][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000085600_87654400.pth [2023-10-10 08:18:42,280][53252] Updated weights for policy 0, policy_version 87240 (0.0010) [2023-10-10 08:18:42,403][53268] Updated weights for policy 1, policy_version 87170 (0.0008) [2023-10-10 08:18:42,651][53252] Updated weights for policy 0, policy_version 87250 (0.0008) [2023-10-10 08:18:42,769][53268] Updated weights for policy 1, policy_version 87180 (0.0008) [2023-10-10 08:18:43,020][53252] Updated weights for policy 0, policy_version 87260 (0.0009) [2023-10-10 08:18:43,143][53268] Updated weights for policy 1, policy_version 87190 (0.0007) [2023-10-10 08:18:43,503][53268] Updated weights for policy 1, policy_version 87200 (0.0008) [2023-10-10 08:18:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 178651136. Throughput: 0: 1685.3, 1: 1678.1. Samples: 44670602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:46,784][52050] Avg episode reward: [(0, '23.450'), (1, '22.720')] [2023-10-10 08:18:47,131][53252] Updated weights for policy 0, policy_version 87270 (0.0008) [2023-10-10 08:18:47,489][53252] Updated weights for policy 0, policy_version 87280 (0.0008) [2023-10-10 08:18:47,818][53268] Updated weights for policy 1, policy_version 87210 (0.0011) [2023-10-10 08:18:47,858][53252] Updated weights for policy 0, policy_version 87290 (0.0008) [2023-10-10 08:18:48,183][53268] Updated weights for policy 1, policy_version 87220 (0.0009) [2023-10-10 08:18:48,555][53268] Updated weights for policy 1, policy_version 87230 (0.0009) [2023-10-10 08:18:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178716672. Throughput: 0: 1688.1, 1: 1678.3. Samples: 44691214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:51,784][52050] Avg episode reward: [(0, '21.970'), (1, '22.580')] [2023-10-10 08:18:51,974][53252] Updated weights for policy 0, policy_version 87300 (0.0008) [2023-10-10 08:18:52,341][53252] Updated weights for policy 0, policy_version 87310 (0.0008) [2023-10-10 08:18:52,567][53268] Updated weights for policy 1, policy_version 87240 (0.0008) [2023-10-10 08:18:52,710][53252] Updated weights for policy 0, policy_version 87320 (0.0009) [2023-10-10 08:18:52,951][53268] Updated weights for policy 1, policy_version 87250 (0.0009) [2023-10-10 08:18:53,318][53268] Updated weights for policy 1, policy_version 87260 (0.0009) [2023-10-10 08:18:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 178782208. Throughput: 0: 1683.4, 1: 1674.4. Samples: 44711828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:18:56,784][52050] Avg episode reward: [(0, '21.030'), (1, '24.130')] [2023-10-10 08:18:56,810][53252] Updated weights for policy 0, policy_version 87330 (0.0010) [2023-10-10 08:18:57,168][53252] Updated weights for policy 0, policy_version 87340 (0.0008) [2023-10-10 08:18:57,364][53268] Updated weights for policy 1, policy_version 87270 (0.0009) [2023-10-10 08:18:57,553][53252] Updated weights for policy 0, policy_version 87350 (0.0009) [2023-10-10 08:18:57,718][53268] Updated weights for policy 1, policy_version 87280 (0.0008) [2023-10-10 08:18:57,915][53252] Updated weights for policy 0, policy_version 87360 (0.0010) [2023-10-10 08:18:58,087][53268] Updated weights for policy 1, policy_version 87290 (0.0007) [2023-10-10 08:19:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178847744. Throughput: 0: 1686.2, 1: 1672.2. Samples: 44720998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:01,784][52050] Avg episode reward: [(0, '20.800'), (1, '23.490')] [2023-10-10 08:19:01,841][53252] Updated weights for policy 0, policy_version 87370 (0.0008) [2023-10-10 08:19:02,180][53268] Updated weights for policy 1, policy_version 87300 (0.0008) [2023-10-10 08:19:02,205][53252] Updated weights for policy 0, policy_version 87380 (0.0009) [2023-10-10 08:19:02,555][53268] Updated weights for policy 1, policy_version 87310 (0.0009) [2023-10-10 08:19:02,586][53252] Updated weights for policy 0, policy_version 87390 (0.0010) [2023-10-10 08:19:02,916][53268] Updated weights for policy 1, policy_version 87320 (0.0009) [2023-10-10 08:19:06,576][53252] Updated weights for policy 0, policy_version 87400 (0.0007) [2023-10-10 08:19:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178913280. Throughput: 0: 1691.8, 1: 1685.1. Samples: 44741998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:06,784][52050] Avg episode reward: [(0, '21.880'), (1, '21.390')] [2023-10-10 08:19:06,848][53268] Updated weights for policy 1, policy_version 87330 (0.0011) [2023-10-10 08:19:06,947][53252] Updated weights for policy 0, policy_version 87410 (0.0007) [2023-10-10 08:19:07,207][53268] Updated weights for policy 1, policy_version 87340 (0.0009) [2023-10-10 08:19:07,319][53252] Updated weights for policy 0, policy_version 87420 (0.0009) [2023-10-10 08:19:07,575][53268] Updated weights for policy 1, policy_version 87350 (0.0009) [2023-10-10 08:19:07,942][53268] Updated weights for policy 1, policy_version 87360 (0.0008) [2023-10-10 08:19:11,407][53252] Updated weights for policy 0, policy_version 87430 (0.0007) [2023-10-10 08:19:11,775][53252] Updated weights for policy 0, policy_version 87440 (0.0007) [2023-10-10 08:19:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 178978816. Throughput: 0: 1684.5, 1: 1688.1. Samples: 44762574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:11,784][52050] Avg episode reward: [(0, '23.140'), (1, '20.800')] [2023-10-10 08:19:12,003][53268] Updated weights for policy 1, policy_version 87370 (0.0007) [2023-10-10 08:19:12,147][53252] Updated weights for policy 0, policy_version 87450 (0.0009) [2023-10-10 08:19:12,367][53268] Updated weights for policy 1, policy_version 87380 (0.0007) [2023-10-10 08:19:12,740][53268] Updated weights for policy 1, policy_version 87390 (0.0007) [2023-10-10 08:19:16,187][53252] Updated weights for policy 0, policy_version 87460 (0.0008) [2023-10-10 08:19:16,569][53252] Updated weights for policy 0, policy_version 87470 (0.0007) [2023-10-10 08:19:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179044352. Throughput: 0: 1690.8, 1: 1686.8. Samples: 44771990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:16,784][52050] Avg episode reward: [(0, '22.350'), (1, '19.670')] [2023-10-10 08:19:16,788][53268] Updated weights for policy 1, policy_version 87400 (0.0008) [2023-10-10 08:19:16,943][53252] Updated weights for policy 0, policy_version 87480 (0.0007) [2023-10-10 08:19:17,156][53268] Updated weights for policy 1, policy_version 87410 (0.0008) [2023-10-10 08:19:17,522][53268] Updated weights for policy 1, policy_version 87420 (0.0008) [2023-10-10 08:19:21,015][53252] Updated weights for policy 0, policy_version 87490 (0.0008) [2023-10-10 08:19:21,383][53252] Updated weights for policy 0, policy_version 87500 (0.0007) [2023-10-10 08:19:21,692][53268] Updated weights for policy 1, policy_version 87430 (0.0009) [2023-10-10 08:19:21,753][53252] Updated weights for policy 0, policy_version 87510 (0.0008) [2023-10-10 08:19:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 179109888. Throughput: 0: 1687.5, 1: 1685.3. Samples: 44792550. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:21,784][52050] Avg episode reward: [(0, '23.430'), (1, '18.970')] [2023-10-10 08:19:22,055][53268] Updated weights for policy 1, policy_version 87440 (0.0008) [2023-10-10 08:19:22,123][53252] Updated weights for policy 0, policy_version 87520 (0.0008) [2023-10-10 08:19:22,432][53268] Updated weights for policy 1, policy_version 87450 (0.0007) [2023-10-10 08:19:26,282][53252] Updated weights for policy 0, policy_version 87530 (0.0007) [2023-10-10 08:19:26,375][53268] Updated weights for policy 1, policy_version 87460 (0.0008) [2023-10-10 08:19:26,659][53252] Updated weights for policy 0, policy_version 87540 (0.0007) [2023-10-10 08:19:26,747][53268] Updated weights for policy 1, policy_version 87470 (0.0007) [2023-10-10 08:19:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 179175424. Throughput: 0: 1678.6, 1: 1689.4. Samples: 44812946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:26,784][52050] Avg episode reward: [(0, '22.330'), (1, '22.000')] [2023-10-10 08:19:27,039][53252] Updated weights for policy 0, policy_version 87550 (0.0007) [2023-10-10 08:19:27,107][53268] Updated weights for policy 1, policy_version 87480 (0.0008) [2023-10-10 08:19:31,123][53268] Updated weights for policy 1, policy_version 87490 (0.0008) [2023-10-10 08:19:31,137][53252] Updated weights for policy 0, policy_version 87560 (0.0009) [2023-10-10 08:19:31,490][53268] Updated weights for policy 1, policy_version 87500 (0.0007) [2023-10-10 08:19:31,512][53252] Updated weights for policy 0, policy_version 87570 (0.0009) [2023-10-10 08:19:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179240960. Throughput: 0: 1690.8, 1: 1684.7. Samples: 44822498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:31,784][52050] Avg episode reward: [(0, '22.400'), (1, '21.630')] [2023-10-10 08:19:31,857][53268] Updated weights for policy 1, policy_version 87510 (0.0007) [2023-10-10 08:19:31,870][53252] Updated weights for policy 0, policy_version 87580 (0.0009) [2023-10-10 08:19:32,223][53268] Updated weights for policy 1, policy_version 87520 (0.0007) [2023-10-10 08:19:36,023][53252] Updated weights for policy 0, policy_version 87590 (0.0011) [2023-10-10 08:19:36,388][53252] Updated weights for policy 0, policy_version 87600 (0.0008) [2023-10-10 08:19:36,389][53268] Updated weights for policy 1, policy_version 87530 (0.0010) [2023-10-10 08:19:36,762][53268] Updated weights for policy 1, policy_version 87540 (0.0008) [2023-10-10 08:19:36,775][53252] Updated weights for policy 0, policy_version 87610 (0.0008) [2023-10-10 08:19:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 179306496. Throughput: 0: 1691.2, 1: 1689.1. Samples: 44843326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:19:36,784][52050] Avg episode reward: [(0, '23.420'), (1, '21.550')] [2023-10-10 08:19:37,124][53268] Updated weights for policy 1, policy_version 87550 (0.0007) [2023-10-10 08:19:40,773][53252] Updated weights for policy 0, policy_version 87620 (0.0009) [2023-10-10 08:19:41,099][53268] Updated weights for policy 1, policy_version 87560 (0.0009) [2023-10-10 08:19:41,142][53252] Updated weights for policy 0, policy_version 87630 (0.0009) [2023-10-10 08:19:41,476][53268] Updated weights for policy 1, policy_version 87570 (0.0009) [2023-10-10 08:19:41,512][53252] Updated weights for policy 0, policy_version 87640 (0.0007) [2023-10-10 08:19:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 179372032. Throughput: 0: 1671.5, 1: 1682.4. Samples: 44862754. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:19:41,784][52050] Avg episode reward: [(0, '23.660'), (1, '22.610')] [2023-10-10 08:19:41,839][53268] Updated weights for policy 1, policy_version 87580 (0.0009) [2023-10-10 08:19:45,507][53252] Updated weights for policy 0, policy_version 87650 (0.0008) [2023-10-10 08:19:45,671][53268] Updated weights for policy 1, policy_version 87590 (0.0010) [2023-10-10 08:19:45,873][53252] Updated weights for policy 0, policy_version 87660 (0.0007) [2023-10-10 08:19:46,040][53268] Updated weights for policy 1, policy_version 87600 (0.0008) [2023-10-10 08:19:46,247][53252] Updated weights for policy 0, policy_version 87670 (0.0008) [2023-10-10 08:19:46,400][53268] Updated weights for policy 1, policy_version 87610 (0.0008) [2023-10-10 08:19:46,616][53252] Updated weights for policy 0, policy_version 87680 (0.0008) [2023-10-10 08:19:46,783][52050] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179503104. Throughput: 0: 1688.1, 1: 1692.4. Samples: 44873122. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:19:46,784][52050] Avg episode reward: [(0, '25.080'), (1, '23.240')] [2023-10-10 08:19:50,627][53268] Updated weights for policy 1, policy_version 87620 (0.0009) [2023-10-10 08:19:50,774][53252] Updated weights for policy 0, policy_version 87690 (0.0009) [2023-10-10 08:19:50,989][53268] Updated weights for policy 1, policy_version 87630 (0.0008) [2023-10-10 08:19:51,129][53252] Updated weights for policy 0, policy_version 87700 (0.0009) [2023-10-10 08:19:51,353][53268] Updated weights for policy 1, policy_version 87640 (0.0007) [2023-10-10 08:19:51,510][53252] Updated weights for policy 0, policy_version 87710 (0.0008) [2023-10-10 08:19:51,783][52050] Fps is (10 sec: 19661.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179568640. Throughput: 0: 1680.9, 1: 1687.2. Samples: 44893562. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:19:51,784][52050] Avg episode reward: [(0, '25.270'), (1, '20.870')] [2023-10-10 08:19:55,480][53268] Updated weights for policy 1, policy_version 87650 (0.0007) [2023-10-10 08:19:55,499][53252] Updated weights for policy 0, policy_version 87720 (0.0007) [2023-10-10 08:19:55,851][53268] Updated weights for policy 1, policy_version 87660 (0.0008) [2023-10-10 08:19:55,859][53252] Updated weights for policy 0, policy_version 87730 (0.0007) [2023-10-10 08:19:56,210][53268] Updated weights for policy 1, policy_version 87670 (0.0007) [2023-10-10 08:19:56,227][53252] Updated weights for policy 0, policy_version 87740 (0.0007) [2023-10-10 08:19:56,573][53268] Updated weights for policy 1, policy_version 87680 (0.0008) [2023-10-10 08:19:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179634176. Throughput: 0: 1658.6, 1: 1669.2. Samples: 44912324. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:19:56,784][52050] Avg episode reward: [(0, '21.580'), (1, '22.500')] [2023-10-10 08:20:00,157][53252] Updated weights for policy 0, policy_version 87750 (0.0009) [2023-10-10 08:20:00,517][53252] Updated weights for policy 0, policy_version 87760 (0.0008) [2023-10-10 08:20:00,788][53268] Updated weights for policy 1, policy_version 87690 (0.0010) [2023-10-10 08:20:00,893][53252] Updated weights for policy 0, policy_version 87770 (0.0008) [2023-10-10 08:20:01,157][53268] Updated weights for policy 1, policy_version 87700 (0.0009) [2023-10-10 08:20:01,526][53268] Updated weights for policy 1, policy_version 87710 (0.0009) [2023-10-10 08:20:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179699712. Throughput: 0: 1683.5, 1: 1684.5. Samples: 44923550. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:01,784][52050] Avg episode reward: [(0, '20.790'), (1, '23.330')] [2023-10-10 08:20:04,970][53252] Updated weights for policy 0, policy_version 87780 (0.0009) [2023-10-10 08:20:05,362][53252] Updated weights for policy 0, policy_version 87790 (0.0008) [2023-10-10 08:20:05,507][53268] Updated weights for policy 1, policy_version 87720 (0.0009) [2023-10-10 08:20:05,730][53252] Updated weights for policy 0, policy_version 87800 (0.0008) [2023-10-10 08:20:05,882][53268] Updated weights for policy 1, policy_version 87730 (0.0007) [2023-10-10 08:20:06,256][53268] Updated weights for policy 1, policy_version 87740 (0.0010) [2023-10-10 08:20:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179765248. Throughput: 0: 1668.8, 1: 1690.1. Samples: 44943702. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:06,784][52050] Avg episode reward: [(0, '22.090'), (1, '23.480')] [2023-10-10 08:20:09,741][53252] Updated weights for policy 0, policy_version 87810 (0.0007) [2023-10-10 08:20:10,114][53252] Updated weights for policy 0, policy_version 87820 (0.0008) [2023-10-10 08:20:10,347][53268] Updated weights for policy 1, policy_version 87750 (0.0008) [2023-10-10 08:20:10,491][53252] Updated weights for policy 0, policy_version 87830 (0.0009) [2023-10-10 08:20:10,711][53268] Updated weights for policy 1, policy_version 87760 (0.0008) [2023-10-10 08:20:10,859][53252] Updated weights for policy 0, policy_version 87840 (0.0012) [2023-10-10 08:20:11,077][53268] Updated weights for policy 1, policy_version 87770 (0.0008) [2023-10-10 08:20:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 179830784. Throughput: 0: 1663.6, 1: 1662.0. Samples: 44962600. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:11,784][52050] Avg episode reward: [(0, '19.460'), (1, '22.760')] [2023-10-10 08:20:15,001][53252] Updated weights for policy 0, policy_version 87850 (0.0007) [2023-10-10 08:20:15,130][53268] Updated weights for policy 1, policy_version 87780 (0.0008) [2023-10-10 08:20:15,374][53252] Updated weights for policy 0, policy_version 87860 (0.0008) [2023-10-10 08:20:15,502][53268] Updated weights for policy 1, policy_version 87790 (0.0009) [2023-10-10 08:20:15,750][53252] Updated weights for policy 0, policy_version 87870 (0.0008) [2023-10-10 08:20:15,857][53268] Updated weights for policy 1, policy_version 87800 (0.0009) [2023-10-10 08:20:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 179896320. Throughput: 0: 1686.7, 1: 1686.8. Samples: 44974306. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:16,784][52050] Avg episode reward: [(0, '21.140'), (1, '21.500')] [2023-10-10 08:20:19,961][53252] Updated weights for policy 0, policy_version 87880 (0.0008) [2023-10-10 08:20:20,111][53268] Updated weights for policy 1, policy_version 87810 (0.0008) [2023-10-10 08:20:20,329][53252] Updated weights for policy 0, policy_version 87890 (0.0007) [2023-10-10 08:20:20,485][53268] Updated weights for policy 1, policy_version 87820 (0.0010) [2023-10-10 08:20:20,700][53252] Updated weights for policy 0, policy_version 87900 (0.0008) [2023-10-10 08:20:20,854][53268] Updated weights for policy 1, policy_version 87830 (0.0009) [2023-10-10 08:20:21,216][53268] Updated weights for policy 1, policy_version 87840 (0.0007) [2023-10-10 08:20:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 179961856. Throughput: 0: 1667.0, 1: 1678.6. Samples: 44993876. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:21,784][52050] Avg episode reward: [(0, '22.500'), (1, '22.180')] [2023-10-10 08:20:24,593][53252] Updated weights for policy 0, policy_version 87910 (0.0009) [2023-10-10 08:20:24,965][53252] Updated weights for policy 0, policy_version 87920 (0.0008) [2023-10-10 08:20:25,084][53268] Updated weights for policy 1, policy_version 87850 (0.0007) [2023-10-10 08:20:25,325][53252] Updated weights for policy 0, policy_version 87930 (0.0007) [2023-10-10 08:20:25,446][53268] Updated weights for policy 1, policy_version 87860 (0.0008) [2023-10-10 08:20:25,813][53268] Updated weights for policy 1, policy_version 87870 (0.0009) [2023-10-10 08:20:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 180027392. Throughput: 0: 1679.0, 1: 1664.9. Samples: 45013230. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:26,784][52050] Avg episode reward: [(0, '21.880'), (1, '21.780')] [2023-10-10 08:20:29,230][53252] Updated weights for policy 0, policy_version 87940 (0.0007) [2023-10-10 08:20:29,597][53252] Updated weights for policy 0, policy_version 87950 (0.0008) [2023-10-10 08:20:29,865][53268] Updated weights for policy 1, policy_version 87880 (0.0009) [2023-10-10 08:20:29,966][53252] Updated weights for policy 0, policy_version 87960 (0.0007) [2023-10-10 08:20:30,239][53268] Updated weights for policy 1, policy_version 87890 (0.0010) [2023-10-10 08:20:30,616][53268] Updated weights for policy 1, policy_version 87900 (0.0011) [2023-10-10 08:20:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 180092928. Throughput: 0: 1686.4, 1: 1678.6. Samples: 45024544. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) [2023-10-10 08:20:31,784][52050] Avg episode reward: [(0, '22.190'), (1, '21.890')] [2023-10-10 08:20:34,140][53252] Updated weights for policy 0, policy_version 87970 (0.0007) [2023-10-10 08:20:34,512][53252] Updated weights for policy 0, policy_version 87980 (0.0009) [2023-10-10 08:20:34,826][53268] Updated weights for policy 1, policy_version 87910 (0.0010) [2023-10-10 08:20:34,871][53252] Updated weights for policy 0, policy_version 87990 (0.0009) [2023-10-10 08:20:35,205][53268] Updated weights for policy 1, policy_version 87920 (0.0009) [2023-10-10 08:20:35,241][53252] Updated weights for policy 0, policy_version 88000 (0.0009) [2023-10-10 08:20:35,575][53268] Updated weights for policy 1, policy_version 87930 (0.0009) [2023-10-10 08:20:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 180158464. Throughput: 0: 1671.7, 1: 1668.3. Samples: 45043862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:20:36,784][52050] Avg episode reward: [(0, '22.290'), (1, '22.510')] [2023-10-10 08:20:39,093][53252] Updated weights for policy 0, policy_version 88010 (0.0008) [2023-10-10 08:20:39,460][53252] Updated weights for policy 0, policy_version 88020 (0.0007) [2023-10-10 08:20:39,571][53268] Updated weights for policy 1, policy_version 87940 (0.0009) [2023-10-10 08:20:39,824][53252] Updated weights for policy 0, policy_version 88030 (0.0008) [2023-10-10 08:20:39,926][53268] Updated weights for policy 1, policy_version 87950 (0.0008) [2023-10-10 08:20:40,290][53268] Updated weights for policy 1, policy_version 87960 (0.0011) [2023-10-10 08:20:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 180224000. Throughput: 0: 1703.1, 1: 1668.0. Samples: 45064024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:20:41,784][52050] Avg episode reward: [(0, '21.180'), (1, '23.750')] [2023-10-10 08:20:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000087968_90079232.pth... [2023-10-10 08:20:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000088032_90144768.pth... [2023-10-10 08:20:41,821][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000086464_88539136.pth [2023-10-10 08:20:41,825][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000086400_88473600.pth [2023-10-10 08:20:43,803][53252] Updated weights for policy 0, policy_version 88040 (0.0007) [2023-10-10 08:20:44,179][53252] Updated weights for policy 0, policy_version 88050 (0.0009) [2023-10-10 08:20:44,370][53268] Updated weights for policy 1, policy_version 87970 (0.0010) [2023-10-10 08:20:44,539][53252] Updated weights for policy 0, policy_version 88060 (0.0009) [2023-10-10 08:20:44,732][53268] Updated weights for policy 1, policy_version 87980 (0.0009) [2023-10-10 08:20:45,102][53268] Updated weights for policy 1, policy_version 87990 (0.0007) [2023-10-10 08:20:45,462][53268] Updated weights for policy 1, policy_version 88000 (0.0007) [2023-10-10 08:20:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 180289536. Throughput: 0: 1683.4, 1: 1685.6. Samples: 45075154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:20:46,784][52050] Avg episode reward: [(0, '22.250'), (1, '22.490')] [2023-10-10 08:20:48,602][53252] Updated weights for policy 0, policy_version 88070 (0.0008) [2023-10-10 08:20:48,967][53252] Updated weights for policy 0, policy_version 88080 (0.0009) [2023-10-10 08:20:49,331][53252] Updated weights for policy 0, policy_version 88090 (0.0009) [2023-10-10 08:20:49,386][53268] Updated weights for policy 1, policy_version 88010 (0.0008) [2023-10-10 08:20:49,751][53268] Updated weights for policy 1, policy_version 88020 (0.0009) [2023-10-10 08:20:50,123][53268] Updated weights for policy 1, policy_version 88030 (0.0010) [2023-10-10 08:20:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 180355072. Throughput: 0: 1688.9, 1: 1660.4. Samples: 45094424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:20:51,784][52050] Avg episode reward: [(0, '21.800'), (1, '22.070')] [2023-10-10 08:20:53,477][53252] Updated weights for policy 0, policy_version 88100 (0.0009) [2023-10-10 08:20:53,858][53252] Updated weights for policy 0, policy_version 88110 (0.0007) [2023-10-10 08:20:54,233][53252] Updated weights for policy 0, policy_version 88120 (0.0008) [2023-10-10 08:20:54,337][53268] Updated weights for policy 1, policy_version 88040 (0.0009) [2023-10-10 08:20:54,698][53268] Updated weights for policy 1, policy_version 88050 (0.0009) [2023-10-10 08:20:55,063][53268] Updated weights for policy 1, policy_version 88060 (0.0008) [2023-10-10 08:20:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 180420608. Throughput: 0: 1702.5, 1: 1681.8. Samples: 45114894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:20:56,784][52050] Avg episode reward: [(0, '22.650'), (1, '22.630')] [2023-10-10 08:20:58,272][53252] Updated weights for policy 0, policy_version 88130 (0.0010) [2023-10-10 08:20:58,642][53252] Updated weights for policy 0, policy_version 88140 (0.0008) [2023-10-10 08:20:59,006][53252] Updated weights for policy 0, policy_version 88150 (0.0009) [2023-10-10 08:20:59,208][53268] Updated weights for policy 1, policy_version 88070 (0.0007) [2023-10-10 08:20:59,372][53252] Updated weights for policy 0, policy_version 88160 (0.0010) [2023-10-10 08:20:59,575][53268] Updated weights for policy 1, policy_version 88080 (0.0008) [2023-10-10 08:20:59,935][53268] Updated weights for policy 1, policy_version 88090 (0.0010) [2023-10-10 08:21:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 180486144. Throughput: 0: 1668.5, 1: 1678.0. Samples: 45124898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:01,784][52050] Avg episode reward: [(0, '21.830'), (1, '23.270')] [2023-10-10 08:21:03,462][53252] Updated weights for policy 0, policy_version 88170 (0.0010) [2023-10-10 08:21:03,837][53252] Updated weights for policy 0, policy_version 88180 (0.0011) [2023-10-10 08:21:03,894][53268] Updated weights for policy 1, policy_version 88100 (0.0007) [2023-10-10 08:21:04,211][53252] Updated weights for policy 0, policy_version 88190 (0.0008) [2023-10-10 08:21:04,255][53268] Updated weights for policy 1, policy_version 88110 (0.0008) [2023-10-10 08:21:04,626][53268] Updated weights for policy 1, policy_version 88120 (0.0009) [2023-10-10 08:21:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180551680. Throughput: 0: 1684.8, 1: 1662.1. Samples: 45144488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:06,784][52050] Avg episode reward: [(0, '22.750'), (1, '22.560')] [2023-10-10 08:21:08,362][53252] Updated weights for policy 0, policy_version 88200 (0.0009) [2023-10-10 08:21:08,726][53252] Updated weights for policy 0, policy_version 88210 (0.0008) [2023-10-10 08:21:08,733][53268] Updated weights for policy 1, policy_version 88130 (0.0010) [2023-10-10 08:21:09,082][53268] Updated weights for policy 1, policy_version 88140 (0.0009) [2023-10-10 08:21:09,100][53252] Updated weights for policy 0, policy_version 88220 (0.0008) [2023-10-10 08:21:09,459][53268] Updated weights for policy 1, policy_version 88150 (0.0009) [2023-10-10 08:21:09,815][53268] Updated weights for policy 1, policy_version 88160 (0.0009) [2023-10-10 08:21:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 180617216. Throughput: 0: 1692.0, 1: 1687.4. Samples: 45165302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:11,784][52050] Avg episode reward: [(0, '23.530'), (1, '21.520')] [2023-10-10 08:21:13,191][53252] Updated weights for policy 0, policy_version 88230 (0.0009) [2023-10-10 08:21:13,560][53252] Updated weights for policy 0, policy_version 88240 (0.0008) [2023-10-10 08:21:13,870][53268] Updated weights for policy 1, policy_version 88170 (0.0009) [2023-10-10 08:21:13,939][53252] Updated weights for policy 0, policy_version 88250 (0.0007) [2023-10-10 08:21:14,238][53268] Updated weights for policy 1, policy_version 88180 (0.0008) [2023-10-10 08:21:14,605][53268] Updated weights for policy 1, policy_version 88190 (0.0009) [2023-10-10 08:21:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 180682752. Throughput: 0: 1669.2, 1: 1676.3. Samples: 45175090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:16,784][52050] Avg episode reward: [(0, '24.050'), (1, '22.610')] [2023-10-10 08:21:18,049][53252] Updated weights for policy 0, policy_version 88260 (0.0009) [2023-10-10 08:21:18,425][53252] Updated weights for policy 0, policy_version 88270 (0.0009) [2023-10-10 08:21:18,797][53268] Updated weights for policy 1, policy_version 88200 (0.0008) [2023-10-10 08:21:18,806][53252] Updated weights for policy 0, policy_version 88280 (0.0007) [2023-10-10 08:21:19,171][53268] Updated weights for policy 1, policy_version 88210 (0.0010) [2023-10-10 08:21:19,538][53268] Updated weights for policy 1, policy_version 88220 (0.0009) [2023-10-10 08:21:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180748288. Throughput: 0: 1684.3, 1: 1669.4. Samples: 45194776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:21,784][52050] Avg episode reward: [(0, '23.920'), (1, '21.580')] [2023-10-10 08:21:22,981][53252] Updated weights for policy 0, policy_version 88290 (0.0008) [2023-10-10 08:21:23,355][53252] Updated weights for policy 0, policy_version 88300 (0.0007) [2023-10-10 08:21:23,672][53268] Updated weights for policy 1, policy_version 88230 (0.0009) [2023-10-10 08:21:23,720][53252] Updated weights for policy 0, policy_version 88310 (0.0009) [2023-10-10 08:21:24,041][53268] Updated weights for policy 1, policy_version 88240 (0.0007) [2023-10-10 08:21:24,077][53252] Updated weights for policy 0, policy_version 88320 (0.0008) [2023-10-10 08:21:24,402][53268] Updated weights for policy 1, policy_version 88250 (0.0011) [2023-10-10 08:21:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180813824. Throughput: 0: 1681.1, 1: 1685.2. Samples: 45215508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:26,784][52050] Avg episode reward: [(0, '25.290'), (1, '21.990')] [2023-10-10 08:21:28,113][53252] Updated weights for policy 0, policy_version 88330 (0.0008) [2023-10-10 08:21:28,437][53268] Updated weights for policy 1, policy_version 88260 (0.0008) [2023-10-10 08:21:28,484][53252] Updated weights for policy 0, policy_version 88340 (0.0008) [2023-10-10 08:21:28,815][53268] Updated weights for policy 1, policy_version 88270 (0.0009) [2023-10-10 08:21:28,853][53252] Updated weights for policy 0, policy_version 88350 (0.0008) [2023-10-10 08:21:29,178][53268] Updated weights for policy 1, policy_version 88280 (0.0011) [2023-10-10 08:21:31,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180879360. Throughput: 0: 1672.7, 1: 1660.0. Samples: 45225124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:21:31,784][52050] Avg episode reward: [(0, '23.230'), (1, '22.590')] [2023-10-10 08:21:32,991][53252] Updated weights for policy 0, policy_version 88360 (0.0008) [2023-10-10 08:21:33,140][53268] Updated weights for policy 1, policy_version 88290 (0.0009) [2023-10-10 08:21:33,365][53252] Updated weights for policy 0, policy_version 88370 (0.0007) [2023-10-10 08:21:33,502][53268] Updated weights for policy 1, policy_version 88300 (0.0009) [2023-10-10 08:21:33,737][53252] Updated weights for policy 0, policy_version 88380 (0.0008) [2023-10-10 08:21:33,871][53268] Updated weights for policy 1, policy_version 88310 (0.0008) [2023-10-10 08:21:34,239][53268] Updated weights for policy 1, policy_version 88320 (0.0008) [2023-10-10 08:21:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 180944896. Throughput: 0: 1677.7, 1: 1676.1. Samples: 45245346. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:21:36,784][52050] Avg episode reward: [(0, '25.140'), (1, '22.950')] [2023-10-10 08:21:37,775][53252] Updated weights for policy 0, policy_version 88390 (0.0007) [2023-10-10 08:21:38,137][53252] Updated weights for policy 0, policy_version 88400 (0.0008) [2023-10-10 08:21:38,412][53268] Updated weights for policy 1, policy_version 88330 (0.0010) [2023-10-10 08:21:38,517][53252] Updated weights for policy 0, policy_version 88410 (0.0007) [2023-10-10 08:21:38,776][53268] Updated weights for policy 1, policy_version 88340 (0.0009) [2023-10-10 08:21:39,143][53268] Updated weights for policy 1, policy_version 88350 (0.0008) [2023-10-10 08:21:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181010432. Throughput: 0: 1674.2, 1: 1679.3. Samples: 45265802. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:21:41,784][52050] Avg episode reward: [(0, '24.180'), (1, '22.530')] [2023-10-10 08:21:42,720][53252] Updated weights for policy 0, policy_version 88420 (0.0008) [2023-10-10 08:21:43,096][53252] Updated weights for policy 0, policy_version 88430 (0.0009) [2023-10-10 08:21:43,361][53268] Updated weights for policy 1, policy_version 88360 (0.0009) [2023-10-10 08:21:43,473][53252] Updated weights for policy 0, policy_version 88440 (0.0009) [2023-10-10 08:21:43,724][53268] Updated weights for policy 1, policy_version 88370 (0.0009) [2023-10-10 08:21:44,092][53268] Updated weights for policy 1, policy_version 88380 (0.0008) [2023-10-10 08:21:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 181075968. Throughput: 0: 1671.2, 1: 1660.8. Samples: 45274842. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:21:46,784][52050] Avg episode reward: [(0, '23.680'), (1, '22.910')] [2023-10-10 08:21:47,401][53252] Updated weights for policy 0, policy_version 88450 (0.0007) [2023-10-10 08:21:47,772][53252] Updated weights for policy 0, policy_version 88460 (0.0008) [2023-10-10 08:21:48,011][53268] Updated weights for policy 1, policy_version 88390 (0.0009) [2023-10-10 08:21:48,146][53252] Updated weights for policy 0, policy_version 88470 (0.0008) [2023-10-10 08:21:48,376][53268] Updated weights for policy 1, policy_version 88400 (0.0009) [2023-10-10 08:21:48,515][53252] Updated weights for policy 0, policy_version 88480 (0.0008) [2023-10-10 08:21:48,747][53268] Updated weights for policy 1, policy_version 88410 (0.0009) [2023-10-10 08:21:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181141504. Throughput: 0: 1675.0, 1: 1676.0. Samples: 45295282. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:21:51,784][52050] Avg episode reward: [(0, '23.070'), (1, '22.070')] [2023-10-10 08:21:52,616][53252] Updated weights for policy 0, policy_version 88490 (0.0010) [2023-10-10 08:21:52,963][53268] Updated weights for policy 1, policy_version 88420 (0.0009) [2023-10-10 08:21:52,986][53252] Updated weights for policy 0, policy_version 88500 (0.0010) [2023-10-10 08:21:53,334][53268] Updated weights for policy 1, policy_version 88430 (0.0011) [2023-10-10 08:21:53,371][53252] Updated weights for policy 0, policy_version 88510 (0.0010) [2023-10-10 08:21:53,708][53268] Updated weights for policy 1, policy_version 88440 (0.0010) [2023-10-10 08:21:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181207040. Throughput: 0: 1681.5, 1: 1669.4. Samples: 45316090. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:21:56,784][52050] Avg episode reward: [(0, '23.260'), (1, '22.830')] [2023-10-10 08:21:57,346][53252] Updated weights for policy 0, policy_version 88520 (0.0010) [2023-10-10 08:21:57,686][53268] Updated weights for policy 1, policy_version 88450 (0.0010) [2023-10-10 08:21:57,719][53252] Updated weights for policy 0, policy_version 88530 (0.0008) [2023-10-10 08:21:58,054][53268] Updated weights for policy 1, policy_version 88460 (0.0008) [2023-10-10 08:21:58,091][53252] Updated weights for policy 0, policy_version 88540 (0.0008) [2023-10-10 08:21:58,418][53268] Updated weights for policy 1, policy_version 88470 (0.0009) [2023-10-10 08:21:58,794][53268] Updated weights for policy 1, policy_version 88480 (0.0008) [2023-10-10 08:22:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 181272576. Throughput: 0: 1680.4, 1: 1657.5. Samples: 45325294. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:01,784][52050] Avg episode reward: [(0, '22.070'), (1, '22.430')] [2023-10-10 08:22:01,933][53252] Updated weights for policy 0, policy_version 88550 (0.0009) [2023-10-10 08:22:02,309][53252] Updated weights for policy 0, policy_version 88560 (0.0009) [2023-10-10 08:22:02,673][53252] Updated weights for policy 0, policy_version 88570 (0.0011) [2023-10-10 08:22:02,849][53268] Updated weights for policy 1, policy_version 88490 (0.0008) [2023-10-10 08:22:03,207][53268] Updated weights for policy 1, policy_version 88500 (0.0010) [2023-10-10 08:22:03,578][53268] Updated weights for policy 1, policy_version 88510 (0.0009) [2023-10-10 08:22:06,703][53252] Updated weights for policy 0, policy_version 88580 (0.0009) [2023-10-10 08:22:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181338112. Throughput: 0: 1683.6, 1: 1678.9. Samples: 45346088. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:06,784][52050] Avg episode reward: [(0, '20.930'), (1, '23.360')] [2023-10-10 08:22:07,085][53252] Updated weights for policy 0, policy_version 88590 (0.0010) [2023-10-10 08:22:07,457][53252] Updated weights for policy 0, policy_version 88600 (0.0007) [2023-10-10 08:22:07,730][53268] Updated weights for policy 1, policy_version 88520 (0.0008) [2023-10-10 08:22:08,103][53268] Updated weights for policy 1, policy_version 88530 (0.0011) [2023-10-10 08:22:08,473][53268] Updated weights for policy 1, policy_version 88540 (0.0011) [2023-10-10 08:22:11,455][53252] Updated weights for policy 0, policy_version 88610 (0.0008) [2023-10-10 08:22:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181403648. Throughput: 0: 1684.5, 1: 1680.8. Samples: 45366944. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:11,784][52050] Avg episode reward: [(0, '22.500'), (1, '21.480')] [2023-10-10 08:22:11,829][53252] Updated weights for policy 0, policy_version 88620 (0.0007) [2023-10-10 08:22:12,187][53252] Updated weights for policy 0, policy_version 88630 (0.0008) [2023-10-10 08:22:12,533][53268] Updated weights for policy 1, policy_version 88550 (0.0008) [2023-10-10 08:22:12,561][53252] Updated weights for policy 0, policy_version 88640 (0.0008) [2023-10-10 08:22:12,892][53268] Updated weights for policy 1, policy_version 88560 (0.0008) [2023-10-10 08:22:13,259][53268] Updated weights for policy 1, policy_version 88570 (0.0007) [2023-10-10 08:22:16,431][53252] Updated weights for policy 0, policy_version 88650 (0.0009) [2023-10-10 08:22:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181469184. Throughput: 0: 1685.9, 1: 1671.7. Samples: 45376218. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:16,784][52050] Avg episode reward: [(0, '22.930'), (1, '22.200')] [2023-10-10 08:22:16,809][53252] Updated weights for policy 0, policy_version 88660 (0.0008) [2023-10-10 08:22:17,163][53252] Updated weights for policy 0, policy_version 88670 (0.0008) [2023-10-10 08:22:17,386][53268] Updated weights for policy 1, policy_version 88580 (0.0008) [2023-10-10 08:22:17,758][53268] Updated weights for policy 1, policy_version 88590 (0.0008) [2023-10-10 08:22:18,116][53268] Updated weights for policy 1, policy_version 88600 (0.0009) [2023-10-10 08:22:21,327][53252] Updated weights for policy 0, policy_version 88680 (0.0009) [2023-10-10 08:22:21,701][53252] Updated weights for policy 0, policy_version 88690 (0.0010) [2023-10-10 08:22:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 181534720. Throughput: 0: 1690.2, 1: 1674.2. Samples: 45396746. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:21,785][52050] Avg episode reward: [(0, '19.950'), (1, '22.740')] [2023-10-10 08:22:22,054][53268] Updated weights for policy 1, policy_version 88610 (0.0009) [2023-10-10 08:22:22,078][53252] Updated weights for policy 0, policy_version 88700 (0.0008) [2023-10-10 08:22:22,416][53268] Updated weights for policy 1, policy_version 88620 (0.0008) [2023-10-10 08:22:22,782][53268] Updated weights for policy 1, policy_version 88630 (0.0008) [2023-10-10 08:22:23,148][53268] Updated weights for policy 1, policy_version 88640 (0.0008) [2023-10-10 08:22:26,113][53252] Updated weights for policy 0, policy_version 88710 (0.0008) [2023-10-10 08:22:26,489][53252] Updated weights for policy 0, policy_version 88720 (0.0008) [2023-10-10 08:22:26,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181600256. Throughput: 0: 1683.3, 1: 1677.4. Samples: 45417034. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:26,784][52050] Avg episode reward: [(0, '21.350'), (1, '22.950')] [2023-10-10 08:22:26,872][53252] Updated weights for policy 0, policy_version 88730 (0.0008) [2023-10-10 08:22:27,267][53268] Updated weights for policy 1, policy_version 88650 (0.0009) [2023-10-10 08:22:27,642][53268] Updated weights for policy 1, policy_version 88660 (0.0011) [2023-10-10 08:22:28,006][53268] Updated weights for policy 1, policy_version 88670 (0.0009) [2023-10-10 08:22:31,091][53252] Updated weights for policy 0, policy_version 88740 (0.0009) [2023-10-10 08:22:31,472][53252] Updated weights for policy 0, policy_version 88750 (0.0010) [2023-10-10 08:22:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181665792. Throughput: 0: 1698.0, 1: 1676.8. Samples: 45426706. Policy #0 lag: (min: 31.0, avg: 33.1, max: 61.0) [2023-10-10 08:22:31,784][52050] Avg episode reward: [(0, '24.110'), (1, '23.520')] [2023-10-10 08:22:31,831][53252] Updated weights for policy 0, policy_version 88760 (0.0007) [2023-10-10 08:22:32,111][53268] Updated weights for policy 1, policy_version 88680 (0.0007) [2023-10-10 08:22:32,487][53268] Updated weights for policy 1, policy_version 88690 (0.0007) [2023-10-10 08:22:32,853][53268] Updated weights for policy 1, policy_version 88700 (0.0008) [2023-10-10 08:22:35,981][53252] Updated weights for policy 0, policy_version 88770 (0.0007) [2023-10-10 08:22:36,343][53252] Updated weights for policy 0, policy_version 88780 (0.0010) [2023-10-10 08:22:36,718][53252] Updated weights for policy 0, policy_version 88790 (0.0009) [2023-10-10 08:22:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 181731328. Throughput: 0: 1693.6, 1: 1681.0. Samples: 45447138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:22:36,784][52050] Avg episode reward: [(0, '22.530'), (1, '22.330')] [2023-10-10 08:22:36,889][53268] Updated weights for policy 1, policy_version 88710 (0.0008) [2023-10-10 08:22:37,090][53252] Updated weights for policy 0, policy_version 88800 (0.0007) [2023-10-10 08:22:37,252][53268] Updated weights for policy 1, policy_version 88720 (0.0009) [2023-10-10 08:22:37,619][53268] Updated weights for policy 1, policy_version 88730 (0.0011) [2023-10-10 08:22:41,167][53252] Updated weights for policy 0, policy_version 88810 (0.0008) [2023-10-10 08:22:41,536][53252] Updated weights for policy 0, policy_version 88820 (0.0007) [2023-10-10 08:22:41,784][52050] Fps is (10 sec: 13106.5, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 181796864. Throughput: 0: 1673.9, 1: 1677.5. Samples: 45466906. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:22:41,785][52050] Avg episode reward: [(0, '23.710'), (1, '21.390')] [2023-10-10 08:22:41,860][53268] Updated weights for policy 1, policy_version 88740 (0.0010) [2023-10-10 08:22:41,914][53252] Updated weights for policy 0, policy_version 88830 (0.0008) [2023-10-10 08:22:41,982][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000088832_90963968.pth... [2023-10-10 08:22:42,010][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000087232_89325568.pth [2023-10-10 08:22:42,237][53268] Updated weights for policy 1, policy_version 88750 (0.0008) [2023-10-10 08:22:42,591][53268] Updated weights for policy 1, policy_version 88760 (0.0009) [2023-10-10 08:22:42,886][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000088768_90898432.pth... [2023-10-10 08:22:42,923][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000087168_89260032.pth [2023-10-10 08:22:45,778][53252] Updated weights for policy 0, policy_version 88840 (0.0008) [2023-10-10 08:22:46,153][53252] Updated weights for policy 0, policy_version 88850 (0.0010) [2023-10-10 08:22:46,530][53252] Updated weights for policy 0, policy_version 88860 (0.0008) [2023-10-10 08:22:46,706][53268] Updated weights for policy 1, policy_version 88770 (0.0009) [2023-10-10 08:22:46,783][52050] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181895168. Throughput: 0: 1690.1, 1: 1677.4. Samples: 45476832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:22:46,784][52050] Avg episode reward: [(0, '24.930'), (1, '21.000')] [2023-10-10 08:22:47,075][53268] Updated weights for policy 1, policy_version 88780 (0.0011) [2023-10-10 08:22:47,439][53268] Updated weights for policy 1, policy_version 88790 (0.0011) [2023-10-10 08:22:47,807][53268] Updated weights for policy 1, policy_version 88800 (0.0010) [2023-10-10 08:22:50,657][53252] Updated weights for policy 0, policy_version 88870 (0.0007) [2023-10-10 08:22:51,022][53252] Updated weights for policy 0, policy_version 88880 (0.0008) [2023-10-10 08:22:51,386][53252] Updated weights for policy 0, policy_version 88890 (0.0007) [2023-10-10 08:22:51,783][52050] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 181960704. Throughput: 0: 1691.6, 1: 1671.3. Samples: 45497418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:22:51,785][52050] Avg episode reward: [(0, '23.740'), (1, '21.450')] [2023-10-10 08:22:51,830][53268] Updated weights for policy 1, policy_version 88810 (0.0009) [2023-10-10 08:22:52,205][53268] Updated weights for policy 1, policy_version 88820 (0.0008) [2023-10-10 08:22:52,573][53268] Updated weights for policy 1, policy_version 88830 (0.0008) [2023-10-10 08:22:55,353][53252] Updated weights for policy 0, policy_version 88900 (0.0007) [2023-10-10 08:22:55,716][53252] Updated weights for policy 0, policy_version 88910 (0.0009) [2023-10-10 08:22:56,089][53252] Updated weights for policy 0, policy_version 88920 (0.0008) [2023-10-10 08:22:56,599][53268] Updated weights for policy 1, policy_version 88840 (0.0008) [2023-10-10 08:22:56,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 182026240. Throughput: 0: 1661.9, 1: 1671.5. Samples: 45516952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:22:56,785][52050] Avg episode reward: [(0, '22.000'), (1, '20.500')] [2023-10-10 08:22:56,976][53268] Updated weights for policy 1, policy_version 88850 (0.0008) [2023-10-10 08:22:57,341][53268] Updated weights for policy 1, policy_version 88860 (0.0009) [2023-10-10 08:23:00,269][53252] Updated weights for policy 0, policy_version 88930 (0.0009) [2023-10-10 08:23:00,643][53252] Updated weights for policy 0, policy_version 88940 (0.0011) [2023-10-10 08:23:01,004][53252] Updated weights for policy 0, policy_version 88950 (0.0009) [2023-10-10 08:23:01,380][53252] Updated weights for policy 0, policy_version 88960 (0.0009) [2023-10-10 08:23:01,591][53268] Updated weights for policy 1, policy_version 88870 (0.0007) [2023-10-10 08:23:01,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 182091776. Throughput: 0: 1683.5, 1: 1669.6. Samples: 45527106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:01,784][52050] Avg episode reward: [(0, '21.380'), (1, '21.740')] [2023-10-10 08:23:01,960][53268] Updated weights for policy 1, policy_version 88880 (0.0008) [2023-10-10 08:23:02,323][53268] Updated weights for policy 1, policy_version 88890 (0.0010) [2023-10-10 08:23:05,402][53252] Updated weights for policy 0, policy_version 88970 (0.0008) [2023-10-10 08:23:05,772][53252] Updated weights for policy 0, policy_version 88980 (0.0007) [2023-10-10 08:23:06,143][53252] Updated weights for policy 0, policy_version 88990 (0.0008) [2023-10-10 08:23:06,612][53268] Updated weights for policy 1, policy_version 88900 (0.0008) [2023-10-10 08:23:06,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182157312. Throughput: 0: 1682.1, 1: 1669.4. Samples: 45547564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:06,784][52050] Avg episode reward: [(0, '19.680'), (1, '21.230')] [2023-10-10 08:23:06,975][53268] Updated weights for policy 1, policy_version 88910 (0.0007) [2023-10-10 08:23:07,346][53268] Updated weights for policy 1, policy_version 88920 (0.0007) [2023-10-10 08:23:10,134][53252] Updated weights for policy 0, policy_version 89000 (0.0008) [2023-10-10 08:23:10,492][53252] Updated weights for policy 0, policy_version 89010 (0.0010) [2023-10-10 08:23:10,873][53252] Updated weights for policy 0, policy_version 89020 (0.0010) [2023-10-10 08:23:11,413][53268] Updated weights for policy 1, policy_version 88930 (0.0008) [2023-10-10 08:23:11,776][53268] Updated weights for policy 1, policy_version 88940 (0.0007) [2023-10-10 08:23:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182222848. Throughput: 0: 1672.2, 1: 1668.1. Samples: 45567348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:11,784][52050] Avg episode reward: [(0, '19.640'), (1, '21.150')] [2023-10-10 08:23:12,140][53268] Updated weights for policy 1, policy_version 88950 (0.0007) [2023-10-10 08:23:12,511][53268] Updated weights for policy 1, policy_version 88960 (0.0007) [2023-10-10 08:23:14,894][53252] Updated weights for policy 0, policy_version 89030 (0.0008) [2023-10-10 08:23:15,271][53252] Updated weights for policy 0, policy_version 89040 (0.0007) [2023-10-10 08:23:15,639][53252] Updated weights for policy 0, policy_version 89050 (0.0007) [2023-10-10 08:23:16,500][53268] Updated weights for policy 1, policy_version 88970 (0.0009) [2023-10-10 08:23:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182288384. Throughput: 0: 1691.4, 1: 1667.9. Samples: 45577878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:16,784][52050] Avg episode reward: [(0, '22.380'), (1, '20.980')] [2023-10-10 08:23:16,872][53268] Updated weights for policy 1, policy_version 88980 (0.0009) [2023-10-10 08:23:17,232][53268] Updated weights for policy 1, policy_version 88990 (0.0008) [2023-10-10 08:23:19,598][53252] Updated weights for policy 0, policy_version 89060 (0.0007) [2023-10-10 08:23:19,993][53252] Updated weights for policy 0, policy_version 89070 (0.0008) [2023-10-10 08:23:20,352][53252] Updated weights for policy 0, policy_version 89080 (0.0009) [2023-10-10 08:23:21,430][53268] Updated weights for policy 1, policy_version 89000 (0.0011) [2023-10-10 08:23:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 182353920. Throughput: 0: 1675.9, 1: 1666.8. Samples: 45597560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:21,784][52050] Avg episode reward: [(0, '23.500'), (1, '23.340')] [2023-10-10 08:23:21,802][53268] Updated weights for policy 1, policy_version 89010 (0.0008) [2023-10-10 08:23:22,165][53268] Updated weights for policy 1, policy_version 89020 (0.0009) [2023-10-10 08:23:24,503][53252] Updated weights for policy 0, policy_version 89090 (0.0008) [2023-10-10 08:23:24,869][53252] Updated weights for policy 0, policy_version 89100 (0.0010) [2023-10-10 08:23:25,237][53252] Updated weights for policy 0, policy_version 89110 (0.0010) [2023-10-10 08:23:25,604][53252] Updated weights for policy 0, policy_version 89120 (0.0010) [2023-10-10 08:23:25,964][53268] Updated weights for policy 1, policy_version 89030 (0.0010) [2023-10-10 08:23:26,338][53268] Updated weights for policy 1, policy_version 89040 (0.0010) [2023-10-10 08:23:26,691][53268] Updated weights for policy 1, policy_version 89050 (0.0011) [2023-10-10 08:23:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182419456. Throughput: 0: 1683.4, 1: 1661.8. Samples: 45617440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:26,784][52050] Avg episode reward: [(0, '23.950'), (1, '22.280')] [2023-10-10 08:23:29,698][53252] Updated weights for policy 0, policy_version 89130 (0.0008) [2023-10-10 08:23:30,056][53252] Updated weights for policy 0, policy_version 89140 (0.0009) [2023-10-10 08:23:30,431][53252] Updated weights for policy 0, policy_version 89150 (0.0008) [2023-10-10 08:23:30,969][53268] Updated weights for policy 1, policy_version 89060 (0.0011) [2023-10-10 08:23:31,334][53268] Updated weights for policy 1, policy_version 89070 (0.0008) [2023-10-10 08:23:31,707][53268] Updated weights for policy 1, policy_version 89080 (0.0007) [2023-10-10 08:23:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182484992. Throughput: 0: 1691.5, 1: 1666.8. Samples: 45627954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:31,784][52050] Avg episode reward: [(0, '25.050'), (1, '23.000')] [2023-10-10 08:23:34,588][53252] Updated weights for policy 0, policy_version 89160 (0.0008) [2023-10-10 08:23:34,956][53252] Updated weights for policy 0, policy_version 89170 (0.0010) [2023-10-10 08:23:35,324][53252] Updated weights for policy 0, policy_version 89180 (0.0010) [2023-10-10 08:23:35,786][53268] Updated weights for policy 1, policy_version 89090 (0.0007) [2023-10-10 08:23:36,151][53268] Updated weights for policy 1, policy_version 89100 (0.0008) [2023-10-10 08:23:36,520][53268] Updated weights for policy 1, policy_version 89110 (0.0009) [2023-10-10 08:23:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182550528. Throughput: 0: 1665.2, 1: 1670.0. Samples: 45647502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:36,784][52050] Avg episode reward: [(0, '22.200'), (1, '22.650')] [2023-10-10 08:23:36,885][53268] Updated weights for policy 1, policy_version 89120 (0.0009) [2023-10-10 08:23:39,327][53252] Updated weights for policy 0, policy_version 89190 (0.0010) [2023-10-10 08:23:39,691][53252] Updated weights for policy 0, policy_version 89200 (0.0009) [2023-10-10 08:23:40,061][53252] Updated weights for policy 0, policy_version 89210 (0.0008) [2023-10-10 08:23:40,960][53268] Updated weights for policy 1, policy_version 89130 (0.0011) [2023-10-10 08:23:41,320][53268] Updated weights for policy 1, policy_version 89140 (0.0010) [2023-10-10 08:23:41,690][53268] Updated weights for policy 1, policy_version 89150 (0.0011) [2023-10-10 08:23:41,783][52050] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 182648832. Throughput: 0: 1688.9, 1: 1657.4. Samples: 45667536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:41,785][52050] Avg episode reward: [(0, '22.320'), (1, '22.730')] [2023-10-10 08:23:44,079][53252] Updated weights for policy 0, policy_version 89220 (0.0007) [2023-10-10 08:23:44,457][53252] Updated weights for policy 0, policy_version 89230 (0.0008) [2023-10-10 08:23:44,818][53252] Updated weights for policy 0, policy_version 89240 (0.0011) [2023-10-10 08:23:45,656][53268] Updated weights for policy 1, policy_version 89160 (0.0011) [2023-10-10 08:23:46,024][53268] Updated weights for policy 1, policy_version 89170 (0.0007) [2023-10-10 08:23:46,390][53268] Updated weights for policy 1, policy_version 89180 (0.0009) [2023-10-10 08:23:46,783][52050] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182714368. Throughput: 0: 1681.1, 1: 1671.8. Samples: 45677990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:46,784][52050] Avg episode reward: [(0, '23.080'), (1, '21.480')] [2023-10-10 08:23:48,911][53252] Updated weights for policy 0, policy_version 89250 (0.0009) [2023-10-10 08:23:49,284][53252] Updated weights for policy 0, policy_version 89260 (0.0007) [2023-10-10 08:23:49,662][53252] Updated weights for policy 0, policy_version 89270 (0.0007) [2023-10-10 08:23:50,029][53252] Updated weights for policy 0, policy_version 89280 (0.0008) [2023-10-10 08:23:50,639][53268] Updated weights for policy 1, policy_version 89190 (0.0008) [2023-10-10 08:23:50,995][53268] Updated weights for policy 1, policy_version 89200 (0.0009) [2023-10-10 08:23:51,367][53268] Updated weights for policy 1, policy_version 89210 (0.0009) [2023-10-10 08:23:51,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 182779904. Throughput: 0: 1662.8, 1: 1679.0. Samples: 45697942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:51,784][52050] Avg episode reward: [(0, '22.150'), (1, '22.310')] [2023-10-10 08:23:54,146][53252] Updated weights for policy 0, policy_version 89290 (0.0011) [2023-10-10 08:23:54,516][53252] Updated weights for policy 0, policy_version 89300 (0.0010) [2023-10-10 08:23:54,894][53252] Updated weights for policy 0, policy_version 89310 (0.0011) [2023-10-10 08:23:55,608][53268] Updated weights for policy 1, policy_version 89220 (0.0007) [2023-10-10 08:23:55,973][53268] Updated weights for policy 1, policy_version 89230 (0.0007) [2023-10-10 08:23:56,350][53268] Updated weights for policy 1, policy_version 89240 (0.0007) [2023-10-10 08:23:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 182845440. Throughput: 0: 1683.6, 1: 1660.1. Samples: 45717814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:23:56,784][52050] Avg episode reward: [(0, '21.830'), (1, '21.710')] [2023-10-10 08:23:58,983][53252] Updated weights for policy 0, policy_version 89320 (0.0007) [2023-10-10 08:23:59,352][53252] Updated weights for policy 0, policy_version 89330 (0.0009) [2023-10-10 08:23:59,724][53252] Updated weights for policy 0, policy_version 89340 (0.0008) [2023-10-10 08:24:00,372][53268] Updated weights for policy 1, policy_version 89250 (0.0008) [2023-10-10 08:24:00,737][53268] Updated weights for policy 1, policy_version 89260 (0.0010) [2023-10-10 08:24:01,108][53268] Updated weights for policy 1, policy_version 89270 (0.0007) [2023-10-10 08:24:01,464][53268] Updated weights for policy 1, policy_version 89280 (0.0007) [2023-10-10 08:24:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182910976. Throughput: 0: 1666.1, 1: 1675.9. Samples: 45728264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:01,784][52050] Avg episode reward: [(0, '23.600'), (1, '22.570')] [2023-10-10 08:24:03,658][53252] Updated weights for policy 0, policy_version 89350 (0.0008) [2023-10-10 08:24:04,023][53252] Updated weights for policy 0, policy_version 89360 (0.0010) [2023-10-10 08:24:04,400][53252] Updated weights for policy 0, policy_version 89370 (0.0007) [2023-10-10 08:24:05,477][53268] Updated weights for policy 1, policy_version 89290 (0.0009) [2023-10-10 08:24:05,838][53268] Updated weights for policy 1, policy_version 89300 (0.0009) [2023-10-10 08:24:06,209][53268] Updated weights for policy 1, policy_version 89310 (0.0009) [2023-10-10 08:24:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182976512. Throughput: 0: 1675.8, 1: 1678.5. Samples: 45748504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:06,784][52050] Avg episode reward: [(0, '22.760'), (1, '23.920')] [2023-10-10 08:24:08,538][53252] Updated weights for policy 0, policy_version 89380 (0.0009) [2023-10-10 08:24:08,924][53252] Updated weights for policy 0, policy_version 89390 (0.0007) [2023-10-10 08:24:09,294][53252] Updated weights for policy 0, policy_version 89400 (0.0009) [2023-10-10 08:24:10,493][53268] Updated weights for policy 1, policy_version 89320 (0.0008) [2023-10-10 08:24:10,870][53268] Updated weights for policy 1, policy_version 89330 (0.0009) [2023-10-10 08:24:11,232][53268] Updated weights for policy 1, policy_version 89340 (0.0007) [2023-10-10 08:24:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 183042048. Throughput: 0: 1685.3, 1: 1666.9. Samples: 45768288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:11,784][52050] Avg episode reward: [(0, '20.970'), (1, '22.770')] [2023-10-10 08:24:13,083][53252] Updated weights for policy 0, policy_version 89410 (0.0008) [2023-10-10 08:24:13,460][53252] Updated weights for policy 0, policy_version 89420 (0.0009) [2023-10-10 08:24:13,838][53252] Updated weights for policy 0, policy_version 89430 (0.0009) [2023-10-10 08:24:14,207][53252] Updated weights for policy 0, policy_version 89440 (0.0009) [2023-10-10 08:24:15,148][53268] Updated weights for policy 1, policy_version 89350 (0.0009) [2023-10-10 08:24:15,517][53268] Updated weights for policy 1, policy_version 89360 (0.0010) [2023-10-10 08:24:15,872][53268] Updated weights for policy 1, policy_version 89370 (0.0008) [2023-10-10 08:24:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 183107584. Throughput: 0: 1660.9, 1: 1686.0. Samples: 45778562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:16,784][52050] Avg episode reward: [(0, '21.960'), (1, '23.750')] [2023-10-10 08:24:18,259][53252] Updated weights for policy 0, policy_version 89450 (0.0007) [2023-10-10 08:24:18,632][53252] Updated weights for policy 0, policy_version 89460 (0.0009) [2023-10-10 08:24:19,001][53252] Updated weights for policy 0, policy_version 89470 (0.0010) [2023-10-10 08:24:20,023][53268] Updated weights for policy 1, policy_version 89380 (0.0010) [2023-10-10 08:24:20,385][53268] Updated weights for policy 1, policy_version 89390 (0.0011) [2023-10-10 08:24:20,752][53268] Updated weights for policy 1, policy_version 89400 (0.0010) [2023-10-10 08:24:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 183173120. Throughput: 0: 1687.1, 1: 1676.1. Samples: 45798846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:21,784][52050] Avg episode reward: [(0, '23.300'), (1, '23.610')] [2023-10-10 08:24:22,978][53252] Updated weights for policy 0, policy_version 89480 (0.0008) [2023-10-10 08:24:23,354][53252] Updated weights for policy 0, policy_version 89490 (0.0007) [2023-10-10 08:24:23,725][53252] Updated weights for policy 0, policy_version 89500 (0.0008) [2023-10-10 08:24:24,874][53268] Updated weights for policy 1, policy_version 89410 (0.0008) [2023-10-10 08:24:25,250][53268] Updated weights for policy 1, policy_version 89420 (0.0008) [2023-10-10 08:24:25,621][53268] Updated weights for policy 1, policy_version 89430 (0.0009) [2023-10-10 08:24:25,985][53268] Updated weights for policy 1, policy_version 89440 (0.0011) [2023-10-10 08:24:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 183238656. Throughput: 0: 1695.3, 1: 1666.0. Samples: 45818792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:24:26,784][52050] Avg episode reward: [(0, '22.790'), (1, '22.080')] [2023-10-10 08:24:27,710][53252] Updated weights for policy 0, policy_version 89510 (0.0008) [2023-10-10 08:24:28,084][53252] Updated weights for policy 0, policy_version 89520 (0.0007) [2023-10-10 08:24:28,456][53252] Updated weights for policy 0, policy_version 89530 (0.0008) [2023-10-10 08:24:30,018][53268] Updated weights for policy 1, policy_version 89450 (0.0009) [2023-10-10 08:24:30,387][53268] Updated weights for policy 1, policy_version 89460 (0.0009) [2023-10-10 08:24:30,765][53268] Updated weights for policy 1, policy_version 89470 (0.0008) [2023-10-10 08:24:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 183304192. Throughput: 0: 1677.7, 1: 1684.7. Samples: 45829294. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:31,784][52050] Avg episode reward: [(0, '22.590'), (1, '22.680')] [2023-10-10 08:24:32,480][53252] Updated weights for policy 0, policy_version 89540 (0.0010) [2023-10-10 08:24:32,850][53252] Updated weights for policy 0, policy_version 89550 (0.0009) [2023-10-10 08:24:33,219][53252] Updated weights for policy 0, policy_version 89560 (0.0008) [2023-10-10 08:24:34,803][53268] Updated weights for policy 1, policy_version 89480 (0.0009) [2023-10-10 08:24:35,178][53268] Updated weights for policy 1, policy_version 89490 (0.0009) [2023-10-10 08:24:35,544][53268] Updated weights for policy 1, policy_version 89500 (0.0008) [2023-10-10 08:24:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 183369728. Throughput: 0: 1700.6, 1: 1666.3. Samples: 45849450. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:36,784][52050] Avg episode reward: [(0, '23.360'), (1, '23.440')] [2023-10-10 08:24:37,458][53252] Updated weights for policy 0, policy_version 89570 (0.0009) [2023-10-10 08:24:37,835][53252] Updated weights for policy 0, policy_version 89580 (0.0008) [2023-10-10 08:24:38,198][53252] Updated weights for policy 0, policy_version 89590 (0.0007) [2023-10-10 08:24:38,563][53252] Updated weights for policy 0, policy_version 89600 (0.0007) [2023-10-10 08:24:39,588][53268] Updated weights for policy 1, policy_version 89510 (0.0008) [2023-10-10 08:24:39,958][53268] Updated weights for policy 1, policy_version 89520 (0.0008) [2023-10-10 08:24:40,331][53268] Updated weights for policy 1, policy_version 89530 (0.0009) [2023-10-10 08:24:41,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183435264. Throughput: 0: 1705.0, 1: 1671.6. Samples: 45869764. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:41,784][52050] Avg episode reward: [(0, '21.810'), (1, '21.940')] [2023-10-10 08:24:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000089536_91684864.pth... [2023-10-10 08:24:41,797][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000089600_91750400.pth... [2023-10-10 08:24:41,826][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000087968_90079232.pth [2023-10-10 08:24:41,831][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000088032_90144768.pth [2023-10-10 08:24:42,554][53252] Updated weights for policy 0, policy_version 89610 (0.0008) [2023-10-10 08:24:42,924][53252] Updated weights for policy 0, policy_version 89620 (0.0008) [2023-10-10 08:24:43,295][53252] Updated weights for policy 0, policy_version 89630 (0.0008) [2023-10-10 08:24:44,501][53268] Updated weights for policy 1, policy_version 89540 (0.0009) [2023-10-10 08:24:44,871][53268] Updated weights for policy 1, policy_version 89550 (0.0007) [2023-10-10 08:24:45,237][53268] Updated weights for policy 1, policy_version 89560 (0.0008) [2023-10-10 08:24:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 183500800. Throughput: 0: 1691.8, 1: 1685.9. Samples: 45880260. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:46,784][52050] Avg episode reward: [(0, '21.220'), (1, '22.330')] [2023-10-10 08:24:47,205][53252] Updated weights for policy 0, policy_version 89640 (0.0011) [2023-10-10 08:24:47,574][53252] Updated weights for policy 0, policy_version 89650 (0.0009) [2023-10-10 08:24:47,940][53252] Updated weights for policy 0, policy_version 89660 (0.0011) [2023-10-10 08:24:49,357][53268] Updated weights for policy 1, policy_version 89570 (0.0008) [2023-10-10 08:24:49,728][53268] Updated weights for policy 1, policy_version 89580 (0.0010) [2023-10-10 08:24:50,095][53268] Updated weights for policy 1, policy_version 89590 (0.0010) [2023-10-10 08:24:50,459][53268] Updated weights for policy 1, policy_version 89600 (0.0007) [2023-10-10 08:24:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183566336. Throughput: 0: 1707.3, 1: 1663.9. Samples: 45900208. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:51,784][52050] Avg episode reward: [(0, '22.360'), (1, '23.880')] [2023-10-10 08:24:52,069][53252] Updated weights for policy 0, policy_version 89670 (0.0009) [2023-10-10 08:24:52,448][53252] Updated weights for policy 0, policy_version 89680 (0.0009) [2023-10-10 08:24:52,829][53252] Updated weights for policy 0, policy_version 89690 (0.0008) [2023-10-10 08:24:54,597][53268] Updated weights for policy 1, policy_version 89610 (0.0008) [2023-10-10 08:24:54,967][53268] Updated weights for policy 1, policy_version 89620 (0.0008) [2023-10-10 08:24:55,332][53268] Updated weights for policy 1, policy_version 89630 (0.0007) [2023-10-10 08:24:56,776][53252] Updated weights for policy 0, policy_version 89700 (0.0007) [2023-10-10 08:24:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 183631872. Throughput: 0: 1706.7, 1: 1676.6. Samples: 45920536. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:24:56,784][52050] Avg episode reward: [(0, '23.570'), (1, '22.990')] [2023-10-10 08:24:57,162][53252] Updated weights for policy 0, policy_version 89710 (0.0008) [2023-10-10 08:24:57,530][53252] Updated weights for policy 0, policy_version 89720 (0.0009) [2023-10-10 08:24:59,495][53268] Updated weights for policy 1, policy_version 89640 (0.0007) [2023-10-10 08:24:59,871][53268] Updated weights for policy 1, policy_version 89650 (0.0007) [2023-10-10 08:25:00,233][53268] Updated weights for policy 1, policy_version 89660 (0.0009) [2023-10-10 08:25:01,489][53252] Updated weights for policy 0, policy_version 89730 (0.0009) [2023-10-10 08:25:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183697408. Throughput: 0: 1707.2, 1: 1678.0. Samples: 45930894. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:01,784][52050] Avg episode reward: [(0, '21.940'), (1, '20.350')] [2023-10-10 08:25:01,862][53252] Updated weights for policy 0, policy_version 89740 (0.0010) [2023-10-10 08:25:02,239][53252] Updated weights for policy 0, policy_version 89750 (0.0008) [2023-10-10 08:25:02,614][53252] Updated weights for policy 0, policy_version 89760 (0.0010) [2023-10-10 08:25:04,235][53268] Updated weights for policy 1, policy_version 89670 (0.0008) [2023-10-10 08:25:04,609][53268] Updated weights for policy 1, policy_version 89680 (0.0009) [2023-10-10 08:25:04,975][53268] Updated weights for policy 1, policy_version 89690 (0.0010) [2023-10-10 08:25:06,687][53252] Updated weights for policy 0, policy_version 89770 (0.0009) [2023-10-10 08:25:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183762944. Throughput: 0: 1704.7, 1: 1662.0. Samples: 45950344. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:06,784][52050] Avg episode reward: [(0, '22.060'), (1, '23.410')] [2023-10-10 08:25:07,064][53252] Updated weights for policy 0, policy_version 89780 (0.0008) [2023-10-10 08:25:07,439][53252] Updated weights for policy 0, policy_version 89790 (0.0010) [2023-10-10 08:25:08,768][53268] Updated weights for policy 1, policy_version 89700 (0.0007) [2023-10-10 08:25:09,126][53268] Updated weights for policy 1, policy_version 89710 (0.0009) [2023-10-10 08:25:09,499][53268] Updated weights for policy 1, policy_version 89720 (0.0009) [2023-10-10 08:25:11,483][53252] Updated weights for policy 0, policy_version 89800 (0.0008) [2023-10-10 08:25:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183828480. Throughput: 0: 1694.6, 1: 1681.6. Samples: 45970724. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:11,784][52050] Avg episode reward: [(0, '22.470'), (1, '21.570')] [2023-10-10 08:25:11,856][53252] Updated weights for policy 0, policy_version 89810 (0.0008) [2023-10-10 08:25:12,218][53252] Updated weights for policy 0, policy_version 89820 (0.0008) [2023-10-10 08:25:13,544][53268] Updated weights for policy 1, policy_version 89730 (0.0009) [2023-10-10 08:25:13,914][53268] Updated weights for policy 1, policy_version 89740 (0.0009) [2023-10-10 08:25:14,277][53268] Updated weights for policy 1, policy_version 89750 (0.0007) [2023-10-10 08:25:14,644][53268] Updated weights for policy 1, policy_version 89760 (0.0007) [2023-10-10 08:25:16,378][53252] Updated weights for policy 0, policy_version 89830 (0.0008) [2023-10-10 08:25:16,756][53252] Updated weights for policy 0, policy_version 89840 (0.0007) [2023-10-10 08:25:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 183894016. Throughput: 0: 1698.1, 1: 1664.1. Samples: 45980596. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:16,784][52050] Avg episode reward: [(0, '22.230'), (1, '21.610')] [2023-10-10 08:25:17,121][53252] Updated weights for policy 0, policy_version 89850 (0.0008) [2023-10-10 08:25:18,661][53268] Updated weights for policy 1, policy_version 89770 (0.0007) [2023-10-10 08:25:19,020][53268] Updated weights for policy 1, policy_version 89780 (0.0008) [2023-10-10 08:25:19,383][53268] Updated weights for policy 1, policy_version 89790 (0.0008) [2023-10-10 08:25:21,350][53252] Updated weights for policy 0, policy_version 89860 (0.0009) [2023-10-10 08:25:21,724][53252] Updated weights for policy 0, policy_version 89870 (0.0009) [2023-10-10 08:25:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 183959552. Throughput: 0: 1689.8, 1: 1673.3. Samples: 46000788. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:21,784][52050] Avg episode reward: [(0, '21.450'), (1, '22.790')] [2023-10-10 08:25:22,086][53252] Updated weights for policy 0, policy_version 89880 (0.0009) [2023-10-10 08:25:23,490][53268] Updated weights for policy 1, policy_version 89800 (0.0007) [2023-10-10 08:25:23,861][53268] Updated weights for policy 1, policy_version 89810 (0.0008) [2023-10-10 08:25:24,230][53268] Updated weights for policy 1, policy_version 89820 (0.0008) [2023-10-10 08:25:25,972][53252] Updated weights for policy 0, policy_version 89890 (0.0009) [2023-10-10 08:25:26,345][53252] Updated weights for policy 0, policy_version 89900 (0.0008) [2023-10-10 08:25:26,724][53252] Updated weights for policy 0, policy_version 89910 (0.0010) [2023-10-10 08:25:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 184025088. Throughput: 0: 1677.8, 1: 1688.8. Samples: 46021260. Policy #0 lag: (min: 30.0, avg: 39.9, max: 62.0) [2023-10-10 08:25:26,785][52050] Avg episode reward: [(0, '23.970'), (1, '20.830')] [2023-10-10 08:25:27,104][53252] Updated weights for policy 0, policy_version 89920 (0.0007) [2023-10-10 08:25:28,328][53268] Updated weights for policy 1, policy_version 89830 (0.0008) [2023-10-10 08:25:28,696][53268] Updated weights for policy 1, policy_version 89840 (0.0009) [2023-10-10 08:25:29,070][53268] Updated weights for policy 1, policy_version 89850 (0.0009) [2023-10-10 08:25:31,191][53252] Updated weights for policy 0, policy_version 89930 (0.0009) [2023-10-10 08:25:31,562][53252] Updated weights for policy 0, policy_version 89940 (0.0008) [2023-10-10 08:25:31,784][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 184090624. Throughput: 0: 1689.4, 1: 1660.0. Samples: 46030982. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:31,785][52050] Avg episode reward: [(0, '23.100'), (1, '20.500')] [2023-10-10 08:25:31,936][53252] Updated weights for policy 0, policy_version 89950 (0.0007) [2023-10-10 08:25:32,983][53268] Updated weights for policy 1, policy_version 89860 (0.0009) [2023-10-10 08:25:33,353][53268] Updated weights for policy 1, policy_version 89870 (0.0009) [2023-10-10 08:25:33,712][53268] Updated weights for policy 1, policy_version 89880 (0.0008) [2023-10-10 08:25:35,885][53252] Updated weights for policy 0, policy_version 89960 (0.0009) [2023-10-10 08:25:36,258][53252] Updated weights for policy 0, policy_version 89970 (0.0009) [2023-10-10 08:25:36,623][53252] Updated weights for policy 0, policy_version 89980 (0.0011) [2023-10-10 08:25:36,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184188928. Throughput: 0: 1682.5, 1: 1681.4. Samples: 46051582. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:36,784][52050] Avg episode reward: [(0, '23.340'), (1, '22.290')] [2023-10-10 08:25:37,686][53268] Updated weights for policy 1, policy_version 89890 (0.0009) [2023-10-10 08:25:38,053][53268] Updated weights for policy 1, policy_version 89900 (0.0008) [2023-10-10 08:25:38,418][53268] Updated weights for policy 1, policy_version 89910 (0.0009) [2023-10-10 08:25:38,781][53268] Updated weights for policy 1, policy_version 89920 (0.0008) [2023-10-10 08:25:40,781][53252] Updated weights for policy 0, policy_version 89990 (0.0009) [2023-10-10 08:25:41,164][53252] Updated weights for policy 0, policy_version 90000 (0.0009) [2023-10-10 08:25:41,538][53252] Updated weights for policy 0, policy_version 90010 (0.0008) [2023-10-10 08:25:41,783][52050] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 184254464. Throughput: 0: 1661.4, 1: 1691.4. Samples: 46071414. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:41,784][52050] Avg episode reward: [(0, '22.230'), (1, '20.480')] [2023-10-10 08:25:42,891][53268] Updated weights for policy 1, policy_version 89930 (0.0010) [2023-10-10 08:25:43,266][53268] Updated weights for policy 1, policy_version 89940 (0.0011) [2023-10-10 08:25:43,639][53268] Updated weights for policy 1, policy_version 89950 (0.0008) [2023-10-10 08:25:45,671][53252] Updated weights for policy 0, policy_version 90020 (0.0008) [2023-10-10 08:25:46,059][53252] Updated weights for policy 0, policy_version 90030 (0.0007) [2023-10-10 08:25:46,422][53252] Updated weights for policy 0, policy_version 90040 (0.0008) [2023-10-10 08:25:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184320000. Throughput: 0: 1681.2, 1: 1666.3. Samples: 46081534. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:46,784][52050] Avg episode reward: [(0, '23.190'), (1, '21.010')] [2023-10-10 08:25:47,668][53268] Updated weights for policy 1, policy_version 89960 (0.0008) [2023-10-10 08:25:48,035][53268] Updated weights for policy 1, policy_version 89970 (0.0010) [2023-10-10 08:25:48,398][53268] Updated weights for policy 1, policy_version 89980 (0.0010) [2023-10-10 08:25:50,470][53252] Updated weights for policy 0, policy_version 90050 (0.0010) [2023-10-10 08:25:50,840][53252] Updated weights for policy 0, policy_version 90060 (0.0008) [2023-10-10 08:25:51,213][53252] Updated weights for policy 0, policy_version 90070 (0.0010) [2023-10-10 08:25:51,580][53252] Updated weights for policy 0, policy_version 90080 (0.0008) [2023-10-10 08:25:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184385536. Throughput: 0: 1680.3, 1: 1695.8. Samples: 46102268. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:51,784][52050] Avg episode reward: [(0, '23.200'), (1, '24.060')] [2023-10-10 08:25:52,691][53268] Updated weights for policy 1, policy_version 89990 (0.0010) [2023-10-10 08:25:53,086][53268] Updated weights for policy 1, policy_version 90000 (0.0010) [2023-10-10 08:25:53,443][53268] Updated weights for policy 1, policy_version 90010 (0.0009) [2023-10-10 08:25:55,665][53252] Updated weights for policy 0, policy_version 90090 (0.0008) [2023-10-10 08:25:56,048][53252] Updated weights for policy 0, policy_version 90100 (0.0011) [2023-10-10 08:25:56,419][53252] Updated weights for policy 0, policy_version 90110 (0.0010) [2023-10-10 08:25:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184451072. Throughput: 0: 1665.3, 1: 1695.5. Samples: 46121960. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:25:56,784][52050] Avg episode reward: [(0, '23.220'), (1, '24.290')] [2023-10-10 08:25:57,389][53268] Updated weights for policy 1, policy_version 90020 (0.0008) [2023-10-10 08:25:57,749][53268] Updated weights for policy 1, policy_version 90030 (0.0009) [2023-10-10 08:25:58,123][53268] Updated weights for policy 1, policy_version 90040 (0.0009) [2023-10-10 08:26:00,711][53252] Updated weights for policy 0, policy_version 90120 (0.0011) [2023-10-10 08:26:01,078][53252] Updated weights for policy 0, policy_version 90130 (0.0010) [2023-10-10 08:26:01,453][53252] Updated weights for policy 0, policy_version 90140 (0.0011) [2023-10-10 08:26:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184516608. Throughput: 0: 1680.3, 1: 1681.6. Samples: 46131878. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:26:01,784][52050] Avg episode reward: [(0, '22.860'), (1, '23.010')] [2023-10-10 08:26:02,227][53268] Updated weights for policy 1, policy_version 90050 (0.0008) [2023-10-10 08:26:02,605][53268] Updated weights for policy 1, policy_version 90060 (0.0007) [2023-10-10 08:26:02,973][53268] Updated weights for policy 1, policy_version 90070 (0.0007) [2023-10-10 08:26:03,338][53268] Updated weights for policy 1, policy_version 90080 (0.0007) [2023-10-10 08:26:05,404][53252] Updated weights for policy 0, policy_version 90150 (0.0010) [2023-10-10 08:26:05,772][53252] Updated weights for policy 0, policy_version 90160 (0.0010) [2023-10-10 08:26:06,148][53252] Updated weights for policy 0, policy_version 90170 (0.0010) [2023-10-10 08:26:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184582144. Throughput: 0: 1677.4, 1: 1690.0. Samples: 46152324. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:26:06,784][52050] Avg episode reward: [(0, '23.480'), (1, '24.310')] [2023-10-10 08:26:07,449][53268] Updated weights for policy 1, policy_version 90090 (0.0010) [2023-10-10 08:26:07,821][53268] Updated weights for policy 1, policy_version 90100 (0.0008) [2023-10-10 08:26:08,189][53268] Updated weights for policy 1, policy_version 90110 (0.0009) [2023-10-10 08:26:10,311][53252] Updated weights for policy 0, policy_version 90180 (0.0009) [2023-10-10 08:26:10,678][53252] Updated weights for policy 0, policy_version 90190 (0.0007) [2023-10-10 08:26:11,037][53252] Updated weights for policy 0, policy_version 90200 (0.0008) [2023-10-10 08:26:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184647680. Throughput: 0: 1662.7, 1: 1690.2. Samples: 46172138. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:26:11,784][52050] Avg episode reward: [(0, '22.920'), (1, '22.370')] [2023-10-10 08:26:12,254][53268] Updated weights for policy 1, policy_version 90120 (0.0008) [2023-10-10 08:26:12,627][53268] Updated weights for policy 1, policy_version 90130 (0.0007) [2023-10-10 08:26:12,988][53268] Updated weights for policy 1, policy_version 90140 (0.0009) [2023-10-10 08:26:15,162][53252] Updated weights for policy 0, policy_version 90210 (0.0008) [2023-10-10 08:26:15,537][53252] Updated weights for policy 0, policy_version 90220 (0.0008) [2023-10-10 08:26:15,902][53252] Updated weights for policy 0, policy_version 90230 (0.0008) [2023-10-10 08:26:16,271][53252] Updated weights for policy 0, policy_version 90240 (0.0007) [2023-10-10 08:26:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184713216. Throughput: 0: 1678.2, 1: 1687.9. Samples: 46182456. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:26:16,784][52050] Avg episode reward: [(0, '22.450'), (1, '21.880')] [2023-10-10 08:26:17,041][53268] Updated weights for policy 1, policy_version 90150 (0.0009) [2023-10-10 08:26:17,404][53268] Updated weights for policy 1, policy_version 90160 (0.0011) [2023-10-10 08:26:17,770][53268] Updated weights for policy 1, policy_version 90170 (0.0008) [2023-10-10 08:26:20,279][53252] Updated weights for policy 0, policy_version 90250 (0.0008) [2023-10-10 08:26:20,652][53252] Updated weights for policy 0, policy_version 90260 (0.0007) [2023-10-10 08:26:21,017][53252] Updated weights for policy 0, policy_version 90270 (0.0007) [2023-10-10 08:26:21,746][53268] Updated weights for policy 1, policy_version 90180 (0.0008) [2023-10-10 08:26:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 184778752. Throughput: 0: 1671.9, 1: 1690.8. Samples: 46202900. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) [2023-10-10 08:26:21,784][52050] Avg episode reward: [(0, '23.030'), (1, '22.180')] [2023-10-10 08:26:22,114][53268] Updated weights for policy 1, policy_version 90190 (0.0009) [2023-10-10 08:26:22,475][53268] Updated weights for policy 1, policy_version 90200 (0.0008) [2023-10-10 08:26:25,131][53252] Updated weights for policy 0, policy_version 90280 (0.0007) [2023-10-10 08:26:25,501][53252] Updated weights for policy 0, policy_version 90290 (0.0008) [2023-10-10 08:26:25,875][53252] Updated weights for policy 0, policy_version 90300 (0.0008) [2023-10-10 08:26:26,424][53268] Updated weights for policy 1, policy_version 90210 (0.0008) [2023-10-10 08:26:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 184844288. Throughput: 0: 1672.3, 1: 1695.5. Samples: 46222962. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:26,784][52050] Avg episode reward: [(0, '23.190'), (1, '22.270')] [2023-10-10 08:26:26,791][53268] Updated weights for policy 1, policy_version 90220 (0.0007) [2023-10-10 08:26:27,155][53268] Updated weights for policy 1, policy_version 90230 (0.0008) [2023-10-10 08:26:27,517][53268] Updated weights for policy 1, policy_version 90240 (0.0009) [2023-10-10 08:26:29,872][53252] Updated weights for policy 0, policy_version 90310 (0.0008) [2023-10-10 08:26:30,236][53252] Updated weights for policy 0, policy_version 90320 (0.0007) [2023-10-10 08:26:30,609][53252] Updated weights for policy 0, policy_version 90330 (0.0008) [2023-10-10 08:26:31,622][53268] Updated weights for policy 1, policy_version 90250 (0.0011) [2023-10-10 08:26:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 184909824. Throughput: 0: 1679.3, 1: 1690.8. Samples: 46233190. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:31,784][52050] Avg episode reward: [(0, '23.040'), (1, '20.670')] [2023-10-10 08:26:31,992][53268] Updated weights for policy 1, policy_version 90260 (0.0009) [2023-10-10 08:26:32,366][53268] Updated weights for policy 1, policy_version 90270 (0.0008) [2023-10-10 08:26:34,657][53252] Updated weights for policy 0, policy_version 90340 (0.0010) [2023-10-10 08:26:35,060][53252] Updated weights for policy 0, policy_version 90350 (0.0008) [2023-10-10 08:26:35,435][53252] Updated weights for policy 0, policy_version 90360 (0.0009) [2023-10-10 08:26:36,512][53268] Updated weights for policy 1, policy_version 90280 (0.0008) [2023-10-10 08:26:36,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 184975360. Throughput: 0: 1658.4, 1: 1684.1. Samples: 46252680. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:36,784][52050] Avg episode reward: [(0, '22.580'), (1, '22.060')] [2023-10-10 08:26:36,879][53268] Updated weights for policy 1, policy_version 90290 (0.0008) [2023-10-10 08:26:37,249][53268] Updated weights for policy 1, policy_version 90300 (0.0009) [2023-10-10 08:26:39,520][53252] Updated weights for policy 0, policy_version 90370 (0.0008) [2023-10-10 08:26:39,896][53252] Updated weights for policy 0, policy_version 90380 (0.0007) [2023-10-10 08:26:40,264][53252] Updated weights for policy 0, policy_version 90390 (0.0008) [2023-10-10 08:26:40,629][53252] Updated weights for policy 0, policy_version 90400 (0.0010) [2023-10-10 08:26:41,468][53268] Updated weights for policy 1, policy_version 90310 (0.0008) [2023-10-10 08:26:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185040896. Throughput: 0: 1674.5, 1: 1684.3. Samples: 46273106. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:41,784][52050] Avg episode reward: [(0, '22.290'), (1, '22.830')] [2023-10-10 08:26:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000090400_92569600.pth... [2023-10-10 08:26:41,826][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000088832_90963968.pth [2023-10-10 08:26:41,854][53268] Updated weights for policy 1, policy_version 90320 (0.0008) [2023-10-10 08:26:42,223][53268] Updated weights for policy 1, policy_version 90330 (0.0010) [2023-10-10 08:26:42,437][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000090336_92504064.pth... [2023-10-10 08:26:42,466][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000088768_90898432.pth [2023-10-10 08:26:44,590][53252] Updated weights for policy 0, policy_version 90410 (0.0010) [2023-10-10 08:26:44,958][53252] Updated weights for policy 0, policy_version 90420 (0.0009) [2023-10-10 08:26:45,323][53252] Updated weights for policy 0, policy_version 90430 (0.0008) [2023-10-10 08:26:46,219][53268] Updated weights for policy 1, policy_version 90340 (0.0009) [2023-10-10 08:26:46,586][53268] Updated weights for policy 1, policy_version 90350 (0.0009) [2023-10-10 08:26:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185106432. Throughput: 0: 1683.2, 1: 1681.6. Samples: 46283294. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:46,784][52050] Avg episode reward: [(0, '21.860'), (1, '22.060')] [2023-10-10 08:26:46,947][53268] Updated weights for policy 1, policy_version 90360 (0.0009) [2023-10-10 08:26:49,136][53252] Updated weights for policy 0, policy_version 90440 (0.0009) [2023-10-10 08:26:49,502][53252] Updated weights for policy 0, policy_version 90450 (0.0009) [2023-10-10 08:26:49,871][53252] Updated weights for policy 0, policy_version 90460 (0.0009) [2023-10-10 08:26:51,176][53268] Updated weights for policy 1, policy_version 90370 (0.0009) [2023-10-10 08:26:51,552][53268] Updated weights for policy 1, policy_version 90380 (0.0009) [2023-10-10 08:26:51,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 185171968. Throughput: 0: 1670.1, 1: 1681.6. Samples: 46303154. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:51,784][52050] Avg episode reward: [(0, '22.000'), (1, '22.610')] [2023-10-10 08:26:51,930][53268] Updated weights for policy 1, policy_version 90390 (0.0008) [2023-10-10 08:26:52,296][53268] Updated weights for policy 1, policy_version 90400 (0.0007) [2023-10-10 08:26:53,817][53252] Updated weights for policy 0, policy_version 90470 (0.0009) [2023-10-10 08:26:54,194][53252] Updated weights for policy 0, policy_version 90480 (0.0007) [2023-10-10 08:26:54,556][53252] Updated weights for policy 0, policy_version 90490 (0.0008) [2023-10-10 08:26:56,401][53268] Updated weights for policy 1, policy_version 90410 (0.0011) [2023-10-10 08:26:56,777][53268] Updated weights for policy 1, policy_version 90420 (0.0012) [2023-10-10 08:26:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185237504. Throughput: 0: 1694.9, 1: 1675.0. Samples: 46323780. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:26:56,784][52050] Avg episode reward: [(0, '21.090'), (1, '22.810')] [2023-10-10 08:26:57,132][53268] Updated weights for policy 1, policy_version 90430 (0.0009) [2023-10-10 08:26:58,729][53252] Updated weights for policy 0, policy_version 90500 (0.0008) [2023-10-10 08:26:59,103][53252] Updated weights for policy 0, policy_version 90510 (0.0008) [2023-10-10 08:26:59,473][53252] Updated weights for policy 0, policy_version 90520 (0.0011) [2023-10-10 08:27:01,091][53268] Updated weights for policy 1, policy_version 90440 (0.0008) [2023-10-10 08:27:01,466][53268] Updated weights for policy 1, policy_version 90450 (0.0007) [2023-10-10 08:27:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185303040. Throughput: 0: 1677.2, 1: 1682.6. Samples: 46333646. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:27:01,784][52050] Avg episode reward: [(0, '21.750'), (1, '22.330')] [2023-10-10 08:27:01,830][53268] Updated weights for policy 1, policy_version 90460 (0.0007) [2023-10-10 08:27:03,513][53252] Updated weights for policy 0, policy_version 90530 (0.0008) [2023-10-10 08:27:03,891][53252] Updated weights for policy 0, policy_version 90540 (0.0009) [2023-10-10 08:27:04,256][53252] Updated weights for policy 0, policy_version 90550 (0.0011) [2023-10-10 08:27:04,628][53252] Updated weights for policy 0, policy_version 90560 (0.0011) [2023-10-10 08:27:05,742][53268] Updated weights for policy 1, policy_version 90470 (0.0007) [2023-10-10 08:27:06,100][53268] Updated weights for policy 1, policy_version 90480 (0.0009) [2023-10-10 08:27:06,466][53268] Updated weights for policy 1, policy_version 90490 (0.0007) [2023-10-10 08:27:06,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185401344. Throughput: 0: 1675.6, 1: 1684.9. Samples: 46354122. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:27:06,784][52050] Avg episode reward: [(0, '21.380'), (1, '23.880')] [2023-10-10 08:27:08,835][53252] Updated weights for policy 0, policy_version 90570 (0.0007) [2023-10-10 08:27:09,205][53252] Updated weights for policy 0, policy_version 90580 (0.0010) [2023-10-10 08:27:09,578][53252] Updated weights for policy 0, policy_version 90590 (0.0010) [2023-10-10 08:27:10,559][53268] Updated weights for policy 1, policy_version 90500 (0.0008) [2023-10-10 08:27:10,925][53268] Updated weights for policy 1, policy_version 90510 (0.0010) [2023-10-10 08:27:11,295][53268] Updated weights for policy 1, policy_version 90520 (0.0009) [2023-10-10 08:27:11,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185466880. Throughput: 0: 1696.5, 1: 1665.0. Samples: 46374228. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:27:11,784][52050] Avg episode reward: [(0, '23.660'), (1, '22.930')] [2023-10-10 08:27:13,588][53252] Updated weights for policy 0, policy_version 90600 (0.0007) [2023-10-10 08:27:13,955][53252] Updated weights for policy 0, policy_version 90610 (0.0007) [2023-10-10 08:27:14,332][53252] Updated weights for policy 0, policy_version 90620 (0.0007) [2023-10-10 08:27:15,282][53268] Updated weights for policy 1, policy_version 90530 (0.0010) [2023-10-10 08:27:15,651][53268] Updated weights for policy 1, policy_version 90540 (0.0010) [2023-10-10 08:27:16,028][53268] Updated weights for policy 1, policy_version 90550 (0.0007) [2023-10-10 08:27:16,395][53268] Updated weights for policy 1, policy_version 90560 (0.0009) [2023-10-10 08:27:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185532416. Throughput: 0: 1673.6, 1: 1685.2. Samples: 46384336. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:27:16,784][52050] Avg episode reward: [(0, '24.670'), (1, '22.460')] [2023-10-10 08:27:18,336][53252] Updated weights for policy 0, policy_version 90630 (0.0007) [2023-10-10 08:27:18,702][53252] Updated weights for policy 0, policy_version 90640 (0.0007) [2023-10-10 08:27:19,081][53252] Updated weights for policy 0, policy_version 90650 (0.0008) [2023-10-10 08:27:20,440][53268] Updated weights for policy 1, policy_version 90570 (0.0008) [2023-10-10 08:27:20,808][53268] Updated weights for policy 1, policy_version 90580 (0.0007) [2023-10-10 08:27:21,178][53268] Updated weights for policy 1, policy_version 90590 (0.0009) [2023-10-10 08:27:21,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185597952. Throughput: 0: 1693.9, 1: 1685.3. Samples: 46404746. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:27:21,784][52050] Avg episode reward: [(0, '24.090'), (1, '23.030')] [2023-10-10 08:27:22,948][53252] Updated weights for policy 0, policy_version 90660 (0.0009) [2023-10-10 08:27:23,339][53252] Updated weights for policy 0, policy_version 90670 (0.0008) [2023-10-10 08:27:23,706][53252] Updated weights for policy 0, policy_version 90680 (0.0009) [2023-10-10 08:27:25,120][53268] Updated weights for policy 1, policy_version 90600 (0.0009) [2023-10-10 08:27:25,497][53268] Updated weights for policy 1, policy_version 90610 (0.0010) [2023-10-10 08:27:25,858][53268] Updated weights for policy 1, policy_version 90620 (0.0008) [2023-10-10 08:27:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185663488. Throughput: 0: 1702.4, 1: 1663.2. Samples: 46424562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:26,784][52050] Avg episode reward: [(0, '24.160'), (1, '21.950')] [2023-10-10 08:27:27,783][53252] Updated weights for policy 0, policy_version 90690 (0.0008) [2023-10-10 08:27:28,143][53252] Updated weights for policy 0, policy_version 90700 (0.0010) [2023-10-10 08:27:28,526][53252] Updated weights for policy 0, policy_version 90710 (0.0011) [2023-10-10 08:27:28,897][53252] Updated weights for policy 0, policy_version 90720 (0.0010) [2023-10-10 08:27:29,929][53268] Updated weights for policy 1, policy_version 90630 (0.0009) [2023-10-10 08:27:30,309][53268] Updated weights for policy 1, policy_version 90640 (0.0009) [2023-10-10 08:27:30,685][53268] Updated weights for policy 1, policy_version 90650 (0.0009) [2023-10-10 08:27:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 185729024. Throughput: 0: 1673.1, 1: 1699.1. Samples: 46435044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:31,784][52050] Avg episode reward: [(0, '23.990'), (1, '20.930')] [2023-10-10 08:27:32,745][53252] Updated weights for policy 0, policy_version 90730 (0.0007) [2023-10-10 08:27:33,119][53252] Updated weights for policy 0, policy_version 90740 (0.0009) [2023-10-10 08:27:33,484][53252] Updated weights for policy 0, policy_version 90750 (0.0007) [2023-10-10 08:27:34,666][53268] Updated weights for policy 1, policy_version 90660 (0.0010) [2023-10-10 08:27:35,045][53268] Updated weights for policy 1, policy_version 90670 (0.0008) [2023-10-10 08:27:35,414][53268] Updated weights for policy 1, policy_version 90680 (0.0008) [2023-10-10 08:27:36,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 185794560. Throughput: 0: 1696.7, 1: 1681.9. Samples: 46455192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:36,784][52050] Avg episode reward: [(0, '23.320'), (1, '21.320')] [2023-10-10 08:27:37,545][53252] Updated weights for policy 0, policy_version 90760 (0.0009) [2023-10-10 08:27:37,921][53252] Updated weights for policy 0, policy_version 90770 (0.0011) [2023-10-10 08:27:38,291][53252] Updated weights for policy 0, policy_version 90780 (0.0008) [2023-10-10 08:27:39,607][53268] Updated weights for policy 1, policy_version 90690 (0.0009) [2023-10-10 08:27:39,974][53268] Updated weights for policy 1, policy_version 90700 (0.0008) [2023-10-10 08:27:40,345][53268] Updated weights for policy 1, policy_version 90710 (0.0009) [2023-10-10 08:27:40,706][53268] Updated weights for policy 1, policy_version 90720 (0.0008) [2023-10-10 08:27:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185860096. Throughput: 0: 1699.4, 1: 1669.5. Samples: 46475380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:41,784][52050] Avg episode reward: [(0, '20.620'), (1, '22.060')] [2023-10-10 08:27:42,249][53252] Updated weights for policy 0, policy_version 90790 (0.0009) [2023-10-10 08:27:42,628][53252] Updated weights for policy 0, policy_version 90800 (0.0009) [2023-10-10 08:27:43,001][53252] Updated weights for policy 0, policy_version 90810 (0.0007) [2023-10-10 08:27:44,931][53268] Updated weights for policy 1, policy_version 90730 (0.0009) [2023-10-10 08:27:45,297][53268] Updated weights for policy 1, policy_version 90740 (0.0008) [2023-10-10 08:27:45,664][53268] Updated weights for policy 1, policy_version 90750 (0.0008) [2023-10-10 08:27:46,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185925632. Throughput: 0: 1687.0, 1: 1688.9. Samples: 46485562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:46,784][52050] Avg episode reward: [(0, '21.160'), (1, '22.550')] [2023-10-10 08:27:47,086][53252] Updated weights for policy 0, policy_version 90820 (0.0008) [2023-10-10 08:27:47,455][53252] Updated weights for policy 0, policy_version 90830 (0.0008) [2023-10-10 08:27:47,819][53252] Updated weights for policy 0, policy_version 90840 (0.0008) [2023-10-10 08:27:49,788][53268] Updated weights for policy 1, policy_version 90760 (0.0009) [2023-10-10 08:27:50,158][53268] Updated weights for policy 1, policy_version 90770 (0.0010) [2023-10-10 08:27:50,533][53268] Updated weights for policy 1, policy_version 90780 (0.0009) [2023-10-10 08:27:51,746][53252] Updated weights for policy 0, policy_version 90850 (0.0010) [2023-10-10 08:27:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 185991168. Throughput: 0: 1699.6, 1: 1667.8. Samples: 46505656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:51,784][52050] Avg episode reward: [(0, '20.040'), (1, '21.690')] [2023-10-10 08:27:52,111][53252] Updated weights for policy 0, policy_version 90860 (0.0007) [2023-10-10 08:27:52,473][53252] Updated weights for policy 0, policy_version 90870 (0.0007) [2023-10-10 08:27:52,840][53252] Updated weights for policy 0, policy_version 90880 (0.0008) [2023-10-10 08:27:54,372][53268] Updated weights for policy 1, policy_version 90790 (0.0010) [2023-10-10 08:27:54,737][53268] Updated weights for policy 1, policy_version 90800 (0.0010) [2023-10-10 08:27:55,101][53268] Updated weights for policy 1, policy_version 90810 (0.0009) [2023-10-10 08:27:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186056704. Throughput: 0: 1698.1, 1: 1670.2. Samples: 46525802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:27:56,784][52050] Avg episode reward: [(0, '21.890'), (1, '23.040')] [2023-10-10 08:27:56,997][53252] Updated weights for policy 0, policy_version 90890 (0.0011) [2023-10-10 08:27:57,367][53252] Updated weights for policy 0, policy_version 90900 (0.0009) [2023-10-10 08:27:57,742][53252] Updated weights for policy 0, policy_version 90910 (0.0007) [2023-10-10 08:27:59,166][53268] Updated weights for policy 1, policy_version 90820 (0.0010) [2023-10-10 08:27:59,530][53268] Updated weights for policy 1, policy_version 90830 (0.0011) [2023-10-10 08:27:59,903][53268] Updated weights for policy 1, policy_version 90840 (0.0008) [2023-10-10 08:28:01,712][53252] Updated weights for policy 0, policy_version 90920 (0.0008) [2023-10-10 08:28:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186122240. Throughput: 0: 1695.7, 1: 1678.6. Samples: 46536178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:28:01,784][52050] Avg episode reward: [(0, '20.090'), (1, '24.090')] [2023-10-10 08:28:02,098][53252] Updated weights for policy 0, policy_version 90930 (0.0008) [2023-10-10 08:28:02,456][53252] Updated weights for policy 0, policy_version 90940 (0.0011) [2023-10-10 08:28:04,023][53268] Updated weights for policy 1, policy_version 90850 (0.0009) [2023-10-10 08:28:04,384][53268] Updated weights for policy 1, policy_version 90860 (0.0009) [2023-10-10 08:28:04,753][53268] Updated weights for policy 1, policy_version 90870 (0.0010) [2023-10-10 08:28:05,120][53268] Updated weights for policy 1, policy_version 90880 (0.0011) [2023-10-10 08:28:06,630][53252] Updated weights for policy 0, policy_version 90950 (0.0008) [2023-10-10 08:28:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186187776. Throughput: 0: 1696.5, 1: 1659.3. Samples: 46555760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:28:06,784][52050] Avg episode reward: [(0, '21.920'), (1, '23.420')] [2023-10-10 08:28:06,995][53252] Updated weights for policy 0, policy_version 90960 (0.0008) [2023-10-10 08:28:07,364][53252] Updated weights for policy 0, policy_version 90970 (0.0008) [2023-10-10 08:28:09,276][53268] Updated weights for policy 1, policy_version 90890 (0.0010) [2023-10-10 08:28:09,642][53268] Updated weights for policy 1, policy_version 90900 (0.0011) [2023-10-10 08:28:10,010][53268] Updated weights for policy 1, policy_version 90910 (0.0009) [2023-10-10 08:28:11,556][53252] Updated weights for policy 0, policy_version 90980 (0.0008) [2023-10-10 08:28:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186253312. Throughput: 0: 1689.6, 1: 1681.4. Samples: 46576254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:28:11,784][52050] Avg episode reward: [(0, '21.770'), (1, '23.470')] [2023-10-10 08:28:11,933][53252] Updated weights for policy 0, policy_version 90990 (0.0010) [2023-10-10 08:28:12,322][53252] Updated weights for policy 0, policy_version 91000 (0.0011) [2023-10-10 08:28:13,983][53268] Updated weights for policy 1, policy_version 90920 (0.0009) [2023-10-10 08:28:14,364][53268] Updated weights for policy 1, policy_version 90930 (0.0008) [2023-10-10 08:28:14,733][53268] Updated weights for policy 1, policy_version 90940 (0.0010) [2023-10-10 08:28:16,562][53252] Updated weights for policy 0, policy_version 91010 (0.0009) [2023-10-10 08:28:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186318848. Throughput: 0: 1689.1, 1: 1666.4. Samples: 46586042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:28:16,784][52050] Avg episode reward: [(0, '23.500'), (1, '24.170')] [2023-10-10 08:28:16,929][53252] Updated weights for policy 0, policy_version 91020 (0.0007) [2023-10-10 08:28:17,304][53252] Updated weights for policy 0, policy_version 91030 (0.0009) [2023-10-10 08:28:17,668][53252] Updated weights for policy 0, policy_version 91040 (0.0009) [2023-10-10 08:28:18,668][53268] Updated weights for policy 1, policy_version 90950 (0.0011) [2023-10-10 08:28:19,034][53268] Updated weights for policy 1, policy_version 90960 (0.0010) [2023-10-10 08:28:19,408][53268] Updated weights for policy 1, policy_version 90970 (0.0008) [2023-10-10 08:28:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186384384. Throughput: 0: 1682.5, 1: 1668.1. Samples: 46605968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:28:21,784][52050] Avg episode reward: [(0, '23.610'), (1, '22.960')] [2023-10-10 08:28:21,791][53252] Updated weights for policy 0, policy_version 91050 (0.0009) [2023-10-10 08:28:22,161][53252] Updated weights for policy 0, policy_version 91060 (0.0008) [2023-10-10 08:28:22,550][53252] Updated weights for policy 0, policy_version 91070 (0.0009) [2023-10-10 08:28:23,533][53268] Updated weights for policy 1, policy_version 90980 (0.0009) [2023-10-10 08:28:23,904][53268] Updated weights for policy 1, policy_version 90990 (0.0009) [2023-10-10 08:28:24,268][53268] Updated weights for policy 1, policy_version 91000 (0.0008) [2023-10-10 08:28:26,667][53252] Updated weights for policy 0, policy_version 91080 (0.0008) [2023-10-10 08:28:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186449920. Throughput: 0: 1678.1, 1: 1688.2. Samples: 46626864. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:26,784][52050] Avg episode reward: [(0, '24.920'), (1, '22.620')] [2023-10-10 08:28:27,044][53252] Updated weights for policy 0, policy_version 91090 (0.0007) [2023-10-10 08:28:27,406][53252] Updated weights for policy 0, policy_version 91100 (0.0007) [2023-10-10 08:28:28,481][53268] Updated weights for policy 1, policy_version 91010 (0.0008) [2023-10-10 08:28:28,842][53268] Updated weights for policy 1, policy_version 91020 (0.0007) [2023-10-10 08:28:29,213][53268] Updated weights for policy 1, policy_version 91030 (0.0008) [2023-10-10 08:28:29,575][53268] Updated weights for policy 1, policy_version 91040 (0.0009) [2023-10-10 08:28:31,386][53252] Updated weights for policy 0, policy_version 91110 (0.0008) [2023-10-10 08:28:31,755][53252] Updated weights for policy 0, policy_version 91120 (0.0007) [2023-10-10 08:28:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186515456. Throughput: 0: 1682.3, 1: 1673.7. Samples: 46636582. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:31,784][52050] Avg episode reward: [(0, '22.800'), (1, '23.480')] [2023-10-10 08:28:32,126][53252] Updated weights for policy 0, policy_version 91130 (0.0010) [2023-10-10 08:28:33,581][53268] Updated weights for policy 1, policy_version 91050 (0.0011) [2023-10-10 08:28:33,955][53268] Updated weights for policy 1, policy_version 91060 (0.0011) [2023-10-10 08:28:34,332][53268] Updated weights for policy 1, policy_version 91070 (0.0010) [2023-10-10 08:28:36,056][53252] Updated weights for policy 0, policy_version 91140 (0.0009) [2023-10-10 08:28:36,431][53252] Updated weights for policy 0, policy_version 91150 (0.0008) [2023-10-10 08:28:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 186580992. Throughput: 0: 1677.4, 1: 1681.1. Samples: 46656790. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:36,784][52050] Avg episode reward: [(0, '21.850'), (1, '21.750')] [2023-10-10 08:28:36,795][53252] Updated weights for policy 0, policy_version 91160 (0.0008) [2023-10-10 08:28:38,335][53268] Updated weights for policy 1, policy_version 91080 (0.0010) [2023-10-10 08:28:38,715][53268] Updated weights for policy 1, policy_version 91090 (0.0008) [2023-10-10 08:28:39,079][53268] Updated weights for policy 1, policy_version 91100 (0.0010) [2023-10-10 08:28:40,946][53252] Updated weights for policy 0, policy_version 91170 (0.0008) [2023-10-10 08:28:41,321][53252] Updated weights for policy 0, policy_version 91180 (0.0010) [2023-10-10 08:28:41,693][53252] Updated weights for policy 0, policy_version 91190 (0.0010) [2023-10-10 08:28:41,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 186646528. Throughput: 0: 1665.0, 1: 1694.1. Samples: 46676964. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:41,784][52050] Avg episode reward: [(0, '19.110'), (1, '23.120')] [2023-10-10 08:28:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000091104_93290496.pth... [2023-10-10 08:28:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000089536_91684864.pth [2023-10-10 08:28:42,069][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000091200_93388800.pth... [2023-10-10 08:28:42,075][53252] Updated weights for policy 0, policy_version 91200 (0.0008) [2023-10-10 08:28:42,107][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000089600_91750400.pth [2023-10-10 08:28:43,051][53268] Updated weights for policy 1, policy_version 91110 (0.0010) [2023-10-10 08:28:43,414][53268] Updated weights for policy 1, policy_version 91120 (0.0009) [2023-10-10 08:28:43,777][53268] Updated weights for policy 1, policy_version 91130 (0.0007) [2023-10-10 08:28:46,209][53252] Updated weights for policy 0, policy_version 91210 (0.0007) [2023-10-10 08:28:46,588][53252] Updated weights for policy 0, policy_version 91220 (0.0007) [2023-10-10 08:28:46,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 186712064. Throughput: 0: 1670.6, 1: 1672.1. Samples: 46686600. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:46,784][52050] Avg episode reward: [(0, '19.910'), (1, '21.080')] [2023-10-10 08:28:46,961][53252] Updated weights for policy 0, policy_version 91230 (0.0007) [2023-10-10 08:28:47,888][53268] Updated weights for policy 1, policy_version 91140 (0.0009) [2023-10-10 08:28:48,253][53268] Updated weights for policy 1, policy_version 91150 (0.0008) [2023-10-10 08:28:48,628][53268] Updated weights for policy 1, policy_version 91160 (0.0009) [2023-10-10 08:28:51,107][53252] Updated weights for policy 0, policy_version 91240 (0.0009) [2023-10-10 08:28:51,469][53252] Updated weights for policy 0, policy_version 91250 (0.0009) [2023-10-10 08:28:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 186777600. Throughput: 0: 1670.9, 1: 1693.5. Samples: 46707158. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:51,784][52050] Avg episode reward: [(0, '21.460'), (1, '21.620')] [2023-10-10 08:28:51,846][53252] Updated weights for policy 0, policy_version 91260 (0.0007) [2023-10-10 08:28:52,719][53268] Updated weights for policy 1, policy_version 91170 (0.0009) [2023-10-10 08:28:53,081][53268] Updated weights for policy 1, policy_version 91180 (0.0010) [2023-10-10 08:28:53,445][53268] Updated weights for policy 1, policy_version 91190 (0.0008) [2023-10-10 08:28:53,816][53268] Updated weights for policy 1, policy_version 91200 (0.0008) [2023-10-10 08:28:55,888][53252] Updated weights for policy 0, policy_version 91270 (0.0008) [2023-10-10 08:28:56,259][53252] Updated weights for policy 0, policy_version 91280 (0.0008) [2023-10-10 08:28:56,632][53252] Updated weights for policy 0, policy_version 91290 (0.0010) [2023-10-10 08:28:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 186843136. Throughput: 0: 1658.8, 1: 1696.1. Samples: 46727224. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:28:56,784][52050] Avg episode reward: [(0, '21.460'), (1, '21.670')] [2023-10-10 08:28:57,971][53268] Updated weights for policy 1, policy_version 91210 (0.0010) [2023-10-10 08:28:58,332][53268] Updated weights for policy 1, policy_version 91220 (0.0009) [2023-10-10 08:28:58,705][53268] Updated weights for policy 1, policy_version 91230 (0.0009) [2023-10-10 08:29:00,610][53252] Updated weights for policy 0, policy_version 91300 (0.0009) [2023-10-10 08:29:00,993][53252] Updated weights for policy 0, policy_version 91310 (0.0008) [2023-10-10 08:29:01,370][53252] Updated weights for policy 0, policy_version 91320 (0.0008) [2023-10-10 08:29:01,783][52050] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186941440. Throughput: 0: 1677.9, 1: 1679.6. Samples: 46737130. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:29:01,784][52050] Avg episode reward: [(0, '21.170'), (1, '24.400')] [2023-10-10 08:29:02,914][53268] Updated weights for policy 1, policy_version 91240 (0.0009) [2023-10-10 08:29:03,281][53268] Updated weights for policy 1, policy_version 91250 (0.0009) [2023-10-10 08:29:03,649][53268] Updated weights for policy 1, policy_version 91260 (0.0009) [2023-10-10 08:29:05,301][53252] Updated weights for policy 0, policy_version 91330 (0.0009) [2023-10-10 08:29:05,674][53252] Updated weights for policy 0, policy_version 91340 (0.0008) [2023-10-10 08:29:06,045][53252] Updated weights for policy 0, policy_version 91350 (0.0007) [2023-10-10 08:29:06,412][53252] Updated weights for policy 0, policy_version 91360 (0.0010) [2023-10-10 08:29:06,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187006976. Throughput: 0: 1680.3, 1: 1690.7. Samples: 46757664. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:29:06,784][52050] Avg episode reward: [(0, '23.610'), (1, '23.460')] [2023-10-10 08:29:07,867][53268] Updated weights for policy 1, policy_version 91270 (0.0008) [2023-10-10 08:29:08,225][53268] Updated weights for policy 1, policy_version 91280 (0.0008) [2023-10-10 08:29:08,598][53268] Updated weights for policy 1, policy_version 91290 (0.0007) [2023-10-10 08:29:10,382][53252] Updated weights for policy 0, policy_version 91370 (0.0007) [2023-10-10 08:29:10,755][53252] Updated weights for policy 0, policy_version 91380 (0.0007) [2023-10-10 08:29:11,123][53252] Updated weights for policy 0, policy_version 91390 (0.0010) [2023-10-10 08:29:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187072512. Throughput: 0: 1660.1, 1: 1688.2. Samples: 46777536. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:29:11,785][52050] Avg episode reward: [(0, '22.390'), (1, '22.220')] [2023-10-10 08:29:12,463][53268] Updated weights for policy 1, policy_version 91300 (0.0008) [2023-10-10 08:29:12,827][53268] Updated weights for policy 1, policy_version 91310 (0.0009) [2023-10-10 08:29:13,199][53268] Updated weights for policy 1, policy_version 91320 (0.0011) [2023-10-10 08:29:15,016][53252] Updated weights for policy 0, policy_version 91400 (0.0009) [2023-10-10 08:29:15,389][53252] Updated weights for policy 0, policy_version 91410 (0.0009) [2023-10-10 08:29:15,762][53252] Updated weights for policy 0, policy_version 91420 (0.0007) [2023-10-10 08:29:16,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 187138048. Throughput: 0: 1688.1, 1: 1673.4. Samples: 46787852. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-10 08:29:16,784][52050] Avg episode reward: [(0, '22.230'), (1, '22.520')] [2023-10-10 08:29:17,401][53268] Updated weights for policy 1, policy_version 91330 (0.0009) [2023-10-10 08:29:17,758][53268] Updated weights for policy 1, policy_version 91340 (0.0008) [2023-10-10 08:29:18,121][53268] Updated weights for policy 1, policy_version 91350 (0.0010) [2023-10-10 08:29:18,485][53268] Updated weights for policy 1, policy_version 91360 (0.0008) [2023-10-10 08:29:19,782][53252] Updated weights for policy 0, policy_version 91430 (0.0008) [2023-10-10 08:29:20,153][53252] Updated weights for policy 0, policy_version 91440 (0.0008) [2023-10-10 08:29:20,517][53252] Updated weights for policy 0, policy_version 91450 (0.0007) [2023-10-10 08:29:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187203584. Throughput: 0: 1672.0, 1: 1682.9. Samples: 46807760. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:21,784][52050] Avg episode reward: [(0, '21.700'), (1, '20.340')] [2023-10-10 08:29:22,465][53268] Updated weights for policy 1, policy_version 91370 (0.0009) [2023-10-10 08:29:22,827][53268] Updated weights for policy 1, policy_version 91380 (0.0009) [2023-10-10 08:29:23,198][53268] Updated weights for policy 1, policy_version 91390 (0.0008) [2023-10-10 08:29:24,610][53252] Updated weights for policy 0, policy_version 91460 (0.0009) [2023-10-10 08:29:24,996][53252] Updated weights for policy 0, policy_version 91470 (0.0010) [2023-10-10 08:29:25,365][53252] Updated weights for policy 0, policy_version 91480 (0.0007) [2023-10-10 08:29:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187269120. Throughput: 0: 1679.2, 1: 1684.4. Samples: 46828326. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:26,784][52050] Avg episode reward: [(0, '24.560'), (1, '20.460')] [2023-10-10 08:29:27,324][53268] Updated weights for policy 1, policy_version 91400 (0.0007) [2023-10-10 08:29:27,683][53268] Updated weights for policy 1, policy_version 91410 (0.0007) [2023-10-10 08:29:28,058][53268] Updated weights for policy 1, policy_version 91420 (0.0009) [2023-10-10 08:29:29,495][53252] Updated weights for policy 0, policy_version 91490 (0.0007) [2023-10-10 08:29:29,853][53252] Updated weights for policy 0, policy_version 91500 (0.0009) [2023-10-10 08:29:30,220][53252] Updated weights for policy 0, policy_version 91510 (0.0010) [2023-10-10 08:29:30,589][53252] Updated weights for policy 0, policy_version 91520 (0.0007) [2023-10-10 08:29:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187334656. Throughput: 0: 1697.3, 1: 1682.2. Samples: 46838678. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:31,784][52050] Avg episode reward: [(0, '22.660'), (1, '22.030')] [2023-10-10 08:29:32,060][53268] Updated weights for policy 1, policy_version 91430 (0.0009) [2023-10-10 08:29:32,436][53268] Updated weights for policy 1, policy_version 91440 (0.0009) [2023-10-10 08:29:32,796][53268] Updated weights for policy 1, policy_version 91450 (0.0008) [2023-10-10 08:29:34,650][53252] Updated weights for policy 0, policy_version 91530 (0.0010) [2023-10-10 08:29:35,031][53252] Updated weights for policy 0, policy_version 91540 (0.0009) [2023-10-10 08:29:35,395][53252] Updated weights for policy 0, policy_version 91550 (0.0010) [2023-10-10 08:29:36,757][53268] Updated weights for policy 1, policy_version 91460 (0.0009) [2023-10-10 08:29:36,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187400192. Throughput: 0: 1671.6, 1: 1684.1. Samples: 46858164. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:36,784][52050] Avg episode reward: [(0, '23.630'), (1, '22.930')] [2023-10-10 08:29:37,123][53268] Updated weights for policy 1, policy_version 91470 (0.0010) [2023-10-10 08:29:37,505][53268] Updated weights for policy 1, policy_version 91480 (0.0009) [2023-10-10 08:29:39,371][53252] Updated weights for policy 0, policy_version 91560 (0.0008) [2023-10-10 08:29:39,747][53252] Updated weights for policy 0, policy_version 91570 (0.0008) [2023-10-10 08:29:40,129][53252] Updated weights for policy 0, policy_version 91580 (0.0009) [2023-10-10 08:29:41,646][53268] Updated weights for policy 1, policy_version 91490 (0.0009) [2023-10-10 08:29:41,783][52050] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187465728. Throughput: 0: 1687.7, 1: 1681.8. Samples: 46878850. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:41,784][52050] Avg episode reward: [(0, '22.390'), (1, '21.770')] [2023-10-10 08:29:42,004][53268] Updated weights for policy 1, policy_version 91500 (0.0009) [2023-10-10 08:29:42,374][53268] Updated weights for policy 1, policy_version 91510 (0.0009) [2023-10-10 08:29:42,728][53268] Updated weights for policy 1, policy_version 91520 (0.0010) [2023-10-10 08:29:44,236][53252] Updated weights for policy 0, policy_version 91590 (0.0008) [2023-10-10 08:29:44,605][53252] Updated weights for policy 0, policy_version 91600 (0.0008) [2023-10-10 08:29:44,976][53252] Updated weights for policy 0, policy_version 91610 (0.0009) [2023-10-10 08:29:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187531264. Throughput: 0: 1692.4, 1: 1680.0. Samples: 46888888. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:46,784][52050] Avg episode reward: [(0, '22.810'), (1, '22.910')] [2023-10-10 08:29:46,887][53268] Updated weights for policy 1, policy_version 91530 (0.0011) [2023-10-10 08:29:47,244][53268] Updated weights for policy 1, policy_version 91540 (0.0010) [2023-10-10 08:29:47,603][53268] Updated weights for policy 1, policy_version 91550 (0.0010) [2023-10-10 08:29:48,950][53252] Updated weights for policy 0, policy_version 91620 (0.0008) [2023-10-10 08:29:49,317][53252] Updated weights for policy 0, policy_version 91630 (0.0007) [2023-10-10 08:29:49,697][53252] Updated weights for policy 0, policy_version 91640 (0.0007) [2023-10-10 08:29:51,593][53268] Updated weights for policy 1, policy_version 91560 (0.0008) [2023-10-10 08:29:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187596800. Throughput: 0: 1675.6, 1: 1683.1. Samples: 46908806. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:51,784][52050] Avg episode reward: [(0, '23.910'), (1, '22.820')] [2023-10-10 08:29:51,954][53268] Updated weights for policy 1, policy_version 91570 (0.0010) [2023-10-10 08:29:52,328][53268] Updated weights for policy 1, policy_version 91580 (0.0008) [2023-10-10 08:29:53,733][53252] Updated weights for policy 0, policy_version 91650 (0.0009) [2023-10-10 08:29:54,130][53252] Updated weights for policy 0, policy_version 91660 (0.0007) [2023-10-10 08:29:54,492][53252] Updated weights for policy 0, policy_version 91670 (0.0008) [2023-10-10 08:29:54,862][53252] Updated weights for policy 0, policy_version 91680 (0.0007) [2023-10-10 08:29:56,519][53268] Updated weights for policy 1, policy_version 91590 (0.0008) [2023-10-10 08:29:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187662336. Throughput: 0: 1704.8, 1: 1680.5. Samples: 46929874. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:29:56,784][52050] Avg episode reward: [(0, '23.150'), (1, '20.730')] [2023-10-10 08:29:56,905][53268] Updated weights for policy 1, policy_version 91600 (0.0008) [2023-10-10 08:29:57,272][53268] Updated weights for policy 1, policy_version 91610 (0.0009) [2023-10-10 08:29:58,871][53252] Updated weights for policy 0, policy_version 91690 (0.0009) [2023-10-10 08:29:59,236][53252] Updated weights for policy 0, policy_version 91700 (0.0009) [2023-10-10 08:29:59,613][53252] Updated weights for policy 0, policy_version 91710 (0.0007) [2023-10-10 08:30:01,283][53268] Updated weights for policy 1, policy_version 91620 (0.0008) [2023-10-10 08:30:01,656][53268] Updated weights for policy 1, policy_version 91630 (0.0007) [2023-10-10 08:30:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 187727872. Throughput: 0: 1683.3, 1: 1682.5. Samples: 46939314. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:30:01,784][52050] Avg episode reward: [(0, '21.650'), (1, '20.170')] [2023-10-10 08:30:02,017][53268] Updated weights for policy 1, policy_version 91640 (0.0008) [2023-10-10 08:30:03,596][53252] Updated weights for policy 0, policy_version 91720 (0.0009) [2023-10-10 08:30:03,962][53252] Updated weights for policy 0, policy_version 91730 (0.0009) [2023-10-10 08:30:04,335][53252] Updated weights for policy 0, policy_version 91740 (0.0008) [2023-10-10 08:30:06,245][53268] Updated weights for policy 1, policy_version 91650 (0.0009) [2023-10-10 08:30:06,616][53268] Updated weights for policy 1, policy_version 91660 (0.0007) [2023-10-10 08:30:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 187793408. Throughput: 0: 1696.6, 1: 1680.5. Samples: 46959730. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:30:06,784][52050] Avg episode reward: [(0, '22.400'), (1, '20.190')] [2023-10-10 08:30:06,993][53268] Updated weights for policy 1, policy_version 91670 (0.0009) [2023-10-10 08:30:07,361][53268] Updated weights for policy 1, policy_version 91680 (0.0009) [2023-10-10 08:30:08,376][53252] Updated weights for policy 0, policy_version 91750 (0.0007) [2023-10-10 08:30:08,751][53252] Updated weights for policy 0, policy_version 91760 (0.0008) [2023-10-10 08:30:09,135][53252] Updated weights for policy 0, policy_version 91770 (0.0007) [2023-10-10 08:30:11,544][53268] Updated weights for policy 1, policy_version 91690 (0.0008) [2023-10-10 08:30:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 187858944. Throughput: 0: 1701.3, 1: 1677.1. Samples: 46980356. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:30:11,784][52050] Avg episode reward: [(0, '20.770'), (1, '21.700')] [2023-10-10 08:30:11,915][53268] Updated weights for policy 1, policy_version 91700 (0.0008) [2023-10-10 08:30:12,276][53268] Updated weights for policy 1, policy_version 91710 (0.0008) [2023-10-10 08:30:13,193][53252] Updated weights for policy 0, policy_version 91780 (0.0009) [2023-10-10 08:30:13,559][53252] Updated weights for policy 0, policy_version 91790 (0.0009) [2023-10-10 08:30:13,933][53252] Updated weights for policy 0, policy_version 91800 (0.0010) [2023-10-10 08:30:16,294][53268] Updated weights for policy 1, policy_version 91720 (0.0007) [2023-10-10 08:30:16,666][53268] Updated weights for policy 1, policy_version 91730 (0.0009) [2023-10-10 08:30:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187924480. Throughput: 0: 1674.5, 1: 1677.9. Samples: 46989536. Policy #0 lag: (min: 10.0, avg: 17.4, max: 42.0) [2023-10-10 08:30:16,784][52050] Avg episode reward: [(0, '22.190'), (1, '22.120')] [2023-10-10 08:30:17,024][53268] Updated weights for policy 1, policy_version 91740 (0.0009) [2023-10-10 08:30:17,954][53252] Updated weights for policy 0, policy_version 91810 (0.0009) [2023-10-10 08:30:18,327][53252] Updated weights for policy 0, policy_version 91820 (0.0007) [2023-10-10 08:30:18,698][53252] Updated weights for policy 0, policy_version 91830 (0.0007) [2023-10-10 08:30:19,060][53252] Updated weights for policy 0, policy_version 91840 (0.0008) [2023-10-10 08:30:20,922][53268] Updated weights for policy 1, policy_version 91750 (0.0008) [2023-10-10 08:30:21,290][53268] Updated weights for policy 1, policy_version 91760 (0.0008) [2023-10-10 08:30:21,662][53268] Updated weights for policy 1, policy_version 91770 (0.0009) [2023-10-10 08:30:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187990016. Throughput: 0: 1704.8, 1: 1682.6. Samples: 47010600. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:21,784][52050] Avg episode reward: [(0, '24.130'), (1, '22.320')] [2023-10-10 08:30:23,068][53252] Updated weights for policy 0, policy_version 91850 (0.0007) [2023-10-10 08:30:23,431][53252] Updated weights for policy 0, policy_version 91860 (0.0008) [2023-10-10 08:30:23,798][53252] Updated weights for policy 0, policy_version 91870 (0.0007) [2023-10-10 08:30:25,794][53268] Updated weights for policy 1, policy_version 91780 (0.0012) [2023-10-10 08:30:26,156][53268] Updated weights for policy 1, policy_version 91790 (0.0009) [2023-10-10 08:30:26,522][53268] Updated weights for policy 1, policy_version 91800 (0.0008) [2023-10-10 08:30:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 188055552. Throughput: 0: 1701.0, 1: 1672.8. Samples: 47030670. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:26,784][52050] Avg episode reward: [(0, '22.540'), (1, '22.170')] [2023-10-10 08:30:27,923][53252] Updated weights for policy 0, policy_version 91880 (0.0010) [2023-10-10 08:30:28,292][53252] Updated weights for policy 0, policy_version 91890 (0.0009) [2023-10-10 08:30:28,659][53252] Updated weights for policy 0, policy_version 91900 (0.0008) [2023-10-10 08:30:30,502][53268] Updated weights for policy 1, policy_version 91810 (0.0008) [2023-10-10 08:30:30,865][53268] Updated weights for policy 1, policy_version 91820 (0.0011) [2023-10-10 08:30:31,225][53268] Updated weights for policy 1, policy_version 91830 (0.0009) [2023-10-10 08:30:31,592][53268] Updated weights for policy 1, policy_version 91840 (0.0008) [2023-10-10 08:30:31,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188153856. Throughput: 0: 1679.3, 1: 1687.4. Samples: 47040392. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:31,784][52050] Avg episode reward: [(0, '22.300'), (1, '22.920')] [2023-10-10 08:30:32,824][53252] Updated weights for policy 0, policy_version 91910 (0.0007) [2023-10-10 08:30:33,195][53252] Updated weights for policy 0, policy_version 91920 (0.0007) [2023-10-10 08:30:33,566][53252] Updated weights for policy 0, policy_version 91930 (0.0007) [2023-10-10 08:30:35,545][53268] Updated weights for policy 1, policy_version 91850 (0.0010) [2023-10-10 08:30:35,917][53268] Updated weights for policy 1, policy_version 91860 (0.0008) [2023-10-10 08:30:36,278][53268] Updated weights for policy 1, policy_version 91870 (0.0007) [2023-10-10 08:30:36,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188219392. Throughput: 0: 1695.8, 1: 1688.1. Samples: 47061082. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:36,784][52050] Avg episode reward: [(0, '23.530'), (1, '22.440')] [2023-10-10 08:30:37,728][53252] Updated weights for policy 0, policy_version 91940 (0.0007) [2023-10-10 08:30:38,084][53252] Updated weights for policy 0, policy_version 91950 (0.0009) [2023-10-10 08:30:38,457][53252] Updated weights for policy 0, policy_version 91960 (0.0011) [2023-10-10 08:30:40,505][53268] Updated weights for policy 1, policy_version 91880 (0.0010) [2023-10-10 08:30:40,880][53268] Updated weights for policy 1, policy_version 91890 (0.0011) [2023-10-10 08:30:41,248][53268] Updated weights for policy 1, policy_version 91900 (0.0011) [2023-10-10 08:30:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 188284928. Throughput: 0: 1686.0, 1: 1664.2. Samples: 47080632. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:41,784][52050] Avg episode reward: [(0, '24.150'), (1, '22.050')] [2023-10-10 08:30:41,795][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000091904_94109696.pth... [2023-10-10 08:30:41,795][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000091968_94175232.pth... [2023-10-10 08:30:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000090400_92569600.pth [2023-10-10 08:30:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000090336_92504064.pth [2023-10-10 08:30:42,489][53252] Updated weights for policy 0, policy_version 91970 (0.0011) [2023-10-10 08:30:42,867][53252] Updated weights for policy 0, policy_version 91980 (0.0007) [2023-10-10 08:30:43,232][53252] Updated weights for policy 0, policy_version 91990 (0.0009) [2023-10-10 08:30:43,605][53252] Updated weights for policy 0, policy_version 92000 (0.0009) [2023-10-10 08:30:45,342][53268] Updated weights for policy 1, policy_version 91910 (0.0008) [2023-10-10 08:30:45,724][53268] Updated weights for policy 1, policy_version 91920 (0.0008) [2023-10-10 08:30:46,099][53268] Updated weights for policy 1, policy_version 91930 (0.0009) [2023-10-10 08:30:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 188350464. Throughput: 0: 1674.7, 1: 1688.1. Samples: 47090642. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:46,784][52050] Avg episode reward: [(0, '22.670'), (1, '22.430')] [2023-10-10 08:30:47,748][53252] Updated weights for policy 0, policy_version 92010 (0.0008) [2023-10-10 08:30:48,109][53252] Updated weights for policy 0, policy_version 92020 (0.0009) [2023-10-10 08:30:48,483][53252] Updated weights for policy 0, policy_version 92030 (0.0009) [2023-10-10 08:30:50,050][53268] Updated weights for policy 1, policy_version 91940 (0.0009) [2023-10-10 08:30:50,422][53268] Updated weights for policy 1, policy_version 91950 (0.0008) [2023-10-10 08:30:50,782][53268] Updated weights for policy 1, policy_version 91960 (0.0007) [2023-10-10 08:30:51,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188416000. Throughput: 0: 1677.4, 1: 1684.4. Samples: 47111010. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:51,784][52050] Avg episode reward: [(0, '23.620'), (1, '24.530')] [2023-10-10 08:30:52,457][53252] Updated weights for policy 0, policy_version 92040 (0.0007) [2023-10-10 08:30:52,825][53252] Updated weights for policy 0, policy_version 92050 (0.0009) [2023-10-10 08:30:53,201][53252] Updated weights for policy 0, policy_version 92060 (0.0011) [2023-10-10 08:30:54,697][53268] Updated weights for policy 1, policy_version 91970 (0.0009) [2023-10-10 08:30:55,068][53268] Updated weights for policy 1, policy_version 91980 (0.0010) [2023-10-10 08:30:55,436][53268] Updated weights for policy 1, policy_version 91990 (0.0009) [2023-10-10 08:30:55,799][53268] Updated weights for policy 1, policy_version 92000 (0.0010) [2023-10-10 08:30:56,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188481536. Throughput: 0: 1683.2, 1: 1664.4. Samples: 47130998. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:30:56,784][52050] Avg episode reward: [(0, '22.700'), (1, '22.600')] [2023-10-10 08:30:57,113][53252] Updated weights for policy 0, policy_version 92070 (0.0010) [2023-10-10 08:30:57,485][53252] Updated weights for policy 0, policy_version 92080 (0.0009) [2023-10-10 08:30:57,861][53252] Updated weights for policy 0, policy_version 92090 (0.0009) [2023-10-10 08:30:59,812][53268] Updated weights for policy 1, policy_version 92010 (0.0008) [2023-10-10 08:31:00,168][53268] Updated weights for policy 1, policy_version 92020 (0.0007) [2023-10-10 08:31:00,530][53268] Updated weights for policy 1, policy_version 92030 (0.0008) [2023-10-10 08:31:01,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188547072. Throughput: 0: 1681.2, 1: 1691.8. Samples: 47141320. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:31:01,784][52050] Avg episode reward: [(0, '22.910'), (1, '22.620')] [2023-10-10 08:31:01,942][53252] Updated weights for policy 0, policy_version 92100 (0.0008) [2023-10-10 08:31:02,313][53252] Updated weights for policy 0, policy_version 92110 (0.0007) [2023-10-10 08:31:02,681][53252] Updated weights for policy 0, policy_version 92120 (0.0007) [2023-10-10 08:31:04,497][53268] Updated weights for policy 1, policy_version 92040 (0.0010) [2023-10-10 08:31:04,869][53268] Updated weights for policy 1, policy_version 92050 (0.0010) [2023-10-10 08:31:05,242][53268] Updated weights for policy 1, policy_version 92060 (0.0009) [2023-10-10 08:31:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188612608. Throughput: 0: 1676.6, 1: 1673.5. Samples: 47161352. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:31:06,784][52050] Avg episode reward: [(0, '21.480'), (1, '22.350')] [2023-10-10 08:31:06,825][53252] Updated weights for policy 0, policy_version 92130 (0.0007) [2023-10-10 08:31:07,203][53252] Updated weights for policy 0, policy_version 92140 (0.0008) [2023-10-10 08:31:07,572][53252] Updated weights for policy 0, policy_version 92150 (0.0007) [2023-10-10 08:31:07,938][53252] Updated weights for policy 0, policy_version 92160 (0.0008) [2023-10-10 08:31:09,462][53268] Updated weights for policy 1, policy_version 92070 (0.0010) [2023-10-10 08:31:09,820][53268] Updated weights for policy 1, policy_version 92080 (0.0008) [2023-10-10 08:31:10,183][53268] Updated weights for policy 1, policy_version 92090 (0.0010) [2023-10-10 08:31:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188678144. Throughput: 0: 1681.6, 1: 1677.9. Samples: 47181848. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:31:11,784][52050] Avg episode reward: [(0, '21.880'), (1, '21.560')] [2023-10-10 08:31:11,849][53252] Updated weights for policy 0, policy_version 92170 (0.0009) [2023-10-10 08:31:12,218][53252] Updated weights for policy 0, policy_version 92180 (0.0008) [2023-10-10 08:31:12,595][53252] Updated weights for policy 0, policy_version 92190 (0.0007) [2023-10-10 08:31:14,092][53268] Updated weights for policy 1, policy_version 92100 (0.0010) [2023-10-10 08:31:14,463][53268] Updated weights for policy 1, policy_version 92110 (0.0009) [2023-10-10 08:31:14,825][53268] Updated weights for policy 1, policy_version 92120 (0.0010) [2023-10-10 08:31:16,605][53252] Updated weights for policy 0, policy_version 92200 (0.0009) [2023-10-10 08:31:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188743680. Throughput: 0: 1683.4, 1: 1690.8. Samples: 47192230. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-10 08:31:16,784][52050] Avg episode reward: [(0, '21.560'), (1, '19.960')] [2023-10-10 08:31:16,976][53252] Updated weights for policy 0, policy_version 92210 (0.0010) [2023-10-10 08:31:17,345][53252] Updated weights for policy 0, policy_version 92220 (0.0010) [2023-10-10 08:31:18,922][53268] Updated weights for policy 1, policy_version 92130 (0.0010) [2023-10-10 08:31:19,285][53268] Updated weights for policy 1, policy_version 92140 (0.0010) [2023-10-10 08:31:19,652][53268] Updated weights for policy 1, policy_version 92150 (0.0011) [2023-10-10 08:31:20,017][53268] Updated weights for policy 1, policy_version 92160 (0.0008) [2023-10-10 08:31:21,504][53252] Updated weights for policy 0, policy_version 92230 (0.0011) [2023-10-10 08:31:21,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 188809216. Throughput: 0: 1684.4, 1: 1664.9. Samples: 47211798. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:21,784][52050] Avg episode reward: [(0, '23.920'), (1, '20.080')] [2023-10-10 08:31:21,868][53252] Updated weights for policy 0, policy_version 92240 (0.0009) [2023-10-10 08:31:22,239][53252] Updated weights for policy 0, policy_version 92250 (0.0007) [2023-10-10 08:31:24,190][53268] Updated weights for policy 1, policy_version 92170 (0.0008) [2023-10-10 08:31:24,561][53268] Updated weights for policy 1, policy_version 92180 (0.0009) [2023-10-10 08:31:24,920][53268] Updated weights for policy 1, policy_version 92190 (0.0011) [2023-10-10 08:31:26,296][53252] Updated weights for policy 0, policy_version 92260 (0.0008) [2023-10-10 08:31:26,669][53252] Updated weights for policy 0, policy_version 92270 (0.0009) [2023-10-10 08:31:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188874752. Throughput: 0: 1683.2, 1: 1688.3. Samples: 47232346. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:26,784][52050] Avg episode reward: [(0, '22.200'), (1, '20.780')] [2023-10-10 08:31:27,040][53252] Updated weights for policy 0, policy_version 92280 (0.0008) [2023-10-10 08:31:29,019][53268] Updated weights for policy 1, policy_version 92200 (0.0009) [2023-10-10 08:31:29,387][53268] Updated weights for policy 1, policy_version 92210 (0.0009) [2023-10-10 08:31:29,759][53268] Updated weights for policy 1, policy_version 92220 (0.0011) [2023-10-10 08:31:31,227][53252] Updated weights for policy 0, policy_version 92290 (0.0007) [2023-10-10 08:31:31,617][53252] Updated weights for policy 0, policy_version 92300 (0.0009) [2023-10-10 08:31:31,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 188940288. Throughput: 0: 1689.9, 1: 1681.8. Samples: 47242372. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:31,784][52050] Avg episode reward: [(0, '22.930'), (1, '21.540')] [2023-10-10 08:31:31,989][53252] Updated weights for policy 0, policy_version 92310 (0.0008) [2023-10-10 08:31:32,374][53252] Updated weights for policy 0, policy_version 92320 (0.0007) [2023-10-10 08:31:33,651][53268] Updated weights for policy 1, policy_version 92230 (0.0009) [2023-10-10 08:31:34,032][53268] Updated weights for policy 1, policy_version 92240 (0.0009) [2023-10-10 08:31:34,397][53268] Updated weights for policy 1, policy_version 92250 (0.0008) [2023-10-10 08:31:36,263][53252] Updated weights for policy 0, policy_version 92330 (0.0007) [2023-10-10 08:31:36,629][53252] Updated weights for policy 0, policy_version 92340 (0.0007) [2023-10-10 08:31:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189005824. Throughput: 0: 1687.0, 1: 1674.5. Samples: 47262276. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:36,784][52050] Avg episode reward: [(0, '22.430'), (1, '22.440')] [2023-10-10 08:31:37,001][53252] Updated weights for policy 0, policy_version 92350 (0.0009) [2023-10-10 08:31:38,625][53268] Updated weights for policy 1, policy_version 92260 (0.0009) [2023-10-10 08:31:38,991][53268] Updated weights for policy 1, policy_version 92270 (0.0009) [2023-10-10 08:31:39,360][53268] Updated weights for policy 1, policy_version 92280 (0.0010) [2023-10-10 08:31:41,364][53252] Updated weights for policy 0, policy_version 92360 (0.0007) [2023-10-10 08:31:41,729][53252] Updated weights for policy 0, policy_version 92370 (0.0008) [2023-10-10 08:31:41,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189071360. Throughput: 0: 1667.2, 1: 1697.4. Samples: 47282402. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:41,784][52050] Avg episode reward: [(0, '21.290'), (1, '24.070')] [2023-10-10 08:31:42,105][53252] Updated weights for policy 0, policy_version 92380 (0.0009) [2023-10-10 08:31:43,472][53268] Updated weights for policy 1, policy_version 92290 (0.0008) [2023-10-10 08:31:43,840][53268] Updated weights for policy 1, policy_version 92300 (0.0007) [2023-10-10 08:31:44,204][53268] Updated weights for policy 1, policy_version 92310 (0.0008) [2023-10-10 08:31:44,569][53268] Updated weights for policy 1, policy_version 92320 (0.0011) [2023-10-10 08:31:46,273][53252] Updated weights for policy 0, policy_version 92390 (0.0010) [2023-10-10 08:31:46,645][53252] Updated weights for policy 0, policy_version 92400 (0.0007) [2023-10-10 08:31:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189136896. Throughput: 0: 1676.6, 1: 1676.6. Samples: 47292216. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:46,784][52050] Avg episode reward: [(0, '21.320'), (1, '23.500')] [2023-10-10 08:31:47,027][53252] Updated weights for policy 0, policy_version 92410 (0.0007) [2023-10-10 08:31:48,719][53268] Updated weights for policy 1, policy_version 92330 (0.0008) [2023-10-10 08:31:49,093][53268] Updated weights for policy 1, policy_version 92340 (0.0007) [2023-10-10 08:31:49,462][53268] Updated weights for policy 1, policy_version 92350 (0.0010) [2023-10-10 08:31:51,130][53252] Updated weights for policy 0, policy_version 92420 (0.0008) [2023-10-10 08:31:51,495][53252] Updated weights for policy 0, policy_version 92430 (0.0007) [2023-10-10 08:31:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189202432. Throughput: 0: 1676.1, 1: 1673.2. Samples: 47312068. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:51,784][52050] Avg episode reward: [(0, '23.560'), (1, '22.820')] [2023-10-10 08:31:51,872][53252] Updated weights for policy 0, policy_version 92440 (0.0007) [2023-10-10 08:31:53,384][53268] Updated weights for policy 1, policy_version 92360 (0.0010) [2023-10-10 08:31:53,749][53268] Updated weights for policy 1, policy_version 92370 (0.0008) [2023-10-10 08:31:54,121][53268] Updated weights for policy 1, policy_version 92380 (0.0009) [2023-10-10 08:31:55,835][53252] Updated weights for policy 0, policy_version 92450 (0.0008) [2023-10-10 08:31:56,205][53252] Updated weights for policy 0, policy_version 92460 (0.0009) [2023-10-10 08:31:56,576][53252] Updated weights for policy 0, policy_version 92470 (0.0010) [2023-10-10 08:31:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189267968. Throughput: 0: 1659.9, 1: 1683.4. Samples: 47332294. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:31:56,784][52050] Avg episode reward: [(0, '23.390'), (1, '22.150')] [2023-10-10 08:31:56,951][53252] Updated weights for policy 0, policy_version 92480 (0.0009) [2023-10-10 08:31:58,156][53268] Updated weights for policy 1, policy_version 92390 (0.0008) [2023-10-10 08:31:58,524][53268] Updated weights for policy 1, policy_version 92400 (0.0007) [2023-10-10 08:31:58,897][53268] Updated weights for policy 1, policy_version 92410 (0.0009) [2023-10-10 08:32:01,033][53252] Updated weights for policy 0, policy_version 92490 (0.0010) [2023-10-10 08:32:01,407][53252] Updated weights for policy 0, policy_version 92500 (0.0008) [2023-10-10 08:32:01,783][53252] Updated weights for policy 0, policy_version 92510 (0.0008) [2023-10-10 08:32:01,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 189333504. Throughput: 0: 1671.0, 1: 1656.9. Samples: 47341986. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:32:01,784][52050] Avg episode reward: [(0, '21.610'), (1, '21.790')] [2023-10-10 08:32:02,986][53268] Updated weights for policy 1, policy_version 92420 (0.0007) [2023-10-10 08:32:03,356][53268] Updated weights for policy 1, policy_version 92430 (0.0007) [2023-10-10 08:32:03,722][53268] Updated weights for policy 1, policy_version 92440 (0.0007) [2023-10-10 08:32:05,884][53252] Updated weights for policy 0, policy_version 92520 (0.0007) [2023-10-10 08:32:06,266][53252] Updated weights for policy 0, policy_version 92530 (0.0007) [2023-10-10 08:32:06,638][53252] Updated weights for policy 0, policy_version 92540 (0.0008) [2023-10-10 08:32:06,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 189431808. Throughput: 0: 1673.5, 1: 1684.5. Samples: 47362906. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:32:06,784][52050] Avg episode reward: [(0, '22.930'), (1, '19.960')] [2023-10-10 08:32:07,777][53268] Updated weights for policy 1, policy_version 92450 (0.0007) [2023-10-10 08:32:08,140][53268] Updated weights for policy 1, policy_version 92460 (0.0009) [2023-10-10 08:32:08,505][53268] Updated weights for policy 1, policy_version 92470 (0.0008) [2023-10-10 08:32:08,865][53268] Updated weights for policy 1, policy_version 92480 (0.0009) [2023-10-10 08:32:10,612][53252] Updated weights for policy 0, policy_version 92550 (0.0008) [2023-10-10 08:32:10,972][53252] Updated weights for policy 0, policy_version 92560 (0.0008) [2023-10-10 08:32:11,352][53252] Updated weights for policy 0, policy_version 92570 (0.0009) [2023-10-10 08:32:11,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189497344. Throughput: 0: 1656.3, 1: 1690.3. Samples: 47382944. Policy #0 lag: (min: 31.0, avg: 45.7, max: 63.0) [2023-10-10 08:32:11,784][52050] Avg episode reward: [(0, '22.170'), (1, '21.150')] [2023-10-10 08:32:12,898][53268] Updated weights for policy 1, policy_version 92490 (0.0009) [2023-10-10 08:32:13,274][53268] Updated weights for policy 1, policy_version 92500 (0.0009) [2023-10-10 08:32:13,643][53268] Updated weights for policy 1, policy_version 92510 (0.0008) [2023-10-10 08:32:15,347][53252] Updated weights for policy 0, policy_version 92580 (0.0009) [2023-10-10 08:32:15,727][53252] Updated weights for policy 0, policy_version 92590 (0.0007) [2023-10-10 08:32:16,094][53252] Updated weights for policy 0, policy_version 92600 (0.0008) [2023-10-10 08:32:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189562880. Throughput: 0: 1680.4, 1: 1673.2. Samples: 47393284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:16,784][52050] Avg episode reward: [(0, '21.170'), (1, '21.000')] [2023-10-10 08:32:17,702][53268] Updated weights for policy 1, policy_version 92520 (0.0008) [2023-10-10 08:32:18,074][53268] Updated weights for policy 1, policy_version 92530 (0.0010) [2023-10-10 08:32:18,441][53268] Updated weights for policy 1, policy_version 92540 (0.0009) [2023-10-10 08:32:19,951][53252] Updated weights for policy 0, policy_version 92610 (0.0009) [2023-10-10 08:32:20,323][53252] Updated weights for policy 0, policy_version 92620 (0.0009) [2023-10-10 08:32:20,702][53252] Updated weights for policy 0, policy_version 92630 (0.0008) [2023-10-10 08:32:21,071][53252] Updated weights for policy 0, policy_version 92640 (0.0009) [2023-10-10 08:32:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189628416. Throughput: 0: 1677.0, 1: 1684.5. Samples: 47413544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:21,784][52050] Avg episode reward: [(0, '20.450'), (1, '20.900')] [2023-10-10 08:32:22,601][53268] Updated weights for policy 1, policy_version 92550 (0.0008) [2023-10-10 08:32:22,984][53268] Updated weights for policy 1, policy_version 92560 (0.0011) [2023-10-10 08:32:23,343][53268] Updated weights for policy 1, policy_version 92570 (0.0009) [2023-10-10 08:32:24,954][53252] Updated weights for policy 0, policy_version 92650 (0.0007) [2023-10-10 08:32:25,329][53252] Updated weights for policy 0, policy_version 92660 (0.0009) [2023-10-10 08:32:25,702][53252] Updated weights for policy 0, policy_version 92670 (0.0008) [2023-10-10 08:32:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189693952. Throughput: 0: 1681.7, 1: 1684.3. Samples: 47433874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:26,784][52050] Avg episode reward: [(0, '21.500'), (1, '20.490')] [2023-10-10 08:32:27,310][53268] Updated weights for policy 1, policy_version 92580 (0.0008) [2023-10-10 08:32:27,674][53268] Updated weights for policy 1, policy_version 92590 (0.0008) [2023-10-10 08:32:28,044][53268] Updated weights for policy 1, policy_version 92600 (0.0010) [2023-10-10 08:32:29,606][53252] Updated weights for policy 0, policy_version 92680 (0.0009) [2023-10-10 08:32:29,985][53252] Updated weights for policy 0, policy_version 92690 (0.0007) [2023-10-10 08:32:30,354][53252] Updated weights for policy 0, policy_version 92700 (0.0007) [2023-10-10 08:32:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189759488. Throughput: 0: 1707.1, 1: 1675.9. Samples: 47444452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:31,784][52050] Avg episode reward: [(0, '21.210'), (1, '22.850')] [2023-10-10 08:32:32,091][53268] Updated weights for policy 1, policy_version 92610 (0.0009) [2023-10-10 08:32:32,448][53268] Updated weights for policy 1, policy_version 92620 (0.0009) [2023-10-10 08:32:32,820][53268] Updated weights for policy 1, policy_version 92630 (0.0008) [2023-10-10 08:32:33,179][53268] Updated weights for policy 1, policy_version 92640 (0.0007) [2023-10-10 08:32:34,437][53252] Updated weights for policy 0, policy_version 92710 (0.0008) [2023-10-10 08:32:34,801][53252] Updated weights for policy 0, policy_version 92720 (0.0009) [2023-10-10 08:32:35,177][53252] Updated weights for policy 0, policy_version 92730 (0.0008) [2023-10-10 08:32:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189825024. Throughput: 0: 1683.9, 1: 1695.9. Samples: 47464160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:36,784][52050] Avg episode reward: [(0, '20.280'), (1, '21.760')] [2023-10-10 08:32:37,311][53268] Updated weights for policy 1, policy_version 92650 (0.0009) [2023-10-10 08:32:37,677][53268] Updated weights for policy 1, policy_version 92660 (0.0008) [2023-10-10 08:32:38,044][53268] Updated weights for policy 1, policy_version 92670 (0.0008) [2023-10-10 08:32:39,256][53252] Updated weights for policy 0, policy_version 92740 (0.0008) [2023-10-10 08:32:39,632][53252] Updated weights for policy 0, policy_version 92750 (0.0010) [2023-10-10 08:32:40,002][53252] Updated weights for policy 0, policy_version 92760 (0.0011) [2023-10-10 08:32:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189890560. Throughput: 0: 1696.2, 1: 1698.0. Samples: 47485032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:41,784][52050] Avg episode reward: [(0, '21.580'), (1, '21.680')] [2023-10-10 08:32:41,792][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000092768_94994432.pth... [2023-10-10 08:32:41,824][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000091200_93388800.pth [2023-10-10 08:32:41,924][53268] Updated weights for policy 1, policy_version 92680 (0.0008) [2023-10-10 08:32:42,295][53268] Updated weights for policy 1, policy_version 92690 (0.0011) [2023-10-10 08:32:42,676][53268] Updated weights for policy 1, policy_version 92700 (0.0008) [2023-10-10 08:32:42,811][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000092704_94928896.pth... [2023-10-10 08:32:42,851][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000091104_93290496.pth [2023-10-10 08:32:43,965][53252] Updated weights for policy 0, policy_version 92770 (0.0010) [2023-10-10 08:32:44,336][53252] Updated weights for policy 0, policy_version 92780 (0.0009) [2023-10-10 08:32:44,705][53252] Updated weights for policy 0, policy_version 92790 (0.0008) [2023-10-10 08:32:45,083][53252] Updated weights for policy 0, policy_version 92800 (0.0007) [2023-10-10 08:32:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189956096. Throughput: 0: 1702.1, 1: 1698.3. Samples: 47495004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:46,784][52050] Avg episode reward: [(0, '23.660'), (1, '22.190')] [2023-10-10 08:32:46,881][53268] Updated weights for policy 1, policy_version 92710 (0.0009) [2023-10-10 08:32:47,247][53268] Updated weights for policy 1, policy_version 92720 (0.0009) [2023-10-10 08:32:47,607][53268] Updated weights for policy 1, policy_version 92730 (0.0009) [2023-10-10 08:32:49,218][53252] Updated weights for policy 0, policy_version 92810 (0.0009) [2023-10-10 08:32:49,575][53252] Updated weights for policy 0, policy_version 92820 (0.0009) [2023-10-10 08:32:49,953][53252] Updated weights for policy 0, policy_version 92830 (0.0008) [2023-10-10 08:32:51,665][53268] Updated weights for policy 1, policy_version 92740 (0.0009) [2023-10-10 08:32:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190021632. Throughput: 0: 1682.1, 1: 1694.6. Samples: 47514858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:51,784][52050] Avg episode reward: [(0, '22.160'), (1, '22.490')] [2023-10-10 08:32:52,030][53268] Updated weights for policy 1, policy_version 92750 (0.0008) [2023-10-10 08:32:52,392][53268] Updated weights for policy 1, policy_version 92760 (0.0009) [2023-10-10 08:32:54,028][53252] Updated weights for policy 0, policy_version 92840 (0.0008) [2023-10-10 08:32:54,406][53252] Updated weights for policy 0, policy_version 92850 (0.0008) [2023-10-10 08:32:54,785][53252] Updated weights for policy 0, policy_version 92860 (0.0008) [2023-10-10 08:32:56,517][53268] Updated weights for policy 1, policy_version 92770 (0.0009) [2023-10-10 08:32:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190087168. Throughput: 0: 1712.5, 1: 1689.0. Samples: 47536014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:32:56,784][52050] Avg episode reward: [(0, '21.370'), (1, '20.960')] [2023-10-10 08:32:56,878][53268] Updated weights for policy 1, policy_version 92780 (0.0008) [2023-10-10 08:32:57,249][53268] Updated weights for policy 1, policy_version 92790 (0.0010) [2023-10-10 08:32:57,619][53268] Updated weights for policy 1, policy_version 92800 (0.0007) [2023-10-10 08:32:58,645][53252] Updated weights for policy 0, policy_version 92870 (0.0008) [2023-10-10 08:32:59,017][53252] Updated weights for policy 0, policy_version 92880 (0.0007) [2023-10-10 08:32:59,374][53252] Updated weights for policy 0, policy_version 92890 (0.0009) [2023-10-10 08:33:01,638][53268] Updated weights for policy 1, policy_version 92810 (0.0011) [2023-10-10 08:33:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190152704. Throughput: 0: 1691.5, 1: 1689.1. Samples: 47545408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:33:01,784][52050] Avg episode reward: [(0, '22.610'), (1, '22.060')] [2023-10-10 08:33:02,010][53268] Updated weights for policy 1, policy_version 92820 (0.0008) [2023-10-10 08:33:02,371][53268] Updated weights for policy 1, policy_version 92830 (0.0007) [2023-10-10 08:33:03,384][53252] Updated weights for policy 0, policy_version 92900 (0.0008) [2023-10-10 08:33:03,781][53252] Updated weights for policy 0, policy_version 92910 (0.0009) [2023-10-10 08:33:04,148][53252] Updated weights for policy 0, policy_version 92920 (0.0009) [2023-10-10 08:33:06,360][53268] Updated weights for policy 1, policy_version 92840 (0.0009) [2023-10-10 08:33:06,726][53268] Updated weights for policy 1, policy_version 92850 (0.0009) [2023-10-10 08:33:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190218240. Throughput: 0: 1693.0, 1: 1693.9. Samples: 47565956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:33:06,784][52050] Avg episode reward: [(0, '21.210'), (1, '21.730')] [2023-10-10 08:33:07,090][53268] Updated weights for policy 1, policy_version 92860 (0.0009) [2023-10-10 08:33:08,239][53252] Updated weights for policy 0, policy_version 92930 (0.0008) [2023-10-10 08:33:08,602][53252] Updated weights for policy 0, policy_version 92940 (0.0009) [2023-10-10 08:33:08,970][53252] Updated weights for policy 0, policy_version 92950 (0.0009) [2023-10-10 08:33:09,347][53252] Updated weights for policy 0, policy_version 92960 (0.0010) [2023-10-10 08:33:11,278][53268] Updated weights for policy 1, policy_version 92870 (0.0009) [2023-10-10 08:33:11,673][53268] Updated weights for policy 1, policy_version 92880 (0.0009) [2023-10-10 08:33:11,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190283776. Throughput: 0: 1702.4, 1: 1685.5. Samples: 47586326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:33:11,784][52050] Avg episode reward: [(0, '21.590'), (1, '21.050')] [2023-10-10 08:33:12,041][53268] Updated weights for policy 1, policy_version 92890 (0.0007) [2023-10-10 08:33:13,409][53252] Updated weights for policy 0, policy_version 92970 (0.0011) [2023-10-10 08:33:13,769][53252] Updated weights for policy 0, policy_version 92980 (0.0010) [2023-10-10 08:33:14,139][53252] Updated weights for policy 0, policy_version 92990 (0.0009) [2023-10-10 08:33:16,061][53268] Updated weights for policy 1, policy_version 92900 (0.0007) [2023-10-10 08:33:16,426][53268] Updated weights for policy 1, policy_version 92910 (0.0008) [2023-10-10 08:33:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190349312. Throughput: 0: 1668.5, 1: 1687.2. Samples: 47595460. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:16,784][52050] Avg episode reward: [(0, '23.200'), (1, '21.730')] [2023-10-10 08:33:16,793][53268] Updated weights for policy 1, policy_version 92920 (0.0008) [2023-10-10 08:33:18,167][53252] Updated weights for policy 0, policy_version 93000 (0.0009) [2023-10-10 08:33:18,530][53252] Updated weights for policy 0, policy_version 93010 (0.0008) [2023-10-10 08:33:18,893][53252] Updated weights for policy 0, policy_version 93020 (0.0009) [2023-10-10 08:33:20,841][53268] Updated weights for policy 1, policy_version 92930 (0.0008) [2023-10-10 08:33:21,212][53268] Updated weights for policy 1, policy_version 92940 (0.0009) [2023-10-10 08:33:21,579][53268] Updated weights for policy 1, policy_version 92950 (0.0009) [2023-10-10 08:33:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190414848. Throughput: 0: 1694.3, 1: 1685.8. Samples: 47616266. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:21,784][52050] Avg episode reward: [(0, '23.650'), (1, '21.710')] [2023-10-10 08:33:21,947][53268] Updated weights for policy 1, policy_version 92960 (0.0008) [2023-10-10 08:33:22,751][53252] Updated weights for policy 0, policy_version 93030 (0.0010) [2023-10-10 08:33:23,128][53252] Updated weights for policy 0, policy_version 93040 (0.0007) [2023-10-10 08:33:23,499][53252] Updated weights for policy 0, policy_version 93050 (0.0007) [2023-10-10 08:33:25,959][53268] Updated weights for policy 1, policy_version 92970 (0.0007) [2023-10-10 08:33:26,322][53268] Updated weights for policy 1, policy_version 92980 (0.0012) [2023-10-10 08:33:26,696][53268] Updated weights for policy 1, policy_version 92990 (0.0009) [2023-10-10 08:33:26,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 190513152. Throughput: 0: 1697.3, 1: 1667.5. Samples: 47636448. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:26,784][52050] Avg episode reward: [(0, '23.260'), (1, '21.760')] [2023-10-10 08:33:27,650][53252] Updated weights for policy 0, policy_version 93060 (0.0007) [2023-10-10 08:33:28,016][53252] Updated weights for policy 0, policy_version 93070 (0.0008) [2023-10-10 08:33:28,397][53252] Updated weights for policy 0, policy_version 93080 (0.0011) [2023-10-10 08:33:30,811][53268] Updated weights for policy 1, policy_version 93000 (0.0009) [2023-10-10 08:33:31,176][53268] Updated weights for policy 1, policy_version 93010 (0.0009) [2023-10-10 08:33:31,550][53268] Updated weights for policy 1, policy_version 93020 (0.0007) [2023-10-10 08:33:31,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 190578688. Throughput: 0: 1675.6, 1: 1681.3. Samples: 47646066. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:31,784][52050] Avg episode reward: [(0, '21.540'), (1, '20.920')] [2023-10-10 08:33:32,771][53252] Updated weights for policy 0, policy_version 93090 (0.0009) [2023-10-10 08:33:33,142][53252] Updated weights for policy 0, policy_version 93100 (0.0007) [2023-10-10 08:33:33,513][53252] Updated weights for policy 0, policy_version 93110 (0.0007) [2023-10-10 08:33:33,881][53252] Updated weights for policy 0, policy_version 93120 (0.0009) [2023-10-10 08:33:35,563][53268] Updated weights for policy 1, policy_version 93030 (0.0010) [2023-10-10 08:33:35,928][53268] Updated weights for policy 1, policy_version 93040 (0.0008) [2023-10-10 08:33:36,293][53268] Updated weights for policy 1, policy_version 93050 (0.0009) [2023-10-10 08:33:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 190644224. Throughput: 0: 1691.5, 1: 1689.7. Samples: 47667012. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:36,784][52050] Avg episode reward: [(0, '21.310'), (1, '21.270')] [2023-10-10 08:33:37,872][53252] Updated weights for policy 0, policy_version 93130 (0.0008) [2023-10-10 08:33:38,238][53252] Updated weights for policy 0, policy_version 93140 (0.0008) [2023-10-10 08:33:38,614][53252] Updated weights for policy 0, policy_version 93150 (0.0009) [2023-10-10 08:33:40,227][53268] Updated weights for policy 1, policy_version 93060 (0.0010) [2023-10-10 08:33:40,601][53268] Updated weights for policy 1, policy_version 93070 (0.0009) [2023-10-10 08:33:40,964][53268] Updated weights for policy 1, policy_version 93080 (0.0009) [2023-10-10 08:33:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 190709760. Throughput: 0: 1684.9, 1: 1665.3. Samples: 47686776. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:41,784][52050] Avg episode reward: [(0, '21.240'), (1, '22.190')] [2023-10-10 08:33:42,558][53252] Updated weights for policy 0, policy_version 93160 (0.0008) [2023-10-10 08:33:42,927][53252] Updated weights for policy 0, policy_version 93170 (0.0009) [2023-10-10 08:33:43,298][53252] Updated weights for policy 0, policy_version 93180 (0.0008) [2023-10-10 08:33:45,023][53268] Updated weights for policy 1, policy_version 93090 (0.0009) [2023-10-10 08:33:45,399][53268] Updated weights for policy 1, policy_version 93100 (0.0007) [2023-10-10 08:33:45,777][53268] Updated weights for policy 1, policy_version 93110 (0.0007) [2023-10-10 08:33:46,145][53268] Updated weights for policy 1, policy_version 93120 (0.0007) [2023-10-10 08:33:46,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 190775296. Throughput: 0: 1676.0, 1: 1695.7. Samples: 47697138. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:46,784][52050] Avg episode reward: [(0, '20.620'), (1, '21.400')] [2023-10-10 08:33:47,321][53252] Updated weights for policy 0, policy_version 93190 (0.0010) [2023-10-10 08:33:47,696][53252] Updated weights for policy 0, policy_version 93200 (0.0007) [2023-10-10 08:33:48,062][53252] Updated weights for policy 0, policy_version 93210 (0.0008) [2023-10-10 08:33:50,236][53268] Updated weights for policy 1, policy_version 93130 (0.0010) [2023-10-10 08:33:50,598][53268] Updated weights for policy 1, policy_version 93140 (0.0008) [2023-10-10 08:33:50,964][53268] Updated weights for policy 1, policy_version 93150 (0.0010) [2023-10-10 08:33:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 190840832. Throughput: 0: 1687.6, 1: 1686.9. Samples: 47717810. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:51,784][52050] Avg episode reward: [(0, '21.950'), (1, '21.180')] [2023-10-10 08:33:52,078][53252] Updated weights for policy 0, policy_version 93220 (0.0008) [2023-10-10 08:33:52,470][53252] Updated weights for policy 0, policy_version 93230 (0.0008) [2023-10-10 08:33:52,831][53252] Updated weights for policy 0, policy_version 93240 (0.0009) [2023-10-10 08:33:55,005][53268] Updated weights for policy 1, policy_version 93160 (0.0009) [2023-10-10 08:33:55,383][53268] Updated weights for policy 1, policy_version 93170 (0.0009) [2023-10-10 08:33:55,751][53268] Updated weights for policy 1, policy_version 93180 (0.0009) [2023-10-10 08:33:56,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190906368. Throughput: 0: 1691.4, 1: 1675.3. Samples: 47737826. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:33:56,784][52050] Avg episode reward: [(0, '24.740'), (1, '21.690')] [2023-10-10 08:33:56,803][53252] Updated weights for policy 0, policy_version 93250 (0.0009) [2023-10-10 08:33:57,159][53252] Updated weights for policy 0, policy_version 93260 (0.0007) [2023-10-10 08:33:57,526][53252] Updated weights for policy 0, policy_version 93270 (0.0008) [2023-10-10 08:33:57,893][53252] Updated weights for policy 0, policy_version 93280 (0.0008) [2023-10-10 08:33:59,959][53268] Updated weights for policy 1, policy_version 93190 (0.0010) [2023-10-10 08:34:00,354][53268] Updated weights for policy 1, policy_version 93200 (0.0011) [2023-10-10 08:34:00,728][53268] Updated weights for policy 1, policy_version 93210 (0.0010) [2023-10-10 08:34:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 190971904. Throughput: 0: 1691.6, 1: 1701.0. Samples: 47748126. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:34:01,784][52050] Avg episode reward: [(0, '23.690'), (1, '21.040')] [2023-10-10 08:34:01,971][53252] Updated weights for policy 0, policy_version 93290 (0.0008) [2023-10-10 08:34:02,340][53252] Updated weights for policy 0, policy_version 93300 (0.0008) [2023-10-10 08:34:02,710][53252] Updated weights for policy 0, policy_version 93310 (0.0007) [2023-10-10 08:34:04,788][53268] Updated weights for policy 1, policy_version 93220 (0.0009) [2023-10-10 08:34:05,141][53268] Updated weights for policy 1, policy_version 93230 (0.0009) [2023-10-10 08:34:05,509][53268] Updated weights for policy 1, policy_version 93240 (0.0011) [2023-10-10 08:34:06,779][53252] Updated weights for policy 0, policy_version 93320 (0.0008) [2023-10-10 08:34:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191037440. Throughput: 0: 1693.7, 1: 1679.3. Samples: 47768054. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:34:06,784][52050] Avg episode reward: [(0, '23.650'), (1, '21.260')] [2023-10-10 08:34:07,145][53252] Updated weights for policy 0, policy_version 93330 (0.0008) [2023-10-10 08:34:07,515][53252] Updated weights for policy 0, policy_version 93340 (0.0008) [2023-10-10 08:34:09,558][53268] Updated weights for policy 1, policy_version 93250 (0.0008) [2023-10-10 08:34:09,925][53268] Updated weights for policy 1, policy_version 93260 (0.0010) [2023-10-10 08:34:10,294][53268] Updated weights for policy 1, policy_version 93270 (0.0009) [2023-10-10 08:34:10,655][53268] Updated weights for policy 1, policy_version 93280 (0.0008) [2023-10-10 08:34:11,669][53252] Updated weights for policy 0, policy_version 93350 (0.0007) [2023-10-10 08:34:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 191102976. Throughput: 0: 1688.6, 1: 1677.2. Samples: 47787912. Policy #0 lag: (min: 31.0, avg: 31.8, max: 51.0) [2023-10-10 08:34:11,784][52050] Avg episode reward: [(0, '24.100'), (1, '21.590')] [2023-10-10 08:34:12,041][53252] Updated weights for policy 0, policy_version 93360 (0.0007) [2023-10-10 08:34:12,412][53252] Updated weights for policy 0, policy_version 93370 (0.0008) [2023-10-10 08:34:14,631][53268] Updated weights for policy 1, policy_version 93290 (0.0011) [2023-10-10 08:34:15,012][53268] Updated weights for policy 1, policy_version 93300 (0.0011) [2023-10-10 08:34:15,369][53268] Updated weights for policy 1, policy_version 93310 (0.0011) [2023-10-10 08:34:16,369][53252] Updated weights for policy 0, policy_version 93380 (0.0008) [2023-10-10 08:34:16,736][53252] Updated weights for policy 0, policy_version 93390 (0.0007) [2023-10-10 08:34:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191168512. Throughput: 0: 1693.1, 1: 1692.2. Samples: 47798402. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:16,784][52050] Avg episode reward: [(0, '23.420'), (1, '21.450')] [2023-10-10 08:34:17,104][53252] Updated weights for policy 0, policy_version 93400 (0.0009) [2023-10-10 08:34:19,373][53268] Updated weights for policy 1, policy_version 93320 (0.0010) [2023-10-10 08:34:19,746][53268] Updated weights for policy 1, policy_version 93330 (0.0010) [2023-10-10 08:34:20,112][53268] Updated weights for policy 1, policy_version 93340 (0.0009) [2023-10-10 08:34:21,191][53252] Updated weights for policy 0, policy_version 93410 (0.0010) [2023-10-10 08:34:21,575][53252] Updated weights for policy 0, policy_version 93420 (0.0008) [2023-10-10 08:34:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191234048. Throughput: 0: 1701.2, 1: 1662.5. Samples: 47818378. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:21,784][52050] Avg episode reward: [(0, '21.810'), (1, '22.200')] [2023-10-10 08:34:21,940][53252] Updated weights for policy 0, policy_version 93430 (0.0007) [2023-10-10 08:34:22,310][53252] Updated weights for policy 0, policy_version 93440 (0.0008) [2023-10-10 08:34:24,127][53268] Updated weights for policy 1, policy_version 93350 (0.0011) [2023-10-10 08:34:24,491][53268] Updated weights for policy 1, policy_version 93360 (0.0011) [2023-10-10 08:34:24,866][53268] Updated weights for policy 1, policy_version 93370 (0.0011) [2023-10-10 08:34:26,288][53252] Updated weights for policy 0, policy_version 93450 (0.0010) [2023-10-10 08:34:26,662][53252] Updated weights for policy 0, policy_version 93460 (0.0011) [2023-10-10 08:34:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191299584. Throughput: 0: 1684.4, 1: 1689.6. Samples: 47838608. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:26,784][52050] Avg episode reward: [(0, '22.170'), (1, '21.710')] [2023-10-10 08:34:27,037][53252] Updated weights for policy 0, policy_version 93470 (0.0007) [2023-10-10 08:34:28,805][53268] Updated weights for policy 1, policy_version 93380 (0.0010) [2023-10-10 08:34:29,180][53268] Updated weights for policy 1, policy_version 93390 (0.0010) [2023-10-10 08:34:29,550][53268] Updated weights for policy 1, policy_version 93400 (0.0010) [2023-10-10 08:34:31,089][53252] Updated weights for policy 0, policy_version 93480 (0.0007) [2023-10-10 08:34:31,460][53252] Updated weights for policy 0, policy_version 93490 (0.0009) [2023-10-10 08:34:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191365120. Throughput: 0: 1696.7, 1: 1678.5. Samples: 47849022. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:31,784][52050] Avg episode reward: [(0, '21.860'), (1, '23.140')] [2023-10-10 08:34:31,824][53252] Updated weights for policy 0, policy_version 93500 (0.0008) [2023-10-10 08:34:33,663][53268] Updated weights for policy 1, policy_version 93410 (0.0010) [2023-10-10 08:34:34,028][53268] Updated weights for policy 1, policy_version 93420 (0.0011) [2023-10-10 08:34:34,391][53268] Updated weights for policy 1, policy_version 93430 (0.0011) [2023-10-10 08:34:34,761][53268] Updated weights for policy 1, policy_version 93440 (0.0011) [2023-10-10 08:34:35,799][53252] Updated weights for policy 0, policy_version 93510 (0.0008) [2023-10-10 08:34:36,161][53252] Updated weights for policy 0, policy_version 93520 (0.0007) [2023-10-10 08:34:36,536][53252] Updated weights for policy 0, policy_version 93530 (0.0007) [2023-10-10 08:34:36,783][52050] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191463424. Throughput: 0: 1691.0, 1: 1665.8. Samples: 47868868. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:36,784][52050] Avg episode reward: [(0, '22.230'), (1, '23.420')] [2023-10-10 08:34:38,816][53268] Updated weights for policy 1, policy_version 93450 (0.0007) [2023-10-10 08:34:39,181][53268] Updated weights for policy 1, policy_version 93460 (0.0007) [2023-10-10 08:34:39,542][53268] Updated weights for policy 1, policy_version 93470 (0.0008) [2023-10-10 08:34:40,801][53252] Updated weights for policy 0, policy_version 93540 (0.0008) [2023-10-10 08:34:41,184][53252] Updated weights for policy 0, policy_version 93550 (0.0007) [2023-10-10 08:34:41,560][53252] Updated weights for policy 0, policy_version 93560 (0.0010) [2023-10-10 08:34:41,784][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191496192. Throughput: 0: 1670.3, 1: 1686.1. Samples: 47888862. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:41,785][52050] Avg episode reward: [(0, '24.270'), (1, '21.090')] [2023-10-10 08:34:41,796][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000093472_95715328.pth... [2023-10-10 08:34:41,827][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000091904_94109696.pth [2023-10-10 08:34:41,859][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000093568_95813632.pth... [2023-10-10 08:34:41,888][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000091968_94175232.pth [2023-10-10 08:34:43,690][53268] Updated weights for policy 1, policy_version 93480 (0.0008) [2023-10-10 08:34:44,063][53268] Updated weights for policy 1, policy_version 93490 (0.0008) [2023-10-10 08:34:44,434][53268] Updated weights for policy 1, policy_version 93500 (0.0008) [2023-10-10 08:34:45,681][53252] Updated weights for policy 0, policy_version 93570 (0.0007) [2023-10-10 08:34:46,050][53252] Updated weights for policy 0, policy_version 93580 (0.0007) [2023-10-10 08:34:46,428][53252] Updated weights for policy 0, policy_version 93590 (0.0009) [2023-10-10 08:34:46,783][52050] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191561728. Throughput: 0: 1685.9, 1: 1667.8. Samples: 47899044. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:46,784][52050] Avg episode reward: [(0, '23.630'), (1, '20.410')] [2023-10-10 08:34:46,795][53252] Updated weights for policy 0, policy_version 93600 (0.0011) [2023-10-10 08:34:48,637][53268] Updated weights for policy 1, policy_version 93510 (0.0008) [2023-10-10 08:34:48,997][53268] Updated weights for policy 1, policy_version 93520 (0.0007) [2023-10-10 08:34:49,375][53268] Updated weights for policy 1, policy_version 93530 (0.0009) [2023-10-10 08:34:50,985][53252] Updated weights for policy 0, policy_version 93610 (0.0009) [2023-10-10 08:34:51,350][53252] Updated weights for policy 0, policy_version 93620 (0.0011) [2023-10-10 08:34:51,716][53252] Updated weights for policy 0, policy_version 93630 (0.0009) [2023-10-10 08:34:51,783][52050] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191627264. Throughput: 0: 1678.2, 1: 1674.6. Samples: 47918930. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:51,784][52050] Avg episode reward: [(0, '22.290'), (1, '19.720')] [2023-10-10 08:34:53,519][53268] Updated weights for policy 1, policy_version 93540 (0.0008) [2023-10-10 08:34:53,917][53268] Updated weights for policy 1, policy_version 93550 (0.0010) [2023-10-10 08:34:54,277][53268] Updated weights for policy 1, policy_version 93560 (0.0011) [2023-10-10 08:34:55,744][53252] Updated weights for policy 0, policy_version 93640 (0.0009) [2023-10-10 08:34:56,113][53252] Updated weights for policy 0, policy_version 93650 (0.0008) [2023-10-10 08:34:56,487][53252] Updated weights for policy 0, policy_version 93660 (0.0007) [2023-10-10 08:34:56,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191725568. Throughput: 0: 1661.5, 1: 1688.1. Samples: 47938648. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:34:56,784][52050] Avg episode reward: [(0, '22.330'), (1, '20.400')] [2023-10-10 08:34:58,224][53268] Updated weights for policy 1, policy_version 93570 (0.0010) [2023-10-10 08:34:58,581][53268] Updated weights for policy 1, policy_version 93580 (0.0010) [2023-10-10 08:34:58,954][53268] Updated weights for policy 1, policy_version 93590 (0.0007) [2023-10-10 08:34:59,313][53268] Updated weights for policy 1, policy_version 93600 (0.0010) [2023-10-10 08:35:00,716][53252] Updated weights for policy 0, policy_version 93670 (0.0008) [2023-10-10 08:35:01,090][53252] Updated weights for policy 0, policy_version 93680 (0.0008) [2023-10-10 08:35:01,456][53252] Updated weights for policy 0, policy_version 93690 (0.0007) [2023-10-10 08:35:01,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191791104. Throughput: 0: 1680.0, 1: 1665.5. Samples: 47948950. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:35:01,784][52050] Avg episode reward: [(0, '21.100'), (1, '22.410')] [2023-10-10 08:35:03,370][53268] Updated weights for policy 1, policy_version 93610 (0.0010) [2023-10-10 08:35:03,738][53268] Updated weights for policy 1, policy_version 93620 (0.0011) [2023-10-10 08:35:04,117][53268] Updated weights for policy 1, policy_version 93630 (0.0010) [2023-10-10 08:35:05,362][53252] Updated weights for policy 0, policy_version 93700 (0.0007) [2023-10-10 08:35:05,733][53252] Updated weights for policy 0, policy_version 93710 (0.0008) [2023-10-10 08:35:06,110][53252] Updated weights for policy 0, policy_version 93720 (0.0011) [2023-10-10 08:35:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191856640. Throughput: 0: 1674.7, 1: 1682.4. Samples: 47969448. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-10 08:35:06,784][52050] Avg episode reward: [(0, '21.450'), (1, '21.440')] [2023-10-10 08:35:08,209][53268] Updated weights for policy 1, policy_version 93640 (0.0010) [2023-10-10 08:35:08,579][53268] Updated weights for policy 1, policy_version 93650 (0.0011) [2023-10-10 08:35:08,937][53268] Updated weights for policy 1, policy_version 93660 (0.0008) [2023-10-10 08:35:10,053][53252] Updated weights for policy 0, policy_version 93730 (0.0011) [2023-10-10 08:35:10,425][53252] Updated weights for policy 0, policy_version 93740 (0.0008) [2023-10-10 08:35:10,798][53252] Updated weights for policy 0, policy_version 93750 (0.0009) [2023-10-10 08:35:11,177][53252] Updated weights for policy 0, policy_version 93760 (0.0011) [2023-10-10 08:35:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191922176. Throughput: 0: 1666.2, 1: 1680.5. Samples: 47989212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:11,784][52050] Avg episode reward: [(0, '20.890'), (1, '24.040')] [2023-10-10 08:35:13,020][53268] Updated weights for policy 1, policy_version 93670 (0.0009) [2023-10-10 08:35:13,391][53268] Updated weights for policy 1, policy_version 93680 (0.0008) [2023-10-10 08:35:13,755][53268] Updated weights for policy 1, policy_version 93690 (0.0007) [2023-10-10 08:35:15,330][53252] Updated weights for policy 0, policy_version 93770 (0.0010) [2023-10-10 08:35:15,693][53252] Updated weights for policy 0, policy_version 93780 (0.0008) [2023-10-10 08:35:16,071][53252] Updated weights for policy 0, policy_version 93790 (0.0010) [2023-10-10 08:35:16,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191987712. Throughput: 0: 1686.5, 1: 1658.8. Samples: 47999560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:16,784][52050] Avg episode reward: [(0, '22.790'), (1, '22.430')] [2023-10-10 08:35:17,888][53268] Updated weights for policy 1, policy_version 93700 (0.0009) [2023-10-10 08:35:18,250][53268] Updated weights for policy 1, policy_version 93710 (0.0008) [2023-10-10 08:35:18,608][53268] Updated weights for policy 1, policy_version 93720 (0.0008) [2023-10-10 08:35:20,033][53252] Updated weights for policy 0, policy_version 93800 (0.0010) [2023-10-10 08:35:20,405][53252] Updated weights for policy 0, policy_version 93810 (0.0010) [2023-10-10 08:35:20,782][53252] Updated weights for policy 0, policy_version 93820 (0.0007) [2023-10-10 08:35:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 192053248. Throughput: 0: 1672.8, 1: 1676.7. Samples: 48019592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:21,784][52050] Avg episode reward: [(0, '23.170'), (1, '21.730')] [2023-10-10 08:35:22,629][53268] Updated weights for policy 1, policy_version 93730 (0.0010) [2023-10-10 08:35:22,989][53268] Updated weights for policy 1, policy_version 93740 (0.0008) [2023-10-10 08:35:23,354][53268] Updated weights for policy 1, policy_version 93750 (0.0007) [2023-10-10 08:35:23,720][53268] Updated weights for policy 1, policy_version 93760 (0.0007) [2023-10-10 08:35:24,965][53252] Updated weights for policy 0, policy_version 93830 (0.0009) [2023-10-10 08:35:25,334][53252] Updated weights for policy 0, policy_version 93840 (0.0010) [2023-10-10 08:35:25,703][53252] Updated weights for policy 0, policy_version 93850 (0.0009) [2023-10-10 08:35:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192118784. Throughput: 0: 1674.1, 1: 1677.7. Samples: 48039692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:26,784][52050] Avg episode reward: [(0, '22.920'), (1, '20.580')] [2023-10-10 08:35:27,622][53268] Updated weights for policy 1, policy_version 93770 (0.0007) [2023-10-10 08:35:27,987][53268] Updated weights for policy 1, policy_version 93780 (0.0009) [2023-10-10 08:35:28,356][53268] Updated weights for policy 1, policy_version 93790 (0.0010) [2023-10-10 08:35:29,726][53252] Updated weights for policy 0, policy_version 93860 (0.0008) [2023-10-10 08:35:30,112][53252] Updated weights for policy 0, policy_version 93870 (0.0007) [2023-10-10 08:35:30,490][53252] Updated weights for policy 0, policy_version 93880 (0.0008) [2023-10-10 08:35:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192184320. Throughput: 0: 1688.7, 1: 1668.2. Samples: 48050104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:31,784][52050] Avg episode reward: [(0, '22.410'), (1, '23.930')] [2023-10-10 08:35:32,544][53268] Updated weights for policy 1, policy_version 93800 (0.0008) [2023-10-10 08:35:32,915][53268] Updated weights for policy 1, policy_version 93810 (0.0008) [2023-10-10 08:35:33,272][53268] Updated weights for policy 1, policy_version 93820 (0.0009) [2023-10-10 08:35:34,415][53252] Updated weights for policy 0, policy_version 93890 (0.0007) [2023-10-10 08:35:34,786][53252] Updated weights for policy 0, policy_version 93900 (0.0007) [2023-10-10 08:35:35,161][53252] Updated weights for policy 0, policy_version 93910 (0.0008) [2023-10-10 08:35:35,534][53252] Updated weights for policy 0, policy_version 93920 (0.0007) [2023-10-10 08:35:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192249856. Throughput: 0: 1668.7, 1: 1686.8. Samples: 48069926. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:36,784][52050] Avg episode reward: [(0, '21.430'), (1, '22.420')] [2023-10-10 08:35:37,478][53268] Updated weights for policy 1, policy_version 93830 (0.0009) [2023-10-10 08:35:37,850][53268] Updated weights for policy 1, policy_version 93840 (0.0009) [2023-10-10 08:35:38,214][53268] Updated weights for policy 1, policy_version 93850 (0.0007) [2023-10-10 08:35:39,484][53252] Updated weights for policy 0, policy_version 93930 (0.0011) [2023-10-10 08:35:39,855][53252] Updated weights for policy 0, policy_version 93940 (0.0009) [2023-10-10 08:35:40,224][53252] Updated weights for policy 0, policy_version 93950 (0.0011) [2023-10-10 08:35:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192315392. Throughput: 0: 1685.8, 1: 1690.5. Samples: 48090582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:41,784][52050] Avg episode reward: [(0, '24.000'), (1, '22.830')] [2023-10-10 08:35:42,272][53268] Updated weights for policy 1, policy_version 93860 (0.0009) [2023-10-10 08:35:42,656][53268] Updated weights for policy 1, policy_version 93870 (0.0009) [2023-10-10 08:35:43,025][53268] Updated weights for policy 1, policy_version 93880 (0.0008) [2023-10-10 08:35:43,988][53252] Updated weights for policy 0, policy_version 93960 (0.0007) [2023-10-10 08:35:44,345][53252] Updated weights for policy 0, policy_version 93970 (0.0007) [2023-10-10 08:35:44,729][53252] Updated weights for policy 0, policy_version 93980 (0.0009) [2023-10-10 08:35:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192380928. Throughput: 0: 1685.2, 1: 1677.7. Samples: 48100284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:46,784][52050] Avg episode reward: [(0, '21.570'), (1, '22.700')] [2023-10-10 08:35:47,217][53268] Updated weights for policy 1, policy_version 93890 (0.0007) [2023-10-10 08:35:47,594][53268] Updated weights for policy 1, policy_version 93900 (0.0007) [2023-10-10 08:35:47,957][53268] Updated weights for policy 1, policy_version 93910 (0.0008) [2023-10-10 08:35:48,328][53268] Updated weights for policy 1, policy_version 93920 (0.0008) [2023-10-10 08:35:48,833][53252] Updated weights for policy 0, policy_version 93990 (0.0008) [2023-10-10 08:35:49,213][53252] Updated weights for policy 0, policy_version 94000 (0.0007) [2023-10-10 08:35:49,594][53252] Updated weights for policy 0, policy_version 94010 (0.0007) [2023-10-10 08:35:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192446464. Throughput: 0: 1671.1, 1: 1683.8. Samples: 48120418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:51,784][52050] Avg episode reward: [(0, '22.600'), (1, '24.100')] [2023-10-10 08:35:52,383][53268] Updated weights for policy 1, policy_version 93930 (0.0011) [2023-10-10 08:35:52,737][53268] Updated weights for policy 1, policy_version 93940 (0.0009) [2023-10-10 08:35:53,100][53268] Updated weights for policy 1, policy_version 93950 (0.0011) [2023-10-10 08:35:53,562][53252] Updated weights for policy 0, policy_version 94020 (0.0008) [2023-10-10 08:35:53,933][53252] Updated weights for policy 0, policy_version 94030 (0.0007) [2023-10-10 08:35:54,307][53252] Updated weights for policy 0, policy_version 94040 (0.0008) [2023-10-10 08:35:56,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192512000. Throughput: 0: 1696.7, 1: 1683.6. Samples: 48141328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:35:56,784][52050] Avg episode reward: [(0, '23.480'), (1, '23.030')] [2023-10-10 08:35:57,141][53268] Updated weights for policy 1, policy_version 93960 (0.0009) [2023-10-10 08:35:57,514][53268] Updated weights for policy 1, policy_version 93970 (0.0012) [2023-10-10 08:35:57,874][53268] Updated weights for policy 1, policy_version 93980 (0.0007) [2023-10-10 08:35:58,404][53252] Updated weights for policy 0, policy_version 94050 (0.0010) [2023-10-10 08:35:58,778][53252] Updated weights for policy 0, policy_version 94060 (0.0008) [2023-10-10 08:35:59,154][53252] Updated weights for policy 0, policy_version 94070 (0.0008) [2023-10-10 08:35:59,526][53252] Updated weights for policy 0, policy_version 94080 (0.0010) [2023-10-10 08:36:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192577536. Throughput: 0: 1671.3, 1: 1685.8. Samples: 48150630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:36:01,784][52050] Avg episode reward: [(0, '20.740'), (1, '23.380')] [2023-10-10 08:36:01,941][53268] Updated weights for policy 1, policy_version 93990 (0.0009) [2023-10-10 08:36:02,301][53268] Updated weights for policy 1, policy_version 94000 (0.0011) [2023-10-10 08:36:02,670][53268] Updated weights for policy 1, policy_version 94010 (0.0011) [2023-10-10 08:36:03,664][53252] Updated weights for policy 0, policy_version 94090 (0.0008) [2023-10-10 08:36:04,044][53252] Updated weights for policy 0, policy_version 94100 (0.0007) [2023-10-10 08:36:04,413][53252] Updated weights for policy 0, policy_version 94110 (0.0009) [2023-10-10 08:36:06,626][53268] Updated weights for policy 1, policy_version 94020 (0.0009) [2023-10-10 08:36:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192643072. Throughput: 0: 1676.4, 1: 1686.4. Samples: 48170920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:36:06,784][52050] Avg episode reward: [(0, '22.410'), (1, '23.530')] [2023-10-10 08:36:06,998][53268] Updated weights for policy 1, policy_version 94030 (0.0008) [2023-10-10 08:36:07,361][53268] Updated weights for policy 1, policy_version 94040 (0.0008) [2023-10-10 08:36:08,508][53252] Updated weights for policy 0, policy_version 94120 (0.0008) [2023-10-10 08:36:08,880][53252] Updated weights for policy 0, policy_version 94130 (0.0010) [2023-10-10 08:36:09,253][53252] Updated weights for policy 0, policy_version 94140 (0.0010) [2023-10-10 08:36:11,481][53268] Updated weights for policy 1, policy_version 94050 (0.0008) [2023-10-10 08:36:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192708608. Throughput: 0: 1695.8, 1: 1688.9. Samples: 48192004. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:11,784][52050] Avg episode reward: [(0, '22.180'), (1, '23.810')] [2023-10-10 08:36:11,847][53268] Updated weights for policy 1, policy_version 94060 (0.0007) [2023-10-10 08:36:12,226][53268] Updated weights for policy 1, policy_version 94070 (0.0009) [2023-10-10 08:36:12,581][53268] Updated weights for policy 1, policy_version 94080 (0.0009) [2023-10-10 08:36:13,264][53252] Updated weights for policy 0, policy_version 94150 (0.0008) [2023-10-10 08:36:13,643][53252] Updated weights for policy 0, policy_version 94160 (0.0008) [2023-10-10 08:36:14,003][53252] Updated weights for policy 0, policy_version 94170 (0.0010) [2023-10-10 08:36:16,555][53268] Updated weights for policy 1, policy_version 94090 (0.0009) [2023-10-10 08:36:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192774144. Throughput: 0: 1663.5, 1: 1689.2. Samples: 48200974. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:16,784][52050] Avg episode reward: [(0, '20.100'), (1, '21.800')] [2023-10-10 08:36:16,927][53268] Updated weights for policy 1, policy_version 94100 (0.0009) [2023-10-10 08:36:17,287][53268] Updated weights for policy 1, policy_version 94110 (0.0009) [2023-10-10 08:36:18,041][53252] Updated weights for policy 0, policy_version 94180 (0.0009) [2023-10-10 08:36:18,414][53252] Updated weights for policy 0, policy_version 94190 (0.0007) [2023-10-10 08:36:18,776][53252] Updated weights for policy 0, policy_version 94200 (0.0008) [2023-10-10 08:36:21,353][53268] Updated weights for policy 1, policy_version 94120 (0.0007) [2023-10-10 08:36:21,713][53268] Updated weights for policy 1, policy_version 94130 (0.0009) [2023-10-10 08:36:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192839680. Throughput: 0: 1691.8, 1: 1689.3. Samples: 48222076. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:21,784][52050] Avg episode reward: [(0, '19.880'), (1, '20.690')] [2023-10-10 08:36:22,080][53268] Updated weights for policy 1, policy_version 94140 (0.0008) [2023-10-10 08:36:22,845][53252] Updated weights for policy 0, policy_version 94210 (0.0008) [2023-10-10 08:36:23,251][53252] Updated weights for policy 0, policy_version 94220 (0.0009) [2023-10-10 08:36:23,631][53252] Updated weights for policy 0, policy_version 94230 (0.0008) [2023-10-10 08:36:23,990][53252] Updated weights for policy 0, policy_version 94240 (0.0009) [2023-10-10 08:36:25,965][53268] Updated weights for policy 1, policy_version 94150 (0.0008) [2023-10-10 08:36:26,339][53268] Updated weights for policy 1, policy_version 94160 (0.0010) [2023-10-10 08:36:26,702][53268] Updated weights for policy 1, policy_version 94170 (0.0010) [2023-10-10 08:36:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192905216. Throughput: 0: 1690.7, 1: 1682.6. Samples: 48242380. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:26,784][52050] Avg episode reward: [(0, '22.020'), (1, '21.030')] [2023-10-10 08:36:28,148][53252] Updated weights for policy 0, policy_version 94250 (0.0007) [2023-10-10 08:36:28,513][53252] Updated weights for policy 0, policy_version 94260 (0.0007) [2023-10-10 08:36:28,885][53252] Updated weights for policy 0, policy_version 94270 (0.0007) [2023-10-10 08:36:30,804][53268] Updated weights for policy 1, policy_version 94180 (0.0009) [2023-10-10 08:36:31,202][53268] Updated weights for policy 1, policy_version 94190 (0.0008) [2023-10-10 08:36:31,578][53268] Updated weights for policy 1, policy_version 94200 (0.0009) [2023-10-10 08:36:31,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192970752. Throughput: 0: 1672.0, 1: 1699.1. Samples: 48251986. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:31,784][52050] Avg episode reward: [(0, '21.070'), (1, '23.130')] [2023-10-10 08:36:32,897][53252] Updated weights for policy 0, policy_version 94280 (0.0008) [2023-10-10 08:36:33,268][53252] Updated weights for policy 0, policy_version 94290 (0.0010) [2023-10-10 08:36:33,649][53252] Updated weights for policy 0, policy_version 94300 (0.0009) [2023-10-10 08:36:35,736][53268] Updated weights for policy 1, policy_version 94210 (0.0008) [2023-10-10 08:36:36,098][53268] Updated weights for policy 1, policy_version 94220 (0.0007) [2023-10-10 08:36:36,472][53268] Updated weights for policy 1, policy_version 94230 (0.0007) [2023-10-10 08:36:36,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193036288. Throughput: 0: 1685.2, 1: 1694.5. Samples: 48272504. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:36,784][52050] Avg episode reward: [(0, '21.660'), (1, '21.220')] [2023-10-10 08:36:36,829][53268] Updated weights for policy 1, policy_version 94240 (0.0007) [2023-10-10 08:36:37,677][53252] Updated weights for policy 0, policy_version 94310 (0.0009) [2023-10-10 08:36:38,047][53252] Updated weights for policy 0, policy_version 94320 (0.0009) [2023-10-10 08:36:38,423][53252] Updated weights for policy 0, policy_version 94330 (0.0010) [2023-10-10 08:36:40,961][53268] Updated weights for policy 1, policy_version 94250 (0.0011) [2023-10-10 08:36:41,340][53268] Updated weights for policy 1, policy_version 94260 (0.0011) [2023-10-10 08:36:41,709][53268] Updated weights for policy 1, policy_version 94270 (0.0008) [2023-10-10 08:36:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193101824. Throughput: 0: 1684.0, 1: 1679.2. Samples: 48292674. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:41,784][52050] Avg episode reward: [(0, '23.550'), (1, '20.600')] [2023-10-10 08:36:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000094272_96534528.pth... [2023-10-10 08:36:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000094336_96600064.pth... [2023-10-10 08:36:41,830][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000092768_94994432.pth [2023-10-10 08:36:41,833][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000092704_94928896.pth [2023-10-10 08:36:41,833][52846] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p0/milestones/checkpoint_000094336_96600064.pth [2023-10-10 08:36:41,838][53061] Saving a milestone ./train_atari/atari_choppercommand_APPO/checkpoint_p1/milestones/checkpoint_000094272_96534528.pth [2023-10-10 08:36:42,473][53252] Updated weights for policy 0, policy_version 94340 (0.0009) [2023-10-10 08:36:42,843][53252] Updated weights for policy 0, policy_version 94350 (0.0008) [2023-10-10 08:36:43,206][53252] Updated weights for policy 0, policy_version 94360 (0.0008) [2023-10-10 08:36:45,807][53268] Updated weights for policy 1, policy_version 94280 (0.0009) [2023-10-10 08:36:46,175][53268] Updated weights for policy 1, policy_version 94290 (0.0007) [2023-10-10 08:36:46,535][53268] Updated weights for policy 1, policy_version 94300 (0.0008) [2023-10-10 08:36:46,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 193200128. Throughput: 0: 1680.1, 1: 1691.5. Samples: 48302356. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:46,784][52050] Avg episode reward: [(0, '24.480'), (1, '23.580')] [2023-10-10 08:36:47,230][53252] Updated weights for policy 0, policy_version 94370 (0.0007) [2023-10-10 08:36:47,606][53252] Updated weights for policy 0, policy_version 94380 (0.0010) [2023-10-10 08:36:47,980][53252] Updated weights for policy 0, policy_version 94390 (0.0010) [2023-10-10 08:36:48,343][53252] Updated weights for policy 0, policy_version 94400 (0.0009) [2023-10-10 08:36:50,797][53268] Updated weights for policy 1, policy_version 94310 (0.0010) [2023-10-10 08:36:51,153][53268] Updated weights for policy 1, policy_version 94320 (0.0010) [2023-10-10 08:36:51,523][53268] Updated weights for policy 1, policy_version 94330 (0.0010) [2023-10-10 08:36:51,783][52050] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 193265664. Throughput: 0: 1687.9, 1: 1688.7. Samples: 48322864. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:51,784][52050] Avg episode reward: [(0, '22.890'), (1, '24.550')] [2023-10-10 08:36:52,488][53252] Updated weights for policy 0, policy_version 94410 (0.0010) [2023-10-10 08:36:52,857][53252] Updated weights for policy 0, policy_version 94420 (0.0009) [2023-10-10 08:36:53,226][53252] Updated weights for policy 0, policy_version 94430 (0.0011) [2023-10-10 08:36:55,482][53268] Updated weights for policy 1, policy_version 94340 (0.0008) [2023-10-10 08:36:55,858][53268] Updated weights for policy 1, policy_version 94350 (0.0008) [2023-10-10 08:36:56,216][53268] Updated weights for policy 1, policy_version 94360 (0.0009) [2023-10-10 08:36:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 193331200. Throughput: 0: 1690.4, 1: 1665.8. Samples: 48343032. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:36:56,784][52050] Avg episode reward: [(0, '23.460'), (1, '22.800')] [2023-10-10 08:36:57,160][53252] Updated weights for policy 0, policy_version 94440 (0.0009) [2023-10-10 08:36:57,525][53252] Updated weights for policy 0, policy_version 94450 (0.0009) [2023-10-10 08:36:57,894][53252] Updated weights for policy 0, policy_version 94460 (0.0009) [2023-10-10 08:37:00,256][53268] Updated weights for policy 1, policy_version 94370 (0.0009) [2023-10-10 08:37:00,618][53268] Updated weights for policy 1, policy_version 94380 (0.0008) [2023-10-10 08:37:01,001][53268] Updated weights for policy 1, policy_version 94390 (0.0008) [2023-10-10 08:37:01,370][53268] Updated weights for policy 1, policy_version 94400 (0.0010) [2023-10-10 08:37:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193396736. Throughput: 0: 1690.9, 1: 1687.4. Samples: 48352998. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:37:01,784][52050] Avg episode reward: [(0, '23.400'), (1, '23.620')] [2023-10-10 08:37:01,978][53252] Updated weights for policy 0, policy_version 94470 (0.0010) [2023-10-10 08:37:02,343][53252] Updated weights for policy 0, policy_version 94480 (0.0009) [2023-10-10 08:37:02,719][53252] Updated weights for policy 0, policy_version 94490 (0.0010) [2023-10-10 08:37:05,551][53268] Updated weights for policy 1, policy_version 94410 (0.0009) [2023-10-10 08:37:05,918][53268] Updated weights for policy 1, policy_version 94420 (0.0008) [2023-10-10 08:37:06,293][53268] Updated weights for policy 1, policy_version 94430 (0.0007) [2023-10-10 08:37:06,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193462272. Throughput: 0: 1686.5, 1: 1678.7. Samples: 48373510. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-10 08:37:06,784][52050] Avg episode reward: [(0, '24.250'), (1, '23.960')] [2023-10-10 08:37:06,833][53252] Updated weights for policy 0, policy_version 94500 (0.0009) [2023-10-10 08:37:07,209][53252] Updated weights for policy 0, policy_version 94510 (0.0009) [2023-10-10 08:37:07,583][53252] Updated weights for policy 0, policy_version 94520 (0.0007) [2023-10-10 08:37:10,417][53268] Updated weights for policy 1, policy_version 94440 (0.0010) [2023-10-10 08:37:10,788][53268] Updated weights for policy 1, policy_version 94450 (0.0010) [2023-10-10 08:37:11,141][53268] Updated weights for policy 1, policy_version 94460 (0.0009) [2023-10-10 08:37:11,638][53252] Updated weights for policy 0, policy_version 94530 (0.0007) [2023-10-10 08:37:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193527808. Throughput: 0: 1697.9, 1: 1657.1. Samples: 48393354. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:11,784][52050] Avg episode reward: [(0, '22.220'), (1, '21.270')] [2023-10-10 08:37:12,021][53252] Updated weights for policy 0, policy_version 94540 (0.0009) [2023-10-10 08:37:12,404][53252] Updated weights for policy 0, policy_version 94550 (0.0007) [2023-10-10 08:37:12,765][53252] Updated weights for policy 0, policy_version 94560 (0.0008) [2023-10-10 08:37:15,187][53268] Updated weights for policy 1, policy_version 94470 (0.0010) [2023-10-10 08:37:15,576][53268] Updated weights for policy 1, policy_version 94480 (0.0009) [2023-10-10 08:37:15,944][53268] Updated weights for policy 1, policy_version 94490 (0.0008) [2023-10-10 08:37:16,651][53252] Updated weights for policy 0, policy_version 94570 (0.0007) [2023-10-10 08:37:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193593344. Throughput: 0: 1694.2, 1: 1669.2. Samples: 48403338. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:16,784][52050] Avg episode reward: [(0, '21.640'), (1, '21.760')] [2023-10-10 08:37:17,023][53252] Updated weights for policy 0, policy_version 94580 (0.0007) [2023-10-10 08:37:17,393][53252] Updated weights for policy 0, policy_version 94590 (0.0007) [2023-10-10 08:37:20,041][53268] Updated weights for policy 1, policy_version 94500 (0.0010) [2023-10-10 08:37:20,402][53268] Updated weights for policy 1, policy_version 94510 (0.0007) [2023-10-10 08:37:20,769][53268] Updated weights for policy 1, policy_version 94520 (0.0009) [2023-10-10 08:37:21,537][53252] Updated weights for policy 0, policy_version 94600 (0.0008) [2023-10-10 08:37:21,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193658880. Throughput: 0: 1698.0, 1: 1660.9. Samples: 48423656. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:21,784][52050] Avg episode reward: [(0, '20.840'), (1, '21.600')] [2023-10-10 08:37:21,905][53252] Updated weights for policy 0, policy_version 94610 (0.0009) [2023-10-10 08:37:22,276][53252] Updated weights for policy 0, policy_version 94620 (0.0007) [2023-10-10 08:37:24,624][53268] Updated weights for policy 1, policy_version 94530 (0.0007) [2023-10-10 08:37:24,992][53268] Updated weights for policy 1, policy_version 94540 (0.0008) [2023-10-10 08:37:25,348][53268] Updated weights for policy 1, policy_version 94550 (0.0008) [2023-10-10 08:37:25,722][53268] Updated weights for policy 1, policy_version 94560 (0.0010) [2023-10-10 08:37:26,186][53252] Updated weights for policy 0, policy_version 94630 (0.0007) [2023-10-10 08:37:26,557][53252] Updated weights for policy 0, policy_version 94640 (0.0007) [2023-10-10 08:37:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193724416. Throughput: 0: 1691.0, 1: 1656.8. Samples: 48443324. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:26,784][52050] Avg episode reward: [(0, '21.980'), (1, '22.070')] [2023-10-10 08:37:26,928][53252] Updated weights for policy 0, policy_version 94650 (0.0008) [2023-10-10 08:37:30,062][53268] Updated weights for policy 1, policy_version 94570 (0.0007) [2023-10-10 08:37:30,427][53268] Updated weights for policy 1, policy_version 94580 (0.0007) [2023-10-10 08:37:30,799][53268] Updated weights for policy 1, policy_version 94590 (0.0009) [2023-10-10 08:37:31,107][53252] Updated weights for policy 0, policy_version 94660 (0.0008) [2023-10-10 08:37:31,482][53252] Updated weights for policy 0, policy_version 94670 (0.0009) [2023-10-10 08:37:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 193789952. Throughput: 0: 1699.3, 1: 1673.0. Samples: 48454110. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:31,784][52050] Avg episode reward: [(0, '22.350'), (1, '22.400')] [2023-10-10 08:37:31,849][53252] Updated weights for policy 0, policy_version 94680 (0.0007) [2023-10-10 08:37:34,952][53268] Updated weights for policy 1, policy_version 94600 (0.0008) [2023-10-10 08:37:35,323][53268] Updated weights for policy 1, policy_version 94610 (0.0008) [2023-10-10 08:37:35,684][53268] Updated weights for policy 1, policy_version 94620 (0.0007) [2023-10-10 08:37:35,836][53252] Updated weights for policy 0, policy_version 94690 (0.0008) [2023-10-10 08:37:36,209][53252] Updated weights for policy 0, policy_version 94700 (0.0009) [2023-10-10 08:37:36,572][53252] Updated weights for policy 0, policy_version 94710 (0.0008) [2023-10-10 08:37:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 193855488. Throughput: 0: 1703.7, 1: 1664.4. Samples: 48474432. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:36,784][52050] Avg episode reward: [(0, '24.010'), (1, '23.750')] [2023-10-10 08:37:36,936][53252] Updated weights for policy 0, policy_version 94720 (0.0009) [2023-10-10 08:37:39,765][53268] Updated weights for policy 1, policy_version 94630 (0.0010) [2023-10-10 08:37:40,134][53268] Updated weights for policy 1, policy_version 94640 (0.0008) [2023-10-10 08:37:40,497][53268] Updated weights for policy 1, policy_version 94650 (0.0008) [2023-10-10 08:37:40,909][53252] Updated weights for policy 0, policy_version 94730 (0.0009) [2023-10-10 08:37:41,270][53252] Updated weights for policy 0, policy_version 94740 (0.0009) [2023-10-10 08:37:41,648][53252] Updated weights for policy 0, policy_version 94750 (0.0010) [2023-10-10 08:37:41,783][52050] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 193953792. Throughput: 0: 1682.2, 1: 1668.2. Samples: 48493800. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:41,784][52050] Avg episode reward: [(0, '23.870'), (1, '23.960')] [2023-10-10 08:37:44,472][53268] Updated weights for policy 1, policy_version 94660 (0.0007) [2023-10-10 08:37:44,855][53268] Updated weights for policy 1, policy_version 94670 (0.0010) [2023-10-10 08:37:45,226][53268] Updated weights for policy 1, policy_version 94680 (0.0008) [2023-10-10 08:37:45,675][53252] Updated weights for policy 0, policy_version 94760 (0.0008) [2023-10-10 08:37:46,045][53252] Updated weights for policy 0, policy_version 94770 (0.0008) [2023-10-10 08:37:46,420][53252] Updated weights for policy 0, policy_version 94780 (0.0009) [2023-10-10 08:37:46,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194019328. Throughput: 0: 1704.4, 1: 1675.2. Samples: 48505084. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:46,784][52050] Avg episode reward: [(0, '23.850'), (1, '22.730')] [2023-10-10 08:37:49,258][53268] Updated weights for policy 1, policy_version 94690 (0.0010) [2023-10-10 08:37:49,628][53268] Updated weights for policy 1, policy_version 94700 (0.0009) [2023-10-10 08:37:49,996][53268] Updated weights for policy 1, policy_version 94710 (0.0010) [2023-10-10 08:37:50,356][53268] Updated weights for policy 1, policy_version 94720 (0.0009) [2023-10-10 08:37:50,456][53252] Updated weights for policy 0, policy_version 94790 (0.0007) [2023-10-10 08:37:50,817][53252] Updated weights for policy 0, policy_version 94800 (0.0008) [2023-10-10 08:37:51,200][53252] Updated weights for policy 0, policy_version 94810 (0.0009) [2023-10-10 08:37:51,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194084864. Throughput: 0: 1701.8, 1: 1653.7. Samples: 48524508. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:51,784][52050] Avg episode reward: [(0, '21.330'), (1, '22.930')] [2023-10-10 08:37:54,245][53268] Updated weights for policy 1, policy_version 94730 (0.0008) [2023-10-10 08:37:54,610][53268] Updated weights for policy 1, policy_version 94740 (0.0008) [2023-10-10 08:37:54,975][53268] Updated weights for policy 1, policy_version 94750 (0.0010) [2023-10-10 08:37:55,342][53252] Updated weights for policy 0, policy_version 94820 (0.0009) [2023-10-10 08:37:55,709][53252] Updated weights for policy 0, policy_version 94830 (0.0007) [2023-10-10 08:37:56,080][53252] Updated weights for policy 0, policy_version 94840 (0.0008) [2023-10-10 08:37:56,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194150400. Throughput: 0: 1671.4, 1: 1680.0. Samples: 48544166. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:37:56,784][52050] Avg episode reward: [(0, '20.620'), (1, '20.930')] [2023-10-10 08:37:58,980][53268] Updated weights for policy 1, policy_version 94760 (0.0008) [2023-10-10 08:37:59,349][53268] Updated weights for policy 1, policy_version 94770 (0.0010) [2023-10-10 08:37:59,716][53268] Updated weights for policy 1, policy_version 94780 (0.0008) [2023-10-10 08:38:00,138][53252] Updated weights for policy 0, policy_version 94850 (0.0007) [2023-10-10 08:38:00,535][53252] Updated weights for policy 0, policy_version 94860 (0.0007) [2023-10-10 08:38:00,906][53252] Updated weights for policy 0, policy_version 94870 (0.0008) [2023-10-10 08:38:01,282][53252] Updated weights for policy 0, policy_version 94880 (0.0008) [2023-10-10 08:38:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 194215936. Throughput: 0: 1702.8, 1: 1673.6. Samples: 48555274. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-10 08:38:01,784][52050] Avg episode reward: [(0, '20.510'), (1, '20.650')] [2023-10-10 08:38:03,773][53268] Updated weights for policy 1, policy_version 94790 (0.0008) [2023-10-10 08:38:04,159][53268] Updated weights for policy 1, policy_version 94800 (0.0009) [2023-10-10 08:38:04,528][53268] Updated weights for policy 1, policy_version 94810 (0.0010) [2023-10-10 08:38:05,283][53252] Updated weights for policy 0, policy_version 94890 (0.0007) [2023-10-10 08:38:05,661][53252] Updated weights for policy 0, policy_version 94900 (0.0007) [2023-10-10 08:38:06,023][53252] Updated weights for policy 0, policy_version 94910 (0.0007) [2023-10-10 08:38:06,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 194281472. Throughput: 0: 1687.7, 1: 1670.5. Samples: 48574778. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:06,784][52050] Avg episode reward: [(0, '21.890'), (1, '20.810')] [2023-10-10 08:38:08,495][53268] Updated weights for policy 1, policy_version 94820 (0.0008) [2023-10-10 08:38:08,865][53268] Updated weights for policy 1, policy_version 94830 (0.0010) [2023-10-10 08:38:09,232][53268] Updated weights for policy 1, policy_version 94840 (0.0009) [2023-10-10 08:38:09,997][53252] Updated weights for policy 0, policy_version 94920 (0.0008) [2023-10-10 08:38:10,377][53252] Updated weights for policy 0, policy_version 94930 (0.0008) [2023-10-10 08:38:10,746][53252] Updated weights for policy 0, policy_version 94940 (0.0007) [2023-10-10 08:38:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194347008. Throughput: 0: 1673.8, 1: 1695.6. Samples: 48594948. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:11,785][52050] Avg episode reward: [(0, '23.600'), (1, '21.560')] [2023-10-10 08:38:13,343][53268] Updated weights for policy 1, policy_version 94850 (0.0008) [2023-10-10 08:38:13,715][53268] Updated weights for policy 1, policy_version 94860 (0.0010) [2023-10-10 08:38:14,083][53268] Updated weights for policy 1, policy_version 94870 (0.0010) [2023-10-10 08:38:14,440][53268] Updated weights for policy 1, policy_version 94880 (0.0009) [2023-10-10 08:38:14,732][53252] Updated weights for policy 0, policy_version 94950 (0.0009) [2023-10-10 08:38:15,106][53252] Updated weights for policy 0, policy_version 94960 (0.0008) [2023-10-10 08:38:15,486][53252] Updated weights for policy 0, policy_version 94970 (0.0009) [2023-10-10 08:38:16,784][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 194412544. Throughput: 0: 1695.0, 1: 1675.1. Samples: 48605766. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:16,785][52050] Avg episode reward: [(0, '24.760'), (1, '20.460')] [2023-10-10 08:38:18,441][53268] Updated weights for policy 1, policy_version 94890 (0.0009) [2023-10-10 08:38:18,806][53268] Updated weights for policy 1, policy_version 94900 (0.0011) [2023-10-10 08:38:19,174][53268] Updated weights for policy 1, policy_version 94910 (0.0008) [2023-10-10 08:38:19,577][53252] Updated weights for policy 0, policy_version 94980 (0.0009) [2023-10-10 08:38:19,937][53252] Updated weights for policy 0, policy_version 94990 (0.0009) [2023-10-10 08:38:20,302][53252] Updated weights for policy 0, policy_version 95000 (0.0007) [2023-10-10 08:38:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 194478080. Throughput: 0: 1666.7, 1: 1676.4. Samples: 48624872. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:21,784][52050] Avg episode reward: [(0, '24.390'), (1, '23.140')] [2023-10-10 08:38:23,352][53268] Updated weights for policy 1, policy_version 94920 (0.0007) [2023-10-10 08:38:23,720][53268] Updated weights for policy 1, policy_version 94930 (0.0007) [2023-10-10 08:38:24,100][53268] Updated weights for policy 1, policy_version 94940 (0.0008) [2023-10-10 08:38:24,386][53252] Updated weights for policy 0, policy_version 95010 (0.0009) [2023-10-10 08:38:24,759][53252] Updated weights for policy 0, policy_version 95020 (0.0008) [2023-10-10 08:38:25,129][53252] Updated weights for policy 0, policy_version 95030 (0.0009) [2023-10-10 08:38:25,501][53252] Updated weights for policy 0, policy_version 95040 (0.0008) [2023-10-10 08:38:26,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194543616. Throughput: 0: 1674.5, 1: 1690.4. Samples: 48645218. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:26,784][52050] Avg episode reward: [(0, '24.720'), (1, '21.180')] [2023-10-10 08:38:28,103][53268] Updated weights for policy 1, policy_version 94950 (0.0009) [2023-10-10 08:38:28,466][53268] Updated weights for policy 1, policy_version 94960 (0.0007) [2023-10-10 08:38:28,829][53268] Updated weights for policy 1, policy_version 94970 (0.0011) [2023-10-10 08:38:29,556][53252] Updated weights for policy 0, policy_version 95050 (0.0007) [2023-10-10 08:38:29,917][53252] Updated weights for policy 0, policy_version 95060 (0.0009) [2023-10-10 08:38:30,290][53252] Updated weights for policy 0, policy_version 95070 (0.0009) [2023-10-10 08:38:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194609152. Throughput: 0: 1677.1, 1: 1662.7. Samples: 48655374. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:31,784][52050] Avg episode reward: [(0, '26.410'), (1, '21.460')] [2023-10-10 08:38:32,963][53268] Updated weights for policy 1, policy_version 94980 (0.0008) [2023-10-10 08:38:33,333][53268] Updated weights for policy 1, policy_version 94990 (0.0007) [2023-10-10 08:38:33,685][53268] Updated weights for policy 1, policy_version 95000 (0.0007) [2023-10-10 08:38:34,510][53252] Updated weights for policy 0, policy_version 95080 (0.0010) [2023-10-10 08:38:34,879][53252] Updated weights for policy 0, policy_version 95090 (0.0008) [2023-10-10 08:38:35,244][53252] Updated weights for policy 0, policy_version 95100 (0.0010) [2023-10-10 08:38:36,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194674688. Throughput: 0: 1658.4, 1: 1689.0. Samples: 48675140. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:36,784][52050] Avg episode reward: [(0, '24.940'), (1, '22.450')] [2023-10-10 08:38:37,779][53268] Updated weights for policy 1, policy_version 95010 (0.0008) [2023-10-10 08:38:38,142][53268] Updated weights for policy 1, policy_version 95020 (0.0010) [2023-10-10 08:38:38,505][53268] Updated weights for policy 1, policy_version 95030 (0.0010) [2023-10-10 08:38:38,876][53268] Updated weights for policy 1, policy_version 95040 (0.0009) [2023-10-10 08:38:39,029][53252] Updated weights for policy 0, policy_version 95110 (0.0010) [2023-10-10 08:38:39,394][53252] Updated weights for policy 0, policy_version 95120 (0.0009) [2023-10-10 08:38:39,767][53252] Updated weights for policy 0, policy_version 95130 (0.0009) [2023-10-10 08:38:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194740224. Throughput: 0: 1687.8, 1: 1686.5. Samples: 48696008. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:41,784][52050] Avg episode reward: [(0, '22.920'), (1, '22.150')] [2023-10-10 08:38:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000095040_97320960.pth... [2023-10-10 08:38:41,793][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000095136_97419264.pth... [2023-10-10 08:38:41,822][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000093568_95813632.pth [2023-10-10 08:38:41,831][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000093472_95715328.pth [2023-10-10 08:38:43,043][53268] Updated weights for policy 1, policy_version 95050 (0.0010) [2023-10-10 08:38:43,410][53268] Updated weights for policy 1, policy_version 95060 (0.0008) [2023-10-10 08:38:43,774][53252] Updated weights for policy 0, policy_version 95140 (0.0009) [2023-10-10 08:38:43,787][53268] Updated weights for policy 1, policy_version 95070 (0.0007) [2023-10-10 08:38:44,146][53252] Updated weights for policy 0, policy_version 95150 (0.0009) [2023-10-10 08:38:44,512][53252] Updated weights for policy 0, policy_version 95160 (0.0008) [2023-10-10 08:38:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194805760. Throughput: 0: 1672.5, 1: 1673.1. Samples: 48705824. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:46,784][52050] Avg episode reward: [(0, '23.840'), (1, '21.260')] [2023-10-10 08:38:47,830][53268] Updated weights for policy 1, policy_version 95080 (0.0008) [2023-10-10 08:38:48,201][53268] Updated weights for policy 1, policy_version 95090 (0.0010) [2023-10-10 08:38:48,584][53268] Updated weights for policy 1, policy_version 95100 (0.0011) [2023-10-10 08:38:48,665][53252] Updated weights for policy 0, policy_version 95170 (0.0010) [2023-10-10 08:38:49,034][53252] Updated weights for policy 0, policy_version 95180 (0.0008) [2023-10-10 08:38:49,410][53252] Updated weights for policy 0, policy_version 95190 (0.0009) [2023-10-10 08:38:49,783][53252] Updated weights for policy 0, policy_version 95200 (0.0009) [2023-10-10 08:38:51,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194871296. Throughput: 0: 1674.0, 1: 1687.0. Samples: 48726026. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:51,784][52050] Avg episode reward: [(0, '23.210'), (1, '21.770')] [2023-10-10 08:38:52,826][53268] Updated weights for policy 1, policy_version 95110 (0.0010) [2023-10-10 08:38:53,219][53268] Updated weights for policy 1, policy_version 95120 (0.0009) [2023-10-10 08:38:53,588][53268] Updated weights for policy 1, policy_version 95130 (0.0008) [2023-10-10 08:38:53,865][53252] Updated weights for policy 0, policy_version 95210 (0.0008) [2023-10-10 08:38:54,239][53252] Updated weights for policy 0, policy_version 95220 (0.0008) [2023-10-10 08:38:54,608][53252] Updated weights for policy 0, policy_version 95230 (0.0010) [2023-10-10 08:38:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194936832. Throughput: 0: 1691.2, 1: 1679.8. Samples: 48746644. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:38:56,784][52050] Avg episode reward: [(0, '21.200'), (1, '22.560')] [2023-10-10 08:38:57,410][53268] Updated weights for policy 1, policy_version 95140 (0.0009) [2023-10-10 08:38:57,782][53268] Updated weights for policy 1, policy_version 95150 (0.0007) [2023-10-10 08:38:58,140][53268] Updated weights for policy 1, policy_version 95160 (0.0007) [2023-10-10 08:38:58,697][53252] Updated weights for policy 0, policy_version 95240 (0.0008) [2023-10-10 08:38:59,064][53252] Updated weights for policy 0, policy_version 95250 (0.0009) [2023-10-10 08:38:59,434][53252] Updated weights for policy 0, policy_version 95260 (0.0008) [2023-10-10 08:39:01,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195002368. Throughput: 0: 1666.7, 1: 1672.8. Samples: 48756040. Policy #0 lag: (min: 13.0, avg: 29.0, max: 45.0) [2023-10-10 08:39:01,784][52050] Avg episode reward: [(0, '21.210'), (1, '20.710')] [2023-10-10 08:39:02,036][53268] Updated weights for policy 1, policy_version 95170 (0.0008) [2023-10-10 08:39:02,407][53268] Updated weights for policy 1, policy_version 95180 (0.0007) [2023-10-10 08:39:02,779][53268] Updated weights for policy 1, policy_version 95190 (0.0009) [2023-10-10 08:39:03,147][53268] Updated weights for policy 1, policy_version 95200 (0.0008) [2023-10-10 08:39:03,521][53252] Updated weights for policy 0, policy_version 95270 (0.0007) [2023-10-10 08:39:03,880][53252] Updated weights for policy 0, policy_version 95280 (0.0008) [2023-10-10 08:39:04,256][53252] Updated weights for policy 0, policy_version 95290 (0.0009) [2023-10-10 08:39:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195067904. Throughput: 0: 1687.2, 1: 1684.3. Samples: 48776592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:06,784][52050] Avg episode reward: [(0, '23.900'), (1, '22.070')] [2023-10-10 08:39:07,241][53268] Updated weights for policy 1, policy_version 95210 (0.0008) [2023-10-10 08:39:07,618][53268] Updated weights for policy 1, policy_version 95220 (0.0008) [2023-10-10 08:39:07,978][53268] Updated weights for policy 1, policy_version 95230 (0.0009) [2023-10-10 08:39:08,306][53252] Updated weights for policy 0, policy_version 95300 (0.0010) [2023-10-10 08:39:08,676][53252] Updated weights for policy 0, policy_version 95310 (0.0009) [2023-10-10 08:39:09,050][53252] Updated weights for policy 0, policy_version 95320 (0.0007) [2023-10-10 08:39:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195133440. Throughput: 0: 1692.2, 1: 1688.3. Samples: 48797344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:11,784][52050] Avg episode reward: [(0, '21.720'), (1, '22.020')] [2023-10-10 08:39:12,222][53268] Updated weights for policy 1, policy_version 95240 (0.0011) [2023-10-10 08:39:12,590][53268] Updated weights for policy 1, policy_version 95250 (0.0011) [2023-10-10 08:39:12,962][53268] Updated weights for policy 1, policy_version 95260 (0.0010) [2023-10-10 08:39:13,147][53252] Updated weights for policy 0, policy_version 95330 (0.0009) [2023-10-10 08:39:13,525][53252] Updated weights for policy 0, policy_version 95340 (0.0009) [2023-10-10 08:39:13,900][53252] Updated weights for policy 0, policy_version 95350 (0.0008) [2023-10-10 08:39:14,268][53252] Updated weights for policy 0, policy_version 95360 (0.0010) [2023-10-10 08:39:16,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 195198976. Throughput: 0: 1670.2, 1: 1690.1. Samples: 48806588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:16,784][52050] Avg episode reward: [(0, '21.930'), (1, '21.890')] [2023-10-10 08:39:16,996][53268] Updated weights for policy 1, policy_version 95270 (0.0009) [2023-10-10 08:39:17,364][53268] Updated weights for policy 1, policy_version 95280 (0.0009) [2023-10-10 08:39:17,743][53268] Updated weights for policy 1, policy_version 95290 (0.0007) [2023-10-10 08:39:18,331][53252] Updated weights for policy 0, policy_version 95370 (0.0007) [2023-10-10 08:39:18,698][53252] Updated weights for policy 0, policy_version 95380 (0.0010) [2023-10-10 08:39:19,070][53252] Updated weights for policy 0, policy_version 95390 (0.0007) [2023-10-10 08:39:21,726][53268] Updated weights for policy 1, policy_version 95300 (0.0008) [2023-10-10 08:39:21,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195264512. Throughput: 0: 1693.4, 1: 1690.2. Samples: 48827402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:21,784][52050] Avg episode reward: [(0, '23.100'), (1, '22.360')] [2023-10-10 08:39:22,095][53268] Updated weights for policy 1, policy_version 95310 (0.0008) [2023-10-10 08:39:22,465][53268] Updated weights for policy 1, policy_version 95320 (0.0008) [2023-10-10 08:39:23,268][53252] Updated weights for policy 0, policy_version 95400 (0.0007) [2023-10-10 08:39:23,636][53252] Updated weights for policy 0, policy_version 95410 (0.0011) [2023-10-10 08:39:24,008][53252] Updated weights for policy 0, policy_version 95420 (0.0009) [2023-10-10 08:39:26,285][53268] Updated weights for policy 1, policy_version 95330 (0.0008) [2023-10-10 08:39:26,644][53268] Updated weights for policy 1, policy_version 95340 (0.0007) [2023-10-10 08:39:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195330048. Throughput: 0: 1681.1, 1: 1699.7. Samples: 48848142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:26,784][52050] Avg episode reward: [(0, '21.160'), (1, '23.950')] [2023-10-10 08:39:27,017][53268] Updated weights for policy 1, policy_version 95350 (0.0007) [2023-10-10 08:39:27,383][53268] Updated weights for policy 1, policy_version 95360 (0.0007) [2023-10-10 08:39:28,049][53252] Updated weights for policy 0, policy_version 95430 (0.0009) [2023-10-10 08:39:28,414][53252] Updated weights for policy 0, policy_version 95440 (0.0007) [2023-10-10 08:39:28,786][53252] Updated weights for policy 0, policy_version 95450 (0.0010) [2023-10-10 08:39:31,514][53268] Updated weights for policy 1, policy_version 95370 (0.0008) [2023-10-10 08:39:31,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 195395584. Throughput: 0: 1670.6, 1: 1698.8. Samples: 48857450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:31,784][52050] Avg episode reward: [(0, '20.640'), (1, '24.490')] [2023-10-10 08:39:31,876][53268] Updated weights for policy 1, policy_version 95380 (0.0007) [2023-10-10 08:39:32,241][53268] Updated weights for policy 1, policy_version 95390 (0.0007) [2023-10-10 08:39:32,894][53252] Updated weights for policy 0, policy_version 95460 (0.0007) [2023-10-10 08:39:33,274][53252] Updated weights for policy 0, policy_version 95470 (0.0007) [2023-10-10 08:39:33,635][53252] Updated weights for policy 0, policy_version 95480 (0.0009) [2023-10-10 08:39:36,227][53268] Updated weights for policy 1, policy_version 95400 (0.0009) [2023-10-10 08:39:36,582][53268] Updated weights for policy 1, policy_version 95410 (0.0008) [2023-10-10 08:39:36,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195461120. Throughput: 0: 1679.3, 1: 1699.6. Samples: 48878078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:36,784][52050] Avg episode reward: [(0, '23.230'), (1, '24.730')] [2023-10-10 08:39:36,945][53268] Updated weights for policy 1, policy_version 95420 (0.0008) [2023-10-10 08:39:37,584][53252] Updated weights for policy 0, policy_version 95490 (0.0009) [2023-10-10 08:39:37,960][53252] Updated weights for policy 0, policy_version 95500 (0.0009) [2023-10-10 08:39:38,328][53252] Updated weights for policy 0, policy_version 95510 (0.0008) [2023-10-10 08:39:38,693][53252] Updated weights for policy 0, policy_version 95520 (0.0009) [2023-10-10 08:39:41,139][53268] Updated weights for policy 1, policy_version 95430 (0.0009) [2023-10-10 08:39:41,509][53268] Updated weights for policy 1, policy_version 95440 (0.0007) [2023-10-10 08:39:41,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195526656. Throughput: 0: 1683.3, 1: 1695.1. Samples: 48898670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:41,784][52050] Avg episode reward: [(0, '22.500'), (1, '24.830')] [2023-10-10 08:39:41,883][53268] Updated weights for policy 1, policy_version 95450 (0.0008) [2023-10-10 08:39:42,785][53252] Updated weights for policy 0, policy_version 95530 (0.0009) [2023-10-10 08:39:43,145][53252] Updated weights for policy 0, policy_version 95540 (0.0008) [2023-10-10 08:39:43,519][53252] Updated weights for policy 0, policy_version 95550 (0.0007) [2023-10-10 08:39:45,804][53268] Updated weights for policy 1, policy_version 95460 (0.0008) [2023-10-10 08:39:46,172][53268] Updated weights for policy 1, policy_version 95470 (0.0010) [2023-10-10 08:39:46,535][53268] Updated weights for policy 1, policy_version 95480 (0.0007) [2023-10-10 08:39:46,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195592192. Throughput: 0: 1676.1, 1: 1703.6. Samples: 48908126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:46,784][52050] Avg episode reward: [(0, '21.600'), (1, '23.340')] [2023-10-10 08:39:47,578][53252] Updated weights for policy 0, policy_version 95560 (0.0007) [2023-10-10 08:39:47,947][53252] Updated weights for policy 0, policy_version 95570 (0.0008) [2023-10-10 08:39:48,326][53252] Updated weights for policy 0, policy_version 95580 (0.0008) [2023-10-10 08:39:50,675][53268] Updated weights for policy 1, policy_version 95490 (0.0007) [2023-10-10 08:39:51,047][53268] Updated weights for policy 1, policy_version 95500 (0.0007) [2023-10-10 08:39:51,406][53268] Updated weights for policy 1, policy_version 95510 (0.0007) [2023-10-10 08:39:51,773][53268] Updated weights for policy 1, policy_version 95520 (0.0009) [2023-10-10 08:39:51,783][52050] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195690496. Throughput: 0: 1679.2, 1: 1702.7. Samples: 48928776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:51,784][52050] Avg episode reward: [(0, '21.990'), (1, '23.260')] [2023-10-10 08:39:52,382][53252] Updated weights for policy 0, policy_version 95590 (0.0008) [2023-10-10 08:39:52,751][53252] Updated weights for policy 0, policy_version 95600 (0.0010) [2023-10-10 08:39:53,131][53252] Updated weights for policy 0, policy_version 95610 (0.0010) [2023-10-10 08:39:55,744][53268] Updated weights for policy 1, policy_version 95530 (0.0008) [2023-10-10 08:39:56,105][53268] Updated weights for policy 1, policy_version 95540 (0.0008) [2023-10-10 08:39:56,463][53268] Updated weights for policy 1, policy_version 95550 (0.0008) [2023-10-10 08:39:56,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 195756032. Throughput: 0: 1682.3, 1: 1684.0. Samples: 48948824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:39:56,784][52050] Avg episode reward: [(0, '20.270'), (1, '21.900')] [2023-10-10 08:39:57,210][53252] Updated weights for policy 0, policy_version 95620 (0.0010) [2023-10-10 08:39:57,579][53252] Updated weights for policy 0, policy_version 95630 (0.0009) [2023-10-10 08:39:57,943][53252] Updated weights for policy 0, policy_version 95640 (0.0009) [2023-10-10 08:40:00,529][53268] Updated weights for policy 1, policy_version 95560 (0.0008) [2023-10-10 08:40:00,901][53268] Updated weights for policy 1, policy_version 95570 (0.0008) [2023-10-10 08:40:01,257][53268] Updated weights for policy 1, policy_version 95580 (0.0011) [2023-10-10 08:40:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195821568. Throughput: 0: 1680.4, 1: 1701.2. Samples: 48958758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:40:01,784][52050] Avg episode reward: [(0, '21.300'), (1, '21.060')] [2023-10-10 08:40:02,051][53252] Updated weights for policy 0, policy_version 95650 (0.0010) [2023-10-10 08:40:02,432][53252] Updated weights for policy 0, policy_version 95660 (0.0007) [2023-10-10 08:40:02,806][53252] Updated weights for policy 0, policy_version 95670 (0.0009) [2023-10-10 08:40:03,171][53252] Updated weights for policy 0, policy_version 95680 (0.0008) [2023-10-10 08:40:05,406][53268] Updated weights for policy 1, policy_version 95590 (0.0010) [2023-10-10 08:40:05,769][53268] Updated weights for policy 1, policy_version 95600 (0.0007) [2023-10-10 08:40:06,136][53268] Updated weights for policy 1, policy_version 95610 (0.0007) [2023-10-10 08:40:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195887104. Throughput: 0: 1687.2, 1: 1694.8. Samples: 48979594. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:06,784][52050] Avg episode reward: [(0, '22.380'), (1, '19.900')] [2023-10-10 08:40:07,029][53252] Updated weights for policy 0, policy_version 95690 (0.0008) [2023-10-10 08:40:07,399][53252] Updated weights for policy 0, policy_version 95700 (0.0008) [2023-10-10 08:40:07,774][53252] Updated weights for policy 0, policy_version 95710 (0.0011) [2023-10-10 08:40:10,243][53268] Updated weights for policy 1, policy_version 95620 (0.0007) [2023-10-10 08:40:10,601][53268] Updated weights for policy 1, policy_version 95630 (0.0009) [2023-10-10 08:40:10,974][53268] Updated weights for policy 1, policy_version 95640 (0.0010) [2023-10-10 08:40:11,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 195952640. Throughput: 0: 1693.5, 1: 1662.8. Samples: 48999176. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:11,784][52050] Avg episode reward: [(0, '21.880'), (1, '19.510')] [2023-10-10 08:40:11,892][53252] Updated weights for policy 0, policy_version 95720 (0.0007) [2023-10-10 08:40:12,262][53252] Updated weights for policy 0, policy_version 95730 (0.0007) [2023-10-10 08:40:12,626][53252] Updated weights for policy 0, policy_version 95740 (0.0008) [2023-10-10 08:40:15,032][53268] Updated weights for policy 1, policy_version 95650 (0.0009) [2023-10-10 08:40:15,389][53268] Updated weights for policy 1, policy_version 95660 (0.0008) [2023-10-10 08:40:15,757][53268] Updated weights for policy 1, policy_version 95670 (0.0011) [2023-10-10 08:40:16,130][53268] Updated weights for policy 1, policy_version 95680 (0.0009) [2023-10-10 08:40:16,747][53252] Updated weights for policy 0, policy_version 95750 (0.0010) [2023-10-10 08:40:16,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196018176. Throughput: 0: 1689.5, 1: 1689.4. Samples: 49009502. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:16,784][52050] Avg episode reward: [(0, '22.670'), (1, '21.940')] [2023-10-10 08:40:17,119][53252] Updated weights for policy 0, policy_version 95760 (0.0010) [2023-10-10 08:40:17,491][53252] Updated weights for policy 0, policy_version 95770 (0.0007) [2023-10-10 08:40:20,323][53268] Updated weights for policy 1, policy_version 95690 (0.0009) [2023-10-10 08:40:20,683][53268] Updated weights for policy 1, policy_version 95700 (0.0010) [2023-10-10 08:40:21,037][53268] Updated weights for policy 1, policy_version 95710 (0.0011) [2023-10-10 08:40:21,483][53252] Updated weights for policy 0, policy_version 95780 (0.0007) [2023-10-10 08:40:21,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196083712. Throughput: 0: 1690.4, 1: 1679.4. Samples: 49029722. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:21,784][52050] Avg episode reward: [(0, '21.570'), (1, '22.650')] [2023-10-10 08:40:21,859][53252] Updated weights for policy 0, policy_version 95790 (0.0007) [2023-10-10 08:40:22,215][53252] Updated weights for policy 0, policy_version 95800 (0.0007) [2023-10-10 08:40:25,109][53268] Updated weights for policy 1, policy_version 95720 (0.0008) [2023-10-10 08:40:25,468][53268] Updated weights for policy 1, policy_version 95730 (0.0011) [2023-10-10 08:40:25,833][53268] Updated weights for policy 1, policy_version 95740 (0.0009) [2023-10-10 08:40:26,151][53252] Updated weights for policy 0, policy_version 95810 (0.0011) [2023-10-10 08:40:26,522][53252] Updated weights for policy 0, policy_version 95820 (0.0010) [2023-10-10 08:40:26,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196149248. Throughput: 0: 1682.1, 1: 1662.1. Samples: 49049160. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:26,784][52050] Avg episode reward: [(0, '21.570'), (1, '22.870')] [2023-10-10 08:40:26,881][53252] Updated weights for policy 0, policy_version 95830 (0.0007) [2023-10-10 08:40:27,247][53252] Updated weights for policy 0, policy_version 95840 (0.0010) [2023-10-10 08:40:29,948][53268] Updated weights for policy 1, policy_version 95750 (0.0009) [2023-10-10 08:40:30,340][53268] Updated weights for policy 1, policy_version 95760 (0.0008) [2023-10-10 08:40:30,699][53268] Updated weights for policy 1, policy_version 95770 (0.0008) [2023-10-10 08:40:31,332][53252] Updated weights for policy 0, policy_version 95850 (0.0008) [2023-10-10 08:40:31,700][53252] Updated weights for policy 0, policy_version 95860 (0.0009) [2023-10-10 08:40:31,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196214784. Throughput: 0: 1688.3, 1: 1685.6. Samples: 49059950. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:31,784][52050] Avg episode reward: [(0, '23.760'), (1, '23.050')] [2023-10-10 08:40:32,079][53252] Updated weights for policy 0, policy_version 95870 (0.0010) [2023-10-10 08:40:34,796][53268] Updated weights for policy 1, policy_version 95780 (0.0008) [2023-10-10 08:40:35,168][53268] Updated weights for policy 1, policy_version 95790 (0.0008) [2023-10-10 08:40:35,541][53268] Updated weights for policy 1, policy_version 95800 (0.0007) [2023-10-10 08:40:36,056][53252] Updated weights for policy 0, policy_version 95880 (0.0008) [2023-10-10 08:40:36,428][53252] Updated weights for policy 0, policy_version 95890 (0.0008) [2023-10-10 08:40:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196280320. Throughput: 0: 1693.4, 1: 1669.1. Samples: 49080088. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:36,784][52050] Avg episode reward: [(0, '25.250'), (1, '23.100')] [2023-10-10 08:40:36,803][53252] Updated weights for policy 0, policy_version 95900 (0.0008) [2023-10-10 08:40:39,514][53268] Updated weights for policy 1, policy_version 95810 (0.0010) [2023-10-10 08:40:39,884][53268] Updated weights for policy 1, policy_version 95820 (0.0011) [2023-10-10 08:40:40,250][53268] Updated weights for policy 1, policy_version 95830 (0.0010) [2023-10-10 08:40:40,612][53268] Updated weights for policy 1, policy_version 95840 (0.0008) [2023-10-10 08:40:40,951][53252] Updated weights for policy 0, policy_version 95910 (0.0009) [2023-10-10 08:40:41,329][53252] Updated weights for policy 0, policy_version 95920 (0.0007) [2023-10-10 08:40:41,700][53252] Updated weights for policy 0, policy_version 95930 (0.0008) [2023-10-10 08:40:41,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196345856. Throughput: 0: 1679.6, 1: 1664.5. Samples: 49099308. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:41,784][52050] Avg episode reward: [(0, '23.280'), (1, '20.140')] [2023-10-10 08:40:41,793][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000095840_98140160.pth... [2023-10-10 08:40:41,826][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000094272_96534528.pth [2023-10-10 08:40:41,924][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000095936_98238464.pth... [2023-10-10 08:40:41,960][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000094336_96600064.pth [2023-10-10 08:40:44,726][53268] Updated weights for policy 1, policy_version 95850 (0.0008) [2023-10-10 08:40:45,096][53268] Updated weights for policy 1, policy_version 95860 (0.0009) [2023-10-10 08:40:45,446][53268] Updated weights for policy 1, policy_version 95870 (0.0009) [2023-10-10 08:40:45,706][53252] Updated weights for policy 0, policy_version 95940 (0.0008) [2023-10-10 08:40:46,078][53252] Updated weights for policy 0, policy_version 95950 (0.0007) [2023-10-10 08:40:46,452][53252] Updated weights for policy 0, policy_version 95960 (0.0008) [2023-10-10 08:40:46,784][52050] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 196444160. Throughput: 0: 1695.7, 1: 1674.3. Samples: 49110408. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:46,785][52050] Avg episode reward: [(0, '24.310'), (1, '22.060')] [2023-10-10 08:40:49,464][53268] Updated weights for policy 1, policy_version 95880 (0.0009) [2023-10-10 08:40:49,831][53268] Updated weights for policy 1, policy_version 95890 (0.0009) [2023-10-10 08:40:50,194][53268] Updated weights for policy 1, policy_version 95900 (0.0009) [2023-10-10 08:40:50,436][53252] Updated weights for policy 0, policy_version 95970 (0.0010) [2023-10-10 08:40:50,803][53252] Updated weights for policy 0, policy_version 95980 (0.0009) [2023-10-10 08:40:51,177][53252] Updated weights for policy 0, policy_version 95990 (0.0008) [2023-10-10 08:40:51,539][53252] Updated weights for policy 0, policy_version 96000 (0.0009) [2023-10-10 08:40:51,783][52050] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196509696. Throughput: 0: 1692.7, 1: 1659.5. Samples: 49130444. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:51,784][52050] Avg episode reward: [(0, '23.190'), (1, '20.970')] [2023-10-10 08:40:54,286][53268] Updated weights for policy 1, policy_version 95910 (0.0009) [2023-10-10 08:40:54,649][53268] Updated weights for policy 1, policy_version 95920 (0.0010) [2023-10-10 08:40:55,012][53268] Updated weights for policy 1, policy_version 95930 (0.0010) [2023-10-10 08:40:55,634][53252] Updated weights for policy 0, policy_version 96010 (0.0007) [2023-10-10 08:40:56,004][53252] Updated weights for policy 0, policy_version 96020 (0.0008) [2023-10-10 08:40:56,376][53252] Updated weights for policy 0, policy_version 96030 (0.0010) [2023-10-10 08:40:56,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 196575232. Throughput: 0: 1665.5, 1: 1682.0. Samples: 49149810. Policy #0 lag: (min: 7.0, avg: 12.1, max: 39.0) [2023-10-10 08:40:56,784][52050] Avg episode reward: [(0, '23.260'), (1, '21.640')] [2023-10-10 08:40:59,238][53268] Updated weights for policy 1, policy_version 95940 (0.0010) [2023-10-10 08:40:59,606][53268] Updated weights for policy 1, policy_version 95950 (0.0008) [2023-10-10 08:40:59,968][53268] Updated weights for policy 1, policy_version 95960 (0.0009) [2023-10-10 08:41:00,461][53252] Updated weights for policy 0, policy_version 96040 (0.0009) [2023-10-10 08:41:00,832][53252] Updated weights for policy 0, policy_version 96050 (0.0009) [2023-10-10 08:41:01,210][53252] Updated weights for policy 0, policy_version 96060 (0.0009) [2023-10-10 08:41:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196640768. Throughput: 0: 1690.0, 1: 1675.1. Samples: 49160934. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:01,784][52050] Avg episode reward: [(0, '21.500'), (1, '22.050')] [2023-10-10 08:41:03,843][53268] Updated weights for policy 1, policy_version 95970 (0.0008) [2023-10-10 08:41:04,223][53268] Updated weights for policy 1, policy_version 95980 (0.0008) [2023-10-10 08:41:04,582][53268] Updated weights for policy 1, policy_version 95990 (0.0007) [2023-10-10 08:41:04,940][53268] Updated weights for policy 1, policy_version 96000 (0.0009) [2023-10-10 08:41:05,338][53252] Updated weights for policy 0, policy_version 96070 (0.0008) [2023-10-10 08:41:05,714][53252] Updated weights for policy 0, policy_version 96080 (0.0007) [2023-10-10 08:41:06,079][53252] Updated weights for policy 0, policy_version 96090 (0.0007) [2023-10-10 08:41:06,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196706304. Throughput: 0: 1684.4, 1: 1664.1. Samples: 49180404. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:06,784][52050] Avg episode reward: [(0, '22.090'), (1, '23.200')] [2023-10-10 08:41:08,931][53268] Updated weights for policy 1, policy_version 96010 (0.0007) [2023-10-10 08:41:09,298][53268] Updated weights for policy 1, policy_version 96020 (0.0008) [2023-10-10 08:41:09,681][53268] Updated weights for policy 1, policy_version 96030 (0.0011) [2023-10-10 08:41:10,218][53252] Updated weights for policy 0, policy_version 96100 (0.0007) [2023-10-10 08:41:10,591][53252] Updated weights for policy 0, policy_version 96110 (0.0009) [2023-10-10 08:41:10,960][53252] Updated weights for policy 0, policy_version 96120 (0.0009) [2023-10-10 08:41:11,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196771840. Throughput: 0: 1664.9, 1: 1693.1. Samples: 49200270. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:11,784][52050] Avg episode reward: [(0, '22.830'), (1, '23.770')] [2023-10-10 08:41:13,452][53268] Updated weights for policy 1, policy_version 96040 (0.0008) [2023-10-10 08:41:13,822][53268] Updated weights for policy 1, policy_version 96050 (0.0008) [2023-10-10 08:41:14,193][53268] Updated weights for policy 1, policy_version 96060 (0.0007) [2023-10-10 08:41:15,099][53252] Updated weights for policy 0, policy_version 96130 (0.0008) [2023-10-10 08:41:15,462][53252] Updated weights for policy 0, policy_version 96140 (0.0009) [2023-10-10 08:41:15,840][53252] Updated weights for policy 0, policy_version 96150 (0.0007) [2023-10-10 08:41:16,210][53252] Updated weights for policy 0, policy_version 96160 (0.0008) [2023-10-10 08:41:16,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 196837376. Throughput: 0: 1685.7, 1: 1669.0. Samples: 49210912. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:16,784][52050] Avg episode reward: [(0, '23.110'), (1, '22.640')] [2023-10-10 08:41:18,223][53268] Updated weights for policy 1, policy_version 96070 (0.0008) [2023-10-10 08:41:18,591][53268] Updated weights for policy 1, policy_version 96080 (0.0009) [2023-10-10 08:41:18,956][53268] Updated weights for policy 1, policy_version 96090 (0.0011) [2023-10-10 08:41:20,320][53252] Updated weights for policy 0, policy_version 96170 (0.0007) [2023-10-10 08:41:20,690][53252] Updated weights for policy 0, policy_version 96180 (0.0008) [2023-10-10 08:41:21,067][53252] Updated weights for policy 0, policy_version 96190 (0.0009) [2023-10-10 08:41:21,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196902912. Throughput: 0: 1665.4, 1: 1680.4. Samples: 49230652. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:21,784][52050] Avg episode reward: [(0, '22.280'), (1, '22.450')] [2023-10-10 08:41:23,135][53268] Updated weights for policy 1, policy_version 96100 (0.0008) [2023-10-10 08:41:23,551][53268] Updated weights for policy 1, policy_version 96110 (0.0011) [2023-10-10 08:41:23,920][53268] Updated weights for policy 1, policy_version 96120 (0.0008) [2023-10-10 08:41:25,135][53252] Updated weights for policy 0, policy_version 96200 (0.0008) [2023-10-10 08:41:25,515][53252] Updated weights for policy 0, policy_version 96210 (0.0009) [2023-10-10 08:41:25,893][53252] Updated weights for policy 0, policy_version 96220 (0.0009) [2023-10-10 08:41:26,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 196968448. Throughput: 0: 1657.4, 1: 1700.2. Samples: 49250398. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:26,784][52050] Avg episode reward: [(0, '21.780'), (1, '21.870')] [2023-10-10 08:41:28,006][53268] Updated weights for policy 1, policy_version 96130 (0.0008) [2023-10-10 08:41:28,371][53268] Updated weights for policy 1, policy_version 96140 (0.0009) [2023-10-10 08:41:28,743][53268] Updated weights for policy 1, policy_version 96150 (0.0009) [2023-10-10 08:41:29,115][53268] Updated weights for policy 1, policy_version 96160 (0.0009) [2023-10-10 08:41:29,946][53252] Updated weights for policy 0, policy_version 96230 (0.0009) [2023-10-10 08:41:30,317][53252] Updated weights for policy 0, policy_version 96240 (0.0009) [2023-10-10 08:41:30,694][53252] Updated weights for policy 0, policy_version 96250 (0.0009) [2023-10-10 08:41:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197033984. Throughput: 0: 1673.1, 1: 1670.1. Samples: 49260852. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:31,784][52050] Avg episode reward: [(0, '22.780'), (1, '22.060')] [2023-10-10 08:41:33,231][53268] Updated weights for policy 1, policy_version 96170 (0.0010) [2023-10-10 08:41:33,608][53268] Updated weights for policy 1, policy_version 96180 (0.0009) [2023-10-10 08:41:33,975][53268] Updated weights for policy 1, policy_version 96190 (0.0009) [2023-10-10 08:41:34,810][53252] Updated weights for policy 0, policy_version 96260 (0.0009) [2023-10-10 08:41:35,172][53252] Updated weights for policy 0, policy_version 96270 (0.0010) [2023-10-10 08:41:35,547][53252] Updated weights for policy 0, policy_version 96280 (0.0007) [2023-10-10 08:41:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197099520. Throughput: 0: 1647.0, 1: 1691.5. Samples: 49280678. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:36,784][52050] Avg episode reward: [(0, '22.950'), (1, '21.600')] [2023-10-10 08:41:38,096][53268] Updated weights for policy 1, policy_version 96200 (0.0009) [2023-10-10 08:41:38,460][53268] Updated weights for policy 1, policy_version 96210 (0.0007) [2023-10-10 08:41:38,822][53268] Updated weights for policy 1, policy_version 96220 (0.0009) [2023-10-10 08:41:39,614][53252] Updated weights for policy 0, policy_version 96290 (0.0009) [2023-10-10 08:41:39,991][53252] Updated weights for policy 0, policy_version 96300 (0.0010) [2023-10-10 08:41:40,357][53252] Updated weights for policy 0, policy_version 96310 (0.0010) [2023-10-10 08:41:40,726][53252] Updated weights for policy 0, policy_version 96320 (0.0008) [2023-10-10 08:41:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197165056. Throughput: 0: 1663.6, 1: 1698.0. Samples: 49301084. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:41,784][52050] Avg episode reward: [(0, '22.120'), (1, '22.780')] [2023-10-10 08:41:42,850][53268] Updated weights for policy 1, policy_version 96230 (0.0008) [2023-10-10 08:41:43,215][53268] Updated weights for policy 1, policy_version 96240 (0.0007) [2023-10-10 08:41:43,579][53268] Updated weights for policy 1, policy_version 96250 (0.0008) [2023-10-10 08:41:44,800][53252] Updated weights for policy 0, policy_version 96330 (0.0010) [2023-10-10 08:41:45,172][53252] Updated weights for policy 0, policy_version 96340 (0.0008) [2023-10-10 08:41:45,540][53252] Updated weights for policy 0, policy_version 96350 (0.0011) [2023-10-10 08:41:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197230592. Throughput: 0: 1670.9, 1: 1676.1. Samples: 49311548. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:46,784][52050] Avg episode reward: [(0, '22.400'), (1, '22.210')] [2023-10-10 08:41:47,564][53268] Updated weights for policy 1, policy_version 96260 (0.0008) [2023-10-10 08:41:47,942][53268] Updated weights for policy 1, policy_version 96270 (0.0009) [2023-10-10 08:41:48,302][53268] Updated weights for policy 1, policy_version 96280 (0.0009) [2023-10-10 08:41:49,497][53252] Updated weights for policy 0, policy_version 96360 (0.0010) [2023-10-10 08:41:49,866][53252] Updated weights for policy 0, policy_version 96370 (0.0011) [2023-10-10 08:41:50,243][53252] Updated weights for policy 0, policy_version 96380 (0.0010) [2023-10-10 08:41:51,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197296128. Throughput: 0: 1654.3, 1: 1698.8. Samples: 49331296. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:51,784][52050] Avg episode reward: [(0, '23.730'), (1, '22.450')] [2023-10-10 08:41:52,484][53268] Updated weights for policy 1, policy_version 96290 (0.0008) [2023-10-10 08:41:52,850][53268] Updated weights for policy 1, policy_version 96300 (0.0008) [2023-10-10 08:41:53,223][53268] Updated weights for policy 1, policy_version 96310 (0.0009) [2023-10-10 08:41:53,589][53268] Updated weights for policy 1, policy_version 96320 (0.0008) [2023-10-10 08:41:54,282][53252] Updated weights for policy 0, policy_version 96390 (0.0009) [2023-10-10 08:41:54,652][53252] Updated weights for policy 0, policy_version 96400 (0.0010) [2023-10-10 08:41:55,022][53252] Updated weights for policy 0, policy_version 96410 (0.0009) [2023-10-10 08:41:56,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 197361664. Throughput: 0: 1674.3, 1: 1691.0. Samples: 49351710. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-10 08:41:56,784][52050] Avg episode reward: [(0, '23.670'), (1, '22.660')] [2023-10-10 08:41:57,502][53268] Updated weights for policy 1, policy_version 96330 (0.0007) [2023-10-10 08:41:57,871][53268] Updated weights for policy 1, policy_version 96340 (0.0007) [2023-10-10 08:41:58,228][53268] Updated weights for policy 1, policy_version 96350 (0.0007) [2023-10-10 08:41:59,232][53252] Updated weights for policy 0, policy_version 96420 (0.0009) [2023-10-10 08:41:59,603][53252] Updated weights for policy 0, policy_version 96430 (0.0007) [2023-10-10 08:41:59,972][53252] Updated weights for policy 0, policy_version 96440 (0.0010) [2023-10-10 08:42:01,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197427200. Throughput: 0: 1667.2, 1: 1684.6. Samples: 49361742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:01,784][52050] Avg episode reward: [(0, '23.890'), (1, '21.860')] [2023-10-10 08:42:02,530][53268] Updated weights for policy 1, policy_version 96360 (0.0007) [2023-10-10 08:42:02,898][53268] Updated weights for policy 1, policy_version 96370 (0.0010) [2023-10-10 08:42:03,262][53268] Updated weights for policy 1, policy_version 96380 (0.0011) [2023-10-10 08:42:03,926][53252] Updated weights for policy 0, policy_version 96450 (0.0007) [2023-10-10 08:42:04,296][53252] Updated weights for policy 0, policy_version 96460 (0.0008) [2023-10-10 08:42:04,670][53252] Updated weights for policy 0, policy_version 96470 (0.0007) [2023-10-10 08:42:05,039][53252] Updated weights for policy 0, policy_version 96480 (0.0007) [2023-10-10 08:42:06,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197492736. Throughput: 0: 1663.6, 1: 1689.8. Samples: 49381554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:06,784][52050] Avg episode reward: [(0, '22.620'), (1, '22.090')] [2023-10-10 08:42:07,241][53268] Updated weights for policy 1, policy_version 96390 (0.0010) [2023-10-10 08:42:07,613][53268] Updated weights for policy 1, policy_version 96400 (0.0009) [2023-10-10 08:42:07,971][53268] Updated weights for policy 1, policy_version 96410 (0.0009) [2023-10-10 08:42:09,162][53252] Updated weights for policy 0, policy_version 96490 (0.0010) [2023-10-10 08:42:09,539][53252] Updated weights for policy 0, policy_version 96500 (0.0010) [2023-10-10 08:42:09,907][53252] Updated weights for policy 0, policy_version 96510 (0.0011) [2023-10-10 08:42:11,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197558272. Throughput: 0: 1689.4, 1: 1687.8. Samples: 49402374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:11,784][52050] Avg episode reward: [(0, '20.830'), (1, '22.230')] [2023-10-10 08:42:12,168][53268] Updated weights for policy 1, policy_version 96420 (0.0008) [2023-10-10 08:42:12,554][53268] Updated weights for policy 1, policy_version 96430 (0.0007) [2023-10-10 08:42:12,924][53268] Updated weights for policy 1, policy_version 96440 (0.0007) [2023-10-10 08:42:13,898][53252] Updated weights for policy 0, policy_version 96520 (0.0011) [2023-10-10 08:42:14,271][53252] Updated weights for policy 0, policy_version 96530 (0.0009) [2023-10-10 08:42:14,644][53252] Updated weights for policy 0, policy_version 96540 (0.0009) [2023-10-10 08:42:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197623808. Throughput: 0: 1671.6, 1: 1686.7. Samples: 49411976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:16,784][52050] Avg episode reward: [(0, '22.570'), (1, '21.870')] [2023-10-10 08:42:16,874][53268] Updated weights for policy 1, policy_version 96450 (0.0007) [2023-10-10 08:42:17,236][53268] Updated weights for policy 1, policy_version 96460 (0.0009) [2023-10-10 08:42:17,605][53268] Updated weights for policy 1, policy_version 96470 (0.0007) [2023-10-10 08:42:17,967][53268] Updated weights for policy 1, policy_version 96480 (0.0008) [2023-10-10 08:42:18,828][53252] Updated weights for policy 0, policy_version 96550 (0.0010) [2023-10-10 08:42:19,210][53252] Updated weights for policy 0, policy_version 96560 (0.0008) [2023-10-10 08:42:19,574][53252] Updated weights for policy 0, policy_version 96570 (0.0008) [2023-10-10 08:42:21,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197689344. Throughput: 0: 1682.2, 1: 1683.3. Samples: 49432124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:21,784][52050] Avg episode reward: [(0, '23.950'), (1, '22.560')] [2023-10-10 08:42:21,949][53268] Updated weights for policy 1, policy_version 96490 (0.0009) [2023-10-10 08:42:22,316][53268] Updated weights for policy 1, policy_version 96500 (0.0010) [2023-10-10 08:42:22,675][53268] Updated weights for policy 1, policy_version 96510 (0.0011) [2023-10-10 08:42:23,675][53252] Updated weights for policy 0, policy_version 96580 (0.0009) [2023-10-10 08:42:24,038][53252] Updated weights for policy 0, policy_version 96590 (0.0009) [2023-10-10 08:42:24,408][53252] Updated weights for policy 0, policy_version 96600 (0.0009) [2023-10-10 08:42:26,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197754880. Throughput: 0: 1690.8, 1: 1679.9. Samples: 49452764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:26,784][52050] Avg episode reward: [(0, '21.830'), (1, '23.210')] [2023-10-10 08:42:26,802][53268] Updated weights for policy 1, policy_version 96520 (0.0009) [2023-10-10 08:42:27,165][53268] Updated weights for policy 1, policy_version 96530 (0.0011) [2023-10-10 08:42:27,539][53268] Updated weights for policy 1, policy_version 96540 (0.0012) [2023-10-10 08:42:28,452][53252] Updated weights for policy 0, policy_version 96610 (0.0009) [2023-10-10 08:42:28,817][53252] Updated weights for policy 0, policy_version 96620 (0.0009) [2023-10-10 08:42:29,187][53252] Updated weights for policy 0, policy_version 96630 (0.0008) [2023-10-10 08:42:29,560][53252] Updated weights for policy 0, policy_version 96640 (0.0009) [2023-10-10 08:42:31,766][53268] Updated weights for policy 1, policy_version 96550 (0.0009) [2023-10-10 08:42:31,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197820416. Throughput: 0: 1665.3, 1: 1677.1. Samples: 49461954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:31,784][52050] Avg episode reward: [(0, '23.500'), (1, '23.860')] [2023-10-10 08:42:32,128][53268] Updated weights for policy 1, policy_version 96560 (0.0009) [2023-10-10 08:42:32,494][53268] Updated weights for policy 1, policy_version 96570 (0.0010) [2023-10-10 08:42:33,532][53252] Updated weights for policy 0, policy_version 96650 (0.0010) [2023-10-10 08:42:33,896][53252] Updated weights for policy 0, policy_version 96660 (0.0011) [2023-10-10 08:42:34,274][53252] Updated weights for policy 0, policy_version 96670 (0.0011) [2023-10-10 08:42:36,513][53268] Updated weights for policy 1, policy_version 96580 (0.0009) [2023-10-10 08:42:36,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197885952. Throughput: 0: 1685.2, 1: 1676.8. Samples: 49482586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:36,784][52050] Avg episode reward: [(0, '21.850'), (1, '21.310')] [2023-10-10 08:42:36,873][53268] Updated weights for policy 1, policy_version 96590 (0.0007) [2023-10-10 08:42:37,239][53268] Updated weights for policy 1, policy_version 96600 (0.0008) [2023-10-10 08:42:38,353][53252] Updated weights for policy 0, policy_version 96680 (0.0008) [2023-10-10 08:42:38,720][53252] Updated weights for policy 0, policy_version 96690 (0.0007) [2023-10-10 08:42:39,097][53252] Updated weights for policy 0, policy_version 96700 (0.0009) [2023-10-10 08:42:41,345][53268] Updated weights for policy 1, policy_version 96610 (0.0007) [2023-10-10 08:42:41,717][53268] Updated weights for policy 1, policy_version 96620 (0.0009) [2023-10-10 08:42:41,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 197951488. Throughput: 0: 1685.6, 1: 1682.3. Samples: 49503264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:41,784][52050] Avg episode reward: [(0, '21.620'), (1, '20.950')] [2023-10-10 08:42:41,791][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000096704_99024896.pth... [2023-10-10 08:42:41,823][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000095136_97419264.pth [2023-10-10 08:42:42,095][53268] Updated weights for policy 1, policy_version 96630 (0.0009) [2023-10-10 08:42:42,463][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000096640_98959360.pth... [2023-10-10 08:42:42,467][53268] Updated weights for policy 1, policy_version 96640 (0.0008) [2023-10-10 08:42:42,493][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000095040_97320960.pth [2023-10-10 08:42:43,135][53252] Updated weights for policy 0, policy_version 96710 (0.0008) [2023-10-10 08:42:43,514][53252] Updated weights for policy 0, policy_version 96720 (0.0012) [2023-10-10 08:42:43,882][53252] Updated weights for policy 0, policy_version 96730 (0.0008) [2023-10-10 08:42:46,629][53268] Updated weights for policy 1, policy_version 96650 (0.0009) [2023-10-10 08:42:46,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198017024. Throughput: 0: 1669.5, 1: 1681.0. Samples: 49512516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:46,784][52050] Avg episode reward: [(0, '23.510'), (1, '22.280')] [2023-10-10 08:42:46,992][53268] Updated weights for policy 1, policy_version 96660 (0.0009) [2023-10-10 08:42:47,367][53268] Updated weights for policy 1, policy_version 96670 (0.0009) [2023-10-10 08:42:47,925][53252] Updated weights for policy 0, policy_version 96740 (0.0008) [2023-10-10 08:42:48,303][53252] Updated weights for policy 0, policy_version 96750 (0.0009) [2023-10-10 08:42:48,675][53252] Updated weights for policy 0, policy_version 96760 (0.0007) [2023-10-10 08:42:51,483][53268] Updated weights for policy 1, policy_version 96680 (0.0010) [2023-10-10 08:42:51,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198082560. Throughput: 0: 1688.3, 1: 1676.6. Samples: 49532978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:51,784][52050] Avg episode reward: [(0, '21.860'), (1, '20.920')] [2023-10-10 08:42:51,857][53268] Updated weights for policy 1, policy_version 96690 (0.0009) [2023-10-10 08:42:52,236][53268] Updated weights for policy 1, policy_version 96700 (0.0008) [2023-10-10 08:42:52,771][53252] Updated weights for policy 0, policy_version 96770 (0.0009) [2023-10-10 08:42:53,138][53252] Updated weights for policy 0, policy_version 96780 (0.0007) [2023-10-10 08:42:53,507][53252] Updated weights for policy 0, policy_version 96790 (0.0009) [2023-10-10 08:42:53,880][53252] Updated weights for policy 0, policy_version 96800 (0.0008) [2023-10-10 08:42:56,313][53268] Updated weights for policy 1, policy_version 96710 (0.0008) [2023-10-10 08:42:56,706][53268] Updated weights for policy 1, policy_version 96720 (0.0008) [2023-10-10 08:42:56,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 198148096. Throughput: 0: 1684.9, 1: 1677.2. Samples: 49553670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:42:56,784][52050] Avg episode reward: [(0, '21.280'), (1, '20.920')] [2023-10-10 08:42:57,069][53268] Updated weights for policy 1, policy_version 96730 (0.0010) [2023-10-10 08:42:57,929][53252] Updated weights for policy 0, policy_version 96810 (0.0009) [2023-10-10 08:42:58,295][53252] Updated weights for policy 0, policy_version 96820 (0.0010) [2023-10-10 08:42:58,676][53252] Updated weights for policy 0, policy_version 96830 (0.0009) [2023-10-10 08:43:01,120][53268] Updated weights for policy 1, policy_version 96740 (0.0009) [2023-10-10 08:43:01,486][53268] Updated weights for policy 1, policy_version 96750 (0.0010) [2023-10-10 08:43:01,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198213632. Throughput: 0: 1667.6, 1: 1680.0. Samples: 49562616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:01,784][52050] Avg episode reward: [(0, '21.890'), (1, '20.280')] [2023-10-10 08:43:01,844][53268] Updated weights for policy 1, policy_version 96760 (0.0009) [2023-10-10 08:43:02,750][53252] Updated weights for policy 0, policy_version 96840 (0.0007) [2023-10-10 08:43:03,114][53252] Updated weights for policy 0, policy_version 96850 (0.0009) [2023-10-10 08:43:03,490][53252] Updated weights for policy 0, policy_version 96860 (0.0010) [2023-10-10 08:43:05,905][53268] Updated weights for policy 1, policy_version 96770 (0.0010) [2023-10-10 08:43:06,276][53268] Updated weights for policy 1, policy_version 96780 (0.0008) [2023-10-10 08:43:06,646][53268] Updated weights for policy 1, policy_version 96790 (0.0007) [2023-10-10 08:43:06,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198279168. Throughput: 0: 1680.6, 1: 1682.7. Samples: 49583472. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:06,784][52050] Avg episode reward: [(0, '21.880'), (1, '21.880')] [2023-10-10 08:43:07,003][53268] Updated weights for policy 1, policy_version 96800 (0.0008) [2023-10-10 08:43:07,553][53252] Updated weights for policy 0, policy_version 96870 (0.0010) [2023-10-10 08:43:07,924][53252] Updated weights for policy 0, policy_version 96880 (0.0007) [2023-10-10 08:43:08,294][53252] Updated weights for policy 0, policy_version 96890 (0.0008) [2023-10-10 08:43:10,990][53268] Updated weights for policy 1, policy_version 96810 (0.0008) [2023-10-10 08:43:11,367][53268] Updated weights for policy 1, policy_version 96820 (0.0009) [2023-10-10 08:43:11,736][53268] Updated weights for policy 1, policy_version 96830 (0.0009) [2023-10-10 08:43:11,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 198344704. Throughput: 0: 1682.3, 1: 1669.4. Samples: 49603592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:11,784][52050] Avg episode reward: [(0, '21.980'), (1, '19.760')] [2023-10-10 08:43:12,560][53252] Updated weights for policy 0, policy_version 96900 (0.0008) [2023-10-10 08:43:12,934][53252] Updated weights for policy 0, policy_version 96910 (0.0008) [2023-10-10 08:43:13,318][53252] Updated weights for policy 0, policy_version 96920 (0.0008) [2023-10-10 08:43:15,804][53268] Updated weights for policy 1, policy_version 96840 (0.0008) [2023-10-10 08:43:16,177][53268] Updated weights for policy 1, policy_version 96850 (0.0008) [2023-10-10 08:43:16,540][53268] Updated weights for policy 1, policy_version 96860 (0.0008) [2023-10-10 08:43:16,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198443008. Throughput: 0: 1673.9, 1: 1685.7. Samples: 49613134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:16,784][52050] Avg episode reward: [(0, '22.700'), (1, '21.460')] [2023-10-10 08:43:17,447][53252] Updated weights for policy 0, policy_version 96930 (0.0008) [2023-10-10 08:43:17,824][53252] Updated weights for policy 0, policy_version 96940 (0.0009) [2023-10-10 08:43:18,188][53252] Updated weights for policy 0, policy_version 96950 (0.0008) [2023-10-10 08:43:18,562][53252] Updated weights for policy 0, policy_version 96960 (0.0008) [2023-10-10 08:43:20,563][53268] Updated weights for policy 1, policy_version 96870 (0.0009) [2023-10-10 08:43:20,929][53268] Updated weights for policy 1, policy_version 96880 (0.0008) [2023-10-10 08:43:21,299][53268] Updated weights for policy 1, policy_version 96890 (0.0010) [2023-10-10 08:43:21,783][52050] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198508544. Throughput: 0: 1673.6, 1: 1685.8. Samples: 49633762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:21,784][52050] Avg episode reward: [(0, '23.970'), (1, '22.020')] [2023-10-10 08:43:22,638][53252] Updated weights for policy 0, policy_version 96970 (0.0009) [2023-10-10 08:43:23,008][53252] Updated weights for policy 0, policy_version 96980 (0.0008) [2023-10-10 08:43:23,383][53252] Updated weights for policy 0, policy_version 96990 (0.0008) [2023-10-10 08:43:25,282][53268] Updated weights for policy 1, policy_version 96900 (0.0008) [2023-10-10 08:43:25,655][53268] Updated weights for policy 1, policy_version 96910 (0.0009) [2023-10-10 08:43:26,019][53268] Updated weights for policy 1, policy_version 96920 (0.0008) [2023-10-10 08:43:26,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198574080. Throughput: 0: 1677.7, 1: 1664.2. Samples: 49653648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:26,784][52050] Avg episode reward: [(0, '25.520'), (1, '20.900')] [2023-10-10 08:43:27,371][53252] Updated weights for policy 0, policy_version 97000 (0.0009) [2023-10-10 08:43:27,745][53252] Updated weights for policy 0, policy_version 97010 (0.0009) [2023-10-10 08:43:28,118][53252] Updated weights for policy 0, policy_version 97020 (0.0009) [2023-10-10 08:43:30,077][53268] Updated weights for policy 1, policy_version 96930 (0.0009) [2023-10-10 08:43:30,458][53268] Updated weights for policy 1, policy_version 96940 (0.0010) [2023-10-10 08:43:30,823][53268] Updated weights for policy 1, policy_version 96950 (0.0007) [2023-10-10 08:43:31,185][53268] Updated weights for policy 1, policy_version 96960 (0.0008) [2023-10-10 08:43:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198639616. Throughput: 0: 1676.9, 1: 1687.0. Samples: 49663894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:31,784][52050] Avg episode reward: [(0, '24.070'), (1, '21.230')] [2023-10-10 08:43:32,100][53252] Updated weights for policy 0, policy_version 97030 (0.0009) [2023-10-10 08:43:32,464][53252] Updated weights for policy 0, policy_version 97040 (0.0010) [2023-10-10 08:43:32,831][53252] Updated weights for policy 0, policy_version 97050 (0.0007) [2023-10-10 08:43:35,303][53268] Updated weights for policy 1, policy_version 96970 (0.0008) [2023-10-10 08:43:35,676][53268] Updated weights for policy 1, policy_version 96980 (0.0008) [2023-10-10 08:43:36,045][53268] Updated weights for policy 1, policy_version 96990 (0.0012) [2023-10-10 08:43:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198705152. Throughput: 0: 1677.9, 1: 1687.4. Samples: 49684416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:36,784][52050] Avg episode reward: [(0, '24.490'), (1, '23.400')] [2023-10-10 08:43:36,989][53252] Updated weights for policy 0, policy_version 97060 (0.0009) [2023-10-10 08:43:37,366][53252] Updated weights for policy 0, policy_version 97070 (0.0007) [2023-10-10 08:43:37,737][53252] Updated weights for policy 0, policy_version 97080 (0.0008) [2023-10-10 08:43:39,936][53268] Updated weights for policy 1, policy_version 97000 (0.0011) [2023-10-10 08:43:40,305][53268] Updated weights for policy 1, policy_version 97010 (0.0007) [2023-10-10 08:43:40,667][53268] Updated weights for policy 1, policy_version 97020 (0.0011) [2023-10-10 08:43:41,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198770688. Throughput: 0: 1680.1, 1: 1664.8. Samples: 49704186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:41,784][52050] Avg episode reward: [(0, '22.970'), (1, '22.110')] [2023-10-10 08:43:41,904][53252] Updated weights for policy 0, policy_version 97090 (0.0010) [2023-10-10 08:43:42,275][53252] Updated weights for policy 0, policy_version 97100 (0.0009) [2023-10-10 08:43:42,647][53252] Updated weights for policy 0, policy_version 97110 (0.0009) [2023-10-10 08:43:43,014][53252] Updated weights for policy 0, policy_version 97120 (0.0008) [2023-10-10 08:43:44,729][53268] Updated weights for policy 1, policy_version 97030 (0.0010) [2023-10-10 08:43:45,116][53268] Updated weights for policy 1, policy_version 97040 (0.0009) [2023-10-10 08:43:45,473][53268] Updated weights for policy 1, policy_version 97050 (0.0009) [2023-10-10 08:43:46,783][52050] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198836224. Throughput: 0: 1682.0, 1: 1693.2. Samples: 49714500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:46,784][52050] Avg episode reward: [(0, '23.330'), (1, '21.100')] [2023-10-10 08:43:47,247][53252] Updated weights for policy 0, policy_version 97130 (0.0008) [2023-10-10 08:43:47,623][53252] Updated weights for policy 0, policy_version 97140 (0.0008) [2023-10-10 08:43:47,989][53252] Updated weights for policy 0, policy_version 97150 (0.0008) [2023-10-10 08:43:49,626][53268] Updated weights for policy 1, policy_version 97060 (0.0009) [2023-10-10 08:43:49,991][53268] Updated weights for policy 1, policy_version 97070 (0.0010) [2023-10-10 08:43:50,354][53268] Updated weights for policy 1, policy_version 97080 (0.0010) [2023-10-10 08:43:51,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198901760. Throughput: 0: 1677.8, 1: 1673.8. Samples: 49734294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:51,784][52050] Avg episode reward: [(0, '23.680'), (1, '21.690')] [2023-10-10 08:43:51,891][53252] Updated weights for policy 0, policy_version 97160 (0.0009) [2023-10-10 08:43:52,261][53252] Updated weights for policy 0, policy_version 97170 (0.0010) [2023-10-10 08:43:52,642][53252] Updated weights for policy 0, policy_version 97180 (0.0010) [2023-10-10 08:43:54,336][53268] Updated weights for policy 1, policy_version 97090 (0.0009) [2023-10-10 08:43:54,702][53268] Updated weights for policy 1, policy_version 97100 (0.0009) [2023-10-10 08:43:55,070][53268] Updated weights for policy 1, policy_version 97110 (0.0007) [2023-10-10 08:43:55,434][53268] Updated weights for policy 1, policy_version 97120 (0.0008) [2023-10-10 08:43:56,417][53252] Updated weights for policy 0, policy_version 97190 (0.0009) [2023-10-10 08:43:56,783][52050] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198967296. Throughput: 0: 1676.0, 1: 1677.6. Samples: 49754502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:43:56,784][52050] Avg episode reward: [(0, '21.890'), (1, '20.520')] [2023-10-10 08:43:56,790][53252] Updated weights for policy 0, policy_version 97200 (0.0008) [2023-10-10 08:43:57,155][53252] Updated weights for policy 0, policy_version 97210 (0.0008) [2023-10-10 08:43:59,509][53268] Updated weights for policy 1, policy_version 97130 (0.0009) [2023-10-10 08:43:59,871][53268] Updated weights for policy 1, policy_version 97140 (0.0010) [2023-10-10 08:44:00,240][53268] Updated weights for policy 1, policy_version 97150 (0.0011) [2023-10-10 08:44:01,192][53252] Updated weights for policy 0, policy_version 97220 (0.0008) [2023-10-10 08:44:01,559][53252] Updated weights for policy 0, policy_version 97230 (0.0009) [2023-10-10 08:44:01,783][52050] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199032832. Throughput: 0: 1681.9, 1: 1693.5. Samples: 49765028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:01,784][52050] Avg episode reward: [(0, '21.100'), (1, '19.920')] [2023-10-10 08:44:01,928][53252] Updated weights for policy 0, policy_version 97240 (0.0008) [2023-10-10 08:44:04,322][53268] Updated weights for policy 1, policy_version 97160 (0.0009) [2023-10-10 08:44:04,692][53268] Updated weights for policy 1, policy_version 97170 (0.0010) [2023-10-10 08:44:05,062][53268] Updated weights for policy 1, policy_version 97180 (0.0008) [2023-10-10 08:44:06,013][53252] Updated weights for policy 0, policy_version 97250 (0.0007) [2023-10-10 08:44:06,386][53252] Updated weights for policy 0, policy_version 97260 (0.0008) [2023-10-10 08:44:06,756][53252] Updated weights for policy 0, policy_version 97270 (0.0008) [2023-10-10 08:44:06,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199098368. Throughput: 0: 1685.6, 1: 1668.5. Samples: 49784694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:06,784][52050] Avg episode reward: [(0, '22.870'), (1, '21.380')] [2023-10-10 08:44:07,126][53252] Updated weights for policy 0, policy_version 97280 (0.0009) [2023-10-10 08:44:09,048][53268] Updated weights for policy 1, policy_version 97190 (0.0009) [2023-10-10 08:44:09,415][53268] Updated weights for policy 1, policy_version 97200 (0.0008) [2023-10-10 08:44:09,782][53268] Updated weights for policy 1, policy_version 97210 (0.0009) [2023-10-10 08:44:11,364][53252] Updated weights for policy 0, policy_version 97290 (0.0007) [2023-10-10 08:44:11,738][53252] Updated weights for policy 0, policy_version 97300 (0.0007) [2023-10-10 08:44:11,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199163904. Throughput: 0: 1672.5, 1: 1686.5. Samples: 49804802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:11,784][52050] Avg episode reward: [(0, '21.850'), (1, '20.110')] [2023-10-10 08:44:12,112][53252] Updated weights for policy 0, policy_version 97310 (0.0009) [2023-10-10 08:44:13,867][53268] Updated weights for policy 1, policy_version 97220 (0.0007) [2023-10-10 08:44:14,230][53268] Updated weights for policy 1, policy_version 97230 (0.0009) [2023-10-10 08:44:14,588][53268] Updated weights for policy 1, policy_version 97240 (0.0008) [2023-10-10 08:44:16,098][53252] Updated weights for policy 0, policy_version 97320 (0.0009) [2023-10-10 08:44:16,473][53252] Updated weights for policy 0, policy_version 97330 (0.0008) [2023-10-10 08:44:16,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199229440. Throughput: 0: 1681.6, 1: 1681.3. Samples: 49815224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:16,784][52050] Avg episode reward: [(0, '21.300'), (1, '20.220')] [2023-10-10 08:44:16,837][53252] Updated weights for policy 0, policy_version 97340 (0.0008) [2023-10-10 08:44:18,722][53268] Updated weights for policy 1, policy_version 97250 (0.0008) [2023-10-10 08:44:19,088][53268] Updated weights for policy 1, policy_version 97260 (0.0008) [2023-10-10 08:44:19,457][53268] Updated weights for policy 1, policy_version 97270 (0.0009) [2023-10-10 08:44:19,824][53268] Updated weights for policy 1, policy_version 97280 (0.0009) [2023-10-10 08:44:21,022][53252] Updated weights for policy 0, policy_version 97350 (0.0007) [2023-10-10 08:44:21,394][53252] Updated weights for policy 0, policy_version 97360 (0.0009) [2023-10-10 08:44:21,758][53252] Updated weights for policy 0, policy_version 97370 (0.0010) [2023-10-10 08:44:21,783][52050] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199294976. Throughput: 0: 1677.3, 1: 1667.3. Samples: 49834922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:21,784][52050] Avg episode reward: [(0, '20.310'), (1, '22.660')] [2023-10-10 08:44:23,808][53268] Updated weights for policy 1, policy_version 97290 (0.0009) [2023-10-10 08:44:24,183][53268] Updated weights for policy 1, policy_version 97300 (0.0009) [2023-10-10 08:44:24,539][53268] Updated weights for policy 1, policy_version 97310 (0.0009) [2023-10-10 08:44:25,917][53252] Updated weights for policy 0, policy_version 97380 (0.0009) [2023-10-10 08:44:26,289][53252] Updated weights for policy 0, policy_version 97390 (0.0010) [2023-10-10 08:44:26,661][53252] Updated weights for policy 0, policy_version 97400 (0.0009) [2023-10-10 08:44:26,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199360512. Throughput: 0: 1655.0, 1: 1694.1. Samples: 49854894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:26,784][52050] Avg episode reward: [(0, '22.760'), (1, '22.340')] [2023-10-10 08:44:28,483][53268] Updated weights for policy 1, policy_version 97320 (0.0011) [2023-10-10 08:44:28,852][53268] Updated weights for policy 1, policy_version 97330 (0.0009) [2023-10-10 08:44:29,210][53268] Updated weights for policy 1, policy_version 97340 (0.0007) [2023-10-10 08:44:30,724][53252] Updated weights for policy 0, policy_version 97410 (0.0008) [2023-10-10 08:44:31,097][53252] Updated weights for policy 0, policy_version 97420 (0.0008) [2023-10-10 08:44:31,477][53252] Updated weights for policy 0, policy_version 97430 (0.0007) [2023-10-10 08:44:31,783][52050] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199426048. Throughput: 0: 1671.5, 1: 1670.8. Samples: 49864902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:31,784][52050] Avg episode reward: [(0, '22.120'), (1, '22.590')] [2023-10-10 08:44:31,843][53252] Updated weights for policy 0, policy_version 97440 (0.0010) [2023-10-10 08:44:33,558][53268] Updated weights for policy 1, policy_version 97350 (0.0009) [2023-10-10 08:44:33,966][53268] Updated weights for policy 1, policy_version 97360 (0.0010) [2023-10-10 08:44:34,326][53268] Updated weights for policy 1, policy_version 97370 (0.0009) [2023-10-10 08:44:36,150][53252] Updated weights for policy 0, policy_version 97450 (0.0009) [2023-10-10 08:44:36,517][53252] Updated weights for policy 0, policy_version 97460 (0.0012) [2023-10-10 08:44:36,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199491584. Throughput: 0: 1673.6, 1: 1674.6. Samples: 49884962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:36,784][52050] Avg episode reward: [(0, '23.160'), (1, '23.310')] [2023-10-10 08:44:36,888][53252] Updated weights for policy 0, policy_version 97470 (0.0008) [2023-10-10 08:44:38,224][53268] Updated weights for policy 1, policy_version 97380 (0.0010) [2023-10-10 08:44:38,592][53268] Updated weights for policy 1, policy_version 97390 (0.0009) [2023-10-10 08:44:38,974][53268] Updated weights for policy 1, policy_version 97400 (0.0009) [2023-10-10 08:44:40,999][53252] Updated weights for policy 0, policy_version 97480 (0.0008) [2023-10-10 08:44:41,371][53252] Updated weights for policy 0, policy_version 97490 (0.0007) [2023-10-10 08:44:41,735][53252] Updated weights for policy 0, policy_version 97500 (0.0007) [2023-10-10 08:44:41,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199557120. Throughput: 0: 1651.6, 1: 1681.6. Samples: 49904496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:41,784][52050] Avg episode reward: [(0, '23.850'), (1, '20.520')] [2023-10-10 08:44:41,792][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000097408_99745792.pth... [2023-10-10 08:44:41,822][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000095840_98140160.pth [2023-10-10 08:44:41,880][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000097504_99844096.pth... [2023-10-10 08:44:41,916][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000095936_98238464.pth [2023-10-10 08:44:43,171][53268] Updated weights for policy 1, policy_version 97410 (0.0008) [2023-10-10 08:44:43,543][53268] Updated weights for policy 1, policy_version 97420 (0.0010) [2023-10-10 08:44:43,906][53268] Updated weights for policy 1, policy_version 97430 (0.0010) [2023-10-10 08:44:44,272][53268] Updated weights for policy 1, policy_version 97440 (0.0009) [2023-10-10 08:44:45,782][53252] Updated weights for policy 0, policy_version 97510 (0.0009) [2023-10-10 08:44:46,144][53252] Updated weights for policy 0, policy_version 97520 (0.0010) [2023-10-10 08:44:46,508][53252] Updated weights for policy 0, policy_version 97530 (0.0010) [2023-10-10 08:44:46,783][52050] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199655424. Throughput: 0: 1664.8, 1: 1659.1. Samples: 49914604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:46,785][52050] Avg episode reward: [(0, '25.830'), (1, '21.280')] [2023-10-10 08:44:48,422][53268] Updated weights for policy 1, policy_version 97450 (0.0011) [2023-10-10 08:44:48,790][53268] Updated weights for policy 1, policy_version 97460 (0.0008) [2023-10-10 08:44:49,149][53268] Updated weights for policy 1, policy_version 97470 (0.0007) [2023-10-10 08:44:50,593][53252] Updated weights for policy 0, policy_version 97540 (0.0007) [2023-10-10 08:44:50,959][53252] Updated weights for policy 0, policy_version 97550 (0.0009) [2023-10-10 08:44:51,343][53252] Updated weights for policy 0, policy_version 97560 (0.0009) [2023-10-10 08:44:51,783][52050] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199720960. Throughput: 0: 1664.1, 1: 1674.6. Samples: 49934936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:51,784][52050] Avg episode reward: [(0, '22.760'), (1, '22.500')] [2023-10-10 08:44:53,130][53268] Updated weights for policy 1, policy_version 97480 (0.0009) [2023-10-10 08:44:53,505][53268] Updated weights for policy 1, policy_version 97490 (0.0009) [2023-10-10 08:44:53,883][53268] Updated weights for policy 1, policy_version 97500 (0.0007) [2023-10-10 08:44:55,385][53252] Updated weights for policy 0, policy_version 97570 (0.0011) [2023-10-10 08:44:55,757][53252] Updated weights for policy 0, policy_version 97580 (0.0009) [2023-10-10 08:44:56,119][53252] Updated weights for policy 0, policy_version 97590 (0.0010) [2023-10-10 08:44:56,491][53252] Updated weights for policy 0, policy_version 97600 (0.0009) [2023-10-10 08:44:56,783][52050] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 199786496. Throughput: 0: 1654.4, 1: 1682.1. Samples: 49954942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:44:56,784][52050] Avg episode reward: [(0, '23.740'), (1, '21.610')] [2023-10-10 08:44:57,878][53268] Updated weights for policy 1, policy_version 97510 (0.0007) [2023-10-10 08:44:58,257][53268] Updated weights for policy 1, policy_version 97520 (0.0008) [2023-10-10 08:44:58,620][53268] Updated weights for policy 1, policy_version 97530 (0.0009) [2023-10-10 08:45:00,567][53252] Updated weights for policy 0, policy_version 97610 (0.0008) [2023-10-10 08:45:00,951][53252] Updated weights for policy 0, policy_version 97620 (0.0008) [2023-10-10 08:45:01,327][53252] Updated weights for policy 0, policy_version 97630 (0.0009) [2023-10-10 08:45:01,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 199852032. Throughput: 0: 1668.6, 1: 1661.0. Samples: 49965054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:45:01,785][52050] Avg episode reward: [(0, '24.910'), (1, '21.540')] [2023-10-10 08:45:02,894][53268] Updated weights for policy 1, policy_version 97540 (0.0009) [2023-10-10 08:45:03,260][53268] Updated weights for policy 1, policy_version 97550 (0.0008) [2023-10-10 08:45:03,635][53268] Updated weights for policy 1, policy_version 97560 (0.0008) [2023-10-10 08:45:05,191][53252] Updated weights for policy 0, policy_version 97640 (0.0007) [2023-10-10 08:45:05,558][53252] Updated weights for policy 0, policy_version 97650 (0.0007) [2023-10-10 08:45:05,933][53252] Updated weights for policy 0, policy_version 97660 (0.0008) [2023-10-10 08:45:06,783][52050] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199917568. Throughput: 0: 1662.5, 1: 1674.5. Samples: 49985086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-10 08:45:06,784][52050] Avg episode reward: [(0, '24.210'), (1, '22.230')] [2023-10-10 08:45:07,669][53268] Updated weights for policy 1, policy_version 97570 (0.0011) [2023-10-10 08:45:08,038][53268] Updated weights for policy 1, policy_version 97580 (0.0009) [2023-10-10 08:45:08,408][53268] Updated weights for policy 1, policy_version 97590 (0.0008) [2023-10-10 08:45:08,778][53268] Updated weights for policy 1, policy_version 97600 (0.0009) [2023-10-10 08:45:09,963][53252] Updated weights for policy 0, policy_version 97670 (0.0009) [2023-10-10 08:45:10,334][53252] Updated weights for policy 0, policy_version 97680 (0.0009) [2023-10-10 08:45:10,707][53252] Updated weights for policy 0, policy_version 97690 (0.0010) [2023-10-10 08:45:11,783][52050] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199983104. Throughput: 0: 1664.3, 1: 1674.7. Samples: 50005150. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:45:11,785][52050] Avg episode reward: [(0, '22.710'), (1, '20.130')] [2023-10-10 08:45:12,927][53268] Updated weights for policy 1, policy_version 97610 (0.0009) [2023-10-10 08:45:13,297][53268] Updated weights for policy 1, policy_version 97620 (0.0010) [2023-10-10 08:45:13,653][53268] Updated weights for policy 1, policy_version 97630 (0.0009) [2023-10-10 08:45:14,723][53252] Updated weights for policy 0, policy_version 97700 (0.0008) [2023-10-10 08:45:15,103][53252] Updated weights for policy 0, policy_version 97710 (0.0007) [2023-10-10 08:45:15,470][53252] Updated weights for policy 0, policy_version 97720 (0.0007) [2023-10-10 08:45:16,783][52050] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 200048640. Throughput: 0: 1683.8, 1: 1667.3. Samples: 50015700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-10 08:45:16,784][52050] Avg episode reward: [(0, '23.320'), (1, '20.920')] [2023-10-10 08:45:17,713][53268] Updated weights for policy 1, policy_version 97640 (0.0009) [2023-10-10 08:45:18,087][53268] Updated weights for policy 1, policy_version 97650 (0.0009) [2023-10-10 08:45:18,446][53268] Updated weights for policy 1, policy_version 97660 (0.0008) [2023-10-10 08:45:19,520][53252] Updated weights for policy 0, policy_version 97730 (0.0008) [2023-10-10 08:45:19,890][53252] Updated weights for policy 0, policy_version 97740 (0.0009) [2023-10-10 08:45:20,272][53252] Updated weights for policy 0, policy_version 97750 (0.0009) [2023-10-10 08:45:20,640][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth... [2023-10-10 08:45:20,641][53299] Stopping RolloutWorker_w12... [2023-10-10 08:45:20,641][53298] Stopping RolloutWorker_w11... [2023-10-10 08:45:20,641][53289] Stopping RolloutWorker_w3... [2023-10-10 08:45:20,641][53299] Loop rollout_proc12_evt_loop terminating... [2023-10-10 08:45:20,641][53294] Stopping RolloutWorker_w9... [2023-10-10 08:45:20,640][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000097760_100106240.pth... [2023-10-10 08:45:20,641][53293] Stopping RolloutWorker_w7... [2023-10-10 08:45:20,641][53298] Loop rollout_proc11_evt_loop terminating... [2023-10-10 08:45:20,641][53289] Loop rollout_proc3_evt_loop terminating... [2023-10-10 08:45:20,641][53291] Stopping RolloutWorker_w5... [2023-10-10 08:45:20,641][53295] Stopping RolloutWorker_w10... [2023-10-10 08:45:20,641][53294] Loop rollout_proc9_evt_loop terminating... [2023-10-10 08:45:20,641][54020] Stopping RolloutWorker_w15... [2023-10-10 08:45:20,641][53293] Loop rollout_proc7_evt_loop terminating... [2023-10-10 08:45:20,641][53291] Loop rollout_proc5_evt_loop terminating... [2023-10-10 08:45:20,641][54020] Loop rollout_proc15_evt_loop terminating... [2023-10-10 08:45:20,641][52050] Component RolloutWorker_w12 stopped! [2023-10-10 08:45:20,642][53295] Loop rollout_proc10_evt_loop terminating... [2023-10-10 08:45:20,642][52050] Component RolloutWorker_w11 stopped! [2023-10-10 08:45:20,642][52050] Component RolloutWorker_w10 stopped! [2023-10-10 08:45:20,643][52050] Component RolloutWorker_w3 stopped! [2023-10-10 08:45:20,643][53287] Stopping RolloutWorker_w2... [2023-10-10 08:45:20,643][52050] Component RolloutWorker_w9 stopped! [2023-10-10 08:45:20,644][53287] Loop rollout_proc2_evt_loop terminating... [2023-10-10 08:45:20,644][53954] Stopping RolloutWorker_w14... [2023-10-10 08:45:20,644][52050] Component RolloutWorker_w7 stopped! [2023-10-10 08:45:20,644][53288] Stopping RolloutWorker_w1... [2023-10-10 08:45:20,644][53954] Loop rollout_proc14_evt_loop terminating... [2023-10-10 08:45:20,644][52050] Component Batcher_0 stopped! [2023-10-10 08:45:20,645][53288] Loop rollout_proc1_evt_loop terminating... [2023-10-10 08:45:20,645][52050] Component RolloutWorker_w5 stopped! [2023-10-10 08:45:20,645][53252] Updated weights for policy 0, policy_version 97760 (0.0009) [2023-10-10 08:45:20,645][52050] Component RolloutWorker_w15 stopped! [2023-10-10 08:45:20,645][53285] Stopping RolloutWorker_w0... [2023-10-10 08:45:20,645][53292] Stopping RolloutWorker_w6... [2023-10-10 08:45:20,645][53290] Stopping RolloutWorker_w4... [2023-10-10 08:45:20,646][52050] Component RolloutWorker_w2 stopped! [2023-10-10 08:45:20,646][53285] Loop rollout_proc0_evt_loop terminating... [2023-10-10 08:45:20,646][53292] Loop rollout_proc6_evt_loop terminating... [2023-10-10 08:45:20,646][53290] Loop rollout_proc4_evt_loop terminating... [2023-10-10 08:45:20,646][52050] Component RolloutWorker_w14 stopped! [2023-10-10 08:45:20,646][53300] Stopping RolloutWorker_w13... [2023-10-10 08:45:20,641][52846] Stopping Batcher_0... [2023-10-10 08:45:20,646][52050] Component RolloutWorker_w1 stopped! [2023-10-10 08:45:20,647][53300] Loop rollout_proc13_evt_loop terminating... [2023-10-10 08:45:20,646][53061] Stopping Batcher_1... [2023-10-10 08:45:20,647][53296] Stopping RolloutWorker_w8... [2023-10-10 08:45:20,647][52050] Component RolloutWorker_w0 stopped! [2023-10-10 08:45:20,647][53296] Loop rollout_proc8_evt_loop terminating... [2023-10-10 08:45:20,647][52050] Component RolloutWorker_w6 stopped! [2023-10-10 08:45:20,647][52050] Component RolloutWorker_w4 stopped! [2023-10-10 08:45:20,648][52050] Component RolloutWorker_w13 stopped! [2023-10-10 08:45:20,648][52050] Component Batcher_1 stopped! [2023-10-10 08:45:20,649][52050] Component RolloutWorker_w8 stopped! [2023-10-10 08:45:20,669][53268] Weights refcount: 2 0 [2023-10-10 08:45:20,670][53268] Stopping InferenceWorker_p1-w0... [2023-10-10 08:45:20,671][53268] Loop inference_proc1-0_evt_loop terminating... [2023-10-10 08:45:20,671][52050] Component InferenceWorker_p1-w0 stopped! [2023-10-10 08:45:20,677][53252] Weights refcount: 2 0 [2023-10-10 08:45:20,663][52846] Loop batcher_evt_loop terminating... [2023-10-10 08:45:20,664][53061] Loop batcher_evt_loop terminating... [2023-10-10 08:45:20,679][53252] Stopping InferenceWorker_p0-w0... [2023-10-10 08:45:20,679][53252] Loop inference_proc0-0_evt_loop terminating... [2023-10-10 08:45:20,679][52050] Component InferenceWorker_p0-w0 stopped! [2023-10-10 08:45:20,692][53061] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000096640_98959360.pth [2023-10-10 08:45:20,692][52846] Removing ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000096704_99024896.pth [2023-10-10 08:45:20,698][53061] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth... [2023-10-10 08:45:20,698][52846] Saving ./train_atari/atari_choppercommand_APPO/checkpoint_p0/checkpoint_000097760_100106240.pth... [2023-10-10 08:45:20,758][53061] Stopping LearnerWorker_p1... [2023-10-10 08:45:20,758][53061] Loop learner_proc1_evt_loop terminating... [2023-10-10 08:45:20,758][52050] Component LearnerWorker_p1 stopped! [2023-10-10 08:45:20,759][52846] Stopping LearnerWorker_p0... [2023-10-10 08:45:20,759][52050] Component LearnerWorker_p0 stopped! [2023-10-10 08:45:20,759][52846] Loop learner_proc0_evt_loop terminating... [2023-10-10 08:45:20,760][52050] Waiting for process learner_proc0 to stop... [2023-10-10 08:45:21,677][52050] Waiting for process learner_proc1 to stop... [2023-10-10 08:45:21,678][52050] Waiting for process inference_proc0-0 to join... [2023-10-10 08:45:21,679][52050] Waiting for process inference_proc1-0 to join... [2023-10-10 08:45:21,679][52050] Waiting for process rollout_proc0 to join... [2023-10-10 08:45:21,680][52050] Waiting for process rollout_proc1 to join... [2023-10-10 08:45:21,680][52050] Waiting for process rollout_proc2 to join... [2023-10-10 08:45:21,681][52050] Waiting for process rollout_proc3 to join... [2023-10-10 08:45:21,682][52050] Waiting for process rollout_proc4 to join... [2023-10-10 08:45:21,682][52050] Waiting for process rollout_proc5 to join... [2023-10-10 08:45:21,683][52050] Waiting for process rollout_proc6 to join... [2023-10-10 08:45:21,684][52050] Waiting for process rollout_proc7 to join... [2023-10-10 08:45:21,684][52050] Waiting for process rollout_proc8 to join... [2023-10-10 08:45:21,685][52050] Waiting for process rollout_proc9 to join... [2023-10-10 08:45:21,686][52050] Waiting for process rollout_proc10 to join... [2023-10-10 08:45:21,686][52050] Waiting for process rollout_proc11 to join... [2023-10-10 08:45:21,687][52050] Waiting for process rollout_proc12 to join... [2023-10-10 08:45:21,688][52050] Waiting for process rollout_proc13 to join... [2023-10-10 08:45:21,688][52050] Waiting for process rollout_proc14 to join... [2023-10-10 08:45:21,689][52050] Waiting for process rollout_proc15 to join... [2023-10-10 08:45:21,689][52050] Batcher 0 profile tree view: batching: 168.8626, releasing_batches: 0.0911 [2023-10-10 08:45:21,689][52050] Batcher 1 profile tree view: batching: 168.6736, releasing_batches: 0.0891 [2023-10-10 08:45:21,689][52050] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 2653.3521 update_model: 201.4035 weight_update: 0.0009 one_step: 0.0031 handle_policy_step: 11360.4197 deserialize: 64.3578, stack: 195.1830, obs_to_device_normalize: 2528.4828, forward: 5156.3755, prepare_outputs: 2454.3746, send_messages: 465.1368 [2023-10-10 08:45:21,690][52050] InferenceWorker_p1-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 2616.4841 update_model: 206.8132 weight_update: 0.0008 one_step: 0.0025 handle_policy_step: 11375.1715 deserialize: 64.8472, stack: 192.2697, obs_to_device_normalize: 2549.0558, forward: 5131.5520, prepare_outputs: 2463.2395, send_messages: 460.9053 [2023-10-10 08:45:21,690][52050] Learner 0 profile tree view: misc: 0.0195, prepare_batch: 269.7383 train: 3642.0859 epoch_init: 0.1854, minibatch_init: 13.3470, losses_postprocess: 892.5024, kl_divergence: 32.5024, update: 391.8727, after_optimizer: 2125.8720 calculate_losses: 169.3671 losses_init: 0.4085, forward_head: 57.0394, bptt_initial: 1.4434, bptt: 2.0325, tail: 38.8662, advantages_returns: 11.2920, losses: 44.5334 [2023-10-10 08:45:21,690][52050] Learner 1 profile tree view: misc: 0.0191, prepare_batch: 269.5004 train: 3604.7525 epoch_init: 0.1865, minibatch_init: 13.2189, losses_postprocess: 888.1377, kl_divergence: 31.5163, update: 384.0842, after_optimizer: 2103.7754 calculate_losses: 167.0014 losses_init: 0.3918, forward_head: 56.1487, bptt_initial: 1.4706, bptt: 1.8137, tail: 38.4002, advantages_returns: 11.2401, losses: 43.8681 [2023-10-10 08:45:21,690][52050] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.2529, enqueue_policy_requests: 413.9617, process_policy_outputs: 189.6808, env_step: 7655.0497, finalize_trajectories: 3.6485, complete_rollouts: 2.9872 post_env_step: 380.2122 process_env_step: 85.0602 [2023-10-10 08:45:21,690][52050] RolloutWorker_w15 profile tree view: wait_for_trajectories: 1.2521, enqueue_policy_requests: 410.1226, process_policy_outputs: 194.1878, env_step: 7575.2981, finalize_trajectories: 3.5410, complete_rollouts: 2.9936 post_env_step: 387.3809 process_env_step: 86.1866 [2023-10-10 08:45:21,691][52050] Loop Runner_EvtLoop terminating... [2023-10-10 08:45:21,691][52050] Runner profile tree view: main_loop: 14915.3646 [2023-10-10 08:45:21,691][52050] Collected {0: 100106240, 1: 100007936}, FPS: 13416.6