diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,2185 @@ +[2023-09-25 20:18:34,219][108279] Saving configuration to ./train_atari/atari_bowling/config.json... +[2023-09-25 20:18:34,554][108279] Rollout worker 0 uses device cpu +[2023-09-25 20:18:34,555][108279] Rollout worker 1 uses device cpu +[2023-09-25 20:18:34,555][108279] Rollout worker 2 uses device cpu +[2023-09-25 20:18:34,556][108279] Rollout worker 3 uses device cpu +[2023-09-25 20:18:34,556][108279] Rollout worker 4 uses device cpu +[2023-09-25 20:18:34,556][108279] Rollout worker 5 uses device cpu +[2023-09-25 20:18:34,557][108279] Rollout worker 6 uses device cpu +[2023-09-25 20:18:34,557][108279] Rollout worker 7 uses device cpu +[2023-09-25 20:18:34,558][108279] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2023-09-25 20:18:34,605][108279] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-25 20:18:34,606][108279] InferenceWorker_p0-w0: min num requests: 1 +[2023-09-25 20:18:34,609][108279] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-25 20:18:34,609][108279] InferenceWorker_p1-w0: min num requests: 1 +[2023-09-25 20:18:34,631][108279] Starting all processes... +[2023-09-25 20:18:34,632][108279] Starting process learner_proc0 +[2023-09-25 20:18:36,225][108279] Starting process learner_proc1 +[2023-09-25 20:18:36,230][108926] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-25 20:18:36,230][108926] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-09-25 20:18:36,248][108926] Num visible devices: 1 +[2023-09-25 20:18:36,267][108926] Starting seed is not provided +[2023-09-25 20:18:36,267][108926] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-25 20:18:36,267][108926] Initializing actor-critic model on device cuda:0 +[2023-09-25 20:18:36,267][108926] RunningMeanStd input shape: (4, 84, 84) +[2023-09-25 20:18:36,268][108926] RunningMeanStd input shape: (1,) +[2023-09-25 20:18:36,280][108926] ConvEncoder: input_channels=4 +[2023-09-25 20:18:36,441][108926] Conv encoder output size: 512 +[2023-09-25 20:18:36,443][108926] Created Actor Critic model with architecture: +[2023-09-25 20:18:36,443][108926] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ReLU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ReLU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ReLU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ReLU) + ) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=6, bias=True) + ) +) +[2023-09-25 20:18:37,021][108926] Using optimizer +[2023-09-25 20:18:37,021][108926] No checkpoints found +[2023-09-25 20:18:37,021][108926] Did not load from checkpoint, starting from scratch! +[2023-09-25 20:18:37,022][108926] Initialized policy 0 weights for model version 0 +[2023-09-25 20:18:37,023][108926] LearnerWorker_p0 finished initialization! +[2023-09-25 20:18:37,024][108926] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-25 20:18:37,819][108279] Starting all processes... +[2023-09-25 20:18:37,823][109025] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-25 20:18:37,823][109025] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 +[2023-09-25 20:18:37,826][108279] Starting process inference_proc0-0 +[2023-09-25 20:18:37,826][108279] Starting process inference_proc1-0 +[2023-09-25 20:18:37,827][108279] Starting process rollout_proc0 +[2023-09-25 20:18:37,827][108279] Starting process rollout_proc1 +[2023-09-25 20:18:37,841][109025] Num visible devices: 1 +[2023-09-25 20:18:37,827][108279] Starting process rollout_proc2 +[2023-09-25 20:18:37,828][108279] Starting process rollout_proc3 +[2023-09-25 20:18:37,866][109025] Starting seed is not provided +[2023-09-25 20:18:37,866][109025] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-09-25 20:18:37,867][109025] Initializing actor-critic model on device cuda:0 +[2023-09-25 20:18:37,867][109025] RunningMeanStd input shape: (4, 84, 84) +[2023-09-25 20:18:37,868][109025] RunningMeanStd input shape: (1,) +[2023-09-25 20:18:37,835][108279] Starting process rollout_proc4 +[2023-09-25 20:18:37,835][108279] Starting process rollout_proc5 +[2023-09-25 20:18:37,837][108279] Starting process rollout_proc6 +[2023-09-25 20:18:37,838][108279] Starting process rollout_proc7 +[2023-09-25 20:18:37,881][109025] ConvEncoder: input_channels=4 +[2023-09-25 20:18:38,232][109025] Conv encoder output size: 512 +[2023-09-25 20:18:38,234][109025] Created Actor Critic model with architecture: +[2023-09-25 20:18:38,234][109025] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ReLU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ReLU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ReLU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ReLU) + ) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=6, bias=True) + ) +) +[2023-09-25 20:18:38,816][109025] Using optimizer +[2023-09-25 20:18:38,817][109025] No checkpoints found +[2023-09-25 20:18:38,817][109025] Did not load from checkpoint, starting from scratch! +[2023-09-25 20:18:38,817][109025] Initialized policy 1 weights for model version 0 +[2023-09-25 20:18:38,819][109025] LearnerWorker_p1 finished initialization! +[2023-09-25 20:18:38,819][109025] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-09-25 20:18:39,798][109261] Worker 3 uses CPU cores [12, 13, 14, 15] +[2023-09-25 20:18:39,816][109259] Worker 1 uses CPU cores [4, 5, 6, 7] +[2023-09-25 20:18:39,820][109225] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-25 20:18:39,821][109225] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-09-25 20:18:39,830][109264] Worker 5 uses CPU cores [20, 21, 22, 23] +[2023-09-25 20:18:39,839][109225] Num visible devices: 1 +[2023-09-25 20:18:39,906][109262] Worker 4 uses CPU cores [16, 17, 18, 19] +[2023-09-25 20:18:39,906][109265] Worker 6 uses CPU cores [24, 25, 26, 27] +[2023-09-25 20:18:39,941][109227] Worker 0 uses CPU cores [0, 1, 2, 3] +[2023-09-25 20:18:39,952][109224] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-25 20:18:39,952][109224] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 +[2023-09-25 20:18:39,971][109224] Num visible devices: 1 +[2023-09-25 20:18:39,975][109263] Worker 2 uses CPU cores [8, 9, 10, 11] +[2023-09-25 20:18:40,009][109266] Worker 7 uses CPU cores [28, 29, 30, 31] +[2023-09-25 20:18:40,432][109225] RunningMeanStd input shape: (4, 84, 84) +[2023-09-25 20:18:40,433][109225] RunningMeanStd input shape: (1,) +[2023-09-25 20:18:40,443][109225] ConvEncoder: input_channels=4 +[2023-09-25 20:18:40,470][108279] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-09-25 20:18:40,542][109225] Conv encoder output size: 512 +[2023-09-25 20:18:40,546][109224] RunningMeanStd input shape: (4, 84, 84) +[2023-09-25 20:18:40,546][109224] RunningMeanStd input shape: (1,) +[2023-09-25 20:18:40,547][108279] Inference worker 0-0 is ready! +[2023-09-25 20:18:40,557][109224] ConvEncoder: input_channels=4 +[2023-09-25 20:18:40,655][109224] Conv encoder output size: 512 +[2023-09-25 20:18:40,660][108279] Inference worker 1-0 is ready! +[2023-09-25 20:18:40,661][108279] All inference workers are ready! Signal rollout workers to start! +[2023-09-25 20:18:41,106][109264] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,108][109262] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,111][109227] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,111][109261] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,120][109266] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,155][109259] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,189][109265] Decorrelating experience for 0 frames... +[2023-09-25 20:18:41,196][109263] Decorrelating experience for 0 frames... +[2023-09-25 20:18:45,470][108279] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:18:50,470][108279] Fps is (10 sec: 3276.9, 60 sec: 3276.9, 300 sec: 3276.9). Total num frames: 32768. Throughput: 0: 409.6, 1: 409.6. Samples: 8192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:18:54,593][108279] Heartbeat connected on Batcher_0 +[2023-09-25 20:18:54,599][108279] Heartbeat connected on Batcher_1 +[2023-09-25 20:18:54,613][108279] Heartbeat connected on RolloutWorker_w0 +[2023-09-25 20:18:54,615][108279] Heartbeat connected on RolloutWorker_w1 +[2023-09-25 20:18:54,617][108279] Heartbeat connected on RolloutWorker_w2 +[2023-09-25 20:18:54,620][108279] Heartbeat connected on RolloutWorker_w3 +[2023-09-25 20:18:54,623][108279] Heartbeat connected on RolloutWorker_w4 +[2023-09-25 20:18:54,626][108279] Heartbeat connected on RolloutWorker_w5 +[2023-09-25 20:18:54,628][108279] Heartbeat connected on RolloutWorker_w6 +[2023-09-25 20:18:54,631][108279] Heartbeat connected on RolloutWorker_w7 +[2023-09-25 20:18:54,645][108279] Heartbeat connected on InferenceWorker_p1-w0 +[2023-09-25 20:18:54,653][108279] Heartbeat connected on InferenceWorker_p0-w0 +[2023-09-25 20:18:54,660][108279] Heartbeat connected on LearnerWorker_p0 +[2023-09-25 20:18:54,701][108279] Heartbeat connected on LearnerWorker_p1 +[2023-09-25 20:18:55,470][108279] Fps is (10 sec: 5734.5, 60 sec: 4369.1, 300 sec: 4369.1). Total num frames: 65536. Throughput: 0: 431.0, 1: 439.1. Samples: 13051. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:18:55,470][108279] Avg episode reward: [(0, '7.500'), (1, '6.250')] +[2023-09-25 20:18:57,192][109225] Updated weights for policy 0, policy_version 160 (0.0015) +[2023-09-25 20:18:57,192][109224] Updated weights for policy 1, policy_version 160 (0.0019) +[2023-09-25 20:19:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 98304. Throughput: 0: 569.4, 1: 575.3. Samples: 22893. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:00,470][108279] Avg episode reward: [(0, '7.500'), (1, '6.250')] +[2023-09-25 20:19:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 5242.9, 300 sec: 5242.9). Total num frames: 131072. Throughput: 0: 655.5, 1: 658.7. Samples: 32856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:05,471][108279] Avg episode reward: [(0, '7.375'), (1, '6.500')] +[2023-09-25 20:19:09,581][109225] Updated weights for policy 0, policy_version 320 (0.0015) +[2023-09-25 20:19:09,582][109224] Updated weights for policy 1, policy_version 320 (0.0016) +[2023-09-25 20:19:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 163840. Throughput: 0: 629.0, 1: 632.9. Samples: 37858. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:10,471][108279] Avg episode reward: [(0, '7.375'), (1, '6.500')] +[2023-09-25 20:19:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 5617.4, 300 sec: 5617.4). Total num frames: 196608. Throughput: 0: 679.5, 1: 682.3. Samples: 47663. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:15,471][108279] Avg episode reward: [(0, '7.917'), (1, '7.000')] +[2023-09-25 20:19:20,470][108279] Fps is (10 sec: 6553.8, 60 sec: 5734.4, 300 sec: 5734.4). Total num frames: 229376. Throughput: 0: 716.8, 1: 718.3. Samples: 57404. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:19:20,470][108279] Avg episode reward: [(0, '7.917'), (1, '7.000')] +[2023-09-25 20:19:20,471][108926] Saving new best policy, reward=7.917! +[2023-09-25 20:19:20,471][109025] Saving new best policy, reward=7.000! +[2023-09-25 20:19:22,072][109224] Updated weights for policy 1, policy_version 480 (0.0017) +[2023-09-25 20:19:22,073][109225] Updated weights for policy 0, policy_version 480 (0.0016) +[2023-09-25 20:19:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 5825.5, 300 sec: 5825.5). Total num frames: 262144. Throughput: 0: 693.3, 1: 695.7. Samples: 62505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:25,470][108279] Avg episode reward: [(0, '7.867'), (1, '7.400')] +[2023-09-25 20:19:25,471][109025] Saving new best policy, reward=7.400! +[2023-09-25 20:19:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 5898.2, 300 sec: 5898.2). Total num frames: 294912. Throughput: 0: 778.2, 1: 780.4. Samples: 72188. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:19:30,471][108279] Avg episode reward: [(0, '7.938'), (1, '7.562')] +[2023-09-25 20:19:30,475][108926] Saving new best policy, reward=7.938! +[2023-09-25 20:19:30,475][109025] Saving new best policy, reward=7.562! +[2023-09-25 20:19:34,655][109225] Updated weights for policy 0, policy_version 640 (0.0014) +[2023-09-25 20:19:34,659][109224] Updated weights for policy 1, policy_version 640 (0.0018) +[2023-09-25 20:19:35,470][108279] Fps is (10 sec: 6553.3, 60 sec: 5957.8, 300 sec: 5957.8). Total num frames: 327680. Throughput: 0: 819.2, 1: 819.3. Samples: 81925. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:35,471][108279] Avg episode reward: [(0, '7.938'), (1, '7.562')] +[2023-09-25 20:19:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6007.5). Total num frames: 360448. Throughput: 0: 819.6, 1: 819.5. Samples: 86810. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:40,471][108279] Avg episode reward: [(0, '8.350'), (1, '8.000')] +[2023-09-25 20:19:40,472][108926] Saving new best policy, reward=8.350! +[2023-09-25 20:19:40,472][109025] Saving new best policy, reward=8.000! +[2023-09-25 20:19:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6049.5). Total num frames: 393216. Throughput: 0: 817.6, 1: 817.6. Samples: 96478. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:45,471][108279] Avg episode reward: [(0, '8.350'), (1, '8.000')] +[2023-09-25 20:19:47,239][109224] Updated weights for policy 1, policy_version 800 (0.0016) +[2023-09-25 20:19:47,239][109225] Updated weights for policy 0, policy_version 800 (0.0017) +[2023-09-25 20:19:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6085.5). Total num frames: 425984. Throughput: 0: 819.1, 1: 817.4. Samples: 106497. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:19:50,470][108279] Avg episode reward: [(0, '8.625'), (1, '8.250')] +[2023-09-25 20:19:50,471][108926] Saving new best policy, reward=8.625! +[2023-09-25 20:19:50,471][109025] Saving new best policy, reward=8.250! +[2023-09-25 20:19:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6116.7). Total num frames: 458752. Throughput: 0: 817.4, 1: 817.5. Samples: 111426. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:19:55,471][108279] Avg episode reward: [(0, '8.625'), (1, '8.250')] +[2023-09-25 20:20:00,051][109225] Updated weights for policy 0, policy_version 960 (0.0018) +[2023-09-25 20:20:00,051][109224] Updated weights for policy 1, policy_version 960 (0.0017) +[2023-09-25 20:20:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6144.0). Total num frames: 491520. Throughput: 0: 814.1, 1: 811.9. Samples: 120832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:00,471][108279] Avg episode reward: [(0, '8.786'), (1, '8.464')] +[2023-09-25 20:20:00,475][108926] Saving new best policy, reward=8.786! +[2023-09-25 20:20:00,475][109025] Saving new best policy, reward=8.464! +[2023-09-25 20:20:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6168.1). Total num frames: 524288. Throughput: 0: 811.7, 1: 812.9. Samples: 130513. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:05,470][108279] Avg episode reward: [(0, '8.786'), (1, '8.464')] +[2023-09-25 20:20:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6189.5). Total num frames: 557056. Throughput: 0: 808.6, 1: 806.6. Samples: 135187. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:20:10,470][108279] Avg episode reward: [(0, '8.875'), (1, '8.656')] +[2023-09-25 20:20:10,471][109025] Saving new best policy, reward=8.656! +[2023-09-25 20:20:10,471][108926] Saving new best policy, reward=8.875! +[2023-09-25 20:20:12,698][109225] Updated weights for policy 0, policy_version 1120 (0.0018) +[2023-09-25 20:20:12,698][109224] Updated weights for policy 1, policy_version 1120 (0.0015) +[2023-09-25 20:20:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6208.7). Total num frames: 589824. Throughput: 0: 810.9, 1: 810.9. Samples: 145169. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:15,471][108279] Avg episode reward: [(0, '8.875'), (1, '8.656')] +[2023-09-25 20:20:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6225.9). Total num frames: 622592. Throughput: 0: 808.1, 1: 810.1. Samples: 154743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:20,470][108279] Avg episode reward: [(0, '9.000'), (1, '8.806')] +[2023-09-25 20:20:20,471][108926] Saving new best policy, reward=9.000! +[2023-09-25 20:20:20,471][109025] Saving new best policy, reward=8.806! +[2023-09-25 20:20:25,316][109224] Updated weights for policy 1, policy_version 1280 (0.0018) +[2023-09-25 20:20:25,316][109225] Updated weights for policy 0, policy_version 1280 (0.0017) +[2023-09-25 20:20:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6241.5). Total num frames: 655360. Throughput: 0: 811.7, 1: 809.0. Samples: 159744. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:25,470][108279] Avg episode reward: [(0, '9.000'), (1, '8.806')] +[2023-09-25 20:20:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6255.7). Total num frames: 688128. Throughput: 0: 810.8, 1: 810.3. Samples: 169427. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:20:30,470][108279] Avg episode reward: [(0, '9.100'), (1, '8.925')] +[2023-09-25 20:20:30,474][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000001344_344064.pth... +[2023-09-25 20:20:30,474][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000001344_344064.pth... +[2023-09-25 20:20:30,510][109025] Saving new best policy, reward=8.925! +[2023-09-25 20:20:30,512][108926] Saving new best policy, reward=9.100! +[2023-09-25 20:20:35,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6268.7). Total num frames: 720896. Throughput: 0: 806.3, 1: 808.6. Samples: 179167. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:35,471][108279] Avg episode reward: [(0, '9.100'), (1, '8.925')] +[2023-09-25 20:20:37,822][109224] Updated weights for policy 1, policy_version 1440 (0.0019) +[2023-09-25 20:20:37,822][109225] Updated weights for policy 0, policy_version 1440 (0.0018) +[2023-09-25 20:20:40,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6280.5). Total num frames: 753664. Throughput: 0: 810.4, 1: 808.6. Samples: 184282. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:20:40,471][108279] Avg episode reward: [(0, '9.182'), (1, '9.023')] +[2023-09-25 20:20:40,473][108926] Saving new best policy, reward=9.182! +[2023-09-25 20:20:40,473][109025] Saving new best policy, reward=9.023! +[2023-09-25 20:20:45,470][108279] Fps is (10 sec: 6144.1, 60 sec: 6485.3, 300 sec: 6258.7). Total num frames: 782336. Throughput: 0: 811.7, 1: 812.4. Samples: 193917. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:20:45,471][108279] Avg episode reward: [(0, '9.182'), (1, '9.023')] +[2023-09-25 20:20:50,470][108279] Fps is (10 sec: 5734.6, 60 sec: 6417.1, 300 sec: 6238.5). Total num frames: 811008. Throughput: 0: 810.7, 1: 810.8. Samples: 203482. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:20:50,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.065')] +[2023-09-25 20:20:50,480][109025] Saving new best policy, reward=9.065! +[2023-09-25 20:20:50,490][108926] Saving new best policy, reward=9.250! +[2023-09-25 20:20:50,493][109224] Updated weights for policy 1, policy_version 1600 (0.0017) +[2023-09-25 20:20:50,493][109225] Updated weights for policy 0, policy_version 1600 (0.0019) +[2023-09-25 20:20:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.3, 300 sec: 6280.5). Total num frames: 847872. Throughput: 0: 814.6, 1: 816.1. Samples: 208571. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:20:55,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.104')] +[2023-09-25 20:20:55,472][109025] Saving new best policy, reward=9.104! +[2023-09-25 20:21:00,470][108279] Fps is (10 sec: 7372.7, 60 sec: 6553.6, 300 sec: 6319.5). Total num frames: 884736. Throughput: 0: 813.8, 1: 813.8. Samples: 218410. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:00,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.104')] +[2023-09-25 20:21:02,949][109224] Updated weights for policy 1, policy_version 1760 (0.0016) +[2023-09-25 20:21:02,951][109225] Updated weights for policy 0, policy_version 1760 (0.0017) +[2023-09-25 20:21:05,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6327.6). Total num frames: 917504. Throughput: 0: 814.2, 1: 815.2. Samples: 228067. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:05,471][108279] Avg episode reward: [(0, '9.308'), (1, '9.135')] +[2023-09-25 20:21:05,472][108926] Saving new best policy, reward=9.308! +[2023-09-25 20:21:05,472][109025] Saving new best policy, reward=9.135! +[2023-09-25 20:21:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6335.2). Total num frames: 950272. Throughput: 0: 815.4, 1: 818.2. Samples: 233258. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:10,470][108279] Avg episode reward: [(0, '9.308'), (1, '9.135')] +[2023-09-25 20:21:15,449][109225] Updated weights for policy 0, policy_version 1920 (0.0017) +[2023-09-25 20:21:15,450][109224] Updated weights for policy 1, policy_version 1920 (0.0016) +[2023-09-25 20:21:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6342.2). Total num frames: 983040. Throughput: 0: 816.1, 1: 816.6. Samples: 242899. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:21:15,471][108279] Avg episode reward: [(0, '9.339'), (1, '9.179')] +[2023-09-25 20:21:15,474][108926] Saving new best policy, reward=9.339! +[2023-09-25 20:21:15,474][109025] Saving new best policy, reward=9.179! +[2023-09-25 20:21:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6348.8). Total num frames: 1015808. Throughput: 0: 816.6, 1: 817.0. Samples: 252678. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:21:20,471][108279] Avg episode reward: [(0, '9.339'), (1, '9.179')] +[2023-09-25 20:21:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6355.0). Total num frames: 1048576. Throughput: 0: 816.3, 1: 818.1. Samples: 257828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:25,471][108279] Avg episode reward: [(0, '9.383'), (1, '9.233')] +[2023-09-25 20:21:25,472][108926] Saving new best policy, reward=9.383! +[2023-09-25 20:21:25,472][109025] Saving new best policy, reward=9.233! +[2023-09-25 20:21:27,893][109224] Updated weights for policy 1, policy_version 2080 (0.0016) +[2023-09-25 20:21:27,894][109225] Updated weights for policy 0, policy_version 2080 (0.0017) +[2023-09-25 20:21:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6360.9). Total num frames: 1081344. Throughput: 0: 817.7, 1: 820.3. Samples: 267628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:21:30,470][108279] Avg episode reward: [(0, '9.383'), (1, '9.233')] +[2023-09-25 20:21:35,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6319.5). Total num frames: 1105920. Throughput: 0: 818.9, 1: 818.6. Samples: 277167. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:21:35,471][108279] Avg episode reward: [(0, '9.406'), (1, '9.281')] +[2023-09-25 20:21:35,472][108926] Saving new best policy, reward=9.406! +[2023-09-25 20:21:35,501][109025] Saving new best policy, reward=9.281! +[2023-09-25 20:21:40,459][109225] Updated weights for policy 0, policy_version 2240 (0.0018) +[2023-09-25 20:21:40,459][109224] Updated weights for policy 1, policy_version 2240 (0.0017) +[2023-09-25 20:21:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6371.6). Total num frames: 1146880. Throughput: 0: 819.1, 1: 819.9. Samples: 282327. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:40,471][108279] Avg episode reward: [(0, '9.406'), (1, '9.281')] +[2023-09-25 20:21:45,470][108279] Fps is (10 sec: 7373.0, 60 sec: 6621.9, 300 sec: 6376.5). Total num frames: 1179648. Throughput: 0: 819.5, 1: 819.9. Samples: 292184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:45,470][108279] Avg episode reward: [(0, '9.441'), (1, '9.324')] +[2023-09-25 20:21:45,474][109025] Saving new best policy, reward=9.324! +[2023-09-25 20:21:45,474][108926] Saving new best policy, reward=9.441! +[2023-09-25 20:21:50,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6553.6, 300 sec: 6338.0). Total num frames: 1204224. Throughput: 0: 818.0, 1: 817.1. Samples: 301647. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:21:50,471][108279] Avg episode reward: [(0, '9.441'), (1, '9.324')] +[2023-09-25 20:21:53,080][109225] Updated weights for policy 0, policy_version 2400 (0.0019) +[2023-09-25 20:21:53,080][109224] Updated weights for policy 1, policy_version 2400 (0.0020) +[2023-09-25 20:21:55,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6485.3, 300 sec: 6343.5). Total num frames: 1236992. Throughput: 0: 816.0, 1: 815.8. Samples: 306687. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:21:55,471][108279] Avg episode reward: [(0, '9.472'), (1, '9.361')] +[2023-09-25 20:21:55,532][108926] Saving new best policy, reward=9.472! +[2023-09-25 20:21:55,543][109025] Saving new best policy, reward=9.361! +[2023-09-25 20:22:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6348.8). Total num frames: 1269760. Throughput: 0: 818.6, 1: 818.8. Samples: 316585. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:22:00,471][108279] Avg episode reward: [(0, '9.472'), (1, '9.361')] +[2023-09-25 20:22:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6353.8). Total num frames: 1302528. Throughput: 0: 817.1, 1: 817.2. Samples: 326222. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:22:05,471][108279] Avg episode reward: [(0, '9.487'), (1, '9.382')] +[2023-09-25 20:22:05,540][108926] Saving new best policy, reward=9.487! +[2023-09-25 20:22:05,544][109025] Saving new best policy, reward=9.382! +[2023-09-25 20:22:05,547][109224] Updated weights for policy 1, policy_version 2560 (0.0018) +[2023-09-25 20:22:05,547][109225] Updated weights for policy 0, policy_version 2560 (0.0017) +[2023-09-25 20:22:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6358.6). Total num frames: 1335296. Throughput: 0: 816.7, 1: 817.1. Samples: 331350. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:22:10,471][108279] Avg episode reward: [(0, '9.487'), (1, '9.382')] +[2023-09-25 20:22:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6363.1). Total num frames: 1368064. Throughput: 0: 815.4, 1: 815.0. Samples: 340999. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:22:15,471][108279] Avg episode reward: [(0, '9.512'), (1, '9.412')] +[2023-09-25 20:22:15,544][108926] Saving new best policy, reward=9.512! +[2023-09-25 20:22:15,596][109025] Saving new best policy, reward=9.412! +[2023-09-25 20:22:18,138][109224] Updated weights for policy 1, policy_version 2720 (0.0017) +[2023-09-25 20:22:18,139][109225] Updated weights for policy 0, policy_version 2720 (0.0016) +[2023-09-25 20:22:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6367.4). Total num frames: 1400832. Throughput: 0: 815.1, 1: 815.5. Samples: 350541. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:22:20,471][108279] Avg episode reward: [(0, '9.512'), (1, '9.412')] +[2023-09-25 20:22:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6371.6). Total num frames: 1433600. Throughput: 0: 813.3, 1: 812.9. Samples: 355507. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:22:25,470][108279] Avg episode reward: [(0, '9.512'), (1, '9.429')] +[2023-09-25 20:22:25,471][109025] Saving new best policy, reward=9.429! +[2023-09-25 20:22:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6375.5). Total num frames: 1466368. Throughput: 0: 810.2, 1: 810.1. Samples: 365095. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:22:30,470][108279] Avg episode reward: [(0, '9.512'), (1, '9.429')] +[2023-09-25 20:22:30,475][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000002864_733184.pth... +[2023-09-25 20:22:30,475][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000002864_733184.pth... +[2023-09-25 20:22:30,802][109225] Updated weights for policy 0, policy_version 2880 (0.0018) +[2023-09-25 20:22:30,802][109224] Updated weights for policy 1, policy_version 2880 (0.0017) +[2023-09-25 20:22:35,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6379.3). Total num frames: 1499136. Throughput: 0: 813.7, 1: 811.7. Samples: 374793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:22:35,471][108279] Avg episode reward: [(0, '9.511'), (1, '9.437')] +[2023-09-25 20:22:35,472][109025] Saving new best policy, reward=9.437! +[2023-09-25 20:22:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6382.9). Total num frames: 1531904. Throughput: 0: 813.1, 1: 813.2. Samples: 379871. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:22:40,470][108279] Avg episode reward: [(0, '9.511'), (1, '9.432')] +[2023-09-25 20:22:43,282][109224] Updated weights for policy 1, policy_version 3040 (0.0019) +[2023-09-25 20:22:43,282][109225] Updated weights for policy 0, policy_version 3040 (0.0019) +[2023-09-25 20:22:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6386.4). Total num frames: 1564672. Throughput: 0: 813.3, 1: 812.8. Samples: 389760. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:22:45,471][108279] Avg episode reward: [(0, '9.516'), (1, '9.432')] +[2023-09-25 20:22:45,476][108926] Saving new best policy, reward=9.516! +[2023-09-25 20:22:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6389.8). Total num frames: 1597440. Throughput: 0: 814.0, 1: 811.4. Samples: 399365. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:22:50,470][108279] Avg episode reward: [(0, '9.522'), (1, '9.457')] +[2023-09-25 20:22:50,471][108926] Saving new best policy, reward=9.522! +[2023-09-25 20:22:50,471][109025] Saving new best policy, reward=9.457! +[2023-09-25 20:22:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6393.0). Total num frames: 1630208. Throughput: 0: 809.9, 1: 809.2. Samples: 404211. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:22:55,471][108279] Avg episode reward: [(0, '9.522'), (1, '9.457')] +[2023-09-25 20:22:55,966][109225] Updated weights for policy 0, policy_version 3200 (0.0015) +[2023-09-25 20:22:55,966][109224] Updated weights for policy 1, policy_version 3200 (0.0019) +[2023-09-25 20:23:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6396.1). Total num frames: 1662976. Throughput: 0: 810.6, 1: 809.8. Samples: 413917. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:23:00,470][108279] Avg episode reward: [(0, '9.531'), (1, '9.469')] +[2023-09-25 20:23:00,473][108926] Saving new best policy, reward=9.531! +[2023-09-25 20:23:00,474][109025] Saving new best policy, reward=9.469! +[2023-09-25 20:23:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6399.0). Total num frames: 1695744. Throughput: 0: 816.8, 1: 814.2. Samples: 423937. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:23:05,471][108279] Avg episode reward: [(0, '9.531'), (1, '9.469')] +[2023-09-25 20:23:08,438][109225] Updated weights for policy 0, policy_version 3360 (0.0019) +[2023-09-25 20:23:08,438][109224] Updated weights for policy 1, policy_version 3360 (0.0019) +[2023-09-25 20:23:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6401.9). Total num frames: 1728512. Throughput: 0: 814.4, 1: 814.8. Samples: 428821. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:10,471][108279] Avg episode reward: [(0, '9.550'), (1, '9.450')] +[2023-09-25 20:23:10,472][108926] Saving new best policy, reward=9.550! +[2023-09-25 20:23:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6404.7). Total num frames: 1761280. Throughput: 0: 818.2, 1: 818.5. Samples: 438747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:15,470][108279] Avg episode reward: [(0, '9.550'), (1, '9.450')] +[2023-09-25 20:23:20,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6407.3). Total num frames: 1794048. Throughput: 0: 819.1, 1: 819.0. Samples: 448509. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:20,470][108279] Avg episode reward: [(0, '9.640'), (1, '9.550')] +[2023-09-25 20:23:20,471][109025] Saving new best policy, reward=9.550! +[2023-09-25 20:23:20,471][108926] Saving new best policy, reward=9.640! +[2023-09-25 20:23:21,059][109225] Updated weights for policy 0, policy_version 3520 (0.0018) +[2023-09-25 20:23:21,059][109224] Updated weights for policy 1, policy_version 3520 (0.0018) +[2023-09-25 20:23:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6409.9). Total num frames: 1826816. Throughput: 0: 814.9, 1: 815.0. Samples: 453216. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:25,471][108279] Avg episode reward: [(0, '9.640'), (1, '9.550')] +[2023-09-25 20:23:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6412.4). Total num frames: 1859584. Throughput: 0: 814.1, 1: 814.4. Samples: 463040. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:30,470][108279] Avg episode reward: [(0, '9.740'), (1, '9.680')] +[2023-09-25 20:23:30,479][108926] Saving new best policy, reward=9.740! +[2023-09-25 20:23:30,479][109025] Saving new best policy, reward=9.680! +[2023-09-25 20:23:33,436][109224] Updated weights for policy 1, policy_version 3680 (0.0014) +[2023-09-25 20:23:33,437][109225] Updated weights for policy 0, policy_version 3680 (0.0017) +[2023-09-25 20:23:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6414.8). Total num frames: 1892352. Throughput: 0: 819.2, 1: 819.1. Samples: 473089. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:35,471][108279] Avg episode reward: [(0, '9.740'), (1, '9.680')] +[2023-09-25 20:23:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 1925120. Throughput: 0: 819.0, 1: 819.0. Samples: 477920. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:23:40,470][108279] Avg episode reward: [(0, '9.780'), (1, '9.740')] +[2023-09-25 20:23:40,471][109025] Saving new best policy, reward=9.740! +[2023-09-25 20:23:40,471][108926] Saving new best policy, reward=9.780! +[2023-09-25 20:23:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 1957888. Throughput: 0: 819.4, 1: 819.4. Samples: 487662. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:23:45,470][108279] Avg episode reward: [(0, '9.780'), (1, '9.740')] +[2023-09-25 20:23:46,050][109225] Updated weights for policy 0, policy_version 3840 (0.0017) +[2023-09-25 20:23:46,050][109224] Updated weights for policy 1, policy_version 3840 (0.0017) +[2023-09-25 20:23:50,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 1990656. Throughput: 0: 817.9, 1: 819.2. Samples: 497607. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:23:50,471][108279] Avg episode reward: [(0, '9.860'), (1, '9.730')] +[2023-09-25 20:23:50,472][108926] Saving new best policy, reward=9.860! +[2023-09-25 20:23:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2023424. Throughput: 0: 815.3, 1: 815.1. Samples: 502191. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:23:55,471][108279] Avg episode reward: [(0, '9.860'), (1, '9.730')] +[2023-09-25 20:23:58,699][109224] Updated weights for policy 1, policy_version 4000 (0.0015) +[2023-09-25 20:23:58,699][109225] Updated weights for policy 0, policy_version 4000 (0.0016) +[2023-09-25 20:24:00,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2056192. Throughput: 0: 815.3, 1: 812.6. Samples: 512001. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:24:00,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.660')] +[2023-09-25 20:24:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2088960. Throughput: 0: 814.5, 1: 816.5. Samples: 521905. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:24:05,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.660')] +[2023-09-25 20:24:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2121728. Throughput: 0: 815.5, 1: 815.5. Samples: 526607. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:10,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.640')] +[2023-09-25 20:24:11,209][109224] Updated weights for policy 1, policy_version 4160 (0.0017) +[2023-09-25 20:24:11,209][109225] Updated weights for policy 0, policy_version 4160 (0.0015) +[2023-09-25 20:24:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2154496. Throughput: 0: 818.4, 1: 815.8. Samples: 536577. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:24:15,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.640')] +[2023-09-25 20:24:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2187264. Throughput: 0: 812.2, 1: 813.5. Samples: 546248. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:24:20,471][108279] Avg episode reward: [(0, '9.860'), (1, '9.610')] +[2023-09-25 20:24:23,982][109224] Updated weights for policy 1, policy_version 4320 (0.0018) +[2023-09-25 20:24:23,982][109225] Updated weights for policy 0, policy_version 4320 (0.0018) +[2023-09-25 20:24:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2220032. Throughput: 0: 812.2, 1: 809.9. Samples: 550912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:25,472][108279] Avg episode reward: [(0, '9.860'), (1, '9.610')] +[2023-09-25 20:24:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2252800. Throughput: 0: 810.2, 1: 810.3. Samples: 560588. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:24:30,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.560')] +[2023-09-25 20:24:30,478][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000004400_1126400.pth... +[2023-09-25 20:24:30,478][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000004400_1126400.pth... +[2023-09-25 20:24:30,506][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000001344_344064.pth +[2023-09-25 20:24:30,513][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000001344_344064.pth +[2023-09-25 20:24:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2285568. Throughput: 0: 805.0, 1: 806.2. Samples: 570107. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:35,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.560')] +[2023-09-25 20:24:35,472][108926] Saving new best policy, reward=9.880! +[2023-09-25 20:24:36,700][109224] Updated weights for policy 1, policy_version 4480 (0.0018) +[2023-09-25 20:24:36,700][109225] Updated weights for policy 0, policy_version 4480 (0.0018) +[2023-09-25 20:24:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2318336. Throughput: 0: 812.3, 1: 812.0. Samples: 575283. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:24:40,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.560')] +[2023-09-25 20:24:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2351104. Throughput: 0: 810.3, 1: 812.9. Samples: 585046. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:45,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.560')] +[2023-09-25 20:24:49,115][109224] Updated weights for policy 1, policy_version 4640 (0.0017) +[2023-09-25 20:24:49,115][109225] Updated weights for policy 0, policy_version 4640 (0.0013) +[2023-09-25 20:24:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2383872. Throughput: 0: 810.6, 1: 811.1. Samples: 594885. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:50,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.510')] +[2023-09-25 20:24:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 2416640. Throughput: 0: 815.4, 1: 814.7. Samples: 599963. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:24:55,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.470')] +[2023-09-25 20:25:00,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 2441216. Throughput: 0: 806.2, 1: 808.6. Samples: 609244. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:25:00,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.450')] +[2023-09-25 20:25:01,874][109225] Updated weights for policy 0, policy_version 4800 (0.0017) +[2023-09-25 20:25:01,874][109224] Updated weights for policy 1, policy_version 4800 (0.0016) +[2023-09-25 20:25:05,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2473984. Throughput: 0: 806.7, 1: 808.0. Samples: 618911. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:25:05,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.430')] +[2023-09-25 20:25:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 2506752. Throughput: 0: 811.7, 1: 813.6. Samples: 624053. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:25:10,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.430')] +[2023-09-25 20:25:14,421][109224] Updated weights for policy 1, policy_version 4960 (0.0017) +[2023-09-25 20:25:14,421][109225] Updated weights for policy 0, policy_version 4960 (0.0019) +[2023-09-25 20:25:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2539520. Throughput: 0: 811.8, 1: 812.4. Samples: 633679. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:25:15,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.430')] +[2023-09-25 20:25:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2572288. Throughput: 0: 813.8, 1: 813.8. Samples: 643349. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:25:20,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.430')] +[2023-09-25 20:25:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2605056. Throughput: 0: 813.5, 1: 813.7. Samples: 648507. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:25:25,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.400')] +[2023-09-25 20:25:26,992][109224] Updated weights for policy 1, policy_version 5120 (0.0019) +[2023-09-25 20:25:26,992][109225] Updated weights for policy 0, policy_version 5120 (0.0017) +[2023-09-25 20:25:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2637824. Throughput: 0: 812.3, 1: 812.7. Samples: 658171. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:25:30,470][108279] Avg episode reward: [(0, '9.870'), (1, '9.400')] +[2023-09-25 20:25:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2670592. Throughput: 0: 810.3, 1: 810.0. Samples: 667796. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:25:35,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.370')] +[2023-09-25 20:25:39,656][109224] Updated weights for policy 1, policy_version 5280 (0.0017) +[2023-09-25 20:25:39,658][109225] Updated weights for policy 0, policy_version 5280 (0.0020) +[2023-09-25 20:25:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6511.9). Total num frames: 2703360. Throughput: 0: 809.2, 1: 808.8. Samples: 672772. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:25:40,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.370')] +[2023-09-25 20:25:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 2736128. Throughput: 0: 813.4, 1: 813.7. Samples: 682463. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:25:45,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.270')] +[2023-09-25 20:25:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6511.9). Total num frames: 2768896. Throughput: 0: 815.6, 1: 813.3. Samples: 692213. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:25:50,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.270')] +[2023-09-25 20:25:52,431][109225] Updated weights for policy 0, policy_version 5440 (0.0019) +[2023-09-25 20:25:52,431][109224] Updated weights for policy 1, policy_version 5440 (0.0019) +[2023-09-25 20:25:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 2801664. Throughput: 0: 806.5, 1: 806.9. Samples: 696658. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:25:55,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.220')] +[2023-09-25 20:26:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2834432. Throughput: 0: 811.2, 1: 808.4. Samples: 706561. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:26:00,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.220')] +[2023-09-25 20:26:05,164][109224] Updated weights for policy 1, policy_version 5600 (0.0018) +[2023-09-25 20:26:05,164][109225] Updated weights for policy 0, policy_version 5600 (0.0017) +[2023-09-25 20:26:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2867200. Throughput: 0: 809.9, 1: 809.1. Samples: 716204. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:26:05,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.150')] +[2023-09-25 20:26:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2899968. Throughput: 0: 805.5, 1: 803.2. Samples: 720900. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:26:10,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.150')] +[2023-09-25 20:26:15,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2932736. Throughput: 0: 807.0, 1: 806.6. Samples: 730786. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:26:15,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.140')] +[2023-09-25 20:26:17,776][109224] Updated weights for policy 1, policy_version 5760 (0.0017) +[2023-09-25 20:26:17,776][109225] Updated weights for policy 0, policy_version 5760 (0.0018) +[2023-09-25 20:26:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2965504. Throughput: 0: 806.2, 1: 806.4. Samples: 740359. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:26:20,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.140')] +[2023-09-25 20:26:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 2998272. Throughput: 0: 806.2, 1: 807.0. Samples: 745364. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:26:25,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.120')] +[2023-09-25 20:26:30,378][109224] Updated weights for policy 1, policy_version 5920 (0.0017) +[2023-09-25 20:26:30,379][109225] Updated weights for policy 0, policy_version 5920 (0.0017) +[2023-09-25 20:26:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3031040. Throughput: 0: 806.7, 1: 806.9. Samples: 755078. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:26:30,471][108279] Avg episode reward: [(0, '9.880'), (1, '9.120')] +[2023-09-25 20:26:30,482][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000005920_1515520.pth... +[2023-09-25 20:26:30,482][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000005920_1515520.pth... +[2023-09-25 20:26:30,517][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000002864_733184.pth +[2023-09-25 20:26:30,523][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000002864_733184.pth +[2023-09-25 20:26:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 3063808. Throughput: 0: 805.6, 1: 808.0. Samples: 764828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:26:35,470][108279] Avg episode reward: [(0, '9.880'), (1, '9.040')] +[2023-09-25 20:26:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 3096576. Throughput: 0: 815.1, 1: 814.3. Samples: 769980. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:26:40,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.040')] +[2023-09-25 20:26:42,821][109225] Updated weights for policy 0, policy_version 6080 (0.0017) +[2023-09-25 20:26:42,821][109224] Updated weights for policy 1, policy_version 6080 (0.0014) +[2023-09-25 20:26:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3129344. Throughput: 0: 812.7, 1: 815.1. Samples: 779811. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:26:45,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.940')] +[2023-09-25 20:26:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3162112. Throughput: 0: 814.3, 1: 815.0. Samples: 789522. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:26:50,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.940')] +[2023-09-25 20:26:50,472][108926] Saving new best policy, reward=9.890! +[2023-09-25 20:26:55,293][109225] Updated weights for policy 0, policy_version 6240 (0.0015) +[2023-09-25 20:26:55,293][109224] Updated weights for policy 1, policy_version 6240 (0.0017) +[2023-09-25 20:26:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3194880. Throughput: 0: 818.6, 1: 819.1. Samples: 794599. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:26:55,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.890')] +[2023-09-25 20:27:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3227648. Throughput: 0: 818.8, 1: 818.2. Samples: 804453. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:27:00,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.860')] +[2023-09-25 20:27:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3260416. Throughput: 0: 821.7, 1: 821.5. Samples: 814302. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:05,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.800')] +[2023-09-25 20:27:07,778][109225] Updated weights for policy 0, policy_version 6400 (0.0017) +[2023-09-25 20:27:07,779][109224] Updated weights for policy 1, policy_version 6400 (0.0018) +[2023-09-25 20:27:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3293184. Throughput: 0: 821.6, 1: 819.2. Samples: 819200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:10,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.780')] +[2023-09-25 20:27:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3325952. Throughput: 0: 822.7, 1: 822.5. Samples: 829113. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:15,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.740')] +[2023-09-25 20:27:20,161][109224] Updated weights for policy 1, policy_version 6560 (0.0016) +[2023-09-25 20:27:20,161][109225] Updated weights for policy 0, policy_version 6560 (0.0017) +[2023-09-25 20:27:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3358720. Throughput: 0: 824.1, 1: 824.7. Samples: 839022. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:27:20,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.740')] +[2023-09-25 20:27:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3391488. Throughput: 0: 820.7, 1: 819.4. Samples: 843787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:25,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.740')] +[2023-09-25 20:27:30,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3424256. Throughput: 0: 824.4, 1: 823.3. Samples: 853954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:30,470][108279] Avg episode reward: [(0, '9.880'), (1, '8.750')] +[2023-09-25 20:27:32,647][109225] Updated weights for policy 0, policy_version 6720 (0.0017) +[2023-09-25 20:27:32,647][109224] Updated weights for policy 1, policy_version 6720 (0.0018) +[2023-09-25 20:27:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3457024. Throughput: 0: 822.7, 1: 822.6. Samples: 863562. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:27:35,471][108279] Avg episode reward: [(0, '9.880'), (1, '8.750')] +[2023-09-25 20:27:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3489792. Throughput: 0: 819.8, 1: 819.3. Samples: 868357. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:27:40,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.780')] +[2023-09-25 20:27:45,276][109225] Updated weights for policy 0, policy_version 6880 (0.0019) +[2023-09-25 20:27:45,277][109224] Updated weights for policy 1, policy_version 6880 (0.0018) +[2023-09-25 20:27:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3522560. Throughput: 0: 819.8, 1: 820.1. Samples: 878246. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:27:45,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.780')] +[2023-09-25 20:27:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3555328. Throughput: 0: 816.8, 1: 817.6. Samples: 887853. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:27:50,471][108279] Avg episode reward: [(0, '9.890'), (1, '8.780')] +[2023-09-25 20:27:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3588096. Throughput: 0: 818.2, 1: 819.2. Samples: 892884. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:27:55,471][108279] Avg episode reward: [(0, '9.880'), (1, '8.780')] +[2023-09-25 20:27:57,963][109224] Updated weights for policy 1, policy_version 7040 (0.0018) +[2023-09-25 20:27:57,964][109225] Updated weights for policy 0, policy_version 7040 (0.0020) +[2023-09-25 20:28:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3620864. Throughput: 0: 814.6, 1: 814.6. Samples: 902426. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:28:00,470][108279] Avg episode reward: [(0, '9.880'), (1, '8.790')] +[2023-09-25 20:28:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3653632. Throughput: 0: 815.6, 1: 815.0. Samples: 912399. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:05,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.790')] +[2023-09-25 20:28:10,200][109224] Updated weights for policy 1, policy_version 7200 (0.0017) +[2023-09-25 20:28:10,200][109225] Updated weights for policy 0, policy_version 7200 (0.0017) +[2023-09-25 20:28:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3686400. Throughput: 0: 819.2, 1: 819.0. Samples: 917504. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:28:10,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.780')] +[2023-09-25 20:28:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3719168. Throughput: 0: 818.6, 1: 819.2. Samples: 927654. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:28:15,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.780')] +[2023-09-25 20:28:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3751936. Throughput: 0: 820.8, 1: 820.6. Samples: 937425. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:28:20,470][108279] Avg episode reward: [(0, '9.870'), (1, '8.850')] +[2023-09-25 20:28:22,697][109224] Updated weights for policy 1, policy_version 7360 (0.0016) +[2023-09-25 20:28:22,697][109225] Updated weights for policy 0, policy_version 7360 (0.0014) +[2023-09-25 20:28:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3784704. Throughput: 0: 819.2, 1: 819.1. Samples: 942082. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:25,470][108279] Avg episode reward: [(0, '9.860'), (1, '8.850')] +[2023-09-25 20:28:30,470][108279] Fps is (10 sec: 5734.2, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 3809280. Throughput: 0: 814.4, 1: 816.0. Samples: 951616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:30,471][108279] Avg episode reward: [(0, '9.860'), (1, '8.890')] +[2023-09-25 20:28:30,481][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000007456_1908736.pth... +[2023-09-25 20:28:30,514][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000004400_1126400.pth +[2023-09-25 20:28:30,545][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000007456_1908736.pth... +[2023-09-25 20:28:30,581][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000004400_1126400.pth +[2023-09-25 20:28:35,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 3842048. Throughput: 0: 814.7, 1: 814.1. Samples: 961151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:35,471][108279] Avg episode reward: [(0, '9.860'), (1, '8.890')] +[2023-09-25 20:28:35,544][109224] Updated weights for policy 1, policy_version 7520 (0.0017) +[2023-09-25 20:28:35,544][109225] Updated weights for policy 0, policy_version 7520 (0.0016) +[2023-09-25 20:28:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 3874816. Throughput: 0: 814.8, 1: 816.2. Samples: 966281. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:40,471][108279] Avg episode reward: [(0, '9.860'), (1, '8.920')] +[2023-09-25 20:28:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 3907584. Throughput: 0: 816.0, 1: 816.5. Samples: 975888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:28:45,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.920')] +[2023-09-25 20:28:48,094][109225] Updated weights for policy 0, policy_version 7680 (0.0017) +[2023-09-25 20:28:48,095][109224] Updated weights for policy 1, policy_version 7680 (0.0018) +[2023-09-25 20:28:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 3940352. Throughput: 0: 814.0, 1: 813.9. Samples: 985654. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:28:50,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.970')] +[2023-09-25 20:28:55,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 3981312. Throughput: 0: 814.2, 1: 817.2. Samples: 990919. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:28:55,470][108279] Avg episode reward: [(0, '9.870'), (1, '8.970')] +[2023-09-25 20:29:00,404][109224] Updated weights for policy 1, policy_version 7840 (0.0014) +[2023-09-25 20:29:00,404][109225] Updated weights for policy 0, policy_version 7840 (0.0017) +[2023-09-25 20:29:00,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 4014080. Throughput: 0: 811.6, 1: 812.6. Samples: 1000740. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:00,470][108279] Avg episode reward: [(0, '9.870'), (1, '8.960')] +[2023-09-25 20:29:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 4046848. Throughput: 0: 812.3, 1: 812.7. Samples: 1010550. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:05,471][108279] Avg episode reward: [(0, '9.870'), (1, '8.960')] +[2023-09-25 20:29:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 4079616. Throughput: 0: 816.0, 1: 819.2. Samples: 1015662. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:10,471][108279] Avg episode reward: [(0, '9.870'), (1, '9.030')] +[2023-09-25 20:29:13,049][109225] Updated weights for policy 0, policy_version 8000 (0.0017) +[2023-09-25 20:29:13,049][109224] Updated weights for policy 1, policy_version 8000 (0.0016) +[2023-09-25 20:29:15,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 4104192. Throughput: 0: 817.0, 1: 815.9. Samples: 1025095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:15,471][108279] Avg episode reward: [(0, '9.860'), (1, '9.050')] +[2023-09-25 20:29:20,470][108279] Fps is (10 sec: 6144.1, 60 sec: 6485.3, 300 sec: 6511.9). Total num frames: 4141056. Throughput: 0: 820.3, 1: 819.9. Samples: 1034958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:20,471][108279] Avg episode reward: [(0, '9.860'), (1, '9.080')] +[2023-09-25 20:29:25,439][109224] Updated weights for policy 1, policy_version 8160 (0.0017) +[2023-09-25 20:29:25,439][109225] Updated weights for policy 0, policy_version 8160 (0.0016) +[2023-09-25 20:29:25,470][108279] Fps is (10 sec: 7373.0, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 4177920. Throughput: 0: 820.0, 1: 820.3. Samples: 1040096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:25,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.080')] +[2023-09-25 20:29:30,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 4202496. Throughput: 0: 820.9, 1: 819.6. Samples: 1049714. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:30,470][108279] Avg episode reward: [(0, '9.860'), (1, '9.080')] +[2023-09-25 20:29:35,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 4235264. Throughput: 0: 820.3, 1: 820.2. Samples: 1059477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:35,471][108279] Avg episode reward: [(0, '9.850'), (1, '9.080')] +[2023-09-25 20:29:38,006][109224] Updated weights for policy 1, policy_version 8320 (0.0017) +[2023-09-25 20:29:38,006][109225] Updated weights for policy 0, policy_version 8320 (0.0016) +[2023-09-25 20:29:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 4268032. Throughput: 0: 819.7, 1: 818.9. Samples: 1064656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:40,471][108279] Avg episode reward: [(0, '9.840'), (1, '9.110')] +[2023-09-25 20:29:45,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6621.9, 300 sec: 6511.9). Total num frames: 4304896. Throughput: 0: 818.2, 1: 818.1. Samples: 1074371. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:45,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.110')] +[2023-09-25 20:29:50,421][109224] Updated weights for policy 1, policy_version 8480 (0.0016) +[2023-09-25 20:29:50,422][109225] Updated weights for policy 0, policy_version 8480 (0.0016) +[2023-09-25 20:29:50,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6690.1, 300 sec: 6525.8). Total num frames: 4341760. Throughput: 0: 819.5, 1: 819.3. Samples: 1084298. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:29:50,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.120')] +[2023-09-25 20:29:55,470][108279] Fps is (10 sec: 6144.1, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 4366336. Throughput: 0: 818.2, 1: 819.2. Samples: 1089346. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:29:55,470][108279] Avg episode reward: [(0, '9.820'), (1, '9.150')] +[2023-09-25 20:30:00,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 4403200. Throughput: 0: 820.4, 1: 820.2. Samples: 1098926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:30:00,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.150')] +[2023-09-25 20:30:02,966][109224] Updated weights for policy 1, policy_version 8640 (0.0013) +[2023-09-25 20:30:02,967][109225] Updated weights for policy 0, policy_version 8640 (0.0017) +[2023-09-25 20:30:05,470][108279] Fps is (10 sec: 7372.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4440064. Throughput: 0: 819.8, 1: 820.5. Samples: 1108769. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:30:05,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.250')] +[2023-09-25 20:30:10,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4472832. Throughput: 0: 820.6, 1: 820.7. Samples: 1113956. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:30:10,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.250')] +[2023-09-25 20:30:15,373][109225] Updated weights for policy 0, policy_version 8800 (0.0016) +[2023-09-25 20:30:15,374][109224] Updated weights for policy 1, policy_version 8800 (0.0017) +[2023-09-25 20:30:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 4505600. Throughput: 0: 821.8, 1: 822.8. Samples: 1123721. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:30:15,471][108279] Avg episode reward: [(0, '9.810'), (1, '9.300')] +[2023-09-25 20:30:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6621.9, 300 sec: 6553.6). Total num frames: 4538368. Throughput: 0: 821.7, 1: 821.8. Samples: 1133434. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:30:20,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.300')] +[2023-09-25 20:30:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4571136. Throughput: 0: 822.5, 1: 821.5. Samples: 1138638. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:30:25,470][108279] Avg episode reward: [(0, '9.820'), (1, '9.370')] +[2023-09-25 20:30:27,877][109224] Updated weights for policy 1, policy_version 8960 (0.0018) +[2023-09-25 20:30:27,877][109225] Updated weights for policy 0, policy_version 8960 (0.0017) +[2023-09-25 20:30:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 4603904. Throughput: 0: 822.2, 1: 821.3. Samples: 1148329. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:30:30,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.370')] +[2023-09-25 20:30:30,479][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000008992_2301952.pth... +[2023-09-25 20:30:30,479][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000008992_2301952.pth... +[2023-09-25 20:30:30,515][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000005920_1515520.pth +[2023-09-25 20:30:30,518][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000005920_1515520.pth +[2023-09-25 20:30:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 4636672. Throughput: 0: 820.3, 1: 820.2. Samples: 1158119. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:30:35,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.380')] +[2023-09-25 20:30:40,313][109224] Updated weights for policy 1, policy_version 9120 (0.0016) +[2023-09-25 20:30:40,313][109225] Updated weights for policy 0, policy_version 9120 (0.0015) +[2023-09-25 20:30:40,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6690.2, 300 sec: 6553.6). Total num frames: 4669440. Throughput: 0: 821.5, 1: 819.2. Samples: 1163177. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:30:40,470][108279] Avg episode reward: [(0, '9.770'), (1, '9.380')] +[2023-09-25 20:30:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6621.9, 300 sec: 6553.6). Total num frames: 4702208. Throughput: 0: 824.8, 1: 824.7. Samples: 1173152. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:30:45,471][108279] Avg episode reward: [(0, '9.760'), (1, '9.400')] +[2023-09-25 20:30:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4734976. Throughput: 0: 822.1, 1: 821.7. Samples: 1182736. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:30:50,470][108279] Avg episode reward: [(0, '9.750'), (1, '9.400')] +[2023-09-25 20:30:52,795][109224] Updated weights for policy 1, policy_version 9280 (0.0018) +[2023-09-25 20:30:52,795][109225] Updated weights for policy 0, policy_version 9280 (0.0017) +[2023-09-25 20:30:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 4767744. Throughput: 0: 822.2, 1: 819.6. Samples: 1187835. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:30:55,470][108279] Avg episode reward: [(0, '9.740'), (1, '9.480')] +[2023-09-25 20:31:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6621.9, 300 sec: 6553.6). Total num frames: 4800512. Throughput: 0: 818.8, 1: 819.2. Samples: 1197429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:31:00,470][108279] Avg episode reward: [(0, '9.720'), (1, '9.480')] +[2023-09-25 20:31:05,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 4825088. Throughput: 0: 816.2, 1: 816.3. Samples: 1206894. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:31:05,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.590')] +[2023-09-25 20:31:05,538][109224] Updated weights for policy 1, policy_version 9440 (0.0017) +[2023-09-25 20:31:05,538][109225] Updated weights for policy 0, policy_version 9440 (0.0017) +[2023-09-25 20:31:10,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 4857856. Throughput: 0: 814.0, 1: 816.4. Samples: 1212006. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:31:10,471][108279] Avg episode reward: [(0, '9.690'), (1, '9.590')] +[2023-09-25 20:31:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 4890624. Throughput: 0: 815.6, 1: 816.2. Samples: 1221762. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:31:15,471][108279] Avg episode reward: [(0, '9.690'), (1, '9.690')] +[2023-09-25 20:31:17,999][109224] Updated weights for policy 1, policy_version 9600 (0.0015) +[2023-09-25 20:31:17,999][109225] Updated weights for policy 0, policy_version 9600 (0.0016) +[2023-09-25 20:31:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 4923392. Throughput: 0: 815.5, 1: 815.8. Samples: 1231525. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:31:20,471][108279] Avg episode reward: [(0, '9.690'), (1, '9.690')] +[2023-09-25 20:31:25,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4964352. Throughput: 0: 816.0, 1: 817.2. Samples: 1236672. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:31:25,471][108279] Avg episode reward: [(0, '9.690'), (1, '9.760')] +[2023-09-25 20:31:25,473][109025] Saving new best policy, reward=9.760! +[2023-09-25 20:31:30,378][109224] Updated weights for policy 1, policy_version 9760 (0.0016) +[2023-09-25 20:31:30,378][109225] Updated weights for policy 0, policy_version 9760 (0.0017) +[2023-09-25 20:31:30,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 4997120. Throughput: 0: 815.7, 1: 815.9. Samples: 1246576. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:31:30,470][108279] Avg episode reward: [(0, '9.680'), (1, '9.760')] +[2023-09-25 20:31:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5029888. Throughput: 0: 817.0, 1: 817.6. Samples: 1256291. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:31:35,470][108279] Avg episode reward: [(0, '9.680'), (1, '9.810')] +[2023-09-25 20:31:35,471][109025] Saving new best policy, reward=9.810! +[2023-09-25 20:31:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5062656. Throughput: 0: 816.9, 1: 819.2. Samples: 1261458. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:31:40,471][108279] Avg episode reward: [(0, '9.650'), (1, '9.810')] +[2023-09-25 20:31:42,851][109225] Updated weights for policy 0, policy_version 9920 (0.0017) +[2023-09-25 20:31:42,851][109224] Updated weights for policy 1, policy_version 9920 (0.0016) +[2023-09-25 20:31:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5095424. Throughput: 0: 820.4, 1: 820.3. Samples: 1271264. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:31:45,470][108279] Avg episode reward: [(0, '9.650'), (1, '9.840')] +[2023-09-25 20:31:45,477][109025] Saving new best policy, reward=9.840! +[2023-09-25 20:31:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5128192. Throughput: 0: 824.2, 1: 823.3. Samples: 1281034. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:31:50,470][108279] Avg episode reward: [(0, '9.620'), (1, '9.840')] +[2023-09-25 20:31:55,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 5156864. Throughput: 0: 822.0, 1: 821.9. Samples: 1285981. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:31:55,471][108279] Avg episode reward: [(0, '9.620'), (1, '9.860')] +[2023-09-25 20:31:55,549][109025] Saving new best policy, reward=9.860! +[2023-09-25 20:31:55,552][109224] Updated weights for policy 1, policy_version 10080 (0.0018) +[2023-09-25 20:31:55,563][109225] Updated weights for policy 0, policy_version 10080 (0.0017) +[2023-09-25 20:32:00,470][108279] Fps is (10 sec: 5734.2, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 5185536. Throughput: 0: 817.4, 1: 817.2. Samples: 1295319. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:00,471][108279] Avg episode reward: [(0, '9.590'), (1, '9.860')] +[2023-09-25 20:32:05,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 5218304. Throughput: 0: 819.4, 1: 819.1. Samples: 1305258. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:32:05,471][108279] Avg episode reward: [(0, '9.590'), (1, '9.860')] +[2023-09-25 20:32:07,981][109224] Updated weights for policy 1, policy_version 10240 (0.0017) +[2023-09-25 20:32:07,982][109225] Updated weights for policy 0, policy_version 10240 (0.0019) +[2023-09-25 20:32:10,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6621.9, 300 sec: 6539.7). Total num frames: 5255168. Throughput: 0: 819.4, 1: 818.8. Samples: 1310389. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:32:10,471][108279] Avg episode reward: [(0, '9.580'), (1, '9.860')] +[2023-09-25 20:32:15,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6621.9, 300 sec: 6539.7). Total num frames: 5287936. Throughput: 0: 818.4, 1: 818.2. Samples: 1320223. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:15,471][108279] Avg episode reward: [(0, '9.580'), (1, '9.860')] +[2023-09-25 20:32:20,439][109224] Updated weights for policy 1, policy_version 10400 (0.0018) +[2023-09-25 20:32:20,439][109225] Updated weights for policy 0, policy_version 10400 (0.0018) +[2023-09-25 20:32:20,470][108279] Fps is (10 sec: 6963.4, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 5324800. Throughput: 0: 818.5, 1: 817.8. Samples: 1329926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:20,470][108279] Avg episode reward: [(0, '9.550'), (1, '9.870')] +[2023-09-25 20:32:20,471][109025] Saving new best policy, reward=9.870! +[2023-09-25 20:32:25,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5357568. Throughput: 0: 818.4, 1: 818.3. Samples: 1335110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:25,471][108279] Avg episode reward: [(0, '9.550'), (1, '9.870')] +[2023-09-25 20:32:30,470][108279] Fps is (10 sec: 5734.2, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 5382144. Throughput: 0: 816.2, 1: 815.2. Samples: 1344677. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:30,471][108279] Avg episode reward: [(0, '9.530'), (1, '9.910')] +[2023-09-25 20:32:30,484][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000010528_2695168.pth... +[2023-09-25 20:32:30,505][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000010528_2695168.pth... +[2023-09-25 20:32:30,511][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000007456_1908736.pth +[2023-09-25 20:32:30,514][109025] Saving new best policy, reward=9.910! +[2023-09-25 20:32:30,533][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000007456_1908736.pth +[2023-09-25 20:32:33,071][109224] Updated weights for policy 1, policy_version 10560 (0.0016) +[2023-09-25 20:32:33,072][109225] Updated weights for policy 0, policy_version 10560 (0.0015) +[2023-09-25 20:32:35,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 5414912. Throughput: 0: 814.1, 1: 815.0. Samples: 1354345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:35,471][108279] Avg episode reward: [(0, '9.530'), (1, '9.910')] +[2023-09-25 20:32:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 5447680. Throughput: 0: 817.5, 1: 816.7. Samples: 1359519. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:40,471][108279] Avg episode reward: [(0, '9.520'), (1, '9.910')] +[2023-09-25 20:32:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 5480448. Throughput: 0: 820.6, 1: 821.9. Samples: 1369235. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:45,471][108279] Avg episode reward: [(0, '9.520'), (1, '9.910')] +[2023-09-25 20:32:45,581][109225] Updated weights for policy 0, policy_version 10720 (0.0016) +[2023-09-25 20:32:45,582][109224] Updated weights for policy 1, policy_version 10720 (0.0018) +[2023-09-25 20:32:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 5513216. Throughput: 0: 818.5, 1: 819.0. Samples: 1378946. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:32:50,471][108279] Avg episode reward: [(0, '9.500'), (1, '9.910')] +[2023-09-25 20:32:55,470][108279] Fps is (10 sec: 7372.7, 60 sec: 6621.8, 300 sec: 6553.6). Total num frames: 5554176. Throughput: 0: 820.4, 1: 820.2. Samples: 1384212. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:32:55,471][108279] Avg episode reward: [(0, '9.500'), (1, '9.910')] +[2023-09-25 20:32:57,886][109225] Updated weights for policy 0, policy_version 10880 (0.0017) +[2023-09-25 20:32:57,886][109224] Updated weights for policy 1, policy_version 10880 (0.0017) +[2023-09-25 20:33:00,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 5586944. Throughput: 0: 820.5, 1: 821.2. Samples: 1394098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:33:00,471][108279] Avg episode reward: [(0, '9.470'), (1, '9.920')] +[2023-09-25 20:33:00,480][109025] Saving new best policy, reward=9.920! +[2023-09-25 20:33:05,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 5611520. Throughput: 0: 818.0, 1: 818.0. Samples: 1403542. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:33:05,471][108279] Avg episode reward: [(0, '9.470'), (1, '9.920')] +[2023-09-25 20:33:10,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 5648384. Throughput: 0: 818.0, 1: 818.6. Samples: 1408756. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:33:10,471][108279] Avg episode reward: [(0, '9.450'), (1, '9.920')] +[2023-09-25 20:33:10,480][109224] Updated weights for policy 1, policy_version 11040 (0.0018) +[2023-09-25 20:33:10,480][109225] Updated weights for policy 0, policy_version 11040 (0.0016) +[2023-09-25 20:33:15,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6621.9, 300 sec: 6553.6). Total num frames: 5685248. Throughput: 0: 819.5, 1: 820.3. Samples: 1418466. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:33:15,470][108279] Avg episode reward: [(0, '9.450'), (1, '9.920')] +[2023-09-25 20:33:20,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5718016. Throughput: 0: 821.9, 1: 822.1. Samples: 1428324. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:33:20,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.910')] +[2023-09-25 20:33:22,854][109225] Updated weights for policy 0, policy_version 11200 (0.0017) +[2023-09-25 20:33:22,854][109224] Updated weights for policy 1, policy_version 11200 (0.0017) +[2023-09-25 20:33:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6581.4). Total num frames: 5750784. Throughput: 0: 822.3, 1: 821.8. Samples: 1433506. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:33:25,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.910')] +[2023-09-25 20:33:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.2, 300 sec: 6581.4). Total num frames: 5783552. Throughput: 0: 825.4, 1: 823.8. Samples: 1443447. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:33:30,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.900')] +[2023-09-25 20:33:35,361][109225] Updated weights for policy 0, policy_version 11360 (0.0018) +[2023-09-25 20:33:35,361][109224] Updated weights for policy 1, policy_version 11360 (0.0016) +[2023-09-25 20:33:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.2, 300 sec: 6581.4). Total num frames: 5816320. Throughput: 0: 822.7, 1: 822.9. Samples: 1452998. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:33:35,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.900')] +[2023-09-25 20:33:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6690.1, 300 sec: 6581.4). Total num frames: 5849088. Throughput: 0: 821.2, 1: 820.6. Samples: 1458091. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:33:40,471][108279] Avg episode reward: [(0, '9.370'), (1, '9.900')] +[2023-09-25 20:33:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6690.1, 300 sec: 6581.4). Total num frames: 5881856. Throughput: 0: 820.6, 1: 820.2. Samples: 1467937. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:33:45,471][108279] Avg episode reward: [(0, '9.370'), (1, '9.900')] +[2023-09-25 20:33:47,790][109225] Updated weights for policy 0, policy_version 11520 (0.0015) +[2023-09-25 20:33:47,791][109224] Updated weights for policy 1, policy_version 11520 (0.0017) +[2023-09-25 20:33:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 5914624. Throughput: 0: 823.8, 1: 824.5. Samples: 1477716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:33:50,471][108279] Avg episode reward: [(0, '9.380'), (1, '9.900')] +[2023-09-25 20:33:55,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5947392. Throughput: 0: 823.6, 1: 820.7. Samples: 1482752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:33:55,470][108279] Avg episode reward: [(0, '9.380'), (1, '9.900')] +[2023-09-25 20:34:00,345][109224] Updated weights for policy 1, policy_version 11680 (0.0018) +[2023-09-25 20:34:00,345][109225] Updated weights for policy 0, policy_version 11680 (0.0017) +[2023-09-25 20:34:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 5980160. Throughput: 0: 823.8, 1: 823.7. Samples: 1492605. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:00,470][108279] Avg episode reward: [(0, '9.330'), (1, '9.910')] +[2023-09-25 20:34:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 6012928. Throughput: 0: 821.5, 1: 821.2. Samples: 1502247. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:05,471][108279] Avg episode reward: [(0, '9.330'), (1, '9.920')] +[2023-09-25 20:34:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6621.9, 300 sec: 6581.4). Total num frames: 6045696. Throughput: 0: 820.3, 1: 819.2. Samples: 1507285. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:34:10,470][108279] Avg episode reward: [(0, '9.290'), (1, '9.920')] +[2023-09-25 20:34:12,908][109225] Updated weights for policy 0, policy_version 11840 (0.0016) +[2023-09-25 20:34:12,909][109224] Updated weights for policy 1, policy_version 11840 (0.0018) +[2023-09-25 20:34:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6567.5). Total num frames: 6078464. Throughput: 0: 816.2, 1: 816.5. Samples: 1516917. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:34:15,471][108279] Avg episode reward: [(0, '9.270'), (1, '9.920')] +[2023-09-25 20:34:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 6111232. Throughput: 0: 819.9, 1: 819.0. Samples: 1526747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:20,470][108279] Avg episode reward: [(0, '9.240'), (1, '9.920')] +[2023-09-25 20:34:25,430][109225] Updated weights for policy 0, policy_version 12000 (0.0017) +[2023-09-25 20:34:25,430][109224] Updated weights for policy 1, policy_version 12000 (0.0016) +[2023-09-25 20:34:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6581.4). Total num frames: 6144000. Throughput: 0: 818.4, 1: 819.2. Samples: 1531786. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:25,470][108279] Avg episode reward: [(0, '9.240'), (1, '9.920')] +[2023-09-25 20:34:30,470][108279] Fps is (10 sec: 5734.2, 60 sec: 6417.0, 300 sec: 6553.6). Total num frames: 6168576. Throughput: 0: 815.0, 1: 813.8. Samples: 1541231. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:30,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.920')] +[2023-09-25 20:34:30,521][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000012064_3088384.pth... +[2023-09-25 20:34:30,530][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000012064_3088384.pth... +[2023-09-25 20:34:30,550][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000008992_2301952.pth +[2023-09-25 20:34:30,558][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000008992_2301952.pth +[2023-09-25 20:34:35,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.0, 300 sec: 6553.6). Total num frames: 6201344. Throughput: 0: 813.6, 1: 813.0. Samples: 1550917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:35,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.910')] +[2023-09-25 20:34:38,078][109225] Updated weights for policy 0, policy_version 12160 (0.0017) +[2023-09-25 20:34:38,079][109224] Updated weights for policy 1, policy_version 12160 (0.0017) +[2023-09-25 20:34:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6539.7). Total num frames: 6234112. Throughput: 0: 812.2, 1: 815.8. Samples: 1556015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:40,470][108279] Avg episode reward: [(0, '9.240'), (1, '9.910')] +[2023-09-25 20:34:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6266880. Throughput: 0: 812.2, 1: 812.5. Samples: 1565716. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:45,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.910')] +[2023-09-25 20:34:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6553.6). Total num frames: 6299648. Throughput: 0: 814.8, 1: 815.0. Samples: 1575588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:50,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.890')] +[2023-09-25 20:34:50,507][109224] Updated weights for policy 1, policy_version 12320 (0.0016) +[2023-09-25 20:34:50,507][109225] Updated weights for policy 0, policy_version 12320 (0.0015) +[2023-09-25 20:34:55,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6567.5). Total num frames: 6340608. Throughput: 0: 816.4, 1: 817.8. Samples: 1580822. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:34:55,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.890')] +[2023-09-25 20:35:00,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 6369280. Throughput: 0: 818.6, 1: 818.5. Samples: 1590590. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:35:00,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.890')] +[2023-09-25 20:35:03,019][109225] Updated weights for policy 0, policy_version 12480 (0.0018) +[2023-09-25 20:35:03,019][109224] Updated weights for policy 1, policy_version 12480 (0.0015) +[2023-09-25 20:35:05,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6397952. Throughput: 0: 814.5, 1: 815.0. Samples: 1600073. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:35:05,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.900')] +[2023-09-25 20:35:10,470][108279] Fps is (10 sec: 6144.1, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6430720. Throughput: 0: 814.3, 1: 814.4. Samples: 1605077. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:35:10,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.900')] +[2023-09-25 20:35:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6463488. Throughput: 0: 815.2, 1: 817.0. Samples: 1614678. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:35:15,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.900')] +[2023-09-25 20:35:15,670][109225] Updated weights for policy 0, policy_version 12640 (0.0013) +[2023-09-25 20:35:15,670][109224] Updated weights for policy 1, policy_version 12640 (0.0018) +[2023-09-25 20:35:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 6496256. Throughput: 0: 817.2, 1: 817.0. Samples: 1624459. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:35:20,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.900')] +[2023-09-25 20:35:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6529024. Throughput: 0: 818.5, 1: 817.1. Samples: 1629616. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:35:25,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.900')] +[2023-09-25 20:35:28,201][109224] Updated weights for policy 1, policy_version 12800 (0.0017) +[2023-09-25 20:35:28,201][109225] Updated weights for policy 0, policy_version 12800 (0.0019) +[2023-09-25 20:35:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6561792. Throughput: 0: 816.4, 1: 815.6. Samples: 1639159. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:35:30,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.900')] +[2023-09-25 20:35:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6594560. Throughput: 0: 813.0, 1: 812.4. Samples: 1648735. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:35:35,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.890')] +[2023-09-25 20:35:40,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6627328. Throughput: 0: 810.8, 1: 811.5. Samples: 1653828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:35:40,470][108279] Avg episode reward: [(0, '9.220'), (1, '9.890')] +[2023-09-25 20:35:40,868][109224] Updated weights for policy 1, policy_version 12960 (0.0017) +[2023-09-25 20:35:40,869][109225] Updated weights for policy 0, policy_version 12960 (0.0017) +[2023-09-25 20:35:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6660096. Throughput: 0: 809.2, 1: 809.4. Samples: 1663430. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:35:45,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.900')] +[2023-09-25 20:35:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6692864. Throughput: 0: 814.0, 1: 811.5. Samples: 1673220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:35:50,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.900')] +[2023-09-25 20:35:53,473][109224] Updated weights for policy 1, policy_version 13120 (0.0015) +[2023-09-25 20:35:53,473][109225] Updated weights for policy 0, policy_version 13120 (0.0017) +[2023-09-25 20:35:55,470][108279] Fps is (10 sec: 6553.9, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 6725632. Throughput: 0: 810.7, 1: 810.7. Samples: 1678040. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:35:55,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.890')] +[2023-09-25 20:36:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6485.4, 300 sec: 6553.6). Total num frames: 6758400. Throughput: 0: 812.6, 1: 812.0. Samples: 1687785. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:36:00,470][108279] Avg episode reward: [(0, '9.240'), (1, '9.890')] +[2023-09-25 20:36:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 6791168. Throughput: 0: 815.9, 1: 813.7. Samples: 1697792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:36:05,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.890')] +[2023-09-25 20:36:06,027][109224] Updated weights for policy 1, policy_version 13280 (0.0018) +[2023-09-25 20:36:06,027][109225] Updated weights for policy 0, policy_version 13280 (0.0018) +[2023-09-25 20:36:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 6823936. Throughput: 0: 809.9, 1: 809.9. Samples: 1702508. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:36:10,470][108279] Avg episode reward: [(0, '9.210'), (1, '9.870')] +[2023-09-25 20:36:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 6856704. Throughput: 0: 811.8, 1: 809.8. Samples: 1712132. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:36:15,470][108279] Avg episode reward: [(0, '9.170'), (1, '9.870')] +[2023-09-25 20:36:18,714][109224] Updated weights for policy 1, policy_version 13440 (0.0018) +[2023-09-25 20:36:18,714][109225] Updated weights for policy 0, policy_version 13440 (0.0017) +[2023-09-25 20:36:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6889472. Throughput: 0: 815.2, 1: 815.7. Samples: 1722125. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:36:20,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.860')] +[2023-09-25 20:36:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6922240. Throughput: 0: 811.2, 1: 811.0. Samples: 1726829. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:36:25,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.860')] +[2023-09-25 20:36:30,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6955008. Throughput: 0: 815.4, 1: 813.0. Samples: 1736709. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:36:30,471][108279] Avg episode reward: [(0, '9.170'), (1, '9.850')] +[2023-09-25 20:36:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000013584_3477504.pth... +[2023-09-25 20:36:30,481][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000013584_3477504.pth... +[2023-09-25 20:36:30,516][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000010528_2695168.pth +[2023-09-25 20:36:30,520][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000010528_2695168.pth +[2023-09-25 20:36:31,117][109224] Updated weights for policy 1, policy_version 13600 (0.0018) +[2023-09-25 20:36:31,118][109225] Updated weights for policy 0, policy_version 13600 (0.0016) +[2023-09-25 20:36:35,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 6987776. Throughput: 0: 816.1, 1: 819.1. Samples: 1746807. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:36:35,471][108279] Avg episode reward: [(0, '9.170'), (1, '9.850')] +[2023-09-25 20:36:40,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7020544. Throughput: 0: 812.7, 1: 812.5. Samples: 1751174. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:36:40,470][108279] Avg episode reward: [(0, '9.170'), (1, '9.830')] +[2023-09-25 20:36:43,783][109224] Updated weights for policy 1, policy_version 13760 (0.0015) +[2023-09-25 20:36:43,783][109225] Updated weights for policy 0, policy_version 13760 (0.0017) +[2023-09-25 20:36:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7053312. Throughput: 0: 817.9, 1: 815.3. Samples: 1761277. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:36:45,471][108279] Avg episode reward: [(0, '9.150'), (1, '9.830')] +[2023-09-25 20:36:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 7086080. Throughput: 0: 814.1, 1: 816.7. Samples: 1771176. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:36:50,470][108279] Avg episode reward: [(0, '9.130'), (1, '9.800')] +[2023-09-25 20:36:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7118848. Throughput: 0: 813.9, 1: 814.1. Samples: 1775767. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:36:55,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.800')] +[2023-09-25 20:36:56,323][109225] Updated weights for policy 0, policy_version 13920 (0.0017) +[2023-09-25 20:36:56,324][109224] Updated weights for policy 1, policy_version 13920 (0.0018) +[2023-09-25 20:37:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7151616. Throughput: 0: 817.2, 1: 819.1. Samples: 1785766. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:00,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.800')] +[2023-09-25 20:37:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 7184384. Throughput: 0: 818.0, 1: 818.0. Samples: 1795743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:05,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.800')] +[2023-09-25 20:37:08,897][109225] Updated weights for policy 0, policy_version 14080 (0.0015) +[2023-09-25 20:37:08,897][109224] Updated weights for policy 1, policy_version 14080 (0.0018) +[2023-09-25 20:37:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 7217152. Throughput: 0: 816.5, 1: 813.9. Samples: 1800200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:10,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.750')] +[2023-09-25 20:37:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7249920. Throughput: 0: 816.4, 1: 819.1. Samples: 1810307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:15,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.750')] +[2023-09-25 20:37:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7282688. Throughput: 0: 810.5, 1: 810.0. Samples: 1819729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:20,470][108279] Avg episode reward: [(0, '9.120'), (1, '9.680')] +[2023-09-25 20:37:21,516][109225] Updated weights for policy 0, policy_version 14240 (0.0017) +[2023-09-25 20:37:21,517][109224] Updated weights for policy 1, policy_version 14240 (0.0019) +[2023-09-25 20:37:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7315456. Throughput: 0: 819.0, 1: 816.5. Samples: 1824768. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:25,470][108279] Avg episode reward: [(0, '9.090'), (1, '9.680')] +[2023-09-25 20:37:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7348224. Throughput: 0: 815.2, 1: 817.3. Samples: 1834737. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:30,470][108279] Avg episode reward: [(0, '9.080'), (1, '9.630')] +[2023-09-25 20:37:33,963][109224] Updated weights for policy 1, policy_version 14400 (0.0017) +[2023-09-25 20:37:33,963][109225] Updated weights for policy 0, policy_version 14400 (0.0018) +[2023-09-25 20:37:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7380992. Throughput: 0: 814.5, 1: 814.1. Samples: 1844462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:35,470][108279] Avg episode reward: [(0, '9.070'), (1, '9.630')] +[2023-09-25 20:37:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7413760. Throughput: 0: 818.4, 1: 816.3. Samples: 1849327. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:40,470][108279] Avg episode reward: [(0, '9.090'), (1, '9.590')] +[2023-09-25 20:37:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7446528. Throughput: 0: 812.8, 1: 812.9. Samples: 1858920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:45,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.600')] +[2023-09-25 20:37:46,755][109225] Updated weights for policy 0, policy_version 14560 (0.0018) +[2023-09-25 20:37:46,755][109224] Updated weights for policy 1, policy_version 14560 (0.0018) +[2023-09-25 20:37:50,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7471104. Throughput: 0: 807.7, 1: 808.2. Samples: 1868460. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:50,471][108279] Avg episode reward: [(0, '9.100'), (1, '9.550')] +[2023-09-25 20:37:55,470][108279] Fps is (10 sec: 6143.9, 60 sec: 6485.3, 300 sec: 6511.9). Total num frames: 7507968. Throughput: 0: 814.0, 1: 817.5. Samples: 1873618. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:37:55,471][108279] Avg episode reward: [(0, '9.090'), (1, '9.560')] +[2023-09-25 20:37:59,125][109225] Updated weights for policy 0, policy_version 14720 (0.0016) +[2023-09-25 20:37:59,125][109224] Updated weights for policy 1, policy_version 14720 (0.0016) +[2023-09-25 20:38:00,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 7544832. Throughput: 0: 813.8, 1: 813.5. Samples: 1883535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:00,471][108279] Avg episode reward: [(0, '9.080'), (1, '9.560')] +[2023-09-25 20:38:05,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 7577600. Throughput: 0: 818.3, 1: 818.5. Samples: 1893388. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:38:05,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.560')] +[2023-09-25 20:38:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7610368. Throughput: 0: 819.2, 1: 819.2. Samples: 1898496. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:38:10,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.550')] +[2023-09-25 20:38:11,542][109225] Updated weights for policy 0, policy_version 14880 (0.0016) +[2023-09-25 20:38:11,542][109224] Updated weights for policy 1, policy_version 14880 (0.0015) +[2023-09-25 20:38:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7643136. Throughput: 0: 817.6, 1: 817.7. Samples: 1908326. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:15,470][108279] Avg episode reward: [(0, '9.160'), (1, '9.550')] +[2023-09-25 20:38:20,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7675904. Throughput: 0: 818.4, 1: 818.5. Samples: 1918120. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:20,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.550')] +[2023-09-25 20:38:24,121][109225] Updated weights for policy 0, policy_version 15040 (0.0016) +[2023-09-25 20:38:24,121][109224] Updated weights for policy 1, policy_version 15040 (0.0019) +[2023-09-25 20:38:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 7708672. Throughput: 0: 818.0, 1: 819.2. Samples: 1923002. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:25,471][108279] Avg episode reward: [(0, '9.190'), (1, '9.550')] +[2023-09-25 20:38:30,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7733248. Throughput: 0: 816.2, 1: 817.3. Samples: 1932429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:30,470][108279] Avg episode reward: [(0, '9.200'), (1, '9.550')] +[2023-09-25 20:38:30,489][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000015120_3870720.pth... +[2023-09-25 20:38:30,516][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000012064_3088384.pth +[2023-09-25 20:38:30,581][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000015120_3870720.pth... +[2023-09-25 20:38:30,614][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000012064_3088384.pth +[2023-09-25 20:38:35,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 7766016. Throughput: 0: 817.4, 1: 816.8. Samples: 1941999. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:35,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.550')] +[2023-09-25 20:38:36,860][109225] Updated weights for policy 0, policy_version 15200 (0.0016) +[2023-09-25 20:38:36,861][109224] Updated weights for policy 1, policy_version 15200 (0.0016) +[2023-09-25 20:38:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 7798784. Throughput: 0: 817.1, 1: 816.3. Samples: 1947121. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:40,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.550')] +[2023-09-25 20:38:45,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7831552. Throughput: 0: 813.8, 1: 813.6. Samples: 1956770. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:38:45,470][108279] Avg episode reward: [(0, '9.210'), (1, '9.550')] +[2023-09-25 20:38:49,344][109224] Updated weights for policy 1, policy_version 15360 (0.0017) +[2023-09-25 20:38:49,344][109225] Updated weights for policy 0, policy_version 15360 (0.0018) +[2023-09-25 20:38:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 7864320. Throughput: 0: 813.3, 1: 813.0. Samples: 1966570. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:38:50,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.540')] +[2023-09-25 20:38:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.4, 300 sec: 6498.1). Total num frames: 7897088. Throughput: 0: 812.4, 1: 815.1. Samples: 1971730. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:38:55,470][108279] Avg episode reward: [(0, '9.170'), (1, '9.550')] +[2023-09-25 20:39:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7929856. Throughput: 0: 814.2, 1: 813.2. Samples: 1981556. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:39:00,471][108279] Avg episode reward: [(0, '9.190'), (1, '9.550')] +[2023-09-25 20:39:01,917][109224] Updated weights for policy 1, policy_version 15520 (0.0014) +[2023-09-25 20:39:01,918][109225] Updated weights for policy 0, policy_version 15520 (0.0018) +[2023-09-25 20:39:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7962624. Throughput: 0: 809.2, 1: 809.4. Samples: 1990954. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:39:05,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.550')] +[2023-09-25 20:39:10,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 7995392. Throughput: 0: 812.6, 1: 813.6. Samples: 1996180. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:39:10,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.550')] +[2023-09-25 20:39:14,359][109224] Updated weights for policy 1, policy_version 15680 (0.0018) +[2023-09-25 20:39:14,359][109225] Updated weights for policy 0, policy_version 15680 (0.0016) +[2023-09-25 20:39:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 8028160. Throughput: 0: 817.6, 1: 817.0. Samples: 2005987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:39:15,471][108279] Avg episode reward: [(0, '9.150'), (1, '9.550')] +[2023-09-25 20:39:20,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 8060928. Throughput: 0: 819.7, 1: 819.9. Samples: 2015780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:39:20,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.550')] +[2023-09-25 20:39:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 8093696. Throughput: 0: 820.7, 1: 820.0. Samples: 2020954. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:39:25,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.550')] +[2023-09-25 20:39:26,790][109225] Updated weights for policy 0, policy_version 15840 (0.0016) +[2023-09-25 20:39:26,791][109224] Updated weights for policy 1, policy_version 15840 (0.0017) +[2023-09-25 20:39:30,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6621.8, 300 sec: 6539.7). Total num frames: 8130560. Throughput: 0: 821.7, 1: 822.1. Samples: 2030742. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:39:30,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.550')] +[2023-09-25 20:39:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8159232. Throughput: 0: 819.8, 1: 819.6. Samples: 2040343. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:39:35,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.520')] +[2023-09-25 20:39:39,278][109225] Updated weights for policy 0, policy_version 16000 (0.0016) +[2023-09-25 20:39:39,278][109224] Updated weights for policy 1, policy_version 16000 (0.0015) +[2023-09-25 20:39:40,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8192000. Throughput: 0: 819.9, 1: 819.5. Samples: 2045503. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:39:40,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.520')] +[2023-09-25 20:39:45,470][108279] Fps is (10 sec: 7372.7, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 8232960. Throughput: 0: 819.4, 1: 820.7. Samples: 2055362. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:39:45,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.500')] +[2023-09-25 20:39:50,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6690.1, 300 sec: 6525.8). Total num frames: 8265728. Throughput: 0: 824.4, 1: 824.6. Samples: 2065159. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:39:50,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.500')] +[2023-09-25 20:39:51,693][109225] Updated weights for policy 0, policy_version 16160 (0.0016) +[2023-09-25 20:39:51,693][109224] Updated weights for policy 1, policy_version 16160 (0.0018) +[2023-09-25 20:39:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6690.1, 300 sec: 6539.7). Total num frames: 8298496. Throughput: 0: 824.2, 1: 823.2. Samples: 2070315. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:39:55,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.500')] +[2023-09-25 20:40:00,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8323072. Throughput: 0: 823.0, 1: 822.9. Samples: 2080050. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:40:00,471][108279] Avg episode reward: [(0, '9.140'), (1, '9.500')] +[2023-09-25 20:40:04,432][109225] Updated weights for policy 0, policy_version 16320 (0.0017) +[2023-09-25 20:40:04,432][109224] Updated weights for policy 1, policy_version 16320 (0.0017) +[2023-09-25 20:40:05,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8355840. Throughput: 0: 816.7, 1: 816.9. Samples: 2089292. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:40:05,471][108279] Avg episode reward: [(0, '9.160'), (1, '9.500')] +[2023-09-25 20:40:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8388608. Throughput: 0: 816.8, 1: 817.2. Samples: 2094484. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:40:10,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.500')] +[2023-09-25 20:40:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8421376. Throughput: 0: 816.8, 1: 816.3. Samples: 2104232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:15,471][108279] Avg episode reward: [(0, '9.190'), (1, '9.490')] +[2023-09-25 20:40:16,835][109224] Updated weights for policy 1, policy_version 16480 (0.0015) +[2023-09-25 20:40:16,835][109225] Updated weights for policy 0, policy_version 16480 (0.0017) +[2023-09-25 20:40:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8454144. Throughput: 0: 819.1, 1: 819.2. Samples: 2114069. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:20,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.480')] +[2023-09-25 20:40:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8486912. Throughput: 0: 819.5, 1: 819.9. Samples: 2119275. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:25,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.470')] +[2023-09-25 20:40:29,325][109224] Updated weights for policy 1, policy_version 16640 (0.0017) +[2023-09-25 20:40:29,325][109225] Updated weights for policy 0, policy_version 16640 (0.0017) +[2023-09-25 20:40:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 8519680. Throughput: 0: 817.4, 1: 817.4. Samples: 2128926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:30,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.480')] +[2023-09-25 20:40:30,542][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000016656_4263936.pth... +[2023-09-25 20:40:30,549][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000016656_4263936.pth... +[2023-09-25 20:40:30,571][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000013584_3477504.pth +[2023-09-25 20:40:30,577][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000013584_3477504.pth +[2023-09-25 20:40:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8552448. Throughput: 0: 815.7, 1: 815.5. Samples: 2138563. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:35,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.470')] +[2023-09-25 20:40:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8585216. Throughput: 0: 816.2, 1: 816.8. Samples: 2143802. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:40:40,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.440')] +[2023-09-25 20:40:41,785][109224] Updated weights for policy 1, policy_version 16800 (0.0017) +[2023-09-25 20:40:41,785][109225] Updated weights for policy 0, policy_version 16800 (0.0017) +[2023-09-25 20:40:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 8617984. Throughput: 0: 816.6, 1: 817.8. Samples: 2153600. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:40:45,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.440')] +[2023-09-25 20:40:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 8650752. Throughput: 0: 822.9, 1: 822.3. Samples: 2163325. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:50,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.440')] +[2023-09-25 20:40:54,357][109224] Updated weights for policy 1, policy_version 16960 (0.0016) +[2023-09-25 20:40:54,357][109225] Updated weights for policy 0, policy_version 16960 (0.0018) +[2023-09-25 20:40:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 8683520. Throughput: 0: 819.9, 1: 820.0. Samples: 2168278. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:40:55,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.450')] +[2023-09-25 20:41:00,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8716288. Throughput: 0: 819.5, 1: 819.4. Samples: 2177982. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:00,470][108279] Avg episode reward: [(0, '9.220'), (1, '9.460')] +[2023-09-25 20:41:05,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8749056. Throughput: 0: 816.4, 1: 816.7. Samples: 2187556. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:05,470][108279] Avg episode reward: [(0, '9.230'), (1, '9.470')] +[2023-09-25 20:41:06,936][109224] Updated weights for policy 1, policy_version 17120 (0.0015) +[2023-09-25 20:41:06,937][109225] Updated weights for policy 0, policy_version 17120 (0.0018) +[2023-09-25 20:41:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8781824. Throughput: 0: 816.1, 1: 815.9. Samples: 2192715. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:10,470][108279] Avg episode reward: [(0, '9.220'), (1, '9.500')] +[2023-09-25 20:41:15,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8814592. Throughput: 0: 817.1, 1: 817.1. Samples: 2202464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:15,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.500')] +[2023-09-25 20:41:19,549][109225] Updated weights for policy 0, policy_version 17280 (0.0015) +[2023-09-25 20:41:19,549][109224] Updated weights for policy 1, policy_version 17280 (0.0017) +[2023-09-25 20:41:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8847360. Throughput: 0: 815.6, 1: 815.1. Samples: 2211943. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:20,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.510')] +[2023-09-25 20:41:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8880128. Throughput: 0: 811.9, 1: 811.1. Samples: 2216838. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:25,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.510')] +[2023-09-25 20:41:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8912896. Throughput: 0: 808.3, 1: 805.3. Samples: 2226214. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:41:30,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.550')] +[2023-09-25 20:41:32,444][109224] Updated weights for policy 1, policy_version 17440 (0.0016) +[2023-09-25 20:41:32,445][109225] Updated weights for policy 0, policy_version 17440 (0.0017) +[2023-09-25 20:41:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8945664. Throughput: 0: 810.2, 1: 810.0. Samples: 2236235. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:41:35,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.550')] +[2023-09-25 20:41:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 8978432. Throughput: 0: 805.6, 1: 805.2. Samples: 2240765. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:40,470][108279] Avg episode reward: [(0, '9.280'), (1, '9.600')] +[2023-09-25 20:41:44,985][109224] Updated weights for policy 1, policy_version 17600 (0.0016) +[2023-09-25 20:41:44,985][109225] Updated weights for policy 0, policy_version 17600 (0.0016) +[2023-09-25 20:41:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9011200. Throughput: 0: 809.6, 1: 807.5. Samples: 2250752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:45,470][108279] Avg episode reward: [(0, '9.280'), (1, '9.600')] +[2023-09-25 20:41:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9043968. Throughput: 0: 812.0, 1: 810.4. Samples: 2260563. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:41:50,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.640')] +[2023-09-25 20:41:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9076736. Throughput: 0: 805.6, 1: 805.1. Samples: 2265194. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:41:55,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.640')] +[2023-09-25 20:41:57,617][109224] Updated weights for policy 1, policy_version 17760 (0.0019) +[2023-09-25 20:41:57,617][109225] Updated weights for policy 0, policy_version 17760 (0.0019) +[2023-09-25 20:42:00,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.5, 300 sec: 6525.8). Total num frames: 9109504. Throughput: 0: 809.5, 1: 808.3. Samples: 2275263. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:42:00,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.680')] +[2023-09-25 20:42:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9142272. Throughput: 0: 808.5, 1: 809.0. Samples: 2284730. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:42:05,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.680')] +[2023-09-25 20:42:10,117][109224] Updated weights for policy 1, policy_version 17920 (0.0015) +[2023-09-25 20:42:10,118][109225] Updated weights for policy 0, policy_version 17920 (0.0019) +[2023-09-25 20:42:10,470][108279] Fps is (10 sec: 6553.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9175040. Throughput: 0: 809.9, 1: 808.6. Samples: 2289668. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:42:10,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.690')] +[2023-09-25 20:42:15,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9207808. Throughput: 0: 818.4, 1: 818.4. Samples: 2299870. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:42:15,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.720')] +[2023-09-25 20:42:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9240576. Throughput: 0: 814.4, 1: 814.9. Samples: 2309553. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:42:20,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.720')] +[2023-09-25 20:42:22,557][109224] Updated weights for policy 1, policy_version 18080 (0.0017) +[2023-09-25 20:42:22,557][109225] Updated weights for policy 0, policy_version 18080 (0.0018) +[2023-09-25 20:42:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9273344. Throughput: 0: 817.8, 1: 817.3. Samples: 2314342. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:42:25,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.720')] +[2023-09-25 20:42:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9306112. Throughput: 0: 819.2, 1: 819.2. Samples: 2324480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:42:30,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.730')] +[2023-09-25 20:42:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000018176_4653056.pth... +[2023-09-25 20:42:30,482][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000018176_4653056.pth... +[2023-09-25 20:42:30,510][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000015120_3870720.pth +[2023-09-25 20:42:30,523][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000015120_3870720.pth +[2023-09-25 20:42:35,099][109225] Updated weights for policy 0, policy_version 18240 (0.0016) +[2023-09-25 20:42:35,100][109224] Updated weights for policy 1, policy_version 18240 (0.0014) +[2023-09-25 20:42:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9338880. Throughput: 0: 817.2, 1: 818.5. Samples: 2334171. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:42:35,470][108279] Avg episode reward: [(0, '9.260'), (1, '9.730')] +[2023-09-25 20:42:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9371648. Throughput: 0: 819.1, 1: 818.3. Samples: 2338879. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:42:40,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.730')] +[2023-09-25 20:42:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 9404416. Throughput: 0: 820.0, 1: 819.2. Samples: 2349026. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:42:45,471][108279] Avg episode reward: [(0, '9.220'), (1, '9.730')] +[2023-09-25 20:42:47,523][109224] Updated weights for policy 1, policy_version 18400 (0.0017) +[2023-09-25 20:42:47,523][109225] Updated weights for policy 0, policy_version 18400 (0.0016) +[2023-09-25 20:42:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 9437184. Throughput: 0: 823.3, 1: 823.9. Samples: 2358856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:42:50,470][108279] Avg episode reward: [(0, '9.230'), (1, '9.730')] +[2023-09-25 20:42:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9469952. Throughput: 0: 819.2, 1: 819.2. Samples: 2363397. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:42:55,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.730')] +[2023-09-25 20:43:00,224][109224] Updated weights for policy 1, policy_version 18560 (0.0015) +[2023-09-25 20:43:00,224][109225] Updated weights for policy 0, policy_version 18560 (0.0018) +[2023-09-25 20:43:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9502720. Throughput: 0: 815.9, 1: 818.8. Samples: 2373432. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:00,470][108279] Avg episode reward: [(0, '9.250'), (1, '9.730')] +[2023-09-25 20:43:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9535488. Throughput: 0: 817.3, 1: 816.5. Samples: 2383074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:05,470][108279] Avg episode reward: [(0, '9.260'), (1, '9.730')] +[2023-09-25 20:43:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9568256. Throughput: 0: 818.0, 1: 817.0. Samples: 2387919. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:10,471][108279] Avg episode reward: [(0, '9.290'), (1, '9.740')] +[2023-09-25 20:43:12,842][109224] Updated weights for policy 1, policy_version 18720 (0.0017) +[2023-09-25 20:43:12,842][109225] Updated weights for policy 0, policy_version 18720 (0.0017) +[2023-09-25 20:43:15,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9601024. Throughput: 0: 811.9, 1: 815.1. Samples: 2397695. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:15,471][108279] Avg episode reward: [(0, '9.300'), (1, '9.730')] +[2023-09-25 20:43:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 9633792. Throughput: 0: 813.1, 1: 814.0. Samples: 2407389. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:20,470][108279] Avg episode reward: [(0, '9.320'), (1, '9.730')] +[2023-09-25 20:43:25,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9658368. Throughput: 0: 814.9, 1: 817.0. Samples: 2412317. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:25,471][108279] Avg episode reward: [(0, '9.330'), (1, '9.730')] +[2023-09-25 20:43:25,527][109224] Updated weights for policy 1, policy_version 18880 (0.0017) +[2023-09-25 20:43:25,527][109225] Updated weights for policy 0, policy_version 18880 (0.0018) +[2023-09-25 20:43:30,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9691136. Throughput: 0: 809.1, 1: 811.2. Samples: 2421940. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:30,471][108279] Avg episode reward: [(0, '9.390'), (1, '9.740')] +[2023-09-25 20:43:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 9723904. Throughput: 0: 807.5, 1: 806.6. Samples: 2431488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:35,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.750')] +[2023-09-25 20:43:38,042][109224] Updated weights for policy 1, policy_version 19040 (0.0015) +[2023-09-25 20:43:38,044][109225] Updated weights for policy 0, policy_version 19040 (0.0016) +[2023-09-25 20:43:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9756672. Throughput: 0: 812.8, 1: 815.4. Samples: 2436669. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:43:40,471][108279] Avg episode reward: [(0, '9.430'), (1, '9.750')] +[2023-09-25 20:43:45,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 9797632. Throughput: 0: 814.1, 1: 812.0. Samples: 2446606. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:43:45,471][108279] Avg episode reward: [(0, '9.450'), (1, '9.750')] +[2023-09-25 20:43:50,432][109225] Updated weights for policy 0, policy_version 19200 (0.0018) +[2023-09-25 20:43:50,432][109224] Updated weights for policy 1, policy_version 19200 (0.0015) +[2023-09-25 20:43:50,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 9830400. Throughput: 0: 814.1, 1: 814.9. Samples: 2456379. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:43:50,471][108279] Avg episode reward: [(0, '9.460'), (1, '9.750')] +[2023-09-25 20:43:55,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9854976. Throughput: 0: 816.4, 1: 818.5. Samples: 2461488. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:43:55,471][108279] Avg episode reward: [(0, '9.460'), (1, '9.780')] +[2023-09-25 20:44:00,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9887744. Throughput: 0: 817.2, 1: 816.1. Samples: 2471196. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:44:00,470][108279] Avg episode reward: [(0, '9.460'), (1, '9.780')] +[2023-09-25 20:44:02,955][109224] Updated weights for policy 1, policy_version 19360 (0.0018) +[2023-09-25 20:44:02,956][109225] Updated weights for policy 0, policy_version 19360 (0.0017) +[2023-09-25 20:44:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 9920512. Throughput: 0: 817.1, 1: 815.3. Samples: 2480846. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:44:05,471][108279] Avg episode reward: [(0, '9.460'), (1, '9.790')] +[2023-09-25 20:44:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9953280. Throughput: 0: 818.8, 1: 817.5. Samples: 2485952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:10,471][108279] Avg episode reward: [(0, '9.460'), (1, '9.790')] +[2023-09-25 20:44:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 9986048. Throughput: 0: 819.3, 1: 818.4. Samples: 2495636. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:15,471][108279] Avg episode reward: [(0, '9.440'), (1, '9.800')] +[2023-09-25 20:44:15,609][109224] Updated weights for policy 1, policy_version 19520 (0.0016) +[2023-09-25 20:44:15,611][109225] Updated weights for policy 0, policy_version 19520 (0.0018) +[2023-09-25 20:44:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 10018816. Throughput: 0: 819.3, 1: 819.7. Samples: 2505244. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:20,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.800')] +[2023-09-25 20:44:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6512.0). Total num frames: 10051584. Throughput: 0: 819.8, 1: 819.6. Samples: 2510438. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:25,470][108279] Avg episode reward: [(0, '9.410'), (1, '9.800')] +[2023-09-25 20:44:28,293][109225] Updated weights for policy 0, policy_version 19680 (0.0018) +[2023-09-25 20:44:28,293][109224] Updated weights for policy 1, policy_version 19680 (0.0017) +[2023-09-25 20:44:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10084352. Throughput: 0: 810.8, 1: 812.1. Samples: 2519635. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:30,471][108279] Avg episode reward: [(0, '9.400'), (1, '9.800')] +[2023-09-25 20:44:30,484][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000019696_5042176.pth... +[2023-09-25 20:44:30,484][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000019696_5042176.pth... +[2023-09-25 20:44:30,519][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000016656_4263936.pth +[2023-09-25 20:44:30,519][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000016656_4263936.pth +[2023-09-25 20:44:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10117120. Throughput: 0: 811.3, 1: 808.8. Samples: 2529285. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:35,471][108279] Avg episode reward: [(0, '9.380'), (1, '9.820')] +[2023-09-25 20:44:40,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 10149888. Throughput: 0: 806.6, 1: 806.4. Samples: 2534076. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:40,471][108279] Avg episode reward: [(0, '9.400'), (1, '9.830')] +[2023-09-25 20:44:40,953][109224] Updated weights for policy 1, policy_version 19840 (0.0015) +[2023-09-25 20:44:40,954][109225] Updated weights for policy 0, policy_version 19840 (0.0016) +[2023-09-25 20:44:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 10182656. Throughput: 0: 807.2, 1: 807.7. Samples: 2543866. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:45,470][108279] Avg episode reward: [(0, '9.450'), (1, '9.840')] +[2023-09-25 20:44:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 10215424. Throughput: 0: 811.4, 1: 810.4. Samples: 2553826. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:50,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.840')] +[2023-09-25 20:44:53,543][109224] Updated weights for policy 1, policy_version 20000 (0.0017) +[2023-09-25 20:44:53,543][109225] Updated weights for policy 0, policy_version 20000 (0.0019) +[2023-09-25 20:44:55,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10248192. Throughput: 0: 806.9, 1: 807.2. Samples: 2558588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:44:55,471][108279] Avg episode reward: [(0, '9.430'), (1, '9.840')] +[2023-09-25 20:45:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10280960. Throughput: 0: 807.9, 1: 808.5. Samples: 2568376. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:45:00,471][108279] Avg episode reward: [(0, '9.410'), (1, '9.850')] +[2023-09-25 20:45:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10313728. Throughput: 0: 814.5, 1: 811.9. Samples: 2578432. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:45:05,471][108279] Avg episode reward: [(0, '9.410'), (1, '9.850')] +[2023-09-25 20:45:06,166][109224] Updated weights for policy 1, policy_version 20160 (0.0018) +[2023-09-25 20:45:06,166][109225] Updated weights for policy 0, policy_version 20160 (0.0016) +[2023-09-25 20:45:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10346496. Throughput: 0: 804.6, 1: 804.8. Samples: 2582861. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:45:10,470][108279] Avg episode reward: [(0, '9.380'), (1, '9.860')] +[2023-09-25 20:45:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10379264. Throughput: 0: 811.1, 1: 809.4. Samples: 2592556. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:45:15,471][108279] Avg episode reward: [(0, '9.390'), (1, '9.860')] +[2023-09-25 20:45:19,078][109224] Updated weights for policy 1, policy_version 20320 (0.0017) +[2023-09-25 20:45:19,078][109225] Updated weights for policy 0, policy_version 20320 (0.0017) +[2023-09-25 20:45:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10412032. Throughput: 0: 807.0, 1: 809.7. Samples: 2602035. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:45:20,471][108279] Avg episode reward: [(0, '9.390'), (1, '9.860')] +[2023-09-25 20:45:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10444800. Throughput: 0: 810.2, 1: 809.9. Samples: 2606983. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:45:25,471][108279] Avg episode reward: [(0, '9.380'), (1, '9.860')] +[2023-09-25 20:45:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10477568. Throughput: 0: 809.7, 1: 809.1. Samples: 2616712. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:45:30,471][108279] Avg episode reward: [(0, '9.370'), (1, '9.860')] +[2023-09-25 20:45:31,634][109224] Updated weights for policy 1, policy_version 20480 (0.0018) +[2023-09-25 20:45:31,634][109225] Updated weights for policy 0, policy_version 20480 (0.0017) +[2023-09-25 20:45:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10510336. Throughput: 0: 807.3, 1: 808.8. Samples: 2626550. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:45:35,470][108279] Avg episode reward: [(0, '9.350'), (1, '9.860')] +[2023-09-25 20:45:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10543104. Throughput: 0: 811.1, 1: 810.9. Samples: 2631577. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:45:40,470][108279] Avg episode reward: [(0, '9.320'), (1, '9.850')] +[2023-09-25 20:45:44,059][109224] Updated weights for policy 1, policy_version 20640 (0.0016) +[2023-09-25 20:45:44,060][109225] Updated weights for policy 0, policy_version 20640 (0.0017) +[2023-09-25 20:45:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10575872. Throughput: 0: 811.6, 1: 812.2. Samples: 2641444. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-25 20:45:45,470][108279] Avg episode reward: [(0, '9.310'), (1, '9.840')] +[2023-09-25 20:45:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10608640. Throughput: 0: 809.4, 1: 811.3. Samples: 2651364. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:45:50,470][108279] Avg episode reward: [(0, '9.280'), (1, '9.850')] +[2023-09-25 20:45:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10641408. Throughput: 0: 815.8, 1: 814.1. Samples: 2656208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:45:55,471][108279] Avg episode reward: [(0, '9.270'), (1, '9.860')] +[2023-09-25 20:45:56,531][109225] Updated weights for policy 0, policy_version 20800 (0.0015) +[2023-09-25 20:45:56,531][109224] Updated weights for policy 1, policy_version 20800 (0.0017) +[2023-09-25 20:46:00,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10674176. Throughput: 0: 816.4, 1: 817.1. Samples: 2666063. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:00,471][108279] Avg episode reward: [(0, '9.290'), (1, '9.870')] +[2023-09-25 20:46:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10706944. Throughput: 0: 819.7, 1: 819.4. Samples: 2675791. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:05,471][108279] Avg episode reward: [(0, '9.290'), (1, '9.860')] +[2023-09-25 20:46:09,006][109224] Updated weights for policy 1, policy_version 20960 (0.0017) +[2023-09-25 20:46:09,006][109225] Updated weights for policy 0, policy_version 20960 (0.0017) +[2023-09-25 20:46:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10739712. Throughput: 0: 821.9, 1: 819.2. Samples: 2680832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:10,471][108279] Avg episode reward: [(0, '9.280'), (1, '9.860')] +[2023-09-25 20:46:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10772480. Throughput: 0: 823.8, 1: 824.6. Samples: 2690892. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:15,471][108279] Avg episode reward: [(0, '9.290'), (1, '9.870')] +[2023-09-25 20:46:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10805248. Throughput: 0: 822.6, 1: 823.1. Samples: 2700607. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:46:20,471][108279] Avg episode reward: [(0, '9.300'), (1, '9.840')] +[2023-09-25 20:46:21,455][109225] Updated weights for policy 0, policy_version 21120 (0.0018) +[2023-09-25 20:46:21,455][109224] Updated weights for policy 1, policy_version 21120 (0.0018) +[2023-09-25 20:46:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10838016. Throughput: 0: 821.5, 1: 819.2. Samples: 2705409. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:46:25,470][108279] Avg episode reward: [(0, '9.280'), (1, '9.850')] +[2023-09-25 20:46:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10870784. Throughput: 0: 822.8, 1: 822.4. Samples: 2715478. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:46:30,470][108279] Avg episode reward: [(0, '9.280'), (1, '9.840')] +[2023-09-25 20:46:30,480][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000021232_5435392.pth... +[2023-09-25 20:46:30,480][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000021232_5435392.pth... +[2023-09-25 20:46:30,516][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000018176_4653056.pth +[2023-09-25 20:46:30,519][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000018176_4653056.pth +[2023-09-25 20:46:33,910][109225] Updated weights for policy 0, policy_version 21280 (0.0018) +[2023-09-25 20:46:33,910][109224] Updated weights for policy 1, policy_version 21280 (0.0017) +[2023-09-25 20:46:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10903552. Throughput: 0: 820.7, 1: 821.3. Samples: 2725255. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:35,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.840')] +[2023-09-25 20:46:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10936320. Throughput: 0: 820.3, 1: 819.3. Samples: 2729988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:40,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.850')] +[2023-09-25 20:46:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 10969088. Throughput: 0: 821.1, 1: 822.0. Samples: 2740003. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:45,470][108279] Avg episode reward: [(0, '9.240'), (1, '9.850')] +[2023-09-25 20:46:46,407][109224] Updated weights for policy 1, policy_version 21440 (0.0016) +[2023-09-25 20:46:46,408][109225] Updated weights for policy 0, policy_version 21440 (0.0016) +[2023-09-25 20:46:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11001856. Throughput: 0: 822.7, 1: 821.6. Samples: 2749784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:50,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.840')] +[2023-09-25 20:46:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11034624. Throughput: 0: 819.2, 1: 819.2. Samples: 2754561. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:46:55,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.840')] +[2023-09-25 20:46:59,067][109225] Updated weights for policy 0, policy_version 21600 (0.0018) +[2023-09-25 20:46:59,067][109224] Updated weights for policy 1, policy_version 21600 (0.0017) +[2023-09-25 20:47:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11067392. Throughput: 0: 815.8, 1: 816.4. Samples: 2764340. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:00,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.830')] +[2023-09-25 20:47:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11100160. Throughput: 0: 813.6, 1: 813.6. Samples: 2773831. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:05,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.830')] +[2023-09-25 20:47:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11132928. Throughput: 0: 816.8, 1: 819.2. Samples: 2779030. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:10,471][108279] Avg episode reward: [(0, '9.170'), (1, '9.830')] +[2023-09-25 20:47:11,621][109225] Updated weights for policy 0, policy_version 21760 (0.0016) +[2023-09-25 20:47:11,621][109224] Updated weights for policy 1, policy_version 21760 (0.0016) +[2023-09-25 20:47:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11165696. Throughput: 0: 813.8, 1: 813.8. Samples: 2788720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:15,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.830')] +[2023-09-25 20:47:20,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6511.9). Total num frames: 11194368. Throughput: 0: 812.7, 1: 812.9. Samples: 2798407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:20,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.830')] +[2023-09-25 20:47:24,399][109225] Updated weights for policy 0, policy_version 21920 (0.0019) +[2023-09-25 20:47:24,399][109224] Updated weights for policy 1, policy_version 21920 (0.0018) +[2023-09-25 20:47:25,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 11223040. Throughput: 0: 811.6, 1: 814.4. Samples: 2803161. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:47:25,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.830')] +[2023-09-25 20:47:30,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11255808. Throughput: 0: 810.8, 1: 810.6. Samples: 2812963. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:47:30,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.820')] +[2023-09-25 20:47:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11288576. Throughput: 0: 810.3, 1: 811.4. Samples: 2822759. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 20:47:35,471][108279] Avg episode reward: [(0, '9.020'), (1, '9.830')] +[2023-09-25 20:47:36,774][109225] Updated weights for policy 0, policy_version 22080 (0.0016) +[2023-09-25 20:47:36,774][109224] Updated weights for policy 1, policy_version 22080 (0.0016) +[2023-09-25 20:47:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11321344. Throughput: 0: 812.6, 1: 816.0. Samples: 2827848. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:47:40,470][108279] Avg episode reward: [(0, '8.980'), (1, '9.830')] +[2023-09-25 20:47:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 11354112. Throughput: 0: 809.4, 1: 809.1. Samples: 2837172. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:47:45,471][108279] Avg episode reward: [(0, '8.970'), (1, '9.820')] +[2023-09-25 20:47:49,600][109224] Updated weights for policy 1, policy_version 22240 (0.0017) +[2023-09-25 20:47:49,600][109225] Updated weights for policy 0, policy_version 22240 (0.0017) +[2023-09-25 20:47:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11386880. Throughput: 0: 811.3, 1: 809.7. Samples: 2846776. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 20:47:50,471][108279] Avg episode reward: [(0, '8.960'), (1, '9.820')] +[2023-09-25 20:47:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11419648. Throughput: 0: 808.4, 1: 808.9. Samples: 2851808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:47:55,470][108279] Avg episode reward: [(0, '8.950'), (1, '9.810')] +[2023-09-25 20:48:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11452416. Throughput: 0: 808.8, 1: 808.5. Samples: 2861501. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:00,471][108279] Avg episode reward: [(0, '8.940'), (1, '9.810')] +[2023-09-25 20:48:02,063][109225] Updated weights for policy 0, policy_version 22400 (0.0014) +[2023-09-25 20:48:02,065][109224] Updated weights for policy 1, policy_version 22400 (0.0015) +[2023-09-25 20:48:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11485184. Throughput: 0: 811.3, 1: 809.6. Samples: 2871347. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:05,470][108279] Avg episode reward: [(0, '8.920'), (1, '9.800')] +[2023-09-25 20:48:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11517952. Throughput: 0: 814.5, 1: 814.2. Samples: 2876455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:10,471][108279] Avg episode reward: [(0, '8.910'), (1, '9.800')] +[2023-09-25 20:48:14,519][109224] Updated weights for policy 1, policy_version 22560 (0.0017) +[2023-09-25 20:48:14,520][109225] Updated weights for policy 0, policy_version 22560 (0.0018) +[2023-09-25 20:48:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 11550720. Throughput: 0: 814.6, 1: 814.7. Samples: 2886282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:15,470][108279] Avg episode reward: [(0, '8.900'), (1, '9.800')] +[2023-09-25 20:48:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 11583488. Throughput: 0: 813.6, 1: 811.2. Samples: 2895877. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:20,470][108279] Avg episode reward: [(0, '8.900'), (1, '9.800')] +[2023-09-25 20:48:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11616256. Throughput: 0: 812.2, 1: 811.2. Samples: 2900903. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:25,471][108279] Avg episode reward: [(0, '8.910'), (1, '9.810')] +[2023-09-25 20:48:27,259][109225] Updated weights for policy 0, policy_version 22720 (0.0019) +[2023-09-25 20:48:27,259][109224] Updated weights for policy 1, policy_version 22720 (0.0018) +[2023-09-25 20:48:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11649024. Throughput: 0: 813.4, 1: 812.6. Samples: 2910346. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:30,470][108279] Avg episode reward: [(0, '8.900'), (1, '9.810')] +[2023-09-25 20:48:30,478][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000022752_5824512.pth... +[2023-09-25 20:48:30,478][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000022752_5824512.pth... +[2023-09-25 20:48:30,514][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000019696_5042176.pth +[2023-09-25 20:48:30,514][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000019696_5042176.pth +[2023-09-25 20:48:35,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11681792. Throughput: 0: 819.2, 1: 818.0. Samples: 2920448. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:35,470][108279] Avg episode reward: [(0, '8.890'), (1, '9.800')] +[2023-09-25 20:48:39,739][109224] Updated weights for policy 1, policy_version 22880 (0.0015) +[2023-09-25 20:48:39,741][109225] Updated weights for policy 0, policy_version 22880 (0.0017) +[2023-09-25 20:48:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 11714560. Throughput: 0: 815.4, 1: 815.2. Samples: 2925184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:40,470][108279] Avg episode reward: [(0, '8.890'), (1, '9.800')] +[2023-09-25 20:48:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 11747328. Throughput: 0: 815.6, 1: 815.2. Samples: 2934887. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:45,471][108279] Avg episode reward: [(0, '8.880'), (1, '9.800')] +[2023-09-25 20:48:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11780096. Throughput: 0: 818.7, 1: 818.1. Samples: 2945002. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:50,471][108279] Avg episode reward: [(0, '8.880'), (1, '9.800')] +[2023-09-25 20:48:52,365][109224] Updated weights for policy 1, policy_version 23040 (0.0016) +[2023-09-25 20:48:52,365][109225] Updated weights for policy 0, policy_version 23040 (0.0017) +[2023-09-25 20:48:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11812864. Throughput: 0: 812.7, 1: 812.3. Samples: 2949579. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:48:55,471][108279] Avg episode reward: [(0, '8.880'), (1, '9.800')] +[2023-09-25 20:49:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11845632. Throughput: 0: 813.3, 1: 810.6. Samples: 2959360. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:49:00,470][108279] Avg episode reward: [(0, '8.890'), (1, '9.790')] +[2023-09-25 20:49:04,964][109225] Updated weights for policy 0, policy_version 23200 (0.0017) +[2023-09-25 20:49:04,964][109224] Updated weights for policy 1, policy_version 23200 (0.0016) +[2023-09-25 20:49:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11878400. Throughput: 0: 814.8, 1: 817.5. Samples: 2969329. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:49:05,471][108279] Avg episode reward: [(0, '8.930'), (1, '9.800')] +[2023-09-25 20:49:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11911168. Throughput: 0: 812.0, 1: 812.2. Samples: 2973993. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:49:10,471][108279] Avg episode reward: [(0, '8.930'), (1, '9.800')] +[2023-09-25 20:49:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11943936. Throughput: 0: 818.9, 1: 816.5. Samples: 2983938. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:49:15,471][108279] Avg episode reward: [(0, '8.960'), (1, '9.840')] +[2023-09-25 20:49:17,397][109224] Updated weights for policy 1, policy_version 23360 (0.0014) +[2023-09-25 20:49:17,398][109225] Updated weights for policy 0, policy_version 23360 (0.0018) +[2023-09-25 20:49:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 11976704. Throughput: 0: 815.7, 1: 818.9. Samples: 2994005. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:49:20,470][108279] Avg episode reward: [(0, '8.960'), (1, '9.840')] +[2023-09-25 20:49:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12009472. Throughput: 0: 816.6, 1: 816.5. Samples: 2998672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:49:25,471][108279] Avg episode reward: [(0, '8.990'), (1, '9.830')] +[2023-09-25 20:49:30,073][109224] Updated weights for policy 1, policy_version 23520 (0.0018) +[2023-09-25 20:49:30,073][109225] Updated weights for policy 0, policy_version 23520 (0.0017) +[2023-09-25 20:49:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12042240. Throughput: 0: 819.1, 1: 817.0. Samples: 3008512. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:49:30,471][108279] Avg episode reward: [(0, '8.980'), (1, '9.830')] +[2023-09-25 20:49:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12075008. Throughput: 0: 812.6, 1: 814.9. Samples: 3018238. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:49:35,470][108279] Avg episode reward: [(0, '9.000'), (1, '9.820')] +[2023-09-25 20:49:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12107776. Throughput: 0: 815.2, 1: 813.5. Samples: 3022871. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-25 20:49:40,471][108279] Avg episode reward: [(0, '9.010'), (1, '9.820')] +[2023-09-25 20:49:42,545][109224] Updated weights for policy 1, policy_version 23680 (0.0016) +[2023-09-25 20:49:42,545][109225] Updated weights for policy 0, policy_version 23680 (0.0017) +[2023-09-25 20:49:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12140544. Throughput: 0: 817.1, 1: 819.2. Samples: 3032995. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:49:45,471][108279] Avg episode reward: [(0, '9.060'), (1, '9.810')] +[2023-09-25 20:49:50,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12173312. Throughput: 0: 815.4, 1: 815.6. Samples: 3042723. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:49:50,470][108279] Avg episode reward: [(0, '9.080'), (1, '9.810')] +[2023-09-25 20:49:55,129][109224] Updated weights for policy 1, policy_version 23840 (0.0018) +[2023-09-25 20:49:55,129][109225] Updated weights for policy 0, policy_version 23840 (0.0020) +[2023-09-25 20:49:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12206080. Throughput: 0: 817.2, 1: 815.3. Samples: 3047455. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:49:55,471][108279] Avg episode reward: [(0, '9.090'), (1, '9.810')] +[2023-09-25 20:50:00,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12238848. Throughput: 0: 816.9, 1: 819.1. Samples: 3057558. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 20:50:00,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.810')] +[2023-09-25 20:50:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12271616. Throughput: 0: 814.3, 1: 813.7. Samples: 3067268. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:50:05,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.820')] +[2023-09-25 20:50:07,603][109225] Updated weights for policy 0, policy_version 24000 (0.0017) +[2023-09-25 20:50:07,603][109224] Updated weights for policy 1, policy_version 24000 (0.0016) +[2023-09-25 20:50:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12304384. Throughput: 0: 816.0, 1: 814.7. Samples: 3072053. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:50:10,470][108279] Avg episode reward: [(0, '9.060'), (1, '9.820')] +[2023-09-25 20:50:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12337152. Throughput: 0: 819.2, 1: 819.2. Samples: 3082240. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 20:50:15,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.800')] +[2023-09-25 20:50:20,075][109224] Updated weights for policy 1, policy_version 24160 (0.0016) +[2023-09-25 20:50:20,075][109225] Updated weights for policy 0, policy_version 24160 (0.0018) +[2023-09-25 20:50:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12369920. Throughput: 0: 819.6, 1: 819.1. Samples: 3091978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:20,470][108279] Avg episode reward: [(0, '9.080'), (1, '9.800')] +[2023-09-25 20:50:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12402688. Throughput: 0: 819.2, 1: 818.7. Samples: 3096577. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:25,471][108279] Avg episode reward: [(0, '9.050'), (1, '9.820')] +[2023-09-25 20:50:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 12435456. Throughput: 0: 814.7, 1: 816.1. Samples: 3106378. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:30,471][108279] Avg episode reward: [(0, '9.050'), (1, '9.820')] +[2023-09-25 20:50:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000024288_6217728.pth... +[2023-09-25 20:50:30,481][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000024288_6217728.pth... +[2023-09-25 20:50:30,517][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000021232_5435392.pth +[2023-09-25 20:50:30,521][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000021232_5435392.pth +[2023-09-25 20:50:33,004][109225] Updated weights for policy 0, policy_version 24320 (0.0017) +[2023-09-25 20:50:33,004][109224] Updated weights for policy 1, policy_version 24320 (0.0019) +[2023-09-25 20:50:35,470][108279] Fps is (10 sec: 5734.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12460032. Throughput: 0: 811.0, 1: 810.7. Samples: 3115702. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:35,470][108279] Avg episode reward: [(0, '9.020'), (1, '9.820')] +[2023-09-25 20:50:40,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12492800. Throughput: 0: 813.2, 1: 816.1. Samples: 3120777. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:40,471][108279] Avg episode reward: [(0, '9.020'), (1, '9.850')] +[2023-09-25 20:50:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12525568. Throughput: 0: 809.1, 1: 809.6. Samples: 3130398. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:45,471][108279] Avg episode reward: [(0, '9.030'), (1, '9.850')] +[2023-09-25 20:50:45,616][109225] Updated weights for policy 0, policy_version 24480 (0.0017) +[2023-09-25 20:50:45,616][109224] Updated weights for policy 1, policy_version 24480 (0.0015) +[2023-09-25 20:50:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 12558336. Throughput: 0: 808.3, 1: 808.3. Samples: 3140012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:50:50,471][108279] Avg episode reward: [(0, '9.040'), (1, '9.860')] +[2023-09-25 20:50:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12591104. Throughput: 0: 812.0, 1: 813.2. Samples: 3145190. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:50:55,470][108279] Avg episode reward: [(0, '9.050'), (1, '9.850')] +[2023-09-25 20:50:58,129][109225] Updated weights for policy 0, policy_version 24640 (0.0017) +[2023-09-25 20:50:58,129][109224] Updated weights for policy 1, policy_version 24640 (0.0016) +[2023-09-25 20:51:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12623872. Throughput: 0: 805.5, 1: 807.1. Samples: 3154807. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:51:00,471][108279] Avg episode reward: [(0, '9.060'), (1, '9.850')] +[2023-09-25 20:51:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12656640. Throughput: 0: 804.0, 1: 804.1. Samples: 3164341. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 20:51:05,470][108279] Avg episode reward: [(0, '9.090'), (1, '9.850')] +[2023-09-25 20:51:10,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12689408. Throughput: 0: 806.8, 1: 809.1. Samples: 3169294. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:51:10,470][108279] Avg episode reward: [(0, '9.080'), (1, '9.860')] +[2023-09-25 20:51:10,869][109224] Updated weights for policy 1, policy_version 24800 (0.0016) +[2023-09-25 20:51:10,869][109225] Updated weights for policy 0, policy_version 24800 (0.0017) +[2023-09-25 20:51:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12722176. Throughput: 0: 806.8, 1: 806.4. Samples: 3178975. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:51:15,470][108279] Avg episode reward: [(0, '9.080'), (1, '9.870')] +[2023-09-25 20:51:20,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 12754944. Throughput: 0: 812.9, 1: 811.1. Samples: 3188781. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:51:20,471][108279] Avg episode reward: [(0, '9.060'), (1, '9.860')] +[2023-09-25 20:51:23,399][109225] Updated weights for policy 0, policy_version 24960 (0.0018) +[2023-09-25 20:51:23,399][109224] Updated weights for policy 1, policy_version 24960 (0.0016) +[2023-09-25 20:51:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12787712. Throughput: 0: 812.5, 1: 811.9. Samples: 3193873. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-25 20:51:25,471][108279] Avg episode reward: [(0, '9.090'), (1, '9.850')] +[2023-09-25 20:51:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 12820480. Throughput: 0: 811.5, 1: 811.4. Samples: 3203430. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:30,471][108279] Avg episode reward: [(0, '9.090'), (1, '9.850')] +[2023-09-25 20:51:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 12853248. Throughput: 0: 815.7, 1: 813.2. Samples: 3213312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:35,471][108279] Avg episode reward: [(0, '9.090'), (1, '9.850')] +[2023-09-25 20:51:35,963][109224] Updated weights for policy 1, policy_version 25120 (0.0020) +[2023-09-25 20:51:35,964][109225] Updated weights for policy 0, policy_version 25120 (0.0018) +[2023-09-25 20:51:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 12886016. Throughput: 0: 811.8, 1: 812.0. Samples: 3218258. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:40,471][108279] Avg episode reward: [(0, '9.100'), (1, '9.850')] +[2023-09-25 20:51:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 12918784. Throughput: 0: 810.5, 1: 811.1. Samples: 3227777. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:45,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.860')] +[2023-09-25 20:51:48,523][109224] Updated weights for policy 1, policy_version 25280 (0.0017) +[2023-09-25 20:51:48,523][109225] Updated weights for policy 0, policy_version 25280 (0.0017) +[2023-09-25 20:51:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 12951552. Throughput: 0: 818.1, 1: 815.9. Samples: 3237871. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:50,471][108279] Avg episode reward: [(0, '9.150'), (1, '9.860')] +[2023-09-25 20:51:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 12984320. Throughput: 0: 814.6, 1: 815.0. Samples: 3242625. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:51:55,470][108279] Avg episode reward: [(0, '9.160'), (1, '9.860')] +[2023-09-25 20:52:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 13017088. Throughput: 0: 817.5, 1: 817.1. Samples: 3252534. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:00,471][108279] Avg episode reward: [(0, '9.140'), (1, '9.870')] +[2023-09-25 20:52:00,976][109225] Updated weights for policy 0, policy_version 25440 (0.0018) +[2023-09-25 20:52:00,976][109224] Updated weights for policy 1, policy_version 25440 (0.0018) +[2023-09-25 20:52:05,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 13049856. Throughput: 0: 819.2, 1: 818.3. Samples: 3262468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:05,471][108279] Avg episode reward: [(0, '9.110'), (1, '9.870')] +[2023-09-25 20:52:10,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 13082624. Throughput: 0: 817.7, 1: 816.0. Samples: 3267386. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:10,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.880')] +[2023-09-25 20:52:13,519][109224] Updated weights for policy 1, policy_version 25600 (0.0016) +[2023-09-25 20:52:13,520][109225] Updated weights for policy 0, policy_version 25600 (0.0018) +[2023-09-25 20:52:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6511.9). Total num frames: 13115392. Throughput: 0: 817.0, 1: 817.2. Samples: 3276967. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:15,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.880')] +[2023-09-25 20:52:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13148160. Throughput: 0: 819.2, 1: 819.2. Samples: 3287041. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:20,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.890')] +[2023-09-25 20:52:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13180928. Throughput: 0: 818.1, 1: 818.1. Samples: 3291888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:25,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.890')] +[2023-09-25 20:52:25,959][109224] Updated weights for policy 1, policy_version 25760 (0.0015) +[2023-09-25 20:52:25,960][109225] Updated weights for policy 0, policy_version 25760 (0.0018) +[2023-09-25 20:52:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13213696. Throughput: 0: 821.8, 1: 822.1. Samples: 3301753. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:30,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.870')] +[2023-09-25 20:52:30,482][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000025808_6606848.pth... +[2023-09-25 20:52:30,482][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000025808_6606848.pth... +[2023-09-25 20:52:30,518][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000022752_5824512.pth +[2023-09-25 20:52:30,520][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000022752_5824512.pth +[2023-09-25 20:52:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13246464. Throughput: 0: 819.6, 1: 819.2. Samples: 3311617. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:35,471][108279] Avg episode reward: [(0, '9.150'), (1, '9.870')] +[2023-09-25 20:52:38,610][109224] Updated weights for policy 1, policy_version 25920 (0.0018) +[2023-09-25 20:52:38,610][109225] Updated weights for policy 0, policy_version 25920 (0.0017) +[2023-09-25 20:52:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13279232. Throughput: 0: 817.0, 1: 817.0. Samples: 3316153. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:52:40,471][108279] Avg episode reward: [(0, '9.150'), (1, '9.870')] +[2023-09-25 20:52:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13312000. Throughput: 0: 817.1, 1: 814.4. Samples: 3325953. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:52:45,471][108279] Avg episode reward: [(0, '9.160'), (1, '9.870')] +[2023-09-25 20:52:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13344768. Throughput: 0: 817.9, 1: 819.1. Samples: 3336134. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:52:50,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.880')] +[2023-09-25 20:52:51,038][109224] Updated weights for policy 1, policy_version 26080 (0.0017) +[2023-09-25 20:52:51,038][109225] Updated weights for policy 0, policy_version 26080 (0.0017) +[2023-09-25 20:52:55,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13377536. Throughput: 0: 817.4, 1: 818.7. Samples: 3341012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:52:55,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.880')] +[2023-09-25 20:53:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13410304. Throughput: 0: 821.6, 1: 821.3. Samples: 3350897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:00,470][108279] Avg episode reward: [(0, '9.120'), (1, '9.880')] +[2023-09-25 20:53:03,469][109224] Updated weights for policy 1, policy_version 26240 (0.0019) +[2023-09-25 20:53:03,469][109225] Updated weights for policy 0, policy_version 26240 (0.0019) +[2023-09-25 20:53:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13443072. Throughput: 0: 819.2, 1: 819.2. Samples: 3360768. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:05,471][108279] Avg episode reward: [(0, '9.130'), (1, '9.870')] +[2023-09-25 20:53:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13475840. Throughput: 0: 820.1, 1: 820.2. Samples: 3365703. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:10,470][108279] Avg episode reward: [(0, '9.090'), (1, '9.870')] +[2023-09-25 20:53:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13508608. Throughput: 0: 819.8, 1: 819.6. Samples: 3375525. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:15,470][108279] Avg episode reward: [(0, '9.090'), (1, '9.870')] +[2023-09-25 20:53:15,870][109224] Updated weights for policy 1, policy_version 26400 (0.0012) +[2023-09-25 20:53:15,871][109225] Updated weights for policy 0, policy_version 26400 (0.0017) +[2023-09-25 20:53:20,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13541376. Throughput: 0: 819.2, 1: 820.1. Samples: 3385388. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:20,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.880')] +[2023-09-25 20:53:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13574144. Throughput: 0: 825.6, 1: 825.6. Samples: 3390455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:25,470][108279] Avg episode reward: [(0, '9.060'), (1, '9.880')] +[2023-09-25 20:53:28,380][109224] Updated weights for policy 1, policy_version 26560 (0.0016) +[2023-09-25 20:53:28,382][109225] Updated weights for policy 0, policy_version 26560 (0.0018) +[2023-09-25 20:53:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13606912. Throughput: 0: 822.8, 1: 825.5. Samples: 3400125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:30,471][108279] Avg episode reward: [(0, '9.040'), (1, '9.880')] +[2023-09-25 20:53:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13639680. Throughput: 0: 819.7, 1: 819.2. Samples: 3409884. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:53:35,470][108279] Avg episode reward: [(0, '9.020'), (1, '9.880')] +[2023-09-25 20:53:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13672448. Throughput: 0: 817.7, 1: 817.2. Samples: 3414579. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:53:40,470][108279] Avg episode reward: [(0, '9.010'), (1, '9.880')] +[2023-09-25 20:53:41,049][109224] Updated weights for policy 1, policy_version 26720 (0.0017) +[2023-09-25 20:53:41,050][109225] Updated weights for policy 0, policy_version 26720 (0.0016) +[2023-09-25 20:53:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13705216. Throughput: 0: 816.6, 1: 816.6. Samples: 3424393. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:53:45,471][108279] Avg episode reward: [(0, '9.020'), (1, '9.890')] +[2023-09-25 20:53:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13737984. Throughput: 0: 819.2, 1: 819.2. Samples: 3434496. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:50,471][108279] Avg episode reward: [(0, '8.980'), (1, '9.890')] +[2023-09-25 20:53:53,486][109225] Updated weights for policy 0, policy_version 26880 (0.0016) +[2023-09-25 20:53:53,487][109224] Updated weights for policy 1, policy_version 26880 (0.0018) +[2023-09-25 20:53:55,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13770752. Throughput: 0: 818.0, 1: 818.4. Samples: 3439345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:53:55,470][108279] Avg episode reward: [(0, '8.970'), (1, '9.900')] +[2023-09-25 20:54:00,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13803520. Throughput: 0: 817.0, 1: 816.9. Samples: 3449047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:00,470][108279] Avg episode reward: [(0, '8.980'), (1, '9.900')] +[2023-09-25 20:54:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13836288. Throughput: 0: 816.6, 1: 818.3. Samples: 3458956. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:05,470][108279] Avg episode reward: [(0, '8.990'), (1, '9.910')] +[2023-09-25 20:54:06,154][109224] Updated weights for policy 1, policy_version 27040 (0.0018) +[2023-09-25 20:54:06,155][109225] Updated weights for policy 0, policy_version 27040 (0.0018) +[2023-09-25 20:54:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13869056. Throughput: 0: 812.9, 1: 812.8. Samples: 3463614. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:10,471][108279] Avg episode reward: [(0, '9.010'), (1, '9.920')] +[2023-09-25 20:54:15,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13901824. Throughput: 0: 815.6, 1: 813.7. Samples: 3473444. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:15,471][108279] Avg episode reward: [(0, '9.010'), (1, '9.920')] +[2023-09-25 20:54:18,632][109224] Updated weights for policy 1, policy_version 27200 (0.0017) +[2023-09-25 20:54:18,632][109225] Updated weights for policy 0, policy_version 27200 (0.0018) +[2023-09-25 20:54:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13934592. Throughput: 0: 817.8, 1: 819.2. Samples: 3483549. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:20,471][108279] Avg episode reward: [(0, '9.020'), (1, '9.920')] +[2023-09-25 20:54:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 13967360. Throughput: 0: 818.3, 1: 818.9. Samples: 3488255. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:25,470][108279] Avg episode reward: [(0, '9.020'), (1, '9.920')] +[2023-09-25 20:54:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14000128. Throughput: 0: 819.0, 1: 816.4. Samples: 3497984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:30,471][108279] Avg episode reward: [(0, '9.020'), (1, '9.920')] +[2023-09-25 20:54:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000027344_7000064.pth... +[2023-09-25 20:54:30,481][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000027344_7000064.pth... +[2023-09-25 20:54:30,510][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000024288_6217728.pth +[2023-09-25 20:54:30,515][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000024288_6217728.pth +[2023-09-25 20:54:31,440][109224] Updated weights for policy 1, policy_version 27360 (0.0019) +[2023-09-25 20:54:31,440][109225] Updated weights for policy 0, policy_version 27360 (0.0019) +[2023-09-25 20:54:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14032896. Throughput: 0: 809.7, 1: 812.2. Samples: 3507480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:35,470][108279] Avg episode reward: [(0, '9.030'), (1, '9.930')] +[2023-09-25 20:54:35,471][109025] Saving new best policy, reward=9.930! +[2023-09-25 20:54:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14065664. Throughput: 0: 812.3, 1: 809.3. Samples: 3512316. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:40,471][108279] Avg episode reward: [(0, '9.030'), (1, '9.930')] +[2023-09-25 20:54:44,024][109225] Updated weights for policy 0, policy_version 27520 (0.0017) +[2023-09-25 20:54:44,024][109224] Updated weights for policy 1, policy_version 27520 (0.0017) +[2023-09-25 20:54:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14098432. Throughput: 0: 812.7, 1: 812.4. Samples: 3522179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:45,471][108279] Avg episode reward: [(0, '9.070'), (1, '9.930')] +[2023-09-25 20:54:50,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14131200. Throughput: 0: 810.0, 1: 810.2. Samples: 3531863. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:50,470][108279] Avg episode reward: [(0, '9.070'), (1, '9.930')] +[2023-09-25 20:54:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14163968. Throughput: 0: 815.2, 1: 813.0. Samples: 3536880. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:54:55,471][108279] Avg episode reward: [(0, '9.120'), (1, '9.930')] +[2023-09-25 20:54:56,598][109225] Updated weights for policy 0, policy_version 27680 (0.0017) +[2023-09-25 20:54:56,598][109224] Updated weights for policy 1, policy_version 27680 (0.0017) +[2023-09-25 20:55:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14196736. Throughput: 0: 812.3, 1: 814.2. Samples: 3546638. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:00,470][108279] Avg episode reward: [(0, '9.110'), (1, '9.930')] +[2023-09-25 20:55:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14229504. Throughput: 0: 809.3, 1: 809.8. Samples: 3556410. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:55:05,470][108279] Avg episode reward: [(0, '9.100'), (1, '9.930')] +[2023-09-25 20:55:09,053][109224] Updated weights for policy 1, policy_version 27840 (0.0014) +[2023-09-25 20:55:09,054][109225] Updated weights for policy 0, policy_version 27840 (0.0016) +[2023-09-25 20:55:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14262272. Throughput: 0: 814.8, 1: 812.2. Samples: 3561472. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:55:10,471][108279] Avg episode reward: [(0, '9.080'), (1, '9.930')] +[2023-09-25 20:55:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14295040. Throughput: 0: 813.1, 1: 816.1. Samples: 3571297. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:55:15,471][108279] Avg episode reward: [(0, '9.080'), (1, '9.940')] +[2023-09-25 20:55:15,480][109025] Saving new best policy, reward=9.940! +[2023-09-25 20:55:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14327808. Throughput: 0: 818.6, 1: 818.9. Samples: 3581170. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-25 20:55:20,470][108279] Avg episode reward: [(0, '9.100'), (1, '9.940')] +[2023-09-25 20:55:21,453][109225] Updated weights for policy 0, policy_version 28000 (0.0014) +[2023-09-25 20:55:21,454][109224] Updated weights for policy 1, policy_version 28000 (0.0017) +[2023-09-25 20:55:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 14360576. Throughput: 0: 819.3, 1: 819.3. Samples: 3586053. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:25,470][108279] Avg episode reward: [(0, '9.130'), (1, '9.940')] +[2023-09-25 20:55:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14393344. Throughput: 0: 820.8, 1: 821.9. Samples: 3596098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:30,470][108279] Avg episode reward: [(0, '9.150'), (1, '9.940')] +[2023-09-25 20:55:34,020][109225] Updated weights for policy 0, policy_version 28160 (0.0016) +[2023-09-25 20:55:34,020][109224] Updated weights for policy 1, policy_version 28160 (0.0015) +[2023-09-25 20:55:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14426112. Throughput: 0: 819.6, 1: 819.2. Samples: 3605611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:35,470][108279] Avg episode reward: [(0, '9.180'), (1, '9.950')] +[2023-09-25 20:55:35,471][109025] Saving new best policy, reward=9.950! +[2023-09-25 20:55:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14458880. Throughput: 0: 819.6, 1: 819.2. Samples: 3610624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:40,471][108279] Avg episode reward: [(0, '9.190'), (1, '9.950')] +[2023-09-25 20:55:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14491648. Throughput: 0: 821.0, 1: 821.0. Samples: 3620530. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:45,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.960')] +[2023-09-25 20:55:45,484][109025] Saving new best policy, reward=9.960! +[2023-09-25 20:55:46,480][109225] Updated weights for policy 0, policy_version 28320 (0.0018) +[2023-09-25 20:55:46,481][109224] Updated weights for policy 1, policy_version 28320 (0.0018) +[2023-09-25 20:55:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14524416. Throughput: 0: 821.9, 1: 821.6. Samples: 3630369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:50,471][108279] Avg episode reward: [(0, '9.200'), (1, '9.960')] +[2023-09-25 20:55:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14557184. Throughput: 0: 819.2, 1: 819.2. Samples: 3635201. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:55:55,471][108279] Avg episode reward: [(0, '9.230'), (1, '9.960')] +[2023-09-25 20:55:58,932][109225] Updated weights for policy 0, policy_version 28480 (0.0017) +[2023-09-25 20:55:58,932][109224] Updated weights for policy 1, policy_version 28480 (0.0017) +[2023-09-25 20:56:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14589952. Throughput: 0: 821.1, 1: 821.2. Samples: 3645204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:56:00,470][108279] Avg episode reward: [(0, '9.230'), (1, '9.960')] +[2023-09-25 20:56:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14622720. Throughput: 0: 818.6, 1: 817.9. Samples: 3654810. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:56:05,471][108279] Avg episode reward: [(0, '9.210'), (1, '9.960')] +[2023-09-25 20:56:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14655488. Throughput: 0: 818.5, 1: 819.1. Samples: 3659743. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:56:10,470][108279] Avg episode reward: [(0, '9.230'), (1, '9.960')] +[2023-09-25 20:56:11,554][109225] Updated weights for policy 0, policy_version 28640 (0.0014) +[2023-09-25 20:56:11,554][109224] Updated weights for policy 1, policy_version 28640 (0.0018) +[2023-09-25 20:56:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14688256. Throughput: 0: 817.3, 1: 816.6. Samples: 3669625. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:56:15,470][108279] Avg episode reward: [(0, '9.270'), (1, '9.960')] +[2023-09-25 20:56:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14721024. Throughput: 0: 820.0, 1: 820.2. Samples: 3679422. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 20:56:20,471][108279] Avg episode reward: [(0, '9.270'), (1, '9.960')] +[2023-09-25 20:56:24,006][109224] Updated weights for policy 1, policy_version 28800 (0.0015) +[2023-09-25 20:56:24,007][109225] Updated weights for policy 0, policy_version 28800 (0.0017) +[2023-09-25 20:56:25,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14753792. Throughput: 0: 819.2, 1: 819.2. Samples: 3684352. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:56:25,471][108279] Avg episode reward: [(0, '9.280'), (1, '9.960')] +[2023-09-25 20:56:30,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 14782464. Throughput: 0: 813.7, 1: 813.6. Samples: 3693761. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:56:30,471][108279] Avg episode reward: [(0, '9.280'), (1, '9.950')] +[2023-09-25 20:56:30,482][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000028880_7393280.pth... +[2023-09-25 20:56:30,514][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000025808_6606848.pth +[2023-09-25 20:56:30,517][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000028880_7393280.pth... +[2023-09-25 20:56:30,546][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000025808_6606848.pth +[2023-09-25 20:56:35,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 14811136. Throughput: 0: 811.7, 1: 812.0. Samples: 3703436. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 20:56:35,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.950')] +[2023-09-25 20:56:36,742][109225] Updated weights for policy 0, policy_version 28960 (0.0017) +[2023-09-25 20:56:36,742][109224] Updated weights for policy 1, policy_version 28960 (0.0017) +[2023-09-25 20:56:40,470][108279] Fps is (10 sec: 6963.2, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14852096. Throughput: 0: 814.5, 1: 817.0. Samples: 3708618. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:56:40,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.950')] +[2023-09-25 20:56:45,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14884864. Throughput: 0: 814.7, 1: 814.4. Samples: 3718516. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:56:45,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.950')] +[2023-09-25 20:56:49,194][109225] Updated weights for policy 0, policy_version 29120 (0.0016) +[2023-09-25 20:56:49,195][109224] Updated weights for policy 1, policy_version 29120 (0.0017) +[2023-09-25 20:56:50,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 14913536. Throughput: 0: 814.3, 1: 815.5. Samples: 3728151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:56:50,471][108279] Avg episode reward: [(0, '9.240'), (1, '9.970')] +[2023-09-25 20:56:50,472][109025] Saving new best policy, reward=9.970! +[2023-09-25 20:56:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14950400. Throughput: 0: 816.2, 1: 818.0. Samples: 3733280. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:56:55,471][108279] Avg episode reward: [(0, '9.250'), (1, '9.970')] +[2023-09-25 20:57:00,470][108279] Fps is (10 sec: 6963.1, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 14983168. Throughput: 0: 815.2, 1: 815.7. Samples: 3743014. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:57:00,471][108279] Avg episode reward: [(0, '9.260'), (1, '9.970')] +[2023-09-25 20:57:01,692][109224] Updated weights for policy 1, policy_version 29280 (0.0016) +[2023-09-25 20:57:01,692][109225] Updated weights for policy 0, policy_version 29280 (0.0017) +[2023-09-25 20:57:05,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15007744. Throughput: 0: 811.9, 1: 811.3. Samples: 3752467. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:57:05,471][108279] Avg episode reward: [(0, '9.300'), (1, '9.970')] +[2023-09-25 20:57:10,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15040512. Throughput: 0: 811.6, 1: 813.8. Samples: 3757499. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:57:10,471][108279] Avg episode reward: [(0, '9.300'), (1, '9.970')] +[2023-09-25 20:57:14,397][109224] Updated weights for policy 1, policy_version 29440 (0.0017) +[2023-09-25 20:57:14,397][109225] Updated weights for policy 0, policy_version 29440 (0.0017) +[2023-09-25 20:57:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 15073280. Throughput: 0: 817.0, 1: 815.8. Samples: 3767237. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:57:15,471][108279] Avg episode reward: [(0, '9.290'), (1, '9.970')] +[2023-09-25 20:57:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15106048. Throughput: 0: 817.5, 1: 817.4. Samples: 3777009. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 20:57:20,471][108279] Avg episode reward: [(0, '9.330'), (1, '9.980')] +[2023-09-25 20:57:20,555][109025] Saving new best policy, reward=9.980! +[2023-09-25 20:57:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15138816. Throughput: 0: 816.7, 1: 816.8. Samples: 3782127. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:25,471][108279] Avg episode reward: [(0, '9.320'), (1, '9.980')] +[2023-09-25 20:57:26,808][109224] Updated weights for policy 1, policy_version 29600 (0.0017) +[2023-09-25 20:57:26,808][109225] Updated weights for policy 0, policy_version 29600 (0.0018) +[2023-09-25 20:57:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 15171584. Throughput: 0: 816.1, 1: 815.5. Samples: 3791939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:30,470][108279] Avg episode reward: [(0, '9.300'), (1, '9.980')] +[2023-09-25 20:57:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15204352. Throughput: 0: 814.7, 1: 813.5. Samples: 3801422. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:35,471][108279] Avg episode reward: [(0, '9.350'), (1, '9.980')] +[2023-09-25 20:57:39,541][109224] Updated weights for policy 1, policy_version 29760 (0.0016) +[2023-09-25 20:57:39,542][109225] Updated weights for policy 0, policy_version 29760 (0.0017) +[2023-09-25 20:57:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15237120. Throughput: 0: 812.2, 1: 812.2. Samples: 3806378. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:40,471][108279] Avg episode reward: [(0, '9.360'), (1, '9.970')] +[2023-09-25 20:57:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15269888. Throughput: 0: 807.7, 1: 807.4. Samples: 3815691. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:45,471][108279] Avg episode reward: [(0, '9.360'), (1, '9.970')] +[2023-09-25 20:57:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 15302656. Throughput: 0: 814.2, 1: 812.4. Samples: 3825668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:50,471][108279] Avg episode reward: [(0, '9.400'), (1, '9.970')] +[2023-09-25 20:57:52,133][109224] Updated weights for policy 1, policy_version 29920 (0.0016) +[2023-09-25 20:57:52,133][109225] Updated weights for policy 0, policy_version 29920 (0.0017) +[2023-09-25 20:57:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15335424. Throughput: 0: 813.3, 1: 813.8. Samples: 3830720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:57:55,470][108279] Avg episode reward: [(0, '9.400'), (1, '9.970')] +[2023-09-25 20:58:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 15368192. Throughput: 0: 811.4, 1: 812.4. Samples: 3840308. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:58:00,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:58:04,703][109224] Updated weights for policy 1, policy_version 30080 (0.0018) +[2023-09-25 20:58:04,704][109225] Updated weights for policy 0, policy_version 30080 (0.0018) +[2023-09-25 20:58:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15400960. Throughput: 0: 815.0, 1: 812.5. Samples: 3850244. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:58:05,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:58:10,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15433728. Throughput: 0: 811.4, 1: 811.4. Samples: 3855156. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:58:10,470][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:58:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15466496. Throughput: 0: 809.7, 1: 810.2. Samples: 3864837. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:58:15,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:58:17,144][109224] Updated weights for policy 1, policy_version 30240 (0.0014) +[2023-09-25 20:58:17,144][109225] Updated weights for policy 0, policy_version 30240 (0.0017) +[2023-09-25 20:58:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15499264. Throughput: 0: 816.6, 1: 815.7. Samples: 3874874. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-25 20:58:20,471][108279] Avg episode reward: [(0, '9.390'), (1, '9.970')] +[2023-09-25 20:58:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15532032. Throughput: 0: 815.9, 1: 816.4. Samples: 3879834. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:58:25,471][108279] Avg episode reward: [(0, '9.380'), (1, '9.970')] +[2023-09-25 20:58:29,748][109224] Updated weights for policy 1, policy_version 30400 (0.0018) +[2023-09-25 20:58:29,748][109225] Updated weights for policy 0, policy_version 30400 (0.0019) +[2023-09-25 20:58:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15564800. Throughput: 0: 818.4, 1: 818.4. Samples: 3889347. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:58:30,471][108279] Avg episode reward: [(0, '9.390'), (1, '9.970')] +[2023-09-25 20:58:30,480][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000030400_7782400.pth... +[2023-09-25 20:58:30,480][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000030400_7782400.pth... +[2023-09-25 20:58:30,509][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000027344_7000064.pth +[2023-09-25 20:58:30,516][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000027344_7000064.pth +[2023-09-25 20:58:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15597568. Throughput: 0: 816.3, 1: 819.0. Samples: 3899257. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:58:35,470][108279] Avg episode reward: [(0, '9.410'), (1, '9.970')] +[2023-09-25 20:58:40,470][108279] Fps is (10 sec: 6553.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15630336. Throughput: 0: 813.0, 1: 812.6. Samples: 3903872. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-25 20:58:40,470][108279] Avg episode reward: [(0, '9.400'), (1, '9.970')] +[2023-09-25 20:58:42,422][109224] Updated weights for policy 1, policy_version 30560 (0.0017) +[2023-09-25 20:58:42,422][109225] Updated weights for policy 0, policy_version 30560 (0.0018) +[2023-09-25 20:58:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15663104. Throughput: 0: 817.0, 1: 814.7. Samples: 3913736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:58:45,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:58:50,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15695872. Throughput: 0: 817.9, 1: 819.1. Samples: 3923909. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:58:50,471][108279] Avg episode reward: [(0, '9.410'), (1, '9.970')] +[2023-09-25 20:58:54,816][109225] Updated weights for policy 0, policy_version 30720 (0.0017) +[2023-09-25 20:58:54,817][109224] Updated weights for policy 1, policy_version 30720 (0.0015) +[2023-09-25 20:58:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15728640. Throughput: 0: 816.6, 1: 816.1. Samples: 3928627. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:58:55,471][108279] Avg episode reward: [(0, '9.420'), (1, '9.970')] +[2023-09-25 20:59:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15761408. Throughput: 0: 817.7, 1: 817.1. Samples: 3938405. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-25 20:59:00,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.970')] +[2023-09-25 20:59:05,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15794176. Throughput: 0: 818.5, 1: 818.0. Samples: 3948515. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:05,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.970')] +[2023-09-25 20:59:07,339][109225] Updated weights for policy 0, policy_version 30880 (0.0017) +[2023-09-25 20:59:07,339][109224] Updated weights for policy 1, policy_version 30880 (0.0016) +[2023-09-25 20:59:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15826944. Throughput: 0: 814.8, 1: 814.6. Samples: 3953159. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:10,470][108279] Avg episode reward: [(0, '9.430'), (1, '9.970')] +[2023-09-25 20:59:15,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15859712. Throughput: 0: 818.4, 1: 817.3. Samples: 3962953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:15,471][108279] Avg episode reward: [(0, '9.450'), (1, '9.970')] +[2023-09-25 20:59:19,922][109224] Updated weights for policy 1, policy_version 31040 (0.0015) +[2023-09-25 20:59:19,922][109225] Updated weights for policy 0, policy_version 31040 (0.0017) +[2023-09-25 20:59:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15892480. Throughput: 0: 819.7, 1: 818.5. Samples: 3972978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:20,471][108279] Avg episode reward: [(0, '9.470'), (1, '9.970')] +[2023-09-25 20:59:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15925248. Throughput: 0: 818.1, 1: 818.4. Samples: 3977514. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:25,471][108279] Avg episode reward: [(0, '9.490'), (1, '9.970')] +[2023-09-25 20:59:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15958016. Throughput: 0: 818.8, 1: 819.0. Samples: 3987437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:30,470][108279] Avg episode reward: [(0, '9.500'), (1, '9.970')] +[2023-09-25 20:59:32,695][109225] Updated weights for policy 0, policy_version 31200 (0.0017) +[2023-09-25 20:59:32,695][109224] Updated weights for policy 1, policy_version 31200 (0.0017) +[2023-09-25 20:59:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 15990784. Throughput: 0: 809.2, 1: 812.0. Samples: 3996861. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:35,471][108279] Avg episode reward: [(0, '9.500'), (1, '9.970')] +[2023-09-25 20:59:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16023552. Throughput: 0: 814.0, 1: 811.9. Samples: 4001792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:40,471][108279] Avg episode reward: [(0, '9.510'), (1, '9.970')] +[2023-09-25 20:59:45,200][109225] Updated weights for policy 0, policy_version 31360 (0.0014) +[2023-09-25 20:59:45,200][109224] Updated weights for policy 1, policy_version 31360 (0.0017) +[2023-09-25 20:59:45,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16056320. Throughput: 0: 815.0, 1: 815.9. Samples: 4011797. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:45,471][108279] Avg episode reward: [(0, '9.510'), (1, '9.970')] +[2023-09-25 20:59:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16089088. Throughput: 0: 810.2, 1: 812.3. Samples: 4021531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:50,471][108279] Avg episode reward: [(0, '9.510'), (1, '9.970')] +[2023-09-25 20:59:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16121856. Throughput: 0: 814.9, 1: 812.0. Samples: 4026368. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 20:59:55,471][108279] Avg episode reward: [(0, '9.510'), (1, '9.970')] +[2023-09-25 20:59:57,706][109224] Updated weights for policy 1, policy_version 31520 (0.0017) +[2023-09-25 20:59:57,707][109225] Updated weights for policy 0, policy_version 31520 (0.0018) +[2023-09-25 21:00:00,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16154624. Throughput: 0: 815.3, 1: 816.6. Samples: 4036387. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:00:00,471][108279] Avg episode reward: [(0, '9.530'), (1, '9.970')] +[2023-09-25 21:00:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16187392. Throughput: 0: 812.2, 1: 813.6. Samples: 4046139. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 21:00:05,471][108279] Avg episode reward: [(0, '9.520'), (1, '9.970')] +[2023-09-25 21:00:10,320][109225] Updated weights for policy 0, policy_version 31680 (0.0017) +[2023-09-25 21:00:10,321][109224] Updated weights for policy 1, policy_version 31680 (0.0017) +[2023-09-25 21:00:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16220160. Throughput: 0: 817.2, 1: 814.6. Samples: 4050944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 21:00:10,471][108279] Avg episode reward: [(0, '9.510'), (1, '9.970')] +[2023-09-25 21:00:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16252928. Throughput: 0: 813.4, 1: 815.5. Samples: 4060738. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 21:00:15,471][108279] Avg episode reward: [(0, '9.540'), (1, '9.970')] +[2023-09-25 21:00:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16285696. Throughput: 0: 818.1, 1: 815.9. Samples: 4070390. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 21:00:20,470][108279] Avg episode reward: [(0, '9.530'), (1, '9.970')] +[2023-09-25 21:00:22,864][109225] Updated weights for policy 0, policy_version 31840 (0.0018) +[2023-09-25 21:00:22,864][109224] Updated weights for policy 1, policy_version 31840 (0.0018) +[2023-09-25 21:00:25,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16318464. Throughput: 0: 817.5, 1: 819.2. Samples: 4075444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-25 21:00:25,470][108279] Avg episode reward: [(0, '9.530'), (1, '9.970')] +[2023-09-25 21:00:30,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16351232. Throughput: 0: 815.2, 1: 814.2. Samples: 4085120. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:00:30,471][108279] Avg episode reward: [(0, '9.520'), (1, '9.970')] +[2023-09-25 21:00:30,480][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000031936_8175616.pth... +[2023-09-25 21:00:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000031936_8175616.pth... +[2023-09-25 21:00:30,519][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000028880_7393280.pth +[2023-09-25 21:00:30,519][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000028880_7393280.pth +[2023-09-25 21:00:35,470][108279] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 16375808. Throughput: 0: 813.3, 1: 812.8. Samples: 4094705. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:00:35,471][108279] Avg episode reward: [(0, '9.520'), (1, '9.970')] +[2023-09-25 21:00:35,497][109224] Updated weights for policy 1, policy_version 32000 (0.0017) +[2023-09-25 21:00:35,497][109225] Updated weights for policy 0, policy_version 32000 (0.0016) +[2023-09-25 21:00:40,470][108279] Fps is (10 sec: 6144.2, 60 sec: 6485.3, 300 sec: 6511.9). Total num frames: 16412672. Throughput: 0: 814.4, 1: 816.8. Samples: 4099773. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:00:40,471][108279] Avg episode reward: [(0, '9.540'), (1, '9.970')] +[2023-09-25 21:00:45,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16449536. Throughput: 0: 813.1, 1: 813.1. Samples: 4109563. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:00:45,470][108279] Avg episode reward: [(0, '9.540'), (1, '9.980')] +[2023-09-25 21:00:47,990][109225] Updated weights for policy 0, policy_version 32160 (0.0017) +[2023-09-25 21:00:47,990][109224] Updated weights for policy 1, policy_version 32160 (0.0017) +[2023-09-25 21:00:50,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 16474112. Throughput: 0: 812.6, 1: 811.9. Samples: 4119244. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:00:50,471][108279] Avg episode reward: [(0, '9.580'), (1, '9.980')] +[2023-09-25 21:00:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16515072. Throughput: 0: 815.0, 1: 817.6. Samples: 4124409. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:00:55,471][108279] Avg episode reward: [(0, '9.600'), (1, '9.980')] +[2023-09-25 21:01:00,357][109225] Updated weights for policy 0, policy_version 32320 (0.0016) +[2023-09-25 21:01:00,357][109224] Updated weights for policy 1, policy_version 32320 (0.0017) +[2023-09-25 21:01:00,470][108279] Fps is (10 sec: 7372.9, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16547840. Throughput: 0: 817.6, 1: 817.4. Samples: 4134316. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:00,470][108279] Avg episode reward: [(0, '9.620'), (1, '9.980')] +[2023-09-25 21:01:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16580608. Throughput: 0: 818.2, 1: 818.0. Samples: 4144021. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:05,471][108279] Avg episode reward: [(0, '9.650'), (1, '9.980')] +[2023-09-25 21:01:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 16613376. Throughput: 0: 817.2, 1: 818.1. Samples: 4149034. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:10,471][108279] Avg episode reward: [(0, '9.650'), (1, '9.980')] +[2023-09-25 21:01:12,965][109224] Updated weights for policy 1, policy_version 32480 (0.0017) +[2023-09-25 21:01:12,965][109225] Updated weights for policy 0, policy_version 32480 (0.0016) +[2023-09-25 21:01:15,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6498.1). Total num frames: 16637952. Throughput: 0: 817.6, 1: 817.1. Samples: 4158681. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:15,471][108279] Avg episode reward: [(0, '9.670'), (1, '9.980')] +[2023-09-25 21:01:20,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.0, 300 sec: 6498.1). Total num frames: 16670720. Throughput: 0: 817.1, 1: 817.4. Samples: 4168255. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:20,471][108279] Avg episode reward: [(0, '9.660'), (1, '9.950')] +[2023-09-25 21:01:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6511.9). Total num frames: 16703488. Throughput: 0: 819.1, 1: 818.7. Samples: 4173474. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:25,471][108279] Avg episode reward: [(0, '9.670'), (1, '9.950')] +[2023-09-25 21:01:25,566][109224] Updated weights for policy 1, policy_version 32640 (0.0018) +[2023-09-25 21:01:25,566][109225] Updated weights for policy 0, policy_version 32640 (0.0018) +[2023-09-25 21:01:30,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 16736256. Throughput: 0: 817.7, 1: 817.8. Samples: 4183161. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:30,470][108279] Avg episode reward: [(0, '9.670'), (1, '9.950')] +[2023-09-25 21:01:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6498.1). Total num frames: 16769024. Throughput: 0: 819.0, 1: 819.1. Samples: 4192961. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:35,471][108279] Avg episode reward: [(0, '9.660'), (1, '9.950')] +[2023-09-25 21:01:37,988][109225] Updated weights for policy 0, policy_version 32800 (0.0016) +[2023-09-25 21:01:37,988][109224] Updated weights for policy 1, policy_version 32800 (0.0017) +[2023-09-25 21:01:40,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6485.3, 300 sec: 6498.1). Total num frames: 16801792. Throughput: 0: 818.4, 1: 819.1. Samples: 4198095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:40,471][108279] Avg episode reward: [(0, '9.660'), (1, '9.950')] +[2023-09-25 21:01:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6511.9). Total num frames: 16834560. Throughput: 0: 815.2, 1: 815.3. Samples: 4207691. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:45,470][108279] Avg episode reward: [(0, '9.710'), (1, '9.950')] +[2023-09-25 21:01:50,459][109225] Updated weights for policy 0, policy_version 32960 (0.0016) +[2023-09-25 21:01:50,460][109224] Updated weights for policy 1, policy_version 32960 (0.0017) +[2023-09-25 21:01:50,470][108279] Fps is (10 sec: 7372.8, 60 sec: 6690.1, 300 sec: 6525.8). Total num frames: 16875520. Throughput: 0: 816.9, 1: 817.6. Samples: 4217577. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:50,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.950')] +[2023-09-25 21:01:55,470][108279] Fps is (10 sec: 6963.1, 60 sec: 6485.3, 300 sec: 6511.9). Total num frames: 16904192. Throughput: 0: 819.9, 1: 819.3. Samples: 4222800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:01:55,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.960')] +[2023-09-25 21:02:00,470][108279] Fps is (10 sec: 6144.0, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 16936960. Throughput: 0: 818.3, 1: 819.1. Samples: 4232364. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:00,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.960')] +[2023-09-25 21:02:03,000][109224] Updated weights for policy 1, policy_version 33120 (0.0017) +[2023-09-25 21:02:03,000][109225] Updated weights for policy 0, policy_version 33120 (0.0018) +[2023-09-25 21:02:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 16969728. Throughput: 0: 821.0, 1: 820.6. Samples: 4242124. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:05,471][108279] Avg episode reward: [(0, '9.740'), (1, '9.960')] +[2023-09-25 21:02:10,470][108279] Fps is (10 sec: 6963.3, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 17006592. Throughput: 0: 820.6, 1: 821.1. Samples: 4247352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:10,470][108279] Avg episode reward: [(0, '9.740'), (1, '9.960')] +[2023-09-25 21:02:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6621.9, 300 sec: 6539.7). Total num frames: 17035264. Throughput: 0: 821.6, 1: 821.0. Samples: 4257081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:15,471][108279] Avg episode reward: [(0, '9.750'), (1, '9.960')] +[2023-09-25 21:02:15,476][109224] Updated weights for policy 1, policy_version 33280 (0.0017) +[2023-09-25 21:02:15,477][109225] Updated weights for policy 0, policy_version 33280 (0.0017) +[2023-09-25 21:02:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.2, 300 sec: 6553.6). Total num frames: 17072128. Throughput: 0: 820.6, 1: 820.5. Samples: 4266812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:20,470][108279] Avg episode reward: [(0, '9.760'), (1, '9.960')] +[2023-09-25 21:02:25,470][108279] Fps is (10 sec: 6963.4, 60 sec: 6690.2, 300 sec: 6553.6). Total num frames: 17104896. Throughput: 0: 821.1, 1: 820.7. Samples: 4271976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:25,470][108279] Avg episode reward: [(0, '9.770'), (1, '9.960')] +[2023-09-25 21:02:27,858][109225] Updated weights for policy 0, policy_version 33440 (0.0018) +[2023-09-25 21:02:27,858][109224] Updated weights for policy 1, policy_version 33440 (0.0017) +[2023-09-25 21:02:30,470][108279] Fps is (10 sec: 6553.3, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 17137664. Throughput: 0: 823.7, 1: 824.3. Samples: 4281851. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:30,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.960')] +[2023-09-25 21:02:30,482][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000033472_8568832.pth... +[2023-09-25 21:02:30,482][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000033472_8568832.pth... +[2023-09-25 21:02:30,516][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000030400_7782400.pth +[2023-09-25 21:02:30,520][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000030400_7782400.pth +[2023-09-25 21:02:35,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 17170432. Throughput: 0: 823.5, 1: 823.2. Samples: 4291676. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:35,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:02:40,396][109224] Updated weights for policy 1, policy_version 33600 (0.0016) +[2023-09-25 21:02:40,396][109225] Updated weights for policy 0, policy_version 33600 (0.0017) +[2023-09-25 21:02:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6690.1, 300 sec: 6553.6). Total num frames: 17203200. Throughput: 0: 821.4, 1: 820.2. Samples: 4296669. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:40,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:02:45,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17227776. Throughput: 0: 817.3, 1: 818.1. Samples: 4305954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:45,473][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:02:50,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 17260544. Throughput: 0: 815.0, 1: 815.5. Samples: 4315494. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:50,470][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:02:53,143][109225] Updated weights for policy 0, policy_version 33760 (0.0017) +[2023-09-25 21:02:53,143][109224] Updated weights for policy 1, policy_version 33760 (0.0017) +[2023-09-25 21:02:55,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6485.4, 300 sec: 6525.8). Total num frames: 17293312. Throughput: 0: 814.6, 1: 814.3. Samples: 4320650. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:02:55,470][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:03:00,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 17326080. Throughput: 0: 815.0, 1: 815.1. Samples: 4330436. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:03:00,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.960')] +[2023-09-25 21:03:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 17358848. Throughput: 0: 814.4, 1: 814.9. Samples: 4340134. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:03:05,470][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:05,662][109225] Updated weights for policy 0, policy_version 33920 (0.0017) +[2023-09-25 21:03:05,662][109224] Updated weights for policy 1, policy_version 33920 (0.0015) +[2023-09-25 21:03:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 17391616. Throughput: 0: 814.2, 1: 813.9. Samples: 4345240. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:03:10,471][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:15,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6485.3, 300 sec: 6525.8). Total num frames: 17424384. Throughput: 0: 813.0, 1: 812.5. Samples: 4355001. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:03:15,471][108279] Avg episode reward: [(0, '9.850'), (1, '9.970')] +[2023-09-25 21:03:18,085][109224] Updated weights for policy 1, policy_version 34080 (0.0017) +[2023-09-25 21:03:18,086][109225] Updated weights for policy 0, policy_version 34080 (0.0016) +[2023-09-25 21:03:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 17457152. Throughput: 0: 811.6, 1: 812.6. Samples: 4364764. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:03:20,471][108279] Avg episode reward: [(0, '9.850'), (1, '9.970')] +[2023-09-25 21:03:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 17489920. Throughput: 0: 810.1, 1: 811.8. Samples: 4369655. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 21:03:25,471][108279] Avg episode reward: [(0, '9.850'), (1, '9.970')] +[2023-09-25 21:03:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 17522688. Throughput: 0: 815.6, 1: 814.7. Samples: 4379318. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 21:03:30,471][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:30,756][109225] Updated weights for policy 0, policy_version 34240 (0.0017) +[2023-09-25 21:03:30,757][109224] Updated weights for policy 1, policy_version 34240 (0.0016) +[2023-09-25 21:03:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 17555456. Throughput: 0: 817.7, 1: 817.3. Samples: 4389070. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 21:03:35,471][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 17588224. Throughput: 0: 816.8, 1: 816.8. Samples: 4394162. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 21:03:40,471][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:43,276][109224] Updated weights for policy 1, policy_version 34400 (0.0017) +[2023-09-25 21:03:43,276][109225] Updated weights for policy 0, policy_version 34400 (0.0018) +[2023-09-25 21:03:45,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17620992. Throughput: 0: 816.0, 1: 816.1. Samples: 4403880. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-25 21:03:45,470][108279] Avg episode reward: [(0, '9.840'), (1, '9.970')] +[2023-09-25 21:03:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17653760. Throughput: 0: 816.0, 1: 813.8. Samples: 4413476. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:03:50,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.960')] +[2023-09-25 21:03:55,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17686528. Throughput: 0: 813.3, 1: 814.0. Samples: 4418469. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:03:55,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.960')] +[2023-09-25 21:03:55,879][109225] Updated weights for policy 0, policy_version 34560 (0.0017) +[2023-09-25 21:03:55,879][109224] Updated weights for policy 1, policy_version 34560 (0.0017) +[2023-09-25 21:04:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17719296. Throughput: 0: 813.8, 1: 813.9. Samples: 4428251. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:00,470][108279] Avg episode reward: [(0, '9.840'), (1, '9.960')] +[2023-09-25 21:04:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17752064. Throughput: 0: 815.4, 1: 812.4. Samples: 4438016. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:05,471][108279] Avg episode reward: [(0, '9.830'), (1, '9.960')] +[2023-09-25 21:04:08,474][109225] Updated weights for policy 0, policy_version 34720 (0.0017) +[2023-09-25 21:04:08,474][109224] Updated weights for policy 1, policy_version 34720 (0.0017) +[2023-09-25 21:04:10,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17784832. Throughput: 0: 813.1, 1: 813.2. Samples: 4442840. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:10,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:04:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17817600. Throughput: 0: 814.9, 1: 815.5. Samples: 4452687. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:15,471][108279] Avg episode reward: [(0, '9.820'), (1, '9.960')] +[2023-09-25 21:04:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17850368. Throughput: 0: 818.0, 1: 815.8. Samples: 4462592. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:20,471][108279] Avg episode reward: [(0, '9.800'), (1, '9.960')] +[2023-09-25 21:04:20,999][109225] Updated weights for policy 0, policy_version 34880 (0.0014) +[2023-09-25 21:04:20,999][109224] Updated weights for policy 1, policy_version 34880 (0.0017) +[2023-09-25 21:04:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17883136. Throughput: 0: 814.6, 1: 815.0. Samples: 4467492. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:25,471][108279] Avg episode reward: [(0, '9.800'), (1, '9.960')] +[2023-09-25 21:04:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17915904. Throughput: 0: 816.4, 1: 816.5. Samples: 4477360. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-25 21:04:30,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:04:30,481][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000034992_8957952.pth... +[2023-09-25 21:04:30,481][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000034992_8957952.pth... +[2023-09-25 21:04:30,516][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000031936_8175616.pth +[2023-09-25 21:04:30,519][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000031936_8175616.pth +[2023-09-25 21:04:33,434][109224] Updated weights for policy 1, policy_version 35040 (0.0017) +[2023-09-25 21:04:33,434][109225] Updated weights for policy 0, policy_version 35040 (0.0018) +[2023-09-25 21:04:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17948672. Throughput: 0: 819.2, 1: 818.4. Samples: 4487168. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 21:04:35,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:04:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 17981440. Throughput: 0: 815.8, 1: 814.6. Samples: 4491839. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 21:04:40,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:04:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18014208. Throughput: 0: 815.2, 1: 812.8. Samples: 4501513. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 21:04:45,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:04:46,220][109225] Updated weights for policy 0, policy_version 35200 (0.0017) +[2023-09-25 21:04:46,220][109224] Updated weights for policy 1, policy_version 35200 (0.0017) +[2023-09-25 21:04:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18046976. Throughput: 0: 813.1, 1: 816.2. Samples: 4511335. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-25 21:04:50,471][108279] Avg episode reward: [(0, '9.800'), (1, '9.960')] +[2023-09-25 21:04:55,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18079744. Throughput: 0: 813.2, 1: 813.2. Samples: 4516030. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:04:55,471][108279] Avg episode reward: [(0, '9.800'), (1, '9.960')] +[2023-09-25 21:04:58,782][109224] Updated weights for policy 1, policy_version 35360 (0.0014) +[2023-09-25 21:04:58,783][109225] Updated weights for policy 0, policy_version 35360 (0.0017) +[2023-09-25 21:05:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18112512. Throughput: 0: 816.8, 1: 814.1. Samples: 4526080. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:05:00,471][108279] Avg episode reward: [(0, '9.810'), (1, '9.960')] +[2023-09-25 21:05:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18145280. Throughput: 0: 809.3, 1: 811.9. Samples: 4535544. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:05:05,471][108279] Avg episode reward: [(0, '9.810'), (1, '9.960')] +[2023-09-25 21:05:10,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18178048. Throughput: 0: 811.6, 1: 809.1. Samples: 4540421. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:05:10,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:05:11,334][109224] Updated weights for policy 1, policy_version 35520 (0.0017) +[2023-09-25 21:05:11,334][109225] Updated weights for policy 0, policy_version 35520 (0.0017) +[2023-09-25 21:05:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18210816. Throughput: 0: 815.7, 1: 813.1. Samples: 4550653. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:05:15,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.960')] +[2023-09-25 21:05:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18243584. Throughput: 0: 813.4, 1: 815.3. Samples: 4560457. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:20,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.960')] +[2023-09-25 21:05:23,807][109224] Updated weights for policy 1, policy_version 35680 (0.0018) +[2023-09-25 21:05:23,807][109225] Updated weights for policy 0, policy_version 35680 (0.0013) +[2023-09-25 21:05:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18276352. Throughput: 0: 813.9, 1: 814.4. Samples: 4565113. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:25,470][108279] Avg episode reward: [(0, '9.770'), (1, '9.960')] +[2023-09-25 21:05:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18309120. Throughput: 0: 817.7, 1: 819.0. Samples: 4575165. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:30,471][108279] Avg episode reward: [(0, '9.760'), (1, '9.960')] +[2023-09-25 21:05:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 18341888. Throughput: 0: 818.5, 1: 817.8. Samples: 4584967. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:35,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.990')] +[2023-09-25 21:05:35,472][109025] Saving new best policy, reward=9.990! +[2023-09-25 21:05:36,357][109224] Updated weights for policy 1, policy_version 35840 (0.0015) +[2023-09-25 21:05:36,357][109225] Updated weights for policy 0, policy_version 35840 (0.0018) +[2023-09-25 21:05:40,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18374656. Throughput: 0: 818.5, 1: 817.4. Samples: 4589643. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:40,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.990')] +[2023-09-25 21:05:45,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18407424. Throughput: 0: 817.4, 1: 818.7. Samples: 4599703. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:45,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.990')] +[2023-09-25 21:05:48,933][109224] Updated weights for policy 1, policy_version 36000 (0.0017) +[2023-09-25 21:05:48,933][109225] Updated weights for policy 0, policy_version 36000 (0.0016) +[2023-09-25 21:05:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18440192. Throughput: 0: 820.1, 1: 820.5. Samples: 4609373. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:50,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.990')] +[2023-09-25 21:05:55,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18472960. Throughput: 0: 819.2, 1: 820.6. Samples: 4614211. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:05:55,470][108279] Avg episode reward: [(0, '9.780'), (1, '9.990')] +[2023-09-25 21:06:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18505728. Throughput: 0: 817.8, 1: 819.2. Samples: 4624320. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:00,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.980')] +[2023-09-25 21:06:01,317][109225] Updated weights for policy 0, policy_version 36160 (0.0016) +[2023-09-25 21:06:01,317][109224] Updated weights for policy 1, policy_version 36160 (0.0018) +[2023-09-25 21:06:05,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18538496. Throughput: 0: 818.3, 1: 819.1. Samples: 4634137. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:05,471][108279] Avg episode reward: [(0, '9.760'), (1, '9.980')] +[2023-09-25 21:06:10,470][108279] Fps is (10 sec: 6553.8, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18571264. Throughput: 0: 819.2, 1: 819.3. Samples: 4638846. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:10,470][108279] Avg episode reward: [(0, '9.770'), (1, '9.980')] +[2023-09-25 21:06:13,776][109224] Updated weights for policy 1, policy_version 36320 (0.0015) +[2023-09-25 21:06:13,776][109225] Updated weights for policy 0, policy_version 36320 (0.0017) +[2023-09-25 21:06:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18604032. Throughput: 0: 820.6, 1: 819.2. Samples: 4648958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:15,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.980')] +[2023-09-25 21:06:20,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18636800. Throughput: 0: 818.6, 1: 818.5. Samples: 4658635. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:20,471][108279] Avg episode reward: [(0, '9.740'), (1, '9.980')] +[2023-09-25 21:06:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18669568. Throughput: 0: 819.1, 1: 818.9. Samples: 4663356. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:25,471][108279] Avg episode reward: [(0, '9.740'), (1, '9.980')] +[2023-09-25 21:06:26,465][109224] Updated weights for policy 1, policy_version 36480 (0.0017) +[2023-09-25 21:06:26,465][109225] Updated weights for policy 0, policy_version 36480 (0.0016) +[2023-09-25 21:06:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18702336. Throughput: 0: 815.0, 1: 816.5. Samples: 4673122. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:30,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:06:30,479][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000036528_9351168.pth... +[2023-09-25 21:06:30,479][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000036528_9351168.pth... +[2023-09-25 21:06:30,508][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000033472_8568832.pth +[2023-09-25 21:06:30,514][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000033472_8568832.pth +[2023-09-25 21:06:35,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18735104. Throughput: 0: 816.6, 1: 816.3. Samples: 4682857. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:35,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:06:39,008][109224] Updated weights for policy 1, policy_version 36640 (0.0017) +[2023-09-25 21:06:39,008][109225] Updated weights for policy 0, policy_version 36640 (0.0017) +[2023-09-25 21:06:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 18767872. Throughput: 0: 819.1, 1: 817.7. Samples: 4687867. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:40,470][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:06:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18800640. Throughput: 0: 816.6, 1: 817.1. Samples: 4697837. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:45,470][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:06:50,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 18833408. Throughput: 0: 814.8, 1: 814.5. Samples: 4707452. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:06:50,470][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:06:51,549][109224] Updated weights for policy 1, policy_version 36800 (0.0017) +[2023-09-25 21:06:51,550][109225] Updated weights for policy 0, policy_version 36800 (0.0014) +[2023-09-25 21:06:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 18866176. Throughput: 0: 818.7, 1: 816.6. Samples: 4712435. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 21:06:55,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.980')] +[2023-09-25 21:07:00,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 18898944. Throughput: 0: 813.0, 1: 815.5. Samples: 4722238. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 21:07:00,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.980')] +[2023-09-25 21:07:04,024][109225] Updated weights for policy 0, policy_version 36960 (0.0014) +[2023-09-25 21:07:04,024][109224] Updated weights for policy 1, policy_version 36960 (0.0015) +[2023-09-25 21:07:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18931712. Throughput: 0: 815.5, 1: 816.2. Samples: 4732060. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 21:07:05,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.980')] +[2023-09-25 21:07:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6539.7). Total num frames: 18964480. Throughput: 0: 819.2, 1: 817.9. Samples: 4737024. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 21:07:10,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.980')] +[2023-09-25 21:07:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 18997248. Throughput: 0: 820.7, 1: 820.5. Samples: 4746974. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-25 21:07:15,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.980')] +[2023-09-25 21:07:16,421][109225] Updated weights for policy 0, policy_version 37120 (0.0016) +[2023-09-25 21:07:16,421][109224] Updated weights for policy 1, policy_version 37120 (0.0015) +[2023-09-25 21:07:20,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 19030016. Throughput: 0: 822.1, 1: 820.9. Samples: 4756792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:20,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.970')] +[2023-09-25 21:07:25,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 19062784. Throughput: 0: 819.3, 1: 819.2. Samples: 4761600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:25,470][108279] Avg episode reward: [(0, '9.720'), (1, '9.970')] +[2023-09-25 21:07:29,116][109224] Updated weights for policy 1, policy_version 37280 (0.0017) +[2023-09-25 21:07:29,116][109225] Updated weights for policy 0, policy_version 37280 (0.0019) +[2023-09-25 21:07:30,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 19095552. Throughput: 0: 816.4, 1: 817.1. Samples: 4771342. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:30,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:07:35,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6525.8). Total num frames: 19128320. Throughput: 0: 814.6, 1: 815.3. Samples: 4780797. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:35,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:07:40,470][108279] Fps is (10 sec: 6144.2, 60 sec: 6485.3, 300 sec: 6539.7). Total num frames: 19156992. Throughput: 0: 814.7, 1: 817.2. Samples: 4785872. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:40,470][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:07:41,731][109225] Updated weights for policy 0, policy_version 37440 (0.0017) +[2023-09-25 21:07:41,731][109224] Updated weights for policy 1, policy_version 37440 (0.0017) +[2023-09-25 21:07:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19193856. Throughput: 0: 815.3, 1: 815.7. Samples: 4795632. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:45,470][108279] Avg episode reward: [(0, '9.720'), (1, '9.970')] +[2023-09-25 21:07:50,470][108279] Fps is (10 sec: 6963.1, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19226624. Throughput: 0: 816.7, 1: 815.6. Samples: 4805514. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:50,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.970')] +[2023-09-25 21:07:54,147][109224] Updated weights for policy 1, policy_version 37600 (0.0019) +[2023-09-25 21:07:54,148][109225] Updated weights for policy 0, policy_version 37600 (0.0019) +[2023-09-25 21:07:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19259392. Throughput: 0: 816.0, 1: 819.0. Samples: 4810598. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:07:55,471][108279] Avg episode reward: [(0, '9.720'), (1, '9.960')] +[2023-09-25 21:08:00,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19292160. Throughput: 0: 816.8, 1: 816.6. Samples: 4820473. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:00,470][108279] Avg episode reward: [(0, '9.720'), (1, '9.960')] +[2023-09-25 21:08:05,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19324928. Throughput: 0: 813.6, 1: 815.1. Samples: 4830086. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 21:08:05,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.970')] +[2023-09-25 21:08:06,684][109225] Updated weights for policy 0, policy_version 37760 (0.0018) +[2023-09-25 21:08:06,684][109224] Updated weights for policy 1, policy_version 37760 (0.0017) +[2023-09-25 21:08:10,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19357696. Throughput: 0: 816.0, 1: 818.7. Samples: 4835164. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 21:08:10,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.970')] +[2023-09-25 21:08:15,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19390464. Throughput: 0: 817.6, 1: 818.5. Samples: 4844963. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 21:08:15,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.970')] +[2023-09-25 21:08:19,117][109224] Updated weights for policy 1, policy_version 37920 (0.0015) +[2023-09-25 21:08:19,117][109225] Updated weights for policy 0, policy_version 37920 (0.0017) +[2023-09-25 21:08:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19423232. Throughput: 0: 821.9, 1: 821.3. Samples: 4854740. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 21:08:20,470][108279] Avg episode reward: [(0, '9.730'), (1, '9.970')] +[2023-09-25 21:08:25,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19456000. Throughput: 0: 821.6, 1: 821.2. Samples: 4859798. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-25 21:08:25,471][108279] Avg episode reward: [(0, '9.740'), (1, '9.970')] +[2023-09-25 21:08:30,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19488768. Throughput: 0: 824.6, 1: 823.7. Samples: 4869803. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:30,471][108279] Avg episode reward: [(0, '9.770'), (1, '9.970')] +[2023-09-25 21:08:30,484][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000038064_9744384.pth... +[2023-09-25 21:08:30,485][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000038064_9744384.pth... +[2023-09-25 21:08:30,518][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000034992_8957952.pth +[2023-09-25 21:08:30,522][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000034992_8957952.pth +[2023-09-25 21:08:31,498][109224] Updated weights for policy 1, policy_version 38080 (0.0017) +[2023-09-25 21:08:31,499][109225] Updated weights for policy 0, policy_version 38080 (0.0019) +[2023-09-25 21:08:35,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19521536. Throughput: 0: 821.1, 1: 821.6. Samples: 4879434. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:35,470][108279] Avg episode reward: [(0, '9.770'), (1, '9.970')] +[2023-09-25 21:08:40,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6621.8, 300 sec: 6553.6). Total num frames: 19554304. Throughput: 0: 821.9, 1: 819.4. Samples: 4884457. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:40,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.970')] +[2023-09-25 21:08:43,976][109225] Updated weights for policy 0, policy_version 38240 (0.0016) +[2023-09-25 21:08:43,976][109224] Updated weights for policy 1, policy_version 38240 (0.0016) +[2023-09-25 21:08:45,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19587072. Throughput: 0: 821.4, 1: 821.9. Samples: 4894419. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:45,471][108279] Avg episode reward: [(0, '9.780'), (1, '9.970')] +[2023-09-25 21:08:50,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19619840. Throughput: 0: 823.3, 1: 823.8. Samples: 4904203. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:08:50,470][108279] Avg episode reward: [(0, '9.790'), (1, '9.970')] +[2023-09-25 21:08:55,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19652608. Throughput: 0: 822.4, 1: 819.8. Samples: 4909060. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:08:55,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.970')] +[2023-09-25 21:08:56,459][109224] Updated weights for policy 1, policy_version 38400 (0.0016) +[2023-09-25 21:08:56,459][109225] Updated weights for policy 0, policy_version 38400 (0.0017) +[2023-09-25 21:09:00,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19685376. Throughput: 0: 824.3, 1: 823.0. Samples: 4919090. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:09:00,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.970')] +[2023-09-25 21:09:05,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19718144. Throughput: 0: 823.7, 1: 823.2. Samples: 4928850. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:09:05,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.970')] +[2023-09-25 21:09:08,915][109224] Updated weights for policy 1, policy_version 38560 (0.0016) +[2023-09-25 21:09:08,916][109225] Updated weights for policy 0, policy_version 38560 (0.0018) +[2023-09-25 21:09:10,470][108279] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19750912. Throughput: 0: 821.5, 1: 819.6. Samples: 4933649. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:09:10,471][108279] Avg episode reward: [(0, '9.790'), (1, '9.970')] +[2023-09-25 21:09:15,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19783680. Throughput: 0: 817.8, 1: 818.4. Samples: 4943430. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-25 21:09:15,470][108279] Avg episode reward: [(0, '9.760'), (1, '9.970')] +[2023-09-25 21:09:20,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19816448. Throughput: 0: 819.8, 1: 820.4. Samples: 4953245. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:20,471][108279] Avg episode reward: [(0, '9.760'), (1, '9.970')] +[2023-09-25 21:09:21,484][109224] Updated weights for policy 1, policy_version 38720 (0.0013) +[2023-09-25 21:09:21,485][109225] Updated weights for policy 0, policy_version 38720 (0.0018) +[2023-09-25 21:09:25,470][108279] Fps is (10 sec: 6553.4, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19849216. Throughput: 0: 819.7, 1: 819.2. Samples: 4958208. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:25,471][108279] Avg episode reward: [(0, '9.750'), (1, '9.970')] +[2023-09-25 21:09:30,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 19881984. Throughput: 0: 819.2, 1: 818.6. Samples: 4968118. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:30,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:09:34,236][109224] Updated weights for policy 1, policy_version 38880 (0.0017) +[2023-09-25 21:09:34,236][109225] Updated weights for policy 0, policy_version 38880 (0.0015) +[2023-09-25 21:09:35,470][108279] Fps is (10 sec: 5734.5, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 19906560. Throughput: 0: 812.9, 1: 812.3. Samples: 4977335. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:35,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:09:40,470][108279] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 19939328. Throughput: 0: 810.6, 1: 813.0. Samples: 4982120. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:40,471][108279] Avg episode reward: [(0, '9.710'), (1, '9.970')] +[2023-09-25 21:09:45,470][108279] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6525.8). Total num frames: 19972096. Throughput: 0: 808.3, 1: 807.7. Samples: 4991813. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-25 21:09:45,470][108279] Avg episode reward: [(0, '9.730'), (1, '9.960')] +[2023-09-25 21:09:47,148][109224] Updated weights for policy 1, policy_version 39040 (0.0016) +[2023-09-25 21:09:47,148][109225] Updated weights for policy 0, policy_version 39040 (0.0017) +[2023-09-25 21:09:50,470][108279] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6525.8). Total num frames: 20004864. Throughput: 0: 805.1, 1: 803.6. Samples: 5001241. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-25 21:09:50,471][108279] Avg episode reward: [(0, '9.730'), (1, '9.960')] +[2023-09-25 21:09:50,856][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000039088_10006528.pth... +[2023-09-25 21:09:50,856][109227] Stopping RolloutWorker_w0... +[2023-09-25 21:09:50,856][109259] Stopping RolloutWorker_w1... +[2023-09-25 21:09:50,856][109262] Stopping RolloutWorker_w4... +[2023-09-25 21:09:50,857][109227] Loop rollout_proc0_evt_loop terminating... +[2023-09-25 21:09:50,856][109266] Stopping RolloutWorker_w7... +[2023-09-25 21:09:50,856][109265] Stopping RolloutWorker_w6... +[2023-09-25 21:09:50,857][109262] Loop rollout_proc4_evt_loop terminating... +[2023-09-25 21:09:50,857][109261] Stopping RolloutWorker_w3... +[2023-09-25 21:09:50,856][109264] Stopping RolloutWorker_w5... +[2023-09-25 21:09:50,857][109263] Stopping RolloutWorker_w2... +[2023-09-25 21:09:50,856][108279] Component RolloutWorker_w1 stopped! +[2023-09-25 21:09:50,857][109259] Loop rollout_proc1_evt_loop terminating... +[2023-09-25 21:09:50,857][109266] Loop rollout_proc7_evt_loop terminating... +[2023-09-25 21:09:50,857][108926] Stopping Batcher_0... +[2023-09-25 21:09:50,857][109264] Loop rollout_proc5_evt_loop terminating... +[2023-09-25 21:09:50,857][109261] Loop rollout_proc3_evt_loop terminating... +[2023-09-25 21:09:50,857][109265] Loop rollout_proc6_evt_loop terminating... +[2023-09-25 21:09:50,857][109263] Loop rollout_proc2_evt_loop terminating... +[2023-09-25 21:09:50,858][108279] Component RolloutWorker_w0 stopped! +[2023-09-25 21:09:50,858][108926] Loop batcher_evt_loop terminating... +[2023-09-25 21:09:50,858][108279] Component RolloutWorker_w4 stopped! +[2023-09-25 21:09:50,859][108279] Component RolloutWorker_w6 stopped! +[2023-09-25 21:09:50,859][108279] Component Batcher_1 stopped! +[2023-09-25 21:09:50,860][108279] Component RolloutWorker_w7 stopped! +[2023-09-25 21:09:50,860][108279] Component RolloutWorker_w5 stopped! +[2023-09-25 21:09:50,861][108279] Component RolloutWorker_w3 stopped! +[2023-09-25 21:09:50,861][108279] Component RolloutWorker_w2 stopped! +[2023-09-25 21:09:50,861][108279] Component Batcher_0 stopped! +[2023-09-25 21:09:50,856][109025] Stopping Batcher_1... +[2023-09-25 21:09:50,875][109025] Loop batcher_evt_loop terminating... +[2023-09-25 21:09:50,893][109025] Removing ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000036528_9351168.pth +[2023-09-25 21:09:50,897][109025] Saving ./train_atari/atari_bowling/checkpoint_p1/checkpoint_000039088_10006528.pth... +[2023-09-25 21:09:50,913][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000039088_10006528.pth... +[2023-09-25 21:09:50,919][109225] Weights refcount: 2 0 +[2023-09-25 21:09:50,921][109225] Stopping InferenceWorker_p0-w0... +[2023-09-25 21:09:50,921][109225] Loop inference_proc0-0_evt_loop terminating... +[2023-09-25 21:09:50,921][108279] Component InferenceWorker_p0-w0 stopped! +[2023-09-25 21:09:50,925][109224] Weights refcount: 2 0 +[2023-09-25 21:09:50,926][109224] Stopping InferenceWorker_p1-w0... +[2023-09-25 21:09:50,926][109224] Loop inference_proc1-0_evt_loop terminating... +[2023-09-25 21:09:50,926][108279] Component InferenceWorker_p1-w0 stopped! +[2023-09-25 21:09:50,936][109025] Stopping LearnerWorker_p1... +[2023-09-25 21:09:50,936][109025] Loop learner_proc1_evt_loop terminating... +[2023-09-25 21:09:50,938][108279] Component LearnerWorker_p1 stopped! +[2023-09-25 21:09:50,942][108926] Removing ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000036528_9351168.pth +[2023-09-25 21:09:50,946][108926] Saving ./train_atari/atari_bowling/checkpoint_p0/checkpoint_000039088_10006528.pth... +[2023-09-25 21:09:50,982][108926] Stopping LearnerWorker_p0... +[2023-09-25 21:09:50,982][108926] Loop learner_proc0_evt_loop terminating... +[2023-09-25 21:09:50,982][108279] Component LearnerWorker_p0 stopped! +[2023-09-25 21:09:50,983][108279] Waiting for process learner_proc0 to stop... +[2023-09-25 21:09:51,643][108279] Waiting for process learner_proc1 to stop... +[2023-09-25 21:09:51,644][108279] Waiting for process inference_proc0-0 to join... +[2023-09-25 21:09:51,673][108279] Waiting for process inference_proc1-0 to join... +[2023-09-25 21:09:51,674][108279] Waiting for process rollout_proc0 to join... +[2023-09-25 21:09:51,675][108279] Waiting for process rollout_proc1 to join... +[2023-09-25 21:09:51,676][108279] Waiting for process rollout_proc2 to join... +[2023-09-25 21:09:51,676][108279] Waiting for process rollout_proc3 to join... +[2023-09-25 21:09:51,677][108279] Waiting for process rollout_proc4 to join... +[2023-09-25 21:09:51,678][108279] Waiting for process rollout_proc5 to join... +[2023-09-25 21:09:51,679][108279] Waiting for process rollout_proc6 to join... +[2023-09-25 21:09:51,680][108279] Waiting for process rollout_proc7 to join... +[2023-09-25 21:09:51,680][108279] Batcher 0 profile tree view: +batching: 20.6225, releasing_batches: 1.7891 +[2023-09-25 21:09:51,681][108279] Batcher 1 profile tree view: +batching: 20.8886, releasing_batches: 1.8818 +[2023-09-25 21:09:51,681][108279] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0051 + wait_policy_total: 616.9296 +update_model: 36.3037 + weight_update: 0.0017 +one_step: 0.0011 + handle_policy_step: 2218.6380 + deserialize: 66.2649, stack: 15.9493, obs_to_device_normalize: 540.3936, forward: 1066.9566, send_messages: 92.6701 + prepare_outputs: 294.9488 + to_cpu: 147.0230 +[2023-09-25 21:09:51,681][108279] InferenceWorker_p1-w0 profile tree view: +wait_policy: 0.0052 + wait_policy_total: 619.4736 +update_model: 36.8075 + weight_update: 0.0015 +one_step: 0.0012 + handle_policy_step: 2212.1271 + deserialize: 67.7604, stack: 16.3694, obs_to_device_normalize: 537.5653, forward: 1059.3219, send_messages: 93.4417 + prepare_outputs: 292.5068 + to_cpu: 146.2553 +[2023-09-25 21:09:51,682][108279] Learner 0 profile tree view: +misc: 0.0152, prepare_batch: 31.9294 +train: 457.3288 + epoch_init: 0.1032, minibatch_init: 3.0735, losses_postprocess: 62.6364, kl_divergence: 5.3809, after_optimizer: 23.2850 + calculate_losses: 44.6980 + losses_init: 0.0995, forward_head: 14.1978, bptt_initial: 0.4313, bptt: 0.4746, tail: 10.3070, advantages_returns: 3.0465, losses: 12.6316 + update: 314.1444 + clip: 162.4632 +[2023-09-25 21:09:51,682][108279] Learner 1 profile tree view: +misc: 0.0152, prepare_batch: 32.1193 +train: 456.3154 + epoch_init: 0.1005, minibatch_init: 3.2161, losses_postprocess: 61.2132, kl_divergence: 5.5294, after_optimizer: 23.1887 + calculate_losses: 44.6887 + losses_init: 0.1012, forward_head: 13.5405, bptt_initial: 0.4526, bptt: 0.4558, tail: 10.5134, advantages_returns: 3.1060, losses: 12.8752 + update: 314.2029 + clip: 161.0696 +[2023-09-25 21:09:51,683][108279] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.3973, enqueue_policy_requests: 43.3810, env_step: 969.9135, overhead: 29.1769, complete_rollouts: 1.0872 +save_policy_outputs: 54.0324 + split_output_tensors: 18.8523 +[2023-09-25 21:09:51,683][108279] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.3966, enqueue_policy_requests: 42.2082, env_step: 1022.0259, overhead: 29.4626, complete_rollouts: 1.0551 +save_policy_outputs: 52.9515 + split_output_tensors: 18.1506 +[2023-09-25 21:09:51,684][108279] Loop Runner_EvtLoop terminating... +[2023-09-25 21:09:51,684][108279] Runner profile tree view: +main_loop: 3077.0532 +[2023-09-25 21:09:51,685][108279] Collected {0: 10006528, 1: 10006528}, FPS: 6504.0