diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,2192 @@ +[2023-09-27 10:58:49,659][08744] Saving configuration to ./train_atari/atari_timepilot/config.json... +[2023-09-27 10:58:49,975][08744] Rollout worker 0 uses device cpu +[2023-09-27 10:58:49,976][08744] Rollout worker 1 uses device cpu +[2023-09-27 10:58:49,976][08744] Rollout worker 2 uses device cpu +[2023-09-27 10:58:49,977][08744] Rollout worker 3 uses device cpu +[2023-09-27 10:58:49,977][08744] Rollout worker 4 uses device cpu +[2023-09-27 10:58:49,978][08744] Rollout worker 5 uses device cpu +[2023-09-27 10:58:49,978][08744] Rollout worker 6 uses device cpu +[2023-09-27 10:58:49,979][08744] Rollout worker 7 uses device cpu +[2023-09-27 10:58:49,979][08744] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2023-09-27 10:58:50,027][08744] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-27 10:58:50,027][08744] InferenceWorker_p0-w0: min num requests: 1 +[2023-09-27 10:58:50,031][08744] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-27 10:58:50,031][08744] InferenceWorker_p1-w0: min num requests: 1 +[2023-09-27 10:58:50,054][08744] Starting all processes... +[2023-09-27 10:58:50,054][08744] Starting process learner_proc0 +[2023-09-27 10:58:51,708][08744] Starting process learner_proc1 +[2023-09-27 10:58:51,712][09606] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-27 10:58:51,713][09606] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-09-27 10:58:51,731][09606] Num visible devices: 1 +[2023-09-27 10:58:51,748][09606] Starting seed is not provided +[2023-09-27 10:58:51,748][09606] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-27 10:58:51,748][09606] Initializing actor-critic model on device cuda:0 +[2023-09-27 10:58:51,749][09606] RunningMeanStd input shape: (4, 84, 84) +[2023-09-27 10:58:51,749][09606] RunningMeanStd input shape: (1,) +[2023-09-27 10:58:51,761][09606] ConvEncoder: input_channels=4 +[2023-09-27 10:58:51,921][09606] Conv encoder output size: 512 +[2023-09-27 10:58:51,923][09606] Created Actor Critic model with architecture: +[2023-09-27 10:58:51,923][09606] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ReLU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ReLU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ReLU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ReLU) + ) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=10, bias=True) + ) +) +[2023-09-27 10:58:52,517][09606] Using optimizer +[2023-09-27 10:58:52,518][09606] No checkpoints found +[2023-09-27 10:58:52,518][09606] Did not load from checkpoint, starting from scratch! +[2023-09-27 10:58:52,518][09606] Initialized policy 0 weights for model version 0 +[2023-09-27 10:58:52,520][09606] LearnerWorker_p0 finished initialization! +[2023-09-27 10:58:52,520][09606] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-27 10:58:53,389][08744] Starting all processes... +[2023-09-27 10:58:53,393][09742] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-27 10:58:53,393][09742] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 +[2023-09-27 10:58:53,396][08744] Starting process inference_proc0-0 +[2023-09-27 10:58:53,396][08744] Starting process inference_proc1-0 +[2023-09-27 10:58:53,397][08744] Starting process rollout_proc0 +[2023-09-27 10:58:53,397][08744] Starting process rollout_proc1 +[2023-09-27 10:58:53,412][09742] Num visible devices: 1 +[2023-09-27 10:58:53,397][08744] Starting process rollout_proc2 +[2023-09-27 10:58:53,398][08744] Starting process rollout_proc3 +[2023-09-27 10:58:53,401][08744] Starting process rollout_proc4 +[2023-09-27 10:58:53,434][09742] Starting seed is not provided +[2023-09-27 10:58:53,434][09742] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-09-27 10:58:53,434][09742] Initializing actor-critic model on device cuda:0 +[2023-09-27 10:58:53,435][09742] RunningMeanStd input shape: (4, 84, 84) +[2023-09-27 10:58:53,402][08744] Starting process rollout_proc5 +[2023-09-27 10:58:53,435][09742] RunningMeanStd input shape: (1,) +[2023-09-27 10:58:53,404][08744] Starting process rollout_proc6 +[2023-09-27 10:58:53,406][08744] Starting process rollout_proc7 +[2023-09-27 10:58:53,448][09742] ConvEncoder: input_channels=4 +[2023-09-27 10:58:53,808][09742] Conv encoder output size: 512 +[2023-09-27 10:58:53,810][09742] Created Actor Critic model with architecture: +[2023-09-27 10:58:53,810][09742] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ReLU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ReLU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ReLU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ReLU) + ) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=10, bias=True) + ) +) +[2023-09-27 10:58:54,514][09742] Using optimizer +[2023-09-27 10:58:54,515][09742] No checkpoints found +[2023-09-27 10:58:54,515][09742] Did not load from checkpoint, starting from scratch! +[2023-09-27 10:58:54,515][09742] Initialized policy 1 weights for model version 0 +[2023-09-27 10:58:54,517][09742] LearnerWorker_p1 finished initialization! +[2023-09-27 10:58:54,517][09742] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-09-27 10:58:55,395][09878] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-09-27 10:58:55,395][09878] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-09-27 10:58:55,411][09909] Worker 0 uses CPU cores [0, 1, 2, 3] +[2023-09-27 10:58:55,413][09878] Num visible devices: 1 +[2023-09-27 10:58:55,422][09918] Worker 6 uses CPU cores [24, 25, 26, 27] +[2023-09-27 10:58:55,424][09919] Worker 7 uses CPU cores [28, 29, 30, 31] +[2023-09-27 10:58:55,427][09879] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-09-27 10:58:55,427][09879] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 +[2023-09-27 10:58:55,430][09913] Worker 2 uses CPU cores [8, 9, 10, 11] +[2023-09-27 10:58:55,439][09912] Worker 1 uses CPU cores [4, 5, 6, 7] +[2023-09-27 10:58:55,474][09879] Num visible devices: 1 +[2023-09-27 10:58:55,600][09916] Worker 5 uses CPU cores [20, 21, 22, 23] +[2023-09-27 10:58:55,685][09917] Worker 4 uses CPU cores [16, 17, 18, 19] +[2023-09-27 10:58:55,754][08744] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-09-27 10:58:55,766][09915] Worker 3 uses CPU cores [12, 13, 14, 15] +[2023-09-27 10:58:56,078][09878] RunningMeanStd input shape: (4, 84, 84) +[2023-09-27 10:58:56,078][09878] RunningMeanStd input shape: (1,) +[2023-09-27 10:58:56,084][09879] RunningMeanStd input shape: (4, 84, 84) +[2023-09-27 10:58:56,085][09879] RunningMeanStd input shape: (1,) +[2023-09-27 10:58:56,090][09878] ConvEncoder: input_channels=4 +[2023-09-27 10:58:56,096][09879] ConvEncoder: input_channels=4 +[2023-09-27 10:58:56,192][09878] Conv encoder output size: 512 +[2023-09-27 10:58:56,196][09879] Conv encoder output size: 512 +[2023-09-27 10:58:56,198][08744] Inference worker 0-0 is ready! +[2023-09-27 10:58:56,201][08744] Inference worker 1-0 is ready! +[2023-09-27 10:58:56,202][08744] All inference workers are ready! Signal rollout workers to start! +[2023-09-27 10:58:56,656][09917] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,662][09918] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,662][09912] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,662][09919] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,664][09916] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,664][09909] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,665][09915] Decorrelating experience for 0 frames... +[2023-09-27 10:58:56,665][09913] Decorrelating experience for 0 frames... +[2023-09-27 10:59:00,583][08744] Fps is (10 sec: 1696.5, 60 sec: 1696.5, 300 sec: 1696.5). Total num frames: 8192. Throughput: 0: 212.1, 1: 212.1. Samples: 2048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 10:59:00,583][08744] Avg episode reward: [(0, '1.500'), (1, '1.000')] +[2023-09-27 10:59:05,582][08744] Fps is (10 sec: 3334.0, 60 sec: 3334.0, 300 sec: 3334.0). Total num frames: 32768. Throughput: 0: 404.1, 1: 403.1. Samples: 7934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:05,583][08744] Avg episode reward: [(0, '2.583'), (1, '1.556')] +[2023-09-27 10:59:10,014][08744] Heartbeat connected on Batcher_0 +[2023-09-27 10:59:10,017][08744] Heartbeat connected on LearnerWorker_p0 +[2023-09-27 10:59:10,020][08744] Heartbeat connected on Batcher_1 +[2023-09-27 10:59:10,023][08744] Heartbeat connected on LearnerWorker_p1 +[2023-09-27 10:59:10,029][08744] Heartbeat connected on InferenceWorker_p0-w0 +[2023-09-27 10:59:10,033][08744] Heartbeat connected on InferenceWorker_p1-w0 +[2023-09-27 10:59:10,034][08744] Heartbeat connected on RolloutWorker_w0 +[2023-09-27 10:59:10,038][08744] Heartbeat connected on RolloutWorker_w1 +[2023-09-27 10:59:10,040][08744] Heartbeat connected on RolloutWorker_w2 +[2023-09-27 10:59:10,042][08744] Heartbeat connected on RolloutWorker_w3 +[2023-09-27 10:59:10,048][08744] Heartbeat connected on RolloutWorker_w5 +[2023-09-27 10:59:10,050][08744] Heartbeat connected on RolloutWorker_w4 +[2023-09-27 10:59:10,052][08744] Heartbeat connected on RolloutWorker_w6 +[2023-09-27 10:59:10,054][08744] Heartbeat connected on RolloutWorker_w7 +[2023-09-27 10:59:10,582][08744] Fps is (10 sec: 4915.4, 60 sec: 3867.1, 300 sec: 3867.1). Total num frames: 57344. Throughput: 0: 414.3, 1: 414.4. Samples: 12289. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:10,583][08744] Avg episode reward: [(0, '2.280'), (1, '2.105')] +[2023-09-27 10:59:13,446][09879] Updated weights for policy 1, policy_version 160 (0.0018) +[2023-09-27 10:59:13,446][09878] Updated weights for policy 0, policy_version 160 (0.0018) +[2023-09-27 10:59:15,582][08744] Fps is (10 sec: 5734.4, 60 sec: 4544.6, 300 sec: 4544.6). Total num frames: 90112. Throughput: 0: 553.8, 1: 554.0. Samples: 21966. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:15,583][08744] Avg episode reward: [(0, '2.333'), (1, '1.909')] +[2023-09-27 10:59:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 4949.2, 300 sec: 4949.2). Total num frames: 122880. Throughput: 0: 633.5, 1: 633.3. Samples: 31453. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 10:59:20,583][08744] Avg episode reward: [(0, '2.458'), (1, '2.064')] +[2023-09-27 10:59:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 5218.1, 300 sec: 5218.1). Total num frames: 155648. Throughput: 0: 606.1, 1: 606.7. Samples: 36174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:25,584][08744] Avg episode reward: [(0, '2.262'), (1, '2.203')] +[2023-09-27 10:59:26,323][09878] Updated weights for policy 0, policy_version 320 (0.0017) +[2023-09-27 10:59:26,324][09879] Updated weights for policy 1, policy_version 320 (0.0017) +[2023-09-27 10:59:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 5409.8, 300 sec: 5409.8). Total num frames: 188416. Throughput: 0: 656.1, 1: 655.0. Samples: 45663. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:30,583][08744] Avg episode reward: [(0, '2.293'), (1, '2.286')] +[2023-09-27 10:59:35,583][08744] Fps is (10 sec: 6553.6, 60 sec: 5553.4, 300 sec: 5553.4). Total num frames: 221184. Throughput: 0: 694.2, 1: 694.2. Samples: 55296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:35,584][08744] Avg episode reward: [(0, '2.231'), (1, '2.232')] +[2023-09-27 10:59:35,585][09606] Saving new best policy, reward=2.231! +[2023-09-27 10:59:35,585][09742] Saving new best policy, reward=2.232! +[2023-09-27 10:59:39,156][09878] Updated weights for policy 0, policy_version 480 (0.0017) +[2023-09-27 10:59:39,157][09879] Updated weights for policy 1, policy_version 480 (0.0017) +[2023-09-27 10:59:40,583][08744] Fps is (10 sec: 6553.4, 60 sec: 5664.9, 300 sec: 5664.9). Total num frames: 253952. Throughput: 0: 669.0, 1: 668.8. Samples: 59969. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 10:59:40,584][08744] Avg episode reward: [(0, '2.330'), (1, '2.213')] +[2023-09-27 10:59:40,585][09606] Saving new best policy, reward=2.330! +[2023-09-27 10:59:45,582][08744] Fps is (10 sec: 5734.4, 60 sec: 5589.7, 300 sec: 5589.7). Total num frames: 278528. Throughput: 0: 750.9, 1: 750.9. Samples: 69632. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 10:59:45,583][08744] Avg episode reward: [(0, '2.340'), (1, '2.310')] +[2023-09-27 10:59:45,634][09606] Saving new best policy, reward=2.340! +[2023-09-27 10:59:45,657][09742] Saving new best policy, reward=2.310! +[2023-09-27 10:59:50,582][08744] Fps is (10 sec: 5734.5, 60 sec: 5677.6, 300 sec: 5677.6). Total num frames: 311296. Throughput: 0: 791.9, 1: 791.8. Samples: 79201. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 10:59:50,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.290')] +[2023-09-27 10:59:50,748][09606] Saving new best policy, reward=2.480! +[2023-09-27 10:59:52,081][09878] Updated weights for policy 0, policy_version 640 (0.0018) +[2023-09-27 10:59:52,083][09879] Updated weights for policy 1, policy_version 640 (0.0017) +[2023-09-27 10:59:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 5750.8, 300 sec: 5750.8). Total num frames: 344064. Throughput: 0: 796.4, 1: 796.4. Samples: 83968. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 10:59:55,583][08744] Avg episode reward: [(0, '2.560'), (1, '2.350')] +[2023-09-27 10:59:55,583][09606] Saving new best policy, reward=2.560! +[2023-09-27 10:59:55,584][09742] Saving new best policy, reward=2.350! +[2023-09-27 11:00:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5812.7). Total num frames: 376832. Throughput: 0: 794.6, 1: 794.6. Samples: 93480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:00,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.380')] +[2023-09-27 11:00:00,590][09742] Saving new best policy, reward=2.380! +[2023-09-27 11:00:04,972][09879] Updated weights for policy 1, policy_version 800 (0.0015) +[2023-09-27 11:00:04,972][09878] Updated weights for policy 0, policy_version 800 (0.0016) +[2023-09-27 11:00:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5865.8). Total num frames: 409600. Throughput: 0: 793.0, 1: 793.3. Samples: 102833. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:00:05,583][08744] Avg episode reward: [(0, '2.590'), (1, '2.380')] +[2023-09-27 11:00:05,583][09606] Saving new best policy, reward=2.590! +[2023-09-27 11:00:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.0, 300 sec: 5911.7). Total num frames: 442368. Throughput: 0: 793.3, 1: 793.1. Samples: 107561. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:00:10,583][08744] Avg episode reward: [(0, '2.650'), (1, '2.260')] +[2023-09-27 11:00:10,584][09606] Saving new best policy, reward=2.650! +[2023-09-27 11:00:15,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 5952.0). Total num frames: 475136. Throughput: 0: 789.3, 1: 790.1. Samples: 116738. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:15,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.290')] +[2023-09-27 11:00:15,592][09606] Saving new best policy, reward=2.780! +[2023-09-27 11:00:18,033][09878] Updated weights for policy 0, policy_version 960 (0.0018) +[2023-09-27 11:00:18,033][09879] Updated weights for policy 1, policy_version 960 (0.0018) +[2023-09-27 11:00:20,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 5890.8). Total num frames: 499712. Throughput: 0: 792.7, 1: 792.6. Samples: 126637. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:20,584][08744] Avg episode reward: [(0, '2.740'), (1, '2.300')] +[2023-09-27 11:00:25,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 5927.7). Total num frames: 532480. Throughput: 0: 790.0, 1: 790.2. Samples: 131080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:25,583][08744] Avg episode reward: [(0, '2.700'), (1, '2.240')] +[2023-09-27 11:00:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 5960.7). Total num frames: 565248. Throughput: 0: 793.6, 1: 793.7. Samples: 141060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:30,583][08744] Avg episode reward: [(0, '2.630'), (1, '2.380')] +[2023-09-27 11:00:30,821][09879] Updated weights for policy 1, policy_version 1120 (0.0014) +[2023-09-27 11:00:30,821][09878] Updated weights for policy 0, policy_version 1120 (0.0017) +[2023-09-27 11:00:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5990.4). Total num frames: 598016. Throughput: 0: 791.1, 1: 791.6. Samples: 150421. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:35,584][08744] Avg episode reward: [(0, '2.550'), (1, '2.430')] +[2023-09-27 11:00:35,585][09742] Saving new best policy, reward=2.430! +[2023-09-27 11:00:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6017.3). Total num frames: 630784. Throughput: 0: 794.2, 1: 793.5. Samples: 155415. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:00:40,583][08744] Avg episode reward: [(0, '2.440'), (1, '2.460')] +[2023-09-27 11:00:40,585][09742] Saving new best policy, reward=2.460! +[2023-09-27 11:00:43,705][09879] Updated weights for policy 1, policy_version 1280 (0.0016) +[2023-09-27 11:00:43,706][09878] Updated weights for policy 0, policy_version 1280 (0.0019) +[2023-09-27 11:00:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6041.7). Total num frames: 663552. Throughput: 0: 792.2, 1: 792.3. Samples: 164782. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:00:45,584][08744] Avg episode reward: [(0, '2.430'), (1, '2.420')] +[2023-09-27 11:00:45,594][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000001296_331776.pth... +[2023-09-27 11:00:45,594][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000001296_331776.pth... +[2023-09-27 11:00:50,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6064.0). Total num frames: 696320. Throughput: 0: 792.7, 1: 792.6. Samples: 174175. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:00:50,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.410')] +[2023-09-27 11:00:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6084.4). Total num frames: 729088. Throughput: 0: 795.5, 1: 795.3. Samples: 179148. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:00:55,583][08744] Avg episode reward: [(0, '2.280'), (1, '2.460')] +[2023-09-27 11:00:56,542][09879] Updated weights for policy 1, policy_version 1440 (0.0013) +[2023-09-27 11:00:56,542][09878] Updated weights for policy 0, policy_version 1440 (0.0018) +[2023-09-27 11:01:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6103.2). Total num frames: 761856. Throughput: 0: 798.3, 1: 798.3. Samples: 188584. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:01:00,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.540')] +[2023-09-27 11:01:00,591][09742] Saving new best policy, reward=2.540! +[2023-09-27 11:01:05,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6057.5). Total num frames: 786432. Throughput: 0: 796.0, 1: 796.2. Samples: 198288. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:01:05,584][08744] Avg episode reward: [(0, '2.280'), (1, '2.600')] +[2023-09-27 11:01:05,600][09742] Saving new best policy, reward=2.600! +[2023-09-27 11:01:09,523][09878] Updated weights for policy 0, policy_version 1600 (0.0017) +[2023-09-27 11:01:09,523][09879] Updated weights for policy 1, policy_version 1600 (0.0017) +[2023-09-27 11:01:10,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6075.9). Total num frames: 819200. Throughput: 0: 797.1, 1: 797.1. Samples: 202821. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:01:10,583][08744] Avg episode reward: [(0, '2.180'), (1, '2.590')] +[2023-09-27 11:01:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6092.9). Total num frames: 851968. Throughput: 0: 792.3, 1: 791.4. Samples: 212327. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:01:15,584][08744] Avg episode reward: [(0, '2.310'), (1, '2.520')] +[2023-09-27 11:01:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6108.9). Total num frames: 884736. Throughput: 0: 792.8, 1: 792.3. Samples: 221752. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:01:20,583][08744] Avg episode reward: [(0, '2.390'), (1, '2.430')] +[2023-09-27 11:01:22,466][09878] Updated weights for policy 0, policy_version 1760 (0.0018) +[2023-09-27 11:01:22,467][09879] Updated weights for policy 1, policy_version 1760 (0.0016) +[2023-09-27 11:01:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6123.7). Total num frames: 917504. Throughput: 0: 792.2, 1: 793.3. Samples: 226763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:01:25,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.480')] +[2023-09-27 11:01:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6137.6). Total num frames: 950272. Throughput: 0: 793.3, 1: 791.9. Samples: 236116. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:01:30,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.570')] +[2023-09-27 11:01:35,412][09878] Updated weights for policy 0, policy_version 1920 (0.0017) +[2023-09-27 11:01:35,412][09879] Updated weights for policy 1, policy_version 1920 (0.0016) +[2023-09-27 11:01:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6150.6). Total num frames: 983040. Throughput: 0: 795.4, 1: 795.3. Samples: 245760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:01:35,583][08744] Avg episode reward: [(0, '2.580'), (1, '2.520')] +[2023-09-27 11:01:40,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6162.8). Total num frames: 1015808. Throughput: 0: 791.8, 1: 791.7. Samples: 250407. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:01:40,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.520')] +[2023-09-27 11:01:45,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6126.1). Total num frames: 1040384. Throughput: 0: 794.6, 1: 794.6. Samples: 260096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:01:45,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.530')] +[2023-09-27 11:01:48,279][09878] Updated weights for policy 0, policy_version 2080 (0.0018) +[2023-09-27 11:01:48,279][09879] Updated weights for policy 1, policy_version 2080 (0.0018) +[2023-09-27 11:01:50,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6138.3). Total num frames: 1073152. Throughput: 0: 790.2, 1: 790.6. Samples: 269425. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:01:50,584][08744] Avg episode reward: [(0, '2.420'), (1, '2.510')] +[2023-09-27 11:01:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6149.9). Total num frames: 1105920. Throughput: 0: 795.7, 1: 795.7. Samples: 274432. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:01:55,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.460')] +[2023-09-27 11:02:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6160.8). Total num frames: 1138688. Throughput: 0: 795.9, 1: 796.9. Samples: 284002. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:02:00,583][08744] Avg episode reward: [(0, '2.400'), (1, '2.530')] +[2023-09-27 11:02:01,086][09878] Updated weights for policy 0, policy_version 2240 (0.0016) +[2023-09-27 11:02:01,086][09879] Updated weights for policy 1, policy_version 2240 (0.0016) +[2023-09-27 11:02:05,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6171.1). Total num frames: 1171456. Throughput: 0: 794.4, 1: 795.6. Samples: 293305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:02:05,584][08744] Avg episode reward: [(0, '2.480'), (1, '2.690')] +[2023-09-27 11:02:05,585][09742] Saving new best policy, reward=2.690! +[2023-09-27 11:02:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6180.9). Total num frames: 1204224. Throughput: 0: 791.0, 1: 790.6. Samples: 297937. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:02:10,584][08744] Avg episode reward: [(0, '2.630'), (1, '2.390')] +[2023-09-27 11:02:14,102][09878] Updated weights for policy 0, policy_version 2400 (0.0019) +[2023-09-27 11:02:14,102][09879] Updated weights for policy 1, policy_version 2400 (0.0017) +[2023-09-27 11:02:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6190.3). Total num frames: 1236992. Throughput: 0: 791.4, 1: 792.4. Samples: 307386. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:02:15,583][08744] Avg episode reward: [(0, '2.690'), (1, '2.430')] +[2023-09-27 11:02:20,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6199.1). Total num frames: 1269760. Throughput: 0: 796.4, 1: 796.4. Samples: 317437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:02:20,584][08744] Avg episode reward: [(0, '2.610'), (1, '2.460')] +[2023-09-27 11:02:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6207.6). Total num frames: 1302528. Throughput: 0: 795.8, 1: 795.8. Samples: 322029. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:02:25,583][08744] Avg episode reward: [(0, '2.750'), (1, '2.520')] +[2023-09-27 11:02:26,735][09879] Updated weights for policy 1, policy_version 2560 (0.0016) +[2023-09-27 11:02:26,736][09878] Updated weights for policy 0, policy_version 2560 (0.0018) +[2023-09-27 11:02:30,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6215.6). Total num frames: 1335296. Throughput: 0: 796.4, 1: 796.4. Samples: 331776. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:02:30,583][08744] Avg episode reward: [(0, '2.630'), (1, '2.440')] +[2023-09-27 11:02:35,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6223.3). Total num frames: 1368064. Throughput: 0: 803.4, 1: 803.5. Samples: 341737. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:02:35,584][08744] Avg episode reward: [(0, '2.620'), (1, '2.510')] +[2023-09-27 11:02:39,347][09879] Updated weights for policy 1, policy_version 2720 (0.0017) +[2023-09-27 11:02:39,347][09878] Updated weights for policy 0, policy_version 2720 (0.0017) +[2023-09-27 11:02:40,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6194.2). Total num frames: 1392640. Throughput: 0: 799.1, 1: 799.2. Samples: 346357. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:02:40,583][08744] Avg episode reward: [(0, '2.530'), (1, '2.550')] +[2023-09-27 11:02:45,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6202.0). Total num frames: 1425408. Throughput: 0: 803.8, 1: 802.2. Samples: 356271. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:02:45,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.680')] +[2023-09-27 11:02:45,594][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000002784_712704.pth... +[2023-09-27 11:02:45,594][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000002784_712704.pth... +[2023-09-27 11:02:50,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6209.5). Total num frames: 1458176. Throughput: 0: 798.4, 1: 797.4. Samples: 365114. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:02:50,584][08744] Avg episode reward: [(0, '2.320'), (1, '2.720')] +[2023-09-27 11:02:50,585][09742] Saving new best policy, reward=2.720! +[2023-09-27 11:02:52,465][09879] Updated weights for policy 1, policy_version 2880 (0.0015) +[2023-09-27 11:02:52,466][09878] Updated weights for policy 0, policy_version 2880 (0.0019) +[2023-09-27 11:02:55,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6216.7). Total num frames: 1490944. Throughput: 0: 802.0, 1: 801.8. Samples: 370106. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:02:55,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.630')] +[2023-09-27 11:03:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6223.6). Total num frames: 1523712. Throughput: 0: 804.2, 1: 804.2. Samples: 379765. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:00,583][08744] Avg episode reward: [(0, '2.440'), (1, '2.560')] +[2023-09-27 11:03:05,199][09879] Updated weights for policy 1, policy_version 3040 (0.0017) +[2023-09-27 11:03:05,199][09878] Updated weights for policy 0, policy_version 3040 (0.0017) +[2023-09-27 11:03:05,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6230.2). Total num frames: 1556480. Throughput: 0: 796.6, 1: 796.6. Samples: 389130. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:03:05,584][08744] Avg episode reward: [(0, '2.480'), (1, '2.520')] +[2023-09-27 11:03:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6236.5). Total num frames: 1589248. Throughput: 0: 797.7, 1: 797.5. Samples: 393812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:10,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.590')] +[2023-09-27 11:03:15,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6211.1). Total num frames: 1613824. Throughput: 0: 796.4, 1: 796.4. Samples: 403456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:15,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.500')] +[2023-09-27 11:03:18,289][09879] Updated weights for policy 1, policy_version 3200 (0.0019) +[2023-09-27 11:03:18,289][09878] Updated weights for policy 0, policy_version 3200 (0.0019) +[2023-09-27 11:03:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6217.6). Total num frames: 1646592. Throughput: 0: 791.4, 1: 791.0. Samples: 412945. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:03:20,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.430')] +[2023-09-27 11:03:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6223.8). Total num frames: 1679360. Throughput: 0: 793.8, 1: 793.7. Samples: 417792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:25,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.350')] +[2023-09-27 11:03:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6229.8). Total num frames: 1712128. Throughput: 0: 789.3, 1: 791.4. Samples: 427403. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:30,584][08744] Avg episode reward: [(0, '2.560'), (1, '2.240')] +[2023-09-27 11:03:31,052][09879] Updated weights for policy 1, policy_version 3360 (0.0018) +[2023-09-27 11:03:31,052][09878] Updated weights for policy 0, policy_version 3360 (0.0018) +[2023-09-27 11:03:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6235.6). Total num frames: 1744896. Throughput: 0: 794.3, 1: 794.7. Samples: 436616. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-27 11:03:35,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.340')] +[2023-09-27 11:03:40,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6241.2). Total num frames: 1777664. Throughput: 0: 788.4, 1: 788.9. Samples: 441083. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:03:40,584][08744] Avg episode reward: [(0, '2.370'), (1, '2.460')] +[2023-09-27 11:03:44,301][09879] Updated weights for policy 1, policy_version 3520 (0.0016) +[2023-09-27 11:03:44,301][09878] Updated weights for policy 0, policy_version 3520 (0.0017) +[2023-09-27 11:03:45,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6246.6). Total num frames: 1810432. Throughput: 0: 786.6, 1: 786.7. Samples: 450560. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:45,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.450')] +[2023-09-27 11:03:50,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6224.0). Total num frames: 1835008. Throughput: 0: 791.8, 1: 792.1. Samples: 460403. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:03:50,584][08744] Avg episode reward: [(0, '2.470'), (1, '2.320')] +[2023-09-27 11:03:55,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 1867776. Throughput: 0: 789.8, 1: 790.0. Samples: 464901. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:03:55,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.500')] +[2023-09-27 11:03:57,257][09879] Updated weights for policy 1, policy_version 3680 (0.0015) +[2023-09-27 11:03:57,258][09878] Updated weights for policy 0, policy_version 3680 (0.0015) +[2023-09-27 11:04:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 1900544. Throughput: 0: 786.5, 1: 787.5. Samples: 474288. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:00,583][08744] Avg episode reward: [(0, '2.330'), (1, '2.400')] +[2023-09-27 11:04:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 1933312. Throughput: 0: 785.3, 1: 785.0. Samples: 483607. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:04:05,583][08744] Avg episode reward: [(0, '2.400'), (1, '2.360')] +[2023-09-27 11:04:10,116][09879] Updated weights for policy 1, policy_version 3840 (0.0017) +[2023-09-27 11:04:10,116][09878] Updated weights for policy 0, policy_version 3840 (0.0018) +[2023-09-27 11:04:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 1966080. Throughput: 0: 787.6, 1: 787.3. Samples: 488664. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:10,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.320')] +[2023-09-27 11:04:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 1998848. Throughput: 0: 787.8, 1: 787.4. Samples: 498285. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:04:15,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.380')] +[2023-09-27 11:04:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2031616. Throughput: 0: 792.4, 1: 792.0. Samples: 507912. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:04:20,583][08744] Avg episode reward: [(0, '2.640'), (1, '2.410')] +[2023-09-27 11:04:22,833][09878] Updated weights for policy 0, policy_version 4000 (0.0017) +[2023-09-27 11:04:22,833][09879] Updated weights for policy 1, policy_version 4000 (0.0018) +[2023-09-27 11:04:25,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2064384. Throughput: 0: 796.2, 1: 795.9. Samples: 512728. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:25,583][08744] Avg episode reward: [(0, '2.630'), (1, '2.390')] +[2023-09-27 11:04:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2097152. Throughput: 0: 796.4, 1: 796.4. Samples: 522240. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:30,583][08744] Avg episode reward: [(0, '2.800'), (1, '2.350')] +[2023-09-27 11:04:30,590][09606] Saving new best policy, reward=2.800! +[2023-09-27 11:04:35,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 2121728. Throughput: 0: 794.1, 1: 794.0. Samples: 531866. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:04:35,583][08744] Avg episode reward: [(0, '2.720'), (1, '2.360')] +[2023-09-27 11:04:35,753][09879] Updated weights for policy 1, policy_version 4160 (0.0015) +[2023-09-27 11:04:35,754][09878] Updated weights for policy 0, policy_version 4160 (0.0021) +[2023-09-27 11:04:40,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2154496. Throughput: 0: 796.4, 1: 796.4. Samples: 536577. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:40,584][08744] Avg episode reward: [(0, '2.730'), (1, '2.260')] +[2023-09-27 11:04:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2187264. Throughput: 0: 801.8, 1: 799.5. Samples: 546346. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:45,584][08744] Avg episode reward: [(0, '2.660'), (1, '2.370')] +[2023-09-27 11:04:45,594][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000004272_1093632.pth... +[2023-09-27 11:04:45,594][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000004272_1093632.pth... +[2023-09-27 11:04:45,622][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000001296_331776.pth +[2023-09-27 11:04:45,627][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000001296_331776.pth +[2023-09-27 11:04:48,444][09878] Updated weights for policy 0, policy_version 4320 (0.0015) +[2023-09-27 11:04:48,445][09879] Updated weights for policy 1, policy_version 4320 (0.0016) +[2023-09-27 11:04:50,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2220032. Throughput: 0: 803.8, 1: 803.2. Samples: 555922. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:04:50,583][08744] Avg episode reward: [(0, '2.640'), (1, '2.340')] +[2023-09-27 11:04:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2252800. Throughput: 0: 802.5, 1: 802.8. Samples: 560902. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:04:55,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.170')] +[2023-09-27 11:05:00,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2285568. Throughput: 0: 798.6, 1: 798.2. Samples: 570142. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:00,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.130')] +[2023-09-27 11:05:01,338][09879] Updated weights for policy 1, policy_version 4480 (0.0017) +[2023-09-27 11:05:01,339][09878] Updated weights for policy 0, policy_version 4480 (0.0017) +[2023-09-27 11:05:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2318336. Throughput: 0: 796.5, 1: 796.5. Samples: 579594. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:05,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.110')] +[2023-09-27 11:05:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2351104. Throughput: 0: 797.5, 1: 796.5. Samples: 584458. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:10,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.030')] +[2023-09-27 11:05:14,278][09878] Updated weights for policy 0, policy_version 4640 (0.0019) +[2023-09-27 11:05:14,279][09879] Updated weights for policy 1, policy_version 4640 (0.0016) +[2023-09-27 11:05:15,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 2383872. Throughput: 0: 796.4, 1: 796.4. Samples: 593920. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:05:15,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.280')] +[2023-09-27 11:05:20,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2408448. Throughput: 0: 797.6, 1: 797.9. Samples: 603666. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:05:20,583][08744] Avg episode reward: [(0, '2.340'), (1, '2.320')] +[2023-09-27 11:05:25,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 2441216. Throughput: 0: 796.5, 1: 796.4. Samples: 608256. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:25,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.080')] +[2023-09-27 11:05:27,181][09878] Updated weights for policy 0, policy_version 4800 (0.0017) +[2023-09-27 11:05:27,181][09879] Updated weights for policy 1, policy_version 4800 (0.0017) +[2023-09-27 11:05:30,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2473984. Throughput: 0: 795.3, 1: 796.5. Samples: 617977. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:30,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.010')] +[2023-09-27 11:05:35,582][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2506752. Throughput: 0: 791.8, 1: 792.7. Samples: 627223. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:35,583][08744] Avg episode reward: [(0, '2.450'), (1, '1.950')] +[2023-09-27 11:05:40,157][09878] Updated weights for policy 0, policy_version 4960 (0.0018) +[2023-09-27 11:05:40,157][09879] Updated weights for policy 1, policy_version 4960 (0.0017) +[2023-09-27 11:05:40,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2539520. Throughput: 0: 789.1, 1: 789.8. Samples: 631953. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:05:40,584][08744] Avg episode reward: [(0, '2.570'), (1, '1.820')] +[2023-09-27 11:05:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2572288. Throughput: 0: 791.8, 1: 792.3. Samples: 641426. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-27 11:05:45,583][08744] Avg episode reward: [(0, '2.610'), (1, '1.670')] +[2023-09-27 11:05:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 2605056. Throughput: 0: 796.3, 1: 796.3. Samples: 651264. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:50,584][08744] Avg episode reward: [(0, '2.550'), (1, '1.830')] +[2023-09-27 11:05:52,971][09878] Updated weights for policy 0, policy_version 5120 (0.0016) +[2023-09-27 11:05:52,972][09879] Updated weights for policy 1, policy_version 5120 (0.0014) +[2023-09-27 11:05:55,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 2637824. Throughput: 0: 792.1, 1: 793.1. Samples: 655795. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:05:55,583][08744] Avg episode reward: [(0, '2.550'), (1, '1.880')] +[2023-09-27 11:06:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 2670592. Throughput: 0: 796.4, 1: 796.4. Samples: 665600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:06:00,584][08744] Avg episode reward: [(0, '2.600'), (1, '1.970')] +[2023-09-27 11:06:05,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2695168. Throughput: 0: 798.0, 1: 797.5. Samples: 675464. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:06:05,583][08744] Avg episode reward: [(0, '2.640'), (1, '2.040')] +[2023-09-27 11:06:05,617][09878] Updated weights for policy 0, policy_version 5280 (0.0018) +[2023-09-27 11:06:05,617][09879] Updated weights for policy 1, policy_version 5280 (0.0017) +[2023-09-27 11:06:10,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2727936. Throughput: 0: 797.9, 1: 797.8. Samples: 680066. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:06:10,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.020')] +[2023-09-27 11:06:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 2760704. Throughput: 0: 799.6, 1: 800.0. Samples: 689961. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:06:15,584][08744] Avg episode reward: [(0, '2.500'), (1, '2.040')] +[2023-09-27 11:06:18,304][09878] Updated weights for policy 0, policy_version 5440 (0.0015) +[2023-09-27 11:06:18,305][09879] Updated weights for policy 1, policy_version 5440 (0.0017) +[2023-09-27 11:06:20,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2793472. Throughput: 0: 803.9, 1: 804.6. Samples: 699607. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:06:20,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.120')] +[2023-09-27 11:06:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2826240. Throughput: 0: 806.4, 1: 805.8. Samples: 704503. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:06:25,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.070')] +[2023-09-27 11:06:30,583][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 2859008. Throughput: 0: 804.8, 1: 804.6. Samples: 713846. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:06:30,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.000')] +[2023-09-27 11:06:31,248][09879] Updated weights for policy 1, policy_version 5600 (0.0016) +[2023-09-27 11:06:31,249][09878] Updated weights for policy 0, policy_version 5600 (0.0018) +[2023-09-27 11:06:35,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 2891776. Throughput: 0: 798.9, 1: 799.0. Samples: 723167. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:06:35,584][08744] Avg episode reward: [(0, '2.520'), (1, '1.970')] +[2023-09-27 11:06:40,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 2924544. Throughput: 0: 803.7, 1: 803.3. Samples: 728109. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:06:40,584][08744] Avg episode reward: [(0, '2.550'), (1, '1.990')] +[2023-09-27 11:06:44,090][09879] Updated weights for policy 1, policy_version 5760 (0.0015) +[2023-09-27 11:06:44,090][09878] Updated weights for policy 0, policy_version 5760 (0.0019) +[2023-09-27 11:06:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 2957312. Throughput: 0: 798.8, 1: 798.8. Samples: 737492. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:06:45,583][08744] Avg episode reward: [(0, '2.430'), (1, '1.920')] +[2023-09-27 11:06:45,596][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000005776_1478656.pth... +[2023-09-27 11:06:45,596][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000005776_1478656.pth... +[2023-09-27 11:06:45,631][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000002784_712704.pth +[2023-09-27 11:06:45,632][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000002784_712704.pth +[2023-09-27 11:06:50,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 2990080. Throughput: 0: 800.6, 1: 800.7. Samples: 747520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:06:50,583][08744] Avg episode reward: [(0, '2.490'), (1, '1.880')] +[2023-09-27 11:06:55,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3014656. Throughput: 0: 795.7, 1: 796.0. Samples: 751694. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:06:55,583][08744] Avg episode reward: [(0, '2.660'), (1, '1.990')] +[2023-09-27 11:06:57,049][09878] Updated weights for policy 0, policy_version 5920 (0.0017) +[2023-09-27 11:06:57,049][09879] Updated weights for policy 1, policy_version 5920 (0.0018) +[2023-09-27 11:07:00,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3047424. Throughput: 0: 795.3, 1: 795.3. Samples: 761538. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:00,584][08744] Avg episode reward: [(0, '2.520'), (1, '2.070')] +[2023-09-27 11:07:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3080192. Throughput: 0: 791.2, 1: 790.9. Samples: 770798. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:05,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.050')] +[2023-09-27 11:07:10,045][09879] Updated weights for policy 1, policy_version 6080 (0.0016) +[2023-09-27 11:07:10,045][09878] Updated weights for policy 0, policy_version 6080 (0.0016) +[2023-09-27 11:07:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3112960. Throughput: 0: 792.0, 1: 790.1. Samples: 775696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:10,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.020')] +[2023-09-27 11:07:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3145728. Throughput: 0: 792.6, 1: 791.7. Samples: 785139. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:15,584][08744] Avg episode reward: [(0, '2.440'), (1, '2.150')] +[2023-09-27 11:07:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3178496. Throughput: 0: 794.0, 1: 794.0. Samples: 794626. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:20,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.260')] +[2023-09-27 11:07:22,899][09878] Updated weights for policy 0, policy_version 6240 (0.0016) +[2023-09-27 11:07:22,899][09879] Updated weights for policy 1, policy_version 6240 (0.0018) +[2023-09-27 11:07:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3211264. Throughput: 0: 790.6, 1: 791.3. Samples: 799297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:25,583][08744] Avg episode reward: [(0, '2.350'), (1, '2.140')] +[2023-09-27 11:07:30,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 3235840. Throughput: 0: 794.1, 1: 794.0. Samples: 808960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:30,584][08744] Avg episode reward: [(0, '2.250'), (1, '2.030')] +[2023-09-27 11:07:35,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3268608. Throughput: 0: 787.7, 1: 787.7. Samples: 818414. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:35,583][08744] Avg episode reward: [(0, '2.200'), (1, '1.980')] +[2023-09-27 11:07:35,819][09879] Updated weights for policy 1, policy_version 6400 (0.0018) +[2023-09-27 11:07:35,819][09878] Updated weights for policy 0, policy_version 6400 (0.0020) +[2023-09-27 11:07:40,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3301376. Throughput: 0: 795.6, 1: 795.4. Samples: 823292. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:40,583][08744] Avg episode reward: [(0, '2.130'), (1, '1.810')] +[2023-09-27 11:07:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3334144. Throughput: 0: 792.7, 1: 792.7. Samples: 832878. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:45,584][08744] Avg episode reward: [(0, '2.100'), (1, '1.660')] +[2023-09-27 11:07:48,575][09878] Updated weights for policy 0, policy_version 6560 (0.0019) +[2023-09-27 11:07:48,575][09879] Updated weights for policy 1, policy_version 6560 (0.0018) +[2023-09-27 11:07:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3366912. Throughput: 0: 795.6, 1: 794.2. Samples: 842337. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:07:50,584][08744] Avg episode reward: [(0, '2.220'), (1, '1.690')] +[2023-09-27 11:07:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3399680. Throughput: 0: 793.8, 1: 795.9. Samples: 847234. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:07:55,583][08744] Avg episode reward: [(0, '2.220'), (1, '1.680')] +[2023-09-27 11:08:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3432448. Throughput: 0: 794.2, 1: 794.8. Samples: 856644. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:08:00,583][08744] Avg episode reward: [(0, '2.280'), (1, '1.760')] +[2023-09-27 11:08:01,492][09879] Updated weights for policy 1, policy_version 6720 (0.0017) +[2023-09-27 11:08:01,492][09878] Updated weights for policy 0, policy_version 6720 (0.0017) +[2023-09-27 11:08:05,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3465216. Throughput: 0: 796.4, 1: 796.4. Samples: 866304. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:08:05,584][08744] Avg episode reward: [(0, '2.260'), (1, '1.910')] +[2023-09-27 11:08:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 3497984. Throughput: 0: 797.1, 1: 797.0. Samples: 871032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:08:10,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.020')] +[2023-09-27 11:08:14,245][09879] Updated weights for policy 1, policy_version 6880 (0.0018) +[2023-09-27 11:08:14,245][09878] Updated weights for policy 0, policy_version 6880 (0.0018) +[2023-09-27 11:08:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 3530752. Throughput: 0: 796.5, 1: 796.5. Samples: 880642. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:08:15,583][08744] Avg episode reward: [(0, '2.290'), (1, '2.020')] +[2023-09-27 11:08:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 3563520. Throughput: 0: 801.7, 1: 801.7. Samples: 890565. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:08:20,584][08744] Avg episode reward: [(0, '2.330'), (1, '2.140')] +[2023-09-27 11:08:25,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3588096. Throughput: 0: 796.8, 1: 796.9. Samples: 895006. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:08:25,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.390')] +[2023-09-27 11:08:27,092][09878] Updated weights for policy 0, policy_version 7040 (0.0017) +[2023-09-27 11:08:27,092][09879] Updated weights for policy 1, policy_version 7040 (0.0017) +[2023-09-27 11:08:30,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3620864. Throughput: 0: 798.6, 1: 798.8. Samples: 904761. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:08:30,583][08744] Avg episode reward: [(0, '2.340'), (1, '2.310')] +[2023-09-27 11:08:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3653632. Throughput: 0: 797.6, 1: 797.8. Samples: 914130. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:08:35,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.100')] +[2023-09-27 11:08:40,081][09878] Updated weights for policy 0, policy_version 7200 (0.0017) +[2023-09-27 11:08:40,081][09879] Updated weights for policy 1, policy_version 7200 (0.0015) +[2023-09-27 11:08:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 3686400. Throughput: 0: 797.5, 1: 798.6. Samples: 919057. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:08:40,583][08744] Avg episode reward: [(0, '2.320'), (1, '2.150')] +[2023-09-27 11:08:45,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 3719168. Throughput: 0: 796.2, 1: 795.9. Samples: 928289. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:08:45,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.140')] +[2023-09-27 11:08:45,591][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000007264_1859584.pth... +[2023-09-27 11:08:45,591][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000007264_1859584.pth... +[2023-09-27 11:08:45,626][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000004272_1093632.pth +[2023-09-27 11:08:45,629][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000004272_1093632.pth +[2023-09-27 11:08:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 3751936. Throughput: 0: 796.1, 1: 795.7. Samples: 937936. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:08:50,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.220')] +[2023-09-27 11:08:53,109][09878] Updated weights for policy 0, policy_version 7360 (0.0020) +[2023-09-27 11:08:53,109][09879] Updated weights for policy 1, policy_version 7360 (0.0016) +[2023-09-27 11:08:55,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 3776512. Throughput: 0: 791.3, 1: 791.0. Samples: 942234. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:08:55,583][08744] Avg episode reward: [(0, '2.540'), (1, '1.910')] +[2023-09-27 11:09:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3809280. Throughput: 0: 794.0, 1: 794.1. Samples: 952104. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:00,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.080')] +[2023-09-27 11:09:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 3842048. Throughput: 0: 790.4, 1: 790.3. Samples: 961699. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:09:05,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.150')] +[2023-09-27 11:09:05,858][09879] Updated weights for policy 1, policy_version 7520 (0.0017) +[2023-09-27 11:09:05,858][09878] Updated weights for policy 0, policy_version 7520 (0.0019) +[2023-09-27 11:09:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3874816. Throughput: 0: 796.1, 1: 796.1. Samples: 966656. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:09:10,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.180')] +[2023-09-27 11:09:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 3907584. Throughput: 0: 793.1, 1: 793.1. Samples: 976141. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:15,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.090')] +[2023-09-27 11:09:18,582][09879] Updated weights for policy 1, policy_version 7680 (0.0017) +[2023-09-27 11:09:18,582][09878] Updated weights for policy 0, policy_version 7680 (0.0018) +[2023-09-27 11:09:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 3940352. Throughput: 0: 795.6, 1: 795.9. Samples: 985750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:20,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.070')] +[2023-09-27 11:09:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 3973120. Throughput: 0: 797.2, 1: 796.6. Samples: 990774. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:25,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.140')] +[2023-09-27 11:09:30,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4005888. Throughput: 0: 798.2, 1: 798.7. Samples: 1000148. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:30,583][08744] Avg episode reward: [(0, '2.550'), (1, '2.090')] +[2023-09-27 11:09:31,393][09878] Updated weights for policy 0, policy_version 7840 (0.0018) +[2023-09-27 11:09:31,393][09879] Updated weights for policy 1, policy_version 7840 (0.0016) +[2023-09-27 11:09:35,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 4038656. Throughput: 0: 796.9, 1: 797.3. Samples: 1009674. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-27 11:09:35,584][08744] Avg episode reward: [(0, '2.630'), (1, '1.980')] +[2023-09-27 11:09:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4071424. Throughput: 0: 801.6, 1: 802.4. Samples: 1014414. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) +[2023-09-27 11:09:40,583][08744] Avg episode reward: [(0, '2.580'), (1, '1.920')] +[2023-09-27 11:09:44,303][09878] Updated weights for policy 0, policy_version 8000 (0.0020) +[2023-09-27 11:09:44,303][09879] Updated weights for policy 1, policy_version 8000 (0.0019) +[2023-09-27 11:09:45,582][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 4100096. Throughput: 0: 798.9, 1: 798.8. Samples: 1024000. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:45,583][08744] Avg episode reward: [(0, '2.620'), (1, '1.980')] +[2023-09-27 11:09:50,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 4128768. Throughput: 0: 801.1, 1: 801.3. Samples: 1033805. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:50,583][08744] Avg episode reward: [(0, '2.530'), (1, '1.940')] +[2023-09-27 11:09:55,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4161536. Throughput: 0: 796.5, 1: 796.5. Samples: 1038338. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:09:55,583][08744] Avg episode reward: [(0, '2.430'), (1, '1.880')] +[2023-09-27 11:09:57,131][09879] Updated weights for policy 1, policy_version 8160 (0.0016) +[2023-09-27 11:09:57,131][09878] Updated weights for policy 0, policy_version 8160 (0.0015) +[2023-09-27 11:10:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4194304. Throughput: 0: 799.2, 1: 799.0. Samples: 1048063. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:10:00,584][08744] Avg episode reward: [(0, '2.380'), (1, '1.950')] +[2023-09-27 11:10:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4227072. Throughput: 0: 796.1, 1: 796.3. Samples: 1057406. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:05,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.000')] +[2023-09-27 11:10:09,964][09878] Updated weights for policy 0, policy_version 8320 (0.0018) +[2023-09-27 11:10:09,964][09879] Updated weights for policy 1, policy_version 8320 (0.0017) +[2023-09-27 11:10:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4259840. Throughput: 0: 796.1, 1: 796.5. Samples: 1062438. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:10,584][08744] Avg episode reward: [(0, '2.420'), (1, '1.950')] +[2023-09-27 11:10:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4292608. Throughput: 0: 796.3, 1: 796.0. Samples: 1071801. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:15,583][08744] Avg episode reward: [(0, '2.500'), (1, '2.000')] +[2023-09-27 11:10:20,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 4325376. Throughput: 0: 796.4, 1: 796.4. Samples: 1081354. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:20,584][08744] Avg episode reward: [(0, '2.550'), (1, '1.980')] +[2023-09-27 11:10:22,772][09878] Updated weights for policy 0, policy_version 8480 (0.0017) +[2023-09-27 11:10:22,772][09879] Updated weights for policy 1, policy_version 8480 (0.0017) +[2023-09-27 11:10:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4358144. Throughput: 0: 798.6, 1: 798.5. Samples: 1086280. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:25,583][08744] Avg episode reward: [(0, '2.590'), (1, '1.950')] +[2023-09-27 11:10:30,583][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4390912. Throughput: 0: 798.0, 1: 798.3. Samples: 1095833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:30,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.010')] +[2023-09-27 11:10:35,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 4415488. Throughput: 0: 791.9, 1: 793.2. Samples: 1105136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:35,584][08744] Avg episode reward: [(0, '2.470'), (1, '2.020')] +[2023-09-27 11:10:35,917][09879] Updated weights for policy 1, policy_version 8640 (0.0018) +[2023-09-27 11:10:35,918][09878] Updated weights for policy 0, policy_version 8640 (0.0017) +[2023-09-27 11:10:40,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 4448256. Throughput: 0: 796.2, 1: 796.4. Samples: 1110007. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:40,583][08744] Avg episode reward: [(0, '2.290'), (1, '2.080')] +[2023-09-27 11:10:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6348.8, 300 sec: 6359.2). Total num frames: 4481024. Throughput: 0: 794.9, 1: 794.5. Samples: 1119587. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:45,583][08744] Avg episode reward: [(0, '2.170'), (1, '2.130')] +[2023-09-27 11:10:45,593][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000008752_2240512.pth... +[2023-09-27 11:10:45,593][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000008752_2240512.pth... +[2023-09-27 11:10:45,622][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000005776_1478656.pth +[2023-09-27 11:10:45,629][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000005776_1478656.pth +[2023-09-27 11:10:48,553][09878] Updated weights for policy 0, policy_version 8800 (0.0018) +[2023-09-27 11:10:48,553][09879] Updated weights for policy 1, policy_version 8800 (0.0016) +[2023-09-27 11:10:50,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4513792. Throughput: 0: 796.8, 1: 796.9. Samples: 1129121. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:10:50,583][08744] Avg episode reward: [(0, '2.150'), (1, '2.080')] +[2023-09-27 11:10:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4546560. Throughput: 0: 797.5, 1: 796.9. Samples: 1134186. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:10:55,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.090')] +[2023-09-27 11:11:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4579328. Throughput: 0: 797.4, 1: 797.1. Samples: 1143553. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:11:00,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.020')] +[2023-09-27 11:11:01,365][09878] Updated weights for policy 0, policy_version 8960 (0.0017) +[2023-09-27 11:11:01,367][09879] Updated weights for policy 1, policy_version 8960 (0.0018) +[2023-09-27 11:11:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4612096. Throughput: 0: 796.4, 1: 796.4. Samples: 1153026. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:11:05,583][08744] Avg episode reward: [(0, '2.250'), (1, '2.000')] +[2023-09-27 11:11:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4644864. Throughput: 0: 795.7, 1: 795.4. Samples: 1157878. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:10,584][08744] Avg episode reward: [(0, '2.190'), (1, '1.930')] +[2023-09-27 11:11:14,200][09879] Updated weights for policy 1, policy_version 9120 (0.0016) +[2023-09-27 11:11:14,200][09878] Updated weights for policy 0, policy_version 9120 (0.0017) +[2023-09-27 11:11:15,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4677632. Throughput: 0: 795.0, 1: 794.7. Samples: 1167370. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:15,583][08744] Avg episode reward: [(0, '2.130'), (1, '1.930')] +[2023-09-27 11:11:20,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4710400. Throughput: 0: 803.2, 1: 802.3. Samples: 1177383. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:20,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.090')] +[2023-09-27 11:11:25,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 4734976. Throughput: 0: 798.8, 1: 798.7. Samples: 1181895. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:25,584][08744] Avg episode reward: [(0, '2.210'), (1, '1.920')] +[2023-09-27 11:11:26,957][09878] Updated weights for policy 0, policy_version 9280 (0.0018) +[2023-09-27 11:11:26,958][09879] Updated weights for policy 1, policy_version 9280 (0.0018) +[2023-09-27 11:11:30,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 4767744. Throughput: 0: 802.4, 1: 803.0. Samples: 1191831. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:30,584][08744] Avg episode reward: [(0, '2.120'), (1, '1.920')] +[2023-09-27 11:11:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4800512. Throughput: 0: 802.3, 1: 802.3. Samples: 1201326. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:35,583][08744] Avg episode reward: [(0, '2.060'), (1, '1.940')] +[2023-09-27 11:11:39,735][09879] Updated weights for policy 1, policy_version 9440 (0.0017) +[2023-09-27 11:11:39,735][09878] Updated weights for policy 0, policy_version 9440 (0.0016) +[2023-09-27 11:11:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4833280. Throughput: 0: 801.1, 1: 800.7. Samples: 1206267. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:40,583][08744] Avg episode reward: [(0, '2.090'), (1, '1.940')] +[2023-09-27 11:11:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 4866048. Throughput: 0: 804.0, 1: 804.0. Samples: 1215912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:11:45,584][08744] Avg episode reward: [(0, '2.250'), (1, '1.830')] +[2023-09-27 11:11:50,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4898816. Throughput: 0: 806.6, 1: 806.6. Samples: 1225618. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:11:50,583][08744] Avg episode reward: [(0, '2.120'), (1, '1.880')] +[2023-09-27 11:11:52,250][09878] Updated weights for policy 0, policy_version 9600 (0.0018) +[2023-09-27 11:11:52,250][09879] Updated weights for policy 1, policy_version 9600 (0.0016) +[2023-09-27 11:11:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4931584. Throughput: 0: 808.2, 1: 808.9. Samples: 1230651. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:11:55,584][08744] Avg episode reward: [(0, '2.160'), (1, '1.920')] +[2023-09-27 11:12:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4964352. Throughput: 0: 805.2, 1: 805.8. Samples: 1239864. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:00,583][08744] Avg episode reward: [(0, '2.150'), (1, '1.920')] +[2023-09-27 11:12:05,255][09879] Updated weights for policy 1, policy_version 9760 (0.0015) +[2023-09-27 11:12:05,256][09878] Updated weights for policy 0, policy_version 9760 (0.0013) +[2023-09-27 11:12:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 4997120. Throughput: 0: 799.1, 1: 798.7. Samples: 1249284. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:12:05,583][08744] Avg episode reward: [(0, '2.370'), (1, '1.940')] +[2023-09-27 11:12:10,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 5021696. Throughput: 0: 798.0, 1: 798.4. Samples: 1253729. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:12:10,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.020')] +[2023-09-27 11:12:15,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 5054464. Throughput: 0: 793.3, 1: 793.8. Samples: 1263248. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:12:15,583][08744] Avg episode reward: [(0, '2.350'), (1, '1.980')] +[2023-09-27 11:12:18,496][09878] Updated weights for policy 0, policy_version 9920 (0.0019) +[2023-09-27 11:12:18,496][09879] Updated weights for policy 1, policy_version 9920 (0.0019) +[2023-09-27 11:12:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 5087232. Throughput: 0: 792.4, 1: 792.7. Samples: 1272656. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:12:20,583][08744] Avg episode reward: [(0, '2.430'), (1, '1.910')] +[2023-09-27 11:12:25,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5120000. Throughput: 0: 793.3, 1: 793.9. Samples: 1277690. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:25,583][08744] Avg episode reward: [(0, '2.350'), (1, '2.030')] +[2023-09-27 11:12:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5152768. Throughput: 0: 790.4, 1: 790.5. Samples: 1287051. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:30,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.090')] +[2023-09-27 11:12:31,276][09878] Updated weights for policy 0, policy_version 10080 (0.0018) +[2023-09-27 11:12:31,277][09879] Updated weights for policy 1, policy_version 10080 (0.0012) +[2023-09-27 11:12:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5185536. Throughput: 0: 788.3, 1: 788.3. Samples: 1296565. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:35,583][08744] Avg episode reward: [(0, '2.040'), (1, '1.950')] +[2023-09-27 11:12:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5218304. Throughput: 0: 788.9, 1: 788.0. Samples: 1301611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:40,583][08744] Avg episode reward: [(0, '2.140'), (1, '2.060')] +[2023-09-27 11:12:43,979][09878] Updated weights for policy 0, policy_version 10240 (0.0016) +[2023-09-27 11:12:43,980][09879] Updated weights for policy 1, policy_version 10240 (0.0014) +[2023-09-27 11:12:45,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5251072. Throughput: 0: 792.3, 1: 791.7. Samples: 1311146. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:45,583][08744] Avg episode reward: [(0, '2.040'), (1, '2.120')] +[2023-09-27 11:12:45,592][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000010256_2625536.pth... +[2023-09-27 11:12:45,593][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000010256_2625536.pth... +[2023-09-27 11:12:45,625][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000007264_1859584.pth +[2023-09-27 11:12:45,627][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000007264_1859584.pth +[2023-09-27 11:12:50,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5283840. Throughput: 0: 796.4, 1: 796.4. Samples: 1320960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:50,583][08744] Avg episode reward: [(0, '2.090'), (1, '1.970')] +[2023-09-27 11:12:55,583][08744] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 5312512. Throughput: 0: 797.2, 1: 796.7. Samples: 1325453. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:12:55,583][08744] Avg episode reward: [(0, '2.160'), (1, '1.950')] +[2023-09-27 11:12:56,844][09879] Updated weights for policy 1, policy_version 10400 (0.0017) +[2023-09-27 11:12:56,845][09878] Updated weights for policy 0, policy_version 10400 (0.0017) +[2023-09-27 11:13:00,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 5341184. Throughput: 0: 801.0, 1: 800.0. Samples: 1335296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:00,583][08744] Avg episode reward: [(0, '2.240'), (1, '1.980')] +[2023-09-27 11:13:05,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 5373952. Throughput: 0: 800.5, 1: 799.7. Samples: 1344664. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:05,583][08744] Avg episode reward: [(0, '2.150'), (1, '1.940')] +[2023-09-27 11:13:09,762][09879] Updated weights for policy 1, policy_version 10560 (0.0018) +[2023-09-27 11:13:09,762][09878] Updated weights for policy 0, policy_version 10560 (0.0018) +[2023-09-27 11:13:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 5406720. Throughput: 0: 799.6, 1: 799.1. Samples: 1349632. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:10,583][08744] Avg episode reward: [(0, '2.120'), (1, '1.920')] +[2023-09-27 11:13:15,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 5439488. Throughput: 0: 798.8, 1: 799.2. Samples: 1358958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:15,583][08744] Avg episode reward: [(0, '2.240'), (1, '1.980')] +[2023-09-27 11:13:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5472256. Throughput: 0: 797.6, 1: 797.4. Samples: 1368343. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:20,583][08744] Avg episode reward: [(0, '2.140'), (1, '1.960')] +[2023-09-27 11:13:22,635][09878] Updated weights for policy 0, policy_version 10720 (0.0016) +[2023-09-27 11:13:22,635][09879] Updated weights for policy 1, policy_version 10720 (0.0015) +[2023-09-27 11:13:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5505024. Throughput: 0: 796.4, 1: 796.5. Samples: 1373290. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:25,584][08744] Avg episode reward: [(0, '2.010'), (1, '1.930')] +[2023-09-27 11:13:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5537792. Throughput: 0: 797.8, 1: 798.1. Samples: 1382961. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:30,583][08744] Avg episode reward: [(0, '2.010'), (1, '2.020')] +[2023-09-27 11:13:35,445][09878] Updated weights for policy 0, policy_version 10880 (0.0019) +[2023-09-27 11:13:35,445][09879] Updated weights for policy 1, policy_version 10880 (0.0019) +[2023-09-27 11:13:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 5570560. Throughput: 0: 796.1, 1: 796.2. Samples: 1392614. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:35,583][08744] Avg episode reward: [(0, '1.960'), (1, '1.980')] +[2023-09-27 11:13:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5603328. Throughput: 0: 796.4, 1: 796.4. Samples: 1397131. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:40,583][08744] Avg episode reward: [(0, '1.840'), (1, '2.030')] +[2023-09-27 11:13:45,583][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 5632000. Throughput: 0: 796.4, 1: 796.4. Samples: 1406976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:45,583][08744] Avg episode reward: [(0, '1.740'), (1, '2.040')] +[2023-09-27 11:13:48,095][09878] Updated weights for policy 0, policy_version 11040 (0.0018) +[2023-09-27 11:13:48,095][09879] Updated weights for policy 1, policy_version 11040 (0.0017) +[2023-09-27 11:13:50,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 5660672. Throughput: 0: 801.5, 1: 802.3. Samples: 1416834. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:13:50,583][08744] Avg episode reward: [(0, '1.830'), (1, '1.990')] +[2023-09-27 11:13:55,582][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6387.0). Total num frames: 5693440. Throughput: 0: 796.6, 1: 796.6. Samples: 1421329. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:13:55,583][08744] Avg episode reward: [(0, '1.920'), (1, '1.980')] +[2023-09-27 11:14:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5726208. Throughput: 0: 802.9, 1: 802.1. Samples: 1431183. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:14:00,583][08744] Avg episode reward: [(0, '1.960'), (1, '2.010')] +[2023-09-27 11:14:00,960][09879] Updated weights for policy 1, policy_version 11200 (0.0016) +[2023-09-27 11:14:00,960][09878] Updated weights for policy 0, policy_version 11200 (0.0017) +[2023-09-27 11:14:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5758976. Throughput: 0: 799.7, 1: 799.8. Samples: 1440320. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:14:05,583][08744] Avg episode reward: [(0, '1.960'), (1, '2.100')] +[2023-09-27 11:14:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 5791744. Throughput: 0: 798.7, 1: 798.6. Samples: 1445171. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:10,583][08744] Avg episode reward: [(0, '2.070'), (1, '2.200')] +[2023-09-27 11:14:13,976][09878] Updated weights for policy 0, policy_version 11360 (0.0018) +[2023-09-27 11:14:13,977][09879] Updated weights for policy 1, policy_version 11360 (0.0017) +[2023-09-27 11:14:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5824512. Throughput: 0: 795.1, 1: 794.6. Samples: 1454498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:15,584][08744] Avg episode reward: [(0, '2.110'), (1, '2.130')] +[2023-09-27 11:14:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5857280. Throughput: 0: 796.8, 1: 796.7. Samples: 1464320. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:20,583][08744] Avg episode reward: [(0, '2.050'), (1, '2.190')] +[2023-09-27 11:14:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 5890048. Throughput: 0: 797.6, 1: 797.7. Samples: 1468919. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:25,584][08744] Avg episode reward: [(0, '2.010'), (1, '2.190')] +[2023-09-27 11:14:26,805][09878] Updated weights for policy 0, policy_version 11520 (0.0016) +[2023-09-27 11:14:26,805][09879] Updated weights for policy 1, policy_version 11520 (0.0016) +[2023-09-27 11:14:30,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 5914624. Throughput: 0: 796.4, 1: 796.4. Samples: 1478656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:14:30,583][08744] Avg episode reward: [(0, '2.010'), (1, '2.210')] +[2023-09-27 11:14:35,582][08744] Fps is (10 sec: 5734.6, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 5947392. Throughput: 0: 790.8, 1: 789.2. Samples: 1487933. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:14:35,583][08744] Avg episode reward: [(0, '1.980'), (1, '2.000')] +[2023-09-27 11:14:39,856][09878] Updated weights for policy 0, policy_version 11680 (0.0018) +[2023-09-27 11:14:39,856][09879] Updated weights for policy 1, policy_version 11680 (0.0018) +[2023-09-27 11:14:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6373.1). Total num frames: 5980160. Throughput: 0: 793.9, 1: 793.3. Samples: 1492750. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:14:40,583][08744] Avg episode reward: [(0, '2.060'), (1, '2.040')] +[2023-09-27 11:14:45,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6348.8, 300 sec: 6387.0). Total num frames: 6012928. Throughput: 0: 786.6, 1: 786.5. Samples: 1501973. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:14:45,583][08744] Avg episode reward: [(0, '2.020'), (1, '2.010')] +[2023-09-27 11:14:45,591][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000011744_3006464.pth... +[2023-09-27 11:14:45,592][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000011744_3006464.pth... +[2023-09-27 11:14:45,628][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000008752_2240512.pth +[2023-09-27 11:14:45,628][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000008752_2240512.pth +[2023-09-27 11:14:50,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6045696. Throughput: 0: 790.0, 1: 790.2. Samples: 1511427. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:50,584][08744] Avg episode reward: [(0, '2.060'), (1, '2.030')] +[2023-09-27 11:14:52,841][09878] Updated weights for policy 0, policy_version 11840 (0.0018) +[2023-09-27 11:14:52,841][09879] Updated weights for policy 1, policy_version 11840 (0.0018) +[2023-09-27 11:14:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6078464. Throughput: 0: 789.3, 1: 789.8. Samples: 1516228. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:14:55,583][08744] Avg episode reward: [(0, '2.000'), (1, '1.940')] +[2023-09-27 11:15:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6111232. Throughput: 0: 791.8, 1: 792.0. Samples: 1525765. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:00,584][08744] Avg episode reward: [(0, '1.950'), (1, '1.960')] +[2023-09-27 11:15:05,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6135808. Throughput: 0: 791.2, 1: 791.2. Samples: 1535532. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:05,583][08744] Avg episode reward: [(0, '2.020'), (1, '2.100')] +[2023-09-27 11:15:05,655][09879] Updated weights for policy 1, policy_version 12000 (0.0017) +[2023-09-27 11:15:05,655][09878] Updated weights for policy 0, policy_version 12000 (0.0015) +[2023-09-27 11:15:10,583][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6168576. Throughput: 0: 791.0, 1: 790.9. Samples: 1540104. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:15:10,583][08744] Avg episode reward: [(0, '1.940'), (1, '2.120')] +[2023-09-27 11:15:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6201344. Throughput: 0: 792.5, 1: 792.7. Samples: 1549992. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:15:15,583][08744] Avg episode reward: [(0, '1.890'), (1, '2.080')] +[2023-09-27 11:15:18,438][09879] Updated weights for policy 1, policy_version 12160 (0.0017) +[2023-09-27 11:15:18,438][09878] Updated weights for policy 0, policy_version 12160 (0.0018) +[2023-09-27 11:15:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6234112. Throughput: 0: 793.4, 1: 794.6. Samples: 1559390. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:20,583][08744] Avg episode reward: [(0, '2.120'), (1, '2.210')] +[2023-09-27 11:15:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 6266880. Throughput: 0: 795.8, 1: 796.5. Samples: 1564407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:25,583][08744] Avg episode reward: [(0, '2.230'), (1, '2.210')] +[2023-09-27 11:15:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 6299648. Throughput: 0: 799.9, 1: 799.8. Samples: 1573958. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:15:30,584][08744] Avg episode reward: [(0, '2.320'), (1, '2.210')] +[2023-09-27 11:15:31,225][09878] Updated weights for policy 0, policy_version 12320 (0.0016) +[2023-09-27 11:15:31,225][09879] Updated weights for policy 1, policy_version 12320 (0.0013) +[2023-09-27 11:15:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 6332416. Throughput: 0: 798.6, 1: 798.5. Samples: 1583295. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:15:35,583][08744] Avg episode reward: [(0, '2.390'), (1, '2.290')] +[2023-09-27 11:15:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6365184. Throughput: 0: 801.3, 1: 800.7. Samples: 1588320. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:40,583][08744] Avg episode reward: [(0, '2.440'), (1, '2.260')] +[2023-09-27 11:15:44,135][09879] Updated weights for policy 1, policy_version 12480 (0.0018) +[2023-09-27 11:15:44,136][09878] Updated weights for policy 0, policy_version 12480 (0.0018) +[2023-09-27 11:15:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6397952. Throughput: 0: 797.6, 1: 797.6. Samples: 1597552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:45,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.260')] +[2023-09-27 11:15:50,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6422528. Throughput: 0: 796.2, 1: 796.5. Samples: 1607204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:50,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.290')] +[2023-09-27 11:15:55,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6455296. Throughput: 0: 796.4, 1: 796.4. Samples: 1611776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:15:55,583][08744] Avg episode reward: [(0, '2.300'), (1, '2.210')] +[2023-09-27 11:15:57,126][09879] Updated weights for policy 1, policy_version 12640 (0.0016) +[2023-09-27 11:15:57,127][09878] Updated weights for policy 0, policy_version 12640 (0.0018) +[2023-09-27 11:16:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 6488064. Throughput: 0: 793.6, 1: 793.5. Samples: 1621412. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:16:00,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.160')] +[2023-09-27 11:16:05,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 6520832. Throughput: 0: 792.9, 1: 792.9. Samples: 1630752. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:16:05,584][08744] Avg episode reward: [(0, '2.090'), (1, '2.140')] +[2023-09-27 11:16:10,050][09879] Updated weights for policy 1, policy_version 12800 (0.0016) +[2023-09-27 11:16:10,051][09878] Updated weights for policy 0, policy_version 12800 (0.0017) +[2023-09-27 11:16:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 6553600. Throughput: 0: 792.3, 1: 792.1. Samples: 1635704. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:16:10,583][08744] Avg episode reward: [(0, '2.150'), (1, '2.330')] +[2023-09-27 11:16:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 6586368. Throughput: 0: 789.7, 1: 790.4. Samples: 1645062. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:16:15,583][08744] Avg episode reward: [(0, '2.180'), (1, '2.340')] +[2023-09-27 11:16:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6619136. Throughput: 0: 794.3, 1: 794.4. Samples: 1654784. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:16:20,583][08744] Avg episode reward: [(0, '2.150'), (1, '2.330')] +[2023-09-27 11:16:22,958][09879] Updated weights for policy 1, policy_version 12960 (0.0015) +[2023-09-27 11:16:22,958][09878] Updated weights for policy 0, policy_version 12960 (0.0018) +[2023-09-27 11:16:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6651904. Throughput: 0: 788.7, 1: 789.4. Samples: 1659333. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:16:25,583][08744] Avg episode reward: [(0, '2.180'), (1, '2.390')] +[2023-09-27 11:16:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6684672. Throughput: 0: 795.2, 1: 795.2. Samples: 1669120. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:16:30,584][08744] Avg episode reward: [(0, '2.090'), (1, '2.460')] +[2023-09-27 11:16:35,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6709248. Throughput: 0: 795.5, 1: 795.2. Samples: 1678785. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:16:35,584][08744] Avg episode reward: [(0, '2.150'), (1, '2.330')] +[2023-09-27 11:16:35,732][09878] Updated weights for policy 0, policy_version 13120 (0.0019) +[2023-09-27 11:16:35,732][09879] Updated weights for policy 1, policy_version 13120 (0.0018) +[2023-09-27 11:16:40,582][08744] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6742016. Throughput: 0: 796.6, 1: 796.5. Samples: 1683465. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:16:40,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.270')] +[2023-09-27 11:16:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 6774784. Throughput: 0: 799.7, 1: 799.7. Samples: 1693386. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:16:45,584][08744] Avg episode reward: [(0, '2.220'), (1, '2.210')] +[2023-09-27 11:16:45,592][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000013232_3387392.pth... +[2023-09-27 11:16:45,592][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000013232_3387392.pth... +[2023-09-27 11:16:45,621][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000010256_2625536.pth +[2023-09-27 11:16:45,627][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000010256_2625536.pth +[2023-09-27 11:16:48,442][09878] Updated weights for policy 0, policy_version 13280 (0.0018) +[2023-09-27 11:16:48,442][09879] Updated weights for policy 1, policy_version 13280 (0.0016) +[2023-09-27 11:16:50,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 6807552. Throughput: 0: 801.0, 1: 801.1. Samples: 1702847. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:16:50,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.210')] +[2023-09-27 11:16:55,582][08744] Fps is (10 sec: 6553.9, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 6840320. Throughput: 0: 802.0, 1: 801.8. Samples: 1707873. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:16:55,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.060')] +[2023-09-27 11:17:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 6873088. Throughput: 0: 800.7, 1: 800.2. Samples: 1717102. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:17:00,584][08744] Avg episode reward: [(0, '2.580'), (1, '2.030')] +[2023-09-27 11:17:01,286][09878] Updated weights for policy 0, policy_version 13440 (0.0017) +[2023-09-27 11:17:01,287][09879] Updated weights for policy 1, policy_version 13440 (0.0016) +[2023-09-27 11:17:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6905856. Throughput: 0: 798.2, 1: 798.3. Samples: 1726628. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:17:05,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.010')] +[2023-09-27 11:17:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6938624. Throughput: 0: 802.4, 1: 801.2. Samples: 1731498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:10,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.190')] +[2023-09-27 11:17:13,995][09879] Updated weights for policy 1, policy_version 13600 (0.0017) +[2023-09-27 11:17:13,995][09878] Updated weights for policy 0, policy_version 13600 (0.0016) +[2023-09-27 11:17:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 6971392. Throughput: 0: 800.8, 1: 800.8. Samples: 1741193. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:15,583][08744] Avg episode reward: [(0, '2.400'), (1, '2.160')] +[2023-09-27 11:17:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7004160. Throughput: 0: 802.8, 1: 802.9. Samples: 1751040. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:20,583][08744] Avg episode reward: [(0, '2.350'), (1, '2.170')] +[2023-09-27 11:17:25,582][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7036928. Throughput: 0: 802.6, 1: 803.0. Samples: 1755717. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:17:25,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.160')] +[2023-09-27 11:17:26,725][09879] Updated weights for policy 1, policy_version 13760 (0.0017) +[2023-09-27 11:17:26,725][09878] Updated weights for policy 0, policy_version 13760 (0.0017) +[2023-09-27 11:17:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7069696. Throughput: 0: 800.0, 1: 800.0. Samples: 1765386. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:17:30,583][08744] Avg episode reward: [(0, '2.190'), (1, '2.250')] +[2023-09-27 11:17:35,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 7094272. Throughput: 0: 803.8, 1: 804.3. Samples: 1775212. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:17:35,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.230')] +[2023-09-27 11:17:39,436][09879] Updated weights for policy 1, policy_version 13920 (0.0017) +[2023-09-27 11:17:39,436][09878] Updated weights for policy 0, policy_version 13920 (0.0019) +[2023-09-27 11:17:40,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 7127040. Throughput: 0: 798.5, 1: 798.9. Samples: 1779756. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:17:40,583][08744] Avg episode reward: [(0, '2.300'), (1, '2.000')] +[2023-09-27 11:17:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 7159808. Throughput: 0: 807.6, 1: 808.2. Samples: 1789812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:45,584][08744] Avg episode reward: [(0, '2.350'), (1, '2.010')] +[2023-09-27 11:17:50,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6373.1). Total num frames: 7192576. Throughput: 0: 808.6, 1: 808.9. Samples: 1799417. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:50,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.110')] +[2023-09-27 11:17:52,098][09879] Updated weights for policy 1, policy_version 14080 (0.0013) +[2023-09-27 11:17:52,098][09878] Updated weights for policy 0, policy_version 14080 (0.0018) +[2023-09-27 11:17:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 7225344. Throughput: 0: 808.4, 1: 809.1. Samples: 1804282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:17:55,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.100')] +[2023-09-27 11:18:00,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7258112. Throughput: 0: 807.7, 1: 808.1. Samples: 1813903. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:00,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.190')] +[2023-09-27 11:18:04,883][09878] Updated weights for policy 0, policy_version 14240 (0.0017) +[2023-09-27 11:18:04,883][09879] Updated weights for policy 1, policy_version 14240 (0.0017) +[2023-09-27 11:18:05,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 7290880. Throughput: 0: 803.2, 1: 803.4. Samples: 1823337. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:05,584][08744] Avg episode reward: [(0, '2.180'), (1, '2.160')] +[2023-09-27 11:18:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7323648. Throughput: 0: 805.4, 1: 804.8. Samples: 1828179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:10,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.240')] +[2023-09-27 11:18:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 7356416. Throughput: 0: 798.0, 1: 798.0. Samples: 1837206. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:15,583][08744] Avg episode reward: [(0, '2.180'), (1, '2.370')] +[2023-09-27 11:18:17,940][09879] Updated weights for policy 1, policy_version 14400 (0.0017) +[2023-09-27 11:18:17,940][09878] Updated weights for policy 0, policy_version 14400 (0.0018) +[2023-09-27 11:18:20,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7389184. Throughput: 0: 800.5, 1: 800.0. Samples: 1847232. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:18:20,584][08744] Avg episode reward: [(0, '2.120'), (1, '2.290')] +[2023-09-27 11:18:25,583][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 7417856. Throughput: 0: 798.9, 1: 798.7. Samples: 1851650. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:18:25,584][08744] Avg episode reward: [(0, '2.190'), (1, '2.340')] +[2023-09-27 11:18:30,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 7446528. Throughput: 0: 798.1, 1: 797.9. Samples: 1861632. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:30,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.270')] +[2023-09-27 11:18:30,666][09878] Updated weights for policy 0, policy_version 14560 (0.0016) +[2023-09-27 11:18:30,666][09879] Updated weights for policy 1, policy_version 14560 (0.0016) +[2023-09-27 11:18:35,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 7479296. Throughput: 0: 797.5, 1: 797.4. Samples: 1871188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:35,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.310')] +[2023-09-27 11:18:40,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6373.1). Total num frames: 7512064. Throughput: 0: 796.6, 1: 796.4. Samples: 1875968. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:40,584][08744] Avg episode reward: [(0, '2.410'), (1, '2.060')] +[2023-09-27 11:18:43,575][09878] Updated weights for policy 0, policy_version 14720 (0.0017) +[2023-09-27 11:18:43,575][09879] Updated weights for policy 1, policy_version 14720 (0.0017) +[2023-09-27 11:18:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7544832. Throughput: 0: 794.5, 1: 793.5. Samples: 1885365. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:45,583][08744] Avg episode reward: [(0, '2.500'), (1, '2.030')] +[2023-09-27 11:18:45,591][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000014736_3772416.pth... +[2023-09-27 11:18:45,591][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000014736_3772416.pth... +[2023-09-27 11:18:45,626][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000011744_3006464.pth +[2023-09-27 11:18:45,628][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000011744_3006464.pth +[2023-09-27 11:18:50,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7577600. Throughput: 0: 794.7, 1: 794.1. Samples: 1894833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:18:50,583][08744] Avg episode reward: [(0, '2.580'), (1, '1.890')] +[2023-09-27 11:18:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7610368. Throughput: 0: 794.3, 1: 794.5. Samples: 1899675. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:18:55,583][08744] Avg episode reward: [(0, '2.370'), (1, '1.840')] +[2023-09-27 11:18:56,408][09879] Updated weights for policy 1, policy_version 14880 (0.0017) +[2023-09-27 11:18:56,409][09878] Updated weights for policy 0, policy_version 14880 (0.0018) +[2023-09-27 11:19:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7643136. Throughput: 0: 797.3, 1: 797.2. Samples: 1908956. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:19:00,583][08744] Avg episode reward: [(0, '2.330'), (1, '1.980')] +[2023-09-27 11:19:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7675904. Throughput: 0: 796.8, 1: 796.0. Samples: 1918907. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:19:05,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.040')] +[2023-09-27 11:19:09,354][09879] Updated weights for policy 1, policy_version 15040 (0.0017) +[2023-09-27 11:19:09,354][09878] Updated weights for policy 0, policy_version 15040 (0.0016) +[2023-09-27 11:19:10,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 7700480. Throughput: 0: 796.5, 1: 796.6. Samples: 1923339. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:19:10,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.060')] +[2023-09-27 11:19:15,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 7733248. Throughput: 0: 795.8, 1: 795.5. Samples: 1933242. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:19:15,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.080')] +[2023-09-27 11:19:20,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 7766016. Throughput: 0: 796.8, 1: 796.3. Samples: 1942876. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:19:20,583][08744] Avg episode reward: [(0, '2.240'), (1, '2.210')] +[2023-09-27 11:19:22,056][09879] Updated weights for policy 1, policy_version 15200 (0.0019) +[2023-09-27 11:19:22,056][09878] Updated weights for policy 0, policy_version 15200 (0.0019) +[2023-09-27 11:19:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6348.8, 300 sec: 6387.0). Total num frames: 7798784. Throughput: 0: 796.4, 1: 796.4. Samples: 1947648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:19:25,583][08744] Avg episode reward: [(0, '2.290'), (1, '2.150')] +[2023-09-27 11:19:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7831552. Throughput: 0: 797.7, 1: 798.5. Samples: 1957191. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:19:30,583][08744] Avg episode reward: [(0, '2.130'), (1, '2.110')] +[2023-09-27 11:19:34,853][09878] Updated weights for policy 0, policy_version 15360 (0.0019) +[2023-09-27 11:19:34,853][09879] Updated weights for policy 1, policy_version 15360 (0.0018) +[2023-09-27 11:19:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7864320. Throughput: 0: 799.0, 1: 798.9. Samples: 1966738. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:19:35,583][08744] Avg episode reward: [(0, '2.050'), (1, '2.120')] +[2023-09-27 11:19:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7897088. Throughput: 0: 800.9, 1: 800.6. Samples: 1971743. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:19:40,583][08744] Avg episode reward: [(0, '2.050'), (1, '2.320')] +[2023-09-27 11:19:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 7929856. Throughput: 0: 801.2, 1: 801.4. Samples: 1981074. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:19:45,583][08744] Avg episode reward: [(0, '2.180'), (1, '2.370')] +[2023-09-27 11:19:47,614][09878] Updated weights for policy 0, policy_version 15520 (0.0017) +[2023-09-27 11:19:47,615][09879] Updated weights for policy 1, policy_version 15520 (0.0014) +[2023-09-27 11:19:50,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7962624. Throughput: 0: 797.4, 1: 797.9. Samples: 1990696. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:19:50,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.210')] +[2023-09-27 11:19:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 7995392. Throughput: 0: 803.6, 1: 803.6. Samples: 1995664. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-09-27 11:19:55,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.210')] +[2023-09-27 11:20:00,430][09879] Updated weights for policy 1, policy_version 15680 (0.0018) +[2023-09-27 11:20:00,430][09878] Updated weights for policy 0, policy_version 15680 (0.0018) +[2023-09-27 11:20:00,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6414.8). Total num frames: 8028160. Throughput: 0: 797.7, 1: 798.1. Samples: 2005053. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:20:00,583][08744] Avg episode reward: [(0, '2.230'), (1, '2.260')] +[2023-09-27 11:20:05,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 8052736. Throughput: 0: 798.0, 1: 799.4. Samples: 2014761. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:20:05,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.070')] +[2023-09-27 11:20:10,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8085504. Throughput: 0: 796.6, 1: 796.6. Samples: 2019339. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-09-27 11:20:10,584][08744] Avg episode reward: [(0, '2.290'), (1, '2.020')] +[2023-09-27 11:20:13,314][09878] Updated weights for policy 0, policy_version 15840 (0.0017) +[2023-09-27 11:20:13,314][09879] Updated weights for policy 1, policy_version 15840 (0.0016) +[2023-09-27 11:20:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8118272. Throughput: 0: 801.2, 1: 801.1. Samples: 2029294. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:15,583][08744] Avg episode reward: [(0, '2.170'), (1, '2.190')] +[2023-09-27 11:20:20,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8151040. Throughput: 0: 799.3, 1: 801.2. Samples: 2038763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:20,583][08744] Avg episode reward: [(0, '2.090'), (1, '2.270')] +[2023-09-27 11:20:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 8183808. Throughput: 0: 799.1, 1: 799.3. Samples: 2043673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:25,584][08744] Avg episode reward: [(0, '2.230'), (1, '2.320')] +[2023-09-27 11:20:26,134][09878] Updated weights for policy 0, policy_version 16000 (0.0018) +[2023-09-27 11:20:26,135][09879] Updated weights for policy 1, policy_version 16000 (0.0015) +[2023-09-27 11:20:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8216576. Throughput: 0: 799.1, 1: 799.0. Samples: 2052990. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:30,583][08744] Avg episode reward: [(0, '2.240'), (1, '2.550')] +[2023-09-27 11:20:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8249344. Throughput: 0: 796.0, 1: 796.0. Samples: 2062336. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:35,583][08744] Avg episode reward: [(0, '2.240'), (1, '2.560')] +[2023-09-27 11:20:39,122][09878] Updated weights for policy 0, policy_version 16160 (0.0017) +[2023-09-27 11:20:39,123][09879] Updated weights for policy 1, policy_version 16160 (0.0018) +[2023-09-27 11:20:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8282112. Throughput: 0: 793.3, 1: 792.9. Samples: 2067045. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:40,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.590')] +[2023-09-27 11:20:45,583][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6400.9). Total num frames: 8310784. Throughput: 0: 795.8, 1: 795.7. Samples: 2076672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:45,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.550')] +[2023-09-27 11:20:45,596][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000016240_4157440.pth... +[2023-09-27 11:20:45,602][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000016240_4157440.pth... +[2023-09-27 11:20:45,631][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000013232_3387392.pth +[2023-09-27 11:20:45,631][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000013232_3387392.pth +[2023-09-27 11:20:50,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 8339456. Throughput: 0: 795.2, 1: 795.3. Samples: 2086335. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:50,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.410')] +[2023-09-27 11:20:52,004][09878] Updated weights for policy 0, policy_version 16320 (0.0017) +[2023-09-27 11:20:52,005][09879] Updated weights for policy 1, policy_version 16320 (0.0017) +[2023-09-27 11:20:55,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 8372224. Throughput: 0: 796.3, 1: 796.3. Samples: 2091008. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:20:55,583][08744] Avg episode reward: [(0, '2.650'), (1, '2.380')] +[2023-09-27 11:21:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6387.0). Total num frames: 8404992. Throughput: 0: 793.2, 1: 793.5. Samples: 2100694. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:00,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.280')] +[2023-09-27 11:21:04,947][09878] Updated weights for policy 0, policy_version 16480 (0.0017) +[2023-09-27 11:21:04,947][09879] Updated weights for policy 1, policy_version 16480 (0.0013) +[2023-09-27 11:21:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8437760. Throughput: 0: 791.4, 1: 789.7. Samples: 2109916. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:05,583][08744] Avg episode reward: [(0, '2.630'), (1, '2.250')] +[2023-09-27 11:21:10,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8470528. Throughput: 0: 791.5, 1: 792.3. Samples: 2114944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:10,583][08744] Avg episode reward: [(0, '2.600'), (1, '2.360')] +[2023-09-27 11:21:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8503296. Throughput: 0: 790.8, 1: 790.7. Samples: 2124157. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:15,583][08744] Avg episode reward: [(0, '2.600'), (1, '2.330')] +[2023-09-27 11:21:17,800][09878] Updated weights for policy 0, policy_version 16640 (0.0017) +[2023-09-27 11:21:17,801][09879] Updated weights for policy 1, policy_version 16640 (0.0017) +[2023-09-27 11:21:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8536064. Throughput: 0: 796.4, 1: 796.4. Samples: 2134016. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:20,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.410')] +[2023-09-27 11:21:25,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8568832. Throughput: 0: 793.5, 1: 793.8. Samples: 2138474. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:25,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.370')] +[2023-09-27 11:21:30,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 8593408. Throughput: 0: 796.4, 1: 796.4. Samples: 2148352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:30,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.410')] +[2023-09-27 11:21:30,653][09878] Updated weights for policy 0, policy_version 16800 (0.0019) +[2023-09-27 11:21:30,653][09879] Updated weights for policy 1, policy_version 16800 (0.0015) +[2023-09-27 11:21:35,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 8626176. Throughput: 0: 795.2, 1: 794.6. Samples: 2157875. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:35,584][08744] Avg episode reward: [(0, '2.340'), (1, '2.260')] +[2023-09-27 11:21:40,583][08744] Fps is (10 sec: 6553.0, 60 sec: 6280.4, 300 sec: 6387.0). Total num frames: 8658944. Throughput: 0: 796.4, 1: 796.4. Samples: 2162688. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:40,584][08744] Avg episode reward: [(0, '2.290'), (1, '2.320')] +[2023-09-27 11:21:43,447][09879] Updated weights for policy 1, policy_version 16960 (0.0014) +[2023-09-27 11:21:43,447][09878] Updated weights for policy 0, policy_version 16960 (0.0018) +[2023-09-27 11:21:45,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6348.8, 300 sec: 6387.0). Total num frames: 8691712. Throughput: 0: 795.6, 1: 796.2. Samples: 2172326. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:45,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.280')] +[2023-09-27 11:21:50,582][08744] Fps is (10 sec: 6554.3, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8724480. Throughput: 0: 797.2, 1: 797.9. Samples: 2181692. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:21:50,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.190')] +[2023-09-27 11:21:55,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 8757248. Throughput: 0: 797.2, 1: 796.7. Samples: 2186668. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:21:55,584][08744] Avg episode reward: [(0, '2.260'), (1, '2.100')] +[2023-09-27 11:21:56,266][09879] Updated weights for policy 1, policy_version 17120 (0.0017) +[2023-09-27 11:21:56,268][09878] Updated weights for policy 0, policy_version 17120 (0.0017) +[2023-09-27 11:22:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 8790016. Throughput: 0: 797.6, 1: 798.0. Samples: 2195955. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:22:00,584][08744] Avg episode reward: [(0, '2.370'), (1, '2.040')] +[2023-09-27 11:22:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8822784. Throughput: 0: 796.4, 1: 796.4. Samples: 2205694. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:22:05,583][08744] Avg episode reward: [(0, '2.230'), (1, '2.000')] +[2023-09-27 11:22:09,265][09878] Updated weights for policy 0, policy_version 17280 (0.0017) +[2023-09-27 11:22:09,266][09879] Updated weights for policy 1, policy_version 17280 (0.0017) +[2023-09-27 11:22:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8855552. Throughput: 0: 797.1, 1: 797.3. Samples: 2210221. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:22:10,583][08744] Avg episode reward: [(0, '2.220'), (1, '1.990')] +[2023-09-27 11:22:15,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 8880128. Throughput: 0: 796.4, 1: 796.4. Samples: 2220031. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:22:15,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.030')] +[2023-09-27 11:22:20,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 8912896. Throughput: 0: 797.7, 1: 797.5. Samples: 2229662. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:22:20,584][08744] Avg episode reward: [(0, '2.400'), (1, '2.100')] +[2023-09-27 11:22:22,013][09878] Updated weights for policy 0, policy_version 17440 (0.0017) +[2023-09-27 11:22:22,013][09879] Updated weights for policy 1, policy_version 17440 (0.0017) +[2023-09-27 11:22:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 8945664. Throughput: 0: 796.5, 1: 796.5. Samples: 2234368. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:22:25,583][08744] Avg episode reward: [(0, '2.350'), (1, '2.120')] +[2023-09-27 11:22:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 8978432. Throughput: 0: 798.8, 1: 798.1. Samples: 2244188. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:22:30,583][08744] Avg episode reward: [(0, '2.300'), (1, '2.100')] +[2023-09-27 11:22:34,747][09878] Updated weights for policy 0, policy_version 17600 (0.0018) +[2023-09-27 11:22:34,747][09879] Updated weights for policy 1, policy_version 17600 (0.0015) +[2023-09-27 11:22:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9011200. Throughput: 0: 799.5, 1: 799.9. Samples: 2253665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:22:35,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.210')] +[2023-09-27 11:22:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.2, 300 sec: 6387.0). Total num frames: 9043968. Throughput: 0: 800.3, 1: 800.6. Samples: 2258712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:22:40,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.140')] +[2023-09-27 11:22:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9076736. Throughput: 0: 801.9, 1: 802.1. Samples: 2268132. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:22:45,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.080')] +[2023-09-27 11:22:45,591][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000017728_4538368.pth... +[2023-09-27 11:22:45,592][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000017728_4538368.pth... +[2023-09-27 11:22:45,625][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000014736_3772416.pth +[2023-09-27 11:22:45,627][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000014736_3772416.pth +[2023-09-27 11:22:47,523][09878] Updated weights for policy 0, policy_version 17760 (0.0018) +[2023-09-27 11:22:47,523][09879] Updated weights for policy 1, policy_version 17760 (0.0017) +[2023-09-27 11:22:50,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9109504. Throughput: 0: 799.2, 1: 799.0. Samples: 2277616. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:22:50,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.250')] +[2023-09-27 11:22:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9142272. Throughput: 0: 802.5, 1: 802.8. Samples: 2282462. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:22:55,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.290')] +[2023-09-27 11:23:00,282][09878] Updated weights for policy 0, policy_version 17920 (0.0019) +[2023-09-27 11:23:00,282][09879] Updated weights for policy 1, policy_version 17920 (0.0018) +[2023-09-27 11:23:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9175040. Throughput: 0: 800.0, 1: 800.2. Samples: 2292042. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:00,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.220')] +[2023-09-27 11:23:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9207808. Throughput: 0: 803.4, 1: 803.0. Samples: 2301952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:05,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.280')] +[2023-09-27 11:23:10,582][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 9236480. Throughput: 0: 800.1, 1: 800.3. Samples: 2306385. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:23:10,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.290')] +[2023-09-27 11:23:13,277][09878] Updated weights for policy 0, policy_version 18080 (0.0019) +[2023-09-27 11:23:13,277][09879] Updated weights for policy 1, policy_version 18080 (0.0017) +[2023-09-27 11:23:15,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 9265152. Throughput: 0: 798.6, 1: 797.7. Samples: 2316020. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:23:15,583][08744] Avg episode reward: [(0, '2.130'), (1, '2.090')] +[2023-09-27 11:23:20,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6417.1, 300 sec: 6373.1). Total num frames: 9297920. Throughput: 0: 795.5, 1: 794.4. Samples: 2325212. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:23:20,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.130')] +[2023-09-27 11:23:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9330688. Throughput: 0: 795.4, 1: 794.2. Samples: 2330246. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:23:25,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.140')] +[2023-09-27 11:23:26,247][09878] Updated weights for policy 0, policy_version 18240 (0.0018) +[2023-09-27 11:23:26,247][09879] Updated weights for policy 1, policy_version 18240 (0.0018) +[2023-09-27 11:23:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9363456. Throughput: 0: 793.0, 1: 792.5. Samples: 2339479. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:23:30,584][08744] Avg episode reward: [(0, '2.300'), (1, '2.180')] +[2023-09-27 11:23:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9396224. Throughput: 0: 793.7, 1: 793.8. Samples: 2349056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:35,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.150')] +[2023-09-27 11:23:39,110][09879] Updated weights for policy 1, policy_version 18400 (0.0018) +[2023-09-27 11:23:39,110][09878] Updated weights for policy 0, policy_version 18400 (0.0019) +[2023-09-27 11:23:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9428992. Throughput: 0: 793.0, 1: 792.7. Samples: 2353817. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:40,583][08744] Avg episode reward: [(0, '2.280'), (1, '2.150')] +[2023-09-27 11:23:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 9461760. Throughput: 0: 792.9, 1: 792.8. Samples: 2363401. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:45,584][08744] Avg episode reward: [(0, '2.150'), (1, '2.170')] +[2023-09-27 11:23:50,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9494528. Throughput: 0: 795.4, 1: 794.1. Samples: 2373481. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:50,584][08744] Avg episode reward: [(0, '2.130'), (1, '2.170')] +[2023-09-27 11:23:51,856][09878] Updated weights for policy 0, policy_version 18560 (0.0017) +[2023-09-27 11:23:51,858][09879] Updated weights for policy 1, policy_version 18560 (0.0016) +[2023-09-27 11:23:55,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9519104. Throughput: 0: 794.0, 1: 794.0. Samples: 2377843. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:23:55,584][08744] Avg episode reward: [(0, '2.120'), (1, '2.030')] +[2023-09-27 11:24:00,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9551872. Throughput: 0: 795.1, 1: 794.3. Samples: 2387542. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:00,584][08744] Avg episode reward: [(0, '2.180'), (1, '2.070')] +[2023-09-27 11:24:04,919][09879] Updated weights for policy 1, policy_version 18720 (0.0016) +[2023-09-27 11:24:04,920][09878] Updated weights for policy 0, policy_version 18720 (0.0017) +[2023-09-27 11:24:05,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 9584640. Throughput: 0: 794.2, 1: 794.7. Samples: 2396717. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:05,584][08744] Avg episode reward: [(0, '2.090'), (1, '2.100')] +[2023-09-27 11:24:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6348.8, 300 sec: 6387.0). Total num frames: 9617408. Throughput: 0: 792.8, 1: 794.0. Samples: 2401648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:10,583][08744] Avg episode reward: [(0, '2.120'), (1, '2.220')] +[2023-09-27 11:24:15,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9650176. Throughput: 0: 795.4, 1: 796.1. Samples: 2411098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:15,583][08744] Avg episode reward: [(0, '2.110'), (1, '2.270')] +[2023-09-27 11:24:17,743][09878] Updated weights for policy 0, policy_version 18880 (0.0019) +[2023-09-27 11:24:17,743][09879] Updated weights for policy 1, policy_version 18880 (0.0014) +[2023-09-27 11:24:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 9682944. Throughput: 0: 796.4, 1: 796.4. Samples: 2420736. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:20,583][08744] Avg episode reward: [(0, '2.250'), (1, '2.270')] +[2023-09-27 11:24:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9715712. Throughput: 0: 796.2, 1: 796.0. Samples: 2425467. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:25,583][08744] Avg episode reward: [(0, '2.120'), (1, '2.300')] +[2023-09-27 11:24:30,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9740288. Throughput: 0: 796.4, 1: 796.3. Samples: 2435072. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:30,584][08744] Avg episode reward: [(0, '2.220'), (1, '2.190')] +[2023-09-27 11:24:30,596][09878] Updated weights for policy 0, policy_version 19040 (0.0020) +[2023-09-27 11:24:30,596][09879] Updated weights for policy 1, policy_version 19040 (0.0016) +[2023-09-27 11:24:35,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9773056. Throughput: 0: 792.8, 1: 793.2. Samples: 2444847. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:24:35,583][08744] Avg episode reward: [(0, '2.160'), (1, '2.170')] +[2023-09-27 11:24:40,582][08744] Fps is (10 sec: 6553.9, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9805824. Throughput: 0: 795.2, 1: 795.1. Samples: 2449408. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:24:40,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.090')] +[2023-09-27 11:24:43,627][09878] Updated weights for policy 0, policy_version 19200 (0.0016) +[2023-09-27 11:24:43,628][09879] Updated weights for policy 1, policy_version 19200 (0.0017) +[2023-09-27 11:24:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9838592. Throughput: 0: 790.2, 1: 790.5. Samples: 2458673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:45,584][08744] Avg episode reward: [(0, '2.300'), (1, '2.010')] +[2023-09-27 11:24:45,597][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000019216_4919296.pth... +[2023-09-27 11:24:45,597][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000019216_4919296.pth... +[2023-09-27 11:24:45,631][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000016240_4157440.pth +[2023-09-27 11:24:45,640][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000016240_4157440.pth +[2023-09-27 11:24:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 9871360. Throughput: 0: 792.6, 1: 792.4. Samples: 2468043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:50,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.020')] +[2023-09-27 11:24:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 9904128. Throughput: 0: 790.3, 1: 789.5. Samples: 2472740. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:24:55,583][08744] Avg episode reward: [(0, '2.160'), (1, '2.100')] +[2023-09-27 11:24:56,605][09878] Updated weights for policy 0, policy_version 19360 (0.0016) +[2023-09-27 11:24:56,605][09879] Updated weights for policy 1, policy_version 19360 (0.0014) +[2023-09-27 11:25:00,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 9936896. Throughput: 0: 790.7, 1: 790.2. Samples: 2482238. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:00,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.160')] +[2023-09-27 11:25:05,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9961472. Throughput: 0: 788.5, 1: 789.5. Samples: 2491747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:05,583][08744] Avg episode reward: [(0, '2.370'), (1, '2.040')] +[2023-09-27 11:25:09,647][09879] Updated weights for policy 1, policy_version 19520 (0.0016) +[2023-09-27 11:25:09,647][09878] Updated weights for policy 0, policy_version 19520 (0.0018) +[2023-09-27 11:25:10,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 9994240. Throughput: 0: 789.4, 1: 789.4. Samples: 2496512. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:10,583][08744] Avg episode reward: [(0, '2.320'), (1, '2.180')] +[2023-09-27 11:25:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10027008. Throughput: 0: 790.2, 1: 790.3. Samples: 2506195. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:15,584][08744] Avg episode reward: [(0, '2.340'), (1, '2.290')] +[2023-09-27 11:25:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10059776. Throughput: 0: 785.1, 1: 785.3. Samples: 2515516. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:20,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.310')] +[2023-09-27 11:25:22,400][09878] Updated weights for policy 0, policy_version 19680 (0.0017) +[2023-09-27 11:25:22,401][09879] Updated weights for policy 1, policy_version 19680 (0.0017) +[2023-09-27 11:25:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10092544. Throughput: 0: 791.8, 1: 791.1. Samples: 2520637. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:25,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.250')] +[2023-09-27 11:25:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10125312. Throughput: 0: 792.2, 1: 793.5. Samples: 2530032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:30,583][08744] Avg episode reward: [(0, '2.530'), (1, '2.370')] +[2023-09-27 11:25:35,339][09879] Updated weights for policy 1, policy_version 19840 (0.0016) +[2023-09-27 11:25:35,340][09878] Updated weights for policy 0, policy_version 19840 (0.0018) +[2023-09-27 11:25:35,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10158080. Throughput: 0: 794.2, 1: 794.2. Samples: 2539520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:35,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.490')] +[2023-09-27 11:25:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6373.1). Total num frames: 10190848. Throughput: 0: 793.4, 1: 794.3. Samples: 2544187. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:25:40,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.460')] +[2023-09-27 11:25:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10223616. Throughput: 0: 795.8, 1: 795.7. Samples: 2553856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:25:45,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.490')] +[2023-09-27 11:25:48,090][09879] Updated weights for policy 1, policy_version 20000 (0.0017) +[2023-09-27 11:25:48,090][09878] Updated weights for policy 0, policy_version 20000 (0.0016) +[2023-09-27 11:25:50,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10248192. Throughput: 0: 800.0, 1: 798.5. Samples: 2563678. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:25:50,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.300')] +[2023-09-27 11:25:55,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10280960. Throughput: 0: 796.7, 1: 796.7. Samples: 2568215. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:25:55,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.510')] +[2023-09-27 11:26:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10313728. Throughput: 0: 796.1, 1: 796.1. Samples: 2577843. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:00,584][08744] Avg episode reward: [(0, '2.480'), (1, '2.550')] +[2023-09-27 11:26:00,974][09878] Updated weights for policy 0, policy_version 20160 (0.0017) +[2023-09-27 11:26:00,975][09879] Updated weights for policy 1, policy_version 20160 (0.0017) +[2023-09-27 11:26:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10346496. Throughput: 0: 799.2, 1: 799.6. Samples: 2587464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:05,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.530')] +[2023-09-27 11:26:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 10379264. Throughput: 0: 799.1, 1: 799.7. Samples: 2592581. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:10,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.500')] +[2023-09-27 11:26:13,579][09879] Updated weights for policy 1, policy_version 20320 (0.0017) +[2023-09-27 11:26:13,579][09878] Updated weights for policy 0, policy_version 20320 (0.0016) +[2023-09-27 11:26:15,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10412032. Throughput: 0: 801.8, 1: 801.2. Samples: 2602167. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:26:15,584][08744] Avg episode reward: [(0, '2.230'), (1, '2.510')] +[2023-09-27 11:26:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10444800. Throughput: 0: 800.6, 1: 800.9. Samples: 2611588. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:26:20,583][08744] Avg episode reward: [(0, '2.100'), (1, '2.520')] +[2023-09-27 11:26:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10477568. Throughput: 0: 803.7, 1: 804.1. Samples: 2616537. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:25,583][08744] Avg episode reward: [(0, '2.040'), (1, '2.420')] +[2023-09-27 11:26:26,379][09878] Updated weights for policy 0, policy_version 20480 (0.0017) +[2023-09-27 11:26:26,379][09879] Updated weights for policy 1, policy_version 20480 (0.0017) +[2023-09-27 11:26:30,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10510336. Throughput: 0: 801.7, 1: 802.0. Samples: 2626020. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:30,583][08744] Avg episode reward: [(0, '2.200'), (1, '2.180')] +[2023-09-27 11:26:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10543104. Throughput: 0: 800.4, 1: 800.8. Samples: 2635734. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:26:35,584][08744] Avg episode reward: [(0, '2.290'), (1, '2.160')] +[2023-09-27 11:26:39,263][09879] Updated weights for policy 1, policy_version 20640 (0.0017) +[2023-09-27 11:26:39,264][09878] Updated weights for policy 0, policy_version 20640 (0.0018) +[2023-09-27 11:26:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10575872. Throughput: 0: 800.9, 1: 800.9. Samples: 2640299. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:26:40,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.210')] +[2023-09-27 11:26:45,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10600448. Throughput: 0: 803.0, 1: 802.8. Samples: 2650105. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:26:45,584][08744] Avg episode reward: [(0, '2.600'), (1, '2.050')] +[2023-09-27 11:26:45,673][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000020720_5304320.pth... +[2023-09-27 11:26:45,695][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000020720_5304320.pth... +[2023-09-27 11:26:45,708][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000017728_4538368.pth +[2023-09-27 11:26:45,723][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000017728_4538368.pth +[2023-09-27 11:26:50,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10633216. Throughput: 0: 802.4, 1: 802.8. Samples: 2659697. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:26:50,583][08744] Avg episode reward: [(0, '2.620'), (1, '1.930')] +[2023-09-27 11:26:52,050][09878] Updated weights for policy 0, policy_version 20800 (0.0018) +[2023-09-27 11:26:52,050][09879] Updated weights for policy 1, policy_version 20800 (0.0018) +[2023-09-27 11:26:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10665984. Throughput: 0: 798.5, 1: 798.6. Samples: 2664448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:26:55,583][08744] Avg episode reward: [(0, '2.740'), (1, '2.030')] +[2023-09-27 11:27:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10698752. Throughput: 0: 802.4, 1: 802.6. Samples: 2674390. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:00,584][08744] Avg episode reward: [(0, '2.860'), (1, '2.140')] +[2023-09-27 11:27:00,594][09606] Saving new best policy, reward=2.860! +[2023-09-27 11:27:04,801][09878] Updated weights for policy 0, policy_version 20960 (0.0018) +[2023-09-27 11:27:04,801][09879] Updated weights for policy 1, policy_version 20960 (0.0015) +[2023-09-27 11:27:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10731520. Throughput: 0: 800.6, 1: 800.2. Samples: 2683624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:05,583][08744] Avg episode reward: [(0, '2.760'), (1, '2.140')] +[2023-09-27 11:27:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10764288. Throughput: 0: 799.7, 1: 799.2. Samples: 2688485. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:10,583][08744] Avg episode reward: [(0, '2.760'), (1, '2.090')] +[2023-09-27 11:27:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10797056. Throughput: 0: 795.8, 1: 795.7. Samples: 2697634. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:27:15,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.130')] +[2023-09-27 11:27:17,773][09879] Updated weights for policy 1, policy_version 21120 (0.0017) +[2023-09-27 11:27:17,773][09878] Updated weights for policy 0, policy_version 21120 (0.0016) +[2023-09-27 11:27:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10829824. Throughput: 0: 796.9, 1: 796.9. Samples: 2707456. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:27:20,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.110')] +[2023-09-27 11:27:25,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 10862592. Throughput: 0: 796.1, 1: 796.0. Samples: 2711945. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:27:25,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.050')] +[2023-09-27 11:27:30,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10887168. Throughput: 0: 796.4, 1: 796.6. Samples: 2721792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:30,583][08744] Avg episode reward: [(0, '2.560'), (1, '2.010')] +[2023-09-27 11:27:30,648][09879] Updated weights for policy 1, policy_version 21280 (0.0016) +[2023-09-27 11:27:30,648][09878] Updated weights for policy 0, policy_version 21280 (0.0017) +[2023-09-27 11:27:35,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10919936. Throughput: 0: 795.1, 1: 795.5. Samples: 2731274. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:35,583][08744] Avg episode reward: [(0, '2.550'), (1, '2.000')] +[2023-09-27 11:27:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 10952704. Throughput: 0: 796.4, 1: 796.4. Samples: 2736128. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:27:40,583][08744] Avg episode reward: [(0, '2.630'), (1, '2.030')] +[2023-09-27 11:27:43,490][09879] Updated weights for policy 1, policy_version 21440 (0.0018) +[2023-09-27 11:27:43,491][09878] Updated weights for policy 0, policy_version 21440 (0.0018) +[2023-09-27 11:27:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 10985472. Throughput: 0: 792.3, 1: 792.7. Samples: 2745717. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:27:45,584][08744] Avg episode reward: [(0, '2.660'), (1, '1.990')] +[2023-09-27 11:27:50,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11018240. Throughput: 0: 795.1, 1: 794.9. Samples: 2755171. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:27:50,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.020')] +[2023-09-27 11:27:55,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11051008. Throughput: 0: 795.9, 1: 796.2. Samples: 2760128. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:27:55,584][08744] Avg episode reward: [(0, '2.500'), (1, '2.120')] +[2023-09-27 11:27:56,367][09878] Updated weights for policy 0, policy_version 21600 (0.0019) +[2023-09-27 11:27:56,367][09879] Updated weights for policy 1, policy_version 21600 (0.0015) +[2023-09-27 11:28:00,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11083776. Throughput: 0: 797.6, 1: 797.5. Samples: 2769416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:28:00,584][08744] Avg episode reward: [(0, '2.490'), (1, '2.130')] +[2023-09-27 11:28:05,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6373.1). Total num frames: 11116544. Throughput: 0: 796.4, 1: 796.4. Samples: 2779136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:28:05,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.210')] +[2023-09-27 11:28:09,143][09878] Updated weights for policy 0, policy_version 21760 (0.0016) +[2023-09-27 11:28:09,144][09879] Updated weights for policy 1, policy_version 21760 (0.0017) +[2023-09-27 11:28:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11149312. Throughput: 0: 799.0, 1: 799.1. Samples: 2783857. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:28:10,583][08744] Avg episode reward: [(0, '2.170'), (1, '2.380')] +[2023-09-27 11:28:15,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11173888. Throughput: 0: 796.4, 1: 796.4. Samples: 2793472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:28:15,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.330')] +[2023-09-27 11:28:20,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11206656. Throughput: 0: 796.1, 1: 795.9. Samples: 2802914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:28:20,584][08744] Avg episode reward: [(0, '2.330'), (1, '2.240')] +[2023-09-27 11:28:22,123][09878] Updated weights for policy 0, policy_version 21920 (0.0018) +[2023-09-27 11:28:22,123][09879] Updated weights for policy 1, policy_version 21920 (0.0016) +[2023-09-27 11:28:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11239424. Throughput: 0: 796.4, 1: 796.4. Samples: 2807806. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:28:25,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.240')] +[2023-09-27 11:28:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11272192. Throughput: 0: 794.4, 1: 794.2. Samples: 2817200. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:28:30,584][08744] Avg episode reward: [(0, '2.450'), (1, '2.320')] +[2023-09-27 11:28:34,987][09878] Updated weights for policy 0, policy_version 22080 (0.0015) +[2023-09-27 11:28:34,988][09879] Updated weights for policy 1, policy_version 22080 (0.0015) +[2023-09-27 11:28:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11304960. Throughput: 0: 794.0, 1: 794.2. Samples: 2826642. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:28:35,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.360')] +[2023-09-27 11:28:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11337728. Throughput: 0: 793.7, 1: 793.1. Samples: 2831534. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:28:40,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.310')] +[2023-09-27 11:28:45,583][08744] Fps is (10 sec: 6553.3, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11370496. Throughput: 0: 795.4, 1: 795.3. Samples: 2840998. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:28:45,584][08744] Avg episode reward: [(0, '2.530'), (1, '2.320')] +[2023-09-27 11:28:45,596][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000022208_5685248.pth... +[2023-09-27 11:28:45,596][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000022208_5685248.pth... +[2023-09-27 11:28:45,635][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000019216_4919296.pth +[2023-09-27 11:28:45,636][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000019216_4919296.pth +[2023-09-27 11:28:47,888][09878] Updated weights for policy 0, policy_version 22240 (0.0018) +[2023-09-27 11:28:47,888][09879] Updated weights for policy 1, policy_version 22240 (0.0017) +[2023-09-27 11:28:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11403264. Throughput: 0: 796.4, 1: 796.3. Samples: 2850808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:28:50,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.580')] +[2023-09-27 11:28:55,583][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11436032. Throughput: 0: 793.1, 1: 792.9. Samples: 2855224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:28:55,583][08744] Avg episode reward: [(0, '2.530'), (1, '2.560')] +[2023-09-27 11:29:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11460608. Throughput: 0: 796.4, 1: 796.4. Samples: 2865152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:00,583][08744] Avg episode reward: [(0, '2.580'), (1, '2.480')] +[2023-09-27 11:29:00,652][09878] Updated weights for policy 0, policy_version 22400 (0.0017) +[2023-09-27 11:29:00,652][09879] Updated weights for policy 1, policy_version 22400 (0.0016) +[2023-09-27 11:29:05,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 11493376. Throughput: 0: 796.8, 1: 796.8. Samples: 2874624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:05,583][08744] Avg episode reward: [(0, '2.690'), (1, '2.350')] +[2023-09-27 11:29:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11526144. Throughput: 0: 796.4, 1: 796.5. Samples: 2879488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:10,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.350')] +[2023-09-27 11:29:13,461][09878] Updated weights for policy 0, policy_version 22560 (0.0016) +[2023-09-27 11:29:13,461][09879] Updated weights for policy 1, policy_version 22560 (0.0015) +[2023-09-27 11:29:15,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11558912. Throughput: 0: 799.0, 1: 798.6. Samples: 2889088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:15,584][08744] Avg episode reward: [(0, '2.690'), (1, '2.000')] +[2023-09-27 11:29:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 11591680. Throughput: 0: 798.2, 1: 799.2. Samples: 2898524. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:20,583][08744] Avg episode reward: [(0, '2.820'), (1, '1.980')] +[2023-09-27 11:29:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11624448. Throughput: 0: 799.6, 1: 799.9. Samples: 2903512. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:25,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.000')] +[2023-09-27 11:29:26,285][09878] Updated weights for policy 0, policy_version 22720 (0.0017) +[2023-09-27 11:29:26,285][09879] Updated weights for policy 1, policy_version 22720 (0.0014) +[2023-09-27 11:29:30,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11657216. Throughput: 0: 796.6, 1: 796.8. Samples: 2912700. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:30,584][08744] Avg episode reward: [(0, '2.600'), (1, '2.090')] +[2023-09-27 11:29:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11689984. Throughput: 0: 792.8, 1: 793.4. Samples: 2922188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:35,583][08744] Avg episode reward: [(0, '2.260'), (1, '2.100')] +[2023-09-27 11:29:39,567][09878] Updated weights for policy 0, policy_version 22880 (0.0016) +[2023-09-27 11:29:39,568][09879] Updated weights for policy 1, policy_version 22880 (0.0015) +[2023-09-27 11:29:40,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11714560. Throughput: 0: 793.0, 1: 793.0. Samples: 2926597. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:40,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.100')] +[2023-09-27 11:29:45,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 11747328. Throughput: 0: 790.4, 1: 790.2. Samples: 2936278. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:29:45,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.230')] +[2023-09-27 11:29:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11780096. Throughput: 0: 790.9, 1: 790.8. Samples: 2945803. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:29:50,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.460')] +[2023-09-27 11:29:52,375][09879] Updated weights for policy 1, policy_version 23040 (0.0016) +[2023-09-27 11:29:52,376][09878] Updated weights for policy 0, policy_version 23040 (0.0018) +[2023-09-27 11:29:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11812864. Throughput: 0: 791.7, 1: 791.4. Samples: 2950726. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:29:55,584][08744] Avg episode reward: [(0, '2.550'), (1, '2.410')] +[2023-09-27 11:30:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11845632. Throughput: 0: 788.9, 1: 789.2. Samples: 2960100. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:30:00,584][08744] Avg episode reward: [(0, '2.620'), (1, '2.480')] +[2023-09-27 11:30:05,254][09879] Updated weights for policy 1, policy_version 23200 (0.0016) +[2023-09-27 11:30:05,255][09878] Updated weights for policy 0, policy_version 23200 (0.0018) +[2023-09-27 11:30:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11878400. Throughput: 0: 790.1, 1: 789.4. Samples: 2969601. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:30:05,583][08744] Avg episode reward: [(0, '2.790'), (1, '2.490')] +[2023-09-27 11:30:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 11911168. Throughput: 0: 788.1, 1: 787.2. Samples: 2974399. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:30:10,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.520')] +[2023-09-27 11:30:15,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 11935744. Throughput: 0: 791.6, 1: 791.4. Samples: 2983936. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:30:15,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.360')] +[2023-09-27 11:30:18,171][09878] Updated weights for policy 0, policy_version 23360 (0.0016) +[2023-09-27 11:30:18,172][09879] Updated weights for policy 1, policy_version 23360 (0.0018) +[2023-09-27 11:30:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 11968512. Throughput: 0: 793.2, 1: 791.6. Samples: 2993502. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:30:20,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.290')] +[2023-09-27 11:30:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12001280. Throughput: 0: 796.4, 1: 796.4. Samples: 2998272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:30:25,584][08744] Avg episode reward: [(0, '2.450'), (1, '2.120')] +[2023-09-27 11:30:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12034048. Throughput: 0: 795.8, 1: 795.9. Samples: 3007908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:30:30,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.120')] +[2023-09-27 11:30:31,033][09878] Updated weights for policy 0, policy_version 23520 (0.0017) +[2023-09-27 11:30:31,033][09879] Updated weights for policy 1, policy_version 23520 (0.0016) +[2023-09-27 11:30:35,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12066816. Throughput: 0: 793.7, 1: 794.1. Samples: 3017256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:30:35,584][08744] Avg episode reward: [(0, '2.510'), (1, '2.050')] +[2023-09-27 11:30:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12099584. Throughput: 0: 794.9, 1: 795.9. Samples: 3022312. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:30:40,583][08744] Avg episode reward: [(0, '2.440'), (1, '2.200')] +[2023-09-27 11:30:43,947][09879] Updated weights for policy 1, policy_version 23680 (0.0015) +[2023-09-27 11:30:43,948][09878] Updated weights for policy 0, policy_version 23680 (0.0016) +[2023-09-27 11:30:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 12132352. Throughput: 0: 793.8, 1: 793.5. Samples: 3031527. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:30:45,584][08744] Avg episode reward: [(0, '2.440'), (1, '2.330')] +[2023-09-27 11:30:45,596][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000023696_6066176.pth... +[2023-09-27 11:30:45,597][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000023696_6066176.pth... +[2023-09-27 11:30:45,625][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000020720_5304320.pth +[2023-09-27 11:30:45,635][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000020720_5304320.pth +[2023-09-27 11:30:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 12165120. Throughput: 0: 796.4, 1: 796.4. Samples: 3041280. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:30:50,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.350')] +[2023-09-27 11:30:55,583][08744] Fps is (10 sec: 6144.1, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 12193792. Throughput: 0: 792.5, 1: 793.2. Samples: 3045753. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:30:55,584][08744] Avg episode reward: [(0, '2.410'), (1, '2.400')] +[2023-09-27 11:30:56,865][09879] Updated weights for policy 1, policy_version 23840 (0.0016) +[2023-09-27 11:30:56,866][09878] Updated weights for policy 0, policy_version 23840 (0.0017) +[2023-09-27 11:31:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12222464. Throughput: 0: 796.3, 1: 796.2. Samples: 3055602. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:00,583][08744] Avg episode reward: [(0, '2.360'), (1, '2.420')] +[2023-09-27 11:31:05,582][08744] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12255232. Throughput: 0: 793.8, 1: 795.2. Samples: 3065006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:05,583][08744] Avg episode reward: [(0, '2.270'), (1, '2.270')] +[2023-09-27 11:31:09,721][09879] Updated weights for policy 1, policy_version 24000 (0.0019) +[2023-09-27 11:31:09,721][09878] Updated weights for policy 0, policy_version 24000 (0.0019) +[2023-09-27 11:31:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12288000. Throughput: 0: 796.4, 1: 796.4. Samples: 3069952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:10,583][08744] Avg episode reward: [(0, '2.320'), (1, '2.240')] +[2023-09-27 11:31:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12320768. Throughput: 0: 794.4, 1: 794.2. Samples: 3079394. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:15,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.420')] +[2023-09-27 11:31:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12353536. Throughput: 0: 796.9, 1: 796.4. Samples: 3088955. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:20,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.280')] +[2023-09-27 11:31:22,435][09879] Updated weights for policy 1, policy_version 24160 (0.0017) +[2023-09-27 11:31:22,435][09878] Updated weights for policy 0, policy_version 24160 (0.0016) +[2023-09-27 11:31:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12386304. Throughput: 0: 797.2, 1: 796.5. Samples: 3094028. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:25,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.410')] +[2023-09-27 11:31:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12419072. Throughput: 0: 797.6, 1: 798.0. Samples: 3103332. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:30,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.380')] +[2023-09-27 11:31:35,324][09878] Updated weights for policy 0, policy_version 24320 (0.0018) +[2023-09-27 11:31:35,324][09879] Updated weights for policy 1, policy_version 24320 (0.0015) +[2023-09-27 11:31:35,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12451840. Throughput: 0: 796.4, 1: 796.4. Samples: 3112960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:35,584][08744] Avg episode reward: [(0, '2.630'), (1, '2.160')] +[2023-09-27 11:31:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 12484608. Throughput: 0: 798.4, 1: 798.4. Samples: 3117611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:31:40,583][08744] Avg episode reward: [(0, '2.860'), (1, '2.050')] +[2023-09-27 11:31:45,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12509184. Throughput: 0: 796.6, 1: 796.6. Samples: 3127296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:31:45,584][08744] Avg episode reward: [(0, '2.900'), (1, '2.260')] +[2023-09-27 11:31:45,629][09606] Saving new best policy, reward=2.900! +[2023-09-27 11:31:48,162][09879] Updated weights for policy 1, policy_version 24480 (0.0017) +[2023-09-27 11:31:48,163][09878] Updated weights for policy 0, policy_version 24480 (0.0016) +[2023-09-27 11:31:50,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12541952. Throughput: 0: 799.9, 1: 799.9. Samples: 3136998. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:31:50,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.160')] +[2023-09-27 11:31:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6348.8, 300 sec: 6359.2). Total num frames: 12574720. Throughput: 0: 796.4, 1: 796.4. Samples: 3141632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:31:55,584][08744] Avg episode reward: [(0, '2.740'), (1, '2.230')] +[2023-09-27 11:32:00,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12607488. Throughput: 0: 797.1, 1: 797.6. Samples: 3151158. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:00,583][08744] Avg episode reward: [(0, '2.560'), (1, '2.260')] +[2023-09-27 11:32:01,097][09879] Updated weights for policy 1, policy_version 24640 (0.0018) +[2023-09-27 11:32:01,097][09878] Updated weights for policy 0, policy_version 24640 (0.0019) +[2023-09-27 11:32:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12640256. Throughput: 0: 795.6, 1: 795.7. Samples: 3160563. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:05,583][08744] Avg episode reward: [(0, '2.440'), (1, '2.430')] +[2023-09-27 11:32:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12673024. Throughput: 0: 795.8, 1: 795.5. Samples: 3165635. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:10,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.330')] +[2023-09-27 11:32:13,843][09879] Updated weights for policy 1, policy_version 24800 (0.0016) +[2023-09-27 11:32:13,843][09878] Updated weights for policy 0, policy_version 24800 (0.0018) +[2023-09-27 11:32:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12705792. Throughput: 0: 796.9, 1: 797.1. Samples: 3175061. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:15,584][08744] Avg episode reward: [(0, '2.710'), (1, '2.200')] +[2023-09-27 11:32:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12738560. Throughput: 0: 796.3, 1: 796.4. Samples: 3184633. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:32:20,583][08744] Avg episode reward: [(0, '2.690'), (1, '2.280')] +[2023-09-27 11:32:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 12771328. Throughput: 0: 795.4, 1: 795.6. Samples: 3189205. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:32:25,584][08744] Avg episode reward: [(0, '2.770'), (1, '2.370')] +[2023-09-27 11:32:26,761][09879] Updated weights for policy 1, policy_version 24960 (0.0016) +[2023-09-27 11:32:26,761][09878] Updated weights for policy 0, policy_version 24960 (0.0017) +[2023-09-27 11:32:30,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12795904. Throughput: 0: 796.4, 1: 796.4. Samples: 3198976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:32:30,584][08744] Avg episode reward: [(0, '2.870'), (1, '2.510')] +[2023-09-27 11:32:35,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 12828672. Throughput: 0: 794.0, 1: 793.5. Samples: 3208437. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:35,583][08744] Avg episode reward: [(0, '2.820'), (1, '2.390')] +[2023-09-27 11:32:39,694][09878] Updated weights for policy 0, policy_version 25120 (0.0017) +[2023-09-27 11:32:39,695][09879] Updated weights for policy 1, policy_version 25120 (0.0018) +[2023-09-27 11:32:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 12861440. Throughput: 0: 796.4, 1: 796.4. Samples: 3213312. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:40,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.530')] +[2023-09-27 11:32:45,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12894208. Throughput: 0: 796.9, 1: 796.4. Samples: 3222859. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:45,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.700')] +[2023-09-27 11:32:45,593][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000025184_6447104.pth... +[2023-09-27 11:32:45,593][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000025184_6447104.pth... +[2023-09-27 11:32:45,627][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000022208_5685248.pth +[2023-09-27 11:32:45,634][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000022208_5685248.pth +[2023-09-27 11:32:50,582][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12926976. Throughput: 0: 799.8, 1: 798.7. Samples: 3232499. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:32:50,583][08744] Avg episode reward: [(0, '2.310'), (1, '2.620')] +[2023-09-27 11:32:52,396][09878] Updated weights for policy 0, policy_version 25280 (0.0016) +[2023-09-27 11:32:52,396][09879] Updated weights for policy 1, policy_version 25280 (0.0017) +[2023-09-27 11:32:55,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12959744. Throughput: 0: 797.3, 1: 798.2. Samples: 3237429. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:32:55,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.570')] +[2023-09-27 11:33:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 12992512. Throughput: 0: 796.8, 1: 796.8. Samples: 3246771. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:33:00,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.520')] +[2023-09-27 11:33:05,326][09879] Updated weights for policy 1, policy_version 25440 (0.0015) +[2023-09-27 11:33:05,326][09878] Updated weights for policy 0, policy_version 25440 (0.0018) +[2023-09-27 11:33:05,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 13025280. Throughput: 0: 796.6, 1: 796.4. Samples: 3256320. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-09-27 11:33:05,584][08744] Avg episode reward: [(0, '2.500'), (1, '2.540')] +[2023-09-27 11:33:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13058048. Throughput: 0: 798.4, 1: 798.5. Samples: 3261065. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:10,583][08744] Avg episode reward: [(0, '2.560'), (1, '2.480')] +[2023-09-27 11:33:15,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13082624. Throughput: 0: 796.4, 1: 796.4. Samples: 3270656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:15,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.590')] +[2023-09-27 11:33:18,089][09878] Updated weights for policy 0, policy_version 25600 (0.0019) +[2023-09-27 11:33:18,089][09879] Updated weights for policy 1, policy_version 25600 (0.0018) +[2023-09-27 11:33:20,582][08744] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 13119488. Throughput: 0: 801.2, 1: 801.4. Samples: 3280555. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:20,583][08744] Avg episode reward: [(0, '2.590'), (1, '2.520')] +[2023-09-27 11:33:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13148160. Throughput: 0: 796.5, 1: 796.6. Samples: 3285001. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:25,584][08744] Avg episode reward: [(0, '2.360'), (1, '2.430')] +[2023-09-27 11:33:30,583][08744] Fps is (10 sec: 6143.9, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13180928. Throughput: 0: 802.3, 1: 802.4. Samples: 3295069. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:30,584][08744] Avg episode reward: [(0, '2.400'), (1, '2.350')] +[2023-09-27 11:33:30,811][09879] Updated weights for policy 1, policy_version 25760 (0.0017) +[2023-09-27 11:33:30,811][09878] Updated weights for policy 0, policy_version 25760 (0.0017) +[2023-09-27 11:33:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13213696. Throughput: 0: 794.9, 1: 796.1. Samples: 3304095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:35,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.190')] +[2023-09-27 11:33:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13246464. Throughput: 0: 796.3, 1: 795.8. Samples: 3309073. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:33:40,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.260')] +[2023-09-27 11:33:43,733][09879] Updated weights for policy 1, policy_version 25920 (0.0018) +[2023-09-27 11:33:43,733][09878] Updated weights for policy 0, policy_version 25920 (0.0020) +[2023-09-27 11:33:45,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13279232. Throughput: 0: 799.1, 1: 798.6. Samples: 3318668. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:33:45,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.200')] +[2023-09-27 11:33:50,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13312000. Throughput: 0: 796.5, 1: 796.5. Samples: 3328005. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:33:50,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.210')] +[2023-09-27 11:33:55,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13344768. Throughput: 0: 797.9, 1: 797.9. Samples: 3332878. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:33:55,583][08744] Avg episode reward: [(0, '2.720'), (1, '2.210')] +[2023-09-27 11:33:56,769][09879] Updated weights for policy 1, policy_version 26080 (0.0017) +[2023-09-27 11:33:56,769][09878] Updated weights for policy 0, policy_version 26080 (0.0018) +[2023-09-27 11:34:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13369344. Throughput: 0: 796.4, 1: 796.4. Samples: 3342336. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-09-27 11:34:00,584][08744] Avg episode reward: [(0, '2.640'), (1, '2.320')] +[2023-09-27 11:34:05,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13402112. Throughput: 0: 794.2, 1: 794.4. Samples: 3352040. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:34:05,584][08744] Avg episode reward: [(0, '2.460'), (1, '2.200')] +[2023-09-27 11:34:09,522][09878] Updated weights for policy 0, policy_version 26240 (0.0017) +[2023-09-27 11:34:09,522][09879] Updated weights for policy 1, policy_version 26240 (0.0014) +[2023-09-27 11:34:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13434880. Throughput: 0: 796.4, 1: 796.3. Samples: 3356672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:34:10,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.230')] +[2023-09-27 11:34:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13467648. Throughput: 0: 792.8, 1: 793.8. Samples: 3366468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:34:15,584][08744] Avg episode reward: [(0, '2.430'), (1, '2.340')] +[2023-09-27 11:34:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6348.8, 300 sec: 6359.2). Total num frames: 13500416. Throughput: 0: 795.5, 1: 795.6. Samples: 3375697. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:34:20,583][08744] Avg episode reward: [(0, '2.410'), (1, '2.380')] +[2023-09-27 11:34:22,400][09879] Updated weights for policy 1, policy_version 26400 (0.0017) +[2023-09-27 11:34:22,401][09878] Updated weights for policy 0, policy_version 26400 (0.0017) +[2023-09-27 11:34:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13533184. Throughput: 0: 796.7, 1: 796.8. Samples: 3380780. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:34:25,584][08744] Avg episode reward: [(0, '2.310'), (1, '2.380')] +[2023-09-27 11:34:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13565952. Throughput: 0: 794.6, 1: 794.7. Samples: 3390187. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:34:30,584][08744] Avg episode reward: [(0, '2.330'), (1, '2.360')] +[2023-09-27 11:34:35,206][09879] Updated weights for policy 1, policy_version 26560 (0.0015) +[2023-09-27 11:34:35,206][09878] Updated weights for policy 0, policy_version 26560 (0.0017) +[2023-09-27 11:34:35,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13598720. Throughput: 0: 796.5, 1: 796.4. Samples: 3399689. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:34:35,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.540')] +[2023-09-27 11:34:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 13631488. Throughput: 0: 796.5, 1: 796.4. Samples: 3404559. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:34:40,584][08744] Avg episode reward: [(0, '2.530'), (1, '2.430')] +[2023-09-27 11:34:45,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 13664256. Throughput: 0: 796.5, 1: 796.5. Samples: 3414018. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:34:45,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.420')] +[2023-09-27 11:34:45,596][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000026688_6832128.pth... +[2023-09-27 11:34:45,596][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000026688_6832128.pth... +[2023-09-27 11:34:45,627][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000023696_6066176.pth +[2023-09-27 11:34:45,630][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000023696_6066176.pth +[2023-09-27 11:34:48,047][09879] Updated weights for policy 1, policy_version 26720 (0.0017) +[2023-09-27 11:34:48,047][09878] Updated weights for policy 0, policy_version 26720 (0.0017) +[2023-09-27 11:34:50,582][08744] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 13692928. Throughput: 0: 799.4, 1: 798.5. Samples: 3423948. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:34:50,583][08744] Avg episode reward: [(0, '2.550'), (1, '2.630')] +[2023-09-27 11:34:55,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 13721600. Throughput: 0: 796.5, 1: 796.5. Samples: 3428360. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:34:55,583][08744] Avg episode reward: [(0, '2.760'), (1, '2.560')] +[2023-09-27 11:35:00,582][08744] Fps is (10 sec: 6144.0, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13754368. Throughput: 0: 798.9, 1: 797.8. Samples: 3438321. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:35:00,583][08744] Avg episode reward: [(0, '2.770'), (1, '2.540')] +[2023-09-27 11:35:00,838][09879] Updated weights for policy 1, policy_version 26880 (0.0018) +[2023-09-27 11:35:00,838][09878] Updated weights for policy 0, policy_version 26880 (0.0017) +[2023-09-27 11:35:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 13787136. Throughput: 0: 800.5, 1: 800.5. Samples: 3447742. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:35:05,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.330')] +[2023-09-27 11:35:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13819904. Throughput: 0: 798.1, 1: 798.5. Samples: 3452628. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:35:10,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.330')] +[2023-09-27 11:35:13,704][09878] Updated weights for policy 0, policy_version 27040 (0.0018) +[2023-09-27 11:35:13,705][09879] Updated weights for policy 1, policy_version 27040 (0.0016) +[2023-09-27 11:35:15,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13852672. Throughput: 0: 798.0, 1: 798.2. Samples: 3462013. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:35:15,583][08744] Avg episode reward: [(0, '2.700'), (1, '2.690')] +[2023-09-27 11:35:20,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13885440. Throughput: 0: 798.3, 1: 798.1. Samples: 3471527. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:35:20,583][08744] Avg episode reward: [(0, '2.770'), (1, '2.600')] +[2023-09-27 11:35:25,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13918208. Throughput: 0: 799.4, 1: 799.6. Samples: 3476518. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:35:25,584][08744] Avg episode reward: [(0, '2.540'), (1, '2.700')] +[2023-09-27 11:35:26,459][09879] Updated weights for policy 1, policy_version 27200 (0.0015) +[2023-09-27 11:35:26,460][09878] Updated weights for policy 0, policy_version 27200 (0.0017) +[2023-09-27 11:35:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13950976. Throughput: 0: 801.0, 1: 801.3. Samples: 3486122. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:35:30,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.800')] +[2023-09-27 11:35:30,596][09742] Saving new best policy, reward=2.800! +[2023-09-27 11:35:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 13983744. Throughput: 0: 797.6, 1: 798.7. Samples: 3495780. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:35:35,583][08744] Avg episode reward: [(0, '2.650'), (1, '2.930')] +[2023-09-27 11:35:35,584][09742] Saving new best policy, reward=2.930! +[2023-09-27 11:35:39,388][09879] Updated weights for policy 1, policy_version 27360 (0.0017) +[2023-09-27 11:35:39,388][09878] Updated weights for policy 0, policy_version 27360 (0.0017) +[2023-09-27 11:35:40,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14008320. Throughput: 0: 797.6, 1: 797.6. Samples: 3500142. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:35:40,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.830')] +[2023-09-27 11:35:45,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 14041088. Throughput: 0: 796.4, 1: 796.5. Samples: 3510000. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:35:45,583][08744] Avg episode reward: [(0, '2.580'), (1, '2.630')] +[2023-09-27 11:35:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 14073856. Throughput: 0: 796.2, 1: 796.0. Samples: 3519391. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) +[2023-09-27 11:35:50,583][08744] Avg episode reward: [(0, '2.470'), (1, '2.380')] +[2023-09-27 11:35:52,286][09878] Updated weights for policy 0, policy_version 27520 (0.0016) +[2023-09-27 11:35:52,286][09879] Updated weights for policy 1, policy_version 27520 (0.0017) +[2023-09-27 11:35:55,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14106624. Throughput: 0: 797.0, 1: 795.5. Samples: 3524293. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:35:55,583][08744] Avg episode reward: [(0, '2.540'), (1, '2.460')] +[2023-09-27 11:36:00,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14139392. Throughput: 0: 796.6, 1: 796.0. Samples: 3533681. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:36:00,583][08744] Avg episode reward: [(0, '2.390'), (1, '2.320')] +[2023-09-27 11:36:05,207][09878] Updated weights for policy 0, policy_version 27680 (0.0017) +[2023-09-27 11:36:05,207][09879] Updated weights for policy 1, policy_version 27680 (0.0017) +[2023-09-27 11:36:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14172160. Throughput: 0: 794.6, 1: 794.8. Samples: 3543049. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:36:05,583][08744] Avg episode reward: [(0, '2.560'), (1, '2.440')] +[2023-09-27 11:36:10,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14204928. Throughput: 0: 794.4, 1: 793.4. Samples: 3547971. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:36:10,584][08744] Avg episode reward: [(0, '2.670'), (1, '2.290')] +[2023-09-27 11:36:15,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14237696. Throughput: 0: 791.9, 1: 791.6. Samples: 3557381. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:36:15,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.420')] +[2023-09-27 11:36:18,169][09878] Updated weights for policy 0, policy_version 27840 (0.0017) +[2023-09-27 11:36:18,169][09879] Updated weights for policy 1, policy_version 27840 (0.0016) +[2023-09-27 11:36:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14262272. Throughput: 0: 791.2, 1: 791.2. Samples: 3566990. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:36:20,583][08744] Avg episode reward: [(0, '2.790'), (1, '2.460')] +[2023-09-27 11:36:25,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14295040. Throughput: 0: 795.2, 1: 795.2. Samples: 3571712. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:36:25,583][08744] Avg episode reward: [(0, '2.730'), (1, '2.500')] +[2023-09-27 11:36:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14327808. Throughput: 0: 793.0, 1: 793.5. Samples: 3581393. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:36:30,583][08744] Avg episode reward: [(0, '2.700'), (1, '2.490')] +[2023-09-27 11:36:31,004][09878] Updated weights for policy 0, policy_version 28000 (0.0015) +[2023-09-27 11:36:31,004][09879] Updated weights for policy 1, policy_version 28000 (0.0017) +[2023-09-27 11:36:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14360576. Throughput: 0: 793.5, 1: 793.3. Samples: 3590795. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:36:35,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.500')] +[2023-09-27 11:36:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14393344. Throughput: 0: 795.1, 1: 795.5. Samples: 3595873. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:36:40,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.510')] +[2023-09-27 11:36:43,742][09878] Updated weights for policy 0, policy_version 28160 (0.0018) +[2023-09-27 11:36:43,742][09879] Updated weights for policy 1, policy_version 28160 (0.0017) +[2023-09-27 11:36:45,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14426112. Throughput: 0: 796.1, 1: 796.7. Samples: 3605357. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:36:45,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.300')] +[2023-09-27 11:36:45,590][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000028176_7213056.pth... +[2023-09-27 11:36:45,590][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000028176_7213056.pth... +[2023-09-27 11:36:45,625][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000025184_6447104.pth +[2023-09-27 11:36:45,626][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000025184_6447104.pth +[2023-09-27 11:36:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14458880. Throughput: 0: 798.2, 1: 798.3. Samples: 3614892. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:36:50,584][08744] Avg episode reward: [(0, '2.530'), (1, '2.400')] +[2023-09-27 11:36:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14491648. Throughput: 0: 798.0, 1: 798.8. Samples: 3619828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:36:55,583][08744] Avg episode reward: [(0, '2.650'), (1, '2.360')] +[2023-09-27 11:36:56,483][09879] Updated weights for policy 1, policy_version 28320 (0.0017) +[2023-09-27 11:36:56,483][09878] Updated weights for policy 0, policy_version 28320 (0.0018) +[2023-09-27 11:37:00,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 14524416. Throughput: 0: 799.9, 1: 799.9. Samples: 3629372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:00,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.360')] +[2023-09-27 11:37:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14557184. Throughput: 0: 803.6, 1: 803.2. Samples: 3639296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:05,583][08744] Avg episode reward: [(0, '2.830'), (1, '2.480')] +[2023-09-27 11:37:09,270][09879] Updated weights for policy 1, policy_version 28480 (0.0017) +[2023-09-27 11:37:09,270][09878] Updated weights for policy 0, policy_version 28480 (0.0015) +[2023-09-27 11:37:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14589952. Throughput: 0: 800.7, 1: 800.8. Samples: 3643781. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:37:10,584][08744] Avg episode reward: [(0, '2.770'), (1, '2.410')] +[2023-09-27 11:37:15,583][08744] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14614528. Throughput: 0: 802.8, 1: 802.5. Samples: 3653632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:37:15,584][08744] Avg episode reward: [(0, '2.760'), (1, '2.350')] +[2023-09-27 11:37:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 14647296. Throughput: 0: 806.3, 1: 806.9. Samples: 3663386. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:37:20,584][08744] Avg episode reward: [(0, '2.920'), (1, '2.320')] +[2023-09-27 11:37:20,676][09606] Saving new best policy, reward=2.920! +[2023-09-27 11:37:21,994][09879] Updated weights for policy 1, policy_version 28640 (0.0014) +[2023-09-27 11:37:21,994][09878] Updated weights for policy 0, policy_version 28640 (0.0018) +[2023-09-27 11:37:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14680064. Throughput: 0: 800.9, 1: 801.2. Samples: 3667968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:37:25,583][08744] Avg episode reward: [(0, '2.890'), (1, '2.310')] +[2023-09-27 11:37:30,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14712832. Throughput: 0: 799.4, 1: 798.6. Samples: 3677267. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:30,583][08744] Avg episode reward: [(0, '2.620'), (1, '2.380')] +[2023-09-27 11:37:35,110][09879] Updated weights for policy 1, policy_version 28800 (0.0017) +[2023-09-27 11:37:35,110][09878] Updated weights for policy 0, policy_version 28800 (0.0017) +[2023-09-27 11:37:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14745600. Throughput: 0: 796.5, 1: 796.4. Samples: 3686569. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:35,583][08744] Avg episode reward: [(0, '2.530'), (1, '2.400')] +[2023-09-27 11:37:40,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14778368. Throughput: 0: 797.4, 1: 797.5. Samples: 3691601. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:40,583][08744] Avg episode reward: [(0, '2.350'), (1, '2.430')] +[2023-09-27 11:37:45,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 14811136. Throughput: 0: 797.5, 1: 797.4. Samples: 3701143. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:45,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.500')] +[2023-09-27 11:37:47,862][09879] Updated weights for policy 1, policy_version 28960 (0.0020) +[2023-09-27 11:37:47,862][09878] Updated weights for policy 0, policy_version 28960 (0.0020) +[2023-09-27 11:37:50,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 14843904. Throughput: 0: 796.0, 1: 796.4. Samples: 3710954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:50,584][08744] Avg episode reward: [(0, '2.550'), (1, '2.290')] +[2023-09-27 11:37:55,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14868480. Throughput: 0: 794.0, 1: 794.2. Samples: 3715247. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:37:55,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.410')] +[2023-09-27 11:38:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14901248. Throughput: 0: 795.6, 1: 796.3. Samples: 3725264. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:38:00,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.210')] +[2023-09-27 11:38:00,768][09879] Updated weights for policy 1, policy_version 29120 (0.0018) +[2023-09-27 11:38:00,768][09878] Updated weights for policy 0, policy_version 29120 (0.0018) +[2023-09-27 11:38:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 14934016. Throughput: 0: 790.2, 1: 790.0. Samples: 3734498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:38:05,583][08744] Avg episode reward: [(0, '2.850'), (1, '2.250')] +[2023-09-27 11:38:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6387.0). Total num frames: 14966784. Throughput: 0: 793.8, 1: 794.2. Samples: 3739429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:38:10,583][08744] Avg episode reward: [(0, '2.880'), (1, '2.250')] +[2023-09-27 11:38:13,801][09879] Updated weights for policy 1, policy_version 29280 (0.0019) +[2023-09-27 11:38:13,802][09878] Updated weights for policy 0, policy_version 29280 (0.0018) +[2023-09-27 11:38:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6373.1). Total num frames: 14999552. Throughput: 0: 792.3, 1: 793.2. Samples: 3748616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:38:15,583][08744] Avg episode reward: [(0, '2.900'), (1, '2.420')] +[2023-09-27 11:38:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15032320. Throughput: 0: 794.7, 1: 794.6. Samples: 3758089. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:38:20,583][08744] Avg episode reward: [(0, '2.940'), (1, '2.390')] +[2023-09-27 11:38:20,583][09606] Saving new best policy, reward=2.940! +[2023-09-27 11:38:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15065088. Throughput: 0: 792.8, 1: 792.7. Samples: 3762949. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:38:25,583][08744] Avg episode reward: [(0, '2.870'), (1, '2.290')] +[2023-09-27 11:38:26,569][09879] Updated weights for policy 1, policy_version 29440 (0.0017) +[2023-09-27 11:38:26,569][09878] Updated weights for policy 0, policy_version 29440 (0.0016) +[2023-09-27 11:38:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 15097856. Throughput: 0: 792.4, 1: 792.7. Samples: 3772475. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:38:30,584][08744] Avg episode reward: [(0, '2.890'), (1, '2.320')] +[2023-09-27 11:38:35,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15122432. Throughput: 0: 792.4, 1: 792.2. Samples: 3782263. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:38:35,583][08744] Avg episode reward: [(0, '2.930'), (1, '2.380')] +[2023-09-27 11:38:39,405][09879] Updated weights for policy 1, policy_version 29600 (0.0016) +[2023-09-27 11:38:39,405][09878] Updated weights for policy 0, policy_version 29600 (0.0015) +[2023-09-27 11:38:40,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15155200. Throughput: 0: 795.7, 1: 795.6. Samples: 3786854. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:38:40,584][08744] Avg episode reward: [(0, '2.890'), (1, '2.420')] +[2023-09-27 11:38:45,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15187968. Throughput: 0: 796.7, 1: 794.5. Samples: 3796868. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:38:45,583][08744] Avg episode reward: [(0, '2.790'), (1, '2.380')] +[2023-09-27 11:38:45,755][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000029680_7598080.pth... +[2023-09-27 11:38:45,778][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000029680_7598080.pth... +[2023-09-27 11:38:45,782][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000026688_6832128.pth +[2023-09-27 11:38:45,815][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000026688_6832128.pth +[2023-09-27 11:38:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15220736. Throughput: 0: 796.2, 1: 795.4. Samples: 3806120. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:38:50,584][08744] Avg episode reward: [(0, '2.690'), (1, '2.480')] +[2023-09-27 11:38:52,251][09879] Updated weights for policy 1, policy_version 29760 (0.0016) +[2023-09-27 11:38:52,252][09878] Updated weights for policy 0, policy_version 29760 (0.0017) +[2023-09-27 11:38:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15253504. Throughput: 0: 797.6, 1: 797.2. Samples: 3811194. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:38:55,583][08744] Avg episode reward: [(0, '2.790'), (1, '2.540')] +[2023-09-27 11:39:00,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15286272. Throughput: 0: 799.7, 1: 797.9. Samples: 3820507. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:39:00,583][08744] Avg episode reward: [(0, '2.740'), (1, '2.620')] +[2023-09-27 11:39:05,198][09879] Updated weights for policy 1, policy_version 29920 (0.0018) +[2023-09-27 11:39:05,198][09878] Updated weights for policy 0, policy_version 29920 (0.0019) +[2023-09-27 11:39:05,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15319040. Throughput: 0: 796.5, 1: 796.4. Samples: 3829770. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:39:05,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.570')] +[2023-09-27 11:39:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15351808. Throughput: 0: 797.9, 1: 797.9. Samples: 3834763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:10,583][08744] Avg episode reward: [(0, '2.750'), (1, '2.620')] +[2023-09-27 11:39:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15384576. Throughput: 0: 796.8, 1: 796.7. Samples: 3844182. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:15,583][08744] Avg episode reward: [(0, '2.910'), (1, '2.650')] +[2023-09-27 11:39:17,973][09878] Updated weights for policy 0, policy_version 30080 (0.0017) +[2023-09-27 11:39:17,973][09879] Updated weights for policy 1, policy_version 30080 (0.0015) +[2023-09-27 11:39:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15417344. Throughput: 0: 799.2, 1: 798.1. Samples: 3854140. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:20,583][08744] Avg episode reward: [(0, '2.940'), (1, '2.440')] +[2023-09-27 11:39:25,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15441920. Throughput: 0: 796.8, 1: 796.6. Samples: 3858555. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:25,583][08744] Avg episode reward: [(0, '2.870'), (1, '2.610')] +[2023-09-27 11:39:30,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 15474688. Throughput: 0: 789.7, 1: 789.7. Samples: 3867940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:39:30,583][08744] Avg episode reward: [(0, '2.820'), (1, '2.550')] +[2023-09-27 11:39:31,094][09879] Updated weights for policy 1, policy_version 30240 (0.0017) +[2023-09-27 11:39:31,095][09878] Updated weights for policy 0, policy_version 30240 (0.0019) +[2023-09-27 11:39:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 15507456. Throughput: 0: 791.5, 1: 792.2. Samples: 3877388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:39:35,583][08744] Avg episode reward: [(0, '2.860'), (1, '2.550')] +[2023-09-27 11:39:40,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 15540224. Throughput: 0: 792.0, 1: 791.9. Samples: 3882466. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:39:40,583][08744] Avg episode reward: [(0, '2.750'), (1, '2.370')] +[2023-09-27 11:39:43,772][09879] Updated weights for policy 1, policy_version 30400 (0.0014) +[2023-09-27 11:39:43,773][09878] Updated weights for policy 0, policy_version 30400 (0.0018) +[2023-09-27 11:39:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6373.1). Total num frames: 15572992. Throughput: 0: 794.0, 1: 795.0. Samples: 3892012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-09-27 11:39:45,583][08744] Avg episode reward: [(0, '2.660'), (1, '2.400')] +[2023-09-27 11:39:50,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15605760. Throughput: 0: 796.3, 1: 796.3. Samples: 3901440. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:50,584][08744] Avg episode reward: [(0, '2.660'), (1, '2.260')] +[2023-09-27 11:39:55,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15638528. Throughput: 0: 792.4, 1: 792.9. Samples: 3906103. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:39:55,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.260')] +[2023-09-27 11:39:56,765][09878] Updated weights for policy 0, policy_version 30560 (0.0020) +[2023-09-27 11:39:56,765][09879] Updated weights for policy 1, policy_version 30560 (0.0018) +[2023-09-27 11:40:00,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15671296. Throughput: 0: 795.5, 1: 795.4. Samples: 3915776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:00,583][08744] Avg episode reward: [(0, '2.720'), (1, '2.200')] +[2023-09-27 11:40:05,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15695872. Throughput: 0: 790.8, 1: 792.0. Samples: 3925365. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:05,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.280')] +[2023-09-27 11:40:09,564][09879] Updated weights for policy 1, policy_version 30720 (0.0015) +[2023-09-27 11:40:09,564][09878] Updated weights for policy 0, policy_version 30720 (0.0015) +[2023-09-27 11:40:10,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15728640. Throughput: 0: 795.0, 1: 795.1. Samples: 3930112. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:10,583][08744] Avg episode reward: [(0, '2.570'), (1, '2.350')] +[2023-09-27 11:40:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15761408. Throughput: 0: 798.4, 1: 799.8. Samples: 3939862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:15,584][08744] Avg episode reward: [(0, '2.680'), (1, '2.450')] +[2023-09-27 11:40:20,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15794176. Throughput: 0: 799.1, 1: 799.3. Samples: 3949315. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:20,584][08744] Avg episode reward: [(0, '2.650'), (1, '2.340')] +[2023-09-27 11:40:22,379][09878] Updated weights for policy 0, policy_version 30880 (0.0019) +[2023-09-27 11:40:22,379][09879] Updated weights for policy 1, policy_version 30880 (0.0018) +[2023-09-27 11:40:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 15826944. Throughput: 0: 797.7, 1: 797.4. Samples: 3954246. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:25,583][08744] Avg episode reward: [(0, '2.640'), (1, '2.440')] +[2023-09-27 11:40:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 15859712. Throughput: 0: 797.6, 1: 797.9. Samples: 3963812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:30,583][08744] Avg episode reward: [(0, '2.860'), (1, '2.410')] +[2023-09-27 11:40:35,177][09879] Updated weights for policy 1, policy_version 31040 (0.0018) +[2023-09-27 11:40:35,177][09878] Updated weights for policy 0, policy_version 31040 (0.0019) +[2023-09-27 11:40:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15892480. Throughput: 0: 796.8, 1: 796.9. Samples: 3973154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:35,583][08744] Avg episode reward: [(0, '2.890'), (1, '2.380')] +[2023-09-27 11:40:40,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 15925248. Throughput: 0: 797.1, 1: 797.1. Samples: 3977841. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:40,583][08744] Avg episode reward: [(0, '2.790'), (1, '2.420')] +[2023-09-27 11:40:45,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15949824. Throughput: 0: 796.1, 1: 796.4. Samples: 3987440. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:40:45,583][08744] Avg episode reward: [(0, '2.800'), (1, '2.430')] +[2023-09-27 11:40:45,686][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000031168_7979008.pth... +[2023-09-27 11:40:45,713][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000028176_7213056.pth +[2023-09-27 11:40:45,718][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000031168_7979008.pth... +[2023-09-27 11:40:45,746][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000028176_7213056.pth +[2023-09-27 11:40:48,261][09879] Updated weights for policy 1, policy_version 31200 (0.0018) +[2023-09-27 11:40:48,261][09878] Updated weights for policy 0, policy_version 31200 (0.0019) +[2023-09-27 11:40:50,582][08744] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 15982592. Throughput: 0: 795.3, 1: 794.0. Samples: 3996887. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:40:50,583][08744] Avg episode reward: [(0, '2.840'), (1, '2.530')] +[2023-09-27 11:40:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16015360. Throughput: 0: 796.4, 1: 796.3. Samples: 4001781. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:40:55,583][08744] Avg episode reward: [(0, '2.870'), (1, '2.610')] +[2023-09-27 11:41:00,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16048128. Throughput: 0: 791.2, 1: 792.4. Samples: 4011121. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:41:00,583][08744] Avg episode reward: [(0, '2.820'), (1, '2.560')] +[2023-09-27 11:41:01,184][09879] Updated weights for policy 1, policy_version 31360 (0.0013) +[2023-09-27 11:41:01,185][09878] Updated weights for policy 0, policy_version 31360 (0.0015) +[2023-09-27 11:41:05,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16080896. Throughput: 0: 789.5, 1: 789.0. Samples: 4020349. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-09-27 11:41:05,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.710')] +[2023-09-27 11:41:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16113664. Throughput: 0: 788.2, 1: 788.4. Samples: 4025197. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:10,583][08744] Avg episode reward: [(0, '2.920'), (1, '2.770')] +[2023-09-27 11:41:14,330][09879] Updated weights for policy 1, policy_version 31520 (0.0015) +[2023-09-27 11:41:14,331][09878] Updated weights for policy 0, policy_version 31520 (0.0018) +[2023-09-27 11:41:15,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16138240. Throughput: 0: 786.0, 1: 786.2. Samples: 4034560. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:15,583][08744] Avg episode reward: [(0, '2.950'), (1, '2.680')] +[2023-09-27 11:41:15,597][09606] Saving new best policy, reward=2.950! +[2023-09-27 11:41:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 16171008. Throughput: 0: 784.9, 1: 784.0. Samples: 4043755. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:20,583][08744] Avg episode reward: [(0, '2.950'), (1, '2.640')] +[2023-09-27 11:41:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16203776. Throughput: 0: 787.9, 1: 786.4. Samples: 4048686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:25,584][08744] Avg episode reward: [(0, '2.770'), (1, '2.540')] +[2023-09-27 11:41:27,398][09878] Updated weights for policy 0, policy_version 31680 (0.0017) +[2023-09-27 11:41:27,399][09879] Updated weights for policy 1, policy_version 31680 (0.0016) +[2023-09-27 11:41:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16236544. Throughput: 0: 786.1, 1: 785.9. Samples: 4058179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:30,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.720')] +[2023-09-27 11:41:35,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16269312. Throughput: 0: 785.6, 1: 786.6. Samples: 4067640. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:35,583][08744] Avg episode reward: [(0, '2.810'), (1, '2.490')] +[2023-09-27 11:41:40,205][09879] Updated weights for policy 1, policy_version 31840 (0.0015) +[2023-09-27 11:41:40,206][09878] Updated weights for policy 0, policy_version 31840 (0.0018) +[2023-09-27 11:41:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 16302080. Throughput: 0: 786.9, 1: 786.5. Samples: 4072583. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:40,583][08744] Avg episode reward: [(0, '2.820'), (1, '2.550')] +[2023-09-27 11:41:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16334848. Throughput: 0: 788.2, 1: 787.4. Samples: 4082022. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:45,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.640')] +[2023-09-27 11:41:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16367616. Throughput: 0: 795.0, 1: 795.1. Samples: 4091904. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:41:50,583][08744] Avg episode reward: [(0, '2.670'), (1, '2.660')] +[2023-09-27 11:41:52,846][09878] Updated weights for policy 0, policy_version 32000 (0.0016) +[2023-09-27 11:41:52,846][09879] Updated weights for policy 1, policy_version 32000 (0.0016) +[2023-09-27 11:41:55,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 16400384. Throughput: 0: 794.3, 1: 794.4. Samples: 4096687. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:41:55,584][08744] Avg episode reward: [(0, '2.610'), (1, '2.450')] +[2023-09-27 11:42:00,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16433152. Throughput: 0: 796.6, 1: 796.6. Samples: 4106251. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:42:00,583][08744] Avg episode reward: [(0, '2.640'), (1, '2.410')] +[2023-09-27 11:42:05,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6331.4). Total num frames: 16457728. Throughput: 0: 803.5, 1: 804.3. Samples: 4116106. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:42:05,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.270')] +[2023-09-27 11:42:05,615][09879] Updated weights for policy 1, policy_version 32160 (0.0018) +[2023-09-27 11:42:05,615][09878] Updated weights for policy 0, policy_version 32160 (0.0019) +[2023-09-27 11:42:10,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16490496. Throughput: 0: 798.3, 1: 799.3. Samples: 4120581. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-09-27 11:42:10,583][08744] Avg episode reward: [(0, '2.760'), (1, '2.200')] +[2023-09-27 11:42:15,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16523264. Throughput: 0: 802.3, 1: 802.6. Samples: 4130401. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:15,584][08744] Avg episode reward: [(0, '2.770'), (1, '2.180')] +[2023-09-27 11:42:18,400][09878] Updated weights for policy 0, policy_version 32320 (0.0019) +[2023-09-27 11:42:18,400][09879] Updated weights for policy 1, policy_version 32320 (0.0017) +[2023-09-27 11:42:20,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16556032. Throughput: 0: 804.2, 1: 803.5. Samples: 4139989. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:20,583][08744] Avg episode reward: [(0, '2.740'), (1, '2.110')] +[2023-09-27 11:42:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16588800. Throughput: 0: 804.1, 1: 804.3. Samples: 4144961. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:25,583][08744] Avg episode reward: [(0, '2.720'), (1, '2.260')] +[2023-09-27 11:42:30,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16621568. Throughput: 0: 803.1, 1: 802.2. Samples: 4154258. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:30,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.430')] +[2023-09-27 11:42:31,339][09878] Updated weights for policy 0, policy_version 32480 (0.0017) +[2023-09-27 11:42:31,339][09879] Updated weights for policy 1, policy_version 32480 (0.0015) +[2023-09-27 11:42:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16654336. Throughput: 0: 796.5, 1: 796.5. Samples: 4163589. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:35,583][08744] Avg episode reward: [(0, '2.830'), (1, '2.650')] +[2023-09-27 11:42:40,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 16687104. Throughput: 0: 798.7, 1: 798.8. Samples: 4168574. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:40,583][08744] Avg episode reward: [(0, '2.810'), (1, '2.610')] +[2023-09-27 11:42:44,163][09879] Updated weights for policy 1, policy_version 32640 (0.0014) +[2023-09-27 11:42:44,164][09878] Updated weights for policy 0, policy_version 32640 (0.0017) +[2023-09-27 11:42:45,583][08744] Fps is (10 sec: 6553.3, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 16719872. Throughput: 0: 796.9, 1: 797.0. Samples: 4177981. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:45,584][08744] Avg episode reward: [(0, '2.970'), (1, '2.840')] +[2023-09-27 11:42:45,597][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000032656_8359936.pth... +[2023-09-27 11:42:45,597][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000032656_8359936.pth... +[2023-09-27 11:42:45,631][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000029680_7598080.pth +[2023-09-27 11:42:45,632][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000029680_7598080.pth +[2023-09-27 11:42:45,635][09606] Saving new best policy, reward=2.970! +[2023-09-27 11:42:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 16752640. Throughput: 0: 800.5, 1: 800.5. Samples: 4188151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:50,583][08744] Avg episode reward: [(0, '2.990'), (1, '2.870')] +[2023-09-27 11:42:50,585][09606] Saving new best policy, reward=2.990! +[2023-09-27 11:42:55,583][08744] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16777216. Throughput: 0: 799.4, 1: 799.1. Samples: 4192517. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:42:55,584][08744] Avg episode reward: [(0, '3.050'), (1, '2.840')] +[2023-09-27 11:42:55,653][09606] Saving new best policy, reward=3.050! +[2023-09-27 11:42:56,976][09879] Updated weights for policy 1, policy_version 32800 (0.0017) +[2023-09-27 11:42:56,976][09878] Updated weights for policy 0, policy_version 32800 (0.0018) +[2023-09-27 11:43:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 16809984. Throughput: 0: 800.2, 1: 800.1. Samples: 4202413. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:43:00,584][08744] Avg episode reward: [(0, '2.980'), (1, '2.950')] +[2023-09-27 11:43:00,680][09742] Saving new best policy, reward=2.950! +[2023-09-27 11:43:05,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16842752. Throughput: 0: 799.4, 1: 800.2. Samples: 4211972. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:43:05,583][08744] Avg episode reward: [(0, '2.980'), (1, '2.790')] +[2023-09-27 11:43:09,606][09879] Updated weights for policy 1, policy_version 32960 (0.0016) +[2023-09-27 11:43:09,606][09878] Updated weights for policy 0, policy_version 32960 (0.0016) +[2023-09-27 11:43:10,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16875520. Throughput: 0: 798.4, 1: 798.7. Samples: 4216832. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:43:10,583][08744] Avg episode reward: [(0, '2.890'), (1, '2.760')] +[2023-09-27 11:43:15,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16908288. Throughput: 0: 802.2, 1: 803.7. Samples: 4226522. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:43:15,584][08744] Avg episode reward: [(0, '2.770'), (1, '2.830')] +[2023-09-27 11:43:20,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 16941056. Throughput: 0: 806.6, 1: 806.9. Samples: 4236197. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:20,584][08744] Avg episode reward: [(0, '2.900'), (1, '2.960')] +[2023-09-27 11:43:20,585][09742] Saving new best policy, reward=2.960! +[2023-09-27 11:43:22,276][09879] Updated weights for policy 1, policy_version 33120 (0.0015) +[2023-09-27 11:43:22,277][09878] Updated weights for policy 0, policy_version 33120 (0.0018) +[2023-09-27 11:43:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 16973824. Throughput: 0: 807.0, 1: 806.0. Samples: 4241161. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:25,583][08744] Avg episode reward: [(0, '2.860'), (1, '3.030')] +[2023-09-27 11:43:25,584][09742] Saving new best policy, reward=3.030! +[2023-09-27 11:43:30,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17006592. Throughput: 0: 808.3, 1: 807.4. Samples: 4250685. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:30,583][08744] Avg episode reward: [(0, '2.850'), (1, '2.900')] +[2023-09-27 11:43:35,057][09878] Updated weights for policy 0, policy_version 33280 (0.0015) +[2023-09-27 11:43:35,058][09879] Updated weights for policy 1, policy_version 33280 (0.0017) +[2023-09-27 11:43:35,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17039360. Throughput: 0: 799.6, 1: 799.5. Samples: 4260112. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:35,583][08744] Avg episode reward: [(0, '2.830'), (1, '2.970')] +[2023-09-27 11:43:40,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17072128. Throughput: 0: 807.4, 1: 807.8. Samples: 4265204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:40,583][08744] Avg episode reward: [(0, '2.730'), (1, '3.050')] +[2023-09-27 11:43:40,584][09742] Saving new best policy, reward=3.050! +[2023-09-27 11:43:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17104896. Throughput: 0: 801.1, 1: 800.9. Samples: 4274502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:45,583][08744] Avg episode reward: [(0, '2.780'), (1, '3.020')] +[2023-09-27 11:43:47,806][09879] Updated weights for policy 1, policy_version 33440 (0.0017) +[2023-09-27 11:43:47,807][09878] Updated weights for policy 0, policy_version 33440 (0.0017) +[2023-09-27 11:43:50,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17137664. Throughput: 0: 805.0, 1: 804.9. Samples: 4284416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:50,583][08744] Avg episode reward: [(0, '2.680'), (1, '3.050')] +[2023-09-27 11:43:55,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6387.0). Total num frames: 17170432. Throughput: 0: 801.6, 1: 801.7. Samples: 4288977. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:43:55,583][08744] Avg episode reward: [(0, '2.660'), (1, '3.010')] +[2023-09-27 11:44:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 17195008. Throughput: 0: 801.1, 1: 801.0. Samples: 4298619. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:44:00,583][08744] Avg episode reward: [(0, '2.640'), (1, '3.020')] +[2023-09-27 11:44:00,905][09879] Updated weights for policy 1, policy_version 33600 (0.0015) +[2023-09-27 11:44:00,905][09878] Updated weights for policy 0, policy_version 33600 (0.0017) +[2023-09-27 11:44:05,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 17227776. Throughput: 0: 797.4, 1: 797.3. Samples: 4307959. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:44:05,583][08744] Avg episode reward: [(0, '2.530'), (1, '2.800')] +[2023-09-27 11:44:10,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 17260544. Throughput: 0: 797.6, 1: 799.2. Samples: 4313019. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:44:10,583][08744] Avg episode reward: [(0, '2.520'), (1, '2.800')] +[2023-09-27 11:44:13,639][09878] Updated weights for policy 0, policy_version 33760 (0.0015) +[2023-09-27 11:44:13,640][09879] Updated weights for policy 1, policy_version 33760 (0.0014) +[2023-09-27 11:44:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 17293312. Throughput: 0: 796.0, 1: 797.0. Samples: 4322371. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:44:15,583][08744] Avg episode reward: [(0, '2.460'), (1, '2.830')] +[2023-09-27 11:44:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17326080. Throughput: 0: 797.4, 1: 797.8. Samples: 4331896. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:44:20,583][08744] Avg episode reward: [(0, '2.300'), (1, '2.670')] +[2023-09-27 11:44:25,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17358848. Throughput: 0: 796.2, 1: 796.6. Samples: 4336880. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-09-27 11:44:25,583][08744] Avg episode reward: [(0, '2.330'), (1, '2.650')] +[2023-09-27 11:44:26,405][09878] Updated weights for policy 0, policy_version 33920 (0.0019) +[2023-09-27 11:44:26,405][09879] Updated weights for policy 1, policy_version 33920 (0.0014) +[2023-09-27 11:44:30,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 17391616. Throughput: 0: 798.2, 1: 797.8. Samples: 4346323. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:44:30,583][08744] Avg episode reward: [(0, '2.330'), (1, '2.660')] +[2023-09-27 11:44:35,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17424384. Throughput: 0: 796.4, 1: 796.4. Samples: 4356096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:44:35,583][08744] Avg episode reward: [(0, '2.330'), (1, '2.770')] +[2023-09-27 11:44:39,178][09879] Updated weights for policy 1, policy_version 34080 (0.0016) +[2023-09-27 11:44:39,178][09878] Updated weights for policy 0, policy_version 34080 (0.0020) +[2023-09-27 11:44:40,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 17457152. Throughput: 0: 797.0, 1: 797.1. Samples: 4360711. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:44:40,584][08744] Avg episode reward: [(0, '2.260'), (1, '2.700')] +[2023-09-27 11:44:45,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17481728. Throughput: 0: 798.3, 1: 797.6. Samples: 4370432. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:44:45,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.730')] +[2023-09-27 11:44:45,599][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000034160_8744960.pth... +[2023-09-27 11:44:45,624][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000034160_8744960.pth... +[2023-09-27 11:44:45,627][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000031168_7979008.pth +[2023-09-27 11:44:45,657][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000031168_7979008.pth +[2023-09-27 11:44:50,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17514496. Throughput: 0: 798.3, 1: 799.5. Samples: 4379861. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:44:50,584][08744] Avg episode reward: [(0, '2.700'), (1, '2.520')] +[2023-09-27 11:44:52,201][09878] Updated weights for policy 0, policy_version 34240 (0.0017) +[2023-09-27 11:44:52,201][09879] Updated weights for policy 1, policy_version 34240 (0.0018) +[2023-09-27 11:44:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17547264. Throughput: 0: 797.0, 1: 796.5. Samples: 4384727. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:44:55,583][08744] Avg episode reward: [(0, '2.690'), (1, '2.610')] +[2023-09-27 11:45:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17580032. Throughput: 0: 797.6, 1: 797.2. Samples: 4394133. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:45:00,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.570')] +[2023-09-27 11:45:04,915][09879] Updated weights for policy 1, policy_version 34400 (0.0016) +[2023-09-27 11:45:04,916][09878] Updated weights for policy 0, policy_version 34400 (0.0017) +[2023-09-27 11:45:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17612800. Throughput: 0: 798.4, 1: 798.4. Samples: 4403753. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:45:05,583][08744] Avg episode reward: [(0, '2.800'), (1, '2.670')] +[2023-09-27 11:45:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17645568. Throughput: 0: 799.8, 1: 798.7. Samples: 4408812. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-09-27 11:45:10,583][08744] Avg episode reward: [(0, '2.910'), (1, '2.540')] +[2023-09-27 11:45:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17678336. Throughput: 0: 798.4, 1: 797.6. Samples: 4418144. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:45:15,583][08744] Avg episode reward: [(0, '2.800'), (1, '2.640')] +[2023-09-27 11:45:17,690][09878] Updated weights for policy 0, policy_version 34560 (0.0019) +[2023-09-27 11:45:17,690][09879] Updated weights for policy 1, policy_version 34560 (0.0016) +[2023-09-27 11:45:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17711104. Throughput: 0: 796.4, 1: 796.4. Samples: 4427776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:45:20,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.570')] +[2023-09-27 11:45:25,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17743872. Throughput: 0: 797.8, 1: 797.1. Samples: 4432478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:45:25,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.520')] +[2023-09-27 11:45:30,507][09879] Updated weights for policy 1, policy_version 34720 (0.0016) +[2023-09-27 11:45:30,507][09878] Updated weights for policy 0, policy_version 34720 (0.0016) +[2023-09-27 11:45:30,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17776640. Throughput: 0: 796.5, 1: 796.6. Samples: 4442119. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:45:30,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.590')] +[2023-09-27 11:45:35,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17801216. Throughput: 0: 801.4, 1: 800.2. Samples: 4451934. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-09-27 11:45:35,583][08744] Avg episode reward: [(0, '2.720'), (1, '2.670')] +[2023-09-27 11:45:40,582][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.6, 300 sec: 6387.0). Total num frames: 17833984. Throughput: 0: 797.0, 1: 796.8. Samples: 4456448. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:45:40,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.630')] +[2023-09-27 11:45:43,771][09879] Updated weights for policy 1, policy_version 34880 (0.0014) +[2023-09-27 11:45:43,772][09878] Updated weights for policy 0, policy_version 34880 (0.0016) +[2023-09-27 11:45:45,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17866752. Throughput: 0: 792.8, 1: 791.8. Samples: 4465441. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:45:45,583][08744] Avg episode reward: [(0, '2.820'), (1, '2.650')] +[2023-09-27 11:45:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17899520. Throughput: 0: 790.4, 1: 790.2. Samples: 4474880. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:45:50,583][08744] Avg episode reward: [(0, '2.770'), (1, '2.580')] +[2023-09-27 11:45:55,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 17932288. Throughput: 0: 784.1, 1: 785.0. Samples: 4479422. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-09-27 11:45:55,583][08744] Avg episode reward: [(0, '2.750'), (1, '2.580')] +[2023-09-27 11:45:56,802][09879] Updated weights for policy 1, policy_version 35040 (0.0016) +[2023-09-27 11:45:56,803][09878] Updated weights for policy 0, policy_version 35040 (0.0017) +[2023-09-27 11:46:00,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17956864. Throughput: 0: 789.2, 1: 790.1. Samples: 4489216. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:00,583][08744] Avg episode reward: [(0, '2.710'), (1, '2.770')] +[2023-09-27 11:46:05,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 17989632. Throughput: 0: 784.6, 1: 785.0. Samples: 4498412. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:05,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.680')] +[2023-09-27 11:46:09,889][09878] Updated weights for policy 0, policy_version 35200 (0.0016) +[2023-09-27 11:46:09,890][09879] Updated weights for policy 1, policy_version 35200 (0.0018) +[2023-09-27 11:46:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 18022400. Throughput: 0: 786.4, 1: 789.1. Samples: 4503376. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:10,583][08744] Avg episode reward: [(0, '2.550'), (1, '2.750')] +[2023-09-27 11:46:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 18055168. Throughput: 0: 783.4, 1: 784.1. Samples: 4512658. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:15,584][08744] Avg episode reward: [(0, '2.380'), (1, '2.860')] +[2023-09-27 11:46:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 18087936. Throughput: 0: 778.4, 1: 778.2. Samples: 4521984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:20,583][08744] Avg episode reward: [(0, '2.420'), (1, '2.870')] +[2023-09-27 11:46:22,869][09879] Updated weights for policy 1, policy_version 35360 (0.0017) +[2023-09-27 11:46:22,869][09878] Updated weights for policy 0, policy_version 35360 (0.0017) +[2023-09-27 11:46:25,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 18120704. Throughput: 0: 780.7, 1: 781.0. Samples: 4526724. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:25,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.910')] +[2023-09-27 11:46:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6387.0). Total num frames: 18153472. Throughput: 0: 787.0, 1: 788.2. Samples: 4536328. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:30,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.910')] +[2023-09-27 11:46:35,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18178048. Throughput: 0: 787.5, 1: 788.4. Samples: 4545793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:35,584][08744] Avg episode reward: [(0, '2.390'), (1, '3.060')] +[2023-09-27 11:46:35,585][09742] Saving new best policy, reward=3.060! +[2023-09-27 11:46:35,821][09878] Updated weights for policy 0, policy_version 35520 (0.0017) +[2023-09-27 11:46:35,822][09879] Updated weights for policy 1, policy_version 35520 (0.0017) +[2023-09-27 11:46:40,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18210816. Throughput: 0: 791.6, 1: 791.4. Samples: 4550656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:40,583][08744] Avg episode reward: [(0, '2.450'), (1, '2.960')] +[2023-09-27 11:46:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18243584. Throughput: 0: 789.1, 1: 788.5. Samples: 4560207. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:45,583][08744] Avg episode reward: [(0, '2.490'), (1, '2.890')] +[2023-09-27 11:46:45,590][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000035632_9121792.pth... +[2023-09-27 11:46:45,590][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000035632_9121792.pth... +[2023-09-27 11:46:45,624][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000032656_8359936.pth +[2023-09-27 11:46:45,628][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000032656_8359936.pth +[2023-09-27 11:46:48,589][09878] Updated weights for policy 0, policy_version 35680 (0.0014) +[2023-09-27 11:46:48,589][09879] Updated weights for policy 1, policy_version 35680 (0.0016) +[2023-09-27 11:46:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18276352. Throughput: 0: 792.7, 1: 792.2. Samples: 4569732. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:50,583][08744] Avg episode reward: [(0, '2.650'), (1, '2.820')] +[2023-09-27 11:46:55,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 18309120. Throughput: 0: 790.4, 1: 788.4. Samples: 4574426. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:46:55,583][08744] Avg episode reward: [(0, '2.590'), (1, '2.830')] +[2023-09-27 11:47:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 18341888. Throughput: 0: 792.7, 1: 791.9. Samples: 4583965. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:00,583][08744] Avg episode reward: [(0, '2.480'), (1, '2.590')] +[2023-09-27 11:47:01,535][09879] Updated weights for policy 1, policy_version 35840 (0.0015) +[2023-09-27 11:47:01,536][09878] Updated weights for policy 0, policy_version 35840 (0.0017) +[2023-09-27 11:47:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 18374656. Throughput: 0: 796.2, 1: 796.4. Samples: 4593654. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:05,583][08744] Avg episode reward: [(0, '2.430'), (1, '2.520')] +[2023-09-27 11:47:10,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 18407424. Throughput: 0: 794.1, 1: 793.6. Samples: 4598172. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:47:10,583][08744] Avg episode reward: [(0, '2.510'), (1, '2.440')] +[2023-09-27 11:47:14,494][09878] Updated weights for policy 0, policy_version 36000 (0.0018) +[2023-09-27 11:47:14,494][09879] Updated weights for policy 1, policy_version 36000 (0.0019) +[2023-09-27 11:47:15,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18432000. Throughput: 0: 794.4, 1: 795.0. Samples: 4607853. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:47:15,584][08744] Avg episode reward: [(0, '2.370'), (1, '2.580')] +[2023-09-27 11:47:20,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18464768. Throughput: 0: 796.1, 1: 795.1. Samples: 4617399. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:47:20,584][08744] Avg episode reward: [(0, '2.340'), (1, '2.500')] +[2023-09-27 11:47:25,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 18497536. Throughput: 0: 796.4, 1: 796.4. Samples: 4622336. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:47:25,583][08744] Avg episode reward: [(0, '2.380'), (1, '2.620')] +[2023-09-27 11:47:27,136][09878] Updated weights for policy 0, policy_version 36160 (0.0016) +[2023-09-27 11:47:27,137][09879] Updated weights for policy 1, policy_version 36160 (0.0018) +[2023-09-27 11:47:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18530304. Throughput: 0: 799.1, 1: 799.4. Samples: 4632143. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-09-27 11:47:30,584][08744] Avg episode reward: [(0, '2.590'), (1, '2.690')] +[2023-09-27 11:47:35,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18563072. Throughput: 0: 797.6, 1: 797.9. Samples: 4641531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:35,583][08744] Avg episode reward: [(0, '2.680'), (1, '2.590')] +[2023-09-27 11:47:39,881][09878] Updated weights for policy 0, policy_version 36320 (0.0015) +[2023-09-27 11:47:39,882][09879] Updated weights for policy 1, policy_version 36320 (0.0016) +[2023-09-27 11:47:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18595840. Throughput: 0: 802.4, 1: 802.8. Samples: 4646662. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:40,583][08744] Avg episode reward: [(0, '2.750'), (1, '2.800')] +[2023-09-27 11:47:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18628608. Throughput: 0: 799.6, 1: 800.1. Samples: 4655951. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:45,584][08744] Avg episode reward: [(0, '2.910'), (1, '2.760')] +[2023-09-27 11:47:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 18661376. Throughput: 0: 798.1, 1: 797.8. Samples: 4665469. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:50,583][08744] Avg episode reward: [(0, '2.850'), (1, '2.650')] +[2023-09-27 11:47:52,683][09878] Updated weights for policy 0, policy_version 36480 (0.0016) +[2023-09-27 11:47:52,683][09879] Updated weights for policy 1, policy_version 36480 (0.0018) +[2023-09-27 11:47:55,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6387.0). Total num frames: 18694144. Throughput: 0: 802.9, 1: 803.3. Samples: 4670449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:47:55,583][08744] Avg episode reward: [(0, '3.040'), (1, '2.560')] +[2023-09-27 11:48:00,583][08744] Fps is (10 sec: 6553.4, 60 sec: 6417.0, 300 sec: 6387.0). Total num frames: 18726912. Throughput: 0: 800.1, 1: 799.6. Samples: 4679841. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:48:00,584][08744] Avg episode reward: [(0, '2.990'), (1, '2.670')] +[2023-09-27 11:48:05,582][08744] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6373.1). Total num frames: 18755584. Throughput: 0: 801.3, 1: 802.6. Samples: 4689576. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:48:05,583][08744] Avg episode reward: [(0, '2.960'), (1, '2.690')] +[2023-09-27 11:48:05,629][09878] Updated weights for policy 0, policy_version 36640 (0.0019) +[2023-09-27 11:48:05,629][09879] Updated weights for policy 1, policy_version 36640 (0.0018) +[2023-09-27 11:48:10,583][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 18784256. Throughput: 0: 797.0, 1: 797.0. Samples: 4694066. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:48:10,583][08744] Avg episode reward: [(0, '2.920'), (1, '2.700')] +[2023-09-27 11:48:15,583][08744] Fps is (10 sec: 6143.9, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18817024. Throughput: 0: 797.2, 1: 797.6. Samples: 4703907. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:48:15,584][08744] Avg episode reward: [(0, '2.900'), (1, '2.620')] +[2023-09-27 11:48:18,401][09879] Updated weights for policy 1, policy_version 36800 (0.0017) +[2023-09-27 11:48:18,401][09878] Updated weights for policy 0, policy_version 36800 (0.0019) +[2023-09-27 11:48:20,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18849792. Throughput: 0: 799.9, 1: 799.2. Samples: 4713490. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:48:20,583][08744] Avg episode reward: [(0, '2.770'), (1, '2.650')] +[2023-09-27 11:48:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 18882560. Throughput: 0: 799.1, 1: 798.2. Samples: 4718542. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:25,584][08744] Avg episode reward: [(0, '2.760'), (1, '2.770')] +[2023-09-27 11:48:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18915328. Throughput: 0: 800.5, 1: 801.2. Samples: 4728026. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:30,583][08744] Avg episode reward: [(0, '2.740'), (1, '2.760')] +[2023-09-27 11:48:31,268][09878] Updated weights for policy 0, policy_version 36960 (0.0017) +[2023-09-27 11:48:31,268][09879] Updated weights for policy 1, policy_version 36960 (0.0017) +[2023-09-27 11:48:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18948096. Throughput: 0: 795.4, 1: 795.7. Samples: 4737068. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:35,583][08744] Avg episode reward: [(0, '2.700'), (1, '2.710')] +[2023-09-27 11:48:40,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 18980864. Throughput: 0: 796.2, 1: 795.9. Samples: 4742093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:40,583][08744] Avg episode reward: [(0, '2.610'), (1, '2.810')] +[2023-09-27 11:48:44,141][09878] Updated weights for policy 0, policy_version 37120 (0.0016) +[2023-09-27 11:48:44,141][09879] Updated weights for policy 1, policy_version 37120 (0.0016) +[2023-09-27 11:48:45,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19013632. Throughput: 0: 795.9, 1: 795.7. Samples: 4751463. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:45,584][08744] Avg episode reward: [(0, '2.820'), (1, '2.950')] +[2023-09-27 11:48:45,597][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000037136_9506816.pth... +[2023-09-27 11:48:45,597][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000037136_9506816.pth... +[2023-09-27 11:48:45,631][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000034160_8744960.pth +[2023-09-27 11:48:45,636][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000034160_8744960.pth +[2023-09-27 11:48:50,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 19046400. Throughput: 0: 798.6, 1: 798.5. Samples: 4761446. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:50,583][08744] Avg episode reward: [(0, '2.840'), (1, '2.790')] +[2023-09-27 11:48:55,582][08744] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19070976. Throughput: 0: 796.3, 1: 796.3. Samples: 4765732. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:48:55,583][08744] Avg episode reward: [(0, '3.050'), (1, '2.700')] +[2023-09-27 11:48:57,080][09879] Updated weights for policy 1, policy_version 37280 (0.0017) +[2023-09-27 11:48:57,080][09878] Updated weights for policy 0, policy_version 37280 (0.0015) +[2023-09-27 11:49:00,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19103744. Throughput: 0: 795.6, 1: 796.1. Samples: 4775534. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:00,583][08744] Avg episode reward: [(0, '2.880'), (1, '2.590')] +[2023-09-27 11:49:05,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6348.8, 300 sec: 6359.2). Total num frames: 19136512. Throughput: 0: 792.1, 1: 793.2. Samples: 4784827. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:05,583][08744] Avg episode reward: [(0, '3.160'), (1, '2.580')] +[2023-09-27 11:49:05,584][09606] Saving new best policy, reward=3.160! +[2023-09-27 11:49:09,958][09878] Updated weights for policy 0, policy_version 37440 (0.0016) +[2023-09-27 11:49:09,958][09879] Updated weights for policy 1, policy_version 37440 (0.0016) +[2023-09-27 11:49:10,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19169280. Throughput: 0: 791.2, 1: 791.3. Samples: 4789752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:10,583][08744] Avg episode reward: [(0, '3.170'), (1, '2.400')] +[2023-09-27 11:49:10,583][09606] Saving new best policy, reward=3.170! +[2023-09-27 11:49:15,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19202048. Throughput: 0: 791.8, 1: 790.0. Samples: 4799208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:15,584][08744] Avg episode reward: [(0, '3.300'), (1, '2.600')] +[2023-09-27 11:49:15,595][09606] Saving new best policy, reward=3.300! +[2023-09-27 11:49:20,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19234816. Throughput: 0: 796.0, 1: 795.9. Samples: 4808707. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:20,583][08744] Avg episode reward: [(0, '3.140'), (1, '2.750')] +[2023-09-27 11:49:22,869][09879] Updated weights for policy 1, policy_version 37600 (0.0019) +[2023-09-27 11:49:22,869][09878] Updated weights for policy 0, policy_version 37600 (0.0019) +[2023-09-27 11:49:25,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19267584. Throughput: 0: 792.1, 1: 792.8. Samples: 4813413. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:25,584][08744] Avg episode reward: [(0, '3.250'), (1, '2.680')] +[2023-09-27 11:49:30,583][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19300352. Throughput: 0: 795.3, 1: 795.3. Samples: 4823040. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:30,584][08744] Avg episode reward: [(0, '3.110'), (1, '2.670')] +[2023-09-27 11:49:35,582][08744] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 19324928. Throughput: 0: 793.7, 1: 792.4. Samples: 4832820. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:35,583][08744] Avg episode reward: [(0, '3.270'), (1, '2.610')] +[2023-09-27 11:49:35,678][09879] Updated weights for policy 1, policy_version 37760 (0.0016) +[2023-09-27 11:49:35,678][09878] Updated weights for policy 0, policy_version 37760 (0.0016) +[2023-09-27 11:49:40,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19357696. Throughput: 0: 796.1, 1: 796.0. Samples: 4837376. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:40,583][08744] Avg episode reward: [(0, '3.180'), (1, '2.300')] +[2023-09-27 11:49:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 19390464. Throughput: 0: 793.2, 1: 792.8. Samples: 4846904. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:45,583][08744] Avg episode reward: [(0, '3.160'), (1, '2.230')] +[2023-09-27 11:49:48,612][09878] Updated weights for policy 0, policy_version 37920 (0.0015) +[2023-09-27 11:49:48,612][09879] Updated weights for policy 1, policy_version 37920 (0.0016) +[2023-09-27 11:49:50,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19423232. Throughput: 0: 793.6, 1: 793.4. Samples: 4856241. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:50,583][08744] Avg episode reward: [(0, '3.290'), (1, '2.330')] +[2023-09-27 11:49:55,582][08744] Fps is (10 sec: 6553.5, 60 sec: 6417.0, 300 sec: 6359.2). Total num frames: 19456000. Throughput: 0: 793.6, 1: 794.4. Samples: 4861210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:49:55,584][08744] Avg episode reward: [(0, '3.150'), (1, '2.490')] +[2023-09-27 11:50:00,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19488768. Throughput: 0: 793.3, 1: 794.2. Samples: 4870645. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:00,583][08744] Avg episode reward: [(0, '3.050'), (1, '2.720')] +[2023-09-27 11:50:01,481][09878] Updated weights for policy 0, policy_version 38080 (0.0018) +[2023-09-27 11:50:01,481][09879] Updated weights for policy 1, policy_version 38080 (0.0019) +[2023-09-27 11:50:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19521536. Throughput: 0: 796.4, 1: 796.4. Samples: 4880386. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:05,584][08744] Avg episode reward: [(0, '3.040'), (1, '2.880')] +[2023-09-27 11:50:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19554304. Throughput: 0: 797.9, 1: 797.3. Samples: 4885196. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:10,583][08744] Avg episode reward: [(0, '2.860'), (1, '2.960')] +[2023-09-27 11:50:14,310][09879] Updated weights for policy 1, policy_version 38240 (0.0016) +[2023-09-27 11:50:14,310][09878] Updated weights for policy 0, policy_version 38240 (0.0018) +[2023-09-27 11:50:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19587072. Throughput: 0: 796.4, 1: 796.4. Samples: 4894720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:15,583][08744] Avg episode reward: [(0, '2.870'), (1, '2.980')] +[2023-09-27 11:50:20,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 19611648. Throughput: 0: 794.4, 1: 794.5. Samples: 4904318. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:20,583][08744] Avg episode reward: [(0, '2.780'), (1, '2.990')] +[2023-09-27 11:50:25,583][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 19644416. Throughput: 0: 796.4, 1: 796.4. Samples: 4909056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:25,584][08744] Avg episode reward: [(0, '2.900'), (1, '2.600')] +[2023-09-27 11:50:27,113][09879] Updated weights for policy 1, policy_version 38400 (0.0019) +[2023-09-27 11:50:27,114][09878] Updated weights for policy 0, policy_version 38400 (0.0018) +[2023-09-27 11:50:30,583][08744] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19677184. Throughput: 0: 799.5, 1: 799.0. Samples: 4918837. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:30,584][08744] Avg episode reward: [(0, '2.940'), (1, '2.530')] +[2023-09-27 11:50:35,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19709952. Throughput: 0: 796.6, 1: 795.8. Samples: 4927896. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:35,583][08744] Avg episode reward: [(0, '3.120'), (1, '2.720')] +[2023-09-27 11:50:40,261][09878] Updated weights for policy 0, policy_version 38560 (0.0018) +[2023-09-27 11:50:40,261][09879] Updated weights for policy 1, policy_version 38560 (0.0019) +[2023-09-27 11:50:40,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19742720. Throughput: 0: 795.1, 1: 795.7. Samples: 4932795. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:40,583][08744] Avg episode reward: [(0, '2.960'), (1, '2.750')] +[2023-09-27 11:50:45,582][08744] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19775488. Throughput: 0: 792.0, 1: 791.8. Samples: 4941915. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:50:45,583][08744] Avg episode reward: [(0, '3.090'), (1, '2.750')] +[2023-09-27 11:50:45,592][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000038624_9887744.pth... +[2023-09-27 11:50:45,592][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000038624_9887744.pth... +[2023-09-27 11:50:45,621][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000035632_9121792.pth +[2023-09-27 11:50:45,632][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000035632_9121792.pth +[2023-09-27 11:50:50,583][08744] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6331.4). Total num frames: 19800064. Throughput: 0: 790.7, 1: 792.6. Samples: 4951634. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:50:50,584][08744] Avg episode reward: [(0, '2.970'), (1, '2.850')] +[2023-09-27 11:50:53,358][09879] Updated weights for policy 1, policy_version 38720 (0.0015) +[2023-09-27 11:50:53,359][09878] Updated weights for policy 0, policy_version 38720 (0.0017) +[2023-09-27 11:50:55,582][08744] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19832832. Throughput: 0: 788.6, 1: 788.4. Samples: 4956160. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:50:55,583][08744] Avg episode reward: [(0, '2.860'), (1, '3.170')] +[2023-09-27 11:50:55,584][09742] Saving new best policy, reward=3.170! +[2023-09-27 11:51:00,582][08744] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19865600. Throughput: 0: 791.0, 1: 789.7. Samples: 4965854. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:51:00,583][08744] Avg episode reward: [(0, '2.770'), (1, '3.270')] +[2023-09-27 11:51:00,589][09742] Saving new best policy, reward=3.270! +[2023-09-27 11:51:05,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 19898368. Throughput: 0: 788.5, 1: 789.0. Samples: 4975306. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:51:05,583][08744] Avg episode reward: [(0, '2.600'), (1, '3.300')] +[2023-09-27 11:51:05,584][09742] Saving new best policy, reward=3.300! +[2023-09-27 11:51:06,122][09878] Updated weights for policy 0, policy_version 38880 (0.0016) +[2023-09-27 11:51:06,123][09879] Updated weights for policy 1, policy_version 38880 (0.0016) +[2023-09-27 11:51:10,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6359.2). Total num frames: 19931136. Throughput: 0: 790.6, 1: 790.8. Samples: 4980220. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-09-27 11:51:10,583][08744] Avg episode reward: [(0, '2.810'), (1, '3.120')] +[2023-09-27 11:51:15,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6359.2). Total num frames: 19963904. Throughput: 0: 784.5, 1: 784.8. Samples: 4989456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:51:15,583][08744] Avg episode reward: [(0, '2.740'), (1, '3.170')] +[2023-09-27 11:51:18,985][09878] Updated weights for policy 0, policy_version 39040 (0.0018) +[2023-09-27 11:51:18,985][09879] Updated weights for policy 1, policy_version 39040 (0.0016) +[2023-09-27 11:51:20,582][08744] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6359.2). Total num frames: 19996672. Throughput: 0: 791.7, 1: 792.2. Samples: 4999170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-09-27 11:51:20,583][08744] Avg episode reward: [(0, '2.850'), (1, '2.850')] +[2023-09-27 11:51:22,764][09916] Stopping RolloutWorker_w5... +[2023-09-27 11:51:22,764][09915] Stopping RolloutWorker_w3... +[2023-09-27 11:51:22,764][09919] Stopping RolloutWorker_w7... +[2023-09-27 11:51:22,764][08744] Component RolloutWorker_w5 stopped! +[2023-09-27 11:51:22,764][09918] Stopping RolloutWorker_w6... +[2023-09-27 11:51:22,764][09917] Stopping RolloutWorker_w4... +[2023-09-27 11:51:22,764][09913] Stopping RolloutWorker_w2... +[2023-09-27 11:51:22,764][09912] Stopping RolloutWorker_w1... +[2023-09-27 11:51:22,764][09916] Loop rollout_proc5_evt_loop terminating... +[2023-09-27 11:51:22,764][09915] Loop rollout_proc3_evt_loop terminating... +[2023-09-27 11:51:22,764][09606] Stopping Batcher_0... +[2023-09-27 11:51:22,764][09909] Stopping RolloutWorker_w0... +[2023-09-27 11:51:22,764][09919] Loop rollout_proc7_evt_loop terminating... +[2023-09-27 11:51:22,765][08744] Component RolloutWorker_w3 stopped! +[2023-09-27 11:51:22,764][09742] Stopping Batcher_1... +[2023-09-27 11:51:22,765][09917] Loop rollout_proc4_evt_loop terminating... +[2023-09-27 11:51:22,765][09918] Loop rollout_proc6_evt_loop terminating... +[2023-09-27 11:51:22,765][09913] Loop rollout_proc2_evt_loop terminating... +[2023-09-27 11:51:22,765][08744] Component RolloutWorker_w6 stopped! +[2023-09-27 11:51:22,765][09912] Loop rollout_proc1_evt_loop terminating... +[2023-09-27 11:51:22,765][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000039088_10006528.pth... +[2023-09-27 11:51:22,765][08744] Component RolloutWorker_w4 stopped! +[2023-09-27 11:51:22,765][09909] Loop rollout_proc0_evt_loop terminating... +[2023-09-27 11:51:22,765][08744] Component RolloutWorker_w1 stopped! +[2023-09-27 11:51:22,765][09742] Loop batcher_evt_loop terminating... +[2023-09-27 11:51:22,766][08744] Component RolloutWorker_w2 stopped! +[2023-09-27 11:51:22,766][08744] Component RolloutWorker_w7 stopped! +[2023-09-27 11:51:22,766][08744] Component Batcher_0 stopped! +[2023-09-27 11:51:22,767][08744] Component Batcher_1 stopped! +[2023-09-27 11:51:22,767][08744] Component RolloutWorker_w0 stopped! +[2023-09-27 11:51:22,765][09606] Loop batcher_evt_loop terminating... +[2023-09-27 11:51:22,802][09606] Removing ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000037136_9506816.pth +[2023-09-27 11:51:22,806][09606] Saving ./train_atari/atari_timepilot/checkpoint_p0/checkpoint_000039088_10006528.pth... +[2023-09-27 11:51:22,812][09879] Weights refcount: 2 0 +[2023-09-27 11:51:22,813][09879] Stopping InferenceWorker_p1-w0... +[2023-09-27 11:51:22,814][09879] Loop inference_proc1-0_evt_loop terminating... +[2023-09-27 11:51:22,813][08744] Component InferenceWorker_p1-w0 stopped! +[2023-09-27 11:51:22,834][09878] Weights refcount: 2 0 +[2023-09-27 11:51:22,835][09878] Stopping InferenceWorker_p0-w0... +[2023-09-27 11:51:22,835][09878] Loop inference_proc0-0_evt_loop terminating... +[2023-09-27 11:51:22,835][08744] Component InferenceWorker_p0-w0 stopped! +[2023-09-27 11:51:22,841][09606] Stopping LearnerWorker_p0... +[2023-09-27 11:51:22,841][09606] Loop learner_proc0_evt_loop terminating... +[2023-09-27 11:51:22,843][08744] Component LearnerWorker_p0 stopped! +[2023-09-27 11:51:22,873][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000039088_10006528.pth... +[2023-09-27 11:51:22,902][09742] Removing ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000037136_9506816.pth +[2023-09-27 11:51:22,906][09742] Saving ./train_atari/atari_timepilot/checkpoint_p1/checkpoint_000039088_10006528.pth... +[2023-09-27 11:51:22,944][09742] Stopping LearnerWorker_p1... +[2023-09-27 11:51:22,944][09742] Loop learner_proc1_evt_loop terminating... +[2023-09-27 11:51:22,944][08744] Component LearnerWorker_p1 stopped! +[2023-09-27 11:51:22,945][08744] Waiting for process learner_proc0 to stop... +[2023-09-27 11:51:23,533][08744] Waiting for process learner_proc1 to stop... +[2023-09-27 11:51:23,614][08744] Waiting for process inference_proc0-0 to join... +[2023-09-27 11:51:23,615][08744] Waiting for process inference_proc1-0 to join... +[2023-09-27 11:51:23,615][08744] Waiting for process rollout_proc0 to join... +[2023-09-27 11:51:23,616][08744] Waiting for process rollout_proc1 to join... +[2023-09-27 11:51:23,617][08744] Waiting for process rollout_proc2 to join... +[2023-09-27 11:51:23,617][08744] Waiting for process rollout_proc3 to join... +[2023-09-27 11:51:23,618][08744] Waiting for process rollout_proc4 to join... +[2023-09-27 11:51:23,619][08744] Waiting for process rollout_proc5 to join... +[2023-09-27 11:51:23,619][08744] Waiting for process rollout_proc6 to join... +[2023-09-27 11:51:23,620][08744] Waiting for process rollout_proc7 to join... +[2023-09-27 11:51:23,620][08744] Batcher 0 profile tree view: +batching: 21.2075, releasing_batches: 1.7565 +[2023-09-27 11:51:23,621][08744] Batcher 1 profile tree view: +batching: 21.0297, releasing_batches: 1.7133 +[2023-09-27 11:51:23,621][08744] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0051 + wait_policy_total: 638.0872 +update_model: 37.5715 + weight_update: 0.0019 +one_step: 0.0012 + handle_policy_step: 2267.1268 + deserialize: 67.7741, stack: 16.3997, obs_to_device_normalize: 551.7942, forward: 1085.7490, send_messages: 94.4382 + prepare_outputs: 302.5156 + to_cpu: 151.8268 +[2023-09-27 11:51:23,622][08744] InferenceWorker_p1-w0 profile tree view: +wait_policy: 0.0051 + wait_policy_total: 637.2362 +update_model: 36.8053 + weight_update: 0.0017 +one_step: 0.0012 + handle_policy_step: 2270.2299 + deserialize: 67.9006, stack: 16.1749, obs_to_device_normalize: 549.1679, forward: 1097.7771, send_messages: 95.9017 + prepare_outputs: 301.4217 + to_cpu: 152.1820 +[2023-09-27 11:51:23,622][08744] Learner 0 profile tree view: +misc: 0.0150, prepare_batch: 31.8350 +train: 458.0539 + epoch_init: 0.1079, minibatch_init: 3.1578, losses_postprocess: 61.7457, kl_divergence: 5.5408, after_optimizer: 21.3328 + calculate_losses: 46.2119 + losses_init: 0.1044, forward_head: 14.7047, bptt_initial: 0.4459, bptt: 0.4995, tail: 10.5941, advantages_returns: 3.1483, losses: 12.9986 + update: 315.7484 + clip: 163.7840 +[2023-09-27 11:51:23,623][08744] Learner 1 profile tree view: +misc: 0.0147, prepare_batch: 31.9707 +train: 455.9617 + epoch_init: 0.1088, minibatch_init: 3.2086, losses_postprocess: 61.4272, kl_divergence: 5.5481, after_optimizer: 21.2121 + calculate_losses: 46.1647 + losses_init: 0.1080, forward_head: 14.7075, bptt_initial: 0.4505, bptt: 0.4677, tail: 10.6196, advantages_returns: 3.1809, losses: 12.9352 + update: 314.1134 + clip: 163.3123 +[2023-09-27 11:51:23,623][08744] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.3941, enqueue_policy_requests: 43.0700, env_step: 1047.6166, overhead: 30.3201, complete_rollouts: 1.0912 +save_policy_outputs: 54.7136 + split_output_tensors: 18.8964 +[2023-09-27 11:51:23,624][08744] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.3970, enqueue_policy_requests: 42.8665, env_step: 1054.1617, overhead: 29.6892, complete_rollouts: 1.0749 +save_policy_outputs: 54.0906 + split_output_tensors: 18.5903 +[2023-09-27 11:51:23,624][08744] Loop Runner_EvtLoop terminating... +[2023-09-27 11:51:23,625][08744] Runner profile tree view: +main_loop: 3153.5713 +[2023-09-27 11:51:23,625][08744] Collected {0: 10006528, 1: 10006528}, FPS: 6346.2