MattStammers's picture
Upload folder using huggingface_hub
48d2809
[2023-09-30 05:18:11,337][123291] Saving configuration to ./train_atari/atari_freeway/config.json...
[2023-09-30 05:18:11,654][123291] Rollout worker 0 uses device cpu
[2023-09-30 05:18:11,655][123291] Rollout worker 1 uses device cpu
[2023-09-30 05:18:11,655][123291] Rollout worker 2 uses device cpu
[2023-09-30 05:18:11,656][123291] Rollout worker 3 uses device cpu
[2023-09-30 05:18:11,656][123291] Rollout worker 4 uses device cpu
[2023-09-30 05:18:11,657][123291] Rollout worker 5 uses device cpu
[2023-09-30 05:18:11,657][123291] Rollout worker 6 uses device cpu
[2023-09-30 05:18:11,658][123291] Rollout worker 7 uses device cpu
[2023-09-30 05:18:11,658][123291] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-09-30 05:18:11,706][123291] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-30 05:18:11,706][123291] InferenceWorker_p0-w0: min num requests: 1
[2023-09-30 05:18:11,709][123291] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-30 05:18:11,709][123291] InferenceWorker_p1-w0: min num requests: 1
[2023-09-30 05:18:11,733][123291] Starting all processes...
[2023-09-30 05:18:11,733][123291] Starting process learner_proc0
[2023-09-30 05:18:13,332][123291] Starting process learner_proc1
[2023-09-30 05:18:13,336][124965] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-30 05:18:13,336][124965] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-30 05:18:13,354][124965] Num visible devices: 1
[2023-09-30 05:18:13,371][124965] Starting seed is not provided
[2023-09-30 05:18:13,371][124965] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-30 05:18:13,371][124965] Initializing actor-critic model on device cuda:0
[2023-09-30 05:18:13,371][124965] RunningMeanStd input shape: (4, 84, 84)
[2023-09-30 05:18:13,372][124965] RunningMeanStd input shape: (1,)
[2023-09-30 05:18:13,383][124965] ConvEncoder: input_channels=4
[2023-09-30 05:18:13,567][124965] Conv encoder output size: 512
[2023-09-30 05:18:13,569][124965] Created Actor Critic model with architecture:
[2023-09-30 05:18:13,569][124965] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=3, bias=True)
)
)
[2023-09-30 05:18:14,157][124965] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-30 05:18:14,158][124965] No checkpoints found
[2023-09-30 05:18:14,158][124965] Did not load from checkpoint, starting from scratch!
[2023-09-30 05:18:14,158][124965] Initialized policy 0 weights for model version 0
[2023-09-30 05:18:14,160][124965] LearnerWorker_p0 finished initialization!
[2023-09-30 05:18:14,160][124965] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-30 05:18:14,963][123291] Starting all processes...
[2023-09-30 05:18:14,967][125162] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-30 05:18:14,967][125162] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1
[2023-09-30 05:18:14,970][123291] Starting process inference_proc0-0
[2023-09-30 05:18:14,971][123291] Starting process inference_proc1-0
[2023-09-30 05:18:14,971][123291] Starting process rollout_proc0
[2023-09-30 05:18:14,971][123291] Starting process rollout_proc1
[2023-09-30 05:18:14,986][125162] Num visible devices: 1
[2023-09-30 05:18:14,971][123291] Starting process rollout_proc2
[2023-09-30 05:18:14,972][123291] Starting process rollout_proc3
[2023-09-30 05:18:15,003][125162] Starting seed is not provided
[2023-09-30 05:18:15,003][125162] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-30 05:18:15,003][125162] Initializing actor-critic model on device cuda:0
[2023-09-30 05:18:15,004][125162] RunningMeanStd input shape: (4, 84, 84)
[2023-09-30 05:18:15,004][125162] RunningMeanStd input shape: (1,)
[2023-09-30 05:18:14,972][123291] Starting process rollout_proc4
[2023-09-30 05:18:14,975][123291] Starting process rollout_proc5
[2023-09-30 05:18:14,975][123291] Starting process rollout_proc6
[2023-09-30 05:18:14,976][123291] Starting process rollout_proc7
[2023-09-30 05:18:15,017][125162] ConvEncoder: input_channels=4
[2023-09-30 05:18:15,331][125162] Conv encoder output size: 512
[2023-09-30 05:18:15,333][125162] Created Actor Critic model with architecture:
[2023-09-30 05:18:15,333][125162] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=3, bias=True)
)
)
[2023-09-30 05:18:15,941][125162] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-30 05:18:15,942][125162] No checkpoints found
[2023-09-30 05:18:15,942][125162] Did not load from checkpoint, starting from scratch!
[2023-09-30 05:18:15,942][125162] Initialized policy 1 weights for model version 0
[2023-09-30 05:18:15,944][125162] LearnerWorker_p1 finished initialization!
[2023-09-30 05:18:15,945][125162] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-30 05:18:16,893][125295] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-30 05:18:16,902][125298] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-30 05:18:16,914][125261] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-30 05:18:16,914][125261] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1
[2023-09-30 05:18:16,932][125299] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-30 05:18:16,932][125261] Num visible devices: 1
[2023-09-30 05:18:16,934][125294] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-30 05:18:16,942][125297] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-30 05:18:16,946][125300] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-30 05:18:16,949][125293] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-30 05:18:16,965][125260] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-30 05:18:16,966][125260] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-30 05:18:17,011][125260] Num visible devices: 1
[2023-09-30 05:18:17,051][125301] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-30 05:18:17,492][123291] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-09-30 05:18:17,561][125261] RunningMeanStd input shape: (4, 84, 84)
[2023-09-30 05:18:17,562][125261] RunningMeanStd input shape: (1,)
[2023-09-30 05:18:17,573][125261] ConvEncoder: input_channels=4
[2023-09-30 05:18:17,612][125260] RunningMeanStd input shape: (4, 84, 84)
[2023-09-30 05:18:17,612][125260] RunningMeanStd input shape: (1,)
[2023-09-30 05:18:17,624][125260] ConvEncoder: input_channels=4
[2023-09-30 05:18:17,673][125261] Conv encoder output size: 512
[2023-09-30 05:18:17,679][123291] Inference worker 1-0 is ready!
[2023-09-30 05:18:17,723][125260] Conv encoder output size: 512
[2023-09-30 05:18:17,729][123291] Inference worker 0-0 is ready!
[2023-09-30 05:18:17,730][123291] All inference workers are ready! Signal rollout workers to start!
[2023-09-30 05:18:18,221][125298] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,221][125297] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,222][125301] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,224][125293] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,228][125295] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,229][125299] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,303][125300] Decorrelating experience for 0 frames...
[2023-09-30 05:18:18,312][125294] Decorrelating experience for 0 frames...
[2023-09-30 05:18:22,491][123291] Fps is (10 sec: 1638.5, 60 sec: 1638.5, 300 sec: 1638.5). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:18:27,492][123291] Fps is (10 sec: 2457.6, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 24576. Throughput: 0: 359.2, 1: 360.9. Samples: 7201. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:18:31,693][123291] Heartbeat connected on Batcher_0
[2023-09-30 05:18:31,696][123291] Heartbeat connected on LearnerWorker_p0
[2023-09-30 05:18:31,699][123291] Heartbeat connected on Batcher_1
[2023-09-30 05:18:31,702][123291] Heartbeat connected on LearnerWorker_p1
[2023-09-30 05:18:31,710][123291] Heartbeat connected on InferenceWorker_p0-w0
[2023-09-30 05:18:31,711][123291] Heartbeat connected on InferenceWorker_p1-w0
[2023-09-30 05:18:31,715][123291] Heartbeat connected on RolloutWorker_w0
[2023-09-30 05:18:31,716][123291] Heartbeat connected on RolloutWorker_w1
[2023-09-30 05:18:31,718][123291] Heartbeat connected on RolloutWorker_w2
[2023-09-30 05:18:31,723][123291] Heartbeat connected on RolloutWorker_w3
[2023-09-30 05:18:31,726][123291] Heartbeat connected on RolloutWorker_w4
[2023-09-30 05:18:31,726][123291] Heartbeat connected on RolloutWorker_w5
[2023-09-30 05:18:31,729][123291] Heartbeat connected on RolloutWorker_w6
[2023-09-30 05:18:31,735][123291] Heartbeat connected on RolloutWorker_w7
[2023-09-30 05:18:32,491][123291] Fps is (10 sec: 4915.2, 60 sec: 3823.0, 300 sec: 3823.0). Total num frames: 57344. Throughput: 0: 388.9, 1: 387.2. Samples: 11641. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:18:35,954][125260] Updated weights for policy 0, policy_version 160 (0.0019)
[2023-09-30 05:18:35,954][125261] Updated weights for policy 1, policy_version 160 (0.0019)
[2023-09-30 05:18:37,491][123291] Fps is (10 sec: 6553.7, 60 sec: 4505.7, 300 sec: 4505.7). Total num frames: 90112. Throughput: 0: 512.0, 1: 510.4. Samples: 20448. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:18:37,492][123291] Avg episode reward: [(0, '0.000'), (1, '0.000')]
[2023-09-30 05:18:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 4587.5, 300 sec: 4587.5). Total num frames: 114688. Throughput: 0: 576.1, 1: 574.5. Samples: 28764. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:18:42,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.000')]
[2023-09-30 05:18:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 147456. Throughput: 0: 553.8, 1: 551.2. Samples: 33151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:18:47,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.000')]
[2023-09-30 05:18:50,053][125260] Updated weights for policy 0, policy_version 320 (0.0018)
[2023-09-30 05:18:50,054][125261] Updated weights for policy 1, policy_version 320 (0.0018)
[2023-09-30 05:18:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 172032. Throughput: 0: 603.0, 1: 601.2. Samples: 42146. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:18:52,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.000')]
[2023-09-30 05:18:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5120.0, 300 sec: 5120.0). Total num frames: 204800. Throughput: 0: 637.1, 1: 636.4. Samples: 50942. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:18:57,492][123291] Avg episode reward: [(0, '0.000'), (1, '0.250')]
[2023-09-30 05:18:57,493][124965] Saving new best policy, reward=0.000!
[2023-09-30 05:18:57,493][125162] Saving new best policy, reward=0.250!
[2023-09-30 05:19:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5097.3, 300 sec: 5097.3). Total num frames: 229376. Throughput: 0: 614.4, 1: 614.4. Samples: 55296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:02,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.250')]
[2023-09-30 05:19:04,139][125260] Updated weights for policy 0, policy_version 480 (0.0017)
[2023-09-30 05:19:04,139][125261] Updated weights for policy 1, policy_version 480 (0.0018)
[2023-09-30 05:19:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5242.9, 300 sec: 5242.9). Total num frames: 262144. Throughput: 0: 688.3, 1: 687.0. Samples: 63938. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:19:07,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.375')]
[2023-09-30 05:19:07,497][125162] Saving new best policy, reward=0.375!
[2023-09-30 05:19:12,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5362.0, 300 sec: 5362.0). Total num frames: 294912. Throughput: 0: 730.0, 1: 729.6. Samples: 72883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:12,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.375')]
[2023-09-30 05:19:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5324.8). Total num frames: 319488. Throughput: 0: 732.5, 1: 729.7. Samples: 77442. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:17,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.375')]
[2023-09-30 05:19:18,098][125260] Updated weights for policy 0, policy_version 640 (0.0018)
[2023-09-30 05:19:18,098][125261] Updated weights for policy 1, policy_version 640 (0.0018)
[2023-09-30 05:19:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5419.3). Total num frames: 352256. Throughput: 0: 728.2, 1: 728.9. Samples: 86016. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:19:22,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.650')]
[2023-09-30 05:19:22,498][125162] Saving new best policy, reward=0.650!
[2023-09-30 05:19:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5383.3). Total num frames: 376832. Throughput: 0: 730.5, 1: 729.8. Samples: 94480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:19:27,493][123291] Avg episode reward: [(0, '0.000'), (1, '0.650')]
[2023-09-30 05:19:32,212][125261] Updated weights for policy 1, policy_version 800 (0.0016)
[2023-09-30 05:19:32,212][125260] Updated weights for policy 0, policy_version 800 (0.0017)
[2023-09-30 05:19:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5461.3). Total num frames: 409600. Throughput: 0: 731.5, 1: 731.8. Samples: 99001. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:19:32,492][123291] Avg episode reward: [(0, '0.000'), (1, '1.542')]
[2023-09-30 05:19:32,493][125162] Saving new best policy, reward=1.542!
[2023-09-30 05:19:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5427.2). Total num frames: 434176. Throughput: 0: 725.8, 1: 726.6. Samples: 107504. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:37,493][123291] Avg episode reward: [(0, '0.000'), (1, '1.542')]
[2023-09-30 05:19:42,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5493.5). Total num frames: 466944. Throughput: 0: 728.6, 1: 725.4. Samples: 116369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:42,492][123291] Avg episode reward: [(0, '0.000'), (1, '3.786')]
[2023-09-30 05:19:42,493][125162] Saving new best policy, reward=3.786!
[2023-09-30 05:19:46,326][125260] Updated weights for policy 0, policy_version 960 (0.0015)
[2023-09-30 05:19:46,327][125261] Updated weights for policy 1, policy_version 960 (0.0020)
[2023-09-30 05:19:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5461.3). Total num frames: 491520. Throughput: 0: 728.2, 1: 728.2. Samples: 120832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:47,492][123291] Avg episode reward: [(0, '0.000'), (1, '3.786')]
[2023-09-30 05:19:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5518.8). Total num frames: 524288. Throughput: 0: 727.1, 1: 727.8. Samples: 129412. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:19:52,493][123291] Avg episode reward: [(0, '0.000'), (1, '5.500')]
[2023-09-30 05:19:52,498][125162] Saving new best policy, reward=5.500!
[2023-09-30 05:19:57,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5570.6). Total num frames: 557056. Throughput: 0: 729.7, 1: 729.3. Samples: 138536. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:19:57,493][123291] Avg episode reward: [(0, '0.000'), (1, '5.500')]
[2023-09-30 05:20:00,107][125260] Updated weights for policy 0, policy_version 1120 (0.0017)
[2023-09-30 05:20:00,108][125261] Updated weights for policy 1, policy_version 1120 (0.0017)
[2023-09-30 05:20:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5539.4). Total num frames: 581632. Throughput: 0: 727.6, 1: 730.8. Samples: 143070. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:20:02,492][123291] Avg episode reward: [(0, '0.000'), (1, '5.500')]
[2023-09-30 05:20:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5585.5). Total num frames: 614400. Throughput: 0: 730.5, 1: 729.4. Samples: 151710. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:20:07,493][123291] Avg episode reward: [(0, '0.000'), (1, '6.944')]
[2023-09-30 05:20:07,498][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000001200_307200.pth...
[2023-09-30 05:20:07,498][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000001200_307200.pth...
[2023-09-30 05:20:07,533][125162] Saving new best policy, reward=6.944!
[2023-09-30 05:20:12,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5591.9). Total num frames: 643072. Throughput: 0: 737.1, 1: 737.2. Samples: 160824. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:20:12,493][123291] Avg episode reward: [(0, '0.000'), (1, '6.944')]
[2023-09-30 05:20:14,050][125261] Updated weights for policy 1, policy_version 1280 (0.0018)
[2023-09-30 05:20:14,051][125260] Updated weights for policy 0, policy_version 1280 (0.0017)
[2023-09-30 05:20:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5597.9). Total num frames: 671744. Throughput: 0: 732.8, 1: 733.7. Samples: 164994. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:20:17,493][123291] Avg episode reward: [(0, '0.000'), (1, '8.275')]
[2023-09-30 05:20:17,494][125162] Saving new best policy, reward=8.275!
[2023-09-30 05:20:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5802.7, 300 sec: 5603.3). Total num frames: 700416. Throughput: 0: 736.8, 1: 737.0. Samples: 173823. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:20:22,493][123291] Avg episode reward: [(0, '0.000'), (1, '8.275')]
[2023-09-30 05:20:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5608.4). Total num frames: 729088. Throughput: 0: 730.3, 1: 734.2. Samples: 182272. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:20:27,493][123291] Avg episode reward: [(0, '0.000'), (1, '9.318')]
[2023-09-30 05:20:27,494][125162] Saving new best policy, reward=9.318!
[2023-09-30 05:20:28,359][125260] Updated weights for policy 0, policy_version 1440 (0.0017)
[2023-09-30 05:20:28,359][125261] Updated weights for policy 1, policy_version 1440 (0.0017)
[2023-09-30 05:20:32,492][123291] Fps is (10 sec: 5324.9, 60 sec: 5734.4, 300 sec: 5582.7). Total num frames: 753664. Throughput: 0: 728.2, 1: 728.2. Samples: 186368. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:20:32,492][123291] Avg episode reward: [(0, '0.000'), (1, '9.318')]
[2023-09-30 05:20:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5617.4). Total num frames: 786432. Throughput: 0: 730.2, 1: 730.0. Samples: 195118. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:20:37,492][123291] Avg episode reward: [(0, '0.000'), (1, '10.146')]
[2023-09-30 05:20:37,496][125162] Saving new best policy, reward=10.146!
[2023-09-30 05:20:42,425][125261] Updated weights for policy 1, policy_version 1600 (0.0017)
[2023-09-30 05:20:42,425][125260] Updated weights for policy 0, policy_version 1600 (0.0017)
[2023-09-30 05:20:42,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5649.7). Total num frames: 819200. Throughput: 0: 729.0, 1: 727.8. Samples: 204092. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:20:42,492][123291] Avg episode reward: [(0, '0.000'), (1, '10.146')]
[2023-09-30 05:20:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5625.2). Total num frames: 843776. Throughput: 0: 727.8, 1: 725.5. Samples: 208468. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:20:47,493][123291] Avg episode reward: [(0, '0.000'), (1, '10.146')]
[2023-09-30 05:20:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5655.1). Total num frames: 876544. Throughput: 0: 725.9, 1: 727.0. Samples: 217089. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:20:52,493][123291] Avg episode reward: [(0, '0.000'), (1, '10.788')]
[2023-09-30 05:20:52,497][125162] Saving new best policy, reward=10.788!
[2023-09-30 05:20:56,245][125261] Updated weights for policy 1, policy_version 1760 (0.0015)
[2023-09-30 05:20:56,246][125260] Updated weights for policy 0, policy_version 1760 (0.0017)
[2023-09-30 05:20:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5632.0). Total num frames: 901120. Throughput: 0: 724.4, 1: 726.5. Samples: 226111. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:20:57,493][123291] Avg episode reward: [(0, '0.000'), (1, '10.788')]
[2023-09-30 05:21:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5659.9). Total num frames: 933888. Throughput: 0: 727.6, 1: 727.9. Samples: 230494. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:21:02,493][123291] Avg episode reward: [(0, '0.000'), (1, '11.411')]
[2023-09-30 05:21:02,494][125162] Saving new best policy, reward=11.411!
[2023-09-30 05:21:07,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5871.0, 300 sec: 5686.2). Total num frames: 966656. Throughput: 0: 728.8, 1: 729.5. Samples: 239446. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:21:07,492][123291] Avg episode reward: [(0, '0.000'), (1, '11.411')]
[2023-09-30 05:21:10,265][125260] Updated weights for policy 0, policy_version 1920 (0.0016)
[2023-09-30 05:21:10,266][125261] Updated weights for policy 1, policy_version 1920 (0.0017)
[2023-09-30 05:21:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5802.7, 300 sec: 5664.2). Total num frames: 991232. Throughput: 0: 728.3, 1: 728.2. Samples: 247816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:21:12,492][123291] Avg episode reward: [(0, '0.000'), (1, '11.917')]
[2023-09-30 05:21:12,493][125162] Saving new best policy, reward=11.917!
[2023-09-30 05:21:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5688.9). Total num frames: 1024000. Throughput: 0: 733.6, 1: 732.4. Samples: 252340. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:21:17,493][123291] Avg episode reward: [(0, '0.000'), (1, '11.917')]
[2023-09-30 05:21:22,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5668.0). Total num frames: 1048576. Throughput: 0: 729.2, 1: 729.4. Samples: 260756. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:21:22,492][123291] Avg episode reward: [(0, '0.000'), (1, '12.438')]
[2023-09-30 05:21:22,495][125162] Saving new best policy, reward=12.438!
[2023-09-30 05:21:24,574][125261] Updated weights for policy 1, policy_version 2080 (0.0016)
[2023-09-30 05:21:24,575][125260] Updated weights for policy 0, policy_version 2080 (0.0017)
[2023-09-30 05:21:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5691.3). Total num frames: 1081344. Throughput: 0: 726.4, 1: 727.4. Samples: 269513. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:21:27,493][123291] Avg episode reward: [(0, '0.000'), (1, '12.438')]
[2023-09-30 05:21:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5671.4). Total num frames: 1105920. Throughput: 0: 724.8, 1: 727.9. Samples: 273840. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:21:32,493][123291] Avg episode reward: [(0, '0.000'), (1, '12.438')]
[2023-09-30 05:21:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5693.4). Total num frames: 1138688. Throughput: 0: 728.2, 1: 728.2. Samples: 282625. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:21:37,492][123291] Avg episode reward: [(0, '0.000'), (1, '13.044')]
[2023-09-30 05:21:37,498][125162] Saving new best policy, reward=13.044!
[2023-09-30 05:21:38,634][125260] Updated weights for policy 0, policy_version 2240 (0.0017)
[2023-09-30 05:21:38,634][125261] Updated weights for policy 1, policy_version 2240 (0.0017)
[2023-09-30 05:21:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5674.5). Total num frames: 1163264. Throughput: 0: 725.1, 1: 723.0. Samples: 291279. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:21:42,492][123291] Avg episode reward: [(0, '0.000'), (1, '13.044')]
[2023-09-30 05:21:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5695.4). Total num frames: 1196032. Throughput: 0: 725.1, 1: 724.4. Samples: 295725. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:21:47,493][123291] Avg episode reward: [(0, '0.000'), (1, '13.556')]
[2023-09-30 05:21:47,494][125162] Saving new best policy, reward=13.556!
[2023-09-30 05:21:52,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5677.2). Total num frames: 1220608. Throughput: 0: 722.0, 1: 719.8. Samples: 304325. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:21:52,493][123291] Avg episode reward: [(0, '0.000'), (1, '13.556')]
[2023-09-30 05:21:52,995][125261] Updated weights for policy 1, policy_version 2400 (0.0017)
[2023-09-30 05:21:52,995][125260] Updated weights for policy 0, policy_version 2400 (0.0019)
[2023-09-30 05:21:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5697.2). Total num frames: 1253376. Throughput: 0: 721.5, 1: 720.3. Samples: 312696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:21:57,493][123291] Avg episode reward: [(0, '0.013'), (1, '14.079')]
[2023-09-30 05:21:57,494][124965] Saving new best policy, reward=0.013!
[2023-09-30 05:21:57,494][125162] Saving new best policy, reward=14.079!
[2023-09-30 05:22:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5679.8). Total num frames: 1277952. Throughput: 0: 721.1, 1: 720.8. Samples: 317226. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:22:02,493][123291] Avg episode reward: [(0, '0.013'), (1, '14.079')]
[2023-09-30 05:22:07,085][125260] Updated weights for policy 0, policy_version 2560 (0.0018)
[2023-09-30 05:22:07,085][125261] Updated weights for policy 1, policy_version 2560 (0.0017)
[2023-09-30 05:22:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5698.8). Total num frames: 1310720. Throughput: 0: 723.0, 1: 722.7. Samples: 325814. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:22:07,492][123291] Avg episode reward: [(0, '0.025'), (1, '14.625')]
[2023-09-30 05:22:07,495][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000002560_655360.pth...
[2023-09-30 05:22:07,495][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000002560_655360.pth...
[2023-09-30 05:22:07,530][125162] Saving new best policy, reward=14.625!
[2023-09-30 05:22:07,532][124965] Saving new best policy, reward=0.025!
[2023-09-30 05:22:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5682.1). Total num frames: 1335296. Throughput: 0: 720.2, 1: 719.4. Samples: 334293. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:22:12,493][123291] Avg episode reward: [(0, '0.025'), (1, '14.625')]
[2023-09-30 05:22:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5700.3). Total num frames: 1368064. Throughput: 0: 719.7, 1: 717.1. Samples: 338496. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:22:17,493][123291] Avg episode reward: [(0, '0.025'), (1, '14.625')]
[2023-09-30 05:22:21,297][125260] Updated weights for policy 0, policy_version 2720 (0.0015)
[2023-09-30 05:22:21,297][125261] Updated weights for policy 1, policy_version 2720 (0.0017)
[2023-09-30 05:22:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5684.2). Total num frames: 1392640. Throughput: 0: 721.5, 1: 721.0. Samples: 347535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:22:22,493][123291] Avg episode reward: [(0, '0.024'), (1, '15.036')]
[2023-09-30 05:22:22,645][125162] Saving new best policy, reward=15.036!
[2023-09-30 05:22:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5701.6). Total num frames: 1425408. Throughput: 0: 722.3, 1: 723.8. Samples: 356352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:22:27,492][123291] Avg episode reward: [(0, '0.024'), (1, '15.036')]
[2023-09-30 05:22:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5686.2). Total num frames: 1449984. Throughput: 0: 718.6, 1: 719.7. Samples: 360448. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:22:32,492][123291] Avg episode reward: [(0, '0.034'), (1, '15.602')]
[2023-09-30 05:22:32,493][125162] Saving new best policy, reward=15.602!
[2023-09-30 05:22:32,597][124965] Saving new best policy, reward=0.034!
[2023-09-30 05:22:35,427][125261] Updated weights for policy 1, policy_version 2880 (0.0018)
[2023-09-30 05:22:35,427][125260] Updated weights for policy 0, policy_version 2880 (0.0017)
[2023-09-30 05:22:37,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5702.9). Total num frames: 1482752. Throughput: 0: 720.0, 1: 721.0. Samples: 369170. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:22:37,493][123291] Avg episode reward: [(0, '0.034'), (1, '15.602')]
[2023-09-30 05:22:42,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.6, 300 sec: 5703.5). Total num frames: 1511424. Throughput: 0: 724.8, 1: 724.2. Samples: 377900. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:22:42,493][123291] Avg episode reward: [(0, '0.054'), (1, '16.033')]
[2023-09-30 05:22:42,494][124965] Saving new best policy, reward=0.054!
[2023-09-30 05:22:42,514][125162] Saving new best policy, reward=16.033!
[2023-09-30 05:22:47,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5704.1). Total num frames: 1540096. Throughput: 0: 725.9, 1: 726.0. Samples: 382562. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:22:47,492][123291] Avg episode reward: [(0, '0.054'), (1, '16.033')]
[2023-09-30 05:22:49,296][125260] Updated weights for policy 0, policy_version 3040 (0.0016)
[2023-09-30 05:22:49,297][125261] Updated weights for policy 1, policy_version 3040 (0.0017)
[2023-09-30 05:22:52,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5871.0, 300 sec: 5719.5). Total num frames: 1572864. Throughput: 0: 727.3, 1: 727.1. Samples: 391262. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:22:52,492][123291] Avg episode reward: [(0, '0.115'), (1, '16.385')]
[2023-09-30 05:22:52,495][124965] Saving new best policy, reward=0.115!
[2023-09-30 05:22:52,495][125162] Saving new best policy, reward=16.385!
[2023-09-30 05:22:57,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5705.1). Total num frames: 1597440. Throughput: 0: 731.8, 1: 733.0. Samples: 400211. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:22:57,492][123291] Avg episode reward: [(0, '0.115'), (1, '16.385')]
[2023-09-30 05:23:02,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5720.0). Total num frames: 1630208. Throughput: 0: 736.1, 1: 736.4. Samples: 404757. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:23:02,492][123291] Avg episode reward: [(0, '0.115'), (1, '16.505')]
[2023-09-30 05:23:02,493][125162] Saving new best policy, reward=16.505!
[2023-09-30 05:23:03,142][125260] Updated weights for policy 0, policy_version 3200 (0.0018)
[2023-09-30 05:23:03,142][125261] Updated weights for policy 1, policy_version 3200 (0.0016)
[2023-09-30 05:23:07,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5734.4). Total num frames: 1662976. Throughput: 0: 734.9, 1: 735.4. Samples: 413697. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:23:07,492][123291] Avg episode reward: [(0, '0.210'), (1, '16.790')]
[2023-09-30 05:23:07,501][124965] Saving new best policy, reward=0.210!
[2023-09-30 05:23:07,502][125162] Saving new best policy, reward=16.790!
[2023-09-30 05:23:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5720.5). Total num frames: 1687552. Throughput: 0: 734.7, 1: 733.1. Samples: 422403. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:23:12,493][123291] Avg episode reward: [(0, '0.210'), (1, '16.790')]
[2023-09-30 05:23:16,830][125261] Updated weights for policy 1, policy_version 3360 (0.0017)
[2023-09-30 05:23:16,830][125260] Updated weights for policy 0, policy_version 3360 (0.0016)
[2023-09-30 05:23:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 1720320. Throughput: 0: 739.6, 1: 739.0. Samples: 426988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:23:17,492][123291] Avg episode reward: [(0, '0.480'), (1, '17.840')]
[2023-09-30 05:23:17,493][124965] Saving new best policy, reward=0.480!
[2023-09-30 05:23:17,493][125162] Saving new best policy, reward=17.840!
[2023-09-30 05:23:22,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 1753088. Throughput: 0: 744.4, 1: 745.7. Samples: 436224. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:23:22,493][123291] Avg episode reward: [(0, '0.480'), (1, '17.840')]
[2023-09-30 05:23:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 1777664. Throughput: 0: 746.2, 1: 746.4. Samples: 445067. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:23:27,493][123291] Avg episode reward: [(0, '0.820'), (1, '18.880')]
[2023-09-30 05:23:27,494][124965] Saving new best policy, reward=0.820!
[2023-09-30 05:23:27,494][125162] Saving new best policy, reward=18.880!
[2023-09-30 05:23:30,510][125260] Updated weights for policy 0, policy_version 3520 (0.0017)
[2023-09-30 05:23:30,510][125261] Updated weights for policy 1, policy_version 3520 (0.0015)
[2023-09-30 05:23:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5831.6). Total num frames: 1810432. Throughput: 0: 744.7, 1: 744.1. Samples: 449559. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:23:32,493][123291] Avg episode reward: [(0, '0.820'), (1, '18.880')]
[2023-09-30 05:23:37,491][123291] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 1843200. Throughput: 0: 748.6, 1: 750.1. Samples: 458704. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:23:37,492][123291] Avg episode reward: [(0, '1.380'), (1, '19.920')]
[2023-09-30 05:23:37,499][124965] Saving new best policy, reward=1.380!
[2023-09-30 05:23:37,499][125162] Saving new best policy, reward=19.920!
[2023-09-30 05:23:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5939.2, 300 sec: 5831.6). Total num frames: 1867776. Throughput: 0: 745.8, 1: 745.3. Samples: 467307. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:23:42,493][123291] Avg episode reward: [(0, '1.380'), (1, '19.920')]
[2023-09-30 05:23:44,395][125260] Updated weights for policy 0, policy_version 3680 (0.0018)
[2023-09-30 05:23:44,395][125261] Updated weights for policy 1, policy_version 3680 (0.0017)
[2023-09-30 05:23:47,491][123291] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 1900544. Throughput: 0: 742.6, 1: 742.9. Samples: 471605. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:23:47,492][123291] Avg episode reward: [(0, '1.910'), (1, '20.920')]
[2023-09-30 05:23:47,493][124965] Saving new best policy, reward=1.910!
[2023-09-30 05:23:47,493][125162] Saving new best policy, reward=20.920!
[2023-09-30 05:23:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 1925120. Throughput: 0: 741.3, 1: 740.4. Samples: 480374. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:23:52,493][123291] Avg episode reward: [(0, '1.910'), (1, '20.920')]
[2023-09-30 05:23:57,491][123291] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 1957888. Throughput: 0: 744.4, 1: 745.1. Samples: 489430. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:23:57,492][123291] Avg episode reward: [(0, '1.910'), (1, '20.920')]
[2023-09-30 05:23:58,366][125261] Updated weights for policy 1, policy_version 3840 (0.0017)
[2023-09-30 05:23:58,366][125260] Updated weights for policy 0, policy_version 3840 (0.0017)
[2023-09-30 05:24:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 1982464. Throughput: 0: 739.5, 1: 740.1. Samples: 493568. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:24:02,493][123291] Avg episode reward: [(0, '2.590'), (1, '21.830')]
[2023-09-30 05:24:02,495][125162] Saving new best policy, reward=21.830!
[2023-09-30 05:24:02,687][124965] Saving new best policy, reward=2.590!
[2023-09-30 05:24:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2015232. Throughput: 0: 733.0, 1: 731.6. Samples: 502134. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:07,492][123291] Avg episode reward: [(0, '2.590'), (1, '21.830')]
[2023-09-30 05:24:07,501][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000003936_1007616.pth...
[2023-09-30 05:24:07,501][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000003936_1007616.pth...
[2023-09-30 05:24:07,536][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000001200_307200.pth
[2023-09-30 05:24:07,539][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000001200_307200.pth
[2023-09-30 05:24:12,454][125260] Updated weights for policy 0, policy_version 4000 (0.0017)
[2023-09-30 05:24:12,454][125261] Updated weights for policy 1, policy_version 4000 (0.0017)
[2023-09-30 05:24:12,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 2048000. Throughput: 0: 732.5, 1: 734.1. Samples: 511061. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:12,493][123291] Avg episode reward: [(0, '3.300'), (1, '22.560')]
[2023-09-30 05:24:12,494][124965] Saving new best policy, reward=3.300!
[2023-09-30 05:24:12,494][125162] Saving new best policy, reward=22.560!
[2023-09-30 05:24:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2072576. Throughput: 0: 734.8, 1: 737.2. Samples: 515800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:17,493][123291] Avg episode reward: [(0, '3.300'), (1, '22.560')]
[2023-09-30 05:24:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 2105344. Throughput: 0: 728.9, 1: 728.5. Samples: 524288. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:24:22,492][123291] Avg episode reward: [(0, '4.080'), (1, '22.920')]
[2023-09-30 05:24:22,503][124965] Saving new best policy, reward=4.080!
[2023-09-30 05:24:22,503][125162] Saving new best policy, reward=22.920!
[2023-09-30 05:24:26,202][125260] Updated weights for policy 0, policy_version 4160 (0.0015)
[2023-09-30 05:24:26,203][125261] Updated weights for policy 1, policy_version 4160 (0.0018)
[2023-09-30 05:24:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2129920. Throughput: 0: 732.9, 1: 734.0. Samples: 533316. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:24:27,493][123291] Avg episode reward: [(0, '4.080'), (1, '22.920')]
[2023-09-30 05:24:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 2162688. Throughput: 0: 733.7, 1: 734.4. Samples: 537668. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:24:32,492][123291] Avg episode reward: [(0, '4.830'), (1, '23.300')]
[2023-09-30 05:24:32,493][124965] Saving new best policy, reward=4.830!
[2023-09-30 05:24:32,494][125162] Saving new best policy, reward=23.300!
[2023-09-30 05:24:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2187264. Throughput: 0: 733.6, 1: 732.4. Samples: 546344. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:37,493][123291] Avg episode reward: [(0, '4.830'), (1, '23.300')]
[2023-09-30 05:24:40,505][125260] Updated weights for policy 0, policy_version 4320 (0.0015)
[2023-09-30 05:24:40,506][125261] Updated weights for policy 1, policy_version 4320 (0.0014)
[2023-09-30 05:24:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 2220032. Throughput: 0: 728.2, 1: 729.1. Samples: 555007. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:24:42,493][123291] Avg episode reward: [(0, '4.830'), (1, '23.360')]
[2023-09-30 05:24:42,494][125162] Saving new best policy, reward=23.360!
[2023-09-30 05:24:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2244608. Throughput: 0: 728.2, 1: 728.2. Samples: 559104. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:24:47,492][123291] Avg episode reward: [(0, '5.630'), (1, '23.630')]
[2023-09-30 05:24:47,492][124965] Saving new best policy, reward=5.630!
[2023-09-30 05:24:47,493][125162] Saving new best policy, reward=23.630!
[2023-09-30 05:24:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 2277376. Throughput: 0: 728.4, 1: 728.8. Samples: 567712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:52,492][123291] Avg episode reward: [(0, '5.630'), (1, '23.630')]
[2023-09-30 05:24:54,853][125260] Updated weights for policy 0, policy_version 4480 (0.0018)
[2023-09-30 05:24:54,854][125261] Updated weights for policy 1, policy_version 4480 (0.0015)
[2023-09-30 05:24:57,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2301952. Throughput: 0: 723.5, 1: 723.2. Samples: 576160. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:24:57,492][123291] Avg episode reward: [(0, '6.370'), (1, '23.960')]
[2023-09-30 05:24:57,493][124965] Saving new best policy, reward=6.370!
[2023-09-30 05:24:57,493][125162] Saving new best policy, reward=23.960!
[2023-09-30 05:25:02,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 2334720. Throughput: 0: 721.3, 1: 719.8. Samples: 580649. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:25:02,492][123291] Avg episode reward: [(0, '6.370'), (1, '23.960')]
[2023-09-30 05:25:07,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 2367488. Throughput: 0: 725.3, 1: 725.0. Samples: 589549. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:25:07,493][123291] Avg episode reward: [(0, '7.260'), (1, '24.280')]
[2023-09-30 05:25:07,504][124965] Saving new best policy, reward=7.260!
[2023-09-30 05:25:07,504][125162] Saving new best policy, reward=24.280!
[2023-09-30 05:25:08,837][125260] Updated weights for policy 0, policy_version 4640 (0.0016)
[2023-09-30 05:25:08,837][125261] Updated weights for policy 1, policy_version 4640 (0.0017)
[2023-09-30 05:25:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2392064. Throughput: 0: 720.3, 1: 719.0. Samples: 598085. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:25:12,493][123291] Avg episode reward: [(0, '7.260'), (1, '24.280')]
[2023-09-30 05:25:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 2424832. Throughput: 0: 721.1, 1: 720.8. Samples: 602552. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:25:17,493][123291] Avg episode reward: [(0, '8.110'), (1, '24.660')]
[2023-09-30 05:25:17,494][124965] Saving new best policy, reward=8.110!
[2023-09-30 05:25:17,494][125162] Saving new best policy, reward=24.660!
[2023-09-30 05:25:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2449408. Throughput: 0: 720.6, 1: 721.6. Samples: 611244. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:25:22,492][123291] Avg episode reward: [(0, '8.110'), (1, '24.660')]
[2023-09-30 05:25:23,071][125260] Updated weights for policy 0, policy_version 4800 (0.0019)
[2023-09-30 05:25:23,071][125261] Updated weights for policy 1, policy_version 4800 (0.0016)
[2023-09-30 05:25:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 2482176. Throughput: 0: 724.2, 1: 721.8. Samples: 620080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:25:27,493][123291] Avg episode reward: [(0, '8.110'), (1, '24.660')]
[2023-09-30 05:25:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2506752. Throughput: 0: 726.8, 1: 724.7. Samples: 624424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:25:32,493][123291] Avg episode reward: [(0, '8.890'), (1, '25.020')]
[2023-09-30 05:25:32,494][124965] Saving new best policy, reward=8.890!
[2023-09-30 05:25:32,494][125162] Saving new best policy, reward=25.020!
[2023-09-30 05:25:37,244][125260] Updated weights for policy 0, policy_version 4960 (0.0017)
[2023-09-30 05:25:37,244][125261] Updated weights for policy 1, policy_version 4960 (0.0016)
[2023-09-30 05:25:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2539520. Throughput: 0: 723.0, 1: 724.1. Samples: 632832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:25:37,492][123291] Avg episode reward: [(0, '8.890'), (1, '25.020')]
[2023-09-30 05:25:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2564096. Throughput: 0: 725.2, 1: 725.0. Samples: 641418. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:25:42,493][123291] Avg episode reward: [(0, '9.750'), (1, '25.350')]
[2023-09-30 05:25:42,494][125162] Saving new best policy, reward=25.350!
[2023-09-30 05:25:42,494][124965] Saving new best policy, reward=9.750!
[2023-09-30 05:25:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2596864. Throughput: 0: 722.5, 1: 722.7. Samples: 645682. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:25:47,493][123291] Avg episode reward: [(0, '9.750'), (1, '25.350')]
[2023-09-30 05:25:51,510][125260] Updated weights for policy 0, policy_version 5120 (0.0017)
[2023-09-30 05:25:51,511][125261] Updated weights for policy 1, policy_version 5120 (0.0018)
[2023-09-30 05:25:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2621440. Throughput: 0: 721.0, 1: 721.3. Samples: 654453. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:25:52,493][123291] Avg episode reward: [(0, '10.620'), (1, '25.790')]
[2023-09-30 05:25:52,505][124965] Saving new best policy, reward=10.620!
[2023-09-30 05:25:52,505][125162] Saving new best policy, reward=25.790!
[2023-09-30 05:25:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2654208. Throughput: 0: 724.4, 1: 724.5. Samples: 663288. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:25:57,493][123291] Avg episode reward: [(0, '10.620'), (1, '25.790')]
[2023-09-30 05:26:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 2678784. Throughput: 0: 722.9, 1: 723.5. Samples: 667639. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:26:02,493][123291] Avg episode reward: [(0, '11.560'), (1, '26.100')]
[2023-09-30 05:26:02,494][124965] Saving new best policy, reward=11.560!
[2023-09-30 05:26:02,494][125162] Saving new best policy, reward=26.100!
[2023-09-30 05:26:05,553][125260] Updated weights for policy 0, policy_version 5280 (0.0018)
[2023-09-30 05:26:05,553][125261] Updated weights for policy 1, policy_version 5280 (0.0018)
[2023-09-30 05:26:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2711552. Throughput: 0: 720.6, 1: 720.7. Samples: 676103. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:26:07,493][123291] Avg episode reward: [(0, '11.560'), (1, '26.100')]
[2023-09-30 05:26:07,504][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000005296_1355776.pth...
[2023-09-30 05:26:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000005296_1355776.pth...
[2023-09-30 05:26:07,538][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000002560_655360.pth
[2023-09-30 05:26:07,542][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000002560_655360.pth
[2023-09-30 05:26:12,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 2740224. Throughput: 0: 721.7, 1: 722.2. Samples: 685057. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:26:12,493][123291] Avg episode reward: [(0, '11.560'), (1, '26.100')]
[2023-09-30 05:26:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2768896. Throughput: 0: 723.2, 1: 724.5. Samples: 689570. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:26:17,493][123291] Avg episode reward: [(0, '12.440'), (1, '26.340')]
[2023-09-30 05:26:17,494][124965] Saving new best policy, reward=12.440!
[2023-09-30 05:26:17,494][125162] Saving new best policy, reward=26.340!
[2023-09-30 05:26:19,320][125260] Updated weights for policy 0, policy_version 5440 (0.0016)
[2023-09-30 05:26:19,320][125261] Updated weights for policy 1, policy_version 5440 (0.0016)
[2023-09-30 05:26:22,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2801664. Throughput: 0: 728.3, 1: 728.2. Samples: 698372. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:26:22,493][123291] Avg episode reward: [(0, '12.440'), (1, '26.340')]
[2023-09-30 05:26:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2826240. Throughput: 0: 728.4, 1: 728.1. Samples: 706959. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:26:27,493][123291] Avg episode reward: [(0, '13.380'), (1, '26.600')]
[2023-09-30 05:26:27,494][124965] Saving new best policy, reward=13.380!
[2023-09-30 05:26:27,494][125162] Saving new best policy, reward=26.600!
[2023-09-30 05:26:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2859008. Throughput: 0: 732.8, 1: 732.7. Samples: 711628. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:26:32,493][123291] Avg episode reward: [(0, '13.380'), (1, '26.600')]
[2023-09-30 05:26:33,457][125261] Updated weights for policy 1, policy_version 5600 (0.0017)
[2023-09-30 05:26:33,457][125260] Updated weights for policy 0, policy_version 5600 (0.0015)
[2023-09-30 05:26:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2883584. Throughput: 0: 732.1, 1: 729.7. Samples: 720235. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:26:37,493][123291] Avg episode reward: [(0, '14.310'), (1, '26.790')]
[2023-09-30 05:26:37,504][124965] Saving new best policy, reward=14.310!
[2023-09-30 05:26:37,504][125162] Saving new best policy, reward=26.790!
[2023-09-30 05:26:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2916352. Throughput: 0: 728.9, 1: 728.0. Samples: 728851. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:26:42,493][123291] Avg episode reward: [(0, '14.310'), (1, '26.790')]
[2023-09-30 05:26:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 2940928. Throughput: 0: 728.2, 1: 728.4. Samples: 733184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:26:47,493][123291] Avg episode reward: [(0, '15.200'), (1, '26.950')]
[2023-09-30 05:26:47,674][124965] Saving new best policy, reward=15.200!
[2023-09-30 05:26:47,687][125162] Saving new best policy, reward=26.950!
[2023-09-30 05:26:47,689][125260] Updated weights for policy 0, policy_version 5760 (0.0017)
[2023-09-30 05:26:47,690][125261] Updated weights for policy 1, policy_version 5760 (0.0018)
[2023-09-30 05:26:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 2973696. Throughput: 0: 727.6, 1: 727.4. Samples: 741582. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:26:52,492][123291] Avg episode reward: [(0, '15.200'), (1, '26.950')]
[2023-09-30 05:26:57,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3006464. Throughput: 0: 729.7, 1: 730.0. Samples: 750744. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:26:57,493][123291] Avg episode reward: [(0, '15.200'), (1, '26.950')]
[2023-09-30 05:27:01,688][125261] Updated weights for policy 1, policy_version 5920 (0.0018)
[2023-09-30 05:27:01,688][125260] Updated weights for policy 0, policy_version 5920 (0.0019)
[2023-09-30 05:27:02,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3031040. Throughput: 0: 729.0, 1: 727.1. Samples: 755094. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:27:02,492][123291] Avg episode reward: [(0, '16.150'), (1, '27.160')]
[2023-09-30 05:27:02,493][124965] Saving new best policy, reward=16.150!
[2023-09-30 05:27:02,493][125162] Saving new best policy, reward=27.160!
[2023-09-30 05:27:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3063808. Throughput: 0: 728.1, 1: 728.2. Samples: 763904. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:27:07,492][123291] Avg episode reward: [(0, '16.150'), (1, '27.160')]
[2023-09-30 05:27:12,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5831.6). Total num frames: 3088384. Throughput: 0: 728.4, 1: 728.1. Samples: 772502. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:27:12,492][123291] Avg episode reward: [(0, '17.070'), (1, '27.150')]
[2023-09-30 05:27:12,493][124965] Saving new best policy, reward=17.070!
[2023-09-30 05:27:15,555][125260] Updated weights for policy 0, policy_version 6080 (0.0017)
[2023-09-30 05:27:15,556][125261] Updated weights for policy 1, policy_version 6080 (0.0015)
[2023-09-30 05:27:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3121152. Throughput: 0: 727.8, 1: 726.6. Samples: 777074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:17,493][123291] Avg episode reward: [(0, '17.070'), (1, '27.150')]
[2023-09-30 05:27:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 3145728. Throughput: 0: 732.6, 1: 732.2. Samples: 786154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:22,493][123291] Avg episode reward: [(0, '17.950'), (1, '27.190')]
[2023-09-30 05:27:22,525][124965] Saving new best policy, reward=17.950!
[2023-09-30 05:27:22,603][125162] Saving new best policy, reward=27.190!
[2023-09-30 05:27:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 3178496. Throughput: 0: 731.8, 1: 733.1. Samples: 794771. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:27,492][123291] Avg episode reward: [(0, '17.950'), (1, '27.190')]
[2023-09-30 05:27:29,348][125260] Updated weights for policy 0, policy_version 6240 (0.0017)
[2023-09-30 05:27:29,348][125261] Updated weights for policy 1, policy_version 6240 (0.0017)
[2023-09-30 05:27:32,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3211264. Throughput: 0: 736.0, 1: 734.3. Samples: 799347. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:32,492][123291] Avg episode reward: [(0, '18.740'), (1, '27.300')]
[2023-09-30 05:27:32,493][124965] Saving new best policy, reward=18.740!
[2023-09-30 05:27:32,493][125162] Saving new best policy, reward=27.300!
[2023-09-30 05:27:37,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 3235840. Throughput: 0: 738.2, 1: 738.2. Samples: 808021. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:27:37,493][123291] Avg episode reward: [(0, '18.740'), (1, '27.300')]
[2023-09-30 05:27:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3268608. Throughput: 0: 731.8, 1: 731.9. Samples: 816609. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:42,493][123291] Avg episode reward: [(0, '18.740'), (1, '27.290')]
[2023-09-30 05:27:43,664][125261] Updated weights for policy 1, policy_version 6400 (0.0017)
[2023-09-30 05:27:43,664][125260] Updated weights for policy 0, policy_version 6400 (0.0017)
[2023-09-30 05:27:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3293184. Throughput: 0: 732.8, 1: 733.5. Samples: 821081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:47,492][123291] Avg episode reward: [(0, '19.540'), (1, '27.300')]
[2023-09-30 05:27:47,680][124965] Saving new best policy, reward=19.540!
[2023-09-30 05:27:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 3325952. Throughput: 0: 732.5, 1: 731.5. Samples: 829784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:52,493][123291] Avg episode reward: [(0, '19.540'), (1, '27.300')]
[2023-09-30 05:27:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 3350528. Throughput: 0: 736.0, 1: 735.7. Samples: 838729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:27:57,493][123291] Avg episode reward: [(0, '20.200'), (1, '27.350')]
[2023-09-30 05:27:57,495][124965] Saving new best policy, reward=20.200!
[2023-09-30 05:27:57,549][125162] Saving new best policy, reward=27.350!
[2023-09-30 05:27:57,552][125260] Updated weights for policy 0, policy_version 6560 (0.0016)
[2023-09-30 05:27:57,552][125261] Updated weights for policy 1, policy_version 6560 (0.0017)
[2023-09-30 05:28:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3383296. Throughput: 0: 731.2, 1: 733.5. Samples: 842985. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:28:02,493][123291] Avg episode reward: [(0, '20.200'), (1, '27.350')]
[2023-09-30 05:28:07,491][123291] Fps is (10 sec: 6553.8, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 3416064. Throughput: 0: 728.7, 1: 732.4. Samples: 851907. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:28:07,492][123291] Avg episode reward: [(0, '20.870'), (1, '27.420')]
[2023-09-30 05:28:07,501][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000006672_1708032.pth...
[2023-09-30 05:28:07,501][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000006672_1708032.pth...
[2023-09-30 05:28:07,534][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000003936_1007616.pth
[2023-09-30 05:28:07,538][125162] Saving new best policy, reward=27.420!
[2023-09-30 05:28:07,541][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000003936_1007616.pth
[2023-09-30 05:28:07,546][124965] Saving new best policy, reward=20.870!
[2023-09-30 05:28:11,758][125260] Updated weights for policy 0, policy_version 6720 (0.0019)
[2023-09-30 05:28:11,765][125261] Updated weights for policy 1, policy_version 6720 (0.0017)
[2023-09-30 05:28:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3440640. Throughput: 0: 726.3, 1: 726.8. Samples: 860162. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:28:12,493][123291] Avg episode reward: [(0, '20.870'), (1, '27.420')]
[2023-09-30 05:28:17,492][123291] Fps is (10 sec: 5324.7, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 3469312. Throughput: 0: 721.8, 1: 722.4. Samples: 864336. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:28:17,493][123291] Avg episode reward: [(0, '21.360'), (1, '27.390')]
[2023-09-30 05:28:17,494][124965] Saving new best policy, reward=21.360!
[2023-09-30 05:28:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3497984. Throughput: 0: 724.1, 1: 724.0. Samples: 873188. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:28:22,493][123291] Avg episode reward: [(0, '21.360'), (1, '27.390')]
[2023-09-30 05:28:25,818][125260] Updated weights for policy 0, policy_version 6880 (0.0018)
[2023-09-30 05:28:25,818][125261] Updated weights for policy 1, policy_version 6880 (0.0018)
[2023-09-30 05:28:27,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3530752. Throughput: 0: 729.8, 1: 728.3. Samples: 882225. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:28:27,493][123291] Avg episode reward: [(0, '21.480'), (1, '27.440')]
[2023-09-30 05:28:27,494][124965] Saving new best policy, reward=21.480!
[2023-09-30 05:28:27,494][125162] Saving new best policy, reward=27.440!
[2023-09-30 05:28:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 3555328. Throughput: 0: 727.9, 1: 729.8. Samples: 886676. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:28:32,492][123291] Avg episode reward: [(0, '21.890'), (1, '27.500')]
[2023-09-30 05:28:32,493][124965] Saving new best policy, reward=21.890!
[2023-09-30 05:28:32,493][125162] Saving new best policy, reward=27.500!
[2023-09-30 05:28:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 3588096. Throughput: 0: 724.7, 1: 725.0. Samples: 895021. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:28:37,492][123291] Avg episode reward: [(0, '21.890'), (1, '27.500')]
[2023-09-30 05:28:39,917][125261] Updated weights for policy 1, policy_version 7040 (0.0019)
[2023-09-30 05:28:39,918][125260] Updated weights for policy 0, policy_version 7040 (0.0017)
[2023-09-30 05:28:42,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 3612672. Throughput: 0: 724.6, 1: 724.2. Samples: 903928. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:28:42,492][123291] Avg episode reward: [(0, '22.290'), (1, '27.620')]
[2023-09-30 05:28:42,644][124965] Saving new best policy, reward=22.290!
[2023-09-30 05:28:42,681][125162] Saving new best policy, reward=27.620!
[2023-09-30 05:28:47,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3645440. Throughput: 0: 728.1, 1: 725.7. Samples: 908405. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:28:47,492][123291] Avg episode reward: [(0, '22.290'), (1, '27.620')]
[2023-09-30 05:28:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 3670016. Throughput: 0: 727.4, 1: 724.2. Samples: 917230. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:28:52,493][123291] Avg episode reward: [(0, '22.700'), (1, '27.760')]
[2023-09-30 05:28:52,530][124965] Saving new best policy, reward=22.700!
[2023-09-30 05:28:52,647][125162] Saving new best policy, reward=27.760!
[2023-09-30 05:28:53,992][125261] Updated weights for policy 1, policy_version 7200 (0.0019)
[2023-09-30 05:28:53,992][125260] Updated weights for policy 0, policy_version 7200 (0.0018)
[2023-09-30 05:28:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3702784. Throughput: 0: 728.1, 1: 728.2. Samples: 925696. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:28:57,493][123291] Avg episode reward: [(0, '22.700'), (1, '27.760')]
[2023-09-30 05:29:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 3727360. Throughput: 0: 727.9, 1: 728.2. Samples: 929862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:02,492][123291] Avg episode reward: [(0, '23.010'), (1, '27.790')]
[2023-09-30 05:29:02,493][124965] Saving new best policy, reward=23.010!
[2023-09-30 05:29:02,494][125162] Saving new best policy, reward=27.790!
[2023-09-30 05:29:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 3760128. Throughput: 0: 729.0, 1: 729.9. Samples: 938836. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:07,492][123291] Avg episode reward: [(0, '23.010'), (1, '27.790')]
[2023-09-30 05:29:08,060][125261] Updated weights for policy 1, policy_version 7360 (0.0017)
[2023-09-30 05:29:08,060][125260] Updated weights for policy 0, policy_version 7360 (0.0017)
[2023-09-30 05:29:12,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3792896. Throughput: 0: 729.2, 1: 730.5. Samples: 947911. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:12,493][123291] Avg episode reward: [(0, '23.330'), (1, '27.830')]
[2023-09-30 05:29:12,494][124965] Saving new best policy, reward=23.330!
[2023-09-30 05:29:12,494][125162] Saving new best policy, reward=27.830!
[2023-09-30 05:29:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 3817472. Throughput: 0: 729.3, 1: 729.5. Samples: 952320. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:17,492][123291] Avg episode reward: [(0, '23.330'), (1, '27.850')]
[2023-09-30 05:29:17,658][125162] Saving new best policy, reward=27.850!
[2023-09-30 05:29:21,785][125260] Updated weights for policy 0, policy_version 7520 (0.0015)
[2023-09-30 05:29:21,786][125261] Updated weights for policy 1, policy_version 7520 (0.0019)
[2023-09-30 05:29:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3850240. Throughput: 0: 733.8, 1: 733.4. Samples: 961043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:22,493][123291] Avg episode reward: [(0, '23.330'), (1, '27.850')]
[2023-09-30 05:29:27,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3883008. Throughput: 0: 733.6, 1: 734.8. Samples: 970003. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:29:27,493][123291] Avg episode reward: [(0, '23.620'), (1, '27.920')]
[2023-09-30 05:29:27,494][124965] Saving new best policy, reward=23.620!
[2023-09-30 05:29:27,494][125162] Saving new best policy, reward=27.920!
[2023-09-30 05:29:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3907584. Throughput: 0: 734.4, 1: 737.0. Samples: 974621. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:29:32,493][123291] Avg episode reward: [(0, '23.620'), (1, '27.920')]
[2023-09-30 05:29:35,734][125260] Updated weights for policy 0, policy_version 7680 (0.0017)
[2023-09-30 05:29:35,734][125261] Updated weights for policy 1, policy_version 7680 (0.0017)
[2023-09-30 05:29:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3940352. Throughput: 0: 730.2, 1: 732.4. Samples: 983046. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:37,492][123291] Avg episode reward: [(0, '23.970'), (1, '27.890')]
[2023-09-30 05:29:37,502][124965] Saving new best policy, reward=23.970!
[2023-09-30 05:29:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3964928. Throughput: 0: 735.1, 1: 734.5. Samples: 991827. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:42,493][123291] Avg episode reward: [(0, '23.970'), (1, '27.890')]
[2023-09-30 05:29:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 3997696. Throughput: 0: 737.9, 1: 739.0. Samples: 996321. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:29:47,493][123291] Avg episode reward: [(0, '24.190'), (1, '27.910')]
[2023-09-30 05:29:47,494][124965] Saving new best policy, reward=24.190!
[2023-09-30 05:29:49,873][125260] Updated weights for policy 0, policy_version 7840 (0.0017)
[2023-09-30 05:29:49,873][125261] Updated weights for policy 1, policy_version 7840 (0.0018)
[2023-09-30 05:29:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 4022272. Throughput: 0: 734.9, 1: 736.2. Samples: 1005039. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:29:52,492][123291] Avg episode reward: [(0, '24.190'), (1, '27.910')]
[2023-09-30 05:29:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4055040. Throughput: 0: 730.8, 1: 732.5. Samples: 1013761. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:29:57,492][123291] Avg episode reward: [(0, '24.400'), (1, '27.890')]
[2023-09-30 05:29:57,493][124965] Saving new best policy, reward=24.400!
[2023-09-30 05:30:02,492][123291] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 5831.6). Total num frames: 4087808. Throughput: 0: 733.1, 1: 731.2. Samples: 1018213. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:30:02,493][123291] Avg episode reward: [(0, '24.400'), (1, '27.890')]
[2023-09-30 05:30:03,740][125261] Updated weights for policy 1, policy_version 8000 (0.0017)
[2023-09-30 05:30:03,740][125260] Updated weights for policy 0, policy_version 8000 (0.0017)
[2023-09-30 05:30:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4112384. Throughput: 0: 734.7, 1: 734.1. Samples: 1027139. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:30:07,493][123291] Avg episode reward: [(0, '24.400'), (1, '27.890')]
[2023-09-30 05:30:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000008032_2056192.pth...
[2023-09-30 05:30:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000008032_2056192.pth...
[2023-09-30 05:30:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000005296_1355776.pth
[2023-09-30 05:30:07,545][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000005296_1355776.pth
[2023-09-30 05:30:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4145152. Throughput: 0: 733.9, 1: 735.3. Samples: 1036119. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:30:12,493][123291] Avg episode reward: [(0, '24.700'), (1, '27.910')]
[2023-09-30 05:30:12,494][124965] Saving new best policy, reward=24.700!
[2023-09-30 05:30:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4169728. Throughput: 0: 730.8, 1: 730.1. Samples: 1040361. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:17,492][123291] Avg episode reward: [(0, '24.700'), (1, '27.910')]
[2023-09-30 05:30:17,806][125260] Updated weights for policy 0, policy_version 8160 (0.0017)
[2023-09-30 05:30:17,806][125261] Updated weights for policy 1, policy_version 8160 (0.0016)
[2023-09-30 05:30:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 4202496. Throughput: 0: 729.2, 1: 728.3. Samples: 1048633. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:22,492][123291] Avg episode reward: [(0, '24.970'), (1, '27.940')]
[2023-09-30 05:30:22,499][124965] Saving new best policy, reward=24.970!
[2023-09-30 05:30:22,500][125162] Saving new best policy, reward=27.940!
[2023-09-30 05:30:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 4227072. Throughput: 0: 729.5, 1: 730.0. Samples: 1057505. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:30:27,492][123291] Avg episode reward: [(0, '24.970'), (1, '27.940')]
[2023-09-30 05:30:31,974][125260] Updated weights for policy 0, policy_version 8320 (0.0017)
[2023-09-30 05:30:31,975][125261] Updated weights for policy 1, policy_version 8320 (0.0017)
[2023-09-30 05:30:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4259840. Throughput: 0: 729.1, 1: 728.4. Samples: 1061910. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:32,493][123291] Avg episode reward: [(0, '25.230'), (1, '27.910')]
[2023-09-30 05:30:32,494][124965] Saving new best policy, reward=25.230!
[2023-09-30 05:30:37,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4292608. Throughput: 0: 733.0, 1: 731.1. Samples: 1070925. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:37,492][123291] Avg episode reward: [(0, '25.230'), (1, '27.910')]
[2023-09-30 05:30:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4317184. Throughput: 0: 729.7, 1: 729.0. Samples: 1079404. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:42,493][123291] Avg episode reward: [(0, '25.420'), (1, '27.910')]
[2023-09-30 05:30:42,494][124965] Saving new best policy, reward=25.420!
[2023-09-30 05:30:45,697][125260] Updated weights for policy 0, policy_version 8480 (0.0018)
[2023-09-30 05:30:45,697][125261] Updated weights for policy 1, policy_version 8480 (0.0017)
[2023-09-30 05:30:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4349952. Throughput: 0: 731.2, 1: 731.4. Samples: 1084029. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:30:47,493][123291] Avg episode reward: [(0, '25.420'), (1, '27.910')]
[2023-09-30 05:30:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4374528. Throughput: 0: 731.0, 1: 728.9. Samples: 1092832. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:30:52,492][123291] Avg episode reward: [(0, '25.420'), (1, '27.910')]
[2023-09-30 05:30:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4407296. Throughput: 0: 729.8, 1: 728.6. Samples: 1101747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:30:57,493][123291] Avg episode reward: [(0, '25.640'), (1, '27.970')]
[2023-09-30 05:30:57,494][124965] Saving new best policy, reward=25.640!
[2023-09-30 05:30:57,494][125162] Saving new best policy, reward=27.970!
[2023-09-30 05:30:59,797][125260] Updated weights for policy 0, policy_version 8640 (0.0017)
[2023-09-30 05:30:59,798][125261] Updated weights for policy 1, policy_version 8640 (0.0016)
[2023-09-30 05:31:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 4431872. Throughput: 0: 728.2, 1: 728.7. Samples: 1105920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:02,493][123291] Avg episode reward: [(0, '25.640'), (1, '27.970')]
[2023-09-30 05:31:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5845.5). Total num frames: 4464640. Throughput: 0: 734.5, 1: 733.6. Samples: 1114700. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:31:07,492][123291] Avg episode reward: [(0, '25.830'), (1, '28.010')]
[2023-09-30 05:31:07,502][125162] Saving new best policy, reward=28.010!
[2023-09-30 05:31:07,502][124965] Saving new best policy, reward=25.830!
[2023-09-30 05:31:12,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4497408. Throughput: 0: 733.6, 1: 732.9. Samples: 1123498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:12,493][123291] Avg episode reward: [(0, '25.830'), (1, '28.010')]
[2023-09-30 05:31:13,815][125260] Updated weights for policy 0, policy_version 8800 (0.0016)
[2023-09-30 05:31:13,815][125261] Updated weights for policy 1, policy_version 8800 (0.0018)
[2023-09-30 05:31:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4521984. Throughput: 0: 734.7, 1: 734.7. Samples: 1128033. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:17,493][123291] Avg episode reward: [(0, '26.030'), (1, '28.090')]
[2023-09-30 05:31:17,494][124965] Saving new best policy, reward=26.030!
[2023-09-30 05:31:17,494][125162] Saving new best policy, reward=28.090!
[2023-09-30 05:31:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4554752. Throughput: 0: 729.7, 1: 730.8. Samples: 1136647. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:31:22,493][123291] Avg episode reward: [(0, '26.030'), (1, '28.090')]
[2023-09-30 05:31:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4579328. Throughput: 0: 735.1, 1: 734.5. Samples: 1145539. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:31:27,493][123291] Avg episode reward: [(0, '26.270'), (1, '28.060')]
[2023-09-30 05:31:27,654][124965] Saving new best policy, reward=26.270!
[2023-09-30 05:31:27,717][125261] Updated weights for policy 1, policy_version 8960 (0.0017)
[2023-09-30 05:31:27,718][125260] Updated weights for policy 0, policy_version 8960 (0.0014)
[2023-09-30 05:31:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4612096. Throughput: 0: 734.4, 1: 734.2. Samples: 1150115. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:32,493][123291] Avg episode reward: [(0, '26.270'), (1, '28.060')]
[2023-09-30 05:31:37,492][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4644864. Throughput: 0: 733.2, 1: 736.5. Samples: 1158972. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:37,492][123291] Avg episode reward: [(0, '26.380'), (1, '28.090')]
[2023-09-30 05:31:37,502][124965] Saving new best policy, reward=26.380!
[2023-09-30 05:31:41,705][125260] Updated weights for policy 0, policy_version 9120 (0.0017)
[2023-09-30 05:31:41,713][125261] Updated weights for policy 1, policy_version 9120 (0.0017)
[2023-09-30 05:31:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4669440. Throughput: 0: 729.0, 1: 729.5. Samples: 1167380. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:42,493][123291] Avg episode reward: [(0, '26.450'), (1, '28.060')]
[2023-09-30 05:31:42,493][124965] Saving new best policy, reward=26.450!
[2023-09-30 05:31:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4702208. Throughput: 0: 734.8, 1: 732.8. Samples: 1171962. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:47,492][123291] Avg episode reward: [(0, '26.450'), (1, '28.060')]
[2023-09-30 05:31:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4726784. Throughput: 0: 731.8, 1: 733.4. Samples: 1180631. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:31:52,493][123291] Avg episode reward: [(0, '26.580'), (1, '28.110')]
[2023-09-30 05:31:52,503][125162] Saving new best policy, reward=28.110!
[2023-09-30 05:31:52,503][124965] Saving new best policy, reward=26.580!
[2023-09-30 05:31:55,927][125261] Updated weights for policy 1, policy_version 9280 (0.0016)
[2023-09-30 05:31:55,927][125260] Updated weights for policy 0, policy_version 9280 (0.0018)
[2023-09-30 05:31:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4759552. Throughput: 0: 729.3, 1: 729.4. Samples: 1189142. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:31:57,493][123291] Avg episode reward: [(0, '26.580'), (1, '28.110')]
[2023-09-30 05:32:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4784128. Throughput: 0: 729.4, 1: 729.7. Samples: 1193690. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:32:02,493][123291] Avg episode reward: [(0, '26.820'), (1, '28.230')]
[2023-09-30 05:32:02,494][124965] Saving new best policy, reward=26.820!
[2023-09-30 05:32:02,494][125162] Saving new best policy, reward=28.230!
[2023-09-30 05:32:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4816896. Throughput: 0: 728.6, 1: 728.2. Samples: 1202203. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:32:07,492][123291] Avg episode reward: [(0, '26.820'), (1, '28.230')]
[2023-09-30 05:32:07,502][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000009408_2408448.pth...
[2023-09-30 05:32:07,502][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000009408_2408448.pth...
[2023-09-30 05:32:07,537][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000006672_1708032.pth
[2023-09-30 05:32:07,540][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000006672_1708032.pth
[2023-09-30 05:32:09,842][125261] Updated weights for policy 1, policy_version 9440 (0.0017)
[2023-09-30 05:32:09,842][125260] Updated weights for policy 0, policy_version 9440 (0.0017)
[2023-09-30 05:32:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 4841472. Throughput: 0: 726.8, 1: 728.6. Samples: 1211032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:32:12,493][123291] Avg episode reward: [(0, '27.170'), (1, '28.330')]
[2023-09-30 05:32:12,494][124965] Saving new best policy, reward=27.170!
[2023-09-30 05:32:12,638][125162] Saving new best policy, reward=28.330!
[2023-09-30 05:32:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 4874240. Throughput: 0: 722.2, 1: 723.4. Samples: 1215163. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:32:17,492][123291] Avg episode reward: [(0, '27.170'), (1, '28.330')]
[2023-09-30 05:32:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 4898816. Throughput: 0: 723.4, 1: 723.5. Samples: 1224082. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:32:22,493][123291] Avg episode reward: [(0, '27.370'), (1, '28.400')]
[2023-09-30 05:32:22,609][124965] Saving new best policy, reward=27.370!
[2023-09-30 05:32:22,669][125162] Saving new best policy, reward=28.400!
[2023-09-30 05:32:24,076][125260] Updated weights for policy 0, policy_version 9600 (0.0017)
[2023-09-30 05:32:24,076][125261] Updated weights for policy 1, policy_version 9600 (0.0017)
[2023-09-30 05:32:27,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 4931584. Throughput: 0: 727.9, 1: 728.1. Samples: 1232898. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:32:27,493][123291] Avg episode reward: [(0, '27.420'), (1, '28.450')]
[2023-09-30 05:32:27,494][124965] Saving new best policy, reward=27.420!
[2023-09-30 05:32:27,494][125162] Saving new best policy, reward=28.450!
[2023-09-30 05:32:32,492][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 4964352. Throughput: 0: 725.4, 1: 725.9. Samples: 1237272. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:32:32,493][123291] Avg episode reward: [(0, '27.420'), (1, '28.450')]
[2023-09-30 05:32:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 4988928. Throughput: 0: 730.9, 1: 731.3. Samples: 1246431. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:32:37,493][123291] Avg episode reward: [(0, '27.700'), (1, '28.520')]
[2023-09-30 05:32:37,504][124965] Saving new best policy, reward=27.700!
[2023-09-30 05:32:37,504][125162] Saving new best policy, reward=28.520!
[2023-09-30 05:32:37,889][125261] Updated weights for policy 1, policy_version 9760 (0.0013)
[2023-09-30 05:32:37,889][125260] Updated weights for policy 0, policy_version 9760 (0.0017)
[2023-09-30 05:32:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5021696. Throughput: 0: 733.0, 1: 732.5. Samples: 1255090. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:32:42,493][123291] Avg episode reward: [(0, '27.700'), (1, '28.520')]
[2023-09-30 05:32:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 5046272. Throughput: 0: 731.4, 1: 731.5. Samples: 1259520. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:32:47,493][123291] Avg episode reward: [(0, '27.840'), (1, '28.550')]
[2023-09-30 05:32:47,635][124965] Saving new best policy, reward=27.840!
[2023-09-30 05:32:47,646][125162] Saving new best policy, reward=28.550!
[2023-09-30 05:32:51,814][125260] Updated weights for policy 0, policy_version 9920 (0.0015)
[2023-09-30 05:32:51,814][125261] Updated weights for policy 1, policy_version 9920 (0.0018)
[2023-09-30 05:32:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5079040. Throughput: 0: 733.7, 1: 733.5. Samples: 1268228. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:32:52,492][123291] Avg episode reward: [(0, '27.840'), (1, '28.550')]
[2023-09-30 05:32:57,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5111808. Throughput: 0: 737.4, 1: 736.3. Samples: 1277350. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:32:57,493][123291] Avg episode reward: [(0, '27.920'), (1, '28.650')]
[2023-09-30 05:32:57,494][124965] Saving new best policy, reward=27.920!
[2023-09-30 05:32:57,494][125162] Saving new best policy, reward=28.650!
[2023-09-30 05:33:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 5136384. Throughput: 0: 741.3, 1: 741.0. Samples: 1281867. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:33:02,493][123291] Avg episode reward: [(0, '27.920'), (1, '28.650')]
[2023-09-30 05:33:05,449][125260] Updated weights for policy 0, policy_version 10080 (0.0017)
[2023-09-30 05:33:05,449][125261] Updated weights for policy 1, policy_version 10080 (0.0017)
[2023-09-30 05:33:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5169152. Throughput: 0: 740.7, 1: 740.4. Samples: 1290731. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:33:07,493][123291] Avg episode reward: [(0, '28.010'), (1, '28.650')]
[2023-09-30 05:33:07,503][124965] Saving new best policy, reward=28.010!
[2023-09-30 05:33:12,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5873.2). Total num frames: 5201920. Throughput: 0: 743.7, 1: 741.7. Samples: 1299740. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:33:12,493][123291] Avg episode reward: [(0, '28.010'), (1, '28.650')]
[2023-09-30 05:33:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5226496. Throughput: 0: 742.9, 1: 743.6. Samples: 1304164. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:33:17,493][123291] Avg episode reward: [(0, '28.050'), (1, '28.650')]
[2023-09-30 05:33:17,494][124965] Saving new best policy, reward=28.050!
[2023-09-30 05:33:19,329][125260] Updated weights for policy 0, policy_version 10240 (0.0015)
[2023-09-30 05:33:19,329][125261] Updated weights for policy 1, policy_version 10240 (0.0018)
[2023-09-30 05:33:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 5259264. Throughput: 0: 737.5, 1: 737.0. Samples: 1312786. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:33:22,493][123291] Avg episode reward: [(0, '28.070'), (1, '28.690')]
[2023-09-30 05:33:22,504][124965] Saving new best policy, reward=28.070!
[2023-09-30 05:33:22,504][125162] Saving new best policy, reward=28.690!
[2023-09-30 05:33:27,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5873.2). Total num frames: 5287936. Throughput: 0: 743.3, 1: 742.7. Samples: 1321958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:27,493][123291] Avg episode reward: [(0, '28.070'), (1, '28.690')]
[2023-09-30 05:33:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 5316608. Throughput: 0: 744.1, 1: 741.9. Samples: 1326387. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:32,492][123291] Avg episode reward: [(0, '28.090'), (1, '28.780')]
[2023-09-30 05:33:32,492][124965] Saving new best policy, reward=28.090!
[2023-09-30 05:33:32,492][125162] Saving new best policy, reward=28.780!
[2023-09-30 05:33:33,200][125261] Updated weights for policy 1, policy_version 10400 (0.0018)
[2023-09-30 05:33:33,201][125260] Updated weights for policy 0, policy_version 10400 (0.0020)
[2023-09-30 05:33:37,491][123291] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 5349376. Throughput: 0: 744.9, 1: 745.5. Samples: 1335296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:37,492][123291] Avg episode reward: [(0, '28.090'), (1, '28.780')]
[2023-09-30 05:33:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5373952. Throughput: 0: 741.2, 1: 741.3. Samples: 1344064. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:42,493][123291] Avg episode reward: [(0, '28.140'), (1, '28.850')]
[2023-09-30 05:33:42,494][125162] Saving new best policy, reward=28.850!
[2023-09-30 05:33:42,494][124965] Saving new best policy, reward=28.140!
[2023-09-30 05:33:46,882][125261] Updated weights for policy 1, policy_version 10560 (0.0017)
[2023-09-30 05:33:46,883][125260] Updated weights for policy 0, policy_version 10560 (0.0017)
[2023-09-30 05:33:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 5406720. Throughput: 0: 741.5, 1: 741.9. Samples: 1348618. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:33:47,493][123291] Avg episode reward: [(0, '28.140'), (1, '28.850')]
[2023-09-30 05:33:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5431296. Throughput: 0: 741.7, 1: 742.7. Samples: 1357527. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:52,493][123291] Avg episode reward: [(0, '28.240'), (1, '28.860')]
[2023-09-30 05:33:52,506][124965] Saving new best policy, reward=28.240!
[2023-09-30 05:33:52,506][125162] Saving new best policy, reward=28.860!
[2023-09-30 05:33:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 5464064. Throughput: 0: 735.4, 1: 737.4. Samples: 1366016. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:33:57,493][123291] Avg episode reward: [(0, '28.240'), (1, '28.860')]
[2023-09-30 05:34:00,942][125260] Updated weights for policy 0, policy_version 10720 (0.0019)
[2023-09-30 05:34:00,943][125261] Updated weights for policy 1, policy_version 10720 (0.0019)
[2023-09-30 05:34:02,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 5496832. Throughput: 0: 735.1, 1: 734.5. Samples: 1370299. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:02,493][123291] Avg episode reward: [(0, '28.360'), (1, '28.840')]
[2023-09-30 05:34:02,494][124965] Saving new best policy, reward=28.360!
[2023-09-30 05:34:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5521408. Throughput: 0: 741.0, 1: 739.3. Samples: 1379397. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:07,493][123291] Avg episode reward: [(0, '28.360'), (1, '28.840')]
[2023-09-30 05:34:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000010784_2760704.pth...
[2023-09-30 05:34:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000010784_2760704.pth...
[2023-09-30 05:34:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000008032_2056192.pth
[2023-09-30 05:34:07,539][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000008032_2056192.pth
[2023-09-30 05:34:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 5554176. Throughput: 0: 738.6, 1: 739.5. Samples: 1388475. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:34:12,493][123291] Avg episode reward: [(0, '28.360'), (1, '28.840')]
[2023-09-30 05:34:14,859][125260] Updated weights for policy 0, policy_version 10880 (0.0016)
[2023-09-30 05:34:14,860][125261] Updated weights for policy 1, policy_version 10880 (0.0017)
[2023-09-30 05:34:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 5578752. Throughput: 0: 735.0, 1: 737.2. Samples: 1392640. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:34:17,492][123291] Avg episode reward: [(0, '28.480'), (1, '28.910')]
[2023-09-30 05:34:17,493][124965] Saving new best policy, reward=28.480!
[2023-09-30 05:34:17,493][125162] Saving new best policy, reward=28.910!
[2023-09-30 05:34:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 5611520. Throughput: 0: 728.2, 1: 727.8. Samples: 1400815. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:22,493][123291] Avg episode reward: [(0, '28.480'), (1, '28.910')]
[2023-09-30 05:34:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 5636096. Throughput: 0: 721.0, 1: 719.9. Samples: 1408905. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:34:27,492][123291] Avg episode reward: [(0, '28.560'), (1, '28.960')]
[2023-09-30 05:34:27,492][124965] Saving new best policy, reward=28.560!
[2023-09-30 05:34:27,493][125162] Saving new best policy, reward=28.960!
[2023-09-30 05:34:30,253][125260] Updated weights for policy 0, policy_version 11040 (0.0018)
[2023-09-30 05:34:30,253][125261] Updated weights for policy 1, policy_version 11040 (0.0013)
[2023-09-30 05:34:32,491][123291] Fps is (10 sec: 4915.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 5660672. Throughput: 0: 713.4, 1: 711.6. Samples: 1412743. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:34:32,492][123291] Avg episode reward: [(0, '28.560'), (1, '28.960')]
[2023-09-30 05:34:37,491][123291] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 5685248. Throughput: 0: 695.3, 1: 696.9. Samples: 1420174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:37,492][123291] Avg episode reward: [(0, '28.560'), (1, '28.960')]
[2023-09-30 05:34:42,491][123291] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 5709824. Throughput: 0: 682.8, 1: 682.7. Samples: 1427461. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:42,492][123291] Avg episode reward: [(0, '28.750'), (1, '29.020')]
[2023-09-30 05:34:42,493][124965] Saving new best policy, reward=28.750!
[2023-09-30 05:34:42,493][125162] Saving new best policy, reward=29.020!
[2023-09-30 05:34:46,381][125261] Updated weights for policy 1, policy_version 11200 (0.0016)
[2023-09-30 05:34:46,381][125260] Updated weights for policy 0, policy_version 11200 (0.0016)
[2023-09-30 05:34:47,491][123291] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 5734400. Throughput: 0: 679.9, 1: 681.2. Samples: 1431552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:47,492][123291] Avg episode reward: [(0, '28.750'), (1, '29.020')]
[2023-09-30 05:34:52,492][123291] Fps is (10 sec: 5324.7, 60 sec: 5529.6, 300 sec: 5789.9). Total num frames: 5763072. Throughput: 0: 668.0, 1: 669.7. Samples: 1439596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:34:52,493][123291] Avg episode reward: [(0, '28.860'), (1, '29.060')]
[2023-09-30 05:34:52,505][125162] Saving new best policy, reward=29.060!
[2023-09-30 05:34:52,516][124965] Saving new best policy, reward=28.860!
[2023-09-30 05:34:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5776.1). Total num frames: 5791744. Throughput: 0: 660.2, 1: 660.9. Samples: 1447923. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:34:57,493][123291] Avg episode reward: [(0, '28.860'), (1, '29.060')]
[2023-09-30 05:35:00,986][125260] Updated weights for policy 0, policy_version 11360 (0.0017)
[2023-09-30 05:35:00,986][125261] Updated weights for policy 1, policy_version 11360 (0.0016)
[2023-09-30 05:35:02,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 5824512. Throughput: 0: 661.8, 1: 660.8. Samples: 1452157. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:35:02,492][123291] Avg episode reward: [(0, '28.910'), (1, '29.090')]
[2023-09-30 05:35:02,493][124965] Saving new best policy, reward=28.910!
[2023-09-30 05:35:02,493][125162] Saving new best policy, reward=29.090!
[2023-09-30 05:35:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5461.4, 300 sec: 5776.1). Total num frames: 5849088. Throughput: 0: 670.0, 1: 671.5. Samples: 1461180. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:35:07,492][123291] Avg episode reward: [(0, '28.910'), (1, '29.090')]
[2023-09-30 05:35:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 5881856. Throughput: 0: 680.7, 1: 679.8. Samples: 1470128. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:35:12,493][123291] Avg episode reward: [(0, '28.910'), (1, '29.090')]
[2023-09-30 05:35:14,992][125260] Updated weights for policy 0, policy_version 11520 (0.0018)
[2023-09-30 05:35:14,993][125261] Updated weights for policy 1, policy_version 11520 (0.0018)
[2023-09-30 05:35:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5776.1). Total num frames: 5906432. Throughput: 0: 685.4, 1: 687.6. Samples: 1474526. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:17,493][123291] Avg episode reward: [(0, '29.000'), (1, '29.060')]
[2023-09-30 05:35:17,494][124965] Saving new best policy, reward=29.000!
[2023-09-30 05:35:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5461.4, 300 sec: 5803.8). Total num frames: 5939200. Throughput: 0: 700.6, 1: 698.1. Samples: 1483116. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:22,492][123291] Avg episode reward: [(0, '29.000'), (1, '29.060')]
[2023-09-30 05:35:27,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 5971968. Throughput: 0: 718.8, 1: 718.0. Samples: 1492120. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:27,493][123291] Avg episode reward: [(0, '29.060'), (1, '29.110')]
[2023-09-30 05:35:27,494][124965] Saving new best policy, reward=29.060!
[2023-09-30 05:35:27,494][125162] Saving new best policy, reward=29.110!
[2023-09-30 05:35:28,838][125261] Updated weights for policy 1, policy_version 11680 (0.0017)
[2023-09-30 05:35:28,838][125260] Updated weights for policy 0, policy_version 11680 (0.0018)
[2023-09-30 05:35:32,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5776.1). Total num frames: 5996544. Throughput: 0: 723.3, 1: 721.7. Samples: 1496576. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:32,492][123291] Avg episode reward: [(0, '29.060'), (1, '29.110')]
[2023-09-30 05:35:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 6029312. Throughput: 0: 728.0, 1: 726.6. Samples: 1505050. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:35:37,493][123291] Avg episode reward: [(0, '29.160'), (1, '29.070')]
[2023-09-30 05:35:37,504][124965] Saving new best policy, reward=29.160!
[2023-09-30 05:35:42,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6053888. Throughput: 0: 729.0, 1: 728.6. Samples: 1513514. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:35:42,492][123291] Avg episode reward: [(0, '29.160'), (1, '29.070')]
[2023-09-30 05:35:43,050][125260] Updated weights for policy 0, policy_version 11840 (0.0017)
[2023-09-30 05:35:43,050][125261] Updated weights for policy 1, policy_version 11840 (0.0017)
[2023-09-30 05:35:47,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 6086656. Throughput: 0: 732.6, 1: 732.6. Samples: 1518093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:47,493][123291] Avg episode reward: [(0, '29.230'), (1, '28.990')]
[2023-09-30 05:35:47,494][124965] Saving new best policy, reward=29.230!
[2023-09-30 05:35:52,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5802.6, 300 sec: 5776.1). Total num frames: 6111232. Throughput: 0: 732.9, 1: 730.2. Samples: 1527018. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:52,492][123291] Avg episode reward: [(0, '29.230'), (1, '28.990')]
[2023-09-30 05:35:57,014][125260] Updated weights for policy 0, policy_version 12000 (0.0018)
[2023-09-30 05:35:57,014][125261] Updated weights for policy 1, policy_version 12000 (0.0018)
[2023-09-30 05:35:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 6144000. Throughput: 0: 729.6, 1: 732.4. Samples: 1535917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:35:57,492][123291] Avg episode reward: [(0, '29.230'), (1, '28.990')]
[2023-09-30 05:36:02,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6168576. Throughput: 0: 728.6, 1: 728.6. Samples: 1540096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:36:02,492][123291] Avg episode reward: [(0, '29.310'), (1, '28.860')]
[2023-09-30 05:36:02,493][124965] Saving new best policy, reward=29.310!
[2023-09-30 05:36:07,491][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6201344. Throughput: 0: 723.8, 1: 724.5. Samples: 1548288. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:36:07,492][123291] Avg episode reward: [(0, '29.310'), (1, '28.860')]
[2023-09-30 05:36:07,500][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000012112_3100672.pth...
[2023-09-30 05:36:07,501][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000012112_3100672.pth...
[2023-09-30 05:36:07,534][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000009408_2408448.pth
[2023-09-30 05:36:07,536][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000009408_2408448.pth
[2023-09-30 05:36:11,448][125260] Updated weights for policy 0, policy_version 12160 (0.0016)
[2023-09-30 05:36:11,449][125261] Updated weights for policy 1, policy_version 12160 (0.0018)
[2023-09-30 05:36:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6225920. Throughput: 0: 719.9, 1: 720.1. Samples: 1556919. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:36:12,493][123291] Avg episode reward: [(0, '29.390'), (1, '28.880')]
[2023-09-30 05:36:12,494][124965] Saving new best policy, reward=29.390!
[2023-09-30 05:36:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6258688. Throughput: 0: 718.6, 1: 719.6. Samples: 1561298. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:36:17,492][123291] Avg episode reward: [(0, '29.390'), (1, '28.880')]
[2023-09-30 05:36:22,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6283264. Throughput: 0: 721.9, 1: 722.9. Samples: 1570069. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:36:22,492][123291] Avg episode reward: [(0, '29.470'), (1, '28.870')]
[2023-09-30 05:36:22,500][124965] Saving new best policy, reward=29.470!
[2023-09-30 05:36:25,697][125260] Updated weights for policy 0, policy_version 12320 (0.0018)
[2023-09-30 05:36:25,697][125261] Updated weights for policy 1, policy_version 12320 (0.0017)
[2023-09-30 05:36:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6316032. Throughput: 0: 726.0, 1: 724.1. Samples: 1578766. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:36:27,492][123291] Avg episode reward: [(0, '29.470'), (1, '28.870')]
[2023-09-30 05:36:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 6340608. Throughput: 0: 721.8, 1: 722.7. Samples: 1583096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:36:32,493][123291] Avg episode reward: [(0, '29.590'), (1, '28.910')]
[2023-09-30 05:36:32,494][124965] Saving new best policy, reward=29.590!
[2023-09-30 05:36:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6373376. Throughput: 0: 718.8, 1: 718.9. Samples: 1591715. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:36:37,492][123291] Avg episode reward: [(0, '29.590'), (1, '28.910')]
[2023-09-30 05:36:39,644][125260] Updated weights for policy 0, policy_version 12480 (0.0017)
[2023-09-30 05:36:39,644][125261] Updated weights for policy 1, policy_version 12480 (0.0015)
[2023-09-30 05:36:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 6397952. Throughput: 0: 718.7, 1: 717.2. Samples: 1600529. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:36:42,493][123291] Avg episode reward: [(0, '29.590'), (1, '28.910')]
[2023-09-30 05:36:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6430720. Throughput: 0: 719.4, 1: 719.2. Samples: 1604833. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:36:47,493][123291] Avg episode reward: [(0, '29.690'), (1, '28.900')]
[2023-09-30 05:36:47,494][124965] Saving new best policy, reward=29.690!
[2023-09-30 05:36:52,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5871.0, 300 sec: 5776.1). Total num frames: 6463488. Throughput: 0: 727.7, 1: 727.8. Samples: 1613784. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:36:52,492][123291] Avg episode reward: [(0, '29.690'), (1, '28.900')]
[2023-09-30 05:36:53,765][125260] Updated weights for policy 0, policy_version 12640 (0.0016)
[2023-09-30 05:36:53,765][125261] Updated weights for policy 1, policy_version 12640 (0.0017)
[2023-09-30 05:36:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6488064. Throughput: 0: 724.3, 1: 723.9. Samples: 1622088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:36:57,492][123291] Avg episode reward: [(0, '29.700'), (1, '28.830')]
[2023-09-30 05:36:57,493][124965] Saving new best policy, reward=29.700!
[2023-09-30 05:37:02,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6520832. Throughput: 0: 725.8, 1: 725.7. Samples: 1626614. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:37:02,492][123291] Avg episode reward: [(0, '29.700'), (1, '28.830')]
[2023-09-30 05:37:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 6545408. Throughput: 0: 730.0, 1: 729.0. Samples: 1635724. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:37:07,493][123291] Avg episode reward: [(0, '29.780'), (1, '28.780')]
[2023-09-30 05:37:07,596][124965] Saving new best policy, reward=29.780!
[2023-09-30 05:37:07,599][125261] Updated weights for policy 1, policy_version 12800 (0.0017)
[2023-09-30 05:37:07,599][125260] Updated weights for policy 0, policy_version 12800 (0.0016)
[2023-09-30 05:37:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.0). Total num frames: 6578176. Throughput: 0: 730.4, 1: 732.3. Samples: 1644585. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:37:12,493][123291] Avg episode reward: [(0, '29.780'), (1, '28.780')]
[2023-09-30 05:37:17,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 6610944. Throughput: 0: 733.5, 1: 732.7. Samples: 1649076. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:37:17,492][123291] Avg episode reward: [(0, '29.810'), (1, '28.760')]
[2023-09-30 05:37:17,493][124965] Saving new best policy, reward=29.810!
[2023-09-30 05:37:21,602][125260] Updated weights for policy 0, policy_version 12960 (0.0018)
[2023-09-30 05:37:21,603][125261] Updated weights for policy 1, policy_version 12960 (0.0016)
[2023-09-30 05:37:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6635520. Throughput: 0: 734.1, 1: 734.2. Samples: 1657787. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:37:22,494][123291] Avg episode reward: [(0, '29.810'), (1, '28.760')]
[2023-09-30 05:37:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6668288. Throughput: 0: 737.4, 1: 738.4. Samples: 1666939. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:37:27,493][123291] Avg episode reward: [(0, '29.810'), (1, '28.760')]
[2023-09-30 05:37:32,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6692864. Throughput: 0: 737.0, 1: 737.2. Samples: 1671168. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:37:32,493][123291] Avg episode reward: [(0, '29.890'), (1, '28.780')]
[2023-09-30 05:37:32,654][124965] Saving new best policy, reward=29.890!
[2023-09-30 05:37:35,446][125261] Updated weights for policy 1, policy_version 13120 (0.0018)
[2023-09-30 05:37:35,447][125260] Updated weights for policy 0, policy_version 13120 (0.0018)
[2023-09-30 05:37:37,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6725632. Throughput: 0: 733.6, 1: 733.0. Samples: 1679782. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:37:37,492][123291] Avg episode reward: [(0, '29.890'), (1, '28.780')]
[2023-09-30 05:37:42,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5803.8). Total num frames: 6758400. Throughput: 0: 740.9, 1: 740.7. Samples: 1688761. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:37:42,493][123291] Avg episode reward: [(0, '30.000'), (1, '28.790')]
[2023-09-30 05:37:42,494][124965] Saving new best policy, reward=30.000!
[2023-09-30 05:37:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6782976. Throughput: 0: 739.6, 1: 739.5. Samples: 1693171. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:37:47,493][123291] Avg episode reward: [(0, '30.000'), (1, '28.790')]
[2023-09-30 05:37:49,538][125260] Updated weights for policy 0, policy_version 13280 (0.0019)
[2023-09-30 05:37:49,538][125261] Updated weights for policy 1, policy_version 13280 (0.0018)
[2023-09-30 05:37:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6815744. Throughput: 0: 734.3, 1: 736.0. Samples: 1701888. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:37:52,493][123291] Avg episode reward: [(0, '30.100'), (1, '28.810')]
[2023-09-30 05:37:52,504][124965] Saving new best policy, reward=30.100!
[2023-09-30 05:37:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6840320. Throughput: 0: 738.2, 1: 738.2. Samples: 1711025. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:37:57,493][123291] Avg episode reward: [(0, '30.100'), (1, '28.810')]
[2023-09-30 05:38:02,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6873088. Throughput: 0: 739.3, 1: 738.9. Samples: 1715597. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:02,492][123291] Avg episode reward: [(0, '30.220'), (1, '28.830')]
[2023-09-30 05:38:02,492][124965] Saving new best policy, reward=30.220!
[2023-09-30 05:38:03,141][125261] Updated weights for policy 1, policy_version 13440 (0.0018)
[2023-09-30 05:38:03,142][125260] Updated weights for policy 0, policy_version 13440 (0.0017)
[2023-09-30 05:38:07,492][123291] Fps is (10 sec: 6553.5, 60 sec: 6007.4, 300 sec: 5776.1). Total num frames: 6905856. Throughput: 0: 739.6, 1: 741.0. Samples: 1724416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:07,493][123291] Avg episode reward: [(0, '30.220'), (1, '28.830')]
[2023-09-30 05:38:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000013488_3452928.pth...
[2023-09-30 05:38:07,505][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000013488_3452928.pth...
[2023-09-30 05:38:07,540][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000010784_2760704.pth
[2023-09-30 05:38:07,541][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000010784_2760704.pth
[2023-09-30 05:38:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6930432. Throughput: 0: 736.4, 1: 735.8. Samples: 1733186. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:12,493][123291] Avg episode reward: [(0, '30.290'), (1, '28.800')]
[2023-09-30 05:38:12,494][124965] Saving new best policy, reward=30.290!
[2023-09-30 05:38:16,901][125261] Updated weights for policy 1, policy_version 13600 (0.0017)
[2023-09-30 05:38:16,902][125260] Updated weights for policy 0, policy_version 13600 (0.0017)
[2023-09-30 05:38:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6963200. Throughput: 0: 739.5, 1: 738.0. Samples: 1737657. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:17,492][123291] Avg episode reward: [(0, '30.290'), (1, '28.800')]
[2023-09-30 05:38:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5762.2). Total num frames: 6987776. Throughput: 0: 740.9, 1: 743.1. Samples: 1746562. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:22,492][123291] Avg episode reward: [(0, '30.290'), (1, '28.800')]
[2023-09-30 05:38:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5776.0). Total num frames: 7020544. Throughput: 0: 737.7, 1: 738.3. Samples: 1755180. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:27,493][123291] Avg episode reward: [(0, '30.420'), (1, '28.810')]
[2023-09-30 05:38:27,494][124965] Saving new best policy, reward=30.420!
[2023-09-30 05:38:30,816][125260] Updated weights for policy 0, policy_version 13760 (0.0017)
[2023-09-30 05:38:30,816][125261] Updated weights for policy 1, policy_version 13760 (0.0018)
[2023-09-30 05:38:32,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5776.1). Total num frames: 7053312. Throughput: 0: 740.0, 1: 739.4. Samples: 1759747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:32,493][123291] Avg episode reward: [(0, '30.420'), (1, '28.810')]
[2023-09-30 05:38:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7077888. Throughput: 0: 742.1, 1: 741.5. Samples: 1768651. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:38:37,493][123291] Avg episode reward: [(0, '30.470'), (1, '28.820')]
[2023-09-30 05:38:37,504][124965] Saving new best policy, reward=30.470!
[2023-09-30 05:38:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7110656. Throughput: 0: 740.1, 1: 737.6. Samples: 1777524. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:38:42,493][123291] Avg episode reward: [(0, '30.470'), (1, '28.820')]
[2023-09-30 05:38:44,828][125260] Updated weights for policy 0, policy_version 13920 (0.0017)
[2023-09-30 05:38:44,828][125261] Updated weights for policy 1, policy_version 13920 (0.0017)
[2023-09-30 05:38:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7135232. Throughput: 0: 734.4, 1: 735.8. Samples: 1781760. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:38:47,493][123291] Avg episode reward: [(0, '30.500'), (1, '28.910')]
[2023-09-30 05:38:47,545][124965] Saving new best policy, reward=30.500!
[2023-09-30 05:38:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7168000. Throughput: 0: 734.8, 1: 733.2. Samples: 1790476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:38:52,493][123291] Avg episode reward: [(0, '30.500'), (1, '28.910')]
[2023-09-30 05:38:57,491][123291] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5776.1). Total num frames: 7200768. Throughput: 0: 737.5, 1: 737.0. Samples: 1799538. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:38:57,492][123291] Avg episode reward: [(0, '30.540'), (1, '28.920')]
[2023-09-30 05:38:57,492][124965] Saving new best policy, reward=30.540!
[2023-09-30 05:38:58,944][125261] Updated weights for policy 1, policy_version 14080 (0.0016)
[2023-09-30 05:38:58,944][125260] Updated weights for policy 0, policy_version 14080 (0.0016)
[2023-09-30 05:39:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7225344. Throughput: 0: 733.0, 1: 732.9. Samples: 1803620. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:02,493][123291] Avg episode reward: [(0, '30.540'), (1, '28.920')]
[2023-09-30 05:39:07,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7258112. Throughput: 0: 733.2, 1: 730.5. Samples: 1812427. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:07,493][123291] Avg episode reward: [(0, '30.540'), (1, '28.920')]
[2023-09-30 05:39:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7282688. Throughput: 0: 730.4, 1: 730.5. Samples: 1820921. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:12,493][123291] Avg episode reward: [(0, '30.580'), (1, '28.920')]
[2023-09-30 05:39:12,494][124965] Saving new best policy, reward=30.580!
[2023-09-30 05:39:13,070][125260] Updated weights for policy 0, policy_version 14240 (0.0019)
[2023-09-30 05:39:13,070][125261] Updated weights for policy 1, policy_version 14240 (0.0017)
[2023-09-30 05:39:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 7315456. Throughput: 0: 726.8, 1: 726.7. Samples: 1825156. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:17,493][123291] Avg episode reward: [(0, '30.580'), (1, '28.920')]
[2023-09-30 05:39:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.0). Total num frames: 7340032. Throughput: 0: 724.1, 1: 724.0. Samples: 1833814. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:22,493][123291] Avg episode reward: [(0, '30.560'), (1, '28.910')]
[2023-09-30 05:39:27,415][125261] Updated weights for policy 1, policy_version 14400 (0.0017)
[2023-09-30 05:39:27,415][125260] Updated weights for policy 0, policy_version 14400 (0.0016)
[2023-09-30 05:39:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 7372800. Throughput: 0: 720.6, 1: 721.8. Samples: 1842430. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:39:27,492][123291] Avg episode reward: [(0, '30.560'), (1, '28.910')]
[2023-09-30 05:39:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 7397376. Throughput: 0: 724.8, 1: 723.0. Samples: 1846913. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:39:32,492][123291] Avg episode reward: [(0, '30.570'), (1, '28.920')]
[2023-09-30 05:39:37,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 7430144. Throughput: 0: 721.5, 1: 723.2. Samples: 1855488. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:37,493][123291] Avg episode reward: [(0, '30.570'), (1, '28.920')]
[2023-09-30 05:39:41,429][125260] Updated weights for policy 0, policy_version 14560 (0.0018)
[2023-09-30 05:39:41,429][125261] Updated weights for policy 1, policy_version 14560 (0.0018)
[2023-09-30 05:39:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 7454720. Throughput: 0: 718.2, 1: 718.3. Samples: 1864180. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-30 05:39:42,493][123291] Avg episode reward: [(0, '30.660'), (1, '28.960')]
[2023-09-30 05:39:42,494][124965] Saving new best policy, reward=30.660!
[2023-09-30 05:39:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 7487488. Throughput: 0: 722.6, 1: 722.8. Samples: 1868667. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:39:47,493][123291] Avg episode reward: [(0, '30.660'), (1, '28.960')]
[2023-09-30 05:39:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 7512064. Throughput: 0: 726.4, 1: 724.7. Samples: 1877726. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:39:52,492][123291] Avg episode reward: [(0, '30.660'), (1, '28.960')]
[2023-09-30 05:39:55,366][125260] Updated weights for policy 0, policy_version 14720 (0.0016)
[2023-09-30 05:39:55,366][125261] Updated weights for policy 1, policy_version 14720 (0.0017)
[2023-09-30 05:39:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 7544832. Throughput: 0: 726.0, 1: 725.9. Samples: 1886256. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:39:57,493][123291] Avg episode reward: [(0, '30.640'), (1, '29.060')]
[2023-09-30 05:40:02,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7577600. Throughput: 0: 729.5, 1: 729.9. Samples: 1890832. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:40:02,492][123291] Avg episode reward: [(0, '30.640'), (1, '29.060')]
[2023-09-30 05:40:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 7602176. Throughput: 0: 733.2, 1: 731.8. Samples: 1899740. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:40:07,492][123291] Avg episode reward: [(0, '30.650'), (1, '29.130')]
[2023-09-30 05:40:07,499][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000014848_3801088.pth...
[2023-09-30 05:40:07,499][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000014848_3801088.pth...
[2023-09-30 05:40:07,534][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000012112_3100672.pth
[2023-09-30 05:40:07,535][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000012112_3100672.pth
[2023-09-30 05:40:07,538][125162] Saving new best policy, reward=29.130!
[2023-09-30 05:40:09,116][125261] Updated weights for policy 1, policy_version 14880 (0.0017)
[2023-09-30 05:40:09,116][125260] Updated weights for policy 0, policy_version 14880 (0.0018)
[2023-09-30 05:40:12,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7634944. Throughput: 0: 735.8, 1: 737.7. Samples: 1908736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:40:12,492][123291] Avg episode reward: [(0, '30.650'), (1, '29.130')]
[2023-09-30 05:40:17,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7667712. Throughput: 0: 733.2, 1: 733.7. Samples: 1912925. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-30 05:40:17,493][123291] Avg episode reward: [(0, '30.600'), (1, '29.170')]
[2023-09-30 05:40:17,494][125162] Saving new best policy, reward=29.170!
[2023-09-30 05:40:22,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 7692288. Throughput: 0: 741.3, 1: 740.2. Samples: 1922157. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:22,492][123291] Avg episode reward: [(0, '30.600'), (1, '29.170')]
[2023-09-30 05:40:22,952][125261] Updated weights for policy 1, policy_version 15040 (0.0016)
[2023-09-30 05:40:22,952][125260] Updated weights for policy 0, policy_version 15040 (0.0016)
[2023-09-30 05:40:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7725056. Throughput: 0: 744.0, 1: 742.9. Samples: 1931090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:27,492][123291] Avg episode reward: [(0, '30.640'), (1, '29.240')]
[2023-09-30 05:40:27,494][125162] Saving new best policy, reward=29.240!
[2023-09-30 05:40:32,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 7749632. Throughput: 0: 740.3, 1: 741.7. Samples: 1935360. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:32,492][123291] Avg episode reward: [(0, '30.640'), (1, '29.240')]
[2023-09-30 05:40:36,914][125260] Updated weights for policy 0, policy_version 15200 (0.0014)
[2023-09-30 05:40:36,914][125261] Updated weights for policy 1, policy_version 15200 (0.0019)
[2023-09-30 05:40:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7782400. Throughput: 0: 732.7, 1: 734.7. Samples: 1943760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:37,493][123291] Avg episode reward: [(0, '30.580'), (1, '29.410')]
[2023-09-30 05:40:37,507][125162] Saving new best policy, reward=29.410!
[2023-09-30 05:40:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 7806976. Throughput: 0: 738.7, 1: 738.9. Samples: 1952747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:42,493][123291] Avg episode reward: [(0, '30.580'), (1, '29.410')]
[2023-09-30 05:40:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7839744. Throughput: 0: 737.6, 1: 736.9. Samples: 1957184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:47,493][123291] Avg episode reward: [(0, '30.580'), (1, '29.410')]
[2023-09-30 05:40:51,085][125260] Updated weights for policy 0, policy_version 15360 (0.0015)
[2023-09-30 05:40:51,085][125261] Updated weights for policy 1, policy_version 15360 (0.0015)
[2023-09-30 05:40:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 7864320. Throughput: 0: 735.5, 1: 735.1. Samples: 1965918. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:52,493][123291] Avg episode reward: [(0, '30.580'), (1, '29.510')]
[2023-09-30 05:40:52,588][125162] Saving new best policy, reward=29.510!
[2023-09-30 05:40:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7897088. Throughput: 0: 728.2, 1: 728.2. Samples: 1974272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:40:57,493][123291] Avg episode reward: [(0, '30.580'), (1, '29.510')]
[2023-09-30 05:41:02,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7929856. Throughput: 0: 731.1, 1: 731.6. Samples: 1978748. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:02,493][123291] Avg episode reward: [(0, '30.640'), (1, '29.630')]
[2023-09-30 05:41:02,494][125162] Saving new best policy, reward=29.630!
[2023-09-30 05:41:05,013][125260] Updated weights for policy 0, policy_version 15520 (0.0018)
[2023-09-30 05:41:05,014][125261] Updated weights for policy 1, policy_version 15520 (0.0015)
[2023-09-30 05:41:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7954432. Throughput: 0: 729.0, 1: 729.6. Samples: 1987793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:07,493][123291] Avg episode reward: [(0, '30.640'), (1, '29.630')]
[2023-09-30 05:41:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 7987200. Throughput: 0: 728.6, 1: 729.8. Samples: 1996716. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:12,492][123291] Avg episode reward: [(0, '30.670'), (1, '29.730')]
[2023-09-30 05:41:12,492][124965] Saving new best policy, reward=30.670!
[2023-09-30 05:41:12,493][125162] Saving new best policy, reward=29.730!
[2023-09-30 05:41:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 8011776. Throughput: 0: 728.2, 1: 728.2. Samples: 2000896. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:17,493][123291] Avg episode reward: [(0, '30.670'), (1, '29.730')]
[2023-09-30 05:41:19,119][125260] Updated weights for policy 0, policy_version 15680 (0.0015)
[2023-09-30 05:41:19,119][125261] Updated weights for policy 1, policy_version 15680 (0.0018)
[2023-09-30 05:41:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8044544. Throughput: 0: 727.9, 1: 727.5. Samples: 2009254. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:22,493][123291] Avg episode reward: [(0, '30.650'), (1, '29.810')]
[2023-09-30 05:41:22,505][125162] Saving new best policy, reward=29.810!
[2023-09-30 05:41:27,491][123291] Fps is (10 sec: 6553.8, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 8077312. Throughput: 0: 730.6, 1: 730.2. Samples: 2018483. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:27,492][123291] Avg episode reward: [(0, '30.650'), (1, '29.810')]
[2023-09-30 05:41:32,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8101888. Throughput: 0: 731.0, 1: 730.9. Samples: 2022972. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:32,493][123291] Avg episode reward: [(0, '30.650'), (1, '29.810')]
[2023-09-30 05:41:33,068][125260] Updated weights for policy 0, policy_version 15840 (0.0020)
[2023-09-30 05:41:33,068][125261] Updated weights for policy 1, policy_version 15840 (0.0018)
[2023-09-30 05:41:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5887.1). Total num frames: 8134656. Throughput: 0: 728.7, 1: 731.3. Samples: 2031618. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:37,492][123291] Avg episode reward: [(0, '30.650'), (1, '29.920')]
[2023-09-30 05:41:37,503][125162] Saving new best policy, reward=29.920!
[2023-09-30 05:41:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8159232. Throughput: 0: 735.5, 1: 734.1. Samples: 2040406. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:42,493][123291] Avg episode reward: [(0, '30.650'), (1, '29.920')]
[2023-09-30 05:41:46,850][125260] Updated weights for policy 0, policy_version 16000 (0.0018)
[2023-09-30 05:41:46,850][125261] Updated weights for policy 1, policy_version 16000 (0.0016)
[2023-09-30 05:41:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8192000. Throughput: 0: 735.5, 1: 736.5. Samples: 2044987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:47,493][123291] Avg episode reward: [(0, '30.620'), (1, '30.030')]
[2023-09-30 05:41:47,494][125162] Saving new best policy, reward=30.030!
[2023-09-30 05:41:52,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5873.2). Total num frames: 8220672. Throughput: 0: 735.3, 1: 736.6. Samples: 2054027. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:41:52,493][123291] Avg episode reward: [(0, '30.620'), (1, '30.030')]
[2023-09-30 05:41:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8249344. Throughput: 0: 729.2, 1: 729.8. Samples: 2062375. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:41:57,493][123291] Avg episode reward: [(0, '30.540'), (1, '30.110')]
[2023-09-30 05:41:57,494][125162] Saving new best policy, reward=30.110!
[2023-09-30 05:42:00,932][125261] Updated weights for policy 1, policy_version 16160 (0.0020)
[2023-09-30 05:42:00,932][125260] Updated weights for policy 0, policy_version 16160 (0.0018)
[2023-09-30 05:42:02,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 8282112. Throughput: 0: 731.3, 1: 729.8. Samples: 2066648. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:42:02,493][123291] Avg episode reward: [(0, '30.540'), (1, '30.110')]
[2023-09-30 05:42:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8306688. Throughput: 0: 739.5, 1: 739.6. Samples: 2075814. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:42:07,493][123291] Avg episode reward: [(0, '30.460'), (1, '30.220')]
[2023-09-30 05:42:07,506][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000016224_4153344.pth...
[2023-09-30 05:42:07,506][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000016224_4153344.pth...
[2023-09-30 05:42:07,534][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000013488_3452928.pth
[2023-09-30 05:42:07,537][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000013488_3452928.pth
[2023-09-30 05:42:07,538][125162] Saving new best policy, reward=30.220!
[2023-09-30 05:42:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8339456. Throughput: 0: 736.7, 1: 737.8. Samples: 2084838. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:42:12,493][123291] Avg episode reward: [(0, '30.460'), (1, '30.220')]
[2023-09-30 05:42:14,767][125261] Updated weights for policy 1, policy_version 16320 (0.0017)
[2023-09-30 05:42:14,768][125260] Updated weights for policy 0, policy_version 16320 (0.0017)
[2023-09-30 05:42:17,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 8364032. Throughput: 0: 732.3, 1: 734.1. Samples: 2088961. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:42:17,492][123291] Avg episode reward: [(0, '30.460'), (1, '30.220')]
[2023-09-30 05:42:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8396800. Throughput: 0: 733.8, 1: 732.2. Samples: 2097588. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:42:22,493][123291] Avg episode reward: [(0, '30.420'), (1, '30.330')]
[2023-09-30 05:42:22,504][125162] Saving new best policy, reward=30.330!
[2023-09-30 05:42:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 8421376. Throughput: 0: 728.6, 1: 728.4. Samples: 2105975. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:42:27,493][123291] Avg episode reward: [(0, '30.420'), (1, '30.330')]
[2023-09-30 05:42:29,151][125261] Updated weights for policy 1, policy_version 16480 (0.0015)
[2023-09-30 05:42:29,151][125260] Updated weights for policy 0, policy_version 16480 (0.0016)
[2023-09-30 05:42:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8454144. Throughput: 0: 727.5, 1: 726.3. Samples: 2110406. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:42:32,492][123291] Avg episode reward: [(0, '30.380'), (1, '30.410')]
[2023-09-30 05:42:32,493][125162] Saving new best policy, reward=30.410!
[2023-09-30 05:42:37,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8486912. Throughput: 0: 729.7, 1: 728.1. Samples: 2119627. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:42:37,492][123291] Avg episode reward: [(0, '30.380'), (1, '30.410')]
[2023-09-30 05:42:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8511488. Throughput: 0: 727.7, 1: 728.2. Samples: 2127887. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:42:42,493][123291] Avg episode reward: [(0, '30.330'), (1, '30.470')]
[2023-09-30 05:42:42,494][125162] Saving new best policy, reward=30.470!
[2023-09-30 05:42:43,044][125260] Updated weights for policy 0, policy_version 16640 (0.0019)
[2023-09-30 05:42:43,044][125261] Updated weights for policy 1, policy_version 16640 (0.0019)
[2023-09-30 05:42:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8544256. Throughput: 0: 728.2, 1: 729.2. Samples: 2132230. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:42:47,493][123291] Avg episode reward: [(0, '30.330'), (1, '30.470')]
[2023-09-30 05:42:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 8568832. Throughput: 0: 727.7, 1: 726.9. Samples: 2141272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:42:52,492][123291] Avg episode reward: [(0, '30.370'), (1, '30.540')]
[2023-09-30 05:42:52,500][125162] Saving new best policy, reward=30.540!
[2023-09-30 05:42:57,217][125260] Updated weights for policy 0, policy_version 16800 (0.0016)
[2023-09-30 05:42:57,219][125261] Updated weights for policy 1, policy_version 16800 (0.0016)
[2023-09-30 05:42:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8601600. Throughput: 0: 725.2, 1: 723.5. Samples: 2150030. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:42:57,493][123291] Avg episode reward: [(0, '30.370'), (1, '30.540')]
[2023-09-30 05:43:02,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 8626176. Throughput: 0: 727.3, 1: 727.4. Samples: 2154423. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:43:02,492][123291] Avg episode reward: [(0, '30.370'), (1, '30.540')]
[2023-09-30 05:43:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8658944. Throughput: 0: 727.1, 1: 727.1. Samples: 2163029. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:43:07,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.600')]
[2023-09-30 05:43:07,505][125162] Saving new best policy, reward=30.600!
[2023-09-30 05:43:11,199][125260] Updated weights for policy 0, policy_version 16960 (0.0018)
[2023-09-30 05:43:11,200][125261] Updated weights for policy 1, policy_version 16960 (0.0018)
[2023-09-30 05:43:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 8683520. Throughput: 0: 729.8, 1: 731.2. Samples: 2171722. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:43:12,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.600')]
[2023-09-30 05:43:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8716288. Throughput: 0: 729.0, 1: 729.3. Samples: 2176031. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:43:17,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.650')]
[2023-09-30 05:43:17,494][125162] Saving new best policy, reward=30.650!
[2023-09-30 05:43:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 8740864. Throughput: 0: 726.6, 1: 726.0. Samples: 2184992. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:43:22,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.650')]
[2023-09-30 05:43:25,413][125261] Updated weights for policy 1, policy_version 17120 (0.0018)
[2023-09-30 05:43:25,413][125260] Updated weights for policy 0, policy_version 17120 (0.0015)
[2023-09-30 05:43:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 8773632. Throughput: 0: 727.9, 1: 728.1. Samples: 2193408. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:43:27,493][123291] Avg episode reward: [(0, '30.350'), (1, '30.690')]
[2023-09-30 05:43:27,494][125162] Saving new best policy, reward=30.690!
[2023-09-30 05:43:32,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5845.5). Total num frames: 8802304. Throughput: 0: 726.6, 1: 726.2. Samples: 2197608. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:43:32,493][123291] Avg episode reward: [(0, '30.350'), (1, '30.690')]
[2023-09-30 05:43:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 8830976. Throughput: 0: 725.2, 1: 725.5. Samples: 2206551. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:43:37,492][123291] Avg episode reward: [(0, '30.310'), (1, '30.710')]
[2023-09-30 05:43:37,500][125162] Saving new best policy, reward=30.710!
[2023-09-30 05:43:39,344][125260] Updated weights for policy 0, policy_version 17280 (0.0018)
[2023-09-30 05:43:39,344][125261] Updated weights for policy 1, policy_version 17280 (0.0017)
[2023-09-30 05:43:42,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8863744. Throughput: 0: 729.5, 1: 729.4. Samples: 2215681. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:43:42,493][123291] Avg episode reward: [(0, '30.310'), (1, '30.710')]
[2023-09-30 05:43:47,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 8896512. Throughput: 0: 729.2, 1: 729.1. Samples: 2220043. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:43:47,493][123291] Avg episode reward: [(0, '30.310'), (1, '30.720')]
[2023-09-30 05:43:47,494][125162] Saving new best policy, reward=30.720!
[2023-09-30 05:43:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 8921088. Throughput: 0: 734.0, 1: 733.5. Samples: 2229070. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:43:52,493][123291] Avg episode reward: [(0, '30.330'), (1, '30.780')]
[2023-09-30 05:43:52,504][125162] Saving new best policy, reward=30.780!
[2023-09-30 05:43:53,040][125260] Updated weights for policy 0, policy_version 17440 (0.0016)
[2023-09-30 05:43:53,040][125261] Updated weights for policy 1, policy_version 17440 (0.0017)
[2023-09-30 05:43:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 8953856. Throughput: 0: 740.0, 1: 736.6. Samples: 2238170. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:43:57,492][123291] Avg episode reward: [(0, '30.330'), (1, '30.780')]
[2023-09-30 05:44:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 8978432. Throughput: 0: 738.8, 1: 738.7. Samples: 2242521. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:02,492][123291] Avg episode reward: [(0, '30.350'), (1, '30.830')]
[2023-09-30 05:44:02,493][125162] Saving new best policy, reward=30.830!
[2023-09-30 05:44:06,986][125260] Updated weights for policy 0, policy_version 17600 (0.0017)
[2023-09-30 05:44:06,986][125261] Updated weights for policy 1, policy_version 17600 (0.0017)
[2023-09-30 05:44:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9011200. Throughput: 0: 732.3, 1: 732.6. Samples: 2250910. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:07,493][123291] Avg episode reward: [(0, '30.350'), (1, '30.830')]
[2023-09-30 05:44:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000017600_4505600.pth...
[2023-09-30 05:44:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000017600_4505600.pth...
[2023-09-30 05:44:07,535][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000014848_3801088.pth
[2023-09-30 05:44:07,539][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000014848_3801088.pth
[2023-09-30 05:44:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 9035776. Throughput: 0: 738.4, 1: 736.5. Samples: 2259780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:12,492][123291] Avg episode reward: [(0, '30.380'), (1, '30.870')]
[2023-09-30 05:44:12,634][125162] Saving new best policy, reward=30.870!
[2023-09-30 05:44:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9068544. Throughput: 0: 741.4, 1: 741.1. Samples: 2264319. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:17,493][123291] Avg episode reward: [(0, '30.380'), (1, '30.870')]
[2023-09-30 05:44:20,976][125261] Updated weights for policy 1, policy_version 17760 (0.0017)
[2023-09-30 05:44:20,976][125260] Updated weights for policy 0, policy_version 17760 (0.0018)
[2023-09-30 05:44:22,492][123291] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 9101312. Throughput: 0: 740.7, 1: 740.9. Samples: 2273225. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:22,493][123291] Avg episode reward: [(0, '30.350'), (1, '30.860')]
[2023-09-30 05:44:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9125888. Throughput: 0: 735.4, 1: 736.1. Samples: 2281897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:27,493][123291] Avg episode reward: [(0, '30.350'), (1, '30.860')]
[2023-09-30 05:44:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5939.2, 300 sec: 5859.4). Total num frames: 9158656. Throughput: 0: 736.4, 1: 735.8. Samples: 2286288. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:44:32,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.850')]
[2023-09-30 05:44:34,919][125260] Updated weights for policy 0, policy_version 17920 (0.0017)
[2023-09-30 05:44:34,919][125261] Updated weights for policy 1, policy_version 17920 (0.0015)
[2023-09-30 05:44:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9183232. Throughput: 0: 735.8, 1: 735.6. Samples: 2295280. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:44:37,492][123291] Avg episode reward: [(0, '30.360'), (1, '30.850')]
[2023-09-30 05:44:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9216000. Throughput: 0: 729.6, 1: 733.3. Samples: 2304000. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:44:42,493][123291] Avg episode reward: [(0, '30.360'), (1, '30.850')]
[2023-09-30 05:44:47,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 9248768. Throughput: 0: 731.8, 1: 731.9. Samples: 2308386. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:44:47,492][123291] Avg episode reward: [(0, '30.410'), (1, '30.840')]
[2023-09-30 05:44:48,755][125261] Updated weights for policy 1, policy_version 18080 (0.0018)
[2023-09-30 05:44:48,755][125260] Updated weights for policy 0, policy_version 18080 (0.0016)
[2023-09-30 05:44:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9273344. Throughput: 0: 734.5, 1: 734.8. Samples: 2317028. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:44:52,493][123291] Avg episode reward: [(0, '30.410'), (1, '30.840')]
[2023-09-30 05:44:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9306112. Throughput: 0: 735.5, 1: 736.6. Samples: 2326026. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:44:57,493][123291] Avg episode reward: [(0, '30.490'), (1, '30.850')]
[2023-09-30 05:45:02,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9330688. Throughput: 0: 736.1, 1: 737.4. Samples: 2330624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:02,493][123291] Avg episode reward: [(0, '30.490'), (1, '30.850')]
[2023-09-30 05:45:02,652][125260] Updated weights for policy 0, policy_version 18240 (0.0016)
[2023-09-30 05:45:02,652][125261] Updated weights for policy 1, policy_version 18240 (0.0015)
[2023-09-30 05:45:07,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9363456. Throughput: 0: 733.0, 1: 733.0. Samples: 2339195. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:07,492][123291] Avg episode reward: [(0, '30.500'), (1, '30.900')]
[2023-09-30 05:45:07,502][125162] Saving new best policy, reward=30.900!
[2023-09-30 05:45:12,491][123291] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 9396224. Throughput: 0: 737.0, 1: 736.6. Samples: 2348208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:12,492][123291] Avg episode reward: [(0, '30.500'), (1, '30.900')]
[2023-09-30 05:45:16,478][125260] Updated weights for policy 0, policy_version 18400 (0.0016)
[2023-09-30 05:45:16,479][125261] Updated weights for policy 1, policy_version 18400 (0.0016)
[2023-09-30 05:45:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 9420800. Throughput: 0: 739.7, 1: 738.5. Samples: 2352806. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:17,492][123291] Avg episode reward: [(0, '30.550'), (1, '30.880')]
[2023-09-30 05:45:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9453568. Throughput: 0: 733.4, 1: 735.3. Samples: 2361375. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:22,493][123291] Avg episode reward: [(0, '30.550'), (1, '30.880')]
[2023-09-30 05:45:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 9478144. Throughput: 0: 735.0, 1: 734.0. Samples: 2370105. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:27,492][123291] Avg episode reward: [(0, '30.570'), (1, '30.890')]
[2023-09-30 05:45:30,833][125260] Updated weights for policy 0, policy_version 18560 (0.0017)
[2023-09-30 05:45:30,833][125261] Updated weights for policy 1, policy_version 18560 (0.0016)
[2023-09-30 05:45:32,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 9510912. Throughput: 0: 731.6, 1: 730.5. Samples: 2374181. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:32,492][123291] Avg episode reward: [(0, '30.580'), (1, '30.870')]
[2023-09-30 05:45:37,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9535488. Throughput: 0: 732.8, 1: 732.0. Samples: 2382947. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:37,493][123291] Avg episode reward: [(0, '30.580'), (1, '30.870')]
[2023-09-30 05:45:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9568256. Throughput: 0: 733.2, 1: 731.8. Samples: 2391955. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:42,493][123291] Avg episode reward: [(0, '30.530'), (1, '30.820')]
[2023-09-30 05:45:44,788][125261] Updated weights for policy 1, policy_version 18720 (0.0018)
[2023-09-30 05:45:44,788][125260] Updated weights for policy 0, policy_version 18720 (0.0017)
[2023-09-30 05:45:47,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 9592832. Throughput: 0: 728.2, 1: 728.2. Samples: 2396162. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:47,492][123291] Avg episode reward: [(0, '30.530'), (1, '30.820')]
[2023-09-30 05:45:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9625600. Throughput: 0: 730.6, 1: 731.6. Samples: 2404992. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:52,493][123291] Avg episode reward: [(0, '30.590'), (1, '30.740')]
[2023-09-30 05:45:57,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5845.5). Total num frames: 9654272. Throughput: 0: 726.9, 1: 728.0. Samples: 2413681. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:45:57,493][123291] Avg episode reward: [(0, '30.590'), (1, '30.740')]
[2023-09-30 05:45:58,999][125261] Updated weights for policy 1, policy_version 18880 (0.0018)
[2023-09-30 05:45:58,999][125260] Updated weights for policy 0, policy_version 18880 (0.0018)
[2023-09-30 05:46:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9682944. Throughput: 0: 722.6, 1: 724.0. Samples: 2417900. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:02,492][123291] Avg episode reward: [(0, '30.620'), (1, '30.730')]
[2023-09-30 05:46:07,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9715712. Throughput: 0: 727.6, 1: 728.1. Samples: 2426880. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:07,493][123291] Avg episode reward: [(0, '30.620'), (1, '30.730')]
[2023-09-30 05:46:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000018976_4857856.pth...
[2023-09-30 05:46:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000018976_4857856.pth...
[2023-09-30 05:46:07,537][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000016224_4153344.pth
[2023-09-30 05:46:07,538][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000016224_4153344.pth
[2023-09-30 05:46:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 9740288. Throughput: 0: 726.7, 1: 725.9. Samples: 2435474. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:12,493][123291] Avg episode reward: [(0, '30.650'), (1, '30.740')]
[2023-09-30 05:46:12,881][125260] Updated weights for policy 0, policy_version 19040 (0.0015)
[2023-09-30 05:46:12,882][125261] Updated weights for policy 1, policy_version 19040 (0.0017)
[2023-09-30 05:46:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9773056. Throughput: 0: 731.3, 1: 733.0. Samples: 2440074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:17,493][123291] Avg episode reward: [(0, '30.690'), (1, '30.720')]
[2023-09-30 05:46:17,494][124965] Saving new best policy, reward=30.690!
[2023-09-30 05:46:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 9797632. Throughput: 0: 732.0, 1: 733.1. Samples: 2448874. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:22,493][123291] Avg episode reward: [(0, '30.690'), (1, '30.720')]
[2023-09-30 05:46:26,849][125260] Updated weights for policy 0, policy_version 19200 (0.0014)
[2023-09-30 05:46:26,850][125261] Updated weights for policy 1, policy_version 19200 (0.0018)
[2023-09-30 05:46:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9830400. Throughput: 0: 728.3, 1: 730.5. Samples: 2457600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:27,493][123291] Avg episode reward: [(0, '30.740'), (1, '30.720')]
[2023-09-30 05:46:27,494][124965] Saving new best policy, reward=30.740!
[2023-09-30 05:46:32,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9863168. Throughput: 0: 731.4, 1: 730.4. Samples: 2461946. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:32,493][123291] Avg episode reward: [(0, '30.740'), (1, '30.720')]
[2023-09-30 05:46:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 9887744. Throughput: 0: 731.6, 1: 729.9. Samples: 2470762. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:37,492][123291] Avg episode reward: [(0, '30.820'), (1, '30.730')]
[2023-09-30 05:46:37,500][124965] Saving new best policy, reward=30.820!
[2023-09-30 05:46:40,847][125260] Updated weights for policy 0, policy_version 19360 (0.0016)
[2023-09-30 05:46:40,847][125261] Updated weights for policy 1, policy_version 19360 (0.0017)
[2023-09-30 05:46:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9920512. Throughput: 0: 733.1, 1: 733.2. Samples: 2479665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:46:42,492][123291] Avg episode reward: [(0, '30.820'), (1, '30.730')]
[2023-09-30 05:46:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 9945088. Throughput: 0: 734.2, 1: 735.0. Samples: 2484018. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:46:47,493][123291] Avg episode reward: [(0, '30.910'), (1, '30.740')]
[2023-09-30 05:46:47,494][124965] Saving new best policy, reward=30.910!
[2023-09-30 05:46:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 9977856. Throughput: 0: 730.8, 1: 729.6. Samples: 2492595. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:46:52,493][123291] Avg episode reward: [(0, '30.910'), (1, '30.740')]
[2023-09-30 05:46:54,724][125261] Updated weights for policy 1, policy_version 19520 (0.0019)
[2023-09-30 05:46:54,724][125260] Updated weights for policy 0, policy_version 19520 (0.0018)
[2023-09-30 05:46:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5831.6). Total num frames: 10002432. Throughput: 0: 734.8, 1: 734.9. Samples: 2501608. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:46:57,493][123291] Avg episode reward: [(0, '30.920'), (1, '30.690')]
[2023-09-30 05:46:57,518][124965] Saving new best policy, reward=30.920!
[2023-09-30 05:47:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10035200. Throughput: 0: 730.2, 1: 728.6. Samples: 2505719. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:02,493][123291] Avg episode reward: [(0, '30.920'), (1, '30.690')]
[2023-09-30 05:47:07,492][123291] Fps is (10 sec: 6553.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10067968. Throughput: 0: 733.8, 1: 733.2. Samples: 2514890. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:07,493][123291] Avg episode reward: [(0, '30.920'), (1, '30.690')]
[2023-09-30 05:47:08,808][125260] Updated weights for policy 0, policy_version 19680 (0.0018)
[2023-09-30 05:47:08,808][125261] Updated weights for policy 1, policy_version 19680 (0.0018)
[2023-09-30 05:47:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10092544. Throughput: 0: 729.2, 1: 728.3. Samples: 2523189. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:12,493][123291] Avg episode reward: [(0, '30.900'), (1, '30.700')]
[2023-09-30 05:47:17,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10125312. Throughput: 0: 731.4, 1: 731.2. Samples: 2527760. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:47:17,492][123291] Avg episode reward: [(0, '30.900'), (1, '30.700')]
[2023-09-30 05:47:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10149888. Throughput: 0: 730.2, 1: 731.9. Samples: 2536556. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:47:22,493][123291] Avg episode reward: [(0, '30.950'), (1, '30.740')]
[2023-09-30 05:47:22,504][124965] Saving new best policy, reward=30.950!
[2023-09-30 05:47:22,854][125260] Updated weights for policy 0, policy_version 19840 (0.0017)
[2023-09-30 05:47:22,854][125261] Updated weights for policy 1, policy_version 19840 (0.0016)
[2023-09-30 05:47:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10182656. Throughput: 0: 731.7, 1: 731.0. Samples: 2545485. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:47:27,493][123291] Avg episode reward: [(0, '30.950'), (1, '30.740')]
[2023-09-30 05:47:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 10207232. Throughput: 0: 730.6, 1: 730.3. Samples: 2549760. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:47:32,492][123291] Avg episode reward: [(0, '30.900'), (1, '30.780')]
[2023-09-30 05:47:36,880][125261] Updated weights for policy 1, policy_version 20000 (0.0017)
[2023-09-30 05:47:36,881][125260] Updated weights for policy 0, policy_version 20000 (0.0019)
[2023-09-30 05:47:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10240000. Throughput: 0: 730.3, 1: 730.9. Samples: 2558351. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:47:37,493][123291] Avg episode reward: [(0, '30.900'), (1, '30.780')]
[2023-09-30 05:47:42,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5845.5). Total num frames: 10268672. Throughput: 0: 728.6, 1: 730.2. Samples: 2567251. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:47:42,493][123291] Avg episode reward: [(0, '30.880'), (1, '30.840')]
[2023-09-30 05:47:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10297344. Throughput: 0: 733.1, 1: 734.0. Samples: 2571737. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:47,493][123291] Avg episode reward: [(0, '30.880'), (1, '30.840')]
[2023-09-30 05:47:50,757][125260] Updated weights for policy 0, policy_version 20160 (0.0018)
[2023-09-30 05:47:50,758][125261] Updated weights for policy 1, policy_version 20160 (0.0019)
[2023-09-30 05:47:52,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10330112. Throughput: 0: 728.3, 1: 729.4. Samples: 2580484. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:52,493][123291] Avg episode reward: [(0, '30.880'), (1, '30.840')]
[2023-09-30 05:47:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10354688. Throughput: 0: 732.5, 1: 732.5. Samples: 2589116. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:47:57,493][123291] Avg episode reward: [(0, '30.920'), (1, '30.880')]
[2023-09-30 05:48:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10387456. Throughput: 0: 732.6, 1: 733.1. Samples: 2593714. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:02,493][123291] Avg episode reward: [(0, '30.920'), (1, '30.880')]
[2023-09-30 05:48:04,780][125261] Updated weights for policy 1, policy_version 20320 (0.0017)
[2023-09-30 05:48:04,780][125260] Updated weights for policy 0, policy_version 20320 (0.0016)
[2023-09-30 05:48:07,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5873.2). Total num frames: 10416128. Throughput: 0: 735.5, 1: 734.0. Samples: 2602687. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:07,493][123291] Avg episode reward: [(0, '30.960'), (1, '30.890')]
[2023-09-30 05:48:07,505][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000020352_5210112.pth...
[2023-09-30 05:48:07,505][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000020352_5210112.pth...
[2023-09-30 05:48:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000017600_4505600.pth
[2023-09-30 05:48:07,550][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000017600_4505600.pth
[2023-09-30 05:48:07,556][124965] Saving new best policy, reward=30.960!
[2023-09-30 05:48:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 10444800. Throughput: 0: 731.6, 1: 731.5. Samples: 2611323. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:48:12,492][123291] Avg episode reward: [(0, '30.960'), (1, '30.890')]
[2023-09-30 05:48:17,491][123291] Fps is (10 sec: 6144.2, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 10477568. Throughput: 0: 735.2, 1: 734.6. Samples: 2615903. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:48:17,492][123291] Avg episode reward: [(0, '31.000'), (1, '30.940')]
[2023-09-30 05:48:17,493][124965] Saving new best policy, reward=31.000!
[2023-09-30 05:48:17,493][125162] Saving new best policy, reward=30.940!
[2023-09-30 05:48:18,472][125260] Updated weights for policy 0, policy_version 20480 (0.0015)
[2023-09-30 05:48:18,473][125261] Updated weights for policy 1, policy_version 20480 (0.0015)
[2023-09-30 05:48:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10502144. Throughput: 0: 740.3, 1: 739.8. Samples: 2624957. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:48:22,493][123291] Avg episode reward: [(0, '31.000'), (1, '30.940')]
[2023-09-30 05:48:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5873.3). Total num frames: 10534912. Throughput: 0: 738.6, 1: 738.6. Samples: 2633726. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:48:27,492][123291] Avg episode reward: [(0, '31.020'), (1, '30.950')]
[2023-09-30 05:48:27,493][125162] Saving new best policy, reward=30.950!
[2023-09-30 05:48:27,493][124965] Saving new best policy, reward=31.020!
[2023-09-30 05:48:32,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 5873.2). Total num frames: 10563584. Throughput: 0: 734.3, 1: 734.8. Samples: 2637845. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:48:32,492][123291] Avg episode reward: [(0, '31.020'), (1, '30.950')]
[2023-09-30 05:48:32,513][125260] Updated weights for policy 0, policy_version 20640 (0.0017)
[2023-09-30 05:48:32,514][125261] Updated weights for policy 1, policy_version 20640 (0.0016)
[2023-09-30 05:48:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10592256. Throughput: 0: 737.5, 1: 737.2. Samples: 2646847. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:48:37,493][123291] Avg episode reward: [(0, '31.020'), (1, '30.940')]
[2023-09-30 05:48:42,492][123291] Fps is (10 sec: 6143.8, 60 sec: 5939.2, 300 sec: 5859.4). Total num frames: 10625024. Throughput: 0: 740.9, 1: 738.8. Samples: 2655702. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:42,493][123291] Avg episode reward: [(0, '31.040'), (1, '30.940')]
[2023-09-30 05:48:42,494][124965] Saving new best policy, reward=31.040!
[2023-09-30 05:48:46,458][125260] Updated weights for policy 0, policy_version 20800 (0.0015)
[2023-09-30 05:48:46,458][125261] Updated weights for policy 1, policy_version 20800 (0.0017)
[2023-09-30 05:48:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10649600. Throughput: 0: 739.2, 1: 738.1. Samples: 2660195. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:47,493][123291] Avg episode reward: [(0, '31.040'), (1, '30.940')]
[2023-09-30 05:48:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10682368. Throughput: 0: 731.5, 1: 732.7. Samples: 2668574. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:52,493][123291] Avg episode reward: [(0, '31.060'), (1, '30.980')]
[2023-09-30 05:48:52,504][124965] Saving new best policy, reward=31.060!
[2023-09-30 05:48:52,504][125162] Saving new best policy, reward=30.980!
[2023-09-30 05:48:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10706944. Throughput: 0: 734.2, 1: 734.4. Samples: 2677407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:48:57,493][123291] Avg episode reward: [(0, '31.060'), (1, '30.980')]
[2023-09-30 05:49:00,562][125260] Updated weights for policy 0, policy_version 20960 (0.0017)
[2023-09-30 05:49:00,563][125261] Updated weights for policy 1, policy_version 20960 (0.0017)
[2023-09-30 05:49:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10739712. Throughput: 0: 733.3, 1: 733.0. Samples: 2681888. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:49:02,493][123291] Avg episode reward: [(0, '31.090'), (1, '31.030')]
[2023-09-30 05:49:02,494][124965] Saving new best policy, reward=31.090!
[2023-09-30 05:49:02,494][125162] Saving new best policy, reward=31.030!
[2023-09-30 05:49:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 10764288. Throughput: 0: 727.6, 1: 726.4. Samples: 2690387. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:49:07,492][123291] Avg episode reward: [(0, '31.090'), (1, '31.030')]
[2023-09-30 05:49:12,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10797056. Throughput: 0: 728.3, 1: 728.2. Samples: 2699268. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:49:12,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.060')]
[2023-09-30 05:49:12,493][125162] Saving new best policy, reward=31.060!
[2023-09-30 05:49:12,493][124965] Saving new best policy, reward=31.100!
[2023-09-30 05:49:14,501][125260] Updated weights for policy 0, policy_version 21120 (0.0016)
[2023-09-30 05:49:14,502][125261] Updated weights for policy 1, policy_version 21120 (0.0013)
[2023-09-30 05:49:17,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10829824. Throughput: 0: 731.9, 1: 731.4. Samples: 2703696. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:49:17,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.060')]
[2023-09-30 05:49:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10854400. Throughput: 0: 729.8, 1: 729.2. Samples: 2712505. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:49:22,493][123291] Avg episode reward: [(0, '31.080'), (1, '31.100')]
[2023-09-30 05:49:22,503][125162] Saving new best policy, reward=31.100!
[2023-09-30 05:49:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10887168. Throughput: 0: 731.4, 1: 733.3. Samples: 2721616. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:49:27,493][123291] Avg episode reward: [(0, '31.090'), (1, '31.100')]
[2023-09-30 05:49:28,385][125261] Updated weights for policy 1, policy_version 21280 (0.0018)
[2023-09-30 05:49:28,385][125260] Updated weights for policy 0, policy_version 21280 (0.0018)
[2023-09-30 05:49:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 10911744. Throughput: 0: 729.0, 1: 730.8. Samples: 2725888. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:49:32,493][123291] Avg episode reward: [(0, '31.090'), (1, '31.100')]
[2023-09-30 05:49:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 10944512. Throughput: 0: 733.8, 1: 733.4. Samples: 2734597. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:49:37,492][123291] Avg episode reward: [(0, '31.140'), (1, '31.120')]
[2023-09-30 05:49:37,502][125162] Saving new best policy, reward=31.120!
[2023-09-30 05:49:37,502][124965] Saving new best policy, reward=31.140!
[2023-09-30 05:49:42,416][125260] Updated weights for policy 0, policy_version 21440 (0.0019)
[2023-09-30 05:49:42,417][125261] Updated weights for policy 1, policy_version 21440 (0.0020)
[2023-09-30 05:49:42,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 10977280. Throughput: 0: 733.6, 1: 733.4. Samples: 2743422. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 05:49:42,493][123291] Avg episode reward: [(0, '31.140'), (1, '31.120')]
[2023-09-30 05:49:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11001856. Throughput: 0: 735.1, 1: 734.4. Samples: 2748015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:49:47,493][123291] Avg episode reward: [(0, '31.150'), (1, '31.160')]
[2023-09-30 05:49:47,494][124965] Saving new best policy, reward=31.150!
[2023-09-30 05:49:47,494][125162] Saving new best policy, reward=31.160!
[2023-09-30 05:49:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11034624. Throughput: 0: 734.7, 1: 736.9. Samples: 2756609. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:49:52,492][123291] Avg episode reward: [(0, '31.150'), (1, '31.160')]
[2023-09-30 05:49:56,387][125260] Updated weights for policy 0, policy_version 21600 (0.0016)
[2023-09-30 05:49:56,387][125261] Updated weights for policy 1, policy_version 21600 (0.0017)
[2023-09-30 05:49:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11059200. Throughput: 0: 735.2, 1: 734.3. Samples: 2765396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:49:57,493][123291] Avg episode reward: [(0, '31.150'), (1, '31.200')]
[2023-09-30 05:49:57,494][125162] Saving new best policy, reward=31.200!
[2023-09-30 05:50:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11091968. Throughput: 0: 736.4, 1: 734.5. Samples: 2769888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:02,493][123291] Avg episode reward: [(0, '31.150'), (1, '31.200')]
[2023-09-30 05:50:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 11116544. Throughput: 0: 736.4, 1: 734.0. Samples: 2778670. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:07,493][123291] Avg episode reward: [(0, '31.100'), (1, '31.180')]
[2023-09-30 05:50:07,505][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000021712_5558272.pth...
[2023-09-30 05:50:07,538][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000018976_4857856.pth
[2023-09-30 05:50:07,668][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000021728_5562368.pth...
[2023-09-30 05:50:07,695][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000018976_4857856.pth
[2023-09-30 05:50:10,603][125260] Updated weights for policy 0, policy_version 21760 (0.0015)
[2023-09-30 05:50:10,603][125261] Updated weights for policy 1, policy_version 21760 (0.0016)
[2023-09-30 05:50:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11149312. Throughput: 0: 729.5, 1: 730.3. Samples: 2787307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:12,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.180')]
[2023-09-30 05:50:17,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11182080. Throughput: 0: 729.0, 1: 728.3. Samples: 2791464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:17,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.180')]
[2023-09-30 05:50:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11206656. Throughput: 0: 733.5, 1: 732.1. Samples: 2800546. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:22,493][123291] Avg episode reward: [(0, '31.110'), (1, '31.220')]
[2023-09-30 05:50:22,503][125162] Saving new best policy, reward=31.220!
[2023-09-30 05:50:24,511][125260] Updated weights for policy 0, policy_version 21920 (0.0017)
[2023-09-30 05:50:24,511][125261] Updated weights for policy 1, policy_version 21920 (0.0017)
[2023-09-30 05:50:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11239424. Throughput: 0: 731.4, 1: 731.1. Samples: 2809235. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:50:27,493][123291] Avg episode reward: [(0, '31.110'), (1, '31.220')]
[2023-09-30 05:50:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11264000. Throughput: 0: 729.9, 1: 730.5. Samples: 2813730. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:50:32,493][123291] Avg episode reward: [(0, '31.100'), (1, '31.280')]
[2023-09-30 05:50:32,494][125162] Saving new best policy, reward=31.280!
[2023-09-30 05:50:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11296768. Throughput: 0: 728.2, 1: 728.2. Samples: 2822144. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:50:37,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.280')]
[2023-09-30 05:50:38,553][125261] Updated weights for policy 1, policy_version 22080 (0.0015)
[2023-09-30 05:50:38,553][125260] Updated weights for policy 0, policy_version 22080 (0.0016)
[2023-09-30 05:50:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 11321344. Throughput: 0: 727.2, 1: 727.6. Samples: 2830859. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:50:42,493][123291] Avg episode reward: [(0, '31.090'), (1, '31.290')]
[2023-09-30 05:50:42,494][125162] Saving new best policy, reward=31.290!
[2023-09-30 05:50:47,491][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11354112. Throughput: 0: 727.3, 1: 729.6. Samples: 2835450. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:50:47,492][123291] Avg episode reward: [(0, '31.090'), (1, '31.290')]
[2023-09-30 05:50:52,374][125261] Updated weights for policy 1, policy_version 22240 (0.0017)
[2023-09-30 05:50:52,374][125260] Updated weights for policy 0, policy_version 22240 (0.0017)
[2023-09-30 05:50:52,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5873.2). Total num frames: 11386880. Throughput: 0: 731.2, 1: 734.2. Samples: 2844614. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:50:52,493][123291] Avg episode reward: [(0, '31.010'), (1, '31.330')]
[2023-09-30 05:50:52,502][125162] Saving new best policy, reward=31.330!
[2023-09-30 05:50:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11411456. Throughput: 0: 731.1, 1: 730.9. Samples: 2853094. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 05:50:57,493][123291] Avg episode reward: [(0, '31.010'), (1, '31.330')]
[2023-09-30 05:51:02,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11444224. Throughput: 0: 734.9, 1: 734.5. Samples: 2857590. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:51:02,493][123291] Avg episode reward: [(0, '31.010'), (1, '31.330')]
[2023-09-30 05:51:06,383][125260] Updated weights for policy 0, policy_version 22400 (0.0018)
[2023-09-30 05:51:06,383][125261] Updated weights for policy 1, policy_version 22400 (0.0017)
[2023-09-30 05:51:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 11468800. Throughput: 0: 732.4, 1: 734.7. Samples: 2866562. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:51:07,492][123291] Avg episode reward: [(0, '31.010'), (1, '31.350')]
[2023-09-30 05:51:07,503][125162] Saving new best policy, reward=31.350!
[2023-09-30 05:51:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11501568. Throughput: 0: 734.4, 1: 735.7. Samples: 2875392. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-30 05:51:12,492][123291] Avg episode reward: [(0, '31.010'), (1, '31.350')]
[2023-09-30 05:51:17,491][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5873.3). Total num frames: 11530240. Throughput: 0: 731.6, 1: 731.5. Samples: 2879569. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:51:17,492][123291] Avg episode reward: [(0, '30.950'), (1, '31.370')]
[2023-09-30 05:51:17,587][125162] Saving new best policy, reward=31.370!
[2023-09-30 05:51:20,336][125260] Updated weights for policy 0, policy_version 22560 (0.0017)
[2023-09-30 05:51:20,336][125261] Updated weights for policy 1, policy_version 22560 (0.0017)
[2023-09-30 05:51:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11558912. Throughput: 0: 736.0, 1: 735.0. Samples: 2888340. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:51:22,492][123291] Avg episode reward: [(0, '30.950'), (1, '31.370')]
[2023-09-30 05:51:27,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11591680. Throughput: 0: 736.8, 1: 735.7. Samples: 2897120. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 05:51:27,493][123291] Avg episode reward: [(0, '30.940'), (1, '31.330')]
[2023-09-30 05:51:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11616256. Throughput: 0: 736.6, 1: 736.2. Samples: 2901726. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:32,492][123291] Avg episode reward: [(0, '30.940'), (1, '31.330')]
[2023-09-30 05:51:34,287][125260] Updated weights for policy 0, policy_version 22720 (0.0018)
[2023-09-30 05:51:34,287][125261] Updated weights for policy 1, policy_version 22720 (0.0018)
[2023-09-30 05:51:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11649024. Throughput: 0: 729.3, 1: 729.0. Samples: 2910239. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:37,493][123291] Avg episode reward: [(0, '30.940'), (1, '31.390')]
[2023-09-30 05:51:37,503][125162] Saving new best policy, reward=31.390!
[2023-09-30 05:51:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11673600. Throughput: 0: 733.7, 1: 733.0. Samples: 2919095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:42,493][123291] Avg episode reward: [(0, '30.940'), (1, '31.390')]
[2023-09-30 05:51:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11706368. Throughput: 0: 732.1, 1: 732.5. Samples: 2923498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:47,492][123291] Avg episode reward: [(0, '30.940'), (1, '31.390')]
[2023-09-30 05:51:48,348][125261] Updated weights for policy 1, policy_version 22880 (0.0018)
[2023-09-30 05:51:48,348][125260] Updated weights for policy 0, policy_version 22880 (0.0017)
[2023-09-30 05:51:52,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5873.2). Total num frames: 11735040. Throughput: 0: 732.2, 1: 730.4. Samples: 2932377. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:52,493][123291] Avg episode reward: [(0, '31.000'), (1, '31.400')]
[2023-09-30 05:51:52,503][125162] Saving new best policy, reward=31.400!
[2023-09-30 05:51:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11763712. Throughput: 0: 728.2, 1: 728.2. Samples: 2940928. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:51:57,493][123291] Avg episode reward: [(0, '31.000'), (1, '31.400')]
[2023-09-30 05:52:02,491][123291] Fps is (10 sec: 5324.9, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 11788288. Throughput: 0: 726.7, 1: 727.9. Samples: 2945024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:52:02,492][123291] Avg episode reward: [(0, '30.990'), (1, '31.400')]
[2023-09-30 05:52:02,707][125260] Updated weights for policy 0, policy_version 23040 (0.0017)
[2023-09-30 05:52:02,708][125261] Updated weights for policy 1, policy_version 23040 (0.0017)
[2023-09-30 05:52:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11821056. Throughput: 0: 725.0, 1: 724.9. Samples: 2953582. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:52:07,493][123291] Avg episode reward: [(0, '30.990'), (1, '31.400')]
[2023-09-30 05:52:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000023088_5910528.pth...
[2023-09-30 05:52:07,505][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000023088_5910528.pth...
[2023-09-30 05:52:07,540][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000020352_5210112.pth
[2023-09-30 05:52:07,548][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000020352_5210112.pth
[2023-09-30 05:52:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 11845632. Throughput: 0: 723.5, 1: 723.9. Samples: 2962251. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:52:12,493][123291] Avg episode reward: [(0, '31.060'), (1, '31.380')]
[2023-09-30 05:52:16,642][125260] Updated weights for policy 0, policy_version 23200 (0.0018)
[2023-09-30 05:52:16,643][125261] Updated weights for policy 1, policy_version 23200 (0.0017)
[2023-09-30 05:52:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.6, 300 sec: 5859.4). Total num frames: 11878400. Throughput: 0: 724.7, 1: 724.3. Samples: 2966932. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:52:17,493][123291] Avg episode reward: [(0, '31.060'), (1, '31.380')]
[2023-09-30 05:52:22,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11911168. Throughput: 0: 727.6, 1: 728.1. Samples: 2975745. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:22,493][123291] Avg episode reward: [(0, '31.090'), (1, '31.360')]
[2023-09-30 05:52:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 11935744. Throughput: 0: 725.4, 1: 725.2. Samples: 2984370. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:27,492][123291] Avg episode reward: [(0, '31.090'), (1, '31.360')]
[2023-09-30 05:52:30,528][125260] Updated weights for policy 0, policy_version 23360 (0.0015)
[2023-09-30 05:52:30,528][125261] Updated weights for policy 1, policy_version 23360 (0.0015)
[2023-09-30 05:52:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 11968512. Throughput: 0: 728.2, 1: 728.1. Samples: 2989031. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:32,493][123291] Avg episode reward: [(0, '31.100'), (1, '31.370')]
[2023-09-30 05:52:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5845.5). Total num frames: 11993088. Throughput: 0: 726.9, 1: 727.7. Samples: 2997832. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:37,493][123291] Avg episode reward: [(0, '31.100'), (1, '31.380')]
[2023-09-30 05:52:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 12025856. Throughput: 0: 728.2, 1: 728.2. Samples: 3006464. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:42,492][123291] Avg episode reward: [(0, '31.100'), (1, '31.380')]
[2023-09-30 05:52:44,734][125260] Updated weights for policy 0, policy_version 23520 (0.0016)
[2023-09-30 05:52:44,735][125261] Updated weights for policy 1, policy_version 23520 (0.0017)
[2023-09-30 05:52:47,492][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 12058624. Throughput: 0: 728.3, 1: 728.2. Samples: 3010566. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:47,493][123291] Avg episode reward: [(0, '31.080'), (1, '31.350')]
[2023-09-30 05:52:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 12083200. Throughput: 0: 735.1, 1: 734.6. Samples: 3019722. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:52:52,493][123291] Avg episode reward: [(0, '31.080'), (1, '31.350')]
[2023-09-30 05:52:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 12115968. Throughput: 0: 737.6, 1: 735.2. Samples: 3028527. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:52:57,493][123291] Avg episode reward: [(0, '31.110'), (1, '31.310')]
[2023-09-30 05:52:58,700][125260] Updated weights for policy 0, policy_version 23680 (0.0018)
[2023-09-30 05:52:58,701][125261] Updated weights for policy 1, policy_version 23680 (0.0018)
[2023-09-30 05:53:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 12140544. Throughput: 0: 731.4, 1: 731.0. Samples: 3032736. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:53:02,492][123291] Avg episode reward: [(0, '31.110'), (1, '31.310')]
[2023-09-30 05:53:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 12173312. Throughput: 0: 728.2, 1: 728.2. Samples: 3041280. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 05:53:07,493][123291] Avg episode reward: [(0, '31.110'), (1, '31.260')]
[2023-09-30 05:53:12,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 12197888. Throughput: 0: 722.7, 1: 724.2. Samples: 3049479. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:12,492][123291] Avg episode reward: [(0, '31.110'), (1, '31.260')]
[2023-09-30 05:53:13,151][125260] Updated weights for policy 0, policy_version 23840 (0.0018)
[2023-09-30 05:53:13,151][125261] Updated weights for policy 1, policy_version 23840 (0.0017)
[2023-09-30 05:53:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 12230656. Throughput: 0: 718.8, 1: 718.9. Samples: 3053729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:17,492][123291] Avg episode reward: [(0, '31.110'), (1, '31.260')]
[2023-09-30 05:53:22,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12255232. Throughput: 0: 718.7, 1: 717.5. Samples: 3062462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:22,493][123291] Avg episode reward: [(0, '31.120'), (1, '31.250')]
[2023-09-30 05:53:27,296][125261] Updated weights for policy 1, policy_version 24000 (0.0018)
[2023-09-30 05:53:27,297][125260] Updated weights for policy 0, policy_version 24000 (0.0018)
[2023-09-30 05:53:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5845.5). Total num frames: 12288000. Throughput: 0: 723.1, 1: 721.5. Samples: 3071470. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:27,492][123291] Avg episode reward: [(0, '31.120'), (1, '31.250')]
[2023-09-30 05:53:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12312576. Throughput: 0: 724.7, 1: 725.2. Samples: 3075810. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:32,493][123291] Avg episode reward: [(0, '31.150'), (1, '31.210')]
[2023-09-30 05:53:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12345344. Throughput: 0: 716.7, 1: 718.2. Samples: 3084292. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:37,493][123291] Avg episode reward: [(0, '31.150'), (1, '31.210')]
[2023-09-30 05:53:41,400][125261] Updated weights for policy 1, policy_version 24160 (0.0016)
[2023-09-30 05:53:41,401][125260] Updated weights for policy 0, policy_version 24160 (0.0017)
[2023-09-30 05:53:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12369920. Throughput: 0: 715.0, 1: 717.8. Samples: 3093004. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:53:42,493][123291] Avg episode reward: [(0, '31.140'), (1, '31.220')]
[2023-09-30 05:53:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12402688. Throughput: 0: 719.6, 1: 719.5. Samples: 3097494. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:53:47,493][123291] Avg episode reward: [(0, '31.140'), (1, '31.220')]
[2023-09-30 05:53:52,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 12435456. Throughput: 0: 725.5, 1: 725.5. Samples: 3106578. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:53:52,493][123291] Avg episode reward: [(0, '31.220'), (1, '31.230')]
[2023-09-30 05:53:52,503][124965] Saving new best policy, reward=31.220!
[2023-09-30 05:53:55,170][125261] Updated weights for policy 1, policy_version 24320 (0.0019)
[2023-09-30 05:53:55,170][125260] Updated weights for policy 0, policy_version 24320 (0.0019)
[2023-09-30 05:53:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12460032. Throughput: 0: 732.3, 1: 731.1. Samples: 3115333. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 05:53:57,492][123291] Avg episode reward: [(0, '31.220'), (1, '31.230')]
[2023-09-30 05:54:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 12492800. Throughput: 0: 734.2, 1: 733.2. Samples: 3119763. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:54:02,492][123291] Avg episode reward: [(0, '31.220'), (1, '31.230')]
[2023-09-30 05:54:07,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12517376. Throughput: 0: 730.3, 1: 731.4. Samples: 3128238. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:54:07,493][123291] Avg episode reward: [(0, '31.270'), (1, '31.230')]
[2023-09-30 05:54:07,506][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000024448_6258688.pth...
[2023-09-30 05:54:07,506][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000024448_6258688.pth...
[2023-09-30 05:54:07,542][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000021712_5558272.pth
[2023-09-30 05:54:07,543][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000021728_5562368.pth
[2023-09-30 05:54:07,546][124965] Saving new best policy, reward=31.270!
[2023-09-30 05:54:09,510][125260] Updated weights for policy 0, policy_version 24480 (0.0018)
[2023-09-30 05:54:09,511][125261] Updated weights for policy 1, policy_version 24480 (0.0017)
[2023-09-30 05:54:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12550144. Throughput: 0: 725.8, 1: 725.4. Samples: 3136772. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:54:12,493][123291] Avg episode reward: [(0, '31.270'), (1, '31.230')]
[2023-09-30 05:54:17,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12574720. Throughput: 0: 728.2, 1: 727.1. Samples: 3141296. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:54:17,492][123291] Avg episode reward: [(0, '31.260'), (1, '31.250')]
[2023-09-30 05:54:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12607488. Throughput: 0: 731.2, 1: 730.7. Samples: 3150081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:54:22,493][123291] Avg episode reward: [(0, '31.260'), (1, '31.250')]
[2023-09-30 05:54:23,357][125260] Updated weights for policy 0, policy_version 24640 (0.0017)
[2023-09-30 05:54:23,357][125261] Updated weights for policy 1, policy_version 24640 (0.0016)
[2023-09-30 05:54:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12632064. Throughput: 0: 733.3, 1: 733.4. Samples: 3159007. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:54:27,493][123291] Avg episode reward: [(0, '31.250'), (1, '31.230')]
[2023-09-30 05:54:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12664832. Throughput: 0: 732.1, 1: 735.4. Samples: 3163531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:54:32,493][123291] Avg episode reward: [(0, '31.250'), (1, '31.230')]
[2023-09-30 05:54:37,364][125260] Updated weights for policy 0, policy_version 24800 (0.0016)
[2023-09-30 05:54:37,366][125261] Updated weights for policy 1, policy_version 24800 (0.0016)
[2023-09-30 05:54:37,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12697600. Throughput: 0: 730.7, 1: 729.5. Samples: 3172286. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:54:37,493][123291] Avg episode reward: [(0, '31.300'), (1, '31.240')]
[2023-09-30 05:54:37,503][124965] Saving new best policy, reward=31.300!
[2023-09-30 05:54:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12722176. Throughput: 0: 725.4, 1: 725.6. Samples: 3180625. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:54:42,492][123291] Avg episode reward: [(0, '31.300'), (1, '31.240')]
[2023-09-30 05:54:47,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12754944. Throughput: 0: 725.4, 1: 726.5. Samples: 3185098. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:54:47,493][123291] Avg episode reward: [(0, '31.330'), (1, '31.250')]
[2023-09-30 05:54:47,494][124965] Saving new best policy, reward=31.330!
[2023-09-30 05:54:51,467][125261] Updated weights for policy 1, policy_version 24960 (0.0013)
[2023-09-30 05:54:51,468][125260] Updated weights for policy 0, policy_version 24960 (0.0017)
[2023-09-30 05:54:52,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12779520. Throughput: 0: 733.5, 1: 730.6. Samples: 3194122. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:54:52,492][123291] Avg episode reward: [(0, '31.360'), (1, '31.300')]
[2023-09-30 05:54:52,500][124965] Saving new best policy, reward=31.360!
[2023-09-30 05:54:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12812288. Throughput: 0: 732.4, 1: 734.4. Samples: 3202775. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:54:57,492][123291] Avg episode reward: [(0, '31.360'), (1, '31.300')]
[2023-09-30 05:55:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 12836864. Throughput: 0: 731.5, 1: 732.1. Samples: 3207158. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:55:02,493][123291] Avg episode reward: [(0, '31.410'), (1, '31.340')]
[2023-09-30 05:55:02,494][124965] Saving new best policy, reward=31.410!
[2023-09-30 05:55:05,449][125260] Updated weights for policy 0, policy_version 25120 (0.0018)
[2023-09-30 05:55:05,449][125261] Updated weights for policy 1, policy_version 25120 (0.0018)
[2023-09-30 05:55:07,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12869632. Throughput: 0: 730.5, 1: 730.2. Samples: 3215815. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-30 05:55:07,493][123291] Avg episode reward: [(0, '31.410'), (1, '31.340')]
[2023-09-30 05:55:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 12894208. Throughput: 0: 726.7, 1: 725.9. Samples: 3224374. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:55:12,492][123291] Avg episode reward: [(0, '31.430'), (1, '31.330')]
[2023-09-30 05:55:12,612][124965] Saving new best policy, reward=31.430!
[2023-09-30 05:55:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12926976. Throughput: 0: 729.2, 1: 726.8. Samples: 3229049. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:55:17,493][123291] Avg episode reward: [(0, '31.430'), (1, '31.330')]
[2023-09-30 05:55:19,485][125260] Updated weights for policy 0, policy_version 25280 (0.0017)
[2023-09-30 05:55:19,485][125261] Updated weights for policy 1, policy_version 25280 (0.0016)
[2023-09-30 05:55:22,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12959744. Throughput: 0: 728.3, 1: 729.4. Samples: 3237884. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 05:55:22,493][123291] Avg episode reward: [(0, '31.470'), (1, '31.370')]
[2023-09-30 05:55:22,503][124965] Saving new best policy, reward=31.470!
[2023-09-30 05:55:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 12984320. Throughput: 0: 726.9, 1: 727.8. Samples: 3246084. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:27,492][123291] Avg episode reward: [(0, '31.470'), (1, '31.370')]
[2023-09-30 05:55:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13017088. Throughput: 0: 725.7, 1: 725.6. Samples: 3250403. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:32,493][123291] Avg episode reward: [(0, '31.470'), (1, '31.370')]
[2023-09-30 05:55:33,758][125260] Updated weights for policy 0, policy_version 25440 (0.0018)
[2023-09-30 05:55:33,759][125261] Updated weights for policy 1, policy_version 25440 (0.0015)
[2023-09-30 05:55:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 13041664. Throughput: 0: 723.2, 1: 724.8. Samples: 3259283. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:37,493][123291] Avg episode reward: [(0, '31.520'), (1, '31.350')]
[2023-09-30 05:55:37,505][124965] Saving new best policy, reward=31.520!
[2023-09-30 05:55:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13074432. Throughput: 0: 722.4, 1: 722.0. Samples: 3267775. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:42,492][123291] Avg episode reward: [(0, '31.520'), (1, '31.350')]
[2023-09-30 05:55:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13099008. Throughput: 0: 725.0, 1: 722.8. Samples: 3272312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:47,492][123291] Avg episode reward: [(0, '31.520'), (1, '31.390')]
[2023-09-30 05:55:48,027][125260] Updated weights for policy 0, policy_version 25600 (0.0017)
[2023-09-30 05:55:48,027][125261] Updated weights for policy 1, policy_version 25600 (0.0015)
[2023-09-30 05:55:52,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13131776. Throughput: 0: 722.7, 1: 723.6. Samples: 3280897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:52,492][123291] Avg episode reward: [(0, '31.520'), (1, '31.390')]
[2023-09-30 05:55:57,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13156352. Throughput: 0: 725.8, 1: 726.4. Samples: 3289724. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:55:57,492][123291] Avg episode reward: [(0, '31.590'), (1, '31.390')]
[2023-09-30 05:55:57,663][124965] Saving new best policy, reward=31.590!
[2023-09-30 05:56:01,771][125260] Updated weights for policy 0, policy_version 25760 (0.0016)
[2023-09-30 05:56:01,771][125261] Updated weights for policy 1, policy_version 25760 (0.0017)
[2023-09-30 05:56:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13189120. Throughput: 0: 725.4, 1: 725.5. Samples: 3294338. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:56:02,493][123291] Avg episode reward: [(0, '31.590'), (1, '31.390')]
[2023-09-30 05:56:07,492][123291] Fps is (10 sec: 6553.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13221888. Throughput: 0: 728.2, 1: 728.3. Samples: 3303424. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:56:07,493][123291] Avg episode reward: [(0, '31.620'), (1, '31.410')]
[2023-09-30 05:56:07,504][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000025824_6610944.pth...
[2023-09-30 05:56:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000025824_6610944.pth...
[2023-09-30 05:56:07,541][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000023088_5910528.pth
[2023-09-30 05:56:07,543][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000023088_5910528.pth
[2023-09-30 05:56:07,544][124965] Saving new best policy, reward=31.620!
[2023-09-30 05:56:07,548][125162] Saving new best policy, reward=31.410!
[2023-09-30 05:56:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5817.7). Total num frames: 13246464. Throughput: 0: 736.2, 1: 735.4. Samples: 3312304. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:56:12,493][123291] Avg episode reward: [(0, '31.620'), (1, '31.410')]
[2023-09-30 05:56:15,661][125260] Updated weights for policy 0, policy_version 25920 (0.0015)
[2023-09-30 05:56:15,661][125261] Updated weights for policy 1, policy_version 25920 (0.0017)
[2023-09-30 05:56:17,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 13279232. Throughput: 0: 736.4, 1: 734.4. Samples: 3316590. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:56:17,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.440')]
[2023-09-30 05:56:17,493][125162] Saving new best policy, reward=31.440!
[2023-09-30 05:56:17,492][124965] Saving new best policy, reward=31.690!
[2023-09-30 05:56:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13303808. Throughput: 0: 730.7, 1: 732.9. Samples: 3325146. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:56:22,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.440')]
[2023-09-30 05:56:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13336576. Throughput: 0: 737.2, 1: 735.3. Samples: 3334039. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:56:27,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.440')]
[2023-09-30 05:56:29,904][125261] Updated weights for policy 1, policy_version 26080 (0.0019)
[2023-09-30 05:56:29,905][125260] Updated weights for policy 0, policy_version 26080 (0.0019)
[2023-09-30 05:56:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13361152. Throughput: 0: 731.3, 1: 733.7. Samples: 3338240. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:56:32,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.470')]
[2023-09-30 05:56:32,494][124965] Saving new best policy, reward=31.720!
[2023-09-30 05:56:32,494][125162] Saving new best policy, reward=31.470!
[2023-09-30 05:56:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13393920. Throughput: 0: 730.2, 1: 729.4. Samples: 3346575. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:56:37,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.470')]
[2023-09-30 05:56:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13418496. Throughput: 0: 727.2, 1: 728.5. Samples: 3355229. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:56:42,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.490')]
[2023-09-30 05:56:42,493][124965] Saving new best policy, reward=31.750!
[2023-09-30 05:56:42,674][125162] Saving new best policy, reward=31.490!
[2023-09-30 05:56:44,209][125260] Updated weights for policy 0, policy_version 26240 (0.0017)
[2023-09-30 05:56:44,210][125261] Updated weights for policy 1, policy_version 26240 (0.0017)
[2023-09-30 05:56:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5817.7). Total num frames: 13451264. Throughput: 0: 726.0, 1: 724.4. Samples: 3359607. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:56:47,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.490')]
[2023-09-30 05:56:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13475840. Throughput: 0: 723.4, 1: 722.3. Samples: 3368483. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-30 05:56:52,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.510')]
[2023-09-30 05:56:52,593][125162] Saving new best policy, reward=31.510!
[2023-09-30 05:56:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13508608. Throughput: 0: 720.4, 1: 721.1. Samples: 3377170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:56:57,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.510')]
[2023-09-30 05:56:58,079][125260] Updated weights for policy 0, policy_version 26400 (0.0017)
[2023-09-30 05:56:58,079][125261] Updated weights for policy 1, policy_version 26400 (0.0016)
[2023-09-30 05:57:02,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13541376. Throughput: 0: 720.9, 1: 722.2. Samples: 3381527. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:02,493][123291] Avg episode reward: [(0, '31.750'), (1, '31.520')]
[2023-09-30 05:57:02,494][125162] Saving new best policy, reward=31.520!
[2023-09-30 05:57:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 13565952. Throughput: 0: 725.2, 1: 723.7. Samples: 3390343. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:07,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.510')]
[2023-09-30 05:57:12,048][125260] Updated weights for policy 0, policy_version 26560 (0.0017)
[2023-09-30 05:57:12,050][125261] Updated weights for policy 1, policy_version 26560 (0.0017)
[2023-09-30 05:57:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13598720. Throughput: 0: 726.4, 1: 729.3. Samples: 3399547. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:12,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.510')]
[2023-09-30 05:57:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 13623296. Throughput: 0: 728.2, 1: 728.2. Samples: 3403775. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:17,492][123291] Avg episode reward: [(0, '31.760'), (1, '31.510')]
[2023-09-30 05:57:17,612][124965] Saving new best policy, reward=31.760!
[2023-09-30 05:57:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13656064. Throughput: 0: 731.4, 1: 730.9. Samples: 3412377. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:22,493][123291] Avg episode reward: [(0, '31.760'), (1, '31.510')]
[2023-09-30 05:57:26,062][125260] Updated weights for policy 0, policy_version 26720 (0.0018)
[2023-09-30 05:57:26,062][125261] Updated weights for policy 1, policy_version 26720 (0.0017)
[2023-09-30 05:57:27,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13688832. Throughput: 0: 735.4, 1: 732.8. Samples: 3421299. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:57:27,493][123291] Avg episode reward: [(0, '31.790'), (1, '31.500')]
[2023-09-30 05:57:27,494][124965] Saving new best policy, reward=31.790!
[2023-09-30 05:57:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13713408. Throughput: 0: 737.0, 1: 737.5. Samples: 3425957. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:57:32,492][123291] Avg episode reward: [(0, '31.790'), (1, '31.500')]
[2023-09-30 05:57:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13746176. Throughput: 0: 733.0, 1: 734.0. Samples: 3434500. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:57:37,493][123291] Avg episode reward: [(0, '31.780'), (1, '31.590')]
[2023-09-30 05:57:37,504][125162] Saving new best policy, reward=31.590!
[2023-09-30 05:57:39,851][125261] Updated weights for policy 1, policy_version 26880 (0.0018)
[2023-09-30 05:57:39,851][125260] Updated weights for policy 0, policy_version 26880 (0.0018)
[2023-09-30 05:57:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 13770752. Throughput: 0: 737.2, 1: 736.5. Samples: 3443485. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:57:42,493][123291] Avg episode reward: [(0, '31.780'), (1, '31.590')]
[2023-09-30 05:57:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13803520. Throughput: 0: 737.4, 1: 738.3. Samples: 3447934. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:57:47,492][123291] Avg episode reward: [(0, '31.770'), (1, '31.650')]
[2023-09-30 05:57:47,493][125162] Saving new best policy, reward=31.650!
[2023-09-30 05:57:52,492][123291] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 13836288. Throughput: 0: 739.4, 1: 740.0. Samples: 3456916. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:57:52,493][123291] Avg episode reward: [(0, '31.770'), (1, '31.650')]
[2023-09-30 05:57:53,771][125261] Updated weights for policy 1, policy_version 27040 (0.0017)
[2023-09-30 05:57:53,771][125260] Updated weights for policy 0, policy_version 27040 (0.0016)
[2023-09-30 05:57:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13860864. Throughput: 0: 733.2, 1: 731.7. Samples: 3465471. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:57:57,493][123291] Avg episode reward: [(0, '31.770'), (1, '31.650')]
[2023-09-30 05:58:02,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 13893632. Throughput: 0: 737.8, 1: 736.2. Samples: 3470103. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:58:02,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.710')]
[2023-09-30 05:58:02,493][125162] Saving new best policy, reward=31.710!
[2023-09-30 05:58:07,485][125260] Updated weights for policy 0, policy_version 27200 (0.0018)
[2023-09-30 05:58:07,485][125261] Updated weights for policy 1, policy_version 27200 (0.0018)
[2023-09-30 05:58:07,492][123291] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 13926400. Throughput: 0: 745.0, 1: 744.3. Samples: 3479396. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-30 05:58:07,493][123291] Avg episode reward: [(0, '31.750'), (1, '31.710')]
[2023-09-30 05:58:07,506][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000027200_6963200.pth...
[2023-09-30 05:58:07,506][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000027200_6963200.pth...
[2023-09-30 05:58:07,542][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000024448_6258688.pth
[2023-09-30 05:58:07,543][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000024448_6258688.pth
[2023-09-30 05:58:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13950976. Throughput: 0: 737.3, 1: 739.4. Samples: 3487747. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:58:12,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.770')]
[2023-09-30 05:58:12,494][125162] Saving new best policy, reward=31.770!
[2023-09-30 05:58:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 6007.4, 300 sec: 5859.4). Total num frames: 13983744. Throughput: 0: 732.8, 1: 733.8. Samples: 3491954. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:58:17,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.770')]
[2023-09-30 05:58:21,601][125261] Updated weights for policy 1, policy_version 27360 (0.0017)
[2023-09-30 05:58:21,601][125260] Updated weights for policy 0, policy_version 27360 (0.0017)
[2023-09-30 05:58:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14008320. Throughput: 0: 739.4, 1: 738.2. Samples: 3500992. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 05:58:22,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.780')]
[2023-09-30 05:58:22,504][125162] Saving new best policy, reward=31.780!
[2023-09-30 05:58:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14041088. Throughput: 0: 739.3, 1: 738.5. Samples: 3509987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:58:27,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.780')]
[2023-09-30 05:58:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14065664. Throughput: 0: 737.8, 1: 737.8. Samples: 3514334. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:58:32,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.800')]
[2023-09-30 05:58:32,494][125162] Saving new best policy, reward=31.800!
[2023-09-30 05:58:35,445][125260] Updated weights for policy 0, policy_version 27520 (0.0017)
[2023-09-30 05:58:35,445][125261] Updated weights for policy 1, policy_version 27520 (0.0017)
[2023-09-30 05:58:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14098432. Throughput: 0: 735.2, 1: 734.0. Samples: 3523030. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:58:37,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.800')]
[2023-09-30 05:58:42,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 5845.5). Total num frames: 14127104. Throughput: 0: 737.9, 1: 737.4. Samples: 3531857. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:58:42,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.810')]
[2023-09-30 05:58:42,506][125162] Saving new best policy, reward=31.810!
[2023-09-30 05:58:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14155776. Throughput: 0: 735.6, 1: 736.9. Samples: 3536367. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:58:47,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.810')]
[2023-09-30 05:58:49,392][125260] Updated weights for policy 0, policy_version 27680 (0.0018)
[2023-09-30 05:58:49,392][125261] Updated weights for policy 1, policy_version 27680 (0.0018)
[2023-09-30 05:58:52,491][123291] Fps is (10 sec: 6144.0, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 14188544. Throughput: 0: 728.9, 1: 730.9. Samples: 3545088. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:58:52,492][123291] Avg episode reward: [(0, '31.670'), (1, '31.810')]
[2023-09-30 05:58:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14213120. Throughput: 0: 735.6, 1: 733.7. Samples: 3553864. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:58:57,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.840')]
[2023-09-30 05:58:57,494][125162] Saving new best policy, reward=31.840!
[2023-09-30 05:59:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14245888. Throughput: 0: 736.3, 1: 735.8. Samples: 3558200. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-30 05:59:02,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.840')]
[2023-09-30 05:59:03,371][125261] Updated weights for policy 1, policy_version 27840 (0.0016)
[2023-09-30 05:59:03,372][125260] Updated weights for policy 0, policy_version 27840 (0.0017)
[2023-09-30 05:59:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14270464. Throughput: 0: 735.2, 1: 736.3. Samples: 3567209. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:59:07,492][123291] Avg episode reward: [(0, '31.670'), (1, '31.890')]
[2023-09-30 05:59:07,577][125162] Saving new best policy, reward=31.890!
[2023-09-30 05:59:12,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 14303232. Throughput: 0: 730.5, 1: 732.2. Samples: 3575808. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:59:12,492][123291] Avg episode reward: [(0, '31.670'), (1, '31.890')]
[2023-09-30 05:59:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14327808. Throughput: 0: 728.3, 1: 728.9. Samples: 3579908. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:59:17,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.860')]
[2023-09-30 05:59:17,673][125260] Updated weights for policy 0, policy_version 28000 (0.0015)
[2023-09-30 05:59:17,673][125261] Updated weights for policy 1, policy_version 28000 (0.0017)
[2023-09-30 05:59:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14360576. Throughput: 0: 726.7, 1: 728.0. Samples: 3588488. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 05:59:22,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.860')]
[2023-09-30 05:59:27,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14393344. Throughput: 0: 729.8, 1: 730.8. Samples: 3597583. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:27,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.850')]
[2023-09-30 05:59:31,832][125260] Updated weights for policy 0, policy_version 28160 (0.0015)
[2023-09-30 05:59:31,833][125261] Updated weights for policy 1, policy_version 28160 (0.0018)
[2023-09-30 05:59:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14417920. Throughput: 0: 726.0, 1: 726.3. Samples: 3601718. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:32,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.840')]
[2023-09-30 05:59:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14450688. Throughput: 0: 726.0, 1: 726.1. Samples: 3610431. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:37,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.840')]
[2023-09-30 05:59:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.6, 300 sec: 5831.6). Total num frames: 14475264. Throughput: 0: 723.1, 1: 724.3. Samples: 3618997. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:42,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.840')]
[2023-09-30 05:59:45,982][125260] Updated weights for policy 0, policy_version 28320 (0.0016)
[2023-09-30 05:59:45,983][125261] Updated weights for policy 1, policy_version 28320 (0.0018)
[2023-09-30 05:59:47,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 14508032. Throughput: 0: 722.8, 1: 723.2. Samples: 3623267. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:47,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.840')]
[2023-09-30 05:59:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14532608. Throughput: 0: 718.5, 1: 717.0. Samples: 3631809. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:52,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.860')]
[2023-09-30 05:59:57,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14565376. Throughput: 0: 723.5, 1: 721.1. Samples: 3640812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 05:59:57,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.860')]
[2023-09-30 06:00:00,033][125260] Updated weights for policy 0, policy_version 28480 (0.0019)
[2023-09-30 06:00:00,033][125261] Updated weights for policy 1, policy_version 28480 (0.0018)
[2023-09-30 06:00:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14589952. Throughput: 0: 727.2, 1: 727.3. Samples: 3645359. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:00:02,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.850')]
[2023-09-30 06:00:07,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14622720. Throughput: 0: 723.5, 1: 724.2. Samples: 3653634. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:00:07,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.850')]
[2023-09-30 06:00:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000028560_7311360.pth...
[2023-09-30 06:00:07,504][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000028560_7311360.pth...
[2023-09-30 06:00:07,538][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000025824_6610944.pth
[2023-09-30 06:00:07,539][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000025824_6610944.pth
[2023-09-30 06:00:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14647296. Throughput: 0: 720.9, 1: 720.0. Samples: 3662425. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:00:12,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.900')]
[2023-09-30 06:00:12,493][125162] Saving new best policy, reward=31.900!
[2023-09-30 06:00:14,137][125260] Updated weights for policy 0, policy_version 28640 (0.0018)
[2023-09-30 06:00:14,137][125261] Updated weights for policy 1, policy_version 28640 (0.0018)
[2023-09-30 06:00:17,492][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14680064. Throughput: 0: 724.8, 1: 724.0. Samples: 3666910. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:00:17,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.920')]
[2023-09-30 06:00:17,494][125162] Saving new best policy, reward=31.920!
[2023-09-30 06:00:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14704640. Throughput: 0: 726.1, 1: 725.0. Samples: 3675730. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:00:22,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.920')]
[2023-09-30 06:00:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14737408. Throughput: 0: 727.5, 1: 727.2. Samples: 3684461. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 06:00:27,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.880')]
[2023-09-30 06:00:27,983][125260] Updated weights for policy 0, policy_version 28800 (0.0018)
[2023-09-30 06:00:27,983][125261] Updated weights for policy 1, policy_version 28800 (0.0016)
[2023-09-30 06:00:32,491][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14770176. Throughput: 0: 731.6, 1: 731.7. Samples: 3689116. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 06:00:32,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.880')]
[2023-09-30 06:00:37,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 14794752. Throughput: 0: 736.3, 1: 735.1. Samples: 3698021. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 06:00:37,492][123291] Avg episode reward: [(0, '31.710'), (1, '31.890')]
[2023-09-30 06:00:41,992][125261] Updated weights for policy 1, policy_version 28960 (0.0016)
[2023-09-30 06:00:41,992][125260] Updated weights for policy 0, policy_version 28960 (0.0018)
[2023-09-30 06:00:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14827520. Throughput: 0: 732.4, 1: 733.5. Samples: 3706776. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-30 06:00:42,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.890')]
[2023-09-30 06:00:47,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.6, 300 sec: 5845.5). Total num frames: 14856192. Throughput: 0: 729.1, 1: 729.1. Samples: 3710976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:00:47,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.890')]
[2023-09-30 06:00:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14884864. Throughput: 0: 735.2, 1: 735.3. Samples: 3719807. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:00:52,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.890')]
[2023-09-30 06:00:55,874][125261] Updated weights for policy 1, policy_version 29120 (0.0016)
[2023-09-30 06:00:55,875][125260] Updated weights for policy 0, policy_version 29120 (0.0016)
[2023-09-30 06:00:57,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 14917632. Throughput: 0: 738.3, 1: 739.0. Samples: 3728906. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:00:57,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.840')]
[2023-09-30 06:01:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14942208. Throughput: 0: 738.3, 1: 738.6. Samples: 3733372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:02,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.840')]
[2023-09-30 06:01:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 14974976. Throughput: 0: 734.0, 1: 734.2. Samples: 3741797. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:07,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.840')]
[2023-09-30 06:01:09,885][125260] Updated weights for policy 0, policy_version 29280 (0.0017)
[2023-09-30 06:01:09,886][125261] Updated weights for policy 1, policy_version 29280 (0.0016)
[2023-09-30 06:01:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 14999552. Throughput: 0: 734.7, 1: 734.6. Samples: 3750583. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:12,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.810')]
[2023-09-30 06:01:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15032320. Throughput: 0: 732.8, 1: 733.4. Samples: 3755098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:17,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.810')]
[2023-09-30 06:01:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15056896. Throughput: 0: 730.7, 1: 730.9. Samples: 3763793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:22,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.790')]
[2023-09-30 06:01:23,925][125260] Updated weights for policy 0, policy_version 29440 (0.0015)
[2023-09-30 06:01:23,926][125261] Updated weights for policy 1, policy_version 29440 (0.0015)
[2023-09-30 06:01:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15089664. Throughput: 0: 730.2, 1: 730.3. Samples: 3772499. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:01:27,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.790')]
[2023-09-30 06:01:32,491][123291] Fps is (10 sec: 6553.8, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15122432. Throughput: 0: 733.1, 1: 732.9. Samples: 3776945. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:01:32,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.780')]
[2023-09-30 06:01:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15147008. Throughput: 0: 731.8, 1: 729.9. Samples: 3785583. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:01:37,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.780')]
[2023-09-30 06:01:38,088][125260] Updated weights for policy 0, policy_version 29600 (0.0016)
[2023-09-30 06:01:38,088][125261] Updated weights for policy 1, policy_version 29600 (0.0018)
[2023-09-30 06:01:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15179776. Throughput: 0: 723.9, 1: 724.9. Samples: 3794101. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:01:42,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.740')]
[2023-09-30 06:01:47,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5802.7, 300 sec: 5859.4). Total num frames: 15204352. Throughput: 0: 725.3, 1: 724.3. Samples: 3798603. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:47,492][123291] Avg episode reward: [(0, '31.710'), (1, '31.740')]
[2023-09-30 06:01:52,186][125260] Updated weights for policy 0, policy_version 29760 (0.0019)
[2023-09-30 06:01:52,186][125261] Updated weights for policy 1, policy_version 29760 (0.0018)
[2023-09-30 06:01:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15237120. Throughput: 0: 726.7, 1: 727.4. Samples: 3807232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:52,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.740')]
[2023-09-30 06:01:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15261696. Throughput: 0: 727.8, 1: 727.3. Samples: 3816062. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:01:57,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.720')]
[2023-09-30 06:02:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5859.4). Total num frames: 15294464. Throughput: 0: 728.6, 1: 727.4. Samples: 3820620. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:02:02,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.720')]
[2023-09-30 06:02:05,949][125261] Updated weights for policy 1, policy_version 29920 (0.0019)
[2023-09-30 06:02:05,949][125260] Updated weights for policy 0, policy_version 29920 (0.0018)
[2023-09-30 06:02:07,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15327232. Throughput: 0: 731.6, 1: 734.1. Samples: 3829749. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:02:07,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.760')]
[2023-09-30 06:02:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000029936_7663616.pth...
[2023-09-30 06:02:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000029936_7663616.pth...
[2023-09-30 06:02:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000027200_6963200.pth
[2023-09-30 06:02:07,542][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000027200_6963200.pth
[2023-09-30 06:02:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15351808. Throughput: 0: 730.2, 1: 730.4. Samples: 3838227. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:02:12,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.760')]
[2023-09-30 06:02:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15384576. Throughput: 0: 732.9, 1: 731.8. Samples: 3842856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:02:17,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.700')]
[2023-09-30 06:02:19,961][125260] Updated weights for policy 0, policy_version 30080 (0.0017)
[2023-09-30 06:02:19,961][125261] Updated weights for policy 1, policy_version 30080 (0.0017)
[2023-09-30 06:02:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 15409152. Throughput: 0: 732.4, 1: 732.5. Samples: 3851505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:02:22,493][123291] Avg episode reward: [(0, '31.670'), (1, '31.700')]
[2023-09-30 06:02:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15441920. Throughput: 0: 735.6, 1: 733.5. Samples: 3860213. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:02:27,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.710')]
[2023-09-30 06:02:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15466496. Throughput: 0: 732.3, 1: 733.2. Samples: 3864551. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:02:32,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.710')]
[2023-09-30 06:02:34,165][125261] Updated weights for policy 1, policy_version 30240 (0.0017)
[2023-09-30 06:02:34,165][125260] Updated weights for policy 0, policy_version 30240 (0.0017)
[2023-09-30 06:02:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 15499264. Throughput: 0: 731.2, 1: 730.0. Samples: 3872987. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:02:37,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.710')]
[2023-09-30 06:02:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15523840. Throughput: 0: 729.4, 1: 728.7. Samples: 3881674. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:02:42,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.720')]
[2023-09-30 06:02:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15556608. Throughput: 0: 726.4, 1: 725.6. Samples: 3885958. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:02:47,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.720')]
[2023-09-30 06:02:48,627][125260] Updated weights for policy 0, policy_version 30400 (0.0017)
[2023-09-30 06:02:48,627][125261] Updated weights for policy 1, policy_version 30400 (0.0017)
[2023-09-30 06:02:52,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15581184. Throughput: 0: 718.7, 1: 717.7. Samples: 3894386. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:02:52,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.690')]
[2023-09-30 06:02:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15613952. Throughput: 0: 722.8, 1: 722.7. Samples: 3903276. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:02:57,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.690')]
[2023-09-30 06:03:02,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 15638528. Throughput: 0: 718.6, 1: 719.8. Samples: 3907584. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:03:02,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.670')]
[2023-09-30 06:03:02,621][125260] Updated weights for policy 0, policy_version 30560 (0.0016)
[2023-09-30 06:03:02,622][125261] Updated weights for policy 1, policy_version 30560 (0.0017)
[2023-09-30 06:03:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15671296. Throughput: 0: 717.8, 1: 717.6. Samples: 3916101. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:07,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.670')]
[2023-09-30 06:03:12,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15704064. Throughput: 0: 721.2, 1: 721.8. Samples: 3925146. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:12,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.640')]
[2023-09-30 06:03:16,654][125260] Updated weights for policy 0, policy_version 30720 (0.0016)
[2023-09-30 06:03:16,654][125261] Updated weights for policy 1, policy_version 30720 (0.0017)
[2023-09-30 06:03:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15728640. Throughput: 0: 724.9, 1: 723.4. Samples: 3929724. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:17,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.640')]
[2023-09-30 06:03:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15761408. Throughput: 0: 725.1, 1: 726.4. Samples: 3938304. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:22,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.640')]
[2023-09-30 06:03:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 15785984. Throughput: 0: 730.4, 1: 731.0. Samples: 3947439. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:03:27,493][123291] Avg episode reward: [(0, '31.730'), (1, '31.630')]
[2023-09-30 06:03:30,409][125261] Updated weights for policy 1, policy_version 30880 (0.0017)
[2023-09-30 06:03:30,409][125260] Updated weights for policy 0, policy_version 30880 (0.0017)
[2023-09-30 06:03:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15818752. Throughput: 0: 730.4, 1: 732.3. Samples: 3951781. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:03:32,493][123291] Avg episode reward: [(0, '31.730'), (1, '31.630')]
[2023-09-30 06:03:37,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5831.6). Total num frames: 15847424. Throughput: 0: 736.8, 1: 734.2. Samples: 3960581. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:03:37,493][123291] Avg episode reward: [(0, '31.750'), (1, '31.590')]
[2023-09-30 06:03:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15876096. Throughput: 0: 730.7, 1: 731.2. Samples: 3969060. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:03:42,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.590')]
[2023-09-30 06:03:44,487][125261] Updated weights for policy 1, policy_version 31040 (0.0017)
[2023-09-30 06:03:44,488][125260] Updated weights for policy 0, policy_version 31040 (0.0019)
[2023-09-30 06:03:47,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15908864. Throughput: 0: 733.1, 1: 732.1. Samples: 3973518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:47,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.560')]
[2023-09-30 06:03:52,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15933440. Throughput: 0: 737.1, 1: 737.6. Samples: 3982462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:52,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.560')]
[2023-09-30 06:03:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 15966208. Throughput: 0: 734.7, 1: 733.6. Samples: 3991223. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:03:57,492][123291] Avg episode reward: [(0, '31.740'), (1, '31.570')]
[2023-09-30 06:03:58,529][125261] Updated weights for policy 1, policy_version 31200 (0.0014)
[2023-09-30 06:03:58,530][125260] Updated weights for policy 0, policy_version 31200 (0.0016)
[2023-09-30 06:04:02,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 15990784. Throughput: 0: 731.4, 1: 732.3. Samples: 3995587. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:02,492][123291] Avg episode reward: [(0, '31.740'), (1, '31.570')]
[2023-09-30 06:04:07,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 16023552. Throughput: 0: 730.4, 1: 729.6. Samples: 4004006. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:04:07,492][123291] Avg episode reward: [(0, '31.740'), (1, '31.570')]
[2023-09-30 06:04:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000031296_8011776.pth...
[2023-09-30 06:04:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000031296_8011776.pth...
[2023-09-30 06:04:07,534][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000028560_7311360.pth
[2023-09-30 06:04:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000028560_7311360.pth
[2023-09-30 06:04:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 16048128. Throughput: 0: 722.7, 1: 723.4. Samples: 4012512. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:04:12,492][123291] Avg episode reward: [(0, '31.790'), (1, '31.570')]
[2023-09-30 06:04:12,776][125261] Updated weights for policy 1, policy_version 31360 (0.0017)
[2023-09-30 06:04:12,776][125260] Updated weights for policy 0, policy_version 31360 (0.0017)
[2023-09-30 06:04:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 16080896. Throughput: 0: 725.3, 1: 725.0. Samples: 4017044. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:04:17,492][123291] Avg episode reward: [(0, '31.790'), (1, '31.570')]
[2023-09-30 06:04:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16105472. Throughput: 0: 721.0, 1: 725.3. Samples: 4025665. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:04:22,493][123291] Avg episode reward: [(0, '31.820'), (1, '31.500')]
[2023-09-30 06:04:22,504][124965] Saving new best policy, reward=31.820!
[2023-09-30 06:04:26,920][125261] Updated weights for policy 1, policy_version 31520 (0.0018)
[2023-09-30 06:04:26,921][125260] Updated weights for policy 0, policy_version 31520 (0.0017)
[2023-09-30 06:04:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16138240. Throughput: 0: 727.5, 1: 728.1. Samples: 4034560. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-30 06:04:27,493][123291] Avg episode reward: [(0, '31.820'), (1, '31.500')]
[2023-09-30 06:04:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16162816. Throughput: 0: 723.3, 1: 724.2. Samples: 4038656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:32,493][123291] Avg episode reward: [(0, '31.810'), (1, '31.480')]
[2023-09-30 06:04:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5831.6). Total num frames: 16195584. Throughput: 0: 721.5, 1: 720.9. Samples: 4047372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:37,493][123291] Avg episode reward: [(0, '31.810'), (1, '31.480')]
[2023-09-30 06:04:41,306][125260] Updated weights for policy 0, policy_version 31680 (0.0017)
[2023-09-30 06:04:41,306][125261] Updated weights for policy 1, policy_version 31680 (0.0017)
[2023-09-30 06:04:42,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16220160. Throughput: 0: 716.5, 1: 717.5. Samples: 4055750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:42,492][123291] Avg episode reward: [(0, '31.790'), (1, '31.450')]
[2023-09-30 06:04:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 16252928. Throughput: 0: 719.4, 1: 718.2. Samples: 4060276. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:47,493][123291] Avg episode reward: [(0, '31.790'), (1, '31.450')]
[2023-09-30 06:04:52,492][123291] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16277504. Throughput: 0: 722.9, 1: 722.7. Samples: 4069060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:52,493][123291] Avg episode reward: [(0, '31.790'), (1, '31.450')]
[2023-09-30 06:04:55,235][125260] Updated weights for policy 0, policy_version 31840 (0.0017)
[2023-09-30 06:04:55,235][125261] Updated weights for policy 1, policy_version 31840 (0.0016)
[2023-09-30 06:04:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 16310272. Throughput: 0: 724.6, 1: 724.2. Samples: 4077707. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:04:57,493][123291] Avg episode reward: [(0, '31.780'), (1, '31.390')]
[2023-09-30 06:05:02,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16343040. Throughput: 0: 722.3, 1: 722.4. Samples: 4082054. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:02,493][123291] Avg episode reward: [(0, '31.780'), (1, '31.390')]
[2023-09-30 06:05:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 16367616. Throughput: 0: 726.1, 1: 724.8. Samples: 4090953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:07,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.400')]
[2023-09-30 06:05:09,313][125260] Updated weights for policy 0, policy_version 32000 (0.0017)
[2023-09-30 06:05:09,313][125261] Updated weights for policy 1, policy_version 32000 (0.0017)
[2023-09-30 06:05:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16400384. Throughput: 0: 722.2, 1: 724.3. Samples: 4099651. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:12,493][123291] Avg episode reward: [(0, '31.750'), (1, '31.400')]
[2023-09-30 06:05:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5831.6). Total num frames: 16424960. Throughput: 0: 725.8, 1: 723.4. Samples: 4103872. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:17,492][123291] Avg episode reward: [(0, '31.760'), (1, '31.400')]
[2023-09-30 06:05:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16457728. Throughput: 0: 719.4, 1: 722.8. Samples: 4112272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:22,493][123291] Avg episode reward: [(0, '31.760'), (1, '31.400')]
[2023-09-30 06:05:23,935][125260] Updated weights for policy 0, policy_version 32160 (0.0019)
[2023-09-30 06:05:23,935][125261] Updated weights for policy 1, policy_version 32160 (0.0020)
[2023-09-30 06:05:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16482304. Throughput: 0: 719.9, 1: 721.0. Samples: 4120589. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:27,492][123291] Avg episode reward: [(0, '31.780'), (1, '31.420')]
[2023-09-30 06:05:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16515072. Throughput: 0: 715.6, 1: 716.9. Samples: 4124737. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 06:05:32,493][123291] Avg episode reward: [(0, '31.780'), (1, '31.420')]
[2023-09-30 06:05:37,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16539648. Throughput: 0: 719.7, 1: 719.5. Samples: 4133825. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 06:05:37,492][123291] Avg episode reward: [(0, '31.780'), (1, '31.420')]
[2023-09-30 06:05:37,838][125260] Updated weights for policy 0, policy_version 32320 (0.0017)
[2023-09-30 06:05:37,838][125261] Updated weights for policy 1, policy_version 32320 (0.0017)
[2023-09-30 06:05:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5817.7). Total num frames: 16572416. Throughput: 0: 725.5, 1: 727.2. Samples: 4143081. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 06:05:42,493][123291] Avg episode reward: [(0, '31.770'), (1, '31.420')]
[2023-09-30 06:05:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16596992. Throughput: 0: 723.5, 1: 724.2. Samples: 4147200. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 06:05:47,493][123291] Avg episode reward: [(0, '31.770'), (1, '31.420')]
[2023-09-30 06:05:51,890][125260] Updated weights for policy 0, policy_version 32480 (0.0017)
[2023-09-30 06:05:51,891][125261] Updated weights for policy 1, policy_version 32480 (0.0017)
[2023-09-30 06:05:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 16629760. Throughput: 0: 719.5, 1: 718.9. Samples: 4155679. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-30 06:05:52,492][123291] Avg episode reward: [(0, '31.720'), (1, '31.440')]
[2023-09-30 06:05:57,497][123291] Fps is (10 sec: 6140.5, 60 sec: 5802.1, 300 sec: 5817.6). Total num frames: 16658432. Throughput: 0: 722.6, 1: 720.2. Samples: 4164587. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:05:57,499][123291] Avg episode reward: [(0, '31.720'), (1, '31.440')]
[2023-09-30 06:06:02,491][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16687104. Throughput: 0: 722.5, 1: 724.8. Samples: 4169001. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:02,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.460')]
[2023-09-30 06:06:05,950][125260] Updated weights for policy 0, policy_version 32640 (0.0017)
[2023-09-30 06:06:05,951][125261] Updated weights for policy 1, policy_version 32640 (0.0017)
[2023-09-30 06:06:07,492][123291] Fps is (10 sec: 6147.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16719872. Throughput: 0: 730.1, 1: 728.8. Samples: 4177920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:07,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.460')]
[2023-09-30 06:06:07,502][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000032656_8359936.pth...
[2023-09-30 06:06:07,502][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000032656_8359936.pth...
[2023-09-30 06:06:07,537][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000029936_7663616.pth
[2023-09-30 06:06:07,539][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000029936_7663616.pth
[2023-09-30 06:06:12,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16744448. Throughput: 0: 730.1, 1: 729.0. Samples: 4186249. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:12,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.450')]
[2023-09-30 06:06:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 16777216. Throughput: 0: 733.7, 1: 734.6. Samples: 4190811. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:17,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.450')]
[2023-09-30 06:06:20,033][125260] Updated weights for policy 0, policy_version 32800 (0.0017)
[2023-09-30 06:06:20,033][125261] Updated weights for policy 1, policy_version 32800 (0.0015)
[2023-09-30 06:06:22,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16801792. Throughput: 0: 729.7, 1: 729.4. Samples: 4199482. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:22,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.450')]
[2023-09-30 06:06:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 16834560. Throughput: 0: 726.2, 1: 724.0. Samples: 4208340. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:27,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.510')]
[2023-09-30 06:06:32,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16859136. Throughput: 0: 728.2, 1: 728.2. Samples: 4212736. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:32,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.510')]
[2023-09-30 06:06:34,178][125261] Updated weights for policy 1, policy_version 32960 (0.0017)
[2023-09-30 06:06:34,178][125260] Updated weights for policy 0, policy_version 32960 (0.0016)
[2023-09-30 06:06:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 16891904. Throughput: 0: 726.4, 1: 727.4. Samples: 4221103. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:06:37,492][123291] Avg episode reward: [(0, '31.620'), (1, '31.520')]
[2023-09-30 06:06:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16916480. Throughput: 0: 729.1, 1: 726.7. Samples: 4230091. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:06:42,492][123291] Avg episode reward: [(0, '31.620'), (1, '31.520')]
[2023-09-30 06:06:47,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 16949248. Throughput: 0: 728.0, 1: 726.1. Samples: 4234436. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:06:47,492][123291] Avg episode reward: [(0, '31.600'), (1, '31.530')]
[2023-09-30 06:06:48,369][125261] Updated weights for policy 1, policy_version 33120 (0.0017)
[2023-09-30 06:06:48,369][125260] Updated weights for policy 0, policy_version 33120 (0.0018)
[2023-09-30 06:06:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 16973824. Throughput: 0: 724.2, 1: 721.6. Samples: 4242984. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:06:52,493][123291] Avg episode reward: [(0, '31.600'), (1, '31.530')]
[2023-09-30 06:06:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5803.2, 300 sec: 5803.8). Total num frames: 17006592. Throughput: 0: 726.2, 1: 727.3. Samples: 4251660. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:06:57,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.560')]
[2023-09-30 06:07:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17031168. Throughput: 0: 722.8, 1: 721.7. Samples: 4255813. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:02,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.560')]
[2023-09-30 06:07:02,600][125260] Updated weights for policy 0, policy_version 33280 (0.0018)
[2023-09-30 06:07:02,601][125261] Updated weights for policy 1, policy_version 33280 (0.0018)
[2023-09-30 06:07:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17063936. Throughput: 0: 721.3, 1: 722.1. Samples: 4264435. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:07,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.560')]
[2023-09-30 06:07:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17088512. Throughput: 0: 716.0, 1: 716.2. Samples: 4272786. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:12,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.550')]
[2023-09-30 06:07:16,927][125260] Updated weights for policy 0, policy_version 33440 (0.0017)
[2023-09-30 06:07:16,927][125261] Updated weights for policy 1, policy_version 33440 (0.0017)
[2023-09-30 06:07:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17121280. Throughput: 0: 715.6, 1: 714.8. Samples: 4277103. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:17,492][123291] Avg episode reward: [(0, '31.630'), (1, '31.550')]
[2023-09-30 06:07:22,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.0). Total num frames: 17145856. Throughput: 0: 719.4, 1: 719.0. Samples: 4285833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:22,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.520')]
[2023-09-30 06:07:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17178624. Throughput: 0: 716.0, 1: 718.8. Samples: 4294656. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:07:27,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.520')]
[2023-09-30 06:07:30,931][125261] Updated weights for policy 1, policy_version 33600 (0.0018)
[2023-09-30 06:07:30,931][125260] Updated weights for policy 0, policy_version 33600 (0.0020)
[2023-09-30 06:07:32,491][123291] Fps is (10 sec: 6553.8, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 17211392. Throughput: 0: 716.8, 1: 717.8. Samples: 4298995. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:07:32,492][123291] Avg episode reward: [(0, '31.640'), (1, '31.560')]
[2023-09-30 06:07:37,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17235968. Throughput: 0: 720.9, 1: 720.6. Samples: 4307849. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:07:37,492][123291] Avg episode reward: [(0, '31.640'), (1, '31.560')]
[2023-09-30 06:07:42,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17268736. Throughput: 0: 720.0, 1: 718.4. Samples: 4316384. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:07:42,492][123291] Avg episode reward: [(0, '31.640'), (1, '31.560')]
[2023-09-30 06:07:45,380][125260] Updated weights for policy 0, policy_version 33760 (0.0017)
[2023-09-30 06:07:45,380][125261] Updated weights for policy 1, policy_version 33760 (0.0017)
[2023-09-30 06:07:47,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17293312. Throughput: 0: 720.7, 1: 719.5. Samples: 4320624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:47,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.580')]
[2023-09-30 06:07:52,492][123291] Fps is (10 sec: 5324.7, 60 sec: 5802.7, 300 sec: 5789.9). Total num frames: 17321984. Throughput: 0: 721.1, 1: 719.5. Samples: 4329262. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:52,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.580')]
[2023-09-30 06:07:57,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17350656. Throughput: 0: 720.1, 1: 721.6. Samples: 4337665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:07:57,492][123291] Avg episode reward: [(0, '31.610'), (1, '31.610')]
[2023-09-30 06:07:59,529][125260] Updated weights for policy 0, policy_version 33920 (0.0017)
[2023-09-30 06:07:59,529][125261] Updated weights for policy 1, policy_version 33920 (0.0018)
[2023-09-30 06:08:02,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17383424. Throughput: 0: 721.5, 1: 721.5. Samples: 4342040. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:02,493][123291] Avg episode reward: [(0, '31.610'), (1, '31.610')]
[2023-09-30 06:08:07,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17408000. Throughput: 0: 722.0, 1: 720.9. Samples: 4350763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:07,492][123291] Avg episode reward: [(0, '31.610'), (1, '31.620')]
[2023-09-30 06:08:07,500][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000034000_8704000.pth...
[2023-09-30 06:08:07,500][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000034000_8704000.pth...
[2023-09-30 06:08:07,532][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000031296_8011776.pth
[2023-09-30 06:08:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000031296_8011776.pth
[2023-09-30 06:08:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17440768. Throughput: 0: 723.6, 1: 722.5. Samples: 4359728. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:08:12,493][123291] Avg episode reward: [(0, '31.610'), (1, '31.620')]
[2023-09-30 06:08:13,617][125261] Updated weights for policy 1, policy_version 34080 (0.0018)
[2023-09-30 06:08:13,617][125260] Updated weights for policy 0, policy_version 34080 (0.0016)
[2023-09-30 06:08:17,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17465344. Throughput: 0: 725.0, 1: 723.9. Samples: 4364194. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:08:17,493][123291] Avg episode reward: [(0, '31.620'), (1, '31.650')]
[2023-09-30 06:08:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17498112. Throughput: 0: 717.0, 1: 719.7. Samples: 4372502. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:08:22,493][123291] Avg episode reward: [(0, '31.620'), (1, '31.650')]
[2023-09-30 06:08:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17522688. Throughput: 0: 724.5, 1: 725.0. Samples: 4381612. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-30 06:08:27,493][123291] Avg episode reward: [(0, '31.620'), (1, '31.650')]
[2023-09-30 06:08:27,546][125260] Updated weights for policy 0, policy_version 34240 (0.0016)
[2023-09-30 06:08:27,546][125261] Updated weights for policy 1, policy_version 34240 (0.0017)
[2023-09-30 06:08:32,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5789.9). Total num frames: 17555456. Throughput: 0: 725.4, 1: 727.5. Samples: 4386006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:32,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.610')]
[2023-09-30 06:08:37,492][123291] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17588224. Throughput: 0: 729.3, 1: 731.3. Samples: 4394989. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:37,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.610')]
[2023-09-30 06:08:41,492][125260] Updated weights for policy 0, policy_version 34400 (0.0017)
[2023-09-30 06:08:41,492][125261] Updated weights for policy 1, policy_version 34400 (0.0016)
[2023-09-30 06:08:42,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17612800. Throughput: 0: 732.6, 1: 731.7. Samples: 4403559. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:42,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.570')]
[2023-09-30 06:08:47,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 17645568. Throughput: 0: 730.9, 1: 730.5. Samples: 4407805. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:47,492][123291] Avg episode reward: [(0, '31.640'), (1, '31.570')]
[2023-09-30 06:08:52,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5802.7, 300 sec: 5776.1). Total num frames: 17670144. Throughput: 0: 731.1, 1: 732.7. Samples: 4416635. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:08:52,492][123291] Avg episode reward: [(0, '31.600'), (1, '31.580')]
[2023-09-30 06:08:55,698][125260] Updated weights for policy 0, policy_version 34560 (0.0017)
[2023-09-30 06:08:55,699][125261] Updated weights for policy 1, policy_version 34560 (0.0017)
[2023-09-30 06:08:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17702912. Throughput: 0: 731.8, 1: 729.4. Samples: 4425485. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:08:57,493][123291] Avg episode reward: [(0, '31.600'), (1, '31.580')]
[2023-09-30 06:09:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17727488. Throughput: 0: 727.5, 1: 730.2. Samples: 4429790. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:02,493][123291] Avg episode reward: [(0, '31.570'), (1, '31.630')]
[2023-09-30 06:09:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17760256. Throughput: 0: 728.9, 1: 728.2. Samples: 4438073. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:07,492][123291] Avg episode reward: [(0, '31.570'), (1, '31.630')]
[2023-09-30 06:09:09,770][125261] Updated weights for policy 1, policy_version 34720 (0.0016)
[2023-09-30 06:09:09,770][125260] Updated weights for policy 0, policy_version 34720 (0.0020)
[2023-09-30 06:09:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5776.1). Total num frames: 17784832. Throughput: 0: 728.0, 1: 729.1. Samples: 4447182. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:12,493][123291] Avg episode reward: [(0, '31.570'), (1, '31.630')]
[2023-09-30 06:09:17,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 17817600. Throughput: 0: 729.4, 1: 727.4. Samples: 4451563. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:17,492][123291] Avg episode reward: [(0, '31.590'), (1, '31.620')]
[2023-09-30 06:09:22,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 17850368. Throughput: 0: 728.2, 1: 728.1. Samples: 4460521. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:22,493][123291] Avg episode reward: [(0, '31.590'), (1, '31.620')]
[2023-09-30 06:09:23,759][125260] Updated weights for policy 0, policy_version 34880 (0.0019)
[2023-09-30 06:09:23,759][125261] Updated weights for policy 1, policy_version 34880 (0.0019)
[2023-09-30 06:09:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17874944. Throughput: 0: 727.8, 1: 727.4. Samples: 4469044. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:27,493][123291] Avg episode reward: [(0, '31.610'), (1, '31.660')]
[2023-09-30 06:09:32,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 17907712. Throughput: 0: 729.2, 1: 728.5. Samples: 4473400. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:32,492][123291] Avg episode reward: [(0, '31.610'), (1, '31.660')]
[2023-09-30 06:09:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 17932288. Throughput: 0: 728.5, 1: 728.4. Samples: 4482195. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-30 06:09:37,492][123291] Avg episode reward: [(0, '31.630'), (1, '31.680')]
[2023-09-30 06:09:37,802][125261] Updated weights for policy 1, policy_version 35040 (0.0016)
[2023-09-30 06:09:37,802][125260] Updated weights for policy 0, policy_version 35040 (0.0016)
[2023-09-30 06:09:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 17965056. Throughput: 0: 729.1, 1: 732.6. Samples: 4491262. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:09:42,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.680')]
[2023-09-30 06:09:47,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 17993728. Throughput: 0: 729.0, 1: 728.4. Samples: 4495371. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:09:47,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.680')]
[2023-09-30 06:09:51,824][125261] Updated weights for policy 1, policy_version 35200 (0.0018)
[2023-09-30 06:09:51,824][125260] Updated weights for policy 0, policy_version 35200 (0.0017)
[2023-09-30 06:09:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18022400. Throughput: 0: 732.7, 1: 732.7. Samples: 4504016. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:09:52,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.680')]
[2023-09-30 06:09:57,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 18055168. Throughput: 0: 733.7, 1: 732.8. Samples: 4513177. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:09:57,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.680')]
[2023-09-30 06:10:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18079744. Throughput: 0: 733.6, 1: 734.9. Samples: 4517647. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:02,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.690')]
[2023-09-30 06:10:05,697][125260] Updated weights for policy 0, policy_version 35360 (0.0017)
[2023-09-30 06:10:05,697][125261] Updated weights for policy 1, policy_version 35360 (0.0018)
[2023-09-30 06:10:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18112512. Throughput: 0: 728.6, 1: 728.8. Samples: 4526106. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:07,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.690')]
[2023-09-30 06:10:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000035376_9056256.pth...
[2023-09-30 06:10:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000035376_9056256.pth...
[2023-09-30 06:10:07,538][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000032656_8359936.pth
[2023-09-30 06:10:07,539][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000032656_8359936.pth
[2023-09-30 06:10:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 18137088. Throughput: 0: 731.7, 1: 732.4. Samples: 4534932. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:12,492][123291] Avg episode reward: [(0, '31.660'), (1, '31.660')]
[2023-09-30 06:10:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18169856. Throughput: 0: 732.8, 1: 734.5. Samples: 4539426. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:17,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.660')]
[2023-09-30 06:10:19,756][125260] Updated weights for policy 0, policy_version 35520 (0.0016)
[2023-09-30 06:10:19,757][125261] Updated weights for policy 1, policy_version 35520 (0.0019)
[2023-09-30 06:10:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18194432. Throughput: 0: 733.3, 1: 733.8. Samples: 4548211. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:22,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.690')]
[2023-09-30 06:10:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18227200. Throughput: 0: 728.2, 1: 728.2. Samples: 4556801. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:27,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.690')]
[2023-09-30 06:10:32,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.6, 300 sec: 5817.7). Total num frames: 18255872. Throughput: 0: 731.5, 1: 730.2. Samples: 4561148. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:32,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.680')]
[2023-09-30 06:10:33,949][125260] Updated weights for policy 0, policy_version 35680 (0.0019)
[2023-09-30 06:10:33,949][125261] Updated weights for policy 1, policy_version 35680 (0.0015)
[2023-09-30 06:10:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18284544. Throughput: 0: 731.0, 1: 732.0. Samples: 4569853. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:37,492][123291] Avg episode reward: [(0, '31.720'), (1, '31.680')]
[2023-09-30 06:10:42,492][123291] Fps is (10 sec: 5324.8, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18309120. Throughput: 0: 722.4, 1: 722.7. Samples: 4578204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:42,492][123291] Avg episode reward: [(0, '31.720'), (1, '31.690')]
[2023-09-30 06:10:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 18341888. Throughput: 0: 720.7, 1: 722.6. Samples: 4582596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:47,493][123291] Avg episode reward: [(0, '31.730'), (1, '31.710')]
[2023-09-30 06:10:48,383][125261] Updated weights for policy 1, policy_version 35840 (0.0017)
[2023-09-30 06:10:48,383][125260] Updated weights for policy 0, policy_version 35840 (0.0018)
[2023-09-30 06:10:52,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5803.9). Total num frames: 18370560. Throughput: 0: 724.9, 1: 723.4. Samples: 4591281. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:52,493][123291] Avg episode reward: [(0, '31.730'), (1, '31.710')]
[2023-09-30 06:10:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18399232. Throughput: 0: 720.5, 1: 721.2. Samples: 4599808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:10:57,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.750')]
[2023-09-30 06:11:02,290][125260] Updated weights for policy 0, policy_version 36000 (0.0017)
[2023-09-30 06:11:02,291][125261] Updated weights for policy 1, policy_version 36000 (0.0017)
[2023-09-30 06:11:02,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18432000. Throughput: 0: 720.1, 1: 718.9. Samples: 4604183. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:02,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.750')]
[2023-09-30 06:11:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18456576. Throughput: 0: 721.4, 1: 720.9. Samples: 4613117. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:07,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.770')]
[2023-09-30 06:11:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18489344. Throughput: 0: 724.6, 1: 723.5. Samples: 4621969. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:12,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.770')]
[2023-09-30 06:11:16,402][125261] Updated weights for policy 1, policy_version 36160 (0.0017)
[2023-09-30 06:11:16,402][125260] Updated weights for policy 0, policy_version 36160 (0.0016)
[2023-09-30 06:11:17,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18513920. Throughput: 0: 724.7, 1: 725.5. Samples: 4626406. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:11:17,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.760')]
[2023-09-30 06:11:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18546688. Throughput: 0: 724.8, 1: 723.8. Samples: 4635043. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:11:22,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.760')]
[2023-09-30 06:11:27,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 18575360. Throughput: 0: 729.9, 1: 729.6. Samples: 4643885. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:11:27,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.780')]
[2023-09-30 06:11:30,426][125261] Updated weights for policy 1, policy_version 36320 (0.0015)
[2023-09-30 06:11:30,426][125260] Updated weights for policy 0, policy_version 36320 (0.0016)
[2023-09-30 06:11:32,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 18604032. Throughput: 0: 728.8, 1: 726.7. Samples: 4648092. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-30 06:11:32,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.760')]
[2023-09-30 06:11:37,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 18636800. Throughput: 0: 731.0, 1: 732.8. Samples: 4657152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:37,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.760')]
[2023-09-30 06:11:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 18661376. Throughput: 0: 729.8, 1: 729.0. Samples: 4665451. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:42,492][123291] Avg episode reward: [(0, '31.670'), (1, '31.780')]
[2023-09-30 06:11:44,646][125260] Updated weights for policy 0, policy_version 36480 (0.0017)
[2023-09-30 06:11:44,646][125261] Updated weights for policy 1, policy_version 36480 (0.0018)
[2023-09-30 06:11:47,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 18694144. Throughput: 0: 726.0, 1: 726.3. Samples: 4669535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:47,492][123291] Avg episode reward: [(0, '31.670'), (1, '31.780')]
[2023-09-30 06:11:52,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 18718720. Throughput: 0: 723.7, 1: 724.4. Samples: 4678281. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:52,492][123291] Avg episode reward: [(0, '31.680'), (1, '31.790')]
[2023-09-30 06:11:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 18751488. Throughput: 0: 724.6, 1: 723.5. Samples: 4687132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:11:57,493][123291] Avg episode reward: [(0, '31.680'), (1, '31.790')]
[2023-09-30 06:11:58,649][125260] Updated weights for policy 0, policy_version 36640 (0.0015)
[2023-09-30 06:11:58,650][125261] Updated weights for policy 1, policy_version 36640 (0.0016)
[2023-09-30 06:12:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18776064. Throughput: 0: 725.7, 1: 725.6. Samples: 4691714. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:02,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.810')]
[2023-09-30 06:12:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 18808832. Throughput: 0: 723.1, 1: 724.1. Samples: 4700164. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:07,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.810')]
[2023-09-30 06:12:07,503][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000036736_9404416.pth...
[2023-09-30 06:12:07,503][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000036736_9404416.pth...
[2023-09-30 06:12:07,537][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000034000_8704000.pth
[2023-09-30 06:12:07,541][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000034000_8704000.pth
[2023-09-30 06:12:12,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18833408. Throughput: 0: 717.1, 1: 717.2. Samples: 4708430. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:12,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.810')]
[2023-09-30 06:12:13,144][125260] Updated weights for policy 0, policy_version 36800 (0.0017)
[2023-09-30 06:12:13,144][125261] Updated weights for policy 1, policy_version 36800 (0.0016)
[2023-09-30 06:12:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 18866176. Throughput: 0: 719.7, 1: 720.0. Samples: 4712881. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:17,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.810')]
[2023-09-30 06:12:22,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18890752. Throughput: 0: 715.9, 1: 715.0. Samples: 4721540. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:22,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.810')]
[2023-09-30 06:12:27,277][125261] Updated weights for policy 1, policy_version 36960 (0.0017)
[2023-09-30 06:12:27,277][125260] Updated weights for policy 0, policy_version 36960 (0.0017)
[2023-09-30 06:12:27,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 18923520. Throughput: 0: 721.0, 1: 720.6. Samples: 4730324. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:27,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.820')]
[2023-09-30 06:12:32,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18948096. Throughput: 0: 723.8, 1: 723.9. Samples: 4734680. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:32,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.820')]
[2023-09-30 06:12:37,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 18980864. Throughput: 0: 722.1, 1: 721.1. Samples: 4743227. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:37,493][123291] Avg episode reward: [(0, '31.700'), (1, '31.800')]
[2023-09-30 06:12:41,467][125260] Updated weights for policy 0, policy_version 37120 (0.0016)
[2023-09-30 06:12:41,467][125261] Updated weights for policy 1, policy_version 37120 (0.0016)
[2023-09-30 06:12:42,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19005440. Throughput: 0: 718.1, 1: 718.6. Samples: 4751780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:42,492][123291] Avg episode reward: [(0, '31.700'), (1, '31.800')]
[2023-09-30 06:12:47,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5817.7). Total num frames: 19038208. Throughput: 0: 714.0, 1: 714.4. Samples: 4755988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:47,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.800')]
[2023-09-30 06:12:52,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19062784. Throughput: 0: 720.3, 1: 719.8. Samples: 4764969. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:52,492][123291] Avg episode reward: [(0, '31.710'), (1, '31.800')]
[2023-09-30 06:12:55,510][125261] Updated weights for policy 1, policy_version 37280 (0.0014)
[2023-09-30 06:12:55,510][125260] Updated weights for policy 0, policy_version 37280 (0.0016)
[2023-09-30 06:12:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19095552. Throughput: 0: 726.9, 1: 727.8. Samples: 4773888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:12:57,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.800')]
[2023-09-30 06:13:02,492][123291] Fps is (10 sec: 6143.9, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 19124224. Throughput: 0: 723.0, 1: 723.8. Samples: 4777985. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:02,493][123291] Avg episode reward: [(0, '31.720'), (1, '31.810')]
[2023-09-30 06:13:07,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19152896. Throughput: 0: 725.7, 1: 725.4. Samples: 4786839. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:07,492][123291] Avg episode reward: [(0, '31.720'), (1, '31.810')]
[2023-09-30 06:13:09,577][125260] Updated weights for policy 0, policy_version 37440 (0.0017)
[2023-09-30 06:13:09,577][125261] Updated weights for policy 1, policy_version 37440 (0.0017)
[2023-09-30 06:13:12,491][123291] Fps is (10 sec: 6144.1, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19185664. Throughput: 0: 726.0, 1: 726.3. Samples: 4795677. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:12,492][123291] Avg episode reward: [(0, '31.730'), (1, '31.880')]
[2023-09-30 06:13:17,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19210240. Throughput: 0: 728.8, 1: 727.0. Samples: 4800188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:17,492][123291] Avg episode reward: [(0, '31.730'), (1, '31.880')]
[2023-09-30 06:13:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19243008. Throughput: 0: 727.6, 1: 728.1. Samples: 4808731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:22,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.940')]
[2023-09-30 06:13:22,503][125162] Saving new best policy, reward=31.940!
[2023-09-30 06:13:23,437][125261] Updated weights for policy 1, policy_version 37600 (0.0018)
[2023-09-30 06:13:23,438][125260] Updated weights for policy 0, policy_version 37600 (0.0019)
[2023-09-30 06:13:27,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19267584. Throughput: 0: 732.7, 1: 733.8. Samples: 4817772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:27,493][123291] Avg episode reward: [(0, '31.740'), (1, '31.940')]
[2023-09-30 06:13:32,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19300352. Throughput: 0: 737.1, 1: 738.2. Samples: 4822376. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:32,492][123291] Avg episode reward: [(0, '31.740'), (1, '31.920')]
[2023-09-30 06:13:37,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19324928. Throughput: 0: 732.3, 1: 732.2. Samples: 4830869. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:37,492][123291] Avg episode reward: [(0, '31.740'), (1, '31.920')]
[2023-09-30 06:13:37,530][125261] Updated weights for policy 1, policy_version 37760 (0.0014)
[2023-09-30 06:13:37,530][125260] Updated weights for policy 0, policy_version 37760 (0.0015)
[2023-09-30 06:13:42,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19357696. Throughput: 0: 728.5, 1: 728.3. Samples: 4839441. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:42,492][123291] Avg episode reward: [(0, '31.730'), (1, '31.920')]
[2023-09-30 06:13:47,491][123291] Fps is (10 sec: 6553.6, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 19390464. Throughput: 0: 731.8, 1: 730.2. Samples: 4843776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:47,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.890')]
[2023-09-30 06:13:51,577][125260] Updated weights for policy 0, policy_version 37920 (0.0017)
[2023-09-30 06:13:51,577][125261] Updated weights for policy 1, policy_version 37920 (0.0017)
[2023-09-30 06:13:52,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19415040. Throughput: 0: 730.3, 1: 730.8. Samples: 4852587. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:52,492][123291] Avg episode reward: [(0, '31.750'), (1, '31.890')]
[2023-09-30 06:13:57,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19447808. Throughput: 0: 728.2, 1: 727.4. Samples: 4861180. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:13:57,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.910')]
[2023-09-30 06:14:02,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5802.7, 300 sec: 5803.8). Total num frames: 19472384. Throughput: 0: 727.9, 1: 729.0. Samples: 4865749. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:02,493][123291] Avg episode reward: [(0, '31.710'), (1, '31.910')]
[2023-09-30 06:14:05,564][125261] Updated weights for policy 1, policy_version 38080 (0.0017)
[2023-09-30 06:14:05,565][125260] Updated weights for policy 0, policy_version 38080 (0.0015)
[2023-09-30 06:14:07,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19505152. Throughput: 0: 731.0, 1: 730.4. Samples: 4874497. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:14:07,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.890')]
[2023-09-30 06:14:07,504][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000038096_9752576.pth...
[2023-09-30 06:14:07,504][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000038096_9752576.pth...
[2023-09-30 06:14:07,539][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000035376_9056256.pth
[2023-09-30 06:14:07,539][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000035376_9056256.pth
[2023-09-30 06:14:12,492][123291] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19537920. Throughput: 0: 730.6, 1: 730.4. Samples: 4883519. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:14:12,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.890')]
[2023-09-30 06:14:17,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19562496. Throughput: 0: 733.1, 1: 729.6. Samples: 4888199. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:14:17,492][123291] Avg episode reward: [(0, '31.690'), (1, '31.880')]
[2023-09-30 06:14:19,605][125260] Updated weights for policy 0, policy_version 38240 (0.0017)
[2023-09-30 06:14:19,605][125261] Updated weights for policy 1, policy_version 38240 (0.0017)
[2023-09-30 06:14:22,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19595264. Throughput: 0: 731.2, 1: 729.7. Samples: 4896609. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:14:22,493][123291] Avg episode reward: [(0, '31.690'), (1, '31.880')]
[2023-09-30 06:14:27,491][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19619840. Throughput: 0: 730.3, 1: 729.4. Samples: 4905126. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-30 06:14:27,492][123291] Avg episode reward: [(0, '31.660'), (1, '31.870')]
[2023-09-30 06:14:32,491][123291] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19652608. Throughput: 0: 729.5, 1: 730.3. Samples: 4909470. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:32,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.880')]
[2023-09-30 06:14:33,666][125260] Updated weights for policy 0, policy_version 38400 (0.0015)
[2023-09-30 06:14:33,667][125261] Updated weights for policy 1, policy_version 38400 (0.0017)
[2023-09-30 06:14:37,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19677184. Throughput: 0: 731.3, 1: 730.0. Samples: 4918345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:37,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.880')]
[2023-09-30 06:14:42,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5817.7). Total num frames: 19709952. Throughput: 0: 735.3, 1: 735.8. Samples: 4927379. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:42,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.850')]
[2023-09-30 06:14:47,416][125260] Updated weights for policy 0, policy_version 38560 (0.0016)
[2023-09-30 06:14:47,416][125261] Updated weights for policy 1, policy_version 38560 (0.0017)
[2023-09-30 06:14:47,492][123291] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19742720. Throughput: 0: 731.1, 1: 732.4. Samples: 4931609. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:47,493][123291] Avg episode reward: [(0, '31.640'), (1, '31.850')]
[2023-09-30 06:14:52,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19767296. Throughput: 0: 735.2, 1: 734.8. Samples: 4940648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:52,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.830')]
[2023-09-30 06:14:57,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19800064. Throughput: 0: 734.3, 1: 732.9. Samples: 4949544. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:14:57,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.830')]
[2023-09-30 06:15:01,501][125261] Updated weights for policy 1, policy_version 38720 (0.0016)
[2023-09-30 06:15:01,502][125260] Updated weights for policy 0, policy_version 38720 (0.0017)
[2023-09-30 06:15:02,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 19824640. Throughput: 0: 730.8, 1: 731.6. Samples: 4954008. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:15:02,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.800')]
[2023-09-30 06:15:07,492][123291] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19857408. Throughput: 0: 732.9, 1: 734.0. Samples: 4962617. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:15:07,493][123291] Avg episode reward: [(0, '31.660'), (1, '31.800')]
[2023-09-30 06:15:12,492][123291] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 19881984. Throughput: 0: 737.9, 1: 737.1. Samples: 4971504. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:15:12,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.780')]
[2023-09-30 06:15:15,485][125260] Updated weights for policy 0, policy_version 38880 (0.0018)
[2023-09-30 06:15:15,485][125261] Updated weights for policy 1, policy_version 38880 (0.0018)
[2023-09-30 06:15:17,492][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 19914752. Throughput: 0: 737.2, 1: 736.3. Samples: 4975781. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-30 06:15:17,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.780')]
[2023-09-30 06:15:22,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5802.7, 300 sec: 5817.7). Total num frames: 19943424. Throughput: 0: 736.6, 1: 735.8. Samples: 4984601. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:15:22,493][123291] Avg episode reward: [(0, '31.630'), (1, '31.780')]
[2023-09-30 06:15:27,491][123291] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5817.7). Total num frames: 19972096. Throughput: 0: 728.8, 1: 729.9. Samples: 4993024. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:15:27,492][123291] Avg episode reward: [(0, '31.650'), (1, '31.710')]
[2023-09-30 06:15:29,603][125261] Updated weights for policy 1, policy_version 39040 (0.0017)
[2023-09-30 06:15:29,603][125260] Updated weights for policy 0, policy_version 39040 (0.0018)
[2023-09-30 06:15:32,492][123291] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 20004864. Throughput: 0: 730.0, 1: 729.5. Samples: 4997286. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-30 06:15:32,493][123291] Avg episode reward: [(0, '31.650'), (1, '31.710')]
[2023-09-30 06:15:33,732][125293] Stopping RolloutWorker_w1...
[2023-09-30 06:15:33,732][125300] Stopping RolloutWorker_w6...
[2023-09-30 06:15:33,732][125299] Stopping RolloutWorker_w5...
[2023-09-30 06:15:33,732][125297] Stopping RolloutWorker_w3...
[2023-09-30 06:15:33,732][125298] Stopping RolloutWorker_w4...
[2023-09-30 06:15:33,732][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-09-30 06:15:33,732][125301] Stopping RolloutWorker_w7...
[2023-09-30 06:15:33,732][125294] Stopping RolloutWorker_w2...
[2023-09-30 06:15:33,733][125293] Loop rollout_proc1_evt_loop terminating...
[2023-09-30 06:15:33,733][125301] Loop rollout_proc7_evt_loop terminating...
[2023-09-30 06:15:33,733][125295] Stopping RolloutWorker_w0...
[2023-09-30 06:15:33,733][125300] Loop rollout_proc6_evt_loop terminating...
[2023-09-30 06:15:33,733][125299] Loop rollout_proc5_evt_loop terminating...
[2023-09-30 06:15:33,733][125297] Loop rollout_proc3_evt_loop terminating...
[2023-09-30 06:15:33,733][125298] Loop rollout_proc4_evt_loop terminating...
[2023-09-30 06:15:33,733][123291] Component RolloutWorker_w6 stopped!
[2023-09-30 06:15:33,733][125294] Loop rollout_proc2_evt_loop terminating...
[2023-09-30 06:15:33,733][125295] Loop rollout_proc0_evt_loop terminating...
[2023-09-30 06:15:33,733][123291] Component RolloutWorker_w5 stopped!
[2023-09-30 06:15:33,734][123291] Component RolloutWorker_w4 stopped!
[2023-09-30 06:15:33,735][123291] Component RolloutWorker_w3 stopped!
[2023-09-30 06:15:33,735][123291] Component RolloutWorker_w1 stopped!
[2023-09-30 06:15:33,736][123291] Component RolloutWorker_w2 stopped!
[2023-09-30 06:15:33,736][124965] Stopping Batcher_0...
[2023-09-30 06:15:33,737][123291] Component Batcher_1 stopped!
[2023-09-30 06:15:33,737][124965] Loop batcher_evt_loop terminating...
[2023-09-30 06:15:33,737][123291] Component RolloutWorker_w7 stopped!
[2023-09-30 06:15:33,738][123291] Component RolloutWorker_w0 stopped!
[2023-09-30 06:15:33,732][125162] Stopping Batcher_1...
[2023-09-30 06:15:33,738][123291] Component Batcher_0 stopped!
[2023-09-30 06:15:33,739][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-09-30 06:15:33,752][125162] Loop batcher_evt_loop terminating...
[2023-09-30 06:15:33,768][124965] Removing ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000036736_9404416.pth
[2023-09-30 06:15:33,772][124965] Saving ./train_atari/atari_freeway/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-09-30 06:15:33,777][125260] Weights refcount: 2 0
[2023-09-30 06:15:33,778][125162] Removing ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000036736_9404416.pth
[2023-09-30 06:15:33,779][125260] Stopping InferenceWorker_p0-w0...
[2023-09-30 06:15:33,779][125260] Loop inference_proc0-0_evt_loop terminating...
[2023-09-30 06:15:33,779][123291] Component InferenceWorker_p0-w0 stopped!
[2023-09-30 06:15:33,784][125162] Saving ./train_atari/atari_freeway/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-09-30 06:15:33,797][125261] Weights refcount: 2 0
[2023-09-30 06:15:33,799][125261] Stopping InferenceWorker_p1-w0...
[2023-09-30 06:15:33,800][125261] Loop inference_proc1-0_evt_loop terminating...
[2023-09-30 06:15:33,800][123291] Component InferenceWorker_p1-w0 stopped!
[2023-09-30 06:15:33,843][124965] Stopping LearnerWorker_p0...
[2023-09-30 06:15:33,843][124965] Loop learner_proc0_evt_loop terminating...
[2023-09-30 06:15:33,844][125162] Stopping LearnerWorker_p1...
[2023-09-30 06:15:33,843][123291] Component LearnerWorker_p0 stopped!
[2023-09-30 06:15:33,844][125162] Loop learner_proc1_evt_loop terminating...
[2023-09-30 06:15:33,846][123291] Component LearnerWorker_p1 stopped!
[2023-09-30 06:15:33,847][123291] Waiting for process learner_proc0 to stop...
[2023-09-30 06:15:34,517][123291] Waiting for process learner_proc1 to stop...
[2023-09-30 06:15:34,587][123291] Waiting for process inference_proc0-0 to join...
[2023-09-30 06:15:34,588][123291] Waiting for process inference_proc1-0 to join...
[2023-09-30 06:15:34,589][123291] Waiting for process rollout_proc0 to join...
[2023-09-30 06:15:34,590][123291] Waiting for process rollout_proc1 to join...
[2023-09-30 06:15:34,590][123291] Waiting for process rollout_proc2 to join...
[2023-09-30 06:15:34,591][123291] Waiting for process rollout_proc3 to join...
[2023-09-30 06:15:34,592][123291] Waiting for process rollout_proc4 to join...
[2023-09-30 06:15:34,592][123291] Waiting for process rollout_proc5 to join...
[2023-09-30 06:15:34,593][123291] Waiting for process rollout_proc6 to join...
[2023-09-30 06:15:34,593][123291] Waiting for process rollout_proc7 to join...
[2023-09-30 06:15:34,594][123291] Batcher 0 profile tree view:
batching: 21.9047, releasing_batches: 1.9731
[2023-09-30 06:15:34,595][123291] Batcher 1 profile tree view:
batching: 21.7394, releasing_batches: 2.0775
[2023-09-30 06:15:34,595][123291] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0052
wait_policy_total: 736.1730
update_model: 39.4480
weight_update: 0.0017
one_step: 0.0012
handle_policy_step: 2442.8137
deserialize: 70.8718, stack: 17.4333, obs_to_device_normalize: 588.7658, forward: 1188.4960, send_messages: 96.9384
prepare_outputs: 328.2861
to_cpu: 165.8783
[2023-09-30 06:15:34,596][123291] InferenceWorker_p1-w0 profile tree view:
wait_policy: 0.0052
wait_policy_total: 743.0885
update_model: 39.5372
weight_update: 0.0017
one_step: 0.0012
handle_policy_step: 2437.6575
deserialize: 72.3292, stack: 17.1788, obs_to_device_normalize: 590.2663, forward: 1180.2937, send_messages: 95.7770
prepare_outputs: 326.9959
to_cpu: 164.3966
[2023-09-30 06:15:34,596][123291] Learner 0 profile tree view:
misc: 0.0179, prepare_batch: 32.6797
train: 458.0023
epoch_init: 0.0913, minibatch_init: 3.1285, losses_postprocess: 62.5257, kl_divergence: 5.4859, after_optimizer: 21.2469
calculate_losses: 45.6093
losses_init: 0.0795, forward_head: 14.6312, bptt_initial: 0.4336, bptt: 0.4494, tail: 10.3959, advantages_returns: 3.0797, losses: 12.9245
update: 315.7455
clip: 164.8358
[2023-09-30 06:15:34,597][123291] Learner 1 profile tree view:
misc: 0.0160, prepare_batch: 32.9209
train: 458.0885
epoch_init: 0.0895, minibatch_init: 3.2639, losses_postprocess: 61.3471, kl_divergence: 5.6418, after_optimizer: 21.1493
calculate_losses: 46.4135
losses_init: 0.0977, forward_head: 14.9108, bptt_initial: 0.4283, bptt: 0.4431, tail: 10.6515, advantages_returns: 3.1596, losses: 13.0457
update: 315.9825
clip: 162.7958
[2023-09-30 06:15:34,597][123291] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4016, enqueue_policy_requests: 42.1799, env_step: 1409.8691, overhead: 28.0971, complete_rollouts: 1.0726
save_policy_outputs: 53.8260
split_output_tensors: 18.7937
[2023-09-30 06:15:34,597][123291] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4002, enqueue_policy_requests: 42.9877, env_step: 1412.7800, overhead: 27.9677, complete_rollouts: 1.0683
save_policy_outputs: 53.1721
split_output_tensors: 18.3138
[2023-09-30 06:15:34,597][123291] Loop Runner_EvtLoop terminating...
[2023-09-30 06:15:34,598][123291] Runner profile tree view:
main_loop: 3442.8651
[2023-09-30 06:15:34,598][123291] Collected {0: 10006528, 1: 10006528}, FPS: 5812.9