appo-atari-amidar / sf_log.txt
MattStammers's picture
Upload folder using huggingface_hub
fec0b82
[2023-09-22 10:05:05,837][82473] Saving configuration to ./train_atari/Amidar/config.json...
[2023-09-22 10:05:06,103][82473] Rollout worker 0 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 1 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 2 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 3 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 4 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 5 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 6 uses device cpu
[2023-09-22 10:05:06,104][82473] Rollout worker 7 uses device cpu
[2023-09-22 10:05:06,104][82473] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-09-22 10:05:06,163][82473] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:05:06,164][82473] InferenceWorker_p0-w0: min num requests: 2
[2023-09-22 10:05:06,188][82473] Starting all processes...
[2023-09-22 10:05:06,188][82473] Starting process learner_proc0
[2023-09-22 10:05:07,808][82473] Starting all processes...
[2023-09-22 10:05:07,811][82914] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:05:07,812][82914] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-22 10:05:07,815][82473] Starting process inference_proc0-0
[2023-09-22 10:05:07,815][82473] Starting process rollout_proc0
[2023-09-22 10:05:07,816][82473] Starting process rollout_proc1
[2023-09-22 10:05:07,816][82473] Starting process rollout_proc2
[2023-09-22 10:05:07,820][82473] Starting process rollout_proc3
[2023-09-22 10:05:07,823][82473] Starting process rollout_proc4
[2023-09-22 10:05:07,823][82473] Starting process rollout_proc5
[2023-09-22 10:05:07,824][82473] Starting process rollout_proc6
[2023-09-22 10:05:07,825][82473] Starting process rollout_proc7
[2023-09-22 10:05:07,849][82914] Num visible devices: 1
[2023-09-22 10:05:07,937][82914] Starting seed is not provided
[2023-09-22 10:05:07,937][82914] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:05:07,938][82914] Initializing actor-critic model on device cuda:0
[2023-09-22 10:05:07,938][82914] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:05:07,939][82914] RunningMeanStd input shape: (1,)
[2023-09-22 10:05:08,013][82914] ConvEncoder: input_channels=4
[2023-09-22 10:05:08,353][82914] Conv encoder output size: 512
[2023-09-22 10:05:08,355][82914] Created Actor Critic model with architecture:
[2023-09-22 10:05:08,355][82914] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=10, bias=True)
)
)
[2023-09-22 10:05:08,946][82914] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-22 10:05:08,947][82914] No checkpoints found
[2023-09-22 10:05:08,947][82914] Did not load from checkpoint, starting from scratch!
[2023-09-22 10:05:08,948][82914] Initialized policy 0 weights for model version 0
[2023-09-22 10:05:08,950][82914] LearnerWorker_p0 finished initialization!
[2023-09-22 10:05:08,950][82914] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:05:09,803][82977] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-22 10:05:09,807][82982] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-22 10:05:09,807][82981] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-22 10:05:09,811][82978] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-22 10:05:09,824][82980] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-22 10:05:09,827][82983] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-22 10:05:09,849][82984] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-22 10:05:09,898][82976] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:05:09,898][82976] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-22 10:05:09,916][82976] Num visible devices: 1
[2023-09-22 10:05:09,982][82979] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-22 10:05:10,515][82976] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:05:10,516][82976] RunningMeanStd input shape: (1,)
[2023-09-22 10:05:10,527][82976] ConvEncoder: input_channels=4
[2023-09-22 10:05:10,632][82976] Conv encoder output size: 512
[2023-09-22 10:05:10,637][82473] Inference worker 0-0 is ready!
[2023-09-22 10:05:10,638][82473] All inference workers are ready! Signal rollout workers to start!
[2023-09-22 10:05:11,105][82984] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,107][82983] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,108][82978] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,109][82982] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,112][82980] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,112][82979] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,113][82981] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,127][82977] Decorrelating experience for 0 frames...
[2023-09-22 10:05:11,685][82473] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-09-22 10:05:16,685][82473] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 551.2. Samples: 2756. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:05:16,686][82473] Avg episode reward: [(0, '0.125')]
[2023-09-22 10:05:20,138][82473] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 82473], exiting...
[2023-09-22 10:05:20,138][82473] Runner profile tree view:
main_loop: 13.9507
[2023-09-22 10:05:20,138][82978] Stopping RolloutWorker_w1...
[2023-09-22 10:05:20,138][82981] Stopping RolloutWorker_w3...
[2023-09-22 10:05:20,138][82979] Stopping RolloutWorker_w4...
[2023-09-22 10:05:20,139][82473] Collected {0: 24576}, FPS: 1761.6
[2023-09-22 10:05:20,139][82978] Loop rollout_proc1_evt_loop terminating...
[2023-09-22 10:05:20,139][82979] Loop rollout_proc4_evt_loop terminating...
[2023-09-22 10:05:20,139][82981] Loop rollout_proc3_evt_loop terminating...
[2023-09-22 10:05:20,139][82984] Stopping RolloutWorker_w7...
[2023-09-22 10:05:20,139][82914] Stopping Batcher_0...
[2023-09-22 10:05:20,140][82980] Stopping RolloutWorker_w2...
[2023-09-22 10:05:20,140][82984] Loop rollout_proc7_evt_loop terminating...
[2023-09-22 10:05:20,140][82980] Loop rollout_proc2_evt_loop terminating...
[2023-09-22 10:05:20,140][82914] Loop batcher_evt_loop terminating...
[2023-09-22 10:05:20,140][82982] Stopping RolloutWorker_w6...
[2023-09-22 10:05:20,140][82977] Stopping RolloutWorker_w0...
[2023-09-22 10:05:20,140][82982] Loop rollout_proc6_evt_loop terminating...
[2023-09-22 10:05:20,141][82977] Loop rollout_proc0_evt_loop terminating...
[2023-09-22 10:05:20,141][82914] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000000096_24576.pth...
[2023-09-22 10:05:20,141][82983] Stopping RolloutWorker_w5...
[2023-09-22 10:05:20,142][82983] Loop rollout_proc5_evt_loop terminating...
[2023-09-22 10:05:20,154][82976] Weights refcount: 2 0
[2023-09-22 10:05:20,156][82976] Stopping InferenceWorker_p0-w0...
[2023-09-22 10:05:20,156][82976] Loop inference_proc0-0_evt_loop terminating...
[2023-09-22 10:05:20,183][82914] Stopping LearnerWorker_p0...
[2023-09-22 10:05:20,184][82914] Loop learner_proc0_evt_loop terminating...
[2023-09-22 10:06:14,636][86732] Saving configuration to ./train_atari/Amidar/config.json...
[2023-09-22 10:06:14,908][86732] Rollout worker 0 uses device cpu
[2023-09-22 10:06:14,909][86732] Rollout worker 1 uses device cpu
[2023-09-22 10:06:14,909][86732] Rollout worker 2 uses device cpu
[2023-09-22 10:06:14,910][86732] Rollout worker 3 uses device cpu
[2023-09-22 10:06:14,910][86732] Rollout worker 4 uses device cpu
[2023-09-22 10:06:14,911][86732] Rollout worker 5 uses device cpu
[2023-09-22 10:06:14,912][86732] Rollout worker 6 uses device cpu
[2023-09-22 10:06:14,912][86732] Rollout worker 7 uses device cpu
[2023-09-22 10:06:14,912][86732] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-09-22 10:06:14,975][86732] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:06:14,975][86732] InferenceWorker_p0-w0: min num requests: 1
[2023-09-22 10:06:14,979][86732] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 10:06:14,979][86732] InferenceWorker_p1-w0: min num requests: 1
[2023-09-22 10:06:15,003][86732] Starting all processes...
[2023-09-22 10:06:15,004][86732] Starting process learner_proc0
[2023-09-22 10:06:16,808][86732] Starting process learner_proc1
[2023-09-22 10:06:16,811][88211] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:06:16,812][88211] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-22 10:06:16,850][88211] Num visible devices: 1
[2023-09-22 10:06:16,932][88211] Starting seed is not provided
[2023-09-22 10:06:16,932][88211] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:06:16,932][88211] Initializing actor-critic model on device cuda:0
[2023-09-22 10:06:16,933][88211] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:06:16,933][88211] RunningMeanStd input shape: (1,)
[2023-09-22 10:06:16,952][88211] ConvEncoder: input_channels=4
[2023-09-22 10:06:17,094][88211] Conv encoder output size: 512
[2023-09-22 10:06:17,095][88211] Created Actor Critic model with architecture:
[2023-09-22 10:06:17,096][88211] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=10, bias=True)
)
)
[2023-09-22 10:06:17,636][88211] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-22 10:06:17,636][88211] Loading state from checkpoint ./train_atari/Amidar/checkpoint_p0/checkpoint_000000096_24576.pth...
[2023-09-22 10:06:17,654][88211] Loading model from checkpoint
[2023-09-22 10:06:17,657][88211] Loaded experiment state at self.train_step=96, self.env_steps=24576
[2023-09-22 10:06:17,657][88211] Initialized policy 0 weights for model version 96
[2023-09-22 10:06:17,659][88211] LearnerWorker_p0 finished initialization!
[2023-09-22 10:06:17,659][88211] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:06:18,490][86732] Starting all processes...
[2023-09-22 10:06:18,494][88352] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 10:06:18,494][88352] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1
[2023-09-22 10:06:18,497][86732] Starting process inference_proc0-0
[2023-09-22 10:06:18,497][86732] Starting process inference_proc1-0
[2023-09-22 10:06:18,498][86732] Starting process rollout_proc0
[2023-09-22 10:06:18,498][86732] Starting process rollout_proc1
[2023-09-22 10:06:18,501][86732] Starting process rollout_proc2
[2023-09-22 10:06:18,505][86732] Starting process rollout_proc3
[2023-09-22 10:06:18,505][86732] Starting process rollout_proc4
[2023-09-22 10:06:18,506][86732] Starting process rollout_proc5
[2023-09-22 10:06:18,530][88352] Num visible devices: 1
[2023-09-22 10:06:18,507][86732] Starting process rollout_proc6
[2023-09-22 10:06:18,510][86732] Starting process rollout_proc7
[2023-09-22 10:06:18,679][88352] Starting seed is not provided
[2023-09-22 10:06:18,680][88352] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-22 10:06:18,680][88352] Initializing actor-critic model on device cuda:0
[2023-09-22 10:06:18,680][88352] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:06:18,681][88352] RunningMeanStd input shape: (1,)
[2023-09-22 10:06:18,719][88352] ConvEncoder: input_channels=4
[2023-09-22 10:06:18,970][88352] Conv encoder output size: 512
[2023-09-22 10:06:18,973][88352] Created Actor Critic model with architecture:
[2023-09-22 10:06:18,973][88352] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=10, bias=True)
)
)
[2023-09-22 10:06:19,595][88352] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-22 10:06:19,596][88352] No checkpoints found
[2023-09-22 10:06:19,596][88352] Did not load from checkpoint, starting from scratch!
[2023-09-22 10:06:19,596][88352] Initialized policy 1 weights for model version 0
[2023-09-22 10:06:19,597][88352] LearnerWorker_p1 finished initialization!
[2023-09-22 10:06:19,598][88352] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-22 10:06:20,502][88480] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-22 10:06:20,502][88474] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 10:06:20,502][88474] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1
[2023-09-22 10:06:20,519][88474] Num visible devices: 1
[2023-09-22 10:06:20,644][88473] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 10:06:20,644][88473] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-22 10:06:20,646][88479] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-22 10:06:20,662][88473] Num visible devices: 1
[2023-09-22 10:06:20,735][88681] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-22 10:06:20,783][88485] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-22 10:06:20,799][88476] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-22 10:06:20,802][88478] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-22 10:06:20,831][88686] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-22 10:06:20,930][86732] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 24576. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-09-22 10:06:20,932][88687] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-22 10:06:21,154][88474] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:06:21,154][88474] RunningMeanStd input shape: (1,)
[2023-09-22 10:06:21,167][88474] ConvEncoder: input_channels=4
[2023-09-22 10:06:21,252][88473] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 10:06:21,253][88473] RunningMeanStd input shape: (1,)
[2023-09-22 10:06:21,264][88473] ConvEncoder: input_channels=4
[2023-09-22 10:06:21,274][88474] Conv encoder output size: 512
[2023-09-22 10:06:21,280][86732] Inference worker 1-0 is ready!
[2023-09-22 10:06:21,367][88473] Conv encoder output size: 512
[2023-09-22 10:06:21,373][86732] Inference worker 0-0 is ready!
[2023-09-22 10:06:21,374][86732] All inference workers are ready! Signal rollout workers to start!
[2023-09-22 10:06:21,846][88681] Decorrelating experience for 0 frames...
[2023-09-22 10:06:21,847][88479] Decorrelating experience for 0 frames...
[2023-09-22 10:06:21,850][88480] Decorrelating experience for 0 frames...
[2023-09-22 10:06:21,896][88686] Decorrelating experience for 0 frames...
[2023-09-22 10:06:21,933][88476] Decorrelating experience for 0 frames...
[2023-09-22 10:06:21,935][88485] Decorrelating experience for 0 frames...
[2023-09-22 10:06:22,163][88478] Decorrelating experience for 0 frames...
[2023-09-22 10:06:22,164][88687] Decorrelating experience for 0 frames...
[2023-09-22 10:06:25,440][86732] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 24576. Throughput: 0: 227.0, 1: 227.0. Samples: 2048. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-09-22 10:06:25,441][86732] Avg episode reward: [(0, '1.667'), (1, '0.000')]
[2023-09-22 10:06:30,440][86732] Fps is (10 sec: 2584.1, 60 sec: 2584.1, 300 sec: 2584.1). Total num frames: 49152. Throughput: 0: 389.3, 1: 380.5. Samples: 7321. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:06:30,441][86732] Avg episode reward: [(0, '0.500'), (1, '0.125')]
[2023-09-22 10:06:34,962][86732] Heartbeat connected on Batcher_0
[2023-09-22 10:06:34,971][86732] Heartbeat connected on Batcher_1
[2023-09-22 10:06:34,983][86732] Heartbeat connected on RolloutWorker_w0
[2023-09-22 10:06:34,985][86732] Heartbeat connected on RolloutWorker_w1
[2023-09-22 10:06:34,988][86732] Heartbeat connected on RolloutWorker_w2
[2023-09-22 10:06:34,991][86732] Heartbeat connected on RolloutWorker_w3
[2023-09-22 10:06:34,994][86732] Heartbeat connected on RolloutWorker_w4
[2023-09-22 10:06:34,997][86732] Heartbeat connected on RolloutWorker_w5
[2023-09-22 10:06:34,999][86732] Heartbeat connected on RolloutWorker_w6
[2023-09-22 10:06:35,002][86732] Heartbeat connected on RolloutWorker_w7
[2023-09-22 10:06:35,020][86732] Heartbeat connected on InferenceWorker_p1-w0
[2023-09-22 10:06:35,024][86732] Heartbeat connected on InferenceWorker_p0-w0
[2023-09-22 10:06:35,059][86732] Heartbeat connected on LearnerWorker_p1
[2023-09-22 10:06:35,064][86732] Heartbeat connected on LearnerWorker_p0
[2023-09-22 10:06:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 3951.9, 300 sec: 3951.9). Total num frames: 81920. Throughput: 0: 418.0, 1: 410.6. Samples: 12024. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:06:35,441][86732] Avg episode reward: [(0, '0.510'), (1, '0.135')]
[2023-09-22 10:06:39,129][88473] Updated weights for policy 0, policy_version 256 (0.0019)
[2023-09-22 10:06:39,129][88474] Updated weights for policy 1, policy_version 160 (0.0018)
[2023-09-22 10:06:40,440][86732] Fps is (10 sec: 6144.0, 60 sec: 4408.7, 300 sec: 4408.7). Total num frames: 110592. Throughput: 0: 536.1, 1: 531.4. Samples: 20828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:06:40,441][86732] Avg episode reward: [(0, '0.569'), (1, '0.163')]
[2023-09-22 10:06:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 4679.2, 300 sec: 4679.2). Total num frames: 139264. Throughput: 0: 612.2, 1: 611.5. Samples: 29993. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:06:45,441][86732] Avg episode reward: [(0, '0.814'), (1, '0.230')]
[2023-09-22 10:06:50,440][86732] Fps is (10 sec: 6144.1, 60 sec: 4996.8, 300 sec: 4996.8). Total num frames: 172032. Throughput: 0: 589.9, 1: 586.6. Samples: 34720. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:06:50,440][86732] Avg episode reward: [(0, '0.920'), (1, '0.360')]
[2023-09-22 10:06:52,775][88473] Updated weights for policy 0, policy_version 416 (0.0016)
[2023-09-22 10:06:52,775][88474] Updated weights for policy 1, policy_version 320 (0.0015)
[2023-09-22 10:06:55,440][86732] Fps is (10 sec: 6553.7, 60 sec: 5222.3, 300 sec: 5222.3). Total num frames: 204800. Throughput: 0: 630.2, 1: 627.8. Samples: 43412. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:06:55,440][86732] Avg episode reward: [(0, '1.240'), (1, '0.370')]
[2023-09-22 10:06:55,446][88211] Saving new best policy, reward=1.240!
[2023-09-22 10:07:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5183.5, 300 sec: 5183.5). Total num frames: 229376. Throughput: 0: 671.4, 1: 668.5. Samples: 52939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:00,441][86732] Avg episode reward: [(0, '1.570'), (1, '0.520')]
[2023-09-22 10:07:00,442][88352] Saving new best policy, reward=0.520!
[2023-09-22 10:07:00,442][88211] Saving new best policy, reward=1.570!
[2023-09-22 10:07:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5337.4, 300 sec: 5337.4). Total num frames: 262144. Throughput: 0: 644.2, 1: 644.2. Samples: 57344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:07:05,440][86732] Avg episode reward: [(0, '1.600'), (1, '0.810')]
[2023-09-22 10:07:05,441][88211] Saving new best policy, reward=1.600!
[2023-09-22 10:07:05,441][88352] Saving new best policy, reward=0.810!
[2023-09-22 10:07:06,184][88474] Updated weights for policy 1, policy_version 480 (0.0017)
[2023-09-22 10:07:06,185][88473] Updated weights for policy 0, policy_version 576 (0.0017)
[2023-09-22 10:07:10,440][86732] Fps is (10 sec: 6553.5, 60 sec: 5460.2, 300 sec: 5460.2). Total num frames: 294912. Throughput: 0: 716.5, 1: 715.2. Samples: 66473. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:07:10,441][86732] Avg episode reward: [(0, '1.870'), (1, '0.870')]
[2023-09-22 10:07:10,449][88211] Saving new best policy, reward=1.870!
[2023-09-22 10:07:10,449][88352] Saving new best policy, reward=0.870!
[2023-09-22 10:07:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5410.2, 300 sec: 5410.2). Total num frames: 319488. Throughput: 0: 759.5, 1: 759.0. Samples: 75652. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:07:15,440][86732] Avg episode reward: [(0, '2.200'), (1, '1.150')]
[2023-09-22 10:07:15,441][88352] Saving new best policy, reward=1.150!
[2023-09-22 10:07:15,598][88211] Saving new best policy, reward=2.200!
[2023-09-22 10:07:19,924][88474] Updated weights for policy 1, policy_version 640 (0.0015)
[2023-09-22 10:07:19,926][88473] Updated weights for policy 0, policy_version 736 (0.0016)
[2023-09-22 10:07:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 5506.3, 300 sec: 5506.3). Total num frames: 352256. Throughput: 0: 752.7, 1: 754.9. Samples: 79864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:07:20,440][86732] Avg episode reward: [(0, '2.230'), (1, '1.370')]
[2023-09-22 10:07:20,441][88352] Saving new best policy, reward=1.370!
[2023-09-22 10:07:20,441][88211] Saving new best policy, reward=2.230!
[2023-09-22 10:07:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5587.5). Total num frames: 385024. Throughput: 0: 754.9, 1: 755.4. Samples: 88790. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:25,441][86732] Avg episode reward: [(0, '2.470'), (1, '1.290')]
[2023-09-22 10:07:25,450][88211] Saving new best policy, reward=2.470!
[2023-09-22 10:07:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5539.1). Total num frames: 409600. Throughput: 0: 757.7, 1: 756.4. Samples: 98131. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:30,441][86732] Avg episode reward: [(0, '2.350'), (1, '1.660')]
[2023-09-22 10:07:30,630][88352] Saving new best policy, reward=1.660!
[2023-09-22 10:07:33,337][88474] Updated weights for policy 1, policy_version 800 (0.0016)
[2023-09-22 10:07:33,337][88473] Updated weights for policy 0, policy_version 896 (0.0018)
[2023-09-22 10:07:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5607.2). Total num frames: 442368. Throughput: 0: 751.6, 1: 753.1. Samples: 102433. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:35,441][86732] Avg episode reward: [(0, '2.260'), (1, '1.770')]
[2023-09-22 10:07:35,441][88352] Saving new best policy, reward=1.770!
[2023-09-22 10:07:40,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6075.7, 300 sec: 5666.7). Total num frames: 475136. Throughput: 0: 761.1, 1: 760.4. Samples: 111882. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:40,440][86732] Avg episode reward: [(0, '2.520'), (1, '1.890')]
[2023-09-22 10:07:40,447][88352] Saving new best policy, reward=1.890!
[2023-09-22 10:07:40,447][88211] Saving new best policy, reward=2.520!
[2023-09-22 10:07:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5719.2). Total num frames: 507904. Throughput: 0: 757.8, 1: 758.1. Samples: 121154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:45,441][86732] Avg episode reward: [(0, '2.680'), (1, '2.220')]
[2023-09-22 10:07:45,442][88211] Saving new best policy, reward=2.680!
[2023-09-22 10:07:45,442][88352] Saving new best policy, reward=2.220!
[2023-09-22 10:07:46,561][88474] Updated weights for policy 1, policy_version 960 (0.0016)
[2023-09-22 10:07:46,562][88473] Updated weights for policy 0, policy_version 1056 (0.0017)
[2023-09-22 10:07:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5674.3). Total num frames: 532480. Throughput: 0: 761.3, 1: 759.3. Samples: 125769. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:50,440][86732] Avg episode reward: [(0, '2.770'), (1, '2.080')]
[2023-09-22 10:07:50,623][88211] Saving new best policy, reward=2.770!
[2023-09-22 10:07:55,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5720.8). Total num frames: 565248. Throughput: 0: 762.6, 1: 763.8. Samples: 135162. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:07:55,440][86732] Avg episode reward: [(0, '2.920'), (1, '2.050')]
[2023-09-22 10:07:55,450][88211] Saving new best policy, reward=2.920!
[2023-09-22 10:07:59,998][88474] Updated weights for policy 1, policy_version 1120 (0.0019)
[2023-09-22 10:07:59,998][88473] Updated weights for policy 0, policy_version 1216 (0.0020)
[2023-09-22 10:08:00,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5762.6). Total num frames: 598016. Throughput: 0: 759.0, 1: 759.4. Samples: 143979. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:00,441][86732] Avg episode reward: [(0, '2.780'), (1, '2.060')]
[2023-09-22 10:08:05,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 5800.5). Total num frames: 630784. Throughput: 0: 765.5, 1: 763.4. Samples: 148666. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:08:05,441][86732] Avg episode reward: [(0, '3.020'), (1, '2.060')]
[2023-09-22 10:08:05,442][88211] Saving new best policy, reward=3.020!
[2023-09-22 10:08:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5760.1). Total num frames: 655360. Throughput: 0: 765.0, 1: 766.4. Samples: 157705. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:08:10,440][86732] Avg episode reward: [(0, '2.970'), (1, '2.100')]
[2023-09-22 10:08:10,638][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000001248_319488.pth...
[2023-09-22 10:08:10,646][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000001344_344064.pth...
[2023-09-22 10:08:13,490][88473] Updated weights for policy 0, policy_version 1376 (0.0018)
[2023-09-22 10:08:13,491][88474] Updated weights for policy 1, policy_version 1280 (0.0018)
[2023-09-22 10:08:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5794.7). Total num frames: 688128. Throughput: 0: 763.5, 1: 763.3. Samples: 166835. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:08:15,441][86732] Avg episode reward: [(0, '2.830'), (1, '2.190')]
[2023-09-22 10:08:20,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5826.4). Total num frames: 720896. Throughput: 0: 769.0, 1: 766.5. Samples: 171530. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:20,441][86732] Avg episode reward: [(0, '2.920'), (1, '2.250')]
[2023-09-22 10:08:20,442][88352] Saving new best policy, reward=2.250!
[2023-09-22 10:08:25,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5789.8). Total num frames: 745472. Throughput: 0: 758.2, 1: 760.6. Samples: 180228. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:25,441][86732] Avg episode reward: [(0, '2.770'), (1, '2.180')]
[2023-09-22 10:08:27,206][88473] Updated weights for policy 0, policy_version 1536 (0.0018)
[2023-09-22 10:08:27,207][88474] Updated weights for policy 1, policy_version 1440 (0.0019)
[2023-09-22 10:08:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5819.3). Total num frames: 778240. Throughput: 0: 756.2, 1: 756.0. Samples: 189203. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:08:30,441][86732] Avg episode reward: [(0, '2.950'), (1, '2.200')]
[2023-09-22 10:08:35,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5846.6). Total num frames: 811008. Throughput: 0: 758.2, 1: 757.7. Samples: 193985. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:35,441][86732] Avg episode reward: [(0, '3.270'), (1, '2.120')]
[2023-09-22 10:08:35,442][88211] Saving new best policy, reward=3.270!
[2023-09-22 10:08:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5813.3). Total num frames: 835584. Throughput: 0: 754.8, 1: 753.2. Samples: 203024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:40,440][86732] Avg episode reward: [(0, '3.570'), (1, '2.170')]
[2023-09-22 10:08:40,496][88211] Saving new best policy, reward=3.570!
[2023-09-22 10:08:40,499][88473] Updated weights for policy 0, policy_version 1696 (0.0017)
[2023-09-22 10:08:40,500][88474] Updated weights for policy 1, policy_version 1600 (0.0017)
[2023-09-22 10:08:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5838.9). Total num frames: 868352. Throughput: 0: 761.5, 1: 761.5. Samples: 212514. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:45,441][86732] Avg episode reward: [(0, '3.660'), (1, '2.300')]
[2023-09-22 10:08:45,442][88352] Saving new best policy, reward=2.300!
[2023-09-22 10:08:45,442][88211] Saving new best policy, reward=3.660!
[2023-09-22 10:08:50,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5862.8). Total num frames: 901120. Throughput: 0: 759.1, 1: 761.2. Samples: 217083. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:08:50,441][86732] Avg episode reward: [(0, '3.630'), (1, '2.300')]
[2023-09-22 10:08:53,890][88473] Updated weights for policy 0, policy_version 1856 (0.0016)
[2023-09-22 10:08:53,890][88474] Updated weights for policy 1, policy_version 1760 (0.0016)
[2023-09-22 10:08:55,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5885.1). Total num frames: 933888. Throughput: 0: 760.7, 1: 759.0. Samples: 226091. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:08:55,441][86732] Avg episode reward: [(0, '3.610'), (1, '2.490')]
[2023-09-22 10:08:55,452][88352] Saving new best policy, reward=2.490!
[2023-09-22 10:09:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5854.7). Total num frames: 958464. Throughput: 0: 759.4, 1: 759.0. Samples: 235162. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:09:00,441][86732] Avg episode reward: [(0, '3.310'), (1, '2.650')]
[2023-09-22 10:09:00,442][88352] Saving new best policy, reward=2.650!
[2023-09-22 10:09:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5876.0). Total num frames: 991232. Throughput: 0: 755.1, 1: 758.0. Samples: 239620. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:05,440][86732] Avg episode reward: [(0, '3.510'), (1, '2.670')]
[2023-09-22 10:09:05,441][88352] Saving new best policy, reward=2.670!
[2023-09-22 10:09:07,391][88474] Updated weights for policy 1, policy_version 1920 (0.0018)
[2023-09-22 10:09:07,391][88473] Updated weights for policy 0, policy_version 2016 (0.0016)
[2023-09-22 10:09:10,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5895.9). Total num frames: 1024000. Throughput: 0: 764.8, 1: 763.7. Samples: 249013. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:09:10,441][86732] Avg episode reward: [(0, '3.500'), (1, '2.660')]
[2023-09-22 10:09:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5867.9). Total num frames: 1048576. Throughput: 0: 713.2, 1: 766.1. Samples: 255772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:15,440][86732] Avg episode reward: [(0, '3.370'), (1, '2.710')]
[2023-09-22 10:09:15,450][88352] Saving new best policy, reward=2.710!
[2023-09-22 10:09:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5886.9). Total num frames: 1081344. Throughput: 0: 761.7, 1: 761.9. Samples: 262549. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:09:20,441][86732] Avg episode reward: [(0, '3.870'), (1, '3.060')]
[2023-09-22 10:09:20,442][88211] Saving new best policy, reward=3.870!
[2023-09-22 10:09:20,442][88352] Saving new best policy, reward=3.110!
[2023-09-22 10:09:20,815][88474] Updated weights for policy 1, policy_version 2080 (0.0017)
[2023-09-22 10:09:20,816][88473] Updated weights for policy 0, policy_version 2176 (0.0015)
[2023-09-22 10:09:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5905.0). Total num frames: 1114112. Throughput: 0: 766.5, 1: 766.0. Samples: 271985. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:25,440][86732] Avg episode reward: [(0, '3.940'), (1, '3.190')]
[2023-09-22 10:09:25,450][88352] Saving new best policy, reward=3.190!
[2023-09-22 10:09:25,450][88211] Saving new best policy, reward=3.940!
[2023-09-22 10:09:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5922.1). Total num frames: 1146880. Throughput: 0: 758.2, 1: 758.2. Samples: 280752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:30,440][86732] Avg episode reward: [(0, '4.130'), (1, '3.420')]
[2023-09-22 10:09:30,441][88211] Saving new best policy, reward=4.130!
[2023-09-22 10:09:30,441][88352] Saving new best policy, reward=3.420!
[2023-09-22 10:09:34,345][88473] Updated weights for policy 0, policy_version 2336 (0.0016)
[2023-09-22 10:09:34,345][88474] Updated weights for policy 1, policy_version 2240 (0.0017)
[2023-09-22 10:09:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5896.2). Total num frames: 1171456. Throughput: 0: 757.8, 1: 755.6. Samples: 285186. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:09:35,441][86732] Avg episode reward: [(0, '4.420'), (1, '3.220')]
[2023-09-22 10:09:35,442][88211] Saving new best policy, reward=4.420!
[2023-09-22 10:09:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5912.7). Total num frames: 1204224. Throughput: 0: 763.2, 1: 763.2. Samples: 294779. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:40,441][86732] Avg episode reward: [(0, '4.170'), (1, '3.100')]
[2023-09-22 10:09:45,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5928.4). Total num frames: 1236992. Throughput: 0: 764.2, 1: 765.0. Samples: 303976. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:09:45,441][86732] Avg episode reward: [(0, '4.040'), (1, '3.040')]
[2023-09-22 10:09:47,501][88473] Updated weights for policy 0, policy_version 2496 (0.0017)
[2023-09-22 10:09:47,501][88474] Updated weights for policy 1, policy_version 2400 (0.0014)
[2023-09-22 10:09:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5943.3). Total num frames: 1269760. Throughput: 0: 769.7, 1: 768.3. Samples: 308827. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:09:50,440][86732] Avg episode reward: [(0, '3.970'), (1, '2.680')]
[2023-09-22 10:09:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5919.3). Total num frames: 1294336. Throughput: 0: 761.8, 1: 761.1. Samples: 317545. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:09:55,441][86732] Avg episode reward: [(0, '3.810'), (1, '2.830')]
[2023-09-22 10:10:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5933.8). Total num frames: 1327104. Throughput: 0: 814.7, 1: 762.2. Samples: 326731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:00,440][86732] Avg episode reward: [(0, '3.920'), (1, '2.840')]
[2023-09-22 10:10:01,079][88473] Updated weights for policy 0, policy_version 2656 (0.0013)
[2023-09-22 10:10:01,080][88474] Updated weights for policy 1, policy_version 2560 (0.0016)
[2023-09-22 10:10:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5947.6). Total num frames: 1359872. Throughput: 0: 764.7, 1: 765.2. Samples: 331391. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:05,441][86732] Avg episode reward: [(0, '4.210'), (1, '2.840')]
[2023-09-22 10:10:10,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 5960.8). Total num frames: 1392640. Throughput: 0: 760.8, 1: 760.7. Samples: 340454. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:10:10,441][86732] Avg episode reward: [(0, '4.190'), (1, '2.760')]
[2023-09-22 10:10:10,453][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000002672_684032.pth...
[2023-09-22 10:10:10,454][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000002768_708608.pth...
[2023-09-22 10:10:10,490][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000000096_24576.pth
[2023-09-22 10:10:14,421][88474] Updated weights for policy 1, policy_version 2720 (0.0016)
[2023-09-22 10:10:14,422][88473] Updated weights for policy 0, policy_version 2816 (0.0017)
[2023-09-22 10:10:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5938.5). Total num frames: 1417216. Throughput: 0: 767.3, 1: 766.7. Samples: 349782. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:15,440][86732] Avg episode reward: [(0, '4.500'), (1, '2.890')]
[2023-09-22 10:10:15,441][88211] Saving new best policy, reward=4.500!
[2023-09-22 10:10:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 5951.3). Total num frames: 1449984. Throughput: 0: 766.8, 1: 769.1. Samples: 354304. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:10:20,440][86732] Avg episode reward: [(0, '4.480'), (1, '2.660')]
[2023-09-22 10:10:25,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5963.7). Total num frames: 1482752. Throughput: 0: 763.1, 1: 762.9. Samples: 363447. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:25,440][86732] Avg episode reward: [(0, '4.530'), (1, '2.830')]
[2023-09-22 10:10:25,450][88211] Saving new best policy, reward=4.530!
[2023-09-22 10:10:27,759][88473] Updated weights for policy 0, policy_version 2976 (0.0015)
[2023-09-22 10:10:27,759][88474] Updated weights for policy 1, policy_version 2880 (0.0016)
[2023-09-22 10:10:30,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 5959.1). Total num frames: 1511424. Throughput: 0: 763.2, 1: 764.8. Samples: 372736. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:10:30,441][86732] Avg episode reward: [(0, '4.600'), (1, '3.070')]
[2023-09-22 10:10:30,494][88211] Saving new best policy, reward=4.600!
[2023-09-22 10:10:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5954.7). Total num frames: 1540096. Throughput: 0: 757.6, 1: 757.3. Samples: 376997. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:35,441][86732] Avg episode reward: [(0, '4.830'), (1, '3.240')]
[2023-09-22 10:10:35,442][88211] Saving new best policy, reward=4.830!
[2023-09-22 10:10:40,440][86732] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 5966.2). Total num frames: 1572864. Throughput: 0: 760.1, 1: 759.8. Samples: 385940. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:10:40,440][86732] Avg episode reward: [(0, '4.790'), (1, '3.500')]
[2023-09-22 10:10:40,448][88352] Saving new best policy, reward=3.500!
[2023-09-22 10:10:41,617][88474] Updated weights for policy 1, policy_version 3040 (0.0016)
[2023-09-22 10:10:41,617][88473] Updated weights for policy 0, policy_version 3136 (0.0015)
[2023-09-22 10:10:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5946.3). Total num frames: 1597440. Throughput: 0: 758.2, 1: 759.4. Samples: 395021. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:10:45,441][86732] Avg episode reward: [(0, '4.910'), (1, '3.840')]
[2023-09-22 10:10:45,442][88352] Saving new best policy, reward=3.840!
[2023-09-22 10:10:45,442][88211] Saving new best policy, reward=4.910!
[2023-09-22 10:10:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5957.6). Total num frames: 1630208. Throughput: 0: 754.3, 1: 756.1. Samples: 399360. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:10:50,441][86732] Avg episode reward: [(0, '4.960'), (1, '3.670')]
[2023-09-22 10:10:50,442][88211] Saving new best policy, reward=4.960!
[2023-09-22 10:10:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5938.6). Total num frames: 1654784. Throughput: 0: 748.5, 1: 748.7. Samples: 407828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:10:55,441][86732] Avg episode reward: [(0, '5.180'), (1, '3.650')]
[2023-09-22 10:10:55,449][88211] Saving new best policy, reward=5.180!
[2023-09-22 10:10:55,451][88474] Updated weights for policy 1, policy_version 3200 (0.0018)
[2023-09-22 10:10:55,451][88473] Updated weights for policy 0, policy_version 3296 (0.0016)
[2023-09-22 10:11:00,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5949.6). Total num frames: 1687552. Throughput: 0: 751.2, 1: 749.6. Samples: 417318. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:00,440][86732] Avg episode reward: [(0, '5.100'), (1, '3.750')]
[2023-09-22 10:11:05,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5960.2). Total num frames: 1720320. Throughput: 0: 750.8, 1: 749.9. Samples: 421834. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:11:05,440][86732] Avg episode reward: [(0, '5.160'), (1, '4.040')]
[2023-09-22 10:11:05,441][88352] Saving new best policy, reward=4.040!
[2023-09-22 10:11:09,091][88474] Updated weights for policy 1, policy_version 3360 (0.0016)
[2023-09-22 10:11:09,092][88473] Updated weights for policy 0, policy_version 3456 (0.0017)
[2023-09-22 10:11:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.2). Total num frames: 1744896. Throughput: 0: 745.3, 1: 745.4. Samples: 430526. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:10,441][86732] Avg episode reward: [(0, '5.300'), (1, '4.230')]
[2023-09-22 10:11:10,450][88211] Saving new best policy, reward=5.300!
[2023-09-22 10:11:10,450][88352] Saving new best policy, reward=4.230!
[2023-09-22 10:11:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5952.6). Total num frames: 1777664. Throughput: 0: 744.4, 1: 742.2. Samples: 439630. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:15,441][86732] Avg episode reward: [(0, '5.220'), (1, '4.120')]
[2023-09-22 10:11:20,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 1810432. Throughput: 0: 746.8, 1: 746.1. Samples: 444180. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:20,441][86732] Avg episode reward: [(0, '5.320'), (1, '3.770')]
[2023-09-22 10:11:20,441][88211] Saving new best policy, reward=5.320!
[2023-09-22 10:11:23,207][88473] Updated weights for policy 0, policy_version 3616 (0.0017)
[2023-09-22 10:11:23,207][88474] Updated weights for policy 1, policy_version 3520 (0.0014)
[2023-09-22 10:11:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6053.7). Total num frames: 1835008. Throughput: 0: 739.6, 1: 740.7. Samples: 452555. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:25,441][86732] Avg episode reward: [(0, '5.590'), (1, '3.630')]
[2023-09-22 10:11:25,453][88211] Saving new best policy, reward=5.590!
[2023-09-22 10:11:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5939.2, 300 sec: 6053.8). Total num frames: 1867776. Throughput: 0: 735.5, 1: 734.3. Samples: 461163. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:30,440][86732] Avg episode reward: [(0, '5.410'), (1, '3.690')]
[2023-09-22 10:11:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 6039.9). Total num frames: 1892352. Throughput: 0: 738.6, 1: 737.2. Samples: 465769. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:35,440][86732] Avg episode reward: [(0, '5.680'), (1, '3.810')]
[2023-09-22 10:11:35,471][88211] Saving new best policy, reward=5.680!
[2023-09-22 10:11:36,828][88473] Updated weights for policy 0, policy_version 3776 (0.0016)
[2023-09-22 10:11:36,828][88474] Updated weights for policy 1, policy_version 3680 (0.0018)
[2023-09-22 10:11:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6053.8). Total num frames: 1925120. Throughput: 0: 746.8, 1: 748.6. Samples: 475123. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:40,440][86732] Avg episode reward: [(0, '5.460'), (1, '4.120')]
[2023-09-22 10:11:45,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 1957888. Throughput: 0: 733.8, 1: 735.7. Samples: 483449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:45,440][86732] Avg episode reward: [(0, '5.270'), (1, '4.220')]
[2023-09-22 10:11:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 1982464. Throughput: 0: 737.0, 1: 735.7. Samples: 488105. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:11:50,441][86732] Avg episode reward: [(0, '5.440'), (1, '4.560')]
[2023-09-22 10:11:50,442][88352] Saving new best policy, reward=4.560!
[2023-09-22 10:11:50,753][88474] Updated weights for policy 1, policy_version 3840 (0.0016)
[2023-09-22 10:11:50,753][88473] Updated weights for policy 0, policy_version 3936 (0.0015)
[2023-09-22 10:11:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 2015232. Throughput: 0: 738.8, 1: 738.9. Samples: 497022. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:11:55,440][86732] Avg episode reward: [(0, '5.260'), (1, '4.980')]
[2023-09-22 10:11:55,447][88352] Saving new best policy, reward=4.980!
[2023-09-22 10:12:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 2039808. Throughput: 0: 686.2, 1: 736.9. Samples: 503670. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:00,441][86732] Avg episode reward: [(0, '5.190'), (1, '4.850')]
[2023-09-22 10:12:04,641][88473] Updated weights for policy 0, policy_version 4096 (0.0014)
[2023-09-22 10:12:04,643][88474] Updated weights for policy 1, policy_version 4000 (0.0017)
[2023-09-22 10:12:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 2072576. Throughput: 0: 732.2, 1: 732.6. Samples: 510095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:05,441][86732] Avg episode reward: [(0, '5.110'), (1, '4.950')]
[2023-09-22 10:12:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 2105344. Throughput: 0: 740.2, 1: 740.0. Samples: 519165. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:10,440][86732] Avg episode reward: [(0, '5.050'), (1, '4.930')]
[2023-09-22 10:12:10,447][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000004064_1040384.pth...
[2023-09-22 10:12:10,448][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000004160_1064960.pth...
[2023-09-22 10:12:10,476][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000001248_319488.pth
[2023-09-22 10:12:10,482][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000001344_344064.pth
[2023-09-22 10:12:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 2129920. Throughput: 0: 744.4, 1: 744.2. Samples: 528151. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:12:15,440][86732] Avg episode reward: [(0, '4.720'), (1, '4.760')]
[2023-09-22 10:12:18,382][88474] Updated weights for policy 1, policy_version 4160 (0.0016)
[2023-09-22 10:12:18,382][88473] Updated weights for policy 0, policy_version 4256 (0.0017)
[2023-09-22 10:12:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 2162688. Throughput: 0: 740.7, 1: 741.9. Samples: 532485. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:20,441][86732] Avg episode reward: [(0, '4.550'), (1, '4.660')]
[2023-09-22 10:12:25,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 2195456. Throughput: 0: 740.6, 1: 738.8. Samples: 541694. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:25,441][86732] Avg episode reward: [(0, '4.720'), (1, '4.780')]
[2023-09-22 10:12:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 2228224. Throughput: 0: 748.6, 1: 750.7. Samples: 550916. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:30,441][86732] Avg episode reward: [(0, '4.940'), (1, '5.120')]
[2023-09-22 10:12:30,442][88352] Saving new best policy, reward=5.120!
[2023-09-22 10:12:31,757][88474] Updated weights for policy 1, policy_version 4320 (0.0015)
[2023-09-22 10:12:31,757][88473] Updated weights for policy 0, policy_version 4416 (0.0015)
[2023-09-22 10:12:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 2252800. Throughput: 0: 748.0, 1: 748.1. Samples: 555428. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:35,441][86732] Avg episode reward: [(0, '5.270'), (1, '5.220')]
[2023-09-22 10:12:35,442][88352] Saving new best policy, reward=5.220!
[2023-09-22 10:12:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 2285568. Throughput: 0: 756.3, 1: 757.8. Samples: 565157. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:12:40,441][86732] Avg episode reward: [(0, '5.520'), (1, '5.410')]
[2023-09-22 10:12:40,452][88352] Saving new best policy, reward=5.410!
[2023-09-22 10:12:44,880][88473] Updated weights for policy 0, policy_version 4576 (0.0018)
[2023-09-22 10:12:44,881][88474] Updated weights for policy 1, policy_version 4480 (0.0016)
[2023-09-22 10:12:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 2318336. Throughput: 0: 809.5, 1: 758.9. Samples: 574247. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:45,440][86732] Avg episode reward: [(0, '5.400'), (1, '5.330')]
[2023-09-22 10:12:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2351104. Throughput: 0: 765.1, 1: 764.8. Samples: 578939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:50,441][86732] Avg episode reward: [(0, '5.250'), (1, '5.590')]
[2023-09-22 10:12:50,442][88352] Saving new best policy, reward=5.590!
[2023-09-22 10:12:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 2375680. Throughput: 0: 762.5, 1: 763.1. Samples: 587816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:12:55,441][86732] Avg episode reward: [(0, '5.410'), (1, '6.010')]
[2023-09-22 10:12:55,451][88352] Saving new best policy, reward=6.010!
[2023-09-22 10:12:58,289][88473] Updated weights for policy 0, policy_version 4736 (0.0016)
[2023-09-22 10:12:58,289][88474] Updated weights for policy 1, policy_version 4640 (0.0016)
[2023-09-22 10:13:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 2408448. Throughput: 0: 766.3, 1: 767.4. Samples: 597168. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:13:00,441][86732] Avg episode reward: [(0, '5.390'), (1, '5.990')]
[2023-09-22 10:13:05,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2441216. Throughput: 0: 771.8, 1: 769.4. Samples: 601838. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:05,440][86732] Avg episode reward: [(0, '6.050'), (1, '6.150')]
[2023-09-22 10:13:05,441][88352] Saving new best policy, reward=6.150!
[2023-09-22 10:13:05,441][88211] Saving new best policy, reward=6.050!
[2023-09-22 10:13:10,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2473984. Throughput: 0: 767.9, 1: 767.3. Samples: 610778. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:10,440][86732] Avg episode reward: [(0, '5.730'), (1, '6.450')]
[2023-09-22 10:13:10,450][88352] Saving new best policy, reward=6.450!
[2023-09-22 10:13:11,807][88474] Updated weights for policy 1, policy_version 4800 (0.0017)
[2023-09-22 10:13:11,809][88473] Updated weights for policy 0, policy_version 4896 (0.0018)
[2023-09-22 10:13:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 2498560. Throughput: 0: 770.0, 1: 767.6. Samples: 620107. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:15,440][86732] Avg episode reward: [(0, '5.670'), (1, '6.300')]
[2023-09-22 10:13:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2531328. Throughput: 0: 768.0, 1: 770.1. Samples: 624644. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:13:20,441][86732] Avg episode reward: [(0, '5.830'), (1, '5.960')]
[2023-09-22 10:13:25,114][88473] Updated weights for policy 0, policy_version 5056 (0.0016)
[2023-09-22 10:13:25,114][88474] Updated weights for policy 1, policy_version 4960 (0.0017)
[2023-09-22 10:13:25,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2564096. Throughput: 0: 763.3, 1: 761.4. Samples: 633769. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:13:25,441][86732] Avg episode reward: [(0, '5.330'), (1, '6.110')]
[2023-09-22 10:13:30,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2596864. Throughput: 0: 763.8, 1: 765.8. Samples: 643076. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:13:30,441][86732] Avg episode reward: [(0, '5.430'), (1, '5.730')]
[2023-09-22 10:13:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2621440. Throughput: 0: 764.1, 1: 764.2. Samples: 647715. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:35,441][86732] Avg episode reward: [(0, '5.500'), (1, '6.010')]
[2023-09-22 10:13:38,354][88473] Updated weights for policy 0, policy_version 5216 (0.0014)
[2023-09-22 10:13:38,355][88474] Updated weights for policy 1, policy_version 5120 (0.0016)
[2023-09-22 10:13:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2654208. Throughput: 0: 771.3, 1: 770.3. Samples: 657187. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:13:40,441][86732] Avg episode reward: [(0, '5.630'), (1, '5.980')]
[2023-09-22 10:13:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2686976. Throughput: 0: 769.4, 1: 768.7. Samples: 666383. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:45,440][86732] Avg episode reward: [(0, '5.680'), (1, '6.040')]
[2023-09-22 10:13:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2719744. Throughput: 0: 771.1, 1: 771.7. Samples: 671261. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:50,441][86732] Avg episode reward: [(0, '6.060'), (1, '5.970')]
[2023-09-22 10:13:50,442][88211] Saving new best policy, reward=6.060!
[2023-09-22 10:13:51,534][88474] Updated weights for policy 1, policy_version 5280 (0.0016)
[2023-09-22 10:13:51,535][88473] Updated weights for policy 0, policy_version 5376 (0.0016)
[2023-09-22 10:13:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2744320. Throughput: 0: 767.4, 1: 769.8. Samples: 679952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:13:55,440][86732] Avg episode reward: [(0, '5.990'), (1, '5.980')]
[2023-09-22 10:14:00,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2777088. Throughput: 0: 764.1, 1: 766.0. Samples: 688961. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:14:00,441][86732] Avg episode reward: [(0, '5.970'), (1, '5.860')]
[2023-09-22 10:14:05,168][88474] Updated weights for policy 1, policy_version 5440 (0.0017)
[2023-09-22 10:14:05,169][88473] Updated weights for policy 0, policy_version 5536 (0.0017)
[2023-09-22 10:14:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2809856. Throughput: 0: 768.1, 1: 766.6. Samples: 693708. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:14:05,440][86732] Avg episode reward: [(0, '5.890'), (1, '6.050')]
[2023-09-22 10:14:10,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6067.6). Total num frames: 2838528. Throughput: 0: 766.8, 1: 767.4. Samples: 702808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:10,441][86732] Avg episode reward: [(0, '6.100'), (1, '5.400')]
[2023-09-22 10:14:10,455][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000005600_1433600.pth...
[2023-09-22 10:14:10,455][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000005504_1409024.pth...
[2023-09-22 10:14:10,498][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000002672_684032.pth
[2023-09-22 10:14:10,499][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000002768_708608.pth
[2023-09-22 10:14:10,503][88211] Saving new best policy, reward=6.100!
[2023-09-22 10:14:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2867200. Throughput: 0: 770.0, 1: 768.1. Samples: 712289. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:15,441][86732] Avg episode reward: [(0, '6.120'), (1, '4.780')]
[2023-09-22 10:14:15,441][88211] Saving new best policy, reward=6.120!
[2023-09-22 10:14:18,394][88473] Updated weights for policy 0, policy_version 5696 (0.0017)
[2023-09-22 10:14:18,394][88474] Updated weights for policy 1, policy_version 5600 (0.0014)
[2023-09-22 10:14:20,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 2899968. Throughput: 0: 766.6, 1: 768.7. Samples: 716804. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:20,441][86732] Avg episode reward: [(0, '6.000'), (1, '4.730')]
[2023-09-22 10:14:25,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2932736. Throughput: 0: 762.3, 1: 762.0. Samples: 725784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:25,440][86732] Avg episode reward: [(0, '6.120'), (1, '4.830')]
[2023-09-22 10:14:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 2957312. Throughput: 0: 762.9, 1: 762.1. Samples: 735006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:30,440][86732] Avg episode reward: [(0, '6.460'), (1, '5.130')]
[2023-09-22 10:14:30,441][88211] Saving new best policy, reward=6.460!
[2023-09-22 10:14:32,122][88473] Updated weights for policy 0, policy_version 5856 (0.0015)
[2023-09-22 10:14:32,123][88474] Updated weights for policy 1, policy_version 5760 (0.0017)
[2023-09-22 10:14:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 2990080. Throughput: 0: 755.4, 1: 757.3. Samples: 739329. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:14:35,441][86732] Avg episode reward: [(0, '6.220'), (1, '5.350')]
[2023-09-22 10:14:40,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 3022848. Throughput: 0: 763.8, 1: 762.6. Samples: 748640. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:14:40,440][86732] Avg episode reward: [(0, '6.160'), (1, '5.720')]
[2023-09-22 10:14:45,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3047424. Throughput: 0: 711.7, 1: 763.6. Samples: 755350. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:14:45,440][86732] Avg episode reward: [(0, '6.150'), (1, '5.880')]
[2023-09-22 10:14:45,577][88474] Updated weights for policy 1, policy_version 5920 (0.0017)
[2023-09-22 10:14:45,578][88473] Updated weights for policy 0, policy_version 6016 (0.0013)
[2023-09-22 10:14:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3080192. Throughput: 0: 758.0, 1: 758.1. Samples: 761934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:14:50,441][86732] Avg episode reward: [(0, '6.140'), (1, '5.450')]
[2023-09-22 10:14:55,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3112960. Throughput: 0: 761.7, 1: 761.1. Samples: 771335. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:14:55,440][86732] Avg episode reward: [(0, '6.500'), (1, '5.370')]
[2023-09-22 10:14:55,450][88211] Saving new best policy, reward=6.500!
[2023-09-22 10:14:59,003][88474] Updated weights for policy 1, policy_version 6080 (0.0014)
[2023-09-22 10:14:59,003][88473] Updated weights for policy 0, policy_version 6176 (0.0017)
[2023-09-22 10:15:00,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 3145728. Throughput: 0: 754.5, 1: 756.6. Samples: 780288. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:15:00,440][86732] Avg episode reward: [(0, '6.380'), (1, '5.380')]
[2023-09-22 10:15:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3170304. Throughput: 0: 754.5, 1: 752.5. Samples: 784617. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:15:05,440][86732] Avg episode reward: [(0, '6.780'), (1, '5.320')]
[2023-09-22 10:15:05,441][88211] Saving new best policy, reward=6.780!
[2023-09-22 10:15:10,440][86732] Fps is (10 sec: 5734.2, 60 sec: 6075.7, 300 sec: 6053.7). Total num frames: 3203072. Throughput: 0: 754.0, 1: 755.0. Samples: 793693. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:10,441][86732] Avg episode reward: [(0, '6.830'), (1, '5.440')]
[2023-09-22 10:15:10,451][88211] Saving new best policy, reward=6.830!
[2023-09-22 10:15:12,893][88474] Updated weights for policy 1, policy_version 6240 (0.0019)
[2023-09-22 10:15:12,893][88473] Updated weights for policy 0, policy_version 6336 (0.0016)
[2023-09-22 10:15:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3227648. Throughput: 0: 752.1, 1: 752.9. Samples: 802729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:15,440][86732] Avg episode reward: [(0, '6.820'), (1, '5.290')]
[2023-09-22 10:15:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3260416. Throughput: 0: 753.4, 1: 751.3. Samples: 807042. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:20,440][86732] Avg episode reward: [(0, '6.820'), (1, '5.350')]
[2023-09-22 10:15:25,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 6039.9). Total num frames: 3293184. Throughput: 0: 749.2, 1: 747.8. Samples: 816006. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:15:25,441][86732] Avg episode reward: [(0, '6.810'), (1, '5.110')]
[2023-09-22 10:15:26,641][88474] Updated weights for policy 1, policy_version 6400 (0.0016)
[2023-09-22 10:15:26,642][88473] Updated weights for policy 0, policy_version 6496 (0.0017)
[2023-09-22 10:15:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3317760. Throughput: 0: 802.2, 1: 748.1. Samples: 825114. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:15:30,440][86732] Avg episode reward: [(0, '6.620'), (1, '4.970')]
[2023-09-22 10:15:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 3350528. Throughput: 0: 749.3, 1: 750.8. Samples: 829441. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:35,441][86732] Avg episode reward: [(0, '6.620'), (1, '5.140')]
[2023-09-22 10:15:39,863][88473] Updated weights for policy 0, policy_version 6656 (0.0016)
[2023-09-22 10:15:39,864][88474] Updated weights for policy 1, policy_version 6560 (0.0015)
[2023-09-22 10:15:40,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 3383296. Throughput: 0: 752.6, 1: 753.0. Samples: 839087. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:40,441][86732] Avg episode reward: [(0, '6.570'), (1, '5.250')]
[2023-09-22 10:15:45,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3416064. Throughput: 0: 751.6, 1: 750.9. Samples: 847902. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:45,441][86732] Avg episode reward: [(0, '6.190'), (1, '5.320')]
[2023-09-22 10:15:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3440640. Throughput: 0: 754.3, 1: 754.4. Samples: 852506. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:15:50,441][86732] Avg episode reward: [(0, '6.290'), (1, '5.400')]
[2023-09-22 10:15:53,498][88473] Updated weights for policy 0, policy_version 6816 (0.0016)
[2023-09-22 10:15:53,499][88474] Updated weights for policy 1, policy_version 6720 (0.0014)
[2023-09-22 10:15:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3473408. Throughput: 0: 756.1, 1: 754.8. Samples: 861683. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:15:55,440][86732] Avg episode reward: [(0, '5.940'), (1, '5.570')]
[2023-09-22 10:16:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 3497984. Throughput: 0: 742.2, 1: 742.7. Samples: 869547. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:00,441][86732] Avg episode reward: [(0, '6.100'), (1, '5.460')]
[2023-09-22 10:16:05,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 3522560. Throughput: 0: 735.6, 1: 735.8. Samples: 873255. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:16:05,440][86732] Avg episode reward: [(0, '6.140'), (1, '5.630')]
[2023-09-22 10:16:09,346][88473] Updated weights for policy 0, policy_version 6976 (0.0015)
[2023-09-22 10:16:09,346][88474] Updated weights for policy 1, policy_version 6880 (0.0014)
[2023-09-22 10:16:10,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5734.4, 300 sec: 5998.2). Total num frames: 3547136. Throughput: 0: 717.2, 1: 719.6. Samples: 880663. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:16:10,441][86732] Avg episode reward: [(0, '6.510'), (1, '5.610')]
[2023-09-22 10:16:10,450][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000006976_1785856.pth...
[2023-09-22 10:16:10,450][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000006880_1761280.pth...
[2023-09-22 10:16:10,480][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000004160_1064960.pth
[2023-09-22 10:16:10,490][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000004064_1040384.pth
[2023-09-22 10:16:15,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5734.4, 300 sec: 5970.4). Total num frames: 3571712. Throughput: 0: 705.8, 1: 706.8. Samples: 888679. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:16:15,440][86732] Avg episode reward: [(0, '6.480'), (1, '5.540')]
[2023-09-22 10:16:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5998.2). Total num frames: 3604480. Throughput: 0: 702.4, 1: 702.5. Samples: 892660. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:20,440][86732] Avg episode reward: [(0, '6.930'), (1, '5.740')]
[2023-09-22 10:16:20,441][88211] Saving new best policy, reward=6.930!
[2023-09-22 10:16:25,109][88473] Updated weights for policy 0, policy_version 7136 (0.0014)
[2023-09-22 10:16:25,109][88474] Updated weights for policy 1, policy_version 7040 (0.0014)
[2023-09-22 10:16:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5970.4). Total num frames: 3629056. Throughput: 0: 680.0, 1: 680.2. Samples: 900297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:25,440][86732] Avg episode reward: [(0, '6.880'), (1, '5.740')]
[2023-09-22 10:16:30,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5597.9, 300 sec: 5970.4). Total num frames: 3653632. Throughput: 0: 667.9, 1: 666.8. Samples: 907964. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:16:30,441][86732] Avg episode reward: [(0, '6.810'), (1, '5.530')]
[2023-09-22 10:16:35,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5942.7). Total num frames: 3678208. Throughput: 0: 659.4, 1: 659.7. Samples: 911862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:35,440][86732] Avg episode reward: [(0, '6.780'), (1, '5.420')]
[2023-09-22 10:16:40,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5324.8, 300 sec: 5914.9). Total num frames: 3702784. Throughput: 0: 641.8, 1: 642.5. Samples: 919477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:40,441][86732] Avg episode reward: [(0, '6.840'), (1, '5.420')]
[2023-09-22 10:16:41,246][88474] Updated weights for policy 1, policy_version 7200 (0.0018)
[2023-09-22 10:16:41,246][88473] Updated weights for policy 0, policy_version 7296 (0.0013)
[2023-09-22 10:16:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5914.9). Total num frames: 3727360. Throughput: 0: 640.7, 1: 639.8. Samples: 927173. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:45,441][86732] Avg episode reward: [(0, '6.530'), (1, '5.070')]
[2023-09-22 10:16:50,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5188.3, 300 sec: 5887.1). Total num frames: 3751936. Throughput: 0: 640.6, 1: 641.3. Samples: 930941. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:50,440][86732] Avg episode reward: [(0, '6.470'), (1, '5.190')]
[2023-09-22 10:16:55,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5120.0, 300 sec: 5901.0). Total num frames: 3780608. Throughput: 0: 644.2, 1: 642.6. Samples: 938565. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:16:55,440][86732] Avg episode reward: [(0, '6.700'), (1, '5.210')]
[2023-09-22 10:16:57,039][88474] Updated weights for policy 1, policy_version 7360 (0.0014)
[2023-09-22 10:16:57,039][88473] Updated weights for policy 0, policy_version 7456 (0.0014)
[2023-09-22 10:17:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5887.1). Total num frames: 3809280. Throughput: 0: 641.0, 1: 641.0. Samples: 946367. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:17:00,440][86732] Avg episode reward: [(0, '6.650'), (1, '5.020')]
[2023-09-22 10:17:05,440][86732] Fps is (10 sec: 5324.7, 60 sec: 5188.2, 300 sec: 5859.4). Total num frames: 3833856. Throughput: 0: 640.1, 1: 640.1. Samples: 950272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:17:05,442][86732] Avg episode reward: [(0, '6.660'), (1, '5.050')]
[2023-09-22 10:17:10,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.3, 300 sec: 5859.4). Total num frames: 3858432. Throughput: 0: 640.8, 1: 641.8. Samples: 958012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:17:10,440][86732] Avg episode reward: [(0, '6.630'), (1, '4.980')]
[2023-09-22 10:17:13,167][88474] Updated weights for policy 1, policy_version 7520 (0.0013)
[2023-09-22 10:17:13,168][88473] Updated weights for policy 0, policy_version 7616 (0.0014)
[2023-09-22 10:17:15,440][86732] Fps is (10 sec: 4915.4, 60 sec: 5188.3, 300 sec: 5831.6). Total num frames: 3883008. Throughput: 0: 639.0, 1: 638.1. Samples: 965430. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:17:15,440][86732] Avg episode reward: [(0, '6.780'), (1, '5.170')]
[2023-09-22 10:17:20,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5051.7, 300 sec: 5803.8). Total num frames: 3907584. Throughput: 0: 636.5, 1: 636.7. Samples: 969156. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:17:20,440][86732] Avg episode reward: [(0, '6.860'), (1, '5.270')]
[2023-09-22 10:17:25,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5051.7, 300 sec: 5776.1). Total num frames: 3932160. Throughput: 0: 637.2, 1: 638.8. Samples: 976897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:17:25,441][86732] Avg episode reward: [(0, '6.780'), (1, '5.420')]
[2023-09-22 10:17:29,153][88473] Updated weights for policy 0, policy_version 7776 (0.0011)
[2023-09-22 10:17:29,154][88474] Updated weights for policy 1, policy_version 7680 (0.0013)
[2023-09-22 10:17:30,440][86732] Fps is (10 sec: 4915.0, 60 sec: 5051.7, 300 sec: 5776.1). Total num frames: 3956736. Throughput: 0: 641.7, 1: 640.8. Samples: 984883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:17:30,442][86732] Avg episode reward: [(0, '6.860'), (1, '5.240')]
[2023-09-22 10:17:35,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5051.7, 300 sec: 5748.3). Total num frames: 3981312. Throughput: 0: 642.0, 1: 639.9. Samples: 988626. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:17:35,440][86732] Avg episode reward: [(0, '6.970'), (1, '5.450')]
[2023-09-22 10:17:35,562][88211] Saving new best policy, reward=6.970!
[2023-09-22 10:17:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5188.3, 300 sec: 5748.3). Total num frames: 4014080. Throughput: 0: 639.6, 1: 639.2. Samples: 996111. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:17:40,441][86732] Avg episode reward: [(0, '6.750'), (1, '5.410')]
[2023-09-22 10:17:45,070][88474] Updated weights for policy 1, policy_version 7840 (0.0012)
[2023-09-22 10:17:45,079][88473] Updated weights for policy 0, policy_version 7936 (0.0013)
[2023-09-22 10:17:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5720.5). Total num frames: 4038656. Throughput: 0: 639.4, 1: 639.4. Samples: 1003913. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:17:45,440][86732] Avg episode reward: [(0, '6.380'), (1, '5.180')]
[2023-09-22 10:17:50,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.2, 300 sec: 5720.5). Total num frames: 4063232. Throughput: 0: 640.7, 1: 639.2. Samples: 1007870. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:17:50,441][86732] Avg episode reward: [(0, '6.540'), (1, '5.340')]
[2023-09-22 10:17:55,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5120.0, 300 sec: 5692.7). Total num frames: 4087808. Throughput: 0: 641.8, 1: 642.6. Samples: 1015808. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:17:55,441][86732] Avg episode reward: [(0, '6.660'), (1, '5.190')]
[2023-09-22 10:18:00,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5051.7, 300 sec: 5665.0). Total num frames: 4112384. Throughput: 0: 647.0, 1: 649.0. Samples: 1023751. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:18:00,441][86732] Avg episode reward: [(0, '6.750'), (1, '4.880')]
[2023-09-22 10:18:00,739][88474] Updated weights for policy 1, policy_version 8000 (0.0017)
[2023-09-22 10:18:00,739][88473] Updated weights for policy 0, policy_version 8096 (0.0016)
[2023-09-22 10:18:05,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5051.8, 300 sec: 5637.2). Total num frames: 4136960. Throughput: 0: 646.8, 1: 647.3. Samples: 1027393. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:18:05,440][86732] Avg episode reward: [(0, '6.790'), (1, '4.670')]
[2023-09-22 10:18:10,440][86732] Fps is (10 sec: 5324.9, 60 sec: 5120.0, 300 sec: 5651.1). Total num frames: 4165632. Throughput: 0: 645.8, 1: 643.6. Samples: 1034921. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:18:10,440][86732] Avg episode reward: [(0, '6.800'), (1, '4.840')]
[2023-09-22 10:18:10,450][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000008192_2097152.pth...
[2023-09-22 10:18:10,479][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000005600_1433600.pth
[2023-09-22 10:18:10,534][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000008096_2072576.pth...
[2023-09-22 10:18:10,572][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000005504_1409024.pth
[2023-09-22 10:18:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5637.2). Total num frames: 4194304. Throughput: 0: 640.4, 1: 641.3. Samples: 1042559. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:18:15,440][86732] Avg episode reward: [(0, '6.680'), (1, '4.980')]
[2023-09-22 10:18:16,869][88473] Updated weights for policy 0, policy_version 8256 (0.0015)
[2023-09-22 10:18:16,869][88474] Updated weights for policy 1, policy_version 8160 (0.0016)
[2023-09-22 10:18:20,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5188.3, 300 sec: 5609.4). Total num frames: 4218880. Throughput: 0: 641.8, 1: 644.9. Samples: 1046529. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:18:20,440][86732] Avg episode reward: [(0, '6.420'), (1, '5.140')]
[2023-09-22 10:18:25,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5581.7). Total num frames: 4243456. Throughput: 0: 643.8, 1: 645.1. Samples: 1054113. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:18:25,440][86732] Avg episode reward: [(0, '6.380'), (1, '5.150')]
[2023-09-22 10:18:30,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.3, 300 sec: 5581.7). Total num frames: 4268032. Throughput: 0: 641.0, 1: 640.9. Samples: 1061595. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:18:30,441][86732] Avg episode reward: [(0, '6.630'), (1, '5.390')]
[2023-09-22 10:18:33,019][88474] Updated weights for policy 1, policy_version 8320 (0.0018)
[2023-09-22 10:18:33,019][88473] Updated weights for policy 0, policy_version 8416 (0.0014)
[2023-09-22 10:18:35,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5553.9). Total num frames: 4292608. Throughput: 0: 639.6, 1: 639.6. Samples: 1065437. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:18:35,440][86732] Avg episode reward: [(0, '6.660'), (1, '5.520')]
[2023-09-22 10:18:40,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5051.7, 300 sec: 5526.1). Total num frames: 4317184. Throughput: 0: 637.3, 1: 637.2. Samples: 1073157. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:18:40,441][86732] Avg episode reward: [(0, '6.830'), (1, '5.350')]
[2023-09-22 10:18:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5051.7, 300 sec: 5498.4). Total num frames: 4341760. Throughput: 0: 637.9, 1: 637.5. Samples: 1081146. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:18:45,440][86732] Avg episode reward: [(0, '6.850'), (1, '5.720')]
[2023-09-22 10:18:48,936][88474] Updated weights for policy 1, policy_version 8480 (0.0012)
[2023-09-22 10:18:48,937][88473] Updated weights for policy 0, policy_version 8576 (0.0013)
[2023-09-22 10:18:50,440][86732] Fps is (10 sec: 5324.9, 60 sec: 5120.0, 300 sec: 5512.2). Total num frames: 4370432. Throughput: 0: 640.2, 1: 638.6. Samples: 1084939. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:18:50,441][86732] Avg episode reward: [(0, '7.070'), (1, '6.220')]
[2023-09-22 10:18:50,442][88211] Saving new best policy, reward=7.070!
[2023-09-22 10:18:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5498.4). Total num frames: 4399104. Throughput: 0: 639.6, 1: 640.7. Samples: 1092535. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:18:55,440][86732] Avg episode reward: [(0, '7.090'), (1, '6.190')]
[2023-09-22 10:18:55,447][88211] Saving new best policy, reward=7.090!
[2023-09-22 10:19:00,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5188.3, 300 sec: 5470.6). Total num frames: 4423680. Throughput: 0: 641.9, 1: 641.5. Samples: 1100314. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:19:00,440][86732] Avg episode reward: [(0, '6.960'), (1, '6.090')]
[2023-09-22 10:19:04,762][88473] Updated weights for policy 0, policy_version 8736 (0.0013)
[2023-09-22 10:19:04,762][88474] Updated weights for policy 1, policy_version 8640 (0.0013)
[2023-09-22 10:19:05,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.3, 300 sec: 5456.7). Total num frames: 4448256. Throughput: 0: 640.5, 1: 638.7. Samples: 1104090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:19:05,441][86732] Avg episode reward: [(0, '6.800'), (1, '6.260')]
[2023-09-22 10:19:10,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5120.0, 300 sec: 5442.8). Total num frames: 4472832. Throughput: 0: 643.3, 1: 644.5. Samples: 1112064. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:19:10,440][86732] Avg episode reward: [(0, '6.890'), (1, '6.380')]
[2023-09-22 10:19:15,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5051.7, 300 sec: 5415.1). Total num frames: 4497408. Throughput: 0: 650.1, 1: 649.0. Samples: 1120053. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:19:15,441][86732] Avg episode reward: [(0, '6.920'), (1, '6.160')]
[2023-09-22 10:19:20,391][88474] Updated weights for policy 1, policy_version 8800 (0.0012)
[2023-09-22 10:19:20,391][88473] Updated weights for policy 0, policy_version 8896 (0.0014)
[2023-09-22 10:19:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5415.1). Total num frames: 4530176. Throughput: 0: 650.6, 1: 650.2. Samples: 1123976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:19:20,441][86732] Avg episode reward: [(0, '6.930'), (1, '6.360')]
[2023-09-22 10:19:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5415.1). Total num frames: 4554752. Throughput: 0: 651.5, 1: 650.2. Samples: 1131734. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:19:25,440][86732] Avg episode reward: [(0, '6.910'), (1, '6.250')]
[2023-09-22 10:19:30,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5387.3). Total num frames: 4579328. Throughput: 0: 643.6, 1: 643.2. Samples: 1139054. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:19:30,440][86732] Avg episode reward: [(0, '7.040'), (1, '6.430')]
[2023-09-22 10:19:35,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5359.5). Total num frames: 4603904. Throughput: 0: 641.5, 1: 644.0. Samples: 1142789. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:19:35,440][86732] Avg episode reward: [(0, '7.080'), (1, '6.340')]
[2023-09-22 10:19:36,481][88474] Updated weights for policy 1, policy_version 8960 (0.0015)
[2023-09-22 10:19:36,482][88473] Updated weights for policy 0, policy_version 9056 (0.0015)
[2023-09-22 10:19:40,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5359.5). Total num frames: 4628480. Throughput: 0: 648.8, 1: 649.3. Samples: 1150948. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:19:40,440][86732] Avg episode reward: [(0, '7.030'), (1, '6.370')]
[2023-09-22 10:19:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5331.7). Total num frames: 4653056. Throughput: 0: 649.8, 1: 649.2. Samples: 1158772. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:19:45,440][86732] Avg episode reward: [(0, '7.010'), (1, '6.290')]
[2023-09-22 10:19:50,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5120.0, 300 sec: 5304.0). Total num frames: 4677632. Throughput: 0: 650.3, 1: 649.1. Samples: 1162562. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:19:50,440][86732] Avg episode reward: [(0, '6.680'), (1, '6.050')]
[2023-09-22 10:19:52,294][88474] Updated weights for policy 1, policy_version 9120 (0.0015)
[2023-09-22 10:19:52,294][88473] Updated weights for policy 0, policy_version 9216 (0.0012)
[2023-09-22 10:19:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5304.0). Total num frames: 4710400. Throughput: 0: 647.6, 1: 645.8. Samples: 1170265. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:19:55,440][86732] Avg episode reward: [(0, '6.710'), (1, '6.340')]
[2023-09-22 10:20:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5304.0). Total num frames: 4734976. Throughput: 0: 649.4, 1: 650.4. Samples: 1178547. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:00,441][86732] Avg episode reward: [(0, '6.600'), (1, '5.910')]
[2023-09-22 10:20:05,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5276.2). Total num frames: 4759552. Throughput: 0: 649.7, 1: 649.3. Samples: 1182429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:05,440][86732] Avg episode reward: [(0, '6.840'), (1, '5.870')]
[2023-09-22 10:20:07,395][88473] Updated weights for policy 0, policy_version 9376 (0.0015)
[2023-09-22 10:20:07,395][88474] Updated weights for policy 1, policy_version 9280 (0.0016)
[2023-09-22 10:20:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5304.0). Total num frames: 4792320. Throughput: 0: 654.5, 1: 654.3. Samples: 1190630. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:20:10,441][86732] Avg episode reward: [(0, '6.850'), (1, '6.030')]
[2023-09-22 10:20:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000009312_2383872.pth...
[2023-09-22 10:20:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000009408_2408448.pth...
[2023-09-22 10:20:10,495][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000006880_1761280.pth
[2023-09-22 10:20:10,496][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000006976_1785856.pth
[2023-09-22 10:20:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5276.2). Total num frames: 4816896. Throughput: 0: 661.4, 1: 662.0. Samples: 1198604. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:20:15,440][86732] Avg episode reward: [(0, '6.910'), (1, '6.240')]
[2023-09-22 10:20:20,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5248.4). Total num frames: 4841472. Throughput: 0: 667.4, 1: 665.4. Samples: 1202765. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:20:20,441][86732] Avg episode reward: [(0, '7.110'), (1, '6.130')]
[2023-09-22 10:20:20,442][88211] Saving new best policy, reward=7.110!
[2023-09-22 10:20:22,466][88473] Updated weights for policy 0, policy_version 9536 (0.0016)
[2023-09-22 10:20:22,467][88474] Updated weights for policy 1, policy_version 9440 (0.0018)
[2023-09-22 10:20:25,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5248.4). Total num frames: 4866048. Throughput: 0: 665.8, 1: 664.6. Samples: 1210818. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:20:25,441][86732] Avg episode reward: [(0, '7.140'), (1, '6.400')]
[2023-09-22 10:20:25,511][88211] Saving new best policy, reward=7.140!
[2023-09-22 10:20:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5324.8, 300 sec: 5248.4). Total num frames: 4898816. Throughput: 0: 668.1, 1: 669.5. Samples: 1218966. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:20:30,441][86732] Avg episode reward: [(0, '7.220'), (1, '6.600')]
[2023-09-22 10:20:30,441][88352] Saving new best policy, reward=6.600!
[2023-09-22 10:20:30,441][88211] Saving new best policy, reward=7.220!
[2023-09-22 10:20:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5324.8, 300 sec: 5220.7). Total num frames: 4923392. Throughput: 0: 671.5, 1: 672.6. Samples: 1223047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:35,440][86732] Avg episode reward: [(0, '7.220'), (1, '6.630')]
[2023-09-22 10:20:35,441][88352] Saving new best policy, reward=6.630!
[2023-09-22 10:20:37,670][88474] Updated weights for policy 1, policy_version 9600 (0.0013)
[2023-09-22 10:20:37,670][88473] Updated weights for policy 0, policy_version 9696 (0.0013)
[2023-09-22 10:20:40,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5192.9). Total num frames: 4947968. Throughput: 0: 674.6, 1: 674.9. Samples: 1230992. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:40,440][86732] Avg episode reward: [(0, '7.350'), (1, '6.640')]
[2023-09-22 10:20:40,446][88352] Saving new best policy, reward=6.640!
[2023-09-22 10:20:40,446][88211] Saving new best policy, reward=7.350!
[2023-09-22 10:20:45,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5220.7). Total num frames: 4980736. Throughput: 0: 672.8, 1: 673.3. Samples: 1239125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:45,441][86732] Avg episode reward: [(0, '7.320'), (1, '6.800')]
[2023-09-22 10:20:45,442][88352] Saving new best policy, reward=6.800!
[2023-09-22 10:20:50,440][86732] Fps is (10 sec: 5734.2, 60 sec: 5461.3, 300 sec: 5192.9). Total num frames: 5005312. Throughput: 0: 674.0, 1: 675.6. Samples: 1243165. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:50,441][86732] Avg episode reward: [(0, '6.880'), (1, '6.950')]
[2023-09-22 10:20:50,442][88352] Saving new best policy, reward=6.950!
[2023-09-22 10:20:52,949][88474] Updated weights for policy 1, policy_version 9760 (0.0019)
[2023-09-22 10:20:52,949][88473] Updated weights for policy 0, policy_version 9856 (0.0015)
[2023-09-22 10:20:55,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5192.9). Total num frames: 5029888. Throughput: 0: 673.7, 1: 675.3. Samples: 1251332. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:20:55,440][86732] Avg episode reward: [(0, '6.960'), (1, '6.930')]
[2023-09-22 10:21:00,440][86732] Fps is (10 sec: 5324.7, 60 sec: 5393.0, 300 sec: 5206.8). Total num frames: 5058560. Throughput: 0: 630.9, 1: 677.3. Samples: 1257473. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:21:00,441][86732] Avg episode reward: [(0, '7.040'), (1, '7.120')]
[2023-09-22 10:21:00,462][88352] Saving new best policy, reward=7.120!
[2023-09-22 10:21:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5220.7). Total num frames: 5087232. Throughput: 0: 675.1, 1: 677.2. Samples: 1263616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:21:05,440][86732] Avg episode reward: [(0, '6.870'), (1, '6.940')]
[2023-09-22 10:21:07,941][88474] Updated weights for policy 1, policy_version 9920 (0.0017)
[2023-09-22 10:21:07,941][88473] Updated weights for policy 0, policy_version 10016 (0.0014)
[2023-09-22 10:21:10,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5324.8, 300 sec: 5220.7). Total num frames: 5111808. Throughput: 0: 676.8, 1: 678.6. Samples: 1271809. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:21:10,441][86732] Avg episode reward: [(0, '6.820'), (1, '6.900')]
[2023-09-22 10:21:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5220.7). Total num frames: 5144576. Throughput: 0: 677.4, 1: 678.9. Samples: 1280001. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:21:15,441][86732] Avg episode reward: [(0, '6.950'), (1, '7.290')]
[2023-09-22 10:21:15,443][88352] Saving new best policy, reward=7.290!
[2023-09-22 10:21:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 5461.4, 300 sec: 5220.7). Total num frames: 5169152. Throughput: 0: 677.4, 1: 679.2. Samples: 1284097. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:21:20,440][86732] Avg episode reward: [(0, '7.080'), (1, '7.350')]
[2023-09-22 10:21:20,441][88352] Saving new best policy, reward=7.350!
[2023-09-22 10:21:23,092][88474] Updated weights for policy 1, policy_version 10080 (0.0015)
[2023-09-22 10:21:23,093][88473] Updated weights for policy 0, policy_version 10176 (0.0015)
[2023-09-22 10:21:25,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5461.3, 300 sec: 5220.7). Total num frames: 5193728. Throughput: 0: 680.3, 1: 681.4. Samples: 1292267. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:21:25,441][86732] Avg episode reward: [(0, '7.210'), (1, '7.310')]
[2023-09-22 10:21:30,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5324.8, 300 sec: 5220.7). Total num frames: 5218304. Throughput: 0: 680.9, 1: 682.1. Samples: 1300460. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:21:30,441][86732] Avg episode reward: [(0, '7.120'), (1, '7.740')]
[2023-09-22 10:21:30,542][88352] Saving new best policy, reward=7.740!
[2023-09-22 10:21:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5248.4). Total num frames: 5251072. Throughput: 0: 681.4, 1: 682.2. Samples: 1304528. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:21:35,440][86732] Avg episode reward: [(0, '7.090'), (1, '7.600')]
[2023-09-22 10:21:38,116][88473] Updated weights for policy 0, policy_version 10336 (0.0015)
[2023-09-22 10:21:38,116][88474] Updated weights for policy 1, policy_version 10240 (0.0017)
[2023-09-22 10:21:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5461.3, 300 sec: 5248.4). Total num frames: 5275648. Throughput: 0: 682.6, 1: 682.6. Samples: 1312765. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:21:40,440][86732] Avg episode reward: [(0, '7.200'), (1, '8.010')]
[2023-09-22 10:21:40,448][88352] Saving new best policy, reward=8.010!
[2023-09-22 10:21:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5461.4, 300 sec: 5276.2). Total num frames: 5308416. Throughput: 0: 735.3, 1: 688.1. Samples: 1321528. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:21:45,440][86732] Avg episode reward: [(0, '7.250'), (1, '7.990')]
[2023-09-22 10:21:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5461.4, 300 sec: 5262.3). Total num frames: 5332992. Throughput: 0: 694.0, 1: 692.2. Samples: 1325993. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:21:50,440][86732] Avg episode reward: [(0, '6.860'), (1, '8.120')]
[2023-09-22 10:21:50,441][88352] Saving new best policy, reward=8.120!
[2023-09-22 10:21:52,085][88474] Updated weights for policy 1, policy_version 10400 (0.0014)
[2023-09-22 10:21:52,086][88473] Updated weights for policy 0, policy_version 10496 (0.0015)
[2023-09-22 10:21:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5276.2). Total num frames: 5365760. Throughput: 0: 703.9, 1: 701.9. Samples: 1335070. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:21:55,441][86732] Avg episode reward: [(0, '7.120'), (1, '8.300')]
[2023-09-22 10:21:55,449][88352] Saving new best policy, reward=8.300!
[2023-09-22 10:22:00,440][86732] Fps is (10 sec: 6553.5, 60 sec: 5666.1, 300 sec: 5304.0). Total num frames: 5398528. Throughput: 0: 710.3, 1: 708.2. Samples: 1343832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:22:00,441][86732] Avg episode reward: [(0, '7.250'), (1, '8.410')]
[2023-09-22 10:22:00,442][88352] Saving new best policy, reward=8.410!
[2023-09-22 10:22:05,440][86732] Fps is (10 sec: 6144.1, 60 sec: 5666.1, 300 sec: 5317.9). Total num frames: 5427200. Throughput: 0: 715.2, 1: 713.2. Samples: 1348372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:05,440][86732] Avg episode reward: [(0, '7.330'), (1, '8.150')]
[2023-09-22 10:22:05,466][88473] Updated weights for policy 0, policy_version 10656 (0.0017)
[2023-09-22 10:22:05,466][88474] Updated weights for policy 1, policy_version 10560 (0.0013)
[2023-09-22 10:22:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5331.7). Total num frames: 5455872. Throughput: 0: 728.2, 1: 728.2. Samples: 1357804. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:10,440][86732] Avg episode reward: [(0, '6.780'), (1, '7.970')]
[2023-09-22 10:22:10,451][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000010704_2740224.pth...
[2023-09-22 10:22:10,451][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000010608_2715648.pth...
[2023-09-22 10:22:10,486][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000008096_2072576.pth
[2023-09-22 10:22:10,486][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000008192_2097152.pth
[2023-09-22 10:22:15,440][86732] Fps is (10 sec: 6144.0, 60 sec: 5734.4, 300 sec: 5359.5). Total num frames: 5488640. Throughput: 0: 739.4, 1: 737.4. Samples: 1366919. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:15,440][86732] Avg episode reward: [(0, '7.230'), (1, '7.940')]
[2023-09-22 10:22:18,883][88474] Updated weights for policy 1, policy_version 10720 (0.0015)
[2023-09-22 10:22:18,883][88473] Updated weights for policy 0, policy_version 10816 (0.0016)
[2023-09-22 10:22:20,440][86732] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5387.3). Total num frames: 5521408. Throughput: 0: 747.4, 1: 745.2. Samples: 1371697. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:22:20,441][86732] Avg episode reward: [(0, '7.230'), (1, '7.710')]
[2023-09-22 10:22:25,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5387.3). Total num frames: 5545984. Throughput: 0: 753.2, 1: 751.5. Samples: 1380474. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:22:25,442][86732] Avg episode reward: [(0, '7.140'), (1, '7.730')]
[2023-09-22 10:22:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5415.0). Total num frames: 5578752. Throughput: 0: 758.4, 1: 758.3. Samples: 1389778. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:30,441][86732] Avg episode reward: [(0, '7.420'), (1, '7.260')]
[2023-09-22 10:22:30,442][88211] Saving new best policy, reward=7.420!
[2023-09-22 10:22:32,310][88474] Updated weights for policy 1, policy_version 10880 (0.0015)
[2023-09-22 10:22:32,310][88473] Updated weights for policy 0, policy_version 10976 (0.0014)
[2023-09-22 10:22:35,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.4, 300 sec: 5415.1). Total num frames: 5611520. Throughput: 0: 762.4, 1: 762.5. Samples: 1394614. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:35,441][86732] Avg episode reward: [(0, '7.630'), (1, '7.480')]
[2023-09-22 10:22:35,442][88211] Saving new best policy, reward=7.630!
[2023-09-22 10:22:40,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6075.7, 300 sec: 5428.9). Total num frames: 5640192. Throughput: 0: 759.8, 1: 758.9. Samples: 1403413. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:40,441][86732] Avg episode reward: [(0, '7.250'), (1, '7.370')]
[2023-09-22 10:22:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5442.8). Total num frames: 5668864. Throughput: 0: 765.8, 1: 766.0. Samples: 1412759. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:45,441][86732] Avg episode reward: [(0, '7.330'), (1, '7.100')]
[2023-09-22 10:22:45,792][88473] Updated weights for policy 0, policy_version 11136 (0.0015)
[2023-09-22 10:22:45,792][88474] Updated weights for policy 1, policy_version 11040 (0.0015)
[2023-09-22 10:22:50,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 5470.6). Total num frames: 5701632. Throughput: 0: 763.9, 1: 765.9. Samples: 1417216. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:50,441][86732] Avg episode reward: [(0, '7.460'), (1, '6.910')]
[2023-09-22 10:22:55,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5498.4). Total num frames: 5734400. Throughput: 0: 758.2, 1: 756.8. Samples: 1425978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:22:55,441][86732] Avg episode reward: [(0, '7.270'), (1, '6.850')]
[2023-09-22 10:22:59,516][88473] Updated weights for policy 0, policy_version 11296 (0.0019)
[2023-09-22 10:22:59,517][88474] Updated weights for policy 1, policy_version 11200 (0.0020)
[2023-09-22 10:23:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5498.4). Total num frames: 5758976. Throughput: 0: 757.3, 1: 757.7. Samples: 1435097. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:00,441][86732] Avg episode reward: [(0, '7.240'), (1, '7.030')]
[2023-09-22 10:23:05,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6075.7, 300 sec: 5512.2). Total num frames: 5791744. Throughput: 0: 755.0, 1: 756.8. Samples: 1439730. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:05,440][86732] Avg episode reward: [(0, '7.270'), (1, '7.240')]
[2023-09-22 10:23:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5526.1). Total num frames: 5824512. Throughput: 0: 760.1, 1: 759.1. Samples: 1448840. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:10,441][86732] Avg episode reward: [(0, '6.800'), (1, '7.180')]
[2023-09-22 10:23:12,983][88473] Updated weights for policy 0, policy_version 11456 (0.0018)
[2023-09-22 10:23:12,983][88474] Updated weights for policy 1, policy_version 11360 (0.0018)
[2023-09-22 10:23:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5526.1). Total num frames: 5849088. Throughput: 0: 756.9, 1: 755.4. Samples: 1457833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:15,440][86732] Avg episode reward: [(0, '6.760'), (1, '6.690')]
[2023-09-22 10:23:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5553.9). Total num frames: 5881856. Throughput: 0: 750.9, 1: 752.6. Samples: 1462272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:20,441][86732] Avg episode reward: [(0, '6.520'), (1, '6.880')]
[2023-09-22 10:23:25,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 5581.7). Total num frames: 5914624. Throughput: 0: 751.2, 1: 752.1. Samples: 1471062. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:25,441][86732] Avg episode reward: [(0, '6.830'), (1, '6.680')]
[2023-09-22 10:23:26,687][88473] Updated weights for policy 0, policy_version 11616 (0.0015)
[2023-09-22 10:23:26,687][88474] Updated weights for policy 1, policy_version 11520 (0.0017)
[2023-09-22 10:23:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5581.7). Total num frames: 5939200. Throughput: 0: 750.2, 1: 749.9. Samples: 1480263. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:23:30,441][86732] Avg episode reward: [(0, '6.780'), (1, '6.820')]
[2023-09-22 10:23:35,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5609.4). Total num frames: 5971968. Throughput: 0: 750.9, 1: 750.8. Samples: 1484792. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:23:35,440][86732] Avg episode reward: [(0, '6.880'), (1, '6.900')]
[2023-09-22 10:23:40,194][88474] Updated weights for policy 1, policy_version 11680 (0.0017)
[2023-09-22 10:23:40,194][88473] Updated weights for policy 0, policy_version 11776 (0.0016)
[2023-09-22 10:23:40,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6075.7, 300 sec: 5637.2). Total num frames: 6004736. Throughput: 0: 754.0, 1: 754.1. Samples: 1493844. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:23:40,441][86732] Avg episode reward: [(0, '6.960'), (1, '7.110')]
[2023-09-22 10:23:45,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5623.3). Total num frames: 6029312. Throughput: 0: 705.3, 1: 757.5. Samples: 1500924. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 10:23:45,441][86732] Avg episode reward: [(0, '7.100'), (1, '7.470')]
[2023-09-22 10:23:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5637.2). Total num frames: 6062080. Throughput: 0: 752.4, 1: 751.3. Samples: 1507400. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:23:50,440][86732] Avg episode reward: [(0, '7.470'), (1, '7.460')]
[2023-09-22 10:23:53,991][88473] Updated weights for policy 0, policy_version 11936 (0.0014)
[2023-09-22 10:23:53,991][88474] Updated weights for policy 1, policy_version 11840 (0.0017)
[2023-09-22 10:23:55,443][86732] Fps is (10 sec: 6142.3, 60 sec: 5938.9, 300 sec: 5651.0). Total num frames: 6090752. Throughput: 0: 746.6, 1: 747.7. Samples: 1516086. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:23:55,444][86732] Avg episode reward: [(0, '7.550'), (1, '7.760')]
[2023-09-22 10:24:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 6119424. Throughput: 0: 741.8, 1: 743.2. Samples: 1524657. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:00,441][86732] Avg episode reward: [(0, '7.500'), (1, '7.760')]
[2023-09-22 10:24:05,440][86732] Fps is (10 sec: 6145.7, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 6152192. Throughput: 0: 745.5, 1: 743.8. Samples: 1529288. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:24:05,441][86732] Avg episode reward: [(0, '7.620'), (1, '7.940')]
[2023-09-22 10:24:07,939][88473] Updated weights for policy 0, policy_version 12096 (0.0014)
[2023-09-22 10:24:07,941][88474] Updated weights for policy 1, policy_version 12000 (0.0014)
[2023-09-22 10:24:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5692.7). Total num frames: 6176768. Throughput: 0: 743.5, 1: 745.3. Samples: 1538055. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:24:10,441][86732] Avg episode reward: [(0, '7.780'), (1, '8.080')]
[2023-09-22 10:24:10,450][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000012112_3100672.pth...
[2023-09-22 10:24:10,450][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000012016_3076096.pth...
[2023-09-22 10:24:10,482][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000009312_2383872.pth
[2023-09-22 10:24:10,488][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000009408_2408448.pth
[2023-09-22 10:24:10,492][88211] Saving new best policy, reward=7.780!
[2023-09-22 10:24:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 6209536. Throughput: 0: 739.1, 1: 739.4. Samples: 1546796. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:24:15,440][86732] Avg episode reward: [(0, '7.890'), (1, '8.110')]
[2023-09-22 10:24:15,441][88211] Saving new best policy, reward=7.890!
[2023-09-22 10:24:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5692.7). Total num frames: 6234112. Throughput: 0: 739.6, 1: 738.3. Samples: 1551295. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:24:20,441][86732] Avg episode reward: [(0, '7.640'), (1, '8.250')]
[2023-09-22 10:24:21,981][88473] Updated weights for policy 0, policy_version 12256 (0.0013)
[2023-09-22 10:24:21,982][88474] Updated weights for policy 1, policy_version 12160 (0.0016)
[2023-09-22 10:24:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5720.5). Total num frames: 6266880. Throughput: 0: 739.3, 1: 738.8. Samples: 1560360. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:25,440][86732] Avg episode reward: [(0, '7.940'), (1, '8.050')]
[2023-09-22 10:24:25,448][88211] Saving new best policy, reward=7.940!
[2023-09-22 10:24:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5748.3). Total num frames: 6299648. Throughput: 0: 782.0, 1: 729.7. Samples: 1568949. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:30,440][86732] Avg episode reward: [(0, '7.780'), (1, '7.970')]
[2023-09-22 10:24:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5748.3). Total num frames: 6324224. Throughput: 0: 735.4, 1: 734.8. Samples: 1573557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:35,440][86732] Avg episode reward: [(0, '8.010'), (1, '7.810')]
[2023-09-22 10:24:35,562][88211] Saving new best policy, reward=8.010!
[2023-09-22 10:24:35,565][88473] Updated weights for policy 0, policy_version 12416 (0.0016)
[2023-09-22 10:24:35,565][88474] Updated weights for policy 1, policy_version 12320 (0.0018)
[2023-09-22 10:24:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6356992. Throughput: 0: 743.9, 1: 743.6. Samples: 1583021. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:40,440][86732] Avg episode reward: [(0, '7.580'), (1, '8.090')]
[2023-09-22 10:24:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5803.8). Total num frames: 6389760. Throughput: 0: 745.3, 1: 745.6. Samples: 1591749. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:45,441][86732] Avg episode reward: [(0, '7.570'), (1, '7.940')]
[2023-09-22 10:24:49,225][88473] Updated weights for policy 0, policy_version 12576 (0.0016)
[2023-09-22 10:24:49,225][88474] Updated weights for policy 1, policy_version 12480 (0.0016)
[2023-09-22 10:24:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 6414336. Throughput: 0: 744.8, 1: 745.0. Samples: 1596328. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:24:50,440][86732] Avg episode reward: [(0, '7.500'), (1, '7.890')]
[2023-09-22 10:24:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5939.5, 300 sec: 5803.8). Total num frames: 6447104. Throughput: 0: 750.8, 1: 749.8. Samples: 1605583. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:24:55,440][86732] Avg episode reward: [(0, '7.270'), (1, '7.990')]
[2023-09-22 10:25:00,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 6479872. Throughput: 0: 748.6, 1: 748.5. Samples: 1614168. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:25:00,441][86732] Avg episode reward: [(0, '7.400'), (1, '8.110')]
[2023-09-22 10:25:02,988][88474] Updated weights for policy 1, policy_version 12640 (0.0015)
[2023-09-22 10:25:02,990][88473] Updated weights for policy 0, policy_version 12736 (0.0012)
[2023-09-22 10:25:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 6504448. Throughput: 0: 747.5, 1: 746.9. Samples: 1618542. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:05,441][86732] Avg episode reward: [(0, '7.230'), (1, '7.900')]
[2023-09-22 10:25:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 6537216. Throughput: 0: 746.3, 1: 747.7. Samples: 1627591. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:10,440][86732] Avg episode reward: [(0, '7.380'), (1, '7.740')]
[2023-09-22 10:25:15,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 6569984. Throughput: 0: 748.0, 1: 749.9. Samples: 1636353. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:15,441][86732] Avg episode reward: [(0, '7.340'), (1, '8.120')]
[2023-09-22 10:25:16,908][88474] Updated weights for policy 1, policy_version 12800 (0.0017)
[2023-09-22 10:25:16,908][88473] Updated weights for policy 0, policy_version 12896 (0.0016)
[2023-09-22 10:25:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 6594560. Throughput: 0: 745.6, 1: 745.9. Samples: 1640674. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:20,440][86732] Avg episode reward: [(0, '7.450'), (1, '8.510')]
[2023-09-22 10:25:20,441][88352] Saving new best policy, reward=8.510!
[2023-09-22 10:25:25,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 6627328. Throughput: 0: 746.1, 1: 744.9. Samples: 1650116. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:25:25,440][86732] Avg episode reward: [(0, '7.570'), (1, '8.270')]
[2023-09-22 10:25:30,385][88474] Updated weights for policy 1, policy_version 12960 (0.0016)
[2023-09-22 10:25:30,386][88473] Updated weights for policy 0, policy_version 13056 (0.0017)
[2023-09-22 10:25:30,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 6660096. Throughput: 0: 745.2, 1: 746.7. Samples: 1658885. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:25:30,440][86732] Avg episode reward: [(0, '7.550'), (1, '8.130')]
[2023-09-22 10:25:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5887.1). Total num frames: 6684672. Throughput: 0: 745.8, 1: 745.3. Samples: 1663426. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:25:35,441][86732] Avg episode reward: [(0, '7.310'), (1, '8.390')]
[2023-09-22 10:25:40,440][86732] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 5887.1). Total num frames: 6717440. Throughput: 0: 750.9, 1: 749.7. Samples: 1673110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:40,441][86732] Avg episode reward: [(0, '7.320'), (1, '8.530')]
[2023-09-22 10:25:40,451][88352] Saving new best policy, reward=8.530!
[2023-09-22 10:25:43,604][88474] Updated weights for policy 1, policy_version 13120 (0.0014)
[2023-09-22 10:25:43,604][88473] Updated weights for policy 0, policy_version 13216 (0.0015)
[2023-09-22 10:25:45,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 6750208. Throughput: 0: 753.7, 1: 753.5. Samples: 1681992. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:25:45,440][86732] Avg episode reward: [(0, '7.440'), (1, '8.450')]
[2023-09-22 10:25:50,440][86732] Fps is (10 sec: 6144.1, 60 sec: 6075.7, 300 sec: 5928.8). Total num frames: 6778880. Throughput: 0: 758.6, 1: 757.4. Samples: 1686758. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:25:50,441][86732] Avg episode reward: [(0, '7.270'), (1, '8.730')]
[2023-09-22 10:25:50,441][88352] Saving new best policy, reward=8.730!
[2023-09-22 10:25:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5928.8). Total num frames: 6807552. Throughput: 0: 756.9, 1: 757.7. Samples: 1695748. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:25:55,440][86732] Avg episode reward: [(0, '7.350'), (1, '8.640')]
[2023-09-22 10:25:57,081][88474] Updated weights for policy 1, policy_version 13280 (0.0017)
[2023-09-22 10:25:57,081][88473] Updated weights for policy 0, policy_version 13376 (0.0017)
[2023-09-22 10:26:00,440][86732] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 6840320. Throughput: 0: 762.6, 1: 761.8. Samples: 1704950. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:26:00,440][86732] Avg episode reward: [(0, '7.440'), (1, '8.230')]
[2023-09-22 10:26:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5970.4). Total num frames: 6873088. Throughput: 0: 764.2, 1: 763.8. Samples: 1709431. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:26:05,440][86732] Avg episode reward: [(0, '7.170'), (1, '8.170')]
[2023-09-22 10:26:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 6897664. Throughput: 0: 755.8, 1: 758.8. Samples: 1718272. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:26:10,441][86732] Avg episode reward: [(0, '6.920'), (1, '8.260')]
[2023-09-22 10:26:10,451][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000013520_3461120.pth...
[2023-09-22 10:26:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000013424_3436544.pth...
[2023-09-22 10:26:10,479][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000010704_2740224.pth
[2023-09-22 10:26:10,487][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000010608_2715648.pth
[2023-09-22 10:26:10,742][88474] Updated weights for policy 1, policy_version 13440 (0.0016)
[2023-09-22 10:26:10,743][88473] Updated weights for policy 0, policy_version 13536 (0.0016)
[2023-09-22 10:26:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6930432. Throughput: 0: 759.9, 1: 757.7. Samples: 1727175. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:26:15,441][86732] Avg episode reward: [(0, '6.770'), (1, '8.200')]
[2023-09-22 10:26:20,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5998.2). Total num frames: 6963200. Throughput: 0: 758.5, 1: 759.3. Samples: 1731729. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:26:20,440][86732] Avg episode reward: [(0, '6.840'), (1, '8.330')]
[2023-09-22 10:26:24,426][88473] Updated weights for policy 0, policy_version 13696 (0.0016)
[2023-09-22 10:26:24,426][88474] Updated weights for policy 1, policy_version 13600 (0.0017)
[2023-09-22 10:26:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 6987776. Throughput: 0: 751.0, 1: 753.3. Samples: 1740801. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:26:25,441][86732] Avg episode reward: [(0, '6.880'), (1, '7.890')]
[2023-09-22 10:26:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7020544. Throughput: 0: 754.1, 1: 754.0. Samples: 1749859. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:26:30,441][86732] Avg episode reward: [(0, '6.770'), (1, '8.300')]
[2023-09-22 10:26:35,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 7053312. Throughput: 0: 753.1, 1: 754.0. Samples: 1754574. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:26:35,440][86732] Avg episode reward: [(0, '6.950'), (1, '8.380')]
[2023-09-22 10:26:37,874][88473] Updated weights for policy 0, policy_version 13856 (0.0015)
[2023-09-22 10:26:37,874][88474] Updated weights for policy 1, policy_version 13760 (0.0016)
[2023-09-22 10:26:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7077888. Throughput: 0: 753.4, 1: 751.4. Samples: 1763465. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:26:40,441][86732] Avg episode reward: [(0, '6.390'), (1, '8.590')]
[2023-09-22 10:26:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 7110656. Throughput: 0: 756.0, 1: 756.2. Samples: 1772996. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:26:45,440][86732] Avg episode reward: [(0, '6.440'), (1, '8.670')]
[2023-09-22 10:26:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6075.7, 300 sec: 6026.0). Total num frames: 7143424. Throughput: 0: 757.1, 1: 757.8. Samples: 1777599. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:26:50,441][86732] Avg episode reward: [(0, '6.640'), (1, '8.570')]
[2023-09-22 10:26:51,414][88473] Updated weights for policy 0, policy_version 14016 (0.0014)
[2023-09-22 10:26:51,415][88474] Updated weights for policy 1, policy_version 13920 (0.0017)
[2023-09-22 10:26:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7168000. Throughput: 0: 754.2, 1: 752.3. Samples: 1786065. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:26:55,441][86732] Avg episode reward: [(0, '6.560'), (1, '9.030')]
[2023-09-22 10:26:55,540][88352] Saving new best policy, reward=9.030!
[2023-09-22 10:27:00,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6012.1). Total num frames: 7200768. Throughput: 0: 754.2, 1: 754.3. Samples: 1795057. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:27:00,440][86732] Avg episode reward: [(0, '6.340'), (1, '8.770')]
[2023-09-22 10:27:05,373][88473] Updated weights for policy 0, policy_version 14176 (0.0014)
[2023-09-22 10:27:05,373][88474] Updated weights for policy 1, policy_version 14080 (0.0015)
[2023-09-22 10:27:05,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 7233536. Throughput: 0: 753.0, 1: 752.5. Samples: 1799475. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:05,440][86732] Avg episode reward: [(0, '5.970'), (1, '8.750')]
[2023-09-22 10:27:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7258112. Throughput: 0: 750.4, 1: 747.4. Samples: 1808204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:10,440][86732] Avg episode reward: [(0, '6.030'), (1, '8.470')]
[2023-09-22 10:27:15,441][86732] Fps is (10 sec: 5733.9, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7290880. Throughput: 0: 740.4, 1: 742.4. Samples: 1816586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:15,441][86732] Avg episode reward: [(0, '6.210'), (1, '8.840')]
[2023-09-22 10:27:19,430][88473] Updated weights for policy 0, policy_version 14336 (0.0015)
[2023-09-22 10:27:19,431][88474] Updated weights for policy 1, policy_version 14240 (0.0017)
[2023-09-22 10:27:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 7315456. Throughput: 0: 739.1, 1: 739.2. Samples: 1821099. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:20,441][86732] Avg episode reward: [(0, '6.010'), (1, '8.870')]
[2023-09-22 10:27:25,440][86732] Fps is (10 sec: 5734.8, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7348224. Throughput: 0: 743.6, 1: 744.0. Samples: 1830410. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:25,441][86732] Avg episode reward: [(0, '6.530'), (1, '8.650')]
[2023-09-22 10:27:30,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7380992. Throughput: 0: 738.2, 1: 736.9. Samples: 1839379. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:30,441][86732] Avg episode reward: [(0, '6.800'), (1, '8.720')]
[2023-09-22 10:27:32,948][88474] Updated weights for policy 1, policy_version 14400 (0.0019)
[2023-09-22 10:27:32,948][88473] Updated weights for policy 0, policy_version 14496 (0.0017)
[2023-09-22 10:27:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5984.3). Total num frames: 7405568. Throughput: 0: 737.5, 1: 737.0. Samples: 1843952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:35,441][86732] Avg episode reward: [(0, '6.900'), (1, '9.310')]
[2023-09-22 10:27:35,441][88352] Saving new best policy, reward=9.310!
[2023-09-22 10:27:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7438336. Throughput: 0: 745.4, 1: 745.4. Samples: 1853151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:40,441][86732] Avg episode reward: [(0, '6.980'), (1, '8.860')]
[2023-09-22 10:27:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7471104. Throughput: 0: 740.9, 1: 741.0. Samples: 1861743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:45,440][86732] Avg episode reward: [(0, '7.080'), (1, '8.930')]
[2023-09-22 10:27:46,652][88474] Updated weights for policy 1, policy_version 14560 (0.0018)
[2023-09-22 10:27:46,652][88473] Updated weights for policy 0, policy_version 14656 (0.0017)
[2023-09-22 10:27:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 7495680. Throughput: 0: 744.6, 1: 744.6. Samples: 1866485. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:50,441][86732] Avg episode reward: [(0, '7.210'), (1, '8.760')]
[2023-09-22 10:27:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7528448. Throughput: 0: 750.8, 1: 752.1. Samples: 1875833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:27:55,441][86732] Avg episode reward: [(0, '7.120'), (1, '9.080')]
[2023-09-22 10:27:59,948][88473] Updated weights for policy 0, policy_version 14816 (0.0016)
[2023-09-22 10:27:59,949][88474] Updated weights for policy 1, policy_version 14720 (0.0017)
[2023-09-22 10:28:00,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7561216. Throughput: 0: 759.6, 1: 757.6. Samples: 1884858. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:28:00,441][86732] Avg episode reward: [(0, '7.040'), (1, '9.130')]
[2023-09-22 10:28:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7593984. Throughput: 0: 760.6, 1: 760.7. Samples: 1889558. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:28:05,441][86732] Avg episode reward: [(0, '7.370'), (1, '8.950')]
[2023-09-22 10:28:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7618560. Throughput: 0: 755.8, 1: 757.3. Samples: 1898501. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:28:10,440][86732] Avg episode reward: [(0, '7.250'), (1, '9.390')]
[2023-09-22 10:28:10,447][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000014832_3796992.pth...
[2023-09-22 10:28:10,447][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000014928_3821568.pth...
[2023-09-22 10:28:10,478][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000012112_3100672.pth
[2023-09-22 10:28:10,480][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000012016_3076096.pth
[2023-09-22 10:28:10,483][88352] Saving new best policy, reward=9.390!
[2023-09-22 10:28:13,467][88473] Updated weights for policy 0, policy_version 14976 (0.0013)
[2023-09-22 10:28:13,468][88474] Updated weights for policy 1, policy_version 14880 (0.0014)
[2023-09-22 10:28:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7651328. Throughput: 0: 757.5, 1: 758.4. Samples: 1907593. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:28:15,441][86732] Avg episode reward: [(0, '7.020'), (1, '9.170')]
[2023-09-22 10:28:20,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 5984.3). Total num frames: 7680000. Throughput: 0: 755.2, 1: 757.2. Samples: 1912013. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:28:20,441][86732] Avg episode reward: [(0, '7.460'), (1, '9.150')]
[2023-09-22 10:28:25,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7708672. Throughput: 0: 751.5, 1: 750.5. Samples: 1920740. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:28:25,440][86732] Avg episode reward: [(0, '7.640'), (1, '9.080')]
[2023-09-22 10:28:27,441][88473] Updated weights for policy 0, policy_version 15136 (0.0015)
[2023-09-22 10:28:27,441][88474] Updated weights for policy 1, policy_version 15040 (0.0015)
[2023-09-22 10:28:30,440][86732] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7741440. Throughput: 0: 754.4, 1: 754.8. Samples: 1929658. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:28:30,440][86732] Avg episode reward: [(0, '7.240'), (1, '9.480')]
[2023-09-22 10:28:30,441][88352] Saving new best policy, reward=9.480!
[2023-09-22 10:28:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7766016. Throughput: 0: 752.0, 1: 751.1. Samples: 1934125. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:28:35,441][86732] Avg episode reward: [(0, '7.410'), (1, '9.790')]
[2023-09-22 10:28:35,442][88352] Saving new best policy, reward=9.790!
[2023-09-22 10:28:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7798784. Throughput: 0: 751.4, 1: 750.6. Samples: 1943421. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:28:40,440][86732] Avg episode reward: [(0, '7.650'), (1, '9.470')]
[2023-09-22 10:28:41,001][88474] Updated weights for policy 1, policy_version 15200 (0.0016)
[2023-09-22 10:28:41,001][88473] Updated weights for policy 0, policy_version 15296 (0.0016)
[2023-09-22 10:28:45,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7831552. Throughput: 0: 750.4, 1: 750.7. Samples: 1952406. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:28:45,440][86732] Avg episode reward: [(0, '7.510'), (1, '9.940')]
[2023-09-22 10:28:45,441][88352] Saving new best policy, reward=9.940!
[2023-09-22 10:28:50,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6012.2). Total num frames: 7864320. Throughput: 0: 753.1, 1: 753.0. Samples: 1957332. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:28:50,441][86732] Avg episode reward: [(0, '7.420'), (1, '9.910')]
[2023-09-22 10:28:54,373][88473] Updated weights for policy 0, policy_version 15456 (0.0018)
[2023-09-22 10:28:54,373][88474] Updated weights for policy 1, policy_version 15360 (0.0018)
[2023-09-22 10:28:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7888896. Throughput: 0: 750.9, 1: 750.9. Samples: 1966085. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:28:55,440][86732] Avg episode reward: [(0, '7.750'), (1, '9.830')]
[2023-09-22 10:29:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 7921664. Throughput: 0: 748.6, 1: 748.4. Samples: 1974959. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:29:00,441][86732] Avg episode reward: [(0, '7.930'), (1, '9.750')]
[2023-09-22 10:29:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 7946240. Throughput: 0: 748.6, 1: 747.2. Samples: 1979328. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:29:05,441][86732] Avg episode reward: [(0, '7.870'), (1, '9.590')]
[2023-09-22 10:29:08,249][88473] Updated weights for policy 0, policy_version 15616 (0.0016)
[2023-09-22 10:29:08,249][88474] Updated weights for policy 1, policy_version 15520 (0.0017)
[2023-09-22 10:29:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 7979008. Throughput: 0: 751.7, 1: 753.7. Samples: 1988481. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:10,441][86732] Avg episode reward: [(0, '7.810'), (1, '9.620')]
[2023-09-22 10:29:15,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 8011776. Throughput: 0: 752.9, 1: 752.4. Samples: 1997397. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:15,440][86732] Avg episode reward: [(0, '7.710'), (1, '10.330')]
[2023-09-22 10:29:15,441][88352] Saving new best policy, reward=10.330!
[2023-09-22 10:29:20,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6075.7, 300 sec: 6026.0). Total num frames: 8044544. Throughput: 0: 755.2, 1: 756.2. Samples: 2002136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:20,441][86732] Avg episode reward: [(0, '8.000'), (1, '10.210')]
[2023-09-22 10:29:21,786][88474] Updated weights for policy 1, policy_version 15680 (0.0019)
[2023-09-22 10:29:21,786][88473] Updated weights for policy 0, policy_version 15776 (0.0018)
[2023-09-22 10:29:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 8069120. Throughput: 0: 751.1, 1: 753.7. Samples: 2011136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:25,440][86732] Avg episode reward: [(0, '7.930'), (1, '9.700')]
[2023-09-22 10:29:30,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 8101888. Throughput: 0: 748.7, 1: 749.1. Samples: 2019809. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:30,440][86732] Avg episode reward: [(0, '7.510'), (1, '9.920')]
[2023-09-22 10:29:35,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6012.1). Total num frames: 8130560. Throughput: 0: 743.0, 1: 743.5. Samples: 2024224. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:29:35,441][86732] Avg episode reward: [(0, '7.500'), (1, '9.830')]
[2023-09-22 10:29:35,480][88473] Updated weights for policy 0, policy_version 15936 (0.0014)
[2023-09-22 10:29:35,480][88474] Updated weights for policy 1, policy_version 15840 (0.0015)
[2023-09-22 10:29:40,440][86732] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 8159232. Throughput: 0: 747.5, 1: 744.8. Samples: 2033238. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:29:40,441][86732] Avg episode reward: [(0, '7.800'), (1, '9.940')]
[2023-09-22 10:29:45,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 8183808. Throughput: 0: 732.5, 1: 731.4. Samples: 2040837. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:45,441][86732] Avg episode reward: [(0, '7.850'), (1, '9.640')]
[2023-09-22 10:29:50,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5734.4, 300 sec: 5970.4). Total num frames: 8208384. Throughput: 0: 726.1, 1: 725.0. Samples: 2044631. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:50,441][86732] Avg episode reward: [(0, '7.830'), (1, '9.890')]
[2023-09-22 10:29:51,106][88474] Updated weights for policy 1, policy_version 16000 (0.0015)
[2023-09-22 10:29:51,106][88473] Updated weights for policy 0, policy_version 16096 (0.0015)
[2023-09-22 10:29:55,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5734.4, 300 sec: 5942.7). Total num frames: 8232960. Throughput: 0: 708.5, 1: 707.7. Samples: 2052210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:29:55,441][86732] Avg episode reward: [(0, '7.880'), (1, '10.080')]
[2023-09-22 10:30:00,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5942.7). Total num frames: 8257536. Throughput: 0: 652.8, 1: 699.8. Samples: 2058264. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:30:00,440][86732] Avg episode reward: [(0, '8.040'), (1, '10.420')]
[2023-09-22 10:30:00,464][88352] Saving new best policy, reward=10.420!
[2023-09-22 10:30:00,487][88211] Saving new best policy, reward=8.040!
[2023-09-22 10:30:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5942.7). Total num frames: 8290304. Throughput: 0: 690.8, 1: 691.7. Samples: 2064350. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:30:05,441][86732] Avg episode reward: [(0, '7.870'), (1, '10.140')]
[2023-09-22 10:30:06,703][88473] Updated weights for policy 0, policy_version 16256 (0.0014)
[2023-09-22 10:30:06,704][88474] Updated weights for policy 1, policy_version 16160 (0.0014)
[2023-09-22 10:30:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5914.9). Total num frames: 8314880. Throughput: 0: 677.8, 1: 676.6. Samples: 2072081. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:30:10,441][86732] Avg episode reward: [(0, '7.810'), (1, '10.370')]
[2023-09-22 10:30:10,449][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000016288_4169728.pth...
[2023-09-22 10:30:10,449][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000016192_4145152.pth...
[2023-09-22 10:30:10,478][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000013424_3436544.pth
[2023-09-22 10:30:10,492][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000013520_3461120.pth
[2023-09-22 10:30:15,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5914.9). Total num frames: 8339456. Throughput: 0: 666.6, 1: 668.4. Samples: 2079880. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:30:15,441][86732] Avg episode reward: [(0, '7.510'), (1, '10.630')]
[2023-09-22 10:30:15,443][88352] Saving new best policy, reward=10.630!
[2023-09-22 10:30:20,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5887.1). Total num frames: 8364032. Throughput: 0: 662.0, 1: 661.5. Samples: 2083783. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:30:20,440][86732] Avg episode reward: [(0, '7.650'), (1, '10.880')]
[2023-09-22 10:30:20,441][88352] Saving new best policy, reward=10.880!
[2023-09-22 10:30:22,438][88473] Updated weights for policy 0, policy_version 16416 (0.0015)
[2023-09-22 10:30:22,438][88474] Updated weights for policy 1, policy_version 16320 (0.0015)
[2023-09-22 10:30:25,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5859.4). Total num frames: 8388608. Throughput: 0: 647.3, 1: 648.5. Samples: 2091551. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:30:25,440][86732] Avg episode reward: [(0, '7.590'), (1, '10.770')]
[2023-09-22 10:30:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5887.1). Total num frames: 8421376. Throughput: 0: 648.6, 1: 649.8. Samples: 2099267. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:30:30,441][86732] Avg episode reward: [(0, '7.650'), (1, '10.450')]
[2023-09-22 10:30:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5256.5, 300 sec: 5859.4). Total num frames: 8445952. Throughput: 0: 650.7, 1: 653.0. Samples: 2103294. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:30:35,440][86732] Avg episode reward: [(0, '7.710'), (1, '10.400')]
[2023-09-22 10:30:38,188][88474] Updated weights for policy 1, policy_version 16480 (0.0012)
[2023-09-22 10:30:38,188][88473] Updated weights for policy 0, policy_version 16576 (0.0015)
[2023-09-22 10:30:40,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5188.3, 300 sec: 5831.6). Total num frames: 8470528. Throughput: 0: 655.4, 1: 655.5. Samples: 2111203. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:30:40,440][86732] Avg episode reward: [(0, '7.980'), (1, '10.580')]
[2023-09-22 10:30:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5817.7). Total num frames: 8495104. Throughput: 0: 698.9, 1: 650.4. Samples: 2118980. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:30:45,440][86732] Avg episode reward: [(0, '7.990'), (1, '10.440')]
[2023-09-22 10:30:50,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5803.8). Total num frames: 8519680. Throughput: 0: 651.3, 1: 649.8. Samples: 2122898. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:30:50,440][86732] Avg episode reward: [(0, '7.880'), (1, '10.620')]
[2023-09-22 10:30:53,964][88473] Updated weights for policy 0, policy_version 16736 (0.0013)
[2023-09-22 10:30:53,964][88474] Updated weights for policy 1, policy_version 16640 (0.0014)
[2023-09-22 10:30:55,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5776.1). Total num frames: 8544256. Throughput: 0: 649.3, 1: 648.8. Samples: 2130498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:30:55,441][86732] Avg episode reward: [(0, '7.920'), (1, '11.140')]
[2023-09-22 10:30:55,477][88352] Saving new best policy, reward=11.140!
[2023-09-22 10:31:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5776.1). Total num frames: 8577024. Throughput: 0: 647.2, 1: 646.9. Samples: 2138117. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:31:00,440][86732] Avg episode reward: [(0, '7.790'), (1, '10.890')]
[2023-09-22 10:31:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5776.1). Total num frames: 8601600. Throughput: 0: 647.0, 1: 646.9. Samples: 2142010. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:31:05,440][86732] Avg episode reward: [(0, '7.920'), (1, '10.800')]
[2023-09-22 10:31:10,391][88473] Updated weights for policy 0, policy_version 16896 (0.0013)
[2023-09-22 10:31:10,391][88474] Updated weights for policy 1, policy_version 16800 (0.0011)
[2023-09-22 10:31:10,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.3, 300 sec: 5748.3). Total num frames: 8626176. Throughput: 0: 640.6, 1: 640.9. Samples: 2149218. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:31:10,441][86732] Avg episode reward: [(0, '7.720'), (1, '10.640')]
[2023-09-22 10:31:15,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5720.5). Total num frames: 8650752. Throughput: 0: 641.3, 1: 641.0. Samples: 2156970. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:31:15,440][86732] Avg episode reward: [(0, '7.650'), (1, '10.410')]
[2023-09-22 10:31:20,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5188.3, 300 sec: 5720.5). Total num frames: 8675328. Throughput: 0: 640.6, 1: 638.8. Samples: 2160866. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:31:20,440][86732] Avg episode reward: [(0, '7.560'), (1, '10.270')]
[2023-09-22 10:31:25,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5692.7). Total num frames: 8699904. Throughput: 0: 639.6, 1: 641.1. Samples: 2168836. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:31:25,440][86732] Avg episode reward: [(0, '7.340'), (1, '10.150')]
[2023-09-22 10:31:25,937][88473] Updated weights for policy 0, policy_version 17056 (0.0013)
[2023-09-22 10:31:25,938][88474] Updated weights for policy 1, policy_version 16960 (0.0016)
[2023-09-22 10:31:30,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5051.8, 300 sec: 5665.0). Total num frames: 8724480. Throughput: 0: 642.8, 1: 644.7. Samples: 2176920. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:31:30,440][86732] Avg episode reward: [(0, '7.400'), (1, '9.590')]
[2023-09-22 10:31:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5692.7). Total num frames: 8757248. Throughput: 0: 642.2, 1: 642.4. Samples: 2180709. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:31:35,441][86732] Avg episode reward: [(0, '7.450'), (1, '9.520')]
[2023-09-22 10:31:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5188.2, 300 sec: 5665.0). Total num frames: 8781824. Throughput: 0: 652.6, 1: 654.3. Samples: 2189312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:31:40,441][86732] Avg episode reward: [(0, '7.250'), (1, '9.670')]
[2023-09-22 10:31:40,881][88473] Updated weights for policy 0, policy_version 17216 (0.0022)
[2023-09-22 10:31:40,881][88474] Updated weights for policy 1, policy_version 17120 (0.0017)
[2023-09-22 10:31:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5665.0). Total num frames: 8814592. Throughput: 0: 669.2, 1: 667.2. Samples: 2198254. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:31:45,440][86732] Avg episode reward: [(0, '7.100'), (1, '9.960')]
[2023-09-22 10:31:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 5461.3, 300 sec: 5692.7). Total num frames: 8847360. Throughput: 0: 676.0, 1: 677.4. Samples: 2202914. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:31:50,441][86732] Avg episode reward: [(0, '7.410'), (1, '10.310')]
[2023-09-22 10:31:54,414][88473] Updated weights for policy 0, policy_version 17376 (0.0016)
[2023-09-22 10:31:54,415][88474] Updated weights for policy 1, policy_version 17280 (0.0017)
[2023-09-22 10:31:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5665.0). Total num frames: 8871936. Throughput: 0: 695.2, 1: 696.4. Samples: 2211841. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:31:55,441][86732] Avg episode reward: [(0, '7.560'), (1, '10.570')]
[2023-09-22 10:32:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5665.0). Total num frames: 8904704. Throughput: 0: 713.0, 1: 713.1. Samples: 2221143. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:32:00,441][86732] Avg episode reward: [(0, '7.240'), (1, '10.590')]
[2023-09-22 10:32:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 5597.9, 300 sec: 5692.7). Total num frames: 8937472. Throughput: 0: 722.7, 1: 722.7. Samples: 2225910. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:32:05,441][86732] Avg episode reward: [(0, '7.620'), (1, '10.760')]
[2023-09-22 10:32:07,624][88473] Updated weights for policy 0, policy_version 17536 (0.0015)
[2023-09-22 10:32:07,624][88474] Updated weights for policy 1, policy_version 17440 (0.0019)
[2023-09-22 10:32:10,440][86732] Fps is (10 sec: 6553.4, 60 sec: 5734.4, 300 sec: 5692.8). Total num frames: 8970240. Throughput: 0: 736.2, 1: 735.0. Samples: 2235041. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:32:10,442][86732] Avg episode reward: [(0, '7.430'), (1, '10.910')]
[2023-09-22 10:32:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000017472_4472832.pth...
[2023-09-22 10:32:10,453][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000017568_4497408.pth...
[2023-09-22 10:32:10,484][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000014832_3796992.pth
[2023-09-22 10:32:10,486][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000014928_3821568.pth
[2023-09-22 10:32:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 8994816. Throughput: 0: 748.6, 1: 748.1. Samples: 2244271. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:32:15,441][86732] Avg episode reward: [(0, '7.230'), (1, '11.060')]
[2023-09-22 10:32:20,440][86732] Fps is (10 sec: 5734.7, 60 sec: 5870.9, 300 sec: 5692.8). Total num frames: 9027584. Throughput: 0: 754.5, 1: 756.6. Samples: 2248705. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:32:20,440][86732] Avg episode reward: [(0, '7.400'), (1, '11.390')]
[2023-09-22 10:32:20,441][88352] Saving new best policy, reward=11.390!
[2023-09-22 10:32:21,248][88474] Updated weights for policy 1, policy_version 17600 (0.0014)
[2023-09-22 10:32:21,248][88473] Updated weights for policy 0, policy_version 17696 (0.0016)
[2023-09-22 10:32:25,440][86732] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5678.9). Total num frames: 9056256. Throughput: 0: 756.0, 1: 754.2. Samples: 2257267. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:32:25,441][86732] Avg episode reward: [(0, '7.400'), (1, '11.150')]
[2023-09-22 10:32:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9084928. Throughput: 0: 759.6, 1: 759.5. Samples: 2266614. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:32:30,441][86732] Avg episode reward: [(0, '7.250'), (1, '11.410')]
[2023-09-22 10:32:30,442][88352] Saving new best policy, reward=11.410!
[2023-09-22 10:32:34,924][88474] Updated weights for policy 1, policy_version 17760 (0.0014)
[2023-09-22 10:32:34,924][88473] Updated weights for policy 0, policy_version 17856 (0.0017)
[2023-09-22 10:32:35,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9117696. Throughput: 0: 758.6, 1: 758.5. Samples: 2271185. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:32:35,441][86732] Avg episode reward: [(0, '7.140'), (1, '10.940')]
[2023-09-22 10:32:40,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5692.7). Total num frames: 9150464. Throughput: 0: 758.9, 1: 756.7. Samples: 2280044. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:32:40,440][86732] Avg episode reward: [(0, '7.370'), (1, '10.900')]
[2023-09-22 10:32:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9175040. Throughput: 0: 755.6, 1: 755.8. Samples: 2289157. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:32:45,440][86732] Avg episode reward: [(0, '6.870'), (1, '11.050')]
[2023-09-22 10:32:48,662][88474] Updated weights for policy 1, policy_version 17920 (0.0016)
[2023-09-22 10:32:48,663][88473] Updated weights for policy 0, policy_version 18016 (0.0017)
[2023-09-22 10:32:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5692.8). Total num frames: 9207808. Throughput: 0: 752.0, 1: 752.1. Samples: 2293593. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:32:50,440][86732] Avg episode reward: [(0, '7.030'), (1, '11.040')]
[2023-09-22 10:32:55,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5692.7). Total num frames: 9240576. Throughput: 0: 749.8, 1: 749.0. Samples: 2302490. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:32:55,441][86732] Avg episode reward: [(0, '7.380'), (1, '11.010')]
[2023-09-22 10:33:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 9265152. Throughput: 0: 751.1, 1: 751.7. Samples: 2311899. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:33:00,440][86732] Avg episode reward: [(0, '7.400'), (1, '11.080')]
[2023-09-22 10:33:02,119][88473] Updated weights for policy 0, policy_version 18176 (0.0015)
[2023-09-22 10:33:02,120][88474] Updated weights for policy 1, policy_version 18080 (0.0017)
[2023-09-22 10:33:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9297920. Throughput: 0: 750.9, 1: 750.9. Samples: 2316288. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:33:05,440][86732] Avg episode reward: [(0, '7.450'), (1, '11.000')]
[2023-09-22 10:33:10,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9330688. Throughput: 0: 758.1, 1: 757.9. Samples: 2325484. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:33:10,441][86732] Avg episode reward: [(0, '7.660'), (1, '11.100')]
[2023-09-22 10:33:15,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6075.8, 300 sec: 5692.7). Total num frames: 9359360. Throughput: 0: 705.9, 1: 757.8. Samples: 2332482. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:33:15,440][86732] Avg episode reward: [(0, '7.640'), (1, '10.980')]
[2023-09-22 10:33:15,469][88474] Updated weights for policy 1, policy_version 18240 (0.0016)
[2023-09-22 10:33:15,470][88473] Updated weights for policy 0, policy_version 18336 (0.0017)
[2023-09-22 10:33:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9388032. Throughput: 0: 752.3, 1: 752.1. Samples: 2338882. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:33:20,440][86732] Avg episode reward: [(0, '7.390'), (1, '11.420')]
[2023-09-22 10:33:20,441][88352] Saving new best policy, reward=11.420!
[2023-09-22 10:33:25,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6075.7, 300 sec: 5692.7). Total num frames: 9420800. Throughput: 0: 756.2, 1: 756.9. Samples: 2348136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:33:25,440][86732] Avg episode reward: [(0, '7.430'), (1, '11.430')]
[2023-09-22 10:33:25,450][88352] Saving new best policy, reward=11.430!
[2023-09-22 10:33:29,071][88474] Updated weights for policy 1, policy_version 18400 (0.0017)
[2023-09-22 10:33:29,071][88473] Updated weights for policy 0, policy_version 18496 (0.0017)
[2023-09-22 10:33:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5720.5). Total num frames: 9453568. Throughput: 0: 756.0, 1: 757.2. Samples: 2357249. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:33:30,440][86732] Avg episode reward: [(0, '7.730'), (1, '11.650')]
[2023-09-22 10:33:30,441][88352] Saving new best policy, reward=11.650!
[2023-09-22 10:33:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9478144. Throughput: 0: 757.1, 1: 756.6. Samples: 2361709. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:33:35,441][86732] Avg episode reward: [(0, '7.960'), (1, '11.220')]
[2023-09-22 10:33:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9510912. Throughput: 0: 763.5, 1: 763.8. Samples: 2371219. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:33:40,440][86732] Avg episode reward: [(0, '7.890'), (1, '10.940')]
[2023-09-22 10:33:42,463][88474] Updated weights for policy 1, policy_version 18560 (0.0016)
[2023-09-22 10:33:42,464][88473] Updated weights for policy 0, policy_version 18656 (0.0017)
[2023-09-22 10:33:45,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5692.7). Total num frames: 9543680. Throughput: 0: 760.1, 1: 759.4. Samples: 2380277. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:33:45,441][86732] Avg episode reward: [(0, '7.870'), (1, '10.980')]
[2023-09-22 10:33:50,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 5706.6). Total num frames: 9572352. Throughput: 0: 763.7, 1: 762.4. Samples: 2384961. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:33:50,441][86732] Avg episode reward: [(0, '7.870'), (1, '10.580')]
[2023-09-22 10:33:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9601024. Throughput: 0: 761.5, 1: 763.3. Samples: 2394102. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:33:55,440][86732] Avg episode reward: [(0, '8.330'), (1, '10.070')]
[2023-09-22 10:33:55,449][88211] Saving new best policy, reward=8.330!
[2023-09-22 10:33:55,933][88474] Updated weights for policy 1, policy_version 18720 (0.0018)
[2023-09-22 10:33:55,933][88473] Updated weights for policy 0, policy_version 18816 (0.0018)
[2023-09-22 10:34:00,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 5720.5). Total num frames: 9633792. Throughput: 0: 808.6, 1: 757.3. Samples: 2402945. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:34:00,441][86732] Avg episode reward: [(0, '8.200'), (1, '9.710')]
[2023-09-22 10:34:05,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6075.7, 300 sec: 5706.6). Total num frames: 9662464. Throughput: 0: 762.4, 1: 762.6. Samples: 2407508. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:05,440][86732] Avg episode reward: [(0, '7.910'), (1, '9.760')]
[2023-09-22 10:34:09,549][88473] Updated weights for policy 0, policy_version 18976 (0.0016)
[2023-09-22 10:34:09,550][88474] Updated weights for policy 1, policy_version 18880 (0.0016)
[2023-09-22 10:34:10,440][86732] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 5692.7). Total num frames: 9691136. Throughput: 0: 760.4, 1: 761.2. Samples: 2416609. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:10,441][86732] Avg episode reward: [(0, '8.190'), (1, '8.900')]
[2023-09-22 10:34:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000018976_4857856.pth...
[2023-09-22 10:34:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000018880_4833280.pth...
[2023-09-22 10:34:10,487][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000016288_4169728.pth
[2023-09-22 10:34:10,489][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000016192_4145152.pth
[2023-09-22 10:34:15,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 5692.7). Total num frames: 9723904. Throughput: 0: 759.3, 1: 756.1. Samples: 2425440. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:15,441][86732] Avg episode reward: [(0, '8.040'), (1, '9.000')]
[2023-09-22 10:34:20,440][86732] Fps is (10 sec: 6144.2, 60 sec: 6075.7, 300 sec: 5706.6). Total num frames: 9752576. Throughput: 0: 759.0, 1: 759.0. Samples: 2430019. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:20,441][86732] Avg episode reward: [(0, '8.160'), (1, '8.860')]
[2023-09-22 10:34:23,140][88473] Updated weights for policy 0, policy_version 19136 (0.0016)
[2023-09-22 10:34:23,140][88474] Updated weights for policy 1, policy_version 19040 (0.0017)
[2023-09-22 10:34:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 9781248. Throughput: 0: 754.1, 1: 755.9. Samples: 2439169. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:25,440][86732] Avg episode reward: [(0, '7.780'), (1, '8.970')]
[2023-09-22 10:34:30,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.4, 300 sec: 5706.6). Total num frames: 9814016. Throughput: 0: 755.1, 1: 757.2. Samples: 2448330. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:30,441][86732] Avg episode reward: [(0, '7.800'), (1, '8.440')]
[2023-09-22 10:34:35,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5720.5). Total num frames: 9846784. Throughput: 0: 752.7, 1: 752.3. Samples: 2452683. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:35,440][86732] Avg episode reward: [(0, '7.860'), (1, '8.130')]
[2023-09-22 10:34:36,692][88473] Updated weights for policy 0, policy_version 19296 (0.0016)
[2023-09-22 10:34:36,693][88474] Updated weights for policy 1, policy_version 19200 (0.0017)
[2023-09-22 10:34:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5720.5). Total num frames: 9871360. Throughput: 0: 751.0, 1: 751.2. Samples: 2461697. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:40,440][86732] Avg episode reward: [(0, '8.110'), (1, '8.270')]
[2023-09-22 10:34:45,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5748.3). Total num frames: 9904128. Throughput: 0: 753.6, 1: 753.6. Samples: 2470771. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:45,441][86732] Avg episode reward: [(0, '8.250'), (1, '8.370')]
[2023-09-22 10:34:50,288][88473] Updated weights for policy 0, policy_version 19456 (0.0017)
[2023-09-22 10:34:50,288][88474] Updated weights for policy 1, policy_version 19360 (0.0017)
[2023-09-22 10:34:50,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6075.7, 300 sec: 5776.1). Total num frames: 9936896. Throughput: 0: 755.3, 1: 754.0. Samples: 2475426. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:50,441][86732] Avg episode reward: [(0, '8.420'), (1, '8.370')]
[2023-09-22 10:34:50,442][88211] Saving new best policy, reward=8.420!
[2023-09-22 10:34:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5776.1). Total num frames: 9961472. Throughput: 0: 750.9, 1: 751.6. Samples: 2484224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:34:55,440][86732] Avg episode reward: [(0, '8.350'), (1, '8.700')]
[2023-09-22 10:35:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5776.1). Total num frames: 9994240. Throughput: 0: 750.4, 1: 752.2. Samples: 2493058. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:35:00,441][86732] Avg episode reward: [(0, '7.800'), (1, '9.080')]
[2023-09-22 10:35:04,088][88473] Updated weights for policy 0, policy_version 19616 (0.0016)
[2023-09-22 10:35:04,089][88474] Updated weights for policy 1, policy_version 19520 (0.0016)
[2023-09-22 10:35:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6075.7, 300 sec: 5803.8). Total num frames: 10027008. Throughput: 0: 750.5, 1: 751.4. Samples: 2497604. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:35:05,441][86732] Avg episode reward: [(0, '7.800'), (1, '9.130')]
[2023-09-22 10:35:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5803.8). Total num frames: 10051584. Throughput: 0: 750.9, 1: 750.8. Samples: 2506745. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:35:10,441][86732] Avg episode reward: [(0, '8.070'), (1, '9.080')]
[2023-09-22 10:35:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 10084352. Throughput: 0: 749.4, 1: 747.7. Samples: 2515698. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:35:15,440][86732] Avg episode reward: [(0, '7.880'), (1, '9.220')]
[2023-09-22 10:35:17,638][88473] Updated weights for policy 0, policy_version 19776 (0.0015)
[2023-09-22 10:35:17,639][88474] Updated weights for policy 1, policy_version 19680 (0.0017)
[2023-09-22 10:35:20,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6075.7, 300 sec: 5859.4). Total num frames: 10117120. Throughput: 0: 752.2, 1: 751.6. Samples: 2520357. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:35:20,441][86732] Avg episode reward: [(0, '7.650'), (1, '9.440')]
[2023-09-22 10:35:25,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 10141696. Throughput: 0: 753.0, 1: 751.1. Samples: 2529380. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:35:25,441][86732] Avg episode reward: [(0, '7.830'), (1, '9.500')]
[2023-09-22 10:35:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 10174464. Throughput: 0: 757.8, 1: 757.2. Samples: 2538942. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:35:30,441][86732] Avg episode reward: [(0, '7.960'), (1, '9.860')]
[2023-09-22 10:35:30,868][88474] Updated weights for policy 1, policy_version 19840 (0.0018)
[2023-09-22 10:35:30,869][88473] Updated weights for policy 0, policy_version 19936 (0.0017)
[2023-09-22 10:35:35,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 10207232. Throughput: 0: 756.5, 1: 758.5. Samples: 2543598. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:35:35,441][86732] Avg episode reward: [(0, '7.950'), (1, '9.540')]
[2023-09-22 10:35:40,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5914.9). Total num frames: 10240000. Throughput: 0: 761.4, 1: 758.7. Samples: 2552629. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 10:35:40,440][86732] Avg episode reward: [(0, '8.140'), (1, '9.850')]
[2023-09-22 10:35:44,249][88473] Updated weights for policy 0, policy_version 20096 (0.0018)
[2023-09-22 10:35:44,249][88474] Updated weights for policy 1, policy_version 20000 (0.0015)
[2023-09-22 10:35:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 10264576. Throughput: 0: 765.8, 1: 766.4. Samples: 2562008. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:35:45,441][86732] Avg episode reward: [(0, '8.390'), (1, '10.250')]
[2023-09-22 10:35:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 10297344. Throughput: 0: 763.3, 1: 762.7. Samples: 2566275. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:35:50,441][86732] Avg episode reward: [(0, '8.400'), (1, '10.480')]
[2023-09-22 10:35:55,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5942.7). Total num frames: 10330112. Throughput: 0: 765.8, 1: 764.5. Samples: 2575607. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:35:55,440][86732] Avg episode reward: [(0, '8.460'), (1, '10.790')]
[2023-09-22 10:35:55,450][88211] Saving new best policy, reward=8.460!
[2023-09-22 10:35:57,644][88473] Updated weights for policy 0, policy_version 20256 (0.0016)
[2023-09-22 10:35:57,644][88474] Updated weights for policy 1, policy_version 20160 (0.0017)
[2023-09-22 10:36:00,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 5970.4). Total num frames: 10362880. Throughput: 0: 767.2, 1: 767.0. Samples: 2584737. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:00,440][86732] Avg episode reward: [(0, '8.560'), (1, '10.860')]
[2023-09-22 10:36:00,441][88211] Saving new best policy, reward=8.560!
[2023-09-22 10:36:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 10387456. Throughput: 0: 764.9, 1: 765.6. Samples: 2589233. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:05,440][86732] Avg episode reward: [(0, '8.480'), (1, '10.910')]
[2023-09-22 10:36:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5998.2). Total num frames: 10420224. Throughput: 0: 766.3, 1: 765.4. Samples: 2598307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:10,440][86732] Avg episode reward: [(0, '8.830'), (1, '10.920')]
[2023-09-22 10:36:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000020304_5197824.pth...
[2023-09-22 10:36:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000020400_5222400.pth...
[2023-09-22 10:36:10,482][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000017472_4472832.pth
[2023-09-22 10:36:10,487][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000017568_4497408.pth
[2023-09-22 10:36:10,490][88211] Saving new best policy, reward=8.830!
[2023-09-22 10:36:11,204][88474] Updated weights for policy 1, policy_version 20320 (0.0016)
[2023-09-22 10:36:11,205][88473] Updated weights for policy 0, policy_version 20416 (0.0016)
[2023-09-22 10:36:15,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 10452992. Throughput: 0: 760.1, 1: 760.0. Samples: 2607345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:15,441][86732] Avg episode reward: [(0, '8.410'), (1, '10.860')]
[2023-09-22 10:36:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 10477568. Throughput: 0: 760.7, 1: 759.0. Samples: 2611986. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:20,440][86732] Avg episode reward: [(0, '8.520'), (1, '10.880')]
[2023-09-22 10:36:24,716][88473] Updated weights for policy 0, policy_version 20576 (0.0019)
[2023-09-22 10:36:24,717][88474] Updated weights for policy 1, policy_version 20480 (0.0016)
[2023-09-22 10:36:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 10510336. Throughput: 0: 760.7, 1: 761.2. Samples: 2621116. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:25,441][86732] Avg episode reward: [(0, '8.730'), (1, '10.660')]
[2023-09-22 10:36:30,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 10543104. Throughput: 0: 757.0, 1: 756.0. Samples: 2630095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:30,441][86732] Avg episode reward: [(0, '8.640'), (1, '10.800')]
[2023-09-22 10:36:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 10567680. Throughput: 0: 758.2, 1: 759.6. Samples: 2634574. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:35,441][86732] Avg episode reward: [(0, '9.150'), (1, '11.470')]
[2023-09-22 10:36:35,540][88211] Saving new best policy, reward=9.150!
[2023-09-22 10:36:38,272][88474] Updated weights for policy 1, policy_version 20640 (0.0018)
[2023-09-22 10:36:38,272][88473] Updated weights for policy 0, policy_version 20736 (0.0017)
[2023-09-22 10:36:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 10600448. Throughput: 0: 757.3, 1: 757.5. Samples: 2643776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:40,441][86732] Avg episode reward: [(0, '9.080'), (1, '11.270')]
[2023-09-22 10:36:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 10633216. Throughput: 0: 754.4, 1: 755.3. Samples: 2652673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:45,441][86732] Avg episode reward: [(0, '8.790'), (1, '11.150')]
[2023-09-22 10:36:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 10657792. Throughput: 0: 757.8, 1: 757.1. Samples: 2657407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:50,441][86732] Avg episode reward: [(0, '8.570'), (1, '11.370')]
[2023-09-22 10:36:51,793][88473] Updated weights for policy 0, policy_version 20896 (0.0017)
[2023-09-22 10:36:51,793][88474] Updated weights for policy 1, policy_version 20800 (0.0015)
[2023-09-22 10:36:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 10690560. Throughput: 0: 756.2, 1: 759.1. Samples: 2666497. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:36:55,441][86732] Avg episode reward: [(0, '8.700'), (1, '11.520')]
[2023-09-22 10:37:00,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 10723328. Throughput: 0: 760.2, 1: 759.7. Samples: 2675739. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:37:00,441][86732] Avg episode reward: [(0, '8.580'), (1, '11.710')]
[2023-09-22 10:37:00,442][88352] Saving new best policy, reward=11.710!
[2023-09-22 10:37:05,069][88473] Updated weights for policy 0, policy_version 21056 (0.0017)
[2023-09-22 10:37:05,070][88474] Updated weights for policy 1, policy_version 20960 (0.0016)
[2023-09-22 10:37:05,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 10756096. Throughput: 0: 760.9, 1: 761.1. Samples: 2680478. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:37:05,441][86732] Avg episode reward: [(0, '8.740'), (1, '11.470')]
[2023-09-22 10:37:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 10788864. Throughput: 0: 760.7, 1: 761.1. Samples: 2689596. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:37:10,440][86732] Avg episode reward: [(0, '8.410'), (1, '11.450')]
[2023-09-22 10:37:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 10813440. Throughput: 0: 760.9, 1: 761.3. Samples: 2698594. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:37:15,441][86732] Avg episode reward: [(0, '8.250'), (1, '11.470')]
[2023-09-22 10:37:18,611][88473] Updated weights for policy 0, policy_version 21216 (0.0017)
[2023-09-22 10:37:18,612][88474] Updated weights for policy 1, policy_version 21120 (0.0017)
[2023-09-22 10:37:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6067.6). Total num frames: 10846208. Throughput: 0: 764.0, 1: 762.4. Samples: 2703260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:37:20,440][86732] Avg episode reward: [(0, '8.040'), (1, '11.210')]
[2023-09-22 10:37:25,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 10878976. Throughput: 0: 759.8, 1: 758.2. Samples: 2712086. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:37:25,440][86732] Avg episode reward: [(0, '8.080'), (1, '11.220')]
[2023-09-22 10:37:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 10903552. Throughput: 0: 765.7, 1: 764.1. Samples: 2721512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:37:30,441][86732] Avg episode reward: [(0, '7.920'), (1, '11.350')]
[2023-09-22 10:37:32,169][88473] Updated weights for policy 0, policy_version 21376 (0.0015)
[2023-09-22 10:37:32,169][88474] Updated weights for policy 1, policy_version 21280 (0.0016)
[2023-09-22 10:37:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 10936320. Throughput: 0: 759.8, 1: 762.0. Samples: 2725888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:37:35,441][86732] Avg episode reward: [(0, '8.030'), (1, '11.360')]
[2023-09-22 10:37:40,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 10969088. Throughput: 0: 764.3, 1: 761.9. Samples: 2735179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:37:40,441][86732] Avg episode reward: [(0, '8.010'), (1, '11.730')]
[2023-09-22 10:37:40,453][88352] Saving new best policy, reward=11.730!
[2023-09-22 10:37:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 10993664. Throughput: 0: 753.5, 1: 754.4. Samples: 2743592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:37:45,441][86732] Avg episode reward: [(0, '8.050'), (1, '11.240')]
[2023-09-22 10:37:45,978][88473] Updated weights for policy 0, policy_version 21536 (0.0016)
[2023-09-22 10:37:45,978][88474] Updated weights for policy 1, policy_version 21440 (0.0018)
[2023-09-22 10:37:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 11026432. Throughput: 0: 753.1, 1: 753.2. Samples: 2748259. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:37:50,440][86732] Avg episode reward: [(0, '8.300'), (1, '11.380')]
[2023-09-22 10:37:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11051008. Throughput: 0: 748.3, 1: 748.1. Samples: 2756934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:37:55,441][86732] Avg episode reward: [(0, '8.470'), (1, '10.930')]
[2023-09-22 10:37:59,637][88473] Updated weights for policy 0, policy_version 21696 (0.0014)
[2023-09-22 10:37:59,638][88474] Updated weights for policy 1, policy_version 21600 (0.0016)
[2023-09-22 10:38:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11083776. Throughput: 0: 750.4, 1: 749.7. Samples: 2766101. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:38:00,440][86732] Avg episode reward: [(0, '8.770'), (1, '11.090')]
[2023-09-22 10:38:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 11116544. Throughput: 0: 748.1, 1: 748.4. Samples: 2770606. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:38:05,441][86732] Avg episode reward: [(0, '9.080'), (1, '11.160')]
[2023-09-22 10:38:10,440][86732] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 6039.9). Total num frames: 11141120. Throughput: 0: 747.3, 1: 748.0. Samples: 2779374. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:38:10,441][86732] Avg episode reward: [(0, '8.770'), (1, '11.010')]
[2023-09-22 10:38:10,511][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000021728_5562368.pth...
[2023-09-22 10:38:10,538][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000018880_4833280.pth
[2023-09-22 10:38:10,570][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000021824_5586944.pth...
[2023-09-22 10:38:10,598][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000018976_4857856.pth
[2023-09-22 10:38:13,247][88474] Updated weights for policy 1, policy_version 21760 (0.0017)
[2023-09-22 10:38:13,247][88473] Updated weights for policy 0, policy_version 21856 (0.0019)
[2023-09-22 10:38:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11173888. Throughput: 0: 745.9, 1: 746.2. Samples: 2788658. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:38:15,441][86732] Avg episode reward: [(0, '8.870'), (1, '11.210')]
[2023-09-22 10:38:20,440][86732] Fps is (10 sec: 6553.9, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 11206656. Throughput: 0: 749.1, 1: 747.4. Samples: 2793232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:38:20,440][86732] Avg episode reward: [(0, '8.920'), (1, '11.200')]
[2023-09-22 10:38:25,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 11231232. Throughput: 0: 737.5, 1: 739.9. Samples: 2801664. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:38:25,440][86732] Avg episode reward: [(0, '8.970'), (1, '11.330')]
[2023-09-22 10:38:27,103][88474] Updated weights for policy 1, policy_version 21920 (0.0014)
[2023-09-22 10:38:27,103][88473] Updated weights for policy 0, policy_version 22016 (0.0014)
[2023-09-22 10:38:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 11264000. Throughput: 0: 745.0, 1: 745.7. Samples: 2810675. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:38:30,440][86732] Avg episode reward: [(0, '8.740'), (1, '11.460')]
[2023-09-22 10:38:35,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11296768. Throughput: 0: 745.9, 1: 745.4. Samples: 2815368. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:38:35,441][86732] Avg episode reward: [(0, '8.650'), (1, '11.330')]
[2023-09-22 10:38:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 11321344. Throughput: 0: 746.4, 1: 748.3. Samples: 2824199. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:38:40,441][86732] Avg episode reward: [(0, '8.560'), (1, '11.500')]
[2023-09-22 10:38:40,665][88473] Updated weights for policy 0, policy_version 22176 (0.0015)
[2023-09-22 10:38:40,665][88474] Updated weights for policy 1, policy_version 22080 (0.0017)
[2023-09-22 10:38:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6039.9). Total num frames: 11354112. Throughput: 0: 749.9, 1: 749.7. Samples: 2833584. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:38:45,441][86732] Avg episode reward: [(0, '8.410'), (1, '11.600')]
[2023-09-22 10:38:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11386880. Throughput: 0: 752.9, 1: 752.7. Samples: 2838358. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:38:50,441][86732] Avg episode reward: [(0, '8.400'), (1, '11.630')]
[2023-09-22 10:38:54,113][88473] Updated weights for policy 0, policy_version 22336 (0.0018)
[2023-09-22 10:38:54,113][88474] Updated weights for policy 1, policy_version 22240 (0.0020)
[2023-09-22 10:38:55,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 11419648. Throughput: 0: 752.8, 1: 752.8. Samples: 2847129. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:38:55,441][86732] Avg episode reward: [(0, '8.230'), (1, '11.530')]
[2023-09-22 10:39:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6039.9). Total num frames: 11444224. Throughput: 0: 751.6, 1: 752.1. Samples: 2856321. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:39:00,441][86732] Avg episode reward: [(0, '8.210'), (1, '11.630')]
[2023-09-22 10:39:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 11476992. Throughput: 0: 752.8, 1: 752.6. Samples: 2860971. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:39:05,440][86732] Avg episode reward: [(0, '7.840'), (1, '11.440')]
[2023-09-22 10:39:07,765][88474] Updated weights for policy 1, policy_version 22400 (0.0015)
[2023-09-22 10:39:07,766][88473] Updated weights for policy 0, policy_version 22496 (0.0015)
[2023-09-22 10:39:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 11501568. Throughput: 0: 755.6, 1: 753.7. Samples: 2869582. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:39:10,441][86732] Avg episode reward: [(0, '8.390'), (1, '11.100')]
[2023-09-22 10:39:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6039.9). Total num frames: 11534336. Throughput: 0: 760.8, 1: 760.2. Samples: 2879118. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:39:15,440][86732] Avg episode reward: [(0, '8.650'), (1, '11.130')]
[2023-09-22 10:39:20,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 11567104. Throughput: 0: 757.3, 1: 759.1. Samples: 2883608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:39:20,441][86732] Avg episode reward: [(0, '8.680'), (1, '11.580')]
[2023-09-22 10:39:21,089][88473] Updated weights for policy 0, policy_version 22656 (0.0016)
[2023-09-22 10:39:21,090][88474] Updated weights for policy 1, policy_version 22560 (0.0016)
[2023-09-22 10:39:25,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 11599872. Throughput: 0: 760.7, 1: 758.7. Samples: 2892571. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:39:25,441][86732] Avg episode reward: [(0, '8.650'), (1, '11.530')]
[2023-09-22 10:39:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 11624448. Throughput: 0: 757.0, 1: 756.4. Samples: 2901688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:39:30,441][86732] Avg episode reward: [(0, '8.520'), (1, '11.460')]
[2023-09-22 10:39:34,815][88473] Updated weights for policy 0, policy_version 22816 (0.0015)
[2023-09-22 10:39:34,815][88474] Updated weights for policy 1, policy_version 22720 (0.0015)
[2023-09-22 10:39:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11657216. Throughput: 0: 751.8, 1: 753.8. Samples: 2906113. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:39:35,441][86732] Avg episode reward: [(0, '8.870'), (1, '11.560')]
[2023-09-22 10:39:40,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 11689984. Throughput: 0: 757.4, 1: 757.2. Samples: 2915289. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:39:40,440][86732] Avg episode reward: [(0, '8.740'), (1, '11.660')]
[2023-09-22 10:39:45,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 11714560. Throughput: 0: 757.0, 1: 757.5. Samples: 2924472. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:39:45,440][86732] Avg episode reward: [(0, '8.160'), (1, '11.970')]
[2023-09-22 10:39:45,557][88352] Saving new best policy, reward=11.970!
[2023-09-22 10:39:48,288][88474] Updated weights for policy 1, policy_version 22880 (0.0019)
[2023-09-22 10:39:48,288][88473] Updated weights for policy 0, policy_version 22976 (0.0018)
[2023-09-22 10:39:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11747328. Throughput: 0: 752.7, 1: 752.9. Samples: 2928725. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:39:50,441][86732] Avg episode reward: [(0, '8.100'), (1, '11.890')]
[2023-09-22 10:39:55,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 11780096. Throughput: 0: 756.6, 1: 756.7. Samples: 2937678. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:39:55,440][86732] Avg episode reward: [(0, '8.780'), (1, '11.750')]
[2023-09-22 10:40:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 11804672. Throughput: 0: 753.9, 1: 753.4. Samples: 2946945. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:40:00,441][86732] Avg episode reward: [(0, '8.340'), (1, '11.840')]
[2023-09-22 10:40:02,013][88474] Updated weights for policy 1, policy_version 23040 (0.0013)
[2023-09-22 10:40:02,014][88473] Updated weights for policy 0, policy_version 23136 (0.0017)
[2023-09-22 10:40:05,441][86732] Fps is (10 sec: 5733.5, 60 sec: 6007.3, 300 sec: 6053.7). Total num frames: 11837440. Throughput: 0: 750.4, 1: 750.9. Samples: 2951168. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:05,442][86732] Avg episode reward: [(0, '8.490'), (1, '12.080')]
[2023-09-22 10:40:05,443][88352] Saving new best policy, reward=12.080!
[2023-09-22 10:40:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 11870208. Throughput: 0: 754.6, 1: 754.8. Samples: 2960495. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:10,440][86732] Avg episode reward: [(0, '8.860'), (1, '12.150')]
[2023-09-22 10:40:10,450][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000023136_5922816.pth...
[2023-09-22 10:40:10,451][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000023232_5947392.pth...
[2023-09-22 10:40:10,483][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000020304_5197824.pth
[2023-09-22 10:40:10,486][88352] Saving new best policy, reward=12.150!
[2023-09-22 10:40:10,490][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000020400_5222400.pth
[2023-09-22 10:40:15,440][86732] Fps is (10 sec: 5735.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 11894784. Throughput: 0: 753.1, 1: 755.8. Samples: 2969588. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:15,440][86732] Avg episode reward: [(0, '8.590'), (1, '11.990')]
[2023-09-22 10:40:15,522][88474] Updated weights for policy 1, policy_version 23200 (0.0017)
[2023-09-22 10:40:15,522][88473] Updated weights for policy 0, policy_version 23296 (0.0017)
[2023-09-22 10:40:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 11927552. Throughput: 0: 754.4, 1: 752.4. Samples: 2973919. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:40:20,440][86732] Avg episode reward: [(0, '8.600'), (1, '12.400')]
[2023-09-22 10:40:20,441][88352] Saving new best policy, reward=12.400!
[2023-09-22 10:40:25,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 11960320. Throughput: 0: 754.6, 1: 754.7. Samples: 2983205. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:40:25,441][86732] Avg episode reward: [(0, '8.860'), (1, '12.590')]
[2023-09-22 10:40:25,450][88352] Saving new best policy, reward=12.590!
[2023-09-22 10:40:28,955][88474] Updated weights for policy 1, policy_version 23360 (0.0019)
[2023-09-22 10:40:28,955][88473] Updated weights for policy 0, policy_version 23456 (0.0017)
[2023-09-22 10:40:30,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 11993088. Throughput: 0: 751.0, 1: 752.5. Samples: 2992133. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:40:30,440][86732] Avg episode reward: [(0, '8.930'), (1, '12.440')]
[2023-09-22 10:40:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12017664. Throughput: 0: 752.7, 1: 752.3. Samples: 2996447. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:40:35,441][86732] Avg episode reward: [(0, '8.930'), (1, '12.270')]
[2023-09-22 10:40:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 12050432. Throughput: 0: 754.0, 1: 753.7. Samples: 3005527. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:40,441][86732] Avg episode reward: [(0, '9.320'), (1, '11.950')]
[2023-09-22 10:40:40,448][88211] Saving new best policy, reward=9.320!
[2023-09-22 10:40:42,759][88474] Updated weights for policy 1, policy_version 23520 (0.0018)
[2023-09-22 10:40:42,760][88473] Updated weights for policy 0, policy_version 23616 (0.0018)
[2023-09-22 10:40:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 12083200. Throughput: 0: 751.4, 1: 753.4. Samples: 3014661. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:45,441][86732] Avg episode reward: [(0, '9.430'), (1, '11.930')]
[2023-09-22 10:40:45,441][88211] Saving new best policy, reward=9.430!
[2023-09-22 10:40:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12107776. Throughput: 0: 753.4, 1: 751.3. Samples: 3018879. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:40:50,441][86732] Avg episode reward: [(0, '9.140'), (1, '11.620')]
[2023-09-22 10:40:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12140544. Throughput: 0: 748.0, 1: 748.4. Samples: 3027835. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:40:55,440][86732] Avg episode reward: [(0, '9.810'), (1, '11.450')]
[2023-09-22 10:40:55,449][88211] Saving new best policy, reward=9.810!
[2023-09-22 10:40:56,651][88473] Updated weights for policy 0, policy_version 23776 (0.0016)
[2023-09-22 10:40:56,651][88474] Updated weights for policy 1, policy_version 23680 (0.0016)
[2023-09-22 10:41:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 12165120. Throughput: 0: 750.4, 1: 748.5. Samples: 3037038. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:41:00,442][86732] Avg episode reward: [(0, '9.740'), (1, '11.420')]
[2023-09-22 10:41:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.6, 300 sec: 6026.0). Total num frames: 12197888. Throughput: 0: 748.3, 1: 749.5. Samples: 3041320. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:41:05,441][86732] Avg episode reward: [(0, '9.500'), (1, '11.380')]
[2023-09-22 10:41:10,037][88473] Updated weights for policy 0, policy_version 23936 (0.0015)
[2023-09-22 10:41:10,037][88474] Updated weights for policy 1, policy_version 23840 (0.0016)
[2023-09-22 10:41:10,440][86732] Fps is (10 sec: 6553.9, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12230656. Throughput: 0: 749.3, 1: 748.8. Samples: 3050623. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:41:10,440][86732] Avg episode reward: [(0, '9.380'), (1, '11.390')]
[2023-09-22 10:41:15,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 12263424. Throughput: 0: 750.9, 1: 750.9. Samples: 3059717. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:41:15,441][86732] Avg episode reward: [(0, '9.550'), (1, '11.660')]
[2023-09-22 10:41:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 12288000. Throughput: 0: 753.5, 1: 753.2. Samples: 3064246. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:41:20,441][86732] Avg episode reward: [(0, '9.470'), (1, '11.560')]
[2023-09-22 10:41:23,504][88473] Updated weights for policy 0, policy_version 24096 (0.0017)
[2023-09-22 10:41:23,504][88474] Updated weights for policy 1, policy_version 24000 (0.0018)
[2023-09-22 10:41:25,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12320768. Throughput: 0: 756.3, 1: 756.6. Samples: 3073608. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:25,440][86732] Avg episode reward: [(0, '9.650'), (1, '11.700')]
[2023-09-22 10:41:30,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 12353536. Throughput: 0: 753.9, 1: 752.0. Samples: 3082423. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:30,441][86732] Avg episode reward: [(0, '9.000'), (1, '11.680')]
[2023-09-22 10:41:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12378112. Throughput: 0: 757.4, 1: 757.9. Samples: 3087069. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:35,440][86732] Avg episode reward: [(0, '9.200'), (1, '11.840')]
[2023-09-22 10:41:36,896][88473] Updated weights for policy 0, policy_version 24256 (0.0016)
[2023-09-22 10:41:36,896][88474] Updated weights for policy 1, policy_version 24160 (0.0016)
[2023-09-22 10:41:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12410880. Throughput: 0: 762.8, 1: 763.5. Samples: 3096522. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:40,441][86732] Avg episode reward: [(0, '9.180'), (1, '11.670')]
[2023-09-22 10:41:45,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 12443648. Throughput: 0: 759.3, 1: 760.5. Samples: 3105429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:45,440][86732] Avg episode reward: [(0, '8.820'), (1, '11.450')]
[2023-09-22 10:41:50,382][88474] Updated weights for policy 1, policy_version 24320 (0.0016)
[2023-09-22 10:41:50,382][88473] Updated weights for policy 0, policy_version 24416 (0.0017)
[2023-09-22 10:41:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 12476416. Throughput: 0: 765.3, 1: 763.8. Samples: 3110130. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:50,440][86732] Avg episode reward: [(0, '9.220'), (1, '11.730')]
[2023-09-22 10:41:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 12500992. Throughput: 0: 759.2, 1: 762.1. Samples: 3119084. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:41:55,441][86732] Avg episode reward: [(0, '9.500'), (1, '11.830')]
[2023-09-22 10:42:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 12533760. Throughput: 0: 753.4, 1: 751.5. Samples: 3127440. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:42:00,441][86732] Avg episode reward: [(0, '9.310'), (1, '11.670')]
[2023-09-22 10:42:04,460][88474] Updated weights for policy 1, policy_version 24480 (0.0014)
[2023-09-22 10:42:04,460][88473] Updated weights for policy 0, policy_version 24576 (0.0017)
[2023-09-22 10:42:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 12558336. Throughput: 0: 752.4, 1: 752.5. Samples: 3131964. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:42:05,440][86732] Avg episode reward: [(0, '9.230'), (1, '11.690')]
[2023-09-22 10:42:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 12591104. Throughput: 0: 747.9, 1: 747.0. Samples: 3140878. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:42:10,441][86732] Avg episode reward: [(0, '9.520'), (1, '11.750')]
[2023-09-22 10:42:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000024544_6283264.pth...
[2023-09-22 10:42:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000024640_6307840.pth...
[2023-09-22 10:42:10,486][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000021728_5562368.pth
[2023-09-22 10:42:10,493][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000021824_5586944.pth
[2023-09-22 10:42:15,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12623872. Throughput: 0: 748.0, 1: 749.9. Samples: 3149830. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 10:42:15,441][86732] Avg episode reward: [(0, '9.230'), (1, '11.820')]
[2023-09-22 10:42:17,972][88473] Updated weights for policy 0, policy_version 24736 (0.0018)
[2023-09-22 10:42:17,972][88474] Updated weights for policy 1, policy_version 24640 (0.0015)
[2023-09-22 10:42:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 12648448. Throughput: 0: 751.0, 1: 750.3. Samples: 3154629. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:42:20,440][86732] Avg episode reward: [(0, '9.270'), (1, '11.790')]
[2023-09-22 10:42:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 12681216. Throughput: 0: 748.8, 1: 747.4. Samples: 3163851. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:42:25,441][86732] Avg episode reward: [(0, '9.250'), (1, '11.920')]
[2023-09-22 10:42:30,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12713984. Throughput: 0: 749.2, 1: 748.4. Samples: 3172822. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 10:42:30,441][86732] Avg episode reward: [(0, '9.060'), (1, '12.070')]
[2023-09-22 10:42:31,400][88473] Updated weights for policy 0, policy_version 24896 (0.0016)
[2023-09-22 10:42:31,400][88474] Updated weights for policy 1, policy_version 24800 (0.0017)
[2023-09-22 10:42:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 12738560. Throughput: 0: 747.5, 1: 747.9. Samples: 3177422. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:42:35,440][86732] Avg episode reward: [(0, '8.890'), (1, '12.350')]
[2023-09-22 10:42:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12771328. Throughput: 0: 751.2, 1: 750.6. Samples: 3186669. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:42:40,441][86732] Avg episode reward: [(0, '9.040'), (1, '12.590')]
[2023-09-22 10:42:44,900][88474] Updated weights for policy 1, policy_version 24960 (0.0016)
[2023-09-22 10:42:44,900][88473] Updated weights for policy 0, policy_version 25056 (0.0017)
[2023-09-22 10:42:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12804096. Throughput: 0: 757.6, 1: 757.5. Samples: 3195620. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:42:45,441][86732] Avg episode reward: [(0, '8.920'), (1, '12.320')]
[2023-09-22 10:42:50,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 12836864. Throughput: 0: 758.4, 1: 759.0. Samples: 3200243. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:42:50,441][86732] Avg episode reward: [(0, '8.670'), (1, '12.240')]
[2023-09-22 10:42:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12861440. Throughput: 0: 757.9, 1: 758.5. Samples: 3209114. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:42:55,441][86732] Avg episode reward: [(0, '8.630'), (1, '12.100')]
[2023-09-22 10:42:59,014][88474] Updated weights for policy 1, policy_version 25120 (0.0014)
[2023-09-22 10:42:59,014][88473] Updated weights for policy 0, policy_version 25216 (0.0014)
[2023-09-22 10:43:00,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12894208. Throughput: 0: 750.9, 1: 750.9. Samples: 3217413. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:43:00,440][86732] Avg episode reward: [(0, '8.810'), (1, '12.060')]
[2023-09-22 10:43:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12918784. Throughput: 0: 747.3, 1: 747.0. Samples: 3221873. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 10:43:05,441][86732] Avg episode reward: [(0, '8.960'), (1, '12.190')]
[2023-09-22 10:43:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12951552. Throughput: 0: 744.2, 1: 745.0. Samples: 3230865. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:43:10,441][86732] Avg episode reward: [(0, '9.110'), (1, '11.950')]
[2023-09-22 10:43:12,805][88474] Updated weights for policy 1, policy_version 25280 (0.0016)
[2023-09-22 10:43:12,806][88473] Updated weights for policy 0, policy_version 25376 (0.0015)
[2023-09-22 10:43:15,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 12984320. Throughput: 0: 744.9, 1: 746.6. Samples: 3239941. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:43:15,440][86732] Avg episode reward: [(0, '9.090'), (1, '11.820')]
[2023-09-22 10:43:20,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 13008896. Throughput: 0: 743.9, 1: 743.4. Samples: 3244353. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:43:20,441][86732] Avg episode reward: [(0, '9.410'), (1, '12.170')]
[2023-09-22 10:43:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 13041664. Throughput: 0: 743.7, 1: 741.5. Samples: 3253505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:43:25,440][86732] Avg episode reward: [(0, '9.520'), (1, '11.910')]
[2023-09-22 10:43:26,394][88473] Updated weights for policy 0, policy_version 25536 (0.0012)
[2023-09-22 10:43:26,396][88474] Updated weights for policy 1, policy_version 25440 (0.0017)
[2023-09-22 10:43:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 13066240. Throughput: 0: 741.4, 1: 743.5. Samples: 3262441. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:43:30,441][86732] Avg episode reward: [(0, '9.430'), (1, '12.320')]
[2023-09-22 10:43:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 13099008. Throughput: 0: 736.0, 1: 737.7. Samples: 3266560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:43:35,440][86732] Avg episode reward: [(0, '9.240'), (1, '12.060')]
[2023-09-22 10:43:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 13123584. Throughput: 0: 730.0, 1: 730.6. Samples: 3274842. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:43:40,440][86732] Avg episode reward: [(0, '8.870'), (1, '11.910')]
[2023-09-22 10:43:40,831][88473] Updated weights for policy 0, policy_version 25696 (0.0016)
[2023-09-22 10:43:40,831][88474] Updated weights for policy 1, policy_version 25600 (0.0014)
[2023-09-22 10:43:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5734.4, 300 sec: 5970.4). Total num frames: 13148160. Throughput: 0: 727.3, 1: 725.6. Samples: 3282796. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:43:45,440][86732] Avg episode reward: [(0, '9.380'), (1, '11.650')]
[2023-09-22 10:43:50,440][86732] Fps is (10 sec: 5324.7, 60 sec: 5666.1, 300 sec: 5956.6). Total num frames: 13176832. Throughput: 0: 720.2, 1: 720.7. Samples: 3286713. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:43:50,441][86732] Avg episode reward: [(0, '9.250'), (1, '12.060')]
[2023-09-22 10:43:55,448][86732] Fps is (10 sec: 5729.7, 60 sec: 5733.6, 300 sec: 5970.3). Total num frames: 13205504. Throughput: 0: 700.5, 1: 700.2. Samples: 3293907. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:43:55,449][86732] Avg episode reward: [(0, '9.100'), (1, '11.730')]
[2023-09-22 10:43:57,055][88473] Updated weights for policy 0, policy_version 25856 (0.0016)
[2023-09-22 10:43:57,055][88474] Updated weights for policy 1, policy_version 25760 (0.0018)
[2023-09-22 10:44:00,440][86732] Fps is (10 sec: 5324.8, 60 sec: 5597.9, 300 sec: 5942.7). Total num frames: 13230080. Throughput: 0: 688.2, 1: 686.4. Samples: 3301797. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 10:44:00,441][86732] Avg episode reward: [(0, '9.000'), (1, '11.870')]
[2023-09-22 10:44:05,440][86732] Fps is (10 sec: 4919.2, 60 sec: 5597.9, 300 sec: 5942.7). Total num frames: 13254656. Throughput: 0: 681.1, 1: 681.5. Samples: 3305670. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:44:05,440][86732] Avg episode reward: [(0, '9.150'), (1, '12.140')]
[2023-09-22 10:44:10,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5461.3, 300 sec: 5914.9). Total num frames: 13279232. Throughput: 0: 667.1, 1: 667.9. Samples: 3313583. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:44:10,440][86732] Avg episode reward: [(0, '8.720'), (1, '11.790')]
[2023-09-22 10:44:10,448][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000025888_6627328.pth...
[2023-09-22 10:44:10,448][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000025984_6651904.pth...
[2023-09-22 10:44:10,479][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000023136_5922816.pth
[2023-09-22 10:44:10,481][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000023232_5947392.pth
[2023-09-22 10:44:12,802][88474] Updated weights for policy 1, policy_version 25920 (0.0016)
[2023-09-22 10:44:12,802][88473] Updated weights for policy 0, policy_version 26016 (0.0015)
[2023-09-22 10:44:15,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5324.8, 300 sec: 5887.1). Total num frames: 13303808. Throughput: 0: 655.6, 1: 654.2. Samples: 3321382. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:44:15,440][86732] Avg episode reward: [(0, '8.610'), (1, '12.490')]
[2023-09-22 10:44:20,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5324.8, 300 sec: 5859.4). Total num frames: 13328384. Throughput: 0: 651.6, 1: 651.2. Samples: 3325188. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:44:20,441][86732] Avg episode reward: [(0, '9.220'), (1, '12.570')]
[2023-09-22 10:44:25,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5887.1). Total num frames: 13361152. Throughput: 0: 644.2, 1: 645.4. Samples: 3332871. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:44:25,440][86732] Avg episode reward: [(0, '9.430'), (1, '12.540')]
[2023-09-22 10:44:28,503][88474] Updated weights for policy 1, policy_version 26080 (0.0012)
[2023-09-22 10:44:28,504][88473] Updated weights for policy 0, policy_version 26176 (0.0015)
[2023-09-22 10:44:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5859.4). Total num frames: 13385728. Throughput: 0: 642.6, 1: 642.9. Samples: 3340645. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:44:30,441][86732] Avg episode reward: [(0, '8.950'), (1, '12.430')]
[2023-09-22 10:44:35,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5831.6). Total num frames: 13410304. Throughput: 0: 642.1, 1: 642.5. Samples: 3344520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:44:35,440][86732] Avg episode reward: [(0, '9.040'), (1, '12.290')]
[2023-09-22 10:44:40,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5831.6). Total num frames: 13434880. Throughput: 0: 651.2, 1: 652.6. Samples: 3352565. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:44:40,441][86732] Avg episode reward: [(0, '8.790'), (1, '12.330')]
[2023-09-22 10:44:44,171][88474] Updated weights for policy 1, policy_version 26240 (0.0014)
[2023-09-22 10:44:44,172][88473] Updated weights for policy 0, policy_version 26336 (0.0012)
[2023-09-22 10:44:45,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5803.8). Total num frames: 13459456. Throughput: 0: 653.6, 1: 653.7. Samples: 3360626. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:44:45,440][86732] Avg episode reward: [(0, '9.080'), (1, '12.320')]
[2023-09-22 10:44:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5256.5, 300 sec: 5803.8). Total num frames: 13492224. Throughput: 0: 654.6, 1: 654.7. Samples: 3364590. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:44:50,440][86732] Avg episode reward: [(0, '9.000'), (1, '12.630')]
[2023-09-22 10:44:50,441][88352] Saving new best policy, reward=12.630!
[2023-09-22 10:44:55,440][86732] Fps is (10 sec: 5734.2, 60 sec: 5189.0, 300 sec: 5803.8). Total num frames: 13516800. Throughput: 0: 651.9, 1: 652.1. Samples: 3372265. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:44:55,441][86732] Avg episode reward: [(0, '9.350'), (1, '12.700')]
[2023-09-22 10:44:55,451][88352] Saving new best policy, reward=12.700!
[2023-09-22 10:45:00,009][88474] Updated weights for policy 1, policy_version 26400 (0.0014)
[2023-09-22 10:45:00,010][88473] Updated weights for policy 0, policy_version 26496 (0.0015)
[2023-09-22 10:45:00,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5776.1). Total num frames: 13541376. Throughput: 0: 647.7, 1: 647.6. Samples: 3379670. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:00,440][86732] Avg episode reward: [(0, '9.990'), (1, '12.150')]
[2023-09-22 10:45:00,441][88211] Saving new best policy, reward=9.990!
[2023-09-22 10:45:05,440][86732] Fps is (10 sec: 4915.3, 60 sec: 5188.3, 300 sec: 5748.3). Total num frames: 13565952. Throughput: 0: 647.0, 1: 646.0. Samples: 3383369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:05,441][86732] Avg episode reward: [(0, '9.970'), (1, '12.270')]
[2023-09-22 10:45:10,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.2, 300 sec: 5748.3). Total num frames: 13590528. Throughput: 0: 650.0, 1: 648.1. Samples: 3391286. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:10,441][86732] Avg episode reward: [(0, '9.630'), (1, '12.620')]
[2023-09-22 10:45:15,440][86732] Fps is (10 sec: 4915.1, 60 sec: 5188.2, 300 sec: 5720.5). Total num frames: 13615104. Throughput: 0: 652.0, 1: 652.7. Samples: 3399359. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:15,441][86732] Avg episode reward: [(0, '9.650'), (1, '12.620')]
[2023-09-22 10:45:15,823][88473] Updated weights for policy 0, policy_version 26656 (0.0013)
[2023-09-22 10:45:15,824][88474] Updated weights for policy 1, policy_version 26560 (0.0014)
[2023-09-22 10:45:20,440][86732] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5692.7). Total num frames: 13639680. Throughput: 0: 653.3, 1: 652.3. Samples: 3403274. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:20,441][86732] Avg episode reward: [(0, '10.280'), (1, '12.720')]
[2023-09-22 10:45:20,462][88211] Saving new best policy, reward=10.280!
[2023-09-22 10:45:20,535][88352] Saving new best policy, reward=12.720!
[2023-09-22 10:45:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.2, 300 sec: 5692.7). Total num frames: 13672448. Throughput: 0: 648.9, 1: 646.6. Samples: 3410862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:25,441][86732] Avg episode reward: [(0, '10.540'), (1, '12.350')]
[2023-09-22 10:45:25,452][88211] Saving new best policy, reward=10.540!
[2023-09-22 10:45:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5692.7). Total num frames: 13697024. Throughput: 0: 644.6, 1: 644.0. Samples: 3418611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:30,441][86732] Avg episode reward: [(0, '10.380'), (1, '12.350')]
[2023-09-22 10:45:31,591][88474] Updated weights for policy 1, policy_version 26720 (0.0014)
[2023-09-22 10:45:31,592][88473] Updated weights for policy 0, policy_version 26816 (0.0014)
[2023-09-22 10:45:35,440][86732] Fps is (10 sec: 4915.4, 60 sec: 5188.3, 300 sec: 5665.0). Total num frames: 13721600. Throughput: 0: 644.0, 1: 644.3. Samples: 3422561. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:45:35,440][86732] Avg episode reward: [(0, '9.950'), (1, '12.250')]
[2023-09-22 10:45:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5324.8, 300 sec: 5665.0). Total num frames: 13754368. Throughput: 0: 662.0, 1: 660.9. Samples: 3431793. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:45:40,441][86732] Avg episode reward: [(0, '9.710'), (1, '12.130')]
[2023-09-22 10:45:45,403][88473] Updated weights for policy 0, policy_version 26976 (0.0014)
[2023-09-22 10:45:45,404][88474] Updated weights for policy 1, policy_version 26880 (0.0017)
[2023-09-22 10:45:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 5461.3, 300 sec: 5692.7). Total num frames: 13787136. Throughput: 0: 676.7, 1: 678.3. Samples: 3440644. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:45:45,441][86732] Avg episode reward: [(0, '9.990'), (1, '12.060')]
[2023-09-22 10:45:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5324.8, 300 sec: 5665.0). Total num frames: 13811712. Throughput: 0: 685.0, 1: 684.3. Samples: 3444987. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:45:50,440][86732] Avg episode reward: [(0, '10.560'), (1, '12.200')]
[2023-09-22 10:45:50,441][88211] Saving new best policy, reward=10.560!
[2023-09-22 10:45:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5461.4, 300 sec: 5692.8). Total num frames: 13844480. Throughput: 0: 698.3, 1: 698.1. Samples: 3454125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:45:55,440][86732] Avg episode reward: [(0, '10.840'), (1, '12.200')]
[2023-09-22 10:45:55,450][88211] Saving new best policy, reward=10.840!
[2023-09-22 10:45:58,993][88473] Updated weights for policy 0, policy_version 27136 (0.0016)
[2023-09-22 10:45:58,993][88474] Updated weights for policy 1, policy_version 27040 (0.0015)
[2023-09-22 10:46:00,440][86732] Fps is (10 sec: 6553.4, 60 sec: 5597.8, 300 sec: 5692.7). Total num frames: 13877248. Throughput: 0: 708.7, 1: 709.4. Samples: 3463172. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:00,441][86732] Avg episode reward: [(0, '10.360'), (1, '12.400')]
[2023-09-22 10:46:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 13901824. Throughput: 0: 710.2, 1: 712.4. Samples: 3467289. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:05,440][86732] Avg episode reward: [(0, '11.000'), (1, '12.530')]
[2023-09-22 10:46:05,441][88211] Saving new best policy, reward=11.000!
[2023-09-22 10:46:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 13934592. Throughput: 0: 723.7, 1: 724.6. Samples: 3476035. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:10,441][86732] Avg episode reward: [(0, '10.280'), (1, '12.770')]
[2023-09-22 10:46:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000027168_6955008.pth...
[2023-09-22 10:46:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000027264_6979584.pth...
[2023-09-22 10:46:10,488][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000024640_6307840.pth
[2023-09-22 10:46:10,489][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000024544_6283264.pth
[2023-09-22 10:46:10,493][88352] Saving new best policy, reward=12.770!
[2023-09-22 10:46:13,170][88474] Updated weights for policy 1, policy_version 27200 (0.0015)
[2023-09-22 10:46:13,171][88473] Updated weights for policy 0, policy_version 27296 (0.0015)
[2023-09-22 10:46:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 13959168. Throughput: 0: 739.8, 1: 740.9. Samples: 3485245. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:15,441][86732] Avg episode reward: [(0, '10.560'), (1, '13.080')]
[2023-09-22 10:46:15,442][88352] Saving new best policy, reward=13.080!
[2023-09-22 10:46:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 13991936. Throughput: 0: 746.2, 1: 747.7. Samples: 3489788. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:20,441][86732] Avg episode reward: [(0, '10.410'), (1, '12.720')]
[2023-09-22 10:46:25,440][86732] Fps is (10 sec: 6553.8, 60 sec: 5871.0, 300 sec: 5665.0). Total num frames: 14024704. Throughput: 0: 742.8, 1: 742.8. Samples: 3498644. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:25,440][86732] Avg episode reward: [(0, '10.840'), (1, '12.730')]
[2023-09-22 10:46:26,634][88473] Updated weights for policy 0, policy_version 27456 (0.0015)
[2023-09-22 10:46:26,634][88474] Updated weights for policy 1, policy_version 27360 (0.0017)
[2023-09-22 10:46:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 14049280. Throughput: 0: 750.8, 1: 749.0. Samples: 3508139. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:30,441][86732] Avg episode reward: [(0, '10.370'), (1, '12.740')]
[2023-09-22 10:46:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5665.0). Total num frames: 14082048. Throughput: 0: 747.2, 1: 749.2. Samples: 3512324. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:46:35,441][86732] Avg episode reward: [(0, '10.720'), (1, '13.090')]
[2023-09-22 10:46:35,442][88352] Saving new best policy, reward=13.090!
[2023-09-22 10:46:40,200][88474] Updated weights for policy 1, policy_version 27520 (0.0018)
[2023-09-22 10:46:40,200][88473] Updated weights for policy 0, policy_version 27616 (0.0018)
[2023-09-22 10:46:40,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14114816. Throughput: 0: 747.5, 1: 748.3. Samples: 3521436. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:46:40,440][86732] Avg episode reward: [(0, '11.130'), (1, '12.560')]
[2023-09-22 10:46:40,448][88211] Saving new best policy, reward=11.130!
[2023-09-22 10:46:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5637.2). Total num frames: 14139392. Throughput: 0: 750.8, 1: 750.7. Samples: 3530740. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:46:45,441][86732] Avg episode reward: [(0, '11.370'), (1, '12.460')]
[2023-09-22 10:46:45,516][88211] Saving new best policy, reward=11.370!
[2023-09-22 10:46:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5665.0). Total num frames: 14172160. Throughput: 0: 754.4, 1: 752.6. Samples: 3535105. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:50,441][86732] Avg episode reward: [(0, '11.180'), (1, '12.590')]
[2023-09-22 10:46:53,633][88474] Updated weights for policy 1, policy_version 27680 (0.0017)
[2023-09-22 10:46:53,633][88473] Updated weights for policy 0, policy_version 27776 (0.0014)
[2023-09-22 10:46:55,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14204928. Throughput: 0: 759.2, 1: 758.7. Samples: 3544342. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:46:55,440][86732] Avg episode reward: [(0, '10.630'), (1, '12.890')]
[2023-09-22 10:47:00,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 14237696. Throughput: 0: 757.3, 1: 757.0. Samples: 3553387. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:00,441][86732] Avg episode reward: [(0, '10.560'), (1, '12.790')]
[2023-09-22 10:47:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5665.0). Total num frames: 14262272. Throughput: 0: 759.4, 1: 757.4. Samples: 3558041. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:05,441][86732] Avg episode reward: [(0, '10.740'), (1, '12.720')]
[2023-09-22 10:47:06,989][88473] Updated weights for policy 0, policy_version 27936 (0.0015)
[2023-09-22 10:47:06,990][88474] Updated weights for policy 1, policy_version 27840 (0.0016)
[2023-09-22 10:47:10,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14295040. Throughput: 0: 762.8, 1: 765.2. Samples: 3567405. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:10,440][86732] Avg episode reward: [(0, '10.550'), (1, '12.640')]
[2023-09-22 10:47:15,440][86732] Fps is (10 sec: 6144.1, 60 sec: 6075.8, 300 sec: 5678.9). Total num frames: 14323712. Throughput: 0: 704.8, 1: 752.8. Samples: 3573731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:15,440][86732] Avg episode reward: [(0, '10.540'), (1, '13.240')]
[2023-09-22 10:47:15,441][88352] Saving new best policy, reward=13.240!
[2023-09-22 10:47:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14352384. Throughput: 0: 755.1, 1: 752.8. Samples: 3580179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:20,441][86732] Avg episode reward: [(0, '10.420'), (1, '12.870')]
[2023-09-22 10:47:20,908][88474] Updated weights for policy 1, policy_version 28000 (0.0014)
[2023-09-22 10:47:20,908][88473] Updated weights for policy 0, policy_version 28096 (0.0017)
[2023-09-22 10:47:25,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14385152. Throughput: 0: 755.1, 1: 755.0. Samples: 3589392. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:25,440][86732] Avg episode reward: [(0, '11.230'), (1, '13.020')]
[2023-09-22 10:47:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5692.7). Total num frames: 14417920. Throughput: 0: 751.0, 1: 751.2. Samples: 3598340. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:47:30,440][86732] Avg episode reward: [(0, '11.430'), (1, '12.820')]
[2023-09-22 10:47:30,441][88211] Saving new best policy, reward=11.430!
[2023-09-22 10:47:34,427][88473] Updated weights for policy 0, policy_version 28256 (0.0016)
[2023-09-22 10:47:34,427][88474] Updated weights for policy 1, policy_version 28160 (0.0018)
[2023-09-22 10:47:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14442496. Throughput: 0: 751.6, 1: 751.7. Samples: 3602755. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:47:35,440][86732] Avg episode reward: [(0, '11.080'), (1, '12.860')]
[2023-09-22 10:47:40,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14475264. Throughput: 0: 750.8, 1: 750.6. Samples: 3611907. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:47:40,440][86732] Avg episode reward: [(0, '10.570'), (1, '12.480')]
[2023-09-22 10:47:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5637.2). Total num frames: 14499840. Throughput: 0: 748.9, 1: 750.6. Samples: 3620864. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:47:45,440][86732] Avg episode reward: [(0, '10.070'), (1, '12.150')]
[2023-09-22 10:47:48,406][88473] Updated weights for policy 0, policy_version 28416 (0.0016)
[2023-09-22 10:47:48,407][88474] Updated weights for policy 1, policy_version 28320 (0.0016)
[2023-09-22 10:47:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14532608. Throughput: 0: 742.6, 1: 744.6. Samples: 3624964. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:50,440][86732] Avg episode reward: [(0, '10.680'), (1, '12.680')]
[2023-09-22 10:47:55,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14565376. Throughput: 0: 738.6, 1: 737.6. Samples: 3633832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:47:55,440][86732] Avg episode reward: [(0, '10.760'), (1, '12.350')]
[2023-09-22 10:48:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5665.0). Total num frames: 14589952. Throughput: 0: 791.7, 1: 743.3. Samples: 3642806. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:48:00,440][86732] Avg episode reward: [(0, '10.550'), (1, '12.630')]
[2023-09-22 10:48:02,242][88474] Updated weights for policy 1, policy_version 28480 (0.0014)
[2023-09-22 10:48:02,242][88473] Updated weights for policy 0, policy_version 28576 (0.0012)
[2023-09-22 10:48:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14622720. Throughput: 0: 746.4, 1: 748.8. Samples: 3647466. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:48:05,441][86732] Avg episode reward: [(0, '10.800'), (1, '12.180')]
[2023-09-22 10:48:10,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 5665.0). Total num frames: 14655488. Throughput: 0: 743.1, 1: 742.8. Samples: 3656260. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:48:10,441][86732] Avg episode reward: [(0, '10.930'), (1, '12.290')]
[2023-09-22 10:48:10,453][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000028672_7340032.pth...
[2023-09-22 10:48:10,453][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000028576_7315456.pth...
[2023-09-22 10:48:10,488][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000025888_6627328.pth
[2023-09-22 10:48:10,492][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000025984_6651904.pth
[2023-09-22 10:48:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5939.2, 300 sec: 5665.0). Total num frames: 14680064. Throughput: 0: 747.1, 1: 744.1. Samples: 3665446. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:48:15,441][86732] Avg episode reward: [(0, '10.840'), (1, '12.460')]
[2023-09-22 10:48:15,872][88474] Updated weights for policy 1, policy_version 28640 (0.0016)
[2023-09-22 10:48:15,872][88473] Updated weights for policy 0, policy_version 28736 (0.0018)
[2023-09-22 10:48:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5665.0). Total num frames: 14712832. Throughput: 0: 742.2, 1: 741.4. Samples: 3669517. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:48:20,440][86732] Avg episode reward: [(0, '10.980'), (1, '12.660')]
[2023-09-22 10:48:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 14737408. Throughput: 0: 735.5, 1: 737.8. Samples: 3678208. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:48:25,440][86732] Avg episode reward: [(0, '11.710'), (1, '12.350')]
[2023-09-22 10:48:25,449][88211] Saving new best policy, reward=11.710!
[2023-09-22 10:48:29,824][88474] Updated weights for policy 1, policy_version 28800 (0.0015)
[2023-09-22 10:48:29,824][88473] Updated weights for policy 0, policy_version 28896 (0.0013)
[2023-09-22 10:48:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 14770176. Throughput: 0: 739.8, 1: 737.3. Samples: 3687335. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:48:30,440][86732] Avg episode reward: [(0, '11.120'), (1, '12.610')]
[2023-09-22 10:48:35,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5692.7). Total num frames: 14802944. Throughput: 0: 744.5, 1: 743.1. Samples: 3691904. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:48:35,441][86732] Avg episode reward: [(0, '11.520'), (1, '12.600')]
[2023-09-22 10:48:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5692.7). Total num frames: 14827520. Throughput: 0: 742.9, 1: 744.0. Samples: 3700741. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:48:40,441][86732] Avg episode reward: [(0, '11.190'), (1, '12.700')]
[2023-09-22 10:48:43,418][88474] Updated weights for policy 1, policy_version 28960 (0.0016)
[2023-09-22 10:48:43,418][88473] Updated weights for policy 0, policy_version 29056 (0.0015)
[2023-09-22 10:48:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5706.6). Total num frames: 14860288. Throughput: 0: 747.8, 1: 748.2. Samples: 3710122. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:48:45,441][86732] Avg episode reward: [(0, '10.380'), (1, '12.630')]
[2023-09-22 10:48:50,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5720.7). Total num frames: 14893056. Throughput: 0: 751.2, 1: 749.7. Samples: 3715008. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:48:50,440][86732] Avg episode reward: [(0, '10.500'), (1, '12.450')]
[2023-09-22 10:48:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5720.5). Total num frames: 14917632. Throughput: 0: 747.5, 1: 747.7. Samples: 3723543. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:48:55,441][86732] Avg episode reward: [(0, '10.910'), (1, '12.650')]
[2023-09-22 10:48:56,931][88474] Updated weights for policy 1, policy_version 29120 (0.0015)
[2023-09-22 10:48:56,931][88473] Updated weights for policy 0, policy_version 29216 (0.0014)
[2023-09-22 10:49:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5748.3). Total num frames: 14950400. Throughput: 0: 749.1, 1: 750.6. Samples: 3732932. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 10:49:00,441][86732] Avg episode reward: [(0, '11.160'), (1, '13.120')]
[2023-09-22 10:49:05,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5776.1). Total num frames: 14983168. Throughput: 0: 755.0, 1: 758.0. Samples: 3737600. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:05,440][86732] Avg episode reward: [(0, '10.850'), (1, '12.730')]
[2023-09-22 10:49:10,312][88473] Updated weights for policy 0, policy_version 29376 (0.0016)
[2023-09-22 10:49:10,312][88474] Updated weights for policy 1, policy_version 29280 (0.0016)
[2023-09-22 10:49:10,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5803.8). Total num frames: 15015936. Throughput: 0: 760.2, 1: 757.8. Samples: 3746515. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:10,440][86732] Avg episode reward: [(0, '10.640'), (1, '12.450')]
[2023-09-22 10:49:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5803.8). Total num frames: 15040512. Throughput: 0: 759.9, 1: 761.7. Samples: 3755808. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:15,441][86732] Avg episode reward: [(0, '10.750'), (1, '12.380')]
[2023-09-22 10:49:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5803.8). Total num frames: 15073280. Throughput: 0: 757.5, 1: 758.8. Samples: 3760136. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:20,441][86732] Avg episode reward: [(0, '11.050'), (1, '12.400')]
[2023-09-22 10:49:23,772][88474] Updated weights for policy 1, policy_version 29440 (0.0016)
[2023-09-22 10:49:23,772][88473] Updated weights for policy 0, policy_version 29536 (0.0013)
[2023-09-22 10:49:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 15106048. Throughput: 0: 763.0, 1: 760.0. Samples: 3769276. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:49:25,441][86732] Avg episode reward: [(0, '10.430'), (1, '12.220')]
[2023-09-22 10:49:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 15130624. Throughput: 0: 755.6, 1: 754.4. Samples: 3778071. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:49:30,440][86732] Avg episode reward: [(0, '10.990'), (1, '12.130')]
[2023-09-22 10:49:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 15163392. Throughput: 0: 750.9, 1: 751.8. Samples: 3782631. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:49:35,440][86732] Avg episode reward: [(0, '11.180'), (1, '12.310')]
[2023-09-22 10:49:37,599][88474] Updated weights for policy 1, policy_version 29600 (0.0016)
[2023-09-22 10:49:37,600][88473] Updated weights for policy 0, policy_version 29696 (0.0015)
[2023-09-22 10:49:40,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 15196160. Throughput: 0: 755.1, 1: 755.3. Samples: 3791511. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:49:40,440][86732] Avg episode reward: [(0, '11.020'), (1, '12.250')]
[2023-09-22 10:49:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 15220736. Throughput: 0: 751.2, 1: 751.4. Samples: 3800552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:49:45,440][86732] Avg episode reward: [(0, '11.030'), (1, '12.790')]
[2023-09-22 10:49:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 15253504. Throughput: 0: 750.9, 1: 750.6. Samples: 3805169. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:50,440][86732] Avg episode reward: [(0, '11.020'), (1, '12.860')]
[2023-09-22 10:49:51,161][88473] Updated weights for policy 0, policy_version 29856 (0.0015)
[2023-09-22 10:49:51,162][88474] Updated weights for policy 1, policy_version 29760 (0.0015)
[2023-09-22 10:49:55,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5914.9). Total num frames: 15286272. Throughput: 0: 751.8, 1: 751.8. Samples: 3814177. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:49:55,440][86732] Avg episode reward: [(0, '11.410'), (1, '12.750')]
[2023-09-22 10:50:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 15310848. Throughput: 0: 753.1, 1: 753.1. Samples: 3823584. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:50:00,441][86732] Avg episode reward: [(0, '11.390'), (1, '13.080')]
[2023-09-22 10:50:04,658][88473] Updated weights for policy 0, policy_version 30016 (0.0015)
[2023-09-22 10:50:04,657][88474] Updated weights for policy 1, policy_version 29920 (0.0018)
[2023-09-22 10:50:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 15343616. Throughput: 0: 752.2, 1: 751.0. Samples: 3827783. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:50:05,440][86732] Avg episode reward: [(0, '11.250'), (1, '13.420')]
[2023-09-22 10:50:05,441][88352] Saving new best policy, reward=13.420!
[2023-09-22 10:50:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 15376384. Throughput: 0: 751.8, 1: 752.8. Samples: 3836987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:10,441][86732] Avg episode reward: [(0, '10.630'), (1, '12.940')]
[2023-09-22 10:50:10,447][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000029984_7675904.pth...
[2023-09-22 10:50:10,448][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000030080_7700480.pth...
[2023-09-22 10:50:10,485][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000027168_6955008.pth
[2023-09-22 10:50:10,488][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000027264_6979584.pth
[2023-09-22 10:50:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 15400960. Throughput: 0: 754.9, 1: 757.9. Samples: 3846144. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:15,440][86732] Avg episode reward: [(0, '10.690'), (1, '12.580')]
[2023-09-22 10:50:18,045][88473] Updated weights for policy 0, policy_version 30176 (0.0015)
[2023-09-22 10:50:18,045][88474] Updated weights for policy 1, policy_version 30080 (0.0017)
[2023-09-22 10:50:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 15433728. Throughput: 0: 758.2, 1: 756.6. Samples: 3850794. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:20,440][86732] Avg episode reward: [(0, '10.910'), (1, '12.380')]
[2023-09-22 10:50:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15466496. Throughput: 0: 759.1, 1: 758.8. Samples: 3859813. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:25,440][86732] Avg episode reward: [(0, '11.610'), (1, '12.630')]
[2023-09-22 10:50:30,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6012.1). Total num frames: 15495168. Throughput: 0: 756.2, 1: 757.6. Samples: 3868672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:30,442][86732] Avg episode reward: [(0, '11.230'), (1, '12.570')]
[2023-09-22 10:50:31,897][88474] Updated weights for policy 1, policy_version 30240 (0.0016)
[2023-09-22 10:50:31,897][88473] Updated weights for policy 0, policy_version 30336 (0.0018)
[2023-09-22 10:50:35,442][86732] Fps is (10 sec: 5733.1, 60 sec: 6007.2, 300 sec: 5998.2). Total num frames: 15523840. Throughput: 0: 753.2, 1: 751.5. Samples: 3872883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:35,443][86732] Avg episode reward: [(0, '11.110'), (1, '12.520')]
[2023-09-22 10:50:40,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 15556608. Throughput: 0: 755.9, 1: 756.8. Samples: 3882246. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:40,441][86732] Avg episode reward: [(0, '11.130'), (1, '12.110')]
[2023-09-22 10:50:45,440][86732] Fps is (10 sec: 6555.1, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 15589376. Throughput: 0: 750.9, 1: 751.6. Samples: 3891200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:45,440][88473] Updated weights for policy 0, policy_version 30496 (0.0015)
[2023-09-22 10:50:45,441][86732] Avg episode reward: [(0, '11.130'), (1, '12.510')]
[2023-09-22 10:50:45,442][88474] Updated weights for policy 1, policy_version 30400 (0.0016)
[2023-09-22 10:50:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15613952. Throughput: 0: 756.1, 1: 755.4. Samples: 3895801. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:50,441][86732] Avg episode reward: [(0, '10.750'), (1, '12.700')]
[2023-09-22 10:50:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15646720. Throughput: 0: 755.9, 1: 754.7. Samples: 3904967. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:50:55,440][86732] Avg episode reward: [(0, '11.090'), (1, '12.720')]
[2023-09-22 10:50:58,860][88473] Updated weights for policy 0, policy_version 30656 (0.0015)
[2023-09-22 10:50:58,860][88474] Updated weights for policy 1, policy_version 30560 (0.0016)
[2023-09-22 10:51:00,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 15679488. Throughput: 0: 753.8, 1: 751.9. Samples: 3913898. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:00,441][86732] Avg episode reward: [(0, '10.670'), (1, '12.720')]
[2023-09-22 10:51:05,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15704064. Throughput: 0: 752.8, 1: 754.0. Samples: 3918596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:05,441][86732] Avg episode reward: [(0, '11.120'), (1, '12.790')]
[2023-09-22 10:51:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 15736832. Throughput: 0: 757.4, 1: 758.8. Samples: 3928042. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:10,440][86732] Avg episode reward: [(0, '10.880'), (1, '12.640')]
[2023-09-22 10:51:12,342][88473] Updated weights for policy 0, policy_version 30816 (0.0016)
[2023-09-22 10:51:12,343][88474] Updated weights for policy 1, policy_version 30720 (0.0016)
[2023-09-22 10:51:15,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 15769600. Throughput: 0: 760.0, 1: 758.2. Samples: 3936991. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:15,441][86732] Avg episode reward: [(0, '11.490'), (1, '12.920')]
[2023-09-22 10:51:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15794176. Throughput: 0: 763.3, 1: 761.8. Samples: 3941510. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:20,441][86732] Avg episode reward: [(0, '11.360'), (1, '12.370')]
[2023-09-22 10:51:25,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 15826944. Throughput: 0: 758.7, 1: 760.1. Samples: 3950592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:51:25,441][86732] Avg episode reward: [(0, '11.510'), (1, '12.440')]
[2023-09-22 10:51:25,802][88474] Updated weights for policy 1, policy_version 30880 (0.0015)
[2023-09-22 10:51:25,802][88473] Updated weights for policy 0, policy_version 30976 (0.0016)
[2023-09-22 10:51:30,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6075.7, 300 sec: 6026.0). Total num frames: 15859712. Throughput: 0: 762.1, 1: 760.4. Samples: 3959711. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:51:30,441][86732] Avg episode reward: [(0, '11.350'), (1, '12.570')]
[2023-09-22 10:51:35,440][86732] Fps is (10 sec: 6553.9, 60 sec: 6144.2, 300 sec: 6026.0). Total num frames: 15892480. Throughput: 0: 762.4, 1: 762.2. Samples: 3964408. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:51:35,440][86732] Avg episode reward: [(0, '10.650'), (1, '12.390')]
[2023-09-22 10:51:39,424][88473] Updated weights for policy 0, policy_version 31136 (0.0017)
[2023-09-22 10:51:39,424][88474] Updated weights for policy 1, policy_version 31040 (0.0017)
[2023-09-22 10:51:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 15917056. Throughput: 0: 755.6, 1: 758.9. Samples: 3973122. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:51:40,441][86732] Avg episode reward: [(0, '11.400'), (1, '12.650')]
[2023-09-22 10:51:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 15949824. Throughput: 0: 758.4, 1: 758.6. Samples: 3982161. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:51:45,440][86732] Avg episode reward: [(0, '11.990'), (1, '12.660')]
[2023-09-22 10:51:45,441][88211] Saving new best policy, reward=11.990!
[2023-09-22 10:51:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 15974400. Throughput: 0: 754.0, 1: 752.8. Samples: 3986402. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:51:50,441][86732] Avg episode reward: [(0, '11.420'), (1, '12.430')]
[2023-09-22 10:51:53,466][88473] Updated weights for policy 0, policy_version 31296 (0.0016)
[2023-09-22 10:51:53,466][88474] Updated weights for policy 1, policy_version 31200 (0.0017)
[2023-09-22 10:51:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 16007168. Throughput: 0: 748.1, 1: 746.5. Samples: 3995299. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:51:55,440][86732] Avg episode reward: [(0, '11.340'), (1, '12.870')]
[2023-09-22 10:52:00,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16039936. Throughput: 0: 745.9, 1: 745.6. Samples: 4004111. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:52:00,441][86732] Avg episode reward: [(0, '11.750'), (1, '12.960')]
[2023-09-22 10:52:05,440][86732] Fps is (10 sec: 5734.2, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 16064512. Throughput: 0: 747.3, 1: 748.4. Samples: 4008816. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:52:05,441][86732] Avg episode reward: [(0, '11.720'), (1, '12.980')]
[2023-09-22 10:52:06,923][88473] Updated weights for policy 0, policy_version 31456 (0.0016)
[2023-09-22 10:52:06,923][88474] Updated weights for policy 1, policy_version 31360 (0.0017)
[2023-09-22 10:52:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6012.1). Total num frames: 16097280. Throughput: 0: 750.9, 1: 750.1. Samples: 4018137. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:10,440][86732] Avg episode reward: [(0, '11.420'), (1, '12.630')]
[2023-09-22 10:52:10,449][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000031392_8036352.pth...
[2023-09-22 10:52:10,451][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000031488_8060928.pth...
[2023-09-22 10:52:10,487][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000028672_7340032.pth
[2023-09-22 10:52:10,487][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000028576_7315456.pth
[2023-09-22 10:52:15,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16130048. Throughput: 0: 747.0, 1: 746.8. Samples: 4026929. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:15,441][86732] Avg episode reward: [(0, '11.490'), (1, '12.160')]
[2023-09-22 10:52:20,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6012.1). Total num frames: 16158720. Throughput: 0: 746.0, 1: 746.1. Samples: 4031553. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:20,441][86732] Avg episode reward: [(0, '10.940'), (1, '12.470')]
[2023-09-22 10:52:20,444][88474] Updated weights for policy 1, policy_version 31520 (0.0017)
[2023-09-22 10:52:20,445][88473] Updated weights for policy 0, policy_version 31616 (0.0016)
[2023-09-22 10:52:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 16187392. Throughput: 0: 751.0, 1: 750.9. Samples: 4040708. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:25,441][86732] Avg episode reward: [(0, '11.310'), (1, '12.150')]
[2023-09-22 10:52:30,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16220160. Throughput: 0: 754.9, 1: 754.7. Samples: 4050093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:30,441][86732] Avg episode reward: [(0, '10.920'), (1, '12.160')]
[2023-09-22 10:52:33,654][88474] Updated weights for policy 1, policy_version 31680 (0.0015)
[2023-09-22 10:52:33,656][88473] Updated weights for policy 0, policy_version 31776 (0.0017)
[2023-09-22 10:52:35,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16252928. Throughput: 0: 760.6, 1: 761.3. Samples: 4054888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:35,440][86732] Avg episode reward: [(0, '10.670'), (1, '12.040')]
[2023-09-22 10:52:40,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 16285696. Throughput: 0: 763.1, 1: 763.0. Samples: 4063974. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:40,441][86732] Avg episode reward: [(0, '9.830'), (1, '12.130')]
[2023-09-22 10:52:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16310272. Throughput: 0: 764.3, 1: 764.6. Samples: 4072912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:45,440][86732] Avg episode reward: [(0, '10.110'), (1, '12.280')]
[2023-09-22 10:52:47,317][88473] Updated weights for policy 0, policy_version 31936 (0.0014)
[2023-09-22 10:52:47,317][88474] Updated weights for policy 1, policy_version 31840 (0.0014)
[2023-09-22 10:52:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 16343040. Throughput: 0: 761.6, 1: 762.8. Samples: 4077416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:52:50,440][86732] Avg episode reward: [(0, '9.960'), (1, '12.120')]
[2023-09-22 10:52:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 16367616. Throughput: 0: 752.5, 1: 751.9. Samples: 4085834. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:52:55,441][86732] Avg episode reward: [(0, '10.430'), (1, '11.790')]
[2023-09-22 10:53:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16400384. Throughput: 0: 756.8, 1: 756.5. Samples: 4095027. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:53:00,441][86732] Avg episode reward: [(0, '10.220'), (1, '12.040')]
[2023-09-22 10:53:01,074][88473] Updated weights for policy 0, policy_version 32096 (0.0018)
[2023-09-22 10:53:01,075][88474] Updated weights for policy 1, policy_version 32000 (0.0017)
[2023-09-22 10:53:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 16433152. Throughput: 0: 758.0, 1: 757.2. Samples: 4099738. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:53:05,441][86732] Avg episode reward: [(0, '10.470'), (1, '11.560')]
[2023-09-22 10:53:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16457728. Throughput: 0: 752.9, 1: 751.2. Samples: 4108391. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:53:10,440][86732] Avg episode reward: [(0, '10.690'), (1, '11.480')]
[2023-09-22 10:53:14,865][88473] Updated weights for policy 0, policy_version 32256 (0.0018)
[2023-09-22 10:53:14,866][88474] Updated weights for policy 1, policy_version 32160 (0.0017)
[2023-09-22 10:53:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16490496. Throughput: 0: 747.3, 1: 747.4. Samples: 4117357. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:15,441][86732] Avg episode reward: [(0, '10.660'), (1, '11.280')]
[2023-09-22 10:53:20,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6075.7, 300 sec: 6053.7). Total num frames: 16523264. Throughput: 0: 747.4, 1: 747.8. Samples: 4122170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:20,440][86732] Avg episode reward: [(0, '10.740'), (1, '11.330')]
[2023-09-22 10:53:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16547840. Throughput: 0: 745.4, 1: 745.6. Samples: 4131071. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:25,441][86732] Avg episode reward: [(0, '11.160'), (1, '10.770')]
[2023-09-22 10:53:28,405][88474] Updated weights for policy 1, policy_version 32320 (0.0017)
[2023-09-22 10:53:28,405][88473] Updated weights for policy 0, policy_version 32416 (0.0015)
[2023-09-22 10:53:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16580608. Throughput: 0: 744.8, 1: 744.5. Samples: 4139934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:30,441][86732] Avg episode reward: [(0, '11.460'), (1, '11.260')]
[2023-09-22 10:53:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 16605184. Throughput: 0: 743.9, 1: 743.8. Samples: 4144361. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:35,441][86732] Avg episode reward: [(0, '11.300'), (1, '11.180')]
[2023-09-22 10:53:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 16637952. Throughput: 0: 745.3, 1: 745.0. Samples: 4152898. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:40,441][86732] Avg episode reward: [(0, '11.370'), (1, '11.460')]
[2023-09-22 10:53:42,592][88473] Updated weights for policy 0, policy_version 32576 (0.0015)
[2023-09-22 10:53:42,592][88474] Updated weights for policy 1, policy_version 32480 (0.0016)
[2023-09-22 10:53:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16670720. Throughput: 0: 739.7, 1: 740.2. Samples: 4161622. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:45,440][86732] Avg episode reward: [(0, '11.360'), (1, '11.600')]
[2023-09-22 10:53:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6026.0). Total num frames: 16695296. Throughput: 0: 737.7, 1: 738.9. Samples: 4166184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:50,441][86732] Avg episode reward: [(0, '11.630'), (1, '11.320')]
[2023-09-22 10:53:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16728064. Throughput: 0: 741.1, 1: 740.2. Samples: 4175049. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:53:55,440][86732] Avg episode reward: [(0, '11.340'), (1, '11.590')]
[2023-09-22 10:53:56,463][88473] Updated weights for policy 0, policy_version 32736 (0.0016)
[2023-09-22 10:53:56,463][88474] Updated weights for policy 1, policy_version 32640 (0.0017)
[2023-09-22 10:54:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16752640. Throughput: 0: 740.1, 1: 740.8. Samples: 4183998. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:54:00,441][86732] Avg episode reward: [(0, '10.200'), (1, '11.450')]
[2023-09-22 10:54:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16785408. Throughput: 0: 732.8, 1: 733.8. Samples: 4188170. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:54:05,440][86732] Avg episode reward: [(0, '10.710'), (1, '11.130')]
[2023-09-22 10:54:10,265][88474] Updated weights for policy 1, policy_version 32800 (0.0014)
[2023-09-22 10:54:10,266][88473] Updated weights for policy 0, policy_version 32896 (0.0016)
[2023-09-22 10:54:10,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.4, 300 sec: 6026.0). Total num frames: 16818176. Throughput: 0: 735.0, 1: 735.8. Samples: 4197260. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:54:10,441][86732] Avg episode reward: [(0, '10.700'), (1, '11.260')]
[2023-09-22 10:54:10,450][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000032896_8421376.pth...
[2023-09-22 10:54:10,451][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000032800_8396800.pth...
[2023-09-22 10:54:10,479][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000030080_7700480.pth
[2023-09-22 10:54:10,488][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000029984_7675904.pth
[2023-09-22 10:54:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16842752. Throughput: 0: 735.4, 1: 735.0. Samples: 4206101. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:54:15,441][86732] Avg episode reward: [(0, '10.640'), (1, '10.900')]
[2023-09-22 10:54:20,440][86732] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16875520. Throughput: 0: 736.2, 1: 737.7. Samples: 4210688. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:54:20,440][86732] Avg episode reward: [(0, '10.650'), (1, '11.090')]
[2023-09-22 10:54:23,939][88474] Updated weights for policy 1, policy_version 32960 (0.0019)
[2023-09-22 10:54:23,940][88473] Updated weights for policy 0, policy_version 33056 (0.0016)
[2023-09-22 10:54:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 16908288. Throughput: 0: 740.7, 1: 739.9. Samples: 4219523. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:54:25,441][86732] Avg episode reward: [(0, '10.960'), (1, '11.570')]
[2023-09-22 10:54:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16932864. Throughput: 0: 743.6, 1: 743.0. Samples: 4228516. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:54:30,440][86732] Avg episode reward: [(0, '11.810'), (1, '11.040')]
[2023-09-22 10:54:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 16965632. Throughput: 0: 743.8, 1: 743.0. Samples: 4233090. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:54:35,441][86732] Avg episode reward: [(0, '11.890'), (1, '11.230')]
[2023-09-22 10:54:37,809][88474] Updated weights for policy 1, policy_version 33120 (0.0017)
[2023-09-22 10:54:37,809][88473] Updated weights for policy 0, policy_version 33216 (0.0016)
[2023-09-22 10:54:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 16990208. Throughput: 0: 738.7, 1: 739.9. Samples: 4241588. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 10:54:40,441][86732] Avg episode reward: [(0, '11.810'), (1, '11.350')]
[2023-09-22 10:54:45,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 17022976. Throughput: 0: 741.7, 1: 740.6. Samples: 4250705. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:54:45,440][86732] Avg episode reward: [(0, '11.690'), (1, '11.230')]
[2023-09-22 10:54:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 17055744. Throughput: 0: 745.7, 1: 744.8. Samples: 4255242. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:54:50,440][86732] Avg episode reward: [(0, '13.040'), (1, '12.070')]
[2023-09-22 10:54:50,441][88211] Saving new best policy, reward=13.040!
[2023-09-22 10:54:51,632][88474] Updated weights for policy 1, policy_version 33280 (0.0013)
[2023-09-22 10:54:51,632][88473] Updated weights for policy 0, policy_version 33376 (0.0012)
[2023-09-22 10:54:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5998.2). Total num frames: 17080320. Throughput: 0: 742.7, 1: 742.1. Samples: 4264076. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:54:55,440][86732] Avg episode reward: [(0, '13.000'), (1, '11.990')]
[2023-09-22 10:55:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 17113088. Throughput: 0: 739.6, 1: 740.0. Samples: 4272684. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 10:55:00,440][86732] Avg episode reward: [(0, '12.850'), (1, '12.180')]
[2023-09-22 10:55:05,440][86732] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5984.3). Total num frames: 17141760. Throughput: 0: 738.6, 1: 737.0. Samples: 4277090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:55:05,440][86732] Avg episode reward: [(0, '13.010'), (1, '12.070')]
[2023-09-22 10:55:05,491][88473] Updated weights for policy 0, policy_version 33536 (0.0015)
[2023-09-22 10:55:05,492][88474] Updated weights for policy 1, policy_version 33440 (0.0016)
[2023-09-22 10:55:10,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5871.0, 300 sec: 5998.2). Total num frames: 17170432. Throughput: 0: 739.8, 1: 742.6. Samples: 4286229. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:55:10,441][86732] Avg episode reward: [(0, '13.170'), (1, '12.260')]
[2023-09-22 10:55:10,451][88211] Saving new best policy, reward=13.170!
[2023-09-22 10:55:15,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 17203200. Throughput: 0: 735.6, 1: 736.1. Samples: 4294741. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:55:15,440][86732] Avg episode reward: [(0, '13.120'), (1, '12.060')]
[2023-09-22 10:55:19,490][88474] Updated weights for policy 1, policy_version 33600 (0.0014)
[2023-09-22 10:55:19,490][88473] Updated weights for policy 0, policy_version 33696 (0.0011)
[2023-09-22 10:55:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 17227776. Throughput: 0: 734.0, 1: 735.8. Samples: 4299232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:55:20,441][86732] Avg episode reward: [(0, '13.150'), (1, '12.060')]
[2023-09-22 10:55:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5984.3). Total num frames: 17260544. Throughput: 0: 743.5, 1: 743.2. Samples: 4308487. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:55:25,440][86732] Avg episode reward: [(0, '13.140'), (1, '12.050')]
[2023-09-22 10:55:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.5). Total num frames: 17285120. Throughput: 0: 690.7, 1: 739.7. Samples: 4315074. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:30,441][86732] Avg episode reward: [(0, '13.340'), (1, '11.940')]
[2023-09-22 10:55:30,466][88211] Saving new best policy, reward=13.340!
[2023-09-22 10:55:33,239][88474] Updated weights for policy 1, policy_version 33760 (0.0013)
[2023-09-22 10:55:33,239][88473] Updated weights for policy 0, policy_version 33856 (0.0012)
[2023-09-22 10:55:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5970.4). Total num frames: 17317888. Throughput: 0: 736.1, 1: 735.1. Samples: 4321447. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:35,440][86732] Avg episode reward: [(0, '13.350'), (1, '11.690')]
[2023-09-22 10:55:35,441][88211] Saving new best policy, reward=13.350!
[2023-09-22 10:55:40,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17350656. Throughput: 0: 739.3, 1: 739.6. Samples: 4330629. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:40,441][86732] Avg episode reward: [(0, '13.680'), (1, '11.950')]
[2023-09-22 10:55:40,450][88211] Saving new best policy, reward=13.680!
[2023-09-22 10:55:45,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 17375232. Throughput: 0: 743.8, 1: 745.4. Samples: 4339699. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:45,440][86732] Avg episode reward: [(0, '13.470'), (1, '11.860')]
[2023-09-22 10:55:46,953][88473] Updated weights for policy 0, policy_version 34016 (0.0017)
[2023-09-22 10:55:46,953][88474] Updated weights for policy 1, policy_version 33920 (0.0018)
[2023-09-22 10:55:50,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 17408000. Throughput: 0: 743.7, 1: 743.2. Samples: 4344002. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:50,441][86732] Avg episode reward: [(0, '12.630'), (1, '11.680')]
[2023-09-22 10:55:55,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17440768. Throughput: 0: 744.8, 1: 743.0. Samples: 4353180. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:55:55,441][86732] Avg episode reward: [(0, '12.730'), (1, '11.650')]
[2023-09-22 10:56:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 17465344. Throughput: 0: 749.2, 1: 749.2. Samples: 4362168. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:56:00,441][86732] Avg episode reward: [(0, '12.700'), (1, '10.890')]
[2023-09-22 10:56:00,571][88474] Updated weights for policy 1, policy_version 34080 (0.0015)
[2023-09-22 10:56:00,572][88473] Updated weights for policy 0, policy_version 34176 (0.0015)
[2023-09-22 10:56:05,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5939.2, 300 sec: 5970.4). Total num frames: 17498112. Throughput: 0: 746.5, 1: 746.1. Samples: 4366397. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 10:56:05,441][86732] Avg episode reward: [(0, '13.080'), (1, '11.160')]
[2023-09-22 10:56:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17530880. Throughput: 0: 743.4, 1: 743.6. Samples: 4375399. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:56:10,440][86732] Avg episode reward: [(0, '12.570'), (1, '11.100')]
[2023-09-22 10:56:10,449][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000034288_8777728.pth...
[2023-09-22 10:56:10,450][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000034192_8753152.pth...
[2023-09-22 10:56:10,479][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000031488_8060928.pth
[2023-09-22 10:56:10,489][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000031392_8036352.pth
[2023-09-22 10:56:14,363][88473] Updated weights for policy 0, policy_version 34336 (0.0015)
[2023-09-22 10:56:14,363][88474] Updated weights for policy 1, policy_version 34240 (0.0015)
[2023-09-22 10:56:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 17555456. Throughput: 0: 796.0, 1: 746.9. Samples: 4384507. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:56:15,440][86732] Avg episode reward: [(0, '12.260'), (1, '11.240')]
[2023-09-22 10:56:20,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17588224. Throughput: 0: 748.0, 1: 750.1. Samples: 4388865. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:56:20,441][86732] Avg episode reward: [(0, '13.000'), (1, '11.390')]
[2023-09-22 10:56:25,440][86732] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 17620992. Throughput: 0: 744.7, 1: 743.9. Samples: 4397616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:56:25,441][86732] Avg episode reward: [(0, '12.850'), (1, '11.330')]
[2023-09-22 10:56:27,995][88474] Updated weights for policy 1, policy_version 34400 (0.0016)
[2023-09-22 10:56:27,997][88473] Updated weights for policy 0, policy_version 34496 (0.0016)
[2023-09-22 10:56:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 17645568. Throughput: 0: 747.6, 1: 747.1. Samples: 4406958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:56:30,441][86732] Avg episode reward: [(0, '13.300'), (1, '11.440')]
[2023-09-22 10:56:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 17678336. Throughput: 0: 747.8, 1: 749.8. Samples: 4411392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:56:35,441][86732] Avg episode reward: [(0, '12.820'), (1, '11.360')]
[2023-09-22 10:56:40,440][86732] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 5956.6). Total num frames: 17707008. Throughput: 0: 741.9, 1: 742.0. Samples: 4419959. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:56:40,440][86732] Avg episode reward: [(0, '12.570'), (1, '11.950')]
[2023-09-22 10:56:41,837][88473] Updated weights for policy 0, policy_version 34656 (0.0018)
[2023-09-22 10:56:41,837][88474] Updated weights for policy 1, policy_version 34560 (0.0016)
[2023-09-22 10:56:45,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17735680. Throughput: 0: 743.9, 1: 743.5. Samples: 4429099. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:56:45,440][86732] Avg episode reward: [(0, '13.160'), (1, '11.930')]
[2023-09-22 10:56:50,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17768448. Throughput: 0: 749.6, 1: 749.8. Samples: 4433867. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:56:50,440][86732] Avg episode reward: [(0, '13.010'), (1, '12.030')]
[2023-09-22 10:56:55,438][88473] Updated weights for policy 0, policy_version 34816 (0.0016)
[2023-09-22 10:56:55,438][88474] Updated weights for policy 1, policy_version 34720 (0.0013)
[2023-09-22 10:56:55,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17801216. Throughput: 0: 746.2, 1: 745.8. Samples: 4442536. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 10:56:55,441][86732] Avg episode reward: [(0, '13.250'), (1, '12.050')]
[2023-09-22 10:57:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17825792. Throughput: 0: 747.9, 1: 747.3. Samples: 4451792. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:57:00,440][86732] Avg episode reward: [(0, '12.760'), (1, '12.140')]
[2023-09-22 10:57:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17858560. Throughput: 0: 750.2, 1: 747.3. Samples: 4456252. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:57:05,441][86732] Avg episode reward: [(0, '12.620'), (1, '11.920')]
[2023-09-22 10:57:09,140][88474] Updated weights for policy 1, policy_version 34880 (0.0016)
[2023-09-22 10:57:09,141][88473] Updated weights for policy 0, policy_version 34976 (0.0016)
[2023-09-22 10:57:10,440][86732] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 17883136. Throughput: 0: 748.1, 1: 748.7. Samples: 4464972. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:57:10,441][86732] Avg episode reward: [(0, '13.240'), (1, '12.070')]
[2023-09-22 10:57:15,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5956.6). Total num frames: 17915904. Throughput: 0: 743.8, 1: 743.2. Samples: 4473876. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 10:57:15,441][86732] Avg episode reward: [(0, '12.710'), (1, '12.070')]
[2023-09-22 10:57:20,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 17948672. Throughput: 0: 746.2, 1: 744.5. Samples: 4478473. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:57:20,441][86732] Avg episode reward: [(0, '12.530'), (1, '12.190')]
[2023-09-22 10:57:23,039][88474] Updated weights for policy 1, policy_version 35040 (0.0016)
[2023-09-22 10:57:23,039][88473] Updated weights for policy 0, policy_version 35136 (0.0016)
[2023-09-22 10:57:25,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 17973248. Throughput: 0: 746.0, 1: 747.5. Samples: 4487168. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:57:25,440][86732] Avg episode reward: [(0, '12.700'), (1, '11.260')]
[2023-09-22 10:57:30,440][86732] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 18006016. Throughput: 0: 742.5, 1: 742.9. Samples: 4495943. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:57:30,440][86732] Avg episode reward: [(0, '12.840'), (1, '11.200')]
[2023-09-22 10:57:35,440][86732] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 18030592. Throughput: 0: 739.4, 1: 737.4. Samples: 4500322. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:57:35,440][86732] Avg episode reward: [(0, '12.760'), (1, '10.840')]
[2023-09-22 10:57:36,867][88473] Updated weights for policy 0, policy_version 35296 (0.0019)
[2023-09-22 10:57:36,867][88474] Updated weights for policy 1, policy_version 35200 (0.0018)
[2023-09-22 10:57:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5942.7). Total num frames: 18063360. Throughput: 0: 745.2, 1: 746.2. Samples: 4509648. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 10:57:40,440][86732] Avg episode reward: [(0, '12.150'), (1, '10.720')]
[2023-09-22 10:57:45,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 18096128. Throughput: 0: 739.7, 1: 740.6. Samples: 4518408. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:57:45,441][86732] Avg episode reward: [(0, '12.590'), (1, '10.730')]
[2023-09-22 10:57:50,440][86732] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 18120704. Throughput: 0: 741.3, 1: 742.3. Samples: 4523014. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:57:50,441][86732] Avg episode reward: [(0, '12.070'), (1, '10.870')]
[2023-09-22 10:57:50,511][88474] Updated weights for policy 1, policy_version 35360 (0.0018)
[2023-09-22 10:57:50,512][88473] Updated weights for policy 0, policy_version 35456 (0.0017)
[2023-09-22 10:57:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 18153472. Throughput: 0: 746.3, 1: 748.2. Samples: 4532224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:57:55,440][86732] Avg episode reward: [(0, '12.450'), (1, '10.710')]
[2023-09-22 10:58:00,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 18186240. Throughput: 0: 747.6, 1: 746.8. Samples: 4541122. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:00,441][86732] Avg episode reward: [(0, '12.070'), (1, '11.080')]
[2023-09-22 10:58:03,960][88474] Updated weights for policy 1, policy_version 35520 (0.0015)
[2023-09-22 10:58:03,960][88473] Updated weights for policy 0, policy_version 35616 (0.0018)
[2023-09-22 10:58:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18219008. Throughput: 0: 748.2, 1: 747.5. Samples: 4545780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:05,440][86732] Avg episode reward: [(0, '11.700'), (1, '11.250')]
[2023-09-22 10:58:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 18243584. Throughput: 0: 751.1, 1: 750.9. Samples: 4554761. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:58:10,441][86732] Avg episode reward: [(0, '11.210'), (1, '11.710')]
[2023-09-22 10:58:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000035584_9109504.pth...
[2023-09-22 10:58:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000035680_9134080.pth...
[2023-09-22 10:58:10,481][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000032800_8396800.pth
[2023-09-22 10:58:10,492][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000032896_8421376.pth
[2023-09-22 10:58:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 18276352. Throughput: 0: 754.9, 1: 754.4. Samples: 4563864. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:58:15,440][86732] Avg episode reward: [(0, '12.010'), (1, '11.870')]
[2023-09-22 10:58:17,652][88473] Updated weights for policy 0, policy_version 35776 (0.0015)
[2023-09-22 10:58:17,653][88474] Updated weights for policy 1, policy_version 35680 (0.0016)
[2023-09-22 10:58:20,440][86732] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18309120. Throughput: 0: 754.9, 1: 756.2. Samples: 4568323. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:58:20,440][86732] Avg episode reward: [(0, '11.530'), (1, '12.070')]
[2023-09-22 10:58:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 18333696. Throughput: 0: 751.0, 1: 752.0. Samples: 4577281. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 10:58:25,440][86732] Avg episode reward: [(0, '12.060'), (1, '11.740')]
[2023-09-22 10:58:30,441][86732] Fps is (10 sec: 5733.9, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 18366464. Throughput: 0: 755.7, 1: 755.2. Samples: 4586401. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:30,441][86732] Avg episode reward: [(0, '12.140'), (1, '12.310')]
[2023-09-22 10:58:31,213][88474] Updated weights for policy 1, policy_version 35840 (0.0016)
[2023-09-22 10:58:31,213][88473] Updated weights for policy 0, policy_version 35936 (0.0013)
[2023-09-22 10:58:35,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5970.4). Total num frames: 18399232. Throughput: 0: 755.4, 1: 754.9. Samples: 4590975. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:35,441][86732] Avg episode reward: [(0, '12.340'), (1, '12.220')]
[2023-09-22 10:58:40,440][86732] Fps is (10 sec: 5734.8, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 18423808. Throughput: 0: 752.0, 1: 751.0. Samples: 4599862. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:40,441][86732] Avg episode reward: [(0, '13.080'), (1, '12.140')]
[2023-09-22 10:58:44,751][88474] Updated weights for policy 1, policy_version 36000 (0.0017)
[2023-09-22 10:58:44,751][88473] Updated weights for policy 0, policy_version 36096 (0.0017)
[2023-09-22 10:58:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18456576. Throughput: 0: 754.8, 1: 755.3. Samples: 4609078. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:45,441][86732] Avg episode reward: [(0, '12.680'), (1, '12.120')]
[2023-09-22 10:58:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5970.4). Total num frames: 18489344. Throughput: 0: 755.0, 1: 755.6. Samples: 4613755. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:58:50,440][86732] Avg episode reward: [(0, '13.050'), (1, '11.500')]
[2023-09-22 10:58:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 18513920. Throughput: 0: 751.1, 1: 750.9. Samples: 4622352. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:58:55,441][86732] Avg episode reward: [(0, '12.970'), (1, '11.760')]
[2023-09-22 10:58:58,552][88474] Updated weights for policy 1, policy_version 36160 (0.0017)
[2023-09-22 10:58:58,553][88473] Updated weights for policy 0, policy_version 36256 (0.0016)
[2023-09-22 10:59:00,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18546688. Throughput: 0: 748.4, 1: 748.5. Samples: 4631224. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:59:00,441][86732] Avg episode reward: [(0, '13.670'), (1, '11.840')]
[2023-09-22 10:59:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 18579456. Throughput: 0: 749.5, 1: 749.7. Samples: 4635787. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:59:05,441][86732] Avg episode reward: [(0, '13.050'), (1, '11.380')]
[2023-09-22 10:59:10,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18604032. Throughput: 0: 750.9, 1: 750.9. Samples: 4644864. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:59:10,441][86732] Avg episode reward: [(0, '13.430'), (1, '11.720')]
[2023-09-22 10:59:12,176][88474] Updated weights for policy 1, policy_version 36320 (0.0015)
[2023-09-22 10:59:12,177][88473] Updated weights for policy 0, policy_version 36416 (0.0014)
[2023-09-22 10:59:15,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18636800. Throughput: 0: 752.3, 1: 752.7. Samples: 4654124. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 10:59:15,440][86732] Avg episode reward: [(0, '13.580'), (1, '11.930')]
[2023-09-22 10:59:20,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 18669568. Throughput: 0: 755.0, 1: 754.7. Samples: 4658912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:20,441][86732] Avg episode reward: [(0, '12.850'), (1, '11.960')]
[2023-09-22 10:59:25,443][86732] Fps is (10 sec: 5732.6, 60 sec: 6007.1, 300 sec: 5970.4). Total num frames: 18694144. Throughput: 0: 749.9, 1: 750.8. Samples: 4667397. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:25,444][86732] Avg episode reward: [(0, '13.960'), (1, '12.110')]
[2023-09-22 10:59:25,451][88211] Saving new best policy, reward=13.960!
[2023-09-22 10:59:25,758][88473] Updated weights for policy 0, policy_version 36576 (0.0016)
[2023-09-22 10:59:25,759][88474] Updated weights for policy 1, policy_version 36480 (0.0018)
[2023-09-22 10:59:30,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18726912. Throughput: 0: 747.3, 1: 746.3. Samples: 4676291. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:30,440][86732] Avg episode reward: [(0, '13.730'), (1, '12.040')]
[2023-09-22 10:59:35,440][86732] Fps is (10 sec: 6555.7, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 18759680. Throughput: 0: 746.5, 1: 746.2. Samples: 4680926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:35,440][86732] Avg episode reward: [(0, '14.310'), (1, '11.850')]
[2023-09-22 10:59:35,441][88211] Saving new best policy, reward=14.310!
[2023-09-22 10:59:39,393][88474] Updated weights for policy 1, policy_version 36640 (0.0018)
[2023-09-22 10:59:39,393][88473] Updated weights for policy 0, policy_version 36736 (0.0018)
[2023-09-22 10:59:40,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18784256. Throughput: 0: 750.7, 1: 750.9. Samples: 4689924. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:40,441][86732] Avg episode reward: [(0, '14.410'), (1, '11.960')]
[2023-09-22 10:59:40,451][88211] Saving new best policy, reward=14.410!
[2023-09-22 10:59:45,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18817024. Throughput: 0: 755.3, 1: 754.8. Samples: 4699178. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:45,441][86732] Avg episode reward: [(0, '13.980'), (1, '11.370')]
[2023-09-22 10:59:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 18849792. Throughput: 0: 756.3, 1: 756.4. Samples: 4703857. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:50,440][86732] Avg episode reward: [(0, '14.280'), (1, '12.020')]
[2023-09-22 10:59:52,974][88473] Updated weights for policy 0, policy_version 36896 (0.0016)
[2023-09-22 10:59:52,974][88474] Updated weights for policy 1, policy_version 36800 (0.0016)
[2023-09-22 10:59:55,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 18874368. Throughput: 0: 751.1, 1: 750.9. Samples: 4712454. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 10:59:55,441][86732] Avg episode reward: [(0, '14.130'), (1, '11.910')]
[2023-09-22 11:00:00,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5984.3). Total num frames: 18907136. Throughput: 0: 749.6, 1: 749.5. Samples: 4721584. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:00,440][86732] Avg episode reward: [(0, '13.510'), (1, '11.590')]
[2023-09-22 11:00:05,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 18939904. Throughput: 0: 748.7, 1: 749.4. Samples: 4726328. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:05,440][86732] Avg episode reward: [(0, '12.710'), (1, '11.430')]
[2023-09-22 11:00:06,434][88473] Updated weights for policy 0, policy_version 37056 (0.0015)
[2023-09-22 11:00:06,435][88474] Updated weights for policy 1, policy_version 36960 (0.0016)
[2023-09-22 11:00:10,440][86732] Fps is (10 sec: 6143.9, 60 sec: 6075.8, 300 sec: 5984.3). Total num frames: 18968576. Throughput: 0: 756.0, 1: 753.9. Samples: 4735338. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:10,441][86732] Avg episode reward: [(0, '13.160'), (1, '11.670')]
[2023-09-22 11:00:10,452][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000037104_9498624.pth...
[2023-09-22 11:00:10,452][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000037008_9474048.pth...
[2023-09-22 11:00:10,485][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000034192_8753152.pth
[2023-09-22 11:00:10,486][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000034288_8777728.pth
[2023-09-22 11:00:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 18997248. Throughput: 0: 758.3, 1: 759.5. Samples: 4744593. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:15,441][86732] Avg episode reward: [(0, '13.590'), (1, '11.710')]
[2023-09-22 11:00:19,942][88473] Updated weights for policy 0, policy_version 37216 (0.0018)
[2023-09-22 11:00:19,943][88474] Updated weights for policy 1, policy_version 37120 (0.0020)
[2023-09-22 11:00:20,440][86732] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19030016. Throughput: 0: 758.7, 1: 758.9. Samples: 4749220. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:20,440][86732] Avg episode reward: [(0, '13.410'), (1, '11.360')]
[2023-09-22 11:00:25,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.3, 300 sec: 6026.0). Total num frames: 19062784. Throughput: 0: 758.8, 1: 757.4. Samples: 4758152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:25,441][86732] Avg episode reward: [(0, '13.000'), (1, '10.900')]
[2023-09-22 11:00:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19087360. Throughput: 0: 755.7, 1: 757.2. Samples: 4767259. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:30,440][86732] Avg episode reward: [(0, '12.900'), (1, '11.070')]
[2023-09-22 11:00:33,655][88474] Updated weights for policy 1, policy_version 37280 (0.0015)
[2023-09-22 11:00:33,655][88473] Updated weights for policy 0, policy_version 37376 (0.0014)
[2023-09-22 11:00:35,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 19120128. Throughput: 0: 754.6, 1: 753.7. Samples: 4771729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:35,441][86732] Avg episode reward: [(0, '13.700'), (1, '10.940')]
[2023-09-22 11:00:40,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 19152896. Throughput: 0: 758.5, 1: 756.6. Samples: 4780633. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:40,440][86732] Avg episode reward: [(0, '12.640'), (1, '11.210')]
[2023-09-22 11:00:45,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19177472. Throughput: 0: 759.0, 1: 760.3. Samples: 4789952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:45,441][86732] Avg episode reward: [(0, '12.640'), (1, '11.240')]
[2023-09-22 11:00:47,114][88473] Updated weights for policy 0, policy_version 37536 (0.0017)
[2023-09-22 11:00:47,114][88474] Updated weights for policy 1, policy_version 37440 (0.0017)
[2023-09-22 11:00:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19210240. Throughput: 0: 755.1, 1: 757.0. Samples: 4794372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:00:50,441][86732] Avg episode reward: [(0, '13.150'), (1, '11.670')]
[2023-09-22 11:00:55,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 19243008. Throughput: 0: 759.3, 1: 759.0. Samples: 4803660. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:00:55,440][86732] Avg episode reward: [(0, '12.330'), (1, '11.920')]
[2023-09-22 11:01:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 19267584. Throughput: 0: 757.0, 1: 756.8. Samples: 4812715. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:01:00,441][86732] Avg episode reward: [(0, '12.330'), (1, '11.940')]
[2023-09-22 11:01:00,612][88474] Updated weights for policy 1, policy_version 37600 (0.0015)
[2023-09-22 11:01:00,612][88473] Updated weights for policy 0, policy_version 37696 (0.0014)
[2023-09-22 11:01:05,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19300352. Throughput: 0: 750.9, 1: 753.0. Samples: 4816896. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:01:05,440][86732] Avg episode reward: [(0, '12.270'), (1, '11.800')]
[2023-09-22 11:01:10,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6075.7, 300 sec: 6026.0). Total num frames: 19333120. Throughput: 0: 751.9, 1: 750.7. Samples: 4825767. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:01:10,440][86732] Avg episode reward: [(0, '12.750'), (1, '12.120')]
[2023-09-22 11:01:14,336][88473] Updated weights for policy 0, policy_version 37856 (0.0015)
[2023-09-22 11:01:14,338][88474] Updated weights for policy 1, policy_version 37760 (0.0017)
[2023-09-22 11:01:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19357696. Throughput: 0: 754.6, 1: 753.7. Samples: 4835131. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:01:15,441][86732] Avg episode reward: [(0, '12.310'), (1, '12.050')]
[2023-09-22 11:01:20,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19390464. Throughput: 0: 752.8, 1: 753.5. Samples: 4839515. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:20,440][86732] Avg episode reward: [(0, '13.730'), (1, '12.160')]
[2023-09-22 11:01:25,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19423232. Throughput: 0: 755.5, 1: 755.5. Samples: 4848627. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:25,441][86732] Avg episode reward: [(0, '13.580'), (1, '12.020')]
[2023-09-22 11:01:27,749][88473] Updated weights for policy 0, policy_version 38016 (0.0016)
[2023-09-22 11:01:27,749][88474] Updated weights for policy 1, policy_version 37920 (0.0017)
[2023-09-22 11:01:30,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5998.2). Total num frames: 19447808. Throughput: 0: 754.0, 1: 754.7. Samples: 4857845. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:30,441][86732] Avg episode reward: [(0, '13.860'), (1, '11.580')]
[2023-09-22 11:01:35,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6012.1). Total num frames: 19480576. Throughput: 0: 751.2, 1: 750.9. Samples: 4861967. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:35,440][86732] Avg episode reward: [(0, '13.580'), (1, '12.010')]
[2023-09-22 11:01:40,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19513344. Throughput: 0: 751.1, 1: 750.9. Samples: 4871251. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:40,441][86732] Avg episode reward: [(0, '13.370'), (1, '11.690')]
[2023-09-22 11:01:41,495][88474] Updated weights for policy 1, policy_version 38080 (0.0013)
[2023-09-22 11:01:41,496][88473] Updated weights for policy 0, policy_version 38176 (0.0017)
[2023-09-22 11:01:45,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 19546112. Throughput: 0: 751.1, 1: 752.7. Samples: 4880384. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:45,440][86732] Avg episode reward: [(0, '12.630'), (1, '11.890')]
[2023-09-22 11:01:50,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5998.2). Total num frames: 19570688. Throughput: 0: 758.4, 1: 756.2. Samples: 4885050. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:50,441][86732] Avg episode reward: [(0, '12.250'), (1, '12.040')]
[2023-09-22 11:01:54,639][88473] Updated weights for policy 0, policy_version 38336 (0.0016)
[2023-09-22 11:01:54,639][88474] Updated weights for policy 1, policy_version 38240 (0.0017)
[2023-09-22 11:01:55,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19603456. Throughput: 0: 763.9, 1: 764.5. Samples: 4894544. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:01:55,440][86732] Avg episode reward: [(0, '11.630'), (1, '11.970')]
[2023-09-22 11:02:00,440][86732] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 19636224. Throughput: 0: 761.1, 1: 760.7. Samples: 4903612. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:00,441][86732] Avg episode reward: [(0, '11.480'), (1, '11.870')]
[2023-09-22 11:02:05,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 19668992. Throughput: 0: 763.1, 1: 762.5. Samples: 4908167. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:05,440][86732] Avg episode reward: [(0, '12.450'), (1, '11.720')]
[2023-09-22 11:02:08,019][88474] Updated weights for policy 1, policy_version 38400 (0.0015)
[2023-09-22 11:02:08,019][88473] Updated weights for policy 0, policy_version 38496 (0.0016)
[2023-09-22 11:02:10,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19693568. Throughput: 0: 761.5, 1: 763.5. Samples: 4917252. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:10,440][86732] Avg episode reward: [(0, '12.510'), (1, '11.940')]
[2023-09-22 11:02:10,450][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000038512_9859072.pth...
[2023-09-22 11:02:10,450][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000038416_9834496.pth...
[2023-09-22 11:02:10,484][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000035584_9109504.pth
[2023-09-22 11:02:10,484][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000035680_9134080.pth
[2023-09-22 11:02:15,440][86732] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6026.0). Total num frames: 19726336. Throughput: 0: 761.4, 1: 759.9. Samples: 4926303. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:15,441][86732] Avg episode reward: [(0, '12.840'), (1, '11.830')]
[2023-09-22 11:02:20,440][86732] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 19759104. Throughput: 0: 768.0, 1: 767.2. Samples: 4931053. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:20,441][86732] Avg episode reward: [(0, '13.430'), (1, '12.000')]
[2023-09-22 11:02:21,511][88473] Updated weights for policy 0, policy_version 38656 (0.0017)
[2023-09-22 11:02:21,511][88474] Updated weights for policy 1, policy_version 38560 (0.0016)
[2023-09-22 11:02:25,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19783680. Throughput: 0: 762.1, 1: 763.0. Samples: 4939881. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:25,441][86732] Avg episode reward: [(0, '14.050'), (1, '11.730')]
[2023-09-22 11:02:30,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 19816448. Throughput: 0: 764.1, 1: 761.7. Samples: 4949048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:02:30,441][86732] Avg episode reward: [(0, '14.620'), (1, '11.890')]
[2023-09-22 11:02:30,442][88211] Saving new best policy, reward=14.620!
[2023-09-22 11:02:35,130][88473] Updated weights for policy 0, policy_version 38816 (0.0016)
[2023-09-22 11:02:35,132][88474] Updated weights for policy 1, policy_version 38720 (0.0019)
[2023-09-22 11:02:35,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 19849216. Throughput: 0: 762.0, 1: 762.6. Samples: 4953658. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:02:35,441][86732] Avg episode reward: [(0, '14.110'), (1, '11.960')]
[2023-09-22 11:02:40,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19873792. Throughput: 0: 755.5, 1: 756.0. Samples: 4962562. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:02:40,441][86732] Avg episode reward: [(0, '14.030'), (1, '12.010')]
[2023-09-22 11:02:45,440][86732] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 19906560. Throughput: 0: 754.0, 1: 754.2. Samples: 4971480. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:02:45,440][86732] Avg episode reward: [(0, '13.840'), (1, '12.060')]
[2023-09-22 11:02:48,995][88474] Updated weights for policy 1, policy_version 38880 (0.0015)
[2023-09-22 11:02:48,996][88473] Updated weights for policy 0, policy_version 38976 (0.0012)
[2023-09-22 11:02:50,440][86732] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 19939328. Throughput: 0: 751.8, 1: 752.5. Samples: 4975860. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:02:50,440][86732] Avg episode reward: [(0, '14.320'), (1, '12.680')]
[2023-09-22 11:02:55,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19963904. Throughput: 0: 750.9, 1: 750.9. Samples: 4984833. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:02:55,440][86732] Avg episode reward: [(0, '14.090'), (1, '12.470')]
[2023-09-22 11:03:00,440][86732] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6026.0). Total num frames: 19996672. Throughput: 0: 752.5, 1: 751.8. Samples: 4993995. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:03:00,440][86732] Avg episode reward: [(0, '14.750'), (1, '12.520')]
[2023-09-22 11:03:00,441][88211] Saving new best policy, reward=14.750!
[2023-09-22 11:03:02,557][88473] Updated weights for policy 0, policy_version 39136 (0.0018)
[2023-09-22 11:03:02,557][88474] Updated weights for policy 1, policy_version 39040 (0.0018)
[2023-09-22 11:03:05,135][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000039168_10027008.pth...
[2023-09-22 11:03:05,136][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000039072_10002432.pth...
[2023-09-22 11:03:05,136][88687] Stopping RolloutWorker_w6...
[2023-09-22 11:03:05,136][88476] Stopping RolloutWorker_w0...
[2023-09-22 11:03:05,136][88485] Stopping RolloutWorker_w4...
[2023-09-22 11:03:05,136][88681] Stopping RolloutWorker_w5...
[2023-09-22 11:03:05,136][88478] Stopping RolloutWorker_w2...
[2023-09-22 11:03:05,136][88480] Stopping RolloutWorker_w3...
[2023-09-22 11:03:05,136][88686] Stopping RolloutWorker_w7...
[2023-09-22 11:03:05,136][88687] Loop rollout_proc6_evt_loop terminating...
[2023-09-22 11:03:05,136][88479] Stopping RolloutWorker_w1...
[2023-09-22 11:03:05,136][88476] Loop rollout_proc0_evt_loop terminating...
[2023-09-22 11:03:05,137][88485] Loop rollout_proc4_evt_loop terminating...
[2023-09-22 11:03:05,136][86732] Component RolloutWorker_w6 stopped!
[2023-09-22 11:03:05,137][88681] Loop rollout_proc5_evt_loop terminating...
[2023-09-22 11:03:05,137][88478] Loop rollout_proc2_evt_loop terminating...
[2023-09-22 11:03:05,137][88686] Loop rollout_proc7_evt_loop terminating...
[2023-09-22 11:03:05,137][88480] Loop rollout_proc3_evt_loop terminating...
[2023-09-22 11:03:05,137][88479] Loop rollout_proc1_evt_loop terminating...
[2023-09-22 11:03:05,137][86732] Component RolloutWorker_w0 stopped!
[2023-09-22 11:03:05,138][86732] Component RolloutWorker_w5 stopped!
[2023-09-22 11:03:05,139][86732] Component RolloutWorker_w4 stopped!
[2023-09-22 11:03:05,139][86732] Component RolloutWorker_w2 stopped!
[2023-09-22 11:03:05,140][86732] Component RolloutWorker_w7 stopped!
[2023-09-22 11:03:05,140][86732] Component RolloutWorker_w3 stopped!
[2023-09-22 11:03:05,141][86732] Component RolloutWorker_w1 stopped!
[2023-09-22 11:03:05,142][86732] Component Batcher_1 stopped!
[2023-09-22 11:03:05,137][88352] Stopping Batcher_1...
[2023-09-22 11:03:05,147][86732] Component Batcher_0 stopped!
[2023-09-22 11:03:05,160][88474] Weights refcount: 2 0
[2023-09-22 11:03:05,161][88474] Stopping InferenceWorker_p1-w0...
[2023-09-22 11:03:05,161][88474] Loop inference_proc1-0_evt_loop terminating...
[2023-09-22 11:03:05,161][86732] Component InferenceWorker_p1-w0 stopped!
[2023-09-22 11:03:05,156][88211] Stopping Batcher_0...
[2023-09-22 11:03:05,156][88352] Loop batcher_evt_loop terminating...
[2023-09-22 11:03:05,166][88211] Loop batcher_evt_loop terminating...
[2023-09-22 11:03:05,167][88211] Removing ./train_atari/Amidar/checkpoint_p0/checkpoint_000037104_9498624.pth
[2023-09-22 11:03:05,167][88473] Weights refcount: 2 0
[2023-09-22 11:03:05,169][88473] Stopping InferenceWorker_p0-w0...
[2023-09-22 11:03:05,169][88473] Loop inference_proc0-0_evt_loop terminating...
[2023-09-22 11:03:05,169][86732] Component InferenceWorker_p0-w0 stopped!
[2023-09-22 11:03:05,171][88211] Saving ./train_atari/Amidar/checkpoint_p0/checkpoint_000039168_10027008.pth...
[2023-09-22 11:03:05,176][88352] Removing ./train_atari/Amidar/checkpoint_p1/checkpoint_000037008_9474048.pth
[2023-09-22 11:03:05,184][88352] Saving ./train_atari/Amidar/checkpoint_p1/checkpoint_000039072_10002432.pth...
[2023-09-22 11:03:05,207][88211] Stopping LearnerWorker_p0...
[2023-09-22 11:03:05,207][88211] Loop learner_proc0_evt_loop terminating...
[2023-09-22 11:03:05,208][86732] Component LearnerWorker_p0 stopped!
[2023-09-22 11:03:05,223][88352] Stopping LearnerWorker_p1...
[2023-09-22 11:03:05,223][88352] Loop learner_proc1_evt_loop terminating...
[2023-09-22 11:03:05,223][86732] Component LearnerWorker_p1 stopped!
[2023-09-22 11:03:05,224][86732] Waiting for process learner_proc0 to stop...
[2023-09-22 11:03:05,988][86732] Waiting for process learner_proc1 to stop...
[2023-09-22 11:03:05,989][86732] Waiting for process inference_proc0-0 to join...
[2023-09-22 11:03:05,990][86732] Waiting for process inference_proc1-0 to join...
[2023-09-22 11:03:05,991][86732] Waiting for process rollout_proc0 to join...
[2023-09-22 11:03:05,991][86732] Waiting for process rollout_proc1 to join...
[2023-09-22 11:03:05,992][86732] Waiting for process rollout_proc2 to join...
[2023-09-22 11:03:05,993][86732] Waiting for process rollout_proc3 to join...
[2023-09-22 11:03:05,994][86732] Waiting for process rollout_proc4 to join...
[2023-09-22 11:03:05,994][86732] Waiting for process rollout_proc5 to join...
[2023-09-22 11:03:05,995][86732] Waiting for process rollout_proc6 to join...
[2023-09-22 11:03:05,996][86732] Waiting for process rollout_proc7 to join...
[2023-09-22 11:03:05,997][86732] Batcher 0 profile tree view:
batching: 20.4799, releasing_batches: 1.8250
[2023-09-22 11:03:05,997][86732] Batcher 1 profile tree view:
batching: 20.3136, releasing_batches: 1.7504
[2023-09-22 11:03:05,998][86732] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0001
wait_policy_total: 719.7419
update_model: 38.5099
weight_update: 0.0019
one_step: 0.0031
handle_policy_step: 2428.7191
deserialize: 71.8022, stack: 17.9179, obs_to_device_normalize: 587.0532, forward: 1176.0751, send_messages: 98.7827
prepare_outputs: 322.0972
to_cpu: 162.2869
[2023-09-22 11:03:05,999][86732] InferenceWorker_p1-w0 profile tree view:
wait_policy: 0.0001
wait_policy_total: 717.6337
update_model: 39.0581
weight_update: 0.0018
one_step: 0.0026
handle_policy_step: 2429.7293
deserialize: 71.8570, stack: 17.6634, obs_to_device_normalize: 594.1591, forward: 1168.9557, send_messages: 99.1936
prepare_outputs: 321.3196
to_cpu: 162.3301
[2023-09-22 11:03:05,999][86732] Learner 0 profile tree view:
misc: 0.0147, prepare_batch: 31.6024
train: 463.8803
epoch_init: 0.1146, minibatch_init: 3.6370, losses_postprocess: 56.9979, kl_divergence: 5.9465, after_optimizer: 10.6003
calculate_losses: 51.0610
losses_init: 0.1313, forward_head: 16.2124, bptt_initial: 0.4840, bptt: 0.5591, tail: 11.7632, advantages_returns: 3.4595, losses: 14.4186
update: 330.9945
clip: 165.1813
[2023-09-22 11:03:06,000][86732] Learner 1 profile tree view:
misc: 0.0156, prepare_batch: 31.5935
train: 454.9218
epoch_init: 0.1164, minibatch_init: 3.6835, losses_postprocess: 57.2089, kl_divergence: 6.0121, after_optimizer: 19.9756
calculate_losses: 50.3010
losses_init: 0.1204, forward_head: 15.2418, bptt_initial: 0.4905, bptt: 0.5588, tail: 11.7280, advantages_returns: 3.5095, losses: 14.5875
update: 313.0746
clip: 165.3535
[2023-09-22 11:03:06,001][86732] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3876, enqueue_policy_requests: 46.9614, env_step: 1276.4616, overhead: 32.2210, complete_rollouts: 1.1462
save_policy_outputs: 61.3017
split_output_tensors: 21.0355
[2023-09-22 11:03:06,001][86732] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3937, enqueue_policy_requests: 46.3951, env_step: 1240.2290, overhead: 32.4553, complete_rollouts: 1.1093
save_policy_outputs: 59.3548
split_output_tensors: 20.2771
[2023-09-22 11:03:06,001][86732] Loop Runner_EvtLoop terminating...
[2023-09-22 11:03:06,002][86732] Runner profile tree view:
main_loop: 3410.9988
[2023-09-22 11:03:06,002][86732] Collected {0: 10027008, 1: 10002432}, FPS: 5864.8