MattStammers's picture
Upload folder using huggingface_hub
b06761a
[2023-10-01 10:30:35,929][117973] Saving configuration to ./train_atari/atari_videopinball/config.json...
[2023-10-01 10:30:36,246][117973] Rollout worker 0 uses device cpu
[2023-10-01 10:30:36,246][117973] Rollout worker 1 uses device cpu
[2023-10-01 10:30:36,247][117973] Rollout worker 2 uses device cpu
[2023-10-01 10:30:36,248][117973] Rollout worker 3 uses device cpu
[2023-10-01 10:30:36,248][117973] Rollout worker 4 uses device cpu
[2023-10-01 10:30:36,249][117973] Rollout worker 5 uses device cpu
[2023-10-01 10:30:36,249][117973] Rollout worker 6 uses device cpu
[2023-10-01 10:30:36,249][117973] Rollout worker 7 uses device cpu
[2023-10-01 10:30:36,250][117973] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-10-01 10:30:36,299][117973] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-10-01 10:30:36,299][117973] InferenceWorker_p0-w0: min num requests: 1
[2023-10-01 10:30:36,302][117973] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-10-01 10:30:36,302][117973] InferenceWorker_p1-w0: min num requests: 1
[2023-10-01 10:30:36,327][117973] Starting all processes...
[2023-10-01 10:30:36,327][117973] Starting process learner_proc0
[2023-10-01 10:30:37,978][117973] Starting process learner_proc1
[2023-10-01 10:30:37,981][118645] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-10-01 10:30:37,982][118645] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-10-01 10:30:38,000][118645] Num visible devices: 1
[2023-10-01 10:30:38,020][118645] Starting seed is not provided
[2023-10-01 10:30:38,020][118645] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-10-01 10:30:38,020][118645] Initializing actor-critic model on device cuda:0
[2023-10-01 10:30:38,021][118645] RunningMeanStd input shape: (4, 84, 84)
[2023-10-01 10:30:38,021][118645] RunningMeanStd input shape: (1,)
[2023-10-01 10:30:38,032][118645] ConvEncoder: input_channels=4
[2023-10-01 10:30:38,181][118645] Conv encoder output size: 512
[2023-10-01 10:30:38,183][118645] Created Actor Critic model with architecture:
[2023-10-01 10:30:38,183][118645] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=9, bias=True)
)
)
[2023-10-01 10:30:38,795][118645] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-10-01 10:30:38,796][118645] No checkpoints found
[2023-10-01 10:30:38,796][118645] Did not load from checkpoint, starting from scratch!
[2023-10-01 10:30:38,796][118645] Initialized policy 0 weights for model version 0
[2023-10-01 10:30:38,798][118645] LearnerWorker_p0 finished initialization!
[2023-10-01 10:30:38,798][118645] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-10-01 10:30:39,670][117973] Starting all processes...
[2023-10-01 10:30:39,674][118715] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-10-01 10:30:39,674][118715] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1
[2023-10-01 10:30:39,678][117973] Starting process inference_proc0-0
[2023-10-01 10:30:39,678][117973] Starting process inference_proc1-0
[2023-10-01 10:30:39,678][117973] Starting process rollout_proc0
[2023-10-01 10:30:39,678][117973] Starting process rollout_proc1
[2023-10-01 10:30:39,693][118715] Num visible devices: 1
[2023-10-01 10:30:39,679][117973] Starting process rollout_proc2
[2023-10-01 10:30:39,711][118715] Starting seed is not provided
[2023-10-01 10:30:39,711][118715] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-10-01 10:30:39,712][118715] Initializing actor-critic model on device cuda:0
[2023-10-01 10:30:39,679][117973] Starting process rollout_proc3
[2023-10-01 10:30:39,712][118715] RunningMeanStd input shape: (4, 84, 84)
[2023-10-01 10:30:39,713][118715] RunningMeanStd input shape: (1,)
[2023-10-01 10:30:39,680][117973] Starting process rollout_proc4
[2023-10-01 10:30:39,683][117973] Starting process rollout_proc5
[2023-10-01 10:30:39,686][117973] Starting process rollout_proc6
[2023-10-01 10:30:39,688][117973] Starting process rollout_proc7
[2023-10-01 10:30:39,725][118715] ConvEncoder: input_channels=4
[2023-10-01 10:30:40,025][118715] Conv encoder output size: 512
[2023-10-01 10:30:40,027][118715] Created Actor Critic model with architecture:
[2023-10-01 10:30:40,027][118715] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=9, bias=True)
)
)
[2023-10-01 10:30:40,674][118715] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-10-01 10:30:40,675][118715] No checkpoints found
[2023-10-01 10:30:40,675][118715] Did not load from checkpoint, starting from scratch!
[2023-10-01 10:30:40,675][118715] Initialized policy 1 weights for model version 0
[2023-10-01 10:30:40,677][118715] LearnerWorker_p1 finished initialization!
[2023-10-01 10:30:40,677][118715] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-10-01 10:30:41,627][119083] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-10-01 10:30:41,627][119041] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-10-01 10:30:41,627][119041] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-10-01 10:30:41,635][119085] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-10-01 10:30:41,639][119041] Num visible devices: 1
[2023-10-01 10:30:41,664][119091] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-10-01 10:30:41,664][119086] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-10-01 10:30:41,672][119088] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-10-01 10:30:41,675][119089] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-10-01 10:30:41,735][119090] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-10-01 10:30:41,843][119042] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-10-01 10:30:41,843][119042] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1
[2023-10-01 10:30:41,850][119087] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-10-01 10:30:41,855][119042] Num visible devices: 1
[2023-10-01 10:30:42,052][117973] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-10-01 10:30:42,313][119041] RunningMeanStd input shape: (4, 84, 84)
[2023-10-01 10:30:42,314][119041] RunningMeanStd input shape: (1,)
[2023-10-01 10:30:42,325][119041] ConvEncoder: input_channels=4
[2023-10-01 10:30:42,429][119041] Conv encoder output size: 512
[2023-10-01 10:30:42,435][117973] Inference worker 0-0 is ready!
[2023-10-01 10:30:42,451][119042] RunningMeanStd input shape: (4, 84, 84)
[2023-10-01 10:30:42,452][119042] RunningMeanStd input shape: (1,)
[2023-10-01 10:30:42,463][119042] ConvEncoder: input_channels=4
[2023-10-01 10:30:42,561][119042] Conv encoder output size: 512
[2023-10-01 10:30:42,567][117973] Inference worker 1-0 is ready!
[2023-10-01 10:30:42,567][117973] All inference workers are ready! Signal rollout workers to start!
[2023-10-01 10:30:43,059][119087] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,062][119085] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,063][119089] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,064][119090] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,064][119083] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,067][119088] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,145][119091] Decorrelating experience for 0 frames...
[2023-10-01 10:30:43,161][119086] Decorrelating experience for 0 frames...
[2023-10-01 10:30:47,052][117973] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:30:47,052][117973] Avg episode reward: [(1, '6.000')]
[2023-10-01 10:30:52,052][117973] Fps is (10 sec: 2457.6, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 24576. Throughput: 0: 368.0, 1: 379.7. Samples: 7477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:30:52,053][117973] Avg episode reward: [(0, '22.667'), (1, '13.600')]
[2023-10-01 10:30:56,286][117973] Heartbeat connected on Batcher_0
[2023-10-01 10:30:56,289][117973] Heartbeat connected on LearnerWorker_p0
[2023-10-01 10:30:56,292][117973] Heartbeat connected on Batcher_1
[2023-10-01 10:30:56,294][117973] Heartbeat connected on LearnerWorker_p1
[2023-10-01 10:30:56,301][117973] Heartbeat connected on InferenceWorker_p0-w0
[2023-10-01 10:30:56,305][117973] Heartbeat connected on InferenceWorker_p1-w0
[2023-10-01 10:30:56,306][117973] Heartbeat connected on RolloutWorker_w0
[2023-10-01 10:30:56,311][117973] Heartbeat connected on RolloutWorker_w1
[2023-10-01 10:30:56,312][117973] Heartbeat connected on RolloutWorker_w2
[2023-10-01 10:30:56,315][117973] Heartbeat connected on RolloutWorker_w3
[2023-10-01 10:30:56,319][117973] Heartbeat connected on RolloutWorker_w4
[2023-10-01 10:30:56,321][117973] Heartbeat connected on RolloutWorker_w5
[2023-10-01 10:30:56,323][117973] Heartbeat connected on RolloutWorker_w6
[2023-10-01 10:30:56,328][117973] Heartbeat connected on RolloutWorker_w7
[2023-10-01 10:30:57,051][117973] Fps is (10 sec: 4915.3, 60 sec: 3823.0, 300 sec: 3823.0). Total num frames: 57344. Throughput: 0: 398.7, 1: 404.7. Samples: 12051. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:30:57,052][117973] Avg episode reward: [(0, '26.143'), (1, '27.455')]
[2023-10-01 10:31:00,324][119041] Updated weights for policy 0, policy_version 160 (0.0017)
[2023-10-01 10:31:00,324][119042] Updated weights for policy 1, policy_version 160 (0.0018)
[2023-10-01 10:31:02,051][117973] Fps is (10 sec: 6553.7, 60 sec: 4505.6, 300 sec: 4505.6). Total num frames: 90112. Throughput: 0: 516.6, 1: 522.8. Samples: 20787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:02,052][117973] Avg episode reward: [(0, '39.833'), (1, '28.250')]
[2023-10-01 10:31:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 4587.5, 300 sec: 4587.5). Total num frames: 114688. Throughput: 0: 594.6, 1: 603.7. Samples: 29958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:07,052][117973] Avg episode reward: [(0, '36.833'), (1, '38.250')]
[2023-10-01 10:31:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 147456. Throughput: 0: 570.1, 1: 574.8. Samples: 34348. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:31:12,052][117973] Avg episode reward: [(0, '36.045'), (1, '39.941')]
[2023-10-01 10:31:14,097][119042] Updated weights for policy 1, policy_version 320 (0.0020)
[2023-10-01 10:31:14,097][119041] Updated weights for policy 0, policy_version 320 (0.0021)
[2023-10-01 10:31:17,052][117973] Fps is (10 sec: 6553.5, 60 sec: 5149.2, 300 sec: 5149.2). Total num frames: 180224. Throughput: 0: 614.5, 1: 617.2. Samples: 43110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:17,053][117973] Avg episode reward: [(0, '39.741'), (1, '47.792')]
[2023-10-01 10:31:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5120.0, 300 sec: 5120.0). Total num frames: 204800. Throughput: 0: 648.2, 1: 652.4. Samples: 52025. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:22,052][117973] Avg episode reward: [(0, '37.484'), (1, '44.862')]
[2023-10-01 10:31:22,053][118645] Saving new best policy, reward=37.484!
[2023-10-01 10:31:22,053][118715] Saving new best policy, reward=44.862!
[2023-10-01 10:31:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5279.3, 300 sec: 5279.3). Total num frames: 237568. Throughput: 0: 626.6, 1: 629.9. Samples: 56543. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:31:27,053][117973] Avg episode reward: [(0, '40.167'), (1, '42.778')]
[2023-10-01 10:31:27,054][118645] Saving new best policy, reward=40.167!
[2023-10-01 10:31:27,883][119042] Updated weights for policy 1, policy_version 480 (0.0019)
[2023-10-01 10:31:27,884][119041] Updated weights for policy 0, policy_version 480 (0.0017)
[2023-10-01 10:31:32,052][117973] Fps is (10 sec: 6553.5, 60 sec: 5406.7, 300 sec: 5406.7). Total num frames: 270336. Throughput: 0: 705.4, 1: 705.4. Samples: 65536. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:31:32,052][117973] Avg episode reward: [(0, '38.925'), (1, '42.103')]
[2023-10-01 10:31:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5362.1, 300 sec: 5362.1). Total num frames: 294912. Throughput: 0: 743.1, 1: 743.2. Samples: 74360. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:31:37,052][117973] Avg episode reward: [(0, '37.333'), (1, '41.773')]
[2023-10-01 10:31:41,729][119042] Updated weights for policy 1, policy_version 640 (0.0019)
[2023-10-01 10:31:41,729][119041] Updated weights for policy 0, policy_version 640 (0.0019)
[2023-10-01 10:31:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 327680. Throughput: 0: 742.4, 1: 742.4. Samples: 78865. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:42,053][117973] Avg episode reward: [(0, '36.755'), (1, '41.085')]
[2023-10-01 10:31:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5419.3). Total num frames: 352256. Throughput: 0: 744.0, 1: 744.9. Samples: 87787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:47,052][117973] Avg episode reward: [(0, '38.464'), (1, '41.700')]
[2023-10-01 10:31:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5500.4). Total num frames: 385024. Throughput: 0: 739.2, 1: 735.0. Samples: 96296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:31:52,052][117973] Avg episode reward: [(0, '38.933'), (1, '43.037')]
[2023-10-01 10:31:55,577][119042] Updated weights for policy 1, policy_version 800 (0.0017)
[2023-10-01 10:31:55,578][119041] Updated weights for policy 0, policy_version 800 (0.0017)
[2023-10-01 10:31:57,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5570.6). Total num frames: 417792. Throughput: 0: 739.3, 1: 739.0. Samples: 100872. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:31:57,052][117973] Avg episode reward: [(0, '39.308'), (1, '42.136')]
[2023-10-01 10:32:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5529.6). Total num frames: 442368. Throughput: 0: 744.0, 1: 744.5. Samples: 110092. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:02,052][117973] Avg episode reward: [(0, '38.129'), (1, '42.312')]
[2023-10-01 10:32:07,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5589.8). Total num frames: 475136. Throughput: 0: 743.7, 1: 742.1. Samples: 118888. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:07,052][117973] Avg episode reward: [(0, '37.568'), (1, '42.879')]
[2023-10-01 10:32:09,155][119041] Updated weights for policy 0, policy_version 960 (0.0017)
[2023-10-01 10:32:09,156][119042] Updated weights for policy 1, policy_version 960 (0.0018)
[2023-10-01 10:32:12,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5643.4). Total num frames: 507904. Throughput: 0: 744.3, 1: 743.9. Samples: 123512. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:32:12,052][117973] Avg episode reward: [(0, '37.938'), (1, '43.400')]
[2023-10-01 10:32:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5605.1). Total num frames: 532480. Throughput: 0: 742.3, 1: 744.4. Samples: 132435. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:17,052][117973] Avg episode reward: [(0, '37.583'), (1, '42.610')]
[2023-10-01 10:32:22,052][117973] Fps is (10 sec: 5734.0, 60 sec: 6007.4, 300 sec: 5652.4). Total num frames: 565248. Throughput: 0: 745.2, 1: 742.7. Samples: 141316. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:32:22,053][117973] Avg episode reward: [(0, '37.562'), (1, '42.712')]
[2023-10-01 10:32:23,006][119041] Updated weights for policy 0, policy_version 1120 (0.0019)
[2023-10-01 10:32:23,006][119042] Updated weights for policy 1, policy_version 1120 (0.0017)
[2023-10-01 10:32:27,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5695.4). Total num frames: 598016. Throughput: 0: 742.4, 1: 743.5. Samples: 145731. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:32:27,052][117973] Avg episode reward: [(0, '36.638'), (1, '42.859')]
[2023-10-01 10:32:32,052][117973] Fps is (10 sec: 5734.7, 60 sec: 5870.9, 300 sec: 5659.9). Total num frames: 622592. Throughput: 0: 746.0, 1: 745.3. Samples: 154895. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:32:32,053][117973] Avg episode reward: [(0, '39.030'), (1, '43.110')]
[2023-10-01 10:32:32,061][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000001216_311296.pth...
[2023-10-01 10:32:32,062][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000001216_311296.pth...
[2023-10-01 10:32:36,721][119041] Updated weights for policy 0, policy_version 1280 (0.0018)
[2023-10-01 10:32:36,721][119042] Updated weights for policy 1, policy_version 1280 (0.0017)
[2023-10-01 10:32:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5698.8). Total num frames: 655360. Throughput: 0: 750.6, 1: 750.0. Samples: 163823. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:32:37,052][117973] Avg episode reward: [(0, '39.730'), (1, '41.562')]
[2023-10-01 10:32:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5666.1). Total num frames: 679936. Throughput: 0: 746.2, 1: 743.7. Samples: 167921. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:42,053][117973] Avg episode reward: [(0, '38.190'), (1, '41.410')]
[2023-10-01 10:32:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5701.6). Total num frames: 712704. Throughput: 0: 736.1, 1: 736.1. Samples: 176340. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:47,052][117973] Avg episode reward: [(0, '38.050'), (1, '42.300')]
[2023-10-01 10:32:50,961][119041] Updated weights for policy 0, policy_version 1440 (0.0018)
[2023-10-01 10:32:50,962][119042] Updated weights for policy 1, policy_version 1440 (0.0015)
[2023-10-01 10:32:52,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5671.4). Total num frames: 737280. Throughput: 0: 738.2, 1: 738.1. Samples: 185322. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:52,053][117973] Avg episode reward: [(0, '38.110'), (1, '43.700')]
[2023-10-01 10:32:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5704.1). Total num frames: 770048. Throughput: 0: 735.0, 1: 735.4. Samples: 189681. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:32:57,053][117973] Avg episode reward: [(0, '38.180'), (1, '41.670')]
[2023-10-01 10:33:02,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5734.4). Total num frames: 802816. Throughput: 0: 736.8, 1: 734.8. Samples: 198656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:33:02,052][117973] Avg episode reward: [(0, '39.080'), (1, '38.520')]
[2023-10-01 10:33:04,758][119042] Updated weights for policy 1, policy_version 1600 (0.0018)
[2023-10-01 10:33:04,758][119041] Updated weights for policy 0, policy_version 1600 (0.0017)
[2023-10-01 10:33:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5706.2). Total num frames: 827392. Throughput: 0: 734.8, 1: 737.4. Samples: 207567. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:33:07,053][117973] Avg episode reward: [(0, '38.110'), (1, '39.250')]
[2023-10-01 10:33:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5734.4). Total num frames: 860160. Throughput: 0: 736.8, 1: 736.2. Samples: 212014. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:33:12,052][117973] Avg episode reward: [(0, '37.410'), (1, '38.890')]
[2023-10-01 10:33:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5708.0). Total num frames: 884736. Throughput: 0: 736.9, 1: 735.0. Samples: 221131. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:33:17,052][117973] Avg episode reward: [(0, '37.630'), (1, '40.160')]
[2023-10-01 10:33:18,408][119042] Updated weights for policy 1, policy_version 1760 (0.0017)
[2023-10-01 10:33:18,409][119041] Updated weights for policy 0, policy_version 1760 (0.0019)
[2023-10-01 10:33:22,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5871.0, 300 sec: 5734.4). Total num frames: 917504. Throughput: 0: 732.8, 1: 734.9. Samples: 229870. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:33:22,053][117973] Avg episode reward: [(0, '36.080'), (1, '40.270')]
[2023-10-01 10:33:27,051][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5759.2). Total num frames: 950272. Throughput: 0: 737.9, 1: 739.6. Samples: 234408. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:33:27,052][117973] Avg episode reward: [(0, '35.310'), (1, '40.490')]
[2023-10-01 10:33:32,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5734.4). Total num frames: 974848. Throughput: 0: 744.1, 1: 745.9. Samples: 243388. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:33:32,052][117973] Avg episode reward: [(0, '35.790'), (1, '40.360')]
[2023-10-01 10:33:32,307][119041] Updated weights for policy 0, policy_version 1920 (0.0016)
[2023-10-01 10:33:32,309][119042] Updated weights for policy 1, policy_version 1920 (0.0016)
[2023-10-01 10:33:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5757.8). Total num frames: 1007616. Throughput: 0: 741.2, 1: 741.7. Samples: 252053. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 10:33:37,052][117973] Avg episode reward: [(0, '36.060'), (1, '38.890')]
[2023-10-01 10:33:42,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5779.9). Total num frames: 1040384. Throughput: 0: 745.0, 1: 744.3. Samples: 256699. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:33:42,053][117973] Avg episode reward: [(0, '34.760'), (1, '38.110')]
[2023-10-01 10:33:46,068][119041] Updated weights for policy 0, policy_version 2080 (0.0016)
[2023-10-01 10:33:46,068][119042] Updated weights for policy 1, policy_version 2080 (0.0017)
[2023-10-01 10:33:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5756.5). Total num frames: 1064960. Throughput: 0: 742.1, 1: 746.7. Samples: 265651. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:33:47,053][117973] Avg episode reward: [(0, '33.650'), (1, '37.830')]
[2023-10-01 10:33:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5777.5). Total num frames: 1097728. Throughput: 0: 744.3, 1: 741.6. Samples: 274433. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:33:52,052][117973] Avg episode reward: [(0, '33.560'), (1, '38.340')]
[2023-10-01 10:33:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5755.4). Total num frames: 1122304. Throughput: 0: 740.5, 1: 741.0. Samples: 278679. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:33:57,053][117973] Avg episode reward: [(0, '32.050'), (1, '37.040')]
[2023-10-01 10:33:59,940][119041] Updated weights for policy 0, policy_version 2240 (0.0019)
[2023-10-01 10:33:59,940][119042] Updated weights for policy 1, policy_version 2240 (0.0018)
[2023-10-01 10:34:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5775.4). Total num frames: 1155072. Throughput: 0: 739.7, 1: 740.9. Samples: 287758. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:34:02,053][117973] Avg episode reward: [(0, '31.970'), (1, '36.430')]
[2023-10-01 10:34:07,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5794.3). Total num frames: 1187840. Throughput: 0: 743.7, 1: 742.9. Samples: 296766. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:34:07,053][117973] Avg episode reward: [(0, '33.930'), (1, '37.300')]
[2023-10-01 10:34:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5773.4). Total num frames: 1212416. Throughput: 0: 740.3, 1: 739.5. Samples: 300998. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:34:12,052][117973] Avg episode reward: [(0, '34.810'), (1, '36.720')]
[2023-10-01 10:34:14,017][119041] Updated weights for policy 0, policy_version 2400 (0.0019)
[2023-10-01 10:34:14,017][119042] Updated weights for policy 1, policy_version 2400 (0.0018)
[2023-10-01 10:34:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5791.6). Total num frames: 1245184. Throughput: 0: 734.1, 1: 731.4. Samples: 309335. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:34:17,053][117973] Avg episode reward: [(0, '34.680'), (1, '36.720')]
[2023-10-01 10:34:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5771.6). Total num frames: 1269760. Throughput: 0: 735.9, 1: 736.4. Samples: 318307. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:34:22,052][117973] Avg episode reward: [(0, '37.160'), (1, '38.710')]
[2023-10-01 10:34:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5789.0). Total num frames: 1302528. Throughput: 0: 733.9, 1: 732.6. Samples: 322689. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:34:27,053][117973] Avg episode reward: [(0, '37.790'), (1, '37.760')]
[2023-10-01 10:34:27,961][119041] Updated weights for policy 0, policy_version 2560 (0.0016)
[2023-10-01 10:34:27,962][119042] Updated weights for policy 1, policy_version 2560 (0.0016)
[2023-10-01 10:34:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5770.0). Total num frames: 1327104. Throughput: 0: 735.5, 1: 732.4. Samples: 331709. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:34:32,052][117973] Avg episode reward: [(0, '36.140'), (1, '36.310')]
[2023-10-01 10:34:32,104][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000002608_667648.pth...
[2023-10-01 10:34:32,121][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000002608_667648.pth...
[2023-10-01 10:34:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5786.7). Total num frames: 1359872. Throughput: 0: 732.9, 1: 736.6. Samples: 340561. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:34:37,053][117973] Avg episode reward: [(0, '35.310'), (1, '38.770')]
[2023-10-01 10:34:41,520][119041] Updated weights for policy 0, policy_version 2720 (0.0018)
[2023-10-01 10:34:41,520][119042] Updated weights for policy 1, policy_version 2720 (0.0017)
[2023-10-01 10:34:42,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5871.0, 300 sec: 5802.7). Total num frames: 1392640. Throughput: 0: 739.8, 1: 739.6. Samples: 345252. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:34:42,052][117973] Avg episode reward: [(0, '37.230'), (1, '37.530')]
[2023-10-01 10:34:47,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5801.3). Total num frames: 1421312. Throughput: 0: 740.1, 1: 738.2. Samples: 354282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:34:47,053][117973] Avg episode reward: [(0, '36.940'), (1, '37.410')]
[2023-10-01 10:34:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5799.9). Total num frames: 1449984. Throughput: 0: 736.2, 1: 736.4. Samples: 363032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:34:52,052][117973] Avg episode reward: [(0, '38.410'), (1, '37.850')]
[2023-10-01 10:34:55,185][119041] Updated weights for policy 0, policy_version 2880 (0.0016)
[2023-10-01 10:34:55,185][119042] Updated weights for policy 1, policy_version 2880 (0.0017)
[2023-10-01 10:34:57,052][117973] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5814.7). Total num frames: 1482752. Throughput: 0: 741.7, 1: 742.9. Samples: 367805. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:34:57,053][117973] Avg episode reward: [(0, '39.230'), (1, '38.200')]
[2023-10-01 10:35:02,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5828.9). Total num frames: 1515520. Throughput: 0: 750.8, 1: 749.1. Samples: 376832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:02,053][117973] Avg episode reward: [(0, '41.520'), (1, '38.230')]
[2023-10-01 10:35:02,060][118645] Saving new best policy, reward=41.520!
[2023-10-01 10:35:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5811.7). Total num frames: 1540096. Throughput: 0: 747.3, 1: 746.8. Samples: 385539. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:35:07,052][117973] Avg episode reward: [(0, '42.860'), (1, '39.370')]
[2023-10-01 10:35:07,052][118645] Saving new best policy, reward=42.860!
[2023-10-01 10:35:08,881][119042] Updated weights for policy 1, policy_version 3040 (0.0017)
[2023-10-01 10:35:08,881][119041] Updated weights for policy 0, policy_version 3040 (0.0018)
[2023-10-01 10:35:12,051][117973] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5825.4). Total num frames: 1572864. Throughput: 0: 749.9, 1: 751.0. Samples: 390229. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:35:12,052][117973] Avg episode reward: [(0, '42.570'), (1, '39.730')]
[2023-10-01 10:35:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5808.9). Total num frames: 1597440. Throughput: 0: 747.8, 1: 748.4. Samples: 399038. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:17,052][117973] Avg episode reward: [(0, '43.320'), (1, '38.440')]
[2023-10-01 10:35:17,249][118645] Saving new best policy, reward=43.320!
[2023-10-01 10:35:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5822.2). Total num frames: 1630208. Throughput: 0: 746.3, 1: 744.5. Samples: 407646. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:22,052][117973] Avg episode reward: [(0, '43.450'), (1, '39.410')]
[2023-10-01 10:35:22,053][118645] Saving new best policy, reward=43.450!
[2023-10-01 10:35:22,950][119041] Updated weights for policy 0, policy_version 3200 (0.0019)
[2023-10-01 10:35:22,950][119042] Updated weights for policy 1, policy_version 3200 (0.0018)
[2023-10-01 10:35:27,052][117973] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 5835.0). Total num frames: 1662976. Throughput: 0: 741.0, 1: 740.8. Samples: 411933. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:35:27,053][117973] Avg episode reward: [(0, '45.080'), (1, '39.330')]
[2023-10-01 10:35:27,055][118645] Saving new best policy, reward=45.080!
[2023-10-01 10:35:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5819.1). Total num frames: 1687552. Throughput: 0: 740.1, 1: 742.9. Samples: 421017. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:32,052][117973] Avg episode reward: [(0, '46.470'), (1, '40.650')]
[2023-10-01 10:35:32,057][118645] Saving new best policy, reward=46.470!
[2023-10-01 10:35:36,674][119041] Updated weights for policy 0, policy_version 3360 (0.0018)
[2023-10-01 10:35:36,674][119042] Updated weights for policy 1, policy_version 3360 (0.0017)
[2023-10-01 10:35:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 1720320. Throughput: 0: 745.7, 1: 744.0. Samples: 430070. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:37,053][117973] Avg episode reward: [(0, '45.110'), (1, '41.460')]
[2023-10-01 10:35:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 1744896. Throughput: 0: 739.2, 1: 739.2. Samples: 434332. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:35:42,053][117973] Avg episode reward: [(0, '46.790'), (1, '40.180')]
[2023-10-01 10:35:42,119][118645] Saving new best policy, reward=46.790!
[2023-10-01 10:35:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5939.2, 300 sec: 5942.7). Total num frames: 1777664. Throughput: 0: 739.4, 1: 742.4. Samples: 443515. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 10:35:47,052][117973] Avg episode reward: [(0, '48.690'), (1, '41.880')]
[2023-10-01 10:35:47,059][118645] Saving new best policy, reward=48.690!
[2023-10-01 10:35:50,285][119042] Updated weights for policy 1, policy_version 3520 (0.0016)
[2023-10-01 10:35:50,285][119041] Updated weights for policy 0, policy_version 3520 (0.0019)
[2023-10-01 10:35:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 1810432. Throughput: 0: 746.4, 1: 744.1. Samples: 452612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:35:52,052][117973] Avg episode reward: [(0, '50.380'), (1, '41.530')]
[2023-10-01 10:35:52,053][118645] Saving new best policy, reward=50.380!
[2023-10-01 10:35:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 1835008. Throughput: 0: 741.7, 1: 742.4. Samples: 457011. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:35:57,053][117973] Avg episode reward: [(0, '50.730'), (1, '41.240')]
[2023-10-01 10:35:57,203][118645] Saving new best policy, reward=50.730!
[2023-10-01 10:36:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 1867776. Throughput: 0: 741.6, 1: 742.9. Samples: 465842. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:02,052][117973] Avg episode reward: [(0, '49.790'), (1, '41.760')]
[2023-10-01 10:36:04,012][119042] Updated weights for policy 1, policy_version 3680 (0.0021)
[2023-10-01 10:36:04,012][119041] Updated weights for policy 0, policy_version 3680 (0.0020)
[2023-10-01 10:36:07,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 1900544. Throughput: 0: 749.8, 1: 748.8. Samples: 475085. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:07,053][117973] Avg episode reward: [(0, '47.470'), (1, '41.760')]
[2023-10-01 10:36:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 1925120. Throughput: 0: 748.9, 1: 746.6. Samples: 479233. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:12,053][117973] Avg episode reward: [(0, '47.210'), (1, '40.510')]
[2023-10-01 10:36:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 1957888. Throughput: 0: 748.0, 1: 747.7. Samples: 488324. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:17,053][117973] Avg episode reward: [(0, '47.420'), (1, '41.880')]
[2023-10-01 10:36:17,848][119042] Updated weights for policy 1, policy_version 3840 (0.0018)
[2023-10-01 10:36:17,849][119041] Updated weights for policy 0, policy_version 3840 (0.0019)
[2023-10-01 10:36:22,051][117973] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 1990656. Throughput: 0: 746.2, 1: 748.8. Samples: 497347. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:22,052][117973] Avg episode reward: [(0, '46.380'), (1, '40.090')]
[2023-10-01 10:36:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 2015232. Throughput: 0: 750.3, 1: 748.0. Samples: 501755. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:27,052][117973] Avg episode reward: [(0, '46.870'), (1, '40.120')]
[2023-10-01 10:36:31,756][119041] Updated weights for policy 0, policy_version 4000 (0.0016)
[2023-10-01 10:36:31,756][119042] Updated weights for policy 1, policy_version 4000 (0.0017)
[2023-10-01 10:36:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2048000. Throughput: 0: 740.6, 1: 739.8. Samples: 510130. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:32,052][117973] Avg episode reward: [(0, '48.030'), (1, '40.690')]
[2023-10-01 10:36:32,061][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000004000_1024000.pth...
[2023-10-01 10:36:32,061][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000004000_1024000.pth...
[2023-10-01 10:36:32,097][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000001216_311296.pth
[2023-10-01 10:36:32,098][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000001216_311296.pth
[2023-10-01 10:36:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2072576. Throughput: 0: 740.3, 1: 742.7. Samples: 519350. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:37,052][117973] Avg episode reward: [(0, '48.630'), (1, '41.500')]
[2023-10-01 10:36:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2105344. Throughput: 0: 744.0, 1: 744.5. Samples: 523991. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:42,053][117973] Avg episode reward: [(0, '48.470'), (1, '42.240')]
[2023-10-01 10:36:45,371][119041] Updated weights for policy 0, policy_version 4160 (0.0018)
[2023-10-01 10:36:45,371][119042] Updated weights for policy 1, policy_version 4160 (0.0018)
[2023-10-01 10:36:47,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2138112. Throughput: 0: 743.4, 1: 742.2. Samples: 532697. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:36:47,052][117973] Avg episode reward: [(0, '47.690'), (1, '39.200')]
[2023-10-01 10:36:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2162688. Throughput: 0: 738.6, 1: 740.0. Samples: 541621. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:52,052][117973] Avg episode reward: [(0, '48.050'), (1, '39.890')]
[2023-10-01 10:36:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2195456. Throughput: 0: 742.8, 1: 747.7. Samples: 546305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:36:57,053][117973] Avg episode reward: [(0, '46.470'), (1, '39.130')]
[2023-10-01 10:36:59,218][119042] Updated weights for policy 1, policy_version 4320 (0.0017)
[2023-10-01 10:36:59,218][119041] Updated weights for policy 0, policy_version 4320 (0.0019)
[2023-10-01 10:37:02,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2228224. Throughput: 0: 742.4, 1: 739.5. Samples: 555012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:02,052][117973] Avg episode reward: [(0, '46.720'), (1, '40.880')]
[2023-10-01 10:37:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 2252800. Throughput: 0: 738.2, 1: 738.0. Samples: 563779. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:07,052][117973] Avg episode reward: [(0, '46.140'), (1, '40.880')]
[2023-10-01 10:37:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2285568. Throughput: 0: 739.6, 1: 741.6. Samples: 568413. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:12,052][117973] Avg episode reward: [(0, '45.590'), (1, '41.660')]
[2023-10-01 10:37:13,097][119041] Updated weights for policy 0, policy_version 4480 (0.0018)
[2023-10-01 10:37:13,098][119042] Updated weights for policy 1, policy_version 4480 (0.0017)
[2023-10-01 10:37:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2310144. Throughput: 0: 747.4, 1: 746.0. Samples: 577333. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:37:17,053][117973] Avg episode reward: [(0, '47.200'), (1, '41.550')]
[2023-10-01 10:37:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2342912. Throughput: 0: 738.9, 1: 738.8. Samples: 585846. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:37:22,052][117973] Avg episode reward: [(0, '45.310'), (1, '39.230')]
[2023-10-01 10:37:26,984][119042] Updated weights for policy 1, policy_version 4640 (0.0019)
[2023-10-01 10:37:26,985][119041] Updated weights for policy 0, policy_version 4640 (0.0018)
[2023-10-01 10:37:27,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2375680. Throughput: 0: 737.0, 1: 736.2. Samples: 590287. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:27,052][117973] Avg episode reward: [(0, '44.480'), (1, '40.270')]
[2023-10-01 10:37:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2400256. Throughput: 0: 741.8, 1: 742.3. Samples: 599479. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:32,053][117973] Avg episode reward: [(0, '42.960'), (1, '39.370')]
[2023-10-01 10:37:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2433024. Throughput: 0: 741.6, 1: 739.3. Samples: 608260. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:37,052][117973] Avg episode reward: [(0, '43.160'), (1, '39.760')]
[2023-10-01 10:37:40,753][119041] Updated weights for policy 0, policy_version 4800 (0.0020)
[2023-10-01 10:37:40,753][119042] Updated weights for policy 1, policy_version 4800 (0.0021)
[2023-10-01 10:37:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2457600. Throughput: 0: 736.8, 1: 734.8. Samples: 612526. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:37:42,053][117973] Avg episode reward: [(0, '43.130'), (1, '39.360')]
[2023-10-01 10:37:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 2490368. Throughput: 0: 740.6, 1: 743.4. Samples: 621796. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:37:47,053][117973] Avg episode reward: [(0, '42.770'), (1, '39.000')]
[2023-10-01 10:37:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2523136. Throughput: 0: 744.9, 1: 743.1. Samples: 630742. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:37:52,052][117973] Avg episode reward: [(0, '44.060'), (1, '39.030')]
[2023-10-01 10:37:54,691][119042] Updated weights for policy 1, policy_version 4960 (0.0017)
[2023-10-01 10:37:54,691][119041] Updated weights for policy 0, policy_version 4960 (0.0018)
[2023-10-01 10:37:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2547712. Throughput: 0: 739.6, 1: 737.5. Samples: 634879. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:37:57,052][117973] Avg episode reward: [(0, '43.960'), (1, '38.560')]
[2023-10-01 10:38:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 2580480. Throughput: 0: 733.0, 1: 735.0. Samples: 643389. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:38:02,053][117973] Avg episode reward: [(0, '45.270'), (1, '38.740')]
[2023-10-01 10:38:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2605056. Throughput: 0: 740.2, 1: 740.9. Samples: 652494. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:07,053][117973] Avg episode reward: [(0, '45.270'), (1, '39.020')]
[2023-10-01 10:38:08,587][119041] Updated weights for policy 0, policy_version 5120 (0.0019)
[2023-10-01 10:38:08,587][119042] Updated weights for policy 1, policy_version 5120 (0.0017)
[2023-10-01 10:38:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 2637824. Throughput: 0: 740.4, 1: 741.3. Samples: 656961. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:12,052][117973] Avg episode reward: [(0, '45.580'), (1, '38.840')]
[2023-10-01 10:38:17,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5928.8). Total num frames: 2666496. Throughput: 0: 734.7, 1: 733.3. Samples: 665541. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:17,052][117973] Avg episode reward: [(0, '46.250'), (1, '35.510')]
[2023-10-01 10:38:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2695168. Throughput: 0: 732.5, 1: 735.4. Samples: 674318. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:22,052][117973] Avg episode reward: [(0, '45.460'), (1, '36.390')]
[2023-10-01 10:38:22,597][119041] Updated weights for policy 0, policy_version 5280 (0.0017)
[2023-10-01 10:38:22,598][119042] Updated weights for policy 1, policy_version 5280 (0.0018)
[2023-10-01 10:38:27,051][117973] Fps is (10 sec: 6144.1, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 2727936. Throughput: 0: 737.2, 1: 736.2. Samples: 678829. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:27,052][117973] Avg episode reward: [(0, '45.340'), (1, '36.850')]
[2023-10-01 10:38:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2752512. Throughput: 0: 735.0, 1: 734.7. Samples: 687931. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:32,052][117973] Avg episode reward: [(0, '46.030'), (1, '36.870')]
[2023-10-01 10:38:32,225][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000005392_1380352.pth...
[2023-10-01 10:38:32,238][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000005392_1380352.pth...
[2023-10-01 10:38:32,254][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000002608_667648.pth
[2023-10-01 10:38:32,267][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000002608_667648.pth
[2023-10-01 10:38:36,327][119041] Updated weights for policy 0, policy_version 5440 (0.0018)
[2023-10-01 10:38:36,335][119042] Updated weights for policy 1, policy_version 5440 (0.0018)
[2023-10-01 10:38:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2785280. Throughput: 0: 731.2, 1: 733.0. Samples: 696632. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:38:37,052][117973] Avg episode reward: [(0, '44.380'), (1, '34.870')]
[2023-10-01 10:38:42,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 2818048. Throughput: 0: 737.1, 1: 739.2. Samples: 701309. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:38:42,053][117973] Avg episode reward: [(0, '44.090'), (1, '35.820')]
[2023-10-01 10:38:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 2842624. Throughput: 0: 741.4, 1: 739.9. Samples: 710047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:47,052][117973] Avg episode reward: [(0, '45.300'), (1, '35.770')]
[2023-10-01 10:38:50,344][119041] Updated weights for policy 0, policy_version 5600 (0.0016)
[2023-10-01 10:38:50,345][119042] Updated weights for policy 1, policy_version 5600 (0.0016)
[2023-10-01 10:38:52,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 2875392. Throughput: 0: 735.6, 1: 735.8. Samples: 718708. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:38:52,053][117973] Avg episode reward: [(0, '47.640'), (1, '37.840')]
[2023-10-01 10:38:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2899968. Throughput: 0: 735.0, 1: 731.4. Samples: 722953. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:38:57,053][117973] Avg episode reward: [(0, '47.320'), (1, '39.690')]
[2023-10-01 10:39:02,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2932736. Throughput: 0: 735.4, 1: 737.6. Samples: 731826. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:39:02,053][117973] Avg episode reward: [(0, '48.040'), (1, '38.540')]
[2023-10-01 10:39:04,341][119042] Updated weights for policy 1, policy_version 5760 (0.0018)
[2023-10-01 10:39:04,341][119041] Updated weights for policy 0, policy_version 5760 (0.0019)
[2023-10-01 10:39:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2957312. Throughput: 0: 735.3, 1: 735.1. Samples: 740484. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:39:07,053][117973] Avg episode reward: [(0, '47.420'), (1, '38.620')]
[2023-10-01 10:39:12,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 2990080. Throughput: 0: 734.6, 1: 734.1. Samples: 744918. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:39:12,053][117973] Avg episode reward: [(0, '47.570'), (1, '38.280')]
[2023-10-01 10:39:17,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5939.2, 300 sec: 5942.7). Total num frames: 3022848. Throughput: 0: 731.6, 1: 730.2. Samples: 753713. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:39:17,052][117973] Avg episode reward: [(0, '46.430'), (1, '38.350')]
[2023-10-01 10:39:18,370][119042] Updated weights for policy 1, policy_version 5920 (0.0019)
[2023-10-01 10:39:18,370][119041] Updated weights for policy 0, policy_version 5920 (0.0019)
[2023-10-01 10:39:22,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3047424. Throughput: 0: 732.5, 1: 731.6. Samples: 762517. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:39:22,053][117973] Avg episode reward: [(0, '46.080'), (1, '38.550')]
[2023-10-01 10:39:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 3080192. Throughput: 0: 726.5, 1: 727.4. Samples: 766734. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:39:27,052][117973] Avg episode reward: [(0, '46.090'), (1, '38.410')]
[2023-10-01 10:39:32,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3104768. Throughput: 0: 730.3, 1: 730.5. Samples: 775784. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:39:32,053][117973] Avg episode reward: [(0, '45.810'), (1, '40.420')]
[2023-10-01 10:39:32,395][119041] Updated weights for policy 0, policy_version 6080 (0.0018)
[2023-10-01 10:39:32,395][119042] Updated weights for policy 1, policy_version 6080 (0.0018)
[2023-10-01 10:39:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3137536. Throughput: 0: 731.3, 1: 728.5. Samples: 784397. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:39:37,052][117973] Avg episode reward: [(0, '44.940'), (1, '40.060')]
[2023-10-01 10:39:42,052][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5928.8). Total num frames: 3170304. Throughput: 0: 732.1, 1: 734.6. Samples: 788957. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:39:42,053][117973] Avg episode reward: [(0, '45.090'), (1, '39.870')]
[2023-10-01 10:39:46,019][119041] Updated weights for policy 0, policy_version 6240 (0.0016)
[2023-10-01 10:39:46,019][119042] Updated weights for policy 1, policy_version 6240 (0.0019)
[2023-10-01 10:39:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3194880. Throughput: 0: 737.8, 1: 737.0. Samples: 798190. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:39:47,053][117973] Avg episode reward: [(0, '44.860'), (1, '40.290')]
[2023-10-01 10:39:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 3227648. Throughput: 0: 739.5, 1: 736.8. Samples: 806916. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:39:52,052][117973] Avg episode reward: [(0, '44.760'), (1, '40.700')]
[2023-10-01 10:39:57,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5901.0). Total num frames: 3256320. Throughput: 0: 736.1, 1: 737.4. Samples: 811229. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:39:57,053][117973] Avg episode reward: [(0, '43.350'), (1, '40.130')]
[2023-10-01 10:39:59,933][119042] Updated weights for policy 1, policy_version 6400 (0.0018)
[2023-10-01 10:39:59,933][119041] Updated weights for policy 0, policy_version 6400 (0.0017)
[2023-10-01 10:40:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 3284992. Throughput: 0: 739.0, 1: 740.1. Samples: 820273. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:02,052][117973] Avg episode reward: [(0, '41.950'), (1, '38.450')]
[2023-10-01 10:40:07,052][117973] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3317760. Throughput: 0: 740.3, 1: 741.8. Samples: 829210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:07,053][117973] Avg episode reward: [(0, '43.750'), (1, '39.120')]
[2023-10-01 10:40:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 3342336. Throughput: 0: 741.6, 1: 740.8. Samples: 833442. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:12,052][117973] Avg episode reward: [(0, '43.010'), (1, '39.290')]
[2023-10-01 10:40:13,849][119042] Updated weights for policy 1, policy_version 6560 (0.0016)
[2023-10-01 10:40:13,849][119041] Updated weights for policy 0, policy_version 6560 (0.0019)
[2023-10-01 10:40:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3375104. Throughput: 0: 738.4, 1: 740.1. Samples: 842319. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:17,053][117973] Avg episode reward: [(0, '42.620'), (1, '39.400')]
[2023-10-01 10:40:22,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3407872. Throughput: 0: 746.8, 1: 748.6. Samples: 851691. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:22,052][117973] Avg episode reward: [(0, '44.040'), (1, '40.390')]
[2023-10-01 10:40:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3432448. Throughput: 0: 745.2, 1: 744.2. Samples: 855981. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:27,053][117973] Avg episode reward: [(0, '42.550'), (1, '44.430')]
[2023-10-01 10:40:27,550][119041] Updated weights for policy 0, policy_version 6720 (0.0018)
[2023-10-01 10:40:27,550][119042] Updated weights for policy 1, policy_version 6720 (0.0017)
[2023-10-01 10:40:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3465216. Throughput: 0: 736.7, 1: 736.3. Samples: 864474. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:32,052][117973] Avg episode reward: [(0, '43.360'), (1, '43.520')]
[2023-10-01 10:40:32,061][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000006768_1732608.pth...
[2023-10-01 10:40:32,061][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000006768_1732608.pth...
[2023-10-01 10:40:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000004000_1024000.pth
[2023-10-01 10:40:32,103][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000004000_1024000.pth
[2023-10-01 10:40:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3489792. Throughput: 0: 738.6, 1: 741.0. Samples: 873498. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:40:37,052][117973] Avg episode reward: [(0, '43.620'), (1, '43.630')]
[2023-10-01 10:40:41,649][119042] Updated weights for policy 1, policy_version 6880 (0.0017)
[2023-10-01 10:40:41,651][119041] Updated weights for policy 0, policy_version 6880 (0.0019)
[2023-10-01 10:40:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 3522560. Throughput: 0: 741.6, 1: 740.8. Samples: 877940. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:40:42,052][117973] Avg episode reward: [(0, '42.270'), (1, '42.880')]
[2023-10-01 10:40:47,052][117973] Fps is (10 sec: 6143.9, 60 sec: 5939.2, 300 sec: 5901.0). Total num frames: 3551232. Throughput: 0: 739.2, 1: 737.9. Samples: 886742. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:40:47,053][117973] Avg episode reward: [(0, '41.030'), (1, '42.800')]
[2023-10-01 10:40:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3579904. Throughput: 0: 734.6, 1: 734.0. Samples: 895300. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:52,053][117973] Avg episode reward: [(0, '40.640'), (1, '44.410')]
[2023-10-01 10:40:55,456][119042] Updated weights for policy 1, policy_version 7040 (0.0018)
[2023-10-01 10:40:55,457][119041] Updated weights for policy 0, policy_version 7040 (0.0017)
[2023-10-01 10:40:57,051][117973] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 5914.9). Total num frames: 3612672. Throughput: 0: 736.7, 1: 737.3. Samples: 899770. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:40:57,052][117973] Avg episode reward: [(0, '39.980'), (1, '40.000')]
[2023-10-01 10:41:02,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 3637248. Throughput: 0: 739.4, 1: 738.7. Samples: 908834. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:02,053][117973] Avg episode reward: [(0, '40.200'), (1, '37.960')]
[2023-10-01 10:41:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3670016. Throughput: 0: 732.9, 1: 733.1. Samples: 917660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:41:07,053][117973] Avg episode reward: [(0, '40.210'), (1, '39.040')]
[2023-10-01 10:41:09,174][119042] Updated weights for policy 1, policy_version 7200 (0.0020)
[2023-10-01 10:41:09,174][119041] Updated weights for policy 0, policy_version 7200 (0.0020)
[2023-10-01 10:41:12,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5914.9). Total num frames: 3702784. Throughput: 0: 734.9, 1: 736.3. Samples: 922187. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:12,053][117973] Avg episode reward: [(0, '41.330'), (1, '39.160')]
[2023-10-01 10:41:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 3727360. Throughput: 0: 743.6, 1: 744.7. Samples: 931449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:17,052][117973] Avg episode reward: [(0, '42.170'), (1, '38.320')]
[2023-10-01 10:41:22,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3760128. Throughput: 0: 740.5, 1: 738.4. Samples: 940048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:22,053][117973] Avg episode reward: [(0, '41.320'), (1, '37.400')]
[2023-10-01 10:41:22,872][119041] Updated weights for policy 0, policy_version 7360 (0.0016)
[2023-10-01 10:41:22,872][119042] Updated weights for policy 1, policy_version 7360 (0.0019)
[2023-10-01 10:41:27,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3792896. Throughput: 0: 741.5, 1: 742.2. Samples: 944706. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:41:27,053][117973] Avg episode reward: [(0, '41.350'), (1, '36.900')]
[2023-10-01 10:41:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 3817472. Throughput: 0: 744.7, 1: 746.3. Samples: 953837. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:41:32,052][117973] Avg episode reward: [(0, '42.980'), (1, '40.170')]
[2023-10-01 10:41:36,454][119042] Updated weights for policy 1, policy_version 7520 (0.0016)
[2023-10-01 10:41:36,454][119041] Updated weights for policy 0, policy_version 7520 (0.0016)
[2023-10-01 10:41:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3850240. Throughput: 0: 748.8, 1: 748.4. Samples: 962673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:37,052][117973] Avg episode reward: [(0, '44.970'), (1, '37.660')]
[2023-10-01 10:41:42,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3883008. Throughput: 0: 749.0, 1: 749.1. Samples: 967183. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:42,052][117973] Avg episode reward: [(0, '45.210'), (1, '37.640')]
[2023-10-01 10:41:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5914.9). Total num frames: 3907584. Throughput: 0: 750.3, 1: 750.5. Samples: 976372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:47,053][117973] Avg episode reward: [(0, '41.140'), (1, '39.680')]
[2023-10-01 10:41:50,195][119041] Updated weights for policy 0, policy_version 7680 (0.0019)
[2023-10-01 10:41:50,195][119042] Updated weights for policy 1, policy_version 7680 (0.0018)
[2023-10-01 10:41:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3940352. Throughput: 0: 750.4, 1: 748.2. Samples: 985093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:52,053][117973] Avg episode reward: [(0, '41.820'), (1, '39.680')]
[2023-10-01 10:41:57,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3973120. Throughput: 0: 748.6, 1: 748.2. Samples: 989546. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:41:57,052][117973] Avg episode reward: [(0, '42.160'), (1, '40.940')]
[2023-10-01 10:42:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 3997696. Throughput: 0: 748.6, 1: 747.5. Samples: 998770. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:02,053][117973] Avg episode reward: [(0, '43.540'), (1, '42.480')]
[2023-10-01 10:42:03,871][119042] Updated weights for policy 1, policy_version 7840 (0.0018)
[2023-10-01 10:42:03,871][119041] Updated weights for policy 0, policy_version 7840 (0.0018)
[2023-10-01 10:42:07,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4030464. Throughput: 0: 750.9, 1: 750.6. Samples: 1007616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:07,052][117973] Avg episode reward: [(0, '43.340'), (1, '39.190')]
[2023-10-01 10:42:12,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 4055040. Throughput: 0: 745.9, 1: 743.2. Samples: 1011716. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:42:12,052][117973] Avg episode reward: [(0, '43.790'), (1, '39.010')]
[2023-10-01 10:42:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4087808. Throughput: 0: 741.1, 1: 741.7. Samples: 1020563. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:42:17,052][117973] Avg episode reward: [(0, '44.500'), (1, '38.950')]
[2023-10-01 10:42:17,878][119041] Updated weights for policy 0, policy_version 8000 (0.0017)
[2023-10-01 10:42:17,879][119042] Updated weights for policy 1, policy_version 8000 (0.0018)
[2023-10-01 10:42:22,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4120576. Throughput: 0: 746.0, 1: 745.6. Samples: 1029797. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:42:22,052][117973] Avg episode reward: [(0, '44.000'), (1, '40.580')]
[2023-10-01 10:42:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4145152. Throughput: 0: 746.5, 1: 743.7. Samples: 1034241. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:27,053][117973] Avg episode reward: [(0, '46.180'), (1, '40.610')]
[2023-10-01 10:42:31,732][119041] Updated weights for policy 0, policy_version 8160 (0.0018)
[2023-10-01 10:42:31,732][119042] Updated weights for policy 1, policy_version 8160 (0.0018)
[2023-10-01 10:42:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4177920. Throughput: 0: 738.1, 1: 737.8. Samples: 1042787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:32,052][117973] Avg episode reward: [(0, '46.180'), (1, '41.250')]
[2023-10-01 10:42:32,059][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000008160_2088960.pth...
[2023-10-01 10:42:32,059][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000008160_2088960.pth...
[2023-10-01 10:42:32,088][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000005392_1380352.pth
[2023-10-01 10:42:32,093][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000005392_1380352.pth
[2023-10-01 10:42:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4202496. Throughput: 0: 739.0, 1: 741.1. Samples: 1051697. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:42:37,053][117973] Avg episode reward: [(0, '47.580'), (1, '40.620')]
[2023-10-01 10:42:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4235264. Throughput: 0: 742.4, 1: 741.9. Samples: 1056341. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:42,052][117973] Avg episode reward: [(0, '47.270'), (1, '42.530')]
[2023-10-01 10:42:45,380][119041] Updated weights for policy 0, policy_version 8320 (0.0018)
[2023-10-01 10:42:45,380][119042] Updated weights for policy 1, policy_version 8320 (0.0018)
[2023-10-01 10:42:47,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4268032. Throughput: 0: 737.6, 1: 737.7. Samples: 1065160. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:42:47,052][117973] Avg episode reward: [(0, '46.190'), (1, '44.010')]
[2023-10-01 10:42:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 4292608. Throughput: 0: 737.7, 1: 740.8. Samples: 1074148. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:42:52,052][117973] Avg episode reward: [(0, '46.360'), (1, '43.480')]
[2023-10-01 10:42:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4325376. Throughput: 0: 741.2, 1: 744.3. Samples: 1078563. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:42:57,053][117973] Avg episode reward: [(0, '44.260'), (1, '43.430')]
[2023-10-01 10:42:59,352][119041] Updated weights for policy 0, policy_version 8480 (0.0018)
[2023-10-01 10:42:59,352][119042] Updated weights for policy 1, policy_version 8480 (0.0018)
[2023-10-01 10:43:02,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4358144. Throughput: 0: 745.0, 1: 742.1. Samples: 1087484. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:43:02,053][117973] Avg episode reward: [(0, '44.300'), (1, '43.910')]
[2023-10-01 10:43:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4382720. Throughput: 0: 740.0, 1: 740.3. Samples: 1096410. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:43:07,053][117973] Avg episode reward: [(0, '46.330'), (1, '45.690')]
[2023-10-01 10:43:07,054][118715] Saving new best policy, reward=45.690!
[2023-10-01 10:43:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5928.8). Total num frames: 4415488. Throughput: 0: 737.8, 1: 741.1. Samples: 1100790. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:43:12,052][117973] Avg episode reward: [(0, '44.200'), (1, '48.620')]
[2023-10-01 10:43:12,053][118715] Saving new best policy, reward=48.620!
[2023-10-01 10:43:13,068][119041] Updated weights for policy 0, policy_version 8640 (0.0017)
[2023-10-01 10:43:13,068][119042] Updated weights for policy 1, policy_version 8640 (0.0018)
[2023-10-01 10:43:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4440064. Throughput: 0: 746.3, 1: 745.7. Samples: 1109929. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:43:17,052][117973] Avg episode reward: [(0, '44.420'), (1, '49.710')]
[2023-10-01 10:43:17,166][118715] Saving new best policy, reward=49.710!
[2023-10-01 10:43:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4472832. Throughput: 0: 742.4, 1: 742.4. Samples: 1118511. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:43:22,052][117973] Avg episode reward: [(0, '44.300'), (1, '47.330')]
[2023-10-01 10:43:26,780][119042] Updated weights for policy 1, policy_version 8800 (0.0017)
[2023-10-01 10:43:26,781][119041] Updated weights for policy 0, policy_version 8800 (0.0020)
[2023-10-01 10:43:27,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4505600. Throughput: 0: 741.0, 1: 741.3. Samples: 1123044. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:43:27,052][117973] Avg episode reward: [(0, '44.930'), (1, '48.290')]
[2023-10-01 10:43:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4530176. Throughput: 0: 744.7, 1: 744.3. Samples: 1132170. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:43:32,053][117973] Avg episode reward: [(0, '47.100'), (1, '48.120')]
[2023-10-01 10:43:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 4562944. Throughput: 0: 741.8, 1: 741.1. Samples: 1140878. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:43:37,053][117973] Avg episode reward: [(0, '46.050'), (1, '49.930')]
[2023-10-01 10:43:37,054][118715] Saving new best policy, reward=49.930!
[2023-10-01 10:43:40,544][119042] Updated weights for policy 1, policy_version 8960 (0.0019)
[2023-10-01 10:43:40,544][119041] Updated weights for policy 0, policy_version 8960 (0.0019)
[2023-10-01 10:43:42,051][117973] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4595712. Throughput: 0: 743.2, 1: 742.8. Samples: 1145432. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:43:42,052][117973] Avg episode reward: [(0, '43.670'), (1, '45.050')]
[2023-10-01 10:43:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 4620288. Throughput: 0: 746.2, 1: 747.5. Samples: 1154697. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:43:47,052][117973] Avg episode reward: [(0, '43.480'), (1, '44.320')]
[2023-10-01 10:43:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4653056. Throughput: 0: 744.0, 1: 743.7. Samples: 1163358. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:43:52,052][117973] Avg episode reward: [(0, '44.460'), (1, '45.360')]
[2023-10-01 10:43:54,216][119041] Updated weights for policy 0, policy_version 9120 (0.0018)
[2023-10-01 10:43:54,216][119042] Updated weights for policy 1, policy_version 9120 (0.0018)
[2023-10-01 10:43:57,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4685824. Throughput: 0: 745.8, 1: 744.7. Samples: 1167860. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:43:57,053][117973] Avg episode reward: [(0, '44.930'), (1, '46.000')]
[2023-10-01 10:44:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 4710400. Throughput: 0: 746.6, 1: 747.6. Samples: 1177169. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:44:02,052][117973] Avg episode reward: [(0, '44.500'), (1, '46.650')]
[2023-10-01 10:44:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4743168. Throughput: 0: 748.7, 1: 747.9. Samples: 1185858. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:07,053][117973] Avg episode reward: [(0, '41.580'), (1, '44.970')]
[2023-10-01 10:44:07,796][119041] Updated weights for policy 0, policy_version 9280 (0.0018)
[2023-10-01 10:44:07,796][119042] Updated weights for policy 1, policy_version 9280 (0.0017)
[2023-10-01 10:44:12,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5928.8). Total num frames: 4771840. Throughput: 0: 747.7, 1: 748.2. Samples: 1190358. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:44:12,053][117973] Avg episode reward: [(0, '41.490'), (1, '44.050')]
[2023-10-01 10:44:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 4800512. Throughput: 0: 745.5, 1: 744.8. Samples: 1199234. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:44:17,053][117973] Avg episode reward: [(0, '41.190'), (1, '43.990')]
[2023-10-01 10:44:21,778][119042] Updated weights for policy 1, policy_version 9440 (0.0014)
[2023-10-01 10:44:21,779][119041] Updated weights for policy 0, policy_version 9440 (0.0017)
[2023-10-01 10:44:22,051][117973] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4833280. Throughput: 0: 749.8, 1: 748.2. Samples: 1208290. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:22,052][117973] Avg episode reward: [(0, '39.840'), (1, '44.950')]
[2023-10-01 10:44:27,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 4857856. Throughput: 0: 745.9, 1: 745.3. Samples: 1212535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:27,053][117973] Avg episode reward: [(0, '41.560'), (1, '47.290')]
[2023-10-01 10:44:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4890624. Throughput: 0: 742.2, 1: 742.1. Samples: 1221491. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:32,052][117973] Avg episode reward: [(0, '42.650'), (1, '44.940')]
[2023-10-01 10:44:32,060][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000009552_2445312.pth...
[2023-10-01 10:44:32,060][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000009552_2445312.pth...
[2023-10-01 10:44:32,093][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000006768_1732608.pth
[2023-10-01 10:44:32,093][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000006768_1732608.pth
[2023-10-01 10:44:35,551][119041] Updated weights for policy 0, policy_version 9600 (0.0019)
[2023-10-01 10:44:35,551][119042] Updated weights for policy 1, policy_version 9600 (0.0020)
[2023-10-01 10:44:37,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4923392. Throughput: 0: 747.1, 1: 746.6. Samples: 1230572. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:37,052][117973] Avg episode reward: [(0, '42.710'), (1, '45.090')]
[2023-10-01 10:44:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 4947968. Throughput: 0: 746.5, 1: 744.2. Samples: 1234945. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:44:42,053][117973] Avg episode reward: [(0, '43.090'), (1, '44.080')]
[2023-10-01 10:44:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 4980736. Throughput: 0: 741.0, 1: 740.0. Samples: 1243813. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:44:47,052][117973] Avg episode reward: [(0, '44.500'), (1, '44.900')]
[2023-10-01 10:44:49,142][119042] Updated weights for policy 1, policy_version 9760 (0.0019)
[2023-10-01 10:44:49,142][119041] Updated weights for policy 0, policy_version 9760 (0.0018)
[2023-10-01 10:44:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5956.6). Total num frames: 5013504. Throughput: 0: 745.7, 1: 749.1. Samples: 1253124. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:44:52,052][117973] Avg episode reward: [(0, '44.570'), (1, '43.070')]
[2023-10-01 10:44:57,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 5038080. Throughput: 0: 747.2, 1: 744.3. Samples: 1257476. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:44:57,052][117973] Avg episode reward: [(0, '46.150'), (1, '41.700')]
[2023-10-01 10:45:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 5070848. Throughput: 0: 746.4, 1: 747.7. Samples: 1266468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:02,052][117973] Avg episode reward: [(0, '46.220'), (1, '42.670')]
[2023-10-01 10:45:02,822][119041] Updated weights for policy 0, policy_version 9920 (0.0018)
[2023-10-01 10:45:02,822][119042] Updated weights for policy 1, policy_version 9920 (0.0018)
[2023-10-01 10:45:07,052][117973] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 5103616. Throughput: 0: 747.7, 1: 750.7. Samples: 1275717. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:07,053][117973] Avg episode reward: [(0, '46.850'), (1, '41.760')]
[2023-10-01 10:45:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5942.7). Total num frames: 5128192. Throughput: 0: 750.8, 1: 748.6. Samples: 1280004. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:12,053][117973] Avg episode reward: [(0, '48.310'), (1, '42.020')]
[2023-10-01 10:45:16,275][119042] Updated weights for policy 1, policy_version 10080 (0.0016)
[2023-10-01 10:45:16,275][119041] Updated weights for policy 0, policy_version 10080 (0.0016)
[2023-10-01 10:45:17,051][117973] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 5160960. Throughput: 0: 750.8, 1: 752.8. Samples: 1289151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:17,052][117973] Avg episode reward: [(0, '47.200'), (1, '43.230')]
[2023-10-01 10:45:22,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5193728. Throughput: 0: 753.3, 1: 753.3. Samples: 1298371. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:22,052][117973] Avg episode reward: [(0, '47.680'), (1, '43.870')]
[2023-10-01 10:45:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 5218304. Throughput: 0: 750.9, 1: 751.0. Samples: 1302530. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:45:27,053][117973] Avg episode reward: [(0, '48.810'), (1, '43.570')]
[2023-10-01 10:45:29,995][119041] Updated weights for policy 0, policy_version 10240 (0.0015)
[2023-10-01 10:45:29,995][119042] Updated weights for policy 1, policy_version 10240 (0.0016)
[2023-10-01 10:45:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5251072. Throughput: 0: 753.2, 1: 753.6. Samples: 1311618. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:45:32,052][117973] Avg episode reward: [(0, '49.750'), (1, '44.710')]
[2023-10-01 10:45:37,052][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5283840. Throughput: 0: 756.1, 1: 751.4. Samples: 1320960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:37,052][117973] Avg episode reward: [(0, '48.830'), (1, '43.580')]
[2023-10-01 10:45:42,052][117973] Fps is (10 sec: 6144.0, 60 sec: 6075.7, 300 sec: 5970.4). Total num frames: 5312512. Throughput: 0: 752.7, 1: 755.4. Samples: 1325344. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:45:42,052][117973] Avg episode reward: [(0, '48.920'), (1, '43.240')]
[2023-10-01 10:45:43,385][119042] Updated weights for policy 1, policy_version 10400 (0.0015)
[2023-10-01 10:45:43,385][119041] Updated weights for policy 0, policy_version 10400 (0.0018)
[2023-10-01 10:45:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5341184. Throughput: 0: 755.6, 1: 756.5. Samples: 1334516. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:45:47,053][117973] Avg episode reward: [(0, '48.030'), (1, '42.390')]
[2023-10-01 10:45:52,051][117973] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5373952. Throughput: 0: 754.1, 1: 751.2. Samples: 1343454. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:45:52,052][117973] Avg episode reward: [(0, '48.020'), (1, '41.690')]
[2023-10-01 10:45:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5398528. Throughput: 0: 752.5, 1: 755.6. Samples: 1347871. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:45:57,053][117973] Avg episode reward: [(0, '46.840'), (1, '38.020')]
[2023-10-01 10:45:57,132][119042] Updated weights for policy 1, policy_version 10560 (0.0018)
[2023-10-01 10:45:57,132][119041] Updated weights for policy 0, policy_version 10560 (0.0019)
[2023-10-01 10:46:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5431296. Throughput: 0: 753.5, 1: 752.4. Samples: 1356917. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:46:02,052][117973] Avg episode reward: [(0, '47.720'), (1, '37.340')]
[2023-10-01 10:46:07,051][117973] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5464064. Throughput: 0: 749.4, 1: 750.2. Samples: 1365853. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:46:07,052][117973] Avg episode reward: [(0, '46.680'), (1, '34.200')]
[2023-10-01 10:46:10,805][119041] Updated weights for policy 0, policy_version 10720 (0.0016)
[2023-10-01 10:46:10,806][119042] Updated weights for policy 1, policy_version 10720 (0.0018)
[2023-10-01 10:46:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5488640. Throughput: 0: 751.0, 1: 752.9. Samples: 1370203. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:46:12,053][117973] Avg episode reward: [(0, '47.460'), (1, '33.660')]
[2023-10-01 10:46:17,052][117973] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 5521408. Throughput: 0: 753.1, 1: 753.4. Samples: 1379411. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:46:17,053][117973] Avg episode reward: [(0, '48.140'), (1, '35.530')]
[2023-10-01 10:46:22,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5554176. Throughput: 0: 749.4, 1: 750.9. Samples: 1388473. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:46:22,053][117973] Avg episode reward: [(0, '45.530'), (1, '33.830')]
[2023-10-01 10:46:24,532][119041] Updated weights for policy 0, policy_version 10880 (0.0016)
[2023-10-01 10:46:24,532][119042] Updated weights for policy 1, policy_version 10880 (0.0018)
[2023-10-01 10:46:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5578752. Throughput: 0: 749.1, 1: 746.9. Samples: 1392665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:46:27,053][117973] Avg episode reward: [(0, '44.430'), (1, '33.010')]
[2023-10-01 10:46:32,052][117973] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 5611520. Throughput: 0: 743.3, 1: 742.3. Samples: 1401366. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:46:32,053][117973] Avg episode reward: [(0, '44.360'), (1, '31.880')]
[2023-10-01 10:46:32,063][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000010960_2805760.pth...
[2023-10-01 10:46:32,064][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000010960_2805760.pth...
[2023-10-01 10:46:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000008160_2088960.pth
[2023-10-01 10:46:32,101][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000008160_2088960.pth
[2023-10-01 10:46:37,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5956.6). Total num frames: 5640192. Throughput: 0: 744.6, 1: 745.6. Samples: 1410512. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:46:37,053][117973] Avg episode reward: [(0, '44.520'), (1, '32.150')]
[2023-10-01 10:46:38,432][119041] Updated weights for policy 0, policy_version 11040 (0.0018)
[2023-10-01 10:46:38,433][119042] Updated weights for policy 1, policy_version 11040 (0.0018)
[2023-10-01 10:46:42,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5939.2, 300 sec: 5970.4). Total num frames: 5668864. Throughput: 0: 746.4, 1: 746.2. Samples: 1415037. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:46:42,053][117973] Avg episode reward: [(0, '40.450'), (1, '32.010')]
[2023-10-01 10:46:47,051][117973] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5701632. Throughput: 0: 742.9, 1: 743.0. Samples: 1423783. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:46:47,052][117973] Avg episode reward: [(0, '38.890'), (1, '33.780')]
[2023-10-01 10:46:52,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 5726208. Throughput: 0: 738.5, 1: 739.1. Samples: 1432346. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:46:52,052][117973] Avg episode reward: [(0, '39.700'), (1, '34.050')]
[2023-10-01 10:46:52,408][119042] Updated weights for policy 1, policy_version 11200 (0.0017)
[2023-10-01 10:46:52,409][119041] Updated weights for policy 0, policy_version 11200 (0.0019)
[2023-10-01 10:46:57,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5758976. Throughput: 0: 741.9, 1: 742.3. Samples: 1436993. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:46:57,052][117973] Avg episode reward: [(0, '38.550'), (1, '34.900')]
[2023-10-01 10:47:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 5783552. Throughput: 0: 738.6, 1: 737.3. Samples: 1445825. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:47:02,052][117973] Avg episode reward: [(0, '38.830'), (1, '36.660')]
[2023-10-01 10:47:06,108][119042] Updated weights for policy 1, policy_version 11360 (0.0018)
[2023-10-01 10:47:06,108][119041] Updated weights for policy 0, policy_version 11360 (0.0019)
[2023-10-01 10:47:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 5816320. Throughput: 0: 735.6, 1: 737.7. Samples: 1454774. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:47:07,053][117973] Avg episode reward: [(0, '39.420'), (1, '36.280')]
[2023-10-01 10:47:12,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5849088. Throughput: 0: 739.8, 1: 742.4. Samples: 1459361. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:47:12,053][117973] Avg episode reward: [(0, '39.710'), (1, '36.010')]
[2023-10-01 10:47:17,051][117973] Fps is (10 sec: 6144.0, 60 sec: 5939.2, 300 sec: 5956.6). Total num frames: 5877760. Throughput: 0: 746.1, 1: 743.7. Samples: 1468402. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:47:17,052][117973] Avg episode reward: [(0, '40.390'), (1, '39.280')]
[2023-10-01 10:47:19,983][119041] Updated weights for policy 0, policy_version 11520 (0.0018)
[2023-10-01 10:47:19,983][119042] Updated weights for policy 1, policy_version 11520 (0.0018)
[2023-10-01 10:47:22,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 5906432. Throughput: 0: 736.6, 1: 737.8. Samples: 1476861. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:47:22,053][117973] Avg episode reward: [(0, '41.170'), (1, '39.900')]
[2023-10-01 10:47:27,051][117973] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 5939200. Throughput: 0: 740.3, 1: 738.3. Samples: 1481573. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:47:27,052][117973] Avg episode reward: [(0, '42.030'), (1, '40.330')]
[2023-10-01 10:47:32,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5970.4). Total num frames: 5963776. Throughput: 0: 742.6, 1: 744.7. Samples: 1490712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:47:32,053][117973] Avg episode reward: [(0, '42.530'), (1, '41.140')]
[2023-10-01 10:47:33,723][119041] Updated weights for policy 0, policy_version 11680 (0.0017)
[2023-10-01 10:47:33,723][119042] Updated weights for policy 1, policy_version 11680 (0.0018)
[2023-10-01 10:47:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5970.4). Total num frames: 5996544. Throughput: 0: 743.6, 1: 742.6. Samples: 1499222. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:47:37,053][117973] Avg episode reward: [(0, '43.380'), (1, '41.810')]
[2023-10-01 10:47:42,051][117973] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6029312. Throughput: 0: 741.8, 1: 742.2. Samples: 1503774. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:47:42,052][117973] Avg episode reward: [(0, '42.710'), (1, '42.890')]
[2023-10-01 10:47:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6053888. Throughput: 0: 742.5, 1: 746.5. Samples: 1512828. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:47:47,052][117973] Avg episode reward: [(0, '45.920'), (1, '43.800')]
[2023-10-01 10:47:47,621][119042] Updated weights for policy 1, policy_version 11840 (0.0018)
[2023-10-01 10:47:47,621][119041] Updated weights for policy 0, policy_version 11840 (0.0017)
[2023-10-01 10:47:52,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6086656. Throughput: 0: 744.1, 1: 741.4. Samples: 1521622. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:47:52,052][117973] Avg episode reward: [(0, '47.740'), (1, '47.140')]
[2023-10-01 10:47:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6111232. Throughput: 0: 740.0, 1: 739.4. Samples: 1525935. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:47:57,053][117973] Avg episode reward: [(0, '46.710'), (1, '46.710')]
[2023-10-01 10:48:01,279][119041] Updated weights for policy 0, policy_version 12000 (0.0017)
[2023-10-01 10:48:01,279][119042] Updated weights for policy 1, policy_version 12000 (0.0019)
[2023-10-01 10:48:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6144000. Throughput: 0: 739.1, 1: 741.5. Samples: 1535029. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:48:02,052][117973] Avg episode reward: [(0, '45.310'), (1, '46.580')]
[2023-10-01 10:48:07,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6176768. Throughput: 0: 747.4, 1: 743.8. Samples: 1543963. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:48:07,052][117973] Avg episode reward: [(0, '44.670'), (1, '45.500')]
[2023-10-01 10:48:12,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6201344. Throughput: 0: 741.6, 1: 740.8. Samples: 1548285. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:48:12,053][117973] Avg episode reward: [(0, '45.280'), (1, '47.470')]
[2023-10-01 10:48:15,216][119041] Updated weights for policy 0, policy_version 12160 (0.0019)
[2023-10-01 10:48:15,216][119042] Updated weights for policy 1, policy_version 12160 (0.0018)
[2023-10-01 10:48:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5970.4). Total num frames: 6234112. Throughput: 0: 736.9, 1: 736.1. Samples: 1556995. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:48:17,052][117973] Avg episode reward: [(0, '44.900'), (1, '48.000')]
[2023-10-01 10:48:22,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6266880. Throughput: 0: 746.6, 1: 747.6. Samples: 1566459. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:48:22,052][117973] Avg episode reward: [(0, '45.220'), (1, '48.650')]
[2023-10-01 10:48:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6291456. Throughput: 0: 746.2, 1: 743.6. Samples: 1570816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:48:27,053][117973] Avg episode reward: [(0, '43.530'), (1, '49.980')]
[2023-10-01 10:48:27,054][118715] Saving new best policy, reward=49.980!
[2023-10-01 10:48:28,793][119041] Updated weights for policy 0, policy_version 12320 (0.0020)
[2023-10-01 10:48:28,793][119042] Updated weights for policy 1, policy_version 12320 (0.0019)
[2023-10-01 10:48:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6324224. Throughput: 0: 743.2, 1: 740.1. Samples: 1579576. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:48:32,053][117973] Avg episode reward: [(0, '43.070'), (1, '49.600')]
[2023-10-01 10:48:32,064][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000012352_3162112.pth...
[2023-10-01 10:48:32,064][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000012352_3162112.pth...
[2023-10-01 10:48:32,101][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000009552_2445312.pth
[2023-10-01 10:48:32,105][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000009552_2445312.pth
[2023-10-01 10:48:37,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6356992. Throughput: 0: 745.8, 1: 746.6. Samples: 1588780. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:48:37,053][117973] Avg episode reward: [(0, '44.040'), (1, '46.700')]
[2023-10-01 10:48:42,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6381568. Throughput: 0: 746.4, 1: 745.4. Samples: 1593066. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 10:48:42,052][117973] Avg episode reward: [(0, '44.320'), (1, '42.470')]
[2023-10-01 10:48:42,862][119041] Updated weights for policy 0, policy_version 12480 (0.0020)
[2023-10-01 10:48:42,862][119042] Updated weights for policy 1, policy_version 12480 (0.0018)
[2023-10-01 10:48:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6414336. Throughput: 0: 740.3, 1: 737.7. Samples: 1601540. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:48:47,052][117973] Avg episode reward: [(0, '43.080'), (1, '43.390')]
[2023-10-01 10:48:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6438912. Throughput: 0: 736.8, 1: 740.3. Samples: 1610436. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:48:52,053][117973] Avg episode reward: [(0, '42.960'), (1, '41.520')]
[2023-10-01 10:48:56,596][119041] Updated weights for policy 0, policy_version 12640 (0.0016)
[2023-10-01 10:48:56,598][119042] Updated weights for policy 1, policy_version 12640 (0.0017)
[2023-10-01 10:48:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6471680. Throughput: 0: 739.8, 1: 741.0. Samples: 1614922. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:48:57,053][117973] Avg episode reward: [(0, '43.840'), (1, '43.000')]
[2023-10-01 10:49:02,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6496256. Throughput: 0: 746.4, 1: 743.3. Samples: 1624032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:02,053][117973] Avg episode reward: [(0, '43.990'), (1, '41.060')]
[2023-10-01 10:49:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5956.6). Total num frames: 6529024. Throughput: 0: 738.6, 1: 738.9. Samples: 1632949. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:07,052][117973] Avg episode reward: [(0, '45.970'), (1, '39.250')]
[2023-10-01 10:49:10,309][119041] Updated weights for policy 0, policy_version 12800 (0.0018)
[2023-10-01 10:49:10,310][119042] Updated weights for policy 1, policy_version 12800 (0.0016)
[2023-10-01 10:49:12,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6561792. Throughput: 0: 739.7, 1: 741.5. Samples: 1637469. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:49:12,053][117973] Avg episode reward: [(0, '45.150'), (1, '40.930')]
[2023-10-01 10:49:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 6586368. Throughput: 0: 743.8, 1: 743.4. Samples: 1646501. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:49:17,052][117973] Avg episode reward: [(0, '40.130'), (1, '39.750')]
[2023-10-01 10:49:22,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6619136. Throughput: 0: 734.9, 1: 736.0. Samples: 1654974. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:49:22,052][117973] Avg episode reward: [(0, '39.540'), (1, '36.990')]
[2023-10-01 10:49:24,039][119042] Updated weights for policy 1, policy_version 12960 (0.0018)
[2023-10-01 10:49:24,039][119041] Updated weights for policy 0, policy_version 12960 (0.0018)
[2023-10-01 10:49:27,052][117973] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6651904. Throughput: 0: 739.7, 1: 739.6. Samples: 1659633. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:27,053][117973] Avg episode reward: [(0, '41.210'), (1, '39.440')]
[2023-10-01 10:49:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 6676480. Throughput: 0: 743.6, 1: 744.5. Samples: 1668505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:32,052][117973] Avg episode reward: [(0, '41.680'), (1, '39.430')]
[2023-10-01 10:49:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 6709248. Throughput: 0: 744.5, 1: 741.8. Samples: 1677317. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:49:37,053][117973] Avg episode reward: [(0, '42.370'), (1, '39.100')]
[2023-10-01 10:49:38,043][119042] Updated weights for policy 1, policy_version 13120 (0.0015)
[2023-10-01 10:49:38,044][119041] Updated weights for policy 0, policy_version 13120 (0.0018)
[2023-10-01 10:49:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6733824. Throughput: 0: 739.5, 1: 740.2. Samples: 1681509. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:49:42,053][117973] Avg episode reward: [(0, '45.440'), (1, '40.560')]
[2023-10-01 10:49:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6766592. Throughput: 0: 740.1, 1: 741.6. Samples: 1690708. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:49:47,053][117973] Avg episode reward: [(0, '45.580'), (1, '40.560')]
[2023-10-01 10:49:51,793][119042] Updated weights for policy 1, policy_version 13280 (0.0017)
[2023-10-01 10:49:51,793][119041] Updated weights for policy 0, policy_version 13280 (0.0017)
[2023-10-01 10:49:52,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6799360. Throughput: 0: 742.0, 1: 741.7. Samples: 1699712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:52,053][117973] Avg episode reward: [(0, '45.590'), (1, '40.940')]
[2023-10-01 10:49:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 6823936. Throughput: 0: 739.4, 1: 737.8. Samples: 1703944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:49:57,053][117973] Avg episode reward: [(0, '47.400'), (1, '41.220')]
[2023-10-01 10:50:02,051][117973] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 6856704. Throughput: 0: 740.8, 1: 740.6. Samples: 1713166. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:02,052][117973] Avg episode reward: [(0, '48.050'), (1, '39.730')]
[2023-10-01 10:50:05,610][119042] Updated weights for policy 1, policy_version 13440 (0.0018)
[2023-10-01 10:50:05,610][119041] Updated weights for policy 0, policy_version 13440 (0.0018)
[2023-10-01 10:50:07,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6889472. Throughput: 0: 743.9, 1: 745.7. Samples: 1722003. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:07,053][117973] Avg episode reward: [(0, '50.040'), (1, '40.940')]
[2023-10-01 10:50:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 6914048. Throughput: 0: 742.1, 1: 741.8. Samples: 1726407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:12,052][117973] Avg episode reward: [(0, '50.550'), (1, '41.670')]
[2023-10-01 10:50:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 6946816. Throughput: 0: 742.9, 1: 743.5. Samples: 1735393. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:17,053][117973] Avg episode reward: [(0, '51.990'), (1, '40.410')]
[2023-10-01 10:50:17,065][118645] Saving new best policy, reward=51.990!
[2023-10-01 10:50:19,163][119042] Updated weights for policy 1, policy_version 13600 (0.0017)
[2023-10-01 10:50:19,163][119041] Updated weights for policy 0, policy_version 13600 (0.0018)
[2023-10-01 10:50:22,052][117973] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 6979584. Throughput: 0: 748.2, 1: 750.8. Samples: 1744775. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:50:22,053][117973] Avg episode reward: [(0, '52.810'), (1, '42.200')]
[2023-10-01 10:50:22,054][118645] Saving new best policy, reward=52.810!
[2023-10-01 10:50:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7004160. Throughput: 0: 750.9, 1: 750.6. Samples: 1749079. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 10:50:27,053][117973] Avg episode reward: [(0, '52.810'), (1, '42.740')]
[2023-10-01 10:50:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7036928. Throughput: 0: 748.3, 1: 749.7. Samples: 1758119. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:50:32,052][117973] Avg episode reward: [(0, '58.280'), (1, '43.000')]
[2023-10-01 10:50:32,062][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000013744_3518464.pth...
[2023-10-01 10:50:32,062][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000013744_3518464.pth...
[2023-10-01 10:50:32,093][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000010960_2805760.pth
[2023-10-01 10:50:32,099][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000010960_2805760.pth
[2023-10-01 10:50:32,104][118645] Saving new best policy, reward=58.280!
[2023-10-01 10:50:32,769][119042] Updated weights for policy 1, policy_version 13760 (0.0016)
[2023-10-01 10:50:32,770][119041] Updated weights for policy 0, policy_version 13760 (0.0019)
[2023-10-01 10:50:37,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5956.6). Total num frames: 7069696. Throughput: 0: 752.3, 1: 750.9. Samples: 1767359. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:50:37,052][117973] Avg episode reward: [(0, '58.790'), (1, '44.780')]
[2023-10-01 10:50:37,052][118645] Saving new best policy, reward=58.790!
[2023-10-01 10:50:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7094272. Throughput: 0: 750.9, 1: 750.9. Samples: 1771525. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:50:42,052][117973] Avg episode reward: [(0, '56.880'), (1, '44.250')]
[2023-10-01 10:50:46,489][119042] Updated weights for policy 1, policy_version 13920 (0.0016)
[2023-10-01 10:50:46,489][119041] Updated weights for policy 0, policy_version 13920 (0.0016)
[2023-10-01 10:50:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7127040. Throughput: 0: 747.5, 1: 746.9. Samples: 1780416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:47,052][117973] Avg episode reward: [(0, '56.550'), (1, '44.910')]
[2023-10-01 10:50:52,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7159808. Throughput: 0: 749.8, 1: 750.5. Samples: 1789520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:52,052][117973] Avg episode reward: [(0, '55.840'), (1, '44.570')]
[2023-10-01 10:50:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7184384. Throughput: 0: 750.2, 1: 750.9. Samples: 1793958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:50:57,052][117973] Avg episode reward: [(0, '53.530'), (1, '44.060')]
[2023-10-01 10:51:00,123][119041] Updated weights for policy 0, policy_version 14080 (0.0016)
[2023-10-01 10:51:00,124][119042] Updated weights for policy 1, policy_version 14080 (0.0018)
[2023-10-01 10:51:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7217152. Throughput: 0: 749.4, 1: 751.0. Samples: 1802910. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:02,052][117973] Avg episode reward: [(0, '53.290'), (1, '43.080')]
[2023-10-01 10:51:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7241728. Throughput: 0: 743.4, 1: 743.3. Samples: 1811677. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:07,053][117973] Avg episode reward: [(0, '54.870'), (1, '42.770')]
[2023-10-01 10:51:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 7274496. Throughput: 0: 744.6, 1: 744.6. Samples: 1816093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:12,053][117973] Avg episode reward: [(0, '57.010'), (1, '41.450')]
[2023-10-01 10:51:14,108][119041] Updated weights for policy 0, policy_version 14240 (0.0019)
[2023-10-01 10:51:14,108][119042] Updated weights for policy 1, policy_version 14240 (0.0019)
[2023-10-01 10:51:17,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7307264. Throughput: 0: 742.4, 1: 741.0. Samples: 1824874. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:17,053][117973] Avg episode reward: [(0, '58.560'), (1, '41.810')]
[2023-10-01 10:51:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 7331840. Throughput: 0: 739.8, 1: 741.3. Samples: 1834007. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:22,052][117973] Avg episode reward: [(0, '58.290'), (1, '41.810')]
[2023-10-01 10:51:27,052][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7364608. Throughput: 0: 740.9, 1: 742.4. Samples: 1838274. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:27,053][117973] Avg episode reward: [(0, '57.510'), (1, '42.800')]
[2023-10-01 10:51:28,137][119041] Updated weights for policy 0, policy_version 14400 (0.0018)
[2023-10-01 10:51:28,138][119042] Updated weights for policy 1, policy_version 14400 (0.0018)
[2023-10-01 10:51:32,052][117973] Fps is (10 sec: 5734.1, 60 sec: 5870.9, 300 sec: 5928.8). Total num frames: 7389184. Throughput: 0: 740.1, 1: 740.8. Samples: 1847060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:32,053][117973] Avg episode reward: [(0, '58.270'), (1, '42.720')]
[2023-10-01 10:51:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7421952. Throughput: 0: 737.3, 1: 734.6. Samples: 1855757. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:37,053][117973] Avg episode reward: [(0, '58.200'), (1, '43.740')]
[2023-10-01 10:51:41,744][119041] Updated weights for policy 0, policy_version 14560 (0.0019)
[2023-10-01 10:51:41,744][119042] Updated weights for policy 1, policy_version 14560 (0.0018)
[2023-10-01 10:51:42,052][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 7454720. Throughput: 0: 737.5, 1: 738.9. Samples: 1860396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:42,053][117973] Avg episode reward: [(0, '59.390'), (1, '44.130')]
[2023-10-01 10:51:42,054][118645] Saving new best policy, reward=59.390!
[2023-10-01 10:51:47,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7479296. Throughput: 0: 742.2, 1: 740.9. Samples: 1869648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:51:47,053][117973] Avg episode reward: [(0, '59.480'), (1, '44.730')]
[2023-10-01 10:51:47,173][118645] Saving new best policy, reward=59.480!
[2023-10-01 10:51:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7512064. Throughput: 0: 739.8, 1: 739.8. Samples: 1878255. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:51:52,052][117973] Avg episode reward: [(0, '60.470'), (1, '45.770')]
[2023-10-01 10:51:52,053][118645] Saving new best policy, reward=60.470!
[2023-10-01 10:51:55,437][119041] Updated weights for policy 0, policy_version 14720 (0.0018)
[2023-10-01 10:51:55,437][119042] Updated weights for policy 1, policy_version 14720 (0.0015)
[2023-10-01 10:51:57,051][117973] Fps is (10 sec: 6554.0, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7544832. Throughput: 0: 741.1, 1: 742.3. Samples: 1882847. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:51:57,052][117973] Avg episode reward: [(0, '61.820'), (1, '44.960')]
[2023-10-01 10:51:57,052][118645] Saving new best policy, reward=61.820!
[2023-10-01 10:52:02,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7569408. Throughput: 0: 747.0, 1: 748.7. Samples: 1892179. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:02,053][117973] Avg episode reward: [(0, '59.910'), (1, '45.500')]
[2023-10-01 10:52:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7602176. Throughput: 0: 742.0, 1: 741.7. Samples: 1900772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:07,053][117973] Avg episode reward: [(0, '59.820'), (1, '43.200')]
[2023-10-01 10:52:09,156][119041] Updated weights for policy 0, policy_version 14880 (0.0018)
[2023-10-01 10:52:09,157][119042] Updated weights for policy 1, policy_version 14880 (0.0015)
[2023-10-01 10:52:12,052][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5956.6). Total num frames: 7634944. Throughput: 0: 743.8, 1: 745.2. Samples: 1905279. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:12,053][117973] Avg episode reward: [(0, '60.080'), (1, '43.050')]
[2023-10-01 10:52:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 7659520. Throughput: 0: 752.0, 1: 752.6. Samples: 1914769. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:17,053][117973] Avg episode reward: [(0, '61.310'), (1, '43.500')]
[2023-10-01 10:52:22,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 5942.7). Total num frames: 7692288. Throughput: 0: 753.6, 1: 753.8. Samples: 1923591. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:22,053][117973] Avg episode reward: [(0, '61.800'), (1, '40.980')]
[2023-10-01 10:52:22,535][119042] Updated weights for policy 1, policy_version 15040 (0.0017)
[2023-10-01 10:52:22,536][119041] Updated weights for policy 0, policy_version 15040 (0.0018)
[2023-10-01 10:52:27,052][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7725056. Throughput: 0: 753.2, 1: 752.2. Samples: 1928138. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:52:27,053][117973] Avg episode reward: [(0, '61.410'), (1, '41.260')]
[2023-10-01 10:52:32,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5970.4). Total num frames: 7757824. Throughput: 0: 753.8, 1: 752.0. Samples: 1937408. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 10:52:32,053][117973] Avg episode reward: [(0, '59.700'), (1, '40.150')]
[2023-10-01 10:52:32,062][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000015152_3878912.pth...
[2023-10-01 10:52:32,062][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000015152_3878912.pth...
[2023-10-01 10:52:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000012352_3162112.pth
[2023-10-01 10:52:32,100][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000012352_3162112.pth
[2023-10-01 10:52:36,141][119042] Updated weights for policy 1, policy_version 15200 (0.0016)
[2023-10-01 10:52:36,141][119041] Updated weights for policy 0, policy_version 15200 (0.0015)
[2023-10-01 10:52:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7782400. Throughput: 0: 755.1, 1: 755.6. Samples: 1946238. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:52:37,052][117973] Avg episode reward: [(0, '57.690'), (1, '40.880')]
[2023-10-01 10:52:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7815168. Throughput: 0: 755.3, 1: 755.5. Samples: 1950834. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:52:42,053][117973] Avg episode reward: [(0, '60.480'), (1, '41.960')]
[2023-10-01 10:52:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7839744. Throughput: 0: 747.7, 1: 748.4. Samples: 1959505. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:47,053][117973] Avg episode reward: [(0, '59.920'), (1, '42.410')]
[2023-10-01 10:52:50,204][119042] Updated weights for policy 1, policy_version 15360 (0.0018)
[2023-10-01 10:52:50,204][119041] Updated weights for policy 0, policy_version 15360 (0.0018)
[2023-10-01 10:52:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7872512. Throughput: 0: 749.7, 1: 747.3. Samples: 1968140. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:52,052][117973] Avg episode reward: [(0, '60.570'), (1, '43.280')]
[2023-10-01 10:52:57,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 7905280. Throughput: 0: 749.0, 1: 749.0. Samples: 1972686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:52:57,053][117973] Avg episode reward: [(0, '61.360'), (1, '42.550')]
[2023-10-01 10:53:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 7929856. Throughput: 0: 743.3, 1: 743.2. Samples: 1981665. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:53:02,053][117973] Avg episode reward: [(0, '61.110'), (1, '42.470')]
[2023-10-01 10:53:03,906][119041] Updated weights for policy 0, policy_version 15520 (0.0015)
[2023-10-01 10:53:03,907][119042] Updated weights for policy 1, policy_version 15520 (0.0017)
[2023-10-01 10:53:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 7962624. Throughput: 0: 746.6, 1: 743.7. Samples: 1990654. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:53:07,052][117973] Avg episode reward: [(0, '62.480'), (1, '44.210')]
[2023-10-01 10:53:07,053][118645] Saving new best policy, reward=62.480!
[2023-10-01 10:53:12,052][117973] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 5956.6). Total num frames: 7991296. Throughput: 0: 741.5, 1: 741.9. Samples: 1994892. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:53:12,053][117973] Avg episode reward: [(0, '63.700'), (1, '42.890')]
[2023-10-01 10:53:12,054][118645] Saving new best policy, reward=63.700!
[2023-10-01 10:53:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 8019968. Throughput: 0: 740.7, 1: 743.9. Samples: 2004216. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:53:17,052][117973] Avg episode reward: [(0, '65.750'), (1, '41.720')]
[2023-10-01 10:53:17,064][118645] Saving new best policy, reward=65.750!
[2023-10-01 10:53:17,525][119041] Updated weights for policy 0, policy_version 15680 (0.0019)
[2023-10-01 10:53:17,526][119042] Updated weights for policy 1, policy_version 15680 (0.0016)
[2023-10-01 10:53:22,052][117973] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8052736. Throughput: 0: 744.7, 1: 742.2. Samples: 2013149. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:53:22,052][117973] Avg episode reward: [(0, '68.190'), (1, '41.970')]
[2023-10-01 10:53:22,053][118645] Saving new best policy, reward=68.190!
[2023-10-01 10:53:27,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8077312. Throughput: 0: 739.9, 1: 738.1. Samples: 2017341. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:53:27,052][117973] Avg episode reward: [(0, '67.490'), (1, '42.750')]
[2023-10-01 10:53:31,418][119041] Updated weights for policy 0, policy_version 15840 (0.0018)
[2023-10-01 10:53:31,419][119042] Updated weights for policy 1, policy_version 15840 (0.0018)
[2023-10-01 10:53:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8110080. Throughput: 0: 743.9, 1: 744.0. Samples: 2026457. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 10:53:32,053][117973] Avg episode reward: [(0, '65.440'), (1, '43.680')]
[2023-10-01 10:53:37,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8142848. Throughput: 0: 746.3, 1: 749.3. Samples: 2035440. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:53:37,052][117973] Avg episode reward: [(0, '65.450'), (1, '44.570')]
[2023-10-01 10:53:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8167424. Throughput: 0: 747.3, 1: 744.4. Samples: 2039810. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:53:42,053][117973] Avg episode reward: [(0, '64.840'), (1, '43.550')]
[2023-10-01 10:53:45,086][119041] Updated weights for policy 0, policy_version 16000 (0.0017)
[2023-10-01 10:53:45,088][119042] Updated weights for policy 1, policy_version 16000 (0.0017)
[2023-10-01 10:53:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8200192. Throughput: 0: 745.7, 1: 745.6. Samples: 2048771. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 10:53:47,052][117973] Avg episode reward: [(0, '63.820'), (1, '43.970')]
[2023-10-01 10:53:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8232960. Throughput: 0: 746.8, 1: 748.2. Samples: 2057931. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:53:52,052][117973] Avg episode reward: [(0, '63.880'), (1, '42.260')]
[2023-10-01 10:53:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5970.4). Total num frames: 8257536. Throughput: 0: 750.7, 1: 748.0. Samples: 2062336. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:53:57,052][117973] Avg episode reward: [(0, '61.530'), (1, '42.910')]
[2023-10-01 10:53:58,801][119041] Updated weights for policy 0, policy_version 16160 (0.0019)
[2023-10-01 10:53:58,801][119042] Updated weights for policy 1, policy_version 16160 (0.0017)
[2023-10-01 10:54:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8290304. Throughput: 0: 743.6, 1: 742.0. Samples: 2071069. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 10:54:02,052][117973] Avg episode reward: [(0, '62.500'), (1, '42.990')]
[2023-10-01 10:54:07,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8323072. Throughput: 0: 745.4, 1: 746.9. Samples: 2080303. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 10:54:07,053][117973] Avg episode reward: [(0, '62.460'), (1, '45.190')]
[2023-10-01 10:54:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5970.4). Total num frames: 8347648. Throughput: 0: 749.5, 1: 749.6. Samples: 2084802. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 10:54:12,053][117973] Avg episode reward: [(0, '63.930'), (1, '47.340')]
[2023-10-01 10:54:12,514][119041] Updated weights for policy 0, policy_version 16320 (0.0019)
[2023-10-01 10:54:12,514][119042] Updated weights for policy 1, policy_version 16320 (0.0017)
[2023-10-01 10:54:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8380416. Throughput: 0: 747.9, 1: 746.3. Samples: 2093694. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:54:17,052][117973] Avg episode reward: [(0, '64.020'), (1, '44.680')]
[2023-10-01 10:54:22,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8404992. Throughput: 0: 746.4, 1: 746.3. Samples: 2102611. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:54:22,053][117973] Avg episode reward: [(0, '63.570'), (1, '45.510')]
[2023-10-01 10:54:26,309][119041] Updated weights for policy 0, policy_version 16480 (0.0018)
[2023-10-01 10:54:26,309][119042] Updated weights for policy 1, policy_version 16480 (0.0018)
[2023-10-01 10:54:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8437760. Throughput: 0: 745.6, 1: 749.7. Samples: 2107100. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:54:27,052][117973] Avg episode reward: [(0, '67.140'), (1, '43.570')]
[2023-10-01 10:54:32,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8470528. Throughput: 0: 744.4, 1: 744.6. Samples: 2115778. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:54:32,053][117973] Avg episode reward: [(0, '63.590'), (1, '43.820')]
[2023-10-01 10:54:32,066][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000016544_4235264.pth...
[2023-10-01 10:54:32,066][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000016544_4235264.pth...
[2023-10-01 10:54:32,099][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000013744_3518464.pth
[2023-10-01 10:54:32,102][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000013744_3518464.pth
[2023-10-01 10:54:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 8495104. Throughput: 0: 738.5, 1: 741.6. Samples: 2124535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:54:37,052][117973] Avg episode reward: [(0, '63.590'), (1, '44.030')]
[2023-10-01 10:54:40,266][119041] Updated weights for policy 0, policy_version 16640 (0.0017)
[2023-10-01 10:54:40,266][119042] Updated weights for policy 1, policy_version 16640 (0.0018)
[2023-10-01 10:54:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8527872. Throughput: 0: 738.0, 1: 740.8. Samples: 2128879. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:54:42,052][117973] Avg episode reward: [(0, '63.840'), (1, '43.040')]
[2023-10-01 10:54:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8552448. Throughput: 0: 745.7, 1: 744.1. Samples: 2138113. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 10:54:47,053][117973] Avg episode reward: [(0, '67.190'), (1, '42.800')]
[2023-10-01 10:54:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 8585216. Throughput: 0: 740.1, 1: 740.6. Samples: 2146938. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:54:52,052][117973] Avg episode reward: [(0, '66.660'), (1, '42.540')]
[2023-10-01 10:54:53,843][119042] Updated weights for policy 1, policy_version 16800 (0.0019)
[2023-10-01 10:54:53,843][119041] Updated weights for policy 0, policy_version 16800 (0.0017)
[2023-10-01 10:54:57,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 8617984. Throughput: 0: 741.0, 1: 742.6. Samples: 2151567. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:54:57,053][117973] Avg episode reward: [(0, '64.010'), (1, '42.940')]
[2023-10-01 10:55:02,052][117973] Fps is (10 sec: 6553.4, 60 sec: 6007.4, 300 sec: 5970.4). Total num frames: 8650752. Throughput: 0: 745.4, 1: 742.3. Samples: 2160640. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:55:02,053][117973] Avg episode reward: [(0, '65.280'), (1, '44.320')]
[2023-10-01 10:55:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 8675328. Throughput: 0: 742.6, 1: 742.6. Samples: 2169445. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:55:07,053][117973] Avg episode reward: [(0, '64.430'), (1, '44.310')]
[2023-10-01 10:55:07,494][119042] Updated weights for policy 1, policy_version 16960 (0.0017)
[2023-10-01 10:55:07,495][119041] Updated weights for policy 0, policy_version 16960 (0.0017)
[2023-10-01 10:55:12,052][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8708096. Throughput: 0: 743.2, 1: 742.5. Samples: 2173954. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:55:12,053][117973] Avg episode reward: [(0, '66.330'), (1, '44.310')]
[2023-10-01 10:55:17,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8732672. Throughput: 0: 745.4, 1: 745.0. Samples: 2182849. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:55:17,053][117973] Avg episode reward: [(0, '62.130'), (1, '44.680')]
[2023-10-01 10:55:21,367][119041] Updated weights for policy 0, policy_version 17120 (0.0017)
[2023-10-01 10:55:21,368][119042] Updated weights for policy 1, policy_version 17120 (0.0015)
[2023-10-01 10:55:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8765440. Throughput: 0: 746.1, 1: 744.2. Samples: 2191602. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:55:22,052][117973] Avg episode reward: [(0, '61.620'), (1, '42.590')]
[2023-10-01 10:55:27,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8798208. Throughput: 0: 747.8, 1: 747.8. Samples: 2196179. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:55:27,053][117973] Avg episode reward: [(0, '60.800'), (1, '42.460')]
[2023-10-01 10:55:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5871.0, 300 sec: 5942.7). Total num frames: 8822784. Throughput: 0: 743.7, 1: 745.3. Samples: 2205120. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:55:32,052][117973] Avg episode reward: [(0, '59.910'), (1, '42.810')]
[2023-10-01 10:55:35,274][119041] Updated weights for policy 0, policy_version 17280 (0.0015)
[2023-10-01 10:55:35,274][119042] Updated weights for policy 1, policy_version 17280 (0.0015)
[2023-10-01 10:55:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8855552. Throughput: 0: 745.3, 1: 742.6. Samples: 2213892. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:55:37,053][117973] Avg episode reward: [(0, '61.280'), (1, '41.620')]
[2023-10-01 10:55:42,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8888320. Throughput: 0: 743.4, 1: 742.6. Samples: 2218441. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:55:42,052][117973] Avg episode reward: [(0, '61.280'), (1, '42.020')]
[2023-10-01 10:55:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 8912896. Throughput: 0: 744.6, 1: 747.0. Samples: 2227759. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 10:55:47,052][117973] Avg episode reward: [(0, '64.320'), (1, '42.220')]
[2023-10-01 10:55:48,692][119042] Updated weights for policy 1, policy_version 17440 (0.0017)
[2023-10-01 10:55:48,692][119041] Updated weights for policy 0, policy_version 17440 (0.0014)
[2023-10-01 10:55:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 8945664. Throughput: 0: 745.7, 1: 743.5. Samples: 2236459. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:55:52,053][117973] Avg episode reward: [(0, '66.310'), (1, '42.250')]
[2023-10-01 10:55:57,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 8970240. Throughput: 0: 741.6, 1: 740.5. Samples: 2240650. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:55:57,053][117973] Avg episode reward: [(0, '68.050'), (1, '41.760')]
[2023-10-01 10:56:02,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5970.4). Total num frames: 9003008. Throughput: 0: 738.9, 1: 740.0. Samples: 2249403. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:02,053][117973] Avg episode reward: [(0, '66.930'), (1, '43.120')]
[2023-10-01 10:56:02,895][119041] Updated weights for policy 0, policy_version 17600 (0.0017)
[2023-10-01 10:56:02,895][119042] Updated weights for policy 1, policy_version 17600 (0.0018)
[2023-10-01 10:56:07,052][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9027584. Throughput: 0: 740.5, 1: 739.2. Samples: 2258186. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:56:07,053][117973] Avg episode reward: [(0, '65.530'), (1, '42.120')]
[2023-10-01 10:56:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9060352. Throughput: 0: 738.4, 1: 737.9. Samples: 2262611. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:56:12,052][117973] Avg episode reward: [(0, '63.760'), (1, '42.460')]
[2023-10-01 10:56:16,975][119041] Updated weights for policy 0, policy_version 17760 (0.0017)
[2023-10-01 10:56:16,976][119042] Updated weights for policy 1, policy_version 17760 (0.0016)
[2023-10-01 10:56:17,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5970.4). Total num frames: 9093120. Throughput: 0: 735.4, 1: 733.9. Samples: 2271237. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:56:17,052][117973] Avg episode reward: [(0, '60.330'), (1, '41.980')]
[2023-10-01 10:56:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9117696. Throughput: 0: 728.8, 1: 731.2. Samples: 2279589. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:56:22,052][117973] Avg episode reward: [(0, '59.340'), (1, '42.420')]
[2023-10-01 10:56:27,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5734.4, 300 sec: 5942.7). Total num frames: 9142272. Throughput: 0: 725.4, 1: 726.3. Samples: 2283770. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:56:27,052][117973] Avg episode reward: [(0, '58.810'), (1, '43.730')]
[2023-10-01 10:56:31,260][119042] Updated weights for policy 1, policy_version 17920 (0.0017)
[2023-10-01 10:56:31,261][119041] Updated weights for policy 0, policy_version 17920 (0.0018)
[2023-10-01 10:56:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9175040. Throughput: 0: 723.2, 1: 723.4. Samples: 2292855. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:32,052][117973] Avg episode reward: [(0, '56.510'), (1, '44.580')]
[2023-10-01 10:56:32,059][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000017920_4587520.pth...
[2023-10-01 10:56:32,059][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000017920_4587520.pth...
[2023-10-01 10:56:32,097][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000015152_3878912.pth
[2023-10-01 10:56:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000015152_3878912.pth
[2023-10-01 10:56:37,052][117973] Fps is (10 sec: 6553.5, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9207808. Throughput: 0: 723.6, 1: 725.0. Samples: 2301646. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:37,053][117973] Avg episode reward: [(0, '56.850'), (1, '45.530')]
[2023-10-01 10:56:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5942.7). Total num frames: 9232384. Throughput: 0: 727.7, 1: 725.5. Samples: 2306043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:42,052][117973] Avg episode reward: [(0, '55.910'), (1, '44.030')]
[2023-10-01 10:56:45,337][119041] Updated weights for policy 0, policy_version 18080 (0.0017)
[2023-10-01 10:56:45,337][119042] Updated weights for policy 1, policy_version 18080 (0.0015)
[2023-10-01 10:56:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9265152. Throughput: 0: 724.0, 1: 723.1. Samples: 2314520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:47,053][117973] Avg episode reward: [(0, '57.780'), (1, '40.820')]
[2023-10-01 10:56:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5914.9). Total num frames: 9289728. Throughput: 0: 726.5, 1: 729.1. Samples: 2323686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:52,053][117973] Avg episode reward: [(0, '57.810'), (1, '41.430')]
[2023-10-01 10:56:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9322496. Throughput: 0: 728.0, 1: 728.7. Samples: 2328163. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:56:57,053][117973] Avg episode reward: [(0, '59.330'), (1, '42.130')]
[2023-10-01 10:56:59,189][119042] Updated weights for policy 1, policy_version 18240 (0.0017)
[2023-10-01 10:56:59,189][119041] Updated weights for policy 0, policy_version 18240 (0.0018)
[2023-10-01 10:57:02,052][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9355264. Throughput: 0: 728.2, 1: 728.6. Samples: 2336790. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:57:02,053][117973] Avg episode reward: [(0, '59.590'), (1, '42.250')]
[2023-10-01 10:57:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9379840. Throughput: 0: 734.6, 1: 735.6. Samples: 2345752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:57:07,052][117973] Avg episode reward: [(0, '59.860'), (1, '42.250')]
[2023-10-01 10:57:12,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5942.7). Total num frames: 9412608. Throughput: 0: 741.1, 1: 740.8. Samples: 2350455. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:57:12,052][117973] Avg episode reward: [(0, '59.350'), (1, '41.410')]
[2023-10-01 10:57:12,961][119041] Updated weights for policy 0, policy_version 18400 (0.0018)
[2023-10-01 10:57:12,961][119042] Updated weights for policy 1, policy_version 18400 (0.0017)
[2023-10-01 10:57:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5914.9). Total num frames: 9437184. Throughput: 0: 738.0, 1: 736.9. Samples: 2359227. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:57:17,052][117973] Avg episode reward: [(0, '59.120'), (1, '40.190')]
[2023-10-01 10:57:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9469952. Throughput: 0: 737.1, 1: 737.2. Samples: 2367990. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:57:22,052][117973] Avg episode reward: [(0, '59.780'), (1, '40.080')]
[2023-10-01 10:57:26,704][119042] Updated weights for policy 1, policy_version 18560 (0.0017)
[2023-10-01 10:57:26,704][119041] Updated weights for policy 0, policy_version 18560 (0.0017)
[2023-10-01 10:57:27,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 9502720. Throughput: 0: 738.4, 1: 741.8. Samples: 2372652. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:57:27,052][117973] Avg episode reward: [(0, '58.420'), (1, '42.180')]
[2023-10-01 10:57:32,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9527296. Throughput: 0: 748.5, 1: 746.7. Samples: 2381803. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:57:32,053][117973] Avg episode reward: [(0, '60.470'), (1, '42.600')]
[2023-10-01 10:57:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9560064. Throughput: 0: 742.0, 1: 740.9. Samples: 2390415. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 10:57:37,053][117973] Avg episode reward: [(0, '61.240'), (1, '42.600')]
[2023-10-01 10:57:40,207][119041] Updated weights for policy 0, policy_version 18720 (0.0018)
[2023-10-01 10:57:40,207][119042] Updated weights for policy 1, policy_version 18720 (0.0018)
[2023-10-01 10:57:42,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 9592832. Throughput: 0: 745.4, 1: 745.2. Samples: 2395239. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:57:42,052][117973] Avg episode reward: [(0, '62.760'), (1, '41.620')]
[2023-10-01 10:57:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9617408. Throughput: 0: 750.3, 1: 750.4. Samples: 2404324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:57:47,053][117973] Avg episode reward: [(0, '65.530'), (1, '43.150')]
[2023-10-01 10:57:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 9650176. Throughput: 0: 745.4, 1: 744.9. Samples: 2412813. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:57:52,052][117973] Avg episode reward: [(0, '63.860'), (1, '42.100')]
[2023-10-01 10:57:54,087][119041] Updated weights for policy 0, policy_version 18880 (0.0020)
[2023-10-01 10:57:54,087][119042] Updated weights for policy 1, policy_version 18880 (0.0019)
[2023-10-01 10:57:57,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 9682944. Throughput: 0: 743.7, 1: 743.2. Samples: 2417369. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:57:57,053][117973] Avg episode reward: [(0, '61.490'), (1, '44.570')]
[2023-10-01 10:58:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 9707520. Throughput: 0: 745.1, 1: 746.3. Samples: 2426337. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:58:02,052][117973] Avg episode reward: [(0, '61.990'), (1, '43.460')]
[2023-10-01 10:58:07,051][117973] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5928.8). Total num frames: 9740288. Throughput: 0: 746.6, 1: 744.5. Samples: 2435090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:58:07,052][117973] Avg episode reward: [(0, '62.880'), (1, '42.490')]
[2023-10-01 10:58:07,906][119041] Updated weights for policy 0, policy_version 19040 (0.0018)
[2023-10-01 10:58:07,906][119042] Updated weights for policy 1, policy_version 19040 (0.0017)
[2023-10-01 10:58:12,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 9773056. Throughput: 0: 744.5, 1: 743.6. Samples: 2439614. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:58:12,052][117973] Avg episode reward: [(0, '62.180'), (1, '42.650')]
[2023-10-01 10:58:17,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 9797632. Throughput: 0: 740.4, 1: 742.0. Samples: 2448507. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 10:58:17,052][117973] Avg episode reward: [(0, '62.780'), (1, '44.900')]
[2023-10-01 10:58:21,992][119041] Updated weights for policy 0, policy_version 19200 (0.0014)
[2023-10-01 10:58:21,993][119042] Updated weights for policy 1, policy_version 19200 (0.0015)
[2023-10-01 10:58:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 9830400. Throughput: 0: 741.2, 1: 740.4. Samples: 2457089. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:58:22,052][117973] Avg episode reward: [(0, '63.730'), (1, '46.390')]
[2023-10-01 10:58:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9854976. Throughput: 0: 738.9, 1: 737.1. Samples: 2461657. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:58:27,052][117973] Avg episode reward: [(0, '66.420'), (1, '46.050')]
[2023-10-01 10:58:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 9887744. Throughput: 0: 732.2, 1: 733.9. Samples: 2470298. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 10:58:32,053][117973] Avg episode reward: [(0, '67.480'), (1, '45.540')]
[2023-10-01 10:58:32,063][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000019312_4943872.pth...
[2023-10-01 10:58:32,063][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000019312_4943872.pth...
[2023-10-01 10:58:32,095][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000016544_4235264.pth
[2023-10-01 10:58:32,098][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000016544_4235264.pth
[2023-10-01 10:58:35,811][119041] Updated weights for policy 0, policy_version 19360 (0.0018)
[2023-10-01 10:58:35,811][119042] Updated weights for policy 1, policy_version 19360 (0.0017)
[2023-10-01 10:58:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9912320. Throughput: 0: 739.8, 1: 739.4. Samples: 2479380. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:58:37,053][117973] Avg episode reward: [(0, '64.010'), (1, '44.030')]
[2023-10-01 10:58:42,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 9945088. Throughput: 0: 738.1, 1: 736.6. Samples: 2483732. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:58:42,052][117973] Avg episode reward: [(0, '66.270'), (1, '44.590')]
[2023-10-01 10:58:47,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 9977856. Throughput: 0: 735.7, 1: 734.6. Samples: 2492499. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:58:47,053][117973] Avg episode reward: [(0, '67.440'), (1, '45.160')]
[2023-10-01 10:58:49,771][119042] Updated weights for policy 1, policy_version 19520 (0.0017)
[2023-10-01 10:58:49,771][119041] Updated weights for policy 0, policy_version 19520 (0.0018)
[2023-10-01 10:58:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10002432. Throughput: 0: 731.0, 1: 733.3. Samples: 2500984. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:58:52,053][117973] Avg episode reward: [(0, '65.790'), (1, '45.530')]
[2023-10-01 10:58:57,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 10035200. Throughput: 0: 734.2, 1: 733.4. Samples: 2505653. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 10:58:57,052][117973] Avg episode reward: [(0, '65.910'), (1, '49.480')]
[2023-10-01 10:59:02,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 10059776. Throughput: 0: 732.2, 1: 732.5. Samples: 2514417. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:59:02,052][117973] Avg episode reward: [(0, '68.410'), (1, '49.900')]
[2023-10-01 10:59:02,060][118645] Saving new best policy, reward=68.410!
[2023-10-01 10:59:03,766][119041] Updated weights for policy 0, policy_version 19680 (0.0017)
[2023-10-01 10:59:03,766][119042] Updated weights for policy 1, policy_version 19680 (0.0019)
[2023-10-01 10:59:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10092544. Throughput: 0: 735.9, 1: 736.6. Samples: 2523353. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:59:07,053][117973] Avg episode reward: [(0, '67.610'), (1, '52.400')]
[2023-10-01 10:59:07,054][118715] Saving new best policy, reward=52.400!
[2023-10-01 10:59:12,051][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10125312. Throughput: 0: 735.0, 1: 738.3. Samples: 2527959. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 10:59:12,052][117973] Avg episode reward: [(0, '67.610'), (1, '53.050')]
[2023-10-01 10:59:12,053][118715] Saving new best policy, reward=53.050!
[2023-10-01 10:59:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10149888. Throughput: 0: 737.7, 1: 738.1. Samples: 2536710. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:59:17,052][117973] Avg episode reward: [(0, '67.320'), (1, '53.190')]
[2023-10-01 10:59:17,060][118715] Saving new best policy, reward=53.190!
[2023-10-01 10:59:17,539][119042] Updated weights for policy 1, policy_version 19840 (0.0017)
[2023-10-01 10:59:17,539][119041] Updated weights for policy 0, policy_version 19840 (0.0018)
[2023-10-01 10:59:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10182656. Throughput: 0: 737.8, 1: 735.2. Samples: 2545664. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:59:22,053][117973] Avg episode reward: [(0, '68.280'), (1, '54.600')]
[2023-10-01 10:59:22,054][118715] Saving new best policy, reward=54.600!
[2023-10-01 10:59:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 10207232. Throughput: 0: 734.6, 1: 736.5. Samples: 2549932. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 10:59:27,053][117973] Avg episode reward: [(0, '67.130'), (1, '54.580')]
[2023-10-01 10:59:31,322][119041] Updated weights for policy 0, policy_version 20000 (0.0018)
[2023-10-01 10:59:31,322][119042] Updated weights for policy 1, policy_version 20000 (0.0019)
[2023-10-01 10:59:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 10240000. Throughput: 0: 738.2, 1: 740.0. Samples: 2559021. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:59:32,052][117973] Avg episode reward: [(0, '67.720'), (1, '58.200')]
[2023-10-01 10:59:32,061][118715] Saving new best policy, reward=58.200!
[2023-10-01 10:59:37,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 10272768. Throughput: 0: 748.1, 1: 745.4. Samples: 2568190. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:59:37,053][117973] Avg episode reward: [(0, '67.870'), (1, '57.130')]
[2023-10-01 10:59:42,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5942.7). Total num frames: 10305536. Throughput: 0: 743.8, 1: 744.5. Samples: 2572628. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 10:59:42,052][117973] Avg episode reward: [(0, '68.100'), (1, '58.680')]
[2023-10-01 10:59:42,053][118715] Saving new best policy, reward=58.680!
[2023-10-01 10:59:44,772][119042] Updated weights for policy 1, policy_version 20160 (0.0019)
[2023-10-01 10:59:44,773][119041] Updated weights for policy 0, policy_version 20160 (0.0018)
[2023-10-01 10:59:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 10330112. Throughput: 0: 749.4, 1: 750.0. Samples: 2581891. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:59:47,053][117973] Avg episode reward: [(0, '68.610'), (1, '56.760')]
[2023-10-01 10:59:47,063][118645] Saving new best policy, reward=68.610!
[2023-10-01 10:59:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 10362880. Throughput: 0: 745.0, 1: 744.3. Samples: 2590372. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:59:52,052][117973] Avg episode reward: [(0, '65.630'), (1, '58.960')]
[2023-10-01 10:59:52,052][118715] Saving new best policy, reward=58.960!
[2023-10-01 10:59:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 10387456. Throughput: 0: 739.8, 1: 738.2. Samples: 2594471. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 10:59:57,053][117973] Avg episode reward: [(0, '65.420'), (1, '59.800')]
[2023-10-01 10:59:57,054][118715] Saving new best policy, reward=59.800!
[2023-10-01 10:59:59,429][119042] Updated weights for policy 1, policy_version 20320 (0.0013)
[2023-10-01 10:59:59,430][119041] Updated weights for policy 0, policy_version 20320 (0.0012)
[2023-10-01 11:00:02,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 10412032. Throughput: 0: 734.4, 1: 733.3. Samples: 2602757. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:02,052][117973] Avg episode reward: [(0, '65.500'), (1, '58.090')]
[2023-10-01 11:00:07,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5871.0, 300 sec: 5887.1). Total num frames: 10444800. Throughput: 0: 722.0, 1: 725.6. Samples: 2610804. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:07,052][117973] Avg episode reward: [(0, '64.900'), (1, '59.780')]
[2023-10-01 11:00:12,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5887.1). Total num frames: 10469376. Throughput: 0: 720.1, 1: 719.3. Samples: 2614708. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:12,052][117973] Avg episode reward: [(0, '64.530'), (1, '60.420')]
[2023-10-01 11:00:12,053][118715] Saving new best policy, reward=60.420!
[2023-10-01 11:00:14,571][119041] Updated weights for policy 0, policy_version 20480 (0.0013)
[2023-10-01 11:00:14,572][119042] Updated weights for policy 1, policy_version 20480 (0.0012)
[2023-10-01 11:00:17,051][117973] Fps is (10 sec: 4915.1, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 10493952. Throughput: 0: 711.9, 1: 711.2. Samples: 2623064. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:17,052][117973] Avg episode reward: [(0, '64.580'), (1, '61.030')]
[2023-10-01 11:00:17,061][118715] Saving new best policy, reward=61.030!
[2023-10-01 11:00:22,052][117973] Fps is (10 sec: 5324.7, 60 sec: 5666.1, 300 sec: 5845.5). Total num frames: 10522624. Throughput: 0: 698.7, 1: 700.6. Samples: 2631161. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:22,053][117973] Avg episode reward: [(0, '66.280'), (1, '60.720')]
[2023-10-01 11:00:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 10551296. Throughput: 0: 696.0, 1: 695.1. Samples: 2635227. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:27,052][117973] Avg episode reward: [(0, '67.910'), (1, '59.460')]
[2023-10-01 11:00:29,860][119042] Updated weights for policy 1, policy_version 20640 (0.0012)
[2023-10-01 11:00:29,860][119041] Updated weights for policy 0, policy_version 20640 (0.0013)
[2023-10-01 11:00:32,051][117973] Fps is (10 sec: 5324.9, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 10575872. Throughput: 0: 678.3, 1: 677.0. Samples: 2642881. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:00:32,052][117973] Avg episode reward: [(0, '66.260'), (1, '62.120')]
[2023-10-01 11:00:32,060][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000020656_5287936.pth...
[2023-10-01 11:00:32,060][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000020656_5287936.pth...
[2023-10-01 11:00:32,096][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000017920_4587520.pth
[2023-10-01 11:00:32,101][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000017920_4587520.pth
[2023-10-01 11:00:32,106][118715] Saving new best policy, reward=62.120!
[2023-10-01 11:00:37,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 10600448. Throughput: 0: 665.7, 1: 665.7. Samples: 2650285. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:00:37,052][117973] Avg episode reward: [(0, '68.990'), (1, '62.150')]
[2023-10-01 11:00:37,052][118645] Saving new best policy, reward=68.990!
[2023-10-01 11:00:37,053][118715] Saving new best policy, reward=62.150!
[2023-10-01 11:00:42,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5324.8, 300 sec: 5803.8). Total num frames: 10625024. Throughput: 0: 665.0, 1: 662.4. Samples: 2654208. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:00:42,053][117973] Avg episode reward: [(0, '69.550'), (1, '61.510')]
[2023-10-01 11:00:42,054][118645] Saving new best policy, reward=69.550!
[2023-10-01 11:00:45,380][119041] Updated weights for policy 0, policy_version 20800 (0.0014)
[2023-10-01 11:00:45,380][119042] Updated weights for policy 1, policy_version 20800 (0.0014)
[2023-10-01 11:00:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 10657792. Throughput: 0: 664.5, 1: 665.4. Samples: 2662602. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:47,053][117973] Avg episode reward: [(0, '69.450'), (1, '63.690')]
[2023-10-01 11:00:47,065][118715] Saving new best policy, reward=63.690!
[2023-10-01 11:00:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5324.8, 300 sec: 5803.8). Total num frames: 10682368. Throughput: 0: 675.2, 1: 672.8. Samples: 2671467. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:52,052][117973] Avg episode reward: [(0, '68.750'), (1, '64.240')]
[2023-10-01 11:00:52,053][118715] Saving new best policy, reward=64.240!
[2023-10-01 11:00:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 10715136. Throughput: 0: 678.8, 1: 679.8. Samples: 2675847. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:00:57,053][117973] Avg episode reward: [(0, '68.940'), (1, '66.190')]
[2023-10-01 11:00:57,054][118715] Saving new best policy, reward=66.190!
[2023-10-01 11:00:59,187][119041] Updated weights for policy 0, policy_version 20960 (0.0018)
[2023-10-01 11:00:59,188][119042] Updated weights for policy 1, policy_version 20960 (0.0016)
[2023-10-01 11:01:02,051][117973] Fps is (10 sec: 6553.6, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 10747904. Throughput: 0: 688.8, 1: 686.1. Samples: 2684932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:01:02,052][117973] Avg episode reward: [(0, '68.900'), (1, '68.580')]
[2023-10-01 11:01:02,060][118715] Saving new best policy, reward=68.580!
[2023-10-01 11:01:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 10772480. Throughput: 0: 697.4, 1: 698.0. Samples: 2693951. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:01:07,053][117973] Avg episode reward: [(0, '70.740'), (1, '70.800')]
[2023-10-01 11:01:07,054][118645] Saving new best policy, reward=70.740!
[2023-10-01 11:01:07,054][118715] Saving new best policy, reward=70.800!
[2023-10-01 11:01:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 10805248. Throughput: 0: 701.8, 1: 701.8. Samples: 2698389. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:01:12,053][117973] Avg episode reward: [(0, '70.200'), (1, '69.580')]
[2023-10-01 11:01:12,987][119042] Updated weights for policy 1, policy_version 21120 (0.0015)
[2023-10-01 11:01:12,988][119041] Updated weights for policy 0, policy_version 21120 (0.0018)
[2023-10-01 11:01:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 10829824. Throughput: 0: 718.1, 1: 716.6. Samples: 2707445. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:01:17,053][117973] Avg episode reward: [(0, '71.360'), (1, '69.430')]
[2023-10-01 11:01:17,166][118645] Saving new best policy, reward=71.360!
[2023-10-01 11:01:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5666.2, 300 sec: 5831.6). Total num frames: 10862592. Throughput: 0: 727.5, 1: 728.2. Samples: 2715790. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:01:22,052][117973] Avg episode reward: [(0, '71.740'), (1, '65.210')]
[2023-10-01 11:01:22,052][118645] Saving new best policy, reward=71.740!
[2023-10-01 11:01:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.8, 300 sec: 5803.8). Total num frames: 10887168. Throughput: 0: 728.6, 1: 731.2. Samples: 2719899. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:01:27,053][117973] Avg episode reward: [(0, '73.680'), (1, '65.700')]
[2023-10-01 11:01:27,112][118645] Saving new best policy, reward=73.680!
[2023-10-01 11:01:27,176][119041] Updated weights for policy 0, policy_version 21280 (0.0018)
[2023-10-01 11:01:27,177][119042] Updated weights for policy 1, policy_version 21280 (0.0018)
[2023-10-01 11:01:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5803.8). Total num frames: 10919936. Throughput: 0: 736.9, 1: 736.5. Samples: 2728903. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 11:01:32,052][117973] Avg episode reward: [(0, '76.710'), (1, '62.580')]
[2023-10-01 11:01:32,061][118645] Saving new best policy, reward=76.710!
[2023-10-01 11:01:37,052][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 10952704. Throughput: 0: 738.8, 1: 740.3. Samples: 2738030. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 11:01:37,053][117973] Avg episode reward: [(0, '75.560'), (1, '62.020')]
[2023-10-01 11:01:40,853][119041] Updated weights for policy 0, policy_version 21440 (0.0018)
[2023-10-01 11:01:40,853][119042] Updated weights for policy 1, policy_version 21440 (0.0017)
[2023-10-01 11:01:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 10977280. Throughput: 0: 739.5, 1: 737.9. Samples: 2742331. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-10-01 11:01:42,053][117973] Avg episode reward: [(0, '73.940'), (1, '61.550')]
[2023-10-01 11:01:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 11010048. Throughput: 0: 739.6, 1: 741.2. Samples: 2751566. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:01:47,053][117973] Avg episode reward: [(0, '76.650'), (1, '58.080')]
[2023-10-01 11:01:52,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 11042816. Throughput: 0: 741.0, 1: 740.4. Samples: 2760616. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:01:52,053][117973] Avg episode reward: [(0, '77.510'), (1, '58.850')]
[2023-10-01 11:01:52,053][118645] Saving new best policy, reward=77.510!
[2023-10-01 11:01:54,503][119041] Updated weights for policy 0, policy_version 21600 (0.0015)
[2023-10-01 11:01:54,504][119042] Updated weights for policy 1, policy_version 21600 (0.0017)
[2023-10-01 11:01:57,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 11067392. Throughput: 0: 738.8, 1: 738.3. Samples: 2764856. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:01:57,052][117973] Avg episode reward: [(0, '78.800'), (1, '57.700')]
[2023-10-01 11:01:57,247][118645] Saving new best policy, reward=78.800!
[2023-10-01 11:02:02,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 11100160. Throughput: 0: 736.6, 1: 737.8. Samples: 2773792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:02,052][117973] Avg episode reward: [(0, '77.190'), (1, '57.650')]
[2023-10-01 11:02:07,052][117973] Fps is (10 sec: 6143.9, 60 sec: 5939.2, 300 sec: 5817.7). Total num frames: 11128832. Throughput: 0: 742.9, 1: 744.3. Samples: 2782714. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:07,053][117973] Avg episode reward: [(0, '77.210'), (1, '59.020')]
[2023-10-01 11:02:08,520][119041] Updated weights for policy 0, policy_version 21760 (0.0018)
[2023-10-01 11:02:08,520][119042] Updated weights for policy 1, policy_version 21760 (0.0016)
[2023-10-01 11:02:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5831.6). Total num frames: 11157504. Throughput: 0: 746.1, 1: 746.4. Samples: 2787061. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:12,052][117973] Avg episode reward: [(0, '78.730'), (1, '57.850')]
[2023-10-01 11:02:17,052][117973] Fps is (10 sec: 6144.0, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 11190272. Throughput: 0: 741.3, 1: 739.5. Samples: 2795536. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:17,053][117973] Avg episode reward: [(0, '78.280'), (1, '56.130')]
[2023-10-01 11:02:22,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 11214848. Throughput: 0: 735.2, 1: 736.0. Samples: 2804232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:22,052][117973] Avg episode reward: [(0, '76.980'), (1, '56.880')]
[2023-10-01 11:02:22,582][119042] Updated weights for policy 1, policy_version 21920 (0.0015)
[2023-10-01 11:02:22,583][119041] Updated weights for policy 0, policy_version 21920 (0.0017)
[2023-10-01 11:02:27,051][117973] Fps is (10 sec: 5734.6, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 11247616. Throughput: 0: 739.2, 1: 740.4. Samples: 2808914. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:02:27,052][117973] Avg episode reward: [(0, '77.840'), (1, '55.140')]
[2023-10-01 11:02:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 11272192. Throughput: 0: 733.1, 1: 733.0. Samples: 2817542. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:02:32,052][117973] Avg episode reward: [(0, '77.460'), (1, '55.840')]
[2023-10-01 11:02:32,061][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000022016_5636096.pth...
[2023-10-01 11:02:32,061][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000022016_5636096.pth...
[2023-10-01 11:02:32,090][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000019312_4943872.pth
[2023-10-01 11:02:32,097][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000019312_4943872.pth
[2023-10-01 11:02:36,573][119042] Updated weights for policy 1, policy_version 22080 (0.0015)
[2023-10-01 11:02:36,574][119041] Updated weights for policy 0, policy_version 22080 (0.0018)
[2023-10-01 11:02:37,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 11304960. Throughput: 0: 730.1, 1: 728.3. Samples: 2826245. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:02:37,052][117973] Avg episode reward: [(0, '77.900'), (1, '55.560')]
[2023-10-01 11:02:42,051][117973] Fps is (10 sec: 5324.8, 60 sec: 5802.7, 300 sec: 5789.9). Total num frames: 11325440. Throughput: 0: 719.3, 1: 720.2. Samples: 2829630. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:02:42,052][117973] Avg episode reward: [(0, '76.510'), (1, '55.770')]
[2023-10-01 11:02:47,051][117973] Fps is (10 sec: 4096.0, 60 sec: 5597.9, 300 sec: 5748.3). Total num frames: 11345920. Throughput: 0: 697.1, 1: 695.8. Samples: 2836474. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:02:47,052][117973] Avg episode reward: [(0, '75.420'), (1, '55.650')]
[2023-10-01 11:02:52,051][117973] Fps is (10 sec: 4505.6, 60 sec: 5461.3, 300 sec: 5720.5). Total num frames: 11370496. Throughput: 0: 671.9, 1: 669.8. Samples: 2843090. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:02:52,052][117973] Avg episode reward: [(0, '74.920'), (1, '54.900')]
[2023-10-01 11:02:54,694][119041] Updated weights for policy 0, policy_version 22240 (0.0011)
[2023-10-01 11:02:54,694][119042] Updated weights for policy 1, policy_version 22240 (0.0012)
[2023-10-01 11:02:57,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5720.5). Total num frames: 11395072. Throughput: 0: 663.8, 1: 661.4. Samples: 2846695. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:02:57,052][117973] Avg episode reward: [(0, '75.760'), (1, '55.820')]
[2023-10-01 11:03:02,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5324.8, 300 sec: 5692.7). Total num frames: 11419648. Throughput: 0: 643.1, 1: 645.3. Samples: 2853516. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:03:02,052][117973] Avg episode reward: [(0, '75.030'), (1, '56.360')]
[2023-10-01 11:03:07,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5256.5, 300 sec: 5665.0). Total num frames: 11444224. Throughput: 0: 633.0, 1: 629.6. Samples: 2861051. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:03:07,052][117973] Avg episode reward: [(0, '74.270'), (1, '56.530')]
[2023-10-01 11:03:12,010][119041] Updated weights for policy 0, policy_version 22400 (0.0012)
[2023-10-01 11:03:12,010][119042] Updated weights for policy 1, policy_version 22400 (0.0012)
[2023-10-01 11:03:12,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5188.3, 300 sec: 5665.0). Total num frames: 11468800. Throughput: 0: 616.9, 1: 616.8. Samples: 2864429. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:03:12,052][117973] Avg episode reward: [(0, '74.960'), (1, '57.890')]
[2023-10-01 11:03:17,052][117973] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 5609.4). Total num frames: 11485184. Throughput: 0: 598.1, 1: 597.0. Samples: 2871320. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:03:17,052][117973] Avg episode reward: [(0, '74.740'), (1, '56.220')]
[2023-10-01 11:03:22,051][117973] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 5609.4). Total num frames: 11509760. Throughput: 0: 577.0, 1: 578.7. Samples: 2878250. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:22,052][117973] Avg episode reward: [(0, '72.070'), (1, '53.140')]
[2023-10-01 11:03:27,052][117973] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 5581.7). Total num frames: 11534336. Throughput: 0: 578.0, 1: 578.0. Samples: 2881654. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:27,052][117973] Avg episode reward: [(0, '71.550'), (1, '54.130')]
[2023-10-01 11:03:29,621][119041] Updated weights for policy 0, policy_version 22560 (0.0012)
[2023-10-01 11:03:29,621][119042] Updated weights for policy 1, policy_version 22560 (0.0010)
[2023-10-01 11:03:32,052][117973] Fps is (10 sec: 4915.0, 60 sec: 4778.6, 300 sec: 5581.7). Total num frames: 11558912. Throughput: 0: 581.3, 1: 583.2. Samples: 2888877. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:32,053][117973] Avg episode reward: [(0, '69.620'), (1, '55.100')]
[2023-10-01 11:03:37,051][117973] Fps is (10 sec: 4915.3, 60 sec: 4642.1, 300 sec: 5553.9). Total num frames: 11583488. Throughput: 0: 587.4, 1: 585.5. Samples: 2895872. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:37,052][117973] Avg episode reward: [(0, '69.680'), (1, '55.300')]
[2023-10-01 11:03:42,051][117973] Fps is (10 sec: 4915.3, 60 sec: 4710.4, 300 sec: 5526.1). Total num frames: 11608064. Throughput: 0: 584.2, 1: 585.2. Samples: 2899317. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:42,052][117973] Avg episode reward: [(0, '66.530'), (1, '53.690')]
[2023-10-01 11:03:47,051][117973] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 5498.4). Total num frames: 11624448. Throughput: 0: 585.8, 1: 585.0. Samples: 2906202. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:03:47,052][117973] Avg episode reward: [(0, '67.080'), (1, '55.580')]
[2023-10-01 11:03:47,206][119042] Updated weights for policy 1, policy_version 22720 (0.0011)
[2023-10-01 11:03:47,206][119041] Updated weights for policy 0, policy_version 22720 (0.0011)
[2023-10-01 11:03:52,052][117973] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 5470.6). Total num frames: 11649024. Throughput: 0: 580.5, 1: 582.4. Samples: 2913385. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:03:52,053][117973] Avg episode reward: [(0, '65.450'), (1, '55.990')]
[2023-10-01 11:03:57,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 5470.6). Total num frames: 11673600. Throughput: 0: 582.5, 1: 582.4. Samples: 2916848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:03:57,052][117973] Avg episode reward: [(0, '65.990'), (1, '54.710')]
[2023-10-01 11:04:02,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 5442.8). Total num frames: 11698176. Throughput: 0: 586.6, 1: 587.8. Samples: 2924167. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:04:02,052][117973] Avg episode reward: [(0, '67.120'), (1, '54.380')]
[2023-10-01 11:04:04,519][119041] Updated weights for policy 0, policy_version 22880 (0.0009)
[2023-10-01 11:04:04,519][119042] Updated weights for policy 1, policy_version 22880 (0.0010)
[2023-10-01 11:04:07,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 5415.1). Total num frames: 11722752. Throughput: 0: 585.2, 1: 585.8. Samples: 2930945. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:07,052][117973] Avg episode reward: [(0, '66.260'), (1, '55.150')]
[2023-10-01 11:04:12,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 5415.1). Total num frames: 11747328. Throughput: 0: 588.9, 1: 589.0. Samples: 2934662. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:12,052][117973] Avg episode reward: [(0, '66.010'), (1, '56.160')]
[2023-10-01 11:04:17,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 5387.3). Total num frames: 11771904. Throughput: 0: 584.6, 1: 584.4. Samples: 2941479. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:17,052][117973] Avg episode reward: [(0, '67.120'), (1, '54.370')]
[2023-10-01 11:04:21,585][119041] Updated weights for policy 0, policy_version 23040 (0.0013)
[2023-10-01 11:04:21,585][119042] Updated weights for policy 1, policy_version 23040 (0.0012)
[2023-10-01 11:04:22,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 5387.3). Total num frames: 11796480. Throughput: 0: 591.6, 1: 591.7. Samples: 2949124. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:04:22,052][117973] Avg episode reward: [(0, '64.110'), (1, '53.260')]
[2023-10-01 11:04:27,051][117973] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 5359.5). Total num frames: 11821056. Throughput: 0: 599.7, 1: 598.2. Samples: 2953223. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:04:27,052][117973] Avg episode reward: [(0, '61.640'), (1, '53.390')]
[2023-10-01 11:04:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 4915.2, 300 sec: 5359.5). Total num frames: 11853824. Throughput: 0: 614.4, 1: 614.4. Samples: 2961498. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:04:32,053][117973] Avg episode reward: [(0, '63.170'), (1, '52.950')]
[2023-10-01 11:04:32,063][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000023152_5926912.pth...
[2023-10-01 11:04:32,063][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000023152_5926912.pth...
[2023-10-01 11:04:32,099][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000020656_5287936.pth
[2023-10-01 11:04:32,106][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000020656_5287936.pth
[2023-10-01 11:04:36,495][119041] Updated weights for policy 0, policy_version 23200 (0.0016)
[2023-10-01 11:04:36,501][119042] Updated weights for policy 1, policy_version 23200 (0.0016)
[2023-10-01 11:04:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 5331.7). Total num frames: 11878400. Throughput: 0: 625.6, 1: 624.7. Samples: 2969651. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:37,052][117973] Avg episode reward: [(0, '61.370'), (1, '51.630')]
[2023-10-01 11:04:42,051][117973] Fps is (10 sec: 4915.3, 60 sec: 4915.2, 300 sec: 5331.7). Total num frames: 11902976. Throughput: 0: 633.6, 1: 634.2. Samples: 2973897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:42,052][117973] Avg episode reward: [(0, '63.020'), (1, '51.910')]
[2023-10-01 11:04:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5331.7). Total num frames: 11935744. Throughput: 0: 643.4, 1: 644.2. Samples: 2982108. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:47,052][117973] Avg episode reward: [(0, '63.000'), (1, '52.250')]
[2023-10-01 11:04:51,456][119042] Updated weights for policy 1, policy_version 23360 (0.0016)
[2023-10-01 11:04:51,456][119041] Updated weights for policy 0, policy_version 23360 (0.0016)
[2023-10-01 11:04:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5331.7). Total num frames: 11960320. Throughput: 0: 658.4, 1: 657.9. Samples: 2990177. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:04:52,052][117973] Avg episode reward: [(0, '63.070'), (1, '53.130')]
[2023-10-01 11:04:57,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5188.3, 300 sec: 5331.7). Total num frames: 11984896. Throughput: 0: 662.4, 1: 660.5. Samples: 2994190. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:04:57,052][117973] Avg episode reward: [(0, '62.590'), (1, '55.430')]
[2023-10-01 11:05:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5331.7). Total num frames: 12017664. Throughput: 0: 677.5, 1: 675.7. Samples: 3002373. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:05:02,054][117973] Avg episode reward: [(0, '61.570'), (1, '55.200')]
[2023-10-01 11:05:06,437][119041] Updated weights for policy 0, policy_version 23520 (0.0016)
[2023-10-01 11:05:06,438][119042] Updated weights for policy 1, policy_version 23520 (0.0018)
[2023-10-01 11:05:07,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5331.7). Total num frames: 12042240. Throughput: 0: 682.8, 1: 684.8. Samples: 3010668. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:05:07,053][117973] Avg episode reward: [(0, '61.580'), (1, '55.330')]
[2023-10-01 11:05:12,051][117973] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5331.7). Total num frames: 12066816. Throughput: 0: 682.8, 1: 685.0. Samples: 3014775. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:05:12,052][117973] Avg episode reward: [(0, '60.830'), (1, '56.240')]
[2023-10-01 11:05:17,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5461.3, 300 sec: 5345.6). Total num frames: 12099584. Throughput: 0: 682.7, 1: 682.6. Samples: 3022936. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:05:17,052][117973] Avg episode reward: [(0, '61.530'), (1, '57.370')]
[2023-10-01 11:05:21,440][119042] Updated weights for policy 1, policy_version 23680 (0.0018)
[2023-10-01 11:05:21,440][119041] Updated weights for policy 0, policy_version 23680 (0.0020)
[2023-10-01 11:05:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5331.7). Total num frames: 12124160. Throughput: 0: 682.8, 1: 683.9. Samples: 3031151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:05:22,052][117973] Avg episode reward: [(0, '62.070'), (1, '58.490')]
[2023-10-01 11:05:27,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5331.7). Total num frames: 12148736. Throughput: 0: 682.4, 1: 682.2. Samples: 3035300. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:05:27,052][117973] Avg episode reward: [(0, '62.410'), (1, '58.930')]
[2023-10-01 11:05:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5359.5). Total num frames: 12181504. Throughput: 0: 686.0, 1: 686.0. Samples: 3043845. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:05:32,052][117973] Avg episode reward: [(0, '63.160'), (1, '59.080')]
[2023-10-01 11:05:36,153][119041] Updated weights for policy 0, policy_version 23840 (0.0018)
[2023-10-01 11:05:36,154][119042] Updated weights for policy 1, policy_version 23840 (0.0016)
[2023-10-01 11:05:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5461.3, 300 sec: 5359.5). Total num frames: 12206080. Throughput: 0: 688.0, 1: 688.0. Samples: 3052094. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:05:37,052][117973] Avg episode reward: [(0, '64.310'), (1, '59.090')]
[2023-10-01 11:05:42,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5331.7). Total num frames: 12230656. Throughput: 0: 685.5, 1: 688.4. Samples: 3056014. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:05:42,053][117973] Avg episode reward: [(0, '61.890'), (1, '60.230')]
[2023-10-01 11:05:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5359.5). Total num frames: 12263424. Throughput: 0: 686.8, 1: 688.9. Samples: 3064281. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:05:47,052][117973] Avg episode reward: [(0, '61.580'), (1, '61.350')]
[2023-10-01 11:05:50,939][119042] Updated weights for policy 1, policy_version 24000 (0.0016)
[2023-10-01 11:05:50,939][119041] Updated weights for policy 0, policy_version 24000 (0.0018)
[2023-10-01 11:05:52,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5331.7). Total num frames: 12288000. Throughput: 0: 693.0, 1: 693.0. Samples: 3073039. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:05:52,052][117973] Avg episode reward: [(0, '61.910'), (1, '61.340')]
[2023-10-01 11:05:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5331.7). Total num frames: 12320768. Throughput: 0: 694.9, 1: 696.0. Samples: 3077368. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:05:57,053][117973] Avg episode reward: [(0, '62.510'), (1, '61.750')]
[2023-10-01 11:06:02,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5597.9, 300 sec: 5359.5). Total num frames: 12353536. Throughput: 0: 705.3, 1: 703.7. Samples: 3086341. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:06:02,052][117973] Avg episode reward: [(0, '63.780'), (1, '60.260')]
[2023-10-01 11:06:04,585][119041] Updated weights for policy 0, policy_version 24160 (0.0018)
[2023-10-01 11:06:04,585][119042] Updated weights for policy 1, policy_version 24160 (0.0017)
[2023-10-01 11:06:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5331.7). Total num frames: 12378112. Throughput: 0: 716.2, 1: 716.5. Samples: 3095623. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:06:07,053][117973] Avg episode reward: [(0, '62.050'), (1, '61.710')]
[2023-10-01 11:06:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5359.5). Total num frames: 12410880. Throughput: 0: 720.0, 1: 720.7. Samples: 3100131. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:06:12,052][117973] Avg episode reward: [(0, '60.820'), (1, '61.820')]
[2023-10-01 11:06:17,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5666.1, 300 sec: 5345.6). Total num frames: 12439552. Throughput: 0: 723.5, 1: 721.1. Samples: 3108854. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:17,053][117973] Avg episode reward: [(0, '55.980'), (1, '61.650')]
[2023-10-01 11:06:18,467][119041] Updated weights for policy 0, policy_version 24320 (0.0020)
[2023-10-01 11:06:18,467][119042] Updated weights for policy 1, policy_version 24320 (0.0019)
[2023-10-01 11:06:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5359.5). Total num frames: 12468224. Throughput: 0: 727.8, 1: 728.6. Samples: 3117636. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:22,052][117973] Avg episode reward: [(0, '54.710'), (1, '64.930')]
[2023-10-01 11:06:27,052][117973] Fps is (10 sec: 6144.0, 60 sec: 5870.9, 300 sec: 5359.5). Total num frames: 12500992. Throughput: 0: 736.8, 1: 736.4. Samples: 3122307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:27,053][117973] Avg episode reward: [(0, '55.300'), (1, '64.790')]
[2023-10-01 11:06:32,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5331.7). Total num frames: 12525568. Throughput: 0: 740.5, 1: 740.9. Samples: 3130944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:32,053][117973] Avg episode reward: [(0, '52.360'), (1, '66.930')]
[2023-10-01 11:06:32,063][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000024464_6262784.pth...
[2023-10-01 11:06:32,063][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000024464_6262784.pth...
[2023-10-01 11:06:32,104][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000022016_5636096.pth
[2023-10-01 11:06:32,105][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000022016_5636096.pth
[2023-10-01 11:06:32,445][119041] Updated weights for policy 0, policy_version 24480 (0.0019)
[2023-10-01 11:06:32,445][119042] Updated weights for policy 1, policy_version 24480 (0.0019)
[2023-10-01 11:06:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5359.5). Total num frames: 12558336. Throughput: 0: 740.0, 1: 738.4. Samples: 3139564. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:06:37,052][117973] Avg episode reward: [(0, '51.220'), (1, '69.170')]
[2023-10-01 11:06:42,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5870.9, 300 sec: 5331.7). Total num frames: 12582912. Throughput: 0: 738.5, 1: 735.3. Samples: 3143691. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:06:42,052][117973] Avg episode reward: [(0, '52.510'), (1, '68.670')]
[2023-10-01 11:06:46,299][119041] Updated weights for policy 0, policy_version 24640 (0.0017)
[2023-10-01 11:06:46,301][119042] Updated weights for policy 1, policy_version 24640 (0.0016)
[2023-10-01 11:06:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5331.7). Total num frames: 12615680. Throughput: 0: 738.1, 1: 740.9. Samples: 3152898. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:06:47,052][117973] Avg episode reward: [(0, '53.130'), (1, '69.030')]
[2023-10-01 11:06:52,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5359.5). Total num frames: 12648448. Throughput: 0: 734.3, 1: 734.7. Samples: 3161729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:52,053][117973] Avg episode reward: [(0, '52.480'), (1, '70.850')]
[2023-10-01 11:06:52,054][118715] Saving new best policy, reward=70.850!
[2023-10-01 11:06:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5331.7). Total num frames: 12673024. Throughput: 0: 734.4, 1: 732.6. Samples: 3166145. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:06:57,052][117973] Avg episode reward: [(0, '52.040'), (1, '75.370')]
[2023-10-01 11:06:57,054][118715] Saving new best policy, reward=75.370!
[2023-10-01 11:07:00,269][119042] Updated weights for policy 1, policy_version 24800 (0.0016)
[2023-10-01 11:07:00,269][119041] Updated weights for policy 0, policy_version 24800 (0.0017)
[2023-10-01 11:07:02,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5345.6). Total num frames: 12705792. Throughput: 0: 731.6, 1: 734.2. Samples: 3174812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:02,052][117973] Avg episode reward: [(0, '52.130'), (1, '74.150')]
[2023-10-01 11:07:07,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5359.5). Total num frames: 12738560. Throughput: 0: 738.0, 1: 738.6. Samples: 3184079. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:07,052][117973] Avg episode reward: [(0, '50.900'), (1, '74.900')]
[2023-10-01 11:07:12,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5331.7). Total num frames: 12763136. Throughput: 0: 737.1, 1: 736.7. Samples: 3188629. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:12,052][117973] Avg episode reward: [(0, '50.600'), (1, '77.280')]
[2023-10-01 11:07:12,053][118715] Saving new best policy, reward=77.280!
[2023-10-01 11:07:14,046][119042] Updated weights for policy 1, policy_version 24960 (0.0015)
[2023-10-01 11:07:14,046][119041] Updated weights for policy 0, policy_version 24960 (0.0018)
[2023-10-01 11:07:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5939.2, 300 sec: 5359.5). Total num frames: 12795904. Throughput: 0: 735.0, 1: 735.4. Samples: 3197110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:17,053][117973] Avg episode reward: [(0, '48.820'), (1, '77.980')]
[2023-10-01 11:07:17,063][118715] Saving new best policy, reward=77.980!
[2023-10-01 11:07:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5331.7). Total num frames: 12820480. Throughput: 0: 741.7, 1: 745.3. Samples: 3206477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:22,052][117973] Avg episode reward: [(0, '50.850'), (1, '77.890')]
[2023-10-01 11:07:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5359.5). Total num frames: 12853248. Throughput: 0: 747.4, 1: 750.7. Samples: 3211105. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:07:27,053][117973] Avg episode reward: [(0, '52.840'), (1, '80.720')]
[2023-10-01 11:07:27,054][118715] Saving new best policy, reward=80.720!
[2023-10-01 11:07:27,581][119042] Updated weights for policy 1, policy_version 25120 (0.0017)
[2023-10-01 11:07:27,581][119041] Updated weights for policy 0, policy_version 25120 (0.0018)
[2023-10-01 11:07:32,052][117973] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 5359.5). Total num frames: 12886016. Throughput: 0: 741.9, 1: 741.6. Samples: 3219655. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:07:32,053][117973] Avg episode reward: [(0, '53.830'), (1, '81.230')]
[2023-10-01 11:07:32,065][118715] Saving new best policy, reward=81.230!
[2023-10-01 11:07:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5373.4). Total num frames: 12910592. Throughput: 0: 743.6, 1: 743.9. Samples: 3228667. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:07:37,052][117973] Avg episode reward: [(0, '54.660'), (1, '78.740')]
[2023-10-01 11:07:41,571][119042] Updated weights for policy 1, policy_version 25280 (0.0019)
[2023-10-01 11:07:41,571][119041] Updated weights for policy 0, policy_version 25280 (0.0021)
[2023-10-01 11:07:42,052][117973] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5415.0). Total num frames: 12943360. Throughput: 0: 745.3, 1: 745.3. Samples: 3233222. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:07:42,053][117973] Avg episode reward: [(0, '51.900'), (1, '76.400')]
[2023-10-01 11:07:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5415.0). Total num frames: 12967936. Throughput: 0: 742.5, 1: 742.0. Samples: 3241611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:47,052][117973] Avg episode reward: [(0, '53.210'), (1, '80.050')]
[2023-10-01 11:07:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5442.8). Total num frames: 13000704. Throughput: 0: 734.8, 1: 732.7. Samples: 3250117. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:52,052][117973] Avg episode reward: [(0, '51.950'), (1, '80.580')]
[2023-10-01 11:07:55,814][119041] Updated weights for policy 0, policy_version 25440 (0.0020)
[2023-10-01 11:07:55,814][119042] Updated weights for policy 1, policy_version 25440 (0.0017)
[2023-10-01 11:07:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5442.8). Total num frames: 13025280. Throughput: 0: 730.6, 1: 730.6. Samples: 3254386. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:07:57,053][117973] Avg episode reward: [(0, '52.090'), (1, '80.980')]
[2023-10-01 11:08:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5470.6). Total num frames: 13058048. Throughput: 0: 738.0, 1: 738.3. Samples: 3263544. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:02,053][117973] Avg episode reward: [(0, '54.500'), (1, '80.840')]
[2023-10-01 11:08:07,052][117973] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5498.4). Total num frames: 13090816. Throughput: 0: 737.9, 1: 733.8. Samples: 3272704. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:07,053][117973] Avg episode reward: [(0, '55.110'), (1, '78.010')]
[2023-10-01 11:08:09,346][119041] Updated weights for policy 0, policy_version 25600 (0.0018)
[2023-10-01 11:08:09,346][119042] Updated weights for policy 1, policy_version 25600 (0.0017)
[2023-10-01 11:08:12,051][117973] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 5553.9). Total num frames: 13123584. Throughput: 0: 733.4, 1: 732.6. Samples: 3277075. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:12,052][117973] Avg episode reward: [(0, '58.560'), (1, '77.460')]
[2023-10-01 11:08:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5553.9). Total num frames: 13148160. Throughput: 0: 741.7, 1: 743.4. Samples: 3286483. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:17,053][117973] Avg episode reward: [(0, '59.240'), (1, '78.320')]
[2023-10-01 11:08:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5581.7). Total num frames: 13180928. Throughput: 0: 738.2, 1: 738.0. Samples: 3295097. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:08:22,053][117973] Avg episode reward: [(0, '61.810'), (1, '78.900')]
[2023-10-01 11:08:23,257][119041] Updated weights for policy 0, policy_version 25760 (0.0017)
[2023-10-01 11:08:23,259][119042] Updated weights for policy 1, policy_version 25760 (0.0015)
[2023-10-01 11:08:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5581.7). Total num frames: 13205504. Throughput: 0: 734.7, 1: 733.8. Samples: 3299305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:08:27,052][117973] Avg episode reward: [(0, '62.790'), (1, '79.890')]
[2023-10-01 11:08:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5609.4). Total num frames: 13238272. Throughput: 0: 735.0, 1: 735.5. Samples: 3307783. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:08:32,052][117973] Avg episode reward: [(0, '63.190'), (1, '79.690')]
[2023-10-01 11:08:32,060][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000025856_6619136.pth...
[2023-10-01 11:08:32,060][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000025856_6619136.pth...
[2023-10-01 11:08:32,097][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000023152_5926912.pth
[2023-10-01 11:08:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000023152_5926912.pth
[2023-10-01 11:08:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5609.4). Total num frames: 13262848. Throughput: 0: 738.5, 1: 739.8. Samples: 3316641. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:08:37,052][117973] Avg episode reward: [(0, '64.960'), (1, '77.790')]
[2023-10-01 11:08:37,335][119041] Updated weights for policy 0, policy_version 25920 (0.0019)
[2023-10-01 11:08:37,335][119042] Updated weights for policy 1, policy_version 25920 (0.0018)
[2023-10-01 11:08:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 13295616. Throughput: 0: 740.6, 1: 741.4. Samples: 3321076. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:42,053][117973] Avg episode reward: [(0, '66.520'), (1, '77.080')]
[2023-10-01 11:08:47,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5870.9, 300 sec: 5665.0). Total num frames: 13320192. Throughput: 0: 739.3, 1: 737.3. Samples: 3329993. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:47,053][117973] Avg episode reward: [(0, '66.070'), (1, '79.070')]
[2023-10-01 11:08:51,209][119041] Updated weights for policy 0, policy_version 26080 (0.0019)
[2023-10-01 11:08:51,209][119042] Updated weights for policy 1, policy_version 26080 (0.0017)
[2023-10-01 11:08:52,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5692.7). Total num frames: 13352960. Throughput: 0: 732.9, 1: 734.4. Samples: 3338736. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:08:52,053][117973] Avg episode reward: [(0, '68.840'), (1, '75.920')]
[2023-10-01 11:08:57,052][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5720.5). Total num frames: 13385728. Throughput: 0: 737.6, 1: 737.5. Samples: 3343451. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:08:57,052][117973] Avg episode reward: [(0, '68.650'), (1, '74.600')]
[2023-10-01 11:09:02,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5720.5). Total num frames: 13410304. Throughput: 0: 736.3, 1: 732.2. Samples: 3352565. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:09:02,052][117973] Avg episode reward: [(0, '68.900'), (1, '73.340')]
[2023-10-01 11:09:04,882][119041] Updated weights for policy 0, policy_version 26240 (0.0019)
[2023-10-01 11:09:04,883][119042] Updated weights for policy 1, policy_version 26240 (0.0018)
[2023-10-01 11:09:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5748.3). Total num frames: 13443072. Throughput: 0: 736.0, 1: 735.5. Samples: 3361311. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:09:07,053][117973] Avg episode reward: [(0, '68.570'), (1, '75.190')]
[2023-10-01 11:09:12,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 13475840. Throughput: 0: 737.4, 1: 738.6. Samples: 3365722. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:09:12,052][117973] Avg episode reward: [(0, '68.550'), (1, '73.310')]
[2023-10-01 11:09:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 13500416. Throughput: 0: 746.9, 1: 745.4. Samples: 3374936. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 11:09:17,053][117973] Avg episode reward: [(0, '68.980'), (1, '69.940')]
[2023-10-01 11:09:18,774][119041] Updated weights for policy 0, policy_version 26400 (0.0019)
[2023-10-01 11:09:18,774][119042] Updated weights for policy 1, policy_version 26400 (0.0018)
[2023-10-01 11:09:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 13533184. Throughput: 0: 742.0, 1: 739.4. Samples: 3383300. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 11:09:22,053][117973] Avg episode reward: [(0, '70.370'), (1, '71.260')]
[2023-10-01 11:09:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5776.1). Total num frames: 13557760. Throughput: 0: 738.5, 1: 737.6. Samples: 3387500. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-10-01 11:09:27,053][117973] Avg episode reward: [(0, '70.640'), (1, '71.250')]
[2023-10-01 11:09:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5803.8). Total num frames: 13590528. Throughput: 0: 738.4, 1: 739.9. Samples: 3396515. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:32,052][117973] Avg episode reward: [(0, '71.700'), (1, '67.780')]
[2023-10-01 11:09:32,722][119041] Updated weights for policy 0, policy_version 26560 (0.0019)
[2023-10-01 11:09:32,723][119042] Updated weights for policy 1, policy_version 26560 (0.0018)
[2023-10-01 11:09:37,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5831.6). Total num frames: 13623296. Throughput: 0: 740.8, 1: 743.0. Samples: 3405508. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:37,053][117973] Avg episode reward: [(0, '71.700'), (1, '69.300')]
[2023-10-01 11:09:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5803.8). Total num frames: 13647872. Throughput: 0: 737.9, 1: 737.2. Samples: 3409833. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:42,052][117973] Avg episode reward: [(0, '72.750'), (1, '68.310')]
[2023-10-01 11:09:46,527][119042] Updated weights for policy 1, policy_version 26720 (0.0016)
[2023-10-01 11:09:46,528][119041] Updated weights for policy 0, policy_version 26720 (0.0018)
[2023-10-01 11:09:47,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 13680640. Throughput: 0: 734.1, 1: 736.9. Samples: 3418761. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:47,053][117973] Avg episode reward: [(0, '78.920'), (1, '69.250')]
[2023-10-01 11:09:47,065][118645] Saving new best policy, reward=78.920!
[2023-10-01 11:09:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13705216. Throughput: 0: 734.5, 1: 735.4. Samples: 3427455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:52,053][117973] Avg episode reward: [(0, '79.740'), (1, '68.990')]
[2023-10-01 11:09:52,054][118645] Saving new best policy, reward=79.740!
[2023-10-01 11:09:57,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 13737984. Throughput: 0: 734.8, 1: 736.4. Samples: 3431924. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:09:57,053][117973] Avg episode reward: [(0, '81.440'), (1, '70.370')]
[2023-10-01 11:09:57,054][118645] Saving new best policy, reward=81.440!
[2023-10-01 11:10:00,435][119042] Updated weights for policy 1, policy_version 26880 (0.0015)
[2023-10-01 11:10:00,436][119041] Updated weights for policy 0, policy_version 26880 (0.0016)
[2023-10-01 11:10:02,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 13770752. Throughput: 0: 730.8, 1: 732.2. Samples: 3440770. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:02,052][117973] Avg episode reward: [(0, '81.650'), (1, '71.780')]
[2023-10-01 11:10:02,061][118645] Saving new best policy, reward=81.650!
[2023-10-01 11:10:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 13795328. Throughput: 0: 740.5, 1: 743.6. Samples: 3450084. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:07,053][117973] Avg episode reward: [(0, '83.670'), (1, '69.680')]
[2023-10-01 11:10:07,215][118645] Saving new best policy, reward=83.670!
[2023-10-01 11:10:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5859.4). Total num frames: 13828096. Throughput: 0: 744.2, 1: 744.8. Samples: 3454503. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:12,052][117973] Avg episode reward: [(0, '92.130'), (1, '67.040')]
[2023-10-01 11:10:12,053][118645] Saving new best policy, reward=92.130!
[2023-10-01 11:10:14,236][119041] Updated weights for policy 0, policy_version 27040 (0.0015)
[2023-10-01 11:10:14,236][119042] Updated weights for policy 1, policy_version 27040 (0.0017)
[2023-10-01 11:10:17,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 13860864. Throughput: 0: 742.0, 1: 739.5. Samples: 3463180. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:17,053][117973] Avg episode reward: [(0, '92.630'), (1, '67.150')]
[2023-10-01 11:10:17,064][118645] Saving new best policy, reward=92.630!
[2023-10-01 11:10:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 13885440. Throughput: 0: 737.2, 1: 735.9. Samples: 3471795. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:22,053][117973] Avg episode reward: [(0, '92.120'), (1, '69.340')]
[2023-10-01 11:10:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5887.1). Total num frames: 13918208. Throughput: 0: 737.8, 1: 738.7. Samples: 3476275. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:27,053][117973] Avg episode reward: [(0, '87.040'), (1, '69.040')]
[2023-10-01 11:10:28,168][119041] Updated weights for policy 0, policy_version 27200 (0.0017)
[2023-10-01 11:10:28,168][119042] Updated weights for policy 1, policy_version 27200 (0.0017)
[2023-10-01 11:10:32,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 13942784. Throughput: 0: 740.0, 1: 741.2. Samples: 3485416. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:32,052][117973] Avg episode reward: [(0, '86.240'), (1, '71.260')]
[2023-10-01 11:10:32,061][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000027232_6971392.pth...
[2023-10-01 11:10:32,062][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000027232_6971392.pth...
[2023-10-01 11:10:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000024464_6262784.pth
[2023-10-01 11:10:32,097][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000024464_6262784.pth
[2023-10-01 11:10:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 13975552. Throughput: 0: 738.4, 1: 736.4. Samples: 3493821. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:37,052][117973] Avg episode reward: [(0, '85.340'), (1, '70.370')]
[2023-10-01 11:10:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 14000128. Throughput: 0: 735.6, 1: 733.1. Samples: 3498018. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:42,053][117973] Avg episode reward: [(0, '86.570'), (1, '67.770')]
[2023-10-01 11:10:42,189][119041] Updated weights for policy 0, policy_version 27360 (0.0018)
[2023-10-01 11:10:42,190][119042] Updated weights for policy 1, policy_version 27360 (0.0018)
[2023-10-01 11:10:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5871.0, 300 sec: 5914.9). Total num frames: 14032896. Throughput: 0: 741.3, 1: 740.9. Samples: 3507472. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:47,052][117973] Avg episode reward: [(0, '84.780'), (1, '66.480')]
[2023-10-01 11:10:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 14065664. Throughput: 0: 737.8, 1: 735.4. Samples: 3516379. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:52,052][117973] Avg episode reward: [(0, '83.580'), (1, '66.840')]
[2023-10-01 11:10:55,804][119042] Updated weights for policy 1, policy_version 27520 (0.0018)
[2023-10-01 11:10:55,804][119041] Updated weights for policy 0, policy_version 27520 (0.0018)
[2023-10-01 11:10:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 14090240. Throughput: 0: 734.8, 1: 733.3. Samples: 3520568. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:10:57,053][117973] Avg episode reward: [(0, '85.790'), (1, '66.300')]
[2023-10-01 11:11:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5914.9). Total num frames: 14123008. Throughput: 0: 737.7, 1: 741.6. Samples: 3529748. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:11:02,052][117973] Avg episode reward: [(0, '88.400'), (1, '68.740')]
[2023-10-01 11:11:07,052][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5914.9). Total num frames: 14155776. Throughput: 0: 743.0, 1: 742.3. Samples: 3538634. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:11:07,053][117973] Avg episode reward: [(0, '91.370'), (1, '69.400')]
[2023-10-01 11:11:09,781][119042] Updated weights for policy 1, policy_version 27680 (0.0017)
[2023-10-01 11:11:09,782][119041] Updated weights for policy 0, policy_version 27680 (0.0015)
[2023-10-01 11:11:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5901.0). Total num frames: 14180352. Throughput: 0: 742.2, 1: 740.4. Samples: 3542990. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:11:12,052][117973] Avg episode reward: [(0, '89.240'), (1, '69.220')]
[2023-10-01 11:11:17,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5734.4, 300 sec: 5887.1). Total num frames: 14204928. Throughput: 0: 730.4, 1: 729.2. Samples: 3551100. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:11:17,053][117973] Avg episode reward: [(0, '90.070'), (1, '70.640')]
[2023-10-01 11:11:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5887.1). Total num frames: 14237696. Throughput: 0: 727.5, 1: 728.2. Samples: 3559328. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:11:22,052][117973] Avg episode reward: [(0, '90.420'), (1, '70.090')]
[2023-10-01 11:11:24,724][119042] Updated weights for policy 1, policy_version 27840 (0.0015)
[2023-10-01 11:11:24,724][119041] Updated weights for policy 0, policy_version 27840 (0.0018)
[2023-10-01 11:11:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5887.1). Total num frames: 14262272. Throughput: 0: 727.0, 1: 727.4. Samples: 3563468. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:11:27,053][117973] Avg episode reward: [(0, '90.990'), (1, '72.000')]
[2023-10-01 11:11:32,051][117973] Fps is (10 sec: 5324.8, 60 sec: 5802.7, 300 sec: 5873.2). Total num frames: 14290944. Throughput: 0: 714.9, 1: 713.3. Samples: 3571742. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:11:32,052][117973] Avg episode reward: [(0, '95.480'), (1, '70.610')]
[2023-10-01 11:11:32,062][118645] Saving new best policy, reward=95.480!
[2023-10-01 11:11:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5887.1). Total num frames: 14319616. Throughput: 0: 706.2, 1: 705.9. Samples: 3579926. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:11:37,052][117973] Avg episode reward: [(0, '99.890'), (1, '72.070')]
[2023-10-01 11:11:37,052][118645] Saving new best policy, reward=99.890!
[2023-10-01 11:11:39,428][119041] Updated weights for policy 0, policy_version 28000 (0.0015)
[2023-10-01 11:11:39,429][119042] Updated weights for policy 1, policy_version 28000 (0.0018)
[2023-10-01 11:11:42,052][117973] Fps is (10 sec: 5324.7, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 14344192. Throughput: 0: 705.6, 1: 706.5. Samples: 3584114. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:11:42,053][117973] Avg episode reward: [(0, '100.530'), (1, '73.090')]
[2023-10-01 11:11:42,054][118645] Saving new best policy, reward=100.530!
[2023-10-01 11:11:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 14376960. Throughput: 0: 696.3, 1: 694.7. Samples: 3592345. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:11:47,052][117973] Avg episode reward: [(0, '100.980'), (1, '73.990')]
[2023-10-01 11:11:47,059][118645] Saving new best policy, reward=100.980!
[2023-10-01 11:11:52,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5597.9, 300 sec: 5859.4). Total num frames: 14401536. Throughput: 0: 687.2, 1: 688.0. Samples: 3600521. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:11:52,052][117973] Avg episode reward: [(0, '98.810'), (1, '75.350')]
[2023-10-01 11:11:54,281][119041] Updated weights for policy 0, policy_version 28160 (0.0014)
[2023-10-01 11:11:54,281][119042] Updated weights for policy 1, policy_version 28160 (0.0013)
[2023-10-01 11:11:57,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 14434304. Throughput: 0: 687.0, 1: 688.6. Samples: 3604890. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:11:57,053][117973] Avg episode reward: [(0, '99.260'), (1, '78.010')]
[2023-10-01 11:12:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 14458880. Throughput: 0: 694.3, 1: 694.6. Samples: 3613599. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:12:02,052][117973] Avg episode reward: [(0, '96.270'), (1, '78.440')]
[2023-10-01 11:12:07,052][117973] Fps is (10 sec: 4915.3, 60 sec: 5461.3, 300 sec: 5831.6). Total num frames: 14483456. Throughput: 0: 695.7, 1: 695.0. Samples: 3621910. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:12:07,052][117973] Avg episode reward: [(0, '98.650'), (1, '79.320')]
[2023-10-01 11:12:08,798][119042] Updated weights for policy 1, policy_version 28320 (0.0014)
[2023-10-01 11:12:08,799][119041] Updated weights for policy 0, policy_version 28320 (0.0017)
[2023-10-01 11:12:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 14516224. Throughput: 0: 694.5, 1: 696.0. Samples: 3626043. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:12:12,052][117973] Avg episode reward: [(0, '96.450'), (1, '79.450')]
[2023-10-01 11:12:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5831.6). Total num frames: 14540800. Throughput: 0: 695.4, 1: 697.6. Samples: 3634425. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:12:17,052][117973] Avg episode reward: [(0, '94.040'), (1, '79.820')]
[2023-10-01 11:12:22,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5803.8). Total num frames: 14565376. Throughput: 0: 697.1, 1: 698.8. Samples: 3642745. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:12:22,052][117973] Avg episode reward: [(0, '88.880'), (1, '81.460')]
[2023-10-01 11:12:22,128][118715] Saving new best policy, reward=81.460!
[2023-10-01 11:12:23,619][119041] Updated weights for policy 0, policy_version 28480 (0.0015)
[2023-10-01 11:12:23,620][119042] Updated weights for policy 1, policy_version 28480 (0.0016)
[2023-10-01 11:12:27,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 14598144. Throughput: 0: 698.7, 1: 698.6. Samples: 3646993. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:12:27,052][117973] Avg episode reward: [(0, '80.500'), (1, '79.880')]
[2023-10-01 11:12:32,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5529.6, 300 sec: 5803.8). Total num frames: 14622720. Throughput: 0: 700.0, 1: 699.7. Samples: 3655331. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:12:32,053][117973] Avg episode reward: [(0, '79.480'), (1, '78.270')]
[2023-10-01 11:12:32,062][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000028560_7311360.pth...
[2023-10-01 11:12:32,062][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000028560_7311360.pth...
[2023-10-01 11:12:32,096][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000025856_6619136.pth
[2023-10-01 11:12:32,098][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000025856_6619136.pth
[2023-10-01 11:12:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.8, 300 sec: 5803.8). Total num frames: 14655488. Throughput: 0: 700.0, 1: 700.8. Samples: 3663557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:12:37,053][117973] Avg episode reward: [(0, '81.390'), (1, '75.920')]
[2023-10-01 11:12:38,432][119042] Updated weights for policy 1, policy_version 28640 (0.0012)
[2023-10-01 11:12:38,432][119041] Updated weights for policy 0, policy_version 28640 (0.0013)
[2023-10-01 11:12:42,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 14680064. Throughput: 0: 697.6, 1: 698.2. Samples: 3667698. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:12:42,052][117973] Avg episode reward: [(0, '81.390'), (1, '75.520')]
[2023-10-01 11:12:47,051][117973] Fps is (10 sec: 4915.3, 60 sec: 5461.3, 300 sec: 5776.1). Total num frames: 14704640. Throughput: 0: 694.1, 1: 692.1. Samples: 3675977. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:12:47,052][117973] Avg episode reward: [(0, '83.900'), (1, '76.590')]
[2023-10-01 11:12:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5803.8). Total num frames: 14737408. Throughput: 0: 693.7, 1: 693.1. Samples: 3684318. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:12:52,052][117973] Avg episode reward: [(0, '84.790'), (1, '74.080')]
[2023-10-01 11:12:53,233][119042] Updated weights for policy 1, policy_version 28800 (0.0017)
[2023-10-01 11:12:53,233][119041] Updated weights for policy 0, policy_version 28800 (0.0017)
[2023-10-01 11:12:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5776.1). Total num frames: 14761984. Throughput: 0: 693.4, 1: 692.0. Samples: 3688389. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:12:57,053][117973] Avg episode reward: [(0, '87.320'), (1, '73.200')]
[2023-10-01 11:13:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5776.1). Total num frames: 14794752. Throughput: 0: 692.7, 1: 690.8. Samples: 3696683. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:02,052][117973] Avg episode reward: [(0, '87.790'), (1, '74.200')]
[2023-10-01 11:13:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5748.3). Total num frames: 14819328. Throughput: 0: 691.3, 1: 691.8. Samples: 3704987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:07,052][117973] Avg episode reward: [(0, '89.600'), (1, '74.000')]
[2023-10-01 11:13:07,880][119041] Updated weights for policy 0, policy_version 28960 (0.0017)
[2023-10-01 11:13:07,880][119042] Updated weights for policy 1, policy_version 28960 (0.0016)
[2023-10-01 11:13:12,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5748.3). Total num frames: 14843904. Throughput: 0: 690.1, 1: 690.5. Samples: 3709118. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:12,052][117973] Avg episode reward: [(0, '85.580'), (1, '73.060')]
[2023-10-01 11:13:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5748.3). Total num frames: 14876672. Throughput: 0: 690.9, 1: 691.1. Samples: 3717523. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:17,052][117973] Avg episode reward: [(0, '84.820'), (1, '74.150')]
[2023-10-01 11:13:22,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.8, 300 sec: 5748.3). Total num frames: 14901248. Throughput: 0: 692.0, 1: 691.6. Samples: 3725820. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:22,053][117973] Avg episode reward: [(0, '86.520'), (1, '76.800')]
[2023-10-01 11:13:22,697][119042] Updated weights for policy 1, policy_version 29120 (0.0016)
[2023-10-01 11:13:22,697][119041] Updated weights for policy 0, policy_version 29120 (0.0015)
[2023-10-01 11:13:27,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5720.5). Total num frames: 14925824. Throughput: 0: 690.5, 1: 689.8. Samples: 3729813. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:27,052][117973] Avg episode reward: [(0, '86.520'), (1, '77.330')]
[2023-10-01 11:13:32,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5748.3). Total num frames: 14958592. Throughput: 0: 690.0, 1: 691.0. Samples: 3738124. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:32,053][117973] Avg episode reward: [(0, '80.340'), (1, '71.170')]
[2023-10-01 11:13:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5720.5). Total num frames: 14983168. Throughput: 0: 689.2, 1: 691.3. Samples: 3746438. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:37,053][117973] Avg episode reward: [(0, '81.240'), (1, '72.760')]
[2023-10-01 11:13:37,572][119042] Updated weights for policy 1, policy_version 29280 (0.0017)
[2023-10-01 11:13:37,572][119041] Updated weights for policy 0, policy_version 29280 (0.0015)
[2023-10-01 11:13:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5748.3). Total num frames: 15015936. Throughput: 0: 691.9, 1: 693.3. Samples: 3750723. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:42,052][117973] Avg episode reward: [(0, '81.240'), (1, '70.600')]
[2023-10-01 11:13:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5720.5). Total num frames: 15040512. Throughput: 0: 694.3, 1: 694.8. Samples: 3759191. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:47,052][117973] Avg episode reward: [(0, '84.320'), (1, '68.400')]
[2023-10-01 11:13:52,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5692.7). Total num frames: 15065088. Throughput: 0: 694.6, 1: 695.7. Samples: 3767548. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:52,053][117973] Avg episode reward: [(0, '90.530'), (1, '67.530')]
[2023-10-01 11:13:52,239][119041] Updated weights for policy 0, policy_version 29440 (0.0018)
[2023-10-01 11:13:52,240][119042] Updated weights for policy 1, policy_version 29440 (0.0016)
[2023-10-01 11:13:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5720.5). Total num frames: 15097856. Throughput: 0: 693.7, 1: 693.3. Samples: 3771532. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:13:57,052][117973] Avg episode reward: [(0, '90.410'), (1, '66.410')]
[2023-10-01 11:14:02,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5692.7). Total num frames: 15122432. Throughput: 0: 692.3, 1: 692.6. Samples: 3779845. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:14:02,053][117973] Avg episode reward: [(0, '86.670'), (1, '66.790')]
[2023-10-01 11:14:07,052][117973] Fps is (10 sec: 5324.7, 60 sec: 5529.6, 300 sec: 5678.9). Total num frames: 15151104. Throughput: 0: 694.0, 1: 693.8. Samples: 3788269. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:14:07,053][117973] Avg episode reward: [(0, '78.820'), (1, '64.250')]
[2023-10-01 11:14:07,066][119042] Updated weights for policy 1, policy_version 29600 (0.0014)
[2023-10-01 11:14:07,067][119041] Updated weights for policy 0, policy_version 29600 (0.0015)
[2023-10-01 11:14:12,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5597.9, 300 sec: 5692.7). Total num frames: 15179776. Throughput: 0: 696.3, 1: 695.1. Samples: 3792423. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:14:12,052][117973] Avg episode reward: [(0, '80.500'), (1, '63.040')]
[2023-10-01 11:14:17,052][117973] Fps is (10 sec: 5324.8, 60 sec: 5461.3, 300 sec: 5665.0). Total num frames: 15204352. Throughput: 0: 697.1, 1: 697.8. Samples: 3800894. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-10-01 11:14:17,053][117973] Avg episode reward: [(0, '79.270'), (1, '62.860')]
[2023-10-01 11:14:21,708][119041] Updated weights for policy 0, policy_version 29760 (0.0014)
[2023-10-01 11:14:21,708][119042] Updated weights for policy 1, policy_version 29760 (0.0016)
[2023-10-01 11:14:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5692.8). Total num frames: 15237120. Throughput: 0: 699.0, 1: 696.8. Samples: 3809249. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:14:22,052][117973] Avg episode reward: [(0, '80.490'), (1, '63.470')]
[2023-10-01 11:14:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 15261696. Throughput: 0: 697.5, 1: 694.8. Samples: 3813376. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:14:27,052][117973] Avg episode reward: [(0, '83.240'), (1, '63.480')]
[2023-10-01 11:14:32,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5637.2). Total num frames: 15286272. Throughput: 0: 693.7, 1: 692.3. Samples: 3821562. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:14:32,053][117973] Avg episode reward: [(0, '84.680'), (1, '68.320')]
[2023-10-01 11:14:32,121][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000029872_7647232.pth...
[2023-10-01 11:14:32,152][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000027232_6971392.pth
[2023-10-01 11:14:32,173][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000029872_7647232.pth...
[2023-10-01 11:14:32,201][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000027232_6971392.pth
[2023-10-01 11:14:36,499][119042] Updated weights for policy 1, policy_version 29920 (0.0010)
[2023-10-01 11:14:36,500][119041] Updated weights for policy 0, policy_version 29920 (0.0017)
[2023-10-01 11:14:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 15319040. Throughput: 0: 693.2, 1: 690.6. Samples: 3829819. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:14:37,053][117973] Avg episode reward: [(0, '86.570'), (1, '66.520')]
[2023-10-01 11:14:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5637.2). Total num frames: 15343616. Throughput: 0: 693.5, 1: 691.6. Samples: 3833861. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:14:42,053][117973] Avg episode reward: [(0, '80.340'), (1, '65.900')]
[2023-10-01 11:14:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 15376384. Throughput: 0: 692.7, 1: 692.8. Samples: 3842193. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:14:47,052][117973] Avg episode reward: [(0, '77.860'), (1, '65.900')]
[2023-10-01 11:14:51,305][119041] Updated weights for policy 0, policy_version 30080 (0.0012)
[2023-10-01 11:14:51,306][119042] Updated weights for policy 1, policy_version 30080 (0.0014)
[2023-10-01 11:14:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 15400960. Throughput: 0: 692.4, 1: 692.3. Samples: 3850580. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:14:52,052][117973] Avg episode reward: [(0, '80.240'), (1, '69.330')]
[2023-10-01 11:14:57,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5609.4). Total num frames: 15425536. Throughput: 0: 692.9, 1: 693.5. Samples: 3854809. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-10-01 11:14:57,053][117973] Avg episode reward: [(0, '82.590'), (1, '70.120')]
[2023-10-01 11:15:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 15458304. Throughput: 0: 692.9, 1: 692.9. Samples: 3863253. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:15:02,052][117973] Avg episode reward: [(0, '83.770'), (1, '70.870')]
[2023-10-01 11:15:05,964][119041] Updated weights for policy 0, policy_version 30240 (0.0014)
[2023-10-01 11:15:05,964][119042] Updated weights for policy 1, policy_version 30240 (0.0016)
[2023-10-01 11:15:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5529.6, 300 sec: 5609.4). Total num frames: 15482880. Throughput: 0: 692.6, 1: 694.4. Samples: 3871661. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:15:07,052][117973] Avg episode reward: [(0, '84.110'), (1, '71.260')]
[2023-10-01 11:15:12,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5597.8, 300 sec: 5609.4). Total num frames: 15515648. Throughput: 0: 693.7, 1: 696.4. Samples: 3875933. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:15:12,053][117973] Avg episode reward: [(0, '87.460'), (1, '72.740')]
[2023-10-01 11:15:17,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5609.4). Total num frames: 15540224. Throughput: 0: 697.2, 1: 699.1. Samples: 3884395. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:15:17,052][117973] Avg episode reward: [(0, '85.970'), (1, '72.010')]
[2023-10-01 11:15:20,633][119042] Updated weights for policy 1, policy_version 30400 (0.0017)
[2023-10-01 11:15:20,633][119041] Updated weights for policy 0, policy_version 30400 (0.0018)
[2023-10-01 11:15:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5609.4). Total num frames: 15572992. Throughput: 0: 699.4, 1: 699.9. Samples: 3892785. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:15:22,052][117973] Avg episode reward: [(0, '87.890'), (1, '71.530')]
[2023-10-01 11:15:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.8, 300 sec: 5609.4). Total num frames: 15597568. Throughput: 0: 701.0, 1: 704.6. Samples: 3897116. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:15:27,052][117973] Avg episode reward: [(0, '86.130'), (1, '71.960')]
[2023-10-01 11:15:32,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15622144. Throughput: 0: 702.0, 1: 702.1. Samples: 3905376. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:15:32,052][117973] Avg episode reward: [(0, '86.130'), (1, '70.320')]
[2023-10-01 11:15:35,317][119041] Updated weights for policy 0, policy_version 30560 (0.0016)
[2023-10-01 11:15:35,317][119042] Updated weights for policy 1, policy_version 30560 (0.0015)
[2023-10-01 11:15:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5609.4). Total num frames: 15654912. Throughput: 0: 700.4, 1: 700.1. Samples: 3913601. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:15:37,052][117973] Avg episode reward: [(0, '94.180'), (1, '70.810')]
[2023-10-01 11:15:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15679488. Throughput: 0: 700.9, 1: 699.2. Samples: 3917812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:15:42,052][117973] Avg episode reward: [(0, '94.180'), (1, '67.270')]
[2023-10-01 11:15:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15712256. Throughput: 0: 699.0, 1: 697.8. Samples: 3926110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:15:47,052][117973] Avg episode reward: [(0, '99.210'), (1, '68.310')]
[2023-10-01 11:15:49,937][119041] Updated weights for policy 0, policy_version 30720 (0.0012)
[2023-10-01 11:15:49,937][119042] Updated weights for policy 1, policy_version 30720 (0.0012)
[2023-10-01 11:15:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15736832. Throughput: 0: 697.0, 1: 697.5. Samples: 3934415. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:15:52,052][117973] Avg episode reward: [(0, '107.960'), (1, '69.490')]
[2023-10-01 11:15:52,053][118645] Saving new best policy, reward=107.960!
[2023-10-01 11:15:57,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 15761408. Throughput: 0: 697.6, 1: 697.0. Samples: 3938687. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:15:57,053][117973] Avg episode reward: [(0, '107.410'), (1, '69.440')]
[2023-10-01 11:16:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 15794176. Throughput: 0: 696.4, 1: 697.5. Samples: 3947124. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:02,052][117973] Avg episode reward: [(0, '106.490'), (1, '68.370')]
[2023-10-01 11:16:04,472][119042] Updated weights for policy 1, policy_version 30880 (0.0016)
[2023-10-01 11:16:04,472][119041] Updated weights for policy 0, policy_version 30880 (0.0018)
[2023-10-01 11:16:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.8, 300 sec: 5553.9). Total num frames: 15818752. Throughput: 0: 698.4, 1: 698.6. Samples: 3955653. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:07,053][117973] Avg episode reward: [(0, '106.750'), (1, '68.320')]
[2023-10-01 11:16:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15851520. Throughput: 0: 695.4, 1: 694.1. Samples: 3959644. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:12,052][117973] Avg episode reward: [(0, '107.120'), (1, '68.640')]
[2023-10-01 11:16:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 15876096. Throughput: 0: 695.0, 1: 693.8. Samples: 3967870. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:17,052][117973] Avg episode reward: [(0, '108.580'), (1, '66.210')]
[2023-10-01 11:16:17,059][118645] Saving new best policy, reward=108.580!
[2023-10-01 11:16:19,224][119041] Updated weights for policy 0, policy_version 31040 (0.0016)
[2023-10-01 11:16:19,225][119042] Updated weights for policy 1, policy_version 31040 (0.0012)
[2023-10-01 11:16:22,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 15900672. Throughput: 0: 698.4, 1: 699.0. Samples: 3976486. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:22,052][117973] Avg episode reward: [(0, '108.550'), (1, '66.180')]
[2023-10-01 11:16:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5567.8). Total num frames: 15933440. Throughput: 0: 698.6, 1: 701.1. Samples: 3980799. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:27,052][117973] Avg episode reward: [(0, '117.030'), (1, '66.970')]
[2023-10-01 11:16:27,052][118645] Saving new best policy, reward=117.030!
[2023-10-01 11:16:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 15958016. Throughput: 0: 702.8, 1: 702.9. Samples: 3989364. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:32,052][117973] Avg episode reward: [(0, '117.030'), (1, '67.310')]
[2023-10-01 11:16:32,060][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000031168_7979008.pth...
[2023-10-01 11:16:32,060][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000031168_7979008.pth...
[2023-10-01 11:16:32,098][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000028560_7311360.pth
[2023-10-01 11:16:32,101][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000028560_7311360.pth
[2023-10-01 11:16:33,830][119041] Updated weights for policy 0, policy_version 31200 (0.0015)
[2023-10-01 11:16:33,830][119042] Updated weights for policy 1, policy_version 31200 (0.0017)
[2023-10-01 11:16:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 15990784. Throughput: 0: 704.6, 1: 701.6. Samples: 3997696. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:37,053][117973] Avg episode reward: [(0, '117.030'), (1, '65.000')]
[2023-10-01 11:16:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 16015360. Throughput: 0: 702.2, 1: 700.2. Samples: 4001797. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-10-01 11:16:42,052][117973] Avg episode reward: [(0, '133.540'), (1, '59.070')]
[2023-10-01 11:16:42,052][118645] Saving new best policy, reward=133.540!
[2023-10-01 11:16:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16048128. Throughput: 0: 703.7, 1: 703.4. Samples: 4010445. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:16:47,052][117973] Avg episode reward: [(0, '135.270'), (1, '60.600')]
[2023-10-01 11:16:47,058][118645] Saving new best policy, reward=135.270!
[2023-10-01 11:16:48,129][119041] Updated weights for policy 0, policy_version 31360 (0.0014)
[2023-10-01 11:16:48,129][119042] Updated weights for policy 1, policy_version 31360 (0.0017)
[2023-10-01 11:16:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 16072704. Throughput: 0: 700.6, 1: 701.1. Samples: 4018728. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:16:52,052][117973] Avg episode reward: [(0, '137.500'), (1, '61.930')]
[2023-10-01 11:16:52,052][118645] Saving new best policy, reward=137.500!
[2023-10-01 11:16:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5581.7). Total num frames: 16105472. Throughput: 0: 703.8, 1: 703.8. Samples: 4022986. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:16:57,052][117973] Avg episode reward: [(0, '140.690'), (1, '60.190')]
[2023-10-01 11:16:57,053][118645] Saving new best policy, reward=140.690!
[2023-10-01 11:17:02,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16130048. Throughput: 0: 707.2, 1: 707.0. Samples: 4031508. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-10-01 11:17:02,052][117973] Avg episode reward: [(0, '133.420'), (1, '60.030')]
[2023-10-01 11:17:02,832][119042] Updated weights for policy 1, policy_version 31520 (0.0014)
[2023-10-01 11:17:02,833][119041] Updated weights for policy 0, policy_version 31520 (0.0016)
[2023-10-01 11:17:07,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5553.9). Total num frames: 16154624. Throughput: 0: 704.7, 1: 702.8. Samples: 4039824. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:07,052][117973] Avg episode reward: [(0, '134.090'), (1, '61.100')]
[2023-10-01 11:17:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16187392. Throughput: 0: 701.9, 1: 703.5. Samples: 4044043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:12,052][117973] Avg episode reward: [(0, '134.940'), (1, '58.930')]
[2023-10-01 11:17:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16211968. Throughput: 0: 703.3, 1: 702.5. Samples: 4052628. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:17,052][117973] Avg episode reward: [(0, '135.240'), (1, '57.500')]
[2023-10-01 11:17:17,435][119041] Updated weights for policy 0, policy_version 31680 (0.0015)
[2023-10-01 11:17:17,435][119042] Updated weights for policy 1, policy_version 31680 (0.0017)
[2023-10-01 11:17:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5581.7). Total num frames: 16244736. Throughput: 0: 700.4, 1: 702.1. Samples: 4060809. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:22,052][117973] Avg episode reward: [(0, '134.860'), (1, '58.090')]
[2023-10-01 11:17:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16269312. Throughput: 0: 702.1, 1: 704.3. Samples: 4065088. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:17:27,052][117973] Avg episode reward: [(0, '138.260'), (1, '58.000')]
[2023-10-01 11:17:32,052][117973] Fps is (10 sec: 4915.0, 60 sec: 5597.8, 300 sec: 5553.9). Total num frames: 16293888. Throughput: 0: 700.9, 1: 698.8. Samples: 4073433. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:17:32,053][117973] Avg episode reward: [(0, '139.920'), (1, '55.140')]
[2023-10-01 11:17:32,253][119041] Updated weights for policy 0, policy_version 31840 (0.0014)
[2023-10-01 11:17:32,253][119042] Updated weights for policy 1, policy_version 31840 (0.0012)
[2023-10-01 11:17:37,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16326656. Throughput: 0: 700.5, 1: 698.1. Samples: 4081668. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:17:37,053][117973] Avg episode reward: [(0, '139.920'), (1, '56.380')]
[2023-10-01 11:17:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.8, 300 sec: 5581.7). Total num frames: 16351232. Throughput: 0: 698.6, 1: 696.4. Samples: 4085764. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:17:42,052][117973] Avg episode reward: [(0, '139.920'), (1, '56.380')]
[2023-10-01 11:17:46,928][119042] Updated weights for policy 1, policy_version 32000 (0.0013)
[2023-10-01 11:17:46,928][119041] Updated weights for policy 0, policy_version 32000 (0.0014)
[2023-10-01 11:17:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16384000. Throughput: 0: 694.7, 1: 695.1. Samples: 4094048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:47,052][117973] Avg episode reward: [(0, '140.790'), (1, '55.820')]
[2023-10-01 11:17:47,058][118645] Saving new best policy, reward=140.790!
[2023-10-01 11:17:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16408576. Throughput: 0: 694.4, 1: 695.3. Samples: 4102362. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:52,052][117973] Avg episode reward: [(0, '148.660'), (1, '55.400')]
[2023-10-01 11:17:52,053][118645] Saving new best policy, reward=148.660!
[2023-10-01 11:17:57,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 16433152. Throughput: 0: 695.4, 1: 693.8. Samples: 4106557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:17:57,053][117973] Avg episode reward: [(0, '155.310'), (1, '53.100')]
[2023-10-01 11:17:57,054][118645] Saving new best policy, reward=155.310!
[2023-10-01 11:18:01,835][119041] Updated weights for policy 0, policy_version 32160 (0.0015)
[2023-10-01 11:18:01,835][119042] Updated weights for policy 1, policy_version 32160 (0.0017)
[2023-10-01 11:18:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16465920. Throughput: 0: 687.5, 1: 689.0. Samples: 4114567. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:02,052][117973] Avg episode reward: [(0, '155.310'), (1, '54.080')]
[2023-10-01 11:18:07,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16490496. Throughput: 0: 692.3, 1: 693.2. Samples: 4123160. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:07,052][117973] Avg episode reward: [(0, '162.360'), (1, '56.020')]
[2023-10-01 11:18:07,053][118645] Saving new best policy, reward=162.360!
[2023-10-01 11:18:12,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 16515072. Throughput: 0: 689.9, 1: 690.7. Samples: 4127214. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:18:12,052][117973] Avg episode reward: [(0, '165.750'), (1, '55.410')]
[2023-10-01 11:18:12,199][118645] Saving new best policy, reward=165.750!
[2023-10-01 11:18:16,512][119041] Updated weights for policy 0, policy_version 32320 (0.0013)
[2023-10-01 11:18:16,512][119042] Updated weights for policy 1, policy_version 32320 (0.0014)
[2023-10-01 11:18:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16547840. Throughput: 0: 690.9, 1: 692.3. Samples: 4135679. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:18:17,052][117973] Avg episode reward: [(0, '168.300'), (1, '55.510')]
[2023-10-01 11:18:17,062][118645] Saving new best policy, reward=168.300!
[2023-10-01 11:18:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5461.3, 300 sec: 5581.7). Total num frames: 16572416. Throughput: 0: 695.3, 1: 697.6. Samples: 4144347. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:18:22,052][117973] Avg episode reward: [(0, '179.190'), (1, '58.120')]
[2023-10-01 11:18:22,053][118645] Saving new best policy, reward=179.190!
[2023-10-01 11:18:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16605184. Throughput: 0: 696.5, 1: 698.8. Samples: 4148550. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-10-01 11:18:27,052][117973] Avg episode reward: [(0, '179.190'), (1, '57.490')]
[2023-10-01 11:18:31,116][119041] Updated weights for policy 0, policy_version 32480 (0.0014)
[2023-10-01 11:18:31,117][119042] Updated weights for policy 1, policy_version 32480 (0.0015)
[2023-10-01 11:18:32,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16629760. Throughput: 0: 697.4, 1: 697.1. Samples: 4156798. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:32,053][117973] Avg episode reward: [(0, '179.190'), (1, '55.850')]
[2023-10-01 11:18:32,064][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000032480_8314880.pth...
[2023-10-01 11:18:32,065][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000032480_8314880.pth...
[2023-10-01 11:18:32,093][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000029872_7647232.pth
[2023-10-01 11:18:32,104][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000029872_7647232.pth
[2023-10-01 11:18:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16662528. Throughput: 0: 698.0, 1: 698.2. Samples: 4165193. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:37,053][117973] Avg episode reward: [(0, '184.680'), (1, '56.560')]
[2023-10-01 11:18:37,053][118645] Saving new best policy, reward=184.680!
[2023-10-01 11:18:42,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16687104. Throughput: 0: 696.0, 1: 695.7. Samples: 4169186. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:42,053][117973] Avg episode reward: [(0, '185.610'), (1, '56.980')]
[2023-10-01 11:18:42,054][118645] Saving new best policy, reward=185.610!
[2023-10-01 11:18:46,038][119042] Updated weights for policy 1, policy_version 32640 (0.0017)
[2023-10-01 11:18:46,038][119041] Updated weights for policy 0, policy_version 32640 (0.0017)
[2023-10-01 11:18:47,051][117973] Fps is (10 sec: 4915.3, 60 sec: 5461.3, 300 sec: 5581.7). Total num frames: 16711680. Throughput: 0: 699.8, 1: 698.4. Samples: 4177486. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:47,052][117973] Avg episode reward: [(0, '188.120'), (1, '56.570')]
[2023-10-01 11:18:47,060][118645] Saving new best policy, reward=188.120!
[2023-10-01 11:18:52,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16744448. Throughput: 0: 696.6, 1: 696.5. Samples: 4185852. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:18:52,052][117973] Avg episode reward: [(0, '187.030'), (1, '56.320')]
[2023-10-01 11:18:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 16769024. Throughput: 0: 698.6, 1: 697.8. Samples: 4190055. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:18:57,052][117973] Avg episode reward: [(0, '186.310'), (1, '53.640')]
[2023-10-01 11:19:01,428][119041] Updated weights for policy 0, policy_version 32800 (0.0012)
[2023-10-01 11:19:01,428][119042] Updated weights for policy 1, policy_version 32800 (0.0011)
[2023-10-01 11:19:02,051][117973] Fps is (10 sec: 4915.1, 60 sec: 5461.3, 300 sec: 5567.8). Total num frames: 16793600. Throughput: 0: 687.3, 1: 686.4. Samples: 4197496. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:19:02,053][117973] Avg episode reward: [(0, '186.310'), (1, '52.460')]
[2023-10-01 11:19:07,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 16818176. Throughput: 0: 670.1, 1: 667.6. Samples: 4204544. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:19:07,052][117973] Avg episode reward: [(0, '188.730'), (1, '51.700')]
[2023-10-01 11:19:07,052][118645] Saving new best policy, reward=188.730!
[2023-10-01 11:19:12,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 16842752. Throughput: 0: 661.9, 1: 662.3. Samples: 4208137. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-10-01 11:19:12,052][117973] Avg episode reward: [(0, '193.690'), (1, '49.820')]
[2023-10-01 11:19:12,053][118645] Saving new best policy, reward=193.690!
[2023-10-01 11:19:17,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5324.8, 300 sec: 5526.1). Total num frames: 16867328. Throughput: 0: 654.3, 1: 655.2. Samples: 4215727. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:19:17,052][117973] Avg episode reward: [(0, '193.690'), (1, '49.820')]
[2023-10-01 11:19:17,843][119042] Updated weights for policy 1, policy_version 32960 (0.0015)
[2023-10-01 11:19:17,843][119041] Updated weights for policy 0, policy_version 32960 (0.0014)
[2023-10-01 11:19:22,051][117973] Fps is (10 sec: 4915.3, 60 sec: 5324.8, 300 sec: 5526.1). Total num frames: 16891904. Throughput: 0: 654.1, 1: 655.2. Samples: 4224111. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:19:22,052][117973] Avg episode reward: [(0, '193.310'), (1, '54.310')]
[2023-10-01 11:19:27,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5324.8, 300 sec: 5553.9). Total num frames: 16924672. Throughput: 0: 657.8, 1: 656.8. Samples: 4228344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:19:27,053][117973] Avg episode reward: [(0, '193.320'), (1, '55.770')]
[2023-10-01 11:19:32,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5324.8, 300 sec: 5526.1). Total num frames: 16949248. Throughput: 0: 657.2, 1: 658.0. Samples: 4236666. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:19:32,053][117973] Avg episode reward: [(0, '193.320'), (1, '54.980')]
[2023-10-01 11:19:32,588][119041] Updated weights for policy 0, policy_version 33120 (0.0019)
[2023-10-01 11:19:32,588][119042] Updated weights for policy 1, policy_version 33120 (0.0018)
[2023-10-01 11:19:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5324.8, 300 sec: 5553.9). Total num frames: 16982016. Throughput: 0: 663.4, 1: 661.6. Samples: 4245476. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:19:37,052][117973] Avg episode reward: [(0, '193.320'), (1, '56.740')]
[2023-10-01 11:19:42,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5461.4, 300 sec: 5553.9). Total num frames: 17014784. Throughput: 0: 663.5, 1: 664.4. Samples: 4249812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:19:42,052][117973] Avg episode reward: [(0, '193.320'), (1, '57.780')]
[2023-10-01 11:19:46,204][119041] Updated weights for policy 0, policy_version 33280 (0.0018)
[2023-10-01 11:19:46,204][119042] Updated weights for policy 1, policy_version 33280 (0.0017)
[2023-10-01 11:19:47,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 17039360. Throughput: 0: 685.7, 1: 684.2. Samples: 4259142. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:19:47,052][117973] Avg episode reward: [(0, '196.760'), (1, '59.100')]
[2023-10-01 11:19:47,058][118645] Saving new best policy, reward=196.760!
[2023-10-01 11:19:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5581.7). Total num frames: 17072128. Throughput: 0: 703.4, 1: 705.4. Samples: 4267943. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:19:52,053][117973] Avg episode reward: [(0, '192.610'), (1, '60.000')]
[2023-10-01 11:19:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5553.9). Total num frames: 17096704. Throughput: 0: 712.4, 1: 711.0. Samples: 4272192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:19:57,053][117973] Avg episode reward: [(0, '192.610'), (1, '60.040')]
[2023-10-01 11:19:59,946][119041] Updated weights for policy 0, policy_version 33440 (0.0019)
[2023-10-01 11:19:59,946][119042] Updated weights for policy 1, policy_version 33440 (0.0019)
[2023-10-01 11:20:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5581.7). Total num frames: 17129472. Throughput: 0: 728.7, 1: 728.4. Samples: 4281296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:02,052][117973] Avg episode reward: [(0, '192.610'), (1, '60.960')]
[2023-10-01 11:20:07,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5734.4, 300 sec: 5581.7). Total num frames: 17162240. Throughput: 0: 736.4, 1: 736.3. Samples: 4290385. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:07,052][117973] Avg episode reward: [(0, '192.610'), (1, '63.700')]
[2023-10-01 11:20:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5581.7). Total num frames: 17186816. Throughput: 0: 737.4, 1: 736.2. Samples: 4294660. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:12,052][117973] Avg episode reward: [(0, '192.610'), (1, '63.560')]
[2023-10-01 11:20:13,689][119041] Updated weights for policy 0, policy_version 33600 (0.0019)
[2023-10-01 11:20:13,689][119042] Updated weights for policy 1, policy_version 33600 (0.0018)
[2023-10-01 11:20:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5581.7). Total num frames: 17219584. Throughput: 0: 742.8, 1: 744.4. Samples: 4303591. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:17,053][117973] Avg episode reward: [(0, '192.610'), (1, '63.510')]
[2023-10-01 11:20:22,051][117973] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5609.4). Total num frames: 17252352. Throughput: 0: 745.4, 1: 748.6. Samples: 4312704. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:22,052][117973] Avg episode reward: [(0, '192.610'), (1, '63.870')]
[2023-10-01 11:20:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5609.4). Total num frames: 17276928. Throughput: 0: 749.9, 1: 747.0. Samples: 4317172. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:27,052][117973] Avg episode reward: [(0, '192.610'), (1, '63.740')]
[2023-10-01 11:20:27,566][119041] Updated weights for policy 0, policy_version 33760 (0.0018)
[2023-10-01 11:20:27,567][119042] Updated weights for policy 1, policy_version 33760 (0.0015)
[2023-10-01 11:20:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5609.4). Total num frames: 17309696. Throughput: 0: 735.9, 1: 736.2. Samples: 4325384. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:32,052][117973] Avg episode reward: [(0, '192.610'), (1, '62.090')]
[2023-10-01 11:20:32,062][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000033808_8654848.pth...
[2023-10-01 11:20:32,062][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000033808_8654848.pth...
[2023-10-01 11:20:32,098][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000031168_7979008.pth
[2023-10-01 11:20:32,102][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000031168_7979008.pth
[2023-10-01 11:20:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5609.4). Total num frames: 17334272. Throughput: 0: 736.0, 1: 737.5. Samples: 4334247. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:37,053][117973] Avg episode reward: [(0, '218.110'), (1, '61.840')]
[2023-10-01 11:20:37,054][118645] Saving new best policy, reward=218.110!
[2023-10-01 11:20:41,493][119042] Updated weights for policy 1, policy_version 33920 (0.0016)
[2023-10-01 11:20:41,493][119041] Updated weights for policy 0, policy_version 33920 (0.0018)
[2023-10-01 11:20:42,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5609.4). Total num frames: 17367040. Throughput: 0: 740.5, 1: 741.8. Samples: 4338892. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:20:42,053][117973] Avg episode reward: [(0, '218.480'), (1, '63.500')]
[2023-10-01 11:20:42,054][118645] Saving new best policy, reward=218.480!
[2023-10-01 11:20:47,051][117973] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 5637.2). Total num frames: 17399808. Throughput: 0: 741.2, 1: 739.1. Samples: 4347912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:20:47,052][117973] Avg episode reward: [(0, '218.480'), (1, '64.300')]
[2023-10-01 11:20:52,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5637.2). Total num frames: 17424384. Throughput: 0: 737.4, 1: 736.2. Samples: 4356695. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:20:52,052][117973] Avg episode reward: [(0, '222.070'), (1, '65.480')]
[2023-10-01 11:20:52,053][118645] Saving new best policy, reward=222.070!
[2023-10-01 11:20:55,537][119042] Updated weights for policy 1, policy_version 34080 (0.0016)
[2023-10-01 11:20:55,537][119041] Updated weights for policy 0, policy_version 34080 (0.0016)
[2023-10-01 11:20:57,051][117973] Fps is (10 sec: 5324.9, 60 sec: 5939.2, 300 sec: 5623.3). Total num frames: 17453056. Throughput: 0: 735.5, 1: 736.7. Samples: 4360909. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:20:57,052][117973] Avg episode reward: [(0, '305.060'), (1, '65.540')]
[2023-10-01 11:20:57,052][118645] Saving new best policy, reward=305.060!
[2023-10-01 11:21:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5637.2). Total num frames: 17481728. Throughput: 0: 726.6, 1: 724.9. Samples: 4368906. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-10-01 11:21:02,052][117973] Avg episode reward: [(0, '295.790'), (1, '66.870')]
[2023-10-01 11:21:07,052][117973] Fps is (10 sec: 5324.6, 60 sec: 5734.4, 300 sec: 5609.4). Total num frames: 17506304. Throughput: 0: 720.8, 1: 719.2. Samples: 4377506. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:07,053][117973] Avg episode reward: [(0, '287.930'), (1, '68.920')]
[2023-10-01 11:21:10,230][119041] Updated weights for policy 0, policy_version 34240 (0.0013)
[2023-10-01 11:21:10,231][119042] Updated weights for policy 1, policy_version 34240 (0.0016)
[2023-10-01 11:21:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5637.2). Total num frames: 17539072. Throughput: 0: 718.6, 1: 721.0. Samples: 4381952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:12,052][117973] Avg episode reward: [(0, '287.930'), (1, '67.760')]
[2023-10-01 11:21:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17563648. Throughput: 0: 722.2, 1: 723.8. Samples: 4390453. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:17,052][117973] Avg episode reward: [(0, '287.930'), (1, '67.970')]
[2023-10-01 11:21:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17596416. Throughput: 0: 720.1, 1: 717.1. Samples: 4398923. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:22,052][117973] Avg episode reward: [(0, '287.930'), (1, '69.500')]
[2023-10-01 11:21:24,928][119041] Updated weights for policy 0, policy_version 34400 (0.0011)
[2023-10-01 11:21:24,929][119042] Updated weights for policy 1, policy_version 34400 (0.0016)
[2023-10-01 11:21:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17620992. Throughput: 0: 711.4, 1: 712.0. Samples: 4402944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:27,056][117973] Avg episode reward: [(0, '287.930'), (1, '72.620')]
[2023-10-01 11:21:32,052][117973] Fps is (10 sec: 4915.1, 60 sec: 5597.9, 300 sec: 5609.4). Total num frames: 17645568. Throughput: 0: 702.6, 1: 704.8. Samples: 4411249. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:32,052][117973] Avg episode reward: [(0, '287.930'), (1, '76.990')]
[2023-10-01 11:21:37,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17678336. Throughput: 0: 699.7, 1: 698.3. Samples: 4419607. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:37,052][117973] Avg episode reward: [(0, '287.930'), (1, '75.350')]
[2023-10-01 11:21:39,291][119042] Updated weights for policy 1, policy_version 34560 (0.0013)
[2023-10-01 11:21:39,292][119041] Updated weights for policy 0, policy_version 34560 (0.0015)
[2023-10-01 11:21:42,052][117973] Fps is (10 sec: 6553.6, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17711104. Throughput: 0: 700.9, 1: 702.3. Samples: 4424053. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:42,052][117973] Avg episode reward: [(0, '287.930'), (1, '76.110')]
[2023-10-01 11:21:47,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 17735680. Throughput: 0: 712.0, 1: 712.8. Samples: 4433023. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:47,052][117973] Avg episode reward: [(0, '287.930'), (1, '72.140')]
[2023-10-01 11:21:52,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5609.4). Total num frames: 17760256. Throughput: 0: 710.8, 1: 711.2. Samples: 4441498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:52,052][117973] Avg episode reward: [(0, '323.930'), (1, '71.340')]
[2023-10-01 11:21:52,097][118645] Saving new best policy, reward=323.930!
[2023-10-01 11:21:53,542][119041] Updated weights for policy 0, policy_version 34720 (0.0014)
[2023-10-01 11:21:53,542][119042] Updated weights for policy 1, policy_version 34720 (0.0009)
[2023-10-01 11:21:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5666.1, 300 sec: 5637.2). Total num frames: 17793024. Throughput: 0: 710.8, 1: 709.6. Samples: 4445866. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:21:57,052][117973] Avg episode reward: [(0, '334.640'), (1, '72.240')]
[2023-10-01 11:21:57,053][118645] Saving new best policy, reward=334.640!
[2023-10-01 11:22:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 17817600. Throughput: 0: 710.8, 1: 709.7. Samples: 4454375. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:02,052][117973] Avg episode reward: [(0, '334.640'), (1, '71.160')]
[2023-10-01 11:22:07,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17850368. Throughput: 0: 707.7, 1: 707.2. Samples: 4462593. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:07,052][117973] Avg episode reward: [(0, '334.640'), (1, '70.390')]
[2023-10-01 11:22:08,023][119041] Updated weights for policy 0, policy_version 34880 (0.0009)
[2023-10-01 11:22:08,023][119042] Updated weights for policy 1, policy_version 34880 (0.0011)
[2023-10-01 11:22:12,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.8, 300 sec: 5637.2). Total num frames: 17874944. Throughput: 0: 710.0, 1: 708.4. Samples: 4466772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:12,053][117973] Avg episode reward: [(0, '336.880'), (1, '70.310')]
[2023-10-01 11:22:12,172][118645] Saving new best policy, reward=336.880!
[2023-10-01 11:22:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17907712. Throughput: 0: 716.8, 1: 717.3. Samples: 4475784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:17,052][117973] Avg episode reward: [(0, '353.490'), (1, '68.680')]
[2023-10-01 11:22:17,060][118645] Saving new best policy, reward=353.490!
[2023-10-01 11:22:22,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 17932288. Throughput: 0: 718.9, 1: 720.9. Samples: 4484396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:22,052][117973] Avg episode reward: [(0, '352.780'), (1, '65.140')]
[2023-10-01 11:22:22,127][119041] Updated weights for policy 0, policy_version 35040 (0.0012)
[2023-10-01 11:22:22,128][119042] Updated weights for policy 1, policy_version 35040 (0.0014)
[2023-10-01 11:22:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 17965056. Throughput: 0: 719.6, 1: 717.6. Samples: 4488727. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:27,052][117973] Avg episode reward: [(0, '352.780'), (1, '63.670')]
[2023-10-01 11:22:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5637.2). Total num frames: 17989632. Throughput: 0: 713.6, 1: 711.0. Samples: 4497129. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:32,052][117973] Avg episode reward: [(0, '353.360'), (1, '62.460')]
[2023-10-01 11:22:32,058][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000035136_8994816.pth...
[2023-10-01 11:22:32,059][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000035136_8994816.pth...
[2023-10-01 11:22:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000032480_8314880.pth
[2023-10-01 11:22:32,100][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000032480_8314880.pth
[2023-10-01 11:22:36,676][119041] Updated weights for policy 0, policy_version 35200 (0.0011)
[2023-10-01 11:22:36,677][119042] Updated weights for policy 1, policy_version 35200 (0.0013)
[2023-10-01 11:22:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18022400. Throughput: 0: 713.5, 1: 711.0. Samples: 4505600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:37,052][117973] Avg episode reward: [(0, '352.910'), (1, '61.440')]
[2023-10-01 11:22:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 18046976. Throughput: 0: 710.0, 1: 708.5. Samples: 4509696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:42,052][117973] Avg episode reward: [(0, '352.910'), (1, '64.660')]
[2023-10-01 11:22:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18079744. Throughput: 0: 707.8, 1: 710.0. Samples: 4518177. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:47,052][117973] Avg episode reward: [(0, '352.910'), (1, '63.070')]
[2023-10-01 11:22:51,258][119041] Updated weights for policy 0, policy_version 35360 (0.0015)
[2023-10-01 11:22:51,259][119042] Updated weights for policy 1, policy_version 35360 (0.0016)
[2023-10-01 11:22:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18104320. Throughput: 0: 709.0, 1: 711.4. Samples: 4526514. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:52,052][117973] Avg episode reward: [(0, '352.910'), (1, '61.630')]
[2023-10-01 11:22:57,051][117973] Fps is (10 sec: 5324.8, 60 sec: 5666.1, 300 sec: 5651.1). Total num frames: 18132992. Throughput: 0: 710.9, 1: 712.2. Samples: 4530811. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:22:57,052][117973] Avg episode reward: [(0, '376.820'), (1, '60.960')]
[2023-10-01 11:22:57,053][118645] Saving new best policy, reward=376.820!
[2023-10-01 11:23:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18161664. Throughput: 0: 705.2, 1: 704.0. Samples: 4539201. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:23:02,052][117973] Avg episode reward: [(0, '376.820'), (1, '61.580')]
[2023-10-01 11:23:05,793][119041] Updated weights for policy 0, policy_version 35520 (0.0012)
[2023-10-01 11:23:05,794][119042] Updated weights for policy 1, policy_version 35520 (0.0014)
[2023-10-01 11:23:07,051][117973] Fps is (10 sec: 5324.8, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 18186240. Throughput: 0: 705.2, 1: 704.1. Samples: 4547813. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:23:07,052][117973] Avg episode reward: [(0, '376.820'), (1, '59.840')]
[2023-10-01 11:23:12,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18219008. Throughput: 0: 703.6, 1: 704.9. Samples: 4552111. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:23:12,052][117973] Avg episode reward: [(0, '376.820'), (1, '61.490')]
[2023-10-01 11:23:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 18243584. Throughput: 0: 702.3, 1: 705.0. Samples: 4560461. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:23:17,052][117973] Avg episode reward: [(0, '376.820'), (1, '56.610')]
[2023-10-01 11:23:20,590][119042] Updated weights for policy 1, policy_version 35680 (0.0013)
[2023-10-01 11:23:20,590][119041] Updated weights for policy 0, policy_version 35680 (0.0011)
[2023-10-01 11:23:22,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5637.2). Total num frames: 18268160. Throughput: 0: 698.4, 1: 701.1. Samples: 4568578. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:23:22,052][117973] Avg episode reward: [(0, '381.480'), (1, '55.010')]
[2023-10-01 11:23:22,076][118645] Saving new best policy, reward=381.480!
[2023-10-01 11:23:27,052][117973] Fps is (10 sec: 5734.0, 60 sec: 5597.8, 300 sec: 5665.0). Total num frames: 18300928. Throughput: 0: 701.0, 1: 703.5. Samples: 4572897. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:23:27,053][117973] Avg episode reward: [(0, '388.980'), (1, '54.850')]
[2023-10-01 11:23:27,054][118645] Saving new best policy, reward=388.980!
[2023-10-01 11:23:32,051][117973] Fps is (10 sec: 6553.5, 60 sec: 5734.4, 300 sec: 5665.0). Total num frames: 18333696. Throughput: 0: 703.6, 1: 701.6. Samples: 4581410. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:23:32,052][117973] Avg episode reward: [(0, '388.980'), (1, '54.450')]
[2023-10-01 11:23:34,909][119042] Updated weights for policy 1, policy_version 35840 (0.0012)
[2023-10-01 11:23:34,910][119041] Updated weights for policy 0, policy_version 35840 (0.0014)
[2023-10-01 11:23:37,051][117973] Fps is (10 sec: 5734.8, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 18358272. Throughput: 0: 703.4, 1: 703.5. Samples: 4589824. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:23:37,052][117973] Avg episode reward: [(0, '388.980'), (1, '53.470')]
[2023-10-01 11:23:42,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.8, 300 sec: 5665.0). Total num frames: 18382848. Throughput: 0: 701.1, 1: 701.2. Samples: 4593914. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:23:42,053][117973] Avg episode reward: [(0, '388.980'), (1, '51.770')]
[2023-10-01 11:23:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 18415616. Throughput: 0: 707.0, 1: 708.0. Samples: 4602875. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-10-01 11:23:47,052][117973] Avg episode reward: [(0, '388.980'), (1, '52.080')]
[2023-10-01 11:23:49,177][119042] Updated weights for policy 1, policy_version 36000 (0.0014)
[2023-10-01 11:23:49,177][119041] Updated weights for policy 0, policy_version 36000 (0.0015)
[2023-10-01 11:23:52,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 18448384. Throughput: 0: 709.8, 1: 710.2. Samples: 4611716. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:23:52,052][117973] Avg episode reward: [(0, '388.980'), (1, '51.730')]
[2023-10-01 11:23:57,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5666.1, 300 sec: 5692.7). Total num frames: 18472960. Throughput: 0: 709.3, 1: 711.0. Samples: 4616021. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:23:57,053][117973] Avg episode reward: [(0, '388.980'), (1, '50.210')]
[2023-10-01 11:24:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18505728. Throughput: 0: 711.8, 1: 710.6. Samples: 4624466. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:24:02,052][117973] Avg episode reward: [(0, '432.740'), (1, '49.570')]
[2023-10-01 11:24:02,060][118645] Saving new best policy, reward=432.740!
[2023-10-01 11:24:03,330][119042] Updated weights for policy 1, policy_version 36160 (0.0012)
[2023-10-01 11:24:03,330][119041] Updated weights for policy 0, policy_version 36160 (0.0014)
[2023-10-01 11:24:07,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18530304. Throughput: 0: 715.4, 1: 715.5. Samples: 4632969. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:24:07,053][117973] Avg episode reward: [(0, '427.000'), (1, '51.730')]
[2023-10-01 11:24:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 18563072. Throughput: 0: 716.7, 1: 716.8. Samples: 4637405. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-10-01 11:24:12,052][117973] Avg episode reward: [(0, '427.000'), (1, '52.530')]
[2023-10-01 11:24:17,052][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 18587648. Throughput: 0: 720.3, 1: 723.9. Samples: 4646400. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:24:17,052][117973] Avg episode reward: [(0, '441.710'), (1, '54.740')]
[2023-10-01 11:24:17,061][118645] Saving new best policy, reward=441.710!
[2023-10-01 11:24:17,501][119042] Updated weights for policy 1, policy_version 36320 (0.0011)
[2023-10-01 11:24:17,502][119041] Updated weights for policy 0, policy_version 36320 (0.0015)
[2023-10-01 11:24:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 5748.3). Total num frames: 18620416. Throughput: 0: 724.3, 1: 724.0. Samples: 4654998. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:24:22,052][117973] Avg episode reward: [(0, '438.670'), (1, '55.920')]
[2023-10-01 11:24:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.5, 300 sec: 5748.3). Total num frames: 18644992. Throughput: 0: 726.9, 1: 723.9. Samples: 4659200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:24:27,052][117973] Avg episode reward: [(0, '438.120'), (1, '50.650')]
[2023-10-01 11:24:31,747][119041] Updated weights for policy 0, policy_version 36480 (0.0015)
[2023-10-01 11:24:31,747][119042] Updated weights for policy 1, policy_version 36480 (0.0012)
[2023-10-01 11:24:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 18677760. Throughput: 0: 719.2, 1: 719.1. Samples: 4667598. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:24:32,052][117973] Avg episode reward: [(0, '440.990'), (1, '50.820')]
[2023-10-01 11:24:32,059][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000036480_9338880.pth...
[2023-10-01 11:24:32,059][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000036480_9338880.pth...
[2023-10-01 11:24:32,092][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000033808_8654848.pth
[2023-10-01 11:24:32,097][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000033808_8654848.pth
[2023-10-01 11:24:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18702336. Throughput: 0: 720.6, 1: 721.5. Samples: 4676612. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:24:37,052][117973] Avg episode reward: [(0, '435.890'), (1, '46.000')]
[2023-10-01 11:24:42,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5871.0, 300 sec: 5748.3). Total num frames: 18735104. Throughput: 0: 722.2, 1: 721.4. Samples: 4680982. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:24:42,052][117973] Avg episode reward: [(0, '445.760'), (1, '45.840')]
[2023-10-01 11:24:42,052][118645] Saving new best policy, reward=445.760!
[2023-10-01 11:24:46,072][119042] Updated weights for policy 1, policy_version 36640 (0.0016)
[2023-10-01 11:24:46,073][119041] Updated weights for policy 0, policy_version 36640 (0.0018)
[2023-10-01 11:24:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18759680. Throughput: 0: 720.9, 1: 721.2. Samples: 4689358. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:24:47,052][117973] Avg episode reward: [(0, '483.650'), (1, '45.510')]
[2023-10-01 11:24:47,060][118645] Saving new best policy, reward=483.650!
[2023-10-01 11:24:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 18792448. Throughput: 0: 722.0, 1: 722.0. Samples: 4697950. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:24:52,052][117973] Avg episode reward: [(0, '477.270'), (1, '46.490')]
[2023-10-01 11:24:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18817024. Throughput: 0: 720.6, 1: 718.8. Samples: 4702178. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-10-01 11:24:57,052][117973] Avg episode reward: [(0, '477.920'), (1, '45.100')]
[2023-10-01 11:25:00,463][119042] Updated weights for policy 1, policy_version 36800 (0.0011)
[2023-10-01 11:25:00,464][119041] Updated weights for policy 0, policy_version 36800 (0.0014)
[2023-10-01 11:25:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18849792. Throughput: 0: 713.4, 1: 711.1. Samples: 4710499. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:02,052][117973] Avg episode reward: [(0, '470.300'), (1, '45.320')]
[2023-10-01 11:25:07,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18874368. Throughput: 0: 716.2, 1: 716.0. Samples: 4719445. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:07,052][117973] Avg episode reward: [(0, '462.810'), (1, '47.230')]
[2023-10-01 11:25:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18907136. Throughput: 0: 717.4, 1: 720.8. Samples: 4723917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:12,052][117973] Avg episode reward: [(0, '462.810'), (1, '50.530')]
[2023-10-01 11:25:14,506][119041] Updated weights for policy 0, policy_version 36960 (0.0014)
[2023-10-01 11:25:14,507][119042] Updated weights for policy 1, policy_version 36960 (0.0015)
[2023-10-01 11:25:17,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 18931712. Throughput: 0: 722.7, 1: 722.8. Samples: 4732645. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:17,052][117973] Avg episode reward: [(0, '460.920'), (1, '50.110')]
[2023-10-01 11:25:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 18964480. Throughput: 0: 718.1, 1: 715.6. Samples: 4741131. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:22,052][117973] Avg episode reward: [(0, '460.920'), (1, '52.000')]
[2023-10-01 11:25:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 18989056. Throughput: 0: 715.3, 1: 714.5. Samples: 4745323. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:25:27,052][117973] Avg episode reward: [(0, '461.410'), (1, '51.730')]
[2023-10-01 11:25:28,880][119042] Updated weights for policy 1, policy_version 37120 (0.0011)
[2023-10-01 11:25:28,881][119041] Updated weights for policy 0, policy_version 37120 (0.0012)
[2023-10-01 11:25:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19021824. Throughput: 0: 713.4, 1: 713.6. Samples: 4753576. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:25:32,052][117973] Avg episode reward: [(0, '461.740'), (1, '52.380')]
[2023-10-01 11:25:37,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19046400. Throughput: 0: 710.7, 1: 710.2. Samples: 4761890. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:25:37,052][117973] Avg episode reward: [(0, '456.990'), (1, '52.710')]
[2023-10-01 11:25:42,051][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 19070976. Throughput: 0: 708.9, 1: 711.0. Samples: 4766076. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:25:42,052][117973] Avg episode reward: [(0, '456.990'), (1, '53.450')]
[2023-10-01 11:25:43,672][119041] Updated weights for policy 0, policy_version 37280 (0.0018)
[2023-10-01 11:25:43,673][119042] Updated weights for policy 1, policy_version 37280 (0.0016)
[2023-10-01 11:25:47,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19103744. Throughput: 0: 712.4, 1: 714.2. Samples: 4774697. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-10-01 11:25:47,052][117973] Avg episode reward: [(0, '456.990'), (1, '54.370')]
[2023-10-01 11:25:52,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5678.9). Total num frames: 19128320. Throughput: 0: 711.6, 1: 709.7. Samples: 4783403. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:52,052][117973] Avg episode reward: [(0, '459.890'), (1, '55.180')]
[2023-10-01 11:25:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19161088. Throughput: 0: 707.0, 1: 704.4. Samples: 4787430. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:25:57,052][117973] Avg episode reward: [(0, '457.790'), (1, '56.940')]
[2023-10-01 11:25:58,194][119042] Updated weights for policy 1, policy_version 37440 (0.0015)
[2023-10-01 11:25:58,194][119041] Updated weights for policy 0, policy_version 37440 (0.0015)
[2023-10-01 11:26:02,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5692.8). Total num frames: 19185664. Throughput: 0: 701.2, 1: 700.8. Samples: 4795731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:02,052][117973] Avg episode reward: [(0, '457.790'), (1, '59.340')]
[2023-10-01 11:26:07,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19218432. Throughput: 0: 702.6, 1: 704.6. Samples: 4804455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:07,052][117973] Avg episode reward: [(0, '457.790'), (1, '59.630')]
[2023-10-01 11:26:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5692.7). Total num frames: 19243008. Throughput: 0: 702.8, 1: 703.1. Samples: 4808592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:12,052][117973] Avg episode reward: [(0, '457.790'), (1, '59.540')]
[2023-10-01 11:26:12,585][119041] Updated weights for policy 0, policy_version 37600 (0.0016)
[2023-10-01 11:26:12,586][119042] Updated weights for policy 1, policy_version 37600 (0.0013)
[2023-10-01 11:26:17,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19275776. Throughput: 0: 704.8, 1: 704.6. Samples: 4816999. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:17,053][117973] Avg episode reward: [(0, '457.790'), (1, '58.350')]
[2023-10-01 11:26:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5692.7). Total num frames: 19300352. Throughput: 0: 710.0, 1: 710.0. Samples: 4825789. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:22,052][117973] Avg episode reward: [(0, '479.010'), (1, '58.530')]
[2023-10-01 11:26:26,848][119042] Updated weights for policy 1, policy_version 37760 (0.0014)
[2023-10-01 11:26:26,848][119041] Updated weights for policy 0, policy_version 37760 (0.0015)
[2023-10-01 11:26:27,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19333120. Throughput: 0: 711.1, 1: 711.3. Samples: 4830082. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:27,052][117973] Avg episode reward: [(0, '478.440'), (1, '58.730')]
[2023-10-01 11:26:32,051][117973] Fps is (10 sec: 5734.3, 60 sec: 5597.9, 300 sec: 5692.7). Total num frames: 19357696. Throughput: 0: 709.2, 1: 708.1. Samples: 4838473. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:32,052][117973] Avg episode reward: [(0, '478.440'), (1, '60.020')]
[2023-10-01 11:26:32,059][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000037808_9678848.pth...
[2023-10-01 11:26:32,059][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000037808_9678848.pth...
[2023-10-01 11:26:32,091][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000035136_8994816.pth
[2023-10-01 11:26:32,095][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000035136_8994816.pth
[2023-10-01 11:26:37,052][117973] Fps is (10 sec: 4915.2, 60 sec: 5597.9, 300 sec: 5665.0). Total num frames: 19382272. Throughput: 0: 705.8, 1: 707.2. Samples: 4846987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:37,052][117973] Avg episode reward: [(0, '497.310'), (1, '60.780')]
[2023-10-01 11:26:37,072][118645] Saving new best policy, reward=497.310!
[2023-10-01 11:26:41,328][119042] Updated weights for policy 1, policy_version 37920 (0.0014)
[2023-10-01 11:26:41,329][119041] Updated weights for policy 0, policy_version 37920 (0.0016)
[2023-10-01 11:26:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19415040. Throughput: 0: 708.9, 1: 710.7. Samples: 4851311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:42,052][117973] Avg episode reward: [(0, '491.190'), (1, '63.340')]
[2023-10-01 11:26:47,051][117973] Fps is (10 sec: 6553.7, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19447808. Throughput: 0: 714.1, 1: 712.9. Samples: 4859948. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:47,052][117973] Avg episode reward: [(0, '491.190'), (1, '63.450')]
[2023-10-01 11:26:52,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19472384. Throughput: 0: 714.8, 1: 714.6. Samples: 4868778. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:52,053][117973] Avg episode reward: [(0, '494.290'), (1, '63.370')]
[2023-10-01 11:26:55,489][119041] Updated weights for policy 0, policy_version 38080 (0.0015)
[2023-10-01 11:26:55,490][119042] Updated weights for policy 1, policy_version 38080 (0.0011)
[2023-10-01 11:26:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19505152. Throughput: 0: 716.6, 1: 716.0. Samples: 4873061. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:26:57,053][117973] Avg episode reward: [(0, '494.310'), (1, '63.380')]
[2023-10-01 11:27:02,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19529728. Throughput: 0: 720.2, 1: 720.4. Samples: 4881826. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:02,052][117973] Avg episode reward: [(0, '494.310'), (1, '64.200')]
[2023-10-01 11:27:07,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19562496. Throughput: 0: 721.6, 1: 719.2. Samples: 4890624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:07,052][117973] Avg episode reward: [(0, '494.310'), (1, '63.620')]
[2023-10-01 11:27:09,479][119042] Updated weights for policy 1, policy_version 38240 (0.0015)
[2023-10-01 11:27:09,480][119041] Updated weights for policy 0, policy_version 38240 (0.0014)
[2023-10-01 11:27:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19587072. Throughput: 0: 719.7, 1: 718.2. Samples: 4894787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:12,052][117973] Avg episode reward: [(0, '494.310'), (1, '63.900')]
[2023-10-01 11:27:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19619840. Throughput: 0: 722.1, 1: 722.1. Samples: 4903459. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:17,052][117973] Avg episode reward: [(0, '494.310'), (1, '63.340')]
[2023-10-01 11:27:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19644416. Throughput: 0: 722.6, 1: 724.1. Samples: 4912088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:22,052][117973] Avg episode reward: [(0, '498.060'), (1, '59.000')]
[2023-10-01 11:27:22,052][118645] Saving new best policy, reward=498.060!
[2023-10-01 11:27:23,857][119042] Updated weights for policy 1, policy_version 38400 (0.0012)
[2023-10-01 11:27:23,857][119041] Updated weights for policy 0, policy_version 38400 (0.0013)
[2023-10-01 11:27:27,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19677184. Throughput: 0: 721.3, 1: 723.1. Samples: 4916310. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:27:27,052][117973] Avg episode reward: [(0, '500.050'), (1, '59.140')]
[2023-10-01 11:27:27,052][118645] Saving new best policy, reward=500.050!
[2023-10-01 11:27:32,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19701760. Throughput: 0: 722.6, 1: 723.0. Samples: 4925001. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 11:27:32,052][117973] Avg episode reward: [(0, '474.990'), (1, '56.810')]
[2023-10-01 11:27:37,052][117973] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 5720.5). Total num frames: 19734528. Throughput: 0: 721.7, 1: 719.5. Samples: 4933631. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 11:27:37,052][117973] Avg episode reward: [(0, '474.990'), (1, '58.670')]
[2023-10-01 11:27:38,159][119042] Updated weights for policy 1, policy_version 38560 (0.0013)
[2023-10-01 11:27:38,160][119041] Updated weights for policy 0, policy_version 38560 (0.0015)
[2023-10-01 11:27:42,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5692.7). Total num frames: 19759104. Throughput: 0: 719.5, 1: 717.6. Samples: 4937732. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 11:27:42,052][117973] Avg episode reward: [(0, '474.990'), (1, '57.830')]
[2023-10-01 11:27:47,051][117973] Fps is (10 sec: 5734.5, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19791872. Throughput: 0: 716.7, 1: 717.3. Samples: 4946355. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 11:27:47,052][117973] Avg episode reward: [(0, '474.990'), (1, '57.120')]
[2023-10-01 11:27:52,052][117973] Fps is (10 sec: 5734.2, 60 sec: 5734.4, 300 sec: 5706.6). Total num frames: 19816448. Throughput: 0: 715.4, 1: 718.3. Samples: 4955138. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-10-01 11:27:52,053][117973] Avg episode reward: [(0, '436.360'), (1, '55.420')]
[2023-10-01 11:27:52,324][119041] Updated weights for policy 0, policy_version 38720 (0.0014)
[2023-10-01 11:27:52,326][119042] Updated weights for policy 1, policy_version 38720 (0.0013)
[2023-10-01 11:27:57,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19849216. Throughput: 0: 718.9, 1: 718.4. Samples: 4959467. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:27:57,052][117973] Avg episode reward: [(0, '457.350'), (1, '55.640')]
[2023-10-01 11:28:02,051][117973] Fps is (10 sec: 5734.6, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19873792. Throughput: 0: 715.8, 1: 715.1. Samples: 4967851. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:28:02,052][117973] Avg episode reward: [(0, '457.350'), (1, '55.200')]
[2023-10-01 11:28:06,846][119042] Updated weights for policy 1, policy_version 38880 (0.0014)
[2023-10-01 11:28:06,846][119041] Updated weights for policy 0, policy_version 38880 (0.0015)
[2023-10-01 11:28:07,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19906560. Throughput: 0: 715.5, 1: 715.5. Samples: 4976484. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:28:07,067][117973] Avg episode reward: [(0, '457.350'), (1, '53.860')]
[2023-10-01 11:28:12,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19931136. Throughput: 0: 716.4, 1: 713.6. Samples: 4980659. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:28:12,052][117973] Avg episode reward: [(0, '457.350'), (1, '53.550')]
[2023-10-01 11:28:17,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5748.3). Total num frames: 19963904. Throughput: 0: 710.9, 1: 709.8. Samples: 4988932. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-10-01 11:28:17,052][117973] Avg episode reward: [(0, '457.350'), (1, '53.200')]
[2023-10-01 11:28:21,469][119042] Updated weights for policy 1, policy_version 39040 (0.0015)
[2023-10-01 11:28:21,469][119041] Updated weights for policy 0, policy_version 39040 (0.0016)
[2023-10-01 11:28:22,051][117973] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 5720.5). Total num frames: 19988480. Throughput: 0: 705.5, 1: 707.3. Samples: 4997210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-10-01 11:28:22,052][117973] Avg episode reward: [(0, '457.350'), (1, '50.840')]
[2023-10-01 11:28:25,749][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-10-01 11:28:25,750][119087] Stopping RolloutWorker_w4...
[2023-10-01 11:28:25,750][119090] Stopping RolloutWorker_w6...
[2023-10-01 11:28:25,750][119087] Loop rollout_proc4_evt_loop terminating...
[2023-10-01 11:28:25,750][119085] Stopping RolloutWorker_w2...
[2023-10-01 11:28:25,750][119088] Stopping RolloutWorker_w5...
[2023-10-01 11:28:25,750][119086] Stopping RolloutWorker_w3...
[2023-10-01 11:28:25,750][119090] Loop rollout_proc6_evt_loop terminating...
[2023-10-01 11:28:25,750][119085] Loop rollout_proc2_evt_loop terminating...
[2023-10-01 11:28:25,750][119089] Stopping RolloutWorker_w1...
[2023-10-01 11:28:25,750][119091] Stopping RolloutWorker_w7...
[2023-10-01 11:28:25,750][117973] Component RolloutWorker_w4 stopped!
[2023-10-01 11:28:25,750][119088] Loop rollout_proc5_evt_loop terminating...
[2023-10-01 11:28:25,751][119086] Loop rollout_proc3_evt_loop terminating...
[2023-10-01 11:28:25,750][119083] Stopping RolloutWorker_w0...
[2023-10-01 11:28:25,751][118645] Stopping Batcher_0...
[2023-10-01 11:28:25,751][119091] Loop rollout_proc7_evt_loop terminating...
[2023-10-01 11:28:25,751][119089] Loop rollout_proc1_evt_loop terminating...
[2023-10-01 11:28:25,751][118645] Loop batcher_evt_loop terminating...
[2023-10-01 11:28:25,751][119083] Loop rollout_proc0_evt_loop terminating...
[2023-10-01 11:28:25,751][117973] Component RolloutWorker_w6 stopped!
[2023-10-01 11:28:25,752][117973] Component RolloutWorker_w5 stopped!
[2023-10-01 11:28:25,753][117973] Component RolloutWorker_w3 stopped!
[2023-10-01 11:28:25,753][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-10-01 11:28:25,753][117973] Component RolloutWorker_w2 stopped!
[2023-10-01 11:28:25,754][117973] Component RolloutWorker_w1 stopped!
[2023-10-01 11:28:25,754][117973] Component RolloutWorker_w7 stopped!
[2023-10-01 11:28:25,755][117973] Component RolloutWorker_w0 stopped!
[2023-10-01 11:28:25,755][117973] Component Batcher_0 stopped!
[2023-10-01 11:28:25,756][117973] Component Batcher_1 stopped!
[2023-10-01 11:28:25,771][118715] Stopping Batcher_1...
[2023-10-01 11:28:25,787][118645] Removing ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000036480_9338880.pth
[2023-10-01 11:28:25,793][118645] Saving ./train_atari/atari_videopinball/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-10-01 11:28:25,795][119042] Weights refcount: 2 0
[2023-10-01 11:28:25,796][119042] Stopping InferenceWorker_p1-w0...
[2023-10-01 11:28:25,796][117973] Component InferenceWorker_p1-w0 stopped!
[2023-10-01 11:28:25,797][119041] Weights refcount: 2 0
[2023-10-01 11:28:25,798][118715] Loop batcher_evt_loop terminating...
[2023-10-01 11:28:25,799][119042] Loop inference_proc1-0_evt_loop terminating...
[2023-10-01 11:28:25,800][118715] Removing ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000036480_9338880.pth
[2023-10-01 11:28:25,800][119041] Stopping InferenceWorker_p0-w0...
[2023-10-01 11:28:25,800][117973] Component InferenceWorker_p0-w0 stopped!
[2023-10-01 11:28:25,800][119041] Loop inference_proc0-0_evt_loop terminating...
[2023-10-01 11:28:25,806][118715] Saving ./train_atari/atari_videopinball/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-10-01 11:28:25,849][118645] Stopping LearnerWorker_p0...
[2023-10-01 11:28:25,849][118645] Loop learner_proc0_evt_loop terminating...
[2023-10-01 11:28:25,849][117973] Component LearnerWorker_p0 stopped!
[2023-10-01 11:28:25,903][118715] Stopping LearnerWorker_p1...
[2023-10-01 11:28:25,904][118715] Loop learner_proc1_evt_loop terminating...
[2023-10-01 11:28:25,904][117973] Component LearnerWorker_p1 stopped!
[2023-10-01 11:28:25,906][117973] Waiting for process learner_proc0 to stop...
[2023-10-01 11:28:26,561][117973] Waiting for process learner_proc1 to stop...
[2023-10-01 11:28:26,685][117973] Waiting for process inference_proc0-0 to join...
[2023-10-01 11:28:26,685][117973] Waiting for process inference_proc1-0 to join...
[2023-10-01 11:28:26,686][117973] Waiting for process rollout_proc0 to join...
[2023-10-01 11:28:26,686][117973] Waiting for process rollout_proc1 to join...
[2023-10-01 11:28:26,687][117973] Waiting for process rollout_proc2 to join...
[2023-10-01 11:28:26,687][117973] Waiting for process rollout_proc3 to join...
[2023-10-01 11:28:26,688][117973] Waiting for process rollout_proc4 to join...
[2023-10-01 11:28:26,688][117973] Waiting for process rollout_proc5 to join...
[2023-10-01 11:28:26,689][117973] Waiting for process rollout_proc6 to join...
[2023-10-01 11:28:26,689][117973] Waiting for process rollout_proc7 to join...
[2023-10-01 11:28:26,690][117973] Batcher 0 profile tree view:
batching: 20.3219, releasing_batches: 1.8075
[2023-10-01 11:28:26,690][117973] Batcher 1 profile tree view:
batching: 20.0259, releasing_batches: 1.6284
[2023-10-01 11:28:26,690][117973] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0051
wait_policy_total: 705.2156
update_model: 39.9754
weight_update: 0.0015
one_step: 0.0014
handle_policy_step: 2497.6383
deserialize: 72.0465, stack: 17.4908, obs_to_device_normalize: 609.6813, forward: 1210.0172, send_messages: 100.4661
prepare_outputs: 332.2135
to_cpu: 166.6091
[2023-10-01 11:28:26,691][117973] InferenceWorker_p1-w0 profile tree view:
wait_policy: 0.0052
wait_policy_total: 717.8632
update_model: 40.8072
weight_update: 0.0013
one_step: 0.0013
handle_policy_step: 2483.5724
deserialize: 73.1755, stack: 17.7367, obs_to_device_normalize: 596.4739, forward: 1213.7785, send_messages: 99.8161
prepare_outputs: 324.4171
to_cpu: 163.3392
[2023-10-01 11:28:26,691][117973] Learner 0 profile tree view:
misc: 0.0173, prepare_batch: 31.5675
train: 456.1824
epoch_init: 0.0935, minibatch_init: 3.4039, losses_postprocess: 59.4576, kl_divergence: 5.8500, after_optimizer: 20.4619
calculate_losses: 48.4074
losses_init: 0.0963, forward_head: 14.9784, bptt_initial: 0.4475, bptt: 0.5437, tail: 11.3755, advantages_returns: 3.3528, losses: 13.7156
update: 313.9645
clip: 165.3935
[2023-10-01 11:28:26,692][117973] Learner 1 profile tree view:
misc: 0.0148, prepare_batch: 31.3666
train: 457.6373
epoch_init: 0.0967, minibatch_init: 3.4317, losses_postprocess: 59.3565, kl_divergence: 5.8646, after_optimizer: 20.3838
calculate_losses: 48.2323
losses_init: 0.0815, forward_head: 14.9061, bptt_initial: 0.4542, bptt: 0.5102, tail: 11.4531, advantages_returns: 3.3226, losses: 13.6540
update: 315.8212
clip: 166.9706
[2023-10-01 11:28:26,692][117973] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3915, enqueue_policy_requests: 47.4726, env_step: 1374.0457, overhead: 30.4934, complete_rollouts: 1.1148
save_policy_outputs: 57.6937
split_output_tensors: 19.7605
[2023-10-01 11:28:26,692][117973] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3961, enqueue_policy_requests: 46.8503, env_step: 1353.3416, overhead: 30.0414, complete_rollouts: 1.1416
save_policy_outputs: 57.2968
split_output_tensors: 19.5440
[2023-10-01 11:28:26,693][117973] Loop Runner_EvtLoop terminating...
[2023-10-01 11:28:26,693][117973] Runner profile tree view:
main_loop: 3470.3665
[2023-10-01 11:28:26,693][117973] Collected {0: 10006528, 1: 10006528}, FPS: 5766.8