diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,34 +1,32 @@ -[2023-07-08 13:52:26,047][977264] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/config.json... -[2023-07-08 13:52:26,066][977264] Rollout worker 0 uses device cpu -[2023-07-08 13:52:26,066][977264] Rollout worker 1 uses device cpu -[2023-07-08 13:52:26,066][977264] Rollout worker 2 uses device cpu -[2023-07-08 13:52:26,067][977264] Rollout worker 3 uses device cpu -[2023-07-08 13:52:26,067][977264] Rollout worker 4 uses device cpu -[2023-07-08 13:52:26,067][977264] Rollout worker 5 uses device cpu -[2023-07-08 13:52:26,067][977264] Rollout worker 6 uses device cpu -[2023-07-08 13:52:26,067][977264] Rollout worker 7 uses device cpu -[2023-07-08 13:52:26,067][977264] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 -[2023-07-08 13:52:26,080][977264] InferenceWorker_p0-w0: min num requests: 2 -[2023-07-08 13:52:26,100][977264] Starting all processes... -[2023-07-08 13:52:26,101][977264] Starting process learner_proc0 -[2023-07-08 13:52:26,149][977264] Starting all processes... -[2023-07-08 13:52:26,192][977264] Starting process inference_proc0-0 -[2023-07-08 13:52:26,192][977264] Starting process rollout_proc0 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc1 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc2 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc3 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc4 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc5 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc6 -[2023-07-08 13:52:26,193][977264] Starting process rollout_proc7 -[2023-07-08 13:52:28,102][977553] Worker 0 uses CPU cores [0, 1, 2, 3] -[2023-07-08 13:52:28,280][977508] Starting seed is not provided -[2023-07-08 13:52:28,280][977508] Initializing actor-critic model on device cpu -[2023-07-08 13:52:28,280][977508] RunningMeanStd input shape: (39,) -[2023-07-08 13:52:28,281][977508] RunningMeanStd input shape: (1,) -[2023-07-08 13:52:28,292][977555] Worker 2 uses CPU cores [8, 9, 10, 11] -[2023-07-08 13:52:28,340][977508] Created Actor Critic model with architecture: -[2023-07-08 13:52:28,341][977508] ActorCriticSharedWeights( +[2023-07-16 20:02:27,159][221941] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/config.json... +[2023-07-16 20:02:27,174][221941] Rollout worker 0 uses device cpu +[2023-07-16 20:02:27,174][221941] Rollout worker 1 uses device cpu +[2023-07-16 20:02:27,174][221941] Rollout worker 2 uses device cpu +[2023-07-16 20:02:27,175][221941] Rollout worker 3 uses device cpu +[2023-07-16 20:02:27,175][221941] Rollout worker 4 uses device cpu +[2023-07-16 20:02:27,175][221941] Rollout worker 5 uses device cpu +[2023-07-16 20:02:27,175][221941] Rollout worker 6 uses device cpu +[2023-07-16 20:02:27,175][221941] Rollout worker 7 uses device cpu +[2023-07-16 20:02:27,175][221941] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2023-07-16 20:02:27,186][221941] InferenceWorker_p0-w0: min num requests: 2 +[2023-07-16 20:02:27,203][221941] Starting all processes... +[2023-07-16 20:02:27,203][221941] Starting process learner_proc0 +[2023-07-16 20:02:27,252][221941] Starting all processes... +[2023-07-16 20:02:27,295][221941] Starting process inference_proc0-0 +[2023-07-16 20:02:27,304][221941] Starting process rollout_proc0 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc1 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc2 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc3 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc4 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc5 +[2023-07-16 20:02:27,305][221941] Starting process rollout_proc6 +[2023-07-16 20:02:27,306][221941] Starting process rollout_proc7 +[2023-07-16 20:02:29,133][222182] Starting seed is not provided +[2023-07-16 20:02:29,133][222182] Initializing actor-critic model on device cpu +[2023-07-16 20:02:29,133][222182] RunningMeanStd input shape: (39,) +[2023-07-16 20:02:29,134][222182] RunningMeanStd input shape: (1,) +[2023-07-16 20:02:29,193][222182] Created Actor Critic model with architecture: +[2023-07-16 20:02:29,193][222182] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -59,970 +57,823 @@ (distribution_linear): Linear(in_features=64, out_features=4, bias=True) ) ) -[2023-07-08 13:52:28,485][977562] Worker 3 uses CPU cores [12, 13, 14, 15] -[2023-07-08 13:52:28,531][977554] Worker 1 uses CPU cores [4, 5, 6, 7] -[2023-07-08 13:52:28,708][977508] Using optimizer -[2023-07-08 13:52:28,709][977508] No checkpoints found -[2023-07-08 13:52:28,709][977508] Did not load from checkpoint, starting from scratch! -[2023-07-08 13:52:28,710][977508] Initialized policy 0 weights for model version 0 -[2023-07-08 13:52:28,711][977508] LearnerWorker_p0 finished initialization! -[2023-07-08 13:52:28,764][977620] Worker 5 uses CPU cores [20, 21, 22, 23] -[2023-07-08 13:52:28,791][977552] RunningMeanStd input shape: (39,) -[2023-07-08 13:52:28,792][977552] RunningMeanStd input shape: (1,) -[2023-07-08 13:52:28,877][977264] Inference worker 0-0 is ready! -[2023-07-08 13:52:28,878][977264] All inference workers are ready! Signal rollout workers to start! -[2023-07-08 13:52:29,008][977684] Worker 6 uses CPU cores [24, 25, 26, 27] -[2023-07-08 13:52:29,021][977683] Worker 7 uses CPU cores [28, 29, 30, 31] -[2023-07-08 13:52:29,231][977619] Worker 4 uses CPU cores [16, 17, 18, 19] -[2023-07-08 13:52:32,898][977562] Decorrelating experience for 0 frames... -[2023-07-08 13:52:32,911][977562] Decorrelating experience for 64 frames... -[2023-07-08 13:52:32,917][977620] Decorrelating experience for 0 frames... -[2023-07-08 13:52:32,927][977553] Decorrelating experience for 0 frames... -[2023-07-08 13:52:32,930][977620] Decorrelating experience for 64 frames... -[2023-07-08 13:52:32,940][977553] Decorrelating experience for 64 frames... -[2023-07-08 13:52:32,943][977562] Decorrelating experience for 128 frames... -[2023-07-08 13:52:32,961][977620] Decorrelating experience for 128 frames... -[2023-07-08 13:52:32,972][977553] Decorrelating experience for 128 frames... -[2023-07-08 13:52:32,973][977555] Decorrelating experience for 0 frames... -[2023-07-08 13:52:32,985][977555] Decorrelating experience for 64 frames... -[2023-07-08 13:52:33,005][977562] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,017][977555] Decorrelating experience for 128 frames... -[2023-07-08 13:52:33,023][977620] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,034][977553] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,079][977555] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,108][977683] Decorrelating experience for 0 frames... -[2023-07-08 13:52:33,121][977683] Decorrelating experience for 64 frames... -[2023-07-08 13:52:33,147][977554] Decorrelating experience for 0 frames... -[2023-07-08 13:52:33,152][977683] Decorrelating experience for 128 frames... -[2023-07-08 13:52:33,159][977554] Decorrelating experience for 64 frames... -[2023-07-08 13:52:33,179][977684] Decorrelating experience for 0 frames... -[2023-07-08 13:52:33,191][977554] Decorrelating experience for 128 frames... -[2023-07-08 13:52:33,193][977684] Decorrelating experience for 64 frames... -[2023-07-08 13:52:33,215][977683] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,224][977684] Decorrelating experience for 128 frames... -[2023-07-08 13:52:33,234][977264] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-07-08 13:52:33,254][977554] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,287][977684] Decorrelating experience for 192 frames... -[2023-07-08 13:52:33,347][977619] Decorrelating experience for 0 frames... -[2023-07-08 13:52:33,360][977619] Decorrelating experience for 64 frames... -[2023-07-08 13:52:33,391][977619] Decorrelating experience for 128 frames... -[2023-07-08 13:52:33,454][977619] Decorrelating experience for 192 frames... -[2023-07-08 13:52:36,967][977562] Decorrelating experience for 256 frames... -[2023-07-08 13:52:36,994][977620] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,033][977553] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,078][977562] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,116][977620] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,123][977555] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,145][977553] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,176][977683] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,220][977562] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,233][977555] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,243][977554] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,258][977620] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,286][977553] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,287][977683] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,295][977684] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,353][977554] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,375][977555] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,378][977562] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,406][977684] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,416][977620] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,427][977683] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,446][977553] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,492][977619] Decorrelating experience for 256 frames... -[2023-07-08 13:52:37,493][977554] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,534][977555] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,549][977684] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,586][977683] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,606][977619] Decorrelating experience for 320 frames... -[2023-07-08 13:52:37,654][977554] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,711][977684] Decorrelating experience for 448 frames... -[2023-07-08 13:52:37,750][977619] Decorrelating experience for 384 frames... -[2023-07-08 13:52:37,911][977619] Decorrelating experience for 448 frames... -[2023-07-08 13:52:38,234][977264] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 201.6. Samples: 1008. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-07-08 13:52:38,234][977264] Avg episode reward: [(0, '9.662')] -[2023-07-08 13:52:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000000_0.pth... -[2023-07-08 13:52:42,072][977552] Updated weights for policy 0, policy_version 80 (0.0006) -[2023-07-08 13:52:43,233][977264] Fps is (10 sec: 5324.8, 60 sec: 5324.8, 300 sec: 5324.8). Total num frames: 53248. Throughput: 0: 2866.4. Samples: 28664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:52:43,234][977264] Avg episode reward: [(0, '99.776')] -[2023-07-08 13:52:45,957][977552] Updated weights for policy 0, policy_version 160 (0.0005) -[2023-07-08 13:52:46,075][977264] Heartbeat connected on Batcher_0 -[2023-07-08 13:52:46,078][977264] Heartbeat connected on LearnerWorker_p0 -[2023-07-08 13:52:46,082][977264] Heartbeat connected on InferenceWorker_p0-w0 -[2023-07-08 13:52:46,084][977264] Heartbeat connected on RolloutWorker_w0 -[2023-07-08 13:52:46,088][977264] Heartbeat connected on RolloutWorker_w1 -[2023-07-08 13:52:46,090][977264] Heartbeat connected on RolloutWorker_w2 -[2023-07-08 13:52:46,105][977264] Heartbeat connected on RolloutWorker_w3 -[2023-07-08 13:52:46,108][977264] Heartbeat connected on RolloutWorker_w4 -[2023-07-08 13:52:46,108][977264] Heartbeat connected on RolloutWorker_w6 -[2023-07-08 13:52:46,109][977264] Heartbeat connected on RolloutWorker_w7 -[2023-07-08 13:52:46,109][977264] Heartbeat connected on RolloutWorker_w5 -[2023-07-08 13:52:48,233][977264] Fps is (10 sec: 10649.7, 60 sec: 7099.8, 300 sec: 7099.8). Total num frames: 106496. Throughput: 0: 6160.5. Samples: 92408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:52:48,235][977264] Avg episode reward: [(0, '218.334')] -[2023-07-08 13:52:48,235][977508] Saving new best policy, reward=218.334! -[2023-07-08 13:52:49,569][977552] Updated weights for policy 0, policy_version 240 (0.0005) -[2023-07-08 13:52:53,234][977264] Fps is (10 sec: 10649.5, 60 sec: 7987.2, 300 sec: 7987.2). Total num frames: 159744. Throughput: 0: 7982.2. Samples: 159644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:52:53,235][977264] Avg episode reward: [(0, '248.666')] -[2023-07-08 13:52:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000312_159744.pth... -[2023-07-08 13:52:53,241][977508] Saving new best policy, reward=248.666! -[2023-07-08 13:52:53,359][977552] Updated weights for policy 0, policy_version 320 (0.0005) -[2023-07-08 13:52:57,067][977552] Updated weights for policy 0, policy_version 400 (0.0005) -[2023-07-08 13:52:58,233][977264] Fps is (10 sec: 11059.3, 60 sec: 8683.6, 300 sec: 8683.6). Total num frames: 217088. Throughput: 0: 7681.5. Samples: 192036. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 13:52:58,234][977264] Avg episode reward: [(0, '264.106')] -[2023-07-08 13:52:58,234][977508] Saving new best policy, reward=264.106! -[2023-07-08 13:53:00,773][977552] Updated weights for policy 0, policy_version 480 (0.0005) -[2023-07-08 13:53:03,233][977264] Fps is (10 sec: 11059.3, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 270336. Throughput: 0: 8610.9. Samples: 258328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:03,234][977264] Avg episode reward: [(0, '275.732')] -[2023-07-08 13:53:03,234][977508] Saving new best policy, reward=275.732! -[2023-07-08 13:53:04,459][977552] Updated weights for policy 0, policy_version 560 (0.0005) -[2023-07-08 13:53:08,155][977552] Updated weights for policy 0, policy_version 640 (0.0005) -[2023-07-08 13:53:08,233][977264] Fps is (10 sec: 11059.1, 60 sec: 9362.3, 300 sec: 9362.3). Total num frames: 327680. Throughput: 0: 9308.6. Samples: 325800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:08,234][977264] Avg episode reward: [(0, '275.766')] -[2023-07-08 13:53:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000640_327680.pth... -[2023-07-08 13:53:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000000_0.pth -[2023-07-08 13:53:08,240][977508] Saving new best policy, reward=275.766! -[2023-07-08 13:53:11,780][977552] Updated weights for policy 0, policy_version 720 (0.0005) -[2023-07-08 13:53:13,233][977264] Fps is (10 sec: 11468.8, 60 sec: 9625.6, 300 sec: 9625.6). Total num frames: 385024. Throughput: 0: 8920.5. Samples: 356820. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 13:53:13,234][977264] Avg episode reward: [(0, '283.620')] -[2023-07-08 13:53:13,234][977508] Saving new best policy, reward=283.620! -[2023-07-08 13:53:14,902][977552] Updated weights for policy 0, policy_version 800 (0.0005) -[2023-07-08 13:53:18,233][977264] Fps is (10 sec: 11468.8, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 442368. Throughput: 0: 9562.4. Samples: 430308. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 13:53:18,234][977264] Avg episode reward: [(0, '289.703')] -[2023-07-08 13:53:18,234][977508] Saving new best policy, reward=289.703! -[2023-07-08 13:53:18,738][977552] Updated weights for policy 0, policy_version 880 (0.0005) -[2023-07-08 13:53:22,678][977552] Updated weights for policy 0, policy_version 960 (0.0005) -[2023-07-08 13:53:23,233][977264] Fps is (10 sec: 11059.1, 60 sec: 9912.3, 300 sec: 9912.3). Total num frames: 495616. Throughput: 0: 10958.0. Samples: 494120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:23,234][977264] Avg episode reward: [(0, '334.681')] -[2023-07-08 13:53:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000968_495616.pth... -[2023-07-08 13:53:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000312_159744.pth -[2023-07-08 13:53:23,240][977508] Saving new best policy, reward=334.681! -[2023-07-08 13:53:26,718][977552] Updated weights for policy 0, policy_version 1040 (0.0005) -[2023-07-08 13:53:28,233][977264] Fps is (10 sec: 10239.9, 60 sec: 9904.9, 300 sec: 9904.9). Total num frames: 544768. Throughput: 0: 11015.3. Samples: 524352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:28,234][977264] Avg episode reward: [(0, '379.395')] -[2023-07-08 13:53:28,235][977508] Saving new best policy, reward=379.395! -[2023-07-08 13:53:30,877][977552] Updated weights for policy 0, policy_version 1120 (0.0005) -[2023-07-08 13:53:33,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9898.7, 300 sec: 9898.7). Total num frames: 593920. Throughput: 0: 10957.5. Samples: 585496. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 13:53:33,234][977264] Avg episode reward: [(0, '428.253')] -[2023-07-08 13:53:33,245][977508] Saving new best policy, reward=428.253! -[2023-07-08 13:53:34,915][977552] Updated weights for policy 0, policy_version 1200 (0.0005) -[2023-07-08 13:53:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10717.9, 300 sec: 9893.4). Total num frames: 643072. Throughput: 0: 10746.9. Samples: 643252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:38,234][977264] Avg episode reward: [(0, '450.154')] -[2023-07-08 13:53:38,243][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001264_647168.pth... -[2023-07-08 13:53:38,245][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000640_327680.pth -[2023-07-08 13:53:38,245][977508] Saving new best policy, reward=450.154! -[2023-07-08 13:53:39,112][977552] Updated weights for policy 0, policy_version 1280 (0.0005) -[2023-07-08 13:53:43,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10649.6, 300 sec: 9888.9). Total num frames: 692224. Throughput: 0: 10666.4. Samples: 672024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:43,234][977264] Avg episode reward: [(0, '451.929')] -[2023-07-08 13:53:43,275][977508] Saving new best policy, reward=451.929! -[2023-07-08 13:53:43,276][977552] Updated weights for policy 0, policy_version 1360 (0.0005) -[2023-07-08 13:53:47,587][977552] Updated weights for policy 0, policy_version 1440 (0.0005) -[2023-07-08 13:53:48,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10581.3, 300 sec: 9885.0). Total num frames: 741376. Throughput: 0: 10533.2. Samples: 732320. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 13:53:48,234][977264] Avg episode reward: [(0, '459.969')] -[2023-07-08 13:53:48,234][977508] Saving new best policy, reward=459.969! -[2023-07-08 13:53:51,643][977552] Updated weights for policy 0, policy_version 1520 (0.0005) -[2023-07-08 13:53:53,234][977264] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 9932.8). Total num frames: 794624. Throughput: 0: 10346.6. Samples: 791396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:53:53,234][977264] Avg episode reward: [(0, '448.335')] -[2023-07-08 13:53:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001552_794624.pth... -[2023-07-08 13:53:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000968_495616.pth -[2023-07-08 13:53:55,518][977552] Updated weights for policy 0, policy_version 1600 (0.0005) -[2023-07-08 13:53:58,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 9975.0). Total num frames: 847872. Throughput: 0: 10367.7. Samples: 823368. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 13:53:58,234][977264] Avg episode reward: [(0, '472.578')] -[2023-07-08 13:53:58,234][977508] Saving new best policy, reward=472.578! -[2023-07-08 13:53:59,281][977552] Updated weights for policy 0, policy_version 1680 (0.0005) -[2023-07-08 13:54:03,233][977264] Fps is (10 sec: 10240.2, 60 sec: 10444.8, 300 sec: 9966.9). Total num frames: 897024. Throughput: 0: 10136.4. Samples: 886444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:03,234][977264] Avg episode reward: [(0, '473.813')] -[2023-07-08 13:54:03,234][977508] Saving new best policy, reward=473.813! -[2023-07-08 13:54:03,378][977552] Updated weights for policy 0, policy_version 1760 (0.0005) -[2023-07-08 13:54:07,340][977552] Updated weights for policy 0, policy_version 1840 (0.0005) -[2023-07-08 13:54:08,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10376.5, 300 sec: 10002.9). Total num frames: 950272. Throughput: 0: 10084.9. Samples: 947940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:08,234][977264] Avg episode reward: [(0, '476.509')] -[2023-07-08 13:54:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001856_950272.pth... -[2023-07-08 13:54:08,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001264_647168.pth -[2023-07-08 13:54:08,240][977508] Saving new best policy, reward=476.509! -[2023-07-08 13:54:11,489][977552] Updated weights for policy 0, policy_version 1920 (0.0005) -[2023-07-08 13:54:13,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10240.0, 300 sec: 9994.2). Total num frames: 999424. Throughput: 0: 10096.4. Samples: 978688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:13,234][977264] Avg episode reward: [(0, '471.423')] -[2023-07-08 13:54:15,460][977552] Updated weights for policy 0, policy_version 2000 (0.0005) -[2023-07-08 13:54:18,233][977264] Fps is (10 sec: 10240.2, 60 sec: 10171.7, 300 sec: 10025.5). Total num frames: 1052672. Throughput: 0: 10072.8. Samples: 1038772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:18,234][977264] Avg episode reward: [(0, '479.757')] -[2023-07-08 13:54:18,234][977508] Saving new best policy, reward=479.757! -[2023-07-08 13:54:19,285][977552] Updated weights for policy 0, policy_version 2080 (0.0005) -[2023-07-08 13:54:23,077][977552] Updated weights for policy 0, policy_version 2160 (0.0004) -[2023-07-08 13:54:23,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10171.8, 300 sec: 10053.8). Total num frames: 1105920. Throughput: 0: 10239.8. Samples: 1104040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:23,234][977264] Avg episode reward: [(0, '474.063')] -[2023-07-08 13:54:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002160_1105920.pth... -[2023-07-08 13:54:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001552_794624.pth -[2023-07-08 13:54:27,048][977552] Updated weights for policy 0, policy_version 2240 (0.0005) -[2023-07-08 13:54:28,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10044.1). Total num frames: 1155072. Throughput: 0: 10291.6. Samples: 1135144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:28,234][977264] Avg episode reward: [(0, '456.746')] -[2023-07-08 13:54:31,341][977552] Updated weights for policy 0, policy_version 2320 (0.0005) -[2023-07-08 13:54:33,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10171.7, 300 sec: 10035.2). Total num frames: 1204224. Throughput: 0: 10242.9. Samples: 1193252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:33,234][977264] Avg episode reward: [(0, '470.708')] -[2023-07-08 13:54:35,524][977552] Updated weights for policy 0, policy_version 2400 (0.0005) -[2023-07-08 13:54:38,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10171.8, 300 sec: 10027.0). Total num frames: 1253376. Throughput: 0: 10288.8. Samples: 1254388. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 13:54:38,234][977264] Avg episode reward: [(0, '471.450')] -[2023-07-08 13:54:38,253][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002456_1257472.pth... -[2023-07-08 13:54:38,255][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001856_950272.pth -[2023-07-08 13:54:39,474][977552] Updated weights for policy 0, policy_version 2480 (0.0004) -[2023-07-08 13:54:43,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10171.7, 300 sec: 10019.4). Total num frames: 1302528. Throughput: 0: 10253.7. Samples: 1284784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:43,234][977264] Avg episode reward: [(0, '460.937')] -[2023-07-08 13:54:43,708][977552] Updated weights for policy 0, policy_version 2560 (0.0005) -[2023-07-08 13:54:47,557][977552] Updated weights for policy 0, policy_version 2640 (0.0005) -[2023-07-08 13:54:48,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10240.0, 300 sec: 10042.8). Total num frames: 1355776. Throughput: 0: 10209.2. Samples: 1345860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:48,234][977264] Avg episode reward: [(0, '463.597')] -[2023-07-08 13:54:51,790][977552] Updated weights for policy 0, policy_version 2720 (0.0005) -[2023-07-08 13:54:53,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10035.2). Total num frames: 1404928. Throughput: 0: 10141.8. Samples: 1404320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:54:53,234][977264] Avg episode reward: [(0, '461.270')] -[2023-07-08 13:54:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002744_1404928.pth... -[2023-07-08 13:54:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002160_1105920.pth -[2023-07-08 13:54:55,892][977552] Updated weights for policy 0, policy_version 2800 (0.0005) -[2023-07-08 13:54:58,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10028.1). Total num frames: 1454080. Throughput: 0: 10110.6. Samples: 1433664. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:54:58,234][977264] Avg episode reward: [(0, '481.602')] -[2023-07-08 13:54:58,234][977508] Saving new best policy, reward=481.602! -[2023-07-08 13:55:00,009][977552] Updated weights for policy 0, policy_version 2880 (0.0005) -[2023-07-08 13:55:03,233][977264] Fps is (10 sec: 9830.6, 60 sec: 10103.5, 300 sec: 10021.6). Total num frames: 1503232. Throughput: 0: 10106.3. Samples: 1493556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:03,234][977264] Avg episode reward: [(0, '478.964')] -[2023-07-08 13:55:04,131][977552] Updated weights for policy 0, policy_version 2960 (0.0005) -[2023-07-08 13:55:08,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10035.2, 300 sec: 10015.4). Total num frames: 1552384. Throughput: 0: 9963.4. Samples: 1552396. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 13:55:08,234][977264] Avg episode reward: [(0, '478.060')] -[2023-07-08 13:55:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003032_1552384.pth... -[2023-07-08 13:55:08,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002456_1257472.pth -[2023-07-08 13:55:08,372][977552] Updated weights for policy 0, policy_version 3040 (0.0005) -[2023-07-08 13:55:12,707][977552] Updated weights for policy 0, policy_version 3120 (0.0005) -[2023-07-08 13:55:13,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10035.2, 300 sec: 10009.6). Total num frames: 1601536. Throughput: 0: 9910.8. Samples: 1581128. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 13:55:13,234][977264] Avg episode reward: [(0, '476.976')] -[2023-07-08 13:55:16,724][977552] Updated weights for policy 0, policy_version 3200 (0.0004) -[2023-07-08 13:55:18,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10029.0). Total num frames: 1654784. Throughput: 0: 9944.0. Samples: 1640732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:18,234][977264] Avg episode reward: [(0, '481.529')] -[2023-07-08 13:55:20,685][977552] Updated weights for policy 0, policy_version 3280 (0.0005) -[2023-07-08 13:55:23,233][977264] Fps is (10 sec: 10240.1, 60 sec: 9966.9, 300 sec: 10023.2). Total num frames: 1703936. Throughput: 0: 9956.6. Samples: 1702436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:23,234][977264] Avg episode reward: [(0, '468.825')] -[2023-07-08 13:55:23,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003328_1703936.pth... -[2023-07-08 13:55:23,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002744_1404928.pth -[2023-07-08 13:55:24,632][977552] Updated weights for policy 0, policy_version 3360 (0.0005) -[2023-07-08 13:55:28,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10017.6). Total num frames: 1753088. Throughput: 0: 9951.8. Samples: 1732616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:28,234][977264] Avg episode reward: [(0, '484.260')] -[2023-07-08 13:55:28,234][977508] Saving new best policy, reward=484.260! -[2023-07-08 13:55:28,872][977552] Updated weights for policy 0, policy_version 3440 (0.0006) -[2023-07-08 13:55:32,960][977552] Updated weights for policy 0, policy_version 3520 (0.0005) -[2023-07-08 13:55:33,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9966.9, 300 sec: 10012.4). Total num frames: 1802240. Throughput: 0: 9894.8. Samples: 1791124. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 13:55:33,234][977264] Avg episode reward: [(0, '484.564')] -[2023-07-08 13:55:33,234][977508] Saving new best policy, reward=484.564! -[2023-07-08 13:55:36,963][977552] Updated weights for policy 0, policy_version 3600 (0.0005) -[2023-07-08 13:55:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10007.5). Total num frames: 1851392. Throughput: 0: 9942.9. Samples: 1851748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:38,234][977264] Avg episode reward: [(0, '476.673')] -[2023-07-08 13:55:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003616_1851392.pth... -[2023-07-08 13:55:38,237][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003032_1552384.pth -[2023-07-08 13:55:41,265][977552] Updated weights for policy 0, policy_version 3680 (0.0005) -[2023-07-08 13:55:43,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10002.9). Total num frames: 1900544. Throughput: 0: 9920.3. Samples: 1880076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:43,234][977264] Avg episode reward: [(0, '472.288')] -[2023-07-08 13:55:45,418][977552] Updated weights for policy 0, policy_version 3760 (0.0005) -[2023-07-08 13:55:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 9998.4). Total num frames: 1949696. Throughput: 0: 9926.5. Samples: 1940248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:48,234][977264] Avg episode reward: [(0, '479.227')] -[2023-07-08 13:55:49,550][977552] Updated weights for policy 0, policy_version 3840 (0.0006) -[2023-07-08 13:55:53,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 9994.2). Total num frames: 1998848. Throughput: 0: 9920.9. Samples: 1998836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:53,234][977264] Avg episode reward: [(0, '474.147')] -[2023-07-08 13:55:53,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003904_1998848.pth... -[2023-07-08 13:55:53,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003328_1703936.pth -[2023-07-08 13:55:53,777][977552] Updated weights for policy 0, policy_version 3920 (0.0005) -[2023-07-08 13:55:57,828][977552] Updated weights for policy 0, policy_version 4000 (0.0005) -[2023-07-08 13:55:58,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9898.7, 300 sec: 9990.2). Total num frames: 2048000. Throughput: 0: 9919.1. Samples: 2027488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:55:58,234][977264] Avg episode reward: [(0, '471.741')] -[2023-07-08 13:56:01,897][977552] Updated weights for policy 0, policy_version 4080 (0.0005) -[2023-07-08 13:56:03,233][977264] Fps is (10 sec: 10240.0, 60 sec: 9966.9, 300 sec: 10005.9). Total num frames: 2101248. Throughput: 0: 9962.1. Samples: 2089024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:03,234][977264] Avg episode reward: [(0, '468.479')] -[2023-07-08 13:56:05,833][977552] Updated weights for policy 0, policy_version 4160 (0.0005) -[2023-07-08 13:56:08,233][977264] Fps is (10 sec: 11059.2, 60 sec: 10103.5, 300 sec: 10040.0). Total num frames: 2158592. Throughput: 0: 10050.7. Samples: 2154720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:56:08,234][977264] Avg episode reward: [(0, '471.134')] -[2023-07-08 13:56:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004216_2158592.pth... -[2023-07-08 13:56:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003616_1851392.pth -[2023-07-08 13:56:09,410][977552] Updated weights for policy 0, policy_version 4240 (0.0005) -[2023-07-08 13:56:13,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10103.5, 300 sec: 10035.2). Total num frames: 2207744. Throughput: 0: 10091.7. Samples: 2186744. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:13,234][977264] Avg episode reward: [(0, '468.933')] -[2023-07-08 13:56:13,570][977552] Updated weights for policy 0, policy_version 4320 (0.0005) -[2023-07-08 13:56:17,578][977552] Updated weights for policy 0, policy_version 4400 (0.0005) -[2023-07-08 13:56:18,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10103.5, 300 sec: 10048.9). Total num frames: 2260992. Throughput: 0: 10078.8. Samples: 2244672. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 13:56:18,234][977264] Avg episode reward: [(0, '482.153')] -[2023-07-08 13:56:21,452][977552] Updated weights for policy 0, policy_version 4480 (0.0005) -[2023-07-08 13:56:23,234][977264] Fps is (10 sec: 10649.5, 60 sec: 10171.7, 300 sec: 10061.9). Total num frames: 2314240. Throughput: 0: 10191.9. Samples: 2310384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:23,234][977264] Avg episode reward: [(0, '482.740')] -[2023-07-08 13:56:23,238][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004520_2314240.pth... -[2023-07-08 13:56:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003904_1998848.pth -[2023-07-08 13:56:25,066][977552] Updated weights for policy 0, policy_version 4560 (0.0005) -[2023-07-08 13:56:28,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10171.7, 300 sec: 10057.0). Total num frames: 2363392. Throughput: 0: 10285.2. Samples: 2342908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:28,234][977264] Avg episode reward: [(0, '487.482')] -[2023-07-08 13:56:28,234][977508] Saving new best policy, reward=487.482! -[2023-07-08 13:56:29,277][977552] Updated weights for policy 0, policy_version 4640 (0.0006) -[2023-07-08 13:56:33,233][977264] Fps is (10 sec: 9830.6, 60 sec: 10171.7, 300 sec: 10052.3). Total num frames: 2412544. Throughput: 0: 10279.5. Samples: 2402824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:33,234][977264] Avg episode reward: [(0, '483.875')] -[2023-07-08 13:56:33,429][977552] Updated weights for policy 0, policy_version 4720 (0.0005) -[2023-07-08 13:56:37,610][977552] Updated weights for policy 0, policy_version 4800 (0.0005) -[2023-07-08 13:56:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10171.7, 300 sec: 10047.7). Total num frames: 2461696. Throughput: 0: 10274.8. Samples: 2461200. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:56:38,234][977264] Avg episode reward: [(0, '482.620')] -[2023-07-08 13:56:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004808_2461696.pth... -[2023-07-08 13:56:38,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004216_2158592.pth -[2023-07-08 13:56:41,648][977552] Updated weights for policy 0, policy_version 4880 (0.0005) -[2023-07-08 13:56:43,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10171.7, 300 sec: 10043.4). Total num frames: 2510848. Throughput: 0: 10295.5. Samples: 2490784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:43,234][977264] Avg episode reward: [(0, '489.667')] -[2023-07-08 13:56:43,234][977508] Saving new best policy, reward=489.667! -[2023-07-08 13:56:45,840][977552] Updated weights for policy 0, policy_version 4960 (0.0004) -[2023-07-08 13:56:48,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10171.7, 300 sec: 10039.2). Total num frames: 2560000. Throughput: 0: 10273.4. Samples: 2551328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:48,234][977264] Avg episode reward: [(0, '483.699')] -[2023-07-08 13:56:50,015][977552] Updated weights for policy 0, policy_version 5040 (0.0005) -[2023-07-08 13:56:53,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10171.7, 300 sec: 10035.2). Total num frames: 2609152. Throughput: 0: 10113.6. Samples: 2609832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:53,234][977264] Avg episode reward: [(0, '482.685')] -[2023-07-08 13:56:53,249][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005104_2613248.pth... -[2023-07-08 13:56:53,251][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004520_2314240.pth -[2023-07-08 13:56:54,030][977552] Updated weights for policy 0, policy_version 5120 (0.0005) -[2023-07-08 13:56:58,041][977552] Updated weights for policy 0, policy_version 5200 (0.0005) -[2023-07-08 13:56:58,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10046.8). Total num frames: 2662400. Throughput: 0: 10088.1. Samples: 2640708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:56:58,234][977264] Avg episode reward: [(0, '480.533')] -[2023-07-08 13:57:02,219][977552] Updated weights for policy 0, policy_version 5280 (0.0005) -[2023-07-08 13:57:03,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10042.8). Total num frames: 2711552. Throughput: 0: 10126.0. Samples: 2700340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:03,234][977264] Avg episode reward: [(0, '486.535')] -[2023-07-08 13:57:06,293][977552] Updated weights for policy 0, policy_version 5360 (0.0006) -[2023-07-08 13:57:08,234][977264] Fps is (10 sec: 9830.3, 60 sec: 10035.2, 300 sec: 10038.9). Total num frames: 2760704. Throughput: 0: 10006.7. Samples: 2760684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:08,234][977264] Avg episode reward: [(0, '481.262')] -[2023-07-08 13:57:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005392_2760704.pth... -[2023-07-08 13:57:08,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004808_2461696.pth -[2023-07-08 13:57:10,409][977552] Updated weights for policy 0, policy_version 5440 (0.0005) -[2023-07-08 13:57:13,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10035.2). Total num frames: 2809856. Throughput: 0: 9945.6. Samples: 2790460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:13,234][977264] Avg episode reward: [(0, '487.739')] -[2023-07-08 13:57:14,565][977552] Updated weights for policy 0, policy_version 5520 (0.0006) -[2023-07-08 13:57:18,233][977264] Fps is (10 sec: 10240.2, 60 sec: 10035.2, 300 sec: 10046.0). Total num frames: 2863104. Throughput: 0: 9950.8. Samples: 2850612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:18,234][977264] Avg episode reward: [(0, '493.783')] -[2023-07-08 13:57:18,234][977508] Saving new best policy, reward=493.783! -[2023-07-08 13:57:18,511][977552] Updated weights for policy 0, policy_version 5600 (0.0005) -[2023-07-08 13:57:22,566][977552] Updated weights for policy 0, policy_version 5680 (0.0005) -[2023-07-08 13:57:23,233][977264] Fps is (10 sec: 10239.9, 60 sec: 9966.9, 300 sec: 10042.3). Total num frames: 2912256. Throughput: 0: 10020.9. Samples: 2912144. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 13:57:23,234][977264] Avg episode reward: [(0, '483.809')] -[2023-07-08 13:57:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005688_2912256.pth... -[2023-07-08 13:57:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005104_2613248.pth -[2023-07-08 13:57:26,825][977552] Updated weights for policy 0, policy_version 5760 (0.0006) -[2023-07-08 13:57:28,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9966.9, 300 sec: 10038.7). Total num frames: 2961408. Throughput: 0: 9992.9. Samples: 2940464. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 13:57:28,234][977264] Avg episode reward: [(0, '483.517')] -[2023-07-08 13:57:30,844][977552] Updated weights for policy 0, policy_version 5840 (0.0005) -[2023-07-08 13:57:33,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10205.3). Total num frames: 3010560. Throughput: 0: 9999.6. Samples: 3001312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:33,234][977264] Avg episode reward: [(0, '482.809')] -[2023-07-08 13:57:34,946][977552] Updated weights for policy 0, policy_version 5920 (0.0005) -[2023-07-08 13:57:38,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10205.3). Total num frames: 3063808. Throughput: 0: 10054.5. Samples: 3062284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:38,234][977264] Avg episode reward: [(0, '479.053')] -[2023-07-08 13:57:38,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005984_3063808.pth... -[2023-07-08 13:57:38,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005392_2760704.pth -[2023-07-08 13:57:38,873][977552] Updated weights for policy 0, policy_version 6000 (0.0005) -[2023-07-08 13:57:42,907][977552] Updated weights for policy 0, policy_version 6080 (0.0005) -[2023-07-08 13:57:43,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10035.2, 300 sec: 10191.4). Total num frames: 3112960. Throughput: 0: 10048.5. Samples: 3092888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:43,234][977264] Avg episode reward: [(0, '484.666')] -[2023-07-08 13:57:47,174][977552] Updated weights for policy 0, policy_version 6160 (0.0005) -[2023-07-08 13:57:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10177.5). Total num frames: 3162112. Throughput: 0: 10019.6. Samples: 3151220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:48,234][977264] Avg episode reward: [(0, '488.867')] -[2023-07-08 13:57:50,969][977552] Updated weights for policy 0, policy_version 6240 (0.0005) -[2023-07-08 13:57:53,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10103.5, 300 sec: 10163.6). Total num frames: 3215360. Throughput: 0: 10099.3. Samples: 3215152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:57:53,234][977264] Avg episode reward: [(0, '487.344')] -[2023-07-08 13:57:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006280_3215360.pth... -[2023-07-08 13:57:53,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005688_2912256.pth -[2023-07-08 13:57:55,047][977552] Updated weights for policy 0, policy_version 6320 (0.0004) -[2023-07-08 13:57:58,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10103.5, 300 sec: 10163.6). Total num frames: 3268608. Throughput: 0: 10080.8. Samples: 3244096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:57:58,234][977264] Avg episode reward: [(0, '484.528')] -[2023-07-08 13:57:59,017][977552] Updated weights for policy 0, policy_version 6400 (0.0005) -[2023-07-08 13:58:03,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10122.0). Total num frames: 3313664. Throughput: 0: 10097.7. Samples: 3305008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:03,234][977264] Avg episode reward: [(0, '477.861')] -[2023-07-08 13:58:03,285][977552] Updated weights for policy 0, policy_version 6480 (0.0005) -[2023-07-08 13:58:07,607][977552] Updated weights for policy 0, policy_version 6560 (0.0005) -[2023-07-08 13:58:08,233][977264] Fps is (10 sec: 9420.7, 60 sec: 10035.2, 300 sec: 10094.2). Total num frames: 3362816. Throughput: 0: 10001.8. Samples: 3362224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:08,234][977264] Avg episode reward: [(0, '484.992')] -[2023-07-08 13:58:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006568_3362816.pth... -[2023-07-08 13:58:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005984_3063808.pth -[2023-07-08 13:58:11,674][977552] Updated weights for policy 0, policy_version 6640 (0.0005) -[2023-07-08 13:58:13,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 3411968. Throughput: 0: 10026.0. Samples: 3391632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 13:58:13,234][977264] Avg episode reward: [(0, '489.225')] -[2023-07-08 13:58:15,989][977552] Updated weights for policy 0, policy_version 6720 (0.0005) -[2023-07-08 13:58:18,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10052.6). Total num frames: 3461120. Throughput: 0: 9981.1. Samples: 3450460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:18,234][977264] Avg episode reward: [(0, '478.643')] -[2023-07-08 13:58:19,802][977552] Updated weights for policy 0, policy_version 6800 (0.0005) -[2023-07-08 13:58:23,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 3514368. Throughput: 0: 10020.1. Samples: 3513188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:23,234][977264] Avg episode reward: [(0, '483.845')] -[2023-07-08 13:58:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006864_3514368.pth... -[2023-07-08 13:58:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006280_3215360.pth -[2023-07-08 13:58:23,918][977552] Updated weights for policy 0, policy_version 6880 (0.0006) -[2023-07-08 13:58:28,121][977552] Updated weights for policy 0, policy_version 6960 (0.0005) -[2023-07-08 13:58:28,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 3563520. Throughput: 0: 9982.5. Samples: 3542100. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:28,234][977264] Avg episode reward: [(0, '482.530')] -[2023-07-08 13:58:32,246][977552] Updated weights for policy 0, policy_version 7040 (0.0005) -[2023-07-08 13:58:33,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 3612672. Throughput: 0: 9995.8. Samples: 3601032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:33,234][977264] Avg episode reward: [(0, '482.090')] -[2023-07-08 13:58:36,337][977552] Updated weights for policy 0, policy_version 7120 (0.0005) -[2023-07-08 13:58:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10066.4). Total num frames: 3661824. Throughput: 0: 9925.2. Samples: 3661784. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 13:58:38,234][977264] Avg episode reward: [(0, '486.369')] -[2023-07-08 13:58:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007152_3661824.pth... -[2023-07-08 13:58:38,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006568_3362816.pth -[2023-07-08 13:58:40,543][977552] Updated weights for policy 0, policy_version 7200 (0.0005) -[2023-07-08 13:58:43,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10066.4). Total num frames: 3710976. Throughput: 0: 9919.8. Samples: 3690488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:43,234][977264] Avg episode reward: [(0, '478.830')] -[2023-07-08 13:58:44,463][977552] Updated weights for policy 0, policy_version 7280 (0.0005) -[2023-07-08 13:58:48,178][977552] Updated weights for policy 0, policy_version 7360 (0.0005) -[2023-07-08 13:58:48,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10103.5, 300 sec: 10080.3). Total num frames: 3768320. Throughput: 0: 9973.6. Samples: 3753820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:48,234][977264] Avg episode reward: [(0, '486.292')] -[2023-07-08 13:58:52,249][977552] Updated weights for policy 0, policy_version 7440 (0.0004) -[2023-07-08 13:58:53,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 3817472. Throughput: 0: 10094.0. Samples: 3816452. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 13:58:53,234][977264] Avg episode reward: [(0, '486.247')] -[2023-07-08 13:58:53,238][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007456_3817472.pth... -[2023-07-08 13:58:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006864_3514368.pth -[2023-07-08 13:58:56,374][977552] Updated weights for policy 0, policy_version 7520 (0.0005) -[2023-07-08 13:58:58,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10066.4). Total num frames: 3866624. Throughput: 0: 10101.7. Samples: 3846208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:58:58,234][977264] Avg episode reward: [(0, '480.418')] -[2023-07-08 13:59:00,459][977552] Updated weights for policy 0, policy_version 7600 (0.0005) -[2023-07-08 13:59:03,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10052.6). Total num frames: 3915776. Throughput: 0: 10092.4. Samples: 3904620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:59:03,234][977264] Avg episode reward: [(0, '489.984')] -[2023-07-08 13:59:04,728][977552] Updated weights for policy 0, policy_version 7680 (0.0005) -[2023-07-08 13:59:08,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10035.2, 300 sec: 10052.6). Total num frames: 3964928. Throughput: 0: 10007.9. Samples: 3963544. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 13:59:08,234][977264] Avg episode reward: [(0, '484.334')] -[2023-07-08 13:59:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007744_3964928.pth... -[2023-07-08 13:59:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007152_3661824.pth -[2023-07-08 13:59:08,933][977552] Updated weights for policy 0, policy_version 7760 (0.0005) -[2023-07-08 13:59:13,146][977552] Updated weights for policy 0, policy_version 7840 (0.0005) -[2023-07-08 13:59:13,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10038.7). Total num frames: 4014080. Throughput: 0: 10027.0. Samples: 3993312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:13,234][977264] Avg episode reward: [(0, '486.121')] -[2023-07-08 13:59:17,235][977552] Updated weights for policy 0, policy_version 7920 (0.0005) -[2023-07-08 13:59:18,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10024.8). Total num frames: 4063232. Throughput: 0: 10013.6. Samples: 4051644. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 13:59:18,234][977264] Avg episode reward: [(0, '482.406')] -[2023-07-08 13:59:21,010][977552] Updated weights for policy 0, policy_version 8000 (0.0005) -[2023-07-08 13:59:23,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10035.2, 300 sec: 10038.7). Total num frames: 4116480. Throughput: 0: 10104.4. Samples: 4116480. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 13:59:23,234][977264] Avg episode reward: [(0, '489.823')] -[2023-07-08 13:59:23,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008040_4116480.pth... -[2023-07-08 13:59:23,237][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007456_3817472.pth -[2023-07-08 13:59:24,992][977552] Updated weights for policy 0, policy_version 8080 (0.0005) -[2023-07-08 13:59:28,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10103.5, 300 sec: 10052.6). Total num frames: 4169728. Throughput: 0: 10170.6. Samples: 4148168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:28,234][977264] Avg episode reward: [(0, '479.900')] -[2023-07-08 13:59:28,982][977552] Updated weights for policy 0, policy_version 8160 (0.0005) -[2023-07-08 13:59:33,043][977552] Updated weights for policy 0, policy_version 8240 (0.0005) -[2023-07-08 13:59:33,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10052.6). Total num frames: 4218880. Throughput: 0: 10077.8. Samples: 4207320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:33,234][977264] Avg episode reward: [(0, '483.808')] -[2023-07-08 13:59:36,849][977552] Updated weights for policy 0, policy_version 8320 (0.0005) -[2023-07-08 13:59:38,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10171.7, 300 sec: 10066.4). Total num frames: 4272128. Throughput: 0: 10090.9. Samples: 4270540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:38,234][977264] Avg episode reward: [(0, '475.594')] -[2023-07-08 13:59:38,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008344_4272128.pth... -[2023-07-08 13:59:38,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007744_3964928.pth -[2023-07-08 13:59:41,080][977552] Updated weights for policy 0, policy_version 8400 (0.0005) -[2023-07-08 13:59:43,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10103.4, 300 sec: 10038.7). Total num frames: 4317184. Throughput: 0: 10081.9. Samples: 4299892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:43,234][977264] Avg episode reward: [(0, '490.789')] -[2023-07-08 13:59:45,219][977552] Updated weights for policy 0, policy_version 8480 (0.0005) -[2023-07-08 13:59:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10052.6). Total num frames: 4370432. Throughput: 0: 10162.2. Samples: 4361920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:48,234][977264] Avg episode reward: [(0, '483.592')] -[2023-07-08 13:59:49,048][977552] Updated weights for policy 0, policy_version 8560 (0.0005) -[2023-07-08 13:59:53,173][977552] Updated weights for policy 0, policy_version 8640 (0.0006) -[2023-07-08 13:59:53,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 4423680. Throughput: 0: 10168.2. Samples: 4421112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:53,234][977264] Avg episode reward: [(0, '485.308')] -[2023-07-08 13:59:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008640_4423680.pth... -[2023-07-08 13:59:53,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008040_4116480.pth -[2023-07-08 13:59:57,182][977552] Updated weights for policy 0, policy_version 8720 (0.0005) -[2023-07-08 13:59:58,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 4472832. Throughput: 0: 10202.5. Samples: 4452424. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 13:59:58,234][977264] Avg episode reward: [(0, '480.315')] -[2023-07-08 14:00:01,294][977552] Updated weights for policy 0, policy_version 8800 (0.0005) -[2023-07-08 14:00:03,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 4521984. Throughput: 0: 10219.7. Samples: 4511532. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:03,234][977264] Avg episode reward: [(0, '484.948')] -[2023-07-08 14:00:05,194][977552] Updated weights for policy 0, policy_version 8880 (0.0005) -[2023-07-08 14:00:08,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10080.3). Total num frames: 4575232. Throughput: 0: 10177.2. Samples: 4574456. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:08,234][977264] Avg episode reward: [(0, '495.570')] -[2023-07-08 14:00:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008936_4575232.pth... -[2023-07-08 14:00:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008344_4272128.pth -[2023-07-08 14:00:08,240][977508] Saving new best policy, reward=495.570! -[2023-07-08 14:00:09,201][977552] Updated weights for policy 0, policy_version 8960 (0.0005) -[2023-07-08 14:00:13,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10066.4). Total num frames: 4624384. Throughput: 0: 10127.7. Samples: 4603912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:13,234][977264] Avg episode reward: [(0, '486.486')] -[2023-07-08 14:00:13,324][977552] Updated weights for policy 0, policy_version 9040 (0.0005) -[2023-07-08 14:00:17,176][977552] Updated weights for policy 0, policy_version 9120 (0.0005) -[2023-07-08 14:00:18,233][977264] Fps is (10 sec: 10649.8, 60 sec: 10308.3, 300 sec: 10094.2). Total num frames: 4681728. Throughput: 0: 10201.4. Samples: 4666384. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:00:18,234][977264] Avg episode reward: [(0, '485.626')] -[2023-07-08 14:00:21,112][977552] Updated weights for policy 0, policy_version 9200 (0.0005) -[2023-07-08 14:00:23,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10240.0, 300 sec: 10094.2). Total num frames: 4730880. Throughput: 0: 10178.6. Samples: 4728576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:23,234][977264] Avg episode reward: [(0, '483.712')] -[2023-07-08 14:00:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009240_4730880.pth... -[2023-07-08 14:00:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008640_4423680.pth -[2023-07-08 14:00:25,132][977552] Updated weights for policy 0, policy_version 9280 (0.0005) -[2023-07-08 14:00:28,234][977264] Fps is (10 sec: 10239.8, 60 sec: 10240.0, 300 sec: 10108.1). Total num frames: 4784128. Throughput: 0: 10199.7. Samples: 4758880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:28,234][977264] Avg episode reward: [(0, '486.740')] -[2023-07-08 14:00:28,739][977552] Updated weights for policy 0, policy_version 9360 (0.0005) -[2023-07-08 14:00:32,513][977552] Updated weights for policy 0, policy_version 9440 (0.0005) -[2023-07-08 14:00:33,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10308.3, 300 sec: 10122.0). Total num frames: 4837376. Throughput: 0: 10306.4. Samples: 4825708. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:00:33,234][977264] Avg episode reward: [(0, '482.929')] -[2023-07-08 14:00:36,349][977552] Updated weights for policy 0, policy_version 9520 (0.0005) -[2023-07-08 14:00:38,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10308.3, 300 sec: 10135.9). Total num frames: 4890624. Throughput: 0: 10422.6. Samples: 4890128. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:00:38,234][977264] Avg episode reward: [(0, '492.628')] -[2023-07-08 14:00:38,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009552_4890624.pth... -[2023-07-08 14:00:38,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008936_4575232.pth -[2023-07-08 14:00:40,004][977552] Updated weights for policy 0, policy_version 9600 (0.0005) -[2023-07-08 14:00:43,233][977264] Fps is (10 sec: 11059.2, 60 sec: 10513.1, 300 sec: 10163.6). Total num frames: 4947968. Throughput: 0: 10527.3. Samples: 4926152. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:00:43,234][977264] Avg episode reward: [(0, '492.329')] -[2023-07-08 14:00:43,741][977552] Updated weights for policy 0, policy_version 9680 (0.0005) -[2023-07-08 14:00:47,946][977552] Updated weights for policy 0, policy_version 9760 (0.0005) -[2023-07-08 14:00:48,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10444.8, 300 sec: 10163.6). Total num frames: 4997120. Throughput: 0: 10579.4. Samples: 4987604. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:00:48,234][977264] Avg episode reward: [(0, '492.997')] -[2023-07-08 14:00:52,121][977552] Updated weights for policy 0, policy_version 9840 (0.0005) -[2023-07-08 14:00:53,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10376.5, 300 sec: 10163.6). Total num frames: 5046272. Throughput: 0: 10485.1. Samples: 5046284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:53,234][977264] Avg episode reward: [(0, '485.031')] -[2023-07-08 14:00:53,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009856_5046272.pth... -[2023-07-08 14:00:53,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009240_4730880.pth -[2023-07-08 14:00:56,250][977552] Updated weights for policy 0, policy_version 9920 (0.0005) -[2023-07-08 14:00:58,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10163.6). Total num frames: 5099520. Throughput: 0: 10467.6. Samples: 5074956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:00:58,234][977264] Avg episode reward: [(0, '488.010')] -[2023-07-08 14:01:00,082][977552] Updated weights for policy 0, policy_version 10000 (0.0005) -[2023-07-08 14:01:03,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10149.8). Total num frames: 5152768. Throughput: 0: 10536.9. Samples: 5140544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:03,234][977264] Avg episode reward: [(0, '489.376')] -[2023-07-08 14:01:03,977][977552] Updated weights for policy 0, policy_version 10080 (0.0005) -[2023-07-08 14:01:08,101][977552] Updated weights for policy 0, policy_version 10160 (0.0005) -[2023-07-08 14:01:08,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10149.8). Total num frames: 5201920. Throughput: 0: 10505.3. Samples: 5201312. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:01:08,234][977264] Avg episode reward: [(0, '485.762')] -[2023-07-08 14:01:08,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010160_5201920.pth... -[2023-07-08 14:01:08,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009552_4890624.pth -[2023-07-08 14:01:12,164][977552] Updated weights for policy 0, policy_version 10240 (0.0005) -[2023-07-08 14:01:13,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10444.8, 300 sec: 10135.9). Total num frames: 5251072. Throughput: 0: 10483.9. Samples: 5230656. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:01:13,234][977264] Avg episode reward: [(0, '483.619')] -[2023-07-08 14:01:15,996][977552] Updated weights for policy 0, policy_version 10320 (0.0005) -[2023-07-08 14:01:18,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10376.5, 300 sec: 10135.9). Total num frames: 5304320. Throughput: 0: 10387.5. Samples: 5293148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:18,234][977264] Avg episode reward: [(0, '488.756')] -[2023-07-08 14:01:20,035][977552] Updated weights for policy 0, policy_version 10400 (0.0005) -[2023-07-08 14:01:23,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10149.7). Total num frames: 5357568. Throughput: 0: 10334.7. Samples: 5355192. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:23,234][977264] Avg episode reward: [(0, '497.667')] -[2023-07-08 14:01:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010464_5357568.pth... -[2023-07-08 14:01:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009856_5046272.pth -[2023-07-08 14:01:23,240][977508] Saving new best policy, reward=497.667! -[2023-07-08 14:01:24,011][977552] Updated weights for policy 0, policy_version 10480 (0.0005) -[2023-07-08 14:01:27,939][977552] Updated weights for policy 0, policy_version 10560 (0.0006) -[2023-07-08 14:01:28,233][977264] Fps is (10 sec: 10240.2, 60 sec: 10376.6, 300 sec: 10149.8). Total num frames: 5406720. Throughput: 0: 10224.4. Samples: 5386248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:28,234][977264] Avg episode reward: [(0, '489.298')] -[2023-07-08 14:01:31,799][977552] Updated weights for policy 0, policy_version 10640 (0.0005) -[2023-07-08 14:01:33,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10163.6). Total num frames: 5459968. Throughput: 0: 10250.8. Samples: 5448892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:33,234][977264] Avg episode reward: [(0, '484.435')] -[2023-07-08 14:01:35,917][977552] Updated weights for policy 0, policy_version 10720 (0.0005) -[2023-07-08 14:01:38,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10308.3, 300 sec: 10163.6). Total num frames: 5509120. Throughput: 0: 10289.8. Samples: 5509324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:38,234][977264] Avg episode reward: [(0, '485.771')] -[2023-07-08 14:01:38,250][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010768_5513216.pth... -[2023-07-08 14:01:38,252][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010160_5201920.pth -[2023-07-08 14:01:39,713][977552] Updated weights for policy 0, policy_version 10800 (0.0005) -[2023-07-08 14:01:43,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10240.0, 300 sec: 10177.5). Total num frames: 5562368. Throughput: 0: 10377.7. Samples: 5541952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:43,234][977264] Avg episode reward: [(0, '481.370')] -[2023-07-08 14:01:43,811][977552] Updated weights for policy 0, policy_version 10880 (0.0005) -[2023-07-08 14:01:47,879][977552] Updated weights for policy 0, policy_version 10960 (0.0005) -[2023-07-08 14:01:48,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10177.5). Total num frames: 5611520. Throughput: 0: 10260.2. Samples: 5602252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:48,234][977264] Avg episode reward: [(0, '491.669')] -[2023-07-08 14:01:51,977][977552] Updated weights for policy 0, policy_version 11040 (0.0005) -[2023-07-08 14:01:53,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10308.3, 300 sec: 10177.5). Total num frames: 5664768. Throughput: 0: 10244.3. Samples: 5662308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:53,234][977264] Avg episode reward: [(0, '486.616')] -[2023-07-08 14:01:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011064_5664768.pth... -[2023-07-08 14:01:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010464_5357568.pth -[2023-07-08 14:01:55,889][977552] Updated weights for policy 0, policy_version 11120 (0.0004) -[2023-07-08 14:01:58,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10308.3, 300 sec: 10191.4). Total num frames: 5718016. Throughput: 0: 10285.5. Samples: 5693504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:01:58,234][977264] Avg episode reward: [(0, '485.894')] -[2023-07-08 14:01:59,708][977552] Updated weights for policy 0, policy_version 11200 (0.0005) -[2023-07-08 14:02:03,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10191.4). Total num frames: 5767168. Throughput: 0: 10323.1. Samples: 5757688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:03,234][977264] Avg episode reward: [(0, '491.909')] -[2023-07-08 14:02:03,809][977552] Updated weights for policy 0, policy_version 11280 (0.0005) -[2023-07-08 14:02:07,986][977552] Updated weights for policy 0, policy_version 11360 (0.0005) -[2023-07-08 14:02:08,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10240.0, 300 sec: 10191.4). Total num frames: 5816320. Throughput: 0: 10246.3. Samples: 5816276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:08,234][977264] Avg episode reward: [(0, '487.241')] -[2023-07-08 14:02:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011360_5816320.pth... -[2023-07-08 14:02:08,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010768_5513216.pth -[2023-07-08 14:02:12,257][977552] Updated weights for policy 0, policy_version 11440 (0.0005) -[2023-07-08 14:02:13,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10240.0, 300 sec: 10177.5). Total num frames: 5865472. Throughput: 0: 10195.9. Samples: 5845064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:13,234][977264] Avg episode reward: [(0, '488.835')] -[2023-07-08 14:02:16,380][977552] Updated weights for policy 0, policy_version 11520 (0.0005) -[2023-07-08 14:02:18,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10171.7, 300 sec: 10177.5). Total num frames: 5914624. Throughput: 0: 10113.2. Samples: 5903984. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:02:18,234][977264] Avg episode reward: [(0, '487.516')] -[2023-07-08 14:02:20,456][977552] Updated weights for policy 0, policy_version 11600 (0.0005) -[2023-07-08 14:02:23,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10177.5). Total num frames: 5963776. Throughput: 0: 10098.8. Samples: 5963772. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:02:23,234][977264] Avg episode reward: [(0, '490.235')] -[2023-07-08 14:02:23,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011648_5963776.pth... -[2023-07-08 14:02:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011064_5664768.pth -[2023-07-08 14:02:24,688][977552] Updated weights for policy 0, policy_version 11680 (0.0005) -[2023-07-08 14:02:28,233][977264] Fps is (10 sec: 9420.9, 60 sec: 10035.2, 300 sec: 10163.6). Total num frames: 6008832. Throughput: 0: 10007.7. Samples: 5992296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:28,234][977264] Avg episode reward: [(0, '490.878')] -[2023-07-08 14:02:29,099][977552] Updated weights for policy 0, policy_version 11760 (0.0005) -[2023-07-08 14:02:33,233][977264] Fps is (10 sec: 9420.8, 60 sec: 9966.9, 300 sec: 10149.7). Total num frames: 6057984. Throughput: 0: 9941.1. Samples: 6049600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:33,234][977264] Avg episode reward: [(0, '490.882')] -[2023-07-08 14:02:33,275][977552] Updated weights for policy 0, policy_version 11840 (0.0005) -[2023-07-08 14:02:37,353][977552] Updated weights for policy 0, policy_version 11920 (0.0005) -[2023-07-08 14:02:38,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9966.9, 300 sec: 10149.7). Total num frames: 6107136. Throughput: 0: 9891.8. Samples: 6107440. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:02:38,234][977264] Avg episode reward: [(0, '491.866')] -[2023-07-08 14:02:38,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011928_6107136.pth... -[2023-07-08 14:02:38,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011360_5816320.pth -[2023-07-08 14:02:41,445][977552] Updated weights for policy 0, policy_version 12000 (0.0005) -[2023-07-08 14:02:43,233][977264] Fps is (10 sec: 10240.0, 60 sec: 9966.9, 300 sec: 10163.6). Total num frames: 6160384. Throughput: 0: 9858.3. Samples: 6137128. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:02:43,234][977264] Avg episode reward: [(0, '484.766')] -[2023-07-08 14:02:45,625][977552] Updated weights for policy 0, policy_version 12080 (0.0005) -[2023-07-08 14:02:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 10135.9). Total num frames: 6205440. Throughput: 0: 9777.4. Samples: 6197672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:02:48,234][977264] Avg episode reward: [(0, '483.706')] -[2023-07-08 14:02:49,949][977552] Updated weights for policy 0, policy_version 12160 (0.0005) -[2023-07-08 14:02:53,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 10135.9). Total num frames: 6258688. Throughput: 0: 9778.1. Samples: 6256292. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:02:53,234][977264] Avg episode reward: [(0, '487.530')] -[2023-07-08 14:02:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012224_6258688.pth... -[2023-07-08 14:02:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011648_5963776.pth -[2023-07-08 14:02:53,789][977552] Updated weights for policy 0, policy_version 12240 (0.0005) -[2023-07-08 14:02:57,768][977552] Updated weights for policy 0, policy_version 12320 (0.0005) -[2023-07-08 14:02:58,233][977264] Fps is (10 sec: 10649.5, 60 sec: 9898.7, 300 sec: 10163.6). Total num frames: 6311936. Throughput: 0: 9874.8. Samples: 6289428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:02:58,234][977264] Avg episode reward: [(0, '482.646')] -[2023-07-08 14:03:01,505][977552] Updated weights for policy 0, policy_version 12400 (0.0005) -[2023-07-08 14:03:03,233][977264] Fps is (10 sec: 10649.6, 60 sec: 9966.9, 300 sec: 10177.5). Total num frames: 6365184. Throughput: 0: 9978.8. Samples: 6353032. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:03:03,234][977264] Avg episode reward: [(0, '487.252')] -[2023-07-08 14:03:05,701][977552] Updated weights for policy 0, policy_version 12480 (0.0005) -[2023-07-08 14:03:08,233][977264] Fps is (10 sec: 10240.0, 60 sec: 9966.9, 300 sec: 10177.5). Total num frames: 6414336. Throughput: 0: 10012.3. Samples: 6414324. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:03:08,234][977264] Avg episode reward: [(0, '490.533')] -[2023-07-08 14:03:08,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012528_6414336.pth... -[2023-07-08 14:03:08,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011928_6107136.pth -[2023-07-08 14:03:09,653][977552] Updated weights for policy 0, policy_version 12560 (0.0005) -[2023-07-08 14:03:13,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9966.9, 300 sec: 10177.5). Total num frames: 6463488. Throughput: 0: 10016.1. Samples: 6443020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:03:13,234][977264] Avg episode reward: [(0, '485.693')] -[2023-07-08 14:03:13,732][977552] Updated weights for policy 0, policy_version 12640 (0.0005) -[2023-07-08 14:03:17,821][977552] Updated weights for policy 0, policy_version 12720 (0.0004) -[2023-07-08 14:03:18,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10163.6). Total num frames: 6512640. Throughput: 0: 10108.0. Samples: 6504460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:03:18,234][977264] Avg episode reward: [(0, '491.158')] -[2023-07-08 14:03:22,047][977552] Updated weights for policy 0, policy_version 12800 (0.0005) -[2023-07-08 14:03:23,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10177.5). Total num frames: 6565888. Throughput: 0: 10120.4. Samples: 6562860. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:03:23,234][977264] Avg episode reward: [(0, '485.691')] -[2023-07-08 14:03:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012824_6565888.pth... -[2023-07-08 14:03:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012224_6258688.pth -[2023-07-08 14:03:26,020][977552] Updated weights for policy 0, policy_version 12880 (0.0005) -[2023-07-08 14:03:28,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10103.5, 300 sec: 10177.5). Total num frames: 6615040. Throughput: 0: 10133.7. Samples: 6593144. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:03:28,234][977264] Avg episode reward: [(0, '482.275')] -[2023-07-08 14:03:29,839][977552] Updated weights for policy 0, policy_version 12960 (0.0005) -[2023-07-08 14:03:33,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10171.7, 300 sec: 10191.4). Total num frames: 6668288. Throughput: 0: 10221.1. Samples: 6657620. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:03:33,234][977264] Avg episode reward: [(0, '493.856')] -[2023-07-08 14:03:33,783][977552] Updated weights for policy 0, policy_version 13040 (0.0005) -[2023-07-08 14:03:37,931][977552] Updated weights for policy 0, policy_version 13120 (0.0005) -[2023-07-08 14:03:38,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10191.4). Total num frames: 6717440. Throughput: 0: 10247.9. Samples: 6717448. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:03:38,234][977264] Avg episode reward: [(0, '491.420')] -[2023-07-08 14:03:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013120_6717440.pth... -[2023-07-08 14:03:38,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012528_6414336.pth -[2023-07-08 14:03:41,955][977552] Updated weights for policy 0, policy_version 13200 (0.0005) -[2023-07-08 14:03:43,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10177.5). Total num frames: 6770688. Throughput: 0: 10184.3. Samples: 6747720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:03:43,234][977264] Avg episode reward: [(0, '485.865')] -[2023-07-08 14:03:46,054][977552] Updated weights for policy 0, policy_version 13280 (0.0004) -[2023-07-08 14:03:48,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10240.0, 300 sec: 10177.5). Total num frames: 6819840. Throughput: 0: 10101.9. Samples: 6807616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:03:48,234][977264] Avg episode reward: [(0, '482.221')] -[2023-07-08 14:03:50,073][977552] Updated weights for policy 0, policy_version 13360 (0.0005) -[2023-07-08 14:03:53,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10240.0, 300 sec: 10191.4). Total num frames: 6873088. Throughput: 0: 10176.4. Samples: 6872264. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 14:03:53,234][977264] Avg episode reward: [(0, '494.584')] -[2023-07-08 14:03:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013424_6873088.pth... -[2023-07-08 14:03:53,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012824_6565888.pth -[2023-07-08 14:03:53,827][977552] Updated weights for policy 0, policy_version 13440 (0.0004) -[2023-07-08 14:03:57,984][977552] Updated weights for policy 0, policy_version 13520 (0.0005) -[2023-07-08 14:03:58,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10191.4). Total num frames: 6922240. Throughput: 0: 10195.6. Samples: 6901824. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 14:03:58,234][977264] Avg episode reward: [(0, '479.405')] -[2023-07-08 14:04:02,186][977552] Updated weights for policy 0, policy_version 13600 (0.0006) -[2023-07-08 14:04:03,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10191.4). Total num frames: 6971392. Throughput: 0: 10133.2. Samples: 6960456. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:04:03,234][977264] Avg episode reward: [(0, '484.932')] -[2023-07-08 14:04:06,390][977552] Updated weights for policy 0, policy_version 13680 (0.0005) -[2023-07-08 14:04:08,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10205.3). Total num frames: 7024640. Throughput: 0: 10186.2. Samples: 7021240. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:04:08,234][977264] Avg episode reward: [(0, '482.755')] -[2023-07-08 14:04:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013720_7024640.pth... -[2023-07-08 14:04:08,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013120_6717440.pth -[2023-07-08 14:04:10,267][977552] Updated weights for policy 0, policy_version 13760 (0.0005) -[2023-07-08 14:04:13,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10171.7, 300 sec: 10205.3). Total num frames: 7073792. Throughput: 0: 10200.2. Samples: 7052152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:13,235][977264] Avg episode reward: [(0, '485.252')] -[2023-07-08 14:04:14,403][977552] Updated weights for policy 0, policy_version 13840 (0.0005) -[2023-07-08 14:04:18,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10171.7, 300 sec: 10191.4). Total num frames: 7122944. Throughput: 0: 10069.6. Samples: 7110752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:18,234][977264] Avg episode reward: [(0, '482.015')] -[2023-07-08 14:04:18,554][977552] Updated weights for policy 0, policy_version 13920 (0.0006) -[2023-07-08 14:04:22,707][977552] Updated weights for policy 0, policy_version 14000 (0.0005) -[2023-07-08 14:04:23,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10103.5, 300 sec: 10177.5). Total num frames: 7172096. Throughput: 0: 10062.8. Samples: 7170276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:23,234][977264] Avg episode reward: [(0, '492.785')] -[2023-07-08 14:04:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014008_7172096.pth... -[2023-07-08 14:04:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013424_6873088.pth -[2023-07-08 14:04:26,782][977552] Updated weights for policy 0, policy_version 14080 (0.0005) -[2023-07-08 14:04:28,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10103.5, 300 sec: 10177.5). Total num frames: 7221248. Throughput: 0: 10046.5. Samples: 7199812. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:28,235][977264] Avg episode reward: [(0, '490.103')] -[2023-07-08 14:04:31,167][977552] Updated weights for policy 0, policy_version 14160 (0.0005) -[2023-07-08 14:04:33,233][977264] Fps is (10 sec: 9830.6, 60 sec: 10035.2, 300 sec: 10163.6). Total num frames: 7270400. Throughput: 0: 10002.2. Samples: 7257716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:33,234][977264] Avg episode reward: [(0, '490.483')] -[2023-07-08 14:04:35,209][977552] Updated weights for policy 0, policy_version 14240 (0.0005) -[2023-07-08 14:04:38,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10191.4). Total num frames: 7323648. Throughput: 0: 9979.4. Samples: 7321336. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:38,235][977264] Avg episode reward: [(0, '492.468')] -[2023-07-08 14:04:38,239][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014304_7323648.pth... -[2023-07-08 14:04:38,241][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013720_7024640.pth -[2023-07-08 14:04:38,975][977552] Updated weights for policy 0, policy_version 14320 (0.0005) -[2023-07-08 14:04:43,195][977552] Updated weights for policy 0, policy_version 14400 (0.0005) -[2023-07-08 14:04:43,234][977264] Fps is (10 sec: 10239.8, 60 sec: 10035.2, 300 sec: 10177.5). Total num frames: 7372800. Throughput: 0: 9978.3. Samples: 7350848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:04:43,235][977264] Avg episode reward: [(0, '489.332')] -[2023-07-08 14:04:46,924][977552] Updated weights for policy 0, policy_version 14480 (0.0005) -[2023-07-08 14:04:48,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10103.5, 300 sec: 10177.5). Total num frames: 7426048. Throughput: 0: 10073.6. Samples: 7413768. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:04:48,234][977264] Avg episode reward: [(0, '484.785')] -[2023-07-08 14:04:51,008][977552] Updated weights for policy 0, policy_version 14560 (0.0005) -[2023-07-08 14:04:53,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10035.2, 300 sec: 10177.5). Total num frames: 7475200. Throughput: 0: 10084.5. Samples: 7475040. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:04:53,234][977264] Avg episode reward: [(0, '491.927')] -[2023-07-08 14:04:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014600_7475200.pth... -[2023-07-08 14:04:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014008_7172096.pth -[2023-07-08 14:04:55,020][977552] Updated weights for policy 0, policy_version 14640 (0.0005) -[2023-07-08 14:04:58,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10191.4). Total num frames: 7528448. Throughput: 0: 10059.9. Samples: 7504848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:04:58,235][977264] Avg episode reward: [(0, '487.223')] -[2023-07-08 14:04:59,070][977552] Updated weights for policy 0, policy_version 14720 (0.0005) -[2023-07-08 14:05:03,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10163.6). Total num frames: 7573504. Throughput: 0: 10054.7. Samples: 7563216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:05:03,235][977264] Avg episode reward: [(0, '483.721')] -[2023-07-08 14:05:03,425][977552] Updated weights for policy 0, policy_version 14800 (0.0005) -[2023-07-08 14:05:07,496][977552] Updated weights for policy 0, policy_version 14880 (0.0005) -[2023-07-08 14:05:08,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10035.2, 300 sec: 10177.5). Total num frames: 7626752. Throughput: 0: 10053.1. Samples: 7622664. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 14:05:08,234][977264] Avg episode reward: [(0, '490.998')] -[2023-07-08 14:05:08,238][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014896_7626752.pth... -[2023-07-08 14:05:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014304_7323648.pth -[2023-07-08 14:05:11,651][977552] Updated weights for policy 0, policy_version 14960 (0.0006) -[2023-07-08 14:05:13,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10135.9). Total num frames: 7671808. Throughput: 0: 10039.8. Samples: 7651604. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 14:05:13,234][977264] Avg episode reward: [(0, '492.352')] -[2023-07-08 14:05:15,773][977552] Updated weights for policy 0, policy_version 15040 (0.0006) -[2023-07-08 14:05:18,233][977264] Fps is (10 sec: 9420.8, 60 sec: 9966.9, 300 sec: 10135.9). Total num frames: 7720960. Throughput: 0: 10097.3. Samples: 7712096. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:05:18,234][977264] Avg episode reward: [(0, '485.160')] -[2023-07-08 14:05:20,005][977552] Updated weights for policy 0, policy_version 15120 (0.0005) -[2023-07-08 14:05:23,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10035.2, 300 sec: 10135.9). Total num frames: 7774208. Throughput: 0: 10065.3. Samples: 7774272. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 14:05:23,234][977264] Avg episode reward: [(0, '490.859')] -[2023-07-08 14:05:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015184_7774208.pth... -[2023-07-08 14:05:23,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014600_7475200.pth -[2023-07-08 14:05:23,782][977552] Updated weights for policy 0, policy_version 15200 (0.0005) -[2023-07-08 14:05:27,613][977552] Updated weights for policy 0, policy_version 15280 (0.0005) -[2023-07-08 14:05:28,233][977264] Fps is (10 sec: 10649.7, 60 sec: 10103.5, 300 sec: 10135.9). Total num frames: 7827456. Throughput: 0: 10136.4. Samples: 7806984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:05:28,234][977264] Avg episode reward: [(0, '490.980')] -[2023-07-08 14:05:31,792][977552] Updated weights for policy 0, policy_version 15360 (0.0005) -[2023-07-08 14:05:33,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10103.5, 300 sec: 10122.0). Total num frames: 7876608. Throughput: 0: 10044.6. Samples: 7865776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:05:33,234][977264] Avg episode reward: [(0, '487.258')] -[2023-07-08 14:05:35,799][977552] Updated weights for policy 0, policy_version 15440 (0.0005) -[2023-07-08 14:05:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10094.2). Total num frames: 7925760. Throughput: 0: 10023.0. Samples: 7926076. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 14:05:38,234][977264] Avg episode reward: [(0, '490.847')] -[2023-07-08 14:05:38,244][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015488_7929856.pth... -[2023-07-08 14:05:38,245][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014896_7626752.pth -[2023-07-08 14:05:39,945][977552] Updated weights for policy 0, policy_version 15520 (0.0006) -[2023-07-08 14:05:43,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10094.2). Total num frames: 7974912. Throughput: 0: 10001.0. Samples: 7954892. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 14:05:43,234][977264] Avg episode reward: [(0, '488.597')] -[2023-07-08 14:05:44,172][977552] Updated weights for policy 0, policy_version 15600 (0.0006) -[2023-07-08 14:05:48,014][977552] Updated weights for policy 0, policy_version 15680 (0.0005) -[2023-07-08 14:05:48,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10108.1). Total num frames: 8028160. Throughput: 0: 10084.0. Samples: 8016996. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 14:05:48,234][977264] Avg episode reward: [(0, '480.957')] -[2023-07-08 14:05:52,278][977552] Updated weights for policy 0, policy_version 15760 (0.0005) -[2023-07-08 14:05:53,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10035.2, 300 sec: 10094.2). Total num frames: 8077312. Throughput: 0: 10083.7. Samples: 8076428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:05:53,234][977264] Avg episode reward: [(0, '490.652')] -[2023-07-08 14:05:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015776_8077312.pth... -[2023-07-08 14:05:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015184_7774208.pth -[2023-07-08 14:05:56,316][977552] Updated weights for policy 0, policy_version 15840 (0.0005) -[2023-07-08 14:05:58,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10080.3). Total num frames: 8126464. Throughput: 0: 10105.1. Samples: 8106336. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:05:58,234][977264] Avg episode reward: [(0, '485.119')] -[2023-07-08 14:06:00,347][977552] Updated weights for policy 0, policy_version 15920 (0.0005) -[2023-07-08 14:06:03,233][977264] Fps is (10 sec: 10240.1, 60 sec: 10103.5, 300 sec: 10094.2). Total num frames: 8179712. Throughput: 0: 10117.4. Samples: 8167380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:03,241][977264] Avg episode reward: [(0, '495.501')] -[2023-07-08 14:06:04,167][977552] Updated weights for policy 0, policy_version 16000 (0.0005) -[2023-07-08 14:06:08,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10094.2). Total num frames: 8228864. Throughput: 0: 10102.0. Samples: 8228864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:08,234][977264] Avg episode reward: [(0, '494.055')] -[2023-07-08 14:06:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016072_8228864.pth... -[2023-07-08 14:06:08,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015488_7929856.pth -[2023-07-08 14:06:08,382][977552] Updated weights for policy 0, policy_version 16080 (0.0004) -[2023-07-08 14:06:12,093][977552] Updated weights for policy 0, policy_version 16160 (0.0005) -[2023-07-08 14:06:13,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10094.2). Total num frames: 8282112. Throughput: 0: 10046.9. Samples: 8259092. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:06:13,234][977264] Avg episode reward: [(0, '489.341')] -[2023-07-08 14:06:16,033][977552] Updated weights for policy 0, policy_version 16240 (0.0006) -[2023-07-08 14:06:18,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10240.0, 300 sec: 10094.2). Total num frames: 8335360. Throughput: 0: 10163.5. Samples: 8323136. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:06:18,234][977264] Avg episode reward: [(0, '498.973')] -[2023-07-08 14:06:18,235][977508] Saving new best policy, reward=498.973! -[2023-07-08 14:06:20,186][977552] Updated weights for policy 0, policy_version 16320 (0.0005) -[2023-07-08 14:06:23,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10094.2). Total num frames: 8384512. Throughput: 0: 10187.6. Samples: 8384520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:06:23,234][977264] Avg episode reward: [(0, '490.038')] -[2023-07-08 14:06:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016376_8384512.pth... -[2023-07-08 14:06:23,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015776_8077312.pth -[2023-07-08 14:06:24,199][977552] Updated weights for policy 0, policy_version 16400 (0.0005) -[2023-07-08 14:06:28,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10103.5, 300 sec: 10080.3). Total num frames: 8433664. Throughput: 0: 10184.4. Samples: 8413192. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:28,235][977264] Avg episode reward: [(0, '493.254')] -[2023-07-08 14:06:28,255][977552] Updated weights for policy 0, policy_version 16480 (0.0005) -[2023-07-08 14:06:32,364][977552] Updated weights for policy 0, policy_version 16560 (0.0005) -[2023-07-08 14:06:33,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10094.2). Total num frames: 8486912. Throughput: 0: 10169.5. Samples: 8474624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:33,234][977264] Avg episode reward: [(0, '489.010')] -[2023-07-08 14:06:36,500][977552] Updated weights for policy 0, policy_version 16640 (0.0005) -[2023-07-08 14:06:38,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10171.7, 300 sec: 10080.3). Total num frames: 8536064. Throughput: 0: 10151.8. Samples: 8533260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:38,264][977264] Avg episode reward: [(0, '492.120')] -[2023-07-08 14:06:38,268][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016672_8536064.pth... -[2023-07-08 14:06:38,271][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016072_8228864.pth -[2023-07-08 14:06:40,793][977552] Updated weights for policy 0, policy_version 16720 (0.0005) -[2023-07-08 14:06:43,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10171.7, 300 sec: 10080.3). Total num frames: 8585216. Throughput: 0: 10120.6. Samples: 8561764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:43,234][977264] Avg episode reward: [(0, '488.501')] -[2023-07-08 14:06:44,572][977552] Updated weights for policy 0, policy_version 16800 (0.0005) -[2023-07-08 14:06:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 8634368. Throughput: 0: 10197.0. Samples: 8626248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:06:48,234][977264] Avg episode reward: [(0, '494.433')] -[2023-07-08 14:06:48,681][977552] Updated weights for policy 0, policy_version 16880 (0.0005) -[2023-07-08 14:06:52,905][977552] Updated weights for policy 0, policy_version 16960 (0.0005) -[2023-07-08 14:06:53,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10052.6). Total num frames: 8683520. Throughput: 0: 10104.9. Samples: 8683584. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:06:53,234][977264] Avg episode reward: [(0, '486.195')] -[2023-07-08 14:06:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016960_8683520.pth... -[2023-07-08 14:06:53,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016376_8384512.pth -[2023-07-08 14:06:57,057][977552] Updated weights for policy 0, policy_version 17040 (0.0005) -[2023-07-08 14:06:58,234][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10066.4). Total num frames: 8736768. Throughput: 0: 10100.1. Samples: 8713600. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 14:06:58,235][977264] Avg episode reward: [(0, '491.315')] -[2023-07-08 14:07:01,076][977552] Updated weights for policy 0, policy_version 17120 (0.0005) -[2023-07-08 14:07:03,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10052.6). Total num frames: 8781824. Throughput: 0: 10011.2. Samples: 8773640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:07:03,234][977264] Avg episode reward: [(0, '493.402')] -[2023-07-08 14:07:05,363][977552] Updated weights for policy 0, policy_version 17200 (0.0005) -[2023-07-08 14:07:08,233][977264] Fps is (10 sec: 9830.6, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 8835072. Throughput: 0: 9990.3. Samples: 8834084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:07:08,234][977264] Avg episode reward: [(0, '493.901')] -[2023-07-08 14:07:08,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017256_8835072.pth... -[2023-07-08 14:07:08,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016672_8536064.pth -[2023-07-08 14:07:09,200][977552] Updated weights for policy 0, policy_version 17280 (0.0005) -[2023-07-08 14:07:13,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 10066.4). Total num frames: 8884224. Throughput: 0: 10036.0. Samples: 8864812. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:07:13,234][977264] Avg episode reward: [(0, '486.657')] -[2023-07-08 14:07:13,294][977552] Updated weights for policy 0, policy_version 17360 (0.0005) -[2023-07-08 14:07:17,517][977552] Updated weights for policy 0, policy_version 17440 (0.0005) -[2023-07-08 14:07:18,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10066.4). Total num frames: 8933376. Throughput: 0: 9994.3. Samples: 8924368. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:07:18,234][977264] Avg episode reward: [(0, '489.651')] -[2023-07-08 14:07:21,774][977552] Updated weights for policy 0, policy_version 17520 (0.0005) -[2023-07-08 14:07:23,234][977264] Fps is (10 sec: 9830.2, 60 sec: 9966.9, 300 sec: 10080.3). Total num frames: 8982528. Throughput: 0: 9979.0. Samples: 8982316. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 14:07:23,234][977264] Avg episode reward: [(0, '487.487')] -[2023-07-08 14:07:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017544_8982528.pth... -[2023-07-08 14:07:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016960_8683520.pth -[2023-07-08 14:07:25,955][977552] Updated weights for policy 0, policy_version 17600 (0.0006) -[2023-07-08 14:07:28,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9966.9, 300 sec: 10080.3). Total num frames: 9031680. Throughput: 0: 9986.8. Samples: 9011168. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:07:28,234][977264] Avg episode reward: [(0, '484.911')] -[2023-07-08 14:07:29,876][977552] Updated weights for policy 0, policy_version 17680 (0.0005) -[2023-07-08 14:07:33,233][977264] Fps is (10 sec: 9830.6, 60 sec: 9898.7, 300 sec: 10080.3). Total num frames: 9080832. Throughput: 0: 9895.2. Samples: 9071532. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:07:33,234][977264] Avg episode reward: [(0, '492.052')] -[2023-07-08 14:07:34,133][977552] Updated weights for policy 0, policy_version 17760 (0.0005) -[2023-07-08 14:07:38,020][977552] Updated weights for policy 0, policy_version 17840 (0.0005) -[2023-07-08 14:07:38,233][977264] Fps is (10 sec: 10240.0, 60 sec: 9966.9, 300 sec: 10080.3). Total num frames: 9134080. Throughput: 0: 9993.3. Samples: 9133284. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:07:38,234][977264] Avg episode reward: [(0, '488.262')] -[2023-07-08 14:07:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017840_9134080.pth... -[2023-07-08 14:07:38,238][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017256_8835072.pth -[2023-07-08 14:07:42,162][977552] Updated weights for policy 0, policy_version 17920 (0.0005) -[2023-07-08 14:07:43,233][977264] Fps is (10 sec: 10649.5, 60 sec: 10035.2, 300 sec: 10108.1). Total num frames: 9187328. Throughput: 0: 9986.4. Samples: 9162988. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:07:43,234][977264] Avg episode reward: [(0, '497.159')] -[2023-07-08 14:07:45,628][977552] Updated weights for policy 0, policy_version 18000 (0.0005) -[2023-07-08 14:07:48,233][977264] Fps is (10 sec: 10649.6, 60 sec: 10103.5, 300 sec: 10108.1). Total num frames: 9240576. Throughput: 0: 10104.9. Samples: 9228364. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 14:07:48,234][977264] Avg episode reward: [(0, '487.345')] -[2023-07-08 14:07:49,606][977552] Updated weights for policy 0, policy_version 18080 (0.0005) -[2023-07-08 14:07:53,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10094.2). Total num frames: 9289728. Throughput: 0: 10126.8. Samples: 9289792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:07:53,234][977264] Avg episode reward: [(0, '490.375')] -[2023-07-08 14:07:53,260][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018152_9293824.pth... -[2023-07-08 14:07:53,262][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017544_8982528.pth -[2023-07-08 14:07:53,721][977552] Updated weights for policy 0, policy_version 18160 (0.0005) -[2023-07-08 14:07:57,988][977552] Updated weights for policy 0, policy_version 18240 (0.0005) -[2023-07-08 14:07:58,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10080.3). Total num frames: 9338880. Throughput: 0: 10081.5. Samples: 9318480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:07:58,234][977264] Avg episode reward: [(0, '488.751')] -[2023-07-08 14:08:02,111][977552] Updated weights for policy 0, policy_version 18320 (0.0005) -[2023-07-08 14:08:03,233][977264] Fps is (10 sec: 9830.3, 60 sec: 10103.5, 300 sec: 10080.3). Total num frames: 9388032. Throughput: 0: 10095.8. Samples: 9378680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:03,234][977264] Avg episode reward: [(0, '482.480')] -[2023-07-08 14:08:06,319][977552] Updated weights for policy 0, policy_version 18400 (0.0005) -[2023-07-08 14:08:08,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10080.3). Total num frames: 9437184. Throughput: 0: 10107.2. Samples: 9437140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:08,234][977264] Avg episode reward: [(0, '487.353')] -[2023-07-08 14:08:08,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018432_9437184.pth... -[2023-07-08 14:08:08,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017840_9134080.pth -[2023-07-08 14:08:10,175][977552] Updated weights for policy 0, policy_version 18480 (0.0005) -[2023-07-08 14:08:13,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10094.2). Total num frames: 9490432. Throughput: 0: 10179.9. Samples: 9469264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:13,234][977264] Avg episode reward: [(0, '495.369')] -[2023-07-08 14:08:14,249][977552] Updated weights for policy 0, policy_version 18560 (0.0005) -[2023-07-08 14:08:18,233][977264] Fps is (10 sec: 10240.0, 60 sec: 10103.5, 300 sec: 10080.3). Total num frames: 9539584. Throughput: 0: 10210.4. Samples: 9531000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:18,234][977264] Avg episode reward: [(0, '493.564')] -[2023-07-08 14:08:18,324][977552] Updated weights for policy 0, policy_version 18640 (0.0005) -[2023-07-08 14:08:22,444][977552] Updated weights for policy 0, policy_version 18720 (0.0005) -[2023-07-08 14:08:23,233][977264] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10094.2). Total num frames: 9592832. Throughput: 0: 10124.3. Samples: 9588876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:23,234][977264] Avg episode reward: [(0, '489.757')] -[2023-07-08 14:08:23,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018736_9592832.pth... -[2023-07-08 14:08:23,240][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018152_9293824.pth -[2023-07-08 14:08:26,729][977552] Updated weights for policy 0, policy_version 18800 (0.0005) -[2023-07-08 14:08:28,233][977264] Fps is (10 sec: 9830.5, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 9637888. Throughput: 0: 10110.9. Samples: 9617976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:28,234][977264] Avg episode reward: [(0, '490.559')] -[2023-07-08 14:08:30,743][977552] Updated weights for policy 0, policy_version 18880 (0.0005) -[2023-07-08 14:08:33,233][977264] Fps is (10 sec: 9420.9, 60 sec: 10103.5, 300 sec: 10066.4). Total num frames: 9687040. Throughput: 0: 9993.6. Samples: 9678076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:33,234][977264] Avg episode reward: [(0, '488.626')] -[2023-07-08 14:08:34,862][977552] Updated weights for policy 0, policy_version 18960 (0.0005) -[2023-07-08 14:08:38,233][977264] Fps is (10 sec: 9830.4, 60 sec: 10035.2, 300 sec: 10052.6). Total num frames: 9736192. Throughput: 0: 9925.4. Samples: 9736436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:38,234][977264] Avg episode reward: [(0, '490.584')] -[2023-07-08 14:08:38,236][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019016_9736192.pth... -[2023-07-08 14:08:38,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018432_9437184.pth -[2023-07-08 14:08:39,148][977552] Updated weights for policy 0, policy_version 19040 (0.0005) -[2023-07-08 14:08:43,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9966.9, 300 sec: 10052.6). Total num frames: 9785344. Throughput: 0: 9945.5. Samples: 9766028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:43,234][977264] Avg episode reward: [(0, '487.486')] -[2023-07-08 14:08:43,342][977552] Updated weights for policy 0, policy_version 19120 (0.0005) -[2023-07-08 14:08:47,423][977552] Updated weights for policy 0, policy_version 19200 (0.0005) -[2023-07-08 14:08:48,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 10038.7). Total num frames: 9834496. Throughput: 0: 9942.1. Samples: 9826072. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:48,234][977264] Avg episode reward: [(0, '495.025')] -[2023-07-08 14:08:51,792][977552] Updated weights for policy 0, policy_version 19280 (0.0006) -[2023-07-08 14:08:53,233][977264] Fps is (10 sec: 9830.3, 60 sec: 9898.7, 300 sec: 10038.7). Total num frames: 9883648. Throughput: 0: 9911.3. Samples: 9883148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:53,234][977264] Avg episode reward: [(0, '493.686')] -[2023-07-08 14:08:53,237][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019304_9883648.pth... -[2023-07-08 14:08:53,239][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018736_9592832.pth -[2023-07-08 14:08:55,856][977552] Updated weights for policy 0, policy_version 19360 (0.0005) -[2023-07-08 14:08:58,233][977264] Fps is (10 sec: 9830.4, 60 sec: 9898.7, 300 sec: 10038.7). Total num frames: 9932800. Throughput: 0: 9854.1. Samples: 9912696. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:08:58,234][977264] Avg episode reward: [(0, '481.864')] -[2023-07-08 14:09:00,040][977552] Updated weights for policy 0, policy_version 19440 (0.0004) -[2023-07-08 14:09:03,233][977264] Fps is (10 sec: 9830.5, 60 sec: 9898.7, 300 sec: 10024.8). Total num frames: 9981952. Throughput: 0: 9778.2. Samples: 9971020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 14:09:03,234][977264] Avg episode reward: [(0, '488.932')] -[2023-07-08 14:09:04,433][977552] Updated weights for policy 0, policy_version 19520 (0.0005) -[2023-07-08 14:09:05,691][977508] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 -[2023-07-08 14:09:05,692][977684] Stopping RolloutWorker_w6... -[2023-07-08 14:09:05,692][977562] Stopping RolloutWorker_w3... -[2023-07-08 14:09:05,692][977555] Stopping RolloutWorker_w2... -[2023-07-08 14:09:05,692][977554] Stopping RolloutWorker_w1... -[2023-07-08 14:09:05,692][977620] Stopping RolloutWorker_w5... -[2023-07-08 14:09:05,692][977619] Stopping RolloutWorker_w4... -[2023-07-08 14:09:05,692][977553] Stopping RolloutWorker_w0... -[2023-07-08 14:09:05,692][977684] Loop rollout_proc6_evt_loop terminating... -[2023-07-08 14:09:05,692][977562] Loop rollout_proc3_evt_loop terminating... -[2023-07-08 14:09:05,692][977683] Stopping RolloutWorker_w7... -[2023-07-08 14:09:05,693][977554] Loop rollout_proc1_evt_loop terminating... -[2023-07-08 14:09:05,693][977620] Loop rollout_proc5_evt_loop terminating... -[2023-07-08 14:09:05,692][977264] Component RolloutWorker_w2 stopped! -[2023-07-08 14:09:05,693][977555] Loop rollout_proc2_evt_loop terminating... -[2023-07-08 14:09:05,693][977619] Loop rollout_proc4_evt_loop terminating... -[2023-07-08 14:09:05,693][977553] Loop rollout_proc0_evt_loop terminating... -[2023-07-08 14:09:05,693][977683] Loop rollout_proc7_evt_loop terminating... -[2023-07-08 14:09:05,692][977508] Stopping Batcher_0... -[2023-07-08 14:09:05,693][977264] Component RolloutWorker_w6 stopped! -[2023-07-08 14:09:05,693][977508] Loop batcher_evt_loop terminating... -[2023-07-08 14:09:05,693][977264] Component RolloutWorker_w3 stopped! -[2023-07-08 14:09:05,693][977264] Component RolloutWorker_w1 stopped! -[2023-07-08 14:09:05,693][977264] Component RolloutWorker_w4 stopped! -[2023-07-08 14:09:05,694][977264] Component RolloutWorker_w5 stopped! -[2023-07-08 14:09:05,694][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... -[2023-07-08 14:09:05,694][977264] Component RolloutWorker_w0 stopped! -[2023-07-08 14:09:05,694][977264] Component Batcher_0 stopped! -[2023-07-08 14:09:05,694][977264] Component RolloutWorker_w7 stopped! -[2023-07-08 14:09:05,697][977508] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019016_9736192.pth -[2023-07-08 14:09:05,698][977508] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... -[2023-07-08 14:09:05,701][977508] Stopping LearnerWorker_p0... -[2023-07-08 14:09:05,702][977508] Loop learner_proc0_evt_loop terminating... -[2023-07-08 14:09:05,702][977264] Component LearnerWorker_p0 stopped! -[2023-07-08 14:09:05,776][977552] Weights refcount: 2 0 -[2023-07-08 14:09:05,777][977552] Stopping InferenceWorker_p0-w0... -[2023-07-08 14:09:05,778][977552] Loop inference_proc0-0_evt_loop terminating... -[2023-07-08 14:09:05,778][977264] Component InferenceWorker_p0-w0 stopped! -[2023-07-08 14:09:05,778][977264] Waiting for process learner_proc0 to stop... -[2023-07-08 14:09:06,410][977264] Waiting for process inference_proc0-0 to join... -[2023-07-08 14:09:06,437][977264] Waiting for process rollout_proc0 to join... -[2023-07-08 14:09:06,437][977264] Waiting for process rollout_proc1 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc2 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc3 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc4 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc5 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc6 to join... -[2023-07-08 14:09:06,438][977264] Waiting for process rollout_proc7 to join... -[2023-07-08 14:09:06,439][977264] Batcher 0 profile tree view: -batching: 1.8328, releasing_batches: 1.5495 -[2023-07-08 14:09:06,439][977264] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0052 - wait_policy_total: 375.7369 -update_model: 11.8402 +[2023-07-16 20:02:29,273][222228] Worker 2 uses CPU cores [8, 9, 10, 11] +[2023-07-16 20:02:29,289][222263] Worker 6 uses CPU cores [24, 25, 26, 27] +[2023-07-16 20:02:29,400][222289] Worker 5 uses CPU cores [20, 21, 22, 23] +[2023-07-16 20:02:29,410][222327] Worker 7 uses CPU cores [28, 29, 30, 31] +[2023-07-16 20:02:29,463][222231] Worker 4 uses CPU cores [16, 17, 18, 19] +[2023-07-16 20:02:29,510][222182] Using optimizer +[2023-07-16 20:02:29,511][222182] No checkpoints found +[2023-07-16 20:02:29,511][222182] Did not load from checkpoint, starting from scratch! +[2023-07-16 20:02:29,511][222182] Initialized policy 0 weights for model version 0 +[2023-07-16 20:02:29,512][222182] LearnerWorker_p0 finished initialization! +[2023-07-16 20:02:29,646][222226] RunningMeanStd input shape: (39,) +[2023-07-16 20:02:29,647][222226] RunningMeanStd input shape: (1,) +[2023-07-16 20:02:29,687][222230] Worker 3 uses CPU cores [12, 13, 14, 15] +[2023-07-16 20:02:29,705][221941] Inference worker 0-0 is ready! +[2023-07-16 20:02:29,706][221941] All inference workers are ready! Signal rollout workers to start! +[2023-07-16 20:02:29,731][222227] Worker 1 uses CPU cores [4, 5, 6, 7] +[2023-07-16 20:02:29,938][222229] Worker 0 uses CPU cores [0, 1, 2, 3] +[2023-07-16 20:02:30,129][221941] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-07-16 20:02:31,219][222231] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,219][222263] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,225][222327] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,225][222231] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,225][222263] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,231][222327] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,245][222289] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,246][222228] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,250][222231] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,250][222263] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,251][222289] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,253][222228] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,256][222327] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,276][222289] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,278][222227] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,278][222228] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,281][222230] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,284][222227] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,288][222230] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,300][222231] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,300][222263] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,306][222327] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,309][222227] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,313][222230] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,326][222289] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,327][222228] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,358][222227] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,362][222230] Decorrelating experience for 192 frames... +[2023-07-16 20:02:31,500][222229] Decorrelating experience for 0 frames... +[2023-07-16 20:02:31,506][222229] Decorrelating experience for 64 frames... +[2023-07-16 20:02:31,531][222229] Decorrelating experience for 128 frames... +[2023-07-16 20:02:31,581][222229] Decorrelating experience for 192 frames... +[2023-07-16 20:02:32,788][222231] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,790][222263] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,790][222327] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,796][222289] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,805][222228] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,833][222227] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,839][222230] Decorrelating experience for 256 frames... +[2023-07-16 20:02:32,880][222231] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,881][222263] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,882][222327] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,888][222289] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,897][222228] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,924][222227] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,930][222230] Decorrelating experience for 320 frames... +[2023-07-16 20:02:32,996][222231] Decorrelating experience for 384 frames... +[2023-07-16 20:02:32,997][222263] Decorrelating experience for 384 frames... +[2023-07-16 20:02:32,998][222327] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,004][222289] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,013][222228] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,040][222227] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,046][222230] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,052][222229] Decorrelating experience for 256 frames... +[2023-07-16 20:02:33,129][222263] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,129][222231] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,132][222327] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,136][222289] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,143][222229] Decorrelating experience for 320 frames... +[2023-07-16 20:02:33,146][222228] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,175][222227] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,179][222230] Decorrelating experience for 448 frames... +[2023-07-16 20:02:33,258][222229] Decorrelating experience for 384 frames... +[2023-07-16 20:02:33,391][222229] Decorrelating experience for 448 frames... +[2023-07-16 20:02:35,129][221941] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 16384. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:02:35,130][221941] Avg episode reward: [(0, '36.753')] +[2023-07-16 20:02:36,558][222226] Updated weights for policy 0, policy_version 80 (0.0004) +[2023-07-16 20:02:39,197][222226] Updated weights for policy 0, policy_version 160 (0.0004) +[2023-07-16 20:02:40,129][221941] Fps is (10 sec: 9420.8, 60 sec: 9420.8, 300 sec: 9420.8). Total num frames: 94208. Throughput: 0: 7483.2. Samples: 74832. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:02:40,130][221941] Avg episode reward: [(0, '218.181')] +[2023-07-16 20:02:40,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000184_94208.pth... +[2023-07-16 20:02:42,001][222226] Updated weights for policy 0, policy_version 240 (0.0004) +[2023-07-16 20:02:44,783][222226] Updated weights for policy 0, policy_version 320 (0.0004) +[2023-07-16 20:02:45,129][221941] Fps is (10 sec: 15155.3, 60 sec: 11195.8, 300 sec: 11195.8). Total num frames: 167936. Throughput: 0: 10869.4. Samples: 163040. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:02:45,130][221941] Avg episode reward: [(0, '250.889')] +[2023-07-16 20:02:45,130][222182] Saving new best policy, reward=250.889! +[2023-07-16 20:02:47,182][221941] Heartbeat connected on Batcher_0 +[2023-07-16 20:02:47,183][221941] Heartbeat connected on LearnerWorker_p0 +[2023-07-16 20:02:47,187][221941] Heartbeat connected on InferenceWorker_p0-w0 +[2023-07-16 20:02:47,191][221941] Heartbeat connected on RolloutWorker_w0 +[2023-07-16 20:02:47,193][221941] Heartbeat connected on RolloutWorker_w1 +[2023-07-16 20:02:47,194][221941] Heartbeat connected on RolloutWorker_w2 +[2023-07-16 20:02:47,196][221941] Heartbeat connected on RolloutWorker_w3 +[2023-07-16 20:02:47,198][221941] Heartbeat connected on RolloutWorker_w4 +[2023-07-16 20:02:47,199][221941] Heartbeat connected on RolloutWorker_w5 +[2023-07-16 20:02:47,201][221941] Heartbeat connected on RolloutWorker_w6 +[2023-07-16 20:02:47,204][221941] Heartbeat connected on RolloutWorker_w7 +[2023-07-16 20:02:47,355][222226] Updated weights for policy 0, policy_version 400 (0.0004) +[2023-07-16 20:02:49,908][222226] Updated weights for policy 0, policy_version 480 (0.0003) +[2023-07-16 20:02:50,129][221941] Fps is (10 sec: 15155.3, 60 sec: 12288.0, 300 sec: 12288.0). Total num frames: 245760. Throughput: 0: 10555.6. Samples: 211112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:02:50,130][221941] Avg episode reward: [(0, '277.895')] +[2023-07-16 20:02:50,130][222182] Saving new best policy, reward=277.895! +[2023-07-16 20:02:52,578][222226] Updated weights for policy 0, policy_version 560 (0.0004) +[2023-07-16 20:02:55,129][221941] Fps is (10 sec: 15155.0, 60 sec: 12779.5, 300 sec: 12779.5). Total num frames: 319488. Throughput: 0: 12126.7. Samples: 303168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:02:55,130][221941] Avg episode reward: [(0, '282.110')] +[2023-07-16 20:02:55,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000624_319488.pth... +[2023-07-16 20:02:55,135][222182] Saving new best policy, reward=282.110! +[2023-07-16 20:02:55,492][222226] Updated weights for policy 0, policy_version 640 (0.0005) +[2023-07-16 20:02:58,346][222226] Updated weights for policy 0, policy_version 720 (0.0005) +[2023-07-16 20:03:00,129][221941] Fps is (10 sec: 14745.5, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 393216. Throughput: 0: 13023.2. Samples: 390696. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:03:00,130][221941] Avg episode reward: [(0, '288.425')] +[2023-07-16 20:03:00,131][222182] Saving new best policy, reward=288.425! +[2023-07-16 20:03:00,954][222226] Updated weights for policy 0, policy_version 800 (0.0004) +[2023-07-16 20:03:03,503][222226] Updated weights for policy 0, policy_version 880 (0.0004) +[2023-07-16 20:03:05,129][221941] Fps is (10 sec: 15564.9, 60 sec: 13575.3, 300 sec: 13575.3). Total num frames: 475136. Throughput: 0: 12522.1. Samples: 438272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:05,130][221941] Avg episode reward: [(0, '297.238')] +[2023-07-16 20:03:05,130][222182] Saving new best policy, reward=297.238! +[2023-07-16 20:03:06,075][222226] Updated weights for policy 0, policy_version 960 (0.0004) +[2023-07-16 20:03:08,694][222226] Updated weights for policy 0, policy_version 1040 (0.0004) +[2023-07-16 20:03:10,129][221941] Fps is (10 sec: 15974.3, 60 sec: 13824.0, 300 sec: 13824.0). Total num frames: 552960. Throughput: 0: 13318.9. Samples: 532756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:10,131][221941] Avg episode reward: [(0, '323.451')] +[2023-07-16 20:03:10,134][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001080_552960.pth... +[2023-07-16 20:03:10,137][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000184_94208.pth +[2023-07-16 20:03:10,137][222182] Saving new best policy, reward=323.451! +[2023-07-16 20:03:11,306][222226] Updated weights for policy 0, policy_version 1120 (0.0004) +[2023-07-16 20:03:13,982][222226] Updated weights for policy 0, policy_version 1200 (0.0004) +[2023-07-16 20:03:15,129][221941] Fps is (10 sec: 15564.8, 60 sec: 14017.4, 300 sec: 14017.4). Total num frames: 630784. Throughput: 0: 13904.6. Samples: 625708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:15,130][221941] Avg episode reward: [(0, '427.038')] +[2023-07-16 20:03:15,131][222182] Saving new best policy, reward=427.038! +[2023-07-16 20:03:16,774][222226] Updated weights for policy 0, policy_version 1280 (0.0005) +[2023-07-16 20:03:19,509][222226] Updated weights for policy 0, policy_version 1360 (0.0004) +[2023-07-16 20:03:20,129][221941] Fps is (10 sec: 15155.3, 60 sec: 14090.3, 300 sec: 14090.3). Total num frames: 704512. Throughput: 0: 14883.1. Samples: 669740. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:03:20,130][221941] Avg episode reward: [(0, '458.502')] +[2023-07-16 20:03:20,131][222182] Saving new best policy, reward=458.502! +[2023-07-16 20:03:22,301][222226] Updated weights for policy 0, policy_version 1440 (0.0004) +[2023-07-16 20:03:25,062][222226] Updated weights for policy 0, policy_version 1520 (0.0004) +[2023-07-16 20:03:25,129][221941] Fps is (10 sec: 14745.5, 60 sec: 14149.8, 300 sec: 14149.8). Total num frames: 778240. Throughput: 0: 15177.6. Samples: 757824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:25,130][221941] Avg episode reward: [(0, '463.988')] +[2023-07-16 20:03:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001520_778240.pth... +[2023-07-16 20:03:25,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000000624_319488.pth +[2023-07-16 20:03:25,136][222182] Saving new best policy, reward=463.988! +[2023-07-16 20:03:27,830][222226] Updated weights for policy 0, policy_version 1600 (0.0004) +[2023-07-16 20:03:30,129][221941] Fps is (10 sec: 14745.6, 60 sec: 14199.5, 300 sec: 14199.5). Total num frames: 851968. Throughput: 0: 15215.0. Samples: 847716. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-16 20:03:30,130][221941] Avg episode reward: [(0, '468.502')] +[2023-07-16 20:03:30,130][222182] Saving new best policy, reward=468.502! +[2023-07-16 20:03:30,632][222226] Updated weights for policy 0, policy_version 1680 (0.0004) +[2023-07-16 20:03:33,447][222226] Updated weights for policy 0, policy_version 1760 (0.0005) +[2023-07-16 20:03:35,129][221941] Fps is (10 sec: 14745.7, 60 sec: 15155.2, 300 sec: 14241.5). Total num frames: 925696. Throughput: 0: 15117.8. Samples: 891412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:35,130][221941] Avg episode reward: [(0, '457.655')] +[2023-07-16 20:03:36,152][222226] Updated weights for policy 0, policy_version 1840 (0.0004) +[2023-07-16 20:03:38,929][222226] Updated weights for policy 0, policy_version 1920 (0.0004) +[2023-07-16 20:03:40,129][221941] Fps is (10 sec: 14745.6, 60 sec: 15086.9, 300 sec: 14277.5). Total num frames: 999424. Throughput: 0: 15042.6. Samples: 980084. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:03:40,130][221941] Avg episode reward: [(0, '467.380')] +[2023-07-16 20:03:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001952_999424.pth... +[2023-07-16 20:03:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001080_552960.pth +[2023-07-16 20:03:41,643][222226] Updated weights for policy 0, policy_version 2000 (0.0004) +[2023-07-16 20:03:44,328][222226] Updated weights for policy 0, policy_version 2080 (0.0004) +[2023-07-16 20:03:45,129][221941] Fps is (10 sec: 14745.6, 60 sec: 15086.9, 300 sec: 14308.7). Total num frames: 1073152. Throughput: 0: 15120.6. Samples: 1071124. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:03:45,130][221941] Avg episode reward: [(0, '467.312')] +[2023-07-16 20:03:47,261][222226] Updated weights for policy 0, policy_version 2160 (0.0005) +[2023-07-16 20:03:50,090][222226] Updated weights for policy 0, policy_version 2240 (0.0005) +[2023-07-16 20:03:50,129][221941] Fps is (10 sec: 14745.6, 60 sec: 15018.7, 300 sec: 14336.0). Total num frames: 1146880. Throughput: 0: 14990.3. Samples: 1112836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:50,130][221941] Avg episode reward: [(0, '477.774')] +[2023-07-16 20:03:50,130][222182] Saving new best policy, reward=477.774! +[2023-07-16 20:03:52,897][222226] Updated weights for policy 0, policy_version 2320 (0.0005) +[2023-07-16 20:03:55,129][221941] Fps is (10 sec: 14336.1, 60 sec: 14950.4, 300 sec: 14311.9). Total num frames: 1216512. Throughput: 0: 14830.8. Samples: 1200140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:03:55,129][221941] Avg episode reward: [(0, '474.040')] +[2023-07-16 20:03:55,160][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002384_1220608.pth... +[2023-07-16 20:03:55,162][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001520_778240.pth +[2023-07-16 20:03:55,695][222226] Updated weights for policy 0, policy_version 2400 (0.0004) +[2023-07-16 20:03:58,523][222226] Updated weights for policy 0, policy_version 2480 (0.0005) +[2023-07-16 20:04:00,129][221941] Fps is (10 sec: 14336.1, 60 sec: 14950.4, 300 sec: 14336.0). Total num frames: 1290240. Throughput: 0: 14684.2. Samples: 1286496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:04:00,130][221941] Avg episode reward: [(0, '477.093')] +[2023-07-16 20:04:01,316][222226] Updated weights for policy 0, policy_version 2560 (0.0004) +[2023-07-16 20:04:04,123][222226] Updated weights for policy 0, policy_version 2640 (0.0004) +[2023-07-16 20:04:05,129][221941] Fps is (10 sec: 14745.5, 60 sec: 14813.9, 300 sec: 14357.6). Total num frames: 1363968. Throughput: 0: 14700.5. Samples: 1331264. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:04:05,130][221941] Avg episode reward: [(0, '477.171')] +[2023-07-16 20:04:06,934][222226] Updated weights for policy 0, policy_version 2720 (0.0004) +[2023-07-16 20:04:09,761][222226] Updated weights for policy 0, policy_version 2800 (0.0004) +[2023-07-16 20:04:10,129][221941] Fps is (10 sec: 14745.5, 60 sec: 14745.6, 300 sec: 14377.0). Total num frames: 1437696. Throughput: 0: 14669.3. Samples: 1417940. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:04:10,130][221941] Avg episode reward: [(0, '486.084')] +[2023-07-16 20:04:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002808_1437696.pth... +[2023-07-16 20:04:10,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000001952_999424.pth +[2023-07-16 20:04:10,136][222182] Saving new best policy, reward=486.084! +[2023-07-16 20:04:12,677][222226] Updated weights for policy 0, policy_version 2880 (0.0004) +[2023-07-16 20:04:15,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14609.1, 300 sec: 14355.5). Total num frames: 1507328. Throughput: 0: 14567.3. Samples: 1503244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:04:15,130][221941] Avg episode reward: [(0, '488.558')] +[2023-07-16 20:04:15,130][222182] Saving new best policy, reward=488.558! +[2023-07-16 20:04:15,630][222226] Updated weights for policy 0, policy_version 2960 (0.0005) +[2023-07-16 20:04:18,627][222226] Updated weights for policy 0, policy_version 3040 (0.0005) +[2023-07-16 20:04:20,129][221941] Fps is (10 sec: 13516.9, 60 sec: 14472.6, 300 sec: 14298.8). Total num frames: 1572864. Throughput: 0: 14512.1. Samples: 1544456. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:04:20,129][221941] Avg episode reward: [(0, '485.199')] +[2023-07-16 20:04:21,767][222226] Updated weights for policy 0, policy_version 3120 (0.0005) +[2023-07-16 20:04:24,869][222226] Updated weights for policy 0, policy_version 3200 (0.0005) +[2023-07-16 20:04:25,129][221941] Fps is (10 sec: 13107.3, 60 sec: 14336.0, 300 sec: 14247.0). Total num frames: 1638400. Throughput: 0: 14277.4. Samples: 1622564. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:04:25,129][221941] Avg episode reward: [(0, '489.883')] +[2023-07-16 20:04:25,147][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003208_1642496.pth... +[2023-07-16 20:04:25,148][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002384_1220608.pth +[2023-07-16 20:04:25,149][222182] Saving new best policy, reward=489.883! +[2023-07-16 20:04:27,775][222226] Updated weights for policy 0, policy_version 3280 (0.0004) +[2023-07-16 20:04:30,129][221941] Fps is (10 sec: 13926.3, 60 sec: 14336.0, 300 sec: 14267.7). Total num frames: 1712128. Throughput: 0: 14133.6. Samples: 1707136. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:04:30,130][221941] Avg episode reward: [(0, '485.877')] +[2023-07-16 20:04:30,678][222226] Updated weights for policy 0, policy_version 3360 (0.0004) +[2023-07-16 20:04:33,621][222226] Updated weights for policy 0, policy_version 3440 (0.0004) +[2023-07-16 20:04:35,129][221941] Fps is (10 sec: 14335.9, 60 sec: 14267.7, 300 sec: 14254.1). Total num frames: 1781760. Throughput: 0: 14138.2. Samples: 1749056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:04:35,130][221941] Avg episode reward: [(0, '474.537')] +[2023-07-16 20:04:36,480][222226] Updated weights for policy 0, policy_version 3520 (0.0004) +[2023-07-16 20:04:39,419][222226] Updated weights for policy 0, policy_version 3600 (0.0005) +[2023-07-16 20:04:40,129][221941] Fps is (10 sec: 13926.4, 60 sec: 14199.5, 300 sec: 14241.5). Total num frames: 1851392. Throughput: 0: 14082.3. Samples: 1833844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:04:40,130][221941] Avg episode reward: [(0, '488.258')] +[2023-07-16 20:04:40,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003616_1851392.pth... +[2023-07-16 20:04:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000002808_1437696.pth +[2023-07-16 20:04:42,397][222226] Updated weights for policy 0, policy_version 3680 (0.0004) +[2023-07-16 20:04:45,129][221941] Fps is (10 sec: 13516.8, 60 sec: 14062.9, 300 sec: 14199.5). Total num frames: 1916928. Throughput: 0: 13980.7. Samples: 1915628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:04:45,146][221941] Avg episode reward: [(0, '480.247')] +[2023-07-16 20:04:45,505][222226] Updated weights for policy 0, policy_version 3760 (0.0005) +[2023-07-16 20:04:48,686][222226] Updated weights for policy 0, policy_version 3840 (0.0005) +[2023-07-16 20:04:50,129][221941] Fps is (10 sec: 13107.3, 60 sec: 13926.4, 300 sec: 14160.5). Total num frames: 1982464. Throughput: 0: 13835.4. Samples: 1953856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:04:50,130][221941] Avg episode reward: [(0, '483.473')] +[2023-07-16 20:04:51,911][222226] Updated weights for policy 0, policy_version 3920 (0.0005) +[2023-07-16 20:04:55,067][222226] Updated weights for policy 0, policy_version 4000 (0.0005) +[2023-07-16 20:04:55,129][221941] Fps is (10 sec: 13107.1, 60 sec: 13858.1, 300 sec: 14124.1). Total num frames: 2048000. Throughput: 0: 13625.1. Samples: 2031072. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:04:55,142][221941] Avg episode reward: [(0, '467.623')] +[2023-07-16 20:04:55,145][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004000_2048000.pth... +[2023-07-16 20:04:55,147][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003208_1642496.pth +[2023-07-16 20:04:58,319][222226] Updated weights for policy 0, policy_version 4080 (0.0005) +[2023-07-16 20:05:00,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13653.3, 300 sec: 14062.9). Total num frames: 2109440. Throughput: 0: 13421.4. Samples: 2107208. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:05:00,130][221941] Avg episode reward: [(0, '492.153')] +[2023-07-16 20:05:00,131][222182] Saving new best policy, reward=492.153! +[2023-07-16 20:05:01,540][222226] Updated weights for policy 0, policy_version 4160 (0.0005) +[2023-07-16 20:05:04,783][222226] Updated weights for policy 0, policy_version 4240 (0.0005) +[2023-07-16 20:05:05,129][221941] Fps is (10 sec: 12697.7, 60 sec: 13516.8, 300 sec: 14032.1). Total num frames: 2174976. Throughput: 0: 13343.9. Samples: 2144932. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:05:05,130][221941] Avg episode reward: [(0, '471.847')] +[2023-07-16 20:05:08,052][222226] Updated weights for policy 0, policy_version 4320 (0.0005) +[2023-07-16 20:05:10,129][221941] Fps is (10 sec: 12697.5, 60 sec: 13312.0, 300 sec: 13977.6). Total num frames: 2236416. Throughput: 0: 13278.6. Samples: 2220104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:05:10,130][221941] Avg episode reward: [(0, '477.881')] +[2023-07-16 20:05:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004368_2236416.pth... +[2023-07-16 20:05:10,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000003616_1851392.pth +[2023-07-16 20:05:11,217][222226] Updated weights for policy 0, policy_version 4400 (0.0005) +[2023-07-16 20:05:14,451][222226] Updated weights for policy 0, policy_version 4480 (0.0005) +[2023-07-16 20:05:15,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13243.7, 300 sec: 13951.2). Total num frames: 2301952. Throughput: 0: 13126.8. Samples: 2297840. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:05:15,130][221941] Avg episode reward: [(0, '492.589')] +[2023-07-16 20:05:15,131][222182] Saving new best policy, reward=492.589! +[2023-07-16 20:05:17,657][222226] Updated weights for policy 0, policy_version 4560 (0.0005) +[2023-07-16 20:05:20,129][221941] Fps is (10 sec: 12697.7, 60 sec: 13175.5, 300 sec: 13902.3). Total num frames: 2363392. Throughput: 0: 13035.7. Samples: 2335660. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:05:20,129][221941] Avg episode reward: [(0, '485.022')] +[2023-07-16 20:05:20,758][222226] Updated weights for policy 0, policy_version 4640 (0.0005) +[2023-07-16 20:05:24,004][222226] Updated weights for policy 0, policy_version 4720 (0.0005) +[2023-07-16 20:05:25,129][221941] Fps is (10 sec: 12697.5, 60 sec: 13175.4, 300 sec: 13879.6). Total num frames: 2428928. Throughput: 0: 12874.9. Samples: 2413216. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:05:25,130][221941] Avg episode reward: [(0, '479.947')] +[2023-07-16 20:05:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004744_2428928.pth... +[2023-07-16 20:05:25,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004000_2048000.pth +[2023-07-16 20:05:27,176][222226] Updated weights for policy 0, policy_version 4800 (0.0005) +[2023-07-16 20:05:30,129][221941] Fps is (10 sec: 13107.1, 60 sec: 13038.9, 300 sec: 13858.1). Total num frames: 2494464. Throughput: 0: 12773.4. Samples: 2490432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:05:30,130][221941] Avg episode reward: [(0, '424.263')] +[2023-07-16 20:05:30,334][222226] Updated weights for policy 0, policy_version 4880 (0.0005) +[2023-07-16 20:05:33,304][222226] Updated weights for policy 0, policy_version 4960 (0.0004) +[2023-07-16 20:05:35,129][221941] Fps is (10 sec: 13516.8, 60 sec: 13038.9, 300 sec: 13860.0). Total num frames: 2564096. Throughput: 0: 12839.2. Samples: 2531620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:05:35,130][221941] Avg episode reward: [(0, '495.079')] +[2023-07-16 20:05:35,130][222182] Saving new best policy, reward=495.079! +[2023-07-16 20:05:36,187][222226] Updated weights for policy 0, policy_version 5040 (0.0004) +[2023-07-16 20:05:39,210][222226] Updated weights for policy 0, policy_version 5120 (0.0005) +[2023-07-16 20:05:40,129][221941] Fps is (10 sec: 13926.1, 60 sec: 13038.9, 300 sec: 13861.7). Total num frames: 2633728. Throughput: 0: 12972.2. Samples: 2614824. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:05:40,130][221941] Avg episode reward: [(0, '480.877')] +[2023-07-16 20:05:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005144_2633728.pth... +[2023-07-16 20:05:40,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004368_2236416.pth +[2023-07-16 20:05:42,172][222226] Updated weights for policy 0, policy_version 5200 (0.0004) +[2023-07-16 20:05:45,129][221941] Fps is (10 sec: 13516.7, 60 sec: 13038.9, 300 sec: 13842.4). Total num frames: 2699264. Throughput: 0: 13066.0. Samples: 2695180. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:05:45,130][221941] Avg episode reward: [(0, '477.693')] +[2023-07-16 20:05:45,439][222226] Updated weights for policy 0, policy_version 5280 (0.0005) +[2023-07-16 20:05:48,663][222226] Updated weights for policy 0, policy_version 5360 (0.0005) +[2023-07-16 20:05:50,129][221941] Fps is (10 sec: 12697.9, 60 sec: 12970.7, 300 sec: 13803.5). Total num frames: 2760704. Throughput: 0: 13060.9. Samples: 2732672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:05:50,130][221941] Avg episode reward: [(0, '486.820')] +[2023-07-16 20:05:51,901][222226] Updated weights for policy 0, policy_version 5440 (0.0006) +[2023-07-16 20:05:55,129][221941] Fps is (10 sec: 12288.1, 60 sec: 12902.4, 300 sec: 13766.6). Total num frames: 2822144. Throughput: 0: 13073.3. Samples: 2808404. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:05:55,129][221941] Avg episode reward: [(0, '487.997')] +[2023-07-16 20:05:55,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005512_2822144.pth... +[2023-07-16 20:05:55,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000004744_2428928.pth +[2023-07-16 20:05:55,184][222226] Updated weights for policy 0, policy_version 5520 (0.0005) +[2023-07-16 20:05:58,121][222226] Updated weights for policy 0, policy_version 5600 (0.0004) +[2023-07-16 20:06:00,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13038.9, 300 sec: 13770.4). Total num frames: 2891776. Throughput: 0: 13166.4. Samples: 2890328. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:06:00,130][221941] Avg episode reward: [(0, '491.215')] +[2023-07-16 20:06:01,021][222226] Updated weights for policy 0, policy_version 5680 (0.0004) +[2023-07-16 20:06:03,979][222226] Updated weights for policy 0, policy_version 5760 (0.0004) +[2023-07-16 20:06:05,129][221941] Fps is (10 sec: 14336.0, 60 sec: 13175.5, 300 sec: 13793.0). Total num frames: 2965504. Throughput: 0: 13258.2. Samples: 2932280. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-16 20:06:05,130][221941] Avg episode reward: [(0, '488.776')] +[2023-07-16 20:06:06,827][222226] Updated weights for policy 0, policy_version 5840 (0.0004) +[2023-07-16 20:06:09,786][222226] Updated weights for policy 0, policy_version 5920 (0.0004) +[2023-07-16 20:06:10,129][221941] Fps is (10 sec: 14336.0, 60 sec: 13312.0, 300 sec: 13796.1). Total num frames: 3035136. Throughput: 0: 13417.4. Samples: 3017000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:10,130][221941] Avg episode reward: [(0, '485.532')] +[2023-07-16 20:06:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005928_3035136.pth... +[2023-07-16 20:06:10,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005144_2633728.pth +[2023-07-16 20:06:12,669][222226] Updated weights for policy 0, policy_version 6000 (0.0004) +[2023-07-16 20:06:15,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13380.3, 300 sec: 13799.0). Total num frames: 3104768. Throughput: 0: 13555.8. Samples: 3100444. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:06:15,130][221941] Avg episode reward: [(0, '481.391')] +[2023-07-16 20:06:15,706][222226] Updated weights for policy 0, policy_version 6080 (0.0005) +[2023-07-16 20:06:18,969][222226] Updated weights for policy 0, policy_version 6160 (0.0005) +[2023-07-16 20:06:20,129][221941] Fps is (10 sec: 13107.3, 60 sec: 13380.3, 300 sec: 13766.1). Total num frames: 3166208. Throughput: 0: 13486.2. Samples: 3138500. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:06:20,130][221941] Avg episode reward: [(0, '485.633')] +[2023-07-16 20:06:22,223][222226] Updated weights for policy 0, policy_version 6240 (0.0005) +[2023-07-16 20:06:25,129][221941] Fps is (10 sec: 12697.5, 60 sec: 13380.3, 300 sec: 13752.1). Total num frames: 3231744. Throughput: 0: 13327.2. Samples: 3214548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:25,130][221941] Avg episode reward: [(0, '499.600')] +[2023-07-16 20:06:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006312_3231744.pth... +[2023-07-16 20:06:25,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005512_2822144.pth +[2023-07-16 20:06:25,136][222182] Saving new best policy, reward=499.600! +[2023-07-16 20:06:25,423][222226] Updated weights for policy 0, policy_version 6320 (0.0006) +[2023-07-16 20:06:28,722][222226] Updated weights for policy 0, policy_version 6400 (0.0006) +[2023-07-16 20:06:30,129][221941] Fps is (10 sec: 12697.5, 60 sec: 13312.0, 300 sec: 13721.6). Total num frames: 3293184. Throughput: 0: 13208.1. Samples: 3289544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:30,130][221941] Avg episode reward: [(0, '484.827')] +[2023-07-16 20:06:31,984][222226] Updated weights for policy 0, policy_version 6480 (0.0006) +[2023-07-16 20:06:35,116][222226] Updated weights for policy 0, policy_version 6560 (0.0005) +[2023-07-16 20:06:35,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13243.7, 300 sec: 13709.1). Total num frames: 3358720. Throughput: 0: 13229.0. Samples: 3327980. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:35,130][221941] Avg episode reward: [(0, '486.603')] +[2023-07-16 20:06:38,214][222226] Updated weights for policy 0, policy_version 6640 (0.0005) +[2023-07-16 20:06:40,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13175.5, 300 sec: 13697.0). Total num frames: 3424256. Throughput: 0: 13315.1. Samples: 3407584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:40,130][221941] Avg episode reward: [(0, '485.024')] +[2023-07-16 20:06:40,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006688_3424256.pth... +[2023-07-16 20:06:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000005928_3035136.pth +[2023-07-16 20:06:41,179][222226] Updated weights for policy 0, policy_version 6720 (0.0004) +[2023-07-16 20:06:44,127][222226] Updated weights for policy 0, policy_version 6800 (0.0005) +[2023-07-16 20:06:45,129][221941] Fps is (10 sec: 13516.9, 60 sec: 13243.7, 300 sec: 13701.5). Total num frames: 3493888. Throughput: 0: 13318.7. Samples: 3489672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:45,130][221941] Avg episode reward: [(0, '486.257')] +[2023-07-16 20:06:47,424][222226] Updated weights for policy 0, policy_version 6880 (0.0006) +[2023-07-16 20:06:50,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13674.3). Total num frames: 3555328. Throughput: 0: 13208.8. Samples: 3526676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:50,130][221941] Avg episode reward: [(0, '469.687')] +[2023-07-16 20:06:50,721][222226] Updated weights for policy 0, policy_version 6960 (0.0006) +[2023-07-16 20:06:53,993][222226] Updated weights for policy 0, policy_version 7040 (0.0005) +[2023-07-16 20:06:55,129][221941] Fps is (10 sec: 12288.1, 60 sec: 13243.7, 300 sec: 13648.2). Total num frames: 3616768. Throughput: 0: 12985.2. Samples: 3601336. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:06:55,130][221941] Avg episode reward: [(0, '475.193')] +[2023-07-16 20:06:55,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007064_3616768.pth... +[2023-07-16 20:06:55,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006312_3231744.pth +[2023-07-16 20:06:57,234][222226] Updated weights for policy 0, policy_version 7120 (0.0006) +[2023-07-16 20:07:00,129][221941] Fps is (10 sec: 12288.1, 60 sec: 13107.2, 300 sec: 13623.0). Total num frames: 3678208. Throughput: 0: 12805.5. Samples: 3676692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:00,129][221941] Avg episode reward: [(0, '478.204')] +[2023-07-16 20:07:00,468][222226] Updated weights for policy 0, policy_version 7200 (0.0005) +[2023-07-16 20:07:03,394][222226] Updated weights for policy 0, policy_version 7280 (0.0004) +[2023-07-16 20:07:05,129][221941] Fps is (10 sec: 13107.3, 60 sec: 13038.9, 300 sec: 13628.5). Total num frames: 3747840. Throughput: 0: 12878.0. Samples: 3718008. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:07:05,129][221941] Avg episode reward: [(0, '472.326')] +[2023-07-16 20:07:06,321][222226] Updated weights for policy 0, policy_version 7360 (0.0005) +[2023-07-16 20:07:09,432][222226] Updated weights for policy 0, policy_version 7440 (0.0005) +[2023-07-16 20:07:10,129][221941] Fps is (10 sec: 13926.3, 60 sec: 13038.9, 300 sec: 13633.8). Total num frames: 3817472. Throughput: 0: 13034.2. Samples: 3801088. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-16 20:07:10,130][221941] Avg episode reward: [(0, '484.615')] +[2023-07-16 20:07:10,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007456_3817472.pth... +[2023-07-16 20:07:10,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000006688_3424256.pth +[2023-07-16 20:07:12,654][222226] Updated weights for policy 0, policy_version 7520 (0.0006) +[2023-07-16 20:07:15,129][221941] Fps is (10 sec: 13107.1, 60 sec: 12902.4, 300 sec: 13610.2). Total num frames: 3878912. Throughput: 0: 13030.5. Samples: 3875916. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:07:15,130][221941] Avg episode reward: [(0, '481.903')] +[2023-07-16 20:07:15,897][222226] Updated weights for policy 0, policy_version 7600 (0.0006) +[2023-07-16 20:07:18,990][222226] Updated weights for policy 0, policy_version 7680 (0.0005) +[2023-07-16 20:07:20,129][221941] Fps is (10 sec: 12697.7, 60 sec: 12970.7, 300 sec: 13601.5). Total num frames: 3944448. Throughput: 0: 13062.7. Samples: 3915800. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:07:20,130][221941] Avg episode reward: [(0, '487.389')] +[2023-07-16 20:07:22,251][222226] Updated weights for policy 0, policy_version 7760 (0.0006) +[2023-07-16 20:07:25,129][221941] Fps is (10 sec: 13107.2, 60 sec: 12970.7, 300 sec: 13593.2). Total num frames: 4009984. Throughput: 0: 13024.0. Samples: 3993664. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:07:25,130][221941] Avg episode reward: [(0, '483.680')] +[2023-07-16 20:07:25,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007832_4009984.pth... +[2023-07-16 20:07:25,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007064_3616768.pth +[2023-07-16 20:07:25,227][222226] Updated weights for policy 0, policy_version 7840 (0.0004) +[2023-07-16 20:07:28,211][222226] Updated weights for policy 0, policy_version 7920 (0.0005) +[2023-07-16 20:07:30,129][221941] Fps is (10 sec: 13516.8, 60 sec: 13107.2, 300 sec: 13773.7). Total num frames: 4079616. Throughput: 0: 13040.0. Samples: 4076472. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:30,130][221941] Avg episode reward: [(0, '475.429')] +[2023-07-16 20:07:31,091][222226] Updated weights for policy 0, policy_version 8000 (0.0004) +[2023-07-16 20:07:34,030][222226] Updated weights for policy 0, policy_version 8080 (0.0004) +[2023-07-16 20:07:35,129][221941] Fps is (10 sec: 13926.5, 60 sec: 13175.5, 300 sec: 13745.9). Total num frames: 4149248. Throughput: 0: 13153.3. Samples: 4118576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:35,129][221941] Avg episode reward: [(0, '472.431')] +[2023-07-16 20:07:36,949][222226] Updated weights for policy 0, policy_version 8160 (0.0004) +[2023-07-16 20:07:39,831][222226] Updated weights for policy 0, policy_version 8240 (0.0004) +[2023-07-16 20:07:40,129][221941] Fps is (10 sec: 14335.9, 60 sec: 13312.0, 300 sec: 13745.9). Total num frames: 4222976. Throughput: 0: 13366.4. Samples: 4202824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:40,130][221941] Avg episode reward: [(0, '472.830')] +[2023-07-16 20:07:40,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008248_4222976.pth... +[2023-07-16 20:07:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007456_3817472.pth +[2023-07-16 20:07:42,847][222226] Updated weights for policy 0, policy_version 8320 (0.0005) +[2023-07-16 20:07:45,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13243.8, 300 sec: 13704.2). Total num frames: 4288512. Throughput: 0: 13539.4. Samples: 4285964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:45,130][221941] Avg episode reward: [(0, '478.791')] +[2023-07-16 20:07:45,763][222226] Updated weights for policy 0, policy_version 8400 (0.0005) +[2023-07-16 20:07:48,620][222226] Updated weights for policy 0, policy_version 8480 (0.0004) +[2023-07-16 20:07:50,129][221941] Fps is (10 sec: 13926.5, 60 sec: 13448.5, 300 sec: 13704.2). Total num frames: 4362240. Throughput: 0: 13584.7. Samples: 4329320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:50,130][221941] Avg episode reward: [(0, '485.148')] +[2023-07-16 20:07:51,475][222226] Updated weights for policy 0, policy_version 8560 (0.0004) +[2023-07-16 20:07:54,320][222226] Updated weights for policy 0, policy_version 8640 (0.0004) +[2023-07-16 20:07:55,129][221941] Fps is (10 sec: 14336.0, 60 sec: 13585.1, 300 sec: 13690.4). Total num frames: 4431872. Throughput: 0: 13653.7. Samples: 4415504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:07:55,130][221941] Avg episode reward: [(0, '487.111')] +[2023-07-16 20:07:55,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008656_4431872.pth... +[2023-07-16 20:07:55,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000007832_4009984.pth +[2023-07-16 20:07:57,312][222226] Updated weights for policy 0, policy_version 8720 (0.0005) +[2023-07-16 20:08:00,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13721.6, 300 sec: 13648.7). Total num frames: 4501504. Throughput: 0: 13823.7. Samples: 4497984. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-16 20:08:00,130][221941] Avg episode reward: [(0, '484.727')] +[2023-07-16 20:08:00,229][222226] Updated weights for policy 0, policy_version 8800 (0.0004) +[2023-07-16 20:08:02,988][222226] Updated weights for policy 0, policy_version 8880 (0.0004) +[2023-07-16 20:08:05,129][221941] Fps is (10 sec: 14336.1, 60 sec: 13789.9, 300 sec: 13634.8). Total num frames: 4575232. Throughput: 0: 13927.5. Samples: 4542536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:05,130][221941] Avg episode reward: [(0, '479.084')] +[2023-07-16 20:08:05,921][222226] Updated weights for policy 0, policy_version 8960 (0.0005) +[2023-07-16 20:08:08,764][222226] Updated weights for policy 0, policy_version 9040 (0.0004) +[2023-07-16 20:08:10,129][221941] Fps is (10 sec: 14335.9, 60 sec: 13789.9, 300 sec: 13607.0). Total num frames: 4644864. Throughput: 0: 14107.3. Samples: 4628492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:10,130][221941] Avg episode reward: [(0, '479.950')] +[2023-07-16 20:08:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009072_4644864.pth... +[2023-07-16 20:08:10,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008248_4222976.pth +[2023-07-16 20:08:11,812][222226] Updated weights for policy 0, policy_version 9120 (0.0005) +[2023-07-16 20:08:15,025][222226] Updated weights for policy 0, policy_version 9200 (0.0005) +[2023-07-16 20:08:15,129][221941] Fps is (10 sec: 13516.8, 60 sec: 13858.1, 300 sec: 13579.3). Total num frames: 4710400. Throughput: 0: 13999.2. Samples: 4706436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:15,130][221941] Avg episode reward: [(0, '479.539')] +[2023-07-16 20:08:15,666][222182] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000010 +[2023-07-16 20:08:17,944][222226] Updated weights for policy 0, policy_version 9280 (0.0004) +[2023-07-16 20:08:20,129][221941] Fps is (10 sec: 13107.3, 60 sec: 13858.1, 300 sec: 13551.5). Total num frames: 4775936. Throughput: 0: 13983.5. Samples: 4747832. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:08:20,129][221941] Avg episode reward: [(0, '482.801')] +[2023-07-16 20:08:21,097][222226] Updated weights for policy 0, policy_version 9360 (0.0005) +[2023-07-16 20:08:24,313][222226] Updated weights for policy 0, policy_version 9440 (0.0005) +[2023-07-16 20:08:25,129][221941] Fps is (10 sec: 13107.1, 60 sec: 13858.1, 300 sec: 13523.7). Total num frames: 4841472. Throughput: 0: 13839.5. Samples: 4825600. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:08:25,130][221941] Avg episode reward: [(0, '484.086')] +[2023-07-16 20:08:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009456_4841472.pth... +[2023-07-16 20:08:25,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000008656_4431872.pth +[2023-07-16 20:08:27,410][222226] Updated weights for policy 0, policy_version 9520 (0.0005) +[2023-07-16 20:08:30,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13789.9, 300 sec: 13496.0). Total num frames: 4907008. Throughput: 0: 13750.3. Samples: 4904728. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:08:30,130][221941] Avg episode reward: [(0, '464.458')] +[2023-07-16 20:08:30,511][222226] Updated weights for policy 0, policy_version 9600 (0.0005) +[2023-07-16 20:08:33,569][222226] Updated weights for policy 0, policy_version 9680 (0.0005) +[2023-07-16 20:08:35,129][221941] Fps is (10 sec: 13516.9, 60 sec: 13789.9, 300 sec: 13482.1). Total num frames: 4976640. Throughput: 0: 13658.3. Samples: 4943944. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:08:35,130][221941] Avg episode reward: [(0, '493.296')] +[2023-07-16 20:08:36,407][222226] Updated weights for policy 0, policy_version 9760 (0.0004) +[2023-07-16 20:08:39,343][222226] Updated weights for policy 0, policy_version 9840 (0.0005) +[2023-07-16 20:08:40,129][221941] Fps is (10 sec: 13926.3, 60 sec: 13721.6, 300 sec: 13468.2). Total num frames: 5046272. Throughput: 0: 13654.4. Samples: 5029952. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:08:40,130][221941] Avg episode reward: [(0, '486.579')] +[2023-07-16 20:08:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009856_5046272.pth... +[2023-07-16 20:08:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009072_4644864.pth +[2023-07-16 20:08:42,204][222226] Updated weights for policy 0, policy_version 9920 (0.0004) +[2023-07-16 20:08:45,003][222226] Updated weights for policy 0, policy_version 10000 (0.0004) +[2023-07-16 20:08:45,129][221941] Fps is (10 sec: 14335.9, 60 sec: 13858.1, 300 sec: 13468.2). Total num frames: 5120000. Throughput: 0: 13732.9. Samples: 5115968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:45,130][221941] Avg episode reward: [(0, '488.626')] +[2023-07-16 20:08:47,786][222226] Updated weights for policy 0, policy_version 10080 (0.0004) +[2023-07-16 20:08:50,129][221941] Fps is (10 sec: 14745.6, 60 sec: 13858.1, 300 sec: 13482.1). Total num frames: 5193728. Throughput: 0: 13738.3. Samples: 5160760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:50,130][221941] Avg episode reward: [(0, '485.793')] +[2023-07-16 20:08:50,646][222226] Updated weights for policy 0, policy_version 10160 (0.0004) +[2023-07-16 20:08:53,481][222226] Updated weights for policy 0, policy_version 10240 (0.0004) +[2023-07-16 20:08:55,129][221941] Fps is (10 sec: 14336.2, 60 sec: 13858.1, 300 sec: 13468.2). Total num frames: 5263360. Throughput: 0: 13742.9. Samples: 5246924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:08:55,129][221941] Avg episode reward: [(0, '487.179')] +[2023-07-16 20:08:55,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010280_5263360.pth... +[2023-07-16 20:08:55,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009456_4841472.pth +[2023-07-16 20:08:56,383][222226] Updated weights for policy 0, policy_version 10320 (0.0005) +[2023-07-16 20:08:59,355][222226] Updated weights for policy 0, policy_version 10400 (0.0005) +[2023-07-16 20:09:00,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13858.1, 300 sec: 13454.3). Total num frames: 5332992. Throughput: 0: 13868.1. Samples: 5330500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:00,130][221941] Avg episode reward: [(0, '488.496')] +[2023-07-16 20:09:02,127][222226] Updated weights for policy 0, policy_version 10480 (0.0004) +[2023-07-16 20:09:05,072][222226] Updated weights for policy 0, policy_version 10560 (0.0005) +[2023-07-16 20:09:05,129][221941] Fps is (10 sec: 14336.0, 60 sec: 13858.1, 300 sec: 13454.3). Total num frames: 5406720. Throughput: 0: 13915.4. Samples: 5374024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:05,130][221941] Avg episode reward: [(0, '485.476')] +[2023-07-16 20:09:08,275][222226] Updated weights for policy 0, policy_version 10640 (0.0005) +[2023-07-16 20:09:10,129][221941] Fps is (10 sec: 13516.7, 60 sec: 13721.6, 300 sec: 13426.5). Total num frames: 5468160. Throughput: 0: 13951.1. Samples: 5453400. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:09:10,130][221941] Avg episode reward: [(0, '492.634')] +[2023-07-16 20:09:10,189][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010688_5472256.pth... +[2023-07-16 20:09:10,192][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000009856_5046272.pth +[2023-07-16 20:09:11,448][222226] Updated weights for policy 0, policy_version 10720 (0.0006) +[2023-07-16 20:09:14,544][222226] Updated weights for policy 0, policy_version 10800 (0.0005) +[2023-07-16 20:09:15,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13721.6, 300 sec: 13426.5). Total num frames: 5533696. Throughput: 0: 13947.2. Samples: 5532352. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:09:15,130][221941] Avg episode reward: [(0, '486.081')] +[2023-07-16 20:09:17,519][222226] Updated weights for policy 0, policy_version 10880 (0.0005) +[2023-07-16 20:09:20,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13858.1, 300 sec: 13454.3). Total num frames: 5607424. Throughput: 0: 14000.4. Samples: 5573964. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:09:20,130][221941] Avg episode reward: [(0, '495.112')] +[2023-07-16 20:09:20,372][222226] Updated weights for policy 0, policy_version 10960 (0.0004) +[2023-07-16 20:09:23,133][222226] Updated weights for policy 0, policy_version 11040 (0.0004) +[2023-07-16 20:09:25,129][221941] Fps is (10 sec: 14745.5, 60 sec: 13994.7, 300 sec: 13454.3). Total num frames: 5681152. Throughput: 0: 14022.4. Samples: 5660960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:25,130][221941] Avg episode reward: [(0, '490.333')] +[2023-07-16 20:09:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011096_5681152.pth... +[2023-07-16 20:09:25,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010280_5263360.pth +[2023-07-16 20:09:25,954][222226] Updated weights for policy 0, policy_version 11120 (0.0004) +[2023-07-16 20:09:28,834][222226] Updated weights for policy 0, policy_version 11200 (0.0004) +[2023-07-16 20:09:30,129][221941] Fps is (10 sec: 14336.1, 60 sec: 14062.9, 300 sec: 13454.3). Total num frames: 5750784. Throughput: 0: 14042.4. Samples: 5747872. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:30,130][221941] Avg episode reward: [(0, '488.695')] +[2023-07-16 20:09:31,613][222226] Updated weights for policy 0, policy_version 11280 (0.0003) +[2023-07-16 20:09:34,449][222226] Updated weights for policy 0, policy_version 11360 (0.0004) +[2023-07-16 20:09:35,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13468.2). Total num frames: 5824512. Throughput: 0: 14023.3. Samples: 5791808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:35,130][221941] Avg episode reward: [(0, '492.821')] +[2023-07-16 20:09:37,349][222226] Updated weights for policy 0, policy_version 11440 (0.0004) +[2023-07-16 20:09:40,129][221941] Fps is (10 sec: 14335.8, 60 sec: 14131.2, 300 sec: 13482.1). Total num frames: 5894144. Throughput: 0: 14020.0. Samples: 5877824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:09:40,130][221941] Avg episode reward: [(0, '494.571')] +[2023-07-16 20:09:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011512_5894144.pth... +[2023-07-16 20:09:40,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000010688_5472256.pth +[2023-07-16 20:09:40,228][222226] Updated weights for policy 0, policy_version 11520 (0.0004) +[2023-07-16 20:09:43,023][222226] Updated weights for policy 0, policy_version 11600 (0.0004) +[2023-07-16 20:09:45,129][221941] Fps is (10 sec: 13926.5, 60 sec: 14063.0, 300 sec: 13496.0). Total num frames: 5963776. Throughput: 0: 14068.6. Samples: 5963588. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-16 20:09:45,130][221941] Avg episode reward: [(0, '497.532')] +[2023-07-16 20:09:46,085][222226] Updated weights for policy 0, policy_version 11680 (0.0004) +[2023-07-16 20:09:49,264][222226] Updated weights for policy 0, policy_version 11760 (0.0005) +[2023-07-16 20:09:50,129][221941] Fps is (10 sec: 13516.9, 60 sec: 13926.4, 300 sec: 13496.0). Total num frames: 6029312. Throughput: 0: 13936.9. Samples: 6001184. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:09:50,130][221941] Avg episode reward: [(0, '488.164')] +[2023-07-16 20:09:52,273][222226] Updated weights for policy 0, policy_version 11840 (0.0004) +[2023-07-16 20:09:55,129][221941] Fps is (10 sec: 13107.1, 60 sec: 13858.1, 300 sec: 13509.9). Total num frames: 6094848. Throughput: 0: 13961.3. Samples: 6081660. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:09:55,137][221941] Avg episode reward: [(0, '494.815')] +[2023-07-16 20:09:55,176][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011912_6098944.pth... +[2023-07-16 20:09:55,179][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011096_5681152.pth +[2023-07-16 20:09:55,488][222226] Updated weights for policy 0, policy_version 11920 (0.0005) +[2023-07-16 20:09:58,357][222226] Updated weights for policy 0, policy_version 12000 (0.0004) +[2023-07-16 20:10:00,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13926.4, 300 sec: 13537.6). Total num frames: 6168576. Throughput: 0: 14031.0. Samples: 6163748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:00,130][221941] Avg episode reward: [(0, '492.276')] +[2023-07-16 20:10:01,178][222226] Updated weights for policy 0, policy_version 12080 (0.0003) +[2023-07-16 20:10:04,142][222226] Updated weights for policy 0, policy_version 12160 (0.0004) +[2023-07-16 20:10:05,129][221941] Fps is (10 sec: 14336.0, 60 sec: 13858.1, 300 sec: 13565.4). Total num frames: 6238208. Throughput: 0: 14054.3. Samples: 6206408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:05,130][221941] Avg episode reward: [(0, '490.019')] +[2023-07-16 20:10:06,994][222226] Updated weights for policy 0, policy_version 12240 (0.0004) +[2023-07-16 20:10:09,835][222226] Updated weights for policy 0, policy_version 12320 (0.0004) +[2023-07-16 20:10:10,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 13579.3). Total num frames: 6307840. Throughput: 0: 14021.2. Samples: 6291912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:10,129][221941] Avg episode reward: [(0, '486.681')] +[2023-07-16 20:10:10,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012328_6311936.pth... +[2023-07-16 20:10:10,134][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011512_5894144.pth +[2023-07-16 20:10:12,710][222226] Updated weights for policy 0, policy_version 12400 (0.0004) +[2023-07-16 20:10:15,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13620.9). Total num frames: 6381568. Throughput: 0: 13992.5. Samples: 6377536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:15,130][221941] Avg episode reward: [(0, '490.544')] +[2023-07-16 20:10:15,594][222226] Updated weights for policy 0, policy_version 12480 (0.0004) +[2023-07-16 20:10:18,398][222226] Updated weights for policy 0, policy_version 12560 (0.0003) +[2023-07-16 20:10:20,129][221941] Fps is (10 sec: 14745.6, 60 sec: 14131.2, 300 sec: 13648.7). Total num frames: 6455296. Throughput: 0: 13989.3. Samples: 6421324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:20,130][221941] Avg episode reward: [(0, '491.438')] +[2023-07-16 20:10:21,258][222226] Updated weights for policy 0, policy_version 12640 (0.0004) +[2023-07-16 20:10:24,151][222226] Updated weights for policy 0, policy_version 12720 (0.0004) +[2023-07-16 20:10:25,129][221941] Fps is (10 sec: 14335.9, 60 sec: 14062.9, 300 sec: 13662.6). Total num frames: 6524928. Throughput: 0: 13975.0. Samples: 6506700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:25,130][221941] Avg episode reward: [(0, '490.952')] +[2023-07-16 20:10:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012744_6524928.pth... +[2023-07-16 20:10:25,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000011912_6098944.pth +[2023-07-16 20:10:27,079][222226] Updated weights for policy 0, policy_version 12800 (0.0004) +[2023-07-16 20:10:29,911][222226] Updated weights for policy 0, policy_version 12880 (0.0003) +[2023-07-16 20:10:30,129][221941] Fps is (10 sec: 13926.4, 60 sec: 14062.9, 300 sec: 13662.6). Total num frames: 6594560. Throughput: 0: 13966.2. Samples: 6592068. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:10:30,130][221941] Avg episode reward: [(0, '488.092')] +[2023-07-16 20:10:32,688][222226] Updated weights for policy 0, policy_version 12960 (0.0003) +[2023-07-16 20:10:35,129][221941] Fps is (10 sec: 14336.2, 60 sec: 14062.9, 300 sec: 13676.5). Total num frames: 6668288. Throughput: 0: 14105.1. Samples: 6635912. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:10:35,130][221941] Avg episode reward: [(0, '494.385')] +[2023-07-16 20:10:35,518][222226] Updated weights for policy 0, policy_version 13040 (0.0004) +[2023-07-16 20:10:38,354][222226] Updated weights for policy 0, policy_version 13120 (0.0004) +[2023-07-16 20:10:40,129][221941] Fps is (10 sec: 14745.4, 60 sec: 14131.2, 300 sec: 13704.2). Total num frames: 6742016. Throughput: 0: 14250.4. Samples: 6722928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:40,130][221941] Avg episode reward: [(0, '496.361')] +[2023-07-16 20:10:40,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013168_6742016.pth... +[2023-07-16 20:10:40,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012328_6311936.pth +[2023-07-16 20:10:41,137][222226] Updated weights for policy 0, policy_version 13200 (0.0004) +[2023-07-16 20:10:44,022][222226] Updated weights for policy 0, policy_version 13280 (0.0004) +[2023-07-16 20:10:45,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13732.0). Total num frames: 6811648. Throughput: 0: 14351.6. Samples: 6809568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:10:45,130][221941] Avg episode reward: [(0, '497.065')] +[2023-07-16 20:10:46,858][222226] Updated weights for policy 0, policy_version 13360 (0.0004) +[2023-07-16 20:10:49,746][222226] Updated weights for policy 0, policy_version 13440 (0.0004) +[2023-07-16 20:10:50,129][221941] Fps is (10 sec: 14336.2, 60 sec: 14267.7, 300 sec: 13773.7). Total num frames: 6885376. Throughput: 0: 14361.4. Samples: 6852672. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:10:50,129][221941] Avg episode reward: [(0, '497.269')] +[2023-07-16 20:10:52,530][222226] Updated weights for policy 0, policy_version 13520 (0.0003) +[2023-07-16 20:10:55,129][221941] Fps is (10 sec: 14745.5, 60 sec: 14404.3, 300 sec: 13787.6). Total num frames: 6959104. Throughput: 0: 14397.2. Samples: 6939788. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:10:55,130][221941] Avg episode reward: [(0, '493.876')] +[2023-07-16 20:10:55,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013592_6959104.pth... +[2023-07-16 20:10:55,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000012744_6524928.pth +[2023-07-16 20:10:55,343][222226] Updated weights for policy 0, policy_version 13600 (0.0004) +[2023-07-16 20:10:58,320][222226] Updated weights for policy 0, policy_version 13680 (0.0004) +[2023-07-16 20:11:00,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14336.0, 300 sec: 13773.7). Total num frames: 7028736. Throughput: 0: 14380.1. Samples: 7024640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:00,130][221941] Avg episode reward: [(0, '493.746')] +[2023-07-16 20:11:01,184][222226] Updated weights for policy 0, policy_version 13760 (0.0004) +[2023-07-16 20:11:04,021][222226] Updated weights for policy 0, policy_version 13840 (0.0004) +[2023-07-16 20:11:05,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 13787.6). Total num frames: 7102464. Throughput: 0: 14365.3. Samples: 7067764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:05,130][221941] Avg episode reward: [(0, '497.523')] +[2023-07-16 20:11:06,781][222226] Updated weights for policy 0, policy_version 13920 (0.0003) +[2023-07-16 20:11:09,709][222226] Updated weights for policy 0, policy_version 14000 (0.0004) +[2023-07-16 20:11:10,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 13787.6). Total num frames: 7172096. Throughput: 0: 14383.9. Samples: 7153976. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:11:10,130][221941] Avg episode reward: [(0, '496.158')] +[2023-07-16 20:11:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014008_7172096.pth... +[2023-07-16 20:11:10,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013168_6742016.pth +[2023-07-16 20:11:12,661][222226] Updated weights for policy 0, policy_version 14080 (0.0004) +[2023-07-16 20:11:15,129][221941] Fps is (10 sec: 13926.4, 60 sec: 14336.0, 300 sec: 13815.3). Total num frames: 7241728. Throughput: 0: 14362.6. Samples: 7238384. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:11:15,130][221941] Avg episode reward: [(0, '491.887')] +[2023-07-16 20:11:15,522][222226] Updated weights for policy 0, policy_version 14160 (0.0004) +[2023-07-16 20:11:18,331][222226] Updated weights for policy 0, policy_version 14240 (0.0004) +[2023-07-16 20:11:20,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14336.0, 300 sec: 13843.1). Total num frames: 7315456. Throughput: 0: 14373.0. Samples: 7282696. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:11:20,130][221941] Avg episode reward: [(0, '499.612')] +[2023-07-16 20:11:20,131][222182] Saving new best policy, reward=499.612! +[2023-07-16 20:11:21,203][222226] Updated weights for policy 0, policy_version 14320 (0.0004) +[2023-07-16 20:11:23,994][222226] Updated weights for policy 0, policy_version 14400 (0.0004) +[2023-07-16 20:11:25,129][221941] Fps is (10 sec: 14335.9, 60 sec: 14336.0, 300 sec: 13870.9). Total num frames: 7385088. Throughput: 0: 14352.0. Samples: 7368768. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:11:25,130][221941] Avg episode reward: [(0, '494.430')] +[2023-07-16 20:11:25,157][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014432_7389184.pth... +[2023-07-16 20:11:25,159][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000013592_6959104.pth +[2023-07-16 20:11:26,870][222226] Updated weights for policy 0, policy_version 14480 (0.0004) +[2023-07-16 20:11:29,741][222226] Updated weights for policy 0, policy_version 14560 (0.0004) +[2023-07-16 20:11:30,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14404.2, 300 sec: 13898.6). Total num frames: 7458816. Throughput: 0: 14338.1. Samples: 7454784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:30,130][221941] Avg episode reward: [(0, '492.307')] +[2023-07-16 20:11:32,550][222226] Updated weights for policy 0, policy_version 14640 (0.0004) +[2023-07-16 20:11:35,129][221941] Fps is (10 sec: 14745.7, 60 sec: 14404.3, 300 sec: 13926.4). Total num frames: 7532544. Throughput: 0: 14351.5. Samples: 7498492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:35,130][221941] Avg episode reward: [(0, '497.252')] +[2023-07-16 20:11:35,341][222226] Updated weights for policy 0, policy_version 14720 (0.0003) +[2023-07-16 20:11:38,157][222226] Updated weights for policy 0, policy_version 14800 (0.0004) +[2023-07-16 20:11:40,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14336.0, 300 sec: 13926.4). Total num frames: 7602176. Throughput: 0: 14357.1. Samples: 7585856. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:11:40,130][221941] Avg episode reward: [(0, '495.675')] +[2023-07-16 20:11:40,140][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014856_7606272.pth... +[2023-07-16 20:11:40,141][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014008_7172096.pth +[2023-07-16 20:11:40,969][222226] Updated weights for policy 0, policy_version 14880 (0.0004) +[2023-07-16 20:11:43,797][222226] Updated weights for policy 0, policy_version 14960 (0.0004) +[2023-07-16 20:11:45,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 13968.1). Total num frames: 7675904. Throughput: 0: 14422.2. Samples: 7673640. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:11:45,130][221941] Avg episode reward: [(0, '491.918')] +[2023-07-16 20:11:46,545][222226] Updated weights for policy 0, policy_version 15040 (0.0003) +[2023-07-16 20:11:49,361][222226] Updated weights for policy 0, policy_version 15120 (0.0004) +[2023-07-16 20:11:50,129][221941] Fps is (10 sec: 14745.7, 60 sec: 14404.3, 300 sec: 14009.7). Total num frames: 7749632. Throughput: 0: 14452.6. Samples: 7718132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:50,130][221941] Avg episode reward: [(0, '497.789')] +[2023-07-16 20:11:52,147][222226] Updated weights for policy 0, policy_version 15200 (0.0004) +[2023-07-16 20:11:54,992][222226] Updated weights for policy 0, policy_version 15280 (0.0004) +[2023-07-16 20:11:55,129][221941] Fps is (10 sec: 14745.6, 60 sec: 14404.3, 300 sec: 14051.4). Total num frames: 7823360. Throughput: 0: 14475.6. Samples: 7805376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:11:55,130][221941] Avg episode reward: [(0, '497.015')] +[2023-07-16 20:11:55,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015280_7823360.pth... +[2023-07-16 20:11:55,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014432_7389184.pth +[2023-07-16 20:11:57,817][222226] Updated weights for policy 0, policy_version 15360 (0.0004) +[2023-07-16 20:12:00,129][221941] Fps is (10 sec: 14745.5, 60 sec: 14472.5, 300 sec: 14065.2). Total num frames: 7897088. Throughput: 0: 14524.1. Samples: 7891968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:12:00,130][221941] Avg episode reward: [(0, '495.310')] +[2023-07-16 20:12:00,656][222226] Updated weights for policy 0, policy_version 15440 (0.0004) +[2023-07-16 20:12:03,497][222226] Updated weights for policy 0, policy_version 15520 (0.0004) +[2023-07-16 20:12:05,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 14065.2). Total num frames: 7966720. Throughput: 0: 14499.5. Samples: 7935172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:12:05,160][221941] Avg episode reward: [(0, '492.825')] +[2023-07-16 20:12:06,250][222226] Updated weights for policy 0, policy_version 15600 (0.0003) +[2023-07-16 20:12:09,155][222226] Updated weights for policy 0, policy_version 15680 (0.0004) +[2023-07-16 20:12:10,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14472.5, 300 sec: 14106.9). Total num frames: 8040448. Throughput: 0: 14528.9. Samples: 8022568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:12:10,130][221941] Avg episode reward: [(0, '498.153')] +[2023-07-16 20:12:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015704_8040448.pth... +[2023-07-16 20:12:10,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000014856_7606272.pth +[2023-07-16 20:12:12,032][222226] Updated weights for policy 0, policy_version 15760 (0.0004) +[2023-07-16 20:12:14,839][222226] Updated weights for policy 0, policy_version 15840 (0.0004) +[2023-07-16 20:12:15,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14472.5, 300 sec: 14120.8). Total num frames: 8110080. Throughput: 0: 14536.1. Samples: 8108908. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:12:15,130][221941] Avg episode reward: [(0, '495.751')] +[2023-07-16 20:12:17,696][222226] Updated weights for policy 0, policy_version 15920 (0.0004) +[2023-07-16 20:12:20,129][221941] Fps is (10 sec: 14336.0, 60 sec: 14472.5, 300 sec: 14148.6). Total num frames: 8183808. Throughput: 0: 14505.2. Samples: 8151224. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-16 20:12:20,130][221941] Avg episode reward: [(0, '495.794')] +[2023-07-16 20:12:20,470][222226] Updated weights for policy 0, policy_version 16000 (0.0004) +[2023-07-16 20:12:23,335][222226] Updated weights for policy 0, policy_version 16080 (0.0004) +[2023-07-16 20:12:25,129][221941] Fps is (10 sec: 14745.4, 60 sec: 14540.8, 300 sec: 14162.4). Total num frames: 8257536. Throughput: 0: 14500.1. Samples: 8238364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:12:25,131][221941] Avg episode reward: [(0, '492.101')] +[2023-07-16 20:12:25,134][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016128_8257536.pth... +[2023-07-16 20:12:25,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015280_7823360.pth +[2023-07-16 20:12:26,170][222226] Updated weights for policy 0, policy_version 16160 (0.0004) +[2023-07-16 20:12:29,266][222226] Updated weights for policy 0, policy_version 16240 (0.0005) +[2023-07-16 20:12:30,129][221941] Fps is (10 sec: 13926.5, 60 sec: 14404.3, 300 sec: 14148.6). Total num frames: 8323072. Throughput: 0: 14395.8. Samples: 8321452. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:12:30,130][221941] Avg episode reward: [(0, '497.049')] +[2023-07-16 20:12:32,460][222226] Updated weights for policy 0, policy_version 16320 (0.0005) +[2023-07-16 20:12:35,129][221941] Fps is (10 sec: 13107.4, 60 sec: 14267.7, 300 sec: 14120.8). Total num frames: 8388608. Throughput: 0: 14262.3. Samples: 8359936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:12:35,130][221941] Avg episode reward: [(0, '490.886')] +[2023-07-16 20:12:35,670][222226] Updated weights for policy 0, policy_version 16400 (0.0005) +[2023-07-16 20:12:38,707][222226] Updated weights for policy 0, policy_version 16480 (0.0005) +[2023-07-16 20:12:40,129][221941] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14120.8). Total num frames: 8454144. Throughput: 0: 14054.7. Samples: 8437836. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:12:40,130][221941] Avg episode reward: [(0, '494.412')] +[2023-07-16 20:12:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016512_8454144.pth... +[2023-07-16 20:12:40,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000015704_8040448.pth +[2023-07-16 20:12:41,929][222226] Updated weights for policy 0, policy_version 16560 (0.0005) +[2023-07-16 20:12:45,025][222226] Updated weights for policy 0, policy_version 16640 (0.0005) +[2023-07-16 20:12:45,129][221941] Fps is (10 sec: 13107.2, 60 sec: 14062.9, 300 sec: 14093.0). Total num frames: 8519680. Throughput: 0: 13862.6. Samples: 8515784. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:12:45,130][221941] Avg episode reward: [(0, '495.050')] +[2023-07-16 20:12:48,131][222226] Updated weights for policy 0, policy_version 16720 (0.0005) +[2023-07-16 20:12:50,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13926.4, 300 sec: 14079.1). Total num frames: 8585216. Throughput: 0: 13792.8. Samples: 8555848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:12:50,130][221941] Avg episode reward: [(0, '498.323')] +[2023-07-16 20:12:51,251][222226] Updated weights for policy 0, policy_version 16800 (0.0005) +[2023-07-16 20:12:54,519][222226] Updated weights for policy 0, policy_version 16880 (0.0005) +[2023-07-16 20:12:55,129][221941] Fps is (10 sec: 12697.5, 60 sec: 13721.6, 300 sec: 14051.4). Total num frames: 8646656. Throughput: 0: 13577.8. Samples: 8633568. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-16 20:12:55,130][221941] Avg episode reward: [(0, '489.125')] +[2023-07-16 20:12:55,159][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016896_8650752.pth... +[2023-07-16 20:12:55,162][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016128_8257536.pth +[2023-07-16 20:12:57,762][222226] Updated weights for policy 0, policy_version 16960 (0.0005) +[2023-07-16 20:13:00,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13585.1, 300 sec: 14023.6). Total num frames: 8712192. Throughput: 0: 13332.4. Samples: 8708864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:00,130][221941] Avg episode reward: [(0, '495.948')] +[2023-07-16 20:13:00,953][222226] Updated weights for policy 0, policy_version 17040 (0.0005) +[2023-07-16 20:13:03,792][222226] Updated weights for policy 0, policy_version 17120 (0.0003) +[2023-07-16 20:13:05,129][221941] Fps is (10 sec: 13516.9, 60 sec: 13585.1, 300 sec: 14023.6). Total num frames: 8781824. Throughput: 0: 13286.9. Samples: 8749136. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:05,130][221941] Avg episode reward: [(0, '502.184')] +[2023-07-16 20:13:05,130][222182] Saving new best policy, reward=502.184! +[2023-07-16 20:13:06,640][222226] Updated weights for policy 0, policy_version 17200 (0.0004) +[2023-07-16 20:13:09,494][222226] Updated weights for policy 0, policy_version 17280 (0.0004) +[2023-07-16 20:13:10,129][221941] Fps is (10 sec: 14335.9, 60 sec: 13585.1, 300 sec: 14051.4). Total num frames: 8855552. Throughput: 0: 13288.0. Samples: 8836324. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:10,130][221941] Avg episode reward: [(0, '494.839')] +[2023-07-16 20:13:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017296_8855552.pth... +[2023-07-16 20:13:10,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016512_8454144.pth +[2023-07-16 20:13:12,420][222226] Updated weights for policy 0, policy_version 17360 (0.0004) +[2023-07-16 20:13:15,129][221941] Fps is (10 sec: 13926.5, 60 sec: 13516.8, 300 sec: 14051.4). Total num frames: 8921088. Throughput: 0: 13264.3. Samples: 8918344. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:13:15,130][221941] Avg episode reward: [(0, '493.079')] +[2023-07-16 20:13:15,536][222226] Updated weights for policy 0, policy_version 17440 (0.0005) +[2023-07-16 20:13:18,621][222226] Updated weights for policy 0, policy_version 17520 (0.0005) +[2023-07-16 20:13:20,129][221941] Fps is (10 sec: 13516.9, 60 sec: 13448.5, 300 sec: 14065.3). Total num frames: 8990720. Throughput: 0: 13290.7. Samples: 8958016. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-16 20:13:20,130][221941] Avg episode reward: [(0, '495.237')] +[2023-07-16 20:13:21,476][222226] Updated weights for policy 0, policy_version 17600 (0.0004) +[2023-07-16 20:13:24,458][222226] Updated weights for policy 0, policy_version 17680 (0.0004) +[2023-07-16 20:13:25,129][221941] Fps is (10 sec: 13926.3, 60 sec: 13380.3, 300 sec: 14079.1). Total num frames: 9060352. Throughput: 0: 13436.0. Samples: 9042456. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:25,130][221941] Avg episode reward: [(0, '497.460')] +[2023-07-16 20:13:25,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017696_9060352.pth... +[2023-07-16 20:13:25,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000016896_8650752.pth +[2023-07-16 20:13:27,324][222226] Updated weights for policy 0, policy_version 17760 (0.0004) +[2023-07-16 20:13:30,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13448.5, 300 sec: 14079.1). Total num frames: 9129984. Throughput: 0: 13577.1. Samples: 9126752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:30,130][221941] Avg episode reward: [(0, '494.273')] +[2023-07-16 20:13:30,259][222226] Updated weights for policy 0, policy_version 17840 (0.0004) +[2023-07-16 20:13:33,154][222226] Updated weights for policy 0, policy_version 17920 (0.0004) +[2023-07-16 20:13:35,129][221941] Fps is (10 sec: 14336.1, 60 sec: 13585.1, 300 sec: 14093.0). Total num frames: 9203712. Throughput: 0: 13635.1. Samples: 9169428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-16 20:13:35,129][221941] Avg episode reward: [(0, '493.236')] +[2023-07-16 20:13:35,959][222226] Updated weights for policy 0, policy_version 18000 (0.0003) +[2023-07-16 20:13:38,834][222226] Updated weights for policy 0, policy_version 18080 (0.0004) +[2023-07-16 20:13:40,129][221941] Fps is (10 sec: 14335.9, 60 sec: 13653.3, 300 sec: 14079.1). Total num frames: 9273344. Throughput: 0: 13817.0. Samples: 9255332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:13:40,130][221941] Avg episode reward: [(0, '489.273')] +[2023-07-16 20:13:40,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018112_9273344.pth... +[2023-07-16 20:13:40,136][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017296_8855552.pth +[2023-07-16 20:13:41,765][222226] Updated weights for policy 0, policy_version 18160 (0.0004) +[2023-07-16 20:13:44,674][222226] Updated weights for policy 0, policy_version 18240 (0.0004) +[2023-07-16 20:13:45,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13721.6, 300 sec: 14065.2). Total num frames: 9342976. Throughput: 0: 14014.0. Samples: 9339492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:13:45,130][221941] Avg episode reward: [(0, '497.643')] +[2023-07-16 20:13:47,553][222226] Updated weights for policy 0, policy_version 18320 (0.0004) +[2023-07-16 20:13:50,129][221941] Fps is (10 sec: 13926.6, 60 sec: 13789.9, 300 sec: 14065.3). Total num frames: 9412608. Throughput: 0: 14076.5. Samples: 9382576. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:13:50,129][221941] Avg episode reward: [(0, '499.781')] +[2023-07-16 20:13:50,427][222226] Updated weights for policy 0, policy_version 18400 (0.0004) +[2023-07-16 20:13:53,455][222226] Updated weights for policy 0, policy_version 18480 (0.0004) +[2023-07-16 20:13:55,129][221941] Fps is (10 sec: 13926.4, 60 sec: 13926.4, 300 sec: 14065.2). Total num frames: 9482240. Throughput: 0: 13989.6. Samples: 9465856. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:13:55,130][221941] Avg episode reward: [(0, '494.464')] +[2023-07-16 20:13:55,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018520_9482240.pth... +[2023-07-16 20:13:55,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000017696_9060352.pth +[2023-07-16 20:13:56,433][222226] Updated weights for policy 0, policy_version 18560 (0.0004) +[2023-07-16 20:13:59,246][222226] Updated weights for policy 0, policy_version 18640 (0.0004) +[2023-07-16 20:14:00,129][221941] Fps is (10 sec: 14335.9, 60 sec: 14062.9, 300 sec: 14065.2). Total num frames: 9555968. Throughput: 0: 14054.8. Samples: 9550812. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-16 20:14:00,130][221941] Avg episode reward: [(0, '496.235')] +[2023-07-16 20:14:02,062][222226] Updated weights for policy 0, policy_version 18720 (0.0004) +[2023-07-16 20:14:05,129][221941] Fps is (10 sec: 13926.5, 60 sec: 13994.7, 300 sec: 14079.1). Total num frames: 9621504. Throughput: 0: 14134.4. Samples: 9594064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:05,130][221941] Avg episode reward: [(0, '489.772')] +[2023-07-16 20:14:05,172][222226] Updated weights for policy 0, policy_version 18800 (0.0004) +[2023-07-16 20:14:08,261][222226] Updated weights for policy 0, policy_version 18880 (0.0003) +[2023-07-16 20:14:10,129][221941] Fps is (10 sec: 13516.7, 60 sec: 13926.4, 300 sec: 14093.0). Total num frames: 9691136. Throughput: 0: 14008.9. Samples: 9672856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:10,130][221941] Avg episode reward: [(0, '497.832')] +[2023-07-16 20:14:10,133][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018928_9691136.pth... +[2023-07-16 20:14:10,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018112_9273344.pth +[2023-07-16 20:14:11,313][222226] Updated weights for policy 0, policy_version 18960 (0.0005) +[2023-07-16 20:14:14,448][222226] Updated weights for policy 0, policy_version 19040 (0.0005) +[2023-07-16 20:14:15,129][221941] Fps is (10 sec: 13516.8, 60 sec: 13926.4, 300 sec: 14065.2). Total num frames: 9756672. Throughput: 0: 13908.6. Samples: 9752640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:15,130][221941] Avg episode reward: [(0, '493.554')] +[2023-07-16 20:14:17,617][222226] Updated weights for policy 0, policy_version 19120 (0.0005) +[2023-07-16 20:14:20,129][221941] Fps is (10 sec: 13107.2, 60 sec: 13858.1, 300 sec: 14037.5). Total num frames: 9822208. Throughput: 0: 13809.2. Samples: 9790844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:20,130][221941] Avg episode reward: [(0, '496.531')] +[2023-07-16 20:14:20,729][222226] Updated weights for policy 0, policy_version 19200 (0.0005) +[2023-07-16 20:14:23,866][222226] Updated weights for policy 0, policy_version 19280 (0.0005) +[2023-07-16 20:14:25,129][221941] Fps is (10 sec: 13107.1, 60 sec: 13789.9, 300 sec: 14023.6). Total num frames: 9887744. Throughput: 0: 13662.2. Samples: 9870132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:25,130][221941] Avg episode reward: [(0, '497.687')] +[2023-07-16 20:14:25,132][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019312_9887744.pth... +[2023-07-16 20:14:25,135][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018520_9482240.pth +[2023-07-16 20:14:27,037][222226] Updated weights for policy 0, policy_version 19360 (0.0005) +[2023-07-16 20:14:30,129][221941] Fps is (10 sec: 12697.6, 60 sec: 13653.3, 300 sec: 13981.9). Total num frames: 9949184. Throughput: 0: 13517.7. Samples: 9947788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-16 20:14:30,130][221941] Avg episode reward: [(0, '501.179')] +[2023-07-16 20:14:30,194][222226] Updated weights for policy 0, policy_version 19440 (0.0005) +[2023-07-16 20:14:33,327][222226] Updated weights for policy 0, policy_version 19520 (0.0005) +[2023-07-16 20:14:34,321][222182] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 +[2023-07-16 20:14:34,322][222228] Stopping RolloutWorker_w2... +[2023-07-16 20:14:34,322][222289] Stopping RolloutWorker_w5... +[2023-07-16 20:14:34,322][222227] Stopping RolloutWorker_w1... +[2023-07-16 20:14:34,322][222230] Stopping RolloutWorker_w3... +[2023-07-16 20:14:34,322][222263] Stopping RolloutWorker_w6... +[2023-07-16 20:14:34,322][222231] Stopping RolloutWorker_w4... +[2023-07-16 20:14:34,322][222229] Stopping RolloutWorker_w0... +[2023-07-16 20:14:34,322][222228] Loop rollout_proc2_evt_loop terminating... +[2023-07-16 20:14:34,322][222289] Loop rollout_proc5_evt_loop terminating... +[2023-07-16 20:14:34,322][222327] Stopping RolloutWorker_w7... +[2023-07-16 20:14:34,322][222227] Loop rollout_proc1_evt_loop terminating... +[2023-07-16 20:14:34,322][222230] Loop rollout_proc3_evt_loop terminating... +[2023-07-16 20:14:34,322][222263] Loop rollout_proc6_evt_loop terminating... +[2023-07-16 20:14:34,322][222231] Loop rollout_proc4_evt_loop terminating... +[2023-07-16 20:14:34,322][222327] Loop rollout_proc7_evt_loop terminating... +[2023-07-16 20:14:34,322][222229] Loop rollout_proc0_evt_loop terminating... +[2023-07-16 20:14:34,322][221941] Component RolloutWorker_w2 stopped! +[2023-07-16 20:14:34,322][222182] Stopping Batcher_0... +[2023-07-16 20:14:34,322][221941] Component RolloutWorker_w1 stopped! +[2023-07-16 20:14:34,323][222182] Loop batcher_evt_loop terminating... +[2023-07-16 20:14:34,323][221941] Component RolloutWorker_w5 stopped! +[2023-07-16 20:14:34,323][221941] Component RolloutWorker_w3 stopped! +[2023-07-16 20:14:34,323][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2023-07-16 20:14:34,323][221941] Component RolloutWorker_w6 stopped! +[2023-07-16 20:14:34,323][221941] Component RolloutWorker_w4 stopped! +[2023-07-16 20:14:34,324][221941] Component RolloutWorker_w0 stopped! +[2023-07-16 20:14:34,324][221941] Component RolloutWorker_w7 stopped! +[2023-07-16 20:14:34,324][221941] Component Batcher_0 stopped! +[2023-07-16 20:14:34,326][222182] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000018928_9691136.pth +[2023-07-16 20:14:34,326][222182] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/button-press-topdown-wall-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2023-07-16 20:14:34,328][222182] Stopping LearnerWorker_p0... +[2023-07-16 20:14:34,329][222182] Loop learner_proc0_evt_loop terminating... +[2023-07-16 20:14:34,329][221941] Component LearnerWorker_p0 stopped! +[2023-07-16 20:14:34,388][222226] Weights refcount: 2 0 +[2023-07-16 20:14:34,389][222226] Stopping InferenceWorker_p0-w0... +[2023-07-16 20:14:34,389][222226] Loop inference_proc0-0_evt_loop terminating... +[2023-07-16 20:14:34,389][221941] Component InferenceWorker_p0-w0 stopped! +[2023-07-16 20:14:34,390][221941] Waiting for process learner_proc0 to stop... +[2023-07-16 20:14:34,912][221941] Waiting for process inference_proc0-0 to join... +[2023-07-16 20:14:34,932][221941] Waiting for process rollout_proc0 to join... +[2023-07-16 20:14:34,932][221941] Waiting for process rollout_proc1 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc2 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc3 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc4 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc5 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc6 to join... +[2023-07-16 20:14:34,933][221941] Waiting for process rollout_proc7 to join... +[2023-07-16 20:14:34,934][221941] Batcher 0 profile tree view: +batching: 1.8505, releasing_batches: 1.5739 +[2023-07-16 20:14:34,934][221941] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0051 + wait_policy_total: 226.2036 +update_model: 9.8376 weight_update: 0.0005 one_step: 0.0006 - handle_policy_step: 544.3367 - deserialize: 23.2006, stack: 5.7697, obs_to_device_normalize: 96.4417, forward: 268.2320, send_messages: 43.0287 - prepare_outputs: 60.8445 - to_cpu: 9.1399 -[2023-07-08 14:09:06,439][977264] Learner 0 profile tree view: -misc: 0.0099, prepare_batch: 8.3734 -train: 85.6720 - epoch_init: 0.0367, minibatch_init: 1.1955, losses_postprocess: 1.2593, kl_divergence: 0.4103, after_optimizer: 0.6102 - calculate_losses: 36.1694 - losses_init: 0.0353, forward_head: 13.7435, bptt_initial: 0.1314, bptt: 0.1177, tail: 10.5751, advantages_returns: 0.8163, losses: 9.4465 - update: 44.5250 - clip: 5.3650 -[2023-07-08 14:09:06,439][977264] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.4541, enqueue_policy_requests: 14.3688, env_step: 596.9103, overhead: 22.5802, complete_rollouts: 0.3804 -save_policy_outputs: 42.6703 - split_output_tensors: 14.5287 -[2023-07-08 14:09:06,439][977264] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.4153, enqueue_policy_requests: 14.5390, env_step: 592.6366, overhead: 22.1743, complete_rollouts: 0.3974 -save_policy_outputs: 42.4443 - split_output_tensors: 14.4920 -[2023-07-08 14:09:06,440][977264] Loop Runner_EvtLoop terminating... -[2023-07-08 14:09:06,440][977264] Runner profile tree view: -main_loop: 1000.3406 -[2023-07-08 14:09:06,440][977264] Collected {0: 10006528}, FPS: 10003.1 + handle_policy_step: 439.1341 + deserialize: 18.7207, stack: 4.6238, obs_to_device_normalize: 77.8507, forward: 216.1656, send_messages: 34.1246 + prepare_outputs: 50.3532 + to_cpu: 7.6623 +[2023-07-16 20:14:34,934][221941] Learner 0 profile tree view: +misc: 0.0083, prepare_batch: 8.3629 +train: 85.1675 + epoch_init: 0.0310, minibatch_init: 1.1838, losses_postprocess: 1.1480, kl_divergence: 0.3931, after_optimizer: 0.5141 + calculate_losses: 36.3619 + losses_init: 0.0288, forward_head: 14.2380, bptt_initial: 0.1234, bptt: 0.1101, tail: 10.2297, advantages_returns: 0.7852, losses: 9.5746 + update: 44.0842 + clip: 5.2612 +[2023-07-16 20:14:34,934][221941] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.2702, enqueue_policy_requests: 12.4641, env_step: 471.7971, overhead: 20.0044, complete_rollouts: 0.3247 +save_policy_outputs: 38.6772 + split_output_tensors: 13.4433 +[2023-07-16 20:14:34,934][221941] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.2625, enqueue_policy_requests: 12.3390, env_step: 470.8625, overhead: 19.8809, complete_rollouts: 0.3100 +save_policy_outputs: 37.6542 + split_output_tensors: 12.9685 +[2023-07-16 20:14:34,934][221941] Loop Runner_EvtLoop terminating... +[2023-07-16 20:14:34,935][221941] Runner profile tree view: +main_loop: 727.7330 +[2023-07-16 20:14:34,935][221941] Collected {0: 10006528}, FPS: 13750.3