diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,32 +1,32 @@ -[2023-07-08 20:06:00,858][1063098] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/config.json... -[2023-07-08 20:06:00,881][1063098] Rollout worker 0 uses device cpu -[2023-07-08 20:06:00,881][1063098] Rollout worker 1 uses device cpu -[2023-07-08 20:06:00,881][1063098] Rollout worker 2 uses device cpu -[2023-07-08 20:06:00,882][1063098] Rollout worker 3 uses device cpu -[2023-07-08 20:06:00,882][1063098] Rollout worker 4 uses device cpu -[2023-07-08 20:06:00,882][1063098] Rollout worker 5 uses device cpu -[2023-07-08 20:06:00,882][1063098] Rollout worker 6 uses device cpu -[2023-07-08 20:06:00,882][1063098] Rollout worker 7 uses device cpu -[2023-07-08 20:06:00,882][1063098] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 -[2023-07-08 20:06:00,897][1063098] InferenceWorker_p0-w0: min num requests: 2 -[2023-07-08 20:06:00,922][1063098] Starting all processes... -[2023-07-08 20:06:00,922][1063098] Starting process learner_proc0 -[2023-07-08 20:06:00,971][1063098] Starting all processes... -[2023-07-08 20:06:01,016][1063098] Starting process inference_proc0-0 -[2023-07-08 20:06:01,016][1063098] Starting process rollout_proc0 -[2023-07-08 20:06:01,016][1063098] Starting process rollout_proc1 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc2 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc3 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc4 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc5 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc6 -[2023-07-08 20:06:01,017][1063098] Starting process rollout_proc7 -[2023-07-08 20:06:03,077][1063339] Starting seed is not provided -[2023-07-08 20:06:03,077][1063339] Initializing actor-critic model on device cpu -[2023-07-08 20:06:03,078][1063339] RunningMeanStd input shape: (39,) -[2023-07-08 20:06:03,078][1063339] RunningMeanStd input shape: (1,) -[2023-07-08 20:06:03,138][1063339] Created Actor Critic model with architecture: -[2023-07-08 20:06:03,138][1063339] ActorCriticSharedWeights( +[2023-07-17 00:32:22,455][276985] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/config.json... +[2023-07-17 00:32:22,471][276985] Rollout worker 0 uses device cpu +[2023-07-17 00:32:22,471][276985] Rollout worker 1 uses device cpu +[2023-07-17 00:32:22,471][276985] Rollout worker 2 uses device cpu +[2023-07-17 00:32:22,471][276985] Rollout worker 3 uses device cpu +[2023-07-17 00:32:22,472][276985] Rollout worker 4 uses device cpu +[2023-07-17 00:32:22,472][276985] Rollout worker 5 uses device cpu +[2023-07-17 00:32:22,472][276985] Rollout worker 6 uses device cpu +[2023-07-17 00:32:22,472][276985] Rollout worker 7 uses device cpu +[2023-07-17 00:32:22,472][276985] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2023-07-17 00:32:22,483][276985] InferenceWorker_p0-w0: min num requests: 2 +[2023-07-17 00:32:22,504][276985] Starting all processes... +[2023-07-17 00:32:22,504][276985] Starting process learner_proc0 +[2023-07-17 00:32:22,553][276985] Starting all processes... +[2023-07-17 00:32:22,587][276985] Starting process inference_proc0-0 +[2023-07-17 00:32:22,596][276985] Starting process rollout_proc0 +[2023-07-17 00:32:22,597][276985] Starting process rollout_proc1 +[2023-07-17 00:32:22,597][276985] Starting process rollout_proc2 +[2023-07-17 00:32:22,597][276985] Starting process rollout_proc3 +[2023-07-17 00:32:22,597][276985] Starting process rollout_proc4 +[2023-07-17 00:32:22,598][276985] Starting process rollout_proc5 +[2023-07-17 00:32:22,598][276985] Starting process rollout_proc6 +[2023-07-17 00:32:22,598][276985] Starting process rollout_proc7 +[2023-07-17 00:32:24,375][277226] Starting seed is not provided +[2023-07-17 00:32:24,375][277226] Initializing actor-critic model on device cpu +[2023-07-17 00:32:24,376][277226] RunningMeanStd input shape: (39,) +[2023-07-17 00:32:24,376][277226] RunningMeanStd input shape: (1,) +[2023-07-17 00:32:24,434][277226] Created Actor Critic model with architecture: +[2023-07-17 00:32:24,434][277226] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -57,945 +57,799 @@ (distribution_linear): Linear(in_features=64, out_features=4, bias=True) ) ) -[2023-07-08 20:06:03,177][1063388] Worker 4 uses CPU cores [16, 17, 18, 19] -[2023-07-08 20:06:03,192][1063385] Worker 1 uses CPU cores [4, 5, 6, 7] -[2023-07-08 20:06:03,284][1063386] Worker 0 uses CPU cores [0, 1, 2, 3] -[2023-07-08 20:06:03,459][1063339] Using optimizer -[2023-07-08 20:06:03,460][1063339] No checkpoints found -[2023-07-08 20:06:03,460][1063339] Did not load from checkpoint, starting from scratch! -[2023-07-08 20:06:03,461][1063339] Initialized policy 0 weights for model version 0 -[2023-07-08 20:06:03,462][1063339] LearnerWorker_p0 finished initialization! -[2023-07-08 20:06:03,594][1063483] Worker 5 uses CPU cores [20, 21, 22, 23] -[2023-07-08 20:06:03,605][1063384] Worker 2 uses CPU cores [8, 9, 10, 11] -[2023-07-08 20:06:03,647][1063451] Worker 6 uses CPU cores [24, 25, 26, 27] -[2023-07-08 20:06:03,717][1063387] Worker 3 uses CPU cores [12, 13, 14, 15] -[2023-07-08 20:06:03,784][1063484] Worker 7 uses CPU cores [28, 29, 30, 31] -[2023-07-08 20:06:03,874][1063383] RunningMeanStd input shape: (39,) -[2023-07-08 20:06:03,875][1063383] RunningMeanStd input shape: (1,) -[2023-07-08 20:06:03,931][1063098] Inference worker 0-0 is ready! -[2023-07-08 20:06:03,931][1063098] All inference workers are ready! Signal rollout workers to start! -[2023-07-08 20:06:08,025][1063098] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-07-08 20:06:08,030][1063387] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,045][1063387] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,053][1063388] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,056][1063484] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,057][1063386] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,067][1063388] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,071][1063484] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,071][1063386] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,075][1063451] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,076][1063384] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,077][1063387] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,089][1063451] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,091][1063384] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,099][1063388] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,104][1063484] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,107][1063386] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,124][1063451] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,125][1063384] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,135][1063385] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,144][1063387] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,148][1063385] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,164][1063388] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,169][1063484] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,172][1063386] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,184][1063385] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,189][1063384] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,193][1063451] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,251][1063385] Decorrelating experience for 192 frames... -[2023-07-08 20:06:08,827][1063483] Decorrelating experience for 0 frames... -[2023-07-08 20:06:08,841][1063483] Decorrelating experience for 64 frames... -[2023-07-08 20:06:08,874][1063483] Decorrelating experience for 128 frames... -[2023-07-08 20:06:08,938][1063483] Decorrelating experience for 192 frames... -[2023-07-08 20:06:12,172][1063387] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,223][1063484] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,245][1063386] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,248][1063384] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,253][1063451] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,290][1063387] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,307][1063385] Decorrelating experience for 256 frames... -[2023-07-08 20:06:12,337][1063484] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,363][1063386] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,375][1063451] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,376][1063384] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,422][1063385] Decorrelating experience for 320 frames... -[2023-07-08 20:06:12,440][1063387] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,486][1063484] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,517][1063386] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,525][1063384] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,525][1063451] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,569][1063385] Decorrelating experience for 384 frames... -[2023-07-08 20:06:12,608][1063387] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,651][1063484] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,683][1063386] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,693][1063384] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,695][1063451] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,737][1063385] Decorrelating experience for 448 frames... -[2023-07-08 20:06:12,961][1063483] Decorrelating experience for 256 frames... -[2023-07-08 20:06:13,025][1063098] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 88.8. Samples: 444. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-07-08 20:06:13,025][1063098] Avg episode reward: [(0, '4.405')] -[2023-07-08 20:06:13,026][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000000_0.pth... -[2023-07-08 20:06:13,075][1063483] Decorrelating experience for 320 frames... -[2023-07-08 20:06:13,157][1063388] Decorrelating experience for 256 frames... -[2023-07-08 20:06:13,221][1063483] Decorrelating experience for 384 frames... -[2023-07-08 20:06:13,308][1063388] Decorrelating experience for 320 frames... -[2023-07-08 20:06:13,383][1063483] Decorrelating experience for 448 frames... -[2023-07-08 20:06:13,500][1063388] Decorrelating experience for 384 frames... -[2023-07-08 20:06:13,660][1063388] Decorrelating experience for 448 frames... -[2023-07-08 20:06:17,980][1063383] Updated weights for policy 0, policy_version 80 (0.0006) -[2023-07-08 20:06:18,025][1063098] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4096.0). Total num frames: 40960. Throughput: 0: 1954.0. Samples: 19540. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:06:18,025][1063098] Avg episode reward: [(0, '372.390')] -[2023-07-08 20:06:20,891][1063098] Heartbeat connected on Batcher_0 -[2023-07-08 20:06:20,894][1063098] Heartbeat connected on LearnerWorker_p0 -[2023-07-08 20:06:20,898][1063098] Heartbeat connected on InferenceWorker_p0-w0 -[2023-07-08 20:06:20,909][1063098] Heartbeat connected on RolloutWorker_w0 -[2023-07-08 20:06:20,911][1063098] Heartbeat connected on RolloutWorker_w2 -[2023-07-08 20:06:20,917][1063098] Heartbeat connected on RolloutWorker_w5 -[2023-07-08 20:06:20,922][1063098] Heartbeat connected on RolloutWorker_w1 -[2023-07-08 20:06:20,923][1063098] Heartbeat connected on RolloutWorker_w7 -[2023-07-08 20:06:20,925][1063098] Heartbeat connected on RolloutWorker_w4 -[2023-07-08 20:06:20,925][1063098] Heartbeat connected on RolloutWorker_w3 -[2023-07-08 20:06:20,928][1063098] Heartbeat connected on RolloutWorker_w6 -[2023-07-08 20:06:22,274][1063383] Updated weights for policy 0, policy_version 160 (0.0005) -[2023-07-08 20:06:23,025][1063098] Fps is (10 sec: 8601.6, 60 sec: 5734.4, 300 sec: 5734.4). Total num frames: 86016. Throughput: 0: 5220.0. Samples: 78300. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:06:23,025][1063098] Avg episode reward: [(0, '525.397')] -[2023-07-08 20:06:26,324][1063383] Updated weights for policy 0, policy_version 240 (0.0005) -[2023-07-08 20:06:28,025][1063098] Fps is (10 sec: 9830.3, 60 sec: 6963.2, 300 sec: 6963.2). Total num frames: 139264. Throughput: 0: 7014.8. Samples: 140296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:06:28,026][1063098] Avg episode reward: [(0, '568.859')] -[2023-07-08 20:06:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000272_139264.pth... -[2023-07-08 20:06:28,031][1063339] Saving new best policy, reward=568.859! -[2023-07-08 20:06:30,181][1063383] Updated weights for policy 0, policy_version 320 (0.0005) -[2023-07-08 20:06:33,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 7536.6, 300 sec: 7536.6). Total num frames: 188416. Throughput: 0: 6883.8. Samples: 172096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:06:33,025][1063098] Avg episode reward: [(0, '563.865')] -[2023-07-08 20:06:34,301][1063383] Updated weights for policy 0, policy_version 400 (0.0005) -[2023-07-08 20:06:38,024][1063098] Fps is (10 sec: 10240.2, 60 sec: 8055.5, 300 sec: 8055.5). Total num frames: 241664. Throughput: 0: 7749.4. Samples: 232480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:06:38,025][1063098] Avg episode reward: [(0, '654.605')] -[2023-07-08 20:06:38,025][1063339] Saving new best policy, reward=654.605! -[2023-07-08 20:06:38,322][1063383] Updated weights for policy 0, policy_version 480 (0.0005) -[2023-07-08 20:06:42,508][1063383] Updated weights for policy 0, policy_version 560 (0.0005) -[2023-07-08 20:06:43,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 8309.0, 300 sec: 8309.0). Total num frames: 290816. Throughput: 0: 8311.1. Samples: 290888. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:06:43,025][1063098] Avg episode reward: [(0, '684.799')] -[2023-07-08 20:06:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000568_290816.pth... -[2023-07-08 20:06:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000000_0.pth -[2023-07-08 20:06:43,031][1063339] Saving new best policy, reward=684.799! -[2023-07-08 20:06:46,717][1063383] Updated weights for policy 0, policy_version 640 (0.0005) -[2023-07-08 20:06:48,027][1063098] Fps is (10 sec: 9828.1, 60 sec: 8498.7, 300 sec: 8498.7). Total num frames: 339968. Throughput: 0: 8030.9. Samples: 321252. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:06:48,028][1063098] Avg episode reward: [(0, '617.262')] -[2023-07-08 20:06:51,073][1063383] Updated weights for policy 0, policy_version 720 (0.0005) -[2023-07-08 20:06:53,025][1063098] Fps is (10 sec: 9420.8, 60 sec: 8556.1, 300 sec: 8556.1). Total num frames: 385024. Throughput: 0: 8394.7. Samples: 377760. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:06:53,025][1063098] Avg episode reward: [(0, '654.827')] -[2023-07-08 20:06:55,223][1063383] Updated weights for policy 0, policy_version 800 (0.0005) -[2023-07-08 20:06:58,024][1063098] Fps is (10 sec: 9423.0, 60 sec: 8683.5, 300 sec: 8683.5). Total num frames: 434176. Throughput: 0: 9711.1. Samples: 437444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:06:58,025][1063098] Avg episode reward: [(0, '703.285')] -[2023-07-08 20:06:58,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000848_434176.pth... -[2023-07-08 20:06:58,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000272_139264.pth -[2023-07-08 20:06:58,029][1063339] Saving new best policy, reward=703.285! -[2023-07-08 20:06:59,355][1063383] Updated weights for policy 0, policy_version 880 (0.0005) -[2023-07-08 20:07:03,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 8862.3, 300 sec: 8862.3). Total num frames: 487424. Throughput: 0: 9942.3. Samples: 466944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:03,025][1063098] Avg episode reward: [(0, '709.366')] -[2023-07-08 20:07:03,026][1063339] Saving new best policy, reward=709.366! -[2023-07-08 20:07:03,321][1063383] Updated weights for policy 0, policy_version 960 (0.0005) -[2023-07-08 20:07:07,371][1063383] Updated weights for policy 0, policy_version 1040 (0.0005) -[2023-07-08 20:07:08,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 8942.9, 300 sec: 8942.9). Total num frames: 536576. Throughput: 0: 10001.9. Samples: 528384. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:07:08,025][1063098] Avg episode reward: [(0, '708.933')] -[2023-07-08 20:07:11,268][1063383] Updated weights for policy 0, policy_version 1120 (0.0005) -[2023-07-08 20:07:13,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 9830.4, 300 sec: 9074.2). Total num frames: 589824. Throughput: 0: 9990.9. Samples: 589888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:13,025][1063098] Avg episode reward: [(0, '718.434')] -[2023-07-08 20:07:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001152_589824.pth... -[2023-07-08 20:07:13,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000568_290816.pth -[2023-07-08 20:07:13,031][1063339] Saving new best policy, reward=718.434! -[2023-07-08 20:07:15,468][1063383] Updated weights for policy 0, policy_version 1200 (0.0005) -[2023-07-08 20:07:18,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 9966.9, 300 sec: 9128.2). Total num frames: 638976. Throughput: 0: 9931.8. Samples: 619028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:18,025][1063098] Avg episode reward: [(0, '752.480')] -[2023-07-08 20:07:18,026][1063339] Saving new best policy, reward=752.480! -[2023-07-08 20:07:19,564][1063383] Updated weights for policy 0, policy_version 1280 (0.0005) -[2023-07-08 20:07:23,025][1063098] Fps is (10 sec: 9830.6, 60 sec: 10035.2, 300 sec: 9175.1). Total num frames: 688128. Throughput: 0: 9940.4. Samples: 679800. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:07:23,025][1063098] Avg episode reward: [(0, '782.824')] -[2023-07-08 20:07:23,025][1063339] Saving new best policy, reward=782.824! -[2023-07-08 20:07:23,546][1063383] Updated weights for policy 0, policy_version 1360 (0.0005) -[2023-07-08 20:07:27,427][1063383] Updated weights for policy 0, policy_version 1440 (0.0005) -[2023-07-08 20:07:28,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10035.2, 300 sec: 9267.2). Total num frames: 741376. Throughput: 0: 10078.4. Samples: 744416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:28,025][1063098] Avg episode reward: [(0, '781.432')] -[2023-07-08 20:07:28,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001448_741376.pth... -[2023-07-08 20:07:28,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000848_434176.pth -[2023-07-08 20:07:31,199][1063383] Updated weights for policy 0, policy_version 1520 (0.0005) -[2023-07-08 20:07:33,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10103.5, 300 sec: 9348.5). Total num frames: 794624. Throughput: 0: 10115.1. Samples: 776408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:33,025][1063098] Avg episode reward: [(0, '781.342')] -[2023-07-08 20:07:35,042][1063383] Updated weights for policy 0, policy_version 1600 (0.0005) -[2023-07-08 20:07:38,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10103.4, 300 sec: 9420.8). Total num frames: 847872. Throughput: 0: 10261.2. Samples: 839516. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:07:38,025][1063098] Avg episode reward: [(0, '786.483')] -[2023-07-08 20:07:38,026][1063339] Saving new best policy, reward=786.483! -[2023-07-08 20:07:38,835][1063383] Updated weights for policy 0, policy_version 1680 (0.0005) -[2023-07-08 20:07:42,653][1063383] Updated weights for policy 0, policy_version 1760 (0.0006) -[2023-07-08 20:07:43,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10171.7, 300 sec: 9485.5). Total num frames: 901120. Throughput: 0: 10391.3. Samples: 905052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:07:43,025][1063098] Avg episode reward: [(0, '821.915')] -[2023-07-08 20:07:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001760_901120.pth... -[2023-07-08 20:07:43,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001152_589824.pth -[2023-07-08 20:07:43,030][1063339] Saving new best policy, reward=821.915! -[2023-07-08 20:07:46,542][1063383] Updated weights for policy 0, policy_version 1840 (0.0005) -[2023-07-08 20:07:48,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10308.7, 300 sec: 9584.6). Total num frames: 958464. Throughput: 0: 10441.6. Samples: 936816. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:07:48,025][1063098] Avg episode reward: [(0, '796.543')] -[2023-07-08 20:07:50,591][1063383] Updated weights for policy 0, policy_version 1920 (0.0005) -[2023-07-08 20:07:53,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10376.5, 300 sec: 9596.3). Total num frames: 1007616. Throughput: 0: 10436.0. Samples: 998004. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:07:53,025][1063098] Avg episode reward: [(0, '832.509')] -[2023-07-08 20:07:53,026][1063339] Saving new best policy, reward=832.509! -[2023-07-08 20:07:54,633][1063383] Updated weights for policy 0, policy_version 2000 (0.0005) -[2023-07-08 20:07:58,025][1063098] Fps is (10 sec: 9830.3, 60 sec: 10376.5, 300 sec: 9607.0). Total num frames: 1056768. Throughput: 0: 10376.5. Samples: 1056832. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:07:58,025][1063098] Avg episode reward: [(0, '795.714')] -[2023-07-08 20:07:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002064_1056768.pth... -[2023-07-08 20:07:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001448_741376.pth -[2023-07-08 20:07:58,752][1063383] Updated weights for policy 0, policy_version 2080 (0.0006) -[2023-07-08 20:08:02,571][1063383] Updated weights for policy 0, policy_version 2160 (0.0005) -[2023-07-08 20:08:03,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 9652.3). Total num frames: 1110016. Throughput: 0: 10453.3. Samples: 1089428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:08:03,025][1063098] Avg episode reward: [(0, '817.271')] -[2023-07-08 20:08:06,356][1063383] Updated weights for policy 0, policy_version 2240 (0.0004) -[2023-07-08 20:08:08,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10444.8, 300 sec: 9693.9). Total num frames: 1163264. Throughput: 0: 10520.1. Samples: 1153204. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:08:08,038][1063098] Avg episode reward: [(0, '830.772')] -[2023-07-08 20:08:10,282][1063383] Updated weights for policy 0, policy_version 2320 (0.0006) -[2023-07-08 20:08:13,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10376.6, 300 sec: 9699.3). Total num frames: 1212416. Throughput: 0: 10441.1. Samples: 1214264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:13,025][1063098] Avg episode reward: [(0, '748.282')] -[2023-07-08 20:08:13,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002368_1212416.pth... -[2023-07-08 20:08:13,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001760_901120.pth -[2023-07-08 20:08:14,272][1063383] Updated weights for policy 0, policy_version 2400 (0.0005) -[2023-07-08 20:08:18,024][1063098] Fps is (10 sec: 10240.2, 60 sec: 10444.8, 300 sec: 9735.9). Total num frames: 1265664. Throughput: 0: 10439.1. Samples: 1246168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:18,025][1063098] Avg episode reward: [(0, '730.117')] -[2023-07-08 20:08:18,237][1063383] Updated weights for policy 0, policy_version 2480 (0.0006) -[2023-07-08 20:08:22,238][1063383] Updated weights for policy 0, policy_version 2560 (0.0005) -[2023-07-08 20:08:23,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10513.0, 300 sec: 9769.7). Total num frames: 1318912. Throughput: 0: 10420.7. Samples: 1308448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:23,025][1063098] Avg episode reward: [(0, '780.767')] -[2023-07-08 20:08:26,099][1063383] Updated weights for policy 0, policy_version 2640 (0.0005) -[2023-07-08 20:08:28,025][1063098] Fps is (10 sec: 10239.8, 60 sec: 10444.8, 300 sec: 9771.9). Total num frames: 1368064. Throughput: 0: 10370.1. Samples: 1371708. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:08:28,025][1063098] Avg episode reward: [(0, '802.417')] -[2023-07-08 20:08:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002672_1368064.pth... -[2023-07-08 20:08:28,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002064_1056768.pth -[2023-07-08 20:08:30,052][1063383] Updated weights for policy 0, policy_version 2720 (0.0005) -[2023-07-08 20:08:33,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 9802.2). Total num frames: 1421312. Throughput: 0: 10329.5. Samples: 1401644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:33,025][1063098] Avg episode reward: [(0, '786.106')] -[2023-07-08 20:08:33,976][1063383] Updated weights for policy 0, policy_version 2800 (0.0005) -[2023-07-08 20:08:37,577][1063383] Updated weights for policy 0, policy_version 2880 (0.0005) -[2023-07-08 20:08:38,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10513.1, 300 sec: 9857.7). Total num frames: 1478656. Throughput: 0: 10410.5. Samples: 1466476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:38,025][1063098] Avg episode reward: [(0, '799.419')] -[2023-07-08 20:08:41,464][1063383] Updated weights for policy 0, policy_version 2960 (0.0005) -[2023-07-08 20:08:43,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10513.1, 300 sec: 9883.3). Total num frames: 1531904. Throughput: 0: 10555.8. Samples: 1531840. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:08:43,025][1063098] Avg episode reward: [(0, '830.487')] -[2023-07-08 20:08:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002992_1531904.pth... -[2023-07-08 20:08:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002368_1212416.pth -[2023-07-08 20:08:45,458][1063383] Updated weights for policy 0, policy_version 3040 (0.0005) -[2023-07-08 20:08:48,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 9881.6). Total num frames: 1581056. Throughput: 0: 10494.1. Samples: 1561660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:08:48,025][1063098] Avg episode reward: [(0, '747.254')] -[2023-07-08 20:08:49,343][1063383] Updated weights for policy 0, policy_version 3120 (0.0005) -[2023-07-08 20:08:52,886][1063383] Updated weights for policy 0, policy_version 3200 (0.0005) -[2023-07-08 20:08:53,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10513.1, 300 sec: 9929.7). Total num frames: 1638400. Throughput: 0: 10550.7. Samples: 1627984. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:08:53,025][1063098] Avg episode reward: [(0, '806.428')] -[2023-07-08 20:08:56,477][1063383] Updated weights for policy 0, policy_version 3280 (0.0004) -[2023-07-08 20:08:58,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10581.3, 300 sec: 9950.9). Total num frames: 1691648. Throughput: 0: 10694.8. Samples: 1695532. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:08:58,026][1063098] Avg episode reward: [(0, '810.341')] -[2023-07-08 20:08:58,055][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003312_1695744.pth... -[2023-07-08 20:08:58,057][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002672_1368064.pth -[2023-07-08 20:09:00,325][1063383] Updated weights for policy 0, policy_version 3360 (0.0005) -[2023-07-08 20:09:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 9970.8). Total num frames: 1744896. Throughput: 0: 10694.4. Samples: 1727416. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:09:03,025][1063098] Avg episode reward: [(0, '702.854')] -[2023-07-08 20:09:04,283][1063383] Updated weights for policy 0, policy_version 3440 (0.0005) -[2023-07-08 20:09:08,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 9989.7). Total num frames: 1798144. Throughput: 0: 10701.7. Samples: 1790024. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:09:08,025][1063098] Avg episode reward: [(0, '758.219')] -[2023-07-08 20:09:08,166][1063383] Updated weights for policy 0, policy_version 3520 (0.0005) -[2023-07-08 20:09:11,868][1063383] Updated weights for policy 0, policy_version 3600 (0.0005) -[2023-07-08 20:09:13,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10649.6, 300 sec: 10007.5). Total num frames: 1851392. Throughput: 0: 10742.2. Samples: 1855108. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:09:13,025][1063098] Avg episode reward: [(0, '790.875')] -[2023-07-08 20:09:13,048][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003624_1855488.pth... -[2023-07-08 20:09:13,049][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002992_1531904.pth -[2023-07-08 20:09:15,843][1063383] Updated weights for policy 0, policy_version 3680 (0.0005) -[2023-07-08 20:09:18,024][1063098] Fps is (10 sec: 10649.8, 60 sec: 10649.6, 300 sec: 10024.4). Total num frames: 1904640. Throughput: 0: 10749.4. Samples: 1885368. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:18,025][1063098] Avg episode reward: [(0, '813.871')] -[2023-07-08 20:09:19,664][1063383] Updated weights for policy 0, policy_version 3760 (0.0005) -[2023-07-08 20:09:23,024][1063098] Fps is (10 sec: 11059.4, 60 sec: 10717.9, 300 sec: 10061.5). Total num frames: 1961984. Throughput: 0: 10834.1. Samples: 1954008. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:09:23,025][1063098] Avg episode reward: [(0, '825.005')] -[2023-07-08 20:09:23,096][1063383] Updated weights for policy 0, policy_version 3840 (0.0005) -[2023-07-08 20:09:26,831][1063383] Updated weights for policy 0, policy_version 3920 (0.0005) -[2023-07-08 20:09:28,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10786.1, 300 sec: 10076.2). Total num frames: 2015232. Throughput: 0: 10816.5. Samples: 2018580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:28,025][1063098] Avg episode reward: [(0, '806.623')] -[2023-07-08 20:09:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003936_2015232.pth... -[2023-07-08 20:09:28,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003312_1695744.pth -[2023-07-08 20:09:30,875][1063383] Updated weights for policy 0, policy_version 4000 (0.0005) -[2023-07-08 20:09:33,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10786.1, 300 sec: 10090.1). Total num frames: 2068480. Throughput: 0: 10825.6. Samples: 2048812. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:09:33,025][1063098] Avg episode reward: [(0, '827.855')] -[2023-07-08 20:09:34,717][1063383] Updated weights for policy 0, policy_version 4080 (0.0005) -[2023-07-08 20:09:38,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10103.5). Total num frames: 2121728. Throughput: 0: 10766.5. Samples: 2112476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:38,025][1063098] Avg episode reward: [(0, '804.111')] -[2023-07-08 20:09:38,806][1063383] Updated weights for policy 0, policy_version 4160 (0.0004) -[2023-07-08 20:09:42,840][1063383] Updated weights for policy 0, policy_version 4240 (0.0006) -[2023-07-08 20:09:43,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10649.6, 300 sec: 10097.1). Total num frames: 2170880. Throughput: 0: 10595.0. Samples: 2172308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:43,025][1063098] Avg episode reward: [(0, '809.408')] -[2023-07-08 20:09:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004240_2170880.pth... -[2023-07-08 20:09:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003624_1855488.pth -[2023-07-08 20:09:46,628][1063383] Updated weights for policy 0, policy_version 4320 (0.0005) -[2023-07-08 20:09:48,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10717.9, 300 sec: 10109.7). Total num frames: 2224128. Throughput: 0: 10609.2. Samples: 2204832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:48,025][1063098] Avg episode reward: [(0, '826.646')] -[2023-07-08 20:09:50,651][1063383] Updated weights for policy 0, policy_version 4400 (0.0006) -[2023-07-08 20:09:53,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10581.3, 300 sec: 10103.5). Total num frames: 2273280. Throughput: 0: 10568.1. Samples: 2265588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:53,025][1063098] Avg episode reward: [(0, '840.987')] -[2023-07-08 20:09:53,025][1063339] Saving new best policy, reward=840.987! -[2023-07-08 20:09:54,718][1063383] Updated weights for policy 0, policy_version 4480 (0.0005) -[2023-07-08 20:09:58,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 10115.3). Total num frames: 2326528. Throughput: 0: 10532.6. Samples: 2329076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:09:58,025][1063098] Avg episode reward: [(0, '842.234')] -[2023-07-08 20:09:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004544_2326528.pth... -[2023-07-08 20:09:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003936_2015232.pth -[2023-07-08 20:09:58,031][1063339] Saving new best policy, reward=842.234! -[2023-07-08 20:09:58,576][1063383] Updated weights for policy 0, policy_version 4560 (0.0005) -[2023-07-08 20:10:02,463][1063383] Updated weights for policy 0, policy_version 4640 (0.0005) -[2023-07-08 20:10:03,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10581.3, 300 sec: 10126.7). Total num frames: 2379776. Throughput: 0: 10538.5. Samples: 2359600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:03,025][1063098] Avg episode reward: [(0, '834.959')] -[2023-07-08 20:10:06,291][1063383] Updated weights for policy 0, policy_version 4720 (0.0005) -[2023-07-08 20:10:08,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10581.3, 300 sec: 10137.6). Total num frames: 2433024. Throughput: 0: 10457.5. Samples: 2424596. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:10:08,025][1063098] Avg episode reward: [(0, '816.909')] -[2023-07-08 20:10:10,283][1063383] Updated weights for policy 0, policy_version 4800 (0.0006) -[2023-07-08 20:10:13,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10513.1, 300 sec: 10131.3). Total num frames: 2482176. Throughput: 0: 10392.3. Samples: 2486236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:13,025][1063098] Avg episode reward: [(0, '824.821')] -[2023-07-08 20:10:13,059][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004856_2486272.pth... -[2023-07-08 20:10:13,060][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004240_2170880.pth -[2023-07-08 20:10:14,277][1063383] Updated weights for policy 0, policy_version 4880 (0.0005) -[2023-07-08 20:10:18,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10141.7). Total num frames: 2535424. Throughput: 0: 10377.4. Samples: 2515796. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:10:18,025][1063098] Avg episode reward: [(0, '833.799')] -[2023-07-08 20:10:18,139][1063383] Updated weights for policy 0, policy_version 4960 (0.0005) -[2023-07-08 20:10:21,947][1063383] Updated weights for policy 0, policy_version 5040 (0.0005) -[2023-07-08 20:10:23,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10513.1, 300 sec: 10167.7). Total num frames: 2592768. Throughput: 0: 10401.5. Samples: 2580544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:23,025][1063098] Avg episode reward: [(0, '813.854')] -[2023-07-08 20:10:25,653][1063383] Updated weights for policy 0, policy_version 5120 (0.0005) -[2023-07-08 20:10:28,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10161.2). Total num frames: 2641920. Throughput: 0: 10516.6. Samples: 2645552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:28,025][1063098] Avg episode reward: [(0, '837.824')] -[2023-07-08 20:10:28,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005160_2641920.pth... -[2023-07-08 20:10:28,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004544_2326528.pth -[2023-07-08 20:10:29,829][1063383] Updated weights for policy 0, policy_version 5200 (0.0005) -[2023-07-08 20:10:33,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10170.4). Total num frames: 2695168. Throughput: 0: 10444.6. Samples: 2674840. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:10:33,025][1063098] Avg episode reward: [(0, '851.236')] -[2023-07-08 20:10:33,026][1063339] Saving new best policy, reward=851.236! -[2023-07-08 20:10:33,609][1063383] Updated weights for policy 0, policy_version 5280 (0.0006) -[2023-07-08 20:10:37,572][1063383] Updated weights for policy 0, policy_version 5360 (0.0005) -[2023-07-08 20:10:38,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10179.3). Total num frames: 2748416. Throughput: 0: 10515.2. Samples: 2738772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:38,025][1063098] Avg episode reward: [(0, '843.582')] -[2023-07-08 20:10:41,133][1063383] Updated weights for policy 0, policy_version 5440 (0.0005) -[2023-07-08 20:10:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10187.9). Total num frames: 2801664. Throughput: 0: 10585.4. Samples: 2805420. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:10:43,025][1063098] Avg episode reward: [(0, '848.824')] -[2023-07-08 20:10:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005472_2801664.pth... -[2023-07-08 20:10:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004856_2486272.pth -[2023-07-08 20:10:44,960][1063383] Updated weights for policy 0, policy_version 5520 (0.0005) -[2023-07-08 20:10:48,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10196.1). Total num frames: 2854912. Throughput: 0: 10579.3. Samples: 2835668. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:10:48,025][1063098] Avg episode reward: [(0, '848.818')] -[2023-07-08 20:10:48,984][1063383] Updated weights for policy 0, policy_version 5600 (0.0004) -[2023-07-08 20:10:52,989][1063383] Updated weights for policy 0, policy_version 5680 (0.0005) -[2023-07-08 20:10:53,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10204.1). Total num frames: 2908160. Throughput: 0: 10517.4. Samples: 2897880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:10:53,025][1063098] Avg episode reward: [(0, '833.154')] -[2023-07-08 20:10:56,927][1063383] Updated weights for policy 0, policy_version 5760 (0.0005) -[2023-07-08 20:10:58,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10197.6). Total num frames: 2957312. Throughput: 0: 10521.4. Samples: 2959700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:10:58,025][1063098] Avg episode reward: [(0, '847.602')] -[2023-07-08 20:10:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005776_2957312.pth... -[2023-07-08 20:10:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005160_2641920.pth -[2023-07-08 20:11:00,845][1063383] Updated weights for policy 0, policy_version 5840 (0.0005) -[2023-07-08 20:11:03,024][1063098] Fps is (10 sec: 10240.1, 60 sec: 10513.1, 300 sec: 10205.3). Total num frames: 3010560. Throughput: 0: 10565.9. Samples: 2991260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:03,025][1063098] Avg episode reward: [(0, '829.747')] -[2023-07-08 20:11:04,878][1063383] Updated weights for policy 0, policy_version 5920 (0.0005) -[2023-07-08 20:11:08,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10513.1, 300 sec: 10385.8). Total num frames: 3063808. Throughput: 0: 10505.0. Samples: 3053268. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:08,025][1063098] Avg episode reward: [(0, '824.155')] -[2023-07-08 20:11:08,557][1063383] Updated weights for policy 0, policy_version 6000 (0.0005) -[2023-07-08 20:11:12,690][1063383] Updated weights for policy 0, policy_version 6080 (0.0005) -[2023-07-08 20:11:13,025][1063098] Fps is (10 sec: 10239.8, 60 sec: 10513.1, 300 sec: 10413.6). Total num frames: 3112960. Throughput: 0: 10468.1. Samples: 3116616. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:11:13,025][1063098] Avg episode reward: [(0, '750.220')] -[2023-07-08 20:11:13,067][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006088_3117056.pth... -[2023-07-08 20:11:13,069][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005472_2801664.pth -[2023-07-08 20:11:16,651][1063383] Updated weights for policy 0, policy_version 6160 (0.0005) -[2023-07-08 20:11:18,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10513.1, 300 sec: 10441.3). Total num frames: 3166208. Throughput: 0: 10501.7. Samples: 3147416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:11:18,025][1063098] Avg episode reward: [(0, '803.599')] -[2023-07-08 20:11:20,629][1063383] Updated weights for policy 0, policy_version 6240 (0.0005) -[2023-07-08 20:11:23,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 10427.4). Total num frames: 3215360. Throughput: 0: 10419.9. Samples: 3207668. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:11:23,025][1063098] Avg episode reward: [(0, '813.616')] -[2023-07-08 20:11:24,705][1063383] Updated weights for policy 0, policy_version 6320 (0.0006) -[2023-07-08 20:11:28,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10441.3). Total num frames: 3268608. Throughput: 0: 10348.5. Samples: 3271104. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:11:28,025][1063098] Avg episode reward: [(0, '844.186')] -[2023-07-08 20:11:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006384_3268608.pth... -[2023-07-08 20:11:28,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005776_2957312.pth -[2023-07-08 20:11:28,531][1063383] Updated weights for policy 0, policy_version 6400 (0.0006) -[2023-07-08 20:11:32,681][1063383] Updated weights for policy 0, policy_version 6480 (0.0005) -[2023-07-08 20:11:33,024][1063098] Fps is (10 sec: 10240.1, 60 sec: 10376.6, 300 sec: 10427.4). Total num frames: 3317760. Throughput: 0: 10346.3. Samples: 3301252. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:11:33,025][1063098] Avg episode reward: [(0, '807.695')] -[2023-07-08 20:11:36,595][1063383] Updated weights for policy 0, policy_version 6560 (0.0005) -[2023-07-08 20:11:38,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10441.3). Total num frames: 3371008. Throughput: 0: 10333.3. Samples: 3362880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:38,025][1063098] Avg episode reward: [(0, '844.831')] -[2023-07-08 20:11:40,457][1063383] Updated weights for policy 0, policy_version 6640 (0.0005) -[2023-07-08 20:11:40,804][1063339] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000006 -[2023-07-08 20:11:43,025][1063098] Fps is (10 sec: 11059.0, 60 sec: 10444.8, 300 sec: 10469.2). Total num frames: 3428352. Throughput: 0: 10413.6. Samples: 3428312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:43,025][1063098] Avg episode reward: [(0, '847.225')] -[2023-07-08 20:11:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006696_3428352.pth... -[2023-07-08 20:11:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006088_3117056.pth -[2023-07-08 20:11:44,203][1063383] Updated weights for policy 0, policy_version 6720 (0.0005) -[2023-07-08 20:11:47,767][1063383] Updated weights for policy 0, policy_version 6800 (0.0005) -[2023-07-08 20:11:48,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10444.8, 300 sec: 10496.9). Total num frames: 3481600. Throughput: 0: 10442.7. Samples: 3461184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:48,025][1063098] Avg episode reward: [(0, '817.897')] -[2023-07-08 20:11:51,519][1063383] Updated weights for policy 0, policy_version 6880 (0.0005) -[2023-07-08 20:11:53,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10444.8, 300 sec: 10510.8). Total num frames: 3534848. Throughput: 0: 10521.2. Samples: 3526720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:53,025][1063098] Avg episode reward: [(0, '839.917')] -[2023-07-08 20:11:55,261][1063383] Updated weights for policy 0, policy_version 6960 (0.0005) -[2023-07-08 20:11:58,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10510.7). Total num frames: 3588096. Throughput: 0: 10550.8. Samples: 3591400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:11:58,025][1063098] Avg episode reward: [(0, '827.547')] -[2023-07-08 20:11:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007008_3588096.pth... -[2023-07-08 20:11:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006384_3268608.pth -[2023-07-08 20:11:59,272][1063383] Updated weights for policy 0, policy_version 7040 (0.0005) -[2023-07-08 20:12:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10524.6). Total num frames: 3641344. Throughput: 0: 10570.9. Samples: 3623104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:03,025][1063098] Avg episode reward: [(0, '829.140')] -[2023-07-08 20:12:03,144][1063383] Updated weights for policy 0, policy_version 7120 (0.0005) -[2023-07-08 20:12:06,891][1063383] Updated weights for policy 0, policy_version 7200 (0.0005) -[2023-07-08 20:12:08,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10581.3, 300 sec: 10538.5). Total num frames: 3698688. Throughput: 0: 10650.3. Samples: 3686932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:08,026][1063098] Avg episode reward: [(0, '844.145')] -[2023-07-08 20:12:10,520][1063383] Updated weights for policy 0, policy_version 7280 (0.0005) -[2023-07-08 20:12:13,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10649.6, 300 sec: 10552.4). Total num frames: 3751936. Throughput: 0: 10723.3. Samples: 3753652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:13,026][1063098] Avg episode reward: [(0, '839.090')] -[2023-07-08 20:12:13,029][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007328_3751936.pth... -[2023-07-08 20:12:13,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006696_3428352.pth -[2023-07-08 20:12:14,382][1063383] Updated weights for policy 0, policy_version 7360 (0.0005) -[2023-07-08 20:12:17,707][1063383] Updated weights for policy 0, policy_version 7440 (0.0005) -[2023-07-08 20:12:18,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10717.9, 300 sec: 10580.2). Total num frames: 3809280. Throughput: 0: 10806.9. Samples: 3787564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:18,025][1063098] Avg episode reward: [(0, '854.024')] -[2023-07-08 20:12:18,037][1063339] Saving new best policy, reward=854.024! -[2023-07-08 20:12:21,624][1063383] Updated weights for policy 0, policy_version 7520 (0.0005) -[2023-07-08 20:12:23,024][1063098] Fps is (10 sec: 11059.3, 60 sec: 10786.1, 300 sec: 10580.2). Total num frames: 3862528. Throughput: 0: 10920.0. Samples: 3854280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:23,025][1063098] Avg episode reward: [(0, '836.246')] -[2023-07-08 20:12:25,180][1063383] Updated weights for policy 0, policy_version 7600 (0.0005) -[2023-07-08 20:12:28,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10854.4, 300 sec: 10594.1). Total num frames: 3919872. Throughput: 0: 10930.4. Samples: 3920180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:28,025][1063098] Avg episode reward: [(0, '843.957')] -[2023-07-08 20:12:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007656_3919872.pth... -[2023-07-08 20:12:28,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007008_3588096.pth -[2023-07-08 20:12:29,076][1063383] Updated weights for policy 0, policy_version 7680 (0.0006) -[2023-07-08 20:12:32,959][1063383] Updated weights for policy 0, policy_version 7760 (0.0005) -[2023-07-08 20:12:33,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10922.6, 300 sec: 10594.1). Total num frames: 3973120. Throughput: 0: 10903.2. Samples: 3951828. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:12:33,025][1063098] Avg episode reward: [(0, '831.052')] -[2023-07-08 20:12:36,617][1063383] Updated weights for policy 0, policy_version 7840 (0.0005) -[2023-07-08 20:12:38,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 10594.1). Total num frames: 4026368. Throughput: 0: 10918.3. Samples: 4018044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:38,025][1063098] Avg episode reward: [(0, '832.250')] -[2023-07-08 20:12:40,507][1063383] Updated weights for policy 0, policy_version 7920 (0.0005) -[2023-07-08 20:12:43,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10580.2). Total num frames: 4079616. Throughput: 0: 10877.6. Samples: 4080892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:43,025][1063098] Avg episode reward: [(0, '825.648')] -[2023-07-08 20:12:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007968_4079616.pth... -[2023-07-08 20:12:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007328_3751936.pth -[2023-07-08 20:12:44,203][1063383] Updated weights for policy 0, policy_version 8000 (0.0005) -[2023-07-08 20:12:48,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10854.4, 300 sec: 10594.1). Total num frames: 4132864. Throughput: 0: 10923.3. Samples: 4114652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:12:48,025][1063098] Avg episode reward: [(0, '855.862')] -[2023-07-08 20:12:48,026][1063339] Saving new best policy, reward=855.862! -[2023-07-08 20:12:48,182][1063383] Updated weights for policy 0, policy_version 8080 (0.0005) -[2023-07-08 20:12:51,815][1063383] Updated weights for policy 0, policy_version 8160 (0.0005) -[2023-07-08 20:12:53,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10922.7, 300 sec: 10621.8). Total num frames: 4190208. Throughput: 0: 10948.6. Samples: 4179620. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:12:53,025][1063098] Avg episode reward: [(0, '843.824')] -[2023-07-08 20:12:55,416][1063383] Updated weights for policy 0, policy_version 8240 (0.0005) -[2023-07-08 20:12:58,025][1063098] Fps is (10 sec: 11468.8, 60 sec: 10990.9, 300 sec: 10635.7). Total num frames: 4247552. Throughput: 0: 10991.4. Samples: 4248264. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:12:58,025][1063098] Avg episode reward: [(0, '854.846')] -[2023-07-08 20:12:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008296_4247552.pth... -[2023-07-08 20:12:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007656_3919872.pth -[2023-07-08 20:12:59,083][1063383] Updated weights for policy 0, policy_version 8320 (0.0005) -[2023-07-08 20:13:02,811][1063383] Updated weights for policy 0, policy_version 8400 (0.0006) -[2023-07-08 20:13:03,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 10635.7). Total num frames: 4300800. Throughput: 0: 10947.5. Samples: 4280200. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:13:03,025][1063098] Avg episode reward: [(0, '837.872')] -[2023-07-08 20:13:06,657][1063383] Updated weights for policy 0, policy_version 8480 (0.0005) -[2023-07-08 20:13:08,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10922.7, 300 sec: 10649.6). Total num frames: 4354048. Throughput: 0: 10920.6. Samples: 4345708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:08,025][1063098] Avg episode reward: [(0, '858.074')] -[2023-07-08 20:13:08,026][1063339] Saving new best policy, reward=858.074! -[2023-07-08 20:13:10,605][1063383] Updated weights for policy 0, policy_version 8560 (0.0005) -[2023-07-08 20:13:13,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 10649.6). Total num frames: 4407296. Throughput: 0: 10863.3. Samples: 4409028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:13,025][1063098] Avg episode reward: [(0, '862.432')] -[2023-07-08 20:13:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008608_4407296.pth... -[2023-07-08 20:13:13,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007968_4079616.pth -[2023-07-08 20:13:13,031][1063339] Saving new best policy, reward=862.432! -[2023-07-08 20:13:14,330][1063383] Updated weights for policy 0, policy_version 8640 (0.0005) -[2023-07-08 20:13:18,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10854.4, 300 sec: 10649.6). Total num frames: 4460544. Throughput: 0: 10890.7. Samples: 4441912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:18,025][1063098] Avg episode reward: [(0, '795.972')] -[2023-07-08 20:13:18,297][1063383] Updated weights for policy 0, policy_version 8720 (0.0005) -[2023-07-08 20:13:22,418][1063383] Updated weights for policy 0, policy_version 8800 (0.0005) -[2023-07-08 20:13:23,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10786.1, 300 sec: 10649.6). Total num frames: 4509696. Throughput: 0: 10743.5. Samples: 4501504. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:13:23,025][1063098] Avg episode reward: [(0, '829.157')] -[2023-07-08 20:13:26,331][1063383] Updated weights for policy 0, policy_version 8880 (0.0005) -[2023-07-08 20:13:28,025][1063098] Fps is (10 sec: 9830.4, 60 sec: 10649.6, 300 sec: 10635.7). Total num frames: 4558848. Throughput: 0: 10709.9. Samples: 4562840. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:13:28,025][1063098] Avg episode reward: [(0, '841.959')] -[2023-07-08 20:13:28,030][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008912_4562944.pth... -[2023-07-08 20:13:28,032][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008296_4247552.pth -[2023-07-08 20:13:30,331][1063383] Updated weights for policy 0, policy_version 8960 (0.0005) -[2023-07-08 20:13:33,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10635.7). Total num frames: 4616192. Throughput: 0: 10683.7. Samples: 4595420. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:13:33,025][1063098] Avg episode reward: [(0, '858.872')] -[2023-07-08 20:13:34,307][1063383] Updated weights for policy 0, policy_version 9040 (0.0005) -[2023-07-08 20:13:38,024][1063098] Fps is (10 sec: 10649.8, 60 sec: 10649.6, 300 sec: 10621.8). Total num frames: 4665344. Throughput: 0: 10611.8. Samples: 4657152. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:13:38,025][1063098] Avg episode reward: [(0, '856.199')] -[2023-07-08 20:13:38,181][1063383] Updated weights for policy 0, policy_version 9120 (0.0005) -[2023-07-08 20:13:41,949][1063383] Updated weights for policy 0, policy_version 9200 (0.0005) -[2023-07-08 20:13:43,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10649.6, 300 sec: 10635.7). Total num frames: 4718592. Throughput: 0: 10520.5. Samples: 4721684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:43,025][1063098] Avg episode reward: [(0, '852.258')] -[2023-07-08 20:13:43,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009216_4718592.pth... -[2023-07-08 20:13:43,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008608_4407296.pth -[2023-07-08 20:13:45,819][1063383] Updated weights for policy 0, policy_version 9280 (0.0005) -[2023-07-08 20:13:48,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10717.9, 300 sec: 10635.7). Total num frames: 4775936. Throughput: 0: 10507.1. Samples: 4753020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:48,025][1063098] Avg episode reward: [(0, '865.325')] -[2023-07-08 20:13:48,026][1063339] Saving new best policy, reward=865.325! -[2023-07-08 20:13:49,386][1063383] Updated weights for policy 0, policy_version 9360 (0.0005) -[2023-07-08 20:13:53,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10649.6, 300 sec: 10635.7). Total num frames: 4829184. Throughput: 0: 10541.5. Samples: 4820076. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 20:13:53,025][1063098] Avg episode reward: [(0, '861.077')] -[2023-07-08 20:13:53,291][1063383] Updated weights for policy 0, policy_version 9440 (0.0005) -[2023-07-08 20:13:57,120][1063383] Updated weights for policy 0, policy_version 9520 (0.0005) -[2023-07-08 20:13:58,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 4882432. Throughput: 0: 10532.6. Samples: 4882996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:13:58,025][1063098] Avg episode reward: [(0, '845.046')] -[2023-07-08 20:13:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009536_4882432.pth... -[2023-07-08 20:13:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008912_4562944.pth -[2023-07-08 20:14:00,948][1063383] Updated weights for policy 0, policy_version 9600 (0.0005) -[2023-07-08 20:14:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 4935680. Throughput: 0: 10519.0. Samples: 4915264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:14:03,025][1063098] Avg episode reward: [(0, '859.897')] -[2023-07-08 20:14:04,764][1063383] Updated weights for policy 0, policy_version 9680 (0.0005) -[2023-07-08 20:14:08,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 4988928. Throughput: 0: 10630.2. Samples: 4979864. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:14:08,025][1063098] Avg episode reward: [(0, '860.037')] -[2023-07-08 20:14:08,607][1063383] Updated weights for policy 0, policy_version 9760 (0.0005) -[2023-07-08 20:14:12,705][1063383] Updated weights for policy 0, policy_version 9840 (0.0004) -[2023-07-08 20:14:13,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10621.8). Total num frames: 5038080. Throughput: 0: 10640.8. Samples: 5041676. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:14:13,025][1063098] Avg episode reward: [(0, '870.349')] -[2023-07-08 20:14:13,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009848_5042176.pth... -[2023-07-08 20:14:13,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009216_4718592.pth -[2023-07-08 20:14:13,030][1063339] Saving new best policy, reward=870.349! -[2023-07-08 20:14:16,363][1063383] Updated weights for policy 0, policy_version 9920 (0.0006) -[2023-07-08 20:14:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10621.8). Total num frames: 5095424. Throughput: 0: 10640.9. Samples: 5074260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:14:18,025][1063098] Avg episode reward: [(0, '853.779')] -[2023-07-08 20:14:20,198][1063383] Updated weights for policy 0, policy_version 10000 (0.0005) -[2023-07-08 20:14:23,024][1063098] Fps is (10 sec: 11059.3, 60 sec: 10649.6, 300 sec: 10621.8). Total num frames: 5148672. Throughput: 0: 10718.8. Samples: 5139500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:14:23,025][1063098] Avg episode reward: [(0, '862.510')] -[2023-07-08 20:14:24,058][1063383] Updated weights for policy 0, policy_version 10080 (0.0005) -[2023-07-08 20:14:28,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10649.6, 300 sec: 10607.9). Total num frames: 5197824. Throughput: 0: 10609.2. Samples: 5199100. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:14:28,025][1063098] Avg episode reward: [(0, '863.733')] -[2023-07-08 20:14:28,029][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010152_5197824.pth... -[2023-07-08 20:14:28,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009536_4882432.pth -[2023-07-08 20:14:28,238][1063383] Updated weights for policy 0, policy_version 10160 (0.0005) -[2023-07-08 20:14:32,068][1063383] Updated weights for policy 0, policy_version 10240 (0.0004) -[2023-07-08 20:14:33,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 10607.9). Total num frames: 5251072. Throughput: 0: 10634.9. Samples: 5231592. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:14:33,025][1063098] Avg episode reward: [(0, '847.126')] -[2023-07-08 20:14:35,920][1063383] Updated weights for policy 0, policy_version 10320 (0.0005) -[2023-07-08 20:14:38,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10649.6, 300 sec: 10621.8). Total num frames: 5304320. Throughput: 0: 10558.0. Samples: 5295184. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:14:38,025][1063098] Avg episode reward: [(0, '859.672')] -[2023-07-08 20:14:39,789][1063383] Updated weights for policy 0, policy_version 10400 (0.0005) -[2023-07-08 20:14:43,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10649.6, 300 sec: 10621.8). Total num frames: 5357568. Throughput: 0: 10560.1. Samples: 5358200. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:14:43,025][1063098] Avg episode reward: [(0, '864.618')] -[2023-07-08 20:14:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010464_5357568.pth... -[2023-07-08 20:14:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009848_5042176.pth -[2023-07-08 20:14:43,631][1063383] Updated weights for policy 0, policy_version 10480 (0.0005) -[2023-07-08 20:14:47,548][1063383] Updated weights for policy 0, policy_version 10560 (0.0005) -[2023-07-08 20:14:48,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 5410816. Throughput: 0: 10541.6. Samples: 5389636. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:14:48,025][1063098] Avg episode reward: [(0, '864.034')] -[2023-07-08 20:14:51,276][1063383] Updated weights for policy 0, policy_version 10640 (0.0005) -[2023-07-08 20:14:53,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 5464064. Throughput: 0: 10562.2. Samples: 5455164. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:14:53,025][1063098] Avg episode reward: [(0, '855.767')] -[2023-07-08 20:14:55,212][1063383] Updated weights for policy 0, policy_version 10720 (0.0005) -[2023-07-08 20:14:58,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 5517312. Throughput: 0: 10582.9. Samples: 5517908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:14:58,025][1063098] Avg episode reward: [(0, '851.540')] -[2023-07-08 20:14:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010776_5517312.pth... -[2023-07-08 20:14:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010152_5197824.pth -[2023-07-08 20:14:59,023][1063383] Updated weights for policy 0, policy_version 10800 (0.0005) -[2023-07-08 20:15:02,852][1063383] Updated weights for policy 0, policy_version 10880 (0.0006) -[2023-07-08 20:15:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10635.7). Total num frames: 5570560. Throughput: 0: 10573.4. Samples: 5550064. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:15:03,025][1063098] Avg episode reward: [(0, '848.841')] -[2023-07-08 20:15:06,553][1063383] Updated weights for policy 0, policy_version 10960 (0.0005) -[2023-07-08 20:15:08,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10649.6, 300 sec: 10663.5). Total num frames: 5627904. Throughput: 0: 10581.8. Samples: 5615680. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:15:08,025][1063098] Avg episode reward: [(0, '870.601')] -[2023-07-08 20:15:08,026][1063339] Saving new best policy, reward=870.601! -[2023-07-08 20:15:10,180][1063383] Updated weights for policy 0, policy_version 11040 (0.0005) -[2023-07-08 20:15:13,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 5681152. Throughput: 0: 10730.2. Samples: 5681960. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:15:13,025][1063098] Avg episode reward: [(0, '877.350')] -[2023-07-08 20:15:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011096_5681152.pth... -[2023-07-08 20:15:13,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010464_5357568.pth -[2023-07-08 20:15:13,031][1063339] Saving new best policy, reward=877.350! -[2023-07-08 20:15:14,116][1063383] Updated weights for policy 0, policy_version 11120 (0.0005) -[2023-07-08 20:15:17,926][1063383] Updated weights for policy 0, policy_version 11200 (0.0006) -[2023-07-08 20:15:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10649.6, 300 sec: 10649.6). Total num frames: 5734400. Throughput: 0: 10715.2. Samples: 5713776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-07-08 20:15:18,025][1063098] Avg episode reward: [(0, '869.143')] -[2023-07-08 20:15:21,705][1063383] Updated weights for policy 0, policy_version 11280 (0.0005) -[2023-07-08 20:15:23,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10649.6, 300 sec: 10663.5). Total num frames: 5787648. Throughput: 0: 10736.8. Samples: 5778340. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-07-08 20:15:23,025][1063098] Avg episode reward: [(0, '862.510')] -[2023-07-08 20:15:25,620][1063383] Updated weights for policy 0, policy_version 11360 (0.0005) -[2023-07-08 20:15:28,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 5840896. Throughput: 0: 10776.6. Samples: 5843148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:15:28,025][1063098] Avg episode reward: [(0, '859.206')] -[2023-07-08 20:15:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011408_5840896.pth... -[2023-07-08 20:15:28,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010776_5517312.pth -[2023-07-08 20:15:29,333][1063383] Updated weights for policy 0, policy_version 11440 (0.0005) -[2023-07-08 20:15:33,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 5894144. Throughput: 0: 10787.8. Samples: 5875088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:15:33,025][1063098] Avg episode reward: [(0, '851.159')] -[2023-07-08 20:15:33,111][1063383] Updated weights for policy 0, policy_version 11520 (0.0005) -[2023-07-08 20:15:36,974][1063383] Updated weights for policy 0, policy_version 11600 (0.0005) -[2023-07-08 20:15:38,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 5947392. Throughput: 0: 10756.6. Samples: 5939212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:15:38,025][1063098] Avg episode reward: [(0, '858.911')] -[2023-07-08 20:15:40,968][1063383] Updated weights for policy 0, policy_version 11680 (0.0005) -[2023-07-08 20:15:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 6000640. Throughput: 0: 10726.7. Samples: 6000608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:15:43,025][1063098] Avg episode reward: [(0, '856.363')] -[2023-07-08 20:15:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011720_6000640.pth... -[2023-07-08 20:15:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011096_5681152.pth -[2023-07-08 20:15:44,953][1063383] Updated weights for policy 0, policy_version 11760 (0.0006) -[2023-07-08 20:15:48,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10717.9, 300 sec: 10663.5). Total num frames: 6053888. Throughput: 0: 10722.0. Samples: 6032552. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:15:48,159][1063098] Avg episode reward: [(0, '834.372')] -[2023-07-08 20:15:48,577][1063383] Updated weights for policy 0, policy_version 11840 (0.0005) -[2023-07-08 20:15:52,212][1063383] Updated weights for policy 0, policy_version 11920 (0.0005) -[2023-07-08 20:15:53,024][1063098] Fps is (10 sec: 11059.3, 60 sec: 10786.1, 300 sec: 10691.3). Total num frames: 6111232. Throughput: 0: 10776.1. Samples: 6100604. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:15:53,025][1063098] Avg episode reward: [(0, '840.400')] -[2023-07-08 20:15:56,050][1063383] Updated weights for policy 0, policy_version 12000 (0.0005) -[2023-07-08 20:15:58,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.2, 300 sec: 10691.3). Total num frames: 6164480. Throughput: 0: 10722.7. Samples: 6164480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:15:58,025][1063098] Avg episode reward: [(0, '842.629')] -[2023-07-08 20:15:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012040_6164480.pth... -[2023-07-08 20:15:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011408_5840896.pth -[2023-07-08 20:15:59,834][1063383] Updated weights for policy 0, policy_version 12080 (0.0004) -[2023-07-08 20:16:03,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10786.1, 300 sec: 10691.3). Total num frames: 6217728. Throughput: 0: 10763.1. Samples: 6198116. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:03,026][1063098] Avg episode reward: [(0, '846.958')] -[2023-07-08 20:16:03,615][1063383] Updated weights for policy 0, policy_version 12160 (0.0005) -[2023-07-08 20:16:07,238][1063383] Updated weights for policy 0, policy_version 12240 (0.0005) -[2023-07-08 20:16:08,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 6275072. Throughput: 0: 10787.0. Samples: 6263752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:08,025][1063098] Avg episode reward: [(0, '843.470')] -[2023-07-08 20:16:11,098][1063383] Updated weights for policy 0, policy_version 12320 (0.0004) -[2023-07-08 20:16:13,024][1063098] Fps is (10 sec: 10649.7, 60 sec: 10717.9, 300 sec: 10705.1). Total num frames: 6324224. Throughput: 0: 10781.0. Samples: 6328292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:13,025][1063098] Avg episode reward: [(0, '857.354')] -[2023-07-08 20:16:13,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012360_6328320.pth... -[2023-07-08 20:16:13,029][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011720_6000640.pth -[2023-07-08 20:16:14,979][1063383] Updated weights for policy 0, policy_version 12400 (0.0005) -[2023-07-08 20:16:18,024][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 6381568. Throughput: 0: 10800.0. Samples: 6361088. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:16:18,026][1063098] Avg episode reward: [(0, '852.305')] -[2023-07-08 20:16:18,668][1063383] Updated weights for policy 0, policy_version 12480 (0.0005) -[2023-07-08 20:16:22,444][1063383] Updated weights for policy 0, policy_version 12560 (0.0004) -[2023-07-08 20:16:23,025][1063098] Fps is (10 sec: 11059.0, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 6434816. Throughput: 0: 10821.7. Samples: 6426188. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:16:23,026][1063098] Avg episode reward: [(0, '861.998')] -[2023-07-08 20:16:26,175][1063383] Updated weights for policy 0, policy_version 12640 (0.0005) -[2023-07-08 20:16:28,025][1063098] Fps is (10 sec: 11059.0, 60 sec: 10854.4, 300 sec: 10760.7). Total num frames: 6492160. Throughput: 0: 10921.6. Samples: 6492080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:28,025][1063098] Avg episode reward: [(0, '860.369')] -[2023-07-08 20:16:28,029][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012680_6492160.pth... -[2023-07-08 20:16:28,032][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012040_6164480.pth -[2023-07-08 20:16:29,856][1063383] Updated weights for policy 0, policy_version 12720 (0.0005) -[2023-07-08 20:16:33,025][1063098] Fps is (10 sec: 11059.4, 60 sec: 10854.4, 300 sec: 10760.7). Total num frames: 6545408. Throughput: 0: 10937.2. Samples: 6524728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:33,025][1063098] Avg episode reward: [(0, '842.568')] -[2023-07-08 20:16:33,594][1063383] Updated weights for policy 0, policy_version 12800 (0.0005) -[2023-07-08 20:16:37,501][1063383] Updated weights for policy 0, policy_version 12880 (0.0005) -[2023-07-08 20:16:38,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10746.8). Total num frames: 6598656. Throughput: 0: 10881.2. Samples: 6590260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:16:38,025][1063098] Avg episode reward: [(0, '859.330')] -[2023-07-08 20:16:41,431][1063383] Updated weights for policy 0, policy_version 12960 (0.0005) -[2023-07-08 20:16:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10854.4, 300 sec: 10746.8). Total num frames: 6651904. Throughput: 0: 10892.0. Samples: 6654620. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:16:43,025][1063098] Avg episode reward: [(0, '866.574')] -[2023-07-08 20:16:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012992_6651904.pth... -[2023-07-08 20:16:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012360_6328320.pth -[2023-07-08 20:16:45,006][1063383] Updated weights for policy 0, policy_version 13040 (0.0005) -[2023-07-08 20:16:48,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10746.8). Total num frames: 6705152. Throughput: 0: 10841.4. Samples: 6685980. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:16:48,025][1063098] Avg episode reward: [(0, '857.714')] -[2023-07-08 20:16:48,786][1063383] Updated weights for policy 0, policy_version 13120 (0.0005) -[2023-07-08 20:16:52,783][1063383] Updated weights for policy 0, policy_version 13200 (0.0004) -[2023-07-08 20:16:53,024][1063098] Fps is (10 sec: 10649.8, 60 sec: 10786.1, 300 sec: 10746.8). Total num frames: 6758400. Throughput: 0: 10811.6. Samples: 6750272. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:16:53,025][1063098] Avg episode reward: [(0, '864.302')] -[2023-07-08 20:16:56,558][1063383] Updated weights for policy 0, policy_version 13280 (0.0005) -[2023-07-08 20:16:58,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10786.1, 300 sec: 10746.8). Total num frames: 6811648. Throughput: 0: 10822.6. Samples: 6815308. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:16:58,025][1063098] Avg episode reward: [(0, '850.611')] -[2023-07-08 20:16:58,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013304_6811648.pth... -[2023-07-08 20:16:58,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012680_6492160.pth -[2023-07-08 20:17:00,575][1063383] Updated weights for policy 0, policy_version 13360 (0.0005) -[2023-07-08 20:17:03,024][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 6864896. Throughput: 0: 10742.0. Samples: 6844480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:03,025][1063098] Avg episode reward: [(0, '859.202')] -[2023-07-08 20:17:04,365][1063383] Updated weights for policy 0, policy_version 13440 (0.0005) -[2023-07-08 20:17:08,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10732.9). Total num frames: 6918144. Throughput: 0: 10729.8. Samples: 6909028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:08,025][1063098] Avg episode reward: [(0, '863.307')] -[2023-07-08 20:17:08,362][1063383] Updated weights for policy 0, policy_version 13520 (0.0005) -[2023-07-08 20:17:12,093][1063383] Updated weights for policy 0, policy_version 13600 (0.0005) -[2023-07-08 20:17:13,025][1063098] Fps is (10 sec: 10649.4, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 6971392. Throughput: 0: 10686.8. Samples: 6972988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:13,025][1063098] Avg episode reward: [(0, '862.298')] -[2023-07-08 20:17:13,029][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013616_6971392.pth... -[2023-07-08 20:17:13,032][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012992_6651904.pth -[2023-07-08 20:17:15,708][1063383] Updated weights for policy 0, policy_version 13680 (0.0005) -[2023-07-08 20:17:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10717.9, 300 sec: 10719.0). Total num frames: 7024640. Throughput: 0: 10731.8. Samples: 7007660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:18,025][1063098] Avg episode reward: [(0, '861.233')] -[2023-07-08 20:17:19,710][1063383] Updated weights for policy 0, policy_version 13760 (0.0004) -[2023-07-08 20:17:23,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10717.9, 300 sec: 10705.1). Total num frames: 7077888. Throughput: 0: 10636.1. Samples: 7068884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:23,025][1063098] Avg episode reward: [(0, '854.685')] -[2023-07-08 20:17:23,740][1063383] Updated weights for policy 0, policy_version 13840 (0.0005) -[2023-07-08 20:17:27,556][1063383] Updated weights for policy 0, policy_version 13920 (0.0005) -[2023-07-08 20:17:28,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10649.6, 300 sec: 10705.1). Total num frames: 7131136. Throughput: 0: 10590.7. Samples: 7131200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:28,025][1063098] Avg episode reward: [(0, '864.162')] -[2023-07-08 20:17:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013928_7131136.pth... -[2023-07-08 20:17:28,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013304_6811648.pth -[2023-07-08 20:17:31,399][1063383] Updated weights for policy 0, policy_version 14000 (0.0005) -[2023-07-08 20:17:33,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10649.6, 300 sec: 10705.1). Total num frames: 7184384. Throughput: 0: 10618.9. Samples: 7163828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:33,025][1063098] Avg episode reward: [(0, '861.113')] -[2023-07-08 20:17:35,269][1063383] Updated weights for policy 0, policy_version 14080 (0.0005) -[2023-07-08 20:17:38,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10649.6, 300 sec: 10705.1). Total num frames: 7237632. Throughput: 0: 10607.2. Samples: 7227596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:38,025][1063098] Avg episode reward: [(0, '855.352')] -[2023-07-08 20:17:39,124][1063383] Updated weights for policy 0, policy_version 14160 (0.0005) -[2023-07-08 20:17:42,904][1063383] Updated weights for policy 0, policy_version 14240 (0.0006) -[2023-07-08 20:17:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10649.6, 300 sec: 10705.1). Total num frames: 7290880. Throughput: 0: 10577.9. Samples: 7291316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:43,025][1063098] Avg episode reward: [(0, '865.286')] -[2023-07-08 20:17:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014240_7290880.pth... -[2023-07-08 20:17:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013616_6971392.pth -[2023-07-08 20:17:46,644][1063383] Updated weights for policy 0, policy_version 14320 (0.0005) -[2023-07-08 20:17:48,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10649.6, 300 sec: 10691.3). Total num frames: 7344128. Throughput: 0: 10648.3. Samples: 7323656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:48,025][1063098] Avg episode reward: [(0, '857.723')] -[2023-07-08 20:17:50,445][1063383] Updated weights for policy 0, policy_version 14400 (0.0005) -[2023-07-08 20:17:53,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10581.3, 300 sec: 10663.5). Total num frames: 7393280. Throughput: 0: 10639.0. Samples: 7387784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:53,025][1063098] Avg episode reward: [(0, '867.559')] -[2023-07-08 20:17:54,667][1063383] Updated weights for policy 0, policy_version 14480 (0.0006) -[2023-07-08 20:17:58,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 10663.5). Total num frames: 7446528. Throughput: 0: 10561.3. Samples: 7448248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:17:58,025][1063098] Avg episode reward: [(0, '857.359')] -[2023-07-08 20:17:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014544_7446528.pth... -[2023-07-08 20:17:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013928_7131136.pth -[2023-07-08 20:17:58,593][1063383] Updated weights for policy 0, policy_version 14560 (0.0005) -[2023-07-08 20:18:02,637][1063383] Updated weights for policy 0, policy_version 14640 (0.0005) -[2023-07-08 20:18:03,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10649.6). Total num frames: 7495680. Throughput: 0: 10468.9. Samples: 7478760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:03,025][1063098] Avg episode reward: [(0, '862.842')] -[2023-07-08 20:18:06,432][1063383] Updated weights for policy 0, policy_version 14720 (0.0005) -[2023-07-08 20:18:08,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10513.1, 300 sec: 10649.6). Total num frames: 7548928. Throughput: 0: 10516.5. Samples: 7542128. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:18:08,025][1063098] Avg episode reward: [(0, '863.737')] -[2023-07-08 20:18:10,500][1063383] Updated weights for policy 0, policy_version 14800 (0.0005) -[2023-07-08 20:18:13,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10649.6). Total num frames: 7602176. Throughput: 0: 10467.7. Samples: 7602248. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:18:13,025][1063098] Avg episode reward: [(0, '860.665')] -[2023-07-08 20:18:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014848_7602176.pth... -[2023-07-08 20:18:13,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014240_7290880.pth -[2023-07-08 20:18:14,405][1063383] Updated weights for policy 0, policy_version 14880 (0.0005) -[2023-07-08 20:18:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10663.5). Total num frames: 7655424. Throughput: 0: 10470.8. Samples: 7635016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:18,025][1063098] Avg episode reward: [(0, '864.419')] -[2023-07-08 20:18:18,260][1063383] Updated weights for policy 0, policy_version 14960 (0.0005) -[2023-07-08 20:18:22,202][1063383] Updated weights for policy 0, policy_version 15040 (0.0005) -[2023-07-08 20:18:23,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10663.5). Total num frames: 7704576. Throughput: 0: 10476.5. Samples: 7699040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:23,025][1063098] Avg episode reward: [(0, '853.038')] -[2023-07-08 20:18:26,144][1063383] Updated weights for policy 0, policy_version 15120 (0.0005) -[2023-07-08 20:18:28,025][1063098] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10649.6). Total num frames: 7757824. Throughput: 0: 10415.5. Samples: 7760012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:28,025][1063098] Avg episode reward: [(0, '858.557')] -[2023-07-08 20:18:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015152_7757824.pth... -[2023-07-08 20:18:28,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014544_7446528.pth -[2023-07-08 20:18:29,983][1063383] Updated weights for policy 0, policy_version 15200 (0.0005) -[2023-07-08 20:18:33,025][1063098] Fps is (10 sec: 10649.8, 60 sec: 10444.8, 300 sec: 10663.5). Total num frames: 7811072. Throughput: 0: 10424.3. Samples: 7792748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:33,025][1063098] Avg episode reward: [(0, '870.329')] -[2023-07-08 20:18:33,949][1063383] Updated weights for policy 0, policy_version 15280 (0.0005) -[2023-07-08 20:18:38,025][1063098] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 10649.6). Total num frames: 7860224. Throughput: 0: 10348.1. Samples: 7853448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:38,025][1063098] Avg episode reward: [(0, '855.521')] -[2023-07-08 20:18:38,060][1063383] Updated weights for policy 0, policy_version 15360 (0.0005) -[2023-07-08 20:18:41,707][1063383] Updated weights for policy 0, policy_version 15440 (0.0005) -[2023-07-08 20:18:43,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10649.6). Total num frames: 7917568. Throughput: 0: 10486.9. Samples: 7920156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:43,025][1063098] Avg episode reward: [(0, '862.218')] -[2023-07-08 20:18:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015464_7917568.pth... -[2023-07-08 20:18:43,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014848_7602176.pth -[2023-07-08 20:18:45,522][1063383] Updated weights for policy 0, policy_version 15520 (0.0005) -[2023-07-08 20:18:48,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10444.8, 300 sec: 10649.6). Total num frames: 7970816. Throughput: 0: 10487.0. Samples: 7950676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:48,026][1063098] Avg episode reward: [(0, '862.671')] -[2023-07-08 20:18:49,013][1063383] Updated weights for policy 0, policy_version 15600 (0.0005) -[2023-07-08 20:18:52,628][1063383] Updated weights for policy 0, policy_version 15680 (0.0005) -[2023-07-08 20:18:53,025][1063098] Fps is (10 sec: 11468.8, 60 sec: 10649.6, 300 sec: 10677.4). Total num frames: 8032256. Throughput: 0: 10624.0. Samples: 8020208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:53,026][1063098] Avg episode reward: [(0, '866.775')] -[2023-07-08 20:18:56,171][1063383] Updated weights for policy 0, policy_version 15760 (0.0005) -[2023-07-08 20:18:58,025][1063098] Fps is (10 sec: 11468.8, 60 sec: 10649.6, 300 sec: 10677.4). Total num frames: 8085504. Throughput: 0: 10814.7. Samples: 8088908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:18:58,025][1063098] Avg episode reward: [(0, '870.730')] -[2023-07-08 20:18:58,044][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015800_8089600.pth... -[2023-07-08 20:18:58,047][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015152_7757824.pth -[2023-07-08 20:18:59,864][1063383] Updated weights for policy 0, policy_version 15840 (0.0005) -[2023-07-08 20:19:03,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10786.1, 300 sec: 10691.3). Total num frames: 8142848. Throughput: 0: 10831.5. Samples: 8122432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:19:03,026][1063098] Avg episode reward: [(0, '865.762')] -[2023-07-08 20:19:03,611][1063383] Updated weights for policy 0, policy_version 15920 (0.0006) -[2023-07-08 20:19:07,515][1063383] Updated weights for policy 0, policy_version 16000 (0.0005) -[2023-07-08 20:19:08,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10786.1, 300 sec: 10705.1). Total num frames: 8196096. Throughput: 0: 10823.4. Samples: 8186092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:19:08,025][1063098] Avg episode reward: [(0, '865.981')] -[2023-07-08 20:19:11,355][1063383] Updated weights for policy 0, policy_version 16080 (0.0005) -[2023-07-08 20:19:13,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10691.3). Total num frames: 8249344. Throughput: 0: 10895.7. Samples: 8250320. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:19:13,026][1063098] Avg episode reward: [(0, '874.528')] -[2023-07-08 20:19:13,029][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016112_8249344.pth... -[2023-07-08 20:19:13,032][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015464_7917568.pth -[2023-07-08 20:19:15,174][1063383] Updated weights for policy 0, policy_version 16160 (0.0005) -[2023-07-08 20:19:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10691.3). Total num frames: 8302592. Throughput: 0: 10899.2. Samples: 8283212. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:19:18,025][1063098] Avg episode reward: [(0, '867.506')] -[2023-07-08 20:19:18,933][1063383] Updated weights for policy 0, policy_version 16240 (0.0005) -[2023-07-08 20:19:22,824][1063383] Updated weights for policy 0, policy_version 16320 (0.0005) -[2023-07-08 20:19:23,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10854.4, 300 sec: 10705.1). Total num frames: 8355840. Throughput: 0: 10981.2. Samples: 8347604. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-07-08 20:19:23,025][1063098] Avg episode reward: [(0, '861.966')] -[2023-07-08 20:19:26,567][1063383] Updated weights for policy 0, policy_version 16400 (0.0005) -[2023-07-08 20:19:28,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10854.4, 300 sec: 10705.1). Total num frames: 8409088. Throughput: 0: 10945.7. Samples: 8412712. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:19:28,025][1063098] Avg episode reward: [(0, '872.811')] -[2023-07-08 20:19:28,027][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016424_8409088.pth... -[2023-07-08 20:19:28,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015800_8089600.pth -[2023-07-08 20:19:30,313][1063383] Updated weights for policy 0, policy_version 16480 (0.0005) -[2023-07-08 20:19:33,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10922.7, 300 sec: 10719.0). Total num frames: 8466432. Throughput: 0: 11003.8. Samples: 8445848. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-07-08 20:19:33,025][1063098] Avg episode reward: [(0, '859.743')] -[2023-07-08 20:19:34,066][1063383] Updated weights for policy 0, policy_version 16560 (0.0005) -[2023-07-08 20:19:37,744][1063383] Updated weights for policy 0, policy_version 16640 (0.0005) -[2023-07-08 20:19:38,024][1063098] Fps is (10 sec: 11059.3, 60 sec: 10991.0, 300 sec: 10719.0). Total num frames: 8519680. Throughput: 0: 10918.8. Samples: 8511552. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:19:38,025][1063098] Avg episode reward: [(0, '866.693')] -[2023-07-08 20:19:41,614][1063383] Updated weights for policy 0, policy_version 16720 (0.0005) -[2023-07-08 20:19:43,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 10719.0). Total num frames: 8572928. Throughput: 0: 10797.0. Samples: 8574772. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-07-08 20:19:43,025][1063098] Avg episode reward: [(0, '874.492')] -[2023-07-08 20:19:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016744_8572928.pth... -[2023-07-08 20:19:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016112_8249344.pth -[2023-07-08 20:19:45,426][1063383] Updated weights for policy 0, policy_version 16800 (0.0005) -[2023-07-08 20:19:48,025][1063098] Fps is (10 sec: 11059.1, 60 sec: 10990.9, 300 sec: 10732.9). Total num frames: 8630272. Throughput: 0: 10760.9. Samples: 8606672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:19:48,025][1063098] Avg episode reward: [(0, '866.421')] -[2023-07-08 20:19:48,878][1063383] Updated weights for policy 0, policy_version 16880 (0.0005) -[2023-07-08 20:19:52,903][1063383] Updated weights for policy 0, policy_version 16960 (0.0005) -[2023-07-08 20:19:53,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 8683520. Throughput: 0: 10868.6. Samples: 8675180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:19:53,025][1063098] Avg episode reward: [(0, '870.867')] -[2023-07-08 20:19:56,551][1063383] Updated weights for policy 0, policy_version 17040 (0.0005) -[2023-07-08 20:19:58,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 8736768. Throughput: 0: 10857.3. Samples: 8738900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:19:58,025][1063098] Avg episode reward: [(0, '865.394')] -[2023-07-08 20:19:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017064_8736768.pth... -[2023-07-08 20:19:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016424_8409088.pth -[2023-07-08 20:20:00,527][1063383] Updated weights for policy 0, policy_version 17120 (0.0005) -[2023-07-08 20:20:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 8790016. Throughput: 0: 10812.0. Samples: 8769752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:03,025][1063098] Avg episode reward: [(0, '868.905')] -[2023-07-08 20:20:04,456][1063383] Updated weights for policy 0, policy_version 17200 (0.0005) -[2023-07-08 20:20:08,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 8843264. Throughput: 0: 10789.3. Samples: 8833124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:08,025][1063098] Avg episode reward: [(0, '869.392')] -[2023-07-08 20:20:08,342][1063383] Updated weights for policy 0, policy_version 17280 (0.0005) -[2023-07-08 20:20:12,354][1063383] Updated weights for policy 0, policy_version 17360 (0.0005) -[2023-07-08 20:20:13,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10717.9, 300 sec: 10705.1). Total num frames: 8892416. Throughput: 0: 10731.6. Samples: 8895636. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:20:13,025][1063098] Avg episode reward: [(0, '874.183')] -[2023-07-08 20:20:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017368_8892416.pth... -[2023-07-08 20:20:13,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016744_8572928.pth -[2023-07-08 20:20:16,073][1063383] Updated weights for policy 0, policy_version 17440 (0.0005) -[2023-07-08 20:20:18,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 8949760. Throughput: 0: 10734.4. Samples: 8928896. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:20:18,025][1063098] Avg episode reward: [(0, '864.798')] -[2023-07-08 20:20:19,645][1063383] Updated weights for policy 0, policy_version 17520 (0.0005) -[2023-07-08 20:20:23,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 9003008. Throughput: 0: 10748.3. Samples: 8995228. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:20:23,025][1063098] Avg episode reward: [(0, '857.568')] -[2023-07-08 20:20:23,533][1063383] Updated weights for policy 0, policy_version 17600 (0.0006) -[2023-07-08 20:20:27,393][1063383] Updated weights for policy 0, policy_version 17680 (0.0005) -[2023-07-08 20:20:28,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10786.1, 300 sec: 10719.0). Total num frames: 9056256. Throughput: 0: 10730.3. Samples: 9057636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:28,025][1063098] Avg episode reward: [(0, '871.977')] -[2023-07-08 20:20:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017688_9056256.pth... -[2023-07-08 20:20:28,030][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017064_8736768.pth -[2023-07-08 20:20:31,150][1063383] Updated weights for policy 0, policy_version 17760 (0.0005) -[2023-07-08 20:20:33,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 9113600. Throughput: 0: 10767.4. Samples: 9091204. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:33,025][1063098] Avg episode reward: [(0, '872.172')] -[2023-07-08 20:20:34,765][1063383] Updated weights for policy 0, policy_version 17840 (0.0005) -[2023-07-08 20:20:38,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 9166848. Throughput: 0: 10730.8. Samples: 9158064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:38,025][1063098] Avg episode reward: [(0, '873.173')] -[2023-07-08 20:20:38,489][1063383] Updated weights for policy 0, policy_version 17920 (0.0005) -[2023-07-08 20:20:42,272][1063383] Updated weights for policy 0, policy_version 18000 (0.0005) -[2023-07-08 20:20:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 9220096. Throughput: 0: 10769.4. Samples: 9223524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:43,025][1063098] Avg episode reward: [(0, '873.582')] -[2023-07-08 20:20:43,046][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018016_9224192.pth... -[2023-07-08 20:20:43,048][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017368_8892416.pth -[2023-07-08 20:20:46,031][1063383] Updated weights for policy 0, policy_version 18080 (0.0004) -[2023-07-08 20:20:48,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 9277440. Throughput: 0: 10822.0. Samples: 9256740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:48,025][1063098] Avg episode reward: [(0, '864.473')] -[2023-07-08 20:20:49,601][1063383] Updated weights for policy 0, policy_version 18160 (0.0004) -[2023-07-08 20:20:53,025][1063098] Fps is (10 sec: 11468.9, 60 sec: 10854.4, 300 sec: 10746.8). Total num frames: 9334784. Throughput: 0: 10965.2. Samples: 9326556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:53,025][1063098] Avg episode reward: [(0, '873.108')] -[2023-07-08 20:20:53,182][1063383] Updated weights for policy 0, policy_version 18240 (0.0005) -[2023-07-08 20:20:56,874][1063383] Updated weights for policy 0, policy_version 18320 (0.0005) -[2023-07-08 20:20:58,025][1063098] Fps is (10 sec: 11468.7, 60 sec: 10922.7, 300 sec: 10760.7). Total num frames: 9392128. Throughput: 0: 11029.4. Samples: 9391960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:20:58,025][1063098] Avg episode reward: [(0, '861.664')] -[2023-07-08 20:20:58,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018344_9392128.pth... -[2023-07-08 20:20:58,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017688_9056256.pth -[2023-07-08 20:21:00,751][1063383] Updated weights for policy 0, policy_version 18400 (0.0005) -[2023-07-08 20:21:03,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 9441280. Throughput: 0: 10987.9. Samples: 9423352. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:21:03,025][1063098] Avg episode reward: [(0, '868.507')] -[2023-07-08 20:21:04,752][1063383] Updated weights for policy 0, policy_version 18480 (0.0005) -[2023-07-08 20:21:08,025][1063098] Fps is (10 sec: 10240.0, 60 sec: 10854.4, 300 sec: 10746.8). Total num frames: 9494528. Throughput: 0: 10867.3. Samples: 9484256. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:21:08,025][1063098] Avg episode reward: [(0, '871.972')] -[2023-07-08 20:21:08,607][1063383] Updated weights for policy 0, policy_version 18560 (0.0005) -[2023-07-08 20:21:12,601][1063383] Updated weights for policy 0, policy_version 18640 (0.0005) -[2023-07-08 20:21:13,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 10732.9). Total num frames: 9547776. Throughput: 0: 10891.7. Samples: 9547764. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-07-08 20:21:13,025][1063098] Avg episode reward: [(0, '871.268')] -[2023-07-08 20:21:13,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018648_9547776.pth... -[2023-07-08 20:21:13,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018016_9224192.pth -[2023-07-08 20:21:16,304][1063383] Updated weights for policy 0, policy_version 18720 (0.0005) -[2023-07-08 20:21:18,025][1063098] Fps is (10 sec: 11059.2, 60 sec: 10922.7, 300 sec: 10746.8). Total num frames: 9605120. Throughput: 0: 10873.5. Samples: 9580512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:21:18,025][1063098] Avg episode reward: [(0, '866.535')] -[2023-07-08 20:21:19,603][1063383] Updated weights for policy 0, policy_version 18800 (0.0005) -[2023-07-08 20:21:23,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10922.7, 300 sec: 10732.9). Total num frames: 9658368. Throughput: 0: 10948.6. Samples: 9650752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:21:23,025][1063098] Avg episode reward: [(0, '868.296')] -[2023-07-08 20:21:23,598][1063383] Updated weights for policy 0, policy_version 18880 (0.0006) -[2023-07-08 20:21:27,414][1063383] Updated weights for policy 0, policy_version 18960 (0.0005) -[2023-07-08 20:21:28,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 10732.9). Total num frames: 9711616. Throughput: 0: 10894.4. Samples: 9713772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:21:28,025][1063098] Avg episode reward: [(0, '873.064')] -[2023-07-08 20:21:28,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018968_9711616.pth... -[2023-07-08 20:21:28,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018344_9392128.pth -[2023-07-08 20:21:31,204][1063383] Updated weights for policy 0, policy_version 19040 (0.0005) -[2023-07-08 20:21:33,025][1063098] Fps is (10 sec: 10649.6, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 9764864. Throughput: 0: 10892.5. Samples: 9746904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:21:33,025][1063098] Avg episode reward: [(0, '862.190')] -[2023-07-08 20:21:35,012][1063383] Updated weights for policy 0, policy_version 19120 (0.0005) -[2023-07-08 20:21:38,025][1063098] Fps is (10 sec: 10649.7, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 9818112. Throughput: 0: 10742.9. Samples: 9809984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-07-08 20:21:38,025][1063098] Avg episode reward: [(0, '872.740')] -[2023-07-08 20:21:38,848][1063383] Updated weights for policy 0, policy_version 19200 (0.0005) -[2023-07-08 20:21:42,798][1063383] Updated weights for policy 0, policy_version 19280 (0.0005) -[2023-07-08 20:21:43,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10854.4, 300 sec: 10732.9). Total num frames: 9871360. Throughput: 0: 10696.9. Samples: 9873320. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:21:43,025][1063098] Avg episode reward: [(0, '873.281')] -[2023-07-08 20:21:43,028][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019280_9871360.pth... -[2023-07-08 20:21:43,031][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018648_9547776.pth -[2023-07-08 20:21:46,681][1063383] Updated weights for policy 0, policy_version 19360 (0.0006) -[2023-07-08 20:21:48,025][1063098] Fps is (10 sec: 10649.5, 60 sec: 10786.1, 300 sec: 10732.9). Total num frames: 9924608. Throughput: 0: 10692.4. Samples: 9904512. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:21:48,025][1063098] Avg episode reward: [(0, '869.429')] -[2023-07-08 20:21:50,365][1063383] Updated weights for policy 0, policy_version 19440 (0.0005) -[2023-07-08 20:21:53,025][1063098] Fps is (10 sec: 11059.3, 60 sec: 10786.1, 300 sec: 10746.8). Total num frames: 9981952. Throughput: 0: 10815.8. Samples: 9970968. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-07-08 20:21:53,025][1063098] Avg episode reward: [(0, '864.647')] -[2023-07-08 20:21:53,984][1063383] Updated weights for policy 0, policy_version 19520 (0.0005) -[2023-07-08 20:21:54,948][1063339] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 -[2023-07-08 20:21:54,949][1063385] Stopping RolloutWorker_w1... -[2023-07-08 20:21:54,949][1063388] Stopping RolloutWorker_w4... -[2023-07-08 20:21:54,949][1063386] Stopping RolloutWorker_w0... -[2023-07-08 20:21:54,949][1063384] Stopping RolloutWorker_w2... -[2023-07-08 20:21:54,949][1063387] Stopping RolloutWorker_w3... -[2023-07-08 20:21:54,949][1063484] Stopping RolloutWorker_w7... -[2023-07-08 20:21:54,949][1063451] Stopping RolloutWorker_w6... -[2023-07-08 20:21:54,949][1063385] Loop rollout_proc1_evt_loop terminating... -[2023-07-08 20:21:54,949][1063483] Stopping RolloutWorker_w5... -[2023-07-08 20:21:54,949][1063388] Loop rollout_proc4_evt_loop terminating... -[2023-07-08 20:21:54,949][1063386] Loop rollout_proc0_evt_loop terminating... -[2023-07-08 20:21:54,949][1063384] Loop rollout_proc2_evt_loop terminating... -[2023-07-08 20:21:54,949][1063387] Loop rollout_proc3_evt_loop terminating... -[2023-07-08 20:21:54,949][1063484] Loop rollout_proc7_evt_loop terminating... -[2023-07-08 20:21:54,949][1063451] Loop rollout_proc6_evt_loop terminating... -[2023-07-08 20:21:54,949][1063098] Component RolloutWorker_w1 stopped! -[2023-07-08 20:21:54,950][1063483] Loop rollout_proc5_evt_loop terminating... -[2023-07-08 20:21:54,949][1063339] Stopping Batcher_0... -[2023-07-08 20:21:54,950][1063339] Loop batcher_evt_loop terminating... -[2023-07-08 20:21:54,950][1063098] Component RolloutWorker_w4 stopped! -[2023-07-08 20:21:54,950][1063098] Component RolloutWorker_w0 stopped! -[2023-07-08 20:21:54,950][1063098] Component RolloutWorker_w6 stopped! -[2023-07-08 20:21:54,950][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... -[2023-07-08 20:21:54,950][1063098] Component RolloutWorker_w2 stopped! -[2023-07-08 20:21:54,951][1063098] Component RolloutWorker_w3 stopped! -[2023-07-08 20:21:54,951][1063098] Component RolloutWorker_w7 stopped! -[2023-07-08 20:21:54,951][1063098] Component RolloutWorker_w5 stopped! -[2023-07-08 20:21:54,951][1063098] Component Batcher_0 stopped! -[2023-07-08 20:21:54,953][1063339] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018968_9711616.pth -[2023-07-08 20:21:54,953][1063339] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... -[2023-07-08 20:21:54,956][1063339] Stopping LearnerWorker_p0... -[2023-07-08 20:21:54,956][1063339] Loop learner_proc0_evt_loop terminating... -[2023-07-08 20:21:54,956][1063098] Component LearnerWorker_p0 stopped! -[2023-07-08 20:21:55,014][1063383] Weights refcount: 2 0 -[2023-07-08 20:21:55,015][1063383] Stopping InferenceWorker_p0-w0... -[2023-07-08 20:21:55,015][1063383] Loop inference_proc0-0_evt_loop terminating... -[2023-07-08 20:21:55,015][1063098] Component InferenceWorker_p0-w0 stopped! -[2023-07-08 20:21:55,016][1063098] Waiting for process learner_proc0 to stop... -[2023-07-08 20:21:55,672][1063098] Waiting for process inference_proc0-0 to join... -[2023-07-08 20:21:55,692][1063098] Waiting for process rollout_proc0 to join... -[2023-07-08 20:21:55,692][1063098] Waiting for process rollout_proc1 to join... -[2023-07-08 20:21:55,693][1063098] Waiting for process rollout_proc2 to join... -[2023-07-08 20:21:55,693][1063098] Waiting for process rollout_proc3 to join... -[2023-07-08 20:21:55,693][1063098] Waiting for process rollout_proc4 to join... -[2023-07-08 20:21:55,693][1063098] Waiting for process rollout_proc5 to join... -[2023-07-08 20:21:55,694][1063098] Waiting for process rollout_proc6 to join... -[2023-07-08 20:21:55,694][1063098] Waiting for process rollout_proc7 to join... -[2023-07-08 20:21:55,694][1063098] Batcher 0 profile tree view: -batching: 1.8467, releasing_batches: 1.5545 -[2023-07-08 20:21:55,694][1063098] InferenceWorker_p0-w0 profile tree view: +[2023-07-17 00:32:24,558][277276] Worker 5 uses CPU cores [20, 21, 22, 23] +[2023-07-17 00:32:24,614][277366] Worker 7 uses CPU cores [28, 29, 30, 31] +[2023-07-17 00:32:24,685][277272] Worker 1 uses CPU cores [4, 5, 6, 7] +[2023-07-17 00:32:24,754][277226] Using optimizer +[2023-07-17 00:32:24,754][277226] No checkpoints found +[2023-07-17 00:32:24,754][277226] Did not load from checkpoint, starting from scratch! +[2023-07-17 00:32:24,754][277226] Initialized policy 0 weights for model version 0 +[2023-07-17 00:32:24,755][277226] LearnerWorker_p0 finished initialization! +[2023-07-17 00:32:24,757][277270] RunningMeanStd input shape: (39,) +[2023-07-17 00:32:24,757][277270] RunningMeanStd input shape: (1,) +[2023-07-17 00:32:24,830][276985] Inference worker 0-0 is ready! +[2023-07-17 00:32:24,831][276985] All inference workers are ready! Signal rollout workers to start! +[2023-07-17 00:32:24,867][277308] Worker 6 uses CPU cores [24, 25, 26, 27] +[2023-07-17 00:32:24,868][277273] Worker 2 uses CPU cores [8, 9, 10, 11] +[2023-07-17 00:32:24,961][277271] Worker 0 uses CPU cores [0, 1, 2, 3] +[2023-07-17 00:32:25,119][277274] Worker 3 uses CPU cores [12, 13, 14, 15] +[2023-07-17 00:32:25,239][277275] Worker 4 uses CPU cores [16, 17, 18, 19] +[2023-07-17 00:32:25,434][276985] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-07-17 00:32:26,286][277272] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,288][277276] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,289][277366] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,293][277272] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,295][277276] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,296][277366] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,318][277272] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,322][277276] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,322][277366] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,372][277272] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,376][277276] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,377][277366] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,381][277273] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,388][277273] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,411][277308] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,417][277273] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,418][277308] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,443][277308] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,473][277273] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,477][277271] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,484][277271] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,500][277308] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,512][277271] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,563][277271] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,626][277274] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,633][277274] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,659][277274] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,710][277274] Decorrelating experience for 192 frames... +[2023-07-17 00:32:26,732][277275] Decorrelating experience for 0 frames... +[2023-07-17 00:32:26,739][277275] Decorrelating experience for 64 frames... +[2023-07-17 00:32:26,767][277275] Decorrelating experience for 128 frames... +[2023-07-17 00:32:26,818][277275] Decorrelating experience for 192 frames... +[2023-07-17 00:32:27,800][277272] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,808][277276] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,810][277366] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,890][277273] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,898][277272] Decorrelating experience for 320 frames... +[2023-07-17 00:32:27,908][277366] Decorrelating experience for 320 frames... +[2023-07-17 00:32:27,913][277276] Decorrelating experience for 320 frames... +[2023-07-17 00:32:27,923][277308] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,981][277271] Decorrelating experience for 256 frames... +[2023-07-17 00:32:27,984][277273] Decorrelating experience for 320 frames... +[2023-07-17 00:32:28,017][277272] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,020][277308] Decorrelating experience for 320 frames... +[2023-07-17 00:32:28,035][277366] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,039][277276] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,081][277271] Decorrelating experience for 320 frames... +[2023-07-17 00:32:28,105][277273] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,137][277274] Decorrelating experience for 256 frames... +[2023-07-17 00:32:28,139][277308] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,163][277272] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,180][277366] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,180][277276] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,208][277271] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,233][277274] Decorrelating experience for 320 frames... +[2023-07-17 00:32:28,233][277275] Decorrelating experience for 256 frames... +[2023-07-17 00:32:28,247][277273] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,281][277308] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,331][277275] Decorrelating experience for 320 frames... +[2023-07-17 00:32:28,344][277271] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,348][277274] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,461][277275] Decorrelating experience for 384 frames... +[2023-07-17 00:32:28,485][277274] Decorrelating experience for 448 frames... +[2023-07-17 00:32:28,602][277275] Decorrelating experience for 448 frames... +[2023-07-17 00:32:30,434][276985] Fps is (10 sec: 4096.1, 60 sec: 4096.1, 300 sec: 4096.1). Total num frames: 20480. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:32:30,435][276985] Avg episode reward: [(0, '208.884')] +[2023-07-17 00:32:31,885][277270] Updated weights for policy 0, policy_version 80 (0.0004) +[2023-07-17 00:32:34,810][277270] Updated weights for policy 0, policy_version 160 (0.0003) +[2023-07-17 00:32:35,434][276985] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 90112. Throughput: 0: 6904.0. Samples: 69040. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:32:35,435][276985] Avg episode reward: [(0, '459.181')] +[2023-07-17 00:32:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000176_90112.pth... +[2023-07-17 00:32:37,544][277270] Updated weights for policy 0, policy_version 240 (0.0003) +[2023-07-17 00:32:40,216][277270] Updated weights for policy 0, policy_version 320 (0.0003) +[2023-07-17 00:32:40,434][276985] Fps is (10 sec: 14336.1, 60 sec: 10922.8, 300 sec: 10922.8). Total num frames: 163840. Throughput: 0: 10548.9. Samples: 158232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:32:40,435][276985] Avg episode reward: [(0, '586.920')] +[2023-07-17 00:32:40,435][277226] Saving new best policy, reward=586.920! +[2023-07-17 00:32:42,479][276985] Heartbeat connected on Batcher_0 +[2023-07-17 00:32:42,481][276985] Heartbeat connected on LearnerWorker_p0 +[2023-07-17 00:32:42,484][276985] Heartbeat connected on InferenceWorker_p0-w0 +[2023-07-17 00:32:42,487][276985] Heartbeat connected on RolloutWorker_w0 +[2023-07-17 00:32:42,489][276985] Heartbeat connected on RolloutWorker_w1 +[2023-07-17 00:32:42,491][276985] Heartbeat connected on RolloutWorker_w2 +[2023-07-17 00:32:42,494][276985] Heartbeat connected on RolloutWorker_w3 +[2023-07-17 00:32:42,499][276985] Heartbeat connected on RolloutWorker_w5 +[2023-07-17 00:32:42,499][276985] Heartbeat connected on RolloutWorker_w4 +[2023-07-17 00:32:42,502][276985] Heartbeat connected on RolloutWorker_w6 +[2023-07-17 00:32:42,506][276985] Heartbeat connected on RolloutWorker_w7 +[2023-07-17 00:32:43,337][277270] Updated weights for policy 0, policy_version 400 (0.0005) +[2023-07-17 00:32:45,434][276985] Fps is (10 sec: 14336.1, 60 sec: 11673.7, 300 sec: 11673.7). Total num frames: 233472. Throughput: 0: 9946.7. Samples: 198932. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:32:45,435][276985] Avg episode reward: [(0, '676.186')] +[2023-07-17 00:32:45,435][277226] Saving new best policy, reward=676.186! +[2023-07-17 00:32:46,050][277270] Updated weights for policy 0, policy_version 480 (0.0004) +[2023-07-17 00:32:48,813][277270] Updated weights for policy 0, policy_version 560 (0.0004) +[2023-07-17 00:32:50,434][276985] Fps is (10 sec: 14745.4, 60 sec: 12451.9, 300 sec: 12451.9). Total num frames: 311296. Throughput: 0: 11484.2. Samples: 287104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:32:50,435][276985] Avg episode reward: [(0, '715.286')] +[2023-07-17 00:32:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000608_311296.pth... +[2023-07-17 00:32:50,441][277226] Saving new best policy, reward=715.286! +[2023-07-17 00:32:51,515][277270] Updated weights for policy 0, policy_version 640 (0.0004) +[2023-07-17 00:32:54,222][277270] Updated weights for policy 0, policy_version 720 (0.0003) +[2023-07-17 00:32:55,434][276985] Fps is (10 sec: 15155.2, 60 sec: 12834.2, 300 sec: 12834.2). Total num frames: 385024. Throughput: 0: 12587.1. Samples: 377612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:32:55,435][276985] Avg episode reward: [(0, '671.475')] +[2023-07-17 00:32:57,067][277270] Updated weights for policy 0, policy_version 800 (0.0004) +[2023-07-17 00:32:59,945][277270] Updated weights for policy 0, policy_version 880 (0.0004) +[2023-07-17 00:33:00,434][276985] Fps is (10 sec: 14336.1, 60 sec: 12990.2, 300 sec: 12990.2). Total num frames: 454656. Throughput: 0: 12030.7. Samples: 421072. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:33:00,456][276985] Avg episode reward: [(0, '662.890')] +[2023-07-17 00:33:03,119][277270] Updated weights for policy 0, policy_version 960 (0.0005) +[2023-07-17 00:33:05,434][276985] Fps is (10 sec: 13516.7, 60 sec: 13004.8, 300 sec: 13004.8). Total num frames: 520192. Throughput: 0: 12531.5. Samples: 501260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:33:05,435][276985] Avg episode reward: [(0, '716.888')] +[2023-07-17 00:33:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001016_520192.pth... +[2023-07-17 00:33:05,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000176_90112.pth +[2023-07-17 00:33:05,442][277226] Saving new best policy, reward=716.888! +[2023-07-17 00:33:06,241][277270] Updated weights for policy 0, policy_version 1040 (0.0005) +[2023-07-17 00:33:09,173][277270] Updated weights for policy 0, policy_version 1120 (0.0005) +[2023-07-17 00:33:10,434][276985] Fps is (10 sec: 13516.8, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 589824. Throughput: 0: 12946.7. Samples: 582600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:33:10,435][276985] Avg episode reward: [(0, '727.992')] +[2023-07-17 00:33:10,436][277226] Saving new best policy, reward=727.992! +[2023-07-17 00:33:12,003][277270] Updated weights for policy 0, policy_version 1200 (0.0004) +[2023-07-17 00:33:14,735][277270] Updated weights for policy 0, policy_version 1280 (0.0004) +[2023-07-17 00:33:15,434][276985] Fps is (10 sec: 14336.0, 60 sec: 13271.1, 300 sec: 13271.1). Total num frames: 663552. Throughput: 0: 13938.7. Samples: 627240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:33:15,435][276985] Avg episode reward: [(0, '778.222')] +[2023-07-17 00:33:15,436][277226] Saving new best policy, reward=778.222! +[2023-07-17 00:33:17,486][277270] Updated weights for policy 0, policy_version 1360 (0.0004) +[2023-07-17 00:33:20,202][277270] Updated weights for policy 0, policy_version 1440 (0.0004) +[2023-07-17 00:33:20,434][276985] Fps is (10 sec: 14745.4, 60 sec: 13405.1, 300 sec: 13405.1). Total num frames: 737280. Throughput: 0: 14394.8. Samples: 716808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:33:20,435][276985] Avg episode reward: [(0, '776.181')] +[2023-07-17 00:33:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001440_737280.pth... +[2023-07-17 00:33:20,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000000608_311296.pth +[2023-07-17 00:33:22,849][277270] Updated weights for policy 0, policy_version 1520 (0.0004) +[2023-07-17 00:33:25,434][276985] Fps is (10 sec: 14745.7, 60 sec: 13516.8, 300 sec: 13516.8). Total num frames: 811008. Throughput: 0: 14406.3. Samples: 806516. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:33:25,435][276985] Avg episode reward: [(0, '759.320')] +[2023-07-17 00:33:25,781][277270] Updated weights for policy 0, policy_version 1600 (0.0005) +[2023-07-17 00:33:28,746][277270] Updated weights for policy 0, policy_version 1680 (0.0005) +[2023-07-17 00:33:30,434][276985] Fps is (10 sec: 14336.2, 60 sec: 14336.0, 300 sec: 13548.3). Total num frames: 880640. Throughput: 0: 14427.9. Samples: 848188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:33:30,435][276985] Avg episode reward: [(0, '772.672')] +[2023-07-17 00:33:31,751][277270] Updated weights for policy 0, policy_version 1760 (0.0005) +[2023-07-17 00:33:34,698][277270] Updated weights for policy 0, policy_version 1840 (0.0005) +[2023-07-17 00:33:35,434][276985] Fps is (10 sec: 13926.3, 60 sec: 14336.0, 300 sec: 13575.3). Total num frames: 950272. Throughput: 0: 14302.7. Samples: 930724. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:33:35,435][276985] Avg episode reward: [(0, '738.653')] +[2023-07-17 00:33:35,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001856_950272.pth... +[2023-07-17 00:33:35,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001016_520192.pth +[2023-07-17 00:33:37,669][277270] Updated weights for policy 0, policy_version 1920 (0.0005) +[2023-07-17 00:33:40,434][276985] Fps is (10 sec: 13926.3, 60 sec: 14267.7, 300 sec: 13598.7). Total num frames: 1019904. Throughput: 0: 14143.5. Samples: 1014068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:33:40,435][276985] Avg episode reward: [(0, '831.193')] +[2023-07-17 00:33:40,435][277226] Saving new best policy, reward=831.193! +[2023-07-17 00:33:40,546][277270] Updated weights for policy 0, policy_version 2000 (0.0005) +[2023-07-17 00:33:43,285][277270] Updated weights for policy 0, policy_version 2080 (0.0004) +[2023-07-17 00:33:45,434][276985] Fps is (10 sec: 14336.1, 60 sec: 14336.0, 300 sec: 13670.4). Total num frames: 1093632. Throughput: 0: 14159.0. Samples: 1058228. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:33:45,435][276985] Avg episode reward: [(0, '743.802')] +[2023-07-17 00:33:45,982][277270] Updated weights for policy 0, policy_version 2160 (0.0004) +[2023-07-17 00:33:48,600][277270] Updated weights for policy 0, policy_version 2240 (0.0004) +[2023-07-17 00:33:50,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14336.0, 300 sec: 13781.9). Total num frames: 1171456. Throughput: 0: 14438.3. Samples: 1150984. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:33:50,434][276985] Avg episode reward: [(0, '776.743')] +[2023-07-17 00:33:50,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002288_1171456.pth... +[2023-07-17 00:33:50,438][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001440_737280.pth +[2023-07-17 00:33:51,251][277270] Updated weights for policy 0, policy_version 2320 (0.0004) +[2023-07-17 00:33:53,902][277270] Updated weights for policy 0, policy_version 2400 (0.0004) +[2023-07-17 00:33:55,434][276985] Fps is (10 sec: 15564.8, 60 sec: 14404.3, 300 sec: 13880.9). Total num frames: 1249280. Throughput: 0: 14696.1. Samples: 1243924. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:33:55,435][276985] Avg episode reward: [(0, '747.670')] +[2023-07-17 00:33:56,567][277270] Updated weights for policy 0, policy_version 2480 (0.0004) +[2023-07-17 00:33:59,252][277270] Updated weights for policy 0, policy_version 2560 (0.0004) +[2023-07-17 00:34:00,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14540.8, 300 sec: 13969.5). Total num frames: 1327104. Throughput: 0: 14733.6. Samples: 1290252. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:34:00,435][276985] Avg episode reward: [(0, '787.007')] +[2023-07-17 00:34:01,839][277270] Updated weights for policy 0, policy_version 2640 (0.0004) +[2023-07-17 00:34:04,404][277270] Updated weights for policy 0, policy_version 2720 (0.0003) +[2023-07-17 00:34:05,434][276985] Fps is (10 sec: 15564.8, 60 sec: 14745.6, 300 sec: 14049.3). Total num frames: 1404928. Throughput: 0: 14820.9. Samples: 1383748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:34:05,435][276985] Avg episode reward: [(0, '801.321')] +[2023-07-17 00:34:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002744_1404928.pth... +[2023-07-17 00:34:05,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000001856_950272.pth +[2023-07-17 00:34:07,318][277270] Updated weights for policy 0, policy_version 2800 (0.0004) +[2023-07-17 00:34:09,935][277270] Updated weights for policy 0, policy_version 2880 (0.0003) +[2023-07-17 00:34:10,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14813.9, 300 sec: 14082.5). Total num frames: 1478656. Throughput: 0: 14815.6. Samples: 1473220. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:34:10,435][276985] Avg episode reward: [(0, '813.237')] +[2023-07-17 00:34:12,649][277270] Updated weights for policy 0, policy_version 2960 (0.0004) +[2023-07-17 00:34:15,302][277270] Updated weights for policy 0, policy_version 3040 (0.0004) +[2023-07-17 00:34:15,434][276985] Fps is (10 sec: 15155.2, 60 sec: 14882.1, 300 sec: 14149.8). Total num frames: 1556480. Throughput: 0: 14909.8. Samples: 1519128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:34:15,435][276985] Avg episode reward: [(0, '810.552')] +[2023-07-17 00:34:17,912][277270] Updated weights for policy 0, policy_version 3120 (0.0004) +[2023-07-17 00:34:20,434][276985] Fps is (10 sec: 15564.8, 60 sec: 14950.4, 300 sec: 14211.4). Total num frames: 1634304. Throughput: 0: 15138.8. Samples: 1611968. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:34:20,435][276985] Avg episode reward: [(0, '804.959')] +[2023-07-17 00:34:20,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003192_1634304.pth... +[2023-07-17 00:34:20,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002288_1171456.pth +[2023-07-17 00:34:20,583][277270] Updated weights for policy 0, policy_version 3200 (0.0004) +[2023-07-17 00:34:23,210][277270] Updated weights for policy 0, policy_version 3280 (0.0003) +[2023-07-17 00:34:25,434][276985] Fps is (10 sec: 15564.7, 60 sec: 15018.7, 300 sec: 14267.7). Total num frames: 1712128. Throughput: 0: 15331.8. Samples: 1704000. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:34:25,435][276985] Avg episode reward: [(0, '779.928')] +[2023-07-17 00:34:25,951][277270] Updated weights for policy 0, policy_version 3360 (0.0004) +[2023-07-17 00:34:28,636][277270] Updated weights for policy 0, policy_version 3440 (0.0004) +[2023-07-17 00:34:30,434][276985] Fps is (10 sec: 15155.1, 60 sec: 15086.9, 300 sec: 14286.9). Total num frames: 1785856. Throughput: 0: 15354.7. Samples: 1749192. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:34:30,435][276985] Avg episode reward: [(0, '792.288')] +[2023-07-17 00:34:31,247][277270] Updated weights for policy 0, policy_version 3520 (0.0003) +[2023-07-17 00:34:33,849][277270] Updated weights for policy 0, policy_version 3600 (0.0003) +[2023-07-17 00:34:35,434][276985] Fps is (10 sec: 15155.3, 60 sec: 15223.5, 300 sec: 14336.0). Total num frames: 1863680. Throughput: 0: 15384.0. Samples: 1843264. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:34:35,435][276985] Avg episode reward: [(0, '808.768')] +[2023-07-17 00:34:35,450][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003648_1867776.pth... +[2023-07-17 00:34:35,453][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000002744_1404928.pth +[2023-07-17 00:34:36,483][277270] Updated weights for policy 0, policy_version 3680 (0.0004) +[2023-07-17 00:34:39,091][277270] Updated weights for policy 0, policy_version 3760 (0.0003) +[2023-07-17 00:34:40,434][276985] Fps is (10 sec: 15974.6, 60 sec: 15428.3, 300 sec: 14411.9). Total num frames: 1945600. Throughput: 0: 15410.9. Samples: 1937416. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-17 00:34:40,435][276985] Avg episode reward: [(0, '791.806')] +[2023-07-17 00:34:41,894][277270] Updated weights for policy 0, policy_version 3840 (0.0004) +[2023-07-17 00:34:44,977][277270] Updated weights for policy 0, policy_version 3920 (0.0005) +[2023-07-17 00:34:45,434][276985] Fps is (10 sec: 14745.5, 60 sec: 15291.7, 300 sec: 14365.3). Total num frames: 2011136. Throughput: 0: 15296.9. Samples: 1978612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:34:45,435][276985] Avg episode reward: [(0, '830.363')] +[2023-07-17 00:34:47,925][277270] Updated weights for policy 0, policy_version 4000 (0.0005) +[2023-07-17 00:34:50,434][276985] Fps is (10 sec: 13926.3, 60 sec: 15223.5, 300 sec: 14378.4). Total num frames: 2084864. Throughput: 0: 15067.0. Samples: 2061764. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:34:50,435][276985] Avg episode reward: [(0, '782.287')] +[2023-07-17 00:34:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004072_2084864.pth... +[2023-07-17 00:34:50,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003192_1634304.pth +[2023-07-17 00:34:50,616][277270] Updated weights for policy 0, policy_version 4080 (0.0004) +[2023-07-17 00:34:53,291][277270] Updated weights for policy 0, policy_version 4160 (0.0004) +[2023-07-17 00:34:55,434][276985] Fps is (10 sec: 14745.6, 60 sec: 15155.2, 300 sec: 14390.6). Total num frames: 2158592. Throughput: 0: 15132.5. Samples: 2154184. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:34:55,435][276985] Avg episode reward: [(0, '792.664')] +[2023-07-17 00:34:56,032][277270] Updated weights for policy 0, policy_version 4240 (0.0004) +[2023-07-17 00:34:58,708][277270] Updated weights for policy 0, policy_version 4320 (0.0004) +[2023-07-17 00:35:00,434][276985] Fps is (10 sec: 15155.3, 60 sec: 15155.2, 300 sec: 14428.5). Total num frames: 2236416. Throughput: 0: 15101.0. Samples: 2198672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:00,435][276985] Avg episode reward: [(0, '802.974')] +[2023-07-17 00:35:01,428][277270] Updated weights for policy 0, policy_version 4400 (0.0004) +[2023-07-17 00:35:04,118][277270] Updated weights for policy 0, policy_version 4480 (0.0004) +[2023-07-17 00:35:05,434][276985] Fps is (10 sec: 15155.2, 60 sec: 15086.9, 300 sec: 14438.4). Total num frames: 2310144. Throughput: 0: 15061.6. Samples: 2289740. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:35:05,435][276985] Avg episode reward: [(0, '823.880')] +[2023-07-17 00:35:05,455][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004520_2314240.pth... +[2023-07-17 00:35:05,457][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000003648_1867776.pth +[2023-07-17 00:35:06,840][277270] Updated weights for policy 0, policy_version 4560 (0.0004) +[2023-07-17 00:35:09,819][277270] Updated weights for policy 0, policy_version 4640 (0.0005) +[2023-07-17 00:35:10,434][276985] Fps is (10 sec: 14745.6, 60 sec: 15086.9, 300 sec: 14447.7). Total num frames: 2383872. Throughput: 0: 14931.6. Samples: 2375920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:10,435][276985] Avg episode reward: [(0, '798.436')] +[2023-07-17 00:35:12,810][277270] Updated weights for policy 0, policy_version 4720 (0.0005) +[2023-07-17 00:35:15,434][276985] Fps is (10 sec: 14336.1, 60 sec: 14950.4, 300 sec: 14432.4). Total num frames: 2453504. Throughput: 0: 14841.5. Samples: 2417060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:15,435][276985] Avg episode reward: [(0, '838.434')] +[2023-07-17 00:35:15,435][277226] Saving new best policy, reward=838.434! +[2023-07-17 00:35:15,669][277270] Updated weights for policy 0, policy_version 4800 (0.0005) +[2023-07-17 00:35:18,704][277270] Updated weights for policy 0, policy_version 4880 (0.0005) +[2023-07-17 00:35:20,434][276985] Fps is (10 sec: 13516.7, 60 sec: 14745.6, 300 sec: 14394.5). Total num frames: 2519040. Throughput: 0: 14611.5. Samples: 2500784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:20,435][276985] Avg episode reward: [(0, '833.792')] +[2023-07-17 00:35:20,494][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004928_2523136.pth... +[2023-07-17 00:35:20,496][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004072_2084864.pth +[2023-07-17 00:35:21,691][277270] Updated weights for policy 0, policy_version 4960 (0.0005) +[2023-07-17 00:35:24,664][277270] Updated weights for policy 0, policy_version 5040 (0.0005) +[2023-07-17 00:35:25,434][276985] Fps is (10 sec: 13516.7, 60 sec: 14609.1, 300 sec: 14381.5). Total num frames: 2588672. Throughput: 0: 14362.6. Samples: 2583736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:25,435][276985] Avg episode reward: [(0, '834.314')] +[2023-07-17 00:35:27,553][277270] Updated weights for policy 0, policy_version 5120 (0.0005) +[2023-07-17 00:35:30,202][277270] Updated weights for policy 0, policy_version 5200 (0.0004) +[2023-07-17 00:35:30,434][276985] Fps is (10 sec: 14336.1, 60 sec: 14609.1, 300 sec: 14391.4). Total num frames: 2662400. Throughput: 0: 14377.7. Samples: 2625608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:30,435][276985] Avg episode reward: [(0, '819.081')] +[2023-07-17 00:35:33,161][277270] Updated weights for policy 0, policy_version 5280 (0.0005) +[2023-07-17 00:35:35,434][276985] Fps is (10 sec: 13926.2, 60 sec: 14404.2, 300 sec: 14357.6). Total num frames: 2727936. Throughput: 0: 14421.1. Samples: 2710716. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:35:35,435][276985] Avg episode reward: [(0, '825.870')] +[2023-07-17 00:35:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005336_2732032.pth... +[2023-07-17 00:35:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004520_2314240.pth +[2023-07-17 00:35:36,318][277270] Updated weights for policy 0, policy_version 5360 (0.0005) +[2023-07-17 00:35:39,168][277270] Updated weights for policy 0, policy_version 5440 (0.0004) +[2023-07-17 00:35:40,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14267.7, 300 sec: 14367.5). Total num frames: 2801664. Throughput: 0: 14253.0. Samples: 2795568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:40,435][276985] Avg episode reward: [(0, '800.398')] +[2023-07-17 00:35:41,827][277270] Updated weights for policy 0, policy_version 5520 (0.0004) +[2023-07-17 00:35:44,400][277270] Updated weights for policy 0, policy_version 5600 (0.0004) +[2023-07-17 00:35:45,434][276985] Fps is (10 sec: 15155.5, 60 sec: 14472.5, 300 sec: 14397.4). Total num frames: 2879488. Throughput: 0: 14311.5. Samples: 2842688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:45,435][276985] Avg episode reward: [(0, '820.171')] +[2023-07-17 00:35:47,113][277270] Updated weights for policy 0, policy_version 5680 (0.0004) +[2023-07-17 00:35:49,667][277270] Updated weights for policy 0, policy_version 5760 (0.0003) +[2023-07-17 00:35:50,434][276985] Fps is (10 sec: 15564.8, 60 sec: 14540.8, 300 sec: 14425.9). Total num frames: 2957312. Throughput: 0: 14381.2. Samples: 2936896. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:35:50,435][276985] Avg episode reward: [(0, '795.964')] +[2023-07-17 00:35:50,459][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005784_2961408.pth... +[2023-07-17 00:35:50,460][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000004928_2523136.pth +[2023-07-17 00:35:52,298][277270] Updated weights for policy 0, policy_version 5840 (0.0003) +[2023-07-17 00:35:54,903][277270] Updated weights for policy 0, policy_version 5920 (0.0004) +[2023-07-17 00:35:55,434][276985] Fps is (10 sec: 15974.4, 60 sec: 14677.3, 300 sec: 14472.5). Total num frames: 3039232. Throughput: 0: 14547.1. Samples: 3030540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:35:55,435][276985] Avg episode reward: [(0, '831.320')] +[2023-07-17 00:35:57,565][277270] Updated weights for policy 0, policy_version 6000 (0.0004) +[2023-07-17 00:36:00,229][277270] Updated weights for policy 0, policy_version 6080 (0.0004) +[2023-07-17 00:36:00,434][276985] Fps is (10 sec: 15564.9, 60 sec: 14609.1, 300 sec: 14478.9). Total num frames: 3112960. Throughput: 0: 14646.7. Samples: 3076160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:00,435][276985] Avg episode reward: [(0, '818.442')] +[2023-07-17 00:36:02,889][277270] Updated weights for policy 0, policy_version 6160 (0.0004) +[2023-07-17 00:36:05,434][276985] Fps is (10 sec: 15155.2, 60 sec: 14677.3, 300 sec: 14503.6). Total num frames: 3190784. Throughput: 0: 14859.6. Samples: 3169464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:05,435][276985] Avg episode reward: [(0, '811.583')] +[2023-07-17 00:36:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006232_3190784.pth... +[2023-07-17 00:36:05,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005336_2732032.pth +[2023-07-17 00:36:05,506][277270] Updated weights for policy 0, policy_version 6240 (0.0004) +[2023-07-17 00:36:08,107][277270] Updated weights for policy 0, policy_version 6320 (0.0003) +[2023-07-17 00:36:10,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14745.6, 300 sec: 14527.2). Total num frames: 3268608. Throughput: 0: 15083.3. Samples: 3262484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:10,435][276985] Avg episode reward: [(0, '807.621')] +[2023-07-17 00:36:10,787][277270] Updated weights for policy 0, policy_version 6400 (0.0004) +[2023-07-17 00:36:13,408][277270] Updated weights for policy 0, policy_version 6480 (0.0003) +[2023-07-17 00:36:15,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14882.1, 300 sec: 14549.7). Total num frames: 3346432. Throughput: 0: 15199.4. Samples: 3309580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:15,435][276985] Avg episode reward: [(0, '777.123')] +[2023-07-17 00:36:16,135][277270] Updated weights for policy 0, policy_version 6560 (0.0004) +[2023-07-17 00:36:18,779][277270] Updated weights for policy 0, policy_version 6640 (0.0004) +[2023-07-17 00:36:20,434][276985] Fps is (10 sec: 15155.2, 60 sec: 15018.7, 300 sec: 14553.9). Total num frames: 3420160. Throughput: 0: 15333.0. Samples: 3400700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:20,435][276985] Avg episode reward: [(0, '816.968')] +[2023-07-17 00:36:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006688_3424256.pth... +[2023-07-17 00:36:20,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000005784_2961408.pth +[2023-07-17 00:36:21,471][277270] Updated weights for policy 0, policy_version 6720 (0.0004) +[2023-07-17 00:36:24,049][277270] Updated weights for policy 0, policy_version 6800 (0.0004) +[2023-07-17 00:36:25,434][276985] Fps is (10 sec: 15564.8, 60 sec: 15223.5, 300 sec: 14592.0). Total num frames: 3502080. Throughput: 0: 15533.8. Samples: 3494588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:25,435][276985] Avg episode reward: [(0, '848.224')] +[2023-07-17 00:36:25,435][277226] Saving new best policy, reward=848.224! +[2023-07-17 00:36:26,578][277270] Updated weights for policy 0, policy_version 6880 (0.0003) +[2023-07-17 00:36:29,279][277270] Updated weights for policy 0, policy_version 6960 (0.0004) +[2023-07-17 00:36:30,434][276985] Fps is (10 sec: 15565.0, 60 sec: 15223.5, 300 sec: 14595.1). Total num frames: 3575808. Throughput: 0: 15563.3. Samples: 3543036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:30,434][276985] Avg episode reward: [(0, '822.579')] +[2023-07-17 00:36:32,204][277270] Updated weights for policy 0, policy_version 7040 (0.0005) +[2023-07-17 00:36:35,140][277270] Updated weights for policy 0, policy_version 7120 (0.0005) +[2023-07-17 00:36:35,434][276985] Fps is (10 sec: 14336.0, 60 sec: 15291.8, 300 sec: 14581.8). Total num frames: 3645440. Throughput: 0: 15350.2. Samples: 3627656. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:36:35,434][276985] Avg episode reward: [(0, '830.247')] +[2023-07-17 00:36:35,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007128_3649536.pth... +[2023-07-17 00:36:35,439][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006232_3190784.pth +[2023-07-17 00:36:38,140][277270] Updated weights for policy 0, policy_version 7200 (0.0005) +[2023-07-17 00:36:40,434][276985] Fps is (10 sec: 13926.4, 60 sec: 15223.5, 300 sec: 14568.9). Total num frames: 3715072. Throughput: 0: 15121.1. Samples: 3710988. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:36:40,435][276985] Avg episode reward: [(0, '849.273')] +[2023-07-17 00:36:40,469][277226] Saving new best policy, reward=849.273! +[2023-07-17 00:36:41,029][277270] Updated weights for policy 0, policy_version 7280 (0.0005) +[2023-07-17 00:36:43,985][277270] Updated weights for policy 0, policy_version 7360 (0.0005) +[2023-07-17 00:36:45,434][276985] Fps is (10 sec: 13926.3, 60 sec: 15086.9, 300 sec: 14556.6). Total num frames: 3784704. Throughput: 0: 15022.7. Samples: 3752180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:45,435][276985] Avg episode reward: [(0, '813.668')] +[2023-07-17 00:36:46,914][277270] Updated weights for policy 0, policy_version 7440 (0.0005) +[2023-07-17 00:36:49,912][277270] Updated weights for policy 0, policy_version 7520 (0.0005) +[2023-07-17 00:36:50,434][276985] Fps is (10 sec: 13926.2, 60 sec: 14950.4, 300 sec: 14544.7). Total num frames: 3854336. Throughput: 0: 14812.1. Samples: 3836012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:50,435][276985] Avg episode reward: [(0, '809.743')] +[2023-07-17 00:36:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007528_3854336.pth... +[2023-07-17 00:36:50,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000006688_3424256.pth +[2023-07-17 00:36:52,833][277270] Updated weights for policy 0, policy_version 7600 (0.0005) +[2023-07-17 00:36:55,434][276985] Fps is (10 sec: 14336.0, 60 sec: 14813.9, 300 sec: 14548.4). Total num frames: 3928064. Throughput: 0: 14612.5. Samples: 3920048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:36:55,435][276985] Avg episode reward: [(0, '840.363')] +[2023-07-17 00:36:55,706][277270] Updated weights for policy 0, policy_version 7680 (0.0005) +[2023-07-17 00:36:58,625][277270] Updated weights for policy 0, policy_version 7760 (0.0005) +[2023-07-17 00:37:00,434][276985] Fps is (10 sec: 14336.2, 60 sec: 14745.6, 300 sec: 14537.1). Total num frames: 3997696. Throughput: 0: 14518.8. Samples: 3962924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:00,435][276985] Avg episode reward: [(0, '779.185')] +[2023-07-17 00:37:01,426][277270] Updated weights for policy 0, policy_version 7840 (0.0004) +[2023-07-17 00:37:04,018][277270] Updated weights for policy 0, policy_version 7920 (0.0003) +[2023-07-17 00:37:05,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14745.6, 300 sec: 14555.4). Total num frames: 4075520. Throughput: 0: 14487.4. Samples: 4052632. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:37:05,435][276985] Avg episode reward: [(0, '841.352')] +[2023-07-17 00:37:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007960_4075520.pth... +[2023-07-17 00:37:05,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007128_3649536.pth +[2023-07-17 00:37:06,655][277270] Updated weights for policy 0, policy_version 8000 (0.0004) +[2023-07-17 00:37:09,376][277270] Updated weights for policy 0, policy_version 8080 (0.0004) +[2023-07-17 00:37:10,434][276985] Fps is (10 sec: 15155.1, 60 sec: 14677.3, 300 sec: 14558.8). Total num frames: 4149248. Throughput: 0: 14426.2. Samples: 4143768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:10,435][276985] Avg episode reward: [(0, '805.051')] +[2023-07-17 00:37:12,358][277270] Updated weights for policy 0, policy_version 8160 (0.0004) +[2023-07-17 00:37:14,932][277270] Updated weights for policy 0, policy_version 8240 (0.0003) +[2023-07-17 00:37:15,434][276985] Fps is (10 sec: 15155.0, 60 sec: 14677.3, 300 sec: 14576.1). Total num frames: 4227072. Throughput: 0: 14291.9. Samples: 4186176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:15,435][276985] Avg episode reward: [(0, '817.797')] +[2023-07-17 00:37:17,621][277270] Updated weights for policy 0, policy_version 8320 (0.0004) +[2023-07-17 00:37:20,255][277270] Updated weights for policy 0, policy_version 8400 (0.0004) +[2023-07-17 00:37:20,434][276985] Fps is (10 sec: 15155.1, 60 sec: 14677.3, 300 sec: 14579.0). Total num frames: 4300800. Throughput: 0: 14481.7. Samples: 4279332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:20,435][276985] Avg episode reward: [(0, '827.285')] +[2023-07-17 00:37:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008400_4300800.pth... +[2023-07-17 00:37:20,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007528_3854336.pth +[2023-07-17 00:37:22,860][277270] Updated weights for policy 0, policy_version 8480 (0.0004) +[2023-07-17 00:37:25,434][276985] Fps is (10 sec: 14745.8, 60 sec: 14540.8, 300 sec: 14759.5). Total num frames: 4374528. Throughput: 0: 14654.3. Samples: 4370432. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:37:25,435][276985] Avg episode reward: [(0, '850.820')] +[2023-07-17 00:37:25,435][277226] Saving new best policy, reward=850.820! +[2023-07-17 00:37:25,813][277270] Updated weights for policy 0, policy_version 8560 (0.0005) +[2023-07-17 00:37:28,716][277270] Updated weights for policy 0, policy_version 8640 (0.0005) +[2023-07-17 00:37:30,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14540.8, 300 sec: 14773.4). Total num frames: 4448256. Throughput: 0: 14650.6. Samples: 4411456. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:37:30,435][276985] Avg episode reward: [(0, '809.249')] +[2023-07-17 00:37:31,547][277270] Updated weights for policy 0, policy_version 8720 (0.0005) +[2023-07-17 00:37:34,341][277270] Updated weights for policy 0, policy_version 8800 (0.0005) +[2023-07-17 00:37:35,434][276985] Fps is (10 sec: 14336.0, 60 sec: 14540.8, 300 sec: 14759.5). Total num frames: 4517888. Throughput: 0: 14723.0. Samples: 4498548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:37:35,435][276985] Avg episode reward: [(0, '838.928')] +[2023-07-17 00:37:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008824_4517888.pth... +[2023-07-17 00:37:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000007960_4075520.pth +[2023-07-17 00:37:37,301][277270] Updated weights for policy 0, policy_version 8880 (0.0005) +[2023-07-17 00:37:40,263][277270] Updated weights for policy 0, policy_version 8960 (0.0005) +[2023-07-17 00:37:40,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14540.8, 300 sec: 14759.5). Total num frames: 4587520. Throughput: 0: 14702.0. Samples: 4581636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:40,435][276985] Avg episode reward: [(0, '821.416')] +[2023-07-17 00:37:43,192][277270] Updated weights for policy 0, policy_version 9040 (0.0005) +[2023-07-17 00:37:45,434][276985] Fps is (10 sec: 13926.5, 60 sec: 14540.8, 300 sec: 14731.7). Total num frames: 4657152. Throughput: 0: 14699.1. Samples: 4624384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:37:45,435][276985] Avg episode reward: [(0, '789.879')] +[2023-07-17 00:37:46,160][277270] Updated weights for policy 0, policy_version 9120 (0.0005) +[2023-07-17 00:37:49,227][277270] Updated weights for policy 0, policy_version 9200 (0.0005) +[2023-07-17 00:37:50,434][276985] Fps is (10 sec: 13516.9, 60 sec: 14472.6, 300 sec: 14703.9). Total num frames: 4722688. Throughput: 0: 14518.2. Samples: 4705948. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:37:50,434][276985] Avg episode reward: [(0, '837.827')] +[2023-07-17 00:37:50,477][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009232_4726784.pth... +[2023-07-17 00:37:50,479][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008400_4300800.pth +[2023-07-17 00:37:52,212][277270] Updated weights for policy 0, policy_version 9280 (0.0005) +[2023-07-17 00:37:54,867][277270] Updated weights for policy 0, policy_version 9360 (0.0004) +[2023-07-17 00:37:55,434][276985] Fps is (10 sec: 14336.0, 60 sec: 14540.8, 300 sec: 14731.7). Total num frames: 4800512. Throughput: 0: 14413.7. Samples: 4792384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:37:55,435][276985] Avg episode reward: [(0, '816.845')] +[2023-07-17 00:37:57,479][277270] Updated weights for policy 0, policy_version 9440 (0.0003) +[2023-07-17 00:38:00,073][277270] Updated weights for policy 0, policy_version 9520 (0.0003) +[2023-07-17 00:38:00,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14677.3, 300 sec: 14773.4). Total num frames: 4878336. Throughput: 0: 14517.7. Samples: 4839472. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:38:00,435][276985] Avg episode reward: [(0, '822.083')] +[2023-07-17 00:38:02,832][277270] Updated weights for policy 0, policy_version 9600 (0.0004) +[2023-07-17 00:38:05,434][276985] Fps is (10 sec: 15155.1, 60 sec: 14609.1, 300 sec: 14787.3). Total num frames: 4952064. Throughput: 0: 14450.6. Samples: 4929608. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:38:05,435][276985] Avg episode reward: [(0, '837.500')] +[2023-07-17 00:38:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009672_4952064.pth... +[2023-07-17 00:38:05,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000008824_4517888.pth +[2023-07-17 00:38:05,666][277270] Updated weights for policy 0, policy_version 9680 (0.0004) +[2023-07-17 00:38:08,290][277270] Updated weights for policy 0, policy_version 9760 (0.0004) +[2023-07-17 00:38:10,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14609.1, 300 sec: 14787.3). Total num frames: 5025792. Throughput: 0: 14453.3. Samples: 5020828. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-07-17 00:38:10,435][276985] Avg episode reward: [(0, '814.835')] +[2023-07-17 00:38:10,947][277270] Updated weights for policy 0, policy_version 9840 (0.0004) +[2023-07-17 00:38:13,615][277270] Updated weights for policy 0, policy_version 9920 (0.0004) +[2023-07-17 00:38:15,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14540.8, 300 sec: 14787.3). Total num frames: 5099520. Throughput: 0: 14563.7. Samples: 5066824. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:15,435][276985] Avg episode reward: [(0, '812.871')] +[2023-07-17 00:38:16,660][277270] Updated weights for policy 0, policy_version 10000 (0.0006) +[2023-07-17 00:38:19,636][277270] Updated weights for policy 0, policy_version 10080 (0.0004) +[2023-07-17 00:38:20,434][276985] Fps is (10 sec: 14335.9, 60 sec: 14472.5, 300 sec: 14773.4). Total num frames: 5169152. Throughput: 0: 14483.9. Samples: 5150324. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:20,435][276985] Avg episode reward: [(0, '832.958')] +[2023-07-17 00:38:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010096_5169152.pth... +[2023-07-17 00:38:20,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009232_4726784.pth +[2023-07-17 00:38:22,585][277270] Updated weights for policy 0, policy_version 10160 (0.0005) +[2023-07-17 00:38:25,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14404.3, 300 sec: 14773.4). Total num frames: 5238784. Throughput: 0: 14471.3. Samples: 5232844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:38:25,435][276985] Avg episode reward: [(0, '839.821')] +[2023-07-17 00:38:25,610][277270] Updated weights for policy 0, policy_version 10240 (0.0005) +[2023-07-17 00:38:28,643][277270] Updated weights for policy 0, policy_version 10320 (0.0005) +[2023-07-17 00:38:30,434][276985] Fps is (10 sec: 13516.9, 60 sec: 14267.8, 300 sec: 14759.5). Total num frames: 5304320. Throughput: 0: 14419.9. Samples: 5273280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:38:30,435][276985] Avg episode reward: [(0, '835.887')] +[2023-07-17 00:38:31,633][277270] Updated weights for policy 0, policy_version 10400 (0.0005) +[2023-07-17 00:38:34,496][277270] Updated weights for policy 0, policy_version 10480 (0.0004) +[2023-07-17 00:38:35,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14336.0, 300 sec: 14773.4). Total num frames: 5378048. Throughput: 0: 14475.4. Samples: 5357344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:38:35,435][276985] Avg episode reward: [(0, '839.972')] +[2023-07-17 00:38:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010504_5378048.pth... +[2023-07-17 00:38:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000009672_4952064.pth +[2023-07-17 00:38:37,419][277270] Updated weights for policy 0, policy_version 10560 (0.0004) +[2023-07-17 00:38:40,293][277270] Updated weights for policy 0, policy_version 10640 (0.0004) +[2023-07-17 00:38:40,434][276985] Fps is (10 sec: 14335.9, 60 sec: 14336.0, 300 sec: 14759.5). Total num frames: 5447680. Throughput: 0: 14435.7. Samples: 5441992. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:40,435][276985] Avg episode reward: [(0, '834.114')] +[2023-07-17 00:38:43,209][277270] Updated weights for policy 0, policy_version 10720 (0.0005) +[2023-07-17 00:38:45,434][276985] Fps is (10 sec: 14336.1, 60 sec: 14404.3, 300 sec: 14745.6). Total num frames: 5521408. Throughput: 0: 14326.5. Samples: 5484164. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:45,435][276985] Avg episode reward: [(0, '847.237')] +[2023-07-17 00:38:45,825][277270] Updated weights for policy 0, policy_version 10800 (0.0004) +[2023-07-17 00:38:48,587][277270] Updated weights for policy 0, policy_version 10880 (0.0004) +[2023-07-17 00:38:50,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14540.8, 300 sec: 14731.7). Total num frames: 5595136. Throughput: 0: 14334.4. Samples: 5574656. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:50,435][276985] Avg episode reward: [(0, '849.395')] +[2023-07-17 00:38:50,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010928_5595136.pth... +[2023-07-17 00:38:50,439][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010096_5169152.pth +[2023-07-17 00:38:51,248][277270] Updated weights for policy 0, policy_version 10960 (0.0004) +[2023-07-17 00:38:53,933][277270] Updated weights for policy 0, policy_version 11040 (0.0004) +[2023-07-17 00:38:55,434][276985] Fps is (10 sec: 15155.1, 60 sec: 14540.8, 300 sec: 14731.7). Total num frames: 5672960. Throughput: 0: 14331.9. Samples: 5665764. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:38:55,435][276985] Avg episode reward: [(0, '851.638')] +[2023-07-17 00:38:55,436][277226] Saving new best policy, reward=851.638! +[2023-07-17 00:38:56,869][277270] Updated weights for policy 0, policy_version 11120 (0.0005) +[2023-07-17 00:38:59,763][277270] Updated weights for policy 0, policy_version 11200 (0.0005) +[2023-07-17 00:39:00,434][276985] Fps is (10 sec: 14745.5, 60 sec: 14404.3, 300 sec: 14703.9). Total num frames: 5742592. Throughput: 0: 14254.2. Samples: 5708264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:39:00,435][276985] Avg episode reward: [(0, '848.789')] +[2023-07-17 00:39:02,599][277270] Updated weights for policy 0, policy_version 11280 (0.0004) +[2023-07-17 00:39:05,238][277270] Updated weights for policy 0, policy_version 11360 (0.0004) +[2023-07-17 00:39:05,434][276985] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 14703.9). Total num frames: 5816320. Throughput: 0: 14316.6. Samples: 5794572. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:39:05,435][276985] Avg episode reward: [(0, '824.108')] +[2023-07-17 00:39:05,439][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011360_5816320.pth... +[2023-07-17 00:39:05,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010504_5378048.pth +[2023-07-17 00:39:07,999][277270] Updated weights for policy 0, policy_version 11440 (0.0004) +[2023-07-17 00:39:10,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14404.3, 300 sec: 14690.1). Total num frames: 5890048. Throughput: 0: 14445.8. Samples: 5882904. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-07-17 00:39:10,435][276985] Avg episode reward: [(0, '833.659')] +[2023-07-17 00:39:10,985][277270] Updated weights for policy 0, policy_version 11520 (0.0005) +[2023-07-17 00:39:13,762][277270] Updated weights for policy 0, policy_version 11600 (0.0004) +[2023-07-17 00:39:15,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14404.3, 300 sec: 14676.2). Total num frames: 5963776. Throughput: 0: 14475.1. Samples: 5924660. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:39:15,435][276985] Avg episode reward: [(0, '839.667')] +[2023-07-17 00:39:16,388][277270] Updated weights for policy 0, policy_version 11680 (0.0004) +[2023-07-17 00:39:19,036][277270] Updated weights for policy 0, policy_version 11760 (0.0004) +[2023-07-17 00:39:20,434][276985] Fps is (10 sec: 14745.4, 60 sec: 14472.5, 300 sec: 14662.3). Total num frames: 6037504. Throughput: 0: 14692.8. Samples: 6018520. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:39:20,435][276985] Avg episode reward: [(0, '825.320')] +[2023-07-17 00:39:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011792_6037504.pth... +[2023-07-17 00:39:20,442][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000010928_5595136.pth +[2023-07-17 00:39:21,950][277270] Updated weights for policy 0, policy_version 11840 (0.0005) +[2023-07-17 00:39:24,745][277270] Updated weights for policy 0, policy_version 11920 (0.0004) +[2023-07-17 00:39:25,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14540.8, 300 sec: 14662.3). Total num frames: 6111232. Throughput: 0: 14722.0. Samples: 6104484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:39:25,435][276985] Avg episode reward: [(0, '844.398')] +[2023-07-17 00:39:27,628][277270] Updated weights for policy 0, policy_version 12000 (0.0005) +[2023-07-17 00:39:30,434][276985] Fps is (10 sec: 14336.2, 60 sec: 14609.0, 300 sec: 14634.5). Total num frames: 6180864. Throughput: 0: 14743.5. Samples: 6147624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:39:30,435][276985] Avg episode reward: [(0, '838.241')] +[2023-07-17 00:39:30,605][277270] Updated weights for policy 0, policy_version 12080 (0.0005) +[2023-07-17 00:39:33,556][277270] Updated weights for policy 0, policy_version 12160 (0.0005) +[2023-07-17 00:39:35,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14540.8, 300 sec: 14592.9). Total num frames: 6250496. Throughput: 0: 14565.1. Samples: 6230088. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:39:35,435][276985] Avg episode reward: [(0, '847.675')] +[2023-07-17 00:39:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012208_6250496.pth... +[2023-07-17 00:39:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011360_5816320.pth +[2023-07-17 00:39:36,515][277270] Updated weights for policy 0, policy_version 12240 (0.0005) +[2023-07-17 00:39:39,531][277270] Updated weights for policy 0, policy_version 12320 (0.0005) +[2023-07-17 00:39:40,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14540.8, 300 sec: 14606.8). Total num frames: 6320128. Throughput: 0: 14361.0. Samples: 6312008. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:39:40,435][276985] Avg episode reward: [(0, '849.039')] +[2023-07-17 00:39:42,462][277270] Updated weights for policy 0, policy_version 12400 (0.0006) +[2023-07-17 00:39:45,434][276985] Fps is (10 sec: 13516.9, 60 sec: 14404.3, 300 sec: 14579.0). Total num frames: 6385664. Throughput: 0: 14361.7. Samples: 6354540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:39:45,435][276985] Avg episode reward: [(0, '849.687')] +[2023-07-17 00:39:45,494][277270] Updated weights for policy 0, policy_version 12480 (0.0005) +[2023-07-17 00:39:48,399][277270] Updated weights for policy 0, policy_version 12560 (0.0004) +[2023-07-17 00:39:50,434][276985] Fps is (10 sec: 13516.7, 60 sec: 14336.0, 300 sec: 14565.1). Total num frames: 6455296. Throughput: 0: 14279.9. Samples: 6437168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:39:50,435][276985] Avg episode reward: [(0, '805.988')] +[2023-07-17 00:39:50,440][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012616_6459392.pth... +[2023-07-17 00:39:50,443][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000011792_6037504.pth +[2023-07-17 00:39:51,324][277270] Updated weights for policy 0, policy_version 12640 (0.0005) +[2023-07-17 00:39:54,251][277270] Updated weights for policy 0, policy_version 12720 (0.0005) +[2023-07-17 00:39:55,434][276985] Fps is (10 sec: 13926.3, 60 sec: 14199.5, 300 sec: 14537.3). Total num frames: 6524928. Throughput: 0: 14176.3. Samples: 6520840. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:39:55,435][276985] Avg episode reward: [(0, '865.641')] +[2023-07-17 00:39:55,439][277226] Saving new best policy, reward=865.641! +[2023-07-17 00:39:57,168][277270] Updated weights for policy 0, policy_version 12800 (0.0005) +[2023-07-17 00:40:00,102][277270] Updated weights for policy 0, policy_version 12880 (0.0005) +[2023-07-17 00:40:00,434][276985] Fps is (10 sec: 14336.1, 60 sec: 14267.8, 300 sec: 14537.3). Total num frames: 6598656. Throughput: 0: 14187.4. Samples: 6563092. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-07-17 00:40:00,435][276985] Avg episode reward: [(0, '857.995')] +[2023-07-17 00:40:03,035][277270] Updated weights for policy 0, policy_version 12960 (0.0005) +[2023-07-17 00:40:05,434][276985] Fps is (10 sec: 14335.9, 60 sec: 14199.5, 300 sec: 14523.4). Total num frames: 6668288. Throughput: 0: 13957.6. Samples: 6646612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:40:05,435][276985] Avg episode reward: [(0, '847.272')] +[2023-07-17 00:40:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013024_6668288.pth... +[2023-07-17 00:40:05,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012208_6250496.pth +[2023-07-17 00:40:06,013][277270] Updated weights for policy 0, policy_version 13040 (0.0005) +[2023-07-17 00:40:08,874][277270] Updated weights for policy 0, policy_version 13120 (0.0005) +[2023-07-17 00:40:10,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14131.2, 300 sec: 14523.4). Total num frames: 6737920. Throughput: 0: 13922.8. Samples: 6731008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:40:10,435][276985] Avg episode reward: [(0, '849.974')] +[2023-07-17 00:40:11,740][277270] Updated weights for policy 0, policy_version 13200 (0.0005) +[2023-07-17 00:40:14,706][277270] Updated weights for policy 0, policy_version 13280 (0.0005) +[2023-07-17 00:40:15,434][276985] Fps is (10 sec: 13926.5, 60 sec: 14062.9, 300 sec: 14537.3). Total num frames: 6807552. Throughput: 0: 13937.1. Samples: 6774792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:40:15,435][276985] Avg episode reward: [(0, '854.907')] +[2023-07-17 00:40:17,586][277270] Updated weights for policy 0, policy_version 13360 (0.0005) +[2023-07-17 00:40:20,434][276985] Fps is (10 sec: 13926.3, 60 sec: 13994.7, 300 sec: 14537.3). Total num frames: 6877184. Throughput: 0: 13954.4. Samples: 6858036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:40:20,435][276985] Avg episode reward: [(0, '851.130')] +[2023-07-17 00:40:20,487][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013440_6881280.pth... +[2023-07-17 00:40:20,487][277270] Updated weights for policy 0, policy_version 13440 (0.0005) +[2023-07-17 00:40:20,489][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000012616_6459392.pth +[2023-07-17 00:40:23,348][277270] Updated weights for policy 0, policy_version 13520 (0.0005) +[2023-07-17 00:40:25,434][276985] Fps is (10 sec: 14335.8, 60 sec: 13994.6, 300 sec: 14537.3). Total num frames: 6950912. Throughput: 0: 14025.4. Samples: 6943152. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:40:25,435][276985] Avg episode reward: [(0, '840.539')] +[2023-07-17 00:40:26,286][277270] Updated weights for policy 0, policy_version 13600 (0.0005) +[2023-07-17 00:40:29,137][277270] Updated weights for policy 0, policy_version 13680 (0.0005) +[2023-07-17 00:40:30,434][276985] Fps is (10 sec: 14336.1, 60 sec: 13994.7, 300 sec: 14551.2). Total num frames: 7020544. Throughput: 0: 14048.9. Samples: 6986740. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-07-17 00:40:30,435][276985] Avg episode reward: [(0, '852.357')] +[2023-07-17 00:40:32,211][277270] Updated weights for policy 0, policy_version 13760 (0.0005) +[2023-07-17 00:40:35,108][277270] Updated weights for policy 0, policy_version 13840 (0.0005) +[2023-07-17 00:40:35,434][276985] Fps is (10 sec: 13926.5, 60 sec: 13994.6, 300 sec: 14537.3). Total num frames: 7090176. Throughput: 0: 14044.4. Samples: 7069168. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:40:35,435][276985] Avg episode reward: [(0, '851.582')] +[2023-07-17 00:40:35,439][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013848_7090176.pth... +[2023-07-17 00:40:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013024_6668288.pth +[2023-07-17 00:40:38,072][277270] Updated weights for policy 0, policy_version 13920 (0.0005) +[2023-07-17 00:40:40,434][276985] Fps is (10 sec: 13516.9, 60 sec: 13926.4, 300 sec: 14495.7). Total num frames: 7155712. Throughput: 0: 14018.7. Samples: 7151680. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-07-17 00:40:40,435][276985] Avg episode reward: [(0, '847.746')] +[2023-07-17 00:40:41,026][277270] Updated weights for policy 0, policy_version 14000 (0.0005) +[2023-07-17 00:40:43,993][277270] Updated weights for policy 0, policy_version 14080 (0.0005) +[2023-07-17 00:40:45,434][276985] Fps is (10 sec: 13926.6, 60 sec: 14062.9, 300 sec: 14481.8). Total num frames: 7229440. Throughput: 0: 13993.7. Samples: 7192808. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:40:45,435][276985] Avg episode reward: [(0, '858.452')] +[2023-07-17 00:40:46,842][277270] Updated weights for policy 0, policy_version 14160 (0.0005) +[2023-07-17 00:40:49,682][277270] Updated weights for policy 0, policy_version 14240 (0.0004) +[2023-07-17 00:40:50,434][276985] Fps is (10 sec: 14335.8, 60 sec: 14062.9, 300 sec: 14440.1). Total num frames: 7299072. Throughput: 0: 14057.1. Samples: 7279180. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-07-17 00:40:50,435][276985] Avg episode reward: [(0, '845.123')] +[2023-07-17 00:40:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014256_7299072.pth... +[2023-07-17 00:40:50,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013440_6881280.pth +[2023-07-17 00:40:52,676][277270] Updated weights for policy 0, policy_version 14320 (0.0005) +[2023-07-17 00:40:55,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14062.9, 300 sec: 14426.2). Total num frames: 7368704. Throughput: 0: 14071.1. Samples: 7364208. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-17 00:40:55,435][276985] Avg episode reward: [(0, '862.105')] +[2023-07-17 00:40:55,518][277270] Updated weights for policy 0, policy_version 14400 (0.0005) +[2023-07-17 00:40:58,449][277270] Updated weights for policy 0, policy_version 14480 (0.0005) +[2023-07-17 00:41:00,434][276985] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 14398.5). Total num frames: 7438336. Throughput: 0: 14017.5. Samples: 7405580. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-07-17 00:41:00,435][276985] Avg episode reward: [(0, '829.627')] +[2023-07-17 00:41:01,459][277270] Updated weights for policy 0, policy_version 14560 (0.0005) +[2023-07-17 00:41:04,317][277270] Updated weights for policy 0, policy_version 14640 (0.0005) +[2023-07-17 00:41:05,434][276985] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 14370.7). Total num frames: 7507968. Throughput: 0: 14022.5. Samples: 7489048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:05,435][276985] Avg episode reward: [(0, '844.752')] +[2023-07-17 00:41:05,456][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014672_7512064.pth... +[2023-07-17 00:41:05,459][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000013848_7090176.pth +[2023-07-17 00:41:07,265][277270] Updated weights for policy 0, policy_version 14720 (0.0005) +[2023-07-17 00:41:10,205][277270] Updated weights for policy 0, policy_version 14800 (0.0005) +[2023-07-17 00:41:10,434][276985] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 14342.9). Total num frames: 7577600. Throughput: 0: 13999.9. Samples: 7573144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:10,435][276985] Avg episode reward: [(0, '851.320')] +[2023-07-17 00:41:12,854][277270] Updated weights for policy 0, policy_version 14880 (0.0004) +[2023-07-17 00:41:15,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14131.2, 300 sec: 14356.8). Total num frames: 7655424. Throughput: 0: 14041.9. Samples: 7618624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:15,435][276985] Avg episode reward: [(0, '862.146')] +[2023-07-17 00:41:15,556][277270] Updated weights for policy 0, policy_version 14960 (0.0004) +[2023-07-17 00:41:18,191][277270] Updated weights for policy 0, policy_version 15040 (0.0004) +[2023-07-17 00:41:20,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14267.7, 300 sec: 14342.9). Total num frames: 7733248. Throughput: 0: 14251.0. Samples: 7710464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:20,435][276985] Avg episode reward: [(0, '859.769')] +[2023-07-17 00:41:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015104_7733248.pth... +[2023-07-17 00:41:20,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014256_7299072.pth +[2023-07-17 00:41:20,812][277270] Updated weights for policy 0, policy_version 15120 (0.0004) +[2023-07-17 00:41:23,463][277270] Updated weights for policy 0, policy_version 15200 (0.0004) +[2023-07-17 00:41:25,434][276985] Fps is (10 sec: 15564.7, 60 sec: 14336.0, 300 sec: 14356.8). Total num frames: 7811072. Throughput: 0: 14487.1. Samples: 7803600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:25,435][276985] Avg episode reward: [(0, '841.602')] +[2023-07-17 00:41:26,129][277270] Updated weights for policy 0, policy_version 15280 (0.0004) +[2023-07-17 00:41:29,104][277270] Updated weights for policy 0, policy_version 15360 (0.0005) +[2023-07-17 00:41:30,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14336.0, 300 sec: 14356.8). Total num frames: 7880704. Throughput: 0: 14557.7. Samples: 7847904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:30,435][276985] Avg episode reward: [(0, '850.189')] +[2023-07-17 00:41:32,067][277270] Updated weights for policy 0, policy_version 15440 (0.0005) +[2023-07-17 00:41:34,984][277270] Updated weights for policy 0, policy_version 15520 (0.0005) +[2023-07-17 00:41:35,434][276985] Fps is (10 sec: 13926.6, 60 sec: 14336.0, 300 sec: 14356.8). Total num frames: 7950336. Throughput: 0: 14470.0. Samples: 7930328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:35,435][276985] Avg episode reward: [(0, '835.591')] +[2023-07-17 00:41:35,437][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015528_7950336.pth... +[2023-07-17 00:41:35,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000014672_7512064.pth +[2023-07-17 00:41:37,977][277270] Updated weights for policy 0, policy_version 15600 (0.0005) +[2023-07-17 00:41:40,434][276985] Fps is (10 sec: 13926.5, 60 sec: 14404.3, 300 sec: 14356.8). Total num frames: 8019968. Throughput: 0: 14416.7. Samples: 8012960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:40,435][276985] Avg episode reward: [(0, '854.954')] +[2023-07-17 00:41:40,889][277270] Updated weights for policy 0, policy_version 15680 (0.0005) +[2023-07-17 00:41:43,534][277270] Updated weights for policy 0, policy_version 15760 (0.0004) +[2023-07-17 00:41:45,434][276985] Fps is (10 sec: 14745.7, 60 sec: 14472.6, 300 sec: 14384.6). Total num frames: 8097792. Throughput: 0: 14513.3. Samples: 8058676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:45,435][276985] Avg episode reward: [(0, '860.554')] +[2023-07-17 00:41:46,089][277270] Updated weights for policy 0, policy_version 15840 (0.0004) +[2023-07-17 00:41:48,774][277270] Updated weights for policy 0, policy_version 15920 (0.0004) +[2023-07-17 00:41:50,434][276985] Fps is (10 sec: 15564.6, 60 sec: 14609.1, 300 sec: 14398.5). Total num frames: 8175616. Throughput: 0: 14737.0. Samples: 8152212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:50,435][276985] Avg episode reward: [(0, '858.705')] +[2023-07-17 00:41:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015968_8175616.pth... +[2023-07-17 00:41:50,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015104_7733248.pth +[2023-07-17 00:41:51,391][277270] Updated weights for policy 0, policy_version 16000 (0.0004) +[2023-07-17 00:41:53,961][277270] Updated weights for policy 0, policy_version 16080 (0.0004) +[2023-07-17 00:41:55,434][276985] Fps is (10 sec: 15564.6, 60 sec: 14745.6, 300 sec: 14426.3). Total num frames: 8253440. Throughput: 0: 14954.1. Samples: 8246076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:41:55,435][276985] Avg episode reward: [(0, '842.072')] +[2023-07-17 00:41:56,576][277270] Updated weights for policy 0, policy_version 16160 (0.0004) +[2023-07-17 00:41:59,466][277270] Updated weights for policy 0, policy_version 16240 (0.0005) +[2023-07-17 00:42:00,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14813.9, 300 sec: 14412.4). Total num frames: 8327168. Throughput: 0: 15001.2. Samples: 8293680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:00,435][276985] Avg episode reward: [(0, '854.967')] +[2023-07-17 00:42:02,389][277270] Updated weights for policy 0, policy_version 16320 (0.0005) +[2023-07-17 00:42:05,303][277270] Updated weights for policy 0, policy_version 16400 (0.0005) +[2023-07-17 00:42:05,434][276985] Fps is (10 sec: 14335.9, 60 sec: 14813.9, 300 sec: 14398.5). Total num frames: 8396800. Throughput: 0: 14806.1. Samples: 8376740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:05,435][276985] Avg episode reward: [(0, '859.111')] +[2023-07-17 00:42:05,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016400_8396800.pth... +[2023-07-17 00:42:05,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015528_7950336.pth +[2023-07-17 00:42:08,204][277270] Updated weights for policy 0, policy_version 16480 (0.0005) +[2023-07-17 00:42:10,434][276985] Fps is (10 sec: 13926.5, 60 sec: 14813.9, 300 sec: 14370.7). Total num frames: 8466432. Throughput: 0: 14629.5. Samples: 8461924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:10,435][276985] Avg episode reward: [(0, '865.814')] +[2023-07-17 00:42:10,463][277226] Saving new best policy, reward=865.814! +[2023-07-17 00:42:11,044][277270] Updated weights for policy 0, policy_version 16560 (0.0005) +[2023-07-17 00:42:14,032][277270] Updated weights for policy 0, policy_version 16640 (0.0005) +[2023-07-17 00:42:15,434][276985] Fps is (10 sec: 13926.5, 60 sec: 14677.3, 300 sec: 14356.8). Total num frames: 8536064. Throughput: 0: 14578.0. Samples: 8503912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:15,434][276985] Avg episode reward: [(0, '859.109')] +[2023-07-17 00:42:16,999][277270] Updated weights for policy 0, policy_version 16720 (0.0005) +[2023-07-17 00:42:19,990][277270] Updated weights for policy 0, policy_version 16800 (0.0005) +[2023-07-17 00:42:20,434][276985] Fps is (10 sec: 13926.2, 60 sec: 14540.8, 300 sec: 14342.9). Total num frames: 8605696. Throughput: 0: 14562.5. Samples: 8585644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:20,435][276985] Avg episode reward: [(0, '853.919')] +[2023-07-17 00:42:20,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016808_8605696.pth... +[2023-07-17 00:42:20,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000015968_8175616.pth +[2023-07-17 00:42:22,929][277270] Updated weights for policy 0, policy_version 16880 (0.0005) +[2023-07-17 00:42:25,434][276985] Fps is (10 sec: 13926.3, 60 sec: 14404.3, 300 sec: 14329.1). Total num frames: 8675328. Throughput: 0: 14567.4. Samples: 8668492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:25,435][276985] Avg episode reward: [(0, '856.414')] +[2023-07-17 00:42:25,940][277270] Updated weights for policy 0, policy_version 16960 (0.0005) +[2023-07-17 00:42:28,968][277270] Updated weights for policy 0, policy_version 17040 (0.0005) +[2023-07-17 00:42:30,434][276985] Fps is (10 sec: 13516.9, 60 sec: 14336.0, 300 sec: 14315.2). Total num frames: 8740864. Throughput: 0: 14452.1. Samples: 8709024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:30,435][276985] Avg episode reward: [(0, '842.645')] +[2023-07-17 00:42:31,947][277270] Updated weights for policy 0, policy_version 17120 (0.0005) +[2023-07-17 00:42:34,973][277270] Updated weights for policy 0, policy_version 17200 (0.0005) +[2023-07-17 00:42:35,434][276985] Fps is (10 sec: 13516.8, 60 sec: 14336.0, 300 sec: 14315.2). Total num frames: 8810496. Throughput: 0: 14190.0. Samples: 8790760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:35,435][276985] Avg episode reward: [(0, '865.283')] +[2023-07-17 00:42:35,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017208_8810496.pth... +[2023-07-17 00:42:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016400_8396800.pth +[2023-07-17 00:42:37,698][277270] Updated weights for policy 0, policy_version 17280 (0.0004) +[2023-07-17 00:42:40,419][277270] Updated weights for policy 0, policy_version 17360 (0.0004) +[2023-07-17 00:42:40,434][276985] Fps is (10 sec: 14745.6, 60 sec: 14472.5, 300 sec: 14342.9). Total num frames: 8888320. Throughput: 0: 14090.3. Samples: 8880140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:40,435][276985] Avg episode reward: [(0, '851.598')] +[2023-07-17 00:42:43,101][277270] Updated weights for policy 0, policy_version 17440 (0.0004) +[2023-07-17 00:42:45,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14404.2, 300 sec: 14370.7). Total num frames: 8962048. Throughput: 0: 14040.5. Samples: 8925504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:45,435][276985] Avg episode reward: [(0, '855.794')] +[2023-07-17 00:42:45,739][277270] Updated weights for policy 0, policy_version 17520 (0.0004) +[2023-07-17 00:42:48,336][277270] Updated weights for policy 0, policy_version 17600 (0.0004) +[2023-07-17 00:42:50,434][276985] Fps is (10 sec: 15155.2, 60 sec: 14404.3, 300 sec: 14370.7). Total num frames: 9039872. Throughput: 0: 14281.4. Samples: 9019404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:50,435][276985] Avg episode reward: [(0, '824.514')] +[2023-07-17 00:42:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017656_9039872.pth... +[2023-07-17 00:42:50,440][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000016808_8605696.pth +[2023-07-17 00:42:50,995][277270] Updated weights for policy 0, policy_version 17680 (0.0005) +[2023-07-17 00:42:53,579][277270] Updated weights for policy 0, policy_version 17760 (0.0004) +[2023-07-17 00:42:55,434][276985] Fps is (10 sec: 15974.4, 60 sec: 14472.5, 300 sec: 14384.6). Total num frames: 9121792. Throughput: 0: 14483.3. Samples: 9113672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:42:55,435][276985] Avg episode reward: [(0, '852.618')] +[2023-07-17 00:42:56,198][277270] Updated weights for policy 0, policy_version 17840 (0.0004) +[2023-07-17 00:42:59,156][277270] Updated weights for policy 0, policy_version 17920 (0.0005) +[2023-07-17 00:43:00,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14404.3, 300 sec: 14370.7). Total num frames: 9191424. Throughput: 0: 14509.6. Samples: 9156844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:00,435][276985] Avg episode reward: [(0, '860.753')] +[2023-07-17 00:43:02,106][277270] Updated weights for policy 0, policy_version 18000 (0.0005) +[2023-07-17 00:43:05,065][277270] Updated weights for policy 0, policy_version 18080 (0.0005) +[2023-07-17 00:43:05,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14404.3, 300 sec: 14356.8). Total num frames: 9261056. Throughput: 0: 14539.7. Samples: 9239928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:05,454][276985] Avg episode reward: [(0, '844.413')] +[2023-07-17 00:43:05,457][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018088_9261056.pth... +[2023-07-17 00:43:05,459][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017208_8810496.pth +[2023-07-17 00:43:07,993][277270] Updated weights for policy 0, policy_version 18160 (0.0005) +[2023-07-17 00:43:10,434][276985] Fps is (10 sec: 13926.4, 60 sec: 14404.3, 300 sec: 14342.9). Total num frames: 9330688. Throughput: 0: 14595.3. Samples: 9325280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:10,435][276985] Avg episode reward: [(0, '861.607')] +[2023-07-17 00:43:10,706][277270] Updated weights for policy 0, policy_version 18240 (0.0005) +[2023-07-17 00:43:13,314][277270] Updated weights for policy 0, policy_version 18320 (0.0004) +[2023-07-17 00:43:15,434][276985] Fps is (10 sec: 15155.3, 60 sec: 14609.1, 300 sec: 14384.6). Total num frames: 9412608. Throughput: 0: 14749.6. Samples: 9372756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:15,435][276985] Avg episode reward: [(0, '865.390')] +[2023-07-17 00:43:15,883][277270] Updated weights for policy 0, policy_version 18400 (0.0004) +[2023-07-17 00:43:18,420][277270] Updated weights for policy 0, policy_version 18480 (0.0004) +[2023-07-17 00:43:20,434][276985] Fps is (10 sec: 15974.4, 60 sec: 14745.6, 300 sec: 14412.4). Total num frames: 9490432. Throughput: 0: 15062.2. Samples: 9468560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:20,442][276985] Avg episode reward: [(0, '854.361')] +[2023-07-17 00:43:20,444][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018536_9490432.pth... +[2023-07-17 00:43:20,446][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000017656_9039872.pth +[2023-07-17 00:43:20,996][277270] Updated weights for policy 0, policy_version 18560 (0.0004) +[2023-07-17 00:43:23,525][277270] Updated weights for policy 0, policy_version 18640 (0.0004) +[2023-07-17 00:43:25,434][276985] Fps is (10 sec: 15974.4, 60 sec: 14950.4, 300 sec: 14467.9). Total num frames: 9572352. Throughput: 0: 15224.0. Samples: 9565220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:25,435][276985] Avg episode reward: [(0, '847.125')] +[2023-07-17 00:43:26,063][277270] Updated weights for policy 0, policy_version 18720 (0.0004) +[2023-07-17 00:43:28,559][277270] Updated weights for policy 0, policy_version 18800 (0.0004) +[2023-07-17 00:43:30,434][276985] Fps is (10 sec: 16384.1, 60 sec: 15223.5, 300 sec: 14495.7). Total num frames: 9654272. Throughput: 0: 15289.8. Samples: 9613544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:30,435][276985] Avg episode reward: [(0, '855.625')] +[2023-07-17 00:43:31,146][277270] Updated weights for policy 0, policy_version 18880 (0.0004) +[2023-07-17 00:43:33,701][277270] Updated weights for policy 0, policy_version 18960 (0.0004) +[2023-07-17 00:43:35,434][276985] Fps is (10 sec: 15974.3, 60 sec: 15360.0, 300 sec: 14523.4). Total num frames: 9732096. Throughput: 0: 15349.5. Samples: 9710132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:35,435][276985] Avg episode reward: [(0, '856.418')] +[2023-07-17 00:43:35,439][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019008_9732096.pth... +[2023-07-17 00:43:35,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018088_9261056.pth +[2023-07-17 00:43:36,247][277270] Updated weights for policy 0, policy_version 19040 (0.0004) +[2023-07-17 00:43:38,795][277270] Updated weights for policy 0, policy_version 19120 (0.0004) +[2023-07-17 00:43:40,434][276985] Fps is (10 sec: 15974.4, 60 sec: 15428.3, 300 sec: 14551.2). Total num frames: 9814016. Throughput: 0: 15382.8. Samples: 9805896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:40,435][276985] Avg episode reward: [(0, '863.397')] +[2023-07-17 00:43:41,376][277270] Updated weights for policy 0, policy_version 19200 (0.0004) +[2023-07-17 00:43:44,005][277270] Updated weights for policy 0, policy_version 19280 (0.0004) +[2023-07-17 00:43:45,434][276985] Fps is (10 sec: 15974.5, 60 sec: 15496.5, 300 sec: 14565.1). Total num frames: 9891840. Throughput: 0: 15487.2. Samples: 9853768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:45,435][276985] Avg episode reward: [(0, '855.930')] +[2023-07-17 00:43:46,650][277270] Updated weights for policy 0, policy_version 19360 (0.0004) +[2023-07-17 00:43:49,236][277270] Updated weights for policy 0, policy_version 19440 (0.0004) +[2023-07-17 00:43:50,434][276985] Fps is (10 sec: 15564.7, 60 sec: 15496.5, 300 sec: 14565.1). Total num frames: 9969664. Throughput: 0: 15720.7. Samples: 9947360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-07-17 00:43:50,435][276985] Avg episode reward: [(0, '854.390')] +[2023-07-17 00:43:50,438][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019472_9969664.pth... +[2023-07-17 00:43:50,441][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000018536_9490432.pth +[2023-07-17 00:43:51,859][277270] Updated weights for policy 0, policy_version 19520 (0.0004) +[2023-07-17 00:43:52,408][277226] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000009 +[2023-07-17 00:43:52,691][277226] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 +[2023-07-17 00:43:52,692][277276] Stopping RolloutWorker_w5... +[2023-07-17 00:43:52,692][277273] Stopping RolloutWorker_w2... +[2023-07-17 00:43:52,692][277274] Stopping RolloutWorker_w3... +[2023-07-17 00:43:52,692][277308] Stopping RolloutWorker_w6... +[2023-07-17 00:43:52,692][277276] Loop rollout_proc5_evt_loop terminating... +[2023-07-17 00:43:52,692][277272] Stopping RolloutWorker_w1... +[2023-07-17 00:43:52,692][277366] Stopping RolloutWorker_w7... +[2023-07-17 00:43:52,692][277273] Loop rollout_proc2_evt_loop terminating... +[2023-07-17 00:43:52,693][277274] Loop rollout_proc3_evt_loop terminating... +[2023-07-17 00:43:52,693][277308] Loop rollout_proc6_evt_loop terminating... +[2023-07-17 00:43:52,692][277275] Stopping RolloutWorker_w4... +[2023-07-17 00:43:52,693][277272] Loop rollout_proc1_evt_loop terminating... +[2023-07-17 00:43:52,693][277366] Loop rollout_proc7_evt_loop terminating... +[2023-07-17 00:43:52,692][277271] Stopping RolloutWorker_w0... +[2023-07-17 00:43:52,692][276985] Component RolloutWorker_w5 stopped! +[2023-07-17 00:43:52,693][277275] Loop rollout_proc4_evt_loop terminating... +[2023-07-17 00:43:52,693][277271] Loop rollout_proc0_evt_loop terminating... +[2023-07-17 00:43:52,693][276985] Component RolloutWorker_w2 stopped! +[2023-07-17 00:43:52,693][277226] Stopping Batcher_0... +[2023-07-17 00:43:52,693][276985] Component RolloutWorker_w3 stopped! +[2023-07-17 00:43:52,693][277226] Loop batcher_evt_loop terminating... +[2023-07-17 00:43:52,693][276985] Component RolloutWorker_w6 stopped! +[2023-07-17 00:43:52,694][276985] Component RolloutWorker_w1 stopped! +[2023-07-17 00:43:52,694][276985] Component RolloutWorker_w7 stopped! +[2023-07-17 00:43:52,694][276985] Component RolloutWorker_w4 stopped! +[2023-07-17 00:43:52,694][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2023-07-17 00:43:52,694][276985] Component RolloutWorker_w0 stopped! +[2023-07-17 00:43:52,694][276985] Component Batcher_0 stopped! +[2023-07-17 00:43:52,697][277226] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019008_9732096.pth +[2023-07-17 00:43:52,697][277226] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-press-v2/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2023-07-17 00:43:52,700][277226] Stopping LearnerWorker_p0... +[2023-07-17 00:43:52,700][277226] Loop learner_proc0_evt_loop terminating... +[2023-07-17 00:43:52,700][276985] Component LearnerWorker_p0 stopped! +[2023-07-17 00:43:52,757][277270] Weights refcount: 2 0 +[2023-07-17 00:43:52,758][277270] Stopping InferenceWorker_p0-w0... +[2023-07-17 00:43:52,759][277270] Loop inference_proc0-0_evt_loop terminating... +[2023-07-17 00:43:52,759][276985] Component InferenceWorker_p0-w0 stopped! +[2023-07-17 00:43:52,759][276985] Waiting for process learner_proc0 to stop... +[2023-07-17 00:43:53,300][276985] Waiting for process inference_proc0-0 to join... +[2023-07-17 00:43:53,312][276985] Waiting for process rollout_proc0 to join... +[2023-07-17 00:43:53,312][276985] Waiting for process rollout_proc1 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc2 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc3 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc4 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc5 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc6 to join... +[2023-07-17 00:43:53,313][276985] Waiting for process rollout_proc7 to join... +[2023-07-17 00:43:53,314][276985] Batcher 0 profile tree view: +batching: 1.8851, releasing_batches: 1.6531 +[2023-07-17 00:43:53,314][276985] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0051 - wait_policy_total: 355.9224 -update_model: 11.6203 - weight_update: 0.0005 + wait_policy_total: 221.3787 +update_model: 9.5503 + weight_update: 0.0004 one_step: 0.0005 - handle_policy_step: 521.3656 - deserialize: 22.2442, stack: 5.4398, obs_to_device_normalize: 92.0136, forward: 255.8458, send_messages: 41.5410 - prepare_outputs: 58.7343 - to_cpu: 8.7414 -[2023-07-08 20:21:55,695][1063098] Learner 0 profile tree view: -misc: 0.0097, prepare_batch: 8.3864 -train: 86.4734 - epoch_init: 0.0337, minibatch_init: 1.2335, losses_postprocess: 1.2809, kl_divergence: 0.4138, after_optimizer: 0.6064 - calculate_losses: 36.4529 - losses_init: 0.0294, forward_head: 13.8126, bptt_initial: 0.1318, bptt: 0.1293, tail: 10.6291, advantages_returns: 0.8332, losses: 9.6060 - update: 44.9841 - clip: 5.4240 -[2023-07-08 20:21:55,695][1063098] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.4520, enqueue_policy_requests: 14.6312, env_step: 555.0006, overhead: 21.9602, complete_rollouts: 0.3771 -save_policy_outputs: 43.4221 - split_output_tensors: 15.0391 -[2023-07-08 20:21:55,695][1063098] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.4283, enqueue_policy_requests: 14.5276, env_step: 548.3563, overhead: 21.9353, complete_rollouts: 0.3938 -save_policy_outputs: 43.2492 - split_output_tensors: 14.7428 -[2023-07-08 20:21:55,696][1063098] Loop Runner_EvtLoop terminating... -[2023-07-08 20:21:55,696][1063098] Runner profile tree view: -main_loop: 954.7748 -[2023-07-08 20:21:55,696][1063098] Collected {0: 10006528}, FPS: 10480.5 + handle_policy_step: 410.2379 + deserialize: 17.6666, stack: 4.2759, obs_to_device_normalize: 72.5388, forward: 200.7743, send_messages: 33.0709 + prepare_outputs: 47.5177 + to_cpu: 7.0143 +[2023-07-17 00:43:53,314][276985] Learner 0 profile tree view: +misc: 0.0115, prepare_batch: 9.4094 +train: 95.1067 + epoch_init: 0.0356, minibatch_init: 1.3080, losses_postprocess: 1.2606, kl_divergence: 0.4330, after_optimizer: 0.5758 + calculate_losses: 40.6366 + losses_init: 0.0306, forward_head: 16.0305, bptt_initial: 0.1407, bptt: 0.1298, tail: 11.4246, advantages_returns: 0.8664, losses: 10.6118 + update: 49.2792 + clip: 5.8341 +[2023-07-17 00:43:53,314][276985] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.3016, enqueue_policy_requests: 12.4608, env_step: 421.9889, overhead: 19.8273, complete_rollouts: 0.3179 +save_policy_outputs: 38.8494 + split_output_tensors: 13.1810 +[2023-07-17 00:43:53,314][276985] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.2857, enqueue_policy_requests: 12.3956, env_step: 425.0600, overhead: 20.0267, complete_rollouts: 0.3117 +save_policy_outputs: 38.8148 + split_output_tensors: 13.2013 +[2023-07-17 00:43:53,315][276985] Loop Runner_EvtLoop terminating... +[2023-07-17 00:43:53,315][276985] Runner profile tree view: +main_loop: 690.8121 +[2023-07-17 00:43:53,315][276985] Collected {0: 10006528}, FPS: 14485.2