diff --git "a/sf_log.txt" "b/sf_log.txt"
new file mode 100644--- /dev/null
+++ "b/sf_log.txt"
@@ -0,0 +1,1069 @@
+[2023-07-08 04:33:53,381][832030] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/config.json...
+[2023-07-08 04:33:53,399][832030] Rollout worker 0 uses device cpu
+[2023-07-08 04:33:53,399][832030] Rollout worker 1 uses device cpu
+[2023-07-08 04:33:53,399][832030] Rollout worker 2 uses device cpu
+[2023-07-08 04:33:53,400][832030] Rollout worker 3 uses device cpu
+[2023-07-08 04:33:53,400][832030] Rollout worker 4 uses device cpu
+[2023-07-08 04:33:53,400][832030] Rollout worker 5 uses device cpu
+[2023-07-08 04:33:53,400][832030] Rollout worker 6 uses device cpu
+[2023-07-08 04:33:53,400][832030] Rollout worker 7 uses device cpu
+[2023-07-08 04:33:53,400][832030] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
+[2023-07-08 04:33:53,411][832030] InferenceWorker_p0-w0: min num requests: 2
+[2023-07-08 04:33:53,429][832030] Starting all processes...
+[2023-07-08 04:33:53,429][832030] Starting process learner_proc0
+[2023-07-08 04:33:53,478][832030] Starting all processes...
+[2023-07-08 04:33:53,524][832030] Starting process inference_proc0-0
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc0
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc1
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc2
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc3
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc4
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc5
+[2023-07-08 04:33:53,524][832030] Starting process rollout_proc6
+[2023-07-08 04:33:53,525][832030] Starting process rollout_proc7
+[2023-07-08 04:33:55,344][832272] Starting seed is not provided
+[2023-07-08 04:33:55,345][832272] Initializing actor-critic model on device cpu
+[2023-07-08 04:33:55,346][832272] RunningMeanStd input shape: (39,)
+[2023-07-08 04:33:55,346][832272] RunningMeanStd input shape: (1,)
+[2023-07-08 04:33:55,406][832272] Created Actor Critic model with architecture:
+[2023-07-08 04:33:55,406][832272] ActorCriticSharedWeights(
+  (obs_normalizer): ObservationNormalizer(
+    (running_mean_std): RunningMeanStdDictInPlace(
+      (running_mean_std): ModuleDict(
+        (obs): RunningMeanStdInPlace()
+      )
+    )
+  )
+  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
+  (encoder): MultiInputEncoder(
+    (encoders): ModuleDict(
+      (obs): MlpEncoder(
+        (mlp_head): RecursiveScriptModule(
+          original_name=Sequential
+          (0): RecursiveScriptModule(original_name=Linear)
+          (1): RecursiveScriptModule(original_name=Tanh)
+          (2): RecursiveScriptModule(original_name=Linear)
+          (3): RecursiveScriptModule(original_name=Tanh)
+        )
+      )
+    )
+  )
+  (core): ModelCoreIdentity()
+  (decoder): MlpDecoder(
+    (mlp): Identity()
+  )
+  (critic_linear): Linear(in_features=64, out_features=1, bias=True)
+  (action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev(
+    (distribution_linear): Linear(in_features=64, out_features=4, bias=True)
+  )
+)
+[2023-07-08 04:33:55,419][832317] Worker 0 uses CPU cores [0, 1, 2, 3]
+[2023-07-08 04:33:55,505][832320] Worker 2 uses CPU cores [8, 9, 10, 11]
+[2023-07-08 04:33:55,573][832385] Worker 6 uses CPU cores [24, 25, 26, 27]
+[2023-07-08 04:33:55,675][832417] Worker 7 uses CPU cores [28, 29, 30, 31]
+[2023-07-08 04:33:55,724][832319] Worker 3 uses CPU cores [12, 13, 14, 15]
+[2023-07-08 04:33:55,727][832272] Using optimizer <class 'torch.optim.adam.Adam'>
+[2023-07-08 04:33:55,728][832272] No checkpoints found
+[2023-07-08 04:33:55,728][832272] Did not load from checkpoint, starting from scratch!
+[2023-07-08 04:33:55,728][832272] Initialized policy 0 weights for model version 0
+[2023-07-08 04:33:55,729][832272] LearnerWorker_p0 finished initialization!
+[2023-07-08 04:33:55,757][832316] RunningMeanStd input shape: (39,)
+[2023-07-08 04:33:55,757][832316] RunningMeanStd input shape: (1,)
+[2023-07-08 04:33:55,813][832030] Inference worker 0-0 is ready!
+[2023-07-08 04:33:55,814][832030] All inference workers are ready! Signal rollout workers to start!
+[2023-07-08 04:33:55,885][832338] Worker 5 uses CPU cores [20, 21, 22, 23]
+[2023-07-08 04:33:55,967][832318] Worker 1 uses CPU cores [4, 5, 6, 7]
+[2023-07-08 04:33:56,053][832321] Worker 4 uses CPU cores [16, 17, 18, 19]
+[2023-07-08 04:33:59,746][832385] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,751][832317] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,751][832320] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,759][832385] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,764][832417] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,764][832320] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,765][832317] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,766][832319] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,776][832417] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,778][832319] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,794][832385] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,799][832317] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,799][832320] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,810][832417] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,812][832319] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,822][832338] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,835][832338] Decorrelating experience for 64 frames...
+[2023-07-08 04:33:59,862][832385] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,865][832317] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,868][832320] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,869][832338] Decorrelating experience for 128 frames...
+[2023-07-08 04:33:59,877][832417] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,880][832319] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,936][832338] Decorrelating experience for 192 frames...
+[2023-07-08 04:33:59,957][832318] Decorrelating experience for 0 frames...
+[2023-07-08 04:33:59,971][832318] Decorrelating experience for 64 frames...
+[2023-07-08 04:34:00,005][832318] Decorrelating experience for 128 frames...
+[2023-07-08 04:34:00,046][832321] Decorrelating experience for 0 frames...
+[2023-07-08 04:34:00,060][832321] Decorrelating experience for 64 frames...
+[2023-07-08 04:34:00,079][832318] Decorrelating experience for 192 frames...
+[2023-07-08 04:34:00,095][832321] Decorrelating experience for 128 frames...
+[2023-07-08 04:34:00,162][832321] Decorrelating experience for 192 frames...
+[2023-07-08 04:34:00,729][832030] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-07-08 04:34:03,743][832317] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,745][832385] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,766][832417] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,766][832320] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,776][832319] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,821][832338] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:03,862][832317] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:03,865][832385] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:03,886][832320] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:03,887][832417] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:03,896][832319] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:03,942][832338] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:04,015][832317] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,018][832385] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,021][832318] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:04,041][832320] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,045][832417] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,049][832321] Decorrelating experience for 256 frames...
+[2023-07-08 04:34:04,050][832319] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,095][832338] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,143][832318] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:04,168][832321] Decorrelating experience for 320 frames...
+[2023-07-08 04:34:04,191][832317] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,193][832385] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,216][832320] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,224][832417] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,224][832319] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,271][832338] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,299][832318] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,323][832321] Decorrelating experience for 384 frames...
+[2023-07-08 04:34:04,489][832318] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:04,496][832321] Decorrelating experience for 448 frames...
+[2023-07-08 04:34:05,729][832030] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 335.2. Samples: 1676. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:34:05,729][832030] Avg episode reward: [(0, '5.597')]
+[2023-07-08 04:34:05,731][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000016_8192.pth...
+[2023-07-08 04:34:08,379][832316] Updated weights for policy 0, policy_version 80 (0.0005)
+[2023-07-08 04:34:10,729][832030] Fps is (10 sec: 6553.6, 60 sec: 6553.6, 300 sec: 6553.6). Total num frames: 65536. Throughput: 0: 3556.0. Samples: 35560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:10,729][832030] Avg episode reward: [(0, '42.517')]
+[2023-07-08 04:34:12,347][832316] Updated weights for policy 0, policy_version 160 (0.0006)
+[2023-07-08 04:34:13,407][832030] Heartbeat connected on Batcher_0
+[2023-07-08 04:34:13,409][832030] Heartbeat connected on LearnerWorker_p0
+[2023-07-08 04:34:13,412][832030] Heartbeat connected on InferenceWorker_p0-w0
+[2023-07-08 04:34:13,416][832030] Heartbeat connected on RolloutWorker_w0
+[2023-07-08 04:34:13,419][832030] Heartbeat connected on RolloutWorker_w1
+[2023-07-08 04:34:13,420][832030] Heartbeat connected on RolloutWorker_w2
+[2023-07-08 04:34:13,422][832030] Heartbeat connected on RolloutWorker_w3
+[2023-07-08 04:34:13,429][832030] Heartbeat connected on RolloutWorker_w6
+[2023-07-08 04:34:13,431][832030] Heartbeat connected on RolloutWorker_w7
+[2023-07-08 04:34:13,438][832030] Heartbeat connected on RolloutWorker_w4
+[2023-07-08 04:34:13,451][832030] Heartbeat connected on RolloutWorker_w5
+[2023-07-08 04:34:15,729][832030] Fps is (10 sec: 10649.8, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 114688. Throughput: 0: 6551.0. Samples: 98264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:34:15,729][832030] Avg episode reward: [(0, '200.520')]
+[2023-07-08 04:34:15,730][832272] Saving new best policy, reward=201.155!
+[2023-07-08 04:34:16,281][832316] Updated weights for policy 0, policy_version 240 (0.0006)
+[2023-07-08 04:34:19,746][832316] Updated weights for policy 0, policy_version 320 (0.0005)
+[2023-07-08 04:34:20,729][832030] Fps is (10 sec: 10649.6, 60 sec: 8601.6, 300 sec: 8601.6). Total num frames: 172032. Throughput: 0: 8249.8. Samples: 164996. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:34:20,730][832030] Avg episode reward: [(0, '311.343')]
+[2023-07-08 04:34:20,767][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000344_176128.pth...
+[2023-07-08 04:34:20,772][832272] Saving new best policy, reward=311.343!
+[2023-07-08 04:34:23,138][832316] Updated weights for policy 0, policy_version 400 (0.0004)
+[2023-07-08 04:34:25,729][832030] Fps is (10 sec: 11878.3, 60 sec: 9338.9, 300 sec: 9338.9). Total num frames: 233472. Throughput: 0: 8031.1. Samples: 200776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:25,729][832030] Avg episode reward: [(0, '324.731')]
+[2023-07-08 04:34:25,730][832272] Saving new best policy, reward=324.731!
+[2023-07-08 04:34:26,650][832316] Updated weights for policy 0, policy_version 480 (0.0005)
+[2023-07-08 04:34:30,405][832316] Updated weights for policy 0, policy_version 560 (0.0006)
+[2023-07-08 04:34:30,729][832030] Fps is (10 sec: 11468.9, 60 sec: 9557.3, 300 sec: 9557.3). Total num frames: 286720. Throughput: 0: 9013.3. Samples: 270400. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:34:30,729][832030] Avg episode reward: [(0, '331.553')]
+[2023-07-08 04:34:30,730][832272] Saving new best policy, reward=331.553!
+[2023-07-08 04:34:34,145][832316] Updated weights for policy 0, policy_version 640 (0.0005)
+[2023-07-08 04:34:35,729][832030] Fps is (10 sec: 11059.2, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 344064. Throughput: 0: 9598.2. Samples: 335936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:35,729][832030] Avg episode reward: [(0, '336.170')]
+[2023-07-08 04:34:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000672_344064.pth...
+[2023-07-08 04:34:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000016_8192.pth
+[2023-07-08 04:34:35,736][832272] Saving new best policy, reward=336.170!
+[2023-07-08 04:34:37,557][832316] Updated weights for policy 0, policy_version 720 (0.0005)
+[2023-07-08 04:34:40,729][832030] Fps is (10 sec: 11878.4, 60 sec: 10137.6, 300 sec: 10137.6). Total num frames: 405504. Throughput: 0: 9310.0. Samples: 372400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:40,729][832030] Avg episode reward: [(0, '340.707')]
+[2023-07-08 04:34:40,730][832272] Saving new best policy, reward=340.707!
+[2023-07-08 04:34:40,995][832316] Updated weights for policy 0, policy_version 800 (0.0005)
+[2023-07-08 04:34:44,720][832316] Updated weights for policy 0, policy_version 880 (0.0006)
+[2023-07-08 04:34:45,729][832030] Fps is (10 sec: 11468.8, 60 sec: 10194.5, 300 sec: 10194.5). Total num frames: 458752. Throughput: 0: 9814.7. Samples: 441660. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:34:45,729][832030] Avg episode reward: [(0, '343.702')]
+[2023-07-08 04:34:45,763][832272] Saving new best policy, reward=343.702!
+[2023-07-08 04:34:48,331][832316] Updated weights for policy 0, policy_version 960 (0.0005)
+[2023-07-08 04:34:50,729][832030] Fps is (10 sec: 11059.1, 60 sec: 10321.9, 300 sec: 10321.9). Total num frames: 516096. Throughput: 0: 11267.8. Samples: 508728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:50,729][832030] Avg episode reward: [(0, '343.593')]
+[2023-07-08 04:34:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001008_516096.pth...
+[2023-07-08 04:34:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000344_176128.pth
+[2023-07-08 04:34:51,963][832316] Updated weights for policy 0, policy_version 1040 (0.0005)
+[2023-07-08 04:34:55,606][832316] Updated weights for policy 0, policy_version 1120 (0.0005)
+[2023-07-08 04:34:55,729][832030] Fps is (10 sec: 11468.9, 60 sec: 10426.2, 300 sec: 10426.2). Total num frames: 573440. Throughput: 0: 11254.6. Samples: 542016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:34:55,729][832030] Avg episode reward: [(0, '344.742')]
+[2023-07-08 04:34:55,730][832272] Saving new best policy, reward=344.742!
+[2023-07-08 04:34:59,359][832316] Updated weights for policy 0, policy_version 1200 (0.0005)
+[2023-07-08 04:35:00,729][832030] Fps is (10 sec: 11059.3, 60 sec: 10444.8, 300 sec: 10444.8). Total num frames: 626688. Throughput: 0: 11356.7. Samples: 609316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:35:00,729][832030] Avg episode reward: [(0, '344.376')]
+[2023-07-08 04:35:02,980][832316] Updated weights for policy 0, policy_version 1280 (0.0005)
+[2023-07-08 04:35:05,729][832030] Fps is (10 sec: 11059.1, 60 sec: 11264.0, 300 sec: 10523.6). Total num frames: 684032. Throughput: 0: 11364.7. Samples: 676408. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:35:05,729][832030] Avg episode reward: [(0, '345.591')]
+[2023-07-08 04:35:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001336_684032.pth...
+[2023-07-08 04:35:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000000672_344064.pth
+[2023-07-08 04:35:05,735][832272] Saving new best policy, reward=345.591!
+[2023-07-08 04:35:06,656][832316] Updated weights for policy 0, policy_version 1360 (0.0005)
+[2023-07-08 04:35:10,301][832316] Updated weights for policy 0, policy_version 1440 (0.0005)
+[2023-07-08 04:35:10,729][832030] Fps is (10 sec: 11468.8, 60 sec: 11264.0, 300 sec: 10591.1). Total num frames: 741376. Throughput: 0: 11301.8. Samples: 709356. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:35:10,729][832030] Avg episode reward: [(0, '345.462')]
+[2023-07-08 04:35:13,939][832316] Updated weights for policy 0, policy_version 1520 (0.0005)
+[2023-07-08 04:35:15,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11332.2, 300 sec: 10595.0). Total num frames: 794624. Throughput: 0: 11278.7. Samples: 777944. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:35:15,729][832030] Avg episode reward: [(0, '344.656')]
+[2023-07-08 04:35:17,703][832316] Updated weights for policy 0, policy_version 1600 (0.0005)
+[2023-07-08 04:35:20,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11332.3, 300 sec: 10649.6). Total num frames: 851968. Throughput: 0: 11302.9. Samples: 844564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:35:20,729][832030] Avg episode reward: [(0, '346.913')]
+[2023-07-08 04:35:20,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001664_851968.pth...
+[2023-07-08 04:35:20,733][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001008_516096.pth
+[2023-07-08 04:35:20,734][832272] Saving new best policy, reward=346.913!
+[2023-07-08 04:35:21,122][832316] Updated weights for policy 0, policy_version 1680 (0.0004)
+[2023-07-08 04:35:24,592][832316] Updated weights for policy 0, policy_version 1760 (0.0004)
+[2023-07-08 04:35:25,729][832030] Fps is (10 sec: 11878.4, 60 sec: 11332.3, 300 sec: 10746.0). Total num frames: 913408. Throughput: 0: 11317.2. Samples: 881676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:35:25,729][832030] Avg episode reward: [(0, '341.477')]
+[2023-07-08 04:35:28,323][832316] Updated weights for policy 0, policy_version 1840 (0.0005)
+[2023-07-08 04:35:30,729][832030] Fps is (10 sec: 11468.8, 60 sec: 11332.3, 300 sec: 10740.6). Total num frames: 966656. Throughput: 0: 11244.2. Samples: 947648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:35:30,729][832030] Avg episode reward: [(0, '342.716')]
+[2023-07-08 04:35:31,856][832316] Updated weights for policy 0, policy_version 1920 (0.0005)
+[2023-07-08 04:35:35,460][832316] Updated weights for policy 0, policy_version 2000 (0.0005)
+[2023-07-08 04:35:35,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11332.3, 300 sec: 10778.9). Total num frames: 1024000. Throughput: 0: 11289.7. Samples: 1016764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:35:35,729][832030] Avg episode reward: [(0, '345.214')]
+[2023-07-08 04:35:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002000_1024000.pth...
+[2023-07-08 04:35:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001336_684032.pth
+[2023-07-08 04:35:39,188][832316] Updated weights for policy 0, policy_version 2080 (0.0005)
+[2023-07-08 04:35:40,729][832030] Fps is (10 sec: 11468.8, 60 sec: 11264.0, 300 sec: 10813.4). Total num frames: 1081344. Throughput: 0: 11293.5. Samples: 1050224. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:35:40,729][832030] Avg episode reward: [(0, '344.462')]
+[2023-07-08 04:35:42,881][832316] Updated weights for policy 0, policy_version 2160 (0.0005)
+[2023-07-08 04:35:45,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11264.0, 300 sec: 10805.6). Total num frames: 1134592. Throughput: 0: 11278.3. Samples: 1116840. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:35:45,729][832030] Avg episode reward: [(0, '345.069')]
+[2023-07-08 04:35:46,613][832316] Updated weights for policy 0, policy_version 2240 (0.0005)
+[2023-07-08 04:35:50,243][832316] Updated weights for policy 0, policy_version 2320 (0.0005)
+[2023-07-08 04:35:50,729][832030] Fps is (10 sec: 11059.1, 60 sec: 11264.0, 300 sec: 10835.8). Total num frames: 1191936. Throughput: 0: 11274.3. Samples: 1183752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:35:50,729][832030] Avg episode reward: [(0, '348.043')]
+[2023-07-08 04:35:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002328_1191936.pth...
+[2023-07-08 04:35:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000001664_851968.pth
+[2023-07-08 04:35:50,736][832272] Saving new best policy, reward=348.043!
+[2023-07-08 04:35:54,006][832316] Updated weights for policy 0, policy_version 2400 (0.0005)
+[2023-07-08 04:35:55,729][832030] Fps is (10 sec: 11059.3, 60 sec: 11195.7, 300 sec: 10827.7). Total num frames: 1245184. Throughput: 0: 11270.2. Samples: 1216512. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:35:55,729][832030] Avg episode reward: [(0, '354.603')]
+[2023-07-08 04:35:55,729][832272] Saving new best policy, reward=354.603!
+[2023-07-08 04:35:57,810][832316] Updated weights for policy 0, policy_version 2480 (0.0005)
+[2023-07-08 04:36:00,729][832030] Fps is (10 sec: 11059.3, 60 sec: 11264.0, 300 sec: 10854.4). Total num frames: 1302528. Throughput: 0: 11197.6. Samples: 1281836. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:36:00,729][832030] Avg episode reward: [(0, '356.724')]
+[2023-07-08 04:36:00,730][832272] Saving new best policy, reward=356.724!
+[2023-07-08 04:36:01,429][832316] Updated weights for policy 0, policy_version 2560 (0.0005)
+[2023-07-08 04:36:05,123][832316] Updated weights for policy 0, policy_version 2640 (0.0005)
+[2023-07-08 04:36:05,729][832030] Fps is (10 sec: 11059.1, 60 sec: 11195.7, 300 sec: 10846.2). Total num frames: 1355776. Throughput: 0: 11207.1. Samples: 1348884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:05,729][832030] Avg episode reward: [(0, '354.314')]
+[2023-07-08 04:36:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002648_1355776.pth...
+[2023-07-08 04:36:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002000_1024000.pth
+[2023-07-08 04:36:08,766][832316] Updated weights for policy 0, policy_version 2720 (0.0005)
+[2023-07-08 04:36:10,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11195.7, 300 sec: 10870.2). Total num frames: 1413120. Throughput: 0: 11119.3. Samples: 1382044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:10,729][832030] Avg episode reward: [(0, '355.588')]
+[2023-07-08 04:36:12,498][832316] Updated weights for policy 0, policy_version 2800 (0.0005)
+[2023-07-08 04:36:15,729][832030] Fps is (10 sec: 11059.3, 60 sec: 11195.7, 300 sec: 10862.0). Total num frames: 1466368. Throughput: 0: 11133.2. Samples: 1448644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:36:15,729][832030] Avg episode reward: [(0, '352.157')]
+[2023-07-08 04:36:16,287][832316] Updated weights for policy 0, policy_version 2880 (0.0005)
+[2023-07-08 04:36:20,027][832316] Updated weights for policy 0, policy_version 2960 (0.0005)
+[2023-07-08 04:36:20,729][832030] Fps is (10 sec: 10649.5, 60 sec: 11127.4, 300 sec: 10854.4). Total num frames: 1519616. Throughput: 0: 11047.6. Samples: 1513908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:20,729][832030] Avg episode reward: [(0, '390.455')]
+[2023-07-08 04:36:20,772][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002976_1523712.pth...
+[2023-07-08 04:36:20,775][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002328_1191936.pth
+[2023-07-08 04:36:20,775][832272] Saving new best policy, reward=390.455!
+[2023-07-08 04:36:23,726][832316] Updated weights for policy 0, policy_version 3040 (0.0005)
+[2023-07-08 04:36:25,729][832030] Fps is (10 sec: 11059.2, 60 sec: 11059.2, 300 sec: 10875.6). Total num frames: 1576960. Throughput: 0: 11045.5. Samples: 1547272. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:36:25,729][832030] Avg episode reward: [(0, '414.749')]
+[2023-07-08 04:36:25,730][832272] Saving new best policy, reward=414.749!
+[2023-07-08 04:36:27,531][832316] Updated weights for policy 0, policy_version 3120 (0.0005)
+[2023-07-08 04:36:30,729][832030] Fps is (10 sec: 11059.3, 60 sec: 11059.2, 300 sec: 10868.1). Total num frames: 1630208. Throughput: 0: 10993.9. Samples: 1611564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:30,729][832030] Avg episode reward: [(0, '359.194')]
+[2023-07-08 04:36:31,344][832316] Updated weights for policy 0, policy_version 3200 (0.0005)
+[2023-07-08 04:36:35,263][832316] Updated weights for policy 0, policy_version 3280 (0.0005)
+[2023-07-08 04:36:35,729][832030] Fps is (10 sec: 10649.6, 60 sec: 10990.9, 300 sec: 10861.0). Total num frames: 1683456. Throughput: 0: 10922.7. Samples: 1675272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:35,729][832030] Avg episode reward: [(0, '368.330')]
+[2023-07-08 04:36:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003288_1683456.pth...
+[2023-07-08 04:36:35,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002648_1355776.pth
+[2023-07-08 04:36:39,185][832316] Updated weights for policy 0, policy_version 3360 (0.0005)
+[2023-07-08 04:36:40,729][832030] Fps is (10 sec: 10240.0, 60 sec: 10854.4, 300 sec: 10828.8). Total num frames: 1732608. Throughput: 0: 10887.4. Samples: 1706444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:40,729][832030] Avg episode reward: [(0, '278.615')]
+[2023-07-08 04:36:43,204][832316] Updated weights for policy 0, policy_version 3440 (0.0005)
+[2023-07-08 04:36:45,729][832030] Fps is (10 sec: 10240.0, 60 sec: 10854.4, 300 sec: 10823.4). Total num frames: 1785856. Throughput: 0: 10798.3. Samples: 1767760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:45,729][832030] Avg episode reward: [(0, '278.884')]
+[2023-07-08 04:36:47,270][832316] Updated weights for policy 0, policy_version 3520 (0.0005)
+[2023-07-08 04:36:50,729][832030] Fps is (10 sec: 10239.9, 60 sec: 10717.9, 300 sec: 10794.2). Total num frames: 1835008. Throughput: 0: 10672.3. Samples: 1829136. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:36:50,729][832030] Avg episode reward: [(0, '295.324')]
+[2023-07-08 04:36:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003584_1835008.pth...
+[2023-07-08 04:36:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000002976_1523712.pth
+[2023-07-08 04:36:51,241][832316] Updated weights for policy 0, policy_version 3600 (0.0005)
+[2023-07-08 04:36:55,231][832316] Updated weights for policy 0, policy_version 3680 (0.0005)
+[2023-07-08 04:36:55,729][832030] Fps is (10 sec: 10240.1, 60 sec: 10717.9, 300 sec: 10790.0). Total num frames: 1888256. Throughput: 0: 10626.2. Samples: 1860224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:36:55,729][832030] Avg episode reward: [(0, '286.298')]
+[2023-07-08 04:36:59,389][832316] Updated weights for policy 0, policy_version 3760 (0.0005)
+[2023-07-08 04:37:00,729][832030] Fps is (10 sec: 10240.0, 60 sec: 10581.3, 300 sec: 10763.4). Total num frames: 1937408. Throughput: 0: 10481.1. Samples: 1920296. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:37:00,729][832030] Avg episode reward: [(0, '255.288')]
+[2023-07-08 04:37:03,349][832316] Updated weights for policy 0, policy_version 3840 (0.0005)
+[2023-07-08 04:37:05,729][832030] Fps is (10 sec: 9830.3, 60 sec: 10513.1, 300 sec: 10738.2). Total num frames: 1986560. Throughput: 0: 10389.8. Samples: 1981448. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:37:05,729][832030] Avg episode reward: [(0, '247.537')]
+[2023-07-08 04:37:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003880_1986560.pth...
+[2023-07-08 04:37:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003288_1683456.pth
+[2023-07-08 04:37:07,454][832316] Updated weights for policy 0, policy_version 3920 (0.0005)
+[2023-07-08 04:37:10,729][832030] Fps is (10 sec: 9830.4, 60 sec: 10376.5, 300 sec: 10714.3). Total num frames: 2035712. Throughput: 0: 10308.3. Samples: 2011144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:10,729][832030] Avg episode reward: [(0, '223.121')]
+[2023-07-08 04:37:11,755][832316] Updated weights for policy 0, policy_version 4000 (0.0005)
+[2023-07-08 04:37:15,729][832030] Fps is (10 sec: 9830.4, 60 sec: 10308.3, 300 sec: 10691.6). Total num frames: 2084864. Throughput: 0: 10155.9. Samples: 2068580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:15,729][832030] Avg episode reward: [(0, '210.647')]
+[2023-07-08 04:37:15,907][832316] Updated weights for policy 0, policy_version 4080 (0.0005)
+[2023-07-08 04:37:20,074][832316] Updated weights for policy 0, policy_version 4160 (0.0005)
+[2023-07-08 04:37:20,729][832030] Fps is (10 sec: 9830.4, 60 sec: 10240.0, 300 sec: 10670.1). Total num frames: 2134016. Throughput: 0: 10044.9. Samples: 2127292. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:37:20,729][832030] Avg episode reward: [(0, '240.214')]
+[2023-07-08 04:37:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004168_2134016.pth...
+[2023-07-08 04:37:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003584_1835008.pth
+[2023-07-08 04:37:24,310][832316] Updated weights for policy 0, policy_version 4240 (0.0005)
+[2023-07-08 04:37:25,729][832030] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 10649.6). Total num frames: 2183168. Throughput: 0: 10012.8. Samples: 2157020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:25,729][832030] Avg episode reward: [(0, '214.940')]
+[2023-07-08 04:37:28,560][832316] Updated weights for policy 0, policy_version 4320 (0.0005)
+[2023-07-08 04:37:30,729][832030] Fps is (10 sec: 9830.5, 60 sec: 10035.2, 300 sec: 10630.1). Total num frames: 2232320. Throughput: 0: 9938.1. Samples: 2214972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:30,729][832030] Avg episode reward: [(0, '230.372')]
+[2023-07-08 04:37:32,816][832316] Updated weights for policy 0, policy_version 4400 (0.0005)
+[2023-07-08 04:37:35,729][832030] Fps is (10 sec: 9420.9, 60 sec: 9898.7, 300 sec: 10592.5). Total num frames: 2277376. Throughput: 0: 9866.4. Samples: 2273124. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:37:35,729][832030] Avg episode reward: [(0, '242.554')]
+[2023-07-08 04:37:35,769][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004456_2281472.pth...
+[2023-07-08 04:37:35,772][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000003880_1986560.pth
+[2023-07-08 04:37:37,051][832316] Updated weights for policy 0, policy_version 4480 (0.0005)
+[2023-07-08 04:37:40,729][832030] Fps is (10 sec: 9420.8, 60 sec: 9898.7, 300 sec: 10575.1). Total num frames: 2326528. Throughput: 0: 9812.7. Samples: 2301796. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:37:40,729][832030] Avg episode reward: [(0, '186.777')]
+[2023-07-08 04:37:41,523][832316] Updated weights for policy 0, policy_version 4560 (0.0005)
+[2023-07-08 04:37:45,729][832030] Fps is (10 sec: 9420.7, 60 sec: 9762.1, 300 sec: 10540.4). Total num frames: 2371584. Throughput: 0: 9695.5. Samples: 2356592. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:45,729][832030] Avg episode reward: [(0, '161.900')]
+[2023-07-08 04:37:45,931][832316] Updated weights for policy 0, policy_version 4640 (0.0005)
+[2023-07-08 04:37:50,428][832316] Updated weights for policy 0, policy_version 4720 (0.0005)
+[2023-07-08 04:37:50,729][832030] Fps is (10 sec: 9011.1, 60 sec: 9693.9, 300 sec: 10507.1). Total num frames: 2416640. Throughput: 0: 9561.0. Samples: 2411692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:50,729][832030] Avg episode reward: [(0, '156.645')]
+[2023-07-08 04:37:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004720_2416640.pth...
+[2023-07-08 04:37:50,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004168_2134016.pth
+[2023-07-08 04:37:54,920][832316] Updated weights for policy 0, policy_version 4800 (0.0005)
+[2023-07-08 04:37:55,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9557.3, 300 sec: 10475.3). Total num frames: 2461696. Throughput: 0: 9502.1. Samples: 2438736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:37:55,729][832030] Avg episode reward: [(0, '149.771')]
+[2023-07-08 04:37:59,457][832316] Updated weights for policy 0, policy_version 4880 (0.0005)
+[2023-07-08 04:38:00,729][832030] Fps is (10 sec: 9011.3, 60 sec: 9489.1, 300 sec: 10444.8). Total num frames: 2506752. Throughput: 0: 9442.5. Samples: 2493492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:00,729][832030] Avg episode reward: [(0, '155.091')]
+[2023-07-08 04:38:03,985][832316] Updated weights for policy 0, policy_version 4960 (0.0005)
+[2023-07-08 04:38:05,729][832030] Fps is (10 sec: 9011.0, 60 sec: 9420.8, 300 sec: 10415.5). Total num frames: 2551808. Throughput: 0: 9342.8. Samples: 2547720. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:38:05,729][832030] Avg episode reward: [(0, '162.142')]
+[2023-07-08 04:38:05,765][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004992_2555904.pth...
+[2023-07-08 04:38:05,767][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004456_2281472.pth
+[2023-07-08 04:38:08,506][832316] Updated weights for policy 0, policy_version 5040 (0.0005)
+[2023-07-08 04:38:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9352.5, 300 sec: 10387.5). Total num frames: 2596864. Throughput: 0: 9285.8. Samples: 2574880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:10,729][832030] Avg episode reward: [(0, '188.301')]
+[2023-07-08 04:38:13,183][832316] Updated weights for policy 0, policy_version 5120 (0.0005)
+[2023-07-08 04:38:15,729][832030] Fps is (10 sec: 9011.4, 60 sec: 9284.3, 300 sec: 10360.5). Total num frames: 2641920. Throughput: 0: 9170.3. Samples: 2627636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:15,729][832030] Avg episode reward: [(0, '153.760')]
+[2023-07-08 04:38:17,857][832316] Updated weights for policy 0, policy_version 5200 (0.0005)
+[2023-07-08 04:38:20,729][832030] Fps is (10 sec: 9011.1, 60 sec: 9216.0, 300 sec: 10334.5). Total num frames: 2686976. Throughput: 0: 9037.7. Samples: 2679820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:20,730][832030] Avg episode reward: [(0, '163.505')]
+[2023-07-08 04:38:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005248_2686976.pth...
+[2023-07-08 04:38:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004720_2416640.pth
+[2023-07-08 04:38:22,331][832316] Updated weights for policy 0, policy_version 5280 (0.0005)
+[2023-07-08 04:38:25,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9147.7, 300 sec: 10309.6). Total num frames: 2732032. Throughput: 0: 9025.1. Samples: 2707928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:25,729][832030] Avg episode reward: [(0, '156.021')]
+[2023-07-08 04:38:26,943][832316] Updated weights for policy 0, policy_version 5360 (0.0005)
+[2023-07-08 04:38:30,729][832030] Fps is (10 sec: 9011.3, 60 sec: 9079.5, 300 sec: 10285.5). Total num frames: 2777088. Throughput: 0: 9002.1. Samples: 2761688. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:38:30,729][832030] Avg episode reward: [(0, '154.092')]
+[2023-07-08 04:38:31,527][832316] Updated weights for policy 0, policy_version 5440 (0.0005)
+[2023-07-08 04:38:35,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 10262.3). Total num frames: 2822144. Throughput: 0: 8987.1. Samples: 2816112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:35,729][832030] Avg episode reward: [(0, '167.742')]
+[2023-07-08 04:38:35,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005512_2822144.pth...
+[2023-07-08 04:38:35,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000004992_2555904.pth
+[2023-07-08 04:38:36,002][832316] Updated weights for policy 0, policy_version 5520 (0.0005)
+[2023-07-08 04:38:40,370][832316] Updated weights for policy 0, policy_version 5600 (0.0005)
+[2023-07-08 04:38:40,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 10240.0). Total num frames: 2867200. Throughput: 0: 8999.0. Samples: 2843692. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:38:40,729][832030] Avg episode reward: [(0, '170.656')]
+[2023-07-08 04:38:44,976][832316] Updated weights for policy 0, policy_version 5680 (0.0005)
+[2023-07-08 04:38:45,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 10218.4). Total num frames: 2912256. Throughput: 0: 9007.7. Samples: 2898840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:45,729][832030] Avg episode reward: [(0, '155.881')]
+[2023-07-08 04:38:49,331][832316] Updated weights for policy 0, policy_version 5760 (0.0005)
+[2023-07-08 04:38:50,729][832030] Fps is (10 sec: 9420.8, 60 sec: 9079.5, 300 sec: 10211.8). Total num frames: 2961408. Throughput: 0: 9020.5. Samples: 2953640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:50,729][832030] Avg episode reward: [(0, '140.622')]
+[2023-07-08 04:38:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005784_2961408.pth...
+[2023-07-08 04:38:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005248_2686976.pth
+[2023-07-08 04:38:53,790][832316] Updated weights for policy 0, policy_version 5840 (0.0005)
+[2023-07-08 04:38:55,729][832030] Fps is (10 sec: 9420.8, 60 sec: 9079.5, 300 sec: 10191.4). Total num frames: 3006464. Throughput: 0: 9030.6. Samples: 2981256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:38:55,729][832030] Avg episode reward: [(0, '133.737')]
+[2023-07-08 04:38:58,284][832316] Updated weights for policy 0, policy_version 5920 (0.0006)
+[2023-07-08 04:39:00,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 10316.4). Total num frames: 3051520. Throughput: 0: 9076.4. Samples: 3036076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:00,729][832030] Avg episode reward: [(0, '147.663')]
+[2023-07-08 04:39:02,901][832316] Updated weights for policy 0, policy_version 6000 (0.0006)
+[2023-07-08 04:39:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 10274.7). Total num frames: 3096576. Throughput: 0: 9108.2. Samples: 3089688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:05,729][832030] Avg episode reward: [(0, '142.238')]
+[2023-07-08 04:39:05,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006048_3096576.pth...
+[2023-07-08 04:39:05,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005512_2822144.pth
+[2023-07-08 04:39:07,306][832316] Updated weights for policy 0, policy_version 6080 (0.0005)
+[2023-07-08 04:39:10,729][832030] Fps is (10 sec: 9011.3, 60 sec: 9079.5, 300 sec: 10260.8). Total num frames: 3141632. Throughput: 0: 9106.8. Samples: 3117732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:10,729][832030] Avg episode reward: [(0, '146.971')]
+[2023-07-08 04:39:11,682][832316] Updated weights for policy 0, policy_version 6160 (0.0005)
+[2023-07-08 04:39:15,729][832030] Fps is (10 sec: 9420.9, 60 sec: 9147.7, 300 sec: 10233.1). Total num frames: 3190784. Throughput: 0: 9177.3. Samples: 3174664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:15,729][832030] Avg episode reward: [(0, '132.634')]
+[2023-07-08 04:39:15,904][832316] Updated weights for policy 0, policy_version 6240 (0.0005)
+[2023-07-08 04:39:20,419][832316] Updated weights for policy 0, policy_version 6320 (0.0005)
+[2023-07-08 04:39:20,729][832030] Fps is (10 sec: 9420.7, 60 sec: 9147.7, 300 sec: 10177.5). Total num frames: 3235840. Throughput: 0: 9222.9. Samples: 3231144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:20,729][832030] Avg episode reward: [(0, '142.964')]
+[2023-07-08 04:39:20,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006320_3235840.pth...
+[2023-07-08 04:39:20,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000005784_2961408.pth
+[2023-07-08 04:39:25,160][832316] Updated weights for policy 0, policy_version 6400 (0.0006)
+[2023-07-08 04:39:25,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9147.8, 300 sec: 10149.8). Total num frames: 3280896. Throughput: 0: 9183.1. Samples: 3256932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:25,729][832030] Avg episode reward: [(0, '121.627')]
+[2023-07-08 04:39:29,846][832316] Updated weights for policy 0, policy_version 6480 (0.0005)
+[2023-07-08 04:39:30,729][832030] Fps is (10 sec: 8601.7, 60 sec: 9079.5, 300 sec: 10094.2). Total num frames: 3321856. Throughput: 0: 9118.8. Samples: 3309184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:30,729][832030] Avg episode reward: [(0, '135.269')]
+[2023-07-08 04:39:34,459][832316] Updated weights for policy 0, policy_version 6560 (0.0005)
+[2023-07-08 04:39:35,729][832030] Fps is (10 sec: 8601.5, 60 sec: 9079.5, 300 sec: 10038.7). Total num frames: 3366912. Throughput: 0: 9087.7. Samples: 3362588. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:39:35,729][832030] Avg episode reward: [(0, '120.489')]
+[2023-07-08 04:39:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006576_3366912.pth...
+[2023-07-08 04:39:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006048_3096576.pth
+[2023-07-08 04:39:39,225][832316] Updated weights for policy 0, policy_version 6640 (0.0005)
+[2023-07-08 04:39:40,729][832030] Fps is (10 sec: 9011.1, 60 sec: 9079.5, 300 sec: 10010.9). Total num frames: 3411968. Throughput: 0: 9027.8. Samples: 3387508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:40,729][832030] Avg episode reward: [(0, '119.863')]
+[2023-07-08 04:39:44,051][832316] Updated weights for policy 0, policy_version 6720 (0.0005)
+[2023-07-08 04:39:45,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 9955.4). Total num frames: 3452928. Throughput: 0: 8960.3. Samples: 3439288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:45,729][832030] Avg episode reward: [(0, '144.237')]
+[2023-07-08 04:39:48,755][832316] Updated weights for policy 0, policy_version 6800 (0.0005)
+[2023-07-08 04:39:50,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8942.9, 300 sec: 9913.7). Total num frames: 3497984. Throughput: 0: 8922.8. Samples: 3491212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:50,729][832030] Avg episode reward: [(0, '127.851')]
+[2023-07-08 04:39:50,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006832_3497984.pth...
+[2023-07-08 04:39:50,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006320_3235840.pth
+[2023-07-08 04:39:53,479][832316] Updated weights for policy 0, policy_version 6880 (0.0005)
+[2023-07-08 04:39:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 9872.1). Total num frames: 3538944. Throughput: 0: 8887.6. Samples: 3517676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:39:55,729][832030] Avg episode reward: [(0, '125.535')]
+[2023-07-08 04:39:58,330][832316] Updated weights for policy 0, policy_version 6960 (0.0005)
+[2023-07-08 04:40:00,729][832030] Fps is (10 sec: 8192.0, 60 sec: 8806.4, 300 sec: 9816.5). Total num frames: 3579904. Throughput: 0: 8733.7. Samples: 3567680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:00,729][832030] Avg episode reward: [(0, '128.638')]
+[2023-07-08 04:40:03,128][832316] Updated weights for policy 0, policy_version 7040 (0.0005)
+[2023-07-08 04:40:05,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8806.4, 300 sec: 9774.9). Total num frames: 3624960. Throughput: 0: 8639.9. Samples: 3619940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:40:05,729][832030] Avg episode reward: [(0, '129.345')]
+[2023-07-08 04:40:05,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007080_3624960.pth...
+[2023-07-08 04:40:05,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006576_3366912.pth
+[2023-07-08 04:40:07,745][832316] Updated weights for policy 0, policy_version 7120 (0.0005)
+[2023-07-08 04:40:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 9747.1). Total num frames: 3670016. Throughput: 0: 8649.3. Samples: 3646152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:10,729][832030] Avg episode reward: [(0, '121.535')]
+[2023-07-08 04:40:12,529][832316] Updated weights for policy 0, policy_version 7200 (0.0005)
+[2023-07-08 04:40:15,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8669.9, 300 sec: 9691.6). Total num frames: 3710976. Throughput: 0: 8646.5. Samples: 3698276. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-08 04:40:15,729][832030] Avg episode reward: [(0, '130.862')]
+[2023-07-08 04:40:17,237][832316] Updated weights for policy 0, policy_version 7280 (0.0005)
+[2023-07-08 04:40:20,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 9636.0). Total num frames: 3756032. Throughput: 0: 8652.2. Samples: 3751936. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-08 04:40:20,729][832030] Avg episode reward: [(0, '129.193')]
+[2023-07-08 04:40:20,750][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007344_3760128.pth...
+[2023-07-08 04:40:20,752][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000006832_3497984.pth
+[2023-07-08 04:40:21,663][832316] Updated weights for policy 0, policy_version 7360 (0.0005)
+[2023-07-08 04:40:25,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8738.1, 300 sec: 9622.1). Total num frames: 3805184. Throughput: 0: 8723.0. Samples: 3780044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:25,729][832030] Avg episode reward: [(0, '132.125')]
+[2023-07-08 04:40:26,132][832316] Updated weights for policy 0, policy_version 7440 (0.0005)
+[2023-07-08 04:40:30,583][832316] Updated weights for policy 0, policy_version 7520 (0.0005)
+[2023-07-08 04:40:30,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8806.4, 300 sec: 9580.5). Total num frames: 3850240. Throughput: 0: 8769.6. Samples: 3833920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:30,729][832030] Avg episode reward: [(0, '137.002')]
+[2023-07-08 04:40:34,983][832316] Updated weights for policy 0, policy_version 7600 (0.0004)
+[2023-07-08 04:40:35,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8806.4, 300 sec: 9538.8). Total num frames: 3895296. Throughput: 0: 8868.0. Samples: 3890272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:35,729][832030] Avg episode reward: [(0, '147.760')]
+[2023-07-08 04:40:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007608_3895296.pth...
+[2023-07-08 04:40:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007080_3624960.pth
+[2023-07-08 04:40:39,393][832316] Updated weights for policy 0, policy_version 7680 (0.0005)
+[2023-07-08 04:40:40,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8874.7, 300 sec: 9524.9). Total num frames: 3944448. Throughput: 0: 8903.0. Samples: 3918312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:40,729][832030] Avg episode reward: [(0, '136.942')]
+[2023-07-08 04:40:43,737][832316] Updated weights for policy 0, policy_version 7760 (0.0005)
+[2023-07-08 04:40:45,729][832030] Fps is (10 sec: 9420.7, 60 sec: 8942.9, 300 sec: 9483.3). Total num frames: 3989504. Throughput: 0: 9029.1. Samples: 3973992. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:40:45,729][832030] Avg episode reward: [(0, '123.855')]
+[2023-07-08 04:40:48,174][832316] Updated weights for policy 0, policy_version 7840 (0.0005)
+[2023-07-08 04:40:50,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 9455.5). Total num frames: 4034560. Throughput: 0: 9115.9. Samples: 4030156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:50,729][832030] Avg episode reward: [(0, '123.901')]
+[2023-07-08 04:40:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007880_4034560.pth...
+[2023-07-08 04:40:50,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007344_3760128.pth
+[2023-07-08 04:40:52,535][832316] Updated weights for policy 0, policy_version 7920 (0.0005)
+[2023-07-08 04:40:55,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9413.9). Total num frames: 4079616. Throughput: 0: 9155.5. Samples: 4058148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:40:55,729][832030] Avg episode reward: [(0, '119.766')]
+[2023-07-08 04:40:57,183][832316] Updated weights for policy 0, policy_version 8000 (0.0006)
+[2023-07-08 04:41:00,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 9386.1). Total num frames: 4124672. Throughput: 0: 9160.2. Samples: 4110484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:00,729][832030] Avg episode reward: [(0, '118.219')]
+[2023-07-08 04:41:01,916][832316] Updated weights for policy 0, policy_version 8080 (0.0006)
+[2023-07-08 04:41:05,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 9330.5). Total num frames: 4165632. Throughput: 0: 9111.1. Samples: 4161936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:05,729][832030] Avg episode reward: [(0, '116.622')]
+[2023-07-08 04:41:05,755][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008144_4169728.pth...
+[2023-07-08 04:41:05,758][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007608_3895296.pth
+[2023-07-08 04:41:06,763][832316] Updated weights for policy 0, policy_version 8160 (0.0005)
+[2023-07-08 04:41:10,729][832030] Fps is (10 sec: 8601.7, 60 sec: 9011.2, 300 sec: 9302.8). Total num frames: 4210688. Throughput: 0: 9043.9. Samples: 4187020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:10,729][832030] Avg episode reward: [(0, '125.598')]
+[2023-07-08 04:41:11,407][832316] Updated weights for policy 0, policy_version 8240 (0.0006)
+[2023-07-08 04:41:15,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 9275.0). Total num frames: 4255744. Throughput: 0: 9028.2. Samples: 4240188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:15,729][832030] Avg episode reward: [(0, '128.358')]
+[2023-07-08 04:41:16,115][832316] Updated weights for policy 0, policy_version 8320 (0.0005)
+[2023-07-08 04:41:20,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 9219.5). Total num frames: 4296704. Throughput: 0: 8932.7. Samples: 4292244. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:41:20,729][832030] Avg episode reward: [(0, '128.575')]
+[2023-07-08 04:41:20,731][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008392_4296704.pth...
+[2023-07-08 04:41:20,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000007880_4034560.pth
+[2023-07-08 04:41:20,915][832316] Updated weights for policy 0, policy_version 8400 (0.0006)
+[2023-07-08 04:41:25,729][832030] Fps is (10 sec: 8192.0, 60 sec: 8874.7, 300 sec: 9177.8). Total num frames: 4337664. Throughput: 0: 8865.2. Samples: 4317248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:25,729][832030] Avg episode reward: [(0, '132.175')]
+[2023-07-08 04:41:25,789][832316] Updated weights for policy 0, policy_version 8480 (0.0005)
+[2023-07-08 04:41:30,529][832316] Updated weights for policy 0, policy_version 8560 (0.0005)
+[2023-07-08 04:41:30,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 9150.0). Total num frames: 4382720. Throughput: 0: 8763.6. Samples: 4368352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:30,729][832030] Avg episode reward: [(0, '138.738')]
+[2023-07-08 04:41:35,292][832316] Updated weights for policy 0, policy_version 8640 (0.0005)
+[2023-07-08 04:41:35,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 9122.3). Total num frames: 4423680. Throughput: 0: 8655.6. Samples: 4419656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:35,729][832030] Avg episode reward: [(0, '141.694')]
+[2023-07-08 04:41:35,736][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008648_4427776.pth...
+[2023-07-08 04:41:35,739][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008144_4169728.pth
+[2023-07-08 04:41:39,956][832316] Updated weights for policy 0, policy_version 8720 (0.0005)
+[2023-07-08 04:41:40,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8738.2, 300 sec: 9094.5). Total num frames: 4468736. Throughput: 0: 8624.3. Samples: 4446240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:41:40,729][832030] Avg episode reward: [(0, '136.772')]
+[2023-07-08 04:41:44,510][832316] Updated weights for policy 0, policy_version 8800 (0.0005)
+[2023-07-08 04:41:45,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8738.2, 300 sec: 9080.6). Total num frames: 4513792. Throughput: 0: 8639.6. Samples: 4499264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:41:45,729][832030] Avg episode reward: [(0, '133.353')]
+[2023-07-08 04:41:48,911][832316] Updated weights for policy 0, policy_version 8880 (0.0005)
+[2023-07-08 04:41:50,729][832030] Fps is (10 sec: 9420.7, 60 sec: 8806.4, 300 sec: 9066.7). Total num frames: 4562944. Throughput: 0: 8739.3. Samples: 4555204. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:41:50,729][832030] Avg episode reward: [(0, '125.767')]
+[2023-07-08 04:41:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008912_4562944.pth...
+[2023-07-08 04:41:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008392_4296704.pth
+[2023-07-08 04:41:53,363][832316] Updated weights for policy 0, policy_version 8960 (0.0005)
+[2023-07-08 04:41:55,729][832030] Fps is (10 sec: 9420.7, 60 sec: 8806.4, 300 sec: 9052.9). Total num frames: 4608000. Throughput: 0: 8808.0. Samples: 4583380. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:41:55,729][832030] Avg episode reward: [(0, '120.736')]
+[2023-07-08 04:41:57,750][832316] Updated weights for policy 0, policy_version 9040 (0.0005)
+[2023-07-08 04:42:00,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 9039.0). Total num frames: 4653056. Throughput: 0: 8855.8. Samples: 4638700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:00,729][832030] Avg episode reward: [(0, '126.076')]
+[2023-07-08 04:42:02,251][832316] Updated weights for policy 0, policy_version 9120 (0.0005)
+[2023-07-08 04:42:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 9025.1). Total num frames: 4698112. Throughput: 0: 8928.4. Samples: 4694024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:05,729][832030] Avg episode reward: [(0, '117.687')]
+[2023-07-08 04:42:05,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009176_4698112.pth...
+[2023-07-08 04:42:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008648_4427776.pth
+[2023-07-08 04:42:06,726][832316] Updated weights for policy 0, policy_version 9200 (0.0005)
+[2023-07-08 04:42:10,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8874.7, 300 sec: 9011.2). Total num frames: 4743168. Throughput: 0: 8964.2. Samples: 4720636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:10,729][832030] Avg episode reward: [(0, '137.470')]
+[2023-07-08 04:42:11,228][832316] Updated weights for policy 0, policy_version 9280 (0.0005)
+[2023-07-08 04:42:15,708][832316] Updated weights for policy 0, policy_version 9360 (0.0005)
+[2023-07-08 04:42:15,729][832030] Fps is (10 sec: 9420.9, 60 sec: 8942.9, 300 sec: 9011.2). Total num frames: 4792320. Throughput: 0: 9052.6. Samples: 4775716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:15,729][832030] Avg episode reward: [(0, '130.399')]
+[2023-07-08 04:42:20,118][832316] Updated weights for policy 0, policy_version 9440 (0.0005)
+[2023-07-08 04:42:20,729][832030] Fps is (10 sec: 9420.7, 60 sec: 9011.2, 300 sec: 8997.3). Total num frames: 4837376. Throughput: 0: 9136.7. Samples: 4830808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:20,729][832030] Avg episode reward: [(0, '140.462')]
+[2023-07-08 04:42:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009448_4837376.pth...
+[2023-07-08 04:42:20,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000008912_4562944.pth
+[2023-07-08 04:42:24,628][832316] Updated weights for policy 0, policy_version 9520 (0.0005)
+[2023-07-08 04:42:25,729][832030] Fps is (10 sec: 9011.1, 60 sec: 9079.5, 300 sec: 8983.4). Total num frames: 4882432. Throughput: 0: 9172.3. Samples: 4858992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:42:25,729][832030] Avg episode reward: [(0, '126.418')]
+[2023-07-08 04:42:29,373][832316] Updated weights for policy 0, policy_version 9600 (0.0005)
+[2023-07-08 04:42:30,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 8969.5). Total num frames: 4923392. Throughput: 0: 9151.5. Samples: 4911084. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:42:30,729][832030] Avg episode reward: [(0, '122.074')]
+[2023-07-08 04:42:34,172][832316] Updated weights for policy 0, policy_version 9680 (0.0005)
+[2023-07-08 04:42:35,729][832030] Fps is (10 sec: 8601.5, 60 sec: 9079.5, 300 sec: 8955.7). Total num frames: 4968448. Throughput: 0: 9042.6. Samples: 4962120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:35,729][832030] Avg episode reward: [(0, '119.058')]
+[2023-07-08 04:42:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009704_4968448.pth...
+[2023-07-08 04:42:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009176_4698112.pth
+[2023-07-08 04:42:38,865][832316] Updated weights for policy 0, policy_version 9760 (0.0006)
+[2023-07-08 04:42:40,729][832030] Fps is (10 sec: 8601.7, 60 sec: 9011.2, 300 sec: 8941.8). Total num frames: 5009408. Throughput: 0: 8998.8. Samples: 4988324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:40,729][832030] Avg episode reward: [(0, '120.987')]
+[2023-07-08 04:42:43,494][832316] Updated weights for policy 0, policy_version 9840 (0.0005)
+[2023-07-08 04:42:45,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 8941.8). Total num frames: 5054464. Throughput: 0: 8954.6. Samples: 5041656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:45,729][832030] Avg episode reward: [(0, '128.043')]
+[2023-07-08 04:42:48,200][832316] Updated weights for policy 0, policy_version 9920 (0.0005)
+[2023-07-08 04:42:50,729][832030] Fps is (10 sec: 9011.0, 60 sec: 8942.9, 300 sec: 8941.8). Total num frames: 5099520. Throughput: 0: 8865.6. Samples: 5092976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:50,729][832030] Avg episode reward: [(0, '118.036')]
+[2023-07-08 04:42:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009960_5099520.pth...
+[2023-07-08 04:42:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009448_4837376.pth
+[2023-07-08 04:42:53,020][832316] Updated weights for policy 0, policy_version 10000 (0.0005)
+[2023-07-08 04:42:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8927.9). Total num frames: 5140480. Throughput: 0: 8845.2. Samples: 5118672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:42:55,729][832030] Avg episode reward: [(0, '119.563')]
+[2023-07-08 04:42:57,764][832316] Updated weights for policy 0, policy_version 10080 (0.0006)
+[2023-07-08 04:43:00,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8927.9). Total num frames: 5185536. Throughput: 0: 8777.5. Samples: 5170704. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:00,729][832030] Avg episode reward: [(0, '124.557')]
+[2023-07-08 04:43:02,253][832316] Updated weights for policy 0, policy_version 10160 (0.0005)
+[2023-07-08 04:43:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8927.9). Total num frames: 5230592. Throughput: 0: 8783.6. Samples: 5226072. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:43:05,729][832030] Avg episode reward: [(0, '115.830')]
+[2023-07-08 04:43:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010216_5230592.pth...
+[2023-07-08 04:43:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009704_4968448.pth
+[2023-07-08 04:43:06,813][832316] Updated weights for policy 0, policy_version 10240 (0.0005)
+[2023-07-08 04:43:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8927.9). Total num frames: 5275648. Throughput: 0: 8730.6. Samples: 5251868. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-08 04:43:10,729][832030] Avg episode reward: [(0, '143.553')]
+[2023-07-08 04:43:11,469][832316] Updated weights for policy 0, policy_version 10320 (0.0005)
+[2023-07-08 04:43:15,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8914.0). Total num frames: 5316608. Throughput: 0: 8738.8. Samples: 5304328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:15,729][832030] Avg episode reward: [(0, '128.060')]
+[2023-07-08 04:43:16,286][832316] Updated weights for policy 0, policy_version 10400 (0.0005)
+[2023-07-08 04:43:20,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8914.0). Total num frames: 5361664. Throughput: 0: 8739.0. Samples: 5355376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:20,729][832030] Avg episode reward: [(0, '125.885')]
+[2023-07-08 04:43:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010472_5361664.pth...
+[2023-07-08 04:43:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000009960_5099520.pth
+[2023-07-08 04:43:21,022][832316] Updated weights for policy 0, policy_version 10480 (0.0005)
+[2023-07-08 04:43:25,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8900.1). Total num frames: 5402624. Throughput: 0: 8738.5. Samples: 5381556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:25,729][832030] Avg episode reward: [(0, '121.601')]
+[2023-07-08 04:43:25,833][832316] Updated weights for policy 0, policy_version 10560 (0.0005)
+[2023-07-08 04:43:30,544][832316] Updated weights for policy 0, policy_version 10640 (0.0005)
+[2023-07-08 04:43:30,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8900.1). Total num frames: 5447680. Throughput: 0: 8683.4. Samples: 5432408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:30,729][832030] Avg episode reward: [(0, '124.817')]
+[2023-07-08 04:43:35,378][832316] Updated weights for policy 0, policy_version 10720 (0.0005)
+[2023-07-08 04:43:35,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8886.2). Total num frames: 5488640. Throughput: 0: 8701.3. Samples: 5484536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:35,729][832030] Avg episode reward: [(0, '116.270')]
+[2023-07-08 04:43:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010720_5488640.pth...
+[2023-07-08 04:43:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010216_5230592.pth
+[2023-07-08 04:43:40,160][832316] Updated weights for policy 0, policy_version 10800 (0.0005)
+[2023-07-08 04:43:40,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8886.2). Total num frames: 5533696. Throughput: 0: 8697.6. Samples: 5510064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:40,729][832030] Avg episode reward: [(0, '114.374')]
+[2023-07-08 04:43:44,822][832316] Updated weights for policy 0, policy_version 10880 (0.0005)
+[2023-07-08 04:43:45,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8738.1, 300 sec: 8872.4). Total num frames: 5578752. Throughput: 0: 8698.4. Samples: 5562132. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:43:45,729][832030] Avg episode reward: [(0, '116.060')]
+[2023-07-08 04:43:49,358][832316] Updated weights for policy 0, policy_version 10960 (0.0005)
+[2023-07-08 04:43:50,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8872.4). Total num frames: 5623808. Throughput: 0: 8658.0. Samples: 5615680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:43:50,729][832030] Avg episode reward: [(0, '124.053')]
+[2023-07-08 04:43:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010984_5623808.pth...
+[2023-07-08 04:43:50,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010472_5361664.pth
+[2023-07-08 04:43:53,892][832316] Updated weights for policy 0, policy_version 11040 (0.0005)
+[2023-07-08 04:43:55,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8872.4). Total num frames: 5668864. Throughput: 0: 8698.7. Samples: 5643308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:43:55,729][832030] Avg episode reward: [(0, '124.489')]
+[2023-07-08 04:43:58,261][832316] Updated weights for policy 0, policy_version 11120 (0.0004)
+[2023-07-08 04:44:00,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8738.2, 300 sec: 8858.5). Total num frames: 5709824. Throughput: 0: 8754.0. Samples: 5698256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:00,729][832030] Avg episode reward: [(0, '115.052')]
+[2023-07-08 04:44:03,195][832316] Updated weights for policy 0, policy_version 11200 (0.0005)
+[2023-07-08 04:44:05,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8858.5). Total num frames: 5754880. Throughput: 0: 8740.1. Samples: 5748680. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:44:05,729][832030] Avg episode reward: [(0, '119.720')]
+[2023-07-08 04:44:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011240_5754880.pth...
+[2023-07-08 04:44:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010720_5488640.pth
+[2023-07-08 04:44:08,141][832316] Updated weights for policy 0, policy_version 11280 (0.0005)
+[2023-07-08 04:44:10,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8669.9, 300 sec: 8830.7). Total num frames: 5795840. Throughput: 0: 8699.6. Samples: 5773040. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:44:10,729][832030] Avg episode reward: [(0, '113.877')]
+[2023-07-08 04:44:12,792][832316] Updated weights for policy 0, policy_version 11360 (0.0005)
+[2023-07-08 04:44:15,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8830.7). Total num frames: 5840896. Throughput: 0: 8740.4. Samples: 5825724. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:15,729][832030] Avg episode reward: [(0, '115.262')]
+[2023-07-08 04:44:17,285][832316] Updated weights for policy 0, policy_version 11440 (0.0005)
+[2023-07-08 04:44:20,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8830.7). Total num frames: 5885952. Throughput: 0: 8808.5. Samples: 5880920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:20,729][832030] Avg episode reward: [(0, '123.614')]
+[2023-07-08 04:44:20,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011496_5885952.pth...
+[2023-07-08 04:44:20,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000010984_5623808.pth
+[2023-07-08 04:44:21,730][832316] Updated weights for policy 0, policy_version 11520 (0.0005)
+[2023-07-08 04:44:25,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8874.7, 300 sec: 8858.5). Total num frames: 5935104. Throughput: 0: 8861.1. Samples: 5908812. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:25,729][832030] Avg episode reward: [(0, '117.023')]
+[2023-07-08 04:44:26,157][832316] Updated weights for policy 0, policy_version 11600 (0.0005)
+[2023-07-08 04:44:30,610][832316] Updated weights for policy 0, policy_version 11680 (0.0005)
+[2023-07-08 04:44:30,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8874.7, 300 sec: 8858.5). Total num frames: 5980160. Throughput: 0: 8926.8. Samples: 5963840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:30,729][832030] Avg episode reward: [(0, '134.608')]
+[2023-07-08 04:44:35,031][832316] Updated weights for policy 0, policy_version 11760 (0.0005)
+[2023-07-08 04:44:35,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8942.9, 300 sec: 8858.5). Total num frames: 6025216. Throughput: 0: 8975.8. Samples: 6019592. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:35,729][832030] Avg episode reward: [(0, '126.130')]
+[2023-07-08 04:44:35,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011768_6025216.pth...
+[2023-07-08 04:44:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011240_5754880.pth
+[2023-07-08 04:44:39,487][832316] Updated weights for policy 0, policy_version 11840 (0.0005)
+[2023-07-08 04:44:40,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8872.4). Total num frames: 6070272. Throughput: 0: 8970.7. Samples: 6046992. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:44:40,729][832030] Avg episode reward: [(0, '114.550')]
+[2023-07-08 04:44:43,969][832316] Updated weights for policy 0, policy_version 11920 (0.0005)
+[2023-07-08 04:44:45,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8872.4). Total num frames: 6115328. Throughput: 0: 8978.6. Samples: 6102292. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:44:45,729][832030] Avg episode reward: [(0, '113.710')]
+[2023-07-08 04:44:48,459][832316] Updated weights for policy 0, policy_version 12000 (0.0005)
+[2023-07-08 04:44:50,729][832030] Fps is (10 sec: 9420.7, 60 sec: 9011.2, 300 sec: 8900.1). Total num frames: 6164480. Throughput: 0: 9061.9. Samples: 6156468. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:50,729][832030] Avg episode reward: [(0, '119.342')]
+[2023-07-08 04:44:50,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012040_6164480.pth...
+[2023-07-08 04:44:50,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011496_5885952.pth
+[2023-07-08 04:44:52,960][832316] Updated weights for policy 0, policy_version 12080 (0.0005)
+[2023-07-08 04:44:55,729][832030] Fps is (10 sec: 9420.8, 60 sec: 9011.2, 300 sec: 8914.0). Total num frames: 6209536. Throughput: 0: 9138.7. Samples: 6184284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:44:55,729][832030] Avg episode reward: [(0, '125.288')]
+[2023-07-08 04:44:57,399][832316] Updated weights for policy 0, policy_version 12160 (0.0004)
+[2023-07-08 04:45:00,729][832030] Fps is (10 sec: 9011.3, 60 sec: 9079.5, 300 sec: 8914.0). Total num frames: 6254592. Throughput: 0: 9187.5. Samples: 6239164. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:45:00,729][832030] Avg episode reward: [(0, '120.510')]
+[2023-07-08 04:45:01,853][832316] Updated weights for policy 0, policy_version 12240 (0.0004)
+[2023-07-08 04:45:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 8914.0). Total num frames: 6299648. Throughput: 0: 9193.9. Samples: 6294644. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:45:05,729][832030] Avg episode reward: [(0, '109.274')]
+[2023-07-08 04:45:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012304_6299648.pth...
+[2023-07-08 04:45:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000011768_6025216.pth
+[2023-07-08 04:45:06,347][832316] Updated weights for policy 0, policy_version 12320 (0.0005)
+[2023-07-08 04:45:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9147.7, 300 sec: 8927.9). Total num frames: 6344704. Throughput: 0: 9175.1. Samples: 6321692. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-08 04:45:10,729][832030] Avg episode reward: [(0, '110.828')]
+[2023-07-08 04:45:11,031][832316] Updated weights for policy 0, policy_version 12400 (0.0005)
+[2023-07-08 04:45:15,729][832030] Fps is (10 sec: 8601.7, 60 sec: 9079.5, 300 sec: 8914.0). Total num frames: 6385664. Throughput: 0: 9101.0. Samples: 6373384. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-08 04:45:15,729][832030] Avg episode reward: [(0, '110.104')]
+[2023-07-08 04:45:15,731][832316] Updated weights for policy 0, policy_version 12480 (0.0005)
+[2023-07-08 04:45:20,313][832316] Updated weights for policy 0, policy_version 12560 (0.0005)
+[2023-07-08 04:45:20,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9079.5, 300 sec: 8900.1). Total num frames: 6430720. Throughput: 0: 9049.4. Samples: 6426816. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:45:20,729][832030] Avg episode reward: [(0, '110.613')]
+[2023-07-08 04:45:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012560_6430720.pth...
+[2023-07-08 04:45:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012040_6164480.pth
+[2023-07-08 04:45:25,082][832316] Updated weights for policy 0, policy_version 12640 (0.0005)
+[2023-07-08 04:45:25,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8900.1). Total num frames: 6475776. Throughput: 0: 8998.2. Samples: 6451912. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:45:25,729][832030] Avg episode reward: [(0, '113.354')]
+[2023-07-08 04:45:29,972][832316] Updated weights for policy 0, policy_version 12720 (0.0005)
+[2023-07-08 04:45:30,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8886.2). Total num frames: 6516736. Throughput: 0: 8925.0. Samples: 6503916. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:45:30,729][832030] Avg episode reward: [(0, '113.499')]
+[2023-07-08 04:45:34,777][832316] Updated weights for policy 0, policy_version 12800 (0.0005)
+[2023-07-08 04:45:35,729][832030] Fps is (10 sec: 8192.0, 60 sec: 8874.7, 300 sec: 8858.5). Total num frames: 6557696. Throughput: 0: 8836.2. Samples: 6554096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:45:35,729][832030] Avg episode reward: [(0, '110.471')]
+[2023-07-08 04:45:35,752][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012816_6561792.pth...
+[2023-07-08 04:45:35,754][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012304_6299648.pth
+[2023-07-08 04:45:39,461][832316] Updated weights for policy 0, policy_version 12880 (0.0005)
+[2023-07-08 04:45:40,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8874.7, 300 sec: 8858.5). Total num frames: 6602752. Throughput: 0: 8819.4. Samples: 6581156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:45:40,729][832030] Avg episode reward: [(0, '112.793')]
+[2023-07-08 04:45:44,082][832316] Updated weights for policy 0, policy_version 12960 (0.0005)
+[2023-07-08 04:45:45,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8858.5). Total num frames: 6647808. Throughput: 0: 8762.9. Samples: 6633492. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:45:45,729][832030] Avg episode reward: [(0, '121.147')]
+[2023-07-08 04:45:48,794][832316] Updated weights for policy 0, policy_version 13040 (0.0005)
+[2023-07-08 04:45:50,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8738.1, 300 sec: 8844.6). Total num frames: 6688768. Throughput: 0: 8677.2. Samples: 6685120. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-08 04:45:50,729][832030] Avg episode reward: [(0, '110.873')]
+[2023-07-08 04:45:50,755][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013072_6692864.pth...
+[2023-07-08 04:45:50,758][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012560_6430720.pth
+[2023-07-08 04:45:53,644][832316] Updated weights for policy 0, policy_version 13120 (0.0005)
+[2023-07-08 04:45:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8844.6). Total num frames: 6733824. Throughput: 0: 8639.4. Samples: 6710464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:45:55,729][832030] Avg episode reward: [(0, '128.579')]
+[2023-07-08 04:45:58,258][832316] Updated weights for policy 0, policy_version 13200 (0.0005)
+[2023-07-08 04:46:00,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8738.1, 300 sec: 8858.5). Total num frames: 6778880. Throughput: 0: 8671.1. Samples: 6763584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:00,729][832030] Avg episode reward: [(0, '114.662')]
+[2023-07-08 04:46:03,052][832316] Updated weights for policy 0, policy_version 13280 (0.0005)
+[2023-07-08 04:46:05,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8844.6). Total num frames: 6819840. Throughput: 0: 8623.1. Samples: 6814856. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:46:05,729][832030] Avg episode reward: [(0, '117.219')]
+[2023-07-08 04:46:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013320_6819840.pth...
+[2023-07-08 04:46:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000012816_6561792.pth
+[2023-07-08 04:46:07,896][832316] Updated weights for policy 0, policy_version 13360 (0.0005)
+[2023-07-08 04:46:10,729][832030] Fps is (10 sec: 8192.1, 60 sec: 8601.6, 300 sec: 8830.7). Total num frames: 6860800. Throughput: 0: 8630.8. Samples: 6840296. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:46:10,729][832030] Avg episode reward: [(0, '114.374')]
+[2023-07-08 04:46:12,722][832316] Updated weights for policy 0, policy_version 13440 (0.0005)
+[2023-07-08 04:46:15,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8844.6). Total num frames: 6905856. Throughput: 0: 8609.1. Samples: 6891328. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:46:15,729][832030] Avg episode reward: [(0, '113.500')]
+[2023-07-08 04:46:17,452][832316] Updated weights for policy 0, policy_version 13520 (0.0005)
+[2023-07-08 04:46:20,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8669.9, 300 sec: 8858.5). Total num frames: 6950912. Throughput: 0: 8640.2. Samples: 6942908. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:46:20,729][832030] Avg episode reward: [(0, '109.705')]
+[2023-07-08 04:46:20,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013576_6950912.pth...
+[2023-07-08 04:46:20,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013072_6692864.pth
+[2023-07-08 04:46:22,182][832316] Updated weights for policy 0, policy_version 13600 (0.0005)
+[2023-07-08 04:46:25,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8844.6). Total num frames: 6991872. Throughput: 0: 8620.4. Samples: 6969076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:25,729][832030] Avg episode reward: [(0, '109.740')]
+[2023-07-08 04:46:26,962][832316] Updated weights for policy 0, policy_version 13680 (0.0005)
+[2023-07-08 04:46:30,729][832030] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8844.6). Total num frames: 7032832. Throughput: 0: 8602.6. Samples: 7020608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:30,729][832030] Avg episode reward: [(0, '108.234')]
+[2023-07-08 04:46:31,715][832316] Updated weights for policy 0, policy_version 13760 (0.0005)
+[2023-07-08 04:46:35,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8844.6). Total num frames: 7077888. Throughput: 0: 8611.4. Samples: 7072632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:35,729][832030] Avg episode reward: [(0, '110.853')]
+[2023-07-08 04:46:35,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013824_7077888.pth...
+[2023-07-08 04:46:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013320_6819840.pth
+[2023-07-08 04:46:36,488][832316] Updated weights for policy 0, policy_version 13840 (0.0005)
+[2023-07-08 04:46:40,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8830.7). Total num frames: 7118848. Throughput: 0: 8615.7. Samples: 7098172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:40,729][832030] Avg episode reward: [(0, '110.242')]
+[2023-07-08 04:46:41,179][832316] Updated weights for policy 0, policy_version 13920 (0.0005)
+[2023-07-08 04:46:45,632][832316] Updated weights for policy 0, policy_version 14000 (0.0005)
+[2023-07-08 04:46:45,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8830.7). Total num frames: 7168000. Throughput: 0: 8624.7. Samples: 7151696. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:45,729][832030] Avg episode reward: [(0, '112.031')]
+[2023-07-08 04:46:50,392][832316] Updated weights for policy 0, policy_version 14080 (0.0005)
+[2023-07-08 04:46:50,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8816.8). Total num frames: 7208960. Throughput: 0: 8663.6. Samples: 7204720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:50,729][832030] Avg episode reward: [(0, '111.748')]
+[2023-07-08 04:46:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014080_7208960.pth...
+[2023-07-08 04:46:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013576_6950912.pth
+[2023-07-08 04:46:55,055][832316] Updated weights for policy 0, policy_version 14160 (0.0005)
+[2023-07-08 04:46:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8816.8). Total num frames: 7254016. Throughput: 0: 8679.9. Samples: 7230892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:46:55,729][832030] Avg episode reward: [(0, '109.722')]
+[2023-07-08 04:46:59,627][832316] Updated weights for policy 0, policy_version 14240 (0.0005)
+[2023-07-08 04:47:00,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8669.9, 300 sec: 8816.8). Total num frames: 7299072. Throughput: 0: 8720.0. Samples: 7283728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:00,729][832030] Avg episode reward: [(0, '106.526')]
+[2023-07-08 04:47:04,056][832316] Updated weights for policy 0, policy_version 14320 (0.0005)
+[2023-07-08 04:47:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8816.8). Total num frames: 7344128. Throughput: 0: 8810.4. Samples: 7339376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:05,729][832030] Avg episode reward: [(0, '104.776')]
+[2023-07-08 04:47:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014344_7344128.pth...
+[2023-07-08 04:47:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000013824_7077888.pth
+[2023-07-08 04:47:08,559][832316] Updated weights for policy 0, policy_version 14400 (0.0005)
+[2023-07-08 04:47:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8802.9). Total num frames: 7389184. Throughput: 0: 8822.7. Samples: 7366096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:10,729][832030] Avg episode reward: [(0, '118.188')]
+[2023-07-08 04:47:13,094][832316] Updated weights for policy 0, policy_version 14480 (0.0005)
+[2023-07-08 04:47:15,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8802.9). Total num frames: 7434240. Throughput: 0: 8900.3. Samples: 7421120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:15,729][832030] Avg episode reward: [(0, '111.298')]
+[2023-07-08 04:47:17,531][832316] Updated weights for policy 0, policy_version 14560 (0.0005)
+[2023-07-08 04:47:20,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8802.9). Total num frames: 7479296. Throughput: 0: 8953.2. Samples: 7475528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:47:20,729][832030] Avg episode reward: [(0, '114.683')]
+[2023-07-08 04:47:20,753][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014616_7483392.pth...
+[2023-07-08 04:47:20,756][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014080_7208960.pth
+[2023-07-08 04:47:22,092][832316] Updated weights for policy 0, policy_version 14640 (0.0005)
+[2023-07-08 04:47:25,729][832030] Fps is (10 sec: 9420.8, 60 sec: 8942.9, 300 sec: 8830.7). Total num frames: 7528448. Throughput: 0: 8993.5. Samples: 7502880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-08 04:47:25,729][832030] Avg episode reward: [(0, '106.778')]
+[2023-07-08 04:47:26,552][832316] Updated weights for policy 0, policy_version 14720 (0.0005)
+[2023-07-08 04:47:30,729][832030] Fps is (10 sec: 9420.8, 60 sec: 9011.2, 300 sec: 8830.7). Total num frames: 7573504. Throughput: 0: 9023.7. Samples: 7557764. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:47:30,729][832030] Avg episode reward: [(0, '124.460')]
+[2023-07-08 04:47:30,961][832316] Updated weights for policy 0, policy_version 14800 (0.0005)
+[2023-07-08 04:47:35,419][832316] Updated weights for policy 0, policy_version 14880 (0.0005)
+[2023-07-08 04:47:35,729][832030] Fps is (10 sec: 9011.1, 60 sec: 9011.2, 300 sec: 8844.6). Total num frames: 7618560. Throughput: 0: 9086.2. Samples: 7613600. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:47:35,729][832030] Avg episode reward: [(0, '110.144')]
+[2023-07-08 04:47:35,739][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014880_7618560.pth...
+[2023-07-08 04:47:35,742][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014344_7344128.pth
+[2023-07-08 04:47:39,875][832316] Updated weights for policy 0, policy_version 14960 (0.0005)
+[2023-07-08 04:47:40,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9079.5, 300 sec: 8844.6). Total num frames: 7663616. Throughput: 0: 9115.0. Samples: 7641068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:40,729][832030] Avg episode reward: [(0, '109.284')]
+[2023-07-08 04:47:44,769][832316] Updated weights for policy 0, policy_version 15040 (0.0005)
+[2023-07-08 04:47:45,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8942.9, 300 sec: 8830.7). Total num frames: 7704576. Throughput: 0: 9088.3. Samples: 7692700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:45,729][832030] Avg episode reward: [(0, '105.091')]
+[2023-07-08 04:47:49,563][832316] Updated weights for policy 0, policy_version 15120 (0.0005)
+[2023-07-08 04:47:50,729][832030] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 8844.6). Total num frames: 7749632. Throughput: 0: 8996.0. Samples: 7744196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:50,729][832030] Avg episode reward: [(0, '129.789')]
+[2023-07-08 04:47:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015136_7749632.pth...
+[2023-07-08 04:47:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014616_7483392.pth
+[2023-07-08 04:47:54,351][832316] Updated weights for policy 0, policy_version 15200 (0.0005)
+[2023-07-08 04:47:55,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8942.9, 300 sec: 8830.7). Total num frames: 7790592. Throughput: 0: 8978.1. Samples: 7770112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:47:55,729][832030] Avg episode reward: [(0, '113.950')]
+[2023-07-08 04:47:59,089][832316] Updated weights for policy 0, policy_version 15280 (0.0005)
+[2023-07-08 04:48:00,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8830.7). Total num frames: 7835648. Throughput: 0: 8895.0. Samples: 7821396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:00,729][832030] Avg episode reward: [(0, '114.059')]
+[2023-07-08 04:48:03,843][832316] Updated weights for policy 0, policy_version 15360 (0.0005)
+[2023-07-08 04:48:05,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8874.7, 300 sec: 8816.8). Total num frames: 7876608. Throughput: 0: 8827.4. Samples: 7872760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:05,729][832030] Avg episode reward: [(0, '112.034')]
+[2023-07-08 04:48:05,763][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015392_7880704.pth...
+[2023-07-08 04:48:05,766][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000014880_7618560.pth
+[2023-07-08 04:48:08,596][832316] Updated weights for policy 0, policy_version 15440 (0.0005)
+[2023-07-08 04:48:10,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8830.7). Total num frames: 7921664. Throughput: 0: 8800.4. Samples: 7898900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:10,729][832030] Avg episode reward: [(0, '109.283')]
+[2023-07-08 04:48:13,408][832316] Updated weights for policy 0, policy_version 15520 (0.0005)
+[2023-07-08 04:48:15,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8816.8). Total num frames: 7962624. Throughput: 0: 8724.0. Samples: 7950344. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:48:15,729][832030] Avg episode reward: [(0, '131.998')]
+[2023-07-08 04:48:18,168][832316] Updated weights for policy 0, policy_version 15600 (0.0004)
+[2023-07-08 04:48:20,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8830.7). Total num frames: 8007680. Throughput: 0: 8622.6. Samples: 8001616. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:48:20,729][832030] Avg episode reward: [(0, '110.692')]
+[2023-07-08 04:48:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015640_8007680.pth...
+[2023-07-08 04:48:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015136_7749632.pth
+[2023-07-08 04:48:22,956][832316] Updated weights for policy 0, policy_version 15680 (0.0005)
+[2023-07-08 04:48:25,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8816.8). Total num frames: 8048640. Throughput: 0: 8598.2. Samples: 8027988. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:48:25,729][832030] Avg episode reward: [(0, '114.841')]
+[2023-07-08 04:48:27,770][832316] Updated weights for policy 0, policy_version 15760 (0.0005)
+[2023-07-08 04:48:30,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8830.7). Total num frames: 8093696. Throughput: 0: 8568.5. Samples: 8078284. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:48:30,729][832030] Avg episode reward: [(0, '114.031')]
+[2023-07-08 04:48:32,535][832316] Updated weights for policy 0, policy_version 15840 (0.0005)
+[2023-07-08 04:48:35,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8816.8). Total num frames: 8134656. Throughput: 0: 8585.9. Samples: 8130560. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-08 04:48:35,729][832030] Avg episode reward: [(0, '110.687')]
+[2023-07-08 04:48:35,731][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015888_8134656.pth...
+[2023-07-08 04:48:35,734][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015392_7880704.pth
+[2023-07-08 04:48:37,244][832316] Updated weights for policy 0, policy_version 15920 (0.0005)
+[2023-07-08 04:48:40,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8816.8). Total num frames: 8179712. Throughput: 0: 8581.2. Samples: 8156268. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:40,729][832030] Avg episode reward: [(0, '111.257')]
+[2023-07-08 04:48:41,996][832316] Updated weights for policy 0, policy_version 16000 (0.0005)
+[2023-07-08 04:48:45,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8802.9). Total num frames: 8220672. Throughput: 0: 8599.8. Samples: 8208384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:45,729][832030] Avg episode reward: [(0, '109.951')]
+[2023-07-08 04:48:46,693][832316] Updated weights for policy 0, policy_version 16080 (0.0005)
+[2023-07-08 04:48:50,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8601.6, 300 sec: 8802.9). Total num frames: 8265728. Throughput: 0: 8579.1. Samples: 8258820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:50,729][832030] Avg episode reward: [(0, '109.333')]
+[2023-07-08 04:48:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016144_8265728.pth...
+[2023-07-08 04:48:50,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015640_8007680.pth
+[2023-07-08 04:48:51,586][832316] Updated weights for policy 0, policy_version 16160 (0.0005)
+[2023-07-08 04:48:55,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8601.6, 300 sec: 8802.9). Total num frames: 8306688. Throughput: 0: 8575.4. Samples: 8284792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:48:55,729][832030] Avg episode reward: [(0, '105.955')]
+[2023-07-08 04:48:56,478][832316] Updated weights for policy 0, policy_version 16240 (0.0005)
+[2023-07-08 04:49:00,729][832030] Fps is (10 sec: 8192.1, 60 sec: 8533.3, 300 sec: 8789.0). Total num frames: 8347648. Throughput: 0: 8557.3. Samples: 8335424. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:00,729][832030] Avg episode reward: [(0, '107.845')]
+[2023-07-08 04:49:01,231][832316] Updated weights for policy 0, policy_version 16320 (0.0005)
+[2023-07-08 04:49:05,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8802.9). Total num frames: 8392704. Throughput: 0: 8567.6. Samples: 8387156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:05,729][832030] Avg episode reward: [(0, '108.484')]
+[2023-07-08 04:49:05,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016392_8392704.pth...
+[2023-07-08 04:49:05,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000015888_8134656.pth
+[2023-07-08 04:49:05,911][832316] Updated weights for policy 0, policy_version 16400 (0.0005)
+[2023-07-08 04:49:10,475][832316] Updated weights for policy 0, policy_version 16480 (0.0005)
+[2023-07-08 04:49:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8601.6, 300 sec: 8802.9). Total num frames: 8437760. Throughput: 0: 8582.4. Samples: 8414196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:10,729][832030] Avg episode reward: [(0, '113.720')]
+[2023-07-08 04:49:15,011][832316] Updated weights for policy 0, policy_version 16560 (0.0005)
+[2023-07-08 04:49:15,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8669.9, 300 sec: 8802.9). Total num frames: 8482816. Throughput: 0: 8673.1. Samples: 8468572. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:15,729][832030] Avg episode reward: [(0, '108.366')]
+[2023-07-08 04:49:19,541][832316] Updated weights for policy 0, policy_version 16640 (0.0005)
+[2023-07-08 04:49:20,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8669.9, 300 sec: 8789.0). Total num frames: 8527872. Throughput: 0: 8711.3. Samples: 8522568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:20,729][832030] Avg episode reward: [(0, '107.261')]
+[2023-07-08 04:49:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016656_8527872.pth...
+[2023-07-08 04:49:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016144_8265728.pth
+[2023-07-08 04:49:24,251][832316] Updated weights for policy 0, policy_version 16720 (0.0005)
+[2023-07-08 04:49:25,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8789.0). Total num frames: 8572928. Throughput: 0: 8712.4. Samples: 8548324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:25,729][832030] Avg episode reward: [(0, '111.233')]
+[2023-07-08 04:49:28,984][832316] Updated weights for policy 0, policy_version 16800 (0.0005)
+[2023-07-08 04:49:30,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8669.9, 300 sec: 8775.2). Total num frames: 8613888. Throughput: 0: 8725.4. Samples: 8601028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:30,729][832030] Avg episode reward: [(0, '114.754')]
+[2023-07-08 04:49:33,793][832316] Updated weights for policy 0, policy_version 16880 (0.0005)
+[2023-07-08 04:49:35,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8775.2). Total num frames: 8658944. Throughput: 0: 8720.8. Samples: 8651256. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:49:35,729][832030] Avg episode reward: [(0, '112.327')]
+[2023-07-08 04:49:35,834][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016912_8658944.pth...
+[2023-07-08 04:49:35,837][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016392_8392704.pth
+[2023-07-08 04:49:38,420][832316] Updated weights for policy 0, policy_version 16960 (0.0005)
+[2023-07-08 04:49:40,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8738.1, 300 sec: 8775.2). Total num frames: 8704000. Throughput: 0: 8752.6. Samples: 8678660. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:49:40,729][832030] Avg episode reward: [(0, '112.494')]
+[2023-07-08 04:49:43,037][832316] Updated weights for policy 0, policy_version 17040 (0.0005)
+[2023-07-08 04:49:45,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8747.4). Total num frames: 8744960. Throughput: 0: 8812.2. Samples: 8731972. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:49:45,729][832030] Avg episode reward: [(0, '117.821')]
+[2023-07-08 04:49:47,855][832316] Updated weights for policy 0, policy_version 17120 (0.0005)
+[2023-07-08 04:49:50,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8747.4). Total num frames: 8790016. Throughput: 0: 8801.8. Samples: 8783236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:50,729][832030] Avg episode reward: [(0, '122.928')]
+[2023-07-08 04:49:50,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017168_8790016.pth...
+[2023-07-08 04:49:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016656_8527872.pth
+[2023-07-08 04:49:52,313][832316] Updated weights for policy 0, policy_version 17200 (0.0005)
+[2023-07-08 04:49:55,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8747.4). Total num frames: 8835072. Throughput: 0: 8817.7. Samples: 8810992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:49:55,729][832030] Avg episode reward: [(0, '108.015')]
+[2023-07-08 04:49:56,817][832316] Updated weights for policy 0, policy_version 17280 (0.0005)
+[2023-07-08 04:50:00,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8747.4). Total num frames: 8880128. Throughput: 0: 8827.3. Samples: 8865800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:00,729][832030] Avg episode reward: [(0, '104.923')]
+[2023-07-08 04:50:01,418][832316] Updated weights for policy 0, policy_version 17360 (0.0005)
+[2023-07-08 04:50:05,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8747.4). Total num frames: 8925184. Throughput: 0: 8804.0. Samples: 8918748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:05,729][832030] Avg episode reward: [(0, '108.618')]
+[2023-07-08 04:50:05,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017432_8925184.pth...
+[2023-07-08 04:50:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000016912_8658944.pth
+[2023-07-08 04:50:06,052][832316] Updated weights for policy 0, policy_version 17440 (0.0005)
+[2023-07-08 04:50:10,715][832316] Updated weights for policy 0, policy_version 17520 (0.0005)
+[2023-07-08 04:50:10,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8761.3). Total num frames: 8970240. Throughput: 0: 8828.8. Samples: 8945620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:10,729][832030] Avg episode reward: [(0, '104.087')]
+[2023-07-08 04:50:15,232][832316] Updated weights for policy 0, policy_version 17600 (0.0005)
+[2023-07-08 04:50:15,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8874.7, 300 sec: 8761.3). Total num frames: 9015296. Throughput: 0: 8842.0. Samples: 8998920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:15,729][832030] Avg episode reward: [(0, '105.241')]
+[2023-07-08 04:50:19,726][832316] Updated weights for policy 0, policy_version 17680 (0.0005)
+[2023-07-08 04:50:20,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8874.7, 300 sec: 8761.3). Total num frames: 9060352. Throughput: 0: 8928.4. Samples: 9053036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:20,729][832030] Avg episode reward: [(0, '117.601')]
+[2023-07-08 04:50:20,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017696_9060352.pth...
+[2023-07-08 04:50:20,736][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017168_8790016.pth
+[2023-07-08 04:50:24,307][832316] Updated weights for policy 0, policy_version 17760 (0.0005)
+[2023-07-08 04:50:25,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8775.2). Total num frames: 9105408. Throughput: 0: 8928.9. Samples: 9080460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:25,729][832030] Avg episode reward: [(0, '116.269')]
+[2023-07-08 04:50:28,878][832316] Updated weights for policy 0, policy_version 17840 (0.0005)
+[2023-07-08 04:50:30,729][832030] Fps is (10 sec: 9011.3, 60 sec: 8942.9, 300 sec: 8789.0). Total num frames: 9150464. Throughput: 0: 8935.2. Samples: 9134056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:30,729][832030] Avg episode reward: [(0, '124.831')]
+[2023-07-08 04:50:33,369][832316] Updated weights for policy 0, policy_version 17920 (0.0004)
+[2023-07-08 04:50:35,729][832030] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8789.0). Total num frames: 9195520. Throughput: 0: 8992.4. Samples: 9187896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:50:35,730][832030] Avg episode reward: [(0, '110.106')]
+[2023-07-08 04:50:35,733][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017960_9195520.pth...
+[2023-07-08 04:50:35,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017432_8925184.pth
+[2023-07-08 04:50:37,970][832316] Updated weights for policy 0, policy_version 18000 (0.0005)
+[2023-07-08 04:50:40,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8942.9, 300 sec: 8789.0). Total num frames: 9240576. Throughput: 0: 8982.9. Samples: 9215224. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:50:40,729][832030] Avg episode reward: [(0, '105.734')]
+[2023-07-08 04:50:42,454][832316] Updated weights for policy 0, policy_version 18080 (0.0005)
+[2023-07-08 04:50:45,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8802.9). Total num frames: 9285632. Throughput: 0: 8966.9. Samples: 9269312. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-08 04:50:45,729][832030] Avg episode reward: [(0, '106.982')]
+[2023-07-08 04:50:47,031][832316] Updated weights for policy 0, policy_version 18160 (0.0005)
+[2023-07-08 04:50:50,729][832030] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8802.9). Total num frames: 9330688. Throughput: 0: 8981.9. Samples: 9322932. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:50:50,729][832030] Avg episode reward: [(0, '115.463')]
+[2023-07-08 04:50:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018224_9330688.pth...
+[2023-07-08 04:50:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017696_9060352.pth
+[2023-07-08 04:50:51,658][832316] Updated weights for policy 0, policy_version 18240 (0.0005)
+[2023-07-08 04:50:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8789.0). Total num frames: 9371648. Throughput: 0: 8973.9. Samples: 9349444. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:50:55,729][832030] Avg episode reward: [(0, '115.718')]
+[2023-07-08 04:50:56,454][832316] Updated weights for policy 0, policy_version 18320 (0.0005)
+[2023-07-08 04:51:00,729][832030] Fps is (10 sec: 8192.1, 60 sec: 8874.7, 300 sec: 8789.0). Total num frames: 9412608. Throughput: 0: 8921.4. Samples: 9400384. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-08 04:51:00,729][832030] Avg episode reward: [(0, '112.649')]
+[2023-07-08 04:51:01,219][832316] Updated weights for policy 0, policy_version 18400 (0.0005)
+[2023-07-08 04:51:05,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8802.9). Total num frames: 9457664. Throughput: 0: 8854.8. Samples: 9451500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:05,729][832030] Avg episode reward: [(0, '116.311')]
+[2023-07-08 04:51:05,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018472_9457664.pth...
+[2023-07-08 04:51:05,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000017960_9195520.pth
+[2023-07-08 04:51:06,032][832316] Updated weights for policy 0, policy_version 18480 (0.0005)
+[2023-07-08 04:51:10,729][832030] Fps is (10 sec: 8601.5, 60 sec: 8806.4, 300 sec: 8789.0). Total num frames: 9498624. Throughput: 0: 8833.4. Samples: 9477964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:10,729][832030] Avg episode reward: [(0, '113.915')]
+[2023-07-08 04:51:10,765][832316] Updated weights for policy 0, policy_version 18560 (0.0005)
+[2023-07-08 04:51:15,568][832316] Updated weights for policy 0, policy_version 18640 (0.0005)
+[2023-07-08 04:51:15,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8789.0). Total num frames: 9543680. Throughput: 0: 8786.9. Samples: 9529468. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:15,729][832030] Avg episode reward: [(0, '117.030')]
+[2023-07-08 04:51:20,037][832316] Updated weights for policy 0, policy_version 18720 (0.0005)
+[2023-07-08 04:51:20,729][832030] Fps is (10 sec: 9011.1, 60 sec: 8806.4, 300 sec: 8802.9). Total num frames: 9588736. Throughput: 0: 8785.1. Samples: 9583228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:20,729][832030] Avg episode reward: [(0, '109.413')]
+[2023-07-08 04:51:20,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018728_9588736.pth...
+[2023-07-08 04:51:20,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018224_9330688.pth
+[2023-07-08 04:51:24,851][832316] Updated weights for policy 0, policy_version 18800 (0.0005)
+[2023-07-08 04:51:25,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8738.1, 300 sec: 8802.9). Total num frames: 9629696. Throughput: 0: 8742.6. Samples: 9608640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:25,729][832030] Avg episode reward: [(0, '113.432')]
+[2023-07-08 04:51:29,487][832316] Updated weights for policy 0, policy_version 18880 (0.0005)
+[2023-07-08 04:51:30,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8802.9). Total num frames: 9674752. Throughput: 0: 8712.8. Samples: 9661388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:30,729][832030] Avg episode reward: [(0, '110.503')]
+[2023-07-08 04:51:34,324][832316] Updated weights for policy 0, policy_version 18960 (0.0006)
+[2023-07-08 04:51:35,729][832030] Fps is (10 sec: 8601.4, 60 sec: 8669.8, 300 sec: 8802.9). Total num frames: 9715712. Throughput: 0: 8639.0. Samples: 9711688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:35,729][832030] Avg episode reward: [(0, '104.446')]
+[2023-07-08 04:51:35,776][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018984_9719808.pth...
+[2023-07-08 04:51:35,779][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018472_9457664.pth
+[2023-07-08 04:51:39,069][832316] Updated weights for policy 0, policy_version 19040 (0.0005)
+[2023-07-08 04:51:40,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8789.0). Total num frames: 9760768. Throughput: 0: 8632.5. Samples: 9737908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:40,729][832030] Avg episode reward: [(0, '109.179')]
+[2023-07-08 04:51:43,903][832316] Updated weights for policy 0, policy_version 19120 (0.0005)
+[2023-07-08 04:51:45,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8601.6, 300 sec: 8789.0). Total num frames: 9801728. Throughput: 0: 8645.7. Samples: 9789440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:45,729][832030] Avg episode reward: [(0, '116.075')]
+[2023-07-08 04:51:48,837][832316] Updated weights for policy 0, policy_version 19200 (0.0005)
+[2023-07-08 04:51:50,729][832030] Fps is (10 sec: 8191.9, 60 sec: 8533.3, 300 sec: 8775.2). Total num frames: 9842688. Throughput: 0: 8606.2. Samples: 9838780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:50,729][832030] Avg episode reward: [(0, '104.232')]
+[2023-07-08 04:51:50,732][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000019224_9842688.pth...
+[2023-07-08 04:51:50,735][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018728_9588736.pth
+[2023-07-08 04:51:53,737][832316] Updated weights for policy 0, policy_version 19280 (0.0005)
+[2023-07-08 04:51:55,729][832030] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8775.2). Total num frames: 9887744. Throughput: 0: 8571.8. Samples: 9863696. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:51:55,729][832030] Avg episode reward: [(0, '104.717')]
+[2023-07-08 04:51:58,543][832316] Updated weights for policy 0, policy_version 19360 (0.0005)
+[2023-07-08 04:52:00,729][832030] Fps is (10 sec: 8601.7, 60 sec: 8601.6, 300 sec: 8761.3). Total num frames: 9928704. Throughput: 0: 8574.6. Samples: 9915324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:52:00,729][832030] Avg episode reward: [(0, '114.003')]
+[2023-07-08 04:52:03,324][832316] Updated weights for policy 0, policy_version 19440 (0.0005)
+[2023-07-08 04:52:05,729][832030] Fps is (10 sec: 8191.9, 60 sec: 8533.3, 300 sec: 8747.4). Total num frames: 9969664. Throughput: 0: 8503.3. Samples: 9965876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-08 04:52:05,729][832030] Avg episode reward: [(0, '118.880')]
+[2023-07-08 04:52:05,745][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000019480_9973760.pth...
+[2023-07-08 04:52:05,748][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000018984_9719808.pth
+[2023-07-08 04:52:08,128][832316] Updated weights for policy 0, policy_version 19520 (0.0005)
+[2023-07-08 04:52:09,607][832272] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000
+[2023-07-08 04:52:09,608][832338] Stopping RolloutWorker_w5...
+[2023-07-08 04:52:09,608][832319] Stopping RolloutWorker_w3...
+[2023-07-08 04:52:09,608][832321] Stopping RolloutWorker_w4...
+[2023-07-08 04:52:09,608][832318] Stopping RolloutWorker_w1...
+[2023-07-08 04:52:09,608][832385] Stopping RolloutWorker_w6...
+[2023-07-08 04:52:09,609][832338] Loop rollout_proc5_evt_loop terminating...
+[2023-07-08 04:52:09,608][832417] Stopping RolloutWorker_w7...
+[2023-07-08 04:52:09,608][832317] Stopping RolloutWorker_w0...
+[2023-07-08 04:52:09,609][832319] Loop rollout_proc3_evt_loop terminating...
+[2023-07-08 04:52:09,608][832320] Stopping RolloutWorker_w2...
+[2023-07-08 04:52:09,609][832321] Loop rollout_proc4_evt_loop terminating...
+[2023-07-08 04:52:09,609][832318] Loop rollout_proc1_evt_loop terminating...
+[2023-07-08 04:52:09,609][832385] Loop rollout_proc6_evt_loop terminating...
+[2023-07-08 04:52:09,609][832417] Loop rollout_proc7_evt_loop terminating...
+[2023-07-08 04:52:09,609][832317] Loop rollout_proc0_evt_loop terminating...
+[2023-07-08 04:52:09,609][832320] Loop rollout_proc2_evt_loop terminating...
+[2023-07-08 04:52:09,608][832030] Component RolloutWorker_w5 stopped!
+[2023-07-08 04:52:09,609][832030] Component RolloutWorker_w3 stopped!
+[2023-07-08 04:52:09,609][832272] Stopping Batcher_0...
+[2023-07-08 04:52:09,609][832030] Component RolloutWorker_w4 stopped!
+[2023-07-08 04:52:09,609][832272] Loop batcher_evt_loop terminating...
+[2023-07-08 04:52:09,609][832030] Component RolloutWorker_w1 stopped!
+[2023-07-08 04:52:09,610][832030] Component RolloutWorker_w6 stopped!
+[2023-07-08 04:52:09,610][832030] Component RolloutWorker_w2 stopped!
+[2023-07-08 04:52:09,610][832030] Component RolloutWorker_w7 stopped!
+[2023-07-08 04:52:09,610][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
+[2023-07-08 04:52:09,610][832030] Component RolloutWorker_w0 stopped!
+[2023-07-08 04:52:09,610][832030] Component Batcher_0 stopped!
+[2023-07-08 04:52:09,613][832272] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000019224_9842688.pth
+[2023-07-08 04:52:09,613][832272] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/handle-pull-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
+[2023-07-08 04:52:09,616][832272] Stopping LearnerWorker_p0...
+[2023-07-08 04:52:09,616][832272] Loop learner_proc0_evt_loop terminating...
+[2023-07-08 04:52:09,616][832030] Component LearnerWorker_p0 stopped!
+[2023-07-08 04:52:09,681][832316] Weights refcount: 2 0
+[2023-07-08 04:52:09,682][832316] Stopping InferenceWorker_p0-w0...
+[2023-07-08 04:52:09,682][832316] Loop inference_proc0-0_evt_loop terminating...
+[2023-07-08 04:52:09,682][832030] Component InferenceWorker_p0-w0 stopped!
+[2023-07-08 04:52:09,683][832030] Waiting for process learner_proc0 to stop...
+[2023-07-08 04:52:10,210][832030] Waiting for process inference_proc0-0 to join...
+[2023-07-08 04:52:10,228][832030] Waiting for process rollout_proc0 to join...
+[2023-07-08 04:52:10,228][832030] Waiting for process rollout_proc1 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc2 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc3 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc4 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc5 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc6 to join...
+[2023-07-08 04:52:10,229][832030] Waiting for process rollout_proc7 to join...
+[2023-07-08 04:52:10,230][832030] Batcher 0 profile tree view:
+batching: 1.8479, releasing_batches: 1.5639
+[2023-07-08 04:52:10,230][832030] InferenceWorker_p0-w0 profile tree view:
+wait_policy: 0.0051
+  wait_policy_total: 429.9317
+update_model: 13.0468
+  weight_update: 0.0005
+one_step: 0.0006
+  handle_policy_step: 580.0260
+    deserialize: 24.1324, stack: 6.2415, obs_to_device_normalize: 105.6743, forward: 288.4035, send_messages: 39.6439
+    prepare_outputs: 66.1291
+      to_cpu: 10.2717
+[2023-07-08 04:52:10,230][832030] Learner 0 profile tree view:
+misc: 0.0107, prepare_batch: 9.9828
+train: 104.5216
+  epoch_init: 0.0376, minibatch_init: 1.4644, losses_postprocess: 1.3839, kl_divergence: 0.4834, after_optimizer: 0.6753
+  calculate_losses: 44.6060
+    losses_init: 0.0334, forward_head: 17.5054, bptt_initial: 0.1490, bptt: 0.1345, tail: 12.5785, advantages_returns: 0.9544, losses: 11.7150
+  update: 54.1596
+    clip: 6.3812
+[2023-07-08 04:52:10,230][832030] RolloutWorker_w0 profile tree view:
+wait_for_trajectories: 0.3226, enqueue_policy_requests: 12.6094, env_step: 756.7704, overhead: 19.7635, complete_rollouts: 0.3296
+save_policy_outputs: 38.5413
+  split_output_tensors: 13.1427
+[2023-07-08 04:52:10,230][832030] RolloutWorker_w7 profile tree view:
+wait_for_trajectories: 0.3011, enqueue_policy_requests: 12.9601, env_step: 761.7134, overhead: 19.8592, complete_rollouts: 0.3236
+save_policy_outputs: 39.2278
+  split_output_tensors: 13.5911
+[2023-07-08 04:52:10,231][832030] Loop Runner_EvtLoop terminating...
+[2023-07-08 04:52:10,231][832030] Runner profile tree view:
+main_loop: 1096.8025
+[2023-07-08 04:52:10,231][832030] Collected {0: 10006528}, FPS: 9123.4