diff --git "a/sf_log.txt" "b/sf_log.txt"
--- "a/sf_log.txt"
+++ "b/sf_log.txt"
@@ -1,33 +1,32 @@
-[2023-07-08 15:48:11,161][1003682] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/config.json...
-[2023-07-08 15:48:11,185][1003682] Rollout worker 0 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 1 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 2 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 3 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 4 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 5 uses device cpu
-[2023-07-08 15:48:11,186][1003682] Rollout worker 6 uses device cpu
-[2023-07-08 15:48:11,187][1003682] Rollout worker 7 uses device cpu
-[2023-07-08 15:48:11,187][1003682] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
-[2023-07-08 15:48:11,203][1003682] InferenceWorker_p0-w0: min num requests: 2
-[2023-07-08 15:48:11,223][1003682] Starting all processes...
-[2023-07-08 15:48:11,224][1003682] Starting process learner_proc0
-[2023-07-08 15:48:11,273][1003682] Starting all processes...
-[2023-07-08 15:48:11,319][1003682] Starting process inference_proc0-0
-[2023-07-08 15:48:11,319][1003682] Starting process rollout_proc2
-[2023-07-08 15:48:11,319][1003682] Starting process rollout_proc1
-[2023-07-08 15:48:11,319][1003682] Starting process rollout_proc0
-[2023-07-08 15:48:11,321][1003682] Starting process rollout_proc3
-[2023-07-08 15:48:11,321][1003682] Starting process rollout_proc4
-[2023-07-08 15:48:11,334][1003682] Starting process rollout_proc5
-[2023-07-08 15:48:11,335][1003682] Starting process rollout_proc6
-[2023-07-08 15:48:11,335][1003682] Starting process rollout_proc7
-[2023-07-08 15:48:13,357][1003924] Starting seed is not provided
-[2023-07-08 15:48:13,358][1003924] Initializing actor-critic model on device cpu
-[2023-07-08 15:48:13,358][1003924] RunningMeanStd input shape: (39,)
-[2023-07-08 15:48:13,358][1003924] RunningMeanStd input shape: (1,)
-[2023-07-08 15:48:13,413][1003972] Worker 4 uses CPU cores [16, 17, 18, 19]
-[2023-07-08 15:48:13,417][1003924] Created Actor Critic model with architecture:
-[2023-07-08 15:48:13,417][1003924] ActorCriticSharedWeights(
+[2023-07-16 21:27:35,904][239306] Saving configuration to /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/config.json...
+[2023-07-16 21:27:35,920][239306] Rollout worker 0 uses device cpu
+[2023-07-16 21:27:35,920][239306] Rollout worker 1 uses device cpu
+[2023-07-16 21:27:35,920][239306] Rollout worker 2 uses device cpu
+[2023-07-16 21:27:35,920][239306] Rollout worker 3 uses device cpu
+[2023-07-16 21:27:35,921][239306] Rollout worker 4 uses device cpu
+[2023-07-16 21:27:35,921][239306] Rollout worker 5 uses device cpu
+[2023-07-16 21:27:35,921][239306] Rollout worker 6 uses device cpu
+[2023-07-16 21:27:35,921][239306] Rollout worker 7 uses device cpu
+[2023-07-16 21:27:35,921][239306] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
+[2023-07-16 21:27:35,932][239306] InferenceWorker_p0-w0: min num requests: 2
+[2023-07-16 21:27:35,949][239306] Starting all processes...
+[2023-07-16 21:27:35,949][239306] Starting process learner_proc0
+[2023-07-16 21:27:35,998][239306] Starting all processes...
+[2023-07-16 21:27:36,039][239306] Starting process inference_proc0-0
+[2023-07-16 21:27:36,048][239306] Starting process rollout_proc0
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc1
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc2
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc3
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc4
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc5
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc6
+[2023-07-16 21:27:36,049][239306] Starting process rollout_proc7
+[2023-07-16 21:27:37,865][239551] Starting seed is not provided
+[2023-07-16 21:27:37,865][239551] Initializing actor-critic model on device cpu
+[2023-07-16 21:27:37,866][239551] RunningMeanStd input shape: (39,)
+[2023-07-16 21:27:37,866][239551] RunningMeanStd input shape: (1,)
+[2023-07-16 21:27:37,923][239551] Created Actor Critic model with architecture:
+[2023-07-16 21:27:37,923][239551] ActorCriticSharedWeights(
   (obs_normalizer): ObservationNormalizer(
     (running_mean_std): RunningMeanStdDictInPlace(
       (running_mean_std): ModuleDict(
@@ -58,984 +57,829 @@
     (distribution_linear): Linear(in_features=64, out_features=4, bias=True)
   )
 )
-[2023-07-08 15:48:13,568][1003971] Worker 0 uses CPU cores [0, 1, 2, 3]
-[2023-07-08 15:48:13,713][1003973] Worker 3 uses CPU cores [12, 13, 14, 15]
-[2023-07-08 15:48:13,728][1003924] Using optimizer <class 'torch.optim.adam.Adam'>
-[2023-07-08 15:48:13,728][1003924] No checkpoints found
-[2023-07-08 15:48:13,728][1003924] Did not load from checkpoint, starting from scratch!
-[2023-07-08 15:48:13,729][1003924] Initialized policy 0 weights for model version 0
-[2023-07-08 15:48:13,730][1003924] LearnerWorker_p0 finished initialization!
-[2023-07-08 15:48:13,757][1004100] Worker 7 uses CPU cores [28, 29, 30, 31]
-[2023-07-08 15:48:13,838][1004068] Worker 6 uses CPU cores [24, 25, 26, 27]
-[2023-07-08 15:48:13,848][1003969] Worker 2 uses CPU cores [8, 9, 10, 11]
-[2023-07-08 15:48:14,066][1003968] RunningMeanStd input shape: (39,)
-[2023-07-08 15:48:14,067][1003968] RunningMeanStd input shape: (1,)
-[2023-07-08 15:48:14,088][1004058] Worker 5 uses CPU cores [20, 21, 22, 23]
-[2023-07-08 15:48:14,124][1003682] Inference worker 0-0 is ready!
-[2023-07-08 15:48:14,125][1003682] All inference workers are ready! Signal rollout workers to start!
-[2023-07-08 15:48:14,319][1003970] Worker 1 uses CPU cores [4, 5, 6, 7]
-[2023-07-08 15:48:17,610][1004100] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,621][1004100] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,637][1003969] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,641][1003973] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,644][1004068] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,648][1003969] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,652][1003973] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,653][1004100] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,655][1004068] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,661][1004058] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,672][1004058] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,676][1003971] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,679][1003969] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,684][1003973] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,687][1003971] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,687][1004068] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,704][1004058] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,716][1004100] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,719][1003971] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,742][1003969] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,747][1003973] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,750][1004068] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,767][1004058] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,782][1003971] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,850][1003972] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,861][1003972] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,887][1003970] Decorrelating experience for 0 frames...
-[2023-07-08 15:48:17,892][1003972] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,898][1003970] Decorrelating experience for 64 frames...
-[2023-07-08 15:48:17,929][1003970] Decorrelating experience for 128 frames...
-[2023-07-08 15:48:17,955][1003972] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:17,994][1003970] Decorrelating experience for 192 frames...
-[2023-07-08 15:48:18,454][1003682] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
-[2023-07-08 15:48:21,155][1004100] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,157][1003969] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,168][1003973] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,181][1004058] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,202][1004068] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,233][1003971] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,270][1004100] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,271][1003969] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,283][1003973] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,295][1004058] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,318][1004068] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,348][1003971] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,387][1003972] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,415][1004100] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,416][1003969] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,419][1003970] Decorrelating experience for 256 frames...
-[2023-07-08 15:48:21,429][1003973] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,441][1004058] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,465][1004068] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,492][1003971] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,501][1003972] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,534][1003970] Decorrelating experience for 320 frames...
-[2023-07-08 15:48:21,581][1004100] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,581][1003969] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,595][1003973] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,606][1004058] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,632][1004068] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,646][1003972] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,656][1003971] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,679][1003970] Decorrelating experience for 384 frames...
-[2023-07-08 15:48:21,809][1003972] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:21,846][1003970] Decorrelating experience for 448 frames...
-[2023-07-08 15:48:23,454][1003682] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 1638.4. Samples: 8192. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:48:23,454][1003682] Avg episode reward: [(0, '19.282')]
-[2023-07-08 15:48:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000016_8192.pth...
-[2023-07-08 15:48:26,294][1003968] Updated weights for policy 0, policy_version 80 (0.0005)
-[2023-07-08 15:48:28,454][1003682] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6144.0). Total num frames: 61440. Throughput: 0: 3994.0. Samples: 39940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:28,455][1003682] Avg episode reward: [(0, '174.261')]
-[2023-07-08 15:48:30,002][1003968] Updated weights for policy 0, policy_version 160 (0.0005)
-[2023-07-08 15:48:31,197][1003682] Heartbeat connected on Batcher_0
-[2023-07-08 15:48:31,201][1003682] Heartbeat connected on LearnerWorker_p0
-[2023-07-08 15:48:31,205][1003682] Heartbeat connected on InferenceWorker_p0-w0
-[2023-07-08 15:48:31,210][1003682] Heartbeat connected on RolloutWorker_w0
-[2023-07-08 15:48:31,212][1003682] Heartbeat connected on RolloutWorker_w1
-[2023-07-08 15:48:31,213][1003682] Heartbeat connected on RolloutWorker_w2
-[2023-07-08 15:48:31,215][1003682] Heartbeat connected on RolloutWorker_w3
-[2023-07-08 15:48:31,220][1003682] Heartbeat connected on RolloutWorker_w4
-[2023-07-08 15:48:31,221][1003682] Heartbeat connected on RolloutWorker_w5
-[2023-07-08 15:48:31,222][1003682] Heartbeat connected on RolloutWorker_w6
-[2023-07-08 15:48:31,224][1003682] Heartbeat connected on RolloutWorker_w7
-[2023-07-08 15:48:33,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 114688. Throughput: 0: 6956.0. Samples: 104340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:33,454][1003682] Avg episode reward: [(0, '228.259')]
-[2023-07-08 15:48:33,455][1003924] Saving new best policy, reward=228.259!
-[2023-07-08 15:48:34,035][1003968] Updated weights for policy 0, policy_version 240 (0.0005)
-[2023-07-08 15:48:38,097][1003968] Updated weights for policy 0, policy_version 320 (0.0005)
-[2023-07-08 15:48:38,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 163840. Throughput: 0: 8212.6. Samples: 164252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:38,454][1003682] Avg episode reward: [(0, '315.016')]
-[2023-07-08 15:48:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000320_163840.pth...
-[2023-07-08 15:48:38,461][1003924] Saving new best policy, reward=315.016!
-[2023-07-08 15:48:41,916][1003968] Updated weights for policy 0, policy_version 400 (0.0006)
-[2023-07-08 15:48:43,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 8683.5, 300 sec: 8683.5). Total num frames: 217088. Throughput: 0: 7884.5. Samples: 197112. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
-[2023-07-08 15:48:43,454][1003682] Avg episode reward: [(0, '337.697')]
-[2023-07-08 15:48:43,454][1003924] Saving new best policy, reward=337.697!
-[2023-07-08 15:48:46,230][1003968] Updated weights for policy 0, policy_version 480 (0.0005)
-[2023-07-08 15:48:48,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 8738.1, 300 sec: 8738.1). Total num frames: 262144. Throughput: 0: 8486.7. Samples: 254600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:48,454][1003682] Avg episode reward: [(0, '324.551')]
-[2023-07-08 15:48:50,578][1003968] Updated weights for policy 0, policy_version 560 (0.0005)
-[2023-07-08 15:48:53,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 315392. Throughput: 0: 9013.0. Samples: 315456. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:53,454][1003682] Avg episode reward: [(0, '343.806')]
-[2023-07-08 15:48:53,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000616_315392.pth...
-[2023-07-08 15:48:53,458][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000016_8192.pth
-[2023-07-08 15:48:53,458][1003924] Saving new best policy, reward=343.806!
-[2023-07-08 15:48:54,496][1003968] Updated weights for policy 0, policy_version 640 (0.0005)
-[2023-07-08 15:48:58,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 9113.6, 300 sec: 9113.6). Total num frames: 364544. Throughput: 0: 8603.2. Samples: 344128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:48:58,454][1003682] Avg episode reward: [(0, '401.115')]
-[2023-07-08 15:48:58,455][1003924] Saving new best policy, reward=401.115!
-[2023-07-08 15:48:58,731][1003968] Updated weights for policy 0, policy_version 720 (0.0006)
-[2023-07-08 15:49:02,849][1003968] Updated weights for policy 0, policy_version 800 (0.0005)
-[2023-07-08 15:49:03,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 9193.3, 300 sec: 9193.3). Total num frames: 413696. Throughput: 0: 8941.1. Samples: 402348. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:03,454][1003682] Avg episode reward: [(0, '403.075')]
-[2023-07-08 15:49:03,454][1003924] Saving new best policy, reward=403.075!
-[2023-07-08 15:49:06,715][1003968] Updated weights for policy 0, policy_version 880 (0.0005)
-[2023-07-08 15:49:08,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 9338.9, 300 sec: 9338.9). Total num frames: 466944. Throughput: 0: 10196.8. Samples: 467048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:08,454][1003682] Avg episode reward: [(0, '436.411')]
-[2023-07-08 15:49:08,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000912_466944.pth...
-[2023-07-08 15:49:08,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000320_163840.pth
-[2023-07-08 15:49:08,459][1003924] Saving new best policy, reward=436.411!
-[2023-07-08 15:49:10,746][1003968] Updated weights for policy 0, policy_version 960 (0.0005)
-[2023-07-08 15:49:13,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 9383.6, 300 sec: 9383.6). Total num frames: 516096. Throughput: 0: 10125.6. Samples: 495592. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:13,454][1003682] Avg episode reward: [(0, '421.102')]
-[2023-07-08 15:49:14,903][1003968] Updated weights for policy 0, policy_version 1040 (0.0006)
-[2023-07-08 15:49:18,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 9420.8, 300 sec: 9420.8). Total num frames: 565248. Throughput: 0: 10009.9. Samples: 554788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:18,454][1003682] Avg episode reward: [(0, '447.197')]
-[2023-07-08 15:49:18,455][1003924] Saving new best policy, reward=447.197!
-[2023-07-08 15:49:19,285][1003968] Updated weights for policy 0, policy_version 1120 (0.0005)
-[2023-07-08 15:49:23,410][1003968] Updated weights for policy 0, policy_version 1200 (0.0005)
-[2023-07-08 15:49:23,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10103.5, 300 sec: 9452.3). Total num frames: 614400. Throughput: 0: 9925.8. Samples: 610912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:23,454][1003682] Avg episode reward: [(0, '453.080')]
-[2023-07-08 15:49:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001200_614400.pth...
-[2023-07-08 15:49:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000616_315392.pth
-[2023-07-08 15:49:23,460][1003924] Saving new best policy, reward=453.080!
-[2023-07-08 15:49:27,652][1003968] Updated weights for policy 0, policy_version 1280 (0.0005)
-[2023-07-08 15:49:28,453][1003682] Fps is (10 sec: 9420.9, 60 sec: 9967.0, 300 sec: 9420.8). Total num frames: 659456. Throughput: 0: 9844.9. Samples: 640132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:28,454][1003682] Avg episode reward: [(0, '477.540')]
-[2023-07-08 15:49:28,459][1003924] Saving new best policy, reward=477.540!
-[2023-07-08 15:49:31,957][1003968] Updated weights for policy 0, policy_version 1360 (0.0006)
-[2023-07-08 15:49:33,454][1003682] Fps is (10 sec: 9420.7, 60 sec: 9898.7, 300 sec: 9448.1). Total num frames: 708608. Throughput: 0: 9867.3. Samples: 698628. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:49:33,454][1003682] Avg episode reward: [(0, '481.925')]
-[2023-07-08 15:49:33,455][1003924] Saving new best policy, reward=481.925!
-[2023-07-08 15:49:36,360][1003968] Updated weights for policy 0, policy_version 1440 (0.0005)
-[2023-07-08 15:49:38,454][1003682] Fps is (10 sec: 9420.7, 60 sec: 9830.4, 300 sec: 9420.8). Total num frames: 753664. Throughput: 0: 9733.5. Samples: 753464. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:49:38,454][1003682] Avg episode reward: [(0, '499.526')]
-[2023-07-08 15:49:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001472_753664.pth...
-[2023-07-08 15:49:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000912_466944.pth
-[2023-07-08 15:49:38,460][1003924] Saving new best policy, reward=499.526!
-[2023-07-08 15:49:40,876][1003968] Updated weights for policy 0, policy_version 1520 (0.0005)
-[2023-07-08 15:49:43,453][1003682] Fps is (10 sec: 9011.2, 60 sec: 9693.9, 300 sec: 9396.7). Total num frames: 798720. Throughput: 0: 9711.9. Samples: 781164. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:49:43,454][1003682] Avg episode reward: [(0, '516.151')]
-[2023-07-08 15:49:43,454][1003924] Saving new best policy, reward=516.151!
-[2023-07-08 15:49:45,306][1003968] Updated weights for policy 0, policy_version 1600 (0.0005)
-[2023-07-08 15:49:48,454][1003682] Fps is (10 sec: 9420.8, 60 sec: 9762.1, 300 sec: 9420.8). Total num frames: 847872. Throughput: 0: 9660.3. Samples: 837064. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:49:48,454][1003682] Avg episode reward: [(0, '531.235')]
-[2023-07-08 15:49:48,454][1003924] Saving new best policy, reward=531.235!
-[2023-07-08 15:49:49,654][1003968] Updated weights for policy 0, policy_version 1680 (0.0005)
-[2023-07-08 15:49:53,453][1003682] Fps is (10 sec: 9420.8, 60 sec: 9625.6, 300 sec: 9399.2). Total num frames: 892928. Throughput: 0: 9461.8. Samples: 892828. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:49:53,454][1003682] Avg episode reward: [(0, '509.734')]
-[2023-07-08 15:49:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001744_892928.pth...
-[2023-07-08 15:49:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001200_614400.pth
-[2023-07-08 15:49:54,204][1003968] Updated weights for policy 0, policy_version 1760 (0.0005)
-[2023-07-08 15:49:58,454][1003682] Fps is (10 sec: 9011.2, 60 sec: 9557.3, 300 sec: 9379.8). Total num frames: 937984. Throughput: 0: 9389.9. Samples: 918140. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:49:58,454][1003682] Avg episode reward: [(0, '521.356')]
-[2023-07-08 15:49:58,852][1003968] Updated weights for policy 0, policy_version 1840 (0.0005)
-[2023-07-08 15:50:03,194][1003968] Updated weights for policy 0, policy_version 1920 (0.0005)
-[2023-07-08 15:50:03,453][1003682] Fps is (10 sec: 9011.3, 60 sec: 9489.1, 300 sec: 9362.3). Total num frames: 983040. Throughput: 0: 9323.0. Samples: 974324. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:50:03,454][1003682] Avg episode reward: [(0, '526.376')]
-[2023-07-08 15:50:07,595][1003968] Updated weights for policy 0, policy_version 2000 (0.0005)
-[2023-07-08 15:50:08,454][1003682] Fps is (10 sec: 9420.8, 60 sec: 9420.8, 300 sec: 9383.6). Total num frames: 1032192. Throughput: 0: 9295.8. Samples: 1029224. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:50:08,454][1003682] Avg episode reward: [(0, '554.387')]
-[2023-07-08 15:50:08,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002016_1032192.pth...
-[2023-07-08 15:50:08,462][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001472_753664.pth
-[2023-07-08 15:50:08,462][1003924] Saving new best policy, reward=554.387!
-[2023-07-08 15:50:11,991][1003968] Updated weights for policy 0, policy_version 2080 (0.0005)
-[2023-07-08 15:50:13,453][1003682] Fps is (10 sec: 9420.8, 60 sec: 9352.5, 300 sec: 9367.4). Total num frames: 1077248. Throughput: 0: 9266.6. Samples: 1057128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:13,454][1003682] Avg episode reward: [(0, '539.979')]
-[2023-07-08 15:50:16,474][1003968] Updated weights for policy 0, policy_version 2160 (0.0005)
-[2023-07-08 15:50:18,453][1003682] Fps is (10 sec: 9011.3, 60 sec: 9284.3, 300 sec: 9352.5). Total num frames: 1122304. Throughput: 0: 9208.5. Samples: 1113008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:18,454][1003682] Avg episode reward: [(0, '570.165')]
-[2023-07-08 15:50:18,454][1003924] Saving new best policy, reward=570.165!
-[2023-07-08 15:50:20,772][1003968] Updated weights for policy 0, policy_version 2240 (0.0005)
-[2023-07-08 15:50:23,453][1003682] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9371.6). Total num frames: 1171456. Throughput: 0: 9238.6. Samples: 1169200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:23,454][1003682] Avg episode reward: [(0, '556.171')]
-[2023-07-08 15:50:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002288_1171456.pth...
-[2023-07-08 15:50:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001744_892928.pth
-[2023-07-08 15:50:25,035][1003968] Updated weights for policy 0, policy_version 2320 (0.0005)
-[2023-07-08 15:50:28,453][1003682] Fps is (10 sec: 9830.3, 60 sec: 9352.5, 300 sec: 9389.3). Total num frames: 1220608. Throughput: 0: 9293.4. Samples: 1199368. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:28,455][1003682] Avg episode reward: [(0, '549.600')]
-[2023-07-08 15:50:29,225][1003968] Updated weights for policy 0, policy_version 2400 (0.0005)
-[2023-07-08 15:50:33,267][1003968] Updated weights for policy 0, policy_version 2480 (0.0005)
-[2023-07-08 15:50:33,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 9352.5, 300 sec: 9405.6). Total num frames: 1269760. Throughput: 0: 9342.4. Samples: 1257472. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:50:33,454][1003682] Avg episode reward: [(0, '565.555')]
-[2023-07-08 15:50:37,266][1003968] Updated weights for policy 0, policy_version 2560 (0.0005)
-[2023-07-08 15:50:38,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 9489.1, 300 sec: 9450.1). Total num frames: 1323008. Throughput: 0: 9485.8. Samples: 1319688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:38,455][1003682] Avg episode reward: [(0, '570.306')]
-[2023-07-08 15:50:38,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002584_1323008.pth...
-[2023-07-08 15:50:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002016_1032192.pth
-[2023-07-08 15:50:38,461][1003924] Saving new best policy, reward=570.306!
-[2023-07-08 15:50:41,228][1003968] Updated weights for policy 0, policy_version 2640 (0.0005)
-[2023-07-08 15:50:43,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 9557.3, 300 sec: 9463.2). Total num frames: 1372160. Throughput: 0: 9632.5. Samples: 1351600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:43,455][1003682] Avg episode reward: [(0, '603.870')]
-[2023-07-08 15:50:43,455][1003924] Saving new best policy, reward=603.870!
-[2023-07-08 15:50:45,074][1003968] Updated weights for policy 0, policy_version 2720 (0.0006)
-[2023-07-08 15:50:48,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 9693.9, 300 sec: 9530.0). Total num frames: 1429504. Throughput: 0: 9830.3. Samples: 1416688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:48,455][1003682] Avg episode reward: [(0, '612.883')]
-[2023-07-08 15:50:48,455][1003924] Saving new best policy, reward=612.883!
-[2023-07-08 15:50:48,651][1003968] Updated weights for policy 0, policy_version 2800 (0.0006)
-[2023-07-08 15:50:52,267][1003968] Updated weights for policy 0, policy_version 2880 (0.0005)
-[2023-07-08 15:50:53,453][1003682] Fps is (10 sec: 11059.2, 60 sec: 9830.4, 300 sec: 9566.1). Total num frames: 1482752. Throughput: 0: 10080.0. Samples: 1482824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:50:53,455][1003682] Avg episode reward: [(0, '626.216')]
-[2023-07-08 15:50:53,485][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002904_1486848.pth...
-[2023-07-08 15:50:53,487][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002288_1171456.pth
-[2023-07-08 15:50:53,487][1003924] Saving new best policy, reward=626.216!
-[2023-07-08 15:50:56,413][1003968] Updated weights for policy 0, policy_version 2960 (0.0006)
-[2023-07-08 15:50:58,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 9966.9, 300 sec: 9600.0). Total num frames: 1536000. Throughput: 0: 10134.7. Samples: 1513188. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
-[2023-07-08 15:50:58,454][1003682] Avg episode reward: [(0, '643.825')]
-[2023-07-08 15:50:58,455][1003924] Saving new best policy, reward=643.825!
-[2023-07-08 15:51:00,280][1003968] Updated weights for policy 0, policy_version 3040 (0.0005)
-[2023-07-08 15:51:03,453][1003682] Fps is (10 sec: 11059.2, 60 sec: 10171.7, 300 sec: 9656.6). Total num frames: 1593344. Throughput: 0: 10377.5. Samples: 1579996. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:51:03,454][1003682] Avg episode reward: [(0, '652.051')]
-[2023-07-08 15:51:03,454][1003924] Saving new best policy, reward=652.051!
-[2023-07-08 15:51:03,678][1003968] Updated weights for policy 0, policy_version 3120 (0.0005)
-[2023-07-08 15:51:07,770][1003968] Updated weights for policy 0, policy_version 3200 (0.0006)
-[2023-07-08 15:51:08,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10171.7, 300 sec: 9661.7). Total num frames: 1642496. Throughput: 0: 10512.8. Samples: 1642276. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:51:08,454][1003682] Avg episode reward: [(0, '657.892')]
-[2023-07-08 15:51:08,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003208_1642496.pth...
-[2023-07-08 15:51:08,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002584_1323008.pth
-[2023-07-08 15:51:08,461][1003924] Saving new best policy, reward=657.892!
-[2023-07-08 15:51:11,968][1003968] Updated weights for policy 0, policy_version 3280 (0.0006)
-[2023-07-08 15:51:13,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10240.0, 300 sec: 9666.6). Total num frames: 1691648. Throughput: 0: 10485.9. Samples: 1671232. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:51:13,454][1003682] Avg episode reward: [(0, '676.497')]
-[2023-07-08 15:51:13,454][1003924] Saving new best policy, reward=676.497!
-[2023-07-08 15:51:16,042][1003968] Updated weights for policy 0, policy_version 3360 (0.0006)
-[2023-07-08 15:51:18,453][1003682] Fps is (10 sec: 9830.6, 60 sec: 10308.3, 300 sec: 9671.1). Total num frames: 1740800. Throughput: 0: 10546.2. Samples: 1732052. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:51:18,454][1003682] Avg episode reward: [(0, '698.499')]
-[2023-07-08 15:51:18,454][1003924] Saving new best policy, reward=698.499!
-[2023-07-08 15:51:20,197][1003968] Updated weights for policy 0, policy_version 3440 (0.0005)
-[2023-07-08 15:51:23,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10376.5, 300 sec: 9697.6). Total num frames: 1794048. Throughput: 0: 10543.6. Samples: 1794148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:23,454][1003682] Avg episode reward: [(0, '669.096')]
-[2023-07-08 15:51:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003504_1794048.pth...
-[2023-07-08 15:51:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002904_1486848.pth
-[2023-07-08 15:51:23,936][1003968] Updated weights for policy 0, policy_version 3520 (0.0005)
-[2023-07-08 15:51:28,001][1003968] Updated weights for policy 0, policy_version 3600 (0.0005)
-[2023-07-08 15:51:28,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10444.8, 300 sec: 9722.6). Total num frames: 1847296. Throughput: 0: 10528.1. Samples: 1825364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:28,454][1003682] Avg episode reward: [(0, '687.602')]
-[2023-07-08 15:51:31,529][1003968] Updated weights for policy 0, policy_version 3680 (0.0005)
-[2023-07-08 15:51:33,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10513.1, 300 sec: 9746.4). Total num frames: 1900544. Throughput: 0: 10533.0. Samples: 1890672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:33,454][1003682] Avg episode reward: [(0, '710.122')]
-[2023-07-08 15:51:33,486][1003924] Saving new best policy, reward=710.122!
-[2023-07-08 15:51:35,610][1003968] Updated weights for policy 0, policy_version 3760 (0.0005)
-[2023-07-08 15:51:38,454][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 9769.0). Total num frames: 1953792. Throughput: 0: 10411.1. Samples: 1951324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:38,454][1003682] Avg episode reward: [(0, '716.901')]
-[2023-07-08 15:51:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003816_1953792.pth...
-[2023-07-08 15:51:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003208_1642496.pth
-[2023-07-08 15:51:38,460][1003924] Saving new best policy, reward=716.901!
-[2023-07-08 15:51:39,596][1003968] Updated weights for policy 0, policy_version 3840 (0.0005)
-[2023-07-08 15:51:43,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10513.1, 300 sec: 9770.5). Total num frames: 2002944. Throughput: 0: 10409.9. Samples: 1981632. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:51:43,454][1003682] Avg episode reward: [(0, '710.339')]
-[2023-07-08 15:51:43,662][1003968] Updated weights for policy 0, policy_version 3920 (0.0005)
-[2023-07-08 15:51:47,592][1003968] Updated weights for policy 0, policy_version 4000 (0.0005)
-[2023-07-08 15:51:48,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 9791.4). Total num frames: 2056192. Throughput: 0: 10290.0. Samples: 2043044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:48,454][1003682] Avg episode reward: [(0, '692.619')]
-[2023-07-08 15:51:51,662][1003968] Updated weights for policy 0, policy_version 4080 (0.0005)
-[2023-07-08 15:51:53,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 9792.3). Total num frames: 2105344. Throughput: 0: 10260.8. Samples: 2104012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:51:53,454][1003682] Avg episode reward: [(0, '684.788')]
-[2023-07-08 15:51:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004112_2105344.pth...
-[2023-07-08 15:51:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003504_1794048.pth
-[2023-07-08 15:51:55,736][1003968] Updated weights for policy 0, policy_version 4160 (0.0005)
-[2023-07-08 15:51:58,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.6, 300 sec: 9811.8). Total num frames: 2158592. Throughput: 0: 10285.5. Samples: 2134080. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:51:58,454][1003682] Avg episode reward: [(0, '731.510')]
-[2023-07-08 15:51:58,455][1003924] Saving new best policy, reward=731.510!
-[2023-07-08 15:51:59,531][1003968] Updated weights for policy 0, policy_version 4240 (0.0005)
-[2023-07-08 15:52:03,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 9812.2). Total num frames: 2207744. Throughput: 0: 10378.9. Samples: 2199104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:03,455][1003682] Avg episode reward: [(0, '731.118')]
-[2023-07-08 15:52:03,516][1003968] Updated weights for policy 0, policy_version 4320 (0.0005)
-[2023-07-08 15:52:07,171][1003968] Updated weights for policy 0, policy_version 4400 (0.0005)
-[2023-07-08 15:52:08,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10376.5, 300 sec: 9848.2). Total num frames: 2265088. Throughput: 0: 10436.9. Samples: 2263808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:08,455][1003682] Avg episode reward: [(0, '715.468')]
-[2023-07-08 15:52:08,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004424_2265088.pth...
-[2023-07-08 15:52:08,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003816_1953792.pth
-[2023-07-08 15:52:11,281][1003968] Updated weights for policy 0, policy_version 4480 (0.0005)
-[2023-07-08 15:52:13,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10376.5, 300 sec: 9847.8). Total num frames: 2314240. Throughput: 0: 10400.9. Samples: 2293404. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
-[2023-07-08 15:52:13,454][1003682] Avg episode reward: [(0, '711.808')]
-[2023-07-08 15:52:15,246][1003968] Updated weights for policy 0, policy_version 4560 (0.0005)
-[2023-07-08 15:52:18,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 9864.5). Total num frames: 2367488. Throughput: 0: 10302.0. Samples: 2354264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:18,454][1003682] Avg episode reward: [(0, '716.330')]
-[2023-07-08 15:52:19,141][1003968] Updated weights for policy 0, policy_version 4640 (0.0006)
-[2023-07-08 15:52:22,800][1003968] Updated weights for policy 0, policy_version 4720 (0.0005)
-[2023-07-08 15:52:23,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 9880.6). Total num frames: 2420736. Throughput: 0: 10432.8. Samples: 2420800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:23,454][1003682] Avg episode reward: [(0, '726.872')]
-[2023-07-08 15:52:23,463][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004736_2424832.pth...
-[2023-07-08 15:52:23,465][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004112_2105344.pth
-[2023-07-08 15:52:26,484][1003968] Updated weights for policy 0, policy_version 4800 (0.0005)
-[2023-07-08 15:52:28,453][1003682] Fps is (10 sec: 11059.3, 60 sec: 10513.1, 300 sec: 9912.3). Total num frames: 2478080. Throughput: 0: 10514.1. Samples: 2454768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:28,454][1003682] Avg episode reward: [(0, '730.424')]
-[2023-07-08 15:52:30,433][1003968] Updated weights for policy 0, policy_version 4880 (0.0005)
-[2023-07-08 15:52:33,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10513.0, 300 sec: 9926.8). Total num frames: 2531328. Throughput: 0: 10534.8. Samples: 2517112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:33,454][1003682] Avg episode reward: [(0, '743.823')]
-[2023-07-08 15:52:33,454][1003924] Saving new best policy, reward=743.823!
-[2023-07-08 15:52:34,247][1003968] Updated weights for policy 0, policy_version 4960 (0.0005)
-[2023-07-08 15:52:38,167][1003968] Updated weights for policy 0, policy_version 5040 (0.0005)
-[2023-07-08 15:52:38,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 9924.9). Total num frames: 2580480. Throughput: 0: 10588.4. Samples: 2580488. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:52:38,454][1003682] Avg episode reward: [(0, '733.153')]
-[2023-07-08 15:52:38,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005040_2580480.pth...
-[2023-07-08 15:52:38,458][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004424_2265088.pth
-[2023-07-08 15:52:42,346][1003968] Updated weights for policy 0, policy_version 5120 (0.0005)
-[2023-07-08 15:52:43,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10444.8, 300 sec: 9923.1). Total num frames: 2629632. Throughput: 0: 10571.2. Samples: 2609784. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:52:43,454][1003682] Avg episode reward: [(0, '735.912')]
-[2023-07-08 15:52:46,314][1003968] Updated weights for policy 0, policy_version 5200 (0.0005)
-[2023-07-08 15:52:48,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 9936.6). Total num frames: 2682880. Throughput: 0: 10485.1. Samples: 2670932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:48,454][1003682] Avg episode reward: [(0, '750.213')]
-[2023-07-08 15:52:48,455][1003924] Saving new best policy, reward=750.213!
-[2023-07-08 15:52:50,341][1003968] Updated weights for policy 0, policy_version 5280 (0.0005)
-[2023-07-08 15:52:53,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 9934.7). Total num frames: 2732032. Throughput: 0: 10370.1. Samples: 2730464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:52:53,454][1003682] Avg episode reward: [(0, '759.347')]
-[2023-07-08 15:52:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005336_2732032.pth...
-[2023-07-08 15:52:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004736_2424832.pth
-[2023-07-08 15:52:53,460][1003924] Saving new best policy, reward=759.347!
-[2023-07-08 15:52:54,578][1003968] Updated weights for policy 0, policy_version 5360 (0.0005)
-[2023-07-08 15:52:58,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10376.5, 300 sec: 9932.8). Total num frames: 2781184. Throughput: 0: 10396.8. Samples: 2761260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:52:58,454][1003682] Avg episode reward: [(0, '759.404')]
-[2023-07-08 15:52:58,454][1003924] Saving new best policy, reward=759.404!
-[2023-07-08 15:52:58,500][1003968] Updated weights for policy 0, policy_version 5440 (0.0005)
-[2023-07-08 15:53:02,517][1003968] Updated weights for policy 0, policy_version 5520 (0.0005)
-[2023-07-08 15:53:03,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 9945.4). Total num frames: 2834432. Throughput: 0: 10405.8. Samples: 2822524. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:53:03,454][1003682] Avg episode reward: [(0, '753.152')]
-[2023-07-08 15:53:06,465][1003968] Updated weights for policy 0, policy_version 5600 (0.0006)
-[2023-07-08 15:53:08,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10308.3, 300 sec: 9943.4). Total num frames: 2883584. Throughput: 0: 10310.6. Samples: 2884776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:53:08,454][1003682] Avg episode reward: [(0, '757.905')]
-[2023-07-08 15:53:08,472][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005640_2887680.pth...
-[2023-07-08 15:53:08,473][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005040_2580480.pth
-[2023-07-08 15:53:10,423][1003968] Updated weights for policy 0, policy_version 5680 (0.0005)
-[2023-07-08 15:53:13,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 9955.4). Total num frames: 2936832. Throughput: 0: 10249.8. Samples: 2916008. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:53:13,454][1003682] Avg episode reward: [(0, '744.888')]
-[2023-07-08 15:53:14,531][1003968] Updated weights for policy 0, policy_version 5760 (0.0005)
-[2023-07-08 15:53:18,453][1003682] Fps is (10 sec: 10240.2, 60 sec: 10308.3, 300 sec: 10094.2). Total num frames: 2985984. Throughput: 0: 10199.0. Samples: 2976064. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:53:18,454][1003682] Avg episode reward: [(0, '769.305')]
-[2023-07-08 15:53:18,454][1003924] Saving new best policy, reward=769.305!
-[2023-07-08 15:53:18,597][1003968] Updated weights for policy 0, policy_version 5840 (0.0006)
-[2023-07-08 15:53:22,742][1003968] Updated weights for policy 0, policy_version 5920 (0.0005)
-[2023-07-08 15:53:23,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10240.0, 300 sec: 10080.3). Total num frames: 3035136. Throughput: 0: 10112.0. Samples: 3035528. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:53:23,454][1003682] Avg episode reward: [(0, '769.609')]
-[2023-07-08 15:53:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005928_3035136.pth...
-[2023-07-08 15:53:23,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005336_2732032.pth
-[2023-07-08 15:53:23,461][1003924] Saving new best policy, reward=769.609!
-[2023-07-08 15:53:26,616][1003968] Updated weights for policy 0, policy_version 6000 (0.0005)
-[2023-07-08 15:53:28,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10171.7, 300 sec: 10080.3). Total num frames: 3088384. Throughput: 0: 10180.4. Samples: 3067904. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:53:28,454][1003682] Avg episode reward: [(0, '764.198')]
-[2023-07-08 15:53:30,545][1003968] Updated weights for policy 0, policy_version 6080 (0.0005)
-[2023-07-08 15:53:33,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10171.7, 300 sec: 10094.2). Total num frames: 3141632. Throughput: 0: 10193.5. Samples: 3129640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:53:33,454][1003682] Avg episode reward: [(0, '765.561')]
-[2023-07-08 15:53:34,515][1003968] Updated weights for policy 0, policy_version 6160 (0.0005)
-[2023-07-08 15:53:38,310][1003968] Updated weights for policy 0, policy_version 6240 (0.0005)
-[2023-07-08 15:53:38,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10240.0, 300 sec: 10094.2). Total num frames: 3194880. Throughput: 0: 10273.7. Samples: 3192780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:53:38,454][1003682] Avg episode reward: [(0, '767.476')]
-[2023-07-08 15:53:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006240_3194880.pth...
-[2023-07-08 15:53:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005640_2887680.pth
-[2023-07-08 15:53:42,194][1003968] Updated weights for policy 0, policy_version 6320 (0.0005)
-[2023-07-08 15:53:43,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10308.3, 300 sec: 10122.0). Total num frames: 3248128. Throughput: 0: 10319.1. Samples: 3225620. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:53:43,455][1003682] Avg episode reward: [(0, '779.147')]
-[2023-07-08 15:53:43,455][1003924] Saving new best policy, reward=779.147!
-[2023-07-08 15:53:46,307][1003968] Updated weights for policy 0, policy_version 6400 (0.0005)
-[2023-07-08 15:53:48,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10308.3, 300 sec: 10122.0). Total num frames: 3301376. Throughput: 0: 10292.6. Samples: 3285688. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:53:48,455][1003682] Avg episode reward: [(0, '777.254')]
-[2023-07-08 15:53:49,958][1003968] Updated weights for policy 0, policy_version 6480 (0.0005)
-[2023-07-08 15:53:53,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10376.5, 300 sec: 10135.9). Total num frames: 3354624. Throughput: 0: 10389.1. Samples: 3352288. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:53:53,455][1003682] Avg episode reward: [(0, '752.150')]
-[2023-07-08 15:53:53,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006552_3354624.pth...
-[2023-07-08 15:53:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005928_3035136.pth
-[2023-07-08 15:53:53,731][1003968] Updated weights for policy 0, policy_version 6560 (0.0005)
-[2023-07-08 15:53:57,522][1003968] Updated weights for policy 0, policy_version 6640 (0.0006)
-[2023-07-08 15:53:58,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10149.7). Total num frames: 3407872. Throughput: 0: 10445.0. Samples: 3386032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:53:58,454][1003682] Avg episode reward: [(0, '717.316')]
-[2023-07-08 15:54:01,315][1003968] Updated weights for policy 0, policy_version 6720 (0.0005)
-[2023-07-08 15:54:03,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10444.8, 300 sec: 10149.8). Total num frames: 3461120. Throughput: 0: 10510.7. Samples: 3449048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:03,454][1003682] Avg episode reward: [(0, '774.660')]
-[2023-07-08 15:54:04,979][1003968] Updated weights for policy 0, policy_version 6800 (0.0005)
-[2023-07-08 15:54:08,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10163.6). Total num frames: 3514368. Throughput: 0: 10616.2. Samples: 3513256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:08,513][1003682] Avg episode reward: [(0, '772.975')]
-[2023-07-08 15:54:08,516][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006864_3514368.pth...
-[2023-07-08 15:54:08,519][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006240_3194880.pth
-[2023-07-08 15:54:09,182][1003968] Updated weights for policy 0, policy_version 6880 (0.0005)
-[2023-07-08 15:54:13,299][1003968] Updated weights for policy 0, policy_version 6960 (0.0005)
-[2023-07-08 15:54:13,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10163.6). Total num frames: 3563520. Throughput: 0: 10540.0. Samples: 3542204. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:54:13,455][1003682] Avg episode reward: [(0, '765.679')]
-[2023-07-08 15:54:17,106][1003968] Updated weights for policy 0, policy_version 7040 (0.0005)
-[2023-07-08 15:54:18,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10177.5). Total num frames: 3616768. Throughput: 0: 10559.7. Samples: 3604828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:18,455][1003682] Avg episode reward: [(0, '782.431')]
-[2023-07-08 15:54:18,455][1003924] Saving new best policy, reward=782.431!
-[2023-07-08 15:54:21,125][1003968] Updated weights for policy 0, policy_version 7120 (0.0005)
-[2023-07-08 15:54:23,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10205.3). Total num frames: 3670016. Throughput: 0: 10538.2. Samples: 3667000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:23,455][1003682] Avg episode reward: [(0, '778.622')]
-[2023-07-08 15:54:23,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007168_3670016.pth...
-[2023-07-08 15:54:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006552_3354624.pth
-[2023-07-08 15:54:24,987][1003968] Updated weights for policy 0, policy_version 7200 (0.0006)
-[2023-07-08 15:54:28,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10205.3). Total num frames: 3719168. Throughput: 0: 10507.0. Samples: 3698436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:28,454][1003682] Avg episode reward: [(0, '782.087')]
-[2023-07-08 15:54:29,342][1003968] Updated weights for policy 0, policy_version 7280 (0.0005)
-[2023-07-08 15:54:33,249][1003968] Updated weights for policy 0, policy_version 7360 (0.0006)
-[2023-07-08 15:54:33,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10444.8, 300 sec: 10219.2). Total num frames: 3768320. Throughput: 0: 10503.7. Samples: 3758352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:33,454][1003682] Avg episode reward: [(0, '764.058')]
-[2023-07-08 15:54:36,945][1003968] Updated weights for policy 0, policy_version 7440 (0.0005)
-[2023-07-08 15:54:38,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10246.9). Total num frames: 3821568. Throughput: 0: 10429.9. Samples: 3821632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:38,454][1003682] Avg episode reward: [(0, '780.296')]
-[2023-07-08 15:54:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007464_3821568.pth...
-[2023-07-08 15:54:38,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006864_3514368.pth
-[2023-07-08 15:54:40,589][1003968] Updated weights for policy 0, policy_version 7520 (0.0006)
-[2023-07-08 15:54:43,454][1003682] Fps is (10 sec: 11059.0, 60 sec: 10513.0, 300 sec: 10274.7). Total num frames: 3878912. Throughput: 0: 10475.1. Samples: 3857412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:43,454][1003682] Avg episode reward: [(0, '761.381')]
-[2023-07-08 15:54:44,526][1003968] Updated weights for policy 0, policy_version 7600 (0.0005)
-[2023-07-08 15:54:48,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10444.8, 300 sec: 10288.6). Total num frames: 3928064. Throughput: 0: 10410.6. Samples: 3917524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:48,454][1003682] Avg episode reward: [(0, '767.264')]
-[2023-07-08 15:54:48,710][1003968] Updated weights for policy 0, policy_version 7680 (0.0005)
-[2023-07-08 15:54:52,366][1003968] Updated weights for policy 0, policy_version 7760 (0.0005)
-[2023-07-08 15:54:53,454][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 10316.4). Total num frames: 3981312. Throughput: 0: 10408.7. Samples: 3981648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:53,454][1003682] Avg episode reward: [(0, '760.843')]
-[2023-07-08 15:54:53,489][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007784_3985408.pth...
-[2023-07-08 15:54:53,490][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007168_3670016.pth
-[2023-07-08 15:54:56,343][1003968] Updated weights for policy 0, policy_version 7840 (0.0005)
-[2023-07-08 15:54:58,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10376.5, 300 sec: 10330.2). Total num frames: 4030464. Throughput: 0: 10458.3. Samples: 4012828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:54:58,454][1003682] Avg episode reward: [(0, '772.203')]
-[2023-07-08 15:55:00,511][1003968] Updated weights for policy 0, policy_version 7920 (0.0005)
-[2023-07-08 15:55:03,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 10344.1). Total num frames: 4083712. Throughput: 0: 10375.0. Samples: 4071704. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:55:03,454][1003682] Avg episode reward: [(0, '770.058')]
-[2023-07-08 15:55:04,446][1003968] Updated weights for policy 0, policy_version 8000 (0.0005)
-[2023-07-08 15:55:08,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10308.3, 300 sec: 10358.0). Total num frames: 4132864. Throughput: 0: 10351.5. Samples: 4132820. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:55:08,454][1003682] Avg episode reward: [(0, '766.794')]
-[2023-07-08 15:55:08,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008072_4132864.pth...
-[2023-07-08 15:55:08,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007464_3821568.pth
-[2023-07-08 15:55:08,541][1003968] Updated weights for policy 0, policy_version 8080 (0.0005)
-[2023-07-08 15:55:12,491][1003968] Updated weights for policy 0, policy_version 8160 (0.0005)
-[2023-07-08 15:55:13,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10385.8). Total num frames: 4186112. Throughput: 0: 10380.9. Samples: 4165576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:55:13,454][1003682] Avg episode reward: [(0, '778.246')]
-[2023-07-08 15:55:16,205][1003968] Updated weights for policy 0, policy_version 8240 (0.0005)
-[2023-07-08 15:55:18,454][1003682] Fps is (10 sec: 11058.8, 60 sec: 10444.7, 300 sec: 10413.5). Total num frames: 4243456. Throughput: 0: 10486.6. Samples: 4230256. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:55:18,455][1003682] Avg episode reward: [(0, '775.918')]
-[2023-07-08 15:55:19,819][1003968] Updated weights for policy 0, policy_version 8320 (0.0005)
-[2023-07-08 15:55:23,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10376.5, 300 sec: 10413.6). Total num frames: 4292608. Throughput: 0: 10467.7. Samples: 4292680. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:55:23,454][1003682] Avg episode reward: [(0, '774.095')]
-[2023-07-08 15:55:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008384_4292608.pth...
-[2023-07-08 15:55:23,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007784_3985408.pth
-[2023-07-08 15:55:24,000][1003968] Updated weights for policy 0, policy_version 8400 (0.0005)
-[2023-07-08 15:55:28,205][1003968] Updated weights for policy 0, policy_version 8480 (0.0005)
-[2023-07-08 15:55:28,453][1003682] Fps is (10 sec: 9830.9, 60 sec: 10376.5, 300 sec: 10413.6). Total num frames: 4341760. Throughput: 0: 10309.8. Samples: 4321352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:55:28,454][1003682] Avg episode reward: [(0, '764.321')]
-[2023-07-08 15:55:32,371][1003968] Updated weights for policy 0, policy_version 8560 (0.0005)
-[2023-07-08 15:55:33,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10376.5, 300 sec: 10399.7). Total num frames: 4390912. Throughput: 0: 10314.3. Samples: 4381668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:55:33,454][1003682] Avg episode reward: [(0, '765.560')]
-[2023-07-08 15:55:36,322][1003968] Updated weights for policy 0, policy_version 8640 (0.0005)
-[2023-07-08 15:55:38,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10376.5, 300 sec: 10413.6). Total num frames: 4444160. Throughput: 0: 10235.6. Samples: 4442248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:55:38,454][1003682] Avg episode reward: [(0, '764.136')]
-[2023-07-08 15:55:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008680_4444160.pth...
-[2023-07-08 15:55:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008072_4132864.pth
-[2023-07-08 15:55:40,357][1003968] Updated weights for policy 0, policy_version 8720 (0.0005)
-[2023-07-08 15:55:43,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10308.3, 300 sec: 10399.7). Total num frames: 4497408. Throughput: 0: 10223.8. Samples: 4472896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:55:43,454][1003682] Avg episode reward: [(0, '767.380')]
-[2023-07-08 15:55:44,244][1003968] Updated weights for policy 0, policy_version 8800 (0.0006)
-[2023-07-08 15:55:48,344][1003968] Updated weights for policy 0, policy_version 8880 (0.0005)
-[2023-07-08 15:55:48,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10308.3, 300 sec: 10385.8). Total num frames: 4546560. Throughput: 0: 10283.7. Samples: 4534472. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:55:48,454][1003682] Avg episode reward: [(0, '768.885')]
-[2023-07-08 15:55:52,521][1003968] Updated weights for policy 0, policy_version 8960 (0.0005)
-[2023-07-08 15:55:53,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10240.0, 300 sec: 10371.9). Total num frames: 4595712. Throughput: 0: 10281.6. Samples: 4595492. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:55:53,454][1003682] Avg episode reward: [(0, '764.703')]
-[2023-07-08 15:55:53,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008976_4595712.pth...
-[2023-07-08 15:55:53,458][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008384_4292608.pth
-[2023-07-08 15:55:56,299][1003968] Updated weights for policy 0, policy_version 9040 (0.0005)
-[2023-07-08 15:55:58,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10308.3, 300 sec: 10358.0). Total num frames: 4648960. Throughput: 0: 10279.6. Samples: 4628156. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:55:58,454][1003682] Avg episode reward: [(0, '767.463')]
-[2023-07-08 15:56:00,216][1003968] Updated weights for policy 0, policy_version 9120 (0.0005)
-[2023-07-08 15:56:03,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10240.0, 300 sec: 10358.0). Total num frames: 4698112. Throughput: 0: 10202.9. Samples: 4689380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:03,454][1003682] Avg episode reward: [(0, '770.297')]
-[2023-07-08 15:56:04,360][1003968] Updated weights for policy 0, policy_version 9200 (0.0005)
-[2023-07-08 15:56:08,330][1003968] Updated weights for policy 0, policy_version 9280 (0.0005)
-[2023-07-08 15:56:08,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 4751360. Throughput: 0: 10139.0. Samples: 4748936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:08,454][1003682] Avg episode reward: [(0, '766.470')]
-[2023-07-08 15:56:08,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009280_4751360.pth...
-[2023-07-08 15:56:08,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008680_4444160.pth
-[2023-07-08 15:56:12,603][1003968] Updated weights for policy 0, policy_version 9360 (0.0005)
-[2023-07-08 15:56:13,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10171.7, 300 sec: 10358.0). Total num frames: 4796416. Throughput: 0: 10172.2. Samples: 4779100. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:13,454][1003682] Avg episode reward: [(0, '769.537')]
-[2023-07-08 15:56:16,580][1003968] Updated weights for policy 0, policy_version 9440 (0.0006)
-[2023-07-08 15:56:18,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10103.5, 300 sec: 10358.0). Total num frames: 4849664. Throughput: 0: 10158.2. Samples: 4838788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:18,454][1003682] Avg episode reward: [(0, '774.058')]
-[2023-07-08 15:56:20,650][1003968] Updated weights for policy 0, policy_version 9520 (0.0005)
-[2023-07-08 15:56:23,454][1003682] Fps is (10 sec: 10649.6, 60 sec: 10171.7, 300 sec: 10358.0). Total num frames: 4902912. Throughput: 0: 10194.4. Samples: 4900996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:23,454][1003682] Avg episode reward: [(0, '773.825')]
-[2023-07-08 15:56:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009576_4902912.pth...
-[2023-07-08 15:56:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008976_4595712.pth
-[2023-07-08 15:56:24,512][1003968] Updated weights for policy 0, policy_version 9600 (0.0005)
-[2023-07-08 15:56:28,392][1003968] Updated weights for policy 0, policy_version 9680 (0.0005)
-[2023-07-08 15:56:28,454][1003682] Fps is (10 sec: 10649.4, 60 sec: 10240.0, 300 sec: 10358.0). Total num frames: 4956160. Throughput: 0: 10261.7. Samples: 4934676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:28,454][1003682] Avg episode reward: [(0, '775.844')]
-[2023-07-08 15:56:32,538][1003968] Updated weights for policy 0, policy_version 9760 (0.0005)
-[2023-07-08 15:56:33,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10344.1). Total num frames: 5005312. Throughput: 0: 10195.9. Samples: 4993288. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:56:33,454][1003682] Avg episode reward: [(0, '761.360')]
-[2023-07-08 15:56:36,247][1003968] Updated weights for policy 0, policy_version 9840 (0.0005)
-[2023-07-08 15:56:38,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10358.0). Total num frames: 5058560. Throughput: 0: 10289.4. Samples: 5058516. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:56:38,454][1003682] Avg episode reward: [(0, '772.189')]
-[2023-07-08 15:56:38,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009880_5058560.pth...
-[2023-07-08 15:56:38,463][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009280_4751360.pth
-[2023-07-08 15:56:40,172][1003968] Updated weights for policy 0, policy_version 9920 (0.0005)
-[2023-07-08 15:56:43,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10240.0, 300 sec: 10358.0). Total num frames: 5111808. Throughput: 0: 10289.8. Samples: 5091196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:43,454][1003682] Avg episode reward: [(0, '766.544')]
-[2023-07-08 15:56:44,029][1003968] Updated weights for policy 0, policy_version 10000 (0.0005)
-[2023-07-08 15:56:47,754][1003968] Updated weights for policy 0, policy_version 10080 (0.0005)
-[2023-07-08 15:56:48,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 5165056. Throughput: 0: 10330.0. Samples: 5154232. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:56:48,454][1003682] Avg episode reward: [(0, '763.331')]
-[2023-07-08 15:56:51,903][1003968] Updated weights for policy 0, policy_version 10160 (0.0005)
-[2023-07-08 15:56:53,454][1003682] Fps is (10 sec: 10649.4, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 5218304. Throughput: 0: 10421.7. Samples: 5217912. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:56:53,501][1003682] Avg episode reward: [(0, '767.959')]
-[2023-07-08 15:56:53,503][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010192_5218304.pth...
-[2023-07-08 15:56:53,505][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009576_4902912.pth
-[2023-07-08 15:56:55,749][1003968] Updated weights for policy 0, policy_version 10240 (0.0005)
-[2023-07-08 15:56:58,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10376.5, 300 sec: 10385.8). Total num frames: 5271552. Throughput: 0: 10398.8. Samples: 5247048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:56:58,460][1003682] Avg episode reward: [(0, '773.560')]
-[2023-07-08 15:56:59,587][1003968] Updated weights for policy 0, policy_version 10320 (0.0005)
-[2023-07-08 15:57:03,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 10358.0). Total num frames: 5320704. Throughput: 0: 10465.3. Samples: 5309728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:03,491][1003682] Avg episode reward: [(0, '776.378')]
-[2023-07-08 15:57:03,567][1003968] Updated weights for policy 0, policy_version 10400 (0.0005)
-[2023-07-08 15:57:07,369][1003968] Updated weights for policy 0, policy_version 10480 (0.0005)
-[2023-07-08 15:57:08,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 5373952. Throughput: 0: 10510.3. Samples: 5373960. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 15:57:08,498][1003682] Avg episode reward: [(0, '776.008')]
-[2023-07-08 15:57:08,501][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010496_5373952.pth...
-[2023-07-08 15:57:08,503][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009880_5058560.pth
-[2023-07-08 15:57:11,302][1003968] Updated weights for policy 0, policy_version 10560 (0.0006)
-[2023-07-08 15:57:13,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10371.9). Total num frames: 5427200. Throughput: 0: 10460.6. Samples: 5405400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:13,455][1003682] Avg episode reward: [(0, '772.635')]
-[2023-07-08 15:57:15,284][1003968] Updated weights for policy 0, policy_version 10640 (0.0005)
-[2023-07-08 15:57:18,453][1003682] Fps is (10 sec: 11059.3, 60 sec: 10581.3, 300 sec: 10385.8). Total num frames: 5484544. Throughput: 0: 10606.1. Samples: 5470560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:18,454][1003682] Avg episode reward: [(0, '771.676')]
-[2023-07-08 15:57:18,780][1003968] Updated weights for policy 0, policy_version 10720 (0.0005)
-[2023-07-08 15:57:22,571][1003968] Updated weights for policy 0, policy_version 10800 (0.0005)
-[2023-07-08 15:57:23,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10358.0). Total num frames: 5533696. Throughput: 0: 10590.2. Samples: 5535076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:23,454][1003682] Avg episode reward: [(0, '778.764')]
-[2023-07-08 15:57:23,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010816_5537792.pth...
-[2023-07-08 15:57:23,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010192_5218304.pth
-[2023-07-08 15:57:26,697][1003968] Updated weights for policy 0, policy_version 10880 (0.0005)
-[2023-07-08 15:57:28,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10513.1, 300 sec: 10358.0). Total num frames: 5586944. Throughput: 0: 10535.1. Samples: 5565276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:28,455][1003682] Avg episode reward: [(0, '779.020')]
-[2023-07-08 15:57:30,632][1003968] Updated weights for policy 0, policy_version 10960 (0.0005)
-[2023-07-08 15:57:33,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10371.9). Total num frames: 5640192. Throughput: 0: 10518.5. Samples: 5627564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:33,455][1003682] Avg episode reward: [(0, '784.101')]
-[2023-07-08 15:57:33,455][1003924] Saving new best policy, reward=784.101!
-[2023-07-08 15:57:34,536][1003968] Updated weights for policy 0, policy_version 11040 (0.0004)
-[2023-07-08 15:57:38,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10371.9). Total num frames: 5689344. Throughput: 0: 10486.6. Samples: 5689808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:38,455][1003682] Avg episode reward: [(0, '782.752')]
-[2023-07-08 15:57:38,487][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011120_5693440.pth...
-[2023-07-08 15:57:38,488][1003968] Updated weights for policy 0, policy_version 11120 (0.0005)
-[2023-07-08 15:57:38,489][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010496_5373952.pth
-[2023-07-08 15:57:42,497][1003968] Updated weights for policy 0, policy_version 11200 (0.0005)
-[2023-07-08 15:57:43,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10513.1, 300 sec: 10371.9). Total num frames: 5742592. Throughput: 0: 10549.9. Samples: 5721792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:43,454][1003682] Avg episode reward: [(0, '787.984')]
-[2023-07-08 15:57:43,455][1003924] Saving new best policy, reward=787.984!
-[2023-07-08 15:57:46,613][1003968] Updated weights for policy 0, policy_version 11280 (0.0005)
-[2023-07-08 15:57:48,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 5791744. Throughput: 0: 10458.9. Samples: 5780380. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
-[2023-07-08 15:57:48,454][1003682] Avg episode reward: [(0, '776.099')]
-[2023-07-08 15:57:50,847][1003968] Updated weights for policy 0, policy_version 11360 (0.0005)
-[2023-07-08 15:57:53,453][1003682] Fps is (10 sec: 9830.3, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 5840896. Throughput: 0: 10329.6. Samples: 5838792. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
-[2023-07-08 15:57:53,454][1003682] Avg episode reward: [(0, '788.878')]
-[2023-07-08 15:57:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011408_5840896.pth...
-[2023-07-08 15:57:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010816_5537792.pth
-[2023-07-08 15:57:53,460][1003924] Saving new best policy, reward=788.878!
-[2023-07-08 15:57:55,045][1003968] Updated weights for policy 0, policy_version 11440 (0.0005)
-[2023-07-08 15:57:58,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 5894144. Throughput: 0: 10309.6. Samples: 5869332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:57:58,454][1003682] Avg episode reward: [(0, '779.164')]
-[2023-07-08 15:57:58,741][1003968] Updated weights for policy 0, policy_version 11520 (0.0005)
-[2023-07-08 15:58:02,532][1003968] Updated weights for policy 0, policy_version 11600 (0.0006)
-[2023-07-08 15:58:03,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10385.8). Total num frames: 5947392. Throughput: 0: 10324.6. Samples: 5935168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:03,454][1003682] Avg episode reward: [(0, '783.102')]
-[2023-07-08 15:58:06,582][1003968] Updated weights for policy 0, policy_version 11680 (0.0006)
-[2023-07-08 15:58:08,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 5996544. Throughput: 0: 10245.2. Samples: 5996112. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
-[2023-07-08 15:58:08,454][1003682] Avg episode reward: [(0, '784.782')]
-[2023-07-08 15:58:08,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011712_5996544.pth...
-[2023-07-08 15:58:08,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011120_5693440.pth
-[2023-07-08 15:58:10,653][1003968] Updated weights for policy 0, policy_version 11760 (0.0004)
-[2023-07-08 15:58:13,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 6045696. Throughput: 0: 10228.5. Samples: 6025556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:13,454][1003682] Avg episode reward: [(0, '791.687')]
-[2023-07-08 15:58:13,454][1003924] Saving new best policy, reward=791.687!
-[2023-07-08 15:58:14,760][1003968] Updated weights for policy 0, policy_version 11840 (0.0005)
-[2023-07-08 15:58:18,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10385.8). Total num frames: 6098944. Throughput: 0: 10203.5. Samples: 6086720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:18,454][1003682] Avg episode reward: [(0, '783.744')]
-[2023-07-08 15:58:18,748][1003968] Updated weights for policy 0, policy_version 11920 (0.0005)
-[2023-07-08 15:58:22,707][1003968] Updated weights for policy 0, policy_version 12000 (0.0005)
-[2023-07-08 15:58:23,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10240.0, 300 sec: 10371.9). Total num frames: 6148096. Throughput: 0: 10196.1. Samples: 6148632. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:58:23,454][1003682] Avg episode reward: [(0, '781.879')]
-[2023-07-08 15:58:23,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012008_6148096.pth...
-[2023-07-08 15:58:23,458][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011408_5840896.pth
-[2023-07-08 15:58:26,638][1003968] Updated weights for policy 0, policy_version 12080 (0.0005)
-[2023-07-08 15:58:28,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10240.0, 300 sec: 10371.9). Total num frames: 6201344. Throughput: 0: 10198.7. Samples: 6180732. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:58:28,454][1003682] Avg episode reward: [(0, '792.592')]
-[2023-07-08 15:58:28,454][1003924] Saving new best policy, reward=792.592!
-[2023-07-08 15:58:30,585][1003968] Updated weights for policy 0, policy_version 12160 (0.0005)
-[2023-07-08 15:58:33,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10240.0, 300 sec: 10371.9). Total num frames: 6254592. Throughput: 0: 10296.9. Samples: 6243740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:33,454][1003682] Avg episode reward: [(0, '784.039')]
-[2023-07-08 15:58:34,351][1003968] Updated weights for policy 0, policy_version 12240 (0.0005)
-[2023-07-08 15:58:38,294][1003968] Updated weights for policy 0, policy_version 12320 (0.0006)
-[2023-07-08 15:58:38,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 6307840. Throughput: 0: 10410.3. Samples: 6307256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:38,454][1003682] Avg episode reward: [(0, '785.070')]
-[2023-07-08 15:58:38,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012320_6307840.pth...
-[2023-07-08 15:58:38,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011712_5996544.pth
-[2023-07-08 15:58:42,198][1003968] Updated weights for policy 0, policy_version 12400 (0.0005)
-[2023-07-08 15:58:43,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 6361088. Throughput: 0: 10423.8. Samples: 6338404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:43,454][1003682] Avg episode reward: [(0, '788.837')]
-[2023-07-08 15:58:45,955][1003968] Updated weights for policy 0, policy_version 12480 (0.0005)
-[2023-07-08 15:58:48,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 6414336. Throughput: 0: 10379.9. Samples: 6402264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:48,454][1003682] Avg episode reward: [(0, '783.607')]
-[2023-07-08 15:58:49,853][1003968] Updated weights for policy 0, policy_version 12560 (0.0005)
-[2023-07-08 15:58:53,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10376.5, 300 sec: 10358.0). Total num frames: 6463488. Throughput: 0: 10395.1. Samples: 6463892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:53,454][1003682] Avg episode reward: [(0, '783.735')]
-[2023-07-08 15:58:53,489][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012632_6467584.pth...
-[2023-07-08 15:58:53,491][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012008_6148096.pth
-[2023-07-08 15:58:53,945][1003968] Updated weights for policy 0, policy_version 12640 (0.0005)
-[2023-07-08 15:58:57,907][1003968] Updated weights for policy 0, policy_version 12720 (0.0005)
-[2023-07-08 15:58:58,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10358.0). Total num frames: 6516736. Throughput: 0: 10421.6. Samples: 6494528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:58:58,455][1003682] Avg episode reward: [(0, '791.112')]
-[2023-07-08 15:59:01,460][1003968] Updated weights for policy 0, policy_version 12800 (0.0005)
-[2023-07-08 15:59:03,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10376.6, 300 sec: 10358.0). Total num frames: 6569984. Throughput: 0: 10539.1. Samples: 6560976. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:59:03,454][1003682] Avg episode reward: [(0, '744.885')]
-[2023-07-08 15:59:05,565][1003968] Updated weights for policy 0, policy_version 12880 (0.0005)
-[2023-07-08 15:59:08,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 6623232. Throughput: 0: 10531.2. Samples: 6622536. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 15:59:08,455][1003682] Avg episode reward: [(0, '787.874')]
-[2023-07-08 15:59:08,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012936_6623232.pth...
-[2023-07-08 15:59:08,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012320_6307840.pth
-[2023-07-08 15:59:09,570][1003968] Updated weights for policy 0, policy_version 12960 (0.0006)
-[2023-07-08 15:59:13,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10358.0). Total num frames: 6672384. Throughput: 0: 10472.1. Samples: 6651976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:59:13,455][1003682] Avg episode reward: [(0, '783.967')]
-[2023-07-08 15:59:13,552][1003968] Updated weights for policy 0, policy_version 13040 (0.0005)
-[2023-07-08 15:59:17,551][1003968] Updated weights for policy 0, policy_version 13120 (0.0006)
-[2023-07-08 15:59:18,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 10358.0). Total num frames: 6725632. Throughput: 0: 10439.7. Samples: 6713528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:59:18,454][1003682] Avg episode reward: [(0, '787.701')]
-[2023-07-08 15:59:21,364][1003968] Updated weights for policy 0, policy_version 13200 (0.0005)
-[2023-07-08 15:59:23,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10371.9). Total num frames: 6778880. Throughput: 0: 10454.3. Samples: 6777700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:59:23,454][1003682] Avg episode reward: [(0, '785.195')]
-[2023-07-08 15:59:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013240_6778880.pth...
-[2023-07-08 15:59:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012632_6467584.pth
-[2023-07-08 15:59:25,283][1003968] Updated weights for policy 0, policy_version 13280 (0.0005)
-[2023-07-08 15:59:28,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10385.8). Total num frames: 6832128. Throughput: 0: 10456.1. Samples: 6808928. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 15:59:28,454][1003682] Avg episode reward: [(0, '782.385')]
-[2023-07-08 15:59:29,086][1003968] Updated weights for policy 0, policy_version 13360 (0.0005)
-[2023-07-08 15:59:32,996][1003968] Updated weights for policy 0, policy_version 13440 (0.0005)
-[2023-07-08 15:59:33,454][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.0, 300 sec: 10385.8). Total num frames: 6885376. Throughput: 0: 10460.9. Samples: 6873004. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:59:33,454][1003682] Avg episode reward: [(0, '784.417')]
-[2023-07-08 15:59:36,965][1003968] Updated weights for policy 0, policy_version 13520 (0.0005)
-[2023-07-08 15:59:38,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10371.9). Total num frames: 6938624. Throughput: 0: 10515.2. Samples: 6937076. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
-[2023-07-08 15:59:38,454][1003682] Avg episode reward: [(0, '785.803')]
-[2023-07-08 15:59:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013552_6938624.pth...
-[2023-07-08 15:59:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012936_6623232.pth
-[2023-07-08 15:59:40,769][1003968] Updated weights for policy 0, policy_version 13600 (0.0006)
-[2023-07-08 15:59:43,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 6987776. Throughput: 0: 10517.6. Samples: 6967820. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:59:43,454][1003682] Avg episode reward: [(0, '789.006')]
-[2023-07-08 15:59:44,699][1003968] Updated weights for policy 0, policy_version 13680 (0.0005)
-[2023-07-08 15:59:48,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 7041024. Throughput: 0: 10402.3. Samples: 7029080. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 15:59:48,454][1003682] Avg episode reward: [(0, '788.585')]
-[2023-07-08 15:59:48,850][1003968] Updated weights for policy 0, policy_version 13760 (0.0006)
-[2023-07-08 15:59:52,738][1003968] Updated weights for policy 0, policy_version 13840 (0.0005)
-[2023-07-08 15:59:53,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 7090176. Throughput: 0: 10393.5. Samples: 7090240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:59:53,454][1003682] Avg episode reward: [(0, '785.701')]
-[2023-07-08 15:59:53,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013848_7090176.pth...
-[2023-07-08 15:59:53,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013240_6778880.pth
-[2023-07-08 15:59:56,915][1003968] Updated weights for policy 0, policy_version 13920 (0.0005)
-[2023-07-08 15:59:58,453][1003682] Fps is (10 sec: 9830.4, 60 sec: 10376.5, 300 sec: 10358.0). Total num frames: 7139328. Throughput: 0: 10381.7. Samples: 7119152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 15:59:58,454][1003682] Avg episode reward: [(0, '773.789')]
-[2023-07-08 16:00:00,916][1003968] Updated weights for policy 0, policy_version 14000 (0.0005)
-[2023-07-08 16:00:03,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 7192576. Throughput: 0: 10405.4. Samples: 7181768. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 16:00:03,454][1003682] Avg episode reward: [(0, '784.178')]
-[2023-07-08 16:00:04,800][1003968] Updated weights for policy 0, policy_version 14080 (0.0005)
-[2023-07-08 16:00:08,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10376.5, 300 sec: 10371.9). Total num frames: 7245824. Throughput: 0: 10402.7. Samples: 7245824. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
-[2023-07-08 16:00:08,454][1003682] Avg episode reward: [(0, '787.766')]
-[2023-07-08 16:00:08,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014152_7245824.pth...
-[2023-07-08 16:00:08,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013552_6938624.pth
-[2023-07-08 16:00:08,585][1003968] Updated weights for policy 0, policy_version 14160 (0.0005)
-[2023-07-08 16:00:12,367][1003968] Updated weights for policy 0, policy_version 14240 (0.0004)
-[2023-07-08 16:00:13,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10358.0). Total num frames: 7299072. Throughput: 0: 10433.4. Samples: 7278432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:13,454][1003682] Avg episode reward: [(0, '778.917')]
-[2023-07-08 16:00:16,342][1003968] Updated weights for policy 0, policy_version 14320 (0.0005)
-[2023-07-08 16:00:18,453][1003682] Fps is (10 sec: 10649.8, 60 sec: 10444.8, 300 sec: 10371.9). Total num frames: 7352320. Throughput: 0: 10379.9. Samples: 7340096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:18,454][1003682] Avg episode reward: [(0, '777.686')]
-[2023-07-08 16:00:20,165][1003968] Updated weights for policy 0, policy_version 14400 (0.0005)
-[2023-07-08 16:00:23,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10444.8, 300 sec: 10385.8). Total num frames: 7405568. Throughput: 0: 10405.0. Samples: 7405300. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:23,454][1003682] Avg episode reward: [(0, '764.378')]
-[2023-07-08 16:00:23,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014464_7405568.pth...
-[2023-07-08 16:00:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013848_7090176.pth
-[2023-07-08 16:00:24,017][1003968] Updated weights for policy 0, policy_version 14480 (0.0005)
-[2023-07-08 16:00:28,073][1003968] Updated weights for policy 0, policy_version 14560 (0.0005)
-[2023-07-08 16:00:28,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10444.8, 300 sec: 10399.7). Total num frames: 7458816. Throughput: 0: 10388.8. Samples: 7435316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:28,454][1003682] Avg episode reward: [(0, '780.120')]
-[2023-07-08 16:00:32,160][1003968] Updated weights for policy 0, policy_version 14640 (0.0005)
-[2023-07-08 16:00:33,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10376.5, 300 sec: 10385.8). Total num frames: 7507968. Throughput: 0: 10370.3. Samples: 7495744. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:33,454][1003682] Avg episode reward: [(0, '779.120')]
-[2023-07-08 16:00:36,187][1003968] Updated weights for policy 0, policy_version 14720 (0.0005)
-[2023-07-08 16:00:38,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10308.3, 300 sec: 10371.9). Total num frames: 7557120. Throughput: 0: 10376.8. Samples: 7557196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:38,454][1003682] Avg episode reward: [(0, '779.635')]
-[2023-07-08 16:00:38,470][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014768_7561216.pth...
-[2023-07-08 16:00:38,472][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014152_7245824.pth
-[2023-07-08 16:00:39,842][1003968] Updated weights for policy 0, policy_version 14800 (0.0005)
-[2023-07-08 16:00:43,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10399.7). Total num frames: 7614464. Throughput: 0: 10525.8. Samples: 7592812. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:43,454][1003682] Avg episode reward: [(0, '779.903')]
-[2023-07-08 16:00:43,772][1003968] Updated weights for policy 0, policy_version 14880 (0.0005)
-[2023-07-08 16:00:47,490][1003968] Updated weights for policy 0, policy_version 14960 (0.0005)
-[2023-07-08 16:00:48,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10444.8, 300 sec: 10413.6). Total num frames: 7667712. Throughput: 0: 10550.7. Samples: 7656552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:48,454][1003682] Avg episode reward: [(0, '786.495')]
-[2023-07-08 16:00:51,558][1003968] Updated weights for policy 0, policy_version 15040 (0.0005)
-[2023-07-08 16:00:53,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10413.6). Total num frames: 7720960. Throughput: 0: 10522.2. Samples: 7719320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:53,454][1003682] Avg episode reward: [(0, '773.041')]
-[2023-07-08 16:00:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015080_7720960.pth...
-[2023-07-08 16:00:53,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014464_7405568.pth
-[2023-07-08 16:00:55,205][1003968] Updated weights for policy 0, policy_version 15120 (0.0005)
-[2023-07-08 16:00:58,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10581.3, 300 sec: 10427.4). Total num frames: 7774208. Throughput: 0: 10511.3. Samples: 7751440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:00:58,454][1003682] Avg episode reward: [(0, '777.405')]
-[2023-07-08 16:00:58,848][1003968] Updated weights for policy 0, policy_version 15200 (0.0005)
-[2023-07-08 16:01:02,720][1003968] Updated weights for policy 0, policy_version 15280 (0.0005)
-[2023-07-08 16:01:03,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10427.4). Total num frames: 7827456. Throughput: 0: 10612.7. Samples: 7817668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:03,454][1003682] Avg episode reward: [(0, '777.042')]
-[2023-07-08 16:01:06,877][1003968] Updated weights for policy 0, policy_version 15360 (0.0005)
-[2023-07-08 16:01:08,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.4, 300 sec: 10455.2). Total num frames: 7880704. Throughput: 0: 10492.5. Samples: 7877460. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 16:01:08,454][1003682] Avg episode reward: [(0, '783.228')]
-[2023-07-08 16:01:08,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015392_7880704.pth...
-[2023-07-08 16:01:08,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014768_7561216.pth
-[2023-07-08 16:01:11,008][1003968] Updated weights for policy 0, policy_version 15440 (0.0005)
-[2023-07-08 16:01:13,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10444.8, 300 sec: 10427.4). Total num frames: 7925760. Throughput: 0: 10478.8. Samples: 7906860. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
-[2023-07-08 16:01:13,454][1003682] Avg episode reward: [(0, '782.133')]
-[2023-07-08 16:01:15,101][1003968] Updated weights for policy 0, policy_version 15520 (0.0006)
-[2023-07-08 16:01:18,454][1003682] Fps is (10 sec: 9830.3, 60 sec: 10444.8, 300 sec: 10427.4). Total num frames: 7979008. Throughput: 0: 10465.1. Samples: 7966672. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:01:18,454][1003682] Avg episode reward: [(0, '782.475')]
-[2023-07-08 16:01:19,277][1003968] Updated weights for policy 0, policy_version 15600 (0.0005)
-[2023-07-08 16:01:23,009][1003968] Updated weights for policy 0, policy_version 15680 (0.0005)
-[2023-07-08 16:01:23,453][1003682] Fps is (10 sec: 10649.5, 60 sec: 10444.8, 300 sec: 10427.4). Total num frames: 8032256. Throughput: 0: 10504.2. Samples: 8029884. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:01:23,454][1003682] Avg episode reward: [(0, '784.823')]
-[2023-07-08 16:01:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015688_8032256.pth...
-[2023-07-08 16:01:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015080_7720960.pth
-[2023-07-08 16:01:26,893][1003968] Updated weights for policy 0, policy_version 15760 (0.0005)
-[2023-07-08 16:01:28,453][1003682] Fps is (10 sec: 10649.8, 60 sec: 10444.8, 300 sec: 10441.3). Total num frames: 8085504. Throughput: 0: 10401.7. Samples: 8060888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:28,454][1003682] Avg episode reward: [(0, '788.225')]
-[2023-07-08 16:01:30,786][1003968] Updated weights for policy 0, policy_version 15840 (0.0005)
-[2023-07-08 16:01:33,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10427.4). Total num frames: 8134656. Throughput: 0: 10355.1. Samples: 8122532. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:33,454][1003682] Avg episode reward: [(0, '789.434')]
-[2023-07-08 16:01:34,937][1003968] Updated weights for policy 0, policy_version 15920 (0.0005)
-[2023-07-08 16:01:38,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10513.1, 300 sec: 10427.4). Total num frames: 8187904. Throughput: 0: 10414.4. Samples: 8187968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:38,454][1003682] Avg episode reward: [(0, '791.673')]
-[2023-07-08 16:01:38,461][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016000_8192000.pth...
-[2023-07-08 16:01:38,461][1003968] Updated weights for policy 0, policy_version 16000 (0.0005)
-[2023-07-08 16:01:38,462][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015392_7880704.pth
-[2023-07-08 16:01:42,417][1003968] Updated weights for policy 0, policy_version 16080 (0.0005)
-[2023-07-08 16:01:43,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10427.4). Total num frames: 8241152. Throughput: 0: 10421.9. Samples: 8220428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:43,454][1003682] Avg episode reward: [(0, '793.861')]
-[2023-07-08 16:01:43,455][1003924] Saving new best policy, reward=793.861!
-[2023-07-08 16:01:46,227][1003968] Updated weights for policy 0, policy_version 16160 (0.0005)
-[2023-07-08 16:01:48,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10513.1, 300 sec: 10441.3). Total num frames: 8298496. Throughput: 0: 10413.4. Samples: 8286272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:48,454][1003682] Avg episode reward: [(0, '784.788')]
-[2023-07-08 16:01:49,677][1003968] Updated weights for policy 0, policy_version 16240 (0.0006)
-[2023-07-08 16:01:53,453][1003682] Fps is (10 sec: 11059.2, 60 sec: 10513.1, 300 sec: 10441.3). Total num frames: 8351744. Throughput: 0: 10553.3. Samples: 8352360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:53,454][1003682] Avg episode reward: [(0, '794.958')]
-[2023-07-08 16:01:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016312_8351744.pth...
-[2023-07-08 16:01:53,458][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015688_8032256.pth
-[2023-07-08 16:01:53,459][1003924] Saving new best policy, reward=794.958!
-[2023-07-08 16:01:53,527][1003968] Updated weights for policy 0, policy_version 16320 (0.0005)
-[2023-07-08 16:01:57,492][1003968] Updated weights for policy 0, policy_version 16400 (0.0005)
-[2023-07-08 16:01:58,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.0, 300 sec: 10455.2). Total num frames: 8404992. Throughput: 0: 10608.3. Samples: 8384236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:01:58,454][1003682] Avg episode reward: [(0, '794.016')]
-[2023-07-08 16:02:01,287][1003968] Updated weights for policy 0, policy_version 16480 (0.0005)
-[2023-07-08 16:02:03,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10455.2). Total num frames: 8458240. Throughput: 0: 10669.0. Samples: 8446776. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:02:03,454][1003682] Avg episode reward: [(0, '793.872')]
-[2023-07-08 16:02:05,369][1003968] Updated weights for policy 0, policy_version 16560 (0.0006)
-[2023-07-08 16:02:08,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10441.3). Total num frames: 8507392. Throughput: 0: 10626.5. Samples: 8508076. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:02:08,454][1003682] Avg episode reward: [(0, '792.577')]
-[2023-07-08 16:02:08,483][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016624_8511488.pth...
-[2023-07-08 16:02:08,484][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016000_8192000.pth
-[2023-07-08 16:02:09,274][1003968] Updated weights for policy 0, policy_version 16640 (0.0005)
-[2023-07-08 16:02:13,213][1003968] Updated weights for policy 0, policy_version 16720 (0.0005)
-[2023-07-08 16:02:13,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10581.3, 300 sec: 10427.4). Total num frames: 8560640. Throughput: 0: 10647.6. Samples: 8540032. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:02:13,454][1003682] Avg episode reward: [(0, '796.746')]
-[2023-07-08 16:02:13,454][1003924] Saving new best policy, reward=796.746!
-[2023-07-08 16:02:16,827][1003968] Updated weights for policy 0, policy_version 16800 (0.0005)
-[2023-07-08 16:02:18,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10441.3). Total num frames: 8613888. Throughput: 0: 10734.6. Samples: 8605588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:18,454][1003682] Avg episode reward: [(0, '799.537')]
-[2023-07-08 16:02:18,454][1003924] Saving new best policy, reward=799.537!
-[2023-07-08 16:02:21,013][1003968] Updated weights for policy 0, policy_version 16880 (0.0005)
-[2023-07-08 16:02:23,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10441.3). Total num frames: 8667136. Throughput: 0: 10628.4. Samples: 8666248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:23,454][1003682] Avg episode reward: [(0, '800.426')]
-[2023-07-08 16:02:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016928_8667136.pth...
-[2023-07-08 16:02:23,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016312_8351744.pth
-[2023-07-08 16:02:23,460][1003924] Saving new best policy, reward=800.426!
-[2023-07-08 16:02:24,857][1003968] Updated weights for policy 0, policy_version 16960 (0.0005)
-[2023-07-08 16:02:28,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10581.3, 300 sec: 10441.3). Total num frames: 8720384. Throughput: 0: 10621.2. Samples: 8698380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:28,454][1003682] Avg episode reward: [(0, '799.312')]
-[2023-07-08 16:02:28,575][1003968] Updated weights for policy 0, policy_version 17040 (0.0005)
-[2023-07-08 16:02:32,600][1003968] Updated weights for policy 0, policy_version 17120 (0.0005)
-[2023-07-08 16:02:33,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10649.6, 300 sec: 10455.2). Total num frames: 8773632. Throughput: 0: 10557.4. Samples: 8761352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:33,454][1003682] Avg episode reward: [(0, '800.176')]
-[2023-07-08 16:02:36,524][1003968] Updated weights for policy 0, policy_version 17200 (0.0006)
-[2023-07-08 16:02:38,454][1003682] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 10441.3). Total num frames: 8822784. Throughput: 0: 10453.8. Samples: 8822780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:38,454][1003682] Avg episode reward: [(0, '797.714')]
-[2023-07-08 16:02:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017232_8822784.pth...
-[2023-07-08 16:02:38,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016624_8511488.pth
-[2023-07-08 16:02:40,539][1003968] Updated weights for policy 0, policy_version 17280 (0.0006)
-[2023-07-08 16:02:43,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10581.3, 300 sec: 10455.2). Total num frames: 8876032. Throughput: 0: 10473.7. Samples: 8855552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:43,454][1003682] Avg episode reward: [(0, '794.151')]
-[2023-07-08 16:02:44,308][1003968] Updated weights for policy 0, policy_version 17360 (0.0005)
-[2023-07-08 16:02:48,158][1003968] Updated weights for policy 0, policy_version 17440 (0.0005)
-[2023-07-08 16:02:48,453][1003682] Fps is (10 sec: 10649.8, 60 sec: 10513.1, 300 sec: 10469.1). Total num frames: 8929280. Throughput: 0: 10521.4. Samples: 8920240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:48,454][1003682] Avg episode reward: [(0, '789.931')]
-[2023-07-08 16:02:52,071][1003968] Updated weights for policy 0, policy_version 17520 (0.0006)
-[2023-07-08 16:02:53,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10469.1). Total num frames: 8982528. Throughput: 0: 10543.4. Samples: 8982528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:53,454][1003682] Avg episode reward: [(0, '800.015')]
-[2023-07-08 16:02:53,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017544_8982528.pth...
-[2023-07-08 16:02:53,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016928_8667136.pth
-[2023-07-08 16:02:55,707][1003968] Updated weights for policy 0, policy_version 17600 (0.0005)
-[2023-07-08 16:02:58,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10469.1). Total num frames: 9035776. Throughput: 0: 10574.7. Samples: 9015892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:02:58,454][1003682] Avg episode reward: [(0, '796.425')]
-[2023-07-08 16:02:59,659][1003968] Updated weights for policy 0, policy_version 17680 (0.0005)
-[2023-07-08 16:03:03,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10444.8, 300 sec: 10469.1). Total num frames: 9084928. Throughput: 0: 10471.4. Samples: 9076800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:03:03,454][1003682] Avg episode reward: [(0, '792.506')]
-[2023-07-08 16:03:03,852][1003968] Updated weights for policy 0, policy_version 17760 (0.0005)
-[2023-07-08 16:03:07,752][1003968] Updated weights for policy 0, policy_version 17840 (0.0005)
-[2023-07-08 16:03:08,454][1003682] Fps is (10 sec: 10239.8, 60 sec: 10513.1, 300 sec: 10483.0). Total num frames: 9138176. Throughput: 0: 10511.9. Samples: 9139284. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:03:08,454][1003682] Avg episode reward: [(0, '793.927')]
-[2023-07-08 16:03:08,488][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017856_9142272.pth...
-[2023-07-08 16:03:08,490][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017232_8822784.pth
-[2023-07-08 16:03:11,674][1003968] Updated weights for policy 0, policy_version 17920 (0.0005)
-[2023-07-08 16:03:13,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10513.1, 300 sec: 10483.0). Total num frames: 9191424. Throughput: 0: 10482.1. Samples: 9170076. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:03:13,454][1003682] Avg episode reward: [(0, '797.976')]
-[2023-07-08 16:03:15,425][1003968] Updated weights for policy 0, policy_version 18000 (0.0005)
-[2023-07-08 16:03:18,453][1003682] Fps is (10 sec: 10649.7, 60 sec: 10513.1, 300 sec: 10496.9). Total num frames: 9244672. Throughput: 0: 10515.5. Samples: 9234548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:03:18,454][1003682] Avg episode reward: [(0, '797.274')]
-[2023-07-08 16:03:19,476][1003968] Updated weights for policy 0, policy_version 18080 (0.0005)
-[2023-07-08 16:03:23,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10444.8, 300 sec: 10483.0). Total num frames: 9293824. Throughput: 0: 10467.7. Samples: 9293824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:03:23,454][1003682] Avg episode reward: [(0, '801.127')]
-[2023-07-08 16:03:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018152_9293824.pth...
-[2023-07-08 16:03:23,460][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017544_8982528.pth
-[2023-07-08 16:03:23,460][1003924] Saving new best policy, reward=801.127!
-[2023-07-08 16:03:23,637][1003968] Updated weights for policy 0, policy_version 18160 (0.0005)
-[2023-07-08 16:03:27,086][1003968] Updated weights for policy 0, policy_version 18240 (0.0005)
-[2023-07-08 16:03:28,454][1003682] Fps is (10 sec: 10649.5, 60 sec: 10513.1, 300 sec: 10496.9). Total num frames: 9351168. Throughput: 0: 10490.9. Samples: 9327644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:03:28,457][1003682] Avg episode reward: [(0, '798.538')]
-[2023-07-08 16:03:30,935][1003968] Updated weights for policy 0, policy_version 18320 (0.0005)
-[2023-07-08 16:03:33,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10513.1, 300 sec: 10496.9). Total num frames: 9404416. Throughput: 0: 10487.8. Samples: 9392192. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 16:03:33,455][1003682] Avg episode reward: [(0, '801.136')]
-[2023-07-08 16:03:33,455][1003924] Saving new best policy, reward=801.136!
-[2023-07-08 16:03:34,989][1003968] Updated weights for policy 0, policy_version 18400 (0.0005)
-[2023-07-08 16:03:38,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10513.1, 300 sec: 10483.0). Total num frames: 9453568. Throughput: 0: 10467.6. Samples: 9453568. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
-[2023-07-08 16:03:38,454][1003682] Avg episode reward: [(0, '804.419')]
-[2023-07-08 16:03:38,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018464_9453568.pth...
-[2023-07-08 16:03:38,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017856_9142272.pth
-[2023-07-08 16:03:38,459][1003924] Saving new best policy, reward=804.419!
-[2023-07-08 16:03:39,068][1003968] Updated weights for policy 0, policy_version 18480 (0.0005)
-[2023-07-08 16:03:43,098][1003968] Updated weights for policy 0, policy_version 18560 (0.0005)
-[2023-07-08 16:03:43,453][1003682] Fps is (10 sec: 9830.5, 60 sec: 10444.8, 300 sec: 10469.1). Total num frames: 9502720. Throughput: 0: 10363.5. Samples: 9482252. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:03:43,454][1003682] Avg episode reward: [(0, '799.738')]
-[2023-07-08 16:03:47,205][1003968] Updated weights for policy 0, policy_version 18640 (0.0005)
-[2023-07-08 16:03:48,453][1003682] Fps is (10 sec: 10239.9, 60 sec: 10444.8, 300 sec: 10483.0). Total num frames: 9555968. Throughput: 0: 10374.8. Samples: 9543668. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:03:48,455][1003682] Avg episode reward: [(0, '791.915')]
-[2023-07-08 16:03:51,044][1003968] Updated weights for policy 0, policy_version 18720 (0.0005)
-[2023-07-08 16:03:53,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10513.1, 300 sec: 10496.9). Total num frames: 9613312. Throughput: 0: 10462.7. Samples: 9610104. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
-[2023-07-08 16:03:53,454][1003682] Avg episode reward: [(0, '786.700')]
-[2023-07-08 16:03:53,458][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018776_9613312.pth...
-[2023-07-08 16:03:53,461][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018152_9293824.pth
-[2023-07-08 16:03:54,570][1003968] Updated weights for policy 0, policy_version 18800 (0.0005)
-[2023-07-08 16:03:58,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10444.8, 300 sec: 10483.0). Total num frames: 9662464. Throughput: 0: 10487.0. Samples: 9641992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:03:58,455][1003682] Avg episode reward: [(0, '789.382')]
-[2023-07-08 16:03:58,588][1003968] Updated weights for policy 0, policy_version 18880 (0.0005)
-[2023-07-08 16:04:02,582][1003968] Updated weights for policy 0, policy_version 18960 (0.0005)
-[2023-07-08 16:04:03,453][1003682] Fps is (10 sec: 10240.1, 60 sec: 10513.1, 300 sec: 10483.0). Total num frames: 9715712. Throughput: 0: 10423.3. Samples: 9703596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:03,454][1003682] Avg episode reward: [(0, '801.178')]
-[2023-07-08 16:04:06,417][1003968] Updated weights for policy 0, policy_version 19040 (0.0005)
-[2023-07-08 16:04:08,453][1003682] Fps is (10 sec: 11059.2, 60 sec: 10581.3, 300 sec: 10510.8). Total num frames: 9773056. Throughput: 0: 10577.3. Samples: 9769804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:08,454][1003682] Avg episode reward: [(0, '797.193')]
-[2023-07-08 16:04:08,456][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019088_9773056.pth...
-[2023-07-08 16:04:08,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018464_9453568.pth
-[2023-07-08 16:04:09,853][1003968] Updated weights for policy 0, policy_version 19120 (0.0005)
-[2023-07-08 16:04:13,453][1003682] Fps is (10 sec: 11059.1, 60 sec: 10581.3, 300 sec: 10510.8). Total num frames: 9826304. Throughput: 0: 10562.0. Samples: 9802936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:13,455][1003682] Avg episode reward: [(0, '802.744')]
-[2023-07-08 16:04:13,796][1003968] Updated weights for policy 0, policy_version 19200 (0.0005)
-[2023-07-08 16:04:17,677][1003968] Updated weights for policy 0, policy_version 19280 (0.0005)
-[2023-07-08 16:04:18,453][1003682] Fps is (10 sec: 10649.6, 60 sec: 10581.3, 300 sec: 10510.8). Total num frames: 9879552. Throughput: 0: 10552.7. Samples: 9867064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:18,454][1003682] Avg episode reward: [(0, '802.009')]
-[2023-07-08 16:04:21,639][1003968] Updated weights for policy 0, policy_version 19360 (0.0005)
-[2023-07-08 16:04:23,454][1003682] Fps is (10 sec: 10240.0, 60 sec: 10581.3, 300 sec: 10496.9). Total num frames: 9928704. Throughput: 0: 10561.2. Samples: 9928824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:23,454][1003682] Avg episode reward: [(0, '802.756')]
-[2023-07-08 16:04:23,457][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019392_9928704.pth...
-[2023-07-08 16:04:23,459][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018776_9613312.pth
-[2023-07-08 16:04:25,491][1003968] Updated weights for policy 0, policy_version 19440 (0.0005)
-[2023-07-08 16:04:28,453][1003682] Fps is (10 sec: 10240.0, 60 sec: 10513.1, 300 sec: 10496.9). Total num frames: 9981952. Throughput: 0: 10632.1. Samples: 9960696. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
-[2023-07-08 16:04:28,455][1003682] Avg episode reward: [(0, '801.304')]
-[2023-07-08 16:04:29,327][1003968] Updated weights for policy 0, policy_version 19520 (0.0006)
-[2023-07-08 16:04:30,405][1003924] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000
-[2023-07-08 16:04:30,406][1004058] Stopping RolloutWorker_w5...
-[2023-07-08 16:04:30,406][1003969] Stopping RolloutWorker_w2...
-[2023-07-08 16:04:30,406][1003970] Stopping RolloutWorker_w1...
-[2023-07-08 16:04:30,406][1004068] Stopping RolloutWorker_w6...
-[2023-07-08 16:04:30,406][1003973] Stopping RolloutWorker_w3...
-[2023-07-08 16:04:30,407][1003969] Loop rollout_proc2_evt_loop terminating...
-[2023-07-08 16:04:30,406][1003972] Stopping RolloutWorker_w4...
-[2023-07-08 16:04:30,406][1004100] Stopping RolloutWorker_w7...
-[2023-07-08 16:04:30,407][1003971] Stopping RolloutWorker_w0...
-[2023-07-08 16:04:30,407][1004058] Loop rollout_proc5_evt_loop terminating...
-[2023-07-08 16:04:30,406][1003682] Component RolloutWorker_w5 stopped!
-[2023-07-08 16:04:30,407][1003970] Loop rollout_proc1_evt_loop terminating...
-[2023-07-08 16:04:30,407][1004068] Loop rollout_proc6_evt_loop terminating...
-[2023-07-08 16:04:30,407][1003973] Loop rollout_proc3_evt_loop terminating...
-[2023-07-08 16:04:30,407][1003972] Loop rollout_proc4_evt_loop terminating...
-[2023-07-08 16:04:30,407][1003971] Loop rollout_proc0_evt_loop terminating...
-[2023-07-08 16:04:30,407][1004100] Loop rollout_proc7_evt_loop terminating...
-[2023-07-08 16:04:30,407][1003924] Stopping Batcher_0...
-[2023-07-08 16:04:30,407][1003682] Component RolloutWorker_w2 stopped!
-[2023-07-08 16:04:30,407][1003682] Component RolloutWorker_w1 stopped!
-[2023-07-08 16:04:30,407][1003682] Component RolloutWorker_w6 stopped!
-[2023-07-08 16:04:30,408][1003682] Component RolloutWorker_w3 stopped!
-[2023-07-08 16:04:30,407][1003924] Loop batcher_evt_loop terminating...
-[2023-07-08 16:04:30,408][1003682] Component RolloutWorker_w4 stopped!
-[2023-07-08 16:04:30,408][1003682] Component RolloutWorker_w7 stopped!
-[2023-07-08 16:04:30,408][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
-[2023-07-08 16:04:30,408][1003682] Component RolloutWorker_w0 stopped!
-[2023-07-08 16:04:30,408][1003682] Component Batcher_0 stopped!
-[2023-07-08 16:04:30,411][1003924] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019088_9773056.pth
-[2023-07-08 16:04:30,411][1003924] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
-[2023-07-08 16:04:30,414][1003924] Stopping LearnerWorker_p0...
-[2023-07-08 16:04:30,414][1003924] Loop learner_proc0_evt_loop terminating...
-[2023-07-08 16:04:30,414][1003682] Component LearnerWorker_p0 stopped!
-[2023-07-08 16:04:30,479][1003968] Weights refcount: 2 0
-[2023-07-08 16:04:30,480][1003968] Stopping InferenceWorker_p0-w0...
-[2023-07-08 16:04:30,480][1003968] Loop inference_proc0-0_evt_loop terminating...
-[2023-07-08 16:04:30,480][1003682] Component InferenceWorker_p0-w0 stopped!
-[2023-07-08 16:04:30,481][1003682] Waiting for process learner_proc0 to stop...
-[2023-07-08 16:04:31,088][1003682] Waiting for process inference_proc0-0 to join...
-[2023-07-08 16:04:31,098][1003682] Waiting for process rollout_proc0 to join...
-[2023-07-08 16:04:31,098][1003682] Waiting for process rollout_proc1 to join...
-[2023-07-08 16:04:31,098][1003682] Waiting for process rollout_proc2 to join...
-[2023-07-08 16:04:31,133][1003682] Waiting for process rollout_proc3 to join...
-[2023-07-08 16:04:31,133][1003682] Waiting for process rollout_proc4 to join...
-[2023-07-08 16:04:31,134][1003682] Waiting for process rollout_proc5 to join...
-[2023-07-08 16:04:31,134][1003682] Waiting for process rollout_proc6 to join...
-[2023-07-08 16:04:31,134][1003682] Waiting for process rollout_proc7 to join...
-[2023-07-08 16:04:31,134][1003682] Batcher 0 profile tree view:
-batching: 1.8344, releasing_batches: 1.5310
-[2023-07-08 16:04:31,134][1003682] InferenceWorker_p0-w0 profile tree view:
+[2023-07-16 21:27:37,992][239596] Worker 2 uses CPU cores [8, 9, 10, 11]
+[2023-07-16 21:27:38,055][239599] Worker 3 uses CPU cores [12, 13, 14, 15]
+[2023-07-16 21:27:38,229][239601] Worker 5 uses CPU cores [20, 21, 22, 23]
+[2023-07-16 21:27:38,251][239551] Using optimizer <class 'torch.optim.adam.Adam'>
+[2023-07-16 21:27:38,252][239551] No checkpoints found
+[2023-07-16 21:27:38,252][239551] Did not load from checkpoint, starting from scratch!
+[2023-07-16 21:27:38,252][239551] Initialized policy 0 weights for model version 0
+[2023-07-16 21:27:38,253][239551] LearnerWorker_p0 finished initialization!
+[2023-07-16 21:27:38,255][239595] RunningMeanStd input shape: (39,)
+[2023-07-16 21:27:38,255][239595] RunningMeanStd input shape: (1,)
+[2023-07-16 21:27:38,274][239597] Worker 1 uses CPU cores [4, 5, 6, 7]
+[2023-07-16 21:27:38,293][239600] Worker 0 uses CPU cores [0, 1, 2, 3]
+[2023-07-16 21:27:38,323][239306] Inference worker 0-0 is ready!
+[2023-07-16 21:27:38,324][239306] All inference workers are ready! Signal rollout workers to start!
+[2023-07-16 21:27:38,424][239664] Worker 7 uses CPU cores [28, 29, 30, 31]
+[2023-07-16 21:27:38,526][239685] Worker 6 uses CPU cores [24, 25, 26, 27]
+[2023-07-16 21:27:38,584][239598] Worker 4 uses CPU cores [16, 17, 18, 19]
+[2023-07-16 21:27:38,926][239306] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
+[2023-07-16 21:27:39,654][239601] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,659][239601] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,662][239597] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,667][239597] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,682][239596] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,686][239601] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,687][239596] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,693][239600] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,694][239597] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,695][239599] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,699][239600] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,701][239599] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,713][239596] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,725][239600] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,727][239599] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,738][239601] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,745][239597] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,765][239596] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,777][239600] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,779][239599] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,845][239664] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,851][239664] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,877][239664] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,930][239664] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:39,945][239685] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,951][239685] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,965][239598] Decorrelating experience for 0 frames...
+[2023-07-16 21:27:39,971][239598] Decorrelating experience for 64 frames...
+[2023-07-16 21:27:39,977][239685] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:39,997][239598] Decorrelating experience for 128 frames...
+[2023-07-16 21:27:40,029][239685] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:40,049][239598] Decorrelating experience for 192 frames...
+[2023-07-16 21:27:41,068][239597] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,068][239601] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,077][239596] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,090][239600] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,091][239599] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,166][239597] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,166][239601] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,175][239596] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,188][239600] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,189][239599] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,245][239664] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,289][239601] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,289][239597] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,298][239596] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,312][239599] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,312][239600] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,343][239664] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,346][239685] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,375][239598] Decorrelating experience for 256 frames...
+[2023-07-16 21:27:41,433][239597] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,433][239601] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,441][239596] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,443][239685] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,454][239599] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,455][239600] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,465][239664] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,472][239598] Decorrelating experience for 320 frames...
+[2023-07-16 21:27:41,566][239685] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,595][239598] Decorrelating experience for 384 frames...
+[2023-07-16 21:27:41,609][239664] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,708][239685] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:41,737][239598] Decorrelating experience for 448 frames...
+[2023-07-16 21:27:43,926][239306] Fps is (10 sec: 4096.1, 60 sec: 4096.1, 300 sec: 4096.1). Total num frames: 20480. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:27:43,927][239306] Avg episode reward: [(0, '109.809')]
+[2023-07-16 21:27:45,356][239595] Updated weights for policy 0, policy_version 80 (0.0006)
+[2023-07-16 21:27:48,354][239595] Updated weights for policy 0, policy_version 160 (0.0005)
+[2023-07-16 21:27:48,926][239306] Fps is (10 sec: 8601.7, 60 sec: 8601.7, 300 sec: 8601.7). Total num frames: 86016. Throughput: 0: 6820.1. Samples: 68200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:27:48,926][239306] Avg episode reward: [(0, '364.664')]
+[2023-07-16 21:27:48,935][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000176_90112.pth...
+[2023-07-16 21:27:51,351][239595] Updated weights for policy 0, policy_version 240 (0.0005)
+[2023-07-16 21:27:53,926][239306] Fps is (10 sec: 13516.8, 60 sec: 10376.6, 300 sec: 10376.6). Total num frames: 155648. Throughput: 0: 10104.0. Samples: 151560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:27:53,927][239306] Avg episode reward: [(0, '472.756')]
+[2023-07-16 21:27:53,928][239551] Saving new best policy, reward=472.756!
+[2023-07-16 21:27:54,326][239595] Updated weights for policy 0, policy_version 320 (0.0005)
+[2023-07-16 21:27:55,927][239306] Heartbeat connected on Batcher_0
+[2023-07-16 21:27:55,934][239306] Heartbeat connected on RolloutWorker_w0
+[2023-07-16 21:27:55,936][239306] Heartbeat connected on RolloutWorker_w1
+[2023-07-16 21:27:55,938][239306] Heartbeat connected on RolloutWorker_w2
+[2023-07-16 21:27:55,940][239306] Heartbeat connected on RolloutWorker_w3
+[2023-07-16 21:27:55,942][239306] Heartbeat connected on RolloutWorker_w4
+[2023-07-16 21:27:55,944][239306] Heartbeat connected on RolloutWorker_w5
+[2023-07-16 21:27:55,945][239306] Heartbeat connected on LearnerWorker_p0
+[2023-07-16 21:27:55,945][239306] Heartbeat connected on RolloutWorker_w6
+[2023-07-16 21:27:55,947][239306] Heartbeat connected on RolloutWorker_w7
+[2023-07-16 21:27:55,948][239306] Heartbeat connected on InferenceWorker_p0-w0
+[2023-07-16 21:27:57,564][239595] Updated weights for policy 0, policy_version 400 (0.0005)
+[2023-07-16 21:27:58,926][239306] Fps is (10 sec: 13516.8, 60 sec: 11059.2, 300 sec: 11059.2). Total num frames: 221184. Throughput: 0: 9443.8. Samples: 188876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:27:58,927][239306] Avg episode reward: [(0, '555.292')]
+[2023-07-16 21:27:58,927][239551] Saving new best policy, reward=555.292!
+[2023-07-16 21:28:00,740][239595] Updated weights for policy 0, policy_version 480 (0.0004)
+[2023-07-16 21:28:03,926][239306] Fps is (10 sec: 12697.6, 60 sec: 11305.0, 300 sec: 11305.0). Total num frames: 282624. Throughput: 0: 10649.9. Samples: 266248. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:28:03,927][239306] Avg episode reward: [(0, '579.673')]
+[2023-07-16 21:28:03,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000552_282624.pth...
+[2023-07-16 21:28:03,932][239551] Saving new best policy, reward=579.673!
+[2023-07-16 21:28:04,079][239595] Updated weights for policy 0, policy_version 560 (0.0005)
+[2023-07-16 21:28:07,290][239595] Updated weights for policy 0, policy_version 640 (0.0005)
+[2023-07-16 21:28:08,926][239306] Fps is (10 sec: 12697.6, 60 sec: 11605.4, 300 sec: 11605.4). Total num frames: 348160. Throughput: 0: 11402.0. Samples: 342060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:08,927][239306] Avg episode reward: [(0, '563.012')]
+[2023-07-16 21:28:10,462][239595] Updated weights for policy 0, policy_version 720 (0.0004)
+[2023-07-16 21:28:13,669][239595] Updated weights for policy 0, policy_version 800 (0.0005)
+[2023-07-16 21:28:13,926][239306] Fps is (10 sec: 12697.7, 60 sec: 11702.9, 300 sec: 11702.9). Total num frames: 409600. Throughput: 0: 10883.7. Samples: 380928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:13,927][239306] Avg episode reward: [(0, '571.703')]
+[2023-07-16 21:28:16,850][239595] Updated weights for policy 0, policy_version 880 (0.0005)
+[2023-07-16 21:28:18,926][239306] Fps is (10 sec: 12697.5, 60 sec: 11878.4, 300 sec: 11878.4). Total num frames: 475136. Throughput: 0: 11422.6. Samples: 456904. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:28:18,927][239306] Avg episode reward: [(0, '531.140')]
+[2023-07-16 21:28:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000928_475136.pth...
+[2023-07-16 21:28:18,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000176_90112.pth
+[2023-07-16 21:28:20,052][239595] Updated weights for policy 0, policy_version 960 (0.0005)
+[2023-07-16 21:28:23,196][239595] Updated weights for policy 0, policy_version 1040 (0.0004)
+[2023-07-16 21:28:23,926][239306] Fps is (10 sec: 13107.2, 60 sec: 12015.0, 300 sec: 12015.0). Total num frames: 540672. Throughput: 0: 11881.9. Samples: 534684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:23,927][239306] Avg episode reward: [(0, '581.222')]
+[2023-07-16 21:28:23,927][239551] Saving new best policy, reward=581.222!
+[2023-07-16 21:28:26,353][239595] Updated weights for policy 0, policy_version 1120 (0.0005)
+[2023-07-16 21:28:28,926][239306] Fps is (10 sec: 13107.2, 60 sec: 12124.2, 300 sec: 12124.2). Total num frames: 606208. Throughput: 0: 12744.5. Samples: 573504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:28,927][239306] Avg episode reward: [(0, '581.975')]
+[2023-07-16 21:28:28,928][239551] Saving new best policy, reward=581.975!
+[2023-07-16 21:28:29,497][239595] Updated weights for policy 0, policy_version 1200 (0.0005)
+[2023-07-16 21:28:32,646][239595] Updated weights for policy 0, policy_version 1280 (0.0005)
+[2023-07-16 21:28:33,926][239306] Fps is (10 sec: 13106.9, 60 sec: 12213.5, 300 sec: 12213.5). Total num frames: 671744. Throughput: 0: 12959.1. Samples: 651364. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:28:33,927][239306] Avg episode reward: [(0, '570.302')]
+[2023-07-16 21:28:33,931][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001312_671744.pth...
+[2023-07-16 21:28:33,934][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000552_282624.pth
+[2023-07-16 21:28:35,860][239595] Updated weights for policy 0, policy_version 1360 (0.0005)
+[2023-07-16 21:28:38,926][239306] Fps is (10 sec: 12697.7, 60 sec: 12219.8, 300 sec: 12219.8). Total num frames: 733184. Throughput: 0: 12833.8. Samples: 729080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:38,926][239306] Avg episode reward: [(0, '595.051')]
+[2023-07-16 21:28:38,948][239551] Saving new best policy, reward=595.051!
+[2023-07-16 21:28:38,948][239595] Updated weights for policy 0, policy_version 1440 (0.0005)
+[2023-07-16 21:28:42,035][239595] Updated weights for policy 0, policy_version 1520 (0.0005)
+[2023-07-16 21:28:43,926][239306] Fps is (10 sec: 12697.8, 60 sec: 12970.7, 300 sec: 12288.0). Total num frames: 798720. Throughput: 0: 12915.2. Samples: 770060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:43,927][239306] Avg episode reward: [(0, '595.025')]
+[2023-07-16 21:28:45,244][239595] Updated weights for policy 0, policy_version 1600 (0.0005)
+[2023-07-16 21:28:48,451][239595] Updated weights for policy 0, policy_version 1680 (0.0005)
+[2023-07-16 21:28:48,926][239306] Fps is (10 sec: 13107.0, 60 sec: 12970.6, 300 sec: 12346.5). Total num frames: 864256. Throughput: 0: 12895.1. Samples: 846528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:48,927][239306] Avg episode reward: [(0, '582.331')]
+[2023-07-16 21:28:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001688_864256.pth...
+[2023-07-16 21:28:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000000928_475136.pth
+[2023-07-16 21:28:51,643][239595] Updated weights for policy 0, policy_version 1760 (0.0005)
+[2023-07-16 21:28:53,926][239306] Fps is (10 sec: 13107.2, 60 sec: 12902.4, 300 sec: 12397.2). Total num frames: 929792. Throughput: 0: 12902.1. Samples: 922656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:28:53,927][239306] Avg episode reward: [(0, '569.627')]
+[2023-07-16 21:28:54,860][239595] Updated weights for policy 0, policy_version 1840 (0.0005)
+[2023-07-16 21:28:58,117][239595] Updated weights for policy 0, policy_version 1920 (0.0005)
+[2023-07-16 21:28:58,926][239306] Fps is (10 sec: 12697.7, 60 sec: 12834.1, 300 sec: 12390.4). Total num frames: 991232. Throughput: 0: 12897.1. Samples: 961300. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:28:58,927][239306] Avg episode reward: [(0, '620.627')]
+[2023-07-16 21:28:58,927][239551] Saving new best policy, reward=620.627!
+[2023-07-16 21:29:01,215][239595] Updated weights for policy 0, policy_version 2000 (0.0004)
+[2023-07-16 21:29:03,926][239306] Fps is (10 sec: 12697.5, 60 sec: 12902.4, 300 sec: 12432.6). Total num frames: 1056768. Throughput: 0: 12944.4. Samples: 1039404. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:29:03,927][239306] Avg episode reward: [(0, '638.021')]
+[2023-07-16 21:29:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002064_1056768.pth...
+[2023-07-16 21:29:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001312_671744.pth
+[2023-07-16 21:29:03,933][239551] Saving new best policy, reward=638.021!
+[2023-07-16 21:29:04,373][239595] Updated weights for policy 0, policy_version 2080 (0.0005)
+[2023-07-16 21:29:07,541][239595] Updated weights for policy 0, policy_version 2160 (0.0003)
+[2023-07-16 21:29:08,926][239306] Fps is (10 sec: 13107.2, 60 sec: 12902.4, 300 sec: 12470.0). Total num frames: 1122304. Throughput: 0: 12963.2. Samples: 1118028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:08,927][239306] Avg episode reward: [(0, '636.723')]
+[2023-07-16 21:29:10,592][239595] Updated weights for policy 0, policy_version 2240 (0.0003)
+[2023-07-16 21:29:13,665][239595] Updated weights for policy 0, policy_version 2320 (0.0004)
+[2023-07-16 21:29:13,926][239306] Fps is (10 sec: 13107.4, 60 sec: 12970.7, 300 sec: 12503.6). Total num frames: 1187840. Throughput: 0: 12970.9. Samples: 1157192. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:13,927][239306] Avg episode reward: [(0, '676.858')]
+[2023-07-16 21:29:13,966][239551] Saving new best policy, reward=676.858!
+[2023-07-16 21:29:16,737][239595] Updated weights for policy 0, policy_version 2400 (0.0004)
+[2023-07-16 21:29:18,926][239306] Fps is (10 sec: 13107.2, 60 sec: 12970.7, 300 sec: 12533.8). Total num frames: 1253376. Throughput: 0: 13015.4. Samples: 1237056. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:29:18,927][239306] Avg episode reward: [(0, '693.208')]
+[2023-07-16 21:29:18,939][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002456_1257472.pth...
+[2023-07-16 21:29:18,942][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000001688_864256.pth
+[2023-07-16 21:29:18,942][239551] Saving new best policy, reward=693.208!
+[2023-07-16 21:29:19,834][239595] Updated weights for policy 0, policy_version 2480 (0.0005)
+[2023-07-16 21:29:23,013][239595] Updated weights for policy 0, policy_version 2560 (0.0005)
+[2023-07-16 21:29:23,926][239306] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 12561.1). Total num frames: 1318912. Throughput: 0: 13021.8. Samples: 1315060. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:29:23,927][239306] Avg episode reward: [(0, '670.428')]
+[2023-07-16 21:29:26,093][239595] Updated weights for policy 0, policy_version 2640 (0.0004)
+[2023-07-16 21:29:28,926][239306] Fps is (10 sec: 13516.9, 60 sec: 13038.9, 300 sec: 12623.1). Total num frames: 1388544. Throughput: 0: 13015.9. Samples: 1355776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:28,927][239306] Avg episode reward: [(0, '685.683')]
+[2023-07-16 21:29:29,203][239595] Updated weights for policy 0, policy_version 2720 (0.0005)
+[2023-07-16 21:29:32,340][239595] Updated weights for policy 0, policy_version 2800 (0.0005)
+[2023-07-16 21:29:33,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13039.0, 300 sec: 12644.2). Total num frames: 1454080. Throughput: 0: 13047.5. Samples: 1433664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:33,927][239306] Avg episode reward: [(0, '704.133')]
+[2023-07-16 21:29:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002840_1454080.pth...
+[2023-07-16 21:29:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002064_1056768.pth
+[2023-07-16 21:29:33,933][239551] Saving new best policy, reward=704.133!
+[2023-07-16 21:29:35,387][239595] Updated weights for policy 0, policy_version 2880 (0.0005)
+[2023-07-16 21:29:38,299][239595] Updated weights for policy 0, policy_version 2960 (0.0004)
+[2023-07-16 21:29:38,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13175.5, 300 sec: 12697.6). Total num frames: 1523712. Throughput: 0: 13187.0. Samples: 1516072. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:38,927][239306] Avg episode reward: [(0, '719.853')]
+[2023-07-16 21:29:38,927][239551] Saving new best policy, reward=719.853!
+[2023-07-16 21:29:41,061][239595] Updated weights for policy 0, policy_version 3040 (0.0004)
+[2023-07-16 21:29:43,926][239306] Fps is (10 sec: 13926.4, 60 sec: 13243.7, 300 sec: 12746.8). Total num frames: 1593344. Throughput: 0: 13318.9. Samples: 1560648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:43,927][239306] Avg episode reward: [(0, '714.485')]
+[2023-07-16 21:29:44,141][239595] Updated weights for policy 0, policy_version 3120 (0.0005)
+[2023-07-16 21:29:47,225][239595] Updated weights for policy 0, policy_version 3200 (0.0005)
+[2023-07-16 21:29:48,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13243.8, 300 sec: 12760.6). Total num frames: 1658880. Throughput: 0: 13350.5. Samples: 1640176. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:29:48,927][239306] Avg episode reward: [(0, '699.427')]
+[2023-07-16 21:29:48,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003240_1658880.pth...
+[2023-07-16 21:29:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002456_1257472.pth
+[2023-07-16 21:29:50,265][239595] Updated weights for policy 0, policy_version 3280 (0.0004)
+[2023-07-16 21:29:53,377][239595] Updated weights for policy 0, policy_version 3360 (0.0005)
+[2023-07-16 21:29:53,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 12773.5). Total num frames: 1724416. Throughput: 0: 13384.5. Samples: 1720328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:29:53,927][239306] Avg episode reward: [(0, '691.970')]
+[2023-07-16 21:29:56,481][239595] Updated weights for policy 0, policy_version 3440 (0.0005)
+[2023-07-16 21:29:58,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13312.0, 300 sec: 12785.4). Total num frames: 1789952. Throughput: 0: 13398.0. Samples: 1760104. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:29:58,927][239306] Avg episode reward: [(0, '708.069')]
+[2023-07-16 21:29:59,550][239595] Updated weights for policy 0, policy_version 3520 (0.0005)
+[2023-07-16 21:30:02,631][239595] Updated weights for policy 0, policy_version 3600 (0.0005)
+[2023-07-16 21:30:03,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13312.0, 300 sec: 12796.5). Total num frames: 1855488. Throughput: 0: 13382.1. Samples: 1839248. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:30:03,927][239306] Avg episode reward: [(0, '715.594')]
+[2023-07-16 21:30:03,934][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003632_1859584.pth...
+[2023-07-16 21:30:03,937][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000002840_1454080.pth
+[2023-07-16 21:30:05,787][239595] Updated weights for policy 0, policy_version 3680 (0.0005)
+[2023-07-16 21:30:08,926][239306] Fps is (10 sec: 13107.4, 60 sec: 13312.0, 300 sec: 12806.8). Total num frames: 1921024. Throughput: 0: 13376.3. Samples: 1916992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:30:08,926][239306] Avg episode reward: [(0, '728.486')]
+[2023-07-16 21:30:08,927][239551] Saving new best policy, reward=728.486!
+[2023-07-16 21:30:08,990][239595] Updated weights for policy 0, policy_version 3760 (0.0005)
+[2023-07-16 21:30:12,120][239595] Updated weights for policy 0, policy_version 3840 (0.0005)
+[2023-07-16 21:30:13,926][239306] Fps is (10 sec: 13107.3, 60 sec: 13312.0, 300 sec: 12816.5). Total num frames: 1986560. Throughput: 0: 13358.8. Samples: 1956920. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:30:13,926][239306] Avg episode reward: [(0, '719.988')]
+[2023-07-16 21:30:15,293][239595] Updated weights for policy 0, policy_version 3920 (0.0005)
+[2023-07-16 21:30:18,405][239595] Updated weights for policy 0, policy_version 4000 (0.0005)
+[2023-07-16 21:30:18,926][239306] Fps is (10 sec: 13107.0, 60 sec: 13312.0, 300 sec: 12825.6). Total num frames: 2052096. Throughput: 0: 13345.2. Samples: 2034200. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:30:18,927][239306] Avg episode reward: [(0, '728.642')]
+[2023-07-16 21:30:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004008_2052096.pth...
+[2023-07-16 21:30:18,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003240_1658880.pth
+[2023-07-16 21:30:18,933][239551] Saving new best policy, reward=728.642!
+[2023-07-16 21:30:21,487][239595] Updated weights for policy 0, policy_version 4080 (0.0005)
+[2023-07-16 21:30:23,926][239306] Fps is (10 sec: 13107.3, 60 sec: 13312.0, 300 sec: 12834.1). Total num frames: 2117632. Throughput: 0: 13280.0. Samples: 2113672. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:30:23,926][239306] Avg episode reward: [(0, '724.484')]
+[2023-07-16 21:30:24,582][239595] Updated weights for policy 0, policy_version 4160 (0.0005)
+[2023-07-16 21:30:27,667][239595] Updated weights for policy 0, policy_version 4240 (0.0005)
+[2023-07-16 21:30:28,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13312.0, 300 sec: 12866.3). Total num frames: 2187264. Throughput: 0: 13184.2. Samples: 2153936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:30:28,927][239306] Avg episode reward: [(0, '734.731')]
+[2023-07-16 21:30:28,927][239551] Saving new best policy, reward=734.731!
+[2023-07-16 21:30:30,712][239595] Updated weights for policy 0, policy_version 4320 (0.0005)
+[2023-07-16 21:30:33,818][239595] Updated weights for policy 0, policy_version 4400 (0.0005)
+[2023-07-16 21:30:33,926][239306] Fps is (10 sec: 13516.6, 60 sec: 13312.0, 300 sec: 12873.1). Total num frames: 2252800. Throughput: 0: 13180.4. Samples: 2233296. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:30:33,927][239306] Avg episode reward: [(0, '732.892')]
+[2023-07-16 21:30:33,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004400_2252800.pth...
+[2023-07-16 21:30:33,931][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000003632_1859584.pth
+[2023-07-16 21:30:36,991][239595] Updated weights for policy 0, policy_version 4480 (0.0005)
+[2023-07-16 21:30:38,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 12879.6). Total num frames: 2318336. Throughput: 0: 13130.6. Samples: 2311204. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:30:38,927][239306] Avg episode reward: [(0, '723.808')]
+[2023-07-16 21:30:40,207][239595] Updated weights for policy 0, policy_version 4560 (0.0005)
+[2023-07-16 21:30:43,460][239595] Updated weights for policy 0, policy_version 4640 (0.0005)
+[2023-07-16 21:30:43,926][239306] Fps is (10 sec: 12697.7, 60 sec: 13107.2, 300 sec: 12863.7). Total num frames: 2379776. Throughput: 0: 13098.1. Samples: 2349520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:30:43,927][239306] Avg episode reward: [(0, '666.314')]
+[2023-07-16 21:30:46,659][239595] Updated weights for policy 0, policy_version 4720 (0.0005)
+[2023-07-16 21:30:48,926][239306] Fps is (10 sec: 12697.6, 60 sec: 13107.2, 300 sec: 12870.1). Total num frames: 2445312. Throughput: 0: 13030.3. Samples: 2425612. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:30:48,926][239306] Avg episode reward: [(0, '726.342')]
+[2023-07-16 21:30:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004776_2445312.pth...
+[2023-07-16 21:30:48,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004008_2052096.pth
+[2023-07-16 21:30:49,818][239595] Updated weights for policy 0, policy_version 4800 (0.0005)
+[2023-07-16 21:30:52,949][239595] Updated weights for policy 0, policy_version 4880 (0.0005)
+[2023-07-16 21:30:53,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12876.1). Total num frames: 2510848. Throughput: 0: 13044.7. Samples: 2504004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:30:53,926][239306] Avg episode reward: [(0, '713.903')]
+[2023-07-16 21:30:55,999][239595] Updated weights for policy 0, policy_version 4960 (0.0005)
+[2023-07-16 21:30:58,926][239306] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12881.9). Total num frames: 2576384. Throughput: 0: 13040.7. Samples: 2543752. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:30:58,926][239306] Avg episode reward: [(0, '720.515')]
+[2023-07-16 21:30:59,200][239595] Updated weights for policy 0, policy_version 5040 (0.0005)
+[2023-07-16 21:31:02,232][239595] Updated weights for policy 0, policy_version 5120 (0.0005)
+[2023-07-16 21:31:03,926][239306] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12887.4). Total num frames: 2641920. Throughput: 0: 13085.5. Samples: 2623048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:31:03,927][239306] Avg episode reward: [(0, '707.474')]
+[2023-07-16 21:31:03,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005160_2641920.pth...
+[2023-07-16 21:31:03,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004400_2252800.pth
+[2023-07-16 21:31:05,335][239595] Updated weights for policy 0, policy_version 5200 (0.0005)
+[2023-07-16 21:31:08,421][239595] Updated weights for policy 0, policy_version 5280 (0.0005)
+[2023-07-16 21:31:08,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12892.7). Total num frames: 2707456. Throughput: 0: 13093.9. Samples: 2702900. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:31:08,927][239306] Avg episode reward: [(0, '704.828')]
+[2023-07-16 21:31:11,512][239595] Updated weights for policy 0, policy_version 5360 (0.0005)
+[2023-07-16 21:31:13,926][239306] Fps is (10 sec: 13516.9, 60 sec: 13175.5, 300 sec: 12916.7). Total num frames: 2777088. Throughput: 0: 13070.1. Samples: 2742088. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:31:13,927][239306] Avg episode reward: [(0, '707.757')]
+[2023-07-16 21:31:14,510][239595] Updated weights for policy 0, policy_version 5440 (0.0005)
+[2023-07-16 21:31:17,502][239595] Updated weights for policy 0, policy_version 5520 (0.0004)
+[2023-07-16 21:31:18,926][239306] Fps is (10 sec: 13926.4, 60 sec: 13243.7, 300 sec: 12939.6). Total num frames: 2846720. Throughput: 0: 13130.2. Samples: 2824152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:31:18,927][239306] Avg episode reward: [(0, '722.682')]
+[2023-07-16 21:31:18,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005560_2846720.pth...
+[2023-07-16 21:31:18,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000004776_2445312.pth
+[2023-07-16 21:31:20,422][239595] Updated weights for policy 0, policy_version 5600 (0.0005)
+[2023-07-16 21:31:23,545][239595] Updated weights for policy 0, policy_version 5680 (0.0005)
+[2023-07-16 21:31:23,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13243.7, 300 sec: 12943.4). Total num frames: 2912256. Throughput: 0: 13206.2. Samples: 2905484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:31:23,926][239306] Avg episode reward: [(0, '733.660')]
+[2023-07-16 21:31:26,691][239595] Updated weights for policy 0, policy_version 5760 (0.0005)
+[2023-07-16 21:31:28,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13175.5, 300 sec: 12946.9). Total num frames: 2977792. Throughput: 0: 13233.6. Samples: 2945032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:31:28,926][239306] Avg episode reward: [(0, '714.251')]
+[2023-07-16 21:31:29,692][239595] Updated weights for policy 0, policy_version 5840 (0.0005)
+[2023-07-16 21:31:32,789][239595] Updated weights for policy 0, policy_version 5920 (0.0005)
+[2023-07-16 21:31:33,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13243.7, 300 sec: 12967.8). Total num frames: 3047424. Throughput: 0: 13319.5. Samples: 3024988. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:31:33,927][239306] Avg episode reward: [(0, '735.942')]
+[2023-07-16 21:31:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005952_3047424.pth...
+[2023-07-16 21:31:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005160_2641920.pth
+[2023-07-16 21:31:33,933][239551] Saving new best policy, reward=735.942!
+[2023-07-16 21:31:35,554][239595] Updated weights for policy 0, policy_version 6000 (0.0004)
+[2023-07-16 21:31:38,420][239595] Updated weights for policy 0, policy_version 6080 (0.0004)
+[2023-07-16 21:31:38,926][239306] Fps is (10 sec: 13926.4, 60 sec: 13312.0, 300 sec: 12987.7). Total num frames: 3117056. Throughput: 0: 13529.3. Samples: 3112824. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:31:38,926][239306] Avg episode reward: [(0, '723.079')]
+[2023-07-16 21:31:41,300][239595] Updated weights for policy 0, policy_version 6160 (0.0005)
+[2023-07-16 21:31:43,926][239306] Fps is (10 sec: 14336.1, 60 sec: 13516.8, 300 sec: 13023.6). Total num frames: 3190784. Throughput: 0: 13569.5. Samples: 3154380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:31:43,927][239306] Avg episode reward: [(0, '733.967')]
+[2023-07-16 21:31:44,086][239595] Updated weights for policy 0, policy_version 6240 (0.0004)
+[2023-07-16 21:31:46,846][239595] Updated weights for policy 0, policy_version 6320 (0.0004)
+[2023-07-16 21:31:48,926][239306] Fps is (10 sec: 14745.5, 60 sec: 13653.3, 300 sec: 13058.0). Total num frames: 3264512. Throughput: 0: 13793.9. Samples: 3243772. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:31:48,927][239306] Avg episode reward: [(0, '740.616')]
+[2023-07-16 21:31:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006376_3264512.pth...
+[2023-07-16 21:31:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005560_2846720.pth
+[2023-07-16 21:31:48,933][239551] Saving new best policy, reward=740.616!
+[2023-07-16 21:31:49,669][239595] Updated weights for policy 0, policy_version 6400 (0.0005)
+[2023-07-16 21:31:52,630][239595] Updated weights for policy 0, policy_version 6480 (0.0005)
+[2023-07-16 21:31:53,926][239306] Fps is (10 sec: 14336.0, 60 sec: 13721.6, 300 sec: 13075.1). Total num frames: 3334144. Throughput: 0: 13903.1. Samples: 3328540. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:31:53,926][239306] Avg episode reward: [(0, '739.432')]
+[2023-07-16 21:31:55,454][239595] Updated weights for policy 0, policy_version 6560 (0.0004)
+[2023-07-16 21:31:58,205][239595] Updated weights for policy 0, policy_version 6640 (0.0004)
+[2023-07-16 21:31:58,926][239306] Fps is (10 sec: 14336.1, 60 sec: 13858.1, 300 sec: 13107.2). Total num frames: 3407872. Throughput: 0: 14001.5. Samples: 3372156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:31:58,927][239306] Avg episode reward: [(0, '743.088')]
+[2023-07-16 21:31:58,927][239551] Saving new best policy, reward=743.088!
+[2023-07-16 21:32:01,143][239595] Updated weights for policy 0, policy_version 6720 (0.0005)
+[2023-07-16 21:32:03,926][239306] Fps is (10 sec: 14335.9, 60 sec: 13926.4, 300 sec: 13122.7). Total num frames: 3477504. Throughput: 0: 14066.2. Samples: 3457132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:03,926][239306] Avg episode reward: [(0, '744.817')]
+[2023-07-16 21:32:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006792_3477504.pth...
+[2023-07-16 21:32:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000005952_3047424.pth
+[2023-07-16 21:32:03,933][239551] Saving new best policy, reward=744.817!
+[2023-07-16 21:32:04,013][239595] Updated weights for policy 0, policy_version 6800 (0.0005)
+[2023-07-16 21:32:06,826][239595] Updated weights for policy 0, policy_version 6880 (0.0005)
+[2023-07-16 21:32:08,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14062.9, 300 sec: 13152.7). Total num frames: 3551232. Throughput: 0: 14223.9. Samples: 3545560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:08,926][239306] Avg episode reward: [(0, '745.165')]
+[2023-07-16 21:32:08,927][239551] Saving new best policy, reward=745.165!
+[2023-07-16 21:32:09,657][239595] Updated weights for policy 0, policy_version 6960 (0.0005)
+[2023-07-16 21:32:12,580][239595] Updated weights for policy 0, policy_version 7040 (0.0005)
+[2023-07-16 21:32:13,926][239306] Fps is (10 sec: 14336.1, 60 sec: 14062.9, 300 sec: 13166.8). Total num frames: 3620864. Throughput: 0: 14291.9. Samples: 3588168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:13,927][239306] Avg episode reward: [(0, '727.608')]
+[2023-07-16 21:32:15,600][239595] Updated weights for policy 0, policy_version 7120 (0.0005)
+[2023-07-16 21:32:18,597][239595] Updated weights for policy 0, policy_version 7200 (0.0005)
+[2023-07-16 21:32:18,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14062.9, 300 sec: 13180.3). Total num frames: 3690496. Throughput: 0: 14330.5. Samples: 3669860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:18,927][239306] Avg episode reward: [(0, '743.462')]
+[2023-07-16 21:32:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007208_3690496.pth...
+[2023-07-16 21:32:18,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006376_3264512.pth
+[2023-07-16 21:32:21,583][239595] Updated weights for policy 0, policy_version 7280 (0.0005)
+[2023-07-16 21:32:23,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14131.2, 300 sec: 13193.4). Total num frames: 3760128. Throughput: 0: 14210.0. Samples: 3752272. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:32:23,927][239306] Avg episode reward: [(0, '725.112')]
+[2023-07-16 21:32:24,414][239595] Updated weights for policy 0, policy_version 7360 (0.0004)
+[2023-07-16 21:32:27,169][239595] Updated weights for policy 0, policy_version 7440 (0.0004)
+[2023-07-16 21:32:28,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14267.7, 300 sec: 13220.2). Total num frames: 3833856. Throughput: 0: 14281.7. Samples: 3797056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:28,927][239306] Avg episode reward: [(0, '750.134')]
+[2023-07-16 21:32:28,927][239551] Saving new best policy, reward=750.134!
+[2023-07-16 21:32:29,975][239595] Updated weights for policy 0, policy_version 7520 (0.0004)
+[2023-07-16 21:32:32,743][239595] Updated weights for policy 0, policy_version 7600 (0.0004)
+[2023-07-16 21:32:33,926][239306] Fps is (10 sec: 14745.5, 60 sec: 14336.0, 300 sec: 13246.0). Total num frames: 3907584. Throughput: 0: 14257.3. Samples: 3885352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:33,927][239306] Avg episode reward: [(0, '762.906')]
+[2023-07-16 21:32:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007632_3907584.pth...
+[2023-07-16 21:32:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000006792_3477504.pth
+[2023-07-16 21:32:33,933][239551] Saving new best policy, reward=762.906!
+[2023-07-16 21:32:35,445][239595] Updated weights for policy 0, policy_version 7680 (0.0004)
+[2023-07-16 21:32:38,220][239595] Updated weights for policy 0, policy_version 7760 (0.0004)
+[2023-07-16 21:32:38,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14404.3, 300 sec: 13426.5). Total num frames: 3981312. Throughput: 0: 14365.2. Samples: 3974976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:38,927][239306] Avg episode reward: [(0, '759.339')]
+[2023-07-16 21:32:40,974][239595] Updated weights for policy 0, policy_version 7840 (0.0004)
+[2023-07-16 21:32:43,708][239595] Updated weights for policy 0, policy_version 7920 (0.0004)
+[2023-07-16 21:32:43,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14404.3, 300 sec: 13454.3). Total num frames: 4055040. Throughput: 0: 14391.6. Samples: 4019776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:43,927][239306] Avg episode reward: [(0, '761.902')]
+[2023-07-16 21:32:46,426][239595] Updated weights for policy 0, policy_version 8000 (0.0004)
+[2023-07-16 21:32:48,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14472.5, 300 sec: 13482.1). Total num frames: 4132864. Throughput: 0: 14505.5. Samples: 4109880. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:32:48,937][239306] Avg episode reward: [(0, '770.249')]
+[2023-07-16 21:32:48,940][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008072_4132864.pth...
+[2023-07-16 21:32:48,943][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007208_3690496.pth
+[2023-07-16 21:32:48,944][239551] Saving new best policy, reward=770.249!
+[2023-07-16 21:32:49,120][239595] Updated weights for policy 0, policy_version 8080 (0.0004)
+[2023-07-16 21:32:51,830][239595] Updated weights for policy 0, policy_version 8160 (0.0004)
+[2023-07-16 21:32:53,926][239306] Fps is (10 sec: 15155.2, 60 sec: 14540.8, 300 sec: 13509.9). Total num frames: 4206592. Throughput: 0: 14575.7. Samples: 4201468. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:32:53,927][239306] Avg episode reward: [(0, '763.837')]
+[2023-07-16 21:32:54,541][239595] Updated weights for policy 0, policy_version 8240 (0.0004)
+[2023-07-16 21:32:57,288][239595] Updated weights for policy 0, policy_version 8320 (0.0004)
+[2023-07-16 21:32:58,926][239306] Fps is (10 sec: 14745.7, 60 sec: 14540.8, 300 sec: 13551.5). Total num frames: 4280320. Throughput: 0: 14618.2. Samples: 4245988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:32:58,929][239306] Avg episode reward: [(0, '765.765')]
+[2023-07-16 21:33:00,049][239595] Updated weights for policy 0, policy_version 8400 (0.0004)
+[2023-07-16 21:33:02,849][239595] Updated weights for policy 0, policy_version 8480 (0.0004)
+[2023-07-16 21:33:03,926][239306] Fps is (10 sec: 15155.2, 60 sec: 14677.3, 300 sec: 13593.2). Total num frames: 4358144. Throughput: 0: 14769.2. Samples: 4334476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:03,927][239306] Avg episode reward: [(0, '758.897')]
+[2023-07-16 21:33:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008512_4358144.pth...
+[2023-07-16 21:33:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000007632_3907584.pth
+[2023-07-16 21:33:05,618][239595] Updated weights for policy 0, policy_version 8560 (0.0004)
+[2023-07-16 21:33:08,653][239595] Updated weights for policy 0, policy_version 8640 (0.0005)
+[2023-07-16 21:33:08,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14540.8, 300 sec: 13607.0). Total num frames: 4423680. Throughput: 0: 14830.6. Samples: 4419648. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:33:08,927][239306] Avg episode reward: [(0, '756.808')]
+[2023-07-16 21:33:11,499][239595] Updated weights for policy 0, policy_version 8720 (0.0004)
+[2023-07-16 21:33:13,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14609.1, 300 sec: 13634.8). Total num frames: 4497408. Throughput: 0: 14779.3. Samples: 4462124. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:33:13,927][239306] Avg episode reward: [(0, '742.418')]
+[2023-07-16 21:33:14,284][239595] Updated weights for policy 0, policy_version 8800 (0.0004)
+[2023-07-16 21:33:17,035][239595] Updated weights for policy 0, policy_version 8880 (0.0004)
+[2023-07-16 21:33:18,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14677.3, 300 sec: 13662.6). Total num frames: 4571136. Throughput: 0: 14796.7. Samples: 4551204. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:33:18,927][239306] Avg episode reward: [(0, '761.355')]
+[2023-07-16 21:33:18,942][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008936_4575232.pth...
+[2023-07-16 21:33:18,944][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008072_4132864.pth
+[2023-07-16 21:33:19,758][239595] Updated weights for policy 0, policy_version 8960 (0.0004)
+[2023-07-16 21:33:22,462][239595] Updated weights for policy 0, policy_version 9040 (0.0004)
+[2023-07-16 21:33:23,926][239306] Fps is (10 sec: 15155.3, 60 sec: 14813.9, 300 sec: 13704.2). Total num frames: 4648960. Throughput: 0: 14816.8. Samples: 4641732. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:33:23,927][239306] Avg episode reward: [(0, '746.469')]
+[2023-07-16 21:33:25,158][239595] Updated weights for policy 0, policy_version 9120 (0.0004)
+[2023-07-16 21:33:28,217][239595] Updated weights for policy 0, policy_version 9200 (0.0005)
+[2023-07-16 21:33:28,926][239306] Fps is (10 sec: 14745.7, 60 sec: 14745.6, 300 sec: 13718.1). Total num frames: 4718592. Throughput: 0: 14802.7. Samples: 4685896. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:33:28,927][239306] Avg episode reward: [(0, '745.975')]
+[2023-07-16 21:33:31,330][239595] Updated weights for policy 0, policy_version 9280 (0.0005)
+[2023-07-16 21:33:33,926][239306] Fps is (10 sec: 13516.7, 60 sec: 14609.1, 300 sec: 13732.0). Total num frames: 4784128. Throughput: 0: 14545.6. Samples: 4764432. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:33:33,926][239306] Avg episode reward: [(0, '751.037')]
+[2023-07-16 21:33:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009344_4784128.pth...
+[2023-07-16 21:33:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008512_4358144.pth
+[2023-07-16 21:33:34,421][239595] Updated weights for policy 0, policy_version 9360 (0.0005)
+[2023-07-16 21:33:37,450][239595] Updated weights for policy 0, policy_version 9440 (0.0005)
+[2023-07-16 21:33:38,926][239306] Fps is (10 sec: 13107.2, 60 sec: 14472.5, 300 sec: 13732.0). Total num frames: 4849664. Throughput: 0: 14313.3. Samples: 4845568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:38,926][239306] Avg episode reward: [(0, '731.105')]
+[2023-07-16 21:33:40,497][239595] Updated weights for policy 0, policy_version 9520 (0.0005)
+[2023-07-16 21:33:43,514][239595] Updated weights for policy 0, policy_version 9600 (0.0005)
+[2023-07-16 21:33:43,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14404.3, 300 sec: 13745.9). Total num frames: 4919296. Throughput: 0: 14234.2. Samples: 4886528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:43,926][239306] Avg episode reward: [(0, '763.108')]
+[2023-07-16 21:33:46,587][239595] Updated weights for policy 0, policy_version 9680 (0.0005)
+[2023-07-16 21:33:48,926][239306] Fps is (10 sec: 13516.7, 60 sec: 14199.5, 300 sec: 13745.9). Total num frames: 4984832. Throughput: 0: 14042.9. Samples: 4966408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:48,927][239306] Avg episode reward: [(0, '753.065')]
+[2023-07-16 21:33:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009736_4984832.pth...
+[2023-07-16 21:33:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000008936_4575232.pth
+[2023-07-16 21:33:49,607][239595] Updated weights for policy 0, policy_version 9760 (0.0005)
+[2023-07-16 21:33:52,568][239595] Updated weights for policy 0, policy_version 9840 (0.0005)
+[2023-07-16 21:33:53,926][239306] Fps is (10 sec: 13516.9, 60 sec: 14131.2, 300 sec: 13773.7). Total num frames: 5054464. Throughput: 0: 13982.4. Samples: 5048856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:53,926][239306] Avg episode reward: [(0, '723.033')]
+[2023-07-16 21:33:55,683][239595] Updated weights for policy 0, policy_version 9920 (0.0005)
+[2023-07-16 21:33:58,645][239595] Updated weights for policy 0, policy_version 10000 (0.0005)
+[2023-07-16 21:33:58,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14062.9, 300 sec: 13787.6). Total num frames: 5124096. Throughput: 0: 13900.1. Samples: 5087628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:33:58,927][239306] Avg episode reward: [(0, '742.262')]
+[2023-07-16 21:34:01,323][239595] Updated weights for policy 0, policy_version 10080 (0.0003)
+[2023-07-16 21:34:03,926][239306] Fps is (10 sec: 14335.9, 60 sec: 13994.7, 300 sec: 13815.3). Total num frames: 5197824. Throughput: 0: 13899.1. Samples: 5176664. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:34:03,927][239306] Avg episode reward: [(0, '753.575')]
+[2023-07-16 21:34:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010152_5197824.pth...
+[2023-07-16 21:34:03,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009344_4784128.pth
+[2023-07-16 21:34:04,060][239595] Updated weights for policy 0, policy_version 10160 (0.0004)
+[2023-07-16 21:34:06,860][239595] Updated weights for policy 0, policy_version 10240 (0.0004)
+[2023-07-16 21:34:08,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14131.2, 300 sec: 13843.1). Total num frames: 5271552. Throughput: 0: 13843.3. Samples: 5264680. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:34:08,927][239306] Avg episode reward: [(0, '760.022')]
+[2023-07-16 21:34:09,613][239595] Updated weights for policy 0, policy_version 10320 (0.0003)
+[2023-07-16 21:34:12,646][239595] Updated weights for policy 0, policy_version 10400 (0.0005)
+[2023-07-16 21:34:13,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14062.9, 300 sec: 13857.0). Total num frames: 5341184. Throughput: 0: 13831.4. Samples: 5308308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:34:13,927][239306] Avg episode reward: [(0, '765.552')]
+[2023-07-16 21:34:15,696][239595] Updated weights for policy 0, policy_version 10480 (0.0005)
+[2023-07-16 21:34:18,472][239595] Updated weights for policy 0, policy_version 10560 (0.0004)
+[2023-07-16 21:34:18,926][239306] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 13870.9). Total num frames: 5410816. Throughput: 0: 13909.3. Samples: 5390348. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:34:18,926][239306] Avg episode reward: [(0, '750.168')]
+[2023-07-16 21:34:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010568_5410816.pth...
+[2023-07-16 21:34:18,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000009736_4984832.pth
+[2023-07-16 21:34:21,252][239595] Updated weights for policy 0, policy_version 10640 (0.0004)
+[2023-07-16 21:34:23,926][239306] Fps is (10 sec: 14336.0, 60 sec: 13926.4, 300 sec: 13884.7). Total num frames: 5484544. Throughput: 0: 14082.7. Samples: 5479288. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:34:23,927][239306] Avg episode reward: [(0, '745.731')]
+[2023-07-16 21:34:24,017][239595] Updated weights for policy 0, policy_version 10720 (0.0004)
+[2023-07-16 21:34:26,885][239595] Updated weights for policy 0, policy_version 10800 (0.0004)
+[2023-07-16 21:34:28,926][239306] Fps is (10 sec: 14336.0, 60 sec: 13926.4, 300 sec: 13898.6). Total num frames: 5554176. Throughput: 0: 14124.2. Samples: 5522116. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:34:28,927][239306] Avg episode reward: [(0, '786.264')]
+[2023-07-16 21:34:28,928][239551] Saving new best policy, reward=786.264!
+[2023-07-16 21:34:29,904][239595] Updated weights for policy 0, policy_version 10880 (0.0005)
+[2023-07-16 21:34:32,985][239595] Updated weights for policy 0, policy_version 10960 (0.0005)
+[2023-07-16 21:34:33,926][239306] Fps is (10 sec: 13926.4, 60 sec: 13994.7, 300 sec: 13898.6). Total num frames: 5623808. Throughput: 0: 14154.0. Samples: 5603336. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:34:33,927][239306] Avg episode reward: [(0, '770.152')]
+[2023-07-16 21:34:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010984_5623808.pth...
+[2023-07-16 21:34:33,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010152_5197824.pth
+[2023-07-16 21:34:36,010][239595] Updated weights for policy 0, policy_version 11040 (0.0005)
+[2023-07-16 21:34:38,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13994.7, 300 sec: 13884.7). Total num frames: 5689344. Throughput: 0: 14135.4. Samples: 5684948. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:34:38,927][239306] Avg episode reward: [(0, '773.969')]
+[2023-07-16 21:34:39,040][239595] Updated weights for policy 0, policy_version 11120 (0.0005)
+[2023-07-16 21:34:41,759][239595] Updated weights for policy 0, policy_version 11200 (0.0004)
+[2023-07-16 21:34:43,926][239306] Fps is (10 sec: 13926.5, 60 sec: 14063.0, 300 sec: 13912.5). Total num frames: 5763072. Throughput: 0: 14229.0. Samples: 5727932. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:34:43,927][239306] Avg episode reward: [(0, '753.264')]
+[2023-07-16 21:34:44,505][239595] Updated weights for policy 0, policy_version 11280 (0.0004)
+[2023-07-16 21:34:47,215][239595] Updated weights for policy 0, policy_version 11360 (0.0004)
+[2023-07-16 21:34:48,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14267.7, 300 sec: 13954.2). Total num frames: 5840896. Throughput: 0: 14260.4. Samples: 5818384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:34:48,927][239306] Avg episode reward: [(0, '761.679')]
+[2023-07-16 21:34:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011408_5840896.pth...
+[2023-07-16 21:34:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010568_5410816.pth
+[2023-07-16 21:34:49,871][239595] Updated weights for policy 0, policy_version 11440 (0.0003)
+[2023-07-16 21:34:52,707][239595] Updated weights for policy 0, policy_version 11520 (0.0004)
+[2023-07-16 21:34:53,926][239306] Fps is (10 sec: 15155.2, 60 sec: 14336.0, 300 sec: 13981.9). Total num frames: 5914624. Throughput: 0: 14284.4. Samples: 5907476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:34:53,927][239306] Avg episode reward: [(0, '778.357')]
+[2023-07-16 21:34:55,416][239595] Updated weights for policy 0, policy_version 11600 (0.0004)
+[2023-07-16 21:34:58,177][239595] Updated weights for policy 0, policy_version 11680 (0.0004)
+[2023-07-16 21:34:58,926][239306] Fps is (10 sec: 14745.7, 60 sec: 14404.3, 300 sec: 14009.7). Total num frames: 5988352. Throughput: 0: 14320.7. Samples: 5952740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:34:58,927][239306] Avg episode reward: [(0, '760.948')]
+[2023-07-16 21:35:01,073][239595] Updated weights for policy 0, policy_version 11760 (0.0004)
+[2023-07-16 21:35:03,926][239306] Fps is (10 sec: 14335.9, 60 sec: 14336.0, 300 sec: 14023.6). Total num frames: 6057984. Throughput: 0: 14402.8. Samples: 6038472. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:35:03,926][239306] Avg episode reward: [(0, '769.921')]
+[2023-07-16 21:35:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011832_6057984.pth...
+[2023-07-16 21:35:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000010984_5623808.pth
+[2023-07-16 21:35:04,109][239595] Updated weights for policy 0, policy_version 11840 (0.0005)
+[2023-07-16 21:35:07,187][239595] Updated weights for policy 0, policy_version 11920 (0.0005)
+[2023-07-16 21:35:08,926][239306] Fps is (10 sec: 13926.3, 60 sec: 14267.7, 300 sec: 14037.5). Total num frames: 6127616. Throughput: 0: 14243.4. Samples: 6120240. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:35:08,927][239306] Avg episode reward: [(0, '746.835')]
+[2023-07-16 21:35:09,902][239595] Updated weights for policy 0, policy_version 12000 (0.0004)
+[2023-07-16 21:35:12,691][239595] Updated weights for policy 0, policy_version 12080 (0.0004)
+[2023-07-16 21:35:13,926][239306] Fps is (10 sec: 14336.1, 60 sec: 14336.0, 300 sec: 14065.2). Total num frames: 6201344. Throughput: 0: 14292.4. Samples: 6165272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:13,927][239306] Avg episode reward: [(0, '779.435')]
+[2023-07-16 21:35:15,441][239595] Updated weights for policy 0, policy_version 12160 (0.0004)
+[2023-07-16 21:35:18,198][239595] Updated weights for policy 0, policy_version 12240 (0.0004)
+[2023-07-16 21:35:18,926][239306] Fps is (10 sec: 14745.5, 60 sec: 14404.3, 300 sec: 14093.0). Total num frames: 6275072. Throughput: 0: 14464.9. Samples: 6254256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:18,927][239306] Avg episode reward: [(0, '772.500')]
+[2023-07-16 21:35:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012256_6275072.pth...
+[2023-07-16 21:35:18,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011408_5840896.pth
+[2023-07-16 21:35:20,937][239595] Updated weights for policy 0, policy_version 12320 (0.0004)
+[2023-07-16 21:35:23,546][239595] Updated weights for policy 0, policy_version 12400 (0.0003)
+[2023-07-16 21:35:23,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14472.5, 300 sec: 14120.8). Total num frames: 6352896. Throughput: 0: 14676.6. Samples: 6345396. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:35:23,927][239306] Avg episode reward: [(0, '764.788')]
+[2023-07-16 21:35:26,336][239595] Updated weights for policy 0, policy_version 12480 (0.0004)
+[2023-07-16 21:35:28,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14540.8, 300 sec: 14148.6). Total num frames: 6426624. Throughput: 0: 14708.7. Samples: 6389824. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:35:28,927][239306] Avg episode reward: [(0, '775.875')]
+[2023-07-16 21:35:29,213][239595] Updated weights for policy 0, policy_version 12560 (0.0004)
+[2023-07-16 21:35:32,275][239595] Updated weights for policy 0, policy_version 12640 (0.0005)
+[2023-07-16 21:35:33,926][239306] Fps is (10 sec: 13926.3, 60 sec: 14472.5, 300 sec: 14148.6). Total num frames: 6492160. Throughput: 0: 14538.4. Samples: 6472612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:33,926][239306] Avg episode reward: [(0, '771.803')]
+[2023-07-16 21:35:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012680_6492160.pth...
+[2023-07-16 21:35:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000011832_6057984.pth
+[2023-07-16 21:35:35,353][239595] Updated weights for policy 0, policy_version 12720 (0.0005)
+[2023-07-16 21:35:38,502][239595] Updated weights for policy 0, policy_version 12800 (0.0005)
+[2023-07-16 21:35:38,926][239306] Fps is (10 sec: 13107.3, 60 sec: 14472.5, 300 sec: 14162.4). Total num frames: 6557696. Throughput: 0: 14319.3. Samples: 6551844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:38,927][239306] Avg episode reward: [(0, '783.128')]
+[2023-07-16 21:35:41,544][239595] Updated weights for policy 0, policy_version 12880 (0.0005)
+[2023-07-16 21:35:43,926][239306] Fps is (10 sec: 13107.3, 60 sec: 14336.0, 300 sec: 14162.4). Total num frames: 6623232. Throughput: 0: 14201.2. Samples: 6591792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:43,926][239306] Avg episode reward: [(0, '780.077')]
+[2023-07-16 21:35:44,611][239595] Updated weights for policy 0, policy_version 12960 (0.0005)
+[2023-07-16 21:35:47,617][239595] Updated weights for policy 0, policy_version 13040 (0.0005)
+[2023-07-16 21:35:48,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14199.5, 300 sec: 14176.3). Total num frames: 6692864. Throughput: 0: 14088.6. Samples: 6672460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:48,926][239306] Avg episode reward: [(0, '779.580')]
+[2023-07-16 21:35:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013072_6692864.pth...
+[2023-07-16 21:35:48,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012256_6275072.pth
+[2023-07-16 21:35:50,656][239595] Updated weights for policy 0, policy_version 13120 (0.0005)
+[2023-07-16 21:35:53,681][239595] Updated weights for policy 0, policy_version 13200 (0.0005)
+[2023-07-16 21:35:53,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14062.9, 300 sec: 14176.3). Total num frames: 6758400. Throughput: 0: 14090.6. Samples: 6754316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:53,926][239306] Avg episode reward: [(0, '764.977')]
+[2023-07-16 21:35:56,730][239595] Updated weights for policy 0, policy_version 13280 (0.0005)
+[2023-07-16 21:35:58,926][239306] Fps is (10 sec: 13516.9, 60 sec: 13994.7, 300 sec: 14190.2). Total num frames: 6828032. Throughput: 0: 13991.7. Samples: 6794900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:35:58,926][239306] Avg episode reward: [(0, '760.557')]
+[2023-07-16 21:35:59,811][239595] Updated weights for policy 0, policy_version 13360 (0.0005)
+[2023-07-16 21:36:02,803][239595] Updated weights for policy 0, policy_version 13440 (0.0005)
+[2023-07-16 21:36:03,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13926.4, 300 sec: 14190.2). Total num frames: 6893568. Throughput: 0: 13792.6. Samples: 6874924. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:36:03,927][239306] Avg episode reward: [(0, '755.685')]
+[2023-07-16 21:36:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013464_6893568.pth...
+[2023-07-16 21:36:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000012680_6492160.pth
+[2023-07-16 21:36:05,907][239595] Updated weights for policy 0, policy_version 13520 (0.0005)
+[2023-07-16 21:36:08,864][239595] Updated weights for policy 0, policy_version 13600 (0.0005)
+[2023-07-16 21:36:08,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13926.4, 300 sec: 14190.2). Total num frames: 6963200. Throughput: 0: 13575.6. Samples: 6956300. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:36:08,927][239306] Avg episode reward: [(0, '764.394')]
+[2023-07-16 21:36:11,975][239595] Updated weights for policy 0, policy_version 13680 (0.0005)
+[2023-07-16 21:36:13,926][239306] Fps is (10 sec: 13516.9, 60 sec: 13789.9, 300 sec: 14176.3). Total num frames: 7028736. Throughput: 0: 13470.1. Samples: 6995976. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:36:13,927][239306] Avg episode reward: [(0, '774.522')]
+[2023-07-16 21:36:15,117][239595] Updated weights for policy 0, policy_version 13760 (0.0005)
+[2023-07-16 21:36:18,127][239595] Updated weights for policy 0, policy_version 13840 (0.0005)
+[2023-07-16 21:36:18,926][239306] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14176.3). Total num frames: 7094272. Throughput: 0: 13396.6. Samples: 7075460. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:36:18,926][239306] Avg episode reward: [(0, '781.957')]
+[2023-07-16 21:36:18,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013856_7094272.pth...
+[2023-07-16 21:36:18,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013072_6692864.pth
+[2023-07-16 21:36:21,145][239595] Updated weights for policy 0, policy_version 13920 (0.0005)
+[2023-07-16 21:36:23,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13516.8, 300 sec: 14190.2). Total num frames: 7163904. Throughput: 0: 13459.9. Samples: 7157540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:36:23,926][239306] Avg episode reward: [(0, '770.859')]
+[2023-07-16 21:36:24,120][239595] Updated weights for policy 0, policy_version 14000 (0.0005)
+[2023-07-16 21:36:27,195][239595] Updated weights for policy 0, policy_version 14080 (0.0005)
+[2023-07-16 21:36:28,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13380.3, 300 sec: 14176.3). Total num frames: 7229440. Throughput: 0: 13451.4. Samples: 7197104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:36:28,927][239306] Avg episode reward: [(0, '775.574')]
+[2023-07-16 21:36:30,250][239595] Updated weights for policy 0, policy_version 14160 (0.0005)
+[2023-07-16 21:36:33,243][239595] Updated weights for policy 0, policy_version 14240 (0.0005)
+[2023-07-16 21:36:33,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13448.5, 300 sec: 14176.3). Total num frames: 7299072. Throughput: 0: 13469.8. Samples: 7278600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:36:33,927][239306] Avg episode reward: [(0, '772.072')]
+[2023-07-16 21:36:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014256_7299072.pth...
+[2023-07-16 21:36:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013464_6893568.pth
+[2023-07-16 21:36:36,349][239595] Updated weights for policy 0, policy_version 14320 (0.0005)
+[2023-07-16 21:36:38,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13448.5, 300 sec: 14148.6). Total num frames: 7364608. Throughput: 0: 13468.4. Samples: 7360396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:36:38,927][239306] Avg episode reward: [(0, '768.898')]
+[2023-07-16 21:36:39,220][239595] Updated weights for policy 0, policy_version 14400 (0.0005)
+[2023-07-16 21:36:42,029][239595] Updated weights for policy 0, policy_version 14480 (0.0004)
+[2023-07-16 21:36:43,926][239306] Fps is (10 sec: 13926.5, 60 sec: 13585.1, 300 sec: 14148.6). Total num frames: 7438336. Throughput: 0: 13525.2. Samples: 7403536. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:36:43,927][239306] Avg episode reward: [(0, '773.949')]
+[2023-07-16 21:36:44,853][239595] Updated weights for policy 0, policy_version 14560 (0.0005)
+[2023-07-16 21:36:47,625][239595] Updated weights for policy 0, policy_version 14640 (0.0004)
+[2023-07-16 21:36:48,926][239306] Fps is (10 sec: 14745.5, 60 sec: 13653.3, 300 sec: 14162.4). Total num frames: 7512064. Throughput: 0: 13703.6. Samples: 7491584. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:36:48,927][239306] Avg episode reward: [(0, '771.605')]
+[2023-07-16 21:36:48,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014672_7512064.pth...
+[2023-07-16 21:36:48,931][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000013856_7094272.pth
+[2023-07-16 21:36:50,390][239595] Updated weights for policy 0, policy_version 14720 (0.0004)
+[2023-07-16 21:36:53,160][239595] Updated weights for policy 0, policy_version 14800 (0.0004)
+[2023-07-16 21:36:53,926][239306] Fps is (10 sec: 14745.6, 60 sec: 13789.9, 300 sec: 14162.4). Total num frames: 7585792. Throughput: 0: 13873.3. Samples: 7580596. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:36:53,927][239306] Avg episode reward: [(0, '769.483')]
+[2023-07-16 21:36:55,913][239595] Updated weights for policy 0, policy_version 14880 (0.0004)
+[2023-07-16 21:36:58,690][239595] Updated weights for policy 0, policy_version 14960 (0.0004)
+[2023-07-16 21:36:58,926][239306] Fps is (10 sec: 14745.7, 60 sec: 13858.1, 300 sec: 14176.3). Total num frames: 7659520. Throughput: 0: 13986.1. Samples: 7625348. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
+[2023-07-16 21:36:58,927][239306] Avg episode reward: [(0, '786.114')]
+[2023-07-16 21:37:01,687][239595] Updated weights for policy 0, policy_version 15040 (0.0005)
+[2023-07-16 21:37:03,926][239306] Fps is (10 sec: 14335.9, 60 sec: 13926.4, 300 sec: 14162.4). Total num frames: 7729152. Throughput: 0: 14072.8. Samples: 7708736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:37:03,927][239306] Avg episode reward: [(0, '773.030')]
+[2023-07-16 21:37:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015096_7729152.pth...
+[2023-07-16 21:37:03,934][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014256_7299072.pth
+[2023-07-16 21:37:04,808][239595] Updated weights for policy 0, policy_version 15120 (0.0005)
+[2023-07-16 21:37:07,924][239595] Updated weights for policy 0, policy_version 15200 (0.0005)
+[2023-07-16 21:37:08,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13858.1, 300 sec: 14148.6). Total num frames: 7794688. Throughput: 0: 14021.0. Samples: 7788484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:37:08,927][239306] Avg episode reward: [(0, '773.637')]
+[2023-07-16 21:37:10,914][239595] Updated weights for policy 0, policy_version 15280 (0.0005)
+[2023-07-16 21:37:13,926][239306] Fps is (10 sec: 13107.2, 60 sec: 13858.1, 300 sec: 14134.7). Total num frames: 7860224. Throughput: 0: 14054.7. Samples: 7829564. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:37:13,927][239306] Avg episode reward: [(0, '784.651')]
+[2023-07-16 21:37:13,954][239595] Updated weights for policy 0, policy_version 15360 (0.0005)
+[2023-07-16 21:37:17,015][239595] Updated weights for policy 0, policy_version 15440 (0.0005)
+[2023-07-16 21:37:18,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13926.4, 300 sec: 14134.7). Total num frames: 7929856. Throughput: 0: 14018.7. Samples: 7909440. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:37:18,927][239306] Avg episode reward: [(0, '768.614')]
+[2023-07-16 21:37:18,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015488_7929856.pth...
+[2023-07-16 21:37:18,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000014672_7512064.pth
+[2023-07-16 21:37:20,016][239595] Updated weights for policy 0, policy_version 15520 (0.0005)
+[2023-07-16 21:37:22,779][239595] Updated weights for policy 0, policy_version 15600 (0.0004)
+[2023-07-16 21:37:23,926][239306] Fps is (10 sec: 14336.0, 60 sec: 13994.7, 300 sec: 14134.7). Total num frames: 8003584. Throughput: 0: 14112.6. Samples: 7995464. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:37:23,927][239306] Avg episode reward: [(0, '751.495')]
+[2023-07-16 21:37:25,545][239595] Updated weights for policy 0, policy_version 15680 (0.0004)
+[2023-07-16 21:37:28,434][239595] Updated weights for policy 0, policy_version 15760 (0.0005)
+[2023-07-16 21:37:28,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14062.9, 300 sec: 14120.8). Total num frames: 8073216. Throughput: 0: 14153.8. Samples: 8040456. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:37:28,927][239306] Avg episode reward: [(0, '755.593')]
+[2023-07-16 21:37:31,554][239595] Updated weights for policy 0, policy_version 15840 (0.0005)
+[2023-07-16 21:37:33,926][239306] Fps is (10 sec: 13516.7, 60 sec: 13994.7, 300 sec: 14093.0). Total num frames: 8138752. Throughput: 0: 14000.9. Samples: 8121624. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
+[2023-07-16 21:37:33,927][239306] Avg episode reward: [(0, '773.552')]
+[2023-07-16 21:37:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015896_8138752.pth...
+[2023-07-16 21:37:33,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015096_7729152.pth
+[2023-07-16 21:37:34,638][239595] Updated weights for policy 0, policy_version 15920 (0.0005)
+[2023-07-16 21:37:37,668][239595] Updated weights for policy 0, policy_version 16000 (0.0005)
+[2023-07-16 21:37:38,926][239306] Fps is (10 sec: 13107.3, 60 sec: 13994.7, 300 sec: 14065.3). Total num frames: 8204288. Throughput: 0: 13784.6. Samples: 8200904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:37:38,967][239306] Avg episode reward: [(0, '784.715')]
+[2023-07-16 21:37:40,725][239595] Updated weights for policy 0, policy_version 16080 (0.0005)
+[2023-07-16 21:37:43,478][239595] Updated weights for policy 0, policy_version 16160 (0.0004)
+[2023-07-16 21:37:43,926][239306] Fps is (10 sec: 13926.5, 60 sec: 13994.7, 300 sec: 14051.4). Total num frames: 8278016. Throughput: 0: 13713.6. Samples: 8242460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:37:43,927][239306] Avg episode reward: [(0, '792.631')]
+[2023-07-16 21:37:43,927][239551] Saving new best policy, reward=792.631!
+[2023-07-16 21:37:46,231][239595] Updated weights for policy 0, policy_version 16240 (0.0004)
+[2023-07-16 21:37:48,926][239306] Fps is (10 sec: 14745.4, 60 sec: 13994.7, 300 sec: 14051.4). Total num frames: 8351744. Throughput: 0: 13848.9. Samples: 8331936. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:37:48,927][239306] Avg episode reward: [(0, '781.760')]
+[2023-07-16 21:37:48,947][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016320_8355840.pth...
+[2023-07-16 21:37:48,947][239595] Updated weights for policy 0, policy_version 16320 (0.0004)
+[2023-07-16 21:37:48,948][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015488_7929856.pth
+[2023-07-16 21:37:51,729][239595] Updated weights for policy 0, policy_version 16400 (0.0004)
+[2023-07-16 21:37:53,926][239306] Fps is (10 sec: 14745.6, 60 sec: 13994.7, 300 sec: 14051.4). Total num frames: 8425472. Throughput: 0: 14051.6. Samples: 8420808. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
+[2023-07-16 21:37:53,926][239306] Avg episode reward: [(0, '767.075')]
+[2023-07-16 21:37:54,549][239595] Updated weights for policy 0, policy_version 16480 (0.0004)
+[2023-07-16 21:37:57,318][239595] Updated weights for policy 0, policy_version 16560 (0.0004)
+[2023-07-16 21:37:58,926][239306] Fps is (10 sec: 14745.7, 60 sec: 13994.7, 300 sec: 14037.5). Total num frames: 8499200. Throughput: 0: 14108.8. Samples: 8464460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:37:58,927][239306] Avg episode reward: [(0, '786.601')]
+[2023-07-16 21:38:00,019][239595] Updated weights for policy 0, policy_version 16640 (0.0004)
+[2023-07-16 21:38:02,783][239595] Updated weights for policy 0, policy_version 16720 (0.0004)
+[2023-07-16 21:38:03,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14131.2, 300 sec: 14079.1). Total num frames: 8577024. Throughput: 0: 14317.5. Samples: 8553728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:38:03,927][239306] Avg episode reward: [(0, '773.305')]
+[2023-07-16 21:38:03,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016752_8577024.pth...
+[2023-07-16 21:38:03,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000015896_8138752.pth
+[2023-07-16 21:38:05,530][239595] Updated weights for policy 0, policy_version 16800 (0.0004)
+[2023-07-16 21:38:08,338][239595] Updated weights for policy 0, policy_version 16880 (0.0004)
+[2023-07-16 21:38:08,926][239306] Fps is (10 sec: 15155.1, 60 sec: 14267.7, 300 sec: 14079.1). Total num frames: 8650752. Throughput: 0: 14381.3. Samples: 8642624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
+[2023-07-16 21:38:08,927][239306] Avg episode reward: [(0, '784.602')]
+[2023-07-16 21:38:11,076][239595] Updated weights for policy 0, policy_version 16960 (0.0004)
+[2023-07-16 21:38:13,850][239595] Updated weights for policy 0, policy_version 17040 (0.0004)
+[2023-07-16 21:38:13,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14404.3, 300 sec: 14079.1). Total num frames: 8724480. Throughput: 0: 14381.5. Samples: 8687624. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:38:13,926][239306] Avg episode reward: [(0, '779.891')]
+[2023-07-16 21:38:16,889][239595] Updated weights for policy 0, policy_version 17120 (0.0005)
+[2023-07-16 21:38:18,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14336.0, 300 sec: 14037.5). Total num frames: 8790016. Throughput: 0: 14442.2. Samples: 8771524. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:38:18,927][239306] Avg episode reward: [(0, '773.225')]
+[2023-07-16 21:38:18,929][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017168_8790016.pth...
+[2023-07-16 21:38:18,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016320_8355840.pth
+[2023-07-16 21:38:19,921][239595] Updated weights for policy 0, policy_version 17200 (0.0006)
+[2023-07-16 21:38:22,939][239595] Updated weights for policy 0, policy_version 17280 (0.0005)
+[2023-07-16 21:38:23,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14267.7, 300 sec: 14037.5). Total num frames: 8859648. Throughput: 0: 14492.9. Samples: 8853084. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:38:23,927][239306] Avg episode reward: [(0, '785.902')]
+[2023-07-16 21:38:25,948][239595] Updated weights for policy 0, policy_version 17360 (0.0005)
+[2023-07-16 21:38:28,926][239306] Fps is (10 sec: 13516.7, 60 sec: 14199.4, 300 sec: 14037.5). Total num frames: 8925184. Throughput: 0: 14478.4. Samples: 8893988. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:38:28,927][239306] Avg episode reward: [(0, '776.071')]
+[2023-07-16 21:38:28,968][239595] Updated weights for policy 0, policy_version 17440 (0.0004)
+[2023-07-16 21:38:31,905][239595] Updated weights for policy 0, policy_version 17520 (0.0004)
+[2023-07-16 21:38:33,926][239306] Fps is (10 sec: 13926.4, 60 sec: 14336.0, 300 sec: 14065.2). Total num frames: 8998912. Throughput: 0: 14326.5. Samples: 8976628. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
+[2023-07-16 21:38:33,927][239306] Avg episode reward: [(0, '775.024')]
+[2023-07-16 21:38:33,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017576_8998912.pth...
+[2023-07-16 21:38:33,932][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000016752_8577024.pth
+[2023-07-16 21:38:34,716][239595] Updated weights for policy 0, policy_version 17600 (0.0005)
+[2023-07-16 21:38:37,481][239595] Updated weights for policy 0, policy_version 17680 (0.0004)
+[2023-07-16 21:38:38,926][239306] Fps is (10 sec: 14745.8, 60 sec: 14472.5, 300 sec: 14079.1). Total num frames: 9072640. Throughput: 0: 14304.5. Samples: 9064512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:38:38,927][239306] Avg episode reward: [(0, '770.695')]
+[2023-07-16 21:38:40,291][239595] Updated weights for policy 0, policy_version 17760 (0.0004)
+[2023-07-16 21:38:43,011][239595] Updated weights for policy 0, policy_version 17840 (0.0004)
+[2023-07-16 21:38:43,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14472.5, 300 sec: 14106.9). Total num frames: 9146368. Throughput: 0: 14325.1. Samples: 9109088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:38:43,927][239306] Avg episode reward: [(0, '777.413')]
+[2023-07-16 21:38:45,812][239595] Updated weights for policy 0, policy_version 17920 (0.0004)
+[2023-07-16 21:38:48,529][239595] Updated weights for policy 0, policy_version 18000 (0.0004)
+[2023-07-16 21:38:48,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14472.5, 300 sec: 14120.8). Total num frames: 9220096. Throughput: 0: 14322.3. Samples: 9198232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:38:48,927][239306] Avg episode reward: [(0, '768.252')]
+[2023-07-16 21:38:48,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018008_9220096.pth...
+[2023-07-16 21:38:48,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017168_8790016.pth
+[2023-07-16 21:38:51,229][239595] Updated weights for policy 0, policy_version 18080 (0.0004)
+[2023-07-16 21:38:53,926][239306] Fps is (10 sec: 14745.6, 60 sec: 14472.5, 300 sec: 14134.7). Total num frames: 9293824. Throughput: 0: 14352.7. Samples: 9288496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:38:53,983][239595] Updated weights for policy 0, policy_version 18160 (0.0004)
+[2023-07-16 21:38:54,089][239306] Avg episode reward: [(0, '779.387')]
+[2023-07-16 21:38:56,954][239595] Updated weights for policy 0, policy_version 18240 (0.0004)
+[2023-07-16 21:38:58,926][239306] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 14120.8). Total num frames: 9363456. Throughput: 0: 14291.7. Samples: 9330752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:38:58,927][239306] Avg episode reward: [(0, '775.109')]
+[2023-07-16 21:38:59,870][239595] Updated weights for policy 0, policy_version 18320 (0.0004)
+[2023-07-16 21:39:02,948][239595] Updated weights for policy 0, policy_version 18400 (0.0005)
+[2023-07-16 21:39:03,926][239306] Fps is (10 sec: 13926.3, 60 sec: 14267.7, 300 sec: 14106.9). Total num frames: 9433088. Throughput: 0: 14246.6. Samples: 9412620. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:39:03,927][239306] Avg episode reward: [(0, '786.043')]
+[2023-07-16 21:39:03,930][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018424_9433088.pth...
+[2023-07-16 21:39:03,933][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000017576_8998912.pth
+[2023-07-16 21:39:05,943][239595] Updated weights for policy 0, policy_version 18480 (0.0005)
+[2023-07-16 21:39:08,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14131.2, 300 sec: 14093.0). Total num frames: 9498624. Throughput: 0: 14255.7. Samples: 9494592. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
+[2023-07-16 21:39:08,927][239306] Avg episode reward: [(0, '781.894')]
+[2023-07-16 21:39:08,948][239595] Updated weights for policy 0, policy_version 18560 (0.0005)
+[2023-07-16 21:39:11,934][239595] Updated weights for policy 0, policy_version 18640 (0.0005)
+[2023-07-16 21:39:13,926][239306] Fps is (10 sec: 13516.9, 60 sec: 14062.9, 300 sec: 14093.0). Total num frames: 9568256. Throughput: 0: 14257.2. Samples: 9535560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:39:13,927][239306] Avg episode reward: [(0, '769.017')]
+[2023-07-16 21:39:14,951][239595] Updated weights for policy 0, policy_version 18720 (0.0003)
+[2023-07-16 21:39:17,965][239595] Updated weights for policy 0, policy_version 18800 (0.0003)
+[2023-07-16 21:39:18,926][239306] Fps is (10 sec: 13926.3, 60 sec: 14131.2, 300 sec: 14079.1). Total num frames: 9637888. Throughput: 0: 14240.9. Samples: 9617472. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:39:18,934][239306] Avg episode reward: [(0, '789.446')]
+[2023-07-16 21:39:18,937][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018824_9637888.pth...
+[2023-07-16 21:39:18,940][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018008_9220096.pth
+[2023-07-16 21:39:21,047][239595] Updated weights for policy 0, policy_version 18880 (0.0005)
+[2023-07-16 21:39:23,926][239306] Fps is (10 sec: 13516.8, 60 sec: 14062.9, 300 sec: 14065.2). Total num frames: 9703424. Throughput: 0: 14042.4. Samples: 9696420. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:39:23,927][239306] Avg episode reward: [(0, '772.489')]
+[2023-07-16 21:39:24,225][239595] Updated weights for policy 0, policy_version 18960 (0.0005)
+[2023-07-16 21:39:27,280][239595] Updated weights for policy 0, policy_version 19040 (0.0005)
+[2023-07-16 21:39:28,926][239306] Fps is (10 sec: 13107.3, 60 sec: 14063.0, 300 sec: 14051.4). Total num frames: 9768960. Throughput: 0: 13937.1. Samples: 9736256. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:39:28,927][239306] Avg episode reward: [(0, '785.958')]
+[2023-07-16 21:39:30,310][239595] Updated weights for policy 0, policy_version 19120 (0.0005)
+[2023-07-16 21:39:33,319][239595] Updated weights for policy 0, policy_version 19200 (0.0005)
+[2023-07-16 21:39:33,926][239306] Fps is (10 sec: 13107.1, 60 sec: 13926.4, 300 sec: 14051.4). Total num frames: 9834496. Throughput: 0: 13763.4. Samples: 9817588. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:39:33,927][239306] Avg episode reward: [(0, '787.608')]
+[2023-07-16 21:39:33,943][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019216_9838592.pth...
+[2023-07-16 21:39:33,946][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018424_9433088.pth
+[2023-07-16 21:39:36,432][239595] Updated weights for policy 0, policy_version 19280 (0.0005)
+[2023-07-16 21:39:38,926][239306] Fps is (10 sec: 13516.8, 60 sec: 13858.1, 300 sec: 14037.5). Total num frames: 9904128. Throughput: 0: 13517.0. Samples: 9896760. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
+[2023-07-16 21:39:38,927][239306] Avg episode reward: [(0, '778.334')]
+[2023-07-16 21:39:39,500][239595] Updated weights for policy 0, policy_version 19360 (0.0005)
+[2023-07-16 21:39:42,526][239595] Updated weights for policy 0, policy_version 19440 (0.0005)
+[2023-07-16 21:39:43,926][239306] Fps is (10 sec: 13516.9, 60 sec: 13721.6, 300 sec: 13995.8). Total num frames: 9969664. Throughput: 0: 13471.3. Samples: 9936960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
+[2023-07-16 21:39:43,926][239306] Avg episode reward: [(0, '773.676')]
+[2023-07-16 21:39:45,558][239595] Updated weights for policy 0, policy_version 19520 (0.0004)
+[2023-07-16 21:39:46,463][239551] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000
+[2023-07-16 21:39:46,464][239596] Stopping RolloutWorker_w2...
+[2023-07-16 21:39:46,464][239601] Stopping RolloutWorker_w5...
+[2023-07-16 21:39:46,464][239685] Stopping RolloutWorker_w6...
+[2023-07-16 21:39:46,464][239599] Stopping RolloutWorker_w3...
+[2023-07-16 21:39:46,464][239598] Stopping RolloutWorker_w4...
+[2023-07-16 21:39:46,465][239596] Loop rollout_proc2_evt_loop terminating...
+[2023-07-16 21:39:46,464][239600] Stopping RolloutWorker_w0...
+[2023-07-16 21:39:46,465][239601] Loop rollout_proc5_evt_loop terminating...
+[2023-07-16 21:39:46,465][239685] Loop rollout_proc6_evt_loop terminating...
+[2023-07-16 21:39:46,464][239664] Stopping RolloutWorker_w7...
+[2023-07-16 21:39:46,464][239597] Stopping RolloutWorker_w1...
+[2023-07-16 21:39:46,465][239599] Loop rollout_proc3_evt_loop terminating...
+[2023-07-16 21:39:46,465][239598] Loop rollout_proc4_evt_loop terminating...
+[2023-07-16 21:39:46,465][239600] Loop rollout_proc0_evt_loop terminating...
+[2023-07-16 21:39:46,465][239597] Loop rollout_proc1_evt_loop terminating...
+[2023-07-16 21:39:46,465][239664] Loop rollout_proc7_evt_loop terminating...
+[2023-07-16 21:39:46,465][239306] Component RolloutWorker_w2 stopped!
+[2023-07-16 21:39:46,465][239551] Stopping Batcher_0...
+[2023-07-16 21:39:46,465][239306] Component RolloutWorker_w5 stopped!
+[2023-07-16 21:39:46,465][239551] Loop batcher_evt_loop terminating...
+[2023-07-16 21:39:46,465][239306] Component RolloutWorker_w3 stopped!
+[2023-07-16 21:39:46,466][239306] Component RolloutWorker_w6 stopped!
+[2023-07-16 21:39:46,466][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
+[2023-07-16 21:39:46,466][239306] Component RolloutWorker_w4 stopped!
+[2023-07-16 21:39:46,466][239306] Component RolloutWorker_w0 stopped!
+[2023-07-16 21:39:46,466][239306] Component RolloutWorker_w7 stopped!
+[2023-07-16 21:39:46,467][239306] Component RolloutWorker_w1 stopped!
+[2023-07-16 21:39:46,467][239306] Component Batcher_0 stopped!
+[2023-07-16 21:39:46,468][239551] Removing /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000018824_9637888.pth
+[2023-07-16 21:39:46,469][239551] Saving /home/qgallouedec/data/gia/data/envs/metaworld/train_dir/dial-turn-v2/checkpoint_p0/checkpoint_000019544_10006528.pth...
+[2023-07-16 21:39:46,471][239551] Stopping LearnerWorker_p0...
+[2023-07-16 21:39:46,472][239551] Loop learner_proc0_evt_loop terminating...
+[2023-07-16 21:39:46,472][239306] Component LearnerWorker_p0 stopped!
+[2023-07-16 21:39:46,529][239595] Weights refcount: 2 0
+[2023-07-16 21:39:46,530][239595] Stopping InferenceWorker_p0-w0...
+[2023-07-16 21:39:46,530][239595] Loop inference_proc0-0_evt_loop terminating...
+[2023-07-16 21:39:46,530][239306] Component InferenceWorker_p0-w0 stopped!
+[2023-07-16 21:39:46,531][239306] Waiting for process learner_proc0 to stop...
+[2023-07-16 21:39:47,053][239306] Waiting for process inference_proc0-0 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc0 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc1 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc2 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc3 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc4 to join...
+[2023-07-16 21:39:47,075][239306] Waiting for process rollout_proc5 to join...
+[2023-07-16 21:39:47,076][239306] Waiting for process rollout_proc6 to join...
+[2023-07-16 21:39:47,076][239306] Waiting for process rollout_proc7 to join...
+[2023-07-16 21:39:47,076][239306] Batcher 0 profile tree view:
+batching: 1.8749, releasing_batches: 1.5905
+[2023-07-16 21:39:47,076][239306] InferenceWorker_p0-w0 profile tree view:
 wait_policy: 0.0051
-  wait_policy_total: 353.0599
-update_model: 12.1906
-  weight_update: 0.0005
-one_step: 0.0009
-  handle_policy_step: 546.2831
-    deserialize: 22.9962, stack: 5.6516, obs_to_device_normalize: 96.6990, forward: 270.1506, send_messages: 42.7656
-    prepare_outputs: 60.8254
-      to_cpu: 9.1446
-[2023-07-08 16:04:31,134][1003682] Learner 0 profile tree view:
-misc: 0.0095, prepare_batch: 8.4075
-train: 86.4189
-  epoch_init: 0.0331, minibatch_init: 1.1916, losses_postprocess: 1.2559, kl_divergence: 0.4067, after_optimizer: 0.5934
-  calculate_losses: 36.4544
-    losses_init: 0.0295, forward_head: 13.8554, bptt_initial: 0.1258, bptt: 0.1250, tail: 10.6224, advantages_returns: 0.8202, losses: 9.5770
-  update: 45.0233
-    clip: 5.3999
-[2023-07-08 16:04:31,135][1003682] RolloutWorker_w0 profile tree view:
-wait_for_trajectories: 0.4485, enqueue_policy_requests: 14.5945, env_step: 575.2918, overhead: 21.7549, complete_rollouts: 0.3753
-save_policy_outputs: 42.8730
-  split_output_tensors: 14.6717
-[2023-07-08 16:04:31,135][1003682] RolloutWorker_w7 profile tree view:
-wait_for_trajectories: 0.4212, enqueue_policy_requests: 14.7497, env_step: 578.0502, overhead: 21.9679, complete_rollouts: 0.3862
-save_policy_outputs: 42.4822
-  split_output_tensors: 14.7786
-[2023-07-08 16:04:31,135][1003682] Loop Runner_EvtLoop terminating...
-[2023-07-08 16:04:31,135][1003682] Runner profile tree view:
-main_loop: 979.9128
-[2023-07-08 16:04:31,135][1003682] Collected {0: 10006528}, FPS: 10211.7
+  wait_policy_total: 231.5440
+update_model: 10.0280
+  weight_update: 0.0004
+one_step: 0.0007
+  handle_policy_step: 436.6499
+    deserialize: 18.7593, stack: 4.6913, obs_to_device_normalize: 77.3099, forward: 214.8834, send_messages: 33.9935
+    prepare_outputs: 49.9916
+      to_cpu: 7.5728
+[2023-07-16 21:39:47,076][239306] Learner 0 profile tree view:
+misc: 0.0104, prepare_batch: 9.8698
+train: 102.7698
+  epoch_init: 0.0368, minibatch_init: 1.4122, losses_postprocess: 1.3661, kl_divergence: 0.4711, after_optimizer: 0.6345
+  calculate_losses: 43.9121
+    losses_init: 0.0381, forward_head: 17.1778, bptt_initial: 0.1540, bptt: 0.1411, tail: 12.3804, advantages_returns: 0.9388, losses: 11.5317
+  update: 53.2210
+    clip: 6.3222
+[2023-07-16 21:39:47,076][239306] RolloutWorker_w0 profile tree view:
+wait_for_trajectories: 0.2797, enqueue_policy_requests: 12.3487, env_step: 458.0028, overhead: 19.3837, complete_rollouts: 0.3243
+save_policy_outputs: 38.2392
+  split_output_tensors: 13.2876
+[2023-07-16 21:39:47,076][239306] RolloutWorker_w7 profile tree view:
+wait_for_trajectories: 0.2816, enqueue_policy_requests: 12.6618, env_step: 461.7800, overhead: 20.1381, complete_rollouts: 0.3310
+save_policy_outputs: 38.2492
+  split_output_tensors: 13.1724
+[2023-07-16 21:39:47,077][239306] Loop Runner_EvtLoop terminating...
+[2023-07-16 21:39:47,077][239306] Runner profile tree view:
+main_loop: 731.1292
+[2023-07-16 21:39:47,077][239306] Collected {0: 10006528}, FPS: 13686.4