File size: 12,482 Bytes
9b49d79 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
[2023-09-12 13:21:22,562][09743] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-12 13:21:22,562][09743] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-12 13:21:22,598][09743] Num visible devices: 1
[2023-09-12 13:21:22,637][09743] Starting seed is not provided
[2023-09-12 13:21:22,638][09743] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-12 13:21:22,638][09743] Initializing actor-critic model on device cuda:0
[2023-09-12 13:21:22,638][09743] RunningMeanStd input shape: (3, 72, 128)
[2023-09-12 13:21:22,639][09743] RunningMeanStd input shape: (1,)
[2023-09-12 13:21:22,659][09743] ConvEncoder: input_channels=3
[2023-09-12 13:21:22,911][09743] Conv encoder output size: 512
[2023-09-12 13:21:22,911][09743] Policy head output size: 512
[2023-09-12 13:21:22,935][09743] Created Actor Critic model with architecture:
[2023-09-12 13:21:22,935][09743] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=4, bias=True)
)
)
[2023-09-12 13:21:24,096][09743] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-12 13:21:24,096][09743] No checkpoints found
[2023-09-12 13:21:24,097][09743] Did not load from checkpoint, starting from scratch!
[2023-09-12 13:21:24,097][09743] Initialized policy 0 weights for model version 0
[2023-09-12 13:21:24,098][09743] LearnerWorker_p0 finished initialization!
[2023-09-12 13:21:24,098][09743] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-12 13:21:24,463][09929] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-12 13:21:24,475][09931] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-12 13:21:24,499][09964] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-12 13:21:24,535][09967] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-12 13:21:24,545][09928] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-12 13:21:24,545][09928] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-12 13:21:24,566][09928] Num visible devices: 1
[2023-09-12 13:21:24,567][09932] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-12 13:21:24,645][09965] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-12 13:21:24,665][09968] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-12 13:21:24,689][09930] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-12 13:21:25,314][09928] RunningMeanStd input shape: (3, 72, 128)
[2023-09-12 13:21:25,315][09928] RunningMeanStd input shape: (1,)
[2023-09-12 13:21:25,326][09928] ConvEncoder: input_channels=3
[2023-09-12 13:21:25,447][09928] Conv encoder output size: 512
[2023-09-12 13:21:25,448][09928] Policy head output size: 512
[2023-09-12 13:21:25,839][09964] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,839][09968] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,839][09967] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,840][09965] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,840][09931] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,848][09932] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,851][09930] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:25,852][09929] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-09-12 13:21:26,147][09967] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,147][09965] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,215][09964] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,239][09929] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,258][09968] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,273][09931] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,286][09930] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,418][09967] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,493][09964] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,522][09965] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,525][09929] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,551][09932] Decorrelating experience for 0 frames...
[2023-09-12 13:21:26,556][09931] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,568][09930] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,775][09967] Decorrelating experience for 64 frames...
[2023-09-12 13:21:26,821][09932] Decorrelating experience for 32 frames...
[2023-09-12 13:21:26,852][09964] Decorrelating experience for 64 frames...
[2023-09-12 13:21:26,919][09931] Decorrelating experience for 64 frames...
[2023-09-12 13:21:26,929][09930] Decorrelating experience for 64 frames...
[2023-09-12 13:21:27,103][09929] Decorrelating experience for 64 frames...
[2023-09-12 13:21:27,160][09968] Decorrelating experience for 32 frames...
[2023-09-12 13:21:27,164][09965] Decorrelating experience for 64 frames...
[2023-09-12 13:21:27,195][09932] Decorrelating experience for 64 frames...
[2023-09-12 13:21:27,201][09964] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,330][09931] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,451][09929] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,465][09967] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,498][09965] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,507][09930] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,588][09968] Decorrelating experience for 64 frames...
[2023-09-12 13:21:27,634][09932] Decorrelating experience for 96 frames...
[2023-09-12 13:21:27,903][09968] Decorrelating experience for 96 frames...
[2023-09-12 13:21:28,649][09743] Signal inference workers to stop experience collection...
[2023-09-12 13:21:28,653][09928] InferenceWorker_p0-w0: stopping experience collection
[2023-09-12 13:21:32,650][09743] Signal inference workers to resume experience collection...
[2023-09-12 13:21:32,651][09928] InferenceWorker_p0-w0: resuming experience collection
[2023-09-12 13:21:35,991][09928] Updated weights for policy 0, policy_version 10 (0.0392)
[2023-09-12 13:21:39,213][09928] Updated weights for policy 0, policy_version 20 (0.0009)
[2023-09-12 13:21:42,363][09928] Updated weights for policy 0, policy_version 30 (0.0009)
[2023-09-12 13:21:42,527][09743] Saving new best policy, reward=-1.655!
[2023-09-12 13:21:45,525][09928] Updated weights for policy 0, policy_version 40 (0.0009)
[2023-09-12 13:21:47,532][09743] Saving new best policy, reward=-0.936!
[2023-09-12 13:21:48,749][09928] Updated weights for policy 0, policy_version 50 (0.0009)
[2023-09-12 13:21:51,927][09928] Updated weights for policy 0, policy_version 60 (0.0008)
[2023-09-12 13:21:52,564][09743] Saving new best policy, reward=0.078!
[2023-09-12 13:21:55,197][09928] Updated weights for policy 0, policy_version 70 (0.0009)
[2023-09-12 13:21:57,529][09743] Saving new best policy, reward=0.521!
[2023-09-12 13:21:58,483][09928] Updated weights for policy 0, policy_version 80 (0.0010)
[2023-09-12 13:22:01,740][09928] Updated weights for policy 0, policy_version 90 (0.0008)
[2023-09-12 13:22:02,527][09743] Saving new best policy, reward=0.599!
[2023-09-12 13:22:05,213][09928] Updated weights for policy 0, policy_version 100 (0.0012)
[2023-09-12 13:22:07,589][09743] Saving new best policy, reward=0.680!
[2023-09-12 13:22:08,636][09928] Updated weights for policy 0, policy_version 110 (0.0017)
[2023-09-12 13:22:11,996][09928] Updated weights for policy 0, policy_version 120 (0.0009)
[2023-09-12 13:22:12,526][09743] Saving new best policy, reward=0.735!
[2023-09-12 13:22:15,370][09928] Updated weights for policy 0, policy_version 130 (0.0008)
[2023-09-12 13:22:17,527][09743] Saving new best policy, reward=0.755!
[2023-09-12 13:22:18,771][09928] Updated weights for policy 0, policy_version 140 (0.0009)
[2023-09-12 13:22:22,142][09928] Updated weights for policy 0, policy_version 150 (0.0009)
[2023-09-12 13:22:22,526][09743] Saving new best policy, reward=0.781!
[2023-09-12 13:22:25,580][09928] Updated weights for policy 0, policy_version 160 (0.0008)
[2023-09-12 13:22:27,573][09743] Saving new best policy, reward=0.791!
[2023-09-12 13:22:28,937][09928] Updated weights for policy 0, policy_version 170 (0.0009)
[2023-09-12 13:22:32,402][09928] Updated weights for policy 0, policy_version 180 (0.0008)
[2023-09-12 13:22:32,526][09743] Saving new best policy, reward=0.794!
[2023-09-12 13:22:35,725][09928] Updated weights for policy 0, policy_version 190 (0.0009)
[2023-09-12 13:22:37,529][09743] Saving new best policy, reward=0.806!
[2023-09-12 13:22:39,128][09928] Updated weights for policy 0, policy_version 200 (0.0010)
[2023-09-12 13:22:42,471][09928] Updated weights for policy 0, policy_version 210 (0.0009)
[2023-09-12 13:22:45,922][09928] Updated weights for policy 0, policy_version 220 (0.0009)
[2023-09-12 13:22:47,533][09743] Saving new best policy, reward=0.815!
[2023-09-12 13:22:49,374][09928] Updated weights for policy 0, policy_version 230 (0.0009)
[2023-09-12 13:22:52,742][09928] Updated weights for policy 0, policy_version 240 (0.0009)
[2023-09-12 13:22:54,862][09743] Stopping Batcher_0...
[2023-09-12 13:22:54,862][09743] Saving /home/cogstack/Documents/optuna/environments/sample_factory/train_dir/default_experiment/checkpoint_p0/checkpoint_000000246_1007616.pth...
[2023-09-12 13:22:54,863][09743] Loop batcher_evt_loop terminating...
[2023-09-12 13:22:54,876][09965] Stopping RolloutWorker_w7...
[2023-09-12 13:22:54,877][09965] Loop rollout_proc7_evt_loop terminating...
[2023-09-12 13:22:54,877][09932] Stopping RolloutWorker_w3...
[2023-09-12 13:22:54,877][09930] Stopping RolloutWorker_w0...
[2023-09-12 13:22:54,877][09968] Stopping RolloutWorker_w6...
[2023-09-12 13:22:54,877][09932] Loop rollout_proc3_evt_loop terminating...
[2023-09-12 13:22:54,877][09930] Loop rollout_proc0_evt_loop terminating...
[2023-09-12 13:22:54,878][09968] Loop rollout_proc6_evt_loop terminating...
[2023-09-12 13:22:54,880][09964] Stopping RolloutWorker_w5...
[2023-09-12 13:22:54,880][09931] Stopping RolloutWorker_w2...
[2023-09-12 13:22:54,880][09964] Loop rollout_proc5_evt_loop terminating...
[2023-09-12 13:22:54,880][09931] Loop rollout_proc2_evt_loop terminating...
[2023-09-12 13:22:54,881][09929] Stopping RolloutWorker_w1...
[2023-09-12 13:22:54,881][09929] Loop rollout_proc1_evt_loop terminating...
[2023-09-12 13:22:54,882][09967] Stopping RolloutWorker_w4...
[2023-09-12 13:22:54,882][09967] Loop rollout_proc4_evt_loop terminating...
[2023-09-12 13:22:54,885][09928] Weights refcount: 2 0
[2023-09-12 13:22:54,887][09928] Stopping InferenceWorker_p0-w0...
[2023-09-12 13:22:54,887][09928] Loop inference_proc0-0_evt_loop terminating...
[2023-09-12 13:22:54,931][09743] Saving /home/cogstack/Documents/optuna/environments/sample_factory/train_dir/default_experiment/checkpoint_p0/checkpoint_000000246_1007616.pth...
[2023-09-12 13:22:55,021][09743] Stopping LearnerWorker_p0...
[2023-09-12 13:22:55,022][09743] Loop learner_proc0_evt_loop terminating...
|