alignmentforever's picture
Upload folder using huggingface_hub
1a408d8 verified
+ deepspeed --master_port 16182 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/setting3/safety_data/training/safe/safe_30k.json --model_name_or_path /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T --max_length 2048 --trust_remote_code True --epochs 1 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-30k --log_type wandb --log_run_name tinyllama-1T-s3-Q1-30k --log_project Inverse_Alignment --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
[rank4]:[W529 04:24:27.827578493 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W529 04:24:27.827601722 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W529 04:24:27.899692720 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W529 04:24:27.938652087 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W529 04:24:27.983457860 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank1]:[W529 04:24:27.991657559 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W529 04:24:27.079396730 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W529 04:24:27.105063926 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.json
loading file tokenizer.json
loading file added_tokens.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.8
wandb: Run data is saved locally in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-30k/wandb/run-20250529_042445-lcs5hdas
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run tinyllama-1T-s3-Q1-30k
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment/runs/lcs5hdas
Training 1/1 epoch: 0%| | 0/938 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.3642): 0%| | 0/938 [00:10<?, ?it/s] Training 1/1 epoch (loss 2.3642): 0%| | 1/938 [00:10<2:46:11, 10.64s/it] Training 1/1 epoch (loss 2.3665): 0%| | 1/938 [00:13<2:46:11, 10.64s/it] Training 1/1 epoch (loss 2.3665): 0%| | 2/938 [00:13<1:34:10, 6.04s/it] Training 1/1 epoch (loss 2.3674): 0%| | 2/938 [00:15<1:34:10, 6.04s/it] Training 1/1 epoch (loss 2.3674): 0%| | 3/938 [00:15<1:03:17, 4.06s/it] Training 1/1 epoch (loss 2.3937): 0%| | 3/938 [00:16<1:03:17, 4.06s/it] Training 1/1 epoch (loss 2.3937): 0%| | 4/938 [00:16<48:32, 3.12s/it] Training 1/1 epoch (loss 2.2683): 0%| | 4/938 [00:18<48:32, 3.12s/it] Training 1/1 epoch (loss 2.2683): 1%| | 5/938 [00:18<40:16, 2.59s/it] Training 1/1 epoch (loss 2.2575): 1%| | 5/938 [00:20<40:16, 2.59s/it] Training 1/1 epoch (loss 2.2575): 1%| | 6/938 [00:20<34:43, 2.24s/it] Training 1/1 epoch (loss 2.2841): 1%| | 6/938 [00:21<34:43, 2.24s/it] Training 1/1 epoch (loss 2.2841): 1%| | 7/938 [00:21<30:33, 1.97s/it] Training 1/1 epoch (loss 2.3704): 1%| | 7/938 [00:23<30:33, 1.97s/it] Training 1/1 epoch (loss 2.3704): 1%| | 8/938 [00:23<30:06, 1.94s/it] Training 1/1 epoch (loss 2.2207): 1%| | 8/938 [00:24<30:06, 1.94s/it] Training 1/1 epoch (loss 2.2207): 1%| | 9/938 [00:24<25:54, 1.67s/it] Training 1/1 epoch (loss 2.2311): 1%| | 9/938 [00:25<25:54, 1.67s/it] Training 1/1 epoch (loss 2.2311): 1%| | 10/938 [00:25<22:47, 1.47s/it] Training 1/1 epoch (loss 2.2751): 1%| | 10/938 [00:26<22:47, 1.47s/it] Training 1/1 epoch (loss 2.2751): 1%| | 11/938 [00:26<22:29, 1.46s/it] Training 1/1 epoch (loss 2.3901): 1%| | 11/938 [00:29<22:29, 1.46s/it] Training 1/1 epoch (loss 2.3901): 1%|▏ | 12/938 [00:29<25:54, 1.68s/it] Training 1/1 epoch (loss 2.3838): 1%|▏ | 12/938 [00:31<25:54, 1.68s/it] Training 1/1 epoch (loss 2.3838): 1%|▏ | 13/938 [00:31<27:37, 1.79s/it] Training 1/1 epoch (loss 2.1834): 1%|▏ | 13/938 [00:32<27:37, 1.79s/it] Training 1/1 epoch (loss 2.1834): 1%|▏ | 14/938 [00:32<27:48, 1.81s/it] Training 1/1 epoch (loss 2.3654): 1%|▏ | 14/938 [00:34<27:48, 1.81s/it] Training 1/1 epoch (loss 2.3654): 2%|▏ | 15/938 [00:34<27:09, 1.77s/it] Training 1/1 epoch (loss 2.3359): 2%|▏ | 15/938 [00:36<27:09, 1.77s/it] Training 1/1 epoch (loss 2.3359): 2%|▏ | 16/938 [00:36<26:31, 1.73s/it] Training 1/1 epoch (loss 2.1469): 2%|▏ | 16/938 [00:37<26:31, 1.73s/it] Training 1/1 epoch (loss 2.1469): 2%|▏ | 17/938 [00:37<24:45, 1.61s/it] Training 1/1 epoch (loss 2.3188): 2%|▏ | 17/938 [00:39<24:45, 1.61s/it] Training 1/1 epoch (loss 2.3188): 2%|▏ | 18/938 [00:39<26:38, 1.74s/it] Training 1/1 epoch (loss 2.2120): 2%|▏ | 18/938 [00:41<26:38, 1.74s/it] Training 1/1 epoch (loss 2.2120): 2%|▏ | 19/938 [00:41<28:07, 1.84s/it] Training 1/1 epoch (loss 2.2399): 2%|▏ | 19/938 [00:44<28:07, 1.84s/it] Training 1/1 epoch (loss 2.2399): 2%|▏ | 20/938 [00:44<30:25, 1.99s/it] Training 1/1 epoch (loss 2.2323): 2%|▏ | 20/938 [00:46<30:25, 1.99s/it] Training 1/1 epoch (loss 2.2323): 2%|▏ | 21/938 [00:46<30:18, 1.98s/it] Training 1/1 epoch (loss 2.4383): 2%|▏ | 21/938 [00:47<30:18, 1.98s/it] Training 1/1 epoch (loss 2.4383): 2%|▏ | 22/938 [00:47<29:47, 1.95s/it] Training 1/1 epoch (loss 2.1563): 2%|▏ | 22/938 [00:49<29:47, 1.95s/it] Training 1/1 epoch (loss 2.1563): 2%|▏ | 23/938 [00:49<28:58, 1.90s/it] Training 1/1 epoch (loss 2.1938): 2%|▏ | 23/938 [00:51<28:58, 1.90s/it] Training 1/1 epoch (loss 2.1938): 3%|β–Ž | 24/938 [00:51<28:44, 1.89s/it] Training 1/1 epoch (loss 2.3226): 3%|β–Ž | 24/938 [00:52<28:44, 1.89s/it] Training 1/1 epoch (loss 2.3226): 3%|β–Ž | 25/938 [00:52<25:03, 1.65s/it] Training 1/1 epoch (loss 2.3257): 3%|β–Ž | 25/938 [00:54<25:03, 1.65s/it] Training 1/1 epoch (loss 2.3257): 3%|β–Ž | 26/938 [00:54<26:30, 1.74s/it] Training 1/1 epoch (loss 2.1772): 3%|β–Ž | 26/938 [00:56<26:30, 1.74s/it] Training 1/1 epoch (loss 2.1772): 3%|β–Ž | 27/938 [00:56<25:38, 1.69s/it] Training 1/1 epoch (loss 2.2153): 3%|β–Ž | 27/938 [00:58<25:38, 1.69s/it] Training 1/1 epoch (loss 2.2153): 3%|β–Ž | 28/938 [00:58<26:36, 1.75s/it] Training 1/1 epoch (loss 2.3130): 3%|β–Ž | 28/938 [00:59<26:36, 1.75s/it] Training 1/1 epoch (loss 2.3130): 3%|β–Ž | 29/938 [00:59<27:27, 1.81s/it] Training 1/1 epoch (loss 2.3894): 3%|β–Ž | 29/938 [01:01<27:27, 1.81s/it] Training 1/1 epoch (loss 2.3894): 3%|β–Ž | 30/938 [01:01<25:25, 1.68s/it] Training 1/1 epoch (loss 2.2545): 3%|β–Ž | 30/938 [01:03<25:25, 1.68s/it] Training 1/1 epoch (loss 2.2545): 3%|β–Ž | 31/938 [01:03<28:00, 1.85s/it] Training 1/1 epoch (loss 2.2425): 3%|β–Ž | 31/938 [01:06<28:00, 1.85s/it] Training 1/1 epoch (loss 2.2425): 3%|β–Ž | 32/938 [01:06<32:50, 2.17s/it] Training 1/1 epoch (loss 2.1617): 3%|β–Ž | 32/938 [01:07<32:50, 2.17s/it] Training 1/1 epoch (loss 2.1617): 4%|β–Ž | 33/938 [01:07<28:34, 1.89s/it] Training 1/1 epoch (loss 2.2476): 4%|β–Ž | 33/938 [01:09<28:34, 1.89s/it] Training 1/1 epoch (loss 2.2476): 4%|β–Ž | 34/938 [01:09<27:23, 1.82s/it] Training 1/1 epoch (loss 2.1486): 4%|β–Ž | 34/938 [01:11<27:23, 1.82s/it] Training 1/1 epoch (loss 2.1486): 4%|β–Ž | 35/938 [01:11<27:44, 1.84s/it] Training 1/1 epoch (loss 2.1386): 4%|β–Ž | 35/938 [01:13<27:44, 1.84s/it] Training 1/1 epoch (loss 2.1386): 4%|▍ | 36/938 [01:13<27:02, 1.80s/it] Training 1/1 epoch (loss 2.2049): 4%|▍ | 36/938 [01:14<27:02, 1.80s/it] Training 1/1 epoch (loss 2.2049): 4%|▍ | 37/938 [01:14<27:01, 1.80s/it] Training 1/1 epoch (loss 2.2077): 4%|▍ | 37/938 [01:16<27:01, 1.80s/it] Training 1/1 epoch (loss 2.2077): 4%|▍ | 38/938 [01:16<28:34, 1.91s/it] Training 1/1 epoch (loss 2.1831): 4%|▍ | 38/938 [01:18<28:34, 1.91s/it] Training 1/1 epoch (loss 2.1831): 4%|▍ | 39/938 [01:18<26:01, 1.74s/it] Training 1/1 epoch (loss 2.1193): 4%|▍ | 39/938 [01:19<26:01, 1.74s/it] Training 1/1 epoch (loss 2.1193): 4%|▍ | 40/938 [01:19<25:09, 1.68s/it] Training 1/1 epoch (loss 2.1246): 4%|▍ | 40/938 [01:22<25:09, 1.68s/it] Training 1/1 epoch (loss 2.1246): 4%|▍ | 41/938 [01:22<28:28, 1.91s/it] Training 1/1 epoch (loss 2.2744): 4%|▍ | 41/938 [01:24<28:28, 1.91s/it] Training 1/1 epoch (loss 2.2744): 4%|▍ | 42/938 [01:24<28:57, 1.94s/it] Training 1/1 epoch (loss 2.1316): 4%|▍ | 42/938 [01:26<28:57, 1.94s/it] Training 1/1 epoch (loss 2.1316): 5%|▍ | 43/938 [01:26<29:48, 2.00s/it] Training 1/1 epoch (loss 2.1229): 5%|▍ | 43/938 [01:27<29:48, 2.00s/it] Training 1/1 epoch (loss 2.1229): 5%|▍ | 44/938 [01:27<27:33, 1.85s/it] Training 1/1 epoch (loss 2.0377): 5%|▍ | 44/938 [01:29<27:33, 1.85s/it] Training 1/1 epoch (loss 2.0377): 5%|▍ | 45/938 [01:29<26:31, 1.78s/it] Training 1/1 epoch (loss 2.1367): 5%|▍ | 45/938 [01:31<26:31, 1.78s/it] Training 1/1 epoch (loss 2.1367): 5%|▍ | 46/938 [01:31<26:31, 1.78s/it] Training 1/1 epoch (loss 2.1473): 5%|▍ | 46/938 [01:33<26:31, 1.78s/it] Training 1/1 epoch (loss 2.1473): 5%|β–Œ | 47/938 [01:33<26:49, 1.81s/it] Training 1/1 epoch (loss 2.1490): 5%|β–Œ | 47/938 [01:35<26:49, 1.81s/it] Training 1/1 epoch (loss 2.1490): 5%|β–Œ | 48/938 [01:35<29:15, 1.97s/it] Training 1/1 epoch (loss 2.1981): 5%|β–Œ | 48/938 [01:36<29:15, 1.97s/it] Training 1/1 epoch (loss 2.1981): 5%|β–Œ | 49/938 [01:36<26:14, 1.77s/it] Training 1/1 epoch (loss 2.0060): 5%|β–Œ | 49/938 [01:38<26:14, 1.77s/it] Training 1/1 epoch (loss 2.0060): 5%|β–Œ | 50/938 [01:38<26:21, 1.78s/it] Training 1/1 epoch (loss 2.0804): 5%|β–Œ | 50/938 [01:40<26:21, 1.78s/it] Training 1/1 epoch (loss 2.0804): 5%|β–Œ | 51/938 [01:40<25:26, 1.72s/it] Training 1/1 epoch (loss 2.0610): 5%|β–Œ | 51/938 [01:41<25:26, 1.72s/it] Training 1/1 epoch (loss 2.0610): 6%|β–Œ | 52/938 [01:41<25:11, 1.71s/it] Training 1/1 epoch (loss 2.0236): 6%|β–Œ | 52/938 [01:42<25:11, 1.71s/it] Training 1/1 epoch (loss 2.0236): 6%|β–Œ | 53/938 [01:42<21:06, 1.43s/it] Training 1/1 epoch (loss 2.1761): 6%|β–Œ | 53/938 [01:43<21:06, 1.43s/it] Training 1/1 epoch (loss 2.1761): 6%|β–Œ | 54/938 [01:43<18:32, 1.26s/it] Training 1/1 epoch (loss 2.1312): 6%|β–Œ | 54/938 [01:45<18:32, 1.26s/it] Training 1/1 epoch (loss 2.1312): 6%|β–Œ | 55/938 [01:45<23:08, 1.57s/it] Training 1/1 epoch (loss 2.0028): 6%|β–Œ | 55/938 [01:48<23:08, 1.57s/it] Training 1/1 epoch (loss 2.0028): 6%|β–Œ | 56/938 [01:48<27:39, 1.88s/it] Training 1/1 epoch (loss 2.0148): 6%|β–Œ | 56/938 [01:50<27:39, 1.88s/it] Training 1/1 epoch (loss 2.0148): 6%|β–Œ | 57/938 [01:50<28:20, 1.93s/it] Training 1/1 epoch (loss 2.0791): 6%|β–Œ | 57/938 [01:51<28:20, 1.93s/it] Training 1/1 epoch (loss 2.0791): 6%|β–Œ | 58/938 [01:51<26:07, 1.78s/it] Training 1/1 epoch (loss 2.0438): 6%|β–Œ | 58/938 [01:54<26:07, 1.78s/it] Training 1/1 epoch (loss 2.0438): 6%|β–‹ | 59/938 [01:54<27:39, 1.89s/it] Training 1/1 epoch (loss 2.0343): 6%|β–‹ | 59/938 [01:55<27:39, 1.89s/it] Training 1/1 epoch (loss 2.0343): 6%|β–‹ | 60/938 [01:55<25:59, 1.78s/it] Training 1/1 epoch (loss 2.0962): 6%|β–‹ | 60/938 [01:56<25:59, 1.78s/it] Training 1/1 epoch (loss 2.0962): 7%|β–‹ | 61/938 [01:56<23:13, 1.59s/it] Training 1/1 epoch (loss 2.0811): 7%|β–‹ | 61/938 [01:59<23:13, 1.59s/it] Training 1/1 epoch (loss 2.0811): 7%|β–‹ | 62/938 [01:59<26:39, 1.83s/it] Training 1/1 epoch (loss 2.0461): 7%|β–‹ | 62/938 [02:00<26:39, 1.83s/it] Training 1/1 epoch (loss 2.0461): 7%|β–‹ | 63/938 [02:00<25:42, 1.76s/it] Training 1/1 epoch (loss 2.0340): 7%|β–‹ | 63/938 [02:02<25:42, 1.76s/it] Training 1/1 epoch (loss 2.0340): 7%|β–‹ | 64/938 [02:02<26:20, 1.81s/it] Training 1/1 epoch (loss 1.9828): 7%|β–‹ | 64/938 [02:04<26:20, 1.81s/it] Training 1/1 epoch (loss 1.9828): 7%|β–‹ | 65/938 [02:04<27:38, 1.90s/it] Training 1/1 epoch (loss 2.1134): 7%|β–‹ | 65/938 [02:06<27:38, 1.90s/it] Training 1/1 epoch (loss 2.1134): 7%|β–‹ | 66/938 [02:06<28:26, 1.96s/it] Training 1/1 epoch (loss 2.0507): 7%|β–‹ | 66/938 [02:08<28:26, 1.96s/it] Training 1/1 epoch (loss 2.0507): 7%|β–‹ | 67/938 [02:08<24:49, 1.71s/it] Training 1/1 epoch (loss 2.0059): 7%|β–‹ | 67/938 [02:10<24:49, 1.71s/it] Training 1/1 epoch (loss 2.0059): 7%|β–‹ | 68/938 [02:10<27:56, 1.93s/it] Training 1/1 epoch (loss 2.0370): 7%|β–‹ | 68/938 [02:12<27:56, 1.93s/it] Training 1/1 epoch (loss 2.0370): 7%|β–‹ | 69/938 [02:12<27:44, 1.92s/it] Training 1/1 epoch (loss 2.1034): 7%|β–‹ | 69/938 [02:13<27:44, 1.92s/it] Training 1/1 epoch (loss 2.1034): 7%|β–‹ | 70/938 [02:13<26:27, 1.83s/it] Training 1/1 epoch (loss 1.9836): 7%|β–‹ | 70/938 [02:15<26:27, 1.83s/it] Training 1/1 epoch (loss 1.9836): 8%|β–Š | 71/938 [02:15<26:27, 1.83s/it] Training 1/1 epoch (loss 2.0255): 8%|β–Š | 71/938 [02:17<26:27, 1.83s/it] Training 1/1 epoch (loss 2.0255): 8%|β–Š | 72/938 [02:17<27:37, 1.91s/it] Training 1/1 epoch (loss 1.9502): 8%|β–Š | 72/938 [02:19<27:37, 1.91s/it] Training 1/1 epoch (loss 1.9502): 8%|β–Š | 73/938 [02:19<27:50, 1.93s/it] Training 1/1 epoch (loss 1.9574): 8%|β–Š | 73/938 [02:21<27:50, 1.93s/it] Training 1/1 epoch (loss 1.9574): 8%|β–Š | 74/938 [02:21<28:20, 1.97s/it] Training 1/1 epoch (loss 1.9175): 8%|β–Š | 74/938 [02:23<28:20, 1.97s/it] Training 1/1 epoch (loss 1.9175): 8%|β–Š | 75/938 [02:23<26:07, 1.82s/it] Training 1/1 epoch (loss 2.0255): 8%|β–Š | 75/938 [02:24<26:07, 1.82s/it] Training 1/1 epoch (loss 2.0255): 8%|β–Š | 76/938 [02:24<24:19, 1.69s/it] Training 1/1 epoch (loss 1.8727): 8%|β–Š | 76/938 [02:27<24:19, 1.69s/it] Training 1/1 epoch (loss 1.8727): 8%|β–Š | 77/938 [02:27<26:42, 1.86s/it] Training 1/1 epoch (loss 2.0836): 8%|β–Š | 77/938 [02:29<26:42, 1.86s/it] Training 1/1 epoch (loss 2.0836): 8%|β–Š | 78/938 [02:29<28:52, 2.01s/it] Training 1/1 epoch (loss 2.0942): 8%|β–Š | 78/938 [02:31<28:52, 2.01s/it] Training 1/1 epoch (loss 2.0942): 8%|β–Š | 79/938 [02:31<27:55, 1.95s/it] Training 1/1 epoch (loss 2.0371): 8%|β–Š | 79/938 [02:32<27:55, 1.95s/it] Training 1/1 epoch (loss 2.0371): 9%|β–Š | 80/938 [02:32<25:13, 1.76s/it] Training 1/1 epoch (loss 2.0046): 9%|β–Š | 80/938 [02:34<25:13, 1.76s/it] Training 1/1 epoch (loss 2.0046): 9%|β–Š | 81/938 [02:34<25:25, 1.78s/it] Training 1/1 epoch (loss 1.8833): 9%|β–Š | 81/938 [02:36<25:25, 1.78s/it] Training 1/1 epoch (loss 1.8833): 9%|β–Š | 82/938 [02:36<24:48, 1.74s/it] Training 1/1 epoch (loss 1.9973): 9%|β–Š | 82/938 [02:37<24:48, 1.74s/it] Training 1/1 epoch (loss 1.9973): 9%|β–‰ | 83/938 [02:37<25:32, 1.79s/it] Training 1/1 epoch (loss 1.8684): 9%|β–‰ | 83/938 [02:39<25:32, 1.79s/it] Training 1/1 epoch (loss 1.8684): 9%|β–‰ | 84/938 [02:39<24:09, 1.70s/it] Training 1/1 epoch (loss 1.9141): 9%|β–‰ | 84/938 [02:40<24:09, 1.70s/it] Training 1/1 epoch (loss 1.9141): 9%|β–‰ | 85/938 [02:40<21:32, 1.52s/it] Training 1/1 epoch (loss 1.9408): 9%|β–‰ | 85/938 [02:42<21:32, 1.52s/it] Training 1/1 epoch (loss 1.9408): 9%|β–‰ | 86/938 [02:42<25:04, 1.77s/it] Training 1/1 epoch (loss 1.9982): 9%|β–‰ | 86/938 [02:44<25:04, 1.77s/it] Training 1/1 epoch (loss 1.9982): 9%|β–‰ | 87/938 [02:44<24:40, 1.74s/it] Training 1/1 epoch (loss 1.8733): 9%|β–‰ | 87/938 [02:47<24:40, 1.74s/it] Training 1/1 epoch (loss 1.8733): 9%|β–‰ | 88/938 [02:47<28:08, 1.99s/it] Training 1/1 epoch (loss 1.9091): 9%|β–‰ | 88/938 [02:48<28:08, 1.99s/it] Training 1/1 epoch (loss 1.9091): 9%|β–‰ | 89/938 [02:48<27:13, 1.92s/it] Training 1/1 epoch (loss 1.9175): 9%|β–‰ | 89/938 [02:50<27:13, 1.92s/it] Training 1/1 epoch (loss 1.9175): 10%|β–‰ | 90/938 [02:50<25:08, 1.78s/it] Training 1/1 epoch (loss 2.0434): 10%|β–‰ | 90/938 [02:51<25:08, 1.78s/it] Training 1/1 epoch (loss 2.0434): 10%|β–‰ | 91/938 [02:51<23:17, 1.65s/it] Training 1/1 epoch (loss 1.8274): 10%|β–‰ | 91/938 [02:54<23:17, 1.65s/it] Training 1/1 epoch (loss 1.8274): 10%|β–‰ | 92/938 [02:54<26:40, 1.89s/it] Training 1/1 epoch (loss 2.0520): 10%|β–‰ | 92/938 [02:55<26:40, 1.89s/it] Training 1/1 epoch (loss 2.0520): 10%|β–‰ | 93/938 [02:55<23:05, 1.64s/it] Training 1/1 epoch (loss 1.9133): 10%|β–‰ | 93/938 [02:57<23:05, 1.64s/it] Training 1/1 epoch (loss 1.9133): 10%|β–ˆ | 94/938 [02:57<24:15, 1.72s/it] Training 1/1 epoch (loss 1.8978): 10%|β–ˆ | 94/938 [02:58<24:15, 1.72s/it] Training 1/1 epoch (loss 1.8978): 10%|β–ˆ | 95/938 [02:58<21:57, 1.56s/it] Training 1/1 epoch (loss 1.9623): 10%|β–ˆ | 95/938 [03:00<21:57, 1.56s/it] Training 1/1 epoch (loss 1.9623): 10%|β–ˆ | 96/938 [03:00<22:45, 1.62s/it] Training 1/1 epoch (loss 1.9836): 10%|β–ˆ | 96/938 [03:01<22:45, 1.62s/it] Training 1/1 epoch (loss 1.9836): 10%|β–ˆ | 97/938 [03:01<23:55, 1.71s/it] Training 1/1 epoch (loss 1.8903): 10%|β–ˆ | 97/938 [03:04<23:55, 1.71s/it] Training 1/1 epoch (loss 1.8903): 10%|β–ˆ | 98/938 [03:04<25:41, 1.83s/it] Training 1/1 epoch (loss 1.9490): 10%|β–ˆ | 98/938 [03:05<25:41, 1.83s/it] Training 1/1 epoch (loss 1.9490): 11%|β–ˆ | 99/938 [03:05<22:42, 1.62s/it] Training 1/1 epoch (loss 1.7534): 11%|β–ˆ | 99/938 [03:06<22:42, 1.62s/it] Training 1/1 epoch (loss 1.7534): 11%|β–ˆ | 100/938 [03:06<22:39, 1.62s/it] Training 1/1 epoch (loss 1.8693): 11%|β–ˆ | 100/938 [03:08<22:39, 1.62s/it] Training 1/1 epoch (loss 1.8693): 11%|β–ˆ | 101/938 [03:08<22:50, 1.64s/it] Training 1/1 epoch (loss 1.8911): 11%|β–ˆ | 101/938 [03:09<22:50, 1.64s/it] Training 1/1 epoch (loss 1.8911): 11%|β–ˆ | 102/938 [03:09<21:08, 1.52s/it] Training 1/1 epoch (loss 1.8560): 11%|β–ˆ | 102/938 [03:11<21:08, 1.52s/it] Training 1/1 epoch (loss 1.8560): 11%|β–ˆ | 103/938 [03:11<21:50, 1.57s/it] Training 1/1 epoch (loss 1.9776): 11%|β–ˆ | 103/938 [03:12<21:50, 1.57s/it] Training 1/1 epoch (loss 1.9776): 11%|β–ˆ | 104/938 [03:12<20:36, 1.48s/it] Training 1/1 epoch (loss 1.9027): 11%|β–ˆ | 104/938 [03:13<20:36, 1.48s/it] Training 1/1 epoch (loss 1.9027): 11%|β–ˆ | 105/938 [03:13<19:34, 1.41s/it] Training 1/1 epoch (loss 1.9151): 11%|β–ˆ | 105/938 [03:15<19:34, 1.41s/it] Training 1/1 epoch (loss 1.9151): 11%|β–ˆβ– | 106/938 [03:15<21:37, 1.56s/it] Training 1/1 epoch (loss 1.9571): 11%|β–ˆβ– | 106/938 [03:17<21:37, 1.56s/it] Training 1/1 epoch (loss 1.9571): 11%|β–ˆβ– | 107/938 [03:17<22:24, 1.62s/it] Training 1/1 epoch (loss 2.0114): 11%|β–ˆβ– | 107/938 [03:19<22:24, 1.62s/it] Training 1/1 epoch (loss 2.0114): 12%|β–ˆβ– | 108/938 [03:19<22:14, 1.61s/it] Training 1/1 epoch (loss 1.9454): 12%|β–ˆβ– | 108/938 [03:21<22:14, 1.61s/it] Training 1/1 epoch (loss 1.9454): 12%|β–ˆβ– | 109/938 [03:21<24:16, 1.76s/it] Training 1/1 epoch (loss 1.9584): 12%|β–ˆβ– | 109/938 [03:23<24:16, 1.76s/it] Training 1/1 epoch (loss 1.9584): 12%|β–ˆβ– | 110/938 [03:23<26:03, 1.89s/it] Training 1/1 epoch (loss 1.8025): 12%|β–ˆβ– | 110/938 [03:25<26:03, 1.89s/it] Training 1/1 epoch (loss 1.8025): 12%|β–ˆβ– | 111/938 [03:25<25:44, 1.87s/it] Training 1/1 epoch (loss 1.9291): 12%|β–ˆβ– | 111/938 [03:27<25:44, 1.87s/it] Training 1/1 epoch (loss 1.9291): 12%|β–ˆβ– | 112/938 [03:27<28:49, 2.09s/it] Training 1/1 epoch (loss 1.8292): 12%|β–ˆβ– | 112/938 [03:29<28:49, 2.09s/it] Training 1/1 epoch (loss 1.8292): 12%|β–ˆβ– | 113/938 [03:29<26:16, 1.91s/it] Training 1/1 epoch (loss 1.8177): 12%|β–ˆβ– | 113/938 [03:30<26:16, 1.91s/it] Training 1/1 epoch (loss 1.8177): 12%|β–ˆβ– | 114/938 [03:30<23:46, 1.73s/it] Training 1/1 epoch (loss 1.8093): 12%|β–ˆβ– | 114/938 [03:32<23:46, 1.73s/it] Training 1/1 epoch (loss 1.8093): 12%|β–ˆβ– | 115/938 [03:32<22:45, 1.66s/it] Training 1/1 epoch (loss 1.9370): 12%|β–ˆβ– | 115/938 [03:34<22:45, 1.66s/it] Training 1/1 epoch (loss 1.9370): 12%|β–ˆβ– | 116/938 [03:34<23:42, 1.73s/it] Training 1/1 epoch (loss 1.8238): 12%|β–ˆβ– | 116/938 [03:35<23:42, 1.73s/it] Training 1/1 epoch (loss 1.8238): 12%|β–ˆβ– | 117/938 [03:35<23:18, 1.70s/it] Training 1/1 epoch (loss 1.8799): 12%|β–ˆβ– | 117/938 [03:37<23:18, 1.70s/it] Training 1/1 epoch (loss 1.8799): 13%|β–ˆβ–Ž | 118/938 [03:37<22:34, 1.65s/it] Training 1/1 epoch (loss 1.9194): 13%|β–ˆβ–Ž | 118/938 [03:38<22:34, 1.65s/it] Training 1/1 epoch (loss 1.9194): 13%|β–ˆβ–Ž | 119/938 [03:38<19:11, 1.41s/it] Training 1/1 epoch (loss 1.7945): 13%|β–ˆβ–Ž | 119/938 [03:39<19:11, 1.41s/it] Training 1/1 epoch (loss 1.7945): 13%|β–ˆβ–Ž | 120/938 [03:39<19:56, 1.46s/it] Training 1/1 epoch (loss 1.8436): 13%|β–ˆβ–Ž | 120/938 [03:41<19:56, 1.46s/it] Training 1/1 epoch (loss 1.8436): 13%|β–ˆβ–Ž | 121/938 [03:41<22:11, 1.63s/it] Training 1/1 epoch (loss 1.8499): 13%|β–ˆβ–Ž | 121/938 [03:43<22:11, 1.63s/it] Training 1/1 epoch (loss 1.8499): 13%|β–ˆβ–Ž | 122/938 [03:43<20:58, 1.54s/it] Training 1/1 epoch (loss 1.8290): 13%|β–ˆβ–Ž | 122/938 [03:45<20:58, 1.54s/it] Training 1/1 epoch (loss 1.8290): 13%|β–ˆβ–Ž | 123/938 [03:45<22:38, 1.67s/it] Training 1/1 epoch (loss 1.8121): 13%|β–ˆβ–Ž | 123/938 [03:47<22:38, 1.67s/it] Training 1/1 epoch (loss 1.8121): 13%|β–ˆβ–Ž | 124/938 [03:47<25:42, 1.90s/it] Training 1/1 epoch (loss 1.9223): 13%|β–ˆβ–Ž | 124/938 [03:49<25:42, 1.90s/it] Training 1/1 epoch (loss 1.9223): 13%|β–ˆβ–Ž | 125/938 [03:49<24:24, 1.80s/it] Training 1/1 epoch (loss 1.9285): 13%|β–ˆβ–Ž | 125/938 [03:51<24:24, 1.80s/it] Training 1/1 epoch (loss 1.9285): 13%|β–ˆβ–Ž | 126/938 [03:51<26:49, 1.98s/it] Training 1/1 epoch (loss 1.7297): 13%|β–ˆβ–Ž | 126/938 [03:53<26:49, 1.98s/it] Training 1/1 epoch (loss 1.7297): 14%|β–ˆβ–Ž | 127/938 [03:53<25:35, 1.89s/it] Training 1/1 epoch (loss 1.7822): 14%|β–ˆβ–Ž | 127/938 [03:55<25:35, 1.89s/it] Training 1/1 epoch (loss 1.7822): 14%|β–ˆβ–Ž | 128/938 [03:55<27:44, 2.05s/it] Training 1/1 epoch (loss 1.8257): 14%|β–ˆβ–Ž | 128/938 [03:57<27:44, 2.05s/it] Training 1/1 epoch (loss 1.8257): 14%|β–ˆβ– | 129/938 [03:57<28:01, 2.08s/it] Training 1/1 epoch (loss 1.7592): 14%|β–ˆβ– | 129/938 [03:59<28:01, 2.08s/it] Training 1/1 epoch (loss 1.7592): 14%|β–ˆβ– | 130/938 [03:59<28:10, 2.09s/it] Training 1/1 epoch (loss 1.8317): 14%|β–ˆβ– | 130/938 [04:01<28:10, 2.09s/it] Training 1/1 epoch (loss 1.8317): 14%|β–ˆβ– | 131/938 [04:01<27:17, 2.03s/it] Training 1/1 epoch (loss 1.7636): 14%|β–ˆβ– | 131/938 [04:03<27:17, 2.03s/it] Training 1/1 epoch (loss 1.7636): 14%|β–ˆβ– | 132/938 [04:03<27:03, 2.01s/it] Training 1/1 epoch (loss 1.8268): 14%|β–ˆβ– | 132/938 [04:04<27:03, 2.01s/it] Training 1/1 epoch (loss 1.8268): 14%|β–ˆβ– | 133/938 [04:04<23:19, 1.74s/it] Training 1/1 epoch (loss 1.9469): 14%|β–ˆβ– | 133/938 [04:06<23:19, 1.74s/it] Training 1/1 epoch (loss 1.9469): 14%|β–ˆβ– | 134/938 [04:06<22:52, 1.71s/it] Training 1/1 epoch (loss 1.7587): 14%|β–ˆβ– | 134/938 [04:07<22:52, 1.71s/it] Training 1/1 epoch (loss 1.7587): 14%|β–ˆβ– | 135/938 [04:07<21:59, 1.64s/it] Training 1/1 epoch (loss 1.8409): 14%|β–ˆβ– | 135/938 [04:09<21:59, 1.64s/it] Training 1/1 epoch (loss 1.8409): 14%|β–ˆβ– | 136/938 [04:09<22:09, 1.66s/it] Training 1/1 epoch (loss 1.8444): 14%|β–ˆβ– | 136/938 [04:11<22:09, 1.66s/it] Training 1/1 epoch (loss 1.8444): 15%|β–ˆβ– | 137/938 [04:11<22:28, 1.68s/it] Training 1/1 epoch (loss 1.8162): 15%|β–ˆβ– | 137/938 [04:13<22:28, 1.68s/it] Training 1/1 epoch (loss 1.8162): 15%|β–ˆβ– | 138/938 [04:13<22:49, 1.71s/it] Training 1/1 epoch (loss 1.7810): 15%|β–ˆβ– | 138/938 [04:14<22:49, 1.71s/it] Training 1/1 epoch (loss 1.7810): 15%|β–ˆβ– | 139/938 [04:14<21:18, 1.60s/it] Training 1/1 epoch (loss 1.7721): 15%|β–ˆβ– | 139/938 [04:16<21:18, 1.60s/it] Training 1/1 epoch (loss 1.7721): 15%|β–ˆβ– | 140/938 [04:16<24:28, 1.84s/it] Training 1/1 epoch (loss 1.8406): 15%|β–ˆβ– | 140/938 [04:18<24:28, 1.84s/it] Training 1/1 epoch (loss 1.8406): 15%|β–ˆβ–Œ | 141/938 [04:18<22:34, 1.70s/it] Training 1/1 epoch (loss 1.8185): 15%|β–ˆβ–Œ | 141/938 [04:20<22:34, 1.70s/it] Training 1/1 epoch (loss 1.8185): 15%|β–ˆβ–Œ | 142/938 [04:20<22:55, 1.73s/it] Training 1/1 epoch (loss 1.7882): 15%|β–ˆβ–Œ | 142/938 [04:21<22:55, 1.73s/it] Training 1/1 epoch (loss 1.7882): 15%|β–ˆβ–Œ | 143/938 [04:21<21:46, 1.64s/it] Training 1/1 epoch (loss 1.8870): 15%|β–ˆβ–Œ | 143/938 [04:23<21:46, 1.64s/it] Training 1/1 epoch (loss 1.8870): 15%|β–ˆβ–Œ | 144/938 [04:23<22:14, 1.68s/it] Training 1/1 epoch (loss 1.7977): 15%|β–ˆβ–Œ | 144/938 [04:24<22:14, 1.68s/it] Training 1/1 epoch (loss 1.7977): 15%|β–ˆβ–Œ | 145/938 [04:24<22:27, 1.70s/it] Training 1/1 epoch (loss 1.7764): 15%|β–ˆβ–Œ | 145/938 [04:27<22:27, 1.70s/it] Training 1/1 epoch (loss 1.7764): 16%|β–ˆβ–Œ | 146/938 [04:27<25:19, 1.92s/it] Training 1/1 epoch (loss 1.6524): 16%|β–ˆβ–Œ | 146/938 [04:28<25:19, 1.92s/it] Training 1/1 epoch (loss 1.6524): 16%|β–ˆβ–Œ | 147/938 [04:28<23:51, 1.81s/it] Training 1/1 epoch (loss 1.9128): 16%|β–ˆβ–Œ | 147/938 [04:30<23:51, 1.81s/it] Training 1/1 epoch (loss 1.9128): 16%|β–ˆβ–Œ | 148/938 [04:30<22:04, 1.68s/it] Training 1/1 epoch (loss 1.9650): 16%|β–ˆβ–Œ | 148/938 [04:31<22:04, 1.68s/it] Training 1/1 epoch (loss 1.9650): 16%|β–ˆβ–Œ | 149/938 [04:31<19:49, 1.51s/it] Training 1/1 epoch (loss 1.7469): 16%|β–ˆβ–Œ | 149/938 [04:32<19:49, 1.51s/it] Training 1/1 epoch (loss 1.7469): 16%|β–ˆβ–Œ | 150/938 [04:32<19:52, 1.51s/it] Training 1/1 epoch (loss 1.7817): 16%|β–ˆβ–Œ | 150/938 [04:34<19:52, 1.51s/it] Training 1/1 epoch (loss 1.7817): 16%|β–ˆβ–Œ | 151/938 [04:34<19:00, 1.45s/it] Training 1/1 epoch (loss 1.7564): 16%|β–ˆβ–Œ | 151/938 [04:37<19:00, 1.45s/it] Training 1/1 epoch (loss 1.7564): 16%|β–ˆβ–Œ | 152/938 [04:37<24:10, 1.85s/it] Training 1/1 epoch (loss 1.8061): 16%|β–ˆβ–Œ | 152/938 [04:39<24:10, 1.85s/it] Training 1/1 epoch (loss 1.8061): 16%|β–ˆβ–‹ | 153/938 [04:39<24:54, 1.90s/it] Training 1/1 epoch (loss 1.6536): 16%|β–ˆβ–‹ | 153/938 [04:41<24:54, 1.90s/it] Training 1/1 epoch (loss 1.6536): 16%|β–ˆβ–‹ | 154/938 [04:41<26:49, 2.05s/it] Training 1/1 epoch (loss 1.9130): 16%|β–ˆβ–‹ | 154/938 [04:43<26:49, 2.05s/it] Training 1/1 epoch (loss 1.9130): 17%|β–ˆβ–‹ | 155/938 [04:43<25:07, 1.93s/it] Training 1/1 epoch (loss 1.7844): 17%|β–ˆβ–‹ | 155/938 [04:44<25:07, 1.93s/it] Training 1/1 epoch (loss 1.7844): 17%|β–ˆβ–‹ | 156/938 [04:44<22:45, 1.75s/it] Training 1/1 epoch (loss 1.7976): 17%|β–ˆβ–‹ | 156/938 [04:45<22:45, 1.75s/it] Training 1/1 epoch (loss 1.7976): 17%|β–ˆβ–‹ | 157/938 [04:45<21:52, 1.68s/it] Training 1/1 epoch (loss 1.7527): 17%|β–ˆβ–‹ | 157/938 [04:48<21:52, 1.68s/it] Training 1/1 epoch (loss 1.7527): 17%|β–ˆβ–‹ | 158/938 [04:48<24:11, 1.86s/it] Training 1/1 epoch (loss 1.8635): 17%|β–ˆβ–‹ | 158/938 [04:49<24:11, 1.86s/it] Training 1/1 epoch (loss 1.8635): 17%|β–ˆβ–‹ | 159/938 [04:49<21:21, 1.64s/it] Training 1/1 epoch (loss 1.8322): 17%|β–ˆβ–‹ | 159/938 [04:51<21:21, 1.64s/it] Training 1/1 epoch (loss 1.8322): 17%|β–ˆβ–‹ | 160/938 [04:51<21:19, 1.65s/it] Training 1/1 epoch (loss 1.7230): 17%|β–ˆβ–‹ | 160/938 [04:52<21:19, 1.65s/it] Training 1/1 epoch (loss 1.7230): 17%|β–ˆβ–‹ | 161/938 [04:52<19:43, 1.52s/it] Training 1/1 epoch (loss 1.7893): 17%|β–ˆβ–‹ | 161/938 [04:53<19:43, 1.52s/it] Training 1/1 epoch (loss 1.7893): 17%|β–ˆβ–‹ | 162/938 [04:53<20:00, 1.55s/it] Training 1/1 epoch (loss 1.8013): 17%|β–ˆβ–‹ | 162/938 [04:55<20:00, 1.55s/it] Training 1/1 epoch (loss 1.8013): 17%|β–ˆβ–‹ | 163/938 [04:55<22:04, 1.71s/it] Training 1/1 epoch (loss 1.7524): 17%|β–ˆβ–‹ | 163/938 [04:57<22:04, 1.71s/it] Training 1/1 epoch (loss 1.7524): 17%|β–ˆβ–‹ | 164/938 [04:57<20:41, 1.60s/it] Training 1/1 epoch (loss 1.8161): 17%|β–ˆβ–‹ | 164/938 [04:58<20:41, 1.60s/it] Training 1/1 epoch (loss 1.8161): 18%|β–ˆβ–Š | 165/938 [04:58<19:53, 1.54s/it] Training 1/1 epoch (loss 1.7237): 18%|β–ˆβ–Š | 165/938 [05:00<19:53, 1.54s/it] Training 1/1 epoch (loss 1.7237): 18%|β–ˆβ–Š | 166/938 [05:00<22:27, 1.74s/it] Training 1/1 epoch (loss 1.8421): 18%|β–ˆβ–Š | 166/938 [05:02<22:27, 1.74s/it] Training 1/1 epoch (loss 1.8421): 18%|β–ˆβ–Š | 167/938 [05:02<23:10, 1.80s/it] Training 1/1 epoch (loss 1.6453): 18%|β–ˆβ–Š | 167/938 [05:05<23:10, 1.80s/it] Training 1/1 epoch (loss 1.6453): 18%|β–ˆβ–Š | 168/938 [05:05<24:39, 1.92s/it] Training 1/1 epoch (loss 1.9326): 18%|β–ˆβ–Š | 168/938 [05:07<24:39, 1.92s/it] Training 1/1 epoch (loss 1.9326): 18%|β–ˆβ–Š | 169/938 [05:07<24:37, 1.92s/it] Training 1/1 epoch (loss 1.7732): 18%|β–ˆβ–Š | 169/938 [05:09<24:37, 1.92s/it] Training 1/1 epoch (loss 1.7732): 18%|β–ˆβ–Š | 170/938 [05:09<26:36, 2.08s/it] Training 1/1 epoch (loss 1.7324): 18%|β–ˆβ–Š | 170/938 [05:11<26:36, 2.08s/it] Training 1/1 epoch (loss 1.7324): 18%|β–ˆβ–Š | 171/938 [05:11<26:48, 2.10s/it] Training 1/1 epoch (loss 1.7094): 18%|β–ˆβ–Š | 171/938 [05:13<26:48, 2.10s/it] Training 1/1 epoch (loss 1.7094): 18%|β–ˆβ–Š | 172/938 [05:13<24:25, 1.91s/it] Training 1/1 epoch (loss 1.7811): 18%|β–ˆβ–Š | 172/938 [05:15<24:25, 1.91s/it] Training 1/1 epoch (loss 1.7811): 18%|β–ˆβ–Š | 173/938 [05:15<24:38, 1.93s/it] Training 1/1 epoch (loss 1.8619): 18%|β–ˆβ–Š | 173/938 [05:17<24:38, 1.93s/it] Training 1/1 epoch (loss 1.8619): 19%|β–ˆβ–Š | 174/938 [05:17<26:53, 2.11s/it] Training 1/1 epoch (loss 1.8221): 19%|β–ˆβ–Š | 174/938 [05:19<26:53, 2.11s/it] Training 1/1 epoch (loss 1.8221): 19%|β–ˆβ–Š | 175/938 [05:19<26:13, 2.06s/it] Training 1/1 epoch (loss 1.6411): 19%|β–ˆβ–Š | 175/938 [05:21<26:13, 2.06s/it] Training 1/1 epoch (loss 1.6411): 19%|β–ˆβ–‰ | 176/938 [05:21<24:27, 1.93s/it] Training 1/1 epoch (loss 1.7428): 19%|β–ˆβ–‰ | 176/938 [05:23<24:27, 1.93s/it] Training 1/1 epoch (loss 1.7428): 19%|β–ˆβ–‰ | 177/938 [05:23<25:13, 1.99s/it] Training 1/1 epoch (loss 1.8113): 19%|β–ˆβ–‰ | 177/938 [05:25<25:13, 1.99s/it] Training 1/1 epoch (loss 1.8113): 19%|β–ˆβ–‰ | 178/938 [05:25<24:28, 1.93s/it] Training 1/1 epoch (loss 1.8721): 19%|β–ˆβ–‰ | 178/938 [05:27<24:28, 1.93s/it] Training 1/1 epoch (loss 1.8721): 19%|β–ˆβ–‰ | 179/938 [05:27<24:45, 1.96s/it] Training 1/1 epoch (loss 1.7056): 19%|β–ˆβ–‰ | 179/938 [05:28<24:45, 1.96s/it] Training 1/1 epoch (loss 1.7056): 19%|β–ˆβ–‰ | 180/938 [05:28<23:32, 1.86s/it] Training 1/1 epoch (loss 1.7595): 19%|β–ˆβ–‰ | 180/938 [05:29<23:32, 1.86s/it] Training 1/1 epoch (loss 1.7595): 19%|β–ˆβ–‰ | 181/938 [05:29<20:37, 1.63s/it] Training 1/1 epoch (loss 1.9005): 19%|β–ˆβ–‰ | 181/938 [05:31<20:37, 1.63s/it] Training 1/1 epoch (loss 1.9005): 19%|β–ˆβ–‰ | 182/938 [05:31<20:49, 1.65s/it] Training 1/1 epoch (loss 1.8799): 19%|β–ˆβ–‰ | 182/938 [05:33<20:49, 1.65s/it] Training 1/1 epoch (loss 1.8799): 20%|β–ˆβ–‰ | 183/938 [05:33<22:42, 1.80s/it] Training 1/1 epoch (loss 1.7763): 20%|β–ˆβ–‰ | 183/938 [05:35<22:42, 1.80s/it] Training 1/1 epoch (loss 1.7763): 20%|β–ˆβ–‰ | 184/938 [05:35<21:36, 1.72s/it] Training 1/1 epoch (loss 1.7889): 20%|β–ˆβ–‰ | 184/938 [05:36<21:36, 1.72s/it] Training 1/1 epoch (loss 1.7889): 20%|β–ˆβ–‰ | 185/938 [05:36<21:45, 1.73s/it] Training 1/1 epoch (loss 1.7819): 20%|β–ˆβ–‰ | 185/938 [05:39<21:45, 1.73s/it] Training 1/1 epoch (loss 1.7819): 20%|β–ˆβ–‰ | 186/938 [05:39<22:58, 1.83s/it] Training 1/1 epoch (loss 1.6605): 20%|β–ˆβ–‰ | 186/938 [05:40<22:58, 1.83s/it] Training 1/1 epoch (loss 1.6605): 20%|β–ˆβ–‰ | 187/938 [05:40<21:50, 1.74s/it] Training 1/1 epoch (loss 1.7226): 20%|β–ˆβ–‰ | 187/938 [05:42<21:50, 1.74s/it] Training 1/1 epoch (loss 1.7226): 20%|β–ˆβ–ˆ | 188/938 [05:42<20:46, 1.66s/it] Training 1/1 epoch (loss 1.7657): 20%|β–ˆβ–ˆ | 188/938 [05:43<20:46, 1.66s/it] Training 1/1 epoch (loss 1.7657): 20%|β–ˆβ–ˆ | 189/938 [05:43<19:07, 1.53s/it] Training 1/1 epoch (loss 2.0084): 20%|β–ˆβ–ˆ | 189/938 [05:45<19:07, 1.53s/it] Training 1/1 epoch (loss 2.0084): 20%|β–ˆβ–ˆ | 190/938 [05:45<20:23, 1.64s/it] Training 1/1 epoch (loss 1.7288): 20%|β–ˆβ–ˆ | 190/938 [05:47<20:23, 1.64s/it] Training 1/1 epoch (loss 1.7288): 20%|β–ˆβ–ˆ | 191/938 [05:47<21:54, 1.76s/it] Training 1/1 epoch (loss 1.7794): 20%|β–ˆβ–ˆ | 191/938 [05:49<21:54, 1.76s/it] Training 1/1 epoch (loss 1.7794): 20%|β–ˆβ–ˆ | 192/938 [05:49<25:11, 2.03s/it] Training 1/1 epoch (loss 1.7418): 20%|β–ˆβ–ˆ | 192/938 [05:51<25:11, 2.03s/it] Training 1/1 epoch (loss 1.7418): 21%|β–ˆβ–ˆ | 193/938 [05:51<22:46, 1.83s/it] Training 1/1 epoch (loss 1.7704): 21%|β–ˆβ–ˆ | 193/938 [05:53<22:46, 1.83s/it] Training 1/1 epoch (loss 1.7704): 21%|β–ˆβ–ˆ | 194/938 [05:53<24:11, 1.95s/it] Training 1/1 epoch (loss 1.7807): 21%|β–ˆβ–ˆ | 194/938 [05:54<24:11, 1.95s/it] Training 1/1 epoch (loss 1.7807): 21%|β–ˆβ–ˆ | 195/938 [05:54<22:00, 1.78s/it] Training 1/1 epoch (loss 1.8042): 21%|β–ˆβ–ˆ | 195/938 [05:57<22:00, 1.78s/it] Training 1/1 epoch (loss 1.8042): 21%|β–ˆβ–ˆ | 196/938 [05:57<24:23, 1.97s/it] Training 1/1 epoch (loss 1.7991): 21%|β–ˆβ–ˆ | 196/938 [05:58<24:23, 1.97s/it] Training 1/1 epoch (loss 1.7991): 21%|β–ˆβ–ˆ | 197/938 [05:58<23:20, 1.89s/it] Training 1/1 epoch (loss 1.6624): 21%|β–ˆβ–ˆ | 197/938 [06:00<23:20, 1.89s/it] Training 1/1 epoch (loss 1.6624): 21%|β–ˆβ–ˆ | 198/938 [06:00<21:42, 1.76s/it] Training 1/1 epoch (loss 1.7888): 21%|β–ˆβ–ˆ | 198/938 [06:01<21:42, 1.76s/it] Training 1/1 epoch (loss 1.7888): 21%|β–ˆβ–ˆ | 199/938 [06:01<20:42, 1.68s/it] Training 1/1 epoch (loss 1.7322): 21%|β–ˆβ–ˆ | 199/938 [06:04<20:42, 1.68s/it] Training 1/1 epoch (loss 1.7322): 21%|β–ˆβ–ˆβ– | 200/938 [06:04<24:15, 1.97s/it] Training 1/1 epoch (loss 1.7721): 21%|β–ˆβ–ˆβ– | 200/938 [06:06<24:15, 1.97s/it] Training 1/1 epoch (loss 1.7721): 21%|β–ˆβ–ˆβ– | 201/938 [06:06<23:43, 1.93s/it] Training 1/1 epoch (loss 1.7515): 21%|β–ˆβ–ˆβ– | 201/938 [06:08<23:43, 1.93s/it] Training 1/1 epoch (loss 1.7515): 22%|β–ˆβ–ˆβ– | 202/938 [06:08<23:01, 1.88s/it] Training 1/1 epoch (loss 1.7759): 22%|β–ˆβ–ˆβ– | 202/938 [06:09<23:01, 1.88s/it] Training 1/1 epoch (loss 1.7759): 22%|β–ˆβ–ˆβ– | 203/938 [06:09<22:19, 1.82s/it] Training 1/1 epoch (loss 1.7998): 22%|β–ˆβ–ˆβ– | 203/938 [06:11<22:19, 1.82s/it] Training 1/1 epoch (loss 1.7998): 22%|β–ˆβ–ˆβ– | 204/938 [06:11<21:07, 1.73s/it] Training 1/1 epoch (loss 1.7702): 22%|β–ˆβ–ˆβ– | 204/938 [06:12<21:07, 1.73s/it] Training 1/1 epoch (loss 1.7702): 22%|β–ˆβ–ˆβ– | 205/938 [06:12<18:07, 1.48s/it] Training 1/1 epoch (loss 1.8666): 22%|β–ˆβ–ˆβ– | 205/938 [06:14<18:07, 1.48s/it] Training 1/1 epoch (loss 1.8666): 22%|β–ˆβ–ˆβ– | 206/938 [06:14<20:20, 1.67s/it] Training 1/1 epoch (loss 1.7383): 22%|β–ˆβ–ˆβ– | 206/938 [06:15<20:20, 1.67s/it] Training 1/1 epoch (loss 1.7383): 22%|β–ˆβ–ˆβ– | 207/938 [06:15<19:09, 1.57s/it] Training 1/1 epoch (loss 1.8412): 22%|β–ˆβ–ˆβ– | 207/938 [06:16<19:09, 1.57s/it] Training 1/1 epoch (loss 1.8412): 22%|β–ˆβ–ˆβ– | 208/938 [06:16<17:06, 1.41s/it] Training 1/1 epoch (loss 1.7470): 22%|β–ˆβ–ˆβ– | 208/938 [06:18<17:06, 1.41s/it] Training 1/1 epoch (loss 1.7470): 22%|β–ˆβ–ˆβ– | 209/938 [06:18<19:37, 1.62s/it] Training 1/1 epoch (loss 1.7736): 22%|β–ˆβ–ˆβ– | 209/938 [06:21<19:37, 1.62s/it] Training 1/1 epoch (loss 1.7736): 22%|β–ˆβ–ˆβ– | 210/938 [06:21<22:35, 1.86s/it] Training 1/1 epoch (loss 1.7648): 22%|β–ˆβ–ˆβ– | 210/938 [06:23<22:35, 1.86s/it] Training 1/1 epoch (loss 1.7648): 22%|β–ˆβ–ˆβ– | 211/938 [06:23<22:38, 1.87s/it] Training 1/1 epoch (loss 1.6771): 22%|β–ˆβ–ˆβ– | 211/938 [06:24<22:38, 1.87s/it] Training 1/1 epoch (loss 1.6771): 23%|β–ˆβ–ˆβ–Ž | 212/938 [06:24<20:21, 1.68s/it] Training 1/1 epoch (loss 1.6524): 23%|β–ˆβ–ˆβ–Ž | 212/938 [06:26<20:21, 1.68s/it] Training 1/1 epoch (loss 1.6524): 23%|β–ˆβ–ˆβ–Ž | 213/938 [06:26<22:25, 1.86s/it] Training 1/1 epoch (loss 1.7403): 23%|β–ˆβ–ˆβ–Ž | 213/938 [06:28<22:25, 1.86s/it] Training 1/1 epoch (loss 1.7403): 23%|β–ˆβ–ˆβ–Ž | 214/938 [06:28<21:28, 1.78s/it] Training 1/1 epoch (loss 1.7755): 23%|β–ˆβ–ˆβ–Ž | 214/938 [06:30<21:28, 1.78s/it] Training 1/1 epoch (loss 1.7755): 23%|β–ˆβ–ˆβ–Ž | 215/938 [06:30<21:53, 1.82s/it] Training 1/1 epoch (loss 1.7163): 23%|β–ˆβ–ˆβ–Ž | 215/938 [06:31<21:53, 1.82s/it] Training 1/1 epoch (loss 1.7163): 23%|β–ˆβ–ˆβ–Ž | 216/938 [06:31<21:52, 1.82s/it] Training 1/1 epoch (loss 1.8887): 23%|β–ˆβ–ˆβ–Ž | 216/938 [06:33<21:52, 1.82s/it] Training 1/1 epoch (loss 1.8887): 23%|β–ˆβ–ˆβ–Ž | 217/938 [06:33<19:41, 1.64s/it] Training 1/1 epoch (loss 1.7766): 23%|β–ˆβ–ˆβ–Ž | 217/938 [06:34<19:41, 1.64s/it] Training 1/1 epoch (loss 1.7766): 23%|β–ˆβ–ˆβ–Ž | 218/938 [06:34<19:24, 1.62s/it] Training 1/1 epoch (loss 1.7341): 23%|β–ˆβ–ˆβ–Ž | 218/938 [06:36<19:24, 1.62s/it] Training 1/1 epoch (loss 1.7341): 23%|β–ˆβ–ˆβ–Ž | 219/938 [06:36<20:23, 1.70s/it] Training 1/1 epoch (loss 1.6251): 23%|β–ˆβ–ˆβ–Ž | 219/938 [06:38<20:23, 1.70s/it] Training 1/1 epoch (loss 1.6251): 23%|β–ˆβ–ˆβ–Ž | 220/938 [06:38<19:11, 1.60s/it] Training 1/1 epoch (loss 1.7924): 23%|β–ˆβ–ˆβ–Ž | 220/938 [06:40<19:11, 1.60s/it] Training 1/1 epoch (loss 1.7924): 24%|β–ˆβ–ˆβ–Ž | 221/938 [06:40<20:59, 1.76s/it] Training 1/1 epoch (loss 1.7637): 24%|β–ˆβ–ˆβ–Ž | 221/938 [06:41<20:59, 1.76s/it] Training 1/1 epoch (loss 1.7637): 24%|β–ˆβ–ˆβ–Ž | 222/938 [06:41<20:12, 1.69s/it] Training 1/1 epoch (loss 1.8178): 24%|β–ˆβ–ˆβ–Ž | 222/938 [06:43<20:12, 1.69s/it] Training 1/1 epoch (loss 1.8178): 24%|β–ˆβ–ˆβ– | 223/938 [06:43<19:52, 1.67s/it] Training 1/1 epoch (loss 1.8062): 24%|β–ˆβ–ˆβ– | 223/938 [06:45<19:52, 1.67s/it] Training 1/1 epoch (loss 1.8062): 24%|β–ˆβ–ˆβ– | 224/938 [06:45<20:59, 1.76s/it] Training 1/1 epoch (loss 1.8134): 24%|β–ˆβ–ˆβ– | 224/938 [06:46<20:59, 1.76s/it] Training 1/1 epoch (loss 1.8134): 24%|β–ˆβ–ˆβ– | 225/938 [06:46<20:37, 1.74s/it] Training 1/1 epoch (loss 1.8564): 24%|β–ˆβ–ˆβ– | 225/938 [06:49<20:37, 1.74s/it] Training 1/1 epoch (loss 1.8564): 24%|β–ˆβ–ˆβ– | 226/938 [06:49<23:00, 1.94s/it] Training 1/1 epoch (loss 1.7691): 24%|β–ˆβ–ˆβ– | 226/938 [06:51<23:00, 1.94s/it] Training 1/1 epoch (loss 1.7691): 24%|β–ˆβ–ˆβ– | 227/938 [06:51<22:41, 1.92s/it] Training 1/1 epoch (loss 1.7331): 24%|β–ˆβ–ˆβ– | 227/938 [06:52<22:41, 1.92s/it] Training 1/1 epoch (loss 1.7331): 24%|β–ˆβ–ˆβ– | 228/938 [06:52<21:54, 1.85s/it] Training 1/1 epoch (loss 1.8079): 24%|β–ˆβ–ˆβ– | 228/938 [06:54<21:54, 1.85s/it] Training 1/1 epoch (loss 1.8079): 24%|β–ˆβ–ˆβ– | 229/938 [06:54<20:38, 1.75s/it] Training 1/1 epoch (loss 1.7832): 24%|β–ˆβ–ˆβ– | 229/938 [06:56<20:38, 1.75s/it] Training 1/1 epoch (loss 1.7832): 25%|β–ˆβ–ˆβ– | 230/938 [06:56<23:04, 1.95s/it] Training 1/1 epoch (loss 1.7528): 25%|β–ˆβ–ˆβ– | 230/938 [06:58<23:04, 1.95s/it] Training 1/1 epoch (loss 1.7528): 25%|β–ˆβ–ˆβ– | 231/938 [06:58<21:09, 1.80s/it] Training 1/1 epoch (loss 1.7578): 25%|β–ˆβ–ˆβ– | 231/938 [07:00<21:09, 1.80s/it] Training 1/1 epoch (loss 1.7578): 25%|β–ˆβ–ˆβ– | 232/938 [07:00<23:24, 1.99s/it] Training 1/1 epoch (loss 1.7332): 25%|β–ˆβ–ˆβ– | 232/938 [07:02<23:24, 1.99s/it] Training 1/1 epoch (loss 1.7332): 25%|β–ˆβ–ˆβ– | 233/938 [07:02<23:20, 1.99s/it] Training 1/1 epoch (loss 1.7053): 25%|β–ˆβ–ˆβ– | 233/938 [07:03<23:20, 1.99s/it] Training 1/1 epoch (loss 1.7053): 25%|β–ˆβ–ˆβ– | 234/938 [07:03<18:46, 1.60s/it] Training 1/1 epoch (loss 1.7188): 25%|β–ˆβ–ˆβ– | 234/938 [07:05<18:46, 1.60s/it] Training 1/1 epoch (loss 1.7188): 25%|β–ˆβ–ˆβ–Œ | 235/938 [07:05<20:40, 1.76s/it] Training 1/1 epoch (loss 1.6840): 25%|β–ˆβ–ˆβ–Œ | 235/938 [07:07<20:40, 1.76s/it] Training 1/1 epoch (loss 1.6840): 25%|β–ˆβ–ˆβ–Œ | 236/938 [07:07<19:46, 1.69s/it] Training 1/1 epoch (loss 1.8289): 25%|β–ˆβ–ˆβ–Œ | 236/938 [07:08<19:46, 1.69s/it] Training 1/1 epoch (loss 1.8289): 25%|β–ˆβ–ˆβ–Œ | 237/938 [07:08<18:47, 1.61s/it] Training 1/1 epoch (loss 1.6273): 25%|β–ˆβ–ˆβ–Œ | 237/938 [07:09<18:47, 1.61s/it] Training 1/1 epoch (loss 1.6273): 25%|β–ˆβ–ˆβ–Œ | 238/938 [07:09<18:02, 1.55s/it] Training 1/1 epoch (loss 1.7713): 25%|β–ˆβ–ˆβ–Œ | 238/938 [07:11<18:02, 1.55s/it] Training 1/1 epoch (loss 1.7713): 25%|β–ˆβ–ˆβ–Œ | 239/938 [07:11<18:38, 1.60s/it] Training 1/1 epoch (loss 1.7601): 25%|β–ˆβ–ˆβ–Œ | 239/938 [07:13<18:38, 1.60s/it] Training 1/1 epoch (loss 1.7601): 26%|β–ˆβ–ˆβ–Œ | 240/938 [07:13<20:13, 1.74s/it] Training 1/1 epoch (loss 1.8255): 26%|β–ˆβ–ˆβ–Œ | 240/938 [07:14<20:13, 1.74s/it] Training 1/1 epoch (loss 1.8255): 26%|β–ˆβ–ˆβ–Œ | 241/938 [07:14<18:32, 1.60s/it] Training 1/1 epoch (loss 1.7985): 26%|β–ˆβ–ˆβ–Œ | 241/938 [07:15<18:32, 1.60s/it] Training 1/1 epoch (loss 1.7985): 26%|β–ˆβ–ˆβ–Œ | 242/938 [07:15<16:09, 1.39s/it] Training 1/1 epoch (loss 1.7849): 26%|β–ˆβ–ˆβ–Œ | 242/938 [07:18<16:09, 1.39s/it] Training 1/1 epoch (loss 1.7849): 26%|β–ˆβ–ˆβ–Œ | 243/938 [07:18<19:31, 1.69s/it] Training 1/1 epoch (loss 1.6514): 26%|β–ˆβ–ˆβ–Œ | 243/938 [07:19<19:31, 1.69s/it] Training 1/1 epoch (loss 1.6514): 26%|β–ˆβ–ˆβ–Œ | 244/938 [07:19<19:15, 1.66s/it] Training 1/1 epoch (loss 1.6774): 26%|β–ˆβ–ˆβ–Œ | 244/938 [07:21<19:15, 1.66s/it] Training 1/1 epoch (loss 1.6774): 26%|β–ˆβ–ˆβ–Œ | 245/938 [07:21<18:41, 1.62s/it] Training 1/1 epoch (loss 1.7654): 26%|β–ˆβ–ˆβ–Œ | 245/938 [07:23<18:41, 1.62s/it] Training 1/1 epoch (loss 1.7654): 26%|β–ˆβ–ˆβ–Œ | 246/938 [07:23<19:44, 1.71s/it] Training 1/1 epoch (loss 1.7325): 26%|β–ˆβ–ˆβ–Œ | 246/938 [07:24<19:44, 1.71s/it] Training 1/1 epoch (loss 1.7325): 26%|β–ˆβ–ˆβ–‹ | 247/938 [07:24<19:24, 1.69s/it] Training 1/1 epoch (loss 1.6865): 26%|β–ˆβ–ˆβ–‹ | 247/938 [07:27<19:24, 1.69s/it] Training 1/1 epoch (loss 1.6865): 26%|β–ˆβ–ˆβ–‹ | 248/938 [07:27<23:04, 2.01s/it] Training 1/1 epoch (loss 1.7836): 26%|β–ˆβ–ˆβ–‹ | 248/938 [07:29<23:04, 2.01s/it] Training 1/1 epoch (loss 1.7836): 27%|β–ˆβ–ˆβ–‹ | 249/938 [07:29<23:19, 2.03s/it] Training 1/1 epoch (loss 1.6111): 27%|β–ˆβ–ˆβ–‹ | 249/938 [07:32<23:19, 2.03s/it] Training 1/1 epoch (loss 1.6111): 27%|β–ˆβ–ˆβ–‹ | 250/938 [07:32<24:03, 2.10s/it] Training 1/1 epoch (loss 1.8499): 27%|β–ˆβ–ˆβ–‹ | 250/938 [07:33<24:03, 2.10s/it] Training 1/1 epoch (loss 1.8499): 27%|β–ˆβ–ˆβ–‹ | 251/938 [07:33<23:10, 2.02s/it] Training 1/1 epoch (loss 1.7608): 27%|β–ˆβ–ˆβ–‹ | 251/938 [07:35<23:10, 2.02s/it] Training 1/1 epoch (loss 1.7608): 27%|β–ˆβ–ˆβ–‹ | 252/938 [07:35<22:00, 1.93s/it] Training 1/1 epoch (loss 1.7239): 27%|β–ˆβ–ˆβ–‹ | 252/938 [07:36<22:00, 1.93s/it] Training 1/1 epoch (loss 1.7239): 27%|β–ˆβ–ˆβ–‹ | 253/938 [07:36<19:54, 1.74s/it] Training 1/1 epoch (loss 1.7528): 27%|β–ˆβ–ˆβ–‹ | 253/938 [07:38<19:54, 1.74s/it] Training 1/1 epoch (loss 1.7528): 27%|β–ˆβ–ˆβ–‹ | 254/938 [07:38<18:08, 1.59s/it] Training 1/1 epoch (loss 1.7221): 27%|β–ˆβ–ˆβ–‹ | 254/938 [07:39<18:08, 1.59s/it] Training 1/1 epoch (loss 1.7221): 27%|β–ˆβ–ˆβ–‹ | 255/938 [07:39<17:01, 1.50s/it] Training 1/1 epoch (loss 1.6485): 27%|β–ˆβ–ˆβ–‹ | 255/938 [07:40<17:01, 1.50s/it] Training 1/1 epoch (loss 1.6485): 27%|β–ˆβ–ˆβ–‹ | 256/938 [07:40<16:13, 1.43s/it] Training 1/1 epoch (loss 1.7254): 27%|β–ˆβ–ˆβ–‹ | 256/938 [07:42<16:13, 1.43s/it] Training 1/1 epoch (loss 1.7254): 27%|β–ˆβ–ˆβ–‹ | 257/938 [07:42<15:54, 1.40s/it] Training 1/1 epoch (loss 1.7603): 27%|β–ˆβ–ˆβ–‹ | 257/938 [07:43<15:54, 1.40s/it] Training 1/1 epoch (loss 1.7603): 28%|β–ˆβ–ˆβ–Š | 258/938 [07:43<15:33, 1.37s/it] Training 1/1 epoch (loss 1.7466): 28%|β–ˆβ–ˆβ–Š | 258/938 [07:44<15:33, 1.37s/it] Training 1/1 epoch (loss 1.7466): 28%|β–ˆβ–ˆβ–Š | 259/938 [07:44<15:32, 1.37s/it] Training 1/1 epoch (loss 1.7350): 28%|β–ˆβ–ˆβ–Š | 259/938 [07:46<15:32, 1.37s/it] Training 1/1 epoch (loss 1.7350): 28%|β–ˆβ–ˆβ–Š | 260/938 [07:46<17:32, 1.55s/it] Training 1/1 epoch (loss 1.6984): 28%|β–ˆβ–ˆβ–Š | 260/938 [07:47<17:32, 1.55s/it] Training 1/1 epoch (loss 1.6984): 28%|β–ˆβ–ˆβ–Š | 261/938 [07:47<16:36, 1.47s/it] Training 1/1 epoch (loss 1.7184): 28%|β–ˆβ–ˆβ–Š | 261/938 [07:49<16:36, 1.47s/it] Training 1/1 epoch (loss 1.7184): 28%|β–ˆβ–ˆβ–Š | 262/938 [07:49<15:39, 1.39s/it] Training 1/1 epoch (loss 1.5950): 28%|β–ˆβ–ˆβ–Š | 262/938 [07:51<15:39, 1.39s/it] Training 1/1 epoch (loss 1.5950): 28%|β–ˆβ–ˆβ–Š | 263/938 [07:51<19:22, 1.72s/it] Training 1/1 epoch (loss 1.8326): 28%|β–ˆβ–ˆβ–Š | 263/938 [07:53<19:22, 1.72s/it] Training 1/1 epoch (loss 1.8326): 28%|β–ˆβ–ˆβ–Š | 264/938 [07:53<18:52, 1.68s/it] Training 1/1 epoch (loss 1.7167): 28%|β–ˆβ–ˆβ–Š | 264/938 [07:55<18:52, 1.68s/it] Training 1/1 epoch (loss 1.7167): 28%|β–ˆβ–ˆβ–Š | 265/938 [07:55<19:14, 1.72s/it] Training 1/1 epoch (loss 1.7732): 28%|β–ˆβ–ˆβ–Š | 265/938 [07:56<19:14, 1.72s/it] Training 1/1 epoch (loss 1.7732): 28%|β–ˆβ–ˆβ–Š | 266/938 [07:56<18:55, 1.69s/it] Training 1/1 epoch (loss 1.6599): 28%|β–ˆβ–ˆβ–Š | 266/938 [07:58<18:55, 1.69s/it] Training 1/1 epoch (loss 1.6599): 28%|β–ˆβ–ˆβ–Š | 267/938 [07:58<19:06, 1.71s/it] Training 1/1 epoch (loss 1.6703): 28%|β–ˆβ–ˆβ–Š | 267/938 [08:00<19:06, 1.71s/it] Training 1/1 epoch (loss 1.6703): 29%|β–ˆβ–ˆβ–Š | 268/938 [08:00<20:31, 1.84s/it] Training 1/1 epoch (loss 1.6629): 29%|β–ˆβ–ˆβ–Š | 268/938 [08:02<20:31, 1.84s/it] Training 1/1 epoch (loss 1.6629): 29%|β–ˆβ–ˆβ–Š | 269/938 [08:02<20:10, 1.81s/it] Training 1/1 epoch (loss 1.7434): 29%|β–ˆβ–ˆβ–Š | 269/938 [08:03<20:10, 1.81s/it] Training 1/1 epoch (loss 1.7434): 29%|β–ˆβ–ˆβ–‰ | 270/938 [08:03<19:23, 1.74s/it] Training 1/1 epoch (loss 1.8048): 29%|β–ˆβ–ˆβ–‰ | 270/938 [08:05<19:23, 1.74s/it] Training 1/1 epoch (loss 1.8048): 29%|β–ˆβ–ˆβ–‰ | 271/938 [08:05<19:08, 1.72s/it] Training 1/1 epoch (loss 1.6185): 29%|β–ˆβ–ˆβ–‰ | 271/938 [08:07<19:08, 1.72s/it] Training 1/1 epoch (loss 1.6185): 29%|β–ˆβ–ˆβ–‰ | 272/938 [08:07<19:21, 1.74s/it] Training 1/1 epoch (loss 1.7674): 29%|β–ˆβ–ˆβ–‰ | 272/938 [08:09<19:21, 1.74s/it] Training 1/1 epoch (loss 1.7674): 29%|β–ˆβ–ˆβ–‰ | 273/938 [08:09<21:43, 1.96s/it] Training 1/1 epoch (loss 1.7942): 29%|β–ˆβ–ˆβ–‰ | 273/938 [08:11<21:43, 1.96s/it] Training 1/1 epoch (loss 1.7942): 29%|β–ˆβ–ˆβ–‰ | 274/938 [08:11<21:08, 1.91s/it] Training 1/1 epoch (loss 1.8416): 29%|β–ˆβ–ˆβ–‰ | 274/938 [08:13<21:08, 1.91s/it] Training 1/1 epoch (loss 1.8416): 29%|β–ˆβ–ˆβ–‰ | 275/938 [08:13<20:47, 1.88s/it] Training 1/1 epoch (loss 1.7610): 29%|β–ˆβ–ˆβ–‰ | 275/938 [08:15<20:47, 1.88s/it] Training 1/1 epoch (loss 1.7610): 29%|β–ˆβ–ˆβ–‰ | 276/938 [08:15<19:48, 1.79s/it] Training 1/1 epoch (loss 1.8217): 29%|β–ˆβ–ˆβ–‰ | 276/938 [08:17<19:48, 1.79s/it] Training 1/1 epoch (loss 1.8217): 30%|β–ˆβ–ˆβ–‰ | 277/938 [08:17<20:47, 1.89s/it] Training 1/1 epoch (loss 1.7257): 30%|β–ˆβ–ˆβ–‰ | 277/938 [08:18<20:47, 1.89s/it] Training 1/1 epoch (loss 1.7257): 30%|β–ˆβ–ˆβ–‰ | 278/938 [08:18<17:49, 1.62s/it] Training 1/1 epoch (loss 1.7619): 30%|β–ˆβ–ˆβ–‰ | 278/938 [08:19<17:49, 1.62s/it] Training 1/1 epoch (loss 1.7619): 30%|β–ˆβ–ˆβ–‰ | 279/938 [08:19<17:08, 1.56s/it] Training 1/1 epoch (loss 1.6507): 30%|β–ˆβ–ˆβ–‰ | 279/938 [08:21<17:08, 1.56s/it] Training 1/1 epoch (loss 1.6507): 30%|β–ˆβ–ˆβ–‰ | 280/938 [08:21<18:06, 1.65s/it] Training 1/1 epoch (loss 1.7203): 30%|β–ˆβ–ˆβ–‰ | 280/938 [08:22<18:06, 1.65s/it] Training 1/1 epoch (loss 1.7203): 30%|β–ˆβ–ˆβ–‰ | 281/938 [08:22<15:58, 1.46s/it] Training 1/1 epoch (loss 1.7024): 30%|β–ˆβ–ˆβ–‰ | 281/938 [08:23<15:58, 1.46s/it] Training 1/1 epoch (loss 1.7024): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [08:23<15:10, 1.39s/it] Training 1/1 epoch (loss 1.8335): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [08:25<15:10, 1.39s/it] Training 1/1 epoch (loss 1.8335): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [08:25<16:47, 1.54s/it] Training 1/1 epoch (loss 1.6484): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [08:26<16:47, 1.54s/it] Training 1/1 epoch (loss 1.6484): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [08:26<16:31, 1.52s/it] Training 1/1 epoch (loss 1.6321): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [08:28<16:31, 1.52s/it] Training 1/1 epoch (loss 1.6321): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [08:28<16:20, 1.50s/it] Training 1/1 epoch (loss 1.7942): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [08:29<16:20, 1.50s/it] Training 1/1 epoch (loss 1.7942): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [08:29<15:29, 1.43s/it] Training 1/1 epoch (loss 1.6279): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [08:32<15:29, 1.43s/it] Training 1/1 epoch (loss 1.6279): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [08:32<18:47, 1.73s/it] Training 1/1 epoch (loss 1.8505): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [08:33<18:47, 1.73s/it] Training 1/1 epoch (loss 1.8505): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [08:33<18:16, 1.69s/it] Training 1/1 epoch (loss 1.7956): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [08:35<18:16, 1.69s/it] Training 1/1 epoch (loss 1.7956): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [08:35<20:03, 1.85s/it] Training 1/1 epoch (loss 1.6651): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [08:37<20:03, 1.85s/it] Training 1/1 epoch (loss 1.6651): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [08:37<20:09, 1.87s/it] Training 1/1 epoch (loss 1.8385): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [08:39<20:09, 1.87s/it] Training 1/1 epoch (loss 1.8385): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [08:39<19:28, 1.81s/it] Training 1/1 epoch (loss 1.6568): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [08:40<19:28, 1.81s/it] Training 1/1 epoch (loss 1.6568): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [08:40<17:27, 1.62s/it] Training 1/1 epoch (loss 1.7107): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [08:42<17:27, 1.62s/it] Training 1/1 epoch (loss 1.7107): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [08:42<16:22, 1.52s/it] Training 1/1 epoch (loss 1.7056): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [08:43<16:22, 1.52s/it] Training 1/1 epoch (loss 1.7056): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [08:43<17:08, 1.60s/it] Training 1/1 epoch (loss 1.5651): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [08:45<17:08, 1.60s/it] Training 1/1 epoch (loss 1.5651): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [08:45<16:14, 1.52s/it] Training 1/1 epoch (loss 1.6862): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [08:47<16:14, 1.52s/it] Training 1/1 epoch (loss 1.6862): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [08:47<17:53, 1.67s/it] Training 1/1 epoch (loss 1.7094): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [08:49<17:53, 1.67s/it] Training 1/1 epoch (loss 1.7094): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [08:49<20:01, 1.87s/it] Training 1/1 epoch (loss 1.7411): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [08:51<20:01, 1.87s/it] Training 1/1 epoch (loss 1.7411): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [08:51<20:20, 1.91s/it] Training 1/1 epoch (loss 1.7921): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [08:52<20:20, 1.91s/it] Training 1/1 epoch (loss 1.7921): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [08:52<18:57, 1.78s/it] Training 1/1 epoch (loss 1.7033): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [08:53<18:57, 1.78s/it] Training 1/1 epoch (loss 1.7033): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [08:53<16:25, 1.54s/it] Training 1/1 epoch (loss 1.7545): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [08:56<16:25, 1.54s/it] Training 1/1 epoch (loss 1.7545): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [08:56<18:06, 1.71s/it] Training 1/1 epoch (loss 1.7601): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [08:57<18:06, 1.71s/it] Training 1/1 epoch (loss 1.7601): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [08:57<18:45, 1.77s/it] Training 1/1 epoch (loss 1.6640): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [09:00<18:45, 1.77s/it] Training 1/1 epoch (loss 1.6640): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [09:00<20:08, 1.90s/it] Training 1/1 epoch (loss 1.8712): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [09:01<20:08, 1.90s/it] Training 1/1 epoch (loss 1.8712): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [09:01<18:29, 1.75s/it] Training 1/1 epoch (loss 1.6483): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [09:03<18:29, 1.75s/it] Training 1/1 epoch (loss 1.6483): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [09:03<18:24, 1.74s/it] Training 1/1 epoch (loss 1.7348): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [09:05<18:24, 1.74s/it] Training 1/1 epoch (loss 1.7348): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [09:05<19:47, 1.88s/it] Training 1/1 epoch (loss 1.7174): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [09:07<19:47, 1.88s/it] Training 1/1 epoch (loss 1.7174): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [09:07<21:40, 2.06s/it] Training 1/1 epoch (loss 1.6787): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [09:09<21:40, 2.06s/it] Training 1/1 epoch (loss 1.6787): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [09:09<19:43, 1.88s/it] Training 1/1 epoch (loss 1.5849): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [09:11<19:43, 1.88s/it] Training 1/1 epoch (loss 1.5849): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [09:11<19:25, 1.85s/it] Training 1/1 epoch (loss 1.8015): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [09:12<19:25, 1.85s/it] Training 1/1 epoch (loss 1.8015): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [09:12<18:41, 1.79s/it] Training 1/1 epoch (loss 1.5719): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [09:15<18:41, 1.79s/it] Training 1/1 epoch (loss 1.5719): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [09:15<20:34, 1.97s/it] Training 1/1 epoch (loss 1.6733): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [09:16<20:34, 1.97s/it] Training 1/1 epoch (loss 1.6733): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [09:16<18:14, 1.75s/it] Training 1/1 epoch (loss 1.9003): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [09:18<18:14, 1.75s/it] Training 1/1 epoch (loss 1.9003): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [09:18<20:10, 1.94s/it] Training 1/1 epoch (loss 1.8025): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [09:19<20:10, 1.94s/it] Training 1/1 epoch (loss 1.8025): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [09:19<17:26, 1.68s/it] Training 1/1 epoch (loss 1.6309): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [09:21<17:26, 1.68s/it] Training 1/1 epoch (loss 1.6309): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [09:21<16:58, 1.63s/it] Training 1/1 epoch (loss 1.7528): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [09:23<16:58, 1.63s/it] Training 1/1 epoch (loss 1.7528): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [09:23<19:21, 1.87s/it] Training 1/1 epoch (loss 1.6897): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [09:25<19:21, 1.87s/it] Training 1/1 epoch (loss 1.6897): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [09:25<17:35, 1.70s/it] Training 1/1 epoch (loss 1.7932): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [09:26<17:35, 1.70s/it] Training 1/1 epoch (loss 1.7932): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [09:26<15:36, 1.51s/it] Training 1/1 epoch (loss 1.6090): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [09:28<15:36, 1.51s/it] Training 1/1 epoch (loss 1.6090): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [09:28<17:31, 1.70s/it] Training 1/1 epoch (loss 1.5010): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [09:31<17:31, 1.70s/it] Training 1/1 epoch (loss 1.5010): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [09:31<20:16, 1.97s/it] Training 1/1 epoch (loss 1.6367): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [09:33<20:16, 1.97s/it] Training 1/1 epoch (loss 1.6367): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [09:33<20:31, 2.00s/it] Training 1/1 epoch (loss 1.7423): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [09:35<20:31, 2.00s/it] Training 1/1 epoch (loss 1.7423): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [09:35<20:31, 2.00s/it] Training 1/1 epoch (loss 1.8117): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [09:36<20:31, 2.00s/it] Training 1/1 epoch (loss 1.8117): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [09:36<19:03, 1.86s/it] Training 1/1 epoch (loss 1.7277): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [09:38<19:03, 1.86s/it] Training 1/1 epoch (loss 1.7277): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [09:38<17:40, 1.73s/it] Training 1/1 epoch (loss 1.6743): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [09:40<17:40, 1.73s/it] Training 1/1 epoch (loss 1.6743): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [09:40<19:57, 1.95s/it] Training 1/1 epoch (loss 1.7855): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [09:42<19:57, 1.95s/it] Training 1/1 epoch (loss 1.7855): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [09:42<19:50, 1.95s/it] Training 1/1 epoch (loss 1.6546): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [09:43<19:50, 1.95s/it] Training 1/1 epoch (loss 1.6546): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [09:43<18:05, 1.78s/it] Training 1/1 epoch (loss 1.6802): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [09:45<18:05, 1.78s/it] Training 1/1 epoch (loss 1.6802): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [09:45<18:16, 1.80s/it] Training 1/1 epoch (loss 1.6021): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [09:47<18:16, 1.80s/it] Training 1/1 epoch (loss 1.6021): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [09:47<17:04, 1.68s/it] Training 1/1 epoch (loss 1.7492): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [09:48<17:04, 1.68s/it] Training 1/1 epoch (loss 1.7492): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [09:48<17:09, 1.69s/it] Training 1/1 epoch (loss 1.7105): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [09:50<17:09, 1.69s/it] Training 1/1 epoch (loss 1.7105): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [09:50<16:30, 1.63s/it] Training 1/1 epoch (loss 1.7754): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [09:51<16:30, 1.63s/it] Training 1/1 epoch (loss 1.7754): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [09:51<16:38, 1.65s/it] Training 1/1 epoch (loss 1.7293): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [09:53<16:38, 1.65s/it] Training 1/1 epoch (loss 1.7293): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [09:53<17:05, 1.69s/it] Training 1/1 epoch (loss 1.6377): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [09:55<17:05, 1.69s/it] Training 1/1 epoch (loss 1.6377): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [09:55<16:04, 1.60s/it] Training 1/1 epoch (loss 1.7499): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [09:56<16:04, 1.60s/it] Training 1/1 epoch (loss 1.7499): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [09:56<15:10, 1.51s/it] Training 1/1 epoch (loss 1.6224): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [09:58<15:10, 1.51s/it] Training 1/1 epoch (loss 1.6224): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [09:58<17:57, 1.79s/it] Training 1/1 epoch (loss 1.7361): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [10:00<17:57, 1.79s/it] Training 1/1 epoch (loss 1.7361): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [10:00<16:02, 1.60s/it] Training 1/1 epoch (loss 1.7065): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [10:01<16:02, 1.60s/it] Training 1/1 epoch (loss 1.7065): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [10:01<14:49, 1.48s/it] Training 1/1 epoch (loss 1.7454): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [10:02<14:49, 1.48s/it] Training 1/1 epoch (loss 1.7454): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [10:02<14:05, 1.41s/it] Training 1/1 epoch (loss 1.6770): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [10:03<14:05, 1.41s/it] Training 1/1 epoch (loss 1.6770): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [10:03<13:47, 1.38s/it] Training 1/1 epoch (loss 1.7111): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [10:05<13:47, 1.38s/it] Training 1/1 epoch (loss 1.7111): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [10:05<15:22, 1.54s/it] Training 1/1 epoch (loss 1.6517): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [10:07<15:22, 1.54s/it] Training 1/1 epoch (loss 1.6517): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [10:07<16:49, 1.69s/it] Training 1/1 epoch (loss 1.8130): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [10:09<16:49, 1.69s/it] Training 1/1 epoch (loss 1.8130): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [10:09<17:10, 1.73s/it] Training 1/1 epoch (loss 1.7171): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [10:10<17:10, 1.73s/it] Training 1/1 epoch (loss 1.7171): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [10:10<16:06, 1.63s/it] Training 1/1 epoch (loss 1.6210): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [10:12<16:06, 1.63s/it] Training 1/1 epoch (loss 1.6210): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [10:12<15:38, 1.58s/it] Training 1/1 epoch (loss 1.6215): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [10:13<15:38, 1.58s/it] Training 1/1 epoch (loss 1.6215): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [10:13<15:06, 1.53s/it] Training 1/1 epoch (loss 1.6913): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [10:15<15:06, 1.53s/it] Training 1/1 epoch (loss 1.6913): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [10:15<14:08, 1.44s/it] Training 1/1 epoch (loss 1.5882): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [10:17<14:08, 1.44s/it] Training 1/1 epoch (loss 1.5882): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [10:17<16:18, 1.66s/it] Training 1/1 epoch (loss 1.7623): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [10:18<16:18, 1.66s/it] Training 1/1 epoch (loss 1.7623): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [10:18<16:20, 1.66s/it] Training 1/1 epoch (loss 1.7272): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [10:20<16:20, 1.66s/it] Training 1/1 epoch (loss 1.7272): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [10:20<16:49, 1.72s/it] Training 1/1 epoch (loss 1.7273): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [10:22<16:49, 1.72s/it] Training 1/1 epoch (loss 1.7273): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [10:22<17:10, 1.76s/it] Training 1/1 epoch (loss 1.6701): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [10:24<17:10, 1.76s/it] Training 1/1 epoch (loss 1.6701): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [10:24<17:44, 1.82s/it] Training 1/1 epoch (loss 1.8173): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [10:26<17:44, 1.82s/it] Training 1/1 epoch (loss 1.8173): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [10:26<17:04, 1.75s/it] Training 1/1 epoch (loss 1.7153): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [10:27<17:04, 1.75s/it] Training 1/1 epoch (loss 1.7153): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [10:27<15:30, 1.59s/it] Training 1/1 epoch (loss 1.7138): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [10:29<15:30, 1.59s/it] Training 1/1 epoch (loss 1.7138): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [10:29<17:44, 1.83s/it] Training 1/1 epoch (loss 1.6381): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [10:31<17:44, 1.83s/it] Training 1/1 epoch (loss 1.6381): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [10:31<17:32, 1.81s/it] Training 1/1 epoch (loss 1.6811): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [10:32<17:32, 1.81s/it] Training 1/1 epoch (loss 1.6811): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [10:32<16:13, 1.68s/it] Training 1/1 epoch (loss 1.6438): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [10:34<16:13, 1.68s/it] Training 1/1 epoch (loss 1.6438): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [10:34<16:52, 1.75s/it] Training 1/1 epoch (loss 1.6632): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [10:37<16:52, 1.75s/it] Training 1/1 epoch (loss 1.6632): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [10:37<18:37, 1.93s/it] Training 1/1 epoch (loss 1.7771): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [10:38<18:37, 1.93s/it] Training 1/1 epoch (loss 1.7771): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [10:38<16:56, 1.76s/it] Training 1/1 epoch (loss 1.8426): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [10:40<16:56, 1.76s/it] Training 1/1 epoch (loss 1.8426): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [10:40<16:19, 1.70s/it] Training 1/1 epoch (loss 1.7158): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [10:41<16:19, 1.70s/it] Training 1/1 epoch (loss 1.7158): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [10:41<16:43, 1.74s/it] Training 1/1 epoch (loss 1.7051): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [10:43<16:43, 1.74s/it] Training 1/1 epoch (loss 1.7051): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [10:43<16:52, 1.76s/it] Training 1/1 epoch (loss 1.7731): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [10:45<16:52, 1.76s/it] Training 1/1 epoch (loss 1.7731): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [10:45<15:41, 1.64s/it] Training 1/1 epoch (loss 1.6658): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [10:46<15:41, 1.64s/it] Training 1/1 epoch (loss 1.6658): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [10:46<14:25, 1.51s/it] Training 1/1 epoch (loss 1.6779): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [10:47<14:25, 1.51s/it] Training 1/1 epoch (loss 1.6779): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [10:47<14:46, 1.55s/it] Training 1/1 epoch (loss 1.7286): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [10:48<14:46, 1.55s/it] Training 1/1 epoch (loss 1.7286): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [10:48<12:56, 1.36s/it] Training 1/1 epoch (loss 1.8218): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [10:50<12:56, 1.36s/it] Training 1/1 epoch (loss 1.8218): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [10:50<14:42, 1.55s/it] Training 1/1 epoch (loss 1.6711): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [10:52<14:42, 1.55s/it] Training 1/1 epoch (loss 1.6711): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [10:52<14:02, 1.48s/it] Training 1/1 epoch (loss 1.7881): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [10:53<14:02, 1.48s/it] Training 1/1 epoch (loss 1.7881): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [10:53<14:34, 1.54s/it] Training 1/1 epoch (loss 1.6735): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [10:55<14:34, 1.54s/it] Training 1/1 epoch (loss 1.6735): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [10:55<14:15, 1.51s/it] Training 1/1 epoch (loss 1.6951): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [10:57<14:15, 1.51s/it] Training 1/1 epoch (loss 1.6951): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [10:57<14:55, 1.58s/it] Training 1/1 epoch (loss 1.6436): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [10:58<14:55, 1.58s/it] Training 1/1 epoch (loss 1.6436): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [10:58<14:12, 1.51s/it] Training 1/1 epoch (loss 1.5352): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [11:00<14:12, 1.51s/it] Training 1/1 epoch (loss 1.5352): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [11:00<15:08, 1.61s/it] Training 1/1 epoch (loss 1.7565): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [11:02<15:08, 1.61s/it] Training 1/1 epoch (loss 1.7565): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [11:02<17:41, 1.89s/it] Training 1/1 epoch (loss 1.7140): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [11:04<17:41, 1.89s/it] Training 1/1 epoch (loss 1.7140): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [11:04<17:17, 1.85s/it] Training 1/1 epoch (loss 1.8216): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [11:06<17:17, 1.85s/it] Training 1/1 epoch (loss 1.8216): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [11:06<18:54, 2.02s/it] Training 1/1 epoch (loss 1.6263): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [11:09<18:54, 2.02s/it] Training 1/1 epoch (loss 1.6263): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [11:09<20:19, 2.18s/it] Training 1/1 epoch (loss 1.6912): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [11:11<20:19, 2.18s/it] Training 1/1 epoch (loss 1.6912): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [11:11<18:29, 1.99s/it] Training 1/1 epoch (loss 1.6546): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [11:13<18:29, 1.99s/it] Training 1/1 epoch (loss 1.6546): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [11:13<19:30, 2.10s/it] Training 1/1 epoch (loss 1.7205): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [11:15<19:30, 2.10s/it] Training 1/1 epoch (loss 1.7205): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [11:15<18:26, 1.99s/it] Training 1/1 epoch (loss 1.5471): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [11:16<18:26, 1.99s/it] Training 1/1 epoch (loss 1.5471): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [11:16<15:48, 1.71s/it] Training 1/1 epoch (loss 1.7282): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [11:17<15:48, 1.71s/it] Training 1/1 epoch (loss 1.7282): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [11:17<15:49, 1.71s/it] Training 1/1 epoch (loss 1.7488): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [11:20<15:49, 1.71s/it] Training 1/1 epoch (loss 1.7488): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [11:20<17:57, 1.95s/it] Training 1/1 epoch (loss 1.6632): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [11:22<17:57, 1.95s/it] Training 1/1 epoch (loss 1.6632): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [11:22<17:56, 1.95s/it] Training 1/1 epoch (loss 1.6942): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [11:24<17:56, 1.95s/it] Training 1/1 epoch (loss 1.6942): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [11:24<18:07, 1.97s/it] Training 1/1 epoch (loss 1.7519): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [11:25<18:07, 1.97s/it] Training 1/1 epoch (loss 1.7519): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [11:25<17:04, 1.86s/it] Training 1/1 epoch (loss 1.6799): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [11:27<17:04, 1.86s/it] Training 1/1 epoch (loss 1.6799): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [11:27<17:20, 1.89s/it] Training 1/1 epoch (loss 1.7938): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [11:30<17:20, 1.89s/it] Training 1/1 epoch (loss 1.7938): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [11:30<18:46, 2.05s/it] Training 1/1 epoch (loss 1.7026): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [11:31<18:46, 2.05s/it] Training 1/1 epoch (loss 1.7026): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [11:31<15:40, 1.72s/it] Training 1/1 epoch (loss 1.6848): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [11:32<15:40, 1.72s/it] Training 1/1 epoch (loss 1.6848): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [11:32<14:05, 1.54s/it] Training 1/1 epoch (loss 1.8071): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [11:34<14:05, 1.54s/it] Training 1/1 epoch (loss 1.8071): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [11:34<15:31, 1.71s/it] Training 1/1 epoch (loss 1.6818): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [11:35<15:31, 1.71s/it] Training 1/1 epoch (loss 1.6818): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [11:35<14:23, 1.58s/it] Training 1/1 epoch (loss 1.8111): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [11:37<14:23, 1.58s/it] Training 1/1 epoch (loss 1.8111): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [11:37<14:22, 1.58s/it] Training 1/1 epoch (loss 1.6582): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [11:39<14:22, 1.58s/it] Training 1/1 epoch (loss 1.6582): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [11:39<16:36, 1.84s/it] Training 1/1 epoch (loss 1.6845): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [11:41<16:36, 1.84s/it] Training 1/1 epoch (loss 1.6845): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [11:41<15:07, 1.67s/it] Training 1/1 epoch (loss 1.6640): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [11:43<15:07, 1.67s/it] Training 1/1 epoch (loss 1.6640): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [11:43<17:07, 1.90s/it] Training 1/1 epoch (loss 1.7060): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [11:45<17:07, 1.90s/it] Training 1/1 epoch (loss 1.7060): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [11:45<17:02, 1.89s/it] Training 1/1 epoch (loss 1.6963): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [11:47<17:02, 1.89s/it] Training 1/1 epoch (loss 1.6963): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [11:47<16:35, 1.85s/it] Training 1/1 epoch (loss 1.7852): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [11:48<16:35, 1.85s/it] Training 1/1 epoch (loss 1.7852): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [11:48<15:23, 1.72s/it] Training 1/1 epoch (loss 1.7405): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [11:50<15:23, 1.72s/it] Training 1/1 epoch (loss 1.7405): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [11:50<16:58, 1.90s/it] Training 1/1 epoch (loss 1.7326): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [11:52<16:58, 1.90s/it] Training 1/1 epoch (loss 1.7326): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [11:52<15:19, 1.71s/it] Training 1/1 epoch (loss 1.7300): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [11:54<15:19, 1.71s/it] Training 1/1 epoch (loss 1.7300): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [11:54<16:10, 1.81s/it] Training 1/1 epoch (loss 1.7113): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [11:55<16:10, 1.81s/it] Training 1/1 epoch (loss 1.7113): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [11:55<15:31, 1.74s/it] Training 1/1 epoch (loss 1.6233): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [11:57<15:31, 1.74s/it] Training 1/1 epoch (loss 1.6233): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [11:57<14:45, 1.66s/it] Training 1/1 epoch (loss 1.7872): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [11:59<14:45, 1.66s/it] Training 1/1 epoch (loss 1.7872): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [11:59<15:10, 1.71s/it] Training 1/1 epoch (loss 1.7519): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [12:00<15:10, 1.71s/it] Training 1/1 epoch (loss 1.7519): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [12:00<15:37, 1.77s/it] Training 1/1 epoch (loss 1.6587): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [12:03<15:37, 1.77s/it] Training 1/1 epoch (loss 1.6587): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [12:03<17:54, 2.03s/it] Training 1/1 epoch (loss 1.7044): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [12:05<17:54, 2.03s/it] Training 1/1 epoch (loss 1.7044): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [12:05<16:12, 1.84s/it] Training 1/1 epoch (loss 1.7185): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [12:07<16:12, 1.84s/it] Training 1/1 epoch (loss 1.7185): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [12:07<17:46, 2.02s/it] Training 1/1 epoch (loss 1.5486): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [12:08<17:46, 2.02s/it] Training 1/1 epoch (loss 1.5486): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [12:08<15:53, 1.81s/it] Training 1/1 epoch (loss 1.7297): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [12:10<15:53, 1.81s/it] Training 1/1 epoch (loss 1.7297): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [12:10<14:40, 1.67s/it] Training 1/1 epoch (loss 1.6108): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [12:12<14:40, 1.67s/it] Training 1/1 epoch (loss 1.6108): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [12:12<16:02, 1.83s/it] Training 1/1 epoch (loss 1.6902): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [12:14<16:02, 1.83s/it] Training 1/1 epoch (loss 1.6902): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [12:14<17:15, 1.98s/it] Training 1/1 epoch (loss 1.6453): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [12:16<17:15, 1.98s/it] Training 1/1 epoch (loss 1.6453): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [12:16<15:32, 1.78s/it] Training 1/1 epoch (loss 1.6007): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [12:17<15:32, 1.78s/it] Training 1/1 epoch (loss 1.6007): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [12:17<14:45, 1.70s/it] Training 1/1 epoch (loss 1.5869): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [12:18<14:45, 1.70s/it] Training 1/1 epoch (loss 1.5869): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [12:18<13:50, 1.59s/it] Training 1/1 epoch (loss 1.5915): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [12:21<13:50, 1.59s/it] Training 1/1 epoch (loss 1.5915): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [12:21<15:58, 1.84s/it] Training 1/1 epoch (loss 1.6989): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [12:23<15:58, 1.84s/it] Training 1/1 epoch (loss 1.6989): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [12:23<17:00, 1.97s/it] Training 1/1 epoch (loss 1.7330): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [12:24<17:00, 1.97s/it] Training 1/1 epoch (loss 1.7330): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [12:24<15:20, 1.78s/it] Training 1/1 epoch (loss 1.6063): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [12:26<15:20, 1.78s/it] Training 1/1 epoch (loss 1.6063): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [12:26<14:57, 1.74s/it] Training 1/1 epoch (loss 1.6435): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [12:27<14:57, 1.74s/it] Training 1/1 epoch (loss 1.6435): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [12:27<13:22, 1.56s/it] Training 1/1 epoch (loss 1.6234): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [12:28<13:22, 1.56s/it] Training 1/1 epoch (loss 1.6234): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [12:28<12:44, 1.48s/it] Training 1/1 epoch (loss 1.5341): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [12:30<12:44, 1.48s/it] Training 1/1 epoch (loss 1.5341): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [12:30<13:50, 1.62s/it] Training 1/1 epoch (loss 1.7040): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [12:32<13:50, 1.62s/it] Training 1/1 epoch (loss 1.7040): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [12:32<15:06, 1.77s/it] Training 1/1 epoch (loss 1.6856): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [12:35<15:06, 1.77s/it] Training 1/1 epoch (loss 1.6856): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [12:35<15:41, 1.84s/it] Training 1/1 epoch (loss 1.7521): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [12:36<15:41, 1.84s/it] Training 1/1 epoch (loss 1.7521): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [12:36<14:19, 1.68s/it] Training 1/1 epoch (loss 1.6158): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [12:38<14:19, 1.68s/it] Training 1/1 epoch (loss 1.6158): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [12:38<14:34, 1.71s/it] Training 1/1 epoch (loss 1.6760): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [12:39<14:34, 1.71s/it] Training 1/1 epoch (loss 1.6760): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [12:39<13:30, 1.59s/it] Training 1/1 epoch (loss 1.7262): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [12:40<13:30, 1.59s/it] Training 1/1 epoch (loss 1.7262): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [12:40<13:12, 1.56s/it] Training 1/1 epoch (loss 1.6388): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [12:42<13:12, 1.56s/it] Training 1/1 epoch (loss 1.6388): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [12:42<14:26, 1.71s/it] Training 1/1 epoch (loss 1.7167): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [12:44<14:26, 1.71s/it] Training 1/1 epoch (loss 1.7167): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [12:44<14:58, 1.77s/it] Training 1/1 epoch (loss 1.6577): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [12:46<14:58, 1.77s/it] Training 1/1 epoch (loss 1.6577): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [12:46<14:44, 1.75s/it] Training 1/1 epoch (loss 1.7561): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [12:48<14:44, 1.75s/it] Training 1/1 epoch (loss 1.7561): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [12:48<14:42, 1.75s/it] Training 1/1 epoch (loss 1.6844): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [12:50<14:42, 1.75s/it] Training 1/1 epoch (loss 1.6844): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [12:50<16:41, 1.99s/it] Training 1/1 epoch (loss 1.7679): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [12:52<16:41, 1.99s/it] Training 1/1 epoch (loss 1.7679): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [12:52<16:29, 1.97s/it] Training 1/1 epoch (loss 1.7295): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [12:55<16:29, 1.97s/it] Training 1/1 epoch (loss 1.7295): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [12:55<17:39, 2.11s/it] Training 1/1 epoch (loss 1.5883): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [12:57<17:39, 2.11s/it] Training 1/1 epoch (loss 1.5883): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [12:57<17:29, 2.10s/it] Training 1/1 epoch (loss 1.7476): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [12:59<17:29, 2.10s/it] Training 1/1 epoch (loss 1.7476): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [12:59<17:53, 2.15s/it] Training 1/1 epoch (loss 1.6734): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [13:01<17:53, 2.15s/it] Training 1/1 epoch (loss 1.6734): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [13:01<16:47, 2.02s/it] Training 1/1 epoch (loss 1.6561): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [13:02<16:47, 2.02s/it] Training 1/1 epoch (loss 1.6561): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [13:02<14:57, 1.81s/it] Training 1/1 epoch (loss 1.5984): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [13:03<14:57, 1.81s/it] Training 1/1 epoch (loss 1.5984): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [13:03<13:46, 1.67s/it] Training 1/1 epoch (loss 1.6607): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [13:05<13:46, 1.67s/it] Training 1/1 epoch (loss 1.6607): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [13:05<13:15, 1.61s/it] Training 1/1 epoch (loss 1.7436): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [13:07<13:15, 1.61s/it] Training 1/1 epoch (loss 1.7436): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [13:07<13:28, 1.64s/it] Training 1/1 epoch (loss 1.5258): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [13:08<13:28, 1.64s/it] Training 1/1 epoch (loss 1.5258): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [13:08<13:41, 1.67s/it] Training 1/1 epoch (loss 1.5784): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [13:10<13:41, 1.67s/it] Training 1/1 epoch (loss 1.5784): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [13:10<12:44, 1.55s/it] Training 1/1 epoch (loss 1.7454): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [13:12<12:44, 1.55s/it] Training 1/1 epoch (loss 1.7454): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [13:12<14:57, 1.83s/it] Training 1/1 epoch (loss 1.6974): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [13:14<14:57, 1.83s/it] Training 1/1 epoch (loss 1.6974): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [13:14<13:49, 1.69s/it] Training 1/1 epoch (loss 1.7923): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [13:15<13:49, 1.69s/it] Training 1/1 epoch (loss 1.7923): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [13:15<14:28, 1.78s/it] Training 1/1 epoch (loss 1.5356): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [13:17<14:28, 1.78s/it] Training 1/1 epoch (loss 1.5356): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [13:17<14:22, 1.77s/it] Training 1/1 epoch (loss 1.6472): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [13:19<14:22, 1.77s/it] Training 1/1 epoch (loss 1.6472): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [13:19<13:17, 1.64s/it] Training 1/1 epoch (loss 1.6526): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [13:20<13:17, 1.64s/it] Training 1/1 epoch (loss 1.6526): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [13:20<13:40, 1.69s/it] Training 1/1 epoch (loss 1.6675): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [13:22<13:40, 1.69s/it] Training 1/1 epoch (loss 1.6675): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [13:22<13:03, 1.62s/it] Training 1/1 epoch (loss 1.6228): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [13:23<13:03, 1.62s/it] Training 1/1 epoch (loss 1.6228): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [13:23<12:45, 1.58s/it] Training 1/1 epoch (loss 1.7333): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [13:26<12:45, 1.58s/it] Training 1/1 epoch (loss 1.7333): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [13:26<14:47, 1.84s/it] Training 1/1 epoch (loss 1.6566): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [13:28<14:47, 1.84s/it] Training 1/1 epoch (loss 1.6566): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [13:28<15:38, 1.95s/it] Training 1/1 epoch (loss 1.7602): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [13:29<15:38, 1.95s/it] Training 1/1 epoch (loss 1.7602): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [13:29<14:32, 1.81s/it] Training 1/1 epoch (loss 1.7448): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [13:31<14:32, 1.81s/it] Training 1/1 epoch (loss 1.7448): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [13:31<14:25, 1.80s/it] Training 1/1 epoch (loss 1.7527): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [13:33<14:25, 1.80s/it] Training 1/1 epoch (loss 1.7527): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [13:33<15:25, 1.93s/it] Training 1/1 epoch (loss 1.7173): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [13:35<15:25, 1.93s/it] Training 1/1 epoch (loss 1.7173): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [13:35<14:07, 1.77s/it] Training 1/1 epoch (loss 1.6367): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [13:36<14:07, 1.77s/it] Training 1/1 epoch (loss 1.6367): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [13:36<12:52, 1.62s/it] Training 1/1 epoch (loss 1.6336): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [13:37<12:52, 1.62s/it] Training 1/1 epoch (loss 1.6336): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [13:37<11:43, 1.48s/it] Training 1/1 epoch (loss 1.6535): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [13:39<11:43, 1.48s/it] Training 1/1 epoch (loss 1.6535): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [13:39<13:16, 1.68s/it] Training 1/1 epoch (loss 1.6838): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [13:42<13:16, 1.68s/it] Training 1/1 epoch (loss 1.6838): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [13:42<14:16, 1.81s/it] Training 1/1 epoch (loss 1.6924): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [13:43<14:16, 1.81s/it] Training 1/1 epoch (loss 1.6924): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [13:43<12:48, 1.62s/it] Training 1/1 epoch (loss 1.7381): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [13:44<12:48, 1.62s/it] Training 1/1 epoch (loss 1.7381): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [13:44<13:08, 1.67s/it] Training 1/1 epoch (loss 1.7371): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [13:47<13:08, 1.67s/it] Training 1/1 epoch (loss 1.7371): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [13:47<15:10, 1.93s/it] Training 1/1 epoch (loss 1.7783): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [13:48<15:10, 1.93s/it] Training 1/1 epoch (loss 1.7783): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [13:48<13:13, 1.69s/it] Training 1/1 epoch (loss 1.7299): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [13:50<13:13, 1.69s/it] Training 1/1 epoch (loss 1.7299): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [13:50<14:12, 1.82s/it] Training 1/1 epoch (loss 1.5889): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [13:52<14:12, 1.82s/it] Training 1/1 epoch (loss 1.5889): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [13:52<14:12, 1.82s/it] Training 1/1 epoch (loss 1.6707): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [13:54<14:12, 1.82s/it] Training 1/1 epoch (loss 1.6707): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [13:54<15:28, 1.99s/it] Training 1/1 epoch (loss 1.7085): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [13:57<15:28, 1.99s/it] Training 1/1 epoch (loss 1.7085): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [13:57<17:34, 2.26s/it] Training 1/1 epoch (loss 1.5644): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [13:59<17:34, 2.26s/it] Training 1/1 epoch (loss 1.5644): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [13:59<15:20, 1.98s/it] Training 1/1 epoch (loss 1.7177): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [14:01<15:20, 1.98s/it] Training 1/1 epoch (loss 1.7177): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [14:01<15:55, 2.06s/it] Training 1/1 epoch (loss 1.6041): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [14:03<15:55, 2.06s/it] Training 1/1 epoch (loss 1.6041): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [14:03<15:36, 2.02s/it] Training 1/1 epoch (loss 1.6741): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [14:05<15:36, 2.02s/it] Training 1/1 epoch (loss 1.6741): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [14:05<16:30, 2.14s/it] Training 1/1 epoch (loss 1.6186): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [14:07<16:30, 2.14s/it] Training 1/1 epoch (loss 1.6186): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [14:07<15:49, 2.06s/it] Training 1/1 epoch (loss 1.5926): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [14:08<15:49, 2.06s/it] Training 1/1 epoch (loss 1.5926): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [14:08<14:03, 1.83s/it] Training 1/1 epoch (loss 1.7135): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [14:10<14:03, 1.83s/it] Training 1/1 epoch (loss 1.7135): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [14:10<13:18, 1.74s/it] Training 1/1 epoch (loss 1.6665): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [14:13<13:18, 1.74s/it] Training 1/1 epoch (loss 1.6665): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [14:13<15:00, 1.97s/it] Training 1/1 epoch (loss 1.7033): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [14:14<15:00, 1.97s/it] Training 1/1 epoch (loss 1.7033): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [14:14<13:59, 1.84s/it] Training 1/1 epoch (loss 1.6576): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [14:16<13:59, 1.84s/it] Training 1/1 epoch (loss 1.6576): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [14:16<14:30, 1.91s/it] Training 1/1 epoch (loss 1.5951): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [14:18<14:30, 1.91s/it] Training 1/1 epoch (loss 1.5951): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [14:18<14:01, 1.85s/it] Training 1/1 epoch (loss 1.8186): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [14:20<14:01, 1.85s/it] Training 1/1 epoch (loss 1.8186): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [14:20<14:41, 1.94s/it] Training 1/1 epoch (loss 1.7995): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [14:21<14:41, 1.94s/it] Training 1/1 epoch (loss 1.7995): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [14:21<12:24, 1.64s/it] Training 1/1 epoch (loss 1.6765): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [14:23<12:24, 1.64s/it] Training 1/1 epoch (loss 1.6765): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [14:23<13:06, 1.74s/it] Training 1/1 epoch (loss 1.6843): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [14:24<13:06, 1.74s/it] Training 1/1 epoch (loss 1.6843): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [14:24<12:23, 1.65s/it] Training 1/1 epoch (loss 1.6547): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [14:26<12:23, 1.65s/it] Training 1/1 epoch (loss 1.6547): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [14:26<12:29, 1.66s/it] Training 1/1 epoch (loss 1.6791): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [14:27<12:29, 1.66s/it] Training 1/1 epoch (loss 1.6791): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [14:27<11:05, 1.48s/it] Training 1/1 epoch (loss 1.7183): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [14:30<11:05, 1.48s/it] Training 1/1 epoch (loss 1.7183): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [14:30<13:11, 1.77s/it] Training 1/1 epoch (loss 1.7868): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [14:32<13:11, 1.77s/it] Training 1/1 epoch (loss 1.7868): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [14:32<13:44, 1.84s/it] Training 1/1 epoch (loss 1.7571): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [14:33<13:44, 1.84s/it] Training 1/1 epoch (loss 1.7571): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [14:33<12:47, 1.72s/it] Training 1/1 epoch (loss 1.6114): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [14:35<12:47, 1.72s/it] Training 1/1 epoch (loss 1.6114): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [14:35<12:48, 1.73s/it] Training 1/1 epoch (loss 1.6650): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [14:36<12:48, 1.73s/it] Training 1/1 epoch (loss 1.6650): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [14:36<12:12, 1.65s/it] Training 1/1 epoch (loss 1.6952): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [14:37<12:12, 1.65s/it] Training 1/1 epoch (loss 1.6952): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [14:37<10:40, 1.45s/it] Training 1/1 epoch (loss 1.5615): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [14:39<10:40, 1.45s/it] Training 1/1 epoch (loss 1.5615): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [14:39<12:22, 1.68s/it] Training 1/1 epoch (loss 1.7291): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [14:40<12:22, 1.68s/it] Training 1/1 epoch (loss 1.7291): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [14:40<10:50, 1.48s/it] Training 1/1 epoch (loss 1.6759): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [14:43<10:50, 1.48s/it] Training 1/1 epoch (loss 1.6759): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [14:43<12:56, 1.76s/it] Training 1/1 epoch (loss 1.6221): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [14:44<12:56, 1.76s/it] Training 1/1 epoch (loss 1.6221): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [14:44<12:03, 1.65s/it] Training 1/1 epoch (loss 1.5733): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [14:46<12:03, 1.65s/it] Training 1/1 epoch (loss 1.5733): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [14:46<11:52, 1.63s/it] Training 1/1 epoch (loss 1.7118): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [14:47<11:52, 1.63s/it] Training 1/1 epoch (loss 1.7118): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [14:47<11:04, 1.52s/it] Training 1/1 epoch (loss 1.7094): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [14:49<11:04, 1.52s/it] Training 1/1 epoch (loss 1.7094): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [14:49<12:09, 1.67s/it] Training 1/1 epoch (loss 1.6651): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [14:51<12:09, 1.67s/it] Training 1/1 epoch (loss 1.6651): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [14:51<12:07, 1.67s/it] Training 1/1 epoch (loss 1.5667): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [14:53<12:07, 1.67s/it] Training 1/1 epoch (loss 1.5667): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [14:53<13:28, 1.86s/it] Training 1/1 epoch (loss 1.6659): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [14:55<13:28, 1.86s/it] Training 1/1 epoch (loss 1.6659): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [14:55<13:42, 1.90s/it] Training 1/1 epoch (loss 1.6712): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [14:56<13:42, 1.90s/it] Training 1/1 epoch (loss 1.6712): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [14:56<12:03, 1.67s/it] Training 1/1 epoch (loss 1.7060): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [14:58<12:03, 1.67s/it] Training 1/1 epoch (loss 1.7060): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [14:58<12:30, 1.74s/it] Training 1/1 epoch (loss 1.6946): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [15:00<12:30, 1.74s/it] Training 1/1 epoch (loss 1.6946): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [15:00<13:23, 1.87s/it] Training 1/1 epoch (loss 1.6668): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [15:01<13:23, 1.87s/it] Training 1/1 epoch (loss 1.6668): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [15:01<11:29, 1.61s/it] Training 1/1 epoch (loss 1.7390): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [15:04<11:29, 1.61s/it] Training 1/1 epoch (loss 1.7390): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [15:04<13:07, 1.84s/it] Training 1/1 epoch (loss 1.8071): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [15:05<13:07, 1.84s/it] Training 1/1 epoch (loss 1.8071): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [15:05<12:41, 1.78s/it] Training 1/1 epoch (loss 1.6757): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [15:08<12:41, 1.78s/it] Training 1/1 epoch (loss 1.6757): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [15:08<14:29, 2.04s/it] Training 1/1 epoch (loss 1.7170): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [15:09<14:29, 2.04s/it] Training 1/1 epoch (loss 1.7170): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [15:09<12:30, 1.77s/it] Training 1/1 epoch (loss 1.6939): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [15:10<12:30, 1.77s/it] Training 1/1 epoch (loss 1.6939): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [15:10<10:59, 1.56s/it] Training 1/1 epoch (loss 1.6355): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [15:12<10:59, 1.56s/it] Training 1/1 epoch (loss 1.6355): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [15:12<10:55, 1.55s/it] Training 1/1 epoch (loss 1.6799): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [15:13<10:55, 1.55s/it] Training 1/1 epoch (loss 1.6799): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [15:13<10:58, 1.56s/it] Training 1/1 epoch (loss 1.7207): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [15:14<10:58, 1.56s/it] Training 1/1 epoch (loss 1.7207): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [15:14<10:12, 1.46s/it] Training 1/1 epoch (loss 1.6529): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [15:16<10:12, 1.46s/it] Training 1/1 epoch (loss 1.6529): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [15:16<10:32, 1.51s/it] Training 1/1 epoch (loss 1.7066): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [15:18<10:32, 1.51s/it] Training 1/1 epoch (loss 1.7066): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [15:18<12:07, 1.74s/it] Training 1/1 epoch (loss 1.7686): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [15:21<12:07, 1.74s/it] Training 1/1 epoch (loss 1.7686): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [15:21<14:09, 2.03s/it] Training 1/1 epoch (loss 1.5733): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [15:23<14:09, 2.03s/it] Training 1/1 epoch (loss 1.5733): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [15:23<13:43, 1.97s/it] Training 1/1 epoch (loss 1.6656): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [15:25<13:43, 1.97s/it] Training 1/1 epoch (loss 1.6656): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [15:25<13:17, 1.92s/it] Training 1/1 epoch (loss 1.6536): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [15:26<13:17, 1.92s/it] Training 1/1 epoch (loss 1.6536): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [15:26<11:21, 1.64s/it] Training 1/1 epoch (loss 1.5721): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [15:27<11:21, 1.64s/it] Training 1/1 epoch (loss 1.5721): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [15:27<11:28, 1.66s/it] Training 1/1 epoch (loss 1.6322): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [15:29<11:28, 1.66s/it] Training 1/1 epoch (loss 1.6322): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [15:29<12:08, 1.77s/it] Training 1/1 epoch (loss 1.6944): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [15:31<12:08, 1.77s/it] Training 1/1 epoch (loss 1.6944): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [15:31<10:58, 1.60s/it] Training 1/1 epoch (loss 1.6409): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [15:32<10:58, 1.60s/it] Training 1/1 epoch (loss 1.6409): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [15:32<11:12, 1.64s/it] Training 1/1 epoch (loss 1.6971): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [15:35<11:12, 1.64s/it] Training 1/1 epoch (loss 1.6971): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [15:35<13:16, 1.94s/it] Training 1/1 epoch (loss 1.6193): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [15:37<13:16, 1.94s/it] Training 1/1 epoch (loss 1.6193): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [15:37<12:39, 1.86s/it] Training 1/1 epoch (loss 1.6399): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [15:39<12:39, 1.86s/it] Training 1/1 epoch (loss 1.6399): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [15:39<12:48, 1.88s/it] Training 1/1 epoch (loss 1.6076): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [15:40<12:48, 1.88s/it] Training 1/1 epoch (loss 1.6076): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [15:40<11:24, 1.68s/it] Training 1/1 epoch (loss 1.7347): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [15:41<11:24, 1.68s/it] Training 1/1 epoch (loss 1.7347): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [15:41<11:05, 1.64s/it] Training 1/1 epoch (loss 1.6753): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [15:43<11:05, 1.64s/it] Training 1/1 epoch (loss 1.6753): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [15:43<11:17, 1.67s/it] Training 1/1 epoch (loss 1.6207): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [15:45<11:17, 1.67s/it] Training 1/1 epoch (loss 1.6207): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [15:45<11:53, 1.77s/it] Training 1/1 epoch (loss 1.6286): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [15:46<11:53, 1.77s/it] Training 1/1 epoch (loss 1.6286): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [15:46<10:55, 1.63s/it] Training 1/1 epoch (loss 1.5639): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [15:49<10:55, 1.63s/it] Training 1/1 epoch (loss 1.5639): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [15:49<13:04, 1.95s/it] Training 1/1 epoch (loss 1.6974): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [15:51<13:04, 1.95s/it] Training 1/1 epoch (loss 1.6974): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [15:51<11:58, 1.79s/it] Training 1/1 epoch (loss 1.5899): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [15:53<11:58, 1.79s/it] Training 1/1 epoch (loss 1.5899): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [15:53<13:10, 1.98s/it] Training 1/1 epoch (loss 1.6678): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [15:55<13:10, 1.98s/it] Training 1/1 epoch (loss 1.6678): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [15:55<13:57, 2.10s/it] Training 1/1 epoch (loss 1.6711): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [15:57<13:57, 2.10s/it] Training 1/1 epoch (loss 1.6711): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [15:57<12:07, 1.83s/it] Training 1/1 epoch (loss 1.5466): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [15:58<12:07, 1.83s/it] Training 1/1 epoch (loss 1.5466): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [15:58<11:57, 1.81s/it] Training 1/1 epoch (loss 1.6184): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [15:59<11:57, 1.81s/it] Training 1/1 epoch (loss 1.6184): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [15:59<10:44, 1.63s/it] Training 1/1 epoch (loss 1.6934): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [16:02<10:44, 1.63s/it] Training 1/1 epoch (loss 1.6934): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [16:02<11:32, 1.75s/it] Training 1/1 epoch (loss 1.5949): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [16:03<11:32, 1.75s/it] Training 1/1 epoch (loss 1.5949): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [16:03<11:40, 1.78s/it] Training 1/1 epoch (loss 1.5527): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [16:06<11:40, 1.78s/it] Training 1/1 epoch (loss 1.5527): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [16:06<12:57, 1.98s/it] Training 1/1 epoch (loss 1.5874): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [16:07<12:57, 1.98s/it] Training 1/1 epoch (loss 1.5874): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [16:07<11:18, 1.73s/it] Training 1/1 epoch (loss 1.5654): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [16:09<11:18, 1.73s/it] Training 1/1 epoch (loss 1.5654): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [16:09<11:57, 1.84s/it] Training 1/1 epoch (loss 1.6996): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [16:11<11:57, 1.84s/it] Training 1/1 epoch (loss 1.6996): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [16:11<11:22, 1.75s/it] Training 1/1 epoch (loss 1.5919): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [16:12<11:22, 1.75s/it] Training 1/1 epoch (loss 1.5919): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [16:12<11:23, 1.76s/it] Training 1/1 epoch (loss 1.7445): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [16:14<11:23, 1.76s/it] Training 1/1 epoch (loss 1.7445): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [16:14<12:04, 1.87s/it] Training 1/1 epoch (loss 1.6699): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [16:16<12:04, 1.87s/it] Training 1/1 epoch (loss 1.6699): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [16:16<10:34, 1.64s/it] Training 1/1 epoch (loss 1.6124): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [16:17<10:34, 1.64s/it] Training 1/1 epoch (loss 1.6124): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [16:17<10:50, 1.69s/it] Training 1/1 epoch (loss 1.7060): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [16:19<10:50, 1.69s/it] Training 1/1 epoch (loss 1.7060): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [16:19<10:53, 1.70s/it] Training 1/1 epoch (loss 1.6753): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [16:21<10:53, 1.70s/it] Training 1/1 epoch (loss 1.6753): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [16:21<11:31, 1.80s/it] Training 1/1 epoch (loss 1.5931): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [16:23<11:31, 1.80s/it] Training 1/1 epoch (loss 1.5931): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [16:23<12:27, 1.95s/it] Training 1/1 epoch (loss 1.7042): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [16:25<12:27, 1.95s/it] Training 1/1 epoch (loss 1.7042): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [16:25<12:04, 1.90s/it] Training 1/1 epoch (loss 1.6807): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [16:28<12:04, 1.90s/it] Training 1/1 epoch (loss 1.6807): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [16:28<13:03, 2.06s/it] Training 1/1 epoch (loss 1.7135): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [16:29<13:03, 2.06s/it] Training 1/1 epoch (loss 1.7135): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [16:29<12:21, 1.95s/it] Training 1/1 epoch (loss 1.6850): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [16:31<12:21, 1.95s/it] Training 1/1 epoch (loss 1.6850): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [16:31<11:19, 1.79s/it] Training 1/1 epoch (loss 1.5869): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [16:32<11:19, 1.79s/it] Training 1/1 epoch (loss 1.5869): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [16:32<10:07, 1.61s/it] Training 1/1 epoch (loss 1.6969): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [16:34<10:07, 1.61s/it] Training 1/1 epoch (loss 1.6969): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [16:34<10:57, 1.74s/it] Training 1/1 epoch (loss 1.7370): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [16:36<10:57, 1.74s/it] Training 1/1 epoch (loss 1.7370): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [16:36<10:34, 1.69s/it] Training 1/1 epoch (loss 1.5801): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [16:37<10:34, 1.69s/it] Training 1/1 epoch (loss 1.5801): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [16:37<10:30, 1.68s/it] Training 1/1 epoch (loss 1.7125): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [16:39<10:30, 1.68s/it] Training 1/1 epoch (loss 1.7125): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [16:39<10:47, 1.73s/it] Training 1/1 epoch (loss 1.6552): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [16:41<10:47, 1.73s/it] Training 1/1 epoch (loss 1.6552): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [16:41<10:25, 1.68s/it] Training 1/1 epoch (loss 1.6371): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [16:42<10:25, 1.68s/it] Training 1/1 epoch (loss 1.6371): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [16:42<09:24, 1.52s/it] Training 1/1 epoch (loss 1.6378): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [16:44<09:24, 1.52s/it] Training 1/1 epoch (loss 1.6378): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [16:44<11:01, 1.78s/it] Training 1/1 epoch (loss 1.7080): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [16:46<11:01, 1.78s/it] Training 1/1 epoch (loss 1.7080): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [16:46<11:56, 1.94s/it] Training 1/1 epoch (loss 1.7744): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [16:48<11:56, 1.94s/it] Training 1/1 epoch (loss 1.7744): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [16:48<11:48, 1.92s/it] Training 1/1 epoch (loss 1.7244): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [16:50<11:48, 1.92s/it] Training 1/1 epoch (loss 1.7244): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [16:50<10:45, 1.75s/it] Training 1/1 epoch (loss 1.7050): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [16:51<10:45, 1.75s/it] Training 1/1 epoch (loss 1.7050): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [16:51<10:38, 1.74s/it] Training 1/1 epoch (loss 1.7081): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [16:54<10:38, 1.74s/it] Training 1/1 epoch (loss 1.7081): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [16:54<11:24, 1.87s/it] Training 1/1 epoch (loss 1.7232): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [16:55<11:24, 1.87s/it] Training 1/1 epoch (loss 1.7232): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [16:55<11:03, 1.82s/it] Training 1/1 epoch (loss 1.6579): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [16:58<11:03, 1.82s/it] Training 1/1 epoch (loss 1.6579): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [16:58<11:53, 1.96s/it] Training 1/1 epoch (loss 1.6486): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [16:59<11:53, 1.96s/it] Training 1/1 epoch (loss 1.6486): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [16:59<11:14, 1.86s/it] Training 1/1 epoch (loss 1.6540): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [17:01<11:14, 1.86s/it] Training 1/1 epoch (loss 1.6540): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [17:01<11:06, 1.84s/it] Training 1/1 epoch (loss 1.6416): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [17:03<11:06, 1.84s/it] Training 1/1 epoch (loss 1.6416): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [17:03<11:10, 1.86s/it] Training 1/1 epoch (loss 1.7497): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [17:05<11:10, 1.86s/it] Training 1/1 epoch (loss 1.7497): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [17:05<12:14, 2.04s/it] Training 1/1 epoch (loss 1.7072): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [17:07<12:14, 2.04s/it] Training 1/1 epoch (loss 1.7072): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [17:07<12:00, 2.01s/it] Training 1/1 epoch (loss 1.7092): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [17:09<12:00, 2.01s/it] Training 1/1 epoch (loss 1.7092): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [17:09<11:13, 1.88s/it] Training 1/1 epoch (loss 1.6372): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [17:10<11:13, 1.88s/it] Training 1/1 epoch (loss 1.6372): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [17:10<09:52, 1.66s/it] Training 1/1 epoch (loss 1.5802): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [17:12<09:52, 1.66s/it] Training 1/1 epoch (loss 1.5802): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [17:12<10:04, 1.70s/it] Training 1/1 epoch (loss 1.7043): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [17:14<10:04, 1.70s/it] Training 1/1 epoch (loss 1.7043): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [17:14<10:30, 1.78s/it] Training 1/1 epoch (loss 1.5734): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [17:15<10:30, 1.78s/it] Training 1/1 epoch (loss 1.5734): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [17:15<08:52, 1.50s/it] Training 1/1 epoch (loss 1.5659): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [17:16<08:52, 1.50s/it] Training 1/1 epoch (loss 1.5659): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [17:16<08:37, 1.47s/it] Training 1/1 epoch (loss 1.7004): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [17:17<08:37, 1.47s/it] Training 1/1 epoch (loss 1.7004): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [17:17<08:32, 1.46s/it] Training 1/1 epoch (loss 1.6783): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [17:20<08:32, 1.46s/it] Training 1/1 epoch (loss 1.6783): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [17:20<10:03, 1.72s/it] Training 1/1 epoch (loss 1.6910): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [17:21<10:03, 1.72s/it] Training 1/1 epoch (loss 1.6910): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [17:21<09:13, 1.58s/it] Training 1/1 epoch (loss 1.7185): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [17:23<09:13, 1.58s/it] Training 1/1 epoch (loss 1.7185): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [17:23<10:38, 1.83s/it] Training 1/1 epoch (loss 1.7102): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [17:25<10:38, 1.83s/it] Training 1/1 epoch (loss 1.7102): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [17:25<09:55, 1.71s/it] Training 1/1 epoch (loss 1.6582): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [17:26<09:55, 1.71s/it] Training 1/1 epoch (loss 1.6582): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [17:26<09:13, 1.59s/it] Training 1/1 epoch (loss 1.5853): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [17:28<09:13, 1.59s/it] Training 1/1 epoch (loss 1.5853): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [17:28<09:16, 1.61s/it] Training 1/1 epoch (loss 1.7383): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [17:30<09:16, 1.61s/it] Training 1/1 epoch (loss 1.7383): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [17:30<10:45, 1.87s/it] Training 1/1 epoch (loss 1.7039): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [17:32<10:45, 1.87s/it] Training 1/1 epoch (loss 1.7039): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [17:32<10:25, 1.82s/it] Training 1/1 epoch (loss 1.6169): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [17:34<10:25, 1.82s/it] Training 1/1 epoch (loss 1.6169): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [17:34<09:59, 1.75s/it] Training 1/1 epoch (loss 1.7103): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [17:35<09:59, 1.75s/it] Training 1/1 epoch (loss 1.7103): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [17:35<09:56, 1.75s/it] Training 1/1 epoch (loss 1.6287): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [17:37<09:56, 1.75s/it] Training 1/1 epoch (loss 1.6287): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [17:37<09:13, 1.62s/it] Training 1/1 epoch (loss 1.7467): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [17:38<09:13, 1.62s/it] Training 1/1 epoch (loss 1.7467): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [17:38<08:46, 1.55s/it] Training 1/1 epoch (loss 1.7030): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [17:41<08:46, 1.55s/it] Training 1/1 epoch (loss 1.7030): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [17:41<10:14, 1.81s/it] Training 1/1 epoch (loss 1.6192): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [17:42<10:14, 1.81s/it] Training 1/1 epoch (loss 1.6192): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [17:42<10:00, 1.78s/it] Training 1/1 epoch (loss 1.6988): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [17:44<10:00, 1.78s/it] Training 1/1 epoch (loss 1.6988): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [17:44<09:20, 1.66s/it] Training 1/1 epoch (loss 1.5549): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [17:45<09:20, 1.66s/it] Training 1/1 epoch (loss 1.5549): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [17:45<09:23, 1.68s/it] Training 1/1 epoch (loss 1.6579): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [17:48<09:23, 1.68s/it] Training 1/1 epoch (loss 1.6579): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [17:48<10:49, 1.94s/it] Training 1/1 epoch (loss 1.7234): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [17:50<10:49, 1.94s/it] Training 1/1 epoch (loss 1.7234): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [17:50<10:24, 1.87s/it] Training 1/1 epoch (loss 1.6337): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [17:51<10:24, 1.87s/it] Training 1/1 epoch (loss 1.6337): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [17:51<09:49, 1.77s/it] Training 1/1 epoch (loss 1.6888): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [17:53<09:49, 1.77s/it] Training 1/1 epoch (loss 1.6888): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [17:53<10:33, 1.91s/it] Training 1/1 epoch (loss 1.6501): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [17:55<10:33, 1.91s/it] Training 1/1 epoch (loss 1.6501): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [17:55<09:38, 1.75s/it] Training 1/1 epoch (loss 1.7369): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [17:56<09:38, 1.75s/it] Training 1/1 epoch (loss 1.7369): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [17:56<09:19, 1.70s/it] Training 1/1 epoch (loss 1.6555): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [17:58<09:19, 1.70s/it] Training 1/1 epoch (loss 1.6555): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [17:58<09:56, 1.81s/it] Training 1/1 epoch (loss 1.6312): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [18:00<09:56, 1.81s/it] Training 1/1 epoch (loss 1.6312): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [18:00<09:34, 1.75s/it] Training 1/1 epoch (loss 1.7105): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [18:02<09:34, 1.75s/it] Training 1/1 epoch (loss 1.7105): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [18:02<10:11, 1.87s/it] Training 1/1 epoch (loss 1.6007): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [18:04<10:11, 1.87s/it] Training 1/1 epoch (loss 1.6007): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [18:04<10:23, 1.91s/it] Training 1/1 epoch (loss 1.4909): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [18:06<10:23, 1.91s/it] Training 1/1 epoch (loss 1.4909): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [18:06<10:54, 2.01s/it] Training 1/1 epoch (loss 1.7684): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [18:08<10:54, 2.01s/it] Training 1/1 epoch (loss 1.7684): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [18:08<10:48, 2.00s/it] Training 1/1 epoch (loss 1.5471): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [18:11<10:48, 2.00s/it] Training 1/1 epoch (loss 1.5471): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [18:11<11:22, 2.11s/it] Training 1/1 epoch (loss 1.6809): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [18:12<11:22, 2.11s/it] Training 1/1 epoch (loss 1.6809): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [18:12<10:11, 1.90s/it] Training 1/1 epoch (loss 1.5531): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [18:13<10:11, 1.90s/it] Training 1/1 epoch (loss 1.5531): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [18:13<08:58, 1.68s/it] Training 1/1 epoch (loss 1.6196): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [18:15<08:58, 1.68s/it] Training 1/1 epoch (loss 1.6196): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [18:15<09:06, 1.71s/it] Training 1/1 epoch (loss 1.6438): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [18:17<09:06, 1.71s/it] Training 1/1 epoch (loss 1.6438): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [18:17<09:24, 1.77s/it] Training 1/1 epoch (loss 1.5402): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [18:19<09:24, 1.77s/it] Training 1/1 epoch (loss 1.5402): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [18:19<10:26, 1.97s/it] Training 1/1 epoch (loss 1.5305): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [18:22<10:26, 1.97s/it] Training 1/1 epoch (loss 1.5305): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [18:22<10:40, 2.02s/it] Training 1/1 epoch (loss 1.7508): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [18:24<10:40, 2.02s/it] Training 1/1 epoch (loss 1.7508): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [18:24<10:36, 2.01s/it] Training 1/1 epoch (loss 1.5797): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [18:25<10:36, 2.01s/it] Training 1/1 epoch (loss 1.5797): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [18:25<09:28, 1.80s/it] Training 1/1 epoch (loss 1.6022): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [18:26<09:28, 1.80s/it] Training 1/1 epoch (loss 1.6022): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [18:26<08:55, 1.71s/it] Training 1/1 epoch (loss 1.4827): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [18:28<08:55, 1.71s/it] Training 1/1 epoch (loss 1.4827): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [18:28<08:51, 1.70s/it] Training 1/1 epoch (loss 1.6315): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [18:30<08:51, 1.70s/it] Training 1/1 epoch (loss 1.6315): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [18:30<09:10, 1.76s/it] Training 1/1 epoch (loss 1.6479): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [18:31<09:10, 1.76s/it] Training 1/1 epoch (loss 1.6479): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [18:31<08:40, 1.67s/it] Training 1/1 epoch (loss 1.6270): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [18:34<08:40, 1.67s/it] Training 1/1 epoch (loss 1.6270): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [18:34<09:24, 1.82s/it] Training 1/1 epoch (loss 1.6150): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [18:35<09:24, 1.82s/it] Training 1/1 epoch (loss 1.6150): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [18:35<08:04, 1.57s/it] Training 1/1 epoch (loss 1.6671): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [18:37<08:04, 1.57s/it] Training 1/1 epoch (loss 1.6671): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [18:37<09:05, 1.77s/it] Training 1/1 epoch (loss 1.5844): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [18:38<09:05, 1.77s/it] Training 1/1 epoch (loss 1.5844): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [18:38<08:48, 1.72s/it] Training 1/1 epoch (loss 1.7203): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [18:40<08:48, 1.72s/it] Training 1/1 epoch (loss 1.7203): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [18:40<08:30, 1.67s/it] Training 1/1 epoch (loss 1.5241): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [18:41<08:30, 1.67s/it] Training 1/1 epoch (loss 1.5241): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [18:41<08:04, 1.59s/it] Training 1/1 epoch (loss 1.6496): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [18:43<08:04, 1.59s/it] Training 1/1 epoch (loss 1.6496): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [18:43<08:49, 1.74s/it] Training 1/1 epoch (loss 1.6195): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [18:45<08:49, 1.74s/it] Training 1/1 epoch (loss 1.6195): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [18:45<09:05, 1.80s/it] Training 1/1 epoch (loss 1.5745): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [18:47<09:05, 1.80s/it] Training 1/1 epoch (loss 1.5745): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [18:47<09:03, 1.80s/it] Training 1/1 epoch (loss 1.6842): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [18:48<09:03, 1.80s/it] Training 1/1 epoch (loss 1.6842): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [18:48<08:04, 1.61s/it] Training 1/1 epoch (loss 1.6555): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [18:50<08:04, 1.61s/it] Training 1/1 epoch (loss 1.6555): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [18:50<08:13, 1.65s/it] Training 1/1 epoch (loss 1.5324): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [18:52<08:13, 1.65s/it] Training 1/1 epoch (loss 1.5324): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [18:52<07:59, 1.60s/it] Training 1/1 epoch (loss 1.7007): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [18:53<07:59, 1.60s/it] Training 1/1 epoch (loss 1.7007): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [18:53<08:15, 1.66s/it] Training 1/1 epoch (loss 1.5156): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [18:55<08:15, 1.66s/it] Training 1/1 epoch (loss 1.5156): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [18:55<08:46, 1.77s/it] Training 1/1 epoch (loss 1.6029): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [18:57<08:46, 1.77s/it] Training 1/1 epoch (loss 1.6029): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [18:57<08:30, 1.72s/it] Training 1/1 epoch (loss 1.7046): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [18:58<08:30, 1.72s/it] Training 1/1 epoch (loss 1.7046): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [18:58<07:35, 1.54s/it] Training 1/1 epoch (loss 1.5233): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [19:01<07:35, 1.54s/it] Training 1/1 epoch (loss 1.5233): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [19:01<08:50, 1.81s/it] Training 1/1 epoch (loss 1.6974): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [19:02<08:50, 1.81s/it] Training 1/1 epoch (loss 1.6974): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [19:02<08:27, 1.73s/it] Training 1/1 epoch (loss 1.7108): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [19:04<08:27, 1.73s/it] Training 1/1 epoch (loss 1.7108): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [19:04<08:38, 1.78s/it] Training 1/1 epoch (loss 1.6923): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [19:05<08:38, 1.78s/it] Training 1/1 epoch (loss 1.6923): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [19:05<07:43, 1.59s/it] Training 1/1 epoch (loss 1.6771): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [19:07<07:43, 1.59s/it] Training 1/1 epoch (loss 1.6771): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [19:07<08:05, 1.68s/it] Training 1/1 epoch (loss 1.6607): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [19:09<08:05, 1.68s/it] Training 1/1 epoch (loss 1.6607): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [19:09<08:09, 1.69s/it] Training 1/1 epoch (loss 1.5992): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [19:10<08:09, 1.69s/it] Training 1/1 epoch (loss 1.5992): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [19:10<07:51, 1.64s/it] Training 1/1 epoch (loss 1.6316): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [19:12<07:51, 1.64s/it] Training 1/1 epoch (loss 1.6316): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [19:12<07:51, 1.64s/it] Training 1/1 epoch (loss 1.5788): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [19:14<07:51, 1.64s/it] Training 1/1 epoch (loss 1.5788): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [19:14<08:08, 1.71s/it] Training 1/1 epoch (loss 1.7553): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [19:15<08:08, 1.71s/it] Training 1/1 epoch (loss 1.7553): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [19:15<07:28, 1.57s/it] Training 1/1 epoch (loss 1.6977): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [19:16<07:28, 1.57s/it] Training 1/1 epoch (loss 1.6977): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [19:16<07:03, 1.49s/it] Training 1/1 epoch (loss 1.5974): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [19:17<07:03, 1.49s/it] Training 1/1 epoch (loss 1.5974): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [19:17<06:29, 1.38s/it] Training 1/1 epoch (loss 1.6643): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [19:20<06:29, 1.38s/it] Training 1/1 epoch (loss 1.6643): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [19:20<08:00, 1.70s/it] Training 1/1 epoch (loss 1.7435): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [19:21<08:00, 1.70s/it] Training 1/1 epoch (loss 1.7435): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [19:21<07:18, 1.56s/it] Training 1/1 epoch (loss 1.6602): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [19:23<07:18, 1.56s/it] Training 1/1 epoch (loss 1.6602): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [19:23<07:43, 1.66s/it] Training 1/1 epoch (loss 1.6943): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [19:24<07:43, 1.66s/it] Training 1/1 epoch (loss 1.6943): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [19:24<07:11, 1.55s/it] Training 1/1 epoch (loss 1.6833): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [19:26<07:11, 1.55s/it] Training 1/1 epoch (loss 1.6833): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [19:26<07:40, 1.66s/it] Training 1/1 epoch (loss 1.7015): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [19:28<07:40, 1.66s/it] Training 1/1 epoch (loss 1.7015): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [19:28<07:37, 1.65s/it] Training 1/1 epoch (loss 1.6166): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [19:29<07:37, 1.65s/it] Training 1/1 epoch (loss 1.6166): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [19:29<07:30, 1.63s/it] Training 1/1 epoch (loss 1.5359): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [19:31<07:30, 1.63s/it] Training 1/1 epoch (loss 1.5359): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [19:31<06:55, 1.51s/it] Training 1/1 epoch (loss 1.6401): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [19:33<06:55, 1.51s/it] Training 1/1 epoch (loss 1.6401): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [19:33<07:54, 1.73s/it] Training 1/1 epoch (loss 1.6338): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [19:34<07:54, 1.73s/it] Training 1/1 epoch (loss 1.6338): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [19:34<07:36, 1.67s/it] Training 1/1 epoch (loss 1.6029): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [19:36<07:36, 1.67s/it] Training 1/1 epoch (loss 1.6029): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [19:36<07:42, 1.70s/it] Training 1/1 epoch (loss 1.6205): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [19:38<07:42, 1.70s/it] Training 1/1 epoch (loss 1.6205): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [19:38<07:07, 1.58s/it] Training 1/1 epoch (loss 1.5328): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [19:39<07:07, 1.58s/it] Training 1/1 epoch (loss 1.5328): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [19:39<06:27, 1.43s/it] Training 1/1 epoch (loss 1.6066): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [19:41<06:27, 1.43s/it] Training 1/1 epoch (loss 1.6066): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [19:41<07:32, 1.68s/it] Training 1/1 epoch (loss 1.5146): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [19:42<07:32, 1.68s/it] Training 1/1 epoch (loss 1.5146): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [19:42<07:18, 1.64s/it] Training 1/1 epoch (loss 1.6841): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [19:44<07:18, 1.64s/it] Training 1/1 epoch (loss 1.6841): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [19:44<07:46, 1.75s/it] Training 1/1 epoch (loss 1.7960): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [19:47<07:46, 1.75s/it] Training 1/1 epoch (loss 1.7960): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [19:47<08:15, 1.86s/it] Training 1/1 epoch (loss 1.6020): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [19:48<08:15, 1.86s/it] Training 1/1 epoch (loss 1.6020): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [19:48<08:14, 1.87s/it] Training 1/1 epoch (loss 1.6291): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [19:50<08:14, 1.87s/it] Training 1/1 epoch (loss 1.6291): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [19:50<07:28, 1.70s/it] Training 1/1 epoch (loss 1.6105): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [19:51<07:28, 1.70s/it] Training 1/1 epoch (loss 1.6105): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [19:51<07:17, 1.66s/it] Training 1/1 epoch (loss 1.6379): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [19:53<07:17, 1.66s/it] Training 1/1 epoch (loss 1.6379): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [19:53<07:26, 1.71s/it] Training 1/1 epoch (loss 1.7485): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [19:55<07:26, 1.71s/it] Training 1/1 epoch (loss 1.7485): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [19:55<07:05, 1.63s/it] Training 1/1 epoch (loss 1.6121): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [19:56<07:05, 1.63s/it] Training 1/1 epoch (loss 1.6121): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [19:56<06:13, 1.44s/it] Training 1/1 epoch (loss 1.6232): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [19:57<06:13, 1.44s/it] Training 1/1 epoch (loss 1.6232): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [19:57<06:02, 1.40s/it] Training 1/1 epoch (loss 1.6441): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [19:59<06:02, 1.40s/it] Training 1/1 epoch (loss 1.6441): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [19:59<07:26, 1.73s/it] Training 1/1 epoch (loss 1.6391): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [20:01<07:26, 1.73s/it] Training 1/1 epoch (loss 1.6391): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [20:01<07:48, 1.82s/it] Training 1/1 epoch (loss 1.6488): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [20:03<07:48, 1.82s/it] Training 1/1 epoch (loss 1.6488): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [20:03<07:11, 1.68s/it] Training 1/1 epoch (loss 1.6706): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [20:04<07:11, 1.68s/it] Training 1/1 epoch (loss 1.6706): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [20:04<06:32, 1.54s/it] Training 1/1 epoch (loss 1.6623): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [20:05<06:32, 1.54s/it] Training 1/1 epoch (loss 1.6623): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [20:05<06:26, 1.52s/it] Training 1/1 epoch (loss 1.6514): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [20:07<06:26, 1.52s/it] Training 1/1 epoch (loss 1.6514): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [20:07<06:16, 1.49s/it] Training 1/1 epoch (loss 1.7450): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [20:08<06:16, 1.49s/it] Training 1/1 epoch (loss 1.7450): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [20:08<06:04, 1.45s/it] Training 1/1 epoch (loss 1.6288): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [20:10<06:04, 1.45s/it] Training 1/1 epoch (loss 1.6288): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [20:10<06:29, 1.55s/it] Training 1/1 epoch (loss 1.6545): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [20:12<06:29, 1.55s/it] Training 1/1 epoch (loss 1.6545): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [20:12<06:34, 1.58s/it] Training 1/1 epoch (loss 1.6777): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [20:13<06:34, 1.58s/it] Training 1/1 epoch (loss 1.6777): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [20:13<06:38, 1.60s/it] Training 1/1 epoch (loss 1.6914): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [20:15<06:38, 1.60s/it] Training 1/1 epoch (loss 1.6914): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [20:15<06:49, 1.65s/it] Training 1/1 epoch (loss 1.7137): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [20:16<06:49, 1.65s/it] Training 1/1 epoch (loss 1.7137): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [20:16<06:10, 1.50s/it] Training 1/1 epoch (loss 1.6512): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [20:19<06:10, 1.50s/it] Training 1/1 epoch (loss 1.6512): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [20:19<07:21, 1.79s/it] Training 1/1 epoch (loss 1.5983): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [20:21<07:21, 1.79s/it] Training 1/1 epoch (loss 1.5983): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [20:21<07:21, 1.80s/it] Training 1/1 epoch (loss 1.5683): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [20:22<07:21, 1.80s/it] Training 1/1 epoch (loss 1.5683): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [20:22<07:05, 1.74s/it] Training 1/1 epoch (loss 1.6326): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [20:23<07:05, 1.74s/it] Training 1/1 epoch (loss 1.6326): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [20:23<06:13, 1.54s/it] Training 1/1 epoch (loss 1.4904): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [20:25<06:13, 1.54s/it] Training 1/1 epoch (loss 1.4904): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [20:25<06:30, 1.61s/it] Training 1/1 epoch (loss 1.6970): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [20:26<06:30, 1.61s/it] Training 1/1 epoch (loss 1.6970): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [20:26<06:19, 1.57s/it] Training 1/1 epoch (loss 1.6090): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [20:29<06:19, 1.57s/it] Training 1/1 epoch (loss 1.6090): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [20:29<06:56, 1.74s/it] Training 1/1 epoch (loss 1.5457): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [20:30<06:56, 1.74s/it] Training 1/1 epoch (loss 1.5457): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [20:30<06:09, 1.54s/it] Training 1/1 epoch (loss 1.6609): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [20:32<06:09, 1.54s/it] Training 1/1 epoch (loss 1.6609): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [20:32<07:07, 1.80s/it] Training 1/1 epoch (loss 1.5759): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [20:34<07:07, 1.80s/it] Training 1/1 epoch (loss 1.5759): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [20:34<07:21, 1.86s/it] Training 1/1 epoch (loss 1.5403): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [20:36<07:21, 1.86s/it] Training 1/1 epoch (loss 1.5403): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [20:36<07:42, 1.96s/it] Training 1/1 epoch (loss 1.6956): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [20:38<07:42, 1.96s/it] Training 1/1 epoch (loss 1.6956): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [20:38<07:30, 1.92s/it] Training 1/1 epoch (loss 1.5778): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [20:40<07:30, 1.92s/it] Training 1/1 epoch (loss 1.5778): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [20:40<07:03, 1.81s/it] Training 1/1 epoch (loss 1.6077): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [20:42<07:03, 1.81s/it] Training 1/1 epoch (loss 1.6077): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [20:42<07:25, 1.91s/it] Training 1/1 epoch (loss 1.5508): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [20:44<07:25, 1.91s/it] Training 1/1 epoch (loss 1.5508): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [20:44<07:25, 1.92s/it] Training 1/1 epoch (loss 1.6312): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [20:45<07:25, 1.92s/it] Training 1/1 epoch (loss 1.6312): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [20:45<06:56, 1.80s/it] Training 1/1 epoch (loss 1.6783): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [20:47<06:56, 1.80s/it] Training 1/1 epoch (loss 1.6783): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [20:47<06:24, 1.67s/it] Training 1/1 epoch (loss 1.7160): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [20:48<06:24, 1.67s/it] Training 1/1 epoch (loss 1.7160): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [20:48<06:19, 1.66s/it] Training 1/1 epoch (loss 1.6439): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [20:50<06:19, 1.66s/it] Training 1/1 epoch (loss 1.6439): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [20:50<06:02, 1.59s/it] Training 1/1 epoch (loss 1.7015): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [20:51<06:02, 1.59s/it] Training 1/1 epoch (loss 1.7015): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [20:51<06:13, 1.65s/it] Training 1/1 epoch (loss 1.5984): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [20:53<06:13, 1.65s/it] Training 1/1 epoch (loss 1.5984): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [20:53<06:02, 1.60s/it] Training 1/1 epoch (loss 1.6589): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [20:54<06:02, 1.60s/it] Training 1/1 epoch (loss 1.6589): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [20:54<05:43, 1.52s/it] Training 1/1 epoch (loss 1.6155): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [20:57<05:43, 1.52s/it] Training 1/1 epoch (loss 1.6155): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [20:57<06:49, 1.83s/it] Training 1/1 epoch (loss 1.6212): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [20:58<06:49, 1.83s/it] Training 1/1 epoch (loss 1.6212): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [20:58<06:24, 1.72s/it] Training 1/1 epoch (loss 1.5857): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [21:00<06:24, 1.72s/it] Training 1/1 epoch (loss 1.5857): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [21:00<06:07, 1.66s/it] Training 1/1 epoch (loss 1.6549): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [21:02<06:07, 1.66s/it] Training 1/1 epoch (loss 1.6549): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [21:02<06:13, 1.69s/it] Training 1/1 epoch (loss 1.5916): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [21:03<06:13, 1.69s/it] Training 1/1 epoch (loss 1.5916): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [21:03<06:16, 1.71s/it] Training 1/1 epoch (loss 1.6353): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [21:05<06:16, 1.71s/it] Training 1/1 epoch (loss 1.6353): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [21:05<06:30, 1.78s/it] Training 1/1 epoch (loss 1.5956): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [21:08<06:30, 1.78s/it] Training 1/1 epoch (loss 1.5956): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [21:08<07:13, 1.99s/it] Training 1/1 epoch (loss 1.5728): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [21:09<07:13, 1.99s/it] Training 1/1 epoch (loss 1.5728): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [21:09<06:38, 1.84s/it] Training 1/1 epoch (loss 1.5926): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [21:11<06:38, 1.84s/it] Training 1/1 epoch (loss 1.5926): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [21:11<06:26, 1.79s/it] Training 1/1 epoch (loss 1.4899): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [21:13<06:26, 1.79s/it] Training 1/1 epoch (loss 1.4899): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [21:13<06:22, 1.78s/it] Training 1/1 epoch (loss 1.5451): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [21:14<06:22, 1.78s/it] Training 1/1 epoch (loss 1.5451): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [21:14<05:37, 1.58s/it] Training 1/1 epoch (loss 1.5769): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [21:16<05:37, 1.58s/it] Training 1/1 epoch (loss 1.5769): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [21:16<06:25, 1.81s/it] Training 1/1 epoch (loss 1.5607): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [21:18<06:25, 1.81s/it] Training 1/1 epoch (loss 1.5607): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [21:18<05:55, 1.68s/it] Training 1/1 epoch (loss 1.6548): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [21:19<05:55, 1.68s/it] Training 1/1 epoch (loss 1.6548): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [21:19<05:47, 1.65s/it] Training 1/1 epoch (loss 1.5169): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [21:22<05:47, 1.65s/it] Training 1/1 epoch (loss 1.5169): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [21:22<06:50, 1.96s/it] Training 1/1 epoch (loss 1.5861): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [21:24<06:50, 1.96s/it] Training 1/1 epoch (loss 1.5861): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [21:24<06:53, 1.98s/it] Training 1/1 epoch (loss 1.6404): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [21:26<06:53, 1.98s/it] Training 1/1 epoch (loss 1.6404): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [21:26<06:38, 1.91s/it] Training 1/1 epoch (loss 1.6701): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [21:27<06:38, 1.91s/it] Training 1/1 epoch (loss 1.6701): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [21:27<05:46, 1.67s/it] Training 1/1 epoch (loss 1.6185): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [21:28<05:46, 1.67s/it] Training 1/1 epoch (loss 1.6185): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [21:28<05:49, 1.70s/it] Training 1/1 epoch (loss 1.7940): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [21:30<05:49, 1.70s/it] Training 1/1 epoch (loss 1.7940): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [21:30<06:02, 1.77s/it] Training 1/1 epoch (loss 1.6340): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [21:32<06:02, 1.77s/it] Training 1/1 epoch (loss 1.6340): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [21:32<06:11, 1.82s/it] Training 1/1 epoch (loss 1.6850): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [21:34<06:11, 1.82s/it] Training 1/1 epoch (loss 1.6850): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [21:34<06:17, 1.86s/it] Training 1/1 epoch (loss 1.6135): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [21:37<06:17, 1.86s/it] Training 1/1 epoch (loss 1.6135): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [21:37<07:20, 2.18s/it] Training 1/1 epoch (loss 1.6239): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [21:39<07:20, 2.18s/it] Training 1/1 epoch (loss 1.6239): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [21:39<07:02, 2.10s/it] Training 1/1 epoch (loss 1.6005): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [21:41<07:02, 2.10s/it] Training 1/1 epoch (loss 1.6005): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [21:41<06:31, 1.96s/it] Training 1/1 epoch (loss 1.5277): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [21:42<06:31, 1.96s/it] Training 1/1 epoch (loss 1.5277): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [21:42<05:49, 1.76s/it] Training 1/1 epoch (loss 1.6243): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [21:44<05:49, 1.76s/it] Training 1/1 epoch (loss 1.6243): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [21:44<05:42, 1.73s/it] Training 1/1 epoch (loss 1.6485): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [21:45<05:42, 1.73s/it] Training 1/1 epoch (loss 1.6485): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [21:45<05:36, 1.71s/it] Training 1/1 epoch (loss 1.6249): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [21:47<05:36, 1.71s/it] Training 1/1 epoch (loss 1.6249): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [21:47<05:19, 1.63s/it] Training 1/1 epoch (loss 1.6400): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [21:49<05:19, 1.63s/it] Training 1/1 epoch (loss 1.6400): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [21:49<05:29, 1.69s/it] Training 1/1 epoch (loss 1.6029): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [21:51<05:29, 1.69s/it] Training 1/1 epoch (loss 1.6029): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [21:51<05:39, 1.75s/it] Training 1/1 epoch (loss 1.6937): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [21:52<05:39, 1.75s/it] Training 1/1 epoch (loss 1.6937): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [21:52<05:15, 1.64s/it] Training 1/1 epoch (loss 1.5794): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [21:54<05:15, 1.64s/it] Training 1/1 epoch (loss 1.5794): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [21:54<05:23, 1.68s/it] Training 1/1 epoch (loss 1.7296): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [21:56<05:23, 1.68s/it] Training 1/1 epoch (loss 1.7296): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [21:56<05:47, 1.82s/it] Training 1/1 epoch (loss 1.5961): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [21:57<05:47, 1.82s/it] Training 1/1 epoch (loss 1.5961): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [21:57<05:24, 1.71s/it] Training 1/1 epoch (loss 1.5481): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [21:59<05:24, 1.71s/it] Training 1/1 epoch (loss 1.5481): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [21:59<05:34, 1.77s/it] Training 1/1 epoch (loss 1.5574): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [22:01<05:34, 1.77s/it] Training 1/1 epoch (loss 1.5574): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [22:01<05:09, 1.64s/it] Training 1/1 epoch (loss 1.6031): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [22:03<05:09, 1.64s/it] Training 1/1 epoch (loss 1.6031): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [22:03<05:37, 1.80s/it] Training 1/1 epoch (loss 1.6198): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [22:04<05:37, 1.80s/it] Training 1/1 epoch (loss 1.6198): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [22:04<05:29, 1.77s/it] Training 1/1 epoch (loss 1.6611): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [22:06<05:29, 1.77s/it] Training 1/1 epoch (loss 1.6611): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [22:06<05:39, 1.83s/it] Training 1/1 epoch (loss 1.6502): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [22:08<05:39, 1.83s/it] Training 1/1 epoch (loss 1.6502): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [22:08<05:03, 1.65s/it] Training 1/1 epoch (loss 1.6332): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [22:09<05:03, 1.65s/it] Training 1/1 epoch (loss 1.6332): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [22:09<04:44, 1.55s/it] Training 1/1 epoch (loss 1.6245): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [22:10<04:44, 1.55s/it] Training 1/1 epoch (loss 1.6245): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [22:10<04:28, 1.47s/it] Training 1/1 epoch (loss 1.5710): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [22:12<04:28, 1.47s/it] Training 1/1 epoch (loss 1.5710): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [22:12<04:53, 1.62s/it] Training 1/1 epoch (loss 1.7167): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [22:14<04:53, 1.62s/it] Training 1/1 epoch (loss 1.7167): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [22:14<04:56, 1.65s/it] Training 1/1 epoch (loss 1.6289): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [22:15<04:56, 1.65s/it] Training 1/1 epoch (loss 1.6289): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [22:15<04:45, 1.60s/it] Training 1/1 epoch (loss 1.4599): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [22:17<04:45, 1.60s/it] Training 1/1 epoch (loss 1.4599): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [22:17<04:56, 1.66s/it] Training 1/1 epoch (loss 1.7722): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [22:19<04:56, 1.66s/it] Training 1/1 epoch (loss 1.7722): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [22:19<04:37, 1.57s/it] Training 1/1 epoch (loss 1.5909): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [22:20<04:37, 1.57s/it] Training 1/1 epoch (loss 1.5909): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [22:20<04:43, 1.61s/it] Training 1/1 epoch (loss 1.7242): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [22:23<04:43, 1.61s/it] Training 1/1 epoch (loss 1.7242): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [22:23<05:24, 1.86s/it] Training 1/1 epoch (loss 1.4632): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [22:24<05:24, 1.86s/it] Training 1/1 epoch (loss 1.4632): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [22:24<05:09, 1.78s/it] Training 1/1 epoch (loss 1.6840): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [22:27<05:09, 1.78s/it] Training 1/1 epoch (loss 1.6840): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [22:27<05:32, 1.92s/it] Training 1/1 epoch (loss 1.6136): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [22:28<05:32, 1.92s/it] Training 1/1 epoch (loss 1.6136): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [22:28<05:04, 1.77s/it] Training 1/1 epoch (loss 1.6828): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [22:30<05:04, 1.77s/it] Training 1/1 epoch (loss 1.6828): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [22:30<05:40, 1.99s/it] Training 1/1 epoch (loss 1.5523): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [22:32<05:40, 1.99s/it] Training 1/1 epoch (loss 1.5523): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [22:32<05:19, 1.88s/it] Training 1/1 epoch (loss 1.8273): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [22:34<05:19, 1.88s/it] Training 1/1 epoch (loss 1.8273): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [22:34<04:56, 1.76s/it] Training 1/1 epoch (loss 1.6319): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [22:35<04:56, 1.76s/it] Training 1/1 epoch (loss 1.6319): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [22:35<04:33, 1.63s/it] Training 1/1 epoch (loss 1.5363): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [22:36<04:33, 1.63s/it] Training 1/1 epoch (loss 1.5363): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [22:36<04:22, 1.57s/it] Training 1/1 epoch (loss 1.5323): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [22:37<04:22, 1.57s/it] Training 1/1 epoch (loss 1.5323): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [22:37<03:55, 1.42s/it] Training 1/1 epoch (loss 1.5366): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [22:39<03:55, 1.42s/it] Training 1/1 epoch (loss 1.5366): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [22:39<03:54, 1.42s/it] Training 1/1 epoch (loss 1.5064): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [22:41<03:54, 1.42s/it] Training 1/1 epoch (loss 1.5064): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [22:41<04:25, 1.62s/it] Training 1/1 epoch (loss 1.6167): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [22:43<04:25, 1.62s/it] Training 1/1 epoch (loss 1.6167): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [22:43<05:03, 1.86s/it] Training 1/1 epoch (loss 1.7102): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [22:46<05:03, 1.86s/it] Training 1/1 epoch (loss 1.7102): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [22:46<05:25, 2.01s/it] Training 1/1 epoch (loss 1.6233): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [22:48<05:25, 2.01s/it] Training 1/1 epoch (loss 1.6233): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [22:48<05:30, 2.05s/it] Training 1/1 epoch (loss 1.5811): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [22:50<05:30, 2.05s/it] Training 1/1 epoch (loss 1.5811): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [22:50<05:37, 2.11s/it] Training 1/1 epoch (loss 1.5642): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [22:52<05:37, 2.11s/it] Training 1/1 epoch (loss 1.5642): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [22:52<05:16, 1.99s/it] Training 1/1 epoch (loss 1.7016): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [22:54<05:16, 1.99s/it] Training 1/1 epoch (loss 1.7016): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [22:54<05:23, 2.05s/it] Training 1/1 epoch (loss 1.6875): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [22:56<05:23, 2.05s/it] Training 1/1 epoch (loss 1.6875): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [22:56<05:22, 2.05s/it] Training 1/1 epoch (loss 1.5724): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [22:57<05:22, 2.05s/it] Training 1/1 epoch (loss 1.5724): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [22:57<04:53, 1.88s/it] Training 1/1 epoch (loss 1.6424): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [22:59<04:53, 1.88s/it] Training 1/1 epoch (loss 1.6424): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [22:59<04:33, 1.76s/it] Training 1/1 epoch (loss 1.6039): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [23:01<04:33, 1.76s/it] Training 1/1 epoch (loss 1.6039): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [23:01<04:36, 1.80s/it] Training 1/1 epoch (loss 1.6263): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [23:03<04:36, 1.80s/it] Training 1/1 epoch (loss 1.6263): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [23:03<04:53, 1.92s/it] Training 1/1 epoch (loss 1.6376): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [23:05<04:53, 1.92s/it] Training 1/1 epoch (loss 1.6376): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [23:05<04:42, 1.86s/it] Training 1/1 epoch (loss 1.5653): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [23:06<04:42, 1.86s/it] Training 1/1 epoch (loss 1.5653): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [23:06<04:02, 1.60s/it] Training 1/1 epoch (loss 1.7475): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [23:07<04:02, 1.60s/it] Training 1/1 epoch (loss 1.7475): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [23:07<03:42, 1.49s/it] Training 1/1 epoch (loss 1.5016): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [23:09<03:42, 1.49s/it] Training 1/1 epoch (loss 1.5016): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [23:09<04:10, 1.68s/it] Training 1/1 epoch (loss 1.5116): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [23:11<04:10, 1.68s/it] Training 1/1 epoch (loss 1.5116): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [23:11<04:27, 1.81s/it] Training 1/1 epoch (loss 1.6499): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [23:13<04:27, 1.81s/it] Training 1/1 epoch (loss 1.6499): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [23:13<04:23, 1.79s/it] Training 1/1 epoch (loss 1.5790): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [23:15<04:23, 1.79s/it] Training 1/1 epoch (loss 1.5790): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [23:15<04:41, 1.93s/it] Training 1/1 epoch (loss 1.6589): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [23:17<04:41, 1.93s/it] Training 1/1 epoch (loss 1.6589): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [23:17<04:21, 1.81s/it] Training 1/1 epoch (loss 1.6847): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [23:19<04:21, 1.81s/it] Training 1/1 epoch (loss 1.6847): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [23:19<04:33, 1.90s/it] Training 1/1 epoch (loss 1.5705): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [23:21<04:33, 1.90s/it] Training 1/1 epoch (loss 1.5705): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [23:21<04:23, 1.85s/it] Training 1/1 epoch (loss 1.5955): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [23:22<04:23, 1.85s/it] Training 1/1 epoch (loss 1.5955): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [23:22<04:17, 1.81s/it] Training 1/1 epoch (loss 1.5617): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [23:24<04:17, 1.81s/it] Training 1/1 epoch (loss 1.5617): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [23:24<04:02, 1.72s/it] Training 1/1 epoch (loss 1.6692): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [23:26<04:02, 1.72s/it] Training 1/1 epoch (loss 1.6692): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [23:26<04:07, 1.77s/it] Training 1/1 epoch (loss 1.6722): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [23:28<04:07, 1.77s/it] Training 1/1 epoch (loss 1.6722): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [23:28<04:37, 1.99s/it] Training 1/1 epoch (loss 1.6106): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [23:30<04:37, 1.99s/it] Training 1/1 epoch (loss 1.6106): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [23:30<04:25, 1.92s/it] Training 1/1 epoch (loss 1.6886): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [23:31<04:25, 1.92s/it] Training 1/1 epoch (loss 1.6886): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [23:31<04:02, 1.77s/it] Training 1/1 epoch (loss 1.6853): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [23:32<04:02, 1.77s/it] Training 1/1 epoch (loss 1.6853): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [23:32<03:32, 1.56s/it] Training 1/1 epoch (loss 1.6326): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [23:34<03:32, 1.56s/it] Training 1/1 epoch (loss 1.6326): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [23:34<03:19, 1.48s/it] Training 1/1 epoch (loss 1.5779): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [23:36<03:19, 1.48s/it] Training 1/1 epoch (loss 1.5779): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [23:36<03:47, 1.70s/it] Training 1/1 epoch (loss 1.6227): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [23:38<03:47, 1.70s/it] Training 1/1 epoch (loss 1.6227): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [23:38<04:18, 1.94s/it] Training 1/1 epoch (loss 1.5845): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [23:40<04:18, 1.94s/it] Training 1/1 epoch (loss 1.5845): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [23:40<03:57, 1.80s/it] Training 1/1 epoch (loss 1.6335): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [23:42<03:57, 1.80s/it] Training 1/1 epoch (loss 1.6335): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [23:42<04:10, 1.91s/it] Training 1/1 epoch (loss 1.6343): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [23:44<04:10, 1.91s/it] Training 1/1 epoch (loss 1.6343): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [23:44<04:14, 1.96s/it] Training 1/1 epoch (loss 1.5133): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [23:46<04:14, 1.96s/it] Training 1/1 epoch (loss 1.5133): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [23:46<04:09, 1.94s/it] Training 1/1 epoch (loss 1.6376): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [23:47<04:09, 1.94s/it] Training 1/1 epoch (loss 1.6376): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [23:47<03:40, 1.72s/it] Training 1/1 epoch (loss 1.5454): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [23:50<03:40, 1.72s/it] Training 1/1 epoch (loss 1.5454): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [23:50<04:05, 1.93s/it] Training 1/1 epoch (loss 1.6229): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [23:51<04:05, 1.93s/it] Training 1/1 epoch (loss 1.6229): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [23:51<03:31, 1.68s/it] Training 1/1 epoch (loss 1.6652): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [23:53<03:31, 1.68s/it] Training 1/1 epoch (loss 1.6652): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [23:53<03:45, 1.80s/it] Training 1/1 epoch (loss 1.6262): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [23:54<03:45, 1.80s/it] Training 1/1 epoch (loss 1.6262): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [23:54<03:24, 1.65s/it] Training 1/1 epoch (loss 1.6030): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [23:56<03:24, 1.65s/it] Training 1/1 epoch (loss 1.6030): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [23:56<03:44, 1.82s/it] Training 1/1 epoch (loss 1.6373): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [23:58<03:44, 1.82s/it] Training 1/1 epoch (loss 1.6373): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [23:58<03:50, 1.89s/it] Training 1/1 epoch (loss 1.6218): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [24:00<03:50, 1.89s/it] Training 1/1 epoch (loss 1.6218): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [24:00<03:27, 1.71s/it] Training 1/1 epoch (loss 1.5518): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [24:02<03:27, 1.71s/it] Training 1/1 epoch (loss 1.5518): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [24:02<03:29, 1.74s/it] Training 1/1 epoch (loss 1.7403): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [24:03<03:29, 1.74s/it] Training 1/1 epoch (loss 1.7403): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [24:03<03:22, 1.70s/it] Training 1/1 epoch (loss 1.5657): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [24:04<03:22, 1.70s/it] Training 1/1 epoch (loss 1.5657): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [24:04<03:01, 1.54s/it] Training 1/1 epoch (loss 1.6311): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [24:06<03:01, 1.54s/it] Training 1/1 epoch (loss 1.6311): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [24:06<03:04, 1.58s/it] Training 1/1 epoch (loss 1.7462): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [24:07<03:04, 1.58s/it] Training 1/1 epoch (loss 1.7462): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [24:07<02:53, 1.49s/it] Training 1/1 epoch (loss 1.5769): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [24:10<02:53, 1.49s/it] Training 1/1 epoch (loss 1.5769): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [24:10<03:26, 1.79s/it] Training 1/1 epoch (loss 1.5556): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [24:12<03:26, 1.79s/it] Training 1/1 epoch (loss 1.5556): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [24:12<03:48, 2.01s/it] Training 1/1 epoch (loss 1.5467): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [24:14<03:48, 2.01s/it] Training 1/1 epoch (loss 1.5467): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [24:14<03:31, 1.87s/it] Training 1/1 epoch (loss 1.6188): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [24:15<03:31, 1.87s/it] Training 1/1 epoch (loss 1.6188): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [24:15<03:11, 1.71s/it] Training 1/1 epoch (loss 1.6148): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [24:17<03:11, 1.71s/it] Training 1/1 epoch (loss 1.6148): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [24:17<03:18, 1.79s/it] Training 1/1 epoch (loss 1.6826): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [24:18<03:18, 1.79s/it] Training 1/1 epoch (loss 1.6826): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [24:18<02:51, 1.56s/it] Training 1/1 epoch (loss 1.6352): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [24:19<02:51, 1.56s/it] Training 1/1 epoch (loss 1.6352): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [24:19<02:41, 1.48s/it] Training 1/1 epoch (loss 1.6645): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [24:21<02:41, 1.48s/it] Training 1/1 epoch (loss 1.6645): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [24:21<02:50, 1.58s/it] Training 1/1 epoch (loss 1.7079): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [24:23<02:50, 1.58s/it] Training 1/1 epoch (loss 1.7079): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [24:23<03:05, 1.73s/it] Training 1/1 epoch (loss 1.6748): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [24:25<03:05, 1.73s/it] Training 1/1 epoch (loss 1.6748): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [24:25<02:51, 1.61s/it] Training 1/1 epoch (loss 1.6288): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [24:27<02:51, 1.61s/it] Training 1/1 epoch (loss 1.6288): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [24:27<03:13, 1.84s/it] Training 1/1 epoch (loss 1.6097): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [24:29<03:13, 1.84s/it] Training 1/1 epoch (loss 1.6097): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [24:29<03:10, 1.84s/it] Training 1/1 epoch (loss 1.5737): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [24:31<03:10, 1.84s/it] Training 1/1 epoch (loss 1.5737): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [24:31<03:09, 1.84s/it] Training 1/1 epoch (loss 1.5851): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [24:32<03:09, 1.84s/it] Training 1/1 epoch (loss 1.5851): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [24:32<02:51, 1.68s/it] Training 1/1 epoch (loss 1.7441): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [24:34<02:51, 1.68s/it] Training 1/1 epoch (loss 1.7441): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [24:34<02:44, 1.63s/it] Training 1/1 epoch (loss 1.5846): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [24:36<02:44, 1.63s/it] Training 1/1 epoch (loss 1.5846): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [24:36<02:52, 1.73s/it] Training 1/1 epoch (loss 1.5801): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [24:38<02:52, 1.73s/it] Training 1/1 epoch (loss 1.5801): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [24:38<03:10, 1.93s/it] Training 1/1 epoch (loss 1.5368): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [24:40<03:10, 1.93s/it] Training 1/1 epoch (loss 1.5368): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [24:40<03:20, 2.04s/it] Training 1/1 epoch (loss 1.6890): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [24:43<03:20, 2.04s/it] Training 1/1 epoch (loss 1.6890): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [24:43<03:32, 2.19s/it] Training 1/1 epoch (loss 1.5645): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [24:44<03:32, 2.19s/it] Training 1/1 epoch (loss 1.5645): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [24:44<03:03, 1.91s/it] Training 1/1 epoch (loss 1.6793): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [24:46<03:03, 1.91s/it] Training 1/1 epoch (loss 1.6793): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [24:46<03:06, 1.96s/it] Training 1/1 epoch (loss 1.5022): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [24:49<03:06, 1.96s/it] Training 1/1 epoch (loss 1.5022): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [24:49<03:16, 2.10s/it] Training 1/1 epoch (loss 1.6861): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [24:51<03:16, 2.10s/it] Training 1/1 epoch (loss 1.6861): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [24:51<03:12, 2.07s/it] Training 1/1 epoch (loss 1.7327): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [24:52<03:12, 2.07s/it] Training 1/1 epoch (loss 1.7327): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [24:52<02:49, 1.84s/it] Training 1/1 epoch (loss 1.5982): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [24:54<02:49, 1.84s/it] Training 1/1 epoch (loss 1.5982): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [24:54<02:47, 1.84s/it] Training 1/1 epoch (loss 1.5562): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [24:55<02:47, 1.84s/it] Training 1/1 epoch (loss 1.5562): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [24:55<02:37, 1.75s/it] Training 1/1 epoch (loss 1.6165): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [24:57<02:37, 1.75s/it] Training 1/1 epoch (loss 1.6165): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [24:57<02:41, 1.81s/it] Training 1/1 epoch (loss 1.5538): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [24:58<02:41, 1.81s/it] Training 1/1 epoch (loss 1.5538): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [24:58<02:25, 1.66s/it] Training 1/1 epoch (loss 1.6753): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [25:01<02:25, 1.66s/it] Training 1/1 epoch (loss 1.6753): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [25:01<02:38, 1.83s/it] Training 1/1 epoch (loss 1.7139): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [25:02<02:38, 1.83s/it] Training 1/1 epoch (loss 1.7139): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [25:02<02:32, 1.78s/it] Training 1/1 epoch (loss 1.6367): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [25:04<02:32, 1.78s/it] Training 1/1 epoch (loss 1.6367): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [25:04<02:37, 1.86s/it] Training 1/1 epoch (loss 1.5918): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [25:06<02:37, 1.86s/it] Training 1/1 epoch (loss 1.5918): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [25:06<02:17, 1.64s/it] Training 1/1 epoch (loss 1.6636): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [25:07<02:17, 1.64s/it] Training 1/1 epoch (loss 1.6636): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [25:07<02:21, 1.71s/it] Training 1/1 epoch (loss 1.6128): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [25:09<02:21, 1.71s/it] Training 1/1 epoch (loss 1.6128): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [25:09<02:14, 1.64s/it] Training 1/1 epoch (loss 1.6664): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [25:11<02:14, 1.64s/it] Training 1/1 epoch (loss 1.6664): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [25:11<02:32, 1.89s/it] Training 1/1 epoch (loss 1.5991): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [25:12<02:32, 1.89s/it] Training 1/1 epoch (loss 1.5991): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [25:12<02:12, 1.66s/it] Training 1/1 epoch (loss 1.6348): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [25:14<02:12, 1.66s/it] Training 1/1 epoch (loss 1.6348): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [25:14<02:04, 1.58s/it] Training 1/1 epoch (loss 1.5268): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [25:15<02:04, 1.58s/it] Training 1/1 epoch (loss 1.5268): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [25:15<01:47, 1.38s/it] Training 1/1 epoch (loss 1.6037): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [25:17<01:47, 1.38s/it] Training 1/1 epoch (loss 1.6037): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [25:17<02:00, 1.57s/it] Training 1/1 epoch (loss 1.6754): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [25:19<02:00, 1.57s/it] Training 1/1 epoch (loss 1.6754): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [25:19<02:02, 1.62s/it] Training 1/1 epoch (loss 1.7184): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [25:21<02:02, 1.62s/it] Training 1/1 epoch (loss 1.7184): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [25:21<02:19, 1.86s/it] Training 1/1 epoch (loss 1.5992): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [25:22<02:19, 1.86s/it] Training 1/1 epoch (loss 1.5992): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [25:22<02:07, 1.73s/it] Training 1/1 epoch (loss 1.6667): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [25:24<02:07, 1.73s/it] Training 1/1 epoch (loss 1.6667): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [25:24<02:00, 1.65s/it] Training 1/1 epoch (loss 1.4673): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [25:26<02:00, 1.65s/it] Training 1/1 epoch (loss 1.4673): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [25:26<02:04, 1.73s/it] Training 1/1 epoch (loss 1.4345): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [25:27<02:04, 1.73s/it] Training 1/1 epoch (loss 1.4345): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [25:27<01:50, 1.55s/it] Training 1/1 epoch (loss 1.5307): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [25:28<01:50, 1.55s/it] Training 1/1 epoch (loss 1.5307): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [25:28<01:47, 1.54s/it] Training 1/1 epoch (loss 1.5825): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [25:31<01:47, 1.54s/it] Training 1/1 epoch (loss 1.5825): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [25:31<02:05, 1.82s/it] Training 1/1 epoch (loss 1.7159): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [25:33<02:05, 1.82s/it] Training 1/1 epoch (loss 1.7159): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [25:33<02:01, 1.79s/it] Training 1/1 epoch (loss 1.5632): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [25:34<02:01, 1.79s/it] Training 1/1 epoch (loss 1.5632): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [25:34<01:56, 1.73s/it] Training 1/1 epoch (loss 1.6163): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [25:36<01:56, 1.73s/it] Training 1/1 epoch (loss 1.6163): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [25:36<01:51, 1.68s/it] Training 1/1 epoch (loss 1.7090): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [25:37<01:51, 1.68s/it] Training 1/1 epoch (loss 1.7090): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [25:37<01:45, 1.62s/it] Training 1/1 epoch (loss 1.5975): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [25:39<01:45, 1.62s/it] Training 1/1 epoch (loss 1.5975): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [25:39<01:49, 1.71s/it] Training 1/1 epoch (loss 1.6858): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [25:41<01:49, 1.71s/it] Training 1/1 epoch (loss 1.6858): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [25:41<01:41, 1.62s/it] Training 1/1 epoch (loss 1.6241): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [25:42<01:41, 1.62s/it] Training 1/1 epoch (loss 1.6241): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [25:42<01:38, 1.59s/it] Training 1/1 epoch (loss 1.6043): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [25:44<01:38, 1.59s/it] Training 1/1 epoch (loss 1.6043): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [25:44<01:33, 1.53s/it] Training 1/1 epoch (loss 1.5642): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [25:45<01:33, 1.53s/it] Training 1/1 epoch (loss 1.5642): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [25:45<01:26, 1.44s/it] Training 1/1 epoch (loss 1.5492): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [25:47<01:26, 1.44s/it] Training 1/1 epoch (loss 1.5492): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [25:47<01:30, 1.54s/it] Training 1/1 epoch (loss 1.7440): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [25:48<01:30, 1.54s/it] Training 1/1 epoch (loss 1.7440): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [25:48<01:33, 1.62s/it] Training 1/1 epoch (loss 1.6050): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [25:51<01:33, 1.62s/it] Training 1/1 epoch (loss 1.6050): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [25:51<01:42, 1.80s/it] Training 1/1 epoch (loss 1.6441): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [25:52<01:42, 1.80s/it] Training 1/1 epoch (loss 1.6441): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [25:52<01:30, 1.62s/it] Training 1/1 epoch (loss 1.6818): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [25:54<01:30, 1.62s/it] Training 1/1 epoch (loss 1.6818): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [25:54<01:35, 1.74s/it] Training 1/1 epoch (loss 1.6821): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [25:56<01:35, 1.74s/it] Training 1/1 epoch (loss 1.6821): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [25:56<01:34, 1.75s/it] Training 1/1 epoch (loss 1.6445): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [25:58<01:34, 1.75s/it] Training 1/1 epoch (loss 1.6445): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [25:58<01:43, 1.95s/it] Training 1/1 epoch (loss 1.5409): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [25:59<01:43, 1.95s/it] Training 1/1 epoch (loss 1.5409): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [25:59<01:33, 1.80s/it] Training 1/1 epoch (loss 1.6092): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [26:01<01:33, 1.80s/it] Training 1/1 epoch (loss 1.6092): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [26:01<01:26, 1.71s/it] Training 1/1 epoch (loss 1.5736): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [26:03<01:26, 1.71s/it] Training 1/1 epoch (loss 1.5736): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [26:03<01:37, 1.94s/it] Training 1/1 epoch (loss 1.6421): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [26:06<01:37, 1.94s/it] Training 1/1 epoch (loss 1.6421): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [26:06<01:42, 2.09s/it] Training 1/1 epoch (loss 1.5172): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [26:08<01:42, 2.09s/it] Training 1/1 epoch (loss 1.5172): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [26:08<01:40, 2.10s/it] Training 1/1 epoch (loss 1.6062): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [26:10<01:40, 2.10s/it] Training 1/1 epoch (loss 1.6062): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [26:10<01:40, 2.14s/it] Training 1/1 epoch (loss 1.6629): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [26:12<01:40, 2.14s/it] Training 1/1 epoch (loss 1.6629): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [26:12<01:27, 1.90s/it] Training 1/1 epoch (loss 1.6867): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [26:13<01:27, 1.90s/it] Training 1/1 epoch (loss 1.6867): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [26:13<01:17, 1.73s/it] Training 1/1 epoch (loss 1.5468): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [26:15<01:17, 1.73s/it] Training 1/1 epoch (loss 1.5468): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [26:15<01:16, 1.73s/it] Training 1/1 epoch (loss 1.6193): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [26:16<01:16, 1.73s/it] Training 1/1 epoch (loss 1.6193): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [26:16<01:09, 1.62s/it] Training 1/1 epoch (loss 1.5666): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [26:18<01:09, 1.62s/it] Training 1/1 epoch (loss 1.5666): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [26:18<01:11, 1.71s/it] Training 1/1 epoch (loss 1.5504): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [26:20<01:11, 1.71s/it] Training 1/1 epoch (loss 1.5504): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [26:20<01:12, 1.78s/it] Training 1/1 epoch (loss 1.6028): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [26:21<01:12, 1.78s/it] Training 1/1 epoch (loss 1.6028): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [26:21<01:04, 1.60s/it] Training 1/1 epoch (loss 1.6251): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [26:23<01:04, 1.60s/it] Training 1/1 epoch (loss 1.6251): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [26:23<01:11, 1.84s/it] Training 1/1 epoch (loss 1.6016): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [26:25<01:11, 1.84s/it] Training 1/1 epoch (loss 1.6016): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [26:25<01:05, 1.74s/it] Training 1/1 epoch (loss 1.5155): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [26:27<01:05, 1.74s/it] Training 1/1 epoch (loss 1.5155): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [26:27<01:11, 1.93s/it] Training 1/1 epoch (loss 1.5319): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [26:28<01:11, 1.93s/it] Training 1/1 epoch (loss 1.5319): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [26:28<01:01, 1.72s/it] Training 1/1 epoch (loss 1.5948): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [26:30<01:01, 1.72s/it] Training 1/1 epoch (loss 1.5948): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [26:30<01:01, 1.74s/it] Training 1/1 epoch (loss 1.6712): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [26:32<01:01, 1.74s/it] Training 1/1 epoch (loss 1.6712): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [26:32<01:01, 1.81s/it] Training 1/1 epoch (loss 1.6584): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [26:33<01:01, 1.81s/it] Training 1/1 epoch (loss 1.6584): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [26:33<00:50, 1.54s/it] Training 1/1 epoch (loss 1.6454): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [26:35<00:50, 1.54s/it] Training 1/1 epoch (loss 1.6454): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [26:35<00:50, 1.57s/it] Training 1/1 epoch (loss 1.5435): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [26:37<00:50, 1.57s/it] Training 1/1 epoch (loss 1.5435): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [26:37<00:57, 1.85s/it] Training 1/1 epoch (loss 1.6056): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [26:39<00:57, 1.85s/it] Training 1/1 epoch (loss 1.6056): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [26:39<00:56, 1.88s/it] Training 1/1 epoch (loss 1.5718): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [26:40<00:56, 1.88s/it] Training 1/1 epoch (loss 1.5718): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [26:40<00:48, 1.67s/it] Training 1/1 epoch (loss 1.6430): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [26:42<00:48, 1.67s/it] Training 1/1 epoch (loss 1.6430): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [26:42<00:45, 1.63s/it] Training 1/1 epoch (loss 1.6004): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [26:44<00:45, 1.63s/it] Training 1/1 epoch (loss 1.6004): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [26:44<00:43, 1.60s/it] Training 1/1 epoch (loss 1.6304): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [26:46<00:43, 1.60s/it] Training 1/1 epoch (loss 1.6304): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [26:46<00:44, 1.73s/it] Training 1/1 epoch (loss 1.6232): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [26:47<00:44, 1.73s/it] Training 1/1 epoch (loss 1.6232): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [26:47<00:38, 1.55s/it] Training 1/1 epoch (loss 1.6282): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [26:48<00:38, 1.55s/it] Training 1/1 epoch (loss 1.6282): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [26:48<00:37, 1.56s/it] Training 1/1 epoch (loss 1.5943): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [26:51<00:37, 1.56s/it] Training 1/1 epoch (loss 1.5943): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [26:51<00:41, 1.82s/it] Training 1/1 epoch (loss 1.6319): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [26:52<00:41, 1.82s/it] Training 1/1 epoch (loss 1.6319): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [26:52<00:38, 1.75s/it] Training 1/1 epoch (loss 1.6005): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [26:55<00:38, 1.75s/it] Training 1/1 epoch (loss 1.6005): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [26:55<00:41, 2.00s/it] Training 1/1 epoch (loss 1.5844): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [26:57<00:41, 2.00s/it] Training 1/1 epoch (loss 1.5844): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [26:57<00:39, 1.96s/it] Training 1/1 epoch (loss 1.5915): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [26:59<00:39, 1.96s/it] Training 1/1 epoch (loss 1.5915): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [26:59<00:37, 2.00s/it] Training 1/1 epoch (loss 1.5817): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [27:00<00:37, 2.00s/it] Training 1/1 epoch (loss 1.5817): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [27:00<00:33, 1.88s/it] Training 1/1 epoch (loss 1.5877): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [27:02<00:33, 1.88s/it] Training 1/1 epoch (loss 1.5877): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [27:02<00:31, 1.83s/it] Training 1/1 epoch (loss 1.5393): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [27:05<00:31, 1.83s/it] Training 1/1 epoch (loss 1.5393): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [27:05<00:32, 2.02s/it] Training 1/1 epoch (loss 1.5404): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [27:06<00:32, 2.02s/it] Training 1/1 epoch (loss 1.5404): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [27:06<00:28, 1.89s/it] Training 1/1 epoch (loss 1.6016): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [27:07<00:28, 1.89s/it] Training 1/1 epoch (loss 1.6016): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [27:07<00:22, 1.63s/it] Training 1/1 epoch (loss 1.6482): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [27:09<00:22, 1.63s/it] Training 1/1 epoch (loss 1.6482): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [27:09<00:21, 1.65s/it] Training 1/1 epoch (loss 1.4983): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [27:10<00:21, 1.65s/it] Training 1/1 epoch (loss 1.4983): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [27:10<00:19, 1.64s/it] Training 1/1 epoch (loss 1.6209): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [27:11<00:19, 1.64s/it] Training 1/1 epoch (loss 1.6209): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [27:11<00:15, 1.43s/it] Training 1/1 epoch (loss 1.5850): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [27:13<00:15, 1.43s/it] Training 1/1 epoch (loss 1.5850): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [27:13<00:14, 1.43s/it] Training 1/1 epoch (loss 1.5641): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [27:15<00:14, 1.43s/it] Training 1/1 epoch (loss 1.5641): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [27:15<00:14, 1.59s/it] Training 1/1 epoch (loss 1.5814): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [27:17<00:14, 1.59s/it] Training 1/1 epoch (loss 1.5814): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [27:17<00:13, 1.69s/it] Training 1/1 epoch (loss 1.5632): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [27:18<00:13, 1.69s/it] Training 1/1 epoch (loss 1.5632): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [27:18<00:10, 1.56s/it] Training 1/1 epoch (loss 1.6210): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [27:20<00:10, 1.56s/it] Training 1/1 epoch (loss 1.6210): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [27:20<00:09, 1.57s/it] Training 1/1 epoch (loss 1.6144): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [27:21<00:09, 1.57s/it] Training 1/1 epoch (loss 1.6144): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [27:21<00:08, 1.60s/it] Training 1/1 epoch (loss 1.6112): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [27:23<00:08, 1.60s/it] Training 1/1 epoch (loss 1.6112): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [27:23<00:06, 1.70s/it] Training 1/1 epoch (loss 1.7452): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [27:25<00:06, 1.70s/it] Training 1/1 epoch (loss 1.7452): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [27:25<00:04, 1.59s/it] Training 1/1 epoch (loss 1.6199): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [27:26<00:04, 1.59s/it] Training 1/1 epoch (loss 1.6199): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [27:26<00:03, 1.66s/it] Training 1/1 epoch (loss 1.5855): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [27:28<00:03, 1.66s/it] Training 1/1 epoch (loss 1.5855): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [27:28<00:01, 1.63s/it] Training 1/1 epoch (loss 1.6134): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [27:30<00:01, 1.63s/it] Training 1/1 epoch (loss 1.6134): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [27:30<00:00, 1.69s/it] Training 1/1 epoch (loss 1.6134): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [27:30<00:00, 1.76s/it]
tokenizer config file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-30k/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-30k/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x15505f9e3a90>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 449, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 391, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2106, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2127, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2352, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2609, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown