alignmentforever's picture
Upload folder using huggingface_hub
04a9082 verified
+ deepspeed --master_port 49924 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/setting3/safety_data/training/safe/safe_50k.json --model_name_or_path /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T --max_length 2048 --trust_remote_code True --epochs 1 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-50k --log_type wandb --log_run_name tinyllama-1T-s3-Q1-50k --log_project Inverse_Alignment --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
[rank1]:[W529 05:08:06.156426766 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W529 05:08:06.163112758 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W529 05:08:06.228977830 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W529 05:08:07.383451943 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W529 05:08:07.402205400 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank4]:[W529 05:08:07.488685971 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W529 05:08:07.488763379 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W529 05:08:07.491929531 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/config.json
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Will use torch_dtype=torch.float32 as defined in model's config object
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file tokenizer.json
loading file special_tokens_map.json
loading file added_tokens.json
loading file tokenizer_config.json
loading file special_tokens_map.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-480k-1T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.8
wandb: Run data is saved locally in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-50k/wandb/run-20250529_050824-f0h70cv5
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run tinyllama-1T-s3-Q1-50k
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment/runs/f0h70cv5
Training 1/1 epoch: 0%| | 0/1563 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.4782): 0%| | 0/1563 [00:10<?, ?it/s] Training 1/1 epoch (loss 2.4782): 0%| | 1/1563 [00:10<4:37:56, 10.68s/it] Training 1/1 epoch (loss 2.3928): 0%| | 1/1563 [00:12<4:37:56, 10.68s/it] Training 1/1 epoch (loss 2.3928): 0%| | 2/1563 [00:12<2:29:00, 5.73s/it] Training 1/1 epoch (loss 2.2977): 0%| | 2/1563 [00:14<2:29:00, 5.73s/it] Training 1/1 epoch (loss 2.2977): 0%| | 3/1563 [00:14<1:37:48, 3.76s/it] Training 1/1 epoch (loss 2.3725): 0%| | 3/1563 [00:15<1:37:48, 3.76s/it] Training 1/1 epoch (loss 2.3725): 0%| | 4/1563 [00:15<1:12:16, 2.78s/it] Training 1/1 epoch (loss 2.4576): 0%| | 4/1563 [00:17<1:12:16, 2.78s/it] Training 1/1 epoch (loss 2.4576): 0%| | 5/1563 [00:17<1:00:13, 2.32s/it] Training 1/1 epoch (loss 2.5388): 0%| | 5/1563 [00:19<1:00:13, 2.32s/it] Training 1/1 epoch (loss 2.5388): 0%| | 6/1563 [00:19<56:23, 2.17s/it] Training 1/1 epoch (loss 2.3437): 0%| | 6/1563 [00:20<56:23, 2.17s/it] Training 1/1 epoch (loss 2.3437): 0%| | 7/1563 [00:20<53:52, 2.08s/it] Training 1/1 epoch (loss 2.3431): 0%| | 7/1563 [00:23<53:52, 2.08s/it] Training 1/1 epoch (loss 2.3431): 1%| | 8/1563 [00:23<54:46, 2.11s/it] Training 1/1 epoch (loss 2.4155): 1%| | 8/1563 [00:24<54:46, 2.11s/it] Training 1/1 epoch (loss 2.4155): 1%| | 9/1563 [00:24<49:39, 1.92s/it] Training 1/1 epoch (loss 2.3474): 1%| | 9/1563 [00:26<49:39, 1.92s/it] Training 1/1 epoch (loss 2.3474): 1%| | 10/1563 [00:26<50:35, 1.95s/it] Training 1/1 epoch (loss 2.2893): 1%| | 10/1563 [00:27<50:35, 1.95s/it] Training 1/1 epoch (loss 2.2893): 1%| | 11/1563 [00:27<41:44, 1.61s/it] Training 1/1 epoch (loss 2.3665): 1%| | 11/1563 [00:29<41:44, 1.61s/it] Training 1/1 epoch (loss 2.3665): 1%| | 12/1563 [00:29<44:20, 1.72s/it] Training 1/1 epoch (loss 2.4028): 1%| | 12/1563 [00:30<44:20, 1.72s/it] Training 1/1 epoch (loss 2.4028): 1%| | 13/1563 [00:30<40:58, 1.59s/it] Training 1/1 epoch (loss 2.2812): 1%| | 13/1563 [00:32<40:58, 1.59s/it] Training 1/1 epoch (loss 2.2812): 1%| | 14/1563 [00:32<45:26, 1.76s/it] Training 1/1 epoch (loss 2.2366): 1%| | 14/1563 [00:34<45:26, 1.76s/it] Training 1/1 epoch (loss 2.2366): 1%| | 15/1563 [00:34<41:59, 1.63s/it] Training 1/1 epoch (loss 2.3631): 1%| | 15/1563 [00:35<41:59, 1.63s/it] Training 1/1 epoch (loss 2.3631): 1%| | 16/1563 [00:35<41:52, 1.62s/it] Training 1/1 epoch (loss 2.4046): 1%| | 16/1563 [00:37<41:52, 1.62s/it] Training 1/1 epoch (loss 2.4046): 1%| | 17/1563 [00:37<41:10, 1.60s/it] Training 1/1 epoch (loss 2.2033): 1%| | 17/1563 [00:39<41:10, 1.60s/it] Training 1/1 epoch (loss 2.2033): 1%| | 18/1563 [00:39<42:06, 1.64s/it] Training 1/1 epoch (loss 2.1424): 1%| | 18/1563 [00:40<42:06, 1.64s/it] Training 1/1 epoch (loss 2.1424): 1%| | 19/1563 [00:40<42:01, 1.63s/it] Training 1/1 epoch (loss 2.1399): 1%| | 19/1563 [00:42<42:01, 1.63s/it] Training 1/1 epoch (loss 2.1399): 1%|▏ | 20/1563 [00:42<39:34, 1.54s/it] Training 1/1 epoch (loss 2.2329): 1%|▏ | 20/1563 [00:44<39:34, 1.54s/it] Training 1/1 epoch (loss 2.2329): 1%|▏ | 21/1563 [00:44<43:08, 1.68s/it] Training 1/1 epoch (loss 2.2564): 1%|▏ | 21/1563 [00:45<43:08, 1.68s/it] Training 1/1 epoch (loss 2.2564): 1%|▏ | 22/1563 [00:45<41:36, 1.62s/it] Training 1/1 epoch (loss 2.2991): 1%|▏ | 22/1563 [00:46<41:36, 1.62s/it] Training 1/1 epoch (loss 2.2991): 1%|▏ | 23/1563 [00:46<40:00, 1.56s/it] Training 1/1 epoch (loss 2.2548): 1%|▏ | 23/1563 [00:49<40:00, 1.56s/it] Training 1/1 epoch (loss 2.2548): 2%|▏ | 24/1563 [00:49<48:46, 1.90s/it] Training 1/1 epoch (loss 2.2705): 2%|▏ | 24/1563 [00:50<48:46, 1.90s/it] Training 1/1 epoch (loss 2.2705): 2%|▏ | 25/1563 [00:50<43:53, 1.71s/it] Training 1/1 epoch (loss 2.2834): 2%|▏ | 25/1563 [00:52<43:53, 1.71s/it] Training 1/1 epoch (loss 2.2834): 2%|▏ | 26/1563 [00:52<44:09, 1.72s/it] Training 1/1 epoch (loss 2.2687): 2%|▏ | 26/1563 [00:54<44:09, 1.72s/it] Training 1/1 epoch (loss 2.2687): 2%|▏ | 27/1563 [00:54<44:02, 1.72s/it] Training 1/1 epoch (loss 2.3331): 2%|▏ | 27/1563 [00:56<44:02, 1.72s/it] Training 1/1 epoch (loss 2.3331): 2%|▏ | 28/1563 [00:56<45:55, 1.80s/it] Training 1/1 epoch (loss 2.1937): 2%|▏ | 28/1563 [00:57<45:55, 1.80s/it] Training 1/1 epoch (loss 2.1937): 2%|▏ | 29/1563 [00:57<43:55, 1.72s/it] Training 1/1 epoch (loss 2.2096): 2%|▏ | 29/1563 [00:59<43:55, 1.72s/it] Training 1/1 epoch (loss 2.2096): 2%|▏ | 30/1563 [00:59<40:30, 1.59s/it] Training 1/1 epoch (loss 2.2062): 2%|▏ | 30/1563 [01:00<40:30, 1.59s/it] Training 1/1 epoch (loss 2.2062): 2%|▏ | 31/1563 [01:00<39:22, 1.54s/it] Training 1/1 epoch (loss 2.1659): 2%|▏ | 31/1563 [01:02<39:22, 1.54s/it] Training 1/1 epoch (loss 2.1659): 2%|▏ | 32/1563 [01:02<41:38, 1.63s/it] Training 1/1 epoch (loss 2.1154): 2%|▏ | 32/1563 [01:03<41:38, 1.63s/it] Training 1/1 epoch (loss 2.1154): 2%|▏ | 33/1563 [01:03<37:48, 1.48s/it] Training 1/1 epoch (loss 2.1718): 2%|▏ | 33/1563 [01:05<37:48, 1.48s/it] Training 1/1 epoch (loss 2.1718): 2%|▏ | 34/1563 [01:05<37:56, 1.49s/it] Training 1/1 epoch (loss 2.2446): 2%|▏ | 34/1563 [01:06<37:56, 1.49s/it] Training 1/1 epoch (loss 2.2446): 2%|▏ | 35/1563 [01:06<38:23, 1.51s/it] Training 1/1 epoch (loss 2.1093): 2%|▏ | 35/1563 [01:08<38:23, 1.51s/it] Training 1/1 epoch (loss 2.1093): 2%|▏ | 36/1563 [01:08<43:00, 1.69s/it] Training 1/1 epoch (loss 2.1876): 2%|▏ | 36/1563 [01:10<43:00, 1.69s/it] Training 1/1 epoch (loss 2.1876): 2%|▏ | 37/1563 [01:10<44:10, 1.74s/it] Training 1/1 epoch (loss 2.1795): 2%|▏ | 37/1563 [01:12<44:10, 1.74s/it] Training 1/1 epoch (loss 2.1795): 2%|▏ | 38/1563 [01:12<49:20, 1.94s/it] Training 1/1 epoch (loss 2.3066): 2%|▏ | 38/1563 [01:14<49:20, 1.94s/it] Training 1/1 epoch (loss 2.3066): 2%|▏ | 39/1563 [01:14<46:08, 1.82s/it] Training 1/1 epoch (loss 2.1269): 2%|▏ | 39/1563 [01:17<46:08, 1.82s/it] Training 1/1 epoch (loss 2.1269): 3%|β–Ž | 40/1563 [01:17<52:26, 2.07s/it] Training 1/1 epoch (loss 2.1998): 3%|β–Ž | 40/1563 [01:18<52:26, 2.07s/it] Training 1/1 epoch (loss 2.1998): 3%|β–Ž | 41/1563 [01:18<46:13, 1.82s/it] Training 1/1 epoch (loss 2.1681): 3%|β–Ž | 41/1563 [01:20<46:13, 1.82s/it] Training 1/1 epoch (loss 2.1681): 3%|β–Ž | 42/1563 [01:20<46:11, 1.82s/it] Training 1/1 epoch (loss 2.0966): 3%|β–Ž | 42/1563 [01:22<46:11, 1.82s/it] Training 1/1 epoch (loss 2.0966): 3%|β–Ž | 43/1563 [01:22<46:14, 1.83s/it] Training 1/1 epoch (loss 2.1984): 3%|β–Ž | 43/1563 [01:23<46:14, 1.83s/it] Training 1/1 epoch (loss 2.1984): 3%|β–Ž | 44/1563 [01:23<44:56, 1.78s/it] Training 1/1 epoch (loss 2.1049): 3%|β–Ž | 44/1563 [01:25<44:56, 1.78s/it] Training 1/1 epoch (loss 2.1049): 3%|β–Ž | 45/1563 [01:25<44:27, 1.76s/it] Training 1/1 epoch (loss 1.9950): 3%|β–Ž | 45/1563 [01:27<44:27, 1.76s/it] Training 1/1 epoch (loss 1.9950): 3%|β–Ž | 46/1563 [01:27<47:59, 1.90s/it] Training 1/1 epoch (loss 2.1244): 3%|β–Ž | 46/1563 [01:28<47:59, 1.90s/it] Training 1/1 epoch (loss 2.1244): 3%|β–Ž | 47/1563 [01:28<43:23, 1.72s/it] Training 1/1 epoch (loss 1.9906): 3%|β–Ž | 47/1563 [01:30<43:23, 1.72s/it] Training 1/1 epoch (loss 1.9906): 3%|β–Ž | 48/1563 [01:30<40:51, 1.62s/it] Training 1/1 epoch (loss 2.1711): 3%|β–Ž | 48/1563 [01:32<40:51, 1.62s/it] Training 1/1 epoch (loss 2.1711): 3%|β–Ž | 49/1563 [01:32<41:47, 1.66s/it] Training 1/1 epoch (loss 2.1772): 3%|β–Ž | 49/1563 [01:33<41:47, 1.66s/it] Training 1/1 epoch (loss 2.1772): 3%|β–Ž | 50/1563 [01:33<41:25, 1.64s/it] Training 1/1 epoch (loss 1.9678): 3%|β–Ž | 50/1563 [01:35<41:25, 1.64s/it] Training 1/1 epoch (loss 1.9678): 3%|β–Ž | 51/1563 [01:35<40:59, 1.63s/it] Training 1/1 epoch (loss 2.1479): 3%|β–Ž | 51/1563 [01:36<40:59, 1.63s/it] Training 1/1 epoch (loss 2.1479): 3%|β–Ž | 52/1563 [01:36<38:21, 1.52s/it] Training 1/1 epoch (loss 2.1616): 3%|β–Ž | 52/1563 [01:38<38:21, 1.52s/it] Training 1/1 epoch (loss 2.1616): 3%|β–Ž | 53/1563 [01:38<42:46, 1.70s/it] Training 1/1 epoch (loss 2.0912): 3%|β–Ž | 53/1563 [01:41<42:46, 1.70s/it] Training 1/1 epoch (loss 2.0912): 3%|β–Ž | 54/1563 [01:41<48:17, 1.92s/it] Training 1/1 epoch (loss 2.1495): 3%|β–Ž | 54/1563 [01:43<48:17, 1.92s/it] Training 1/1 epoch (loss 2.1495): 4%|β–Ž | 55/1563 [01:43<48:27, 1.93s/it] Training 1/1 epoch (loss 2.0939): 4%|β–Ž | 55/1563 [01:45<48:27, 1.93s/it] Training 1/1 epoch (loss 2.0939): 4%|β–Ž | 56/1563 [01:45<50:17, 2.00s/it] Training 1/1 epoch (loss 2.0501): 4%|β–Ž | 56/1563 [01:46<50:17, 2.00s/it] Training 1/1 epoch (loss 2.0501): 4%|β–Ž | 57/1563 [01:46<45:48, 1.82s/it] Training 1/1 epoch (loss 2.0870): 4%|β–Ž | 57/1563 [01:49<45:48, 1.82s/it] Training 1/1 epoch (loss 2.0870): 4%|β–Ž | 58/1563 [01:49<50:07, 2.00s/it] Training 1/1 epoch (loss 2.0160): 4%|β–Ž | 58/1563 [01:50<50:07, 2.00s/it] Training 1/1 epoch (loss 2.0160): 4%|▍ | 59/1563 [01:50<44:46, 1.79s/it] Training 1/1 epoch (loss 1.9800): 4%|▍ | 59/1563 [01:52<44:46, 1.79s/it] Training 1/1 epoch (loss 1.9800): 4%|▍ | 60/1563 [01:52<45:59, 1.84s/it] Training 1/1 epoch (loss 2.0397): 4%|▍ | 60/1563 [01:53<45:59, 1.84s/it] Training 1/1 epoch (loss 2.0397): 4%|▍ | 61/1563 [01:53<43:57, 1.76s/it] Training 1/1 epoch (loss 1.9929): 4%|▍ | 61/1563 [01:56<43:57, 1.76s/it] Training 1/1 epoch (loss 1.9929): 4%|▍ | 62/1563 [01:56<48:37, 1.94s/it] Training 1/1 epoch (loss 2.0936): 4%|▍ | 62/1563 [01:58<48:37, 1.94s/it] Training 1/1 epoch (loss 2.0936): 4%|▍ | 63/1563 [01:58<49:07, 1.96s/it] Training 1/1 epoch (loss 2.0195): 4%|▍ | 63/1563 [01:59<49:07, 1.96s/it] Training 1/1 epoch (loss 2.0195): 4%|▍ | 64/1563 [01:59<47:20, 1.90s/it] Training 1/1 epoch (loss 2.1012): 4%|▍ | 64/1563 [02:02<47:20, 1.90s/it] Training 1/1 epoch (loss 2.1012): 4%|▍ | 65/1563 [02:02<49:14, 1.97s/it] Training 1/1 epoch (loss 1.9661): 4%|▍ | 65/1563 [02:04<49:14, 1.97s/it] Training 1/1 epoch (loss 1.9661): 4%|▍ | 66/1563 [02:04<51:33, 2.07s/it] Training 1/1 epoch (loss 1.9273): 4%|▍ | 66/1563 [02:05<51:33, 2.07s/it] Training 1/1 epoch (loss 1.9273): 4%|▍ | 67/1563 [02:05<45:28, 1.82s/it] Training 1/1 epoch (loss 2.0294): 4%|▍ | 67/1563 [02:06<45:28, 1.82s/it] Training 1/1 epoch (loss 2.0294): 4%|▍ | 68/1563 [02:06<40:38, 1.63s/it] Training 1/1 epoch (loss 2.0362): 4%|▍ | 68/1563 [02:09<40:38, 1.63s/it] Training 1/1 epoch (loss 2.0362): 4%|▍ | 69/1563 [02:09<46:34, 1.87s/it] Training 1/1 epoch (loss 2.0102): 4%|▍ | 69/1563 [02:10<46:34, 1.87s/it] Training 1/1 epoch (loss 2.0102): 4%|▍ | 70/1563 [02:10<41:14, 1.66s/it] Training 1/1 epoch (loss 2.0156): 4%|▍ | 70/1563 [02:12<41:14, 1.66s/it] Training 1/1 epoch (loss 2.0156): 5%|▍ | 71/1563 [02:12<44:37, 1.79s/it] Training 1/1 epoch (loss 2.1738): 5%|▍ | 71/1563 [02:14<44:37, 1.79s/it] Training 1/1 epoch (loss 2.1738): 5%|▍ | 72/1563 [02:14<46:44, 1.88s/it] Training 1/1 epoch (loss 2.0173): 5%|▍ | 72/1563 [02:16<46:44, 1.88s/it] Training 1/1 epoch (loss 2.0173): 5%|▍ | 73/1563 [02:16<47:00, 1.89s/it] Training 1/1 epoch (loss 2.1230): 5%|▍ | 73/1563 [02:17<47:00, 1.89s/it] Training 1/1 epoch (loss 2.1230): 5%|▍ | 74/1563 [02:17<42:39, 1.72s/it] Training 1/1 epoch (loss 1.9717): 5%|▍ | 74/1563 [02:19<42:39, 1.72s/it] Training 1/1 epoch (loss 1.9717): 5%|▍ | 75/1563 [02:19<41:13, 1.66s/it] Training 1/1 epoch (loss 1.9507): 5%|▍ | 75/1563 [02:20<41:13, 1.66s/it] Training 1/1 epoch (loss 1.9507): 5%|▍ | 76/1563 [02:20<39:35, 1.60s/it] Training 1/1 epoch (loss 1.8584): 5%|▍ | 76/1563 [02:23<39:35, 1.60s/it] Training 1/1 epoch (loss 1.8584): 5%|▍ | 77/1563 [02:23<43:34, 1.76s/it] Training 1/1 epoch (loss 1.9591): 5%|▍ | 77/1563 [02:24<43:34, 1.76s/it] Training 1/1 epoch (loss 1.9591): 5%|▍ | 78/1563 [02:24<39:59, 1.62s/it] Training 1/1 epoch (loss 2.1517): 5%|▍ | 78/1563 [02:26<39:59, 1.62s/it] Training 1/1 epoch (loss 2.1517): 5%|β–Œ | 79/1563 [02:26<41:08, 1.66s/it] Training 1/1 epoch (loss 2.0471): 5%|β–Œ | 79/1563 [02:28<41:08, 1.66s/it] Training 1/1 epoch (loss 2.0471): 5%|β–Œ | 80/1563 [02:28<43:14, 1.75s/it] Training 1/1 epoch (loss 2.0087): 5%|β–Œ | 80/1563 [02:30<43:14, 1.75s/it] Training 1/1 epoch (loss 2.0087): 5%|β–Œ | 81/1563 [02:30<45:12, 1.83s/it] Training 1/1 epoch (loss 2.0137): 5%|β–Œ | 81/1563 [02:32<45:12, 1.83s/it] Training 1/1 epoch (loss 2.0137): 5%|β–Œ | 82/1563 [02:32<47:50, 1.94s/it] Training 1/1 epoch (loss 1.9500): 5%|β–Œ | 82/1563 [02:34<47:50, 1.94s/it] Training 1/1 epoch (loss 1.9500): 5%|β–Œ | 83/1563 [02:34<48:00, 1.95s/it] Training 1/1 epoch (loss 1.9617): 5%|β–Œ | 83/1563 [02:35<48:00, 1.95s/it] Training 1/1 epoch (loss 1.9617): 5%|β–Œ | 84/1563 [02:35<42:45, 1.73s/it] Training 1/1 epoch (loss 1.8991): 5%|β–Œ | 84/1563 [02:37<42:45, 1.73s/it] Training 1/1 epoch (loss 1.8991): 5%|β–Œ | 85/1563 [02:37<44:35, 1.81s/it] Training 1/1 epoch (loss 1.9665): 5%|β–Œ | 85/1563 [02:38<44:35, 1.81s/it] Training 1/1 epoch (loss 1.9665): 6%|β–Œ | 86/1563 [02:38<42:23, 1.72s/it] Training 1/1 epoch (loss 1.9147): 6%|β–Œ | 86/1563 [02:40<42:23, 1.72s/it] Training 1/1 epoch (loss 1.9147): 6%|β–Œ | 87/1563 [02:40<44:05, 1.79s/it] Training 1/1 epoch (loss 1.8970): 6%|β–Œ | 87/1563 [02:43<44:05, 1.79s/it] Training 1/1 epoch (loss 1.8970): 6%|β–Œ | 88/1563 [02:43<47:16, 1.92s/it] Training 1/1 epoch (loss 1.9195): 6%|β–Œ | 88/1563 [02:45<47:16, 1.92s/it] Training 1/1 epoch (loss 1.9195): 6%|β–Œ | 89/1563 [02:45<49:09, 2.00s/it] Training 1/1 epoch (loss 1.8589): 6%|β–Œ | 89/1563 [02:47<49:09, 2.00s/it] Training 1/1 epoch (loss 1.8589): 6%|β–Œ | 90/1563 [02:47<52:03, 2.12s/it] Training 1/1 epoch (loss 1.9217): 6%|β–Œ | 90/1563 [02:50<52:03, 2.12s/it] Training 1/1 epoch (loss 1.9217): 6%|β–Œ | 91/1563 [02:50<54:12, 2.21s/it] Training 1/1 epoch (loss 1.8956): 6%|β–Œ | 91/1563 [02:51<54:12, 2.21s/it] Training 1/1 epoch (loss 1.8956): 6%|β–Œ | 92/1563 [02:51<49:53, 2.04s/it] Training 1/1 epoch (loss 2.0171): 6%|β–Œ | 92/1563 [02:53<49:53, 2.04s/it] Training 1/1 epoch (loss 2.0171): 6%|β–Œ | 93/1563 [02:53<51:23, 2.10s/it] Training 1/1 epoch (loss 1.9143): 6%|β–Œ | 93/1563 [02:55<51:23, 2.10s/it] Training 1/1 epoch (loss 1.9143): 6%|β–Œ | 94/1563 [02:55<48:40, 1.99s/it] Training 1/1 epoch (loss 1.7554): 6%|β–Œ | 94/1563 [02:56<48:40, 1.99s/it] Training 1/1 epoch (loss 1.7554): 6%|β–Œ | 95/1563 [02:56<43:25, 1.77s/it] Training 1/1 epoch (loss 1.9656): 6%|β–Œ | 95/1563 [02:59<43:25, 1.77s/it] Training 1/1 epoch (loss 1.9656): 6%|β–Œ | 96/1563 [02:59<49:26, 2.02s/it] Training 1/1 epoch (loss 1.8814): 6%|β–Œ | 96/1563 [03:02<49:26, 2.02s/it] Training 1/1 epoch (loss 1.8814): 6%|β–Œ | 97/1563 [03:02<52:46, 2.16s/it] Training 1/1 epoch (loss 2.0150): 6%|β–Œ | 97/1563 [03:03<52:46, 2.16s/it] Training 1/1 epoch (loss 2.0150): 6%|β–‹ | 98/1563 [03:03<50:48, 2.08s/it] Training 1/1 epoch (loss 1.9525): 6%|β–‹ | 98/1563 [03:05<50:48, 2.08s/it] Training 1/1 epoch (loss 1.9525): 6%|β–‹ | 99/1563 [03:05<48:16, 1.98s/it] Training 1/1 epoch (loss 1.9026): 6%|β–‹ | 99/1563 [03:07<48:16, 1.98s/it] Training 1/1 epoch (loss 1.9026): 6%|β–‹ | 100/1563 [03:07<43:44, 1.79s/it] Training 1/1 epoch (loss 1.8583): 6%|β–‹ | 100/1563 [03:09<43:44, 1.79s/it] Training 1/1 epoch (loss 1.8583): 6%|β–‹ | 101/1563 [03:09<48:26, 1.99s/it] Training 1/1 epoch (loss 2.0079): 6%|β–‹ | 101/1563 [03:11<48:26, 1.99s/it] Training 1/1 epoch (loss 2.0079): 7%|β–‹ | 102/1563 [03:11<50:57, 2.09s/it] Training 1/1 epoch (loss 1.8004): 7%|β–‹ | 102/1563 [03:13<50:57, 2.09s/it] Training 1/1 epoch (loss 1.8004): 7%|β–‹ | 103/1563 [03:13<45:17, 1.86s/it] Training 1/1 epoch (loss 2.0473): 7%|β–‹ | 103/1563 [03:14<45:17, 1.86s/it] Training 1/1 epoch (loss 2.0473): 7%|β–‹ | 104/1563 [03:14<42:48, 1.76s/it] Training 1/1 epoch (loss 1.9769): 7%|β–‹ | 104/1563 [03:17<42:48, 1.76s/it] Training 1/1 epoch (loss 1.9769): 7%|β–‹ | 105/1563 [03:17<47:42, 1.96s/it] Training 1/1 epoch (loss 1.9292): 7%|β–‹ | 105/1563 [03:18<47:42, 1.96s/it] Training 1/1 epoch (loss 1.9292): 7%|β–‹ | 106/1563 [03:18<44:45, 1.84s/it] Training 1/1 epoch (loss 1.9129): 7%|β–‹ | 106/1563 [03:20<44:45, 1.84s/it] Training 1/1 epoch (loss 1.9129): 7%|β–‹ | 107/1563 [03:20<45:37, 1.88s/it] Training 1/1 epoch (loss 1.8421): 7%|β–‹ | 107/1563 [03:22<45:37, 1.88s/it] Training 1/1 epoch (loss 1.8421): 7%|β–‹ | 108/1563 [03:22<45:57, 1.89s/it] Training 1/1 epoch (loss 1.9574): 7%|β–‹ | 108/1563 [03:24<45:57, 1.89s/it] Training 1/1 epoch (loss 1.9574): 7%|β–‹ | 109/1563 [03:24<47:45, 1.97s/it] Training 1/1 epoch (loss 1.8804): 7%|β–‹ | 109/1563 [03:25<47:45, 1.97s/it] Training 1/1 epoch (loss 1.8804): 7%|β–‹ | 110/1563 [03:25<42:03, 1.74s/it] Training 1/1 epoch (loss 1.7685): 7%|β–‹ | 110/1563 [03:26<42:03, 1.74s/it] Training 1/1 epoch (loss 1.7685): 7%|β–‹ | 111/1563 [03:26<35:47, 1.48s/it] Training 1/1 epoch (loss 1.9475): 7%|β–‹ | 111/1563 [03:28<35:47, 1.48s/it] Training 1/1 epoch (loss 1.9475): 7%|β–‹ | 112/1563 [03:28<36:57, 1.53s/it] Training 1/1 epoch (loss 1.8840): 7%|β–‹ | 112/1563 [03:29<36:57, 1.53s/it] Training 1/1 epoch (loss 1.8840): 7%|β–‹ | 113/1563 [03:29<36:46, 1.52s/it] Training 1/1 epoch (loss 1.8166): 7%|β–‹ | 113/1563 [03:32<36:46, 1.52s/it] Training 1/1 epoch (loss 1.8166): 7%|β–‹ | 114/1563 [03:32<42:57, 1.78s/it] Training 1/1 epoch (loss 1.9350): 7%|β–‹ | 114/1563 [03:34<42:57, 1.78s/it] Training 1/1 epoch (loss 1.9350): 7%|β–‹ | 115/1563 [03:34<47:55, 1.99s/it] Training 1/1 epoch (loss 1.9494): 7%|β–‹ | 115/1563 [03:36<47:55, 1.99s/it] Training 1/1 epoch (loss 1.9494): 7%|β–‹ | 116/1563 [03:36<45:03, 1.87s/it] Training 1/1 epoch (loss 1.8735): 7%|β–‹ | 116/1563 [03:37<45:03, 1.87s/it] Training 1/1 epoch (loss 1.8735): 7%|β–‹ | 117/1563 [03:37<41:03, 1.70s/it] Training 1/1 epoch (loss 1.8653): 7%|β–‹ | 117/1563 [03:39<41:03, 1.70s/it] Training 1/1 epoch (loss 1.8653): 8%|β–Š | 118/1563 [03:39<44:41, 1.86s/it] Training 1/1 epoch (loss 1.8947): 8%|β–Š | 118/1563 [03:41<44:41, 1.86s/it] Training 1/1 epoch (loss 1.8947): 8%|β–Š | 119/1563 [03:41<42:12, 1.75s/it] Training 1/1 epoch (loss 2.0482): 8%|β–Š | 119/1563 [03:42<42:12, 1.75s/it] Training 1/1 epoch (loss 2.0482): 8%|β–Š | 120/1563 [03:42<39:17, 1.63s/it] Training 1/1 epoch (loss 1.7945): 8%|β–Š | 120/1563 [03:44<39:17, 1.63s/it] Training 1/1 epoch (loss 1.7945): 8%|β–Š | 121/1563 [03:44<37:23, 1.56s/it] Training 1/1 epoch (loss 1.7152): 8%|β–Š | 121/1563 [03:46<37:23, 1.56s/it] Training 1/1 epoch (loss 1.7152): 8%|β–Š | 122/1563 [03:46<42:12, 1.76s/it] Training 1/1 epoch (loss 1.8369): 8%|β–Š | 122/1563 [03:48<42:12, 1.76s/it] Training 1/1 epoch (loss 1.8369): 8%|β–Š | 123/1563 [03:48<43:26, 1.81s/it] Training 1/1 epoch (loss 1.7447): 8%|β–Š | 123/1563 [03:50<43:26, 1.81s/it] Training 1/1 epoch (loss 1.7447): 8%|β–Š | 124/1563 [03:50<45:16, 1.89s/it] Training 1/1 epoch (loss 1.9115): 8%|β–Š | 124/1563 [03:52<45:16, 1.89s/it] Training 1/1 epoch (loss 1.9115): 8%|β–Š | 125/1563 [03:52<43:32, 1.82s/it] Training 1/1 epoch (loss 1.9317): 8%|β–Š | 125/1563 [03:54<43:32, 1.82s/it] Training 1/1 epoch (loss 1.9317): 8%|β–Š | 126/1563 [03:54<46:35, 1.95s/it] Training 1/1 epoch (loss 1.8143): 8%|β–Š | 126/1563 [03:56<46:35, 1.95s/it] Training 1/1 epoch (loss 1.8143): 8%|β–Š | 127/1563 [03:56<47:58, 2.00s/it] Training 1/1 epoch (loss 1.8070): 8%|β–Š | 127/1563 [03:58<47:58, 2.00s/it] Training 1/1 epoch (loss 1.8070): 8%|β–Š | 128/1563 [03:58<50:03, 2.09s/it] Training 1/1 epoch (loss 1.8499): 8%|β–Š | 128/1563 [04:01<50:03, 2.09s/it] Training 1/1 epoch (loss 1.8499): 8%|β–Š | 129/1563 [04:01<52:35, 2.20s/it] Training 1/1 epoch (loss 1.8461): 8%|β–Š | 129/1563 [04:02<52:35, 2.20s/it] Training 1/1 epoch (loss 1.8461): 8%|β–Š | 130/1563 [04:02<48:38, 2.04s/it] Training 1/1 epoch (loss 1.7575): 8%|β–Š | 130/1563 [04:04<48:38, 2.04s/it] Training 1/1 epoch (loss 1.7575): 8%|β–Š | 131/1563 [04:04<46:45, 1.96s/it] Training 1/1 epoch (loss 1.8470): 8%|β–Š | 131/1563 [04:06<46:45, 1.96s/it] Training 1/1 epoch (loss 1.8470): 8%|β–Š | 132/1563 [04:06<44:47, 1.88s/it] Training 1/1 epoch (loss 1.9133): 8%|β–Š | 132/1563 [04:08<44:47, 1.88s/it] Training 1/1 epoch (loss 1.9133): 9%|β–Š | 133/1563 [04:08<46:27, 1.95s/it] Training 1/1 epoch (loss 1.9355): 9%|β–Š | 133/1563 [04:10<46:27, 1.95s/it] Training 1/1 epoch (loss 1.9355): 9%|β–Š | 134/1563 [04:10<45:07, 1.89s/it] Training 1/1 epoch (loss 1.9590): 9%|β–Š | 134/1563 [04:12<45:07, 1.89s/it] Training 1/1 epoch (loss 1.9590): 9%|β–Š | 135/1563 [04:12<48:42, 2.05s/it] Training 1/1 epoch (loss 1.9844): 9%|β–Š | 135/1563 [04:14<48:42, 2.05s/it] Training 1/1 epoch (loss 1.9844): 9%|β–Š | 136/1563 [04:14<47:05, 1.98s/it] Training 1/1 epoch (loss 1.7893): 9%|β–Š | 136/1563 [04:15<47:05, 1.98s/it] Training 1/1 epoch (loss 1.7893): 9%|β–‰ | 137/1563 [04:15<44:05, 1.86s/it] Training 1/1 epoch (loss 1.6645): 9%|β–‰ | 137/1563 [04:17<44:05, 1.86s/it] Training 1/1 epoch (loss 1.6645): 9%|β–‰ | 138/1563 [04:17<42:40, 1.80s/it] Training 1/1 epoch (loss 1.7027): 9%|β–‰ | 138/1563 [04:19<42:40, 1.80s/it] Training 1/1 epoch (loss 1.7027): 9%|β–‰ | 139/1563 [04:19<40:44, 1.72s/it] Training 1/1 epoch (loss 1.7943): 9%|β–‰ | 139/1563 [04:20<40:44, 1.72s/it] Training 1/1 epoch (loss 1.7943): 9%|β–‰ | 140/1563 [04:20<39:05, 1.65s/it] Training 1/1 epoch (loss 1.7829): 9%|β–‰ | 140/1563 [04:23<39:05, 1.65s/it] Training 1/1 epoch (loss 1.7829): 9%|β–‰ | 141/1563 [04:23<44:57, 1.90s/it] Training 1/1 epoch (loss 1.8433): 9%|β–‰ | 141/1563 [04:24<44:57, 1.90s/it] Training 1/1 epoch (loss 1.8433): 9%|β–‰ | 142/1563 [04:24<43:17, 1.83s/it] Training 1/1 epoch (loss 1.8560): 9%|β–‰ | 142/1563 [04:26<43:17, 1.83s/it] Training 1/1 epoch (loss 1.8560): 9%|β–‰ | 143/1563 [04:26<40:07, 1.70s/it] Training 1/1 epoch (loss 1.7946): 9%|β–‰ | 143/1563 [04:28<40:07, 1.70s/it] Training 1/1 epoch (loss 1.7946): 9%|β–‰ | 144/1563 [04:28<42:10, 1.78s/it] Training 1/1 epoch (loss 1.7611): 9%|β–‰ | 144/1563 [04:30<42:10, 1.78s/it] Training 1/1 epoch (loss 1.7611): 9%|β–‰ | 145/1563 [04:30<46:34, 1.97s/it] Training 1/1 epoch (loss 1.7582): 9%|β–‰ | 145/1563 [04:32<46:34, 1.97s/it] Training 1/1 epoch (loss 1.7582): 9%|β–‰ | 146/1563 [04:32<48:53, 2.07s/it] Training 1/1 epoch (loss 1.7561): 9%|β–‰ | 146/1563 [04:34<48:53, 2.07s/it] Training 1/1 epoch (loss 1.7561): 9%|β–‰ | 147/1563 [04:34<45:34, 1.93s/it] Training 1/1 epoch (loss 1.8358): 9%|β–‰ | 147/1563 [04:36<45:34, 1.93s/it] Training 1/1 epoch (loss 1.8358): 9%|β–‰ | 148/1563 [04:36<44:25, 1.88s/it] Training 1/1 epoch (loss 1.7387): 9%|β–‰ | 148/1563 [04:38<44:25, 1.88s/it] Training 1/1 epoch (loss 1.7387): 10%|β–‰ | 149/1563 [04:38<45:28, 1.93s/it] Training 1/1 epoch (loss 1.7654): 10%|β–‰ | 149/1563 [04:39<45:28, 1.93s/it] Training 1/1 epoch (loss 1.7654): 10%|β–‰ | 150/1563 [04:39<41:15, 1.75s/it] Training 1/1 epoch (loss 1.7441): 10%|β–‰ | 150/1563 [04:41<41:15, 1.75s/it] Training 1/1 epoch (loss 1.7441): 10%|β–‰ | 151/1563 [04:41<40:23, 1.72s/it] Training 1/1 epoch (loss 1.8815): 10%|β–‰ | 151/1563 [04:42<40:23, 1.72s/it] Training 1/1 epoch (loss 1.8815): 10%|β–‰ | 152/1563 [04:42<40:14, 1.71s/it] Training 1/1 epoch (loss 1.7317): 10%|β–‰ | 152/1563 [04:43<40:14, 1.71s/it] Training 1/1 epoch (loss 1.7317): 10%|β–‰ | 153/1563 [04:43<34:14, 1.46s/it] Training 1/1 epoch (loss 1.7225): 10%|β–‰ | 153/1563 [04:45<34:14, 1.46s/it] Training 1/1 epoch (loss 1.7225): 10%|β–‰ | 154/1563 [04:45<32:05, 1.37s/it] Training 1/1 epoch (loss 1.7144): 10%|β–‰ | 154/1563 [04:47<32:05, 1.37s/it] Training 1/1 epoch (loss 1.7144): 10%|β–‰ | 155/1563 [04:47<38:38, 1.65s/it] Training 1/1 epoch (loss 1.8040): 10%|β–‰ | 155/1563 [04:49<38:38, 1.65s/it] Training 1/1 epoch (loss 1.8040): 10%|β–‰ | 156/1563 [04:49<43:59, 1.88s/it] Training 1/1 epoch (loss 1.6840): 10%|β–‰ | 156/1563 [04:51<43:59, 1.88s/it] Training 1/1 epoch (loss 1.6840): 10%|β–ˆ | 157/1563 [04:51<42:51, 1.83s/it] Training 1/1 epoch (loss 1.7668): 10%|β–ˆ | 157/1563 [04:53<42:51, 1.83s/it] Training 1/1 epoch (loss 1.7668): 10%|β–ˆ | 158/1563 [04:53<46:56, 2.00s/it] Training 1/1 epoch (loss 1.7547): 10%|β–ˆ | 158/1563 [04:55<46:56, 2.00s/it] Training 1/1 epoch (loss 1.7547): 10%|β–ˆ | 159/1563 [04:55<43:38, 1.87s/it] Training 1/1 epoch (loss 1.6929): 10%|β–ˆ | 159/1563 [04:57<43:38, 1.87s/it] Training 1/1 epoch (loss 1.6929): 10%|β–ˆ | 160/1563 [04:57<46:20, 1.98s/it] Training 1/1 epoch (loss 1.8476): 10%|β–ˆ | 160/1563 [04:59<46:20, 1.98s/it] Training 1/1 epoch (loss 1.8476): 10%|β–ˆ | 161/1563 [04:59<47:40, 2.04s/it] Training 1/1 epoch (loss 1.8469): 10%|β–ˆ | 161/1563 [05:01<47:40, 2.04s/it] Training 1/1 epoch (loss 1.8469): 10%|β–ˆ | 162/1563 [05:01<46:51, 2.01s/it] Training 1/1 epoch (loss 1.7283): 10%|β–ˆ | 162/1563 [05:03<46:51, 2.01s/it] Training 1/1 epoch (loss 1.7283): 10%|β–ˆ | 163/1563 [05:03<45:28, 1.95s/it] Training 1/1 epoch (loss 1.6780): 10%|β–ˆ | 163/1563 [05:06<45:28, 1.95s/it] Training 1/1 epoch (loss 1.6780): 10%|β–ˆ | 164/1563 [05:06<49:17, 2.11s/it] Training 1/1 epoch (loss 1.8622): 10%|β–ˆ | 164/1563 [05:08<49:17, 2.11s/it] Training 1/1 epoch (loss 1.8622): 11%|β–ˆ | 165/1563 [05:08<48:14, 2.07s/it] Training 1/1 epoch (loss 1.8283): 11%|β–ˆ | 165/1563 [05:10<48:14, 2.07s/it] Training 1/1 epoch (loss 1.8283): 11%|β–ˆ | 166/1563 [05:10<50:55, 2.19s/it] Training 1/1 epoch (loss 1.6787): 11%|β–ˆ | 166/1563 [05:12<50:55, 2.19s/it] Training 1/1 epoch (loss 1.6787): 11%|β–ˆ | 167/1563 [05:12<46:23, 1.99s/it] Training 1/1 epoch (loss 1.7641): 11%|β–ˆ | 167/1563 [05:14<46:23, 1.99s/it] Training 1/1 epoch (loss 1.7641): 11%|β–ˆ | 168/1563 [05:14<47:15, 2.03s/it] Training 1/1 epoch (loss 1.7884): 11%|β–ˆ | 168/1563 [05:15<47:15, 2.03s/it] Training 1/1 epoch (loss 1.7884): 11%|β–ˆ | 169/1563 [05:15<42:50, 1.84s/it] Training 1/1 epoch (loss 1.7782): 11%|β–ˆ | 169/1563 [05:17<42:50, 1.84s/it] Training 1/1 epoch (loss 1.7782): 11%|β–ˆ | 170/1563 [05:17<42:48, 1.84s/it] Training 1/1 epoch (loss 1.7272): 11%|β–ˆ | 170/1563 [05:18<42:48, 1.84s/it] Training 1/1 epoch (loss 1.7272): 11%|β–ˆ | 171/1563 [05:18<36:49, 1.59s/it] Training 1/1 epoch (loss 1.9029): 11%|β–ˆ | 171/1563 [05:20<36:49, 1.59s/it] Training 1/1 epoch (loss 1.9029): 11%|β–ˆ | 172/1563 [05:20<41:12, 1.78s/it] Training 1/1 epoch (loss 1.8577): 11%|β–ˆ | 172/1563 [05:22<41:12, 1.78s/it] Training 1/1 epoch (loss 1.8577): 11%|β–ˆ | 173/1563 [05:22<39:21, 1.70s/it] Training 1/1 epoch (loss 1.7636): 11%|β–ˆ | 173/1563 [05:24<39:21, 1.70s/it] Training 1/1 epoch (loss 1.7636): 11%|β–ˆ | 174/1563 [05:24<42:30, 1.84s/it] Training 1/1 epoch (loss 1.7478): 11%|β–ˆ | 174/1563 [05:25<42:30, 1.84s/it] Training 1/1 epoch (loss 1.7478): 11%|β–ˆ | 175/1563 [05:25<39:29, 1.71s/it] Training 1/1 epoch (loss 1.6861): 11%|β–ˆ | 175/1563 [05:27<39:29, 1.71s/it] Training 1/1 epoch (loss 1.6861): 11%|β–ˆβ– | 176/1563 [05:27<39:28, 1.71s/it] Training 1/1 epoch (loss 1.7404): 11%|β–ˆβ– | 176/1563 [05:29<39:28, 1.71s/it] Training 1/1 epoch (loss 1.7404): 11%|β–ˆβ– | 177/1563 [05:29<40:09, 1.74s/it] Training 1/1 epoch (loss 1.7273): 11%|β–ˆβ– | 177/1563 [05:30<40:09, 1.74s/it] Training 1/1 epoch (loss 1.7273): 11%|β–ˆβ– | 178/1563 [05:30<38:10, 1.65s/it] Training 1/1 epoch (loss 1.8297): 11%|β–ˆβ– | 178/1563 [05:32<38:10, 1.65s/it] Training 1/1 epoch (loss 1.8297): 11%|β–ˆβ– | 179/1563 [05:32<38:54, 1.69s/it] Training 1/1 epoch (loss 1.7636): 11%|β–ˆβ– | 179/1563 [05:34<38:54, 1.69s/it] Training 1/1 epoch (loss 1.7636): 12%|β–ˆβ– | 180/1563 [05:34<38:57, 1.69s/it] Training 1/1 epoch (loss 1.8000): 12%|β–ˆβ– | 180/1563 [05:35<38:57, 1.69s/it] Training 1/1 epoch (loss 1.8000): 12%|β–ˆβ– | 181/1563 [05:35<34:08, 1.48s/it] Training 1/1 epoch (loss 1.7658): 12%|β–ˆβ– | 181/1563 [05:36<34:08, 1.48s/it] Training 1/1 epoch (loss 1.7658): 12%|β–ˆβ– | 182/1563 [05:36<36:13, 1.57s/it] Training 1/1 epoch (loss 1.7116): 12%|β–ˆβ– | 182/1563 [05:38<36:13, 1.57s/it] Training 1/1 epoch (loss 1.7116): 12%|β–ˆβ– | 183/1563 [05:38<37:47, 1.64s/it] Training 1/1 epoch (loss 1.6300): 12%|β–ˆβ– | 183/1563 [05:40<37:47, 1.64s/it] Training 1/1 epoch (loss 1.6300): 12%|β–ˆβ– | 184/1563 [05:40<37:22, 1.63s/it] Training 1/1 epoch (loss 1.7182): 12%|β–ˆβ– | 184/1563 [05:41<37:22, 1.63s/it] Training 1/1 epoch (loss 1.7182): 12%|β–ˆβ– | 185/1563 [05:41<34:39, 1.51s/it] Training 1/1 epoch (loss 1.8698): 12%|β–ˆβ– | 185/1563 [05:43<34:39, 1.51s/it] Training 1/1 epoch (loss 1.8698): 12%|β–ˆβ– | 186/1563 [05:43<35:35, 1.55s/it] Training 1/1 epoch (loss 1.7162): 12%|β–ˆβ– | 186/1563 [05:44<35:35, 1.55s/it] Training 1/1 epoch (loss 1.7162): 12%|β–ˆβ– | 187/1563 [05:44<34:18, 1.50s/it] Training 1/1 epoch (loss 1.6832): 12%|β–ˆβ– | 187/1563 [05:45<34:18, 1.50s/it] Training 1/1 epoch (loss 1.6832): 12%|β–ˆβ– | 188/1563 [05:45<33:26, 1.46s/it] Training 1/1 epoch (loss 1.5885): 12%|β–ˆβ– | 188/1563 [05:47<33:26, 1.46s/it] Training 1/1 epoch (loss 1.5885): 12%|β–ˆβ– | 189/1563 [05:47<31:54, 1.39s/it] Training 1/1 epoch (loss 1.8497): 12%|β–ˆβ– | 189/1563 [05:48<31:54, 1.39s/it] Training 1/1 epoch (loss 1.8497): 12%|β–ˆβ– | 190/1563 [05:48<31:01, 1.36s/it] Training 1/1 epoch (loss 1.8298): 12%|β–ˆβ– | 190/1563 [05:49<31:01, 1.36s/it] Training 1/1 epoch (loss 1.8298): 12%|β–ˆβ– | 191/1563 [05:49<29:54, 1.31s/it] Training 1/1 epoch (loss 1.7962): 12%|β–ˆβ– | 191/1563 [05:51<29:54, 1.31s/it] Training 1/1 epoch (loss 1.7962): 12%|β–ˆβ– | 192/1563 [05:51<32:12, 1.41s/it] Training 1/1 epoch (loss 1.6356): 12%|β–ˆβ– | 192/1563 [05:52<32:12, 1.41s/it] Training 1/1 epoch (loss 1.6356): 12%|β–ˆβ– | 193/1563 [05:52<33:11, 1.45s/it] Training 1/1 epoch (loss 1.6862): 12%|β–ˆβ– | 193/1563 [05:54<33:11, 1.45s/it] Training 1/1 epoch (loss 1.6862): 12%|β–ˆβ– | 194/1563 [05:54<35:18, 1.55s/it] Training 1/1 epoch (loss 1.7115): 12%|β–ˆβ– | 194/1563 [05:55<35:18, 1.55s/it] Training 1/1 epoch (loss 1.7115): 12%|β–ˆβ– | 195/1563 [05:55<34:04, 1.49s/it] Training 1/1 epoch (loss 1.8669): 12%|β–ˆβ– | 195/1563 [05:57<34:04, 1.49s/it] Training 1/1 epoch (loss 1.8669): 13%|β–ˆβ–Ž | 196/1563 [05:57<31:01, 1.36s/it] Training 1/1 epoch (loss 1.8014): 13%|β–ˆβ–Ž | 196/1563 [05:59<31:01, 1.36s/it] Training 1/1 epoch (loss 1.8014): 13%|β–ˆβ–Ž | 197/1563 [05:59<35:50, 1.57s/it] Training 1/1 epoch (loss 1.7082): 13%|β–ˆβ–Ž | 197/1563 [06:00<35:50, 1.57s/it] Training 1/1 epoch (loss 1.7082): 13%|β–ˆβ–Ž | 198/1563 [06:00<34:07, 1.50s/it] Training 1/1 epoch (loss 1.6977): 13%|β–ˆβ–Ž | 198/1563 [06:02<34:07, 1.50s/it] Training 1/1 epoch (loss 1.6977): 13%|β–ˆβ–Ž | 199/1563 [06:02<37:12, 1.64s/it] Training 1/1 epoch (loss 1.7097): 13%|β–ˆβ–Ž | 199/1563 [06:03<37:12, 1.64s/it] Training 1/1 epoch (loss 1.7097): 13%|β–ˆβ–Ž | 200/1563 [06:03<35:41, 1.57s/it] Training 1/1 epoch (loss 1.7751): 13%|β–ˆβ–Ž | 200/1563 [06:06<35:41, 1.57s/it] Training 1/1 epoch (loss 1.7751): 13%|β–ˆβ–Ž | 201/1563 [06:06<41:27, 1.83s/it] Training 1/1 epoch (loss 1.6647): 13%|β–ˆβ–Ž | 201/1563 [06:07<41:27, 1.83s/it] Training 1/1 epoch (loss 1.6647): 13%|β–ˆβ–Ž | 202/1563 [06:07<38:50, 1.71s/it] Training 1/1 epoch (loss 1.8326): 13%|β–ˆβ–Ž | 202/1563 [06:09<38:50, 1.71s/it] Training 1/1 epoch (loss 1.8326): 13%|β–ˆβ–Ž | 203/1563 [06:09<39:04, 1.72s/it] Training 1/1 epoch (loss 1.6931): 13%|β–ˆβ–Ž | 203/1563 [06:10<39:04, 1.72s/it] Training 1/1 epoch (loss 1.6931): 13%|β–ˆβ–Ž | 204/1563 [06:10<36:08, 1.60s/it] Training 1/1 epoch (loss 1.6116): 13%|β–ˆβ–Ž | 204/1563 [06:12<36:08, 1.60s/it] Training 1/1 epoch (loss 1.6116): 13%|β–ˆβ–Ž | 205/1563 [06:12<34:24, 1.52s/it] Training 1/1 epoch (loss 1.7964): 13%|β–ˆβ–Ž | 205/1563 [06:13<34:24, 1.52s/it] Training 1/1 epoch (loss 1.7964): 13%|β–ˆβ–Ž | 206/1563 [06:13<32:02, 1.42s/it] Training 1/1 epoch (loss 1.7892): 13%|β–ˆβ–Ž | 206/1563 [06:14<32:02, 1.42s/it] Training 1/1 epoch (loss 1.7892): 13%|β–ˆβ–Ž | 207/1563 [06:14<33:48, 1.50s/it] Training 1/1 epoch (loss 1.7373): 13%|β–ˆβ–Ž | 207/1563 [06:17<33:48, 1.50s/it] Training 1/1 epoch (loss 1.7373): 13%|β–ˆβ–Ž | 208/1563 [06:17<40:01, 1.77s/it] Training 1/1 epoch (loss 1.7641): 13%|β–ˆβ–Ž | 208/1563 [06:19<40:01, 1.77s/it] Training 1/1 epoch (loss 1.7641): 13%|β–ˆβ–Ž | 209/1563 [06:19<42:55, 1.90s/it] Training 1/1 epoch (loss 1.7680): 13%|β–ˆβ–Ž | 209/1563 [06:20<42:55, 1.90s/it] Training 1/1 epoch (loss 1.7680): 13%|β–ˆβ–Ž | 210/1563 [06:20<36:13, 1.61s/it] Training 1/1 epoch (loss 1.6811): 13%|β–ˆβ–Ž | 210/1563 [06:22<36:13, 1.61s/it] Training 1/1 epoch (loss 1.6811): 13%|β–ˆβ–Ž | 211/1563 [06:22<41:39, 1.85s/it] Training 1/1 epoch (loss 1.7600): 13%|β–ˆβ–Ž | 211/1563 [06:24<41:39, 1.85s/it] Training 1/1 epoch (loss 1.7600): 14%|β–ˆβ–Ž | 212/1563 [06:24<41:01, 1.82s/it] Training 1/1 epoch (loss 1.6264): 14%|β–ˆβ–Ž | 212/1563 [06:26<41:01, 1.82s/it] Training 1/1 epoch (loss 1.6264): 14%|β–ˆβ–Ž | 213/1563 [06:26<42:43, 1.90s/it] Training 1/1 epoch (loss 1.9319): 14%|β–ˆβ–Ž | 213/1563 [06:28<42:43, 1.90s/it] Training 1/1 epoch (loss 1.9319): 14%|β–ˆβ–Ž | 214/1563 [06:28<44:33, 1.98s/it] Training 1/1 epoch (loss 1.7971): 14%|β–ˆβ–Ž | 214/1563 [06:30<44:33, 1.98s/it] Training 1/1 epoch (loss 1.7971): 14%|β–ˆβ– | 215/1563 [06:30<44:48, 1.99s/it] Training 1/1 epoch (loss 1.7330): 14%|β–ˆβ– | 215/1563 [06:32<44:48, 1.99s/it] Training 1/1 epoch (loss 1.7330): 14%|β–ˆβ– | 216/1563 [06:32<43:13, 1.93s/it] Training 1/1 epoch (loss 1.7113): 14%|β–ˆβ– | 216/1563 [06:35<43:13, 1.93s/it] Training 1/1 epoch (loss 1.7113): 14%|β–ˆβ– | 217/1563 [06:35<46:49, 2.09s/it] Training 1/1 epoch (loss 1.7789): 14%|β–ˆβ– | 217/1563 [06:36<46:49, 2.09s/it] Training 1/1 epoch (loss 1.7789): 14%|β–ˆβ– | 218/1563 [06:36<44:31, 1.99s/it] Training 1/1 epoch (loss 1.7758): 14%|β–ˆβ– | 218/1563 [06:38<44:31, 1.99s/it] Training 1/1 epoch (loss 1.7758): 14%|β–ˆβ– | 219/1563 [06:38<44:23, 1.98s/it] Training 1/1 epoch (loss 1.6428): 14%|β–ˆβ– | 219/1563 [06:41<44:23, 1.98s/it] Training 1/1 epoch (loss 1.6428): 14%|β–ˆβ– | 220/1563 [06:41<47:31, 2.12s/it] Training 1/1 epoch (loss 1.7332): 14%|β–ˆβ– | 220/1563 [06:43<47:31, 2.12s/it] Training 1/1 epoch (loss 1.7332): 14%|β–ˆβ– | 221/1563 [06:43<46:48, 2.09s/it] Training 1/1 epoch (loss 1.6468): 14%|β–ˆβ– | 221/1563 [06:44<46:48, 2.09s/it] Training 1/1 epoch (loss 1.6468): 14%|β–ˆβ– | 222/1563 [06:44<41:18, 1.85s/it] Training 1/1 epoch (loss 1.6322): 14%|β–ˆβ– | 222/1563 [06:45<41:18, 1.85s/it] Training 1/1 epoch (loss 1.6322): 14%|β–ˆβ– | 223/1563 [06:45<37:00, 1.66s/it] Training 1/1 epoch (loss 1.6769): 14%|β–ˆβ– | 223/1563 [06:48<37:00, 1.66s/it] Training 1/1 epoch (loss 1.6769): 14%|β–ˆβ– | 224/1563 [06:48<43:44, 1.96s/it] Training 1/1 epoch (loss 1.8278): 14%|β–ˆβ– | 224/1563 [06:49<43:44, 1.96s/it] Training 1/1 epoch (loss 1.8278): 14%|β–ˆβ– | 225/1563 [06:49<40:04, 1.80s/it] Training 1/1 epoch (loss 1.7969): 14%|β–ˆβ– | 225/1563 [06:51<40:04, 1.80s/it] Training 1/1 epoch (loss 1.7969): 14%|β–ˆβ– | 226/1563 [06:51<36:17, 1.63s/it] Training 1/1 epoch (loss 1.7547): 14%|β–ˆβ– | 226/1563 [06:53<36:17, 1.63s/it] Training 1/1 epoch (loss 1.7547): 15%|β–ˆβ– | 227/1563 [06:53<40:35, 1.82s/it] Training 1/1 epoch (loss 1.7723): 15%|β–ˆβ– | 227/1563 [06:54<40:35, 1.82s/it] Training 1/1 epoch (loss 1.7723): 15%|β–ˆβ– | 228/1563 [06:54<38:12, 1.72s/it] Training 1/1 epoch (loss 1.7627): 15%|β–ˆβ– | 228/1563 [06:56<38:12, 1.72s/it] Training 1/1 epoch (loss 1.7627): 15%|β–ˆβ– | 229/1563 [06:56<38:37, 1.74s/it] Training 1/1 epoch (loss 1.7549): 15%|β–ˆβ– | 229/1563 [06:58<38:37, 1.74s/it] Training 1/1 epoch (loss 1.7549): 15%|β–ˆβ– | 230/1563 [06:58<38:37, 1.74s/it] Training 1/1 epoch (loss 1.8026): 15%|β–ˆβ– | 230/1563 [07:00<38:37, 1.74s/it] Training 1/1 epoch (loss 1.8026): 15%|β–ˆβ– | 231/1563 [07:00<40:43, 1.83s/it] Training 1/1 epoch (loss 1.8208): 15%|β–ˆβ– | 231/1563 [07:01<40:43, 1.83s/it] Training 1/1 epoch (loss 1.8208): 15%|β–ˆβ– | 232/1563 [07:01<38:31, 1.74s/it] Training 1/1 epoch (loss 1.6776): 15%|β–ˆβ– | 232/1563 [07:03<38:31, 1.74s/it] Training 1/1 epoch (loss 1.6776): 15%|β–ˆβ– | 233/1563 [07:03<39:02, 1.76s/it] Training 1/1 epoch (loss 1.7802): 15%|β–ˆβ– | 233/1563 [07:05<39:02, 1.76s/it] Training 1/1 epoch (loss 1.7802): 15%|β–ˆβ– | 234/1563 [07:05<38:09, 1.72s/it] Training 1/1 epoch (loss 1.6333): 15%|β–ˆβ– | 234/1563 [07:06<38:09, 1.72s/it] Training 1/1 epoch (loss 1.6333): 15%|β–ˆβ–Œ | 235/1563 [07:06<36:54, 1.67s/it] Training 1/1 epoch (loss 1.6779): 15%|β–ˆβ–Œ | 235/1563 [07:07<36:54, 1.67s/it] Training 1/1 epoch (loss 1.6779): 15%|β–ˆβ–Œ | 236/1563 [07:07<31:17, 1.41s/it] Training 1/1 epoch (loss 1.7376): 15%|β–ˆβ–Œ | 236/1563 [07:09<31:17, 1.41s/it] Training 1/1 epoch (loss 1.7376): 15%|β–ˆβ–Œ | 237/1563 [07:09<34:41, 1.57s/it] Training 1/1 epoch (loss 1.8180): 15%|β–ˆβ–Œ | 237/1563 [07:11<34:41, 1.57s/it] Training 1/1 epoch (loss 1.8180): 15%|β–ˆβ–Œ | 238/1563 [07:11<35:19, 1.60s/it] Training 1/1 epoch (loss 1.7077): 15%|β–ˆβ–Œ | 238/1563 [07:13<35:19, 1.60s/it] Training 1/1 epoch (loss 1.7077): 15%|β–ˆβ–Œ | 239/1563 [07:13<35:39, 1.62s/it] Training 1/1 epoch (loss 1.6017): 15%|β–ˆβ–Œ | 239/1563 [07:14<35:39, 1.62s/it] Training 1/1 epoch (loss 1.6017): 15%|β–ˆβ–Œ | 240/1563 [07:14<36:57, 1.68s/it] Training 1/1 epoch (loss 1.7150): 15%|β–ˆβ–Œ | 240/1563 [07:16<36:57, 1.68s/it] Training 1/1 epoch (loss 1.7150): 15%|β–ˆβ–Œ | 241/1563 [07:16<35:59, 1.63s/it] Training 1/1 epoch (loss 1.6571): 15%|β–ˆβ–Œ | 241/1563 [07:18<35:59, 1.63s/it] Training 1/1 epoch (loss 1.6571): 15%|β–ˆβ–Œ | 242/1563 [07:18<40:19, 1.83s/it] Training 1/1 epoch (loss 1.6129): 15%|β–ˆβ–Œ | 242/1563 [07:20<40:19, 1.83s/it] Training 1/1 epoch (loss 1.6129): 16%|β–ˆβ–Œ | 243/1563 [07:20<40:03, 1.82s/it] Training 1/1 epoch (loss 1.7117): 16%|β–ˆβ–Œ | 243/1563 [07:21<40:03, 1.82s/it] Training 1/1 epoch (loss 1.7117): 16%|β–ˆβ–Œ | 244/1563 [07:21<35:53, 1.63s/it] Training 1/1 epoch (loss 1.7429): 16%|β–ˆβ–Œ | 244/1563 [07:23<35:53, 1.63s/it] Training 1/1 epoch (loss 1.7429): 16%|β–ˆβ–Œ | 245/1563 [07:23<33:56, 1.55s/it] Training 1/1 epoch (loss 1.7957): 16%|β–ˆβ–Œ | 245/1563 [07:25<33:56, 1.55s/it] Training 1/1 epoch (loss 1.7957): 16%|β–ˆβ–Œ | 246/1563 [07:25<37:14, 1.70s/it] Training 1/1 epoch (loss 1.6957): 16%|β–ˆβ–Œ | 246/1563 [07:26<37:14, 1.70s/it] Training 1/1 epoch (loss 1.6957): 16%|β–ˆβ–Œ | 247/1563 [07:26<37:24, 1.71s/it] Training 1/1 epoch (loss 1.7224): 16%|β–ˆβ–Œ | 247/1563 [07:28<37:24, 1.71s/it] Training 1/1 epoch (loss 1.7224): 16%|β–ˆβ–Œ | 248/1563 [07:28<34:58, 1.60s/it] Training 1/1 epoch (loss 1.6863): 16%|β–ˆβ–Œ | 248/1563 [07:29<34:58, 1.60s/it] Training 1/1 epoch (loss 1.6863): 16%|β–ˆβ–Œ | 249/1563 [07:29<32:37, 1.49s/it] Training 1/1 epoch (loss 1.7770): 16%|β–ˆβ–Œ | 249/1563 [07:31<32:37, 1.49s/it] Training 1/1 epoch (loss 1.7770): 16%|β–ˆβ–Œ | 250/1563 [07:31<37:52, 1.73s/it] Training 1/1 epoch (loss 1.6383): 16%|β–ˆβ–Œ | 250/1563 [07:32<37:52, 1.73s/it] Training 1/1 epoch (loss 1.6383): 16%|β–ˆβ–Œ | 251/1563 [07:32<33:30, 1.53s/it] Training 1/1 epoch (loss 1.6724): 16%|β–ˆβ–Œ | 251/1563 [07:35<33:30, 1.53s/it] Training 1/1 epoch (loss 1.6724): 16%|β–ˆβ–Œ | 252/1563 [07:35<39:23, 1.80s/it] Training 1/1 epoch (loss 1.7567): 16%|β–ˆβ–Œ | 252/1563 [07:36<39:23, 1.80s/it] Training 1/1 epoch (loss 1.7567): 16%|β–ˆβ–Œ | 253/1563 [07:36<35:16, 1.62s/it] Training 1/1 epoch (loss 1.5907): 16%|β–ˆβ–Œ | 253/1563 [07:38<35:16, 1.62s/it] Training 1/1 epoch (loss 1.5907): 16%|β–ˆβ–‹ | 254/1563 [07:38<41:07, 1.88s/it] Training 1/1 epoch (loss 1.6865): 16%|β–ˆβ–‹ | 254/1563 [07:40<41:07, 1.88s/it] Training 1/1 epoch (loss 1.6865): 16%|β–ˆβ–‹ | 255/1563 [07:40<36:40, 1.68s/it] Training 1/1 epoch (loss 1.8102): 16%|β–ˆβ–‹ | 255/1563 [07:42<36:40, 1.68s/it] Training 1/1 epoch (loss 1.8102): 16%|β–ˆβ–‹ | 256/1563 [07:42<39:23, 1.81s/it] Training 1/1 epoch (loss 1.7987): 16%|β–ˆβ–‹ | 256/1563 [07:43<39:23, 1.81s/it] Training 1/1 epoch (loss 1.7987): 16%|β–ˆβ–‹ | 257/1563 [07:43<36:35, 1.68s/it] Training 1/1 epoch (loss 1.7387): 16%|β–ˆβ–‹ | 257/1563 [07:45<36:35, 1.68s/it] Training 1/1 epoch (loss 1.7387): 17%|β–ˆβ–‹ | 258/1563 [07:45<37:56, 1.74s/it] Training 1/1 epoch (loss 1.6953): 17%|β–ˆβ–‹ | 258/1563 [07:47<37:56, 1.74s/it] Training 1/1 epoch (loss 1.6953): 17%|β–ˆβ–‹ | 259/1563 [07:47<37:32, 1.73s/it] Training 1/1 epoch (loss 1.6877): 17%|β–ˆβ–‹ | 259/1563 [07:48<37:32, 1.73s/it] Training 1/1 epoch (loss 1.6877): 17%|β–ˆβ–‹ | 260/1563 [07:48<37:44, 1.74s/it] Training 1/1 epoch (loss 1.6698): 17%|β–ˆβ–‹ | 260/1563 [07:51<37:44, 1.74s/it] Training 1/1 epoch (loss 1.6698): 17%|β–ˆβ–‹ | 261/1563 [07:51<40:29, 1.87s/it] Training 1/1 epoch (loss 1.8029): 17%|β–ˆβ–‹ | 261/1563 [07:52<40:29, 1.87s/it] Training 1/1 epoch (loss 1.8029): 17%|β–ˆβ–‹ | 262/1563 [07:52<38:30, 1.78s/it] Training 1/1 epoch (loss 1.7915): 17%|β–ˆβ–‹ | 262/1563 [07:54<38:30, 1.78s/it] Training 1/1 epoch (loss 1.7915): 17%|β–ˆβ–‹ | 263/1563 [07:54<40:01, 1.85s/it] Training 1/1 epoch (loss 1.7505): 17%|β–ˆβ–‹ | 263/1563 [07:56<40:01, 1.85s/it] Training 1/1 epoch (loss 1.7505): 17%|β–ˆβ–‹ | 264/1563 [07:56<38:43, 1.79s/it] Training 1/1 epoch (loss 1.7101): 17%|β–ˆβ–‹ | 264/1563 [07:57<38:43, 1.79s/it] Training 1/1 epoch (loss 1.7101): 17%|β–ˆβ–‹ | 265/1563 [07:57<37:14, 1.72s/it] Training 1/1 epoch (loss 1.7094): 17%|β–ˆβ–‹ | 265/1563 [07:59<37:14, 1.72s/it] Training 1/1 epoch (loss 1.7094): 17%|β–ˆβ–‹ | 266/1563 [07:59<34:29, 1.60s/it] Training 1/1 epoch (loss 1.6681): 17%|β–ˆβ–‹ | 266/1563 [08:00<34:29, 1.60s/it] Training 1/1 epoch (loss 1.6681): 17%|β–ˆβ–‹ | 267/1563 [08:00<29:35, 1.37s/it] Training 1/1 epoch (loss 1.8272): 17%|β–ˆβ–‹ | 267/1563 [08:01<29:35, 1.37s/it] Training 1/1 epoch (loss 1.8272): 17%|β–ˆβ–‹ | 268/1563 [08:01<31:09, 1.44s/it] Training 1/1 epoch (loss 1.6378): 17%|β–ˆβ–‹ | 268/1563 [08:03<31:09, 1.44s/it] Training 1/1 epoch (loss 1.6378): 17%|β–ˆβ–‹ | 269/1563 [08:03<32:38, 1.51s/it] Training 1/1 epoch (loss 1.7698): 17%|β–ˆβ–‹ | 269/1563 [08:04<32:38, 1.51s/it] Training 1/1 epoch (loss 1.7698): 17%|β–ˆβ–‹ | 270/1563 [08:04<31:29, 1.46s/it] Training 1/1 epoch (loss 1.7017): 17%|β–ˆβ–‹ | 270/1563 [08:05<31:29, 1.46s/it] Training 1/1 epoch (loss 1.7017): 17%|β–ˆβ–‹ | 271/1563 [08:05<29:58, 1.39s/it] Training 1/1 epoch (loss 1.5852): 17%|β–ˆβ–‹ | 271/1563 [08:08<29:58, 1.39s/it] Training 1/1 epoch (loss 1.5852): 17%|β–ˆβ–‹ | 272/1563 [08:08<38:38, 1.80s/it] Training 1/1 epoch (loss 1.7884): 17%|β–ˆβ–‹ | 272/1563 [08:10<38:38, 1.80s/it] Training 1/1 epoch (loss 1.7884): 17%|β–ˆβ–‹ | 273/1563 [08:10<42:12, 1.96s/it] Training 1/1 epoch (loss 1.6836): 17%|β–ˆβ–‹ | 273/1563 [08:13<42:12, 1.96s/it] Training 1/1 epoch (loss 1.6836): 18%|β–ˆβ–Š | 274/1563 [08:13<44:14, 2.06s/it] Training 1/1 epoch (loss 1.7281): 18%|β–ˆβ–Š | 274/1563 [08:15<44:14, 2.06s/it] Training 1/1 epoch (loss 1.7281): 18%|β–ˆβ–Š | 275/1563 [08:15<42:57, 2.00s/it] Training 1/1 epoch (loss 1.7658): 18%|β–ˆβ–Š | 275/1563 [08:16<42:57, 2.00s/it] Training 1/1 epoch (loss 1.7658): 18%|β–ˆβ–Š | 276/1563 [08:16<37:38, 1.75s/it] Training 1/1 epoch (loss 1.7969): 18%|β–ˆβ–Š | 276/1563 [08:18<37:38, 1.75s/it] Training 1/1 epoch (loss 1.7969): 18%|β–ˆβ–Š | 277/1563 [08:18<39:18, 1.83s/it] Training 1/1 epoch (loss 1.6916): 18%|β–ˆβ–Š | 277/1563 [08:19<39:18, 1.83s/it] Training 1/1 epoch (loss 1.6916): 18%|β–ˆβ–Š | 278/1563 [08:19<37:26, 1.75s/it] Training 1/1 epoch (loss 1.6850): 18%|β–ˆβ–Š | 278/1563 [08:20<37:26, 1.75s/it] Training 1/1 epoch (loss 1.6850): 18%|β–ˆβ–Š | 279/1563 [08:20<33:16, 1.56s/it] Training 1/1 epoch (loss 1.6971): 18%|β–ˆβ–Š | 279/1563 [08:23<33:16, 1.56s/it] Training 1/1 epoch (loss 1.6971): 18%|β–ˆβ–Š | 280/1563 [08:23<38:00, 1.78s/it] Training 1/1 epoch (loss 1.7646): 18%|β–ˆβ–Š | 280/1563 [08:24<38:00, 1.78s/it] Training 1/1 epoch (loss 1.7646): 18%|β–ˆβ–Š | 281/1563 [08:24<34:34, 1.62s/it] Training 1/1 epoch (loss 1.6598): 18%|β–ˆβ–Š | 281/1563 [08:26<34:34, 1.62s/it] Training 1/1 epoch (loss 1.6598): 18%|β–ˆβ–Š | 282/1563 [08:26<36:14, 1.70s/it] Training 1/1 epoch (loss 1.6963): 18%|β–ˆβ–Š | 282/1563 [08:28<36:14, 1.70s/it] Training 1/1 epoch (loss 1.6963): 18%|β–ˆβ–Š | 283/1563 [08:28<38:42, 1.81s/it] Training 1/1 epoch (loss 1.7348): 18%|β–ˆβ–Š | 283/1563 [08:30<38:42, 1.81s/it] Training 1/1 epoch (loss 1.7348): 18%|β–ˆβ–Š | 284/1563 [08:30<39:07, 1.84s/it] Training 1/1 epoch (loss 1.6330): 18%|β–ˆβ–Š | 284/1563 [08:32<39:07, 1.84s/it] Training 1/1 epoch (loss 1.6330): 18%|β–ˆβ–Š | 285/1563 [08:32<40:55, 1.92s/it] Training 1/1 epoch (loss 1.6543): 18%|β–ˆβ–Š | 285/1563 [08:34<40:55, 1.92s/it] Training 1/1 epoch (loss 1.6543): 18%|β–ˆβ–Š | 286/1563 [08:34<42:38, 2.00s/it] Training 1/1 epoch (loss 1.7924): 18%|β–ˆβ–Š | 286/1563 [08:35<42:38, 2.00s/it] Training 1/1 epoch (loss 1.7924): 18%|β–ˆβ–Š | 287/1563 [08:35<37:22, 1.76s/it] Training 1/1 epoch (loss 1.6799): 18%|β–ˆβ–Š | 287/1563 [08:38<37:22, 1.76s/it] Training 1/1 epoch (loss 1.6799): 18%|β–ˆβ–Š | 288/1563 [08:38<40:23, 1.90s/it] Training 1/1 epoch (loss 1.7346): 18%|β–ˆβ–Š | 288/1563 [08:39<40:23, 1.90s/it] Training 1/1 epoch (loss 1.7346): 18%|β–ˆβ–Š | 289/1563 [08:39<40:17, 1.90s/it] Training 1/1 epoch (loss 1.7106): 18%|β–ˆβ–Š | 289/1563 [08:42<40:17, 1.90s/it] Training 1/1 epoch (loss 1.7106): 19%|β–ˆβ–Š | 290/1563 [08:42<43:29, 2.05s/it] Training 1/1 epoch (loss 1.7178): 19%|β–ˆβ–Š | 290/1563 [08:43<43:29, 2.05s/it] Training 1/1 epoch (loss 1.7178): 19%|β–ˆβ–Š | 291/1563 [08:43<38:45, 1.83s/it] Training 1/1 epoch (loss 1.6727): 19%|β–ˆβ–Š | 291/1563 [08:45<38:45, 1.83s/it] Training 1/1 epoch (loss 1.6727): 19%|β–ˆβ–Š | 292/1563 [08:45<38:26, 1.81s/it] Training 1/1 epoch (loss 1.7375): 19%|β–ˆβ–Š | 292/1563 [08:46<38:26, 1.81s/it] Training 1/1 epoch (loss 1.7375): 19%|β–ˆβ–Š | 293/1563 [08:46<34:31, 1.63s/it] Training 1/1 epoch (loss 1.8500): 19%|β–ˆβ–Š | 293/1563 [08:49<34:31, 1.63s/it] Training 1/1 epoch (loss 1.8500): 19%|β–ˆβ–‰ | 294/1563 [08:49<39:31, 1.87s/it] Training 1/1 epoch (loss 1.7360): 19%|β–ˆβ–‰ | 294/1563 [08:51<39:31, 1.87s/it] Training 1/1 epoch (loss 1.7360): 19%|β–ˆβ–‰ | 295/1563 [08:51<40:19, 1.91s/it] Training 1/1 epoch (loss 1.6286): 19%|β–ˆβ–‰ | 295/1563 [08:52<40:19, 1.91s/it] Training 1/1 epoch (loss 1.6286): 19%|β–ˆβ–‰ | 296/1563 [08:52<39:53, 1.89s/it] Training 1/1 epoch (loss 1.8246): 19%|β–ˆβ–‰ | 296/1563 [08:54<39:53, 1.89s/it] Training 1/1 epoch (loss 1.8246): 19%|β–ˆβ–‰ | 297/1563 [08:54<35:49, 1.70s/it] Training 1/1 epoch (loss 1.7648): 19%|β–ˆβ–‰ | 297/1563 [08:56<35:49, 1.70s/it] Training 1/1 epoch (loss 1.7648): 19%|β–ˆβ–‰ | 298/1563 [08:56<37:28, 1.78s/it] Training 1/1 epoch (loss 1.6557): 19%|β–ˆβ–‰ | 298/1563 [08:57<37:28, 1.78s/it] Training 1/1 epoch (loss 1.6557): 19%|β–ˆβ–‰ | 299/1563 [08:57<37:05, 1.76s/it] Training 1/1 epoch (loss 1.7369): 19%|β–ˆβ–‰ | 299/1563 [08:59<37:05, 1.76s/it] Training 1/1 epoch (loss 1.7369): 19%|β–ˆβ–‰ | 300/1563 [08:59<37:17, 1.77s/it] Training 1/1 epoch (loss 1.7802): 19%|β–ˆβ–‰ | 300/1563 [09:01<37:17, 1.77s/it] Training 1/1 epoch (loss 1.7802): 19%|β–ˆβ–‰ | 301/1563 [09:01<38:14, 1.82s/it] Training 1/1 epoch (loss 1.7585): 19%|β–ˆβ–‰ | 301/1563 [09:04<38:14, 1.82s/it] Training 1/1 epoch (loss 1.7585): 19%|β–ˆβ–‰ | 302/1563 [09:04<41:58, 2.00s/it] Training 1/1 epoch (loss 1.6748): 19%|β–ˆβ–‰ | 302/1563 [09:05<41:58, 2.00s/it] Training 1/1 epoch (loss 1.6748): 19%|β–ˆβ–‰ | 303/1563 [09:05<37:30, 1.79s/it] Training 1/1 epoch (loss 1.6896): 19%|β–ˆβ–‰ | 303/1563 [09:07<37:30, 1.79s/it] Training 1/1 epoch (loss 1.6896): 19%|β–ˆβ–‰ | 304/1563 [09:07<41:24, 1.97s/it] Training 1/1 epoch (loss 1.6744): 19%|β–ˆβ–‰ | 304/1563 [09:10<41:24, 1.97s/it] Training 1/1 epoch (loss 1.6744): 20%|β–ˆβ–‰ | 305/1563 [09:10<44:49, 2.14s/it] Training 1/1 epoch (loss 1.7533): 20%|β–ˆβ–‰ | 305/1563 [09:12<44:49, 2.14s/it] Training 1/1 epoch (loss 1.7533): 20%|β–ˆβ–‰ | 306/1563 [09:12<44:02, 2.10s/it] Training 1/1 epoch (loss 1.6824): 20%|β–ˆβ–‰ | 306/1563 [09:13<44:02, 2.10s/it] Training 1/1 epoch (loss 1.6824): 20%|β–ˆβ–‰ | 307/1563 [09:13<40:47, 1.95s/it] Training 1/1 epoch (loss 1.7539): 20%|β–ˆβ–‰ | 307/1563 [09:15<40:47, 1.95s/it] Training 1/1 epoch (loss 1.7539): 20%|β–ˆβ–‰ | 308/1563 [09:15<36:26, 1.74s/it] Training 1/1 epoch (loss 1.7828): 20%|β–ˆβ–‰ | 308/1563 [09:16<36:26, 1.74s/it] Training 1/1 epoch (loss 1.7828): 20%|β–ˆβ–‰ | 309/1563 [09:16<35:39, 1.71s/it] Training 1/1 epoch (loss 1.7594): 20%|β–ˆβ–‰ | 309/1563 [09:18<35:39, 1.71s/it] Training 1/1 epoch (loss 1.7594): 20%|β–ˆβ–‰ | 310/1563 [09:18<38:31, 1.84s/it] Training 1/1 epoch (loss 1.7696): 20%|β–ˆβ–‰ | 310/1563 [09:21<38:31, 1.84s/it] Training 1/1 epoch (loss 1.7696): 20%|β–ˆβ–‰ | 311/1563 [09:21<40:35, 1.95s/it] Training 1/1 epoch (loss 1.6692): 20%|β–ˆβ–‰ | 311/1563 [09:22<40:35, 1.95s/it] Training 1/1 epoch (loss 1.6692): 20%|β–ˆβ–‰ | 312/1563 [09:22<38:16, 1.84s/it] Training 1/1 epoch (loss 1.6528): 20%|β–ˆβ–‰ | 312/1563 [09:25<38:16, 1.84s/it] Training 1/1 epoch (loss 1.6528): 20%|β–ˆβ–ˆ | 313/1563 [09:25<42:07, 2.02s/it] Training 1/1 epoch (loss 1.7409): 20%|β–ˆβ–ˆ | 313/1563 [09:26<42:07, 2.02s/it] Training 1/1 epoch (loss 1.7409): 20%|β–ˆβ–ˆ | 314/1563 [09:26<40:13, 1.93s/it] Training 1/1 epoch (loss 1.7812): 20%|β–ˆβ–ˆ | 314/1563 [09:28<40:13, 1.93s/it] Training 1/1 epoch (loss 1.7812): 20%|β–ˆβ–ˆ | 315/1563 [09:28<37:47, 1.82s/it] Training 1/1 epoch (loss 1.6902): 20%|β–ˆβ–ˆ | 315/1563 [09:30<37:47, 1.82s/it] Training 1/1 epoch (loss 1.6902): 20%|β–ˆβ–ˆ | 316/1563 [09:30<39:01, 1.88s/it] Training 1/1 epoch (loss 1.7452): 20%|β–ˆβ–ˆ | 316/1563 [09:31<39:01, 1.88s/it] Training 1/1 epoch (loss 1.7452): 20%|β–ˆβ–ˆ | 317/1563 [09:31<35:16, 1.70s/it] Training 1/1 epoch (loss 1.6981): 20%|β–ˆβ–ˆ | 317/1563 [09:33<35:16, 1.70s/it] Training 1/1 epoch (loss 1.6981): 20%|β–ˆβ–ˆ | 318/1563 [09:33<34:07, 1.64s/it] Training 1/1 epoch (loss 1.6421): 20%|β–ˆβ–ˆ | 318/1563 [09:34<34:07, 1.64s/it] Training 1/1 epoch (loss 1.6421): 20%|β–ˆβ–ˆ | 319/1563 [09:34<34:27, 1.66s/it] Training 1/1 epoch (loss 1.7360): 20%|β–ˆβ–ˆ | 319/1563 [09:36<34:27, 1.66s/it] Training 1/1 epoch (loss 1.7360): 20%|β–ˆβ–ˆ | 320/1563 [09:36<36:29, 1.76s/it] Training 1/1 epoch (loss 1.7457): 20%|β–ˆβ–ˆ | 320/1563 [09:39<36:29, 1.76s/it] Training 1/1 epoch (loss 1.7457): 21%|β–ˆβ–ˆ | 321/1563 [09:39<39:31, 1.91s/it] Training 1/1 epoch (loss 1.7865): 21%|β–ˆβ–ˆ | 321/1563 [09:40<39:31, 1.91s/it] Training 1/1 epoch (loss 1.7865): 21%|β–ˆβ–ˆ | 322/1563 [09:40<36:29, 1.76s/it] Training 1/1 epoch (loss 1.6334): 21%|β–ˆβ–ˆ | 322/1563 [09:42<36:29, 1.76s/it] Training 1/1 epoch (loss 1.6334): 21%|β–ˆβ–ˆ | 323/1563 [09:42<35:36, 1.72s/it] Training 1/1 epoch (loss 1.6784): 21%|β–ˆβ–ˆ | 323/1563 [09:44<35:36, 1.72s/it] Training 1/1 epoch (loss 1.6784): 21%|β–ˆβ–ˆ | 324/1563 [09:44<36:14, 1.76s/it] Training 1/1 epoch (loss 1.6978): 21%|β–ˆβ–ˆ | 324/1563 [09:45<36:14, 1.76s/it] Training 1/1 epoch (loss 1.6978): 21%|β–ˆβ–ˆ | 325/1563 [09:45<36:12, 1.75s/it] Training 1/1 epoch (loss 1.6747): 21%|β–ˆβ–ˆ | 325/1563 [09:47<36:12, 1.75s/it] Training 1/1 epoch (loss 1.6747): 21%|β–ˆβ–ˆ | 326/1563 [09:47<38:33, 1.87s/it] Training 1/1 epoch (loss 1.7432): 21%|β–ˆβ–ˆ | 326/1563 [09:49<38:33, 1.87s/it] Training 1/1 epoch (loss 1.7432): 21%|β–ˆβ–ˆ | 327/1563 [09:49<37:49, 1.84s/it] Training 1/1 epoch (loss 1.6142): 21%|β–ˆβ–ˆ | 327/1563 [09:51<37:49, 1.84s/it] Training 1/1 epoch (loss 1.6142): 21%|β–ˆβ–ˆ | 328/1563 [09:51<37:32, 1.82s/it] Training 1/1 epoch (loss 1.8085): 21%|β–ˆβ–ˆ | 328/1563 [09:53<37:32, 1.82s/it] Training 1/1 epoch (loss 1.8085): 21%|β–ˆβ–ˆ | 329/1563 [09:53<40:53, 1.99s/it] Training 1/1 epoch (loss 1.7013): 21%|β–ˆβ–ˆ | 329/1563 [09:55<40:53, 1.99s/it] Training 1/1 epoch (loss 1.7013): 21%|β–ˆβ–ˆ | 330/1563 [09:55<37:29, 1.82s/it] Training 1/1 epoch (loss 1.5966): 21%|β–ˆβ–ˆ | 330/1563 [09:56<37:29, 1.82s/it] Training 1/1 epoch (loss 1.5966): 21%|β–ˆβ–ˆ | 331/1563 [09:56<35:10, 1.71s/it] Training 1/1 epoch (loss 1.7563): 21%|β–ˆβ–ˆ | 331/1563 [09:58<35:10, 1.71s/it] Training 1/1 epoch (loss 1.7563): 21%|β–ˆβ–ˆ | 332/1563 [09:58<32:57, 1.61s/it] Training 1/1 epoch (loss 1.7440): 21%|β–ˆβ–ˆ | 332/1563 [10:00<32:57, 1.61s/it] Training 1/1 epoch (loss 1.7440): 21%|β–ˆβ–ˆβ– | 333/1563 [10:00<36:19, 1.77s/it] Training 1/1 epoch (loss 1.6682): 21%|β–ˆβ–ˆβ– | 333/1563 [10:01<36:19, 1.77s/it] Training 1/1 epoch (loss 1.6682): 21%|β–ˆβ–ˆβ– | 334/1563 [10:01<34:10, 1.67s/it] Training 1/1 epoch (loss 1.7259): 21%|β–ˆβ–ˆβ– | 334/1563 [10:04<34:10, 1.67s/it] Training 1/1 epoch (loss 1.7259): 21%|β–ˆβ–ˆβ– | 335/1563 [10:04<38:47, 1.90s/it] Training 1/1 epoch (loss 1.7510): 21%|β–ˆβ–ˆβ– | 335/1563 [10:05<38:47, 1.90s/it] Training 1/1 epoch (loss 1.7510): 21%|β–ˆβ–ˆβ– | 336/1563 [10:05<36:22, 1.78s/it] Training 1/1 epoch (loss 1.7038): 21%|β–ˆβ–ˆβ– | 336/1563 [10:07<36:22, 1.78s/it] Training 1/1 epoch (loss 1.7038): 22%|β–ˆβ–ˆβ– | 337/1563 [10:07<37:00, 1.81s/it] Training 1/1 epoch (loss 1.7411): 22%|β–ˆβ–ˆβ– | 337/1563 [10:09<37:00, 1.81s/it] Training 1/1 epoch (loss 1.7411): 22%|β–ˆβ–ˆβ– | 338/1563 [10:09<37:15, 1.83s/it] Training 1/1 epoch (loss 1.7110): 22%|β–ˆβ–ˆβ– | 338/1563 [10:10<37:15, 1.83s/it] Training 1/1 epoch (loss 1.7110): 22%|β–ˆβ–ˆβ– | 339/1563 [10:10<34:23, 1.69s/it] Training 1/1 epoch (loss 1.7340): 22%|β–ˆβ–ˆβ– | 339/1563 [10:12<34:23, 1.69s/it] Training 1/1 epoch (loss 1.7340): 22%|β–ˆβ–ˆβ– | 340/1563 [10:12<36:33, 1.79s/it] Training 1/1 epoch (loss 1.6772): 22%|β–ˆβ–ˆβ– | 340/1563 [10:14<36:33, 1.79s/it] Training 1/1 epoch (loss 1.6772): 22%|β–ˆβ–ˆβ– | 341/1563 [10:14<38:39, 1.90s/it] Training 1/1 epoch (loss 1.6617): 22%|β–ˆβ–ˆβ– | 341/1563 [10:16<38:39, 1.90s/it] Training 1/1 epoch (loss 1.6617): 22%|β–ˆβ–ˆβ– | 342/1563 [10:16<38:10, 1.88s/it] Training 1/1 epoch (loss 1.7039): 22%|β–ˆβ–ˆβ– | 342/1563 [10:18<38:10, 1.88s/it] Training 1/1 epoch (loss 1.7039): 22%|β–ˆβ–ˆβ– | 343/1563 [10:18<39:39, 1.95s/it] Training 1/1 epoch (loss 1.7177): 22%|β–ˆβ–ˆβ– | 343/1563 [10:20<39:39, 1.95s/it] Training 1/1 epoch (loss 1.7177): 22%|β–ˆβ–ˆβ– | 344/1563 [10:20<38:52, 1.91s/it] Training 1/1 epoch (loss 1.7206): 22%|β–ˆβ–ˆβ– | 344/1563 [10:21<38:52, 1.91s/it] Training 1/1 epoch (loss 1.7206): 22%|β–ˆβ–ˆβ– | 345/1563 [10:21<33:47, 1.66s/it] Training 1/1 epoch (loss 1.8159): 22%|β–ˆβ–ˆβ– | 345/1563 [10:23<33:47, 1.66s/it] Training 1/1 epoch (loss 1.8159): 22%|β–ˆβ–ˆβ– | 346/1563 [10:23<33:03, 1.63s/it] Training 1/1 epoch (loss 1.7007): 22%|β–ˆβ–ˆβ– | 346/1563 [10:25<33:03, 1.63s/it] Training 1/1 epoch (loss 1.7007): 22%|β–ˆβ–ˆβ– | 347/1563 [10:25<37:30, 1.85s/it] Training 1/1 epoch (loss 1.6166): 22%|β–ˆβ–ˆβ– | 347/1563 [10:27<37:30, 1.85s/it] Training 1/1 epoch (loss 1.6166): 22%|β–ˆβ–ˆβ– | 348/1563 [10:27<38:43, 1.91s/it] Training 1/1 epoch (loss 1.6576): 22%|β–ˆβ–ˆβ– | 348/1563 [10:29<38:43, 1.91s/it] Training 1/1 epoch (loss 1.6576): 22%|β–ˆβ–ˆβ– | 349/1563 [10:29<37:10, 1.84s/it] Training 1/1 epoch (loss 1.7562): 22%|β–ˆβ–ˆβ– | 349/1563 [10:31<37:10, 1.84s/it] Training 1/1 epoch (loss 1.7562): 22%|β–ˆβ–ˆβ– | 350/1563 [10:31<36:21, 1.80s/it] Training 1/1 epoch (loss 1.7466): 22%|β–ˆβ–ˆβ– | 350/1563 [10:32<36:21, 1.80s/it] Training 1/1 epoch (loss 1.7466): 22%|β–ˆβ–ˆβ– | 351/1563 [10:32<35:17, 1.75s/it] Training 1/1 epoch (loss 1.7910): 22%|β–ˆβ–ˆβ– | 351/1563 [10:34<35:17, 1.75s/it] Training 1/1 epoch (loss 1.7910): 23%|β–ˆβ–ˆβ–Ž | 352/1563 [10:34<34:14, 1.70s/it] Training 1/1 epoch (loss 1.7441): 23%|β–ˆβ–ˆβ–Ž | 352/1563 [10:35<34:14, 1.70s/it] Training 1/1 epoch (loss 1.7441): 23%|β–ˆβ–ˆβ–Ž | 353/1563 [10:35<33:48, 1.68s/it] Training 1/1 epoch (loss 1.6622): 23%|β–ˆβ–ˆβ–Ž | 353/1563 [10:37<33:48, 1.68s/it] Training 1/1 epoch (loss 1.6622): 23%|β–ˆβ–ˆβ–Ž | 354/1563 [10:37<32:54, 1.63s/it] Training 1/1 epoch (loss 1.6387): 23%|β–ˆβ–ˆβ–Ž | 354/1563 [10:39<32:54, 1.63s/it] Training 1/1 epoch (loss 1.6387): 23%|β–ˆβ–ˆβ–Ž | 355/1563 [10:39<36:56, 1.83s/it] Training 1/1 epoch (loss 1.6799): 23%|β–ˆβ–ˆβ–Ž | 355/1563 [10:40<36:56, 1.83s/it] Training 1/1 epoch (loss 1.6799): 23%|β–ˆβ–ˆβ–Ž | 356/1563 [10:40<32:08, 1.60s/it] Training 1/1 epoch (loss 1.6925): 23%|β–ˆβ–ˆβ–Ž | 356/1563 [10:42<32:08, 1.60s/it] Training 1/1 epoch (loss 1.6925): 23%|β–ˆβ–ˆβ–Ž | 357/1563 [10:42<34:46, 1.73s/it] Training 1/1 epoch (loss 1.5968): 23%|β–ˆβ–ˆβ–Ž | 357/1563 [10:44<34:46, 1.73s/it] Training 1/1 epoch (loss 1.5968): 23%|β–ˆβ–ˆβ–Ž | 358/1563 [10:44<34:13, 1.70s/it] Training 1/1 epoch (loss 1.6935): 23%|β–ˆβ–ˆβ–Ž | 358/1563 [10:46<34:13, 1.70s/it] Training 1/1 epoch (loss 1.6935): 23%|β–ˆβ–ˆβ–Ž | 359/1563 [10:46<38:37, 1.92s/it] Training 1/1 epoch (loss 1.8005): 23%|β–ˆβ–ˆβ–Ž | 359/1563 [10:48<38:37, 1.92s/it] Training 1/1 epoch (loss 1.8005): 23%|β–ˆβ–ˆβ–Ž | 360/1563 [10:48<39:07, 1.95s/it] Training 1/1 epoch (loss 1.6506): 23%|β–ˆβ–ˆβ–Ž | 360/1563 [10:50<39:07, 1.95s/it] Training 1/1 epoch (loss 1.6506): 23%|β–ˆβ–ˆβ–Ž | 361/1563 [10:50<34:01, 1.70s/it] Training 1/1 epoch (loss 1.7692): 23%|β–ˆβ–ˆβ–Ž | 361/1563 [10:51<34:01, 1.70s/it] Training 1/1 epoch (loss 1.7692): 23%|β–ˆβ–ˆβ–Ž | 362/1563 [10:51<31:32, 1.58s/it] Training 1/1 epoch (loss 1.7964): 23%|β–ˆβ–ˆβ–Ž | 362/1563 [10:53<31:32, 1.58s/it] Training 1/1 epoch (loss 1.7964): 23%|β–ˆβ–ˆβ–Ž | 363/1563 [10:53<34:31, 1.73s/it] Training 1/1 epoch (loss 1.7750): 23%|β–ˆβ–ˆβ–Ž | 363/1563 [10:55<34:31, 1.73s/it] Training 1/1 epoch (loss 1.7750): 23%|β–ˆβ–ˆβ–Ž | 364/1563 [10:55<39:01, 1.95s/it] Training 1/1 epoch (loss 1.6640): 23%|β–ˆβ–ˆβ–Ž | 364/1563 [10:57<39:01, 1.95s/it] Training 1/1 epoch (loss 1.6640): 23%|β–ˆβ–ˆβ–Ž | 365/1563 [10:57<36:13, 1.81s/it] Training 1/1 epoch (loss 1.6951): 23%|β–ˆβ–ˆβ–Ž | 365/1563 [10:58<36:13, 1.81s/it] Training 1/1 epoch (loss 1.6951): 23%|β–ˆβ–ˆβ–Ž | 366/1563 [10:58<30:20, 1.52s/it] Training 1/1 epoch (loss 1.6540): 23%|β–ˆβ–ˆβ–Ž | 366/1563 [11:00<30:20, 1.52s/it] Training 1/1 epoch (loss 1.6540): 23%|β–ˆβ–ˆβ–Ž | 367/1563 [11:00<33:19, 1.67s/it] Training 1/1 epoch (loss 1.6884): 23%|β–ˆβ–ˆβ–Ž | 367/1563 [11:02<33:19, 1.67s/it] Training 1/1 epoch (loss 1.6884): 24%|β–ˆβ–ˆβ–Ž | 368/1563 [11:02<36:21, 1.83s/it] Training 1/1 epoch (loss 1.6901): 24%|β–ˆβ–ˆβ–Ž | 368/1563 [11:04<36:21, 1.83s/it] Training 1/1 epoch (loss 1.6901): 24%|β–ˆβ–ˆβ–Ž | 369/1563 [11:04<36:17, 1.82s/it] Training 1/1 epoch (loss 1.7061): 24%|β–ˆβ–ˆβ–Ž | 369/1563 [11:05<36:17, 1.82s/it] Training 1/1 epoch (loss 1.7061): 24%|β–ˆβ–ˆβ–Ž | 370/1563 [11:05<35:27, 1.78s/it] Training 1/1 epoch (loss 1.8156): 24%|β–ˆβ–ˆβ–Ž | 370/1563 [11:07<35:27, 1.78s/it] Training 1/1 epoch (loss 1.8156): 24%|β–ˆβ–ˆβ–Ž | 371/1563 [11:07<33:21, 1.68s/it] Training 1/1 epoch (loss 1.6565): 24%|β–ˆβ–ˆβ–Ž | 371/1563 [11:08<33:21, 1.68s/it] Training 1/1 epoch (loss 1.6565): 24%|β–ˆβ–ˆβ– | 372/1563 [11:08<27:57, 1.41s/it] Training 1/1 epoch (loss 1.6048): 24%|β–ˆβ–ˆβ– | 372/1563 [11:09<27:57, 1.41s/it] Training 1/1 epoch (loss 1.6048): 24%|β–ˆβ–ˆβ– | 373/1563 [11:09<28:47, 1.45s/it] Training 1/1 epoch (loss 1.6725): 24%|β–ˆβ–ˆβ– | 373/1563 [11:11<28:47, 1.45s/it] Training 1/1 epoch (loss 1.6725): 24%|β–ˆβ–ˆβ– | 374/1563 [11:11<28:48, 1.45s/it] Training 1/1 epoch (loss 1.6599): 24%|β–ˆβ–ˆβ– | 374/1563 [11:13<28:48, 1.45s/it] Training 1/1 epoch (loss 1.6599): 24%|β–ˆβ–ˆβ– | 375/1563 [11:13<32:33, 1.64s/it] Training 1/1 epoch (loss 1.7006): 24%|β–ˆβ–ˆβ– | 375/1563 [11:15<32:33, 1.64s/it] Training 1/1 epoch (loss 1.7006): 24%|β–ˆβ–ˆβ– | 376/1563 [11:15<33:12, 1.68s/it] Training 1/1 epoch (loss 1.6139): 24%|β–ˆβ–ˆβ– | 376/1563 [11:16<33:12, 1.68s/it] Training 1/1 epoch (loss 1.6139): 24%|β–ˆβ–ˆβ– | 377/1563 [11:16<31:13, 1.58s/it] Training 1/1 epoch (loss 1.8085): 24%|β–ˆβ–ˆβ– | 377/1563 [11:18<31:13, 1.58s/it] Training 1/1 epoch (loss 1.8085): 24%|β–ˆβ–ˆβ– | 378/1563 [11:18<31:44, 1.61s/it] Training 1/1 epoch (loss 1.6870): 24%|β–ˆβ–ˆβ– | 378/1563 [11:20<31:44, 1.61s/it] Training 1/1 epoch (loss 1.6870): 24%|β–ˆβ–ˆβ– | 379/1563 [11:20<36:36, 1.86s/it] Training 1/1 epoch (loss 1.5751): 24%|β–ˆβ–ˆβ– | 379/1563 [11:21<36:36, 1.86s/it] Training 1/1 epoch (loss 1.5751): 24%|β–ˆβ–ˆβ– | 380/1563 [11:21<31:26, 1.59s/it] Training 1/1 epoch (loss 1.6013): 24%|β–ˆβ–ˆβ– | 380/1563 [11:22<31:26, 1.59s/it] Training 1/1 epoch (loss 1.6013): 24%|β–ˆβ–ˆβ– | 381/1563 [11:22<29:01, 1.47s/it] Training 1/1 epoch (loss 1.7380): 24%|β–ˆβ–ˆβ– | 381/1563 [11:24<29:01, 1.47s/it] Training 1/1 epoch (loss 1.7380): 24%|β–ˆβ–ˆβ– | 382/1563 [11:24<28:29, 1.45s/it] Training 1/1 epoch (loss 1.7110): 24%|β–ˆβ–ˆβ– | 382/1563 [11:25<28:29, 1.45s/it] Training 1/1 epoch (loss 1.7110): 25%|β–ˆβ–ˆβ– | 383/1563 [11:25<28:48, 1.46s/it] Training 1/1 epoch (loss 1.7297): 25%|β–ˆβ–ˆβ– | 383/1563 [11:27<28:48, 1.46s/it] Training 1/1 epoch (loss 1.7297): 25%|β–ˆβ–ˆβ– | 384/1563 [11:27<33:15, 1.69s/it] Training 1/1 epoch (loss 1.7441): 25%|β–ˆβ–ˆβ– | 384/1563 [11:29<33:15, 1.69s/it] Training 1/1 epoch (loss 1.7441): 25%|β–ˆβ–ˆβ– | 385/1563 [11:29<34:06, 1.74s/it] Training 1/1 epoch (loss 1.7248): 25%|β–ˆβ–ˆβ– | 385/1563 [11:31<34:06, 1.74s/it] Training 1/1 epoch (loss 1.7248): 25%|β–ˆβ–ˆβ– | 386/1563 [11:31<33:58, 1.73s/it] Training 1/1 epoch (loss 1.7579): 25%|β–ˆβ–ˆβ– | 386/1563 [11:33<33:58, 1.73s/it] Training 1/1 epoch (loss 1.7579): 25%|β–ˆβ–ˆβ– | 387/1563 [11:33<38:24, 1.96s/it] Training 1/1 epoch (loss 1.7846): 25%|β–ˆβ–ˆβ– | 387/1563 [11:35<38:24, 1.96s/it] Training 1/1 epoch (loss 1.7846): 25%|β–ˆβ–ˆβ– | 388/1563 [11:35<35:37, 1.82s/it] Training 1/1 epoch (loss 1.7152): 25%|β–ˆβ–ˆβ– | 388/1563 [11:37<35:37, 1.82s/it] Training 1/1 epoch (loss 1.7152): 25%|β–ˆβ–ˆβ– | 389/1563 [11:37<39:08, 2.00s/it] Training 1/1 epoch (loss 1.7354): 25%|β–ˆβ–ˆβ– | 389/1563 [11:40<39:08, 2.00s/it] Training 1/1 epoch (loss 1.7354): 25%|β–ˆβ–ˆβ– | 390/1563 [11:40<41:38, 2.13s/it] Training 1/1 epoch (loss 1.6884): 25%|β–ˆβ–ˆβ– | 390/1563 [11:42<41:38, 2.13s/it] Training 1/1 epoch (loss 1.6884): 25%|β–ˆβ–ˆβ–Œ | 391/1563 [11:42<41:19, 2.12s/it] Training 1/1 epoch (loss 1.6540): 25%|β–ˆβ–ˆβ–Œ | 391/1563 [11:44<41:19, 2.12s/it] Training 1/1 epoch (loss 1.6540): 25%|β–ˆβ–ˆβ–Œ | 392/1563 [11:44<39:58, 2.05s/it] Training 1/1 epoch (loss 1.6269): 25%|β–ˆβ–ˆβ–Œ | 392/1563 [11:45<39:58, 2.05s/it] Training 1/1 epoch (loss 1.6269): 25%|β–ˆβ–ˆβ–Œ | 393/1563 [11:45<35:46, 1.83s/it] Training 1/1 epoch (loss 1.5898): 25%|β–ˆβ–ˆβ–Œ | 393/1563 [11:46<35:46, 1.83s/it] Training 1/1 epoch (loss 1.5898): 25%|β–ˆβ–ˆβ–Œ | 394/1563 [11:46<31:13, 1.60s/it] Training 1/1 epoch (loss 1.6165): 25%|β–ˆβ–ˆβ–Œ | 394/1563 [11:48<31:13, 1.60s/it] Training 1/1 epoch (loss 1.6165): 25%|β–ˆβ–ˆβ–Œ | 395/1563 [11:48<34:04, 1.75s/it] Training 1/1 epoch (loss 1.7388): 25%|β–ˆβ–ˆβ–Œ | 395/1563 [11:50<34:04, 1.75s/it] Training 1/1 epoch (loss 1.7388): 25%|β–ˆβ–ˆβ–Œ | 396/1563 [11:50<33:25, 1.72s/it] Training 1/1 epoch (loss 1.6610): 25%|β–ˆβ–ˆβ–Œ | 396/1563 [11:51<33:25, 1.72s/it] Training 1/1 epoch (loss 1.6610): 25%|β–ˆβ–ˆβ–Œ | 397/1563 [11:51<33:00, 1.70s/it] Training 1/1 epoch (loss 1.7739): 25%|β–ˆβ–ˆβ–Œ | 397/1563 [11:53<33:00, 1.70s/it] Training 1/1 epoch (loss 1.7739): 25%|β–ˆβ–ˆβ–Œ | 398/1563 [11:53<32:58, 1.70s/it] Training 1/1 epoch (loss 1.6012): 25%|β–ˆβ–ˆβ–Œ | 398/1563 [11:55<32:58, 1.70s/it] Training 1/1 epoch (loss 1.6012): 26%|β–ˆβ–ˆβ–Œ | 399/1563 [11:55<31:05, 1.60s/it] Training 1/1 epoch (loss 1.7565): 26%|β–ˆβ–ˆβ–Œ | 399/1563 [11:57<31:05, 1.60s/it] Training 1/1 epoch (loss 1.7565): 26%|β–ˆβ–ˆβ–Œ | 400/1563 [11:57<34:04, 1.76s/it] Training 1/1 epoch (loss 1.7228): 26%|β–ˆβ–ˆβ–Œ | 400/1563 [11:58<34:04, 1.76s/it] Training 1/1 epoch (loss 1.7228): 26%|β–ˆβ–ˆβ–Œ | 401/1563 [11:58<31:24, 1.62s/it] Training 1/1 epoch (loss 1.7505): 26%|β–ˆβ–ˆβ–Œ | 401/1563 [12:00<31:24, 1.62s/it] Training 1/1 epoch (loss 1.7505): 26%|β–ˆβ–ˆβ–Œ | 402/1563 [12:00<35:42, 1.85s/it] Training 1/1 epoch (loss 1.6836): 26%|β–ˆβ–ˆβ–Œ | 402/1563 [12:02<35:42, 1.85s/it] Training 1/1 epoch (loss 1.6836): 26%|β–ˆβ–ˆβ–Œ | 403/1563 [12:02<37:08, 1.92s/it] Training 1/1 epoch (loss 1.5435): 26%|β–ˆβ–ˆβ–Œ | 403/1563 [12:05<37:08, 1.92s/it] Training 1/1 epoch (loss 1.5435): 26%|β–ˆβ–ˆβ–Œ | 404/1563 [12:05<39:12, 2.03s/it] Training 1/1 epoch (loss 1.6707): 26%|β–ˆβ–ˆβ–Œ | 404/1563 [12:07<39:12, 2.03s/it] Training 1/1 epoch (loss 1.6707): 26%|β–ˆβ–ˆβ–Œ | 405/1563 [12:07<40:46, 2.11s/it] Training 1/1 epoch (loss 1.7112): 26%|β–ˆβ–ˆβ–Œ | 405/1563 [12:09<40:46, 2.11s/it] Training 1/1 epoch (loss 1.7112): 26%|β–ˆβ–ˆβ–Œ | 406/1563 [12:09<38:21, 1.99s/it] Training 1/1 epoch (loss 1.7265): 26%|β–ˆβ–ˆβ–Œ | 406/1563 [12:10<38:21, 1.99s/it] Training 1/1 epoch (loss 1.7265): 26%|β–ˆβ–ˆβ–Œ | 407/1563 [12:10<36:26, 1.89s/it] Training 1/1 epoch (loss 1.7073): 26%|β–ˆβ–ˆβ–Œ | 407/1563 [12:13<36:26, 1.89s/it] Training 1/1 epoch (loss 1.7073): 26%|β–ˆβ–ˆβ–Œ | 408/1563 [12:13<39:34, 2.06s/it] Training 1/1 epoch (loss 1.6758): 26%|β–ˆβ–ˆβ–Œ | 408/1563 [12:15<39:34, 2.06s/it] Training 1/1 epoch (loss 1.6758): 26%|β–ˆβ–ˆβ–Œ | 409/1563 [12:15<40:04, 2.08s/it] Training 1/1 epoch (loss 1.6973): 26%|β–ˆβ–ˆβ–Œ | 409/1563 [12:17<40:04, 2.08s/it] Training 1/1 epoch (loss 1.6973): 26%|β–ˆβ–ˆβ–Œ | 410/1563 [12:17<37:46, 1.97s/it] Training 1/1 epoch (loss 1.7719): 26%|β–ˆβ–ˆβ–Œ | 410/1563 [12:19<37:46, 1.97s/it] Training 1/1 epoch (loss 1.7719): 26%|β–ˆβ–ˆβ–‹ | 411/1563 [12:19<37:37, 1.96s/it] Training 1/1 epoch (loss 1.6800): 26%|β–ˆβ–ˆβ–‹ | 411/1563 [12:20<37:37, 1.96s/it] Training 1/1 epoch (loss 1.6800): 26%|β–ˆβ–ˆβ–‹ | 412/1563 [12:20<36:27, 1.90s/it] Training 1/1 epoch (loss 1.6354): 26%|β–ˆβ–ˆβ–‹ | 412/1563 [12:22<36:27, 1.90s/it] Training 1/1 epoch (loss 1.6354): 26%|β–ˆβ–ˆβ–‹ | 413/1563 [12:22<35:27, 1.85s/it] Training 1/1 epoch (loss 1.6966): 26%|β–ˆβ–ˆβ–‹ | 413/1563 [12:23<35:27, 1.85s/it] Training 1/1 epoch (loss 1.6966): 26%|β–ˆβ–ˆβ–‹ | 414/1563 [12:23<32:08, 1.68s/it] Training 1/1 epoch (loss 1.7174): 26%|β–ˆβ–ˆβ–‹ | 414/1563 [12:25<32:08, 1.68s/it] Training 1/1 epoch (loss 1.7174): 27%|β–ˆβ–ˆβ–‹ | 415/1563 [12:25<31:38, 1.65s/it] Training 1/1 epoch (loss 1.6653): 27%|β–ˆβ–ˆβ–‹ | 415/1563 [12:27<31:38, 1.65s/it] Training 1/1 epoch (loss 1.6653): 27%|β–ˆβ–ˆβ–‹ | 416/1563 [12:27<35:51, 1.88s/it] Training 1/1 epoch (loss 1.6916): 27%|β–ˆβ–ˆβ–‹ | 416/1563 [12:30<35:51, 1.88s/it] Training 1/1 epoch (loss 1.6916): 27%|β–ˆβ–ˆβ–‹ | 417/1563 [12:30<38:45, 2.03s/it] Training 1/1 epoch (loss 1.6259): 27%|β–ˆβ–ˆβ–‹ | 417/1563 [12:31<38:45, 2.03s/it] Training 1/1 epoch (loss 1.6259): 27%|β–ˆβ–ˆβ–‹ | 418/1563 [12:31<34:34, 1.81s/it] Training 1/1 epoch (loss 1.6287): 27%|β–ˆβ–ˆβ–‹ | 418/1563 [12:33<34:34, 1.81s/it] Training 1/1 epoch (loss 1.6287): 27%|β–ˆβ–ˆβ–‹ | 419/1563 [12:33<34:01, 1.78s/it] Training 1/1 epoch (loss 1.6775): 27%|β–ˆβ–ˆβ–‹ | 419/1563 [12:35<34:01, 1.78s/it] Training 1/1 epoch (loss 1.6775): 27%|β–ˆβ–ˆβ–‹ | 420/1563 [12:35<35:53, 1.88s/it] Training 1/1 epoch (loss 1.6762): 27%|β–ˆβ–ˆβ–‹ | 420/1563 [12:37<35:53, 1.88s/it] Training 1/1 epoch (loss 1.6762): 27%|β–ˆβ–ˆβ–‹ | 421/1563 [12:37<37:38, 1.98s/it] Training 1/1 epoch (loss 1.7498): 27%|β–ˆβ–ˆβ–‹ | 421/1563 [12:39<37:38, 1.98s/it] Training 1/1 epoch (loss 1.7498): 27%|β–ˆβ–ˆβ–‹ | 422/1563 [12:39<39:53, 2.10s/it] Training 1/1 epoch (loss 1.6542): 27%|β–ˆβ–ˆβ–‹ | 422/1563 [12:41<39:53, 2.10s/it] Training 1/1 epoch (loss 1.6542): 27%|β–ˆβ–ˆβ–‹ | 423/1563 [12:41<38:35, 2.03s/it] Training 1/1 epoch (loss 1.7717): 27%|β–ˆβ–ˆβ–‹ | 423/1563 [12:43<38:35, 2.03s/it] Training 1/1 epoch (loss 1.7717): 27%|β–ˆβ–ˆβ–‹ | 424/1563 [12:43<35:16, 1.86s/it] Training 1/1 epoch (loss 1.7090): 27%|β–ˆβ–ˆβ–‹ | 424/1563 [12:44<35:16, 1.86s/it] Training 1/1 epoch (loss 1.7090): 27%|β–ˆβ–ˆβ–‹ | 425/1563 [12:44<32:21, 1.71s/it] Training 1/1 epoch (loss 1.6180): 27%|β–ˆβ–ˆβ–‹ | 425/1563 [12:46<32:21, 1.71s/it] Training 1/1 epoch (loss 1.6180): 27%|β–ˆβ–ˆβ–‹ | 426/1563 [12:46<35:32, 1.88s/it] Training 1/1 epoch (loss 1.8252): 27%|β–ˆβ–ˆβ–‹ | 426/1563 [12:48<35:32, 1.88s/it] Training 1/1 epoch (loss 1.8252): 27%|β–ˆβ–ˆβ–‹ | 427/1563 [12:48<31:32, 1.67s/it] Training 1/1 epoch (loss 1.6115): 27%|β–ˆβ–ˆβ–‹ | 427/1563 [12:49<31:32, 1.67s/it] Training 1/1 epoch (loss 1.6115): 27%|β–ˆβ–ˆβ–‹ | 428/1563 [12:49<29:53, 1.58s/it] Training 1/1 epoch (loss 1.6155): 27%|β–ˆβ–ˆβ–‹ | 428/1563 [12:51<29:53, 1.58s/it] Training 1/1 epoch (loss 1.6155): 27%|β–ˆβ–ˆβ–‹ | 429/1563 [12:51<29:56, 1.58s/it] Training 1/1 epoch (loss 1.6452): 27%|β–ˆβ–ˆβ–‹ | 429/1563 [12:53<29:56, 1.58s/it] Training 1/1 epoch (loss 1.6452): 28%|β–ˆβ–ˆβ–Š | 430/1563 [12:53<31:51, 1.69s/it] Training 1/1 epoch (loss 1.6406): 28%|β–ˆβ–ˆβ–Š | 430/1563 [12:55<31:51, 1.69s/it] Training 1/1 epoch (loss 1.6406): 28%|β–ˆβ–ˆβ–Š | 431/1563 [12:55<34:51, 1.85s/it] Training 1/1 epoch (loss 1.7254): 28%|β–ˆβ–ˆβ–Š | 431/1563 [12:57<34:51, 1.85s/it] Training 1/1 epoch (loss 1.7254): 28%|β–ˆβ–ˆβ–Š | 432/1563 [12:57<35:06, 1.86s/it] Training 1/1 epoch (loss 1.6055): 28%|β–ˆβ–ˆβ–Š | 432/1563 [12:59<35:06, 1.86s/it] Training 1/1 epoch (loss 1.6055): 28%|β–ˆβ–ˆβ–Š | 433/1563 [12:59<38:49, 2.06s/it] Training 1/1 epoch (loss 1.6683): 28%|β–ˆβ–ˆβ–Š | 433/1563 [13:01<38:49, 2.06s/it] Training 1/1 epoch (loss 1.6683): 28%|β–ˆβ–ˆβ–Š | 434/1563 [13:01<36:24, 1.93s/it] Training 1/1 epoch (loss 1.7331): 28%|β–ˆβ–ˆβ–Š | 434/1563 [13:03<36:24, 1.93s/it] Training 1/1 epoch (loss 1.7331): 28%|β–ˆβ–ˆβ–Š | 435/1563 [13:03<39:37, 2.11s/it] Training 1/1 epoch (loss 1.7232): 28%|β–ˆβ–ˆβ–Š | 435/1563 [13:06<39:37, 2.11s/it] Training 1/1 epoch (loss 1.7232): 28%|β–ˆβ–ˆβ–Š | 436/1563 [13:06<41:35, 2.21s/it] Training 1/1 epoch (loss 1.6922): 28%|β–ˆβ–ˆβ–Š | 436/1563 [13:08<41:35, 2.21s/it] Training 1/1 epoch (loss 1.6922): 28%|β–ˆβ–ˆβ–Š | 437/1563 [13:08<40:08, 2.14s/it] Training 1/1 epoch (loss 1.7284): 28%|β–ˆβ–ˆβ–Š | 437/1563 [13:10<40:08, 2.14s/it] Training 1/1 epoch (loss 1.7284): 28%|β–ˆβ–ˆβ–Š | 438/1563 [13:10<40:38, 2.17s/it] Training 1/1 epoch (loss 1.6640): 28%|β–ˆβ–ˆβ–Š | 438/1563 [13:11<40:38, 2.17s/it] Training 1/1 epoch (loss 1.6640): 28%|β–ˆβ–ˆβ–Š | 439/1563 [13:11<35:03, 1.87s/it] Training 1/1 epoch (loss 1.7515): 28%|β–ˆβ–ˆβ–Š | 439/1563 [13:13<35:03, 1.87s/it] Training 1/1 epoch (loss 1.7515): 28%|β–ˆβ–ˆβ–Š | 440/1563 [13:13<37:08, 1.98s/it] Training 1/1 epoch (loss 1.6098): 28%|β–ˆβ–ˆβ–Š | 440/1563 [13:15<37:08, 1.98s/it] Training 1/1 epoch (loss 1.6098): 28%|β–ˆβ–ˆβ–Š | 441/1563 [13:15<33:47, 1.81s/it] Training 1/1 epoch (loss 1.6704): 28%|β–ˆβ–ˆβ–Š | 441/1563 [13:17<33:47, 1.81s/it] Training 1/1 epoch (loss 1.6704): 28%|β–ˆβ–ˆβ–Š | 442/1563 [13:17<34:38, 1.85s/it] Training 1/1 epoch (loss 1.7471): 28%|β–ˆβ–ˆβ–Š | 442/1563 [13:19<34:38, 1.85s/it] Training 1/1 epoch (loss 1.7471): 28%|β–ˆβ–ˆβ–Š | 443/1563 [13:19<34:43, 1.86s/it] Training 1/1 epoch (loss 1.6595): 28%|β–ˆβ–ˆβ–Š | 443/1563 [13:20<34:43, 1.86s/it] Training 1/1 epoch (loss 1.6595): 28%|β–ˆβ–ˆβ–Š | 444/1563 [13:20<30:40, 1.64s/it] Training 1/1 epoch (loss 1.6972): 28%|β–ˆβ–ˆβ–Š | 444/1563 [13:21<30:40, 1.64s/it] Training 1/1 epoch (loss 1.6972): 28%|β–ˆβ–ˆβ–Š | 445/1563 [13:21<28:42, 1.54s/it] Training 1/1 epoch (loss 1.6720): 28%|β–ˆβ–ˆβ–Š | 445/1563 [13:23<28:42, 1.54s/it] Training 1/1 epoch (loss 1.6720): 29%|β–ˆβ–ˆβ–Š | 446/1563 [13:23<29:50, 1.60s/it] Training 1/1 epoch (loss 1.6731): 29%|β–ˆβ–ˆβ–Š | 446/1563 [13:25<29:50, 1.60s/it] Training 1/1 epoch (loss 1.6731): 29%|β–ˆβ–ˆβ–Š | 447/1563 [13:25<31:10, 1.68s/it] Training 1/1 epoch (loss 1.7727): 29%|β–ˆβ–ˆβ–Š | 447/1563 [13:27<31:10, 1.68s/it] Training 1/1 epoch (loss 1.7727): 29%|β–ˆβ–ˆβ–Š | 448/1563 [13:27<35:38, 1.92s/it] Training 1/1 epoch (loss 1.6564): 29%|β–ˆβ–ˆβ–Š | 448/1563 [13:29<35:38, 1.92s/it] Training 1/1 epoch (loss 1.6564): 29%|β–ˆβ–ˆβ–Š | 449/1563 [13:29<32:41, 1.76s/it] Training 1/1 epoch (loss 1.5579): 29%|β–ˆβ–ˆβ–Š | 449/1563 [13:31<32:41, 1.76s/it] Training 1/1 epoch (loss 1.5579): 29%|β–ˆβ–ˆβ–‰ | 450/1563 [13:31<36:24, 1.96s/it] Training 1/1 epoch (loss 1.7021): 29%|β–ˆβ–ˆβ–‰ | 450/1563 [13:33<36:24, 1.96s/it] Training 1/1 epoch (loss 1.7021): 29%|β–ˆβ–ˆβ–‰ | 451/1563 [13:33<36:24, 1.96s/it] Training 1/1 epoch (loss 1.6070): 29%|β–ˆβ–ˆβ–‰ | 451/1563 [13:35<36:24, 1.96s/it] Training 1/1 epoch (loss 1.6070): 29%|β–ˆβ–ˆβ–‰ | 452/1563 [13:35<36:48, 1.99s/it] Training 1/1 epoch (loss 1.7773): 29%|β–ˆβ–ˆβ–‰ | 452/1563 [13:37<36:48, 1.99s/it] Training 1/1 epoch (loss 1.7773): 29%|β–ˆβ–ˆβ–‰ | 453/1563 [13:37<37:20, 2.02s/it] Training 1/1 epoch (loss 1.6552): 29%|β–ˆβ–ˆβ–‰ | 453/1563 [13:39<37:20, 2.02s/it] Training 1/1 epoch (loss 1.6552): 29%|β–ˆβ–ˆβ–‰ | 454/1563 [13:39<34:29, 1.87s/it] Training 1/1 epoch (loss 1.6731): 29%|β–ˆβ–ˆβ–‰ | 454/1563 [13:40<34:29, 1.87s/it] Training 1/1 epoch (loss 1.6731): 29%|β–ˆβ–ˆβ–‰ | 455/1563 [13:40<34:00, 1.84s/it] Training 1/1 epoch (loss 1.6809): 29%|β–ˆβ–ˆβ–‰ | 455/1563 [13:42<34:00, 1.84s/it] Training 1/1 epoch (loss 1.6809): 29%|β–ˆβ–ˆβ–‰ | 456/1563 [13:42<31:59, 1.73s/it] Training 1/1 epoch (loss 1.7290): 29%|β–ˆβ–ˆβ–‰ | 456/1563 [13:44<31:59, 1.73s/it] Training 1/1 epoch (loss 1.7290): 29%|β–ˆβ–ˆβ–‰ | 457/1563 [13:44<32:15, 1.75s/it] Training 1/1 epoch (loss 1.6374): 29%|β–ˆβ–ˆβ–‰ | 457/1563 [13:46<32:15, 1.75s/it] Training 1/1 epoch (loss 1.6374): 29%|β–ˆβ–ˆβ–‰ | 458/1563 [13:46<33:43, 1.83s/it] Training 1/1 epoch (loss 1.6059): 29%|β–ˆβ–ˆβ–‰ | 458/1563 [13:48<33:43, 1.83s/it] Training 1/1 epoch (loss 1.6059): 29%|β–ˆβ–ˆβ–‰ | 459/1563 [13:48<34:20, 1.87s/it] Training 1/1 epoch (loss 1.6880): 29%|β–ˆβ–ˆβ–‰ | 459/1563 [13:50<34:20, 1.87s/it] Training 1/1 epoch (loss 1.6880): 29%|β–ˆβ–ˆβ–‰ | 460/1563 [13:50<36:48, 2.00s/it] Training 1/1 epoch (loss 1.6848): 29%|β–ˆβ–ˆβ–‰ | 460/1563 [13:52<36:48, 2.00s/it] Training 1/1 epoch (loss 1.6848): 29%|β–ˆβ–ˆβ–‰ | 461/1563 [13:52<35:45, 1.95s/it] Training 1/1 epoch (loss 1.6991): 29%|β–ˆβ–ˆβ–‰ | 461/1563 [13:53<35:45, 1.95s/it] Training 1/1 epoch (loss 1.6991): 30%|β–ˆβ–ˆβ–‰ | 462/1563 [13:53<33:53, 1.85s/it] Training 1/1 epoch (loss 1.5997): 30%|β–ˆβ–ˆβ–‰ | 462/1563 [13:55<33:53, 1.85s/it] Training 1/1 epoch (loss 1.5997): 30%|β–ˆβ–ˆβ–‰ | 463/1563 [13:55<33:21, 1.82s/it] Training 1/1 epoch (loss 1.6830): 30%|β–ˆβ–ˆβ–‰ | 463/1563 [13:57<33:21, 1.82s/it] Training 1/1 epoch (loss 1.6830): 30%|β–ˆβ–ˆβ–‰ | 464/1563 [13:57<32:36, 1.78s/it] Training 1/1 epoch (loss 1.7197): 30%|β–ˆβ–ˆβ–‰ | 464/1563 [13:58<32:36, 1.78s/it] Training 1/1 epoch (loss 1.7197): 30%|β–ˆβ–ˆβ–‰ | 465/1563 [13:58<30:53, 1.69s/it] Training 1/1 epoch (loss 1.5927): 30%|β–ˆβ–ˆβ–‰ | 465/1563 [14:00<30:53, 1.69s/it] Training 1/1 epoch (loss 1.5927): 30%|β–ˆβ–ˆβ–‰ | 466/1563 [14:00<30:35, 1.67s/it] Training 1/1 epoch (loss 1.5880): 30%|β–ˆβ–ˆβ–‰ | 466/1563 [14:01<30:35, 1.67s/it] Training 1/1 epoch (loss 1.5880): 30%|β–ˆβ–ˆβ–‰ | 467/1563 [14:01<28:31, 1.56s/it] Training 1/1 epoch (loss 1.7718): 30%|β–ˆβ–ˆβ–‰ | 467/1563 [14:03<28:31, 1.56s/it] Training 1/1 epoch (loss 1.7718): 30%|β–ˆβ–ˆβ–‰ | 468/1563 [14:03<30:07, 1.65s/it] Training 1/1 epoch (loss 1.6699): 30%|β–ˆβ–ˆβ–‰ | 468/1563 [14:04<30:07, 1.65s/it] Training 1/1 epoch (loss 1.6699): 30%|β–ˆβ–ˆβ–ˆ | 469/1563 [14:04<25:47, 1.41s/it] Training 1/1 epoch (loss 1.6754): 30%|β–ˆβ–ˆβ–ˆ | 469/1563 [14:06<25:47, 1.41s/it] Training 1/1 epoch (loss 1.6754): 30%|β–ˆβ–ˆβ–ˆ | 470/1563 [14:06<31:09, 1.71s/it] Training 1/1 epoch (loss 1.7443): 30%|β–ˆβ–ˆβ–ˆ | 470/1563 [14:08<31:09, 1.71s/it] Training 1/1 epoch (loss 1.7443): 30%|β–ˆβ–ˆβ–ˆ | 471/1563 [14:08<29:43, 1.63s/it] Training 1/1 epoch (loss 1.5825): 30%|β–ˆβ–ˆβ–ˆ | 471/1563 [14:10<29:43, 1.63s/it] Training 1/1 epoch (loss 1.5825): 30%|β–ˆβ–ˆβ–ˆ | 472/1563 [14:10<31:30, 1.73s/it] Training 1/1 epoch (loss 1.7168): 30%|β–ˆβ–ˆβ–ˆ | 472/1563 [14:11<31:30, 1.73s/it] Training 1/1 epoch (loss 1.7168): 30%|β–ˆβ–ˆβ–ˆ | 473/1563 [14:11<31:00, 1.71s/it] Training 1/1 epoch (loss 1.6823): 30%|β–ˆβ–ˆβ–ˆ | 473/1563 [14:13<31:00, 1.71s/it] Training 1/1 epoch (loss 1.6823): 30%|β–ˆβ–ˆβ–ˆ | 474/1563 [14:13<31:38, 1.74s/it] Training 1/1 epoch (loss 1.7014): 30%|β–ˆβ–ˆβ–ˆ | 474/1563 [14:15<31:38, 1.74s/it] Training 1/1 epoch (loss 1.7014): 30%|β–ˆβ–ˆβ–ˆ | 475/1563 [14:15<29:27, 1.62s/it] Training 1/1 epoch (loss 1.7314): 30%|β–ˆβ–ˆβ–ˆ | 475/1563 [14:16<29:27, 1.62s/it] Training 1/1 epoch (loss 1.7314): 30%|β–ˆβ–ˆβ–ˆ | 476/1563 [14:16<29:37, 1.63s/it] Training 1/1 epoch (loss 1.6391): 30%|β–ˆβ–ˆβ–ˆ | 476/1563 [14:19<29:37, 1.63s/it] Training 1/1 epoch (loss 1.6391): 31%|β–ˆβ–ˆβ–ˆ | 477/1563 [14:19<33:12, 1.83s/it] Training 1/1 epoch (loss 1.6254): 31%|β–ˆβ–ˆβ–ˆ | 477/1563 [14:20<33:12, 1.83s/it] Training 1/1 epoch (loss 1.6254): 31%|β–ˆβ–ˆβ–ˆ | 478/1563 [14:20<32:45, 1.81s/it] Training 1/1 epoch (loss 1.7122): 31%|β–ˆβ–ˆβ–ˆ | 478/1563 [14:22<32:45, 1.81s/it] Training 1/1 epoch (loss 1.7122): 31%|β–ˆβ–ˆβ–ˆ | 479/1563 [14:22<34:03, 1.89s/it] Training 1/1 epoch (loss 1.6309): 31%|β–ˆβ–ˆβ–ˆ | 479/1563 [14:24<34:03, 1.89s/it] Training 1/1 epoch (loss 1.6309): 31%|β–ˆβ–ˆβ–ˆ | 480/1563 [14:24<34:58, 1.94s/it] Training 1/1 epoch (loss 1.6302): 31%|β–ˆβ–ˆβ–ˆ | 480/1563 [14:26<34:58, 1.94s/it] Training 1/1 epoch (loss 1.6302): 31%|β–ˆβ–ˆβ–ˆ | 481/1563 [14:26<34:44, 1.93s/it] Training 1/1 epoch (loss 1.7502): 31%|β–ˆβ–ˆβ–ˆ | 481/1563 [14:27<34:44, 1.93s/it] Training 1/1 epoch (loss 1.7502): 31%|β–ˆβ–ˆβ–ˆ | 482/1563 [14:27<29:12, 1.62s/it] Training 1/1 epoch (loss 1.6015): 31%|β–ˆβ–ˆβ–ˆ | 482/1563 [14:29<29:12, 1.62s/it] Training 1/1 epoch (loss 1.6015): 31%|β–ˆβ–ˆβ–ˆ | 483/1563 [14:29<31:23, 1.74s/it] Training 1/1 epoch (loss 1.6475): 31%|β–ˆβ–ˆβ–ˆ | 483/1563 [14:31<31:23, 1.74s/it] Training 1/1 epoch (loss 1.6475): 31%|β–ˆβ–ˆβ–ˆ | 484/1563 [14:31<30:36, 1.70s/it] Training 1/1 epoch (loss 1.6904): 31%|β–ˆβ–ˆβ–ˆ | 484/1563 [14:33<30:36, 1.70s/it] Training 1/1 epoch (loss 1.6904): 31%|β–ˆβ–ˆβ–ˆ | 485/1563 [14:33<31:07, 1.73s/it] Training 1/1 epoch (loss 1.7105): 31%|β–ˆβ–ˆβ–ˆ | 485/1563 [14:35<31:07, 1.73s/it] Training 1/1 epoch (loss 1.7105): 31%|β–ˆβ–ˆβ–ˆ | 486/1563 [14:35<34:36, 1.93s/it] Training 1/1 epoch (loss 1.7278): 31%|β–ˆβ–ˆβ–ˆ | 486/1563 [14:37<34:36, 1.93s/it] Training 1/1 epoch (loss 1.7278): 31%|β–ˆβ–ˆβ–ˆ | 487/1563 [14:37<33:36, 1.87s/it] Training 1/1 epoch (loss 1.7290): 31%|β–ˆβ–ˆβ–ˆ | 487/1563 [14:39<33:36, 1.87s/it] Training 1/1 epoch (loss 1.7290): 31%|β–ˆβ–ˆβ–ˆ | 488/1563 [14:39<34:48, 1.94s/it] Training 1/1 epoch (loss 1.5687): 31%|β–ˆβ–ˆβ–ˆ | 488/1563 [14:41<34:48, 1.94s/it] Training 1/1 epoch (loss 1.5687): 31%|β–ˆβ–ˆβ–ˆβ– | 489/1563 [14:41<34:45, 1.94s/it] Training 1/1 epoch (loss 1.6622): 31%|β–ˆβ–ˆβ–ˆβ– | 489/1563 [14:42<34:45, 1.94s/it] Training 1/1 epoch (loss 1.6622): 31%|β–ˆβ–ˆβ–ˆβ– | 490/1563 [14:42<31:40, 1.77s/it] Training 1/1 epoch (loss 1.6398): 31%|β–ˆβ–ˆβ–ˆβ– | 490/1563 [14:44<31:40, 1.77s/it] Training 1/1 epoch (loss 1.6398): 31%|β–ˆβ–ˆβ–ˆβ– | 491/1563 [14:44<32:57, 1.84s/it] Training 1/1 epoch (loss 1.6745): 31%|β–ˆβ–ˆβ–ˆβ– | 491/1563 [14:46<32:57, 1.84s/it] Training 1/1 epoch (loss 1.6745): 31%|β–ˆβ–ˆβ–ˆβ– | 492/1563 [14:46<32:31, 1.82s/it] Training 1/1 epoch (loss 1.5738): 31%|β–ˆβ–ˆβ–ˆβ– | 492/1563 [14:47<32:31, 1.82s/it] Training 1/1 epoch (loss 1.5738): 32%|β–ˆβ–ˆβ–ˆβ– | 493/1563 [14:47<29:22, 1.65s/it] Training 1/1 epoch (loss 1.6404): 32%|β–ˆβ–ˆβ–ˆβ– | 493/1563 [14:49<29:22, 1.65s/it] Training 1/1 epoch (loss 1.6404): 32%|β–ˆβ–ˆβ–ˆβ– | 494/1563 [14:49<29:22, 1.65s/it] Training 1/1 epoch (loss 1.6904): 32%|β–ˆβ–ˆβ–ˆβ– | 494/1563 [14:50<29:22, 1.65s/it] Training 1/1 epoch (loss 1.6904): 32%|β–ˆβ–ˆβ–ˆβ– | 495/1563 [14:50<29:01, 1.63s/it] Training 1/1 epoch (loss 1.7043): 32%|β–ˆβ–ˆβ–ˆβ– | 495/1563 [14:53<29:01, 1.63s/it] Training 1/1 epoch (loss 1.7043): 32%|β–ˆβ–ˆβ–ˆβ– | 496/1563 [14:53<33:52, 1.90s/it] Training 1/1 epoch (loss 1.6117): 32%|β–ˆβ–ˆβ–ˆβ– | 496/1563 [14:55<33:52, 1.90s/it] Training 1/1 epoch (loss 1.6117): 32%|β–ˆβ–ˆβ–ˆβ– | 497/1563 [14:55<33:41, 1.90s/it] Training 1/1 epoch (loss 1.7268): 32%|β–ˆβ–ˆβ–ˆβ– | 497/1563 [14:56<33:41, 1.90s/it] Training 1/1 epoch (loss 1.7268): 32%|β–ˆβ–ˆβ–ˆβ– | 498/1563 [14:56<31:00, 1.75s/it] Training 1/1 epoch (loss 1.6703): 32%|β–ˆβ–ˆβ–ˆβ– | 498/1563 [14:58<31:00, 1.75s/it] Training 1/1 epoch (loss 1.6703): 32%|β–ˆβ–ˆβ–ˆβ– | 499/1563 [14:58<30:45, 1.73s/it] Training 1/1 epoch (loss 1.6517): 32%|β–ˆβ–ˆβ–ˆβ– | 499/1563 [14:59<30:45, 1.73s/it] Training 1/1 epoch (loss 1.6517): 32%|β–ˆβ–ˆβ–ˆβ– | 500/1563 [14:59<27:53, 1.57s/it] Training 1/1 epoch (loss 1.7144): 32%|β–ˆβ–ˆβ–ˆβ– | 500/1563 [15:02<27:53, 1.57s/it] Training 1/1 epoch (loss 1.7144): 32%|β–ˆβ–ˆβ–ˆβ– | 501/1563 [15:02<32:06, 1.81s/it] Training 1/1 epoch (loss 1.6774): 32%|β–ˆβ–ˆβ–ˆβ– | 501/1563 [15:03<32:06, 1.81s/it] Training 1/1 epoch (loss 1.6774): 32%|β–ˆβ–ˆβ–ˆβ– | 502/1563 [15:03<31:45, 1.80s/it] Training 1/1 epoch (loss 1.7676): 32%|β–ˆβ–ˆβ–ˆβ– | 502/1563 [15:05<31:45, 1.80s/it] Training 1/1 epoch (loss 1.7676): 32%|β–ˆβ–ˆβ–ˆβ– | 503/1563 [15:05<32:30, 1.84s/it] Training 1/1 epoch (loss 1.5886): 32%|β–ˆβ–ˆβ–ˆβ– | 503/1563 [15:08<32:30, 1.84s/it] Training 1/1 epoch (loss 1.5886): 32%|β–ˆβ–ˆβ–ˆβ– | 504/1563 [15:08<35:42, 2.02s/it] Training 1/1 epoch (loss 1.6707): 32%|β–ˆβ–ˆβ–ˆβ– | 504/1563 [15:09<35:42, 2.02s/it] Training 1/1 epoch (loss 1.6707): 32%|β–ˆβ–ˆβ–ˆβ– | 505/1563 [15:09<32:49, 1.86s/it] Training 1/1 epoch (loss 1.6898): 32%|β–ˆβ–ˆβ–ˆβ– | 505/1563 [15:11<32:49, 1.86s/it] Training 1/1 epoch (loss 1.6898): 32%|β–ˆβ–ˆβ–ˆβ– | 506/1563 [15:11<31:35, 1.79s/it] Training 1/1 epoch (loss 1.6758): 32%|β–ˆβ–ˆβ–ˆβ– | 506/1563 [15:13<31:35, 1.79s/it] Training 1/1 epoch (loss 1.6758): 32%|β–ˆβ–ˆβ–ˆβ– | 507/1563 [15:13<31:08, 1.77s/it] Training 1/1 epoch (loss 1.6800): 32%|β–ˆβ–ˆβ–ˆβ– | 507/1563 [15:15<31:08, 1.77s/it] Training 1/1 epoch (loss 1.6800): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 508/1563 [15:15<32:20, 1.84s/it] Training 1/1 epoch (loss 1.6402): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 508/1563 [15:16<32:20, 1.84s/it] Training 1/1 epoch (loss 1.6402): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 509/1563 [15:16<29:46, 1.69s/it] Training 1/1 epoch (loss 1.7250): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 509/1563 [15:18<29:46, 1.69s/it] Training 1/1 epoch (loss 1.7250): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 510/1563 [15:18<34:03, 1.94s/it] Training 1/1 epoch (loss 1.5475): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 510/1563 [15:20<34:03, 1.94s/it] Training 1/1 epoch (loss 1.5475): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 511/1563 [15:20<32:41, 1.86s/it] Training 1/1 epoch (loss 1.6717): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 511/1563 [15:23<32:41, 1.86s/it] Training 1/1 epoch (loss 1.6717): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 512/1563 [15:23<37:17, 2.13s/it] Training 1/1 epoch (loss 1.6228): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 512/1563 [15:25<37:17, 2.13s/it] Training 1/1 epoch (loss 1.6228): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 513/1563 [15:25<34:45, 1.99s/it] Training 1/1 epoch (loss 1.7268): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 513/1563 [15:26<34:45, 1.99s/it] Training 1/1 epoch (loss 1.7268): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 514/1563 [15:26<31:49, 1.82s/it] Training 1/1 epoch (loss 1.7089): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 514/1563 [15:27<31:49, 1.82s/it] Training 1/1 epoch (loss 1.7089): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 515/1563 [15:27<29:14, 1.67s/it] Training 1/1 epoch (loss 1.7020): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 515/1563 [15:29<29:14, 1.67s/it] Training 1/1 epoch (loss 1.7020): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 516/1563 [15:29<27:15, 1.56s/it] Training 1/1 epoch (loss 1.6322): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 516/1563 [15:30<27:15, 1.56s/it] Training 1/1 epoch (loss 1.6322): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 517/1563 [15:30<27:01, 1.55s/it] Training 1/1 epoch (loss 1.5772): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 517/1563 [15:32<27:01, 1.55s/it] Training 1/1 epoch (loss 1.5772): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 518/1563 [15:32<29:53, 1.72s/it] Training 1/1 epoch (loss 1.4700): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 518/1563 [15:34<29:53, 1.72s/it] Training 1/1 epoch (loss 1.4700): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 519/1563 [15:34<31:40, 1.82s/it] Training 1/1 epoch (loss 1.5929): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 519/1563 [15:36<31:40, 1.82s/it] Training 1/1 epoch (loss 1.5929): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 520/1563 [15:36<28:49, 1.66s/it] Training 1/1 epoch (loss 1.7207): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 520/1563 [15:37<28:49, 1.66s/it] Training 1/1 epoch (loss 1.7207): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 521/1563 [15:37<29:23, 1.69s/it] Training 1/1 epoch (loss 1.6859): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 521/1563 [15:39<29:23, 1.69s/it] Training 1/1 epoch (loss 1.6859): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 522/1563 [15:39<27:24, 1.58s/it] Training 1/1 epoch (loss 1.6897): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 522/1563 [15:41<27:24, 1.58s/it] Training 1/1 epoch (loss 1.6897): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 523/1563 [15:41<29:29, 1.70s/it] Training 1/1 epoch (loss 1.6348): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 523/1563 [15:42<29:29, 1.70s/it] Training 1/1 epoch (loss 1.6348): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 524/1563 [15:42<27:35, 1.59s/it] Training 1/1 epoch (loss 1.6418): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 524/1563 [15:44<27:35, 1.59s/it] Training 1/1 epoch (loss 1.6418): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 525/1563 [15:44<31:39, 1.83s/it] Training 1/1 epoch (loss 1.6402): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 525/1563 [15:47<31:39, 1.83s/it] Training 1/1 epoch (loss 1.6402): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 526/1563 [15:47<34:08, 1.97s/it] Training 1/1 epoch (loss 1.6180): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 526/1563 [15:48<34:08, 1.97s/it] Training 1/1 epoch (loss 1.6180): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 527/1563 [15:48<32:52, 1.90s/it] Training 1/1 epoch (loss 1.7272): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 527/1563 [15:50<32:52, 1.90s/it] Training 1/1 epoch (loss 1.7272): 34%|β–ˆβ–ˆβ–ˆβ– | 528/1563 [15:50<31:36, 1.83s/it] Training 1/1 epoch (loss 1.7581): 34%|β–ˆβ–ˆβ–ˆβ– | 528/1563 [15:51<31:36, 1.83s/it] Training 1/1 epoch (loss 1.7581): 34%|β–ˆβ–ˆβ–ˆβ– | 529/1563 [15:51<28:54, 1.68s/it] Training 1/1 epoch (loss 1.6197): 34%|β–ˆβ–ˆβ–ˆβ– | 529/1563 [15:53<28:54, 1.68s/it] Training 1/1 epoch (loss 1.6197): 34%|β–ˆβ–ˆβ–ˆβ– | 530/1563 [15:53<30:11, 1.75s/it] Training 1/1 epoch (loss 1.6270): 34%|β–ˆβ–ˆβ–ˆβ– | 530/1563 [15:55<30:11, 1.75s/it] Training 1/1 epoch (loss 1.6270): 34%|β–ˆβ–ˆβ–ˆβ– | 531/1563 [15:55<29:14, 1.70s/it] Training 1/1 epoch (loss 1.6287): 34%|β–ˆβ–ˆβ–ˆβ– | 531/1563 [15:57<29:14, 1.70s/it] Training 1/1 epoch (loss 1.6287): 34%|β–ˆβ–ˆβ–ˆβ– | 532/1563 [15:57<33:19, 1.94s/it] Training 1/1 epoch (loss 1.6336): 34%|β–ˆβ–ˆβ–ˆβ– | 532/1563 [15:59<33:19, 1.94s/it] Training 1/1 epoch (loss 1.6336): 34%|β–ˆβ–ˆβ–ˆβ– | 533/1563 [15:59<33:18, 1.94s/it] Training 1/1 epoch (loss 1.6294): 34%|β–ˆβ–ˆβ–ˆβ– | 533/1563 [16:01<33:18, 1.94s/it] Training 1/1 epoch (loss 1.6294): 34%|β–ˆβ–ˆβ–ˆβ– | 534/1563 [16:01<30:13, 1.76s/it] Training 1/1 epoch (loss 1.5874): 34%|β–ˆβ–ˆβ–ˆβ– | 534/1563 [16:02<30:13, 1.76s/it] Training 1/1 epoch (loss 1.5874): 34%|β–ˆβ–ˆβ–ˆβ– | 535/1563 [16:02<28:42, 1.68s/it] Training 1/1 epoch (loss 1.7184): 34%|β–ˆβ–ˆβ–ˆβ– | 535/1563 [16:04<28:42, 1.68s/it] Training 1/1 epoch (loss 1.7184): 34%|β–ˆβ–ˆβ–ˆβ– | 536/1563 [16:04<29:44, 1.74s/it] Training 1/1 epoch (loss 1.6159): 34%|β–ˆβ–ˆβ–ˆβ– | 536/1563 [16:06<29:44, 1.74s/it] Training 1/1 epoch (loss 1.6159): 34%|β–ˆβ–ˆβ–ˆβ– | 537/1563 [16:06<30:34, 1.79s/it] Training 1/1 epoch (loss 1.6185): 34%|β–ˆβ–ˆβ–ˆβ– | 537/1563 [16:08<30:34, 1.79s/it] Training 1/1 epoch (loss 1.6185): 34%|β–ˆβ–ˆβ–ˆβ– | 538/1563 [16:08<29:25, 1.72s/it] Training 1/1 epoch (loss 1.6777): 34%|β–ˆβ–ˆβ–ˆβ– | 538/1563 [16:09<29:25, 1.72s/it] Training 1/1 epoch (loss 1.6777): 34%|β–ˆβ–ˆβ–ˆβ– | 539/1563 [16:09<28:52, 1.69s/it] Training 1/1 epoch (loss 1.6430): 34%|β–ˆβ–ˆβ–ˆβ– | 539/1563 [16:10<28:52, 1.69s/it] Training 1/1 epoch (loss 1.6430): 35%|β–ˆβ–ˆβ–ˆβ– | 540/1563 [16:10<27:09, 1.59s/it] Training 1/1 epoch (loss 1.6507): 35%|β–ˆβ–ˆβ–ˆβ– | 540/1563 [16:13<27:09, 1.59s/it] Training 1/1 epoch (loss 1.6507): 35%|β–ˆβ–ˆβ–ˆβ– | 541/1563 [16:13<30:59, 1.82s/it] Training 1/1 epoch (loss 1.6875): 35%|β–ˆβ–ˆβ–ˆβ– | 541/1563 [16:15<30:59, 1.82s/it] Training 1/1 epoch (loss 1.6875): 35%|β–ˆβ–ˆβ–ˆβ– | 542/1563 [16:15<30:18, 1.78s/it] Training 1/1 epoch (loss 1.6306): 35%|β–ˆβ–ˆβ–ˆβ– | 542/1563 [16:16<30:18, 1.78s/it] Training 1/1 epoch (loss 1.6306): 35%|β–ˆβ–ˆβ–ˆβ– | 543/1563 [16:16<27:44, 1.63s/it] Training 1/1 epoch (loss 1.5474): 35%|β–ˆβ–ˆβ–ˆβ– | 543/1563 [16:18<27:44, 1.63s/it] Training 1/1 epoch (loss 1.5474): 35%|β–ˆβ–ˆβ–ˆβ– | 544/1563 [16:18<28:59, 1.71s/it] Training 1/1 epoch (loss 1.6066): 35%|β–ˆβ–ˆβ–ˆβ– | 544/1563 [16:20<28:59, 1.71s/it] Training 1/1 epoch (loss 1.6066): 35%|β–ˆβ–ˆβ–ˆβ– | 545/1563 [16:20<31:56, 1.88s/it] Training 1/1 epoch (loss 1.6449): 35%|β–ˆβ–ˆβ–ˆβ– | 545/1563 [16:22<31:56, 1.88s/it] Training 1/1 epoch (loss 1.6449): 35%|β–ˆβ–ˆβ–ˆβ– | 546/1563 [16:22<31:04, 1.83s/it] Training 1/1 epoch (loss 1.7802): 35%|β–ˆβ–ˆβ–ˆβ– | 546/1563 [16:23<31:04, 1.83s/it] Training 1/1 epoch (loss 1.7802): 35%|β–ˆβ–ˆβ–ˆβ– | 547/1563 [16:23<29:49, 1.76s/it] Training 1/1 epoch (loss 1.6464): 35%|β–ˆβ–ˆβ–ˆβ– | 547/1563 [16:24<29:49, 1.76s/it] Training 1/1 epoch (loss 1.6464): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 548/1563 [16:24<25:33, 1.51s/it] Training 1/1 epoch (loss 1.6260): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 548/1563 [16:26<25:33, 1.51s/it] Training 1/1 epoch (loss 1.6260): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 549/1563 [16:26<28:39, 1.70s/it] Training 1/1 epoch (loss 1.6696): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 549/1563 [16:28<28:39, 1.70s/it] Training 1/1 epoch (loss 1.6696): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 550/1563 [16:28<28:51, 1.71s/it] Training 1/1 epoch (loss 1.6346): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 550/1563 [16:30<28:51, 1.71s/it] Training 1/1 epoch (loss 1.6346): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 551/1563 [16:30<30:07, 1.79s/it] Training 1/1 epoch (loss 1.7106): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 551/1563 [16:32<30:07, 1.79s/it] Training 1/1 epoch (loss 1.7106): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 552/1563 [16:32<29:16, 1.74s/it] Training 1/1 epoch (loss 1.7382): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 552/1563 [16:33<29:16, 1.74s/it] Training 1/1 epoch (loss 1.7382): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 553/1563 [16:33<29:30, 1.75s/it] Training 1/1 epoch (loss 1.7441): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 553/1563 [16:35<29:30, 1.75s/it] Training 1/1 epoch (loss 1.7441): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 554/1563 [16:35<30:12, 1.80s/it] Training 1/1 epoch (loss 1.7252): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 554/1563 [16:37<30:12, 1.80s/it] Training 1/1 epoch (loss 1.7252): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 555/1563 [16:37<29:23, 1.75s/it] Training 1/1 epoch (loss 1.7044): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 555/1563 [16:38<29:23, 1.75s/it] Training 1/1 epoch (loss 1.7044): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 556/1563 [16:38<27:11, 1.62s/it] Training 1/1 epoch (loss 1.6913): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 556/1563 [16:41<27:11, 1.62s/it] Training 1/1 epoch (loss 1.6913): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 557/1563 [16:41<30:55, 1.84s/it] Training 1/1 epoch (loss 1.7591): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 557/1563 [16:42<30:55, 1.84s/it] Training 1/1 epoch (loss 1.7591): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 558/1563 [16:42<29:02, 1.73s/it] Training 1/1 epoch (loss 1.6993): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 558/1563 [16:45<29:02, 1.73s/it] Training 1/1 epoch (loss 1.6993): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 559/1563 [16:45<32:12, 1.92s/it] Training 1/1 epoch (loss 1.6400): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 559/1563 [16:47<32:12, 1.92s/it] Training 1/1 epoch (loss 1.6400): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 560/1563 [16:47<33:17, 1.99s/it] Training 1/1 epoch (loss 1.7200): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 560/1563 [16:49<33:17, 1.99s/it] Training 1/1 epoch (loss 1.7200): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 561/1563 [16:49<32:34, 1.95s/it] Training 1/1 epoch (loss 1.6664): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 561/1563 [16:50<32:34, 1.95s/it] Training 1/1 epoch (loss 1.6664): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 562/1563 [16:50<30:43, 1.84s/it] Training 1/1 epoch (loss 1.6705): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 562/1563 [16:51<30:43, 1.84s/it] Training 1/1 epoch (loss 1.6705): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 563/1563 [16:51<25:33, 1.53s/it] Training 1/1 epoch (loss 1.6435): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 563/1563 [16:52<25:33, 1.53s/it] Training 1/1 epoch (loss 1.6435): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 564/1563 [16:52<23:30, 1.41s/it] Training 1/1 epoch (loss 1.5745): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 564/1563 [16:54<23:30, 1.41s/it] Training 1/1 epoch (loss 1.5745): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 565/1563 [16:54<26:26, 1.59s/it] Training 1/1 epoch (loss 1.6493): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 565/1563 [16:56<26:26, 1.59s/it] Training 1/1 epoch (loss 1.6493): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 566/1563 [16:56<25:41, 1.55s/it] Training 1/1 epoch (loss 1.7000): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 566/1563 [16:57<25:41, 1.55s/it] Training 1/1 epoch (loss 1.7000): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 567/1563 [16:57<25:38, 1.54s/it] Training 1/1 epoch (loss 1.6056): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 567/1563 [16:58<25:38, 1.54s/it] Training 1/1 epoch (loss 1.6056): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 568/1563 [16:58<24:43, 1.49s/it] Training 1/1 epoch (loss 1.5927): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 568/1563 [17:00<24:43, 1.49s/it] Training 1/1 epoch (loss 1.5927): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 569/1563 [17:00<26:32, 1.60s/it] Training 1/1 epoch (loss 1.6927): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 569/1563 [17:02<26:32, 1.60s/it] Training 1/1 epoch (loss 1.6927): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 570/1563 [17:02<27:12, 1.64s/it] Training 1/1 epoch (loss 1.5356): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 570/1563 [17:03<27:12, 1.64s/it] Training 1/1 epoch (loss 1.5356): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 571/1563 [17:03<23:29, 1.42s/it] Training 1/1 epoch (loss 1.6514): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 571/1563 [17:04<23:29, 1.42s/it] Training 1/1 epoch (loss 1.6514): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 572/1563 [17:04<23:49, 1.44s/it] Training 1/1 epoch (loss 1.6676): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 572/1563 [17:07<23:49, 1.44s/it] Training 1/1 epoch (loss 1.6676): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 573/1563 [17:07<28:37, 1.73s/it] Training 1/1 epoch (loss 1.5659): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 573/1563 [17:08<28:37, 1.73s/it] Training 1/1 epoch (loss 1.5659): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 574/1563 [17:08<27:53, 1.69s/it] Training 1/1 epoch (loss 1.7238): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 574/1563 [17:10<27:53, 1.69s/it] Training 1/1 epoch (loss 1.7238): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 575/1563 [17:10<28:06, 1.71s/it] Training 1/1 epoch (loss 1.6032): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 575/1563 [17:12<28:06, 1.71s/it] Training 1/1 epoch (loss 1.6032): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 576/1563 [17:12<27:00, 1.64s/it] Training 1/1 epoch (loss 1.4503): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 576/1563 [17:14<27:00, 1.64s/it] Training 1/1 epoch (loss 1.4503): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 577/1563 [17:14<28:46, 1.75s/it] Training 1/1 epoch (loss 1.6514): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 577/1563 [17:15<28:46, 1.75s/it] Training 1/1 epoch (loss 1.6514): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 578/1563 [17:15<29:00, 1.77s/it] Training 1/1 epoch (loss 1.5911): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 578/1563 [17:17<29:00, 1.77s/it] Training 1/1 epoch (loss 1.5911): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 579/1563 [17:17<27:54, 1.70s/it] Training 1/1 epoch (loss 1.6169): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 579/1563 [17:19<27:54, 1.70s/it] Training 1/1 epoch (loss 1.6169): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 580/1563 [17:19<26:46, 1.63s/it] Training 1/1 epoch (loss 1.6920): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 580/1563 [17:20<26:46, 1.63s/it] Training 1/1 epoch (loss 1.6920): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 581/1563 [17:20<24:16, 1.48s/it] Training 1/1 epoch (loss 1.7449): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 581/1563 [17:22<24:16, 1.48s/it] Training 1/1 epoch (loss 1.7449): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 582/1563 [17:22<28:16, 1.73s/it] Training 1/1 epoch (loss 1.6263): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 582/1563 [17:23<28:16, 1.73s/it] Training 1/1 epoch (loss 1.6263): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 583/1563 [17:23<27:18, 1.67s/it] Training 1/1 epoch (loss 1.7575): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 583/1563 [17:25<27:18, 1.67s/it] Training 1/1 epoch (loss 1.7575): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 584/1563 [17:25<27:01, 1.66s/it] Training 1/1 epoch (loss 1.6094): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 584/1563 [17:27<27:01, 1.66s/it] Training 1/1 epoch (loss 1.6094): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 585/1563 [17:27<27:44, 1.70s/it] Training 1/1 epoch (loss 1.6487): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 585/1563 [17:29<27:44, 1.70s/it] Training 1/1 epoch (loss 1.6487): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 586/1563 [17:29<28:32, 1.75s/it] Training 1/1 epoch (loss 1.6612): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 586/1563 [17:30<28:32, 1.75s/it] Training 1/1 epoch (loss 1.6612): 38%|β–ˆβ–ˆβ–ˆβ–Š | 587/1563 [17:30<25:36, 1.57s/it] Training 1/1 epoch (loss 1.7406): 38%|β–ˆβ–ˆβ–ˆβ–Š | 587/1563 [17:32<25:36, 1.57s/it] Training 1/1 epoch (loss 1.7406): 38%|β–ˆβ–ˆβ–ˆβ–Š | 588/1563 [17:32<28:29, 1.75s/it] Training 1/1 epoch (loss 1.6478): 38%|β–ˆβ–ˆβ–ˆβ–Š | 588/1563 [17:33<28:29, 1.75s/it] Training 1/1 epoch (loss 1.6478): 38%|β–ˆβ–ˆβ–ˆβ–Š | 589/1563 [17:33<25:53, 1.59s/it] Training 1/1 epoch (loss 1.6431): 38%|β–ˆβ–ˆβ–ˆβ–Š | 589/1563 [17:35<25:53, 1.59s/it] Training 1/1 epoch (loss 1.6431): 38%|β–ˆβ–ˆβ–ˆβ–Š | 590/1563 [17:35<26:41, 1.65s/it] Training 1/1 epoch (loss 1.6039): 38%|β–ˆβ–ˆβ–ˆβ–Š | 590/1563 [17:37<26:41, 1.65s/it] Training 1/1 epoch (loss 1.6039): 38%|β–ˆβ–ˆβ–ˆβ–Š | 591/1563 [17:37<26:31, 1.64s/it] Training 1/1 epoch (loss 1.5788): 38%|β–ˆβ–ˆβ–ˆβ–Š | 591/1563 [17:39<26:31, 1.64s/it] Training 1/1 epoch (loss 1.5788): 38%|β–ˆβ–ˆβ–ˆβ–Š | 592/1563 [17:39<29:05, 1.80s/it] Training 1/1 epoch (loss 1.6024): 38%|β–ˆβ–ˆβ–ˆβ–Š | 592/1563 [17:40<29:05, 1.80s/it] Training 1/1 epoch (loss 1.6024): 38%|β–ˆβ–ˆβ–ˆβ–Š | 593/1563 [17:40<25:14, 1.56s/it] Training 1/1 epoch (loss 1.6676): 38%|β–ˆβ–ˆβ–ˆβ–Š | 593/1563 [17:42<25:14, 1.56s/it] Training 1/1 epoch (loss 1.6676): 38%|β–ˆβ–ˆβ–ˆβ–Š | 594/1563 [17:42<26:48, 1.66s/it] Training 1/1 epoch (loss 1.7127): 38%|β–ˆβ–ˆβ–ˆβ–Š | 594/1563 [17:44<26:48, 1.66s/it] Training 1/1 epoch (loss 1.7127): 38%|β–ˆβ–ˆβ–ˆβ–Š | 595/1563 [17:44<27:18, 1.69s/it] Training 1/1 epoch (loss 1.7681): 38%|β–ˆβ–ˆβ–ˆβ–Š | 595/1563 [17:44<27:18, 1.69s/it] Training 1/1 epoch (loss 1.7681): 38%|β–ˆβ–ˆβ–ˆβ–Š | 596/1563 [17:44<23:02, 1.43s/it] Training 1/1 epoch (loss 1.6559): 38%|β–ˆβ–ˆβ–ˆβ–Š | 596/1563 [17:45<23:02, 1.43s/it] Training 1/1 epoch (loss 1.6559): 38%|β–ˆβ–ˆβ–ˆβ–Š | 597/1563 [17:45<19:58, 1.24s/it] Training 1/1 epoch (loss 1.7627): 38%|β–ˆβ–ˆβ–ˆβ–Š | 597/1563 [17:47<19:58, 1.24s/it] Training 1/1 epoch (loss 1.7627): 38%|β–ˆβ–ˆβ–ˆβ–Š | 598/1563 [17:47<22:54, 1.42s/it] Training 1/1 epoch (loss 1.6391): 38%|β–ˆβ–ˆβ–ˆβ–Š | 598/1563 [17:49<22:54, 1.42s/it] Training 1/1 epoch (loss 1.6391): 38%|β–ˆβ–ˆβ–ˆβ–Š | 599/1563 [17:49<25:54, 1.61s/it] Training 1/1 epoch (loss 1.7545): 38%|β–ˆβ–ˆβ–ˆβ–Š | 599/1563 [17:51<25:54, 1.61s/it] Training 1/1 epoch (loss 1.7545): 38%|β–ˆβ–ˆβ–ˆβ–Š | 600/1563 [17:51<28:51, 1.80s/it] Training 1/1 epoch (loss 1.6787): 38%|β–ˆβ–ˆβ–ˆβ–Š | 600/1563 [17:52<28:51, 1.80s/it] Training 1/1 epoch (loss 1.6787): 38%|β–ˆβ–ˆβ–ˆβ–Š | 601/1563 [17:52<24:45, 1.54s/it] Training 1/1 epoch (loss 1.6840): 38%|β–ˆβ–ˆβ–ˆβ–Š | 601/1563 [17:53<24:45, 1.54s/it] Training 1/1 epoch (loss 1.6840): 39%|β–ˆβ–ˆβ–ˆβ–Š | 602/1563 [17:53<22:43, 1.42s/it] Training 1/1 epoch (loss 1.6116): 39%|β–ˆβ–ˆβ–ˆβ–Š | 602/1563 [17:56<22:43, 1.42s/it] Training 1/1 epoch (loss 1.6116): 39%|β–ˆβ–ˆβ–ˆβ–Š | 603/1563 [17:56<27:33, 1.72s/it] Training 1/1 epoch (loss 1.5460): 39%|β–ˆβ–ˆβ–ˆβ–Š | 603/1563 [17:58<27:33, 1.72s/it] Training 1/1 epoch (loss 1.5460): 39%|β–ˆβ–ˆβ–ˆβ–Š | 604/1563 [17:58<28:39, 1.79s/it] Training 1/1 epoch (loss 1.6729): 39%|β–ˆβ–ˆβ–ˆβ–Š | 604/1563 [18:00<28:39, 1.79s/it] Training 1/1 epoch (loss 1.6729): 39%|β–ˆβ–ˆβ–ˆβ–Š | 605/1563 [18:00<29:13, 1.83s/it] Training 1/1 epoch (loss 1.6821): 39%|β–ˆβ–ˆβ–ˆβ–Š | 605/1563 [18:01<29:13, 1.83s/it] Training 1/1 epoch (loss 1.6821): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 606/1563 [18:01<27:37, 1.73s/it] Training 1/1 epoch (loss 1.5663): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 606/1563 [18:04<27:37, 1.73s/it] Training 1/1 epoch (loss 1.5663): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 607/1563 [18:04<31:18, 1.97s/it] Training 1/1 epoch (loss 1.5875): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 607/1563 [18:05<31:18, 1.97s/it] Training 1/1 epoch (loss 1.5875): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 608/1563 [18:05<28:31, 1.79s/it] Training 1/1 epoch (loss 1.6572): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 608/1563 [18:08<28:31, 1.79s/it] Training 1/1 epoch (loss 1.6572): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 609/1563 [18:08<31:55, 2.01s/it] Training 1/1 epoch (loss 1.6295): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 609/1563 [18:09<31:55, 2.01s/it] Training 1/1 epoch (loss 1.6295): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 610/1563 [18:09<30:43, 1.93s/it] Training 1/1 epoch (loss 1.7477): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 610/1563 [18:11<30:43, 1.93s/it] Training 1/1 epoch (loss 1.7477): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 611/1563 [18:11<28:47, 1.81s/it] Training 1/1 epoch (loss 1.6639): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 611/1563 [18:13<28:47, 1.81s/it] Training 1/1 epoch (loss 1.6639): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 612/1563 [18:13<28:01, 1.77s/it] Training 1/1 epoch (loss 1.6757): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 612/1563 [18:15<28:01, 1.77s/it] Training 1/1 epoch (loss 1.6757): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 613/1563 [18:15<30:40, 1.94s/it] Training 1/1 epoch (loss 1.6206): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 613/1563 [18:16<30:40, 1.94s/it] Training 1/1 epoch (loss 1.6206): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 614/1563 [18:16<27:22, 1.73s/it] Training 1/1 epoch (loss 1.6931): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 614/1563 [18:18<27:22, 1.73s/it] Training 1/1 epoch (loss 1.6931): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 615/1563 [18:18<29:39, 1.88s/it] Training 1/1 epoch (loss 1.6204): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 615/1563 [18:20<29:39, 1.88s/it] Training 1/1 epoch (loss 1.6204): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 616/1563 [18:20<30:30, 1.93s/it] Training 1/1 epoch (loss 1.7472): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 616/1563 [18:22<30:30, 1.93s/it] Training 1/1 epoch (loss 1.7472): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 617/1563 [18:22<29:41, 1.88s/it] Training 1/1 epoch (loss 1.5807): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 617/1563 [18:23<29:41, 1.88s/it] Training 1/1 epoch (loss 1.5807): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 618/1563 [18:23<25:57, 1.65s/it] Training 1/1 epoch (loss 1.7197): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 618/1563 [18:25<25:57, 1.65s/it] Training 1/1 epoch (loss 1.7197): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 619/1563 [18:25<27:52, 1.77s/it] Training 1/1 epoch (loss 1.6441): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 619/1563 [18:27<27:52, 1.77s/it] Training 1/1 epoch (loss 1.6441): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 620/1563 [18:27<26:19, 1.68s/it] Training 1/1 epoch (loss 1.6782): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 620/1563 [18:28<26:19, 1.68s/it] Training 1/1 epoch (loss 1.6782): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 621/1563 [18:28<23:32, 1.50s/it] Training 1/1 epoch (loss 1.6016): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 621/1563 [18:29<23:32, 1.50s/it] Training 1/1 epoch (loss 1.6016): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 622/1563 [18:29<23:06, 1.47s/it] Training 1/1 epoch (loss 1.5623): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 622/1563 [18:30<23:06, 1.47s/it] Training 1/1 epoch (loss 1.5623): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 623/1563 [18:30<21:19, 1.36s/it] Training 1/1 epoch (loss 1.6991): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 623/1563 [18:32<21:19, 1.36s/it] Training 1/1 epoch (loss 1.6991): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 624/1563 [18:32<22:23, 1.43s/it] Training 1/1 epoch (loss 1.5697): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 624/1563 [18:34<22:23, 1.43s/it] Training 1/1 epoch (loss 1.5697): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 625/1563 [18:34<23:50, 1.53s/it] Training 1/1 epoch (loss 1.7300): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 625/1563 [18:35<23:50, 1.53s/it] Training 1/1 epoch (loss 1.7300): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 626/1563 [18:35<24:41, 1.58s/it] Training 1/1 epoch (loss 1.6719): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 626/1563 [18:37<24:41, 1.58s/it] Training 1/1 epoch (loss 1.6719): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 627/1563 [18:37<24:58, 1.60s/it] Training 1/1 epoch (loss 1.6657): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 627/1563 [18:39<24:58, 1.60s/it] Training 1/1 epoch (loss 1.6657): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 628/1563 [18:39<25:29, 1.64s/it] Training 1/1 epoch (loss 1.7835): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 628/1563 [18:40<25:29, 1.64s/it] Training 1/1 epoch (loss 1.7835): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 629/1563 [18:40<24:45, 1.59s/it] Training 1/1 epoch (loss 1.7157): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 629/1563 [18:42<24:45, 1.59s/it] Training 1/1 epoch (loss 1.7157): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 630/1563 [18:42<25:40, 1.65s/it] Training 1/1 epoch (loss 1.6887): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 630/1563 [18:44<25:40, 1.65s/it] Training 1/1 epoch (loss 1.6887): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 631/1563 [18:44<27:44, 1.79s/it] Training 1/1 epoch (loss 1.5641): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 631/1563 [18:46<27:44, 1.79s/it] Training 1/1 epoch (loss 1.5641): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 632/1563 [18:46<29:05, 1.87s/it] Training 1/1 epoch (loss 1.6598): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 632/1563 [18:47<29:05, 1.87s/it] Training 1/1 epoch (loss 1.6598): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 633/1563 [18:47<25:33, 1.65s/it] Training 1/1 epoch (loss 1.7373): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 633/1563 [18:48<25:33, 1.65s/it] Training 1/1 epoch (loss 1.7373): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 634/1563 [18:48<22:22, 1.45s/it] Training 1/1 epoch (loss 1.6001): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 634/1563 [18:51<22:22, 1.45s/it] Training 1/1 epoch (loss 1.6001): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 635/1563 [18:51<26:54, 1.74s/it] Training 1/1 epoch (loss 1.6469): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 635/1563 [18:52<26:54, 1.74s/it] Training 1/1 epoch (loss 1.6469): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 636/1563 [18:52<25:37, 1.66s/it] Training 1/1 epoch (loss 1.6906): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 636/1563 [18:54<25:37, 1.66s/it] Training 1/1 epoch (loss 1.6906): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 637/1563 [18:54<26:15, 1.70s/it] Training 1/1 epoch (loss 1.6140): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 637/1563 [18:56<26:15, 1.70s/it] Training 1/1 epoch (loss 1.6140): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 638/1563 [18:56<27:09, 1.76s/it] Training 1/1 epoch (loss 1.5820): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 638/1563 [18:57<27:09, 1.76s/it] Training 1/1 epoch (loss 1.5820): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 639/1563 [18:57<25:08, 1.63s/it] Training 1/1 epoch (loss 1.5601): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 639/1563 [18:59<25:08, 1.63s/it] Training 1/1 epoch (loss 1.5601): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 640/1563 [18:59<24:26, 1.59s/it] Training 1/1 epoch (loss 1.5812): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 640/1563 [19:01<24:26, 1.59s/it] Training 1/1 epoch (loss 1.5812): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 641/1563 [19:01<25:49, 1.68s/it] Training 1/1 epoch (loss 1.5636): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 641/1563 [19:02<25:49, 1.68s/it] Training 1/1 epoch (loss 1.5636): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 642/1563 [19:02<25:20, 1.65s/it] Training 1/1 epoch (loss 1.6566): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 642/1563 [19:04<25:20, 1.65s/it] Training 1/1 epoch (loss 1.6566): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 643/1563 [19:04<24:13, 1.58s/it] Training 1/1 epoch (loss 1.6329): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 643/1563 [19:05<24:13, 1.58s/it] Training 1/1 epoch (loss 1.6329): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 644/1563 [19:05<22:22, 1.46s/it] Training 1/1 epoch (loss 1.6048): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 644/1563 [19:06<22:22, 1.46s/it] Training 1/1 epoch (loss 1.6048): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 645/1563 [19:06<21:17, 1.39s/it] Training 1/1 epoch (loss 1.6318): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 645/1563 [19:08<21:17, 1.39s/it] Training 1/1 epoch (loss 1.6318): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 646/1563 [19:08<22:55, 1.50s/it] Training 1/1 epoch (loss 1.4926): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 646/1563 [19:09<22:55, 1.50s/it] Training 1/1 epoch (loss 1.4926): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 647/1563 [19:09<22:48, 1.49s/it] Training 1/1 epoch (loss 1.6467): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 647/1563 [19:12<22:48, 1.49s/it] Training 1/1 epoch (loss 1.6467): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 648/1563 [19:12<26:36, 1.74s/it] Training 1/1 epoch (loss 1.6475): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 648/1563 [19:13<26:36, 1.74s/it] Training 1/1 epoch (loss 1.6475): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 649/1563 [19:13<25:04, 1.65s/it] Training 1/1 epoch (loss 1.5796): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 649/1563 [19:14<25:04, 1.65s/it] Training 1/1 epoch (loss 1.5796): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 650/1563 [19:14<23:27, 1.54s/it] Training 1/1 epoch (loss 1.5404): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 650/1563 [19:16<23:27, 1.54s/it] Training 1/1 epoch (loss 1.5404): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 651/1563 [19:16<24:39, 1.62s/it] Training 1/1 epoch (loss 1.5584): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 651/1563 [19:18<24:39, 1.62s/it] Training 1/1 epoch (loss 1.5584): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 652/1563 [19:18<24:17, 1.60s/it] Training 1/1 epoch (loss 1.6908): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 652/1563 [19:20<24:17, 1.60s/it] Training 1/1 epoch (loss 1.6908): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 653/1563 [19:20<27:57, 1.84s/it] Training 1/1 epoch (loss 1.6384): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 653/1563 [19:22<27:57, 1.84s/it] Training 1/1 epoch (loss 1.6384): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 654/1563 [19:22<28:14, 1.86s/it] Training 1/1 epoch (loss 1.6101): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 654/1563 [19:24<28:14, 1.86s/it] Training 1/1 epoch (loss 1.6101): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 655/1563 [19:24<27:02, 1.79s/it] Training 1/1 epoch (loss 1.7053): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 655/1563 [19:25<27:02, 1.79s/it] Training 1/1 epoch (loss 1.7053): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 656/1563 [19:25<25:04, 1.66s/it] Training 1/1 epoch (loss 1.5734): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 656/1563 [19:26<25:04, 1.66s/it] Training 1/1 epoch (loss 1.5734): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 657/1563 [19:26<22:35, 1.50s/it] Training 1/1 epoch (loss 1.7103): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 657/1563 [19:28<22:35, 1.50s/it] Training 1/1 epoch (loss 1.7103): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 658/1563 [19:28<24:44, 1.64s/it] Training 1/1 epoch (loss 1.6936): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 658/1563 [19:30<24:44, 1.64s/it] Training 1/1 epoch (loss 1.6936): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 659/1563 [19:30<25:31, 1.69s/it] Training 1/1 epoch (loss 1.6439): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 659/1563 [19:31<25:31, 1.69s/it] Training 1/1 epoch (loss 1.6439): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 660/1563 [19:31<23:05, 1.53s/it] Training 1/1 epoch (loss 1.6613): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 660/1563 [19:33<23:05, 1.53s/it] Training 1/1 epoch (loss 1.6613): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 661/1563 [19:33<23:58, 1.59s/it] Training 1/1 epoch (loss 1.6504): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 661/1563 [19:34<23:58, 1.59s/it] Training 1/1 epoch (loss 1.6504): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 662/1563 [19:34<23:36, 1.57s/it] Training 1/1 epoch (loss 1.6368): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 662/1563 [19:36<23:36, 1.57s/it] Training 1/1 epoch (loss 1.6368): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 663/1563 [19:36<23:19, 1.55s/it] Training 1/1 epoch (loss 1.6378): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 663/1563 [19:37<23:19, 1.55s/it] Training 1/1 epoch (loss 1.6378): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 664/1563 [19:37<22:08, 1.48s/it] Training 1/1 epoch (loss 1.6274): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 664/1563 [19:39<22:08, 1.48s/it] Training 1/1 epoch (loss 1.6274): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 665/1563 [19:39<23:22, 1.56s/it] Training 1/1 epoch (loss 1.4829): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 665/1563 [19:41<23:22, 1.56s/it] Training 1/1 epoch (loss 1.4829): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 666/1563 [19:41<24:33, 1.64s/it] Training 1/1 epoch (loss 1.6307): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 666/1563 [19:42<24:33, 1.64s/it] Training 1/1 epoch (loss 1.6307): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 667/1563 [19:42<23:03, 1.54s/it] Training 1/1 epoch (loss 1.6391): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 667/1563 [19:44<23:03, 1.54s/it] Training 1/1 epoch (loss 1.6391): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 668/1563 [19:44<23:05, 1.55s/it] Training 1/1 epoch (loss 1.6079): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 668/1563 [19:45<23:05, 1.55s/it] Training 1/1 epoch (loss 1.6079): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 669/1563 [19:45<24:14, 1.63s/it] Training 1/1 epoch (loss 1.6971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 669/1563 [19:47<24:14, 1.63s/it] Training 1/1 epoch (loss 1.6971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 670/1563 [19:47<22:32, 1.51s/it] Training 1/1 epoch (loss 1.6931): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 670/1563 [19:49<22:32, 1.51s/it] Training 1/1 epoch (loss 1.6931): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 671/1563 [19:49<26:56, 1.81s/it] Training 1/1 epoch (loss 1.5836): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 671/1563 [19:51<26:56, 1.81s/it] Training 1/1 epoch (loss 1.5836): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 672/1563 [19:51<28:17, 1.90s/it] Training 1/1 epoch (loss 1.6517): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 672/1563 [19:53<28:17, 1.90s/it] Training 1/1 epoch (loss 1.6517): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 673/1563 [19:53<25:39, 1.73s/it] Training 1/1 epoch (loss 1.5527): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 673/1563 [19:54<25:39, 1.73s/it] Training 1/1 epoch (loss 1.5527): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 674/1563 [19:54<25:20, 1.71s/it] Training 1/1 epoch (loss 1.5825): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 674/1563 [19:57<25:20, 1.71s/it] Training 1/1 epoch (loss 1.5825): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 675/1563 [19:57<28:50, 1.95s/it] Training 1/1 epoch (loss 1.6971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 675/1563 [19:59<28:50, 1.95s/it] Training 1/1 epoch (loss 1.6971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 676/1563 [19:59<29:30, 2.00s/it] Training 1/1 epoch (loss 1.7286): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 676/1563 [20:01<29:30, 2.00s/it] Training 1/1 epoch (loss 1.7286): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 677/1563 [20:01<28:40, 1.94s/it] Training 1/1 epoch (loss 1.6115): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 677/1563 [20:02<28:40, 1.94s/it] Training 1/1 epoch (loss 1.6115): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 678/1563 [20:02<26:19, 1.78s/it] Training 1/1 epoch (loss 1.7304): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 678/1563 [20:03<26:19, 1.78s/it] Training 1/1 epoch (loss 1.7304): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 679/1563 [20:03<24:23, 1.66s/it] Training 1/1 epoch (loss 1.5767): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 679/1563 [20:05<24:23, 1.66s/it] Training 1/1 epoch (loss 1.5767): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 680/1563 [20:05<24:17, 1.65s/it] Training 1/1 epoch (loss 1.6605): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 680/1563 [20:08<24:17, 1.65s/it] Training 1/1 epoch (loss 1.6605): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/1563 [20:08<27:48, 1.89s/it] Training 1/1 epoch (loss 1.4915): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/1563 [20:09<27:48, 1.89s/it] Training 1/1 epoch (loss 1.4915): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/1563 [20:09<26:51, 1.83s/it] Training 1/1 epoch (loss 1.5979): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/1563 [20:11<26:51, 1.83s/it] Training 1/1 epoch (loss 1.5979): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/1563 [20:11<28:28, 1.94s/it] Training 1/1 epoch (loss 1.5173): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/1563 [20:14<28:28, 1.94s/it] Training 1/1 epoch (loss 1.5173): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 684/1563 [20:14<28:51, 1.97s/it] Training 1/1 epoch (loss 1.5963): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 684/1563 [20:15<28:51, 1.97s/it] Training 1/1 epoch (loss 1.5963): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 685/1563 [20:15<27:04, 1.85s/it] Training 1/1 epoch (loss 1.5136): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 685/1563 [20:17<27:04, 1.85s/it] Training 1/1 epoch (loss 1.5136): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 686/1563 [20:17<26:25, 1.81s/it] Training 1/1 epoch (loss 1.6890): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 686/1563 [20:19<26:25, 1.81s/it] Training 1/1 epoch (loss 1.6890): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 687/1563 [20:19<27:54, 1.91s/it] Training 1/1 epoch (loss 1.6025): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 687/1563 [20:22<27:54, 1.91s/it] Training 1/1 epoch (loss 1.6025): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 688/1563 [20:22<31:24, 2.15s/it] Training 1/1 epoch (loss 1.6062): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 688/1563 [20:23<31:24, 2.15s/it] Training 1/1 epoch (loss 1.6062): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 689/1563 [20:23<27:10, 1.87s/it] Training 1/1 epoch (loss 1.7218): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 689/1563 [20:24<27:10, 1.87s/it] Training 1/1 epoch (loss 1.7218): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 690/1563 [20:24<24:32, 1.69s/it] Training 1/1 epoch (loss 1.6795): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 690/1563 [20:25<24:32, 1.69s/it] Training 1/1 epoch (loss 1.6795): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 691/1563 [20:25<21:23, 1.47s/it] Training 1/1 epoch (loss 1.6778): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 691/1563 [20:27<21:23, 1.47s/it] Training 1/1 epoch (loss 1.6778): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 692/1563 [20:27<22:56, 1.58s/it] Training 1/1 epoch (loss 1.6967): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 692/1563 [20:29<22:56, 1.58s/it] Training 1/1 epoch (loss 1.6967): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 693/1563 [20:29<23:02, 1.59s/it] Training 1/1 epoch (loss 1.6341): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 693/1563 [20:30<23:02, 1.59s/it] Training 1/1 epoch (loss 1.6341): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 694/1563 [20:30<24:06, 1.66s/it] Training 1/1 epoch (loss 1.5900): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 694/1563 [20:32<24:06, 1.66s/it] Training 1/1 epoch (loss 1.5900): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 695/1563 [20:32<25:13, 1.74s/it] Training 1/1 epoch (loss 1.5558): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 695/1563 [20:35<25:13, 1.74s/it] Training 1/1 epoch (loss 1.5558): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 696/1563 [20:35<28:10, 1.95s/it] Training 1/1 epoch (loss 1.6631): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 696/1563 [20:36<28:10, 1.95s/it] Training 1/1 epoch (loss 1.6631): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 697/1563 [20:36<26:28, 1.83s/it] Training 1/1 epoch (loss 1.8010): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 697/1563 [20:37<26:28, 1.83s/it] Training 1/1 epoch (loss 1.8010): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 698/1563 [20:37<21:08, 1.47s/it] Training 1/1 epoch (loss 1.5802): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 698/1563 [20:37<21:08, 1.47s/it] Training 1/1 epoch (loss 1.5802): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 699/1563 [20:37<16:11, 1.12s/it] Training 1/1 epoch (loss 1.6302): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 699/1563 [20:38<16:11, 1.12s/it] Training 1/1 epoch (loss 1.6302): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 700/1563 [20:38<12:40, 1.13it/s] Training 1/1 epoch (loss 1.6299): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 700/1563 [20:38<12:40, 1.13it/s] Training 1/1 epoch (loss 1.6299): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 701/1563 [20:38<10:31, 1.37it/s] Training 1/1 epoch (loss 1.7281): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 701/1563 [20:38<10:31, 1.37it/s] Training 1/1 epoch (loss 1.7281): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 702/1563 [20:38<08:56, 1.60it/s] Training 1/1 epoch (loss 1.6614): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 702/1563 [20:39<08:56, 1.60it/s] Training 1/1 epoch (loss 1.6614): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 703/1563 [20:39<07:44, 1.85it/s] Training 1/1 epoch (loss 1.6042): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 703/1563 [20:39<07:44, 1.85it/s] Training 1/1 epoch (loss 1.6042): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/1563 [20:39<07:08, 2.01it/s] Training 1/1 epoch (loss 1.7224): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/1563 [20:39<07:08, 2.01it/s] Training 1/1 epoch (loss 1.7224): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/1563 [20:39<06:35, 2.17it/s] Training 1/1 epoch (loss 1.6503): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/1563 [20:40<06:35, 2.17it/s] Training 1/1 epoch (loss 1.6503): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/1563 [20:40<06:09, 2.32it/s] Training 1/1 epoch (loss 1.6574): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/1563 [20:40<06:09, 2.32it/s] Training 1/1 epoch (loss 1.6574): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/1563 [20:40<05:40, 2.51it/s] Training 1/1 epoch (loss 1.7031): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/1563 [20:40<05:40, 2.51it/s] Training 1/1 epoch (loss 1.7031): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/1563 [20:40<05:29, 2.59it/s] Training 1/1 epoch (loss 1.6806): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/1563 [20:41<05:29, 2.59it/s] Training 1/1 epoch (loss 1.6806): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/1563 [20:41<05:23, 2.64it/s] Training 1/1 epoch (loss 1.6482): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/1563 [20:41<05:23, 2.64it/s] Training 1/1 epoch (loss 1.6482): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/1563 [20:41<05:34, 2.55it/s] Training 1/1 epoch (loss 1.6677): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/1563 [20:42<05:34, 2.55it/s] Training 1/1 epoch (loss 1.6677): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/1563 [20:42<05:20, 2.66it/s] Training 1/1 epoch (loss 1.6150): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/1563 [20:42<05:20, 2.66it/s] Training 1/1 epoch (loss 1.6150): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/1563 [20:42<05:08, 2.75it/s] Training 1/1 epoch (loss 1.6518): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/1563 [20:42<05:08, 2.75it/s] Training 1/1 epoch (loss 1.6518): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/1563 [20:42<04:58, 2.85it/s] Training 1/1 epoch (loss 1.6000): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/1563 [20:43<04:58, 2.85it/s] Training 1/1 epoch (loss 1.6000): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/1563 [20:43<05:07, 2.76it/s] Training 1/1 epoch (loss 1.6528): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/1563 [20:43<05:07, 2.76it/s] Training 1/1 epoch (loss 1.6528): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/1563 [20:43<05:12, 2.72it/s] Training 1/1 epoch (loss 1.6298): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/1563 [20:43<05:12, 2.72it/s] Training 1/1 epoch (loss 1.6298): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 716/1563 [20:43<05:16, 2.67it/s] Training 1/1 epoch (loss 1.4995): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 716/1563 [20:44<05:16, 2.67it/s] Training 1/1 epoch (loss 1.4995): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 717/1563 [20:44<05:02, 2.80it/s] Training 1/1 epoch (loss 1.6612): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 717/1563 [20:44<05:02, 2.80it/s] Training 1/1 epoch (loss 1.6612): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 718/1563 [20:44<04:50, 2.91it/s] Training 1/1 epoch (loss 1.6169): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 718/1563 [20:44<04:50, 2.91it/s] Training 1/1 epoch (loss 1.6169): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 719/1563 [20:44<04:43, 2.97it/s] Training 1/1 epoch (loss 1.5952): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 719/1563 [20:45<04:43, 2.97it/s] Training 1/1 epoch (loss 1.5952): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 720/1563 [20:45<04:55, 2.85it/s] Training 1/1 epoch (loss 1.6106): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 720/1563 [20:45<04:55, 2.85it/s] Training 1/1 epoch (loss 1.6106): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 721/1563 [20:45<05:07, 2.74it/s] Training 1/1 epoch (loss 1.7358): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 721/1563 [20:45<05:07, 2.74it/s] Training 1/1 epoch (loss 1.7358): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 722/1563 [20:45<05:00, 2.80it/s] Training 1/1 epoch (loss 1.5867): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 722/1563 [20:46<05:00, 2.80it/s] Training 1/1 epoch (loss 1.5867): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/1563 [20:46<04:54, 2.85it/s] Training 1/1 epoch (loss 1.6017): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/1563 [20:46<04:54, 2.85it/s] Training 1/1 epoch (loss 1.6017): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/1563 [20:46<04:57, 2.82it/s] Training 1/1 epoch (loss 1.6324): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/1563 [20:46<04:57, 2.82it/s] Training 1/1 epoch (loss 1.6324): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/1563 [20:46<04:47, 2.92it/s] Training 1/1 epoch (loss 1.5699): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/1563 [20:47<04:47, 2.92it/s] Training 1/1 epoch (loss 1.5699): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/1563 [20:47<04:47, 2.91it/s] Training 1/1 epoch (loss 1.6664): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/1563 [20:47<04:47, 2.91it/s] Training 1/1 epoch (loss 1.6664): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 727/1563 [20:47<05:01, 2.77it/s] Training 1/1 epoch (loss 1.7116): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 727/1563 [20:48<05:01, 2.77it/s] Training 1/1 epoch (loss 1.7116): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 728/1563 [20:48<05:01, 2.77it/s] Training 1/1 epoch (loss 1.5601): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 728/1563 [20:48<05:01, 2.77it/s] Training 1/1 epoch (loss 1.5601): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 729/1563 [20:48<04:50, 2.87it/s] Training 1/1 epoch (loss 1.6075): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 729/1563 [20:48<04:50, 2.87it/s] Training 1/1 epoch (loss 1.6075): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 730/1563 [20:48<04:42, 2.95it/s] Training 1/1 epoch (loss 1.5422): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 730/1563 [20:49<04:42, 2.95it/s] Training 1/1 epoch (loss 1.5422): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 731/1563 [20:49<04:40, 2.97it/s] Training 1/1 epoch (loss 1.5914): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 731/1563 [20:49<04:40, 2.97it/s] Training 1/1 epoch (loss 1.5914): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 732/1563 [20:49<04:37, 2.99it/s] Training 1/1 epoch (loss 1.5500): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 732/1563 [20:49<04:37, 2.99it/s] Training 1/1 epoch (loss 1.5500): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 733/1563 [20:49<04:56, 2.80it/s] Training 1/1 epoch (loss 1.6647): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 733/1563 [20:50<04:56, 2.80it/s] Training 1/1 epoch (loss 1.6647): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/1563 [20:50<04:45, 2.90it/s] Training 1/1 epoch (loss 1.6750): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/1563 [20:50<04:45, 2.90it/s] Training 1/1 epoch (loss 1.6750): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 735/1563 [20:50<04:38, 2.97it/s] Training 1/1 epoch (loss 1.6307): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 735/1563 [20:50<04:38, 2.97it/s] Training 1/1 epoch (loss 1.6307): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 736/1563 [20:50<04:38, 2.97it/s] Training 1/1 epoch (loss 1.7553): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 736/1563 [20:51<04:38, 2.97it/s] Training 1/1 epoch (loss 1.7553): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 737/1563 [20:51<04:38, 2.97it/s] Training 1/1 epoch (loss 1.5935): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 737/1563 [20:51<04:38, 2.97it/s] Training 1/1 epoch (loss 1.5935): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 738/1563 [20:51<04:40, 2.94it/s] Training 1/1 epoch (loss 1.5560): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 738/1563 [20:51<04:40, 2.94it/s] Training 1/1 epoch (loss 1.5560): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 739/1563 [20:51<04:55, 2.79it/s] Training 1/1 epoch (loss 1.6281): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 739/1563 [20:52<04:55, 2.79it/s] Training 1/1 epoch (loss 1.6281): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 740/1563 [20:52<04:50, 2.84it/s] Training 1/1 epoch (loss 1.6564): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 740/1563 [20:52<04:50, 2.84it/s] Training 1/1 epoch (loss 1.6564): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 741/1563 [20:52<04:43, 2.90it/s] Training 1/1 epoch (loss 1.5985): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 741/1563 [20:52<04:43, 2.90it/s] Training 1/1 epoch (loss 1.5985): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/1563 [20:52<04:39, 2.94it/s] Training 1/1 epoch (loss 1.5824): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/1563 [20:53<04:39, 2.94it/s] Training 1/1 epoch (loss 1.5824): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 743/1563 [20:53<04:36, 2.97it/s] Training 1/1 epoch (loss 1.6257): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 743/1563 [20:53<04:36, 2.97it/s] Training 1/1 epoch (loss 1.6257): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 744/1563 [20:53<05:05, 2.68it/s] Training 1/1 epoch (loss 1.6348): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 744/1563 [20:54<05:05, 2.68it/s] Training 1/1 epoch (loss 1.6348): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 745/1563 [20:54<05:09, 2.64it/s] Training 1/1 epoch (loss 1.5024): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 745/1563 [20:54<05:09, 2.64it/s] Training 1/1 epoch (loss 1.5024): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 746/1563 [20:54<04:54, 2.78it/s] Training 1/1 epoch (loss 1.5605): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 746/1563 [20:54<04:54, 2.78it/s] Training 1/1 epoch (loss 1.5605): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 747/1563 [20:54<04:45, 2.86it/s] Training 1/1 epoch (loss 1.4728): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 747/1563 [20:54<04:45, 2.86it/s] Training 1/1 epoch (loss 1.4728): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 748/1563 [20:54<04:39, 2.92it/s] Training 1/1 epoch (loss 1.6930): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 748/1563 [20:55<04:39, 2.92it/s] Training 1/1 epoch (loss 1.6930): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 749/1563 [20:55<04:35, 2.96it/s] Training 1/1 epoch (loss 1.5516): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 749/1563 [20:55<04:35, 2.96it/s] Training 1/1 epoch (loss 1.5516): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/1563 [20:55<04:50, 2.80it/s] Training 1/1 epoch (loss 1.6068): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/1563 [20:56<04:50, 2.80it/s] Training 1/1 epoch (loss 1.6068): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 751/1563 [20:56<04:38, 2.91it/s] Training 1/1 epoch (loss 1.7443): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 751/1563 [20:56<04:38, 2.91it/s] Training 1/1 epoch (loss 1.7443): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 752/1563 [20:56<04:38, 2.91it/s] Training 1/1 epoch (loss 1.6505): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 752/1563 [20:56<04:38, 2.91it/s] Training 1/1 epoch (loss 1.6505): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 753/1563 [20:56<04:31, 2.99it/s] Training 1/1 epoch (loss 1.6918): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 753/1563 [20:57<04:31, 2.99it/s] Training 1/1 epoch (loss 1.6918): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 754/1563 [20:57<04:26, 3.04it/s] Training 1/1 epoch (loss 1.6519): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 754/1563 [20:57<04:26, 3.04it/s] Training 1/1 epoch (loss 1.6519): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 755/1563 [20:57<04:31, 2.98it/s] Training 1/1 epoch (loss 1.5048): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 755/1563 [20:57<04:31, 2.98it/s] Training 1/1 epoch (loss 1.5048): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 756/1563 [20:57<04:39, 2.89it/s] Training 1/1 epoch (loss 1.5836): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 756/1563 [20:58<04:39, 2.89it/s] Training 1/1 epoch (loss 1.5836): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 757/1563 [20:58<04:36, 2.91it/s] Training 1/1 epoch (loss 1.6390): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 757/1563 [20:58<04:36, 2.91it/s] Training 1/1 epoch (loss 1.6390): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 758/1563 [20:58<04:49, 2.78it/s] Training 1/1 epoch (loss 1.6834): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 758/1563 [20:58<04:49, 2.78it/s] Training 1/1 epoch (loss 1.6834): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 759/1563 [20:58<04:41, 2.86it/s] Training 1/1 epoch (loss 1.6416): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 759/1563 [20:59<04:41, 2.86it/s] Training 1/1 epoch (loss 1.6416): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 760/1563 [20:59<04:46, 2.80it/s] Training 1/1 epoch (loss 1.6301): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 760/1563 [20:59<04:46, 2.80it/s] Training 1/1 epoch (loss 1.6301): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 761/1563 [20:59<04:47, 2.79it/s] Training 1/1 epoch (loss 1.6489): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 761/1563 [20:59<04:47, 2.79it/s] Training 1/1 epoch (loss 1.6489): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 762/1563 [20:59<04:57, 2.69it/s] Training 1/1 epoch (loss 1.5901): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 762/1563 [21:00<04:57, 2.69it/s] Training 1/1 epoch (loss 1.5901): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 763/1563 [21:00<04:42, 2.83it/s] Training 1/1 epoch (loss 1.6630): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 763/1563 [21:00<04:42, 2.83it/s] Training 1/1 epoch (loss 1.6630): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 764/1563 [21:00<04:36, 2.89it/s] Training 1/1 epoch (loss 1.6239): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 764/1563 [21:00<04:36, 2.89it/s] Training 1/1 epoch (loss 1.6239): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 765/1563 [21:00<04:33, 2.92it/s] Training 1/1 epoch (loss 1.7187): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 765/1563 [21:01<04:33, 2.92it/s] Training 1/1 epoch (loss 1.7187): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 766/1563 [21:01<04:37, 2.87it/s] Training 1/1 epoch (loss 1.6824): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 766/1563 [21:01<04:37, 2.87it/s] Training 1/1 epoch (loss 1.6824): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 767/1563 [21:01<04:47, 2.77it/s] Training 1/1 epoch (loss 1.7095): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 767/1563 [21:02<04:47, 2.77it/s] Training 1/1 epoch (loss 1.7095): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 768/1563 [21:02<04:53, 2.71it/s] Training 1/1 epoch (loss 1.6096): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 768/1563 [21:02<04:53, 2.71it/s] Training 1/1 epoch (loss 1.6096): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 769/1563 [21:02<04:39, 2.84it/s] Training 1/1 epoch (loss 1.5862): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 769/1563 [21:02<04:39, 2.84it/s] Training 1/1 epoch (loss 1.5862): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 770/1563 [21:02<04:43, 2.80it/s] Training 1/1 epoch (loss 1.5533): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 770/1563 [21:03<04:43, 2.80it/s] Training 1/1 epoch (loss 1.5533): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 771/1563 [21:03<04:48, 2.74it/s] Training 1/1 epoch (loss 1.6359): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 771/1563 [21:03<04:48, 2.74it/s] Training 1/1 epoch (loss 1.6359): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 772/1563 [21:03<05:27, 2.42it/s] Training 1/1 epoch (loss 1.5989): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 772/1563 [21:04<05:27, 2.42it/s] Training 1/1 epoch (loss 1.5989): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 773/1563 [21:04<05:37, 2.34it/s] Training 1/1 epoch (loss 1.6201): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 773/1563 [21:04<05:37, 2.34it/s] Training 1/1 epoch (loss 1.6201): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 774/1563 [21:04<05:17, 2.48it/s] Training 1/1 epoch (loss 1.5561): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 774/1563 [21:04<05:17, 2.48it/s] Training 1/1 epoch (loss 1.5561): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 775/1563 [21:04<05:06, 2.57it/s] Training 1/1 epoch (loss 1.5773): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 775/1563 [21:05<05:06, 2.57it/s] Training 1/1 epoch (loss 1.5773): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 776/1563 [21:05<05:07, 2.56it/s] Training 1/1 epoch (loss 1.6846): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 776/1563 [21:05<05:07, 2.56it/s] Training 1/1 epoch (loss 1.6846): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 777/1563 [21:05<05:02, 2.60it/s] Training 1/1 epoch (loss 1.5857): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 777/1563 [21:05<05:02, 2.60it/s] Training 1/1 epoch (loss 1.5857): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 778/1563 [21:05<05:02, 2.59it/s] Training 1/1 epoch (loss 1.6045): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 778/1563 [21:06<05:02, 2.59it/s] Training 1/1 epoch (loss 1.6045): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 779/1563 [21:06<05:14, 2.49it/s] Training 1/1 epoch (loss 1.6648): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 779/1563 [21:06<05:14, 2.49it/s] Training 1/1 epoch (loss 1.6648): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 780/1563 [21:06<05:12, 2.51it/s] Training 1/1 epoch (loss 1.4895): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 780/1563 [21:07<05:12, 2.51it/s] Training 1/1 epoch (loss 1.4895): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 781/1563 [21:07<05:04, 2.57it/s] Training 1/1 epoch (loss 1.7101): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 781/1563 [21:07<05:04, 2.57it/s] Training 1/1 epoch (loss 1.7101): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 782/1563 [21:07<05:34, 2.33it/s] Training 1/1 epoch (loss 1.6072): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 782/1563 [21:08<05:34, 2.33it/s] Training 1/1 epoch (loss 1.6072): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 783/1563 [21:08<05:38, 2.30it/s] Training 1/1 epoch (loss 1.6226): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 783/1563 [21:08<05:38, 2.30it/s] Training 1/1 epoch (loss 1.6226): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 784/1563 [21:08<05:52, 2.21it/s] Training 1/1 epoch (loss 1.4335): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 784/1563 [21:09<05:52, 2.21it/s] Training 1/1 epoch (loss 1.4335): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 785/1563 [21:09<06:25, 2.02it/s] Training 1/1 epoch (loss 1.6361): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 785/1563 [21:10<06:25, 2.02it/s] Training 1/1 epoch (loss 1.6361): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 786/1563 [21:10<11:13, 1.15it/s] Training 1/1 epoch (loss 1.5891): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 786/1563 [21:11<11:13, 1.15it/s] Training 1/1 epoch (loss 1.5891): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 787/1563 [21:11<09:24, 1.37it/s] Training 1/1 epoch (loss 1.5651): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 787/1563 [21:11<09:24, 1.37it/s] Training 1/1 epoch (loss 1.5651): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 788/1563 [21:11<08:00, 1.61it/s] Training 1/1 epoch (loss 1.6795): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 788/1563 [21:12<08:00, 1.61it/s] Training 1/1 epoch (loss 1.6795): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 789/1563 [21:12<07:02, 1.83it/s] Training 1/1 epoch (loss 1.6453): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 789/1563 [21:12<07:02, 1.83it/s] Training 1/1 epoch (loss 1.6453): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 790/1563 [21:12<06:27, 2.00it/s] Training 1/1 epoch (loss 1.5750): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 790/1563 [21:12<06:27, 2.00it/s] Training 1/1 epoch (loss 1.5750): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 791/1563 [21:12<05:43, 2.25it/s] Training 1/1 epoch (loss 1.6967): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 791/1563 [21:13<05:43, 2.25it/s] Training 1/1 epoch (loss 1.6967): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 792/1563 [21:13<05:16, 2.44it/s] Training 1/1 epoch (loss 1.6390): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 792/1563 [21:13<05:16, 2.44it/s] Training 1/1 epoch (loss 1.6390): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 793/1563 [21:13<04:59, 2.57it/s] Training 1/1 epoch (loss 1.6641): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 793/1563 [21:13<04:59, 2.57it/s] Training 1/1 epoch (loss 1.6641): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 794/1563 [21:13<04:52, 2.63it/s] Training 1/1 epoch (loss 1.6042): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 794/1563 [21:14<04:52, 2.63it/s] Training 1/1 epoch (loss 1.6042): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 795/1563 [21:14<04:57, 2.58it/s] Training 1/1 epoch (loss 1.5993): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 795/1563 [21:14<04:57, 2.58it/s] Training 1/1 epoch (loss 1.5993): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 796/1563 [21:14<05:04, 2.52it/s] Training 1/1 epoch (loss 1.6133): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 796/1563 [21:14<05:04, 2.52it/s] Training 1/1 epoch (loss 1.6133): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 797/1563 [21:14<04:45, 2.68it/s] Training 1/1 epoch (loss 1.7162): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 797/1563 [21:15<04:45, 2.68it/s] Training 1/1 epoch (loss 1.7162): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 798/1563 [21:15<04:33, 2.80it/s] Training 1/1 epoch (loss 1.6538): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 798/1563 [21:15<04:33, 2.80it/s] Training 1/1 epoch (loss 1.6538): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 799/1563 [21:15<04:29, 2.84it/s] Training 1/1 epoch (loss 1.6729): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 799/1563 [21:15<04:29, 2.84it/s] Training 1/1 epoch (loss 1.6729): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 800/1563 [21:15<04:24, 2.89it/s] Training 1/1 epoch (loss 1.5884): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 800/1563 [21:16<04:24, 2.89it/s] Training 1/1 epoch (loss 1.5884): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 801/1563 [21:16<04:19, 2.93it/s] Training 1/1 epoch (loss 1.6427): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 801/1563 [21:16<04:19, 2.93it/s] Training 1/1 epoch (loss 1.6427): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 802/1563 [21:16<04:29, 2.82it/s] Training 1/1 epoch (loss 1.5743): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 802/1563 [21:16<04:29, 2.82it/s] Training 1/1 epoch (loss 1.5743): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 803/1563 [21:16<04:16, 2.96it/s] Training 1/1 epoch (loss 1.6185): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 803/1563 [21:17<04:16, 2.96it/s] Training 1/1 epoch (loss 1.6185): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 804/1563 [21:17<04:35, 2.75it/s] Training 1/1 epoch (loss 1.6343): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 804/1563 [21:17<04:35, 2.75it/s] Training 1/1 epoch (loss 1.6343): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 805/1563 [21:17<05:01, 2.51it/s] Training 1/1 epoch (loss 1.4300): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 805/1563 [21:18<05:01, 2.51it/s] Training 1/1 epoch (loss 1.4300): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 806/1563 [21:18<05:14, 2.41it/s] Training 1/1 epoch (loss 1.6844): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 806/1563 [21:18<05:14, 2.41it/s] Training 1/1 epoch (loss 1.6844): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 807/1563 [21:18<04:59, 2.52it/s] Training 1/1 epoch (loss 1.5993): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 807/1563 [21:19<04:59, 2.52it/s] Training 1/1 epoch (loss 1.5993): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 808/1563 [21:19<04:54, 2.56it/s] Training 1/1 epoch (loss 1.6618): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 808/1563 [21:20<04:54, 2.56it/s] Training 1/1 epoch (loss 1.6618): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 809/1563 [21:20<07:46, 1.62it/s] Training 1/1 epoch (loss 1.6163): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 809/1563 [21:20<07:46, 1.62it/s] Training 1/1 epoch (loss 1.6163): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 810/1563 [21:20<07:27, 1.68it/s] Training 1/1 epoch (loss 1.4824): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 810/1563 [21:21<07:27, 1.68it/s] Training 1/1 epoch (loss 1.4824): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 811/1563 [21:21<07:38, 1.64it/s] Training 1/1 epoch (loss 1.5061): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 811/1563 [21:22<07:38, 1.64it/s] Training 1/1 epoch (loss 1.5061): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 812/1563 [21:22<08:53, 1.41it/s] Training 1/1 epoch (loss 1.5754): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 812/1563 [21:23<08:53, 1.41it/s] Training 1/1 epoch (loss 1.5754): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 813/1563 [21:23<08:59, 1.39it/s] Training 1/1 epoch (loss 1.7377): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 813/1563 [21:23<08:59, 1.39it/s] Training 1/1 epoch (loss 1.7377): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 814/1563 [21:23<08:46, 1.42it/s] Training 1/1 epoch (loss 1.6317): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 814/1563 [21:24<08:46, 1.42it/s] Training 1/1 epoch (loss 1.6317): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 815/1563 [21:24<09:08, 1.36it/s] Training 1/1 epoch (loss 1.5323): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 815/1563 [21:25<09:08, 1.36it/s] Training 1/1 epoch (loss 1.5323): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 816/1563 [21:25<09:45, 1.28it/s] Training 1/1 epoch (loss 1.6490): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 816/1563 [21:26<09:45, 1.28it/s] Training 1/1 epoch (loss 1.6490): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 817/1563 [21:26<09:57, 1.25it/s] Training 1/1 epoch (loss 1.6150): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 817/1563 [21:26<09:57, 1.25it/s] Training 1/1 epoch (loss 1.6150): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 818/1563 [21:26<08:52, 1.40it/s] Training 1/1 epoch (loss 1.5726): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 818/1563 [21:27<08:52, 1.40it/s] Training 1/1 epoch (loss 1.5726): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 819/1563 [21:27<08:07, 1.53it/s] Training 1/1 epoch (loss 1.6527): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 819/1563 [21:27<08:07, 1.53it/s] Training 1/1 epoch (loss 1.6527): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 820/1563 [21:27<07:22, 1.68it/s] Training 1/1 epoch (loss 1.6462): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 820/1563 [21:28<07:22, 1.68it/s] Training 1/1 epoch (loss 1.6462): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 821/1563 [21:28<06:44, 1.83it/s] Training 1/1 epoch (loss 1.6190): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 821/1563 [21:28<06:44, 1.83it/s] Training 1/1 epoch (loss 1.6190): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 822/1563 [21:28<07:32, 1.64it/s] Training 1/1 epoch (loss 1.5314): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 822/1563 [21:29<07:32, 1.64it/s] Training 1/1 epoch (loss 1.5314): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 823/1563 [21:29<06:47, 1.82it/s] Training 1/1 epoch (loss 1.6557): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 823/1563 [21:30<06:47, 1.82it/s] Training 1/1 epoch (loss 1.6557): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 824/1563 [21:30<08:17, 1.48it/s] Training 1/1 epoch (loss 1.6364): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 824/1563 [21:31<08:17, 1.48it/s] Training 1/1 epoch (loss 1.6364): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 825/1563 [21:31<10:28, 1.17it/s] Training 1/1 epoch (loss 1.5175): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 825/1563 [21:32<10:28, 1.17it/s] Training 1/1 epoch (loss 1.5175): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 826/1563 [21:32<12:02, 1.02it/s] Training 1/1 epoch (loss 1.6490): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 826/1563 [21:35<12:02, 1.02it/s] Training 1/1 epoch (loss 1.6490): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 827/1563 [21:35<17:26, 1.42s/it] Training 1/1 epoch (loss 1.5380): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 827/1563 [21:36<17:26, 1.42s/it] Training 1/1 epoch (loss 1.5380): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 828/1563 [21:36<18:01, 1.47s/it] Training 1/1 epoch (loss 1.5973): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 828/1563 [21:38<18:01, 1.47s/it] Training 1/1 epoch (loss 1.5973): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 829/1563 [21:38<18:39, 1.53s/it] Training 1/1 epoch (loss 1.5426): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 829/1563 [21:40<18:39, 1.53s/it] Training 1/1 epoch (loss 1.5426): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 830/1563 [21:40<20:34, 1.68s/it] Training 1/1 epoch (loss 1.6244): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 830/1563 [21:42<20:34, 1.68s/it] Training 1/1 epoch (loss 1.6244): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 831/1563 [21:42<19:31, 1.60s/it] Training 1/1 epoch (loss 1.6424): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 831/1563 [21:43<19:31, 1.60s/it] Training 1/1 epoch (loss 1.6424): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 832/1563 [21:43<20:00, 1.64s/it] Training 1/1 epoch (loss 1.6040): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 832/1563 [21:46<20:00, 1.64s/it] Training 1/1 epoch (loss 1.6040): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 833/1563 [21:46<22:51, 1.88s/it] Training 1/1 epoch (loss 1.6590): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 833/1563 [21:48<22:51, 1.88s/it] Training 1/1 epoch (loss 1.6590): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 834/1563 [21:48<24:00, 1.98s/it] Training 1/1 epoch (loss 1.6432): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 834/1563 [21:49<24:00, 1.98s/it] Training 1/1 epoch (loss 1.6432): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 835/1563 [21:49<20:06, 1.66s/it] Training 1/1 epoch (loss 1.7308): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 835/1563 [21:50<20:06, 1.66s/it] Training 1/1 epoch (loss 1.7308): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 836/1563 [21:50<17:00, 1.40s/it] Training 1/1 epoch (loss 1.6294): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 836/1563 [21:50<17:00, 1.40s/it] Training 1/1 epoch (loss 1.6294): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 837/1563 [21:50<14:31, 1.20s/it] Training 1/1 epoch (loss 1.6379): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 837/1563 [21:51<14:31, 1.20s/it] Training 1/1 epoch (loss 1.6379): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 838/1563 [21:51<12:49, 1.06s/it] Training 1/1 epoch (loss 1.5625): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 838/1563 [21:51<12:49, 1.06s/it] Training 1/1 epoch (loss 1.5625): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 839/1563 [21:51<10:13, 1.18it/s] Training 1/1 epoch (loss 1.6375): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 839/1563 [21:52<10:13, 1.18it/s] Training 1/1 epoch (loss 1.6375): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 840/1563 [21:52<08:26, 1.43it/s] Training 1/1 epoch (loss 1.6799): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 840/1563 [21:52<08:26, 1.43it/s] Training 1/1 epoch (loss 1.6799): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 841/1563 [21:52<07:13, 1.67it/s] Training 1/1 epoch (loss 1.6222): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 841/1563 [21:53<07:13, 1.67it/s] Training 1/1 epoch (loss 1.6222): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 842/1563 [21:53<07:50, 1.53it/s] Training 1/1 epoch (loss 1.5702): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 842/1563 [21:53<07:50, 1.53it/s] Training 1/1 epoch (loss 1.5702): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 843/1563 [21:53<06:34, 1.82it/s] Training 1/1 epoch (loss 1.7266): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 843/1563 [21:54<06:34, 1.82it/s] Training 1/1 epoch (loss 1.7266): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 844/1563 [21:54<05:40, 2.11it/s] Training 1/1 epoch (loss 1.5581): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 844/1563 [21:54<05:40, 2.11it/s] Training 1/1 epoch (loss 1.5581): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 845/1563 [21:54<05:07, 2.34it/s] Training 1/1 epoch (loss 1.5275): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 845/1563 [21:54<05:07, 2.34it/s] Training 1/1 epoch (loss 1.5275): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 846/1563 [21:54<04:59, 2.40it/s] Training 1/1 epoch (loss 1.5519): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 846/1563 [21:55<04:59, 2.40it/s] Training 1/1 epoch (loss 1.5519): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 847/1563 [21:55<06:07, 1.95it/s] Training 1/1 epoch (loss 1.6192): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 847/1563 [21:56<06:07, 1.95it/s] Training 1/1 epoch (loss 1.6192): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 848/1563 [21:56<09:28, 1.26it/s] Training 1/1 epoch (loss 1.5474): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 848/1563 [21:58<09:28, 1.26it/s] Training 1/1 epoch (loss 1.5474): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 849/1563 [21:58<12:50, 1.08s/it] Training 1/1 epoch (loss 1.5799): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 849/1563 [21:59<12:50, 1.08s/it] Training 1/1 epoch (loss 1.5799): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 850/1563 [21:59<13:04, 1.10s/it] Training 1/1 epoch (loss 1.6850): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 850/1563 [22:01<13:04, 1.10s/it] Training 1/1 epoch (loss 1.6850): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 851/1563 [22:01<16:36, 1.40s/it] Training 1/1 epoch (loss 1.6834): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 851/1563 [22:02<16:36, 1.40s/it] Training 1/1 epoch (loss 1.6834): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 852/1563 [22:02<15:14, 1.29s/it] Training 1/1 epoch (loss 1.5638): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 852/1563 [22:03<15:14, 1.29s/it] Training 1/1 epoch (loss 1.5638): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 853/1563 [22:03<13:32, 1.14s/it] Training 1/1 epoch (loss 1.5293): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 853/1563 [22:04<13:32, 1.14s/it] Training 1/1 epoch (loss 1.5293): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 854/1563 [22:04<11:58, 1.01s/it] Training 1/1 epoch (loss 1.5753): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 854/1563 [22:05<11:58, 1.01s/it] Training 1/1 epoch (loss 1.5753): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 855/1563 [22:05<10:50, 1.09it/s] Training 1/1 epoch (loss 1.7033): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 855/1563 [22:05<10:50, 1.09it/s] Training 1/1 epoch (loss 1.7033): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 856/1563 [22:05<09:56, 1.19it/s] Training 1/1 epoch (loss 1.5879): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 856/1563 [22:06<09:56, 1.19it/s] Training 1/1 epoch (loss 1.5879): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 857/1563 [22:06<08:44, 1.35it/s] Training 1/1 epoch (loss 1.5984): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 857/1563 [22:06<08:44, 1.35it/s] Training 1/1 epoch (loss 1.5984): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 858/1563 [22:06<07:36, 1.54it/s] Training 1/1 epoch (loss 1.6151): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 858/1563 [22:07<07:36, 1.54it/s] Training 1/1 epoch (loss 1.6151): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 859/1563 [22:07<06:55, 1.69it/s] Training 1/1 epoch (loss 1.7118): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 859/1563 [22:07<06:55, 1.69it/s] Training 1/1 epoch (loss 1.7118): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 860/1563 [22:07<06:37, 1.77it/s] Training 1/1 epoch (loss 1.6061): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 860/1563 [22:08<06:37, 1.77it/s] Training 1/1 epoch (loss 1.6061): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 861/1563 [22:08<06:53, 1.70it/s] Training 1/1 epoch (loss 1.6323): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 861/1563 [22:10<06:53, 1.70it/s] Training 1/1 epoch (loss 1.6323): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 862/1563 [22:10<11:08, 1.05it/s] Training 1/1 epoch (loss 1.6529): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 862/1563 [22:11<11:08, 1.05it/s] Training 1/1 epoch (loss 1.6529): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 863/1563 [22:11<13:12, 1.13s/it] Training 1/1 epoch (loss 1.5646): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 863/1563 [22:12<13:12, 1.13s/it] Training 1/1 epoch (loss 1.5646): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 864/1563 [22:12<12:52, 1.11s/it] Training 1/1 epoch (loss 1.6774): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 864/1563 [22:14<12:52, 1.11s/it] Training 1/1 epoch (loss 1.6774): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 865/1563 [22:14<14:21, 1.23s/it] Training 1/1 epoch (loss 1.6352): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 865/1563 [22:16<14:21, 1.23s/it] Training 1/1 epoch (loss 1.6352): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 866/1563 [22:16<15:55, 1.37s/it] Training 1/1 epoch (loss 1.5731): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 866/1563 [22:17<15:55, 1.37s/it] Training 1/1 epoch (loss 1.5731): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 867/1563 [22:17<16:06, 1.39s/it] Training 1/1 epoch (loss 1.6286): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 867/1563 [22:19<16:06, 1.39s/it] Training 1/1 epoch (loss 1.6286): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 868/1563 [22:19<18:08, 1.57s/it] Training 1/1 epoch (loss 1.5378): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 868/1563 [22:20<18:08, 1.57s/it] Training 1/1 epoch (loss 1.5378): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 869/1563 [22:20<16:36, 1.44s/it] Training 1/1 epoch (loss 1.6486): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 869/1563 [22:22<16:36, 1.44s/it] Training 1/1 epoch (loss 1.6486): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 870/1563 [22:22<17:39, 1.53s/it] Training 1/1 epoch (loss 1.6170): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 870/1563 [22:23<17:39, 1.53s/it] Training 1/1 epoch (loss 1.6170): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 871/1563 [22:23<16:31, 1.43s/it] Training 1/1 epoch (loss 1.5357): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 871/1563 [22:25<16:31, 1.43s/it] Training 1/1 epoch (loss 1.5357): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 872/1563 [22:25<17:39, 1.53s/it] Training 1/1 epoch (loss 1.6357): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 872/1563 [22:27<17:39, 1.53s/it] Training 1/1 epoch (loss 1.6357): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 873/1563 [22:27<19:26, 1.69s/it] Training 1/1 epoch (loss 1.5764): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 873/1563 [22:29<19:26, 1.69s/it] Training 1/1 epoch (loss 1.5764): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 874/1563 [22:29<19:43, 1.72s/it] Training 1/1 epoch (loss 1.5844): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 874/1563 [22:30<19:43, 1.72s/it] Training 1/1 epoch (loss 1.5844): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 875/1563 [22:30<19:58, 1.74s/it] Training 1/1 epoch (loss 1.7956): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 875/1563 [22:32<19:58, 1.74s/it] Training 1/1 epoch (loss 1.7956): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 876/1563 [22:32<19:09, 1.67s/it] Training 1/1 epoch (loss 1.5672): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 876/1563 [22:34<19:09, 1.67s/it] Training 1/1 epoch (loss 1.5672): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 877/1563 [22:34<19:20, 1.69s/it] Training 1/1 epoch (loss 1.6432): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 877/1563 [22:35<19:20, 1.69s/it] Training 1/1 epoch (loss 1.6432): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 878/1563 [22:35<17:55, 1.57s/it] Training 1/1 epoch (loss 1.6410): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 878/1563 [22:37<17:55, 1.57s/it] Training 1/1 epoch (loss 1.6410): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 879/1563 [22:37<17:57, 1.58s/it] Training 1/1 epoch (loss 1.6709): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 879/1563 [22:39<17:57, 1.58s/it] Training 1/1 epoch (loss 1.6709): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 880/1563 [22:39<19:22, 1.70s/it] Training 1/1 epoch (loss 1.6186): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 880/1563 [22:40<19:22, 1.70s/it] Training 1/1 epoch (loss 1.6186): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 881/1563 [22:40<19:55, 1.75s/it] Training 1/1 epoch (loss 1.5927): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 881/1563 [22:43<19:55, 1.75s/it] Training 1/1 epoch (loss 1.5927): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 882/1563 [22:43<22:11, 1.96s/it] Training 1/1 epoch (loss 1.7004): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 882/1563 [22:45<22:11, 1.96s/it] Training 1/1 epoch (loss 1.7004): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 883/1563 [22:45<22:19, 1.97s/it] Training 1/1 epoch (loss 1.5688): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 883/1563 [22:47<22:19, 1.97s/it] Training 1/1 epoch (loss 1.5688): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 884/1563 [22:47<22:26, 1.98s/it] Training 1/1 epoch (loss 1.6271): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 884/1563 [22:49<22:26, 1.98s/it] Training 1/1 epoch (loss 1.6271): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 885/1563 [22:49<24:16, 2.15s/it] Training 1/1 epoch (loss 1.5318): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 885/1563 [22:51<24:16, 2.15s/it] Training 1/1 epoch (loss 1.5318): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 886/1563 [22:51<22:58, 2.04s/it] Training 1/1 epoch (loss 1.7119): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 886/1563 [22:54<22:58, 2.04s/it] Training 1/1 epoch (loss 1.7119): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 887/1563 [22:54<24:22, 2.16s/it] Training 1/1 epoch (loss 1.5863): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 887/1563 [22:55<24:22, 2.16s/it] Training 1/1 epoch (loss 1.5863): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 888/1563 [22:55<21:08, 1.88s/it] Training 1/1 epoch (loss 1.6499): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 888/1563 [22:56<21:08, 1.88s/it] Training 1/1 epoch (loss 1.6499): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 889/1563 [22:56<19:06, 1.70s/it] Training 1/1 epoch (loss 1.6309): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 889/1563 [22:57<19:06, 1.70s/it] Training 1/1 epoch (loss 1.6309): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 890/1563 [22:57<16:49, 1.50s/it] Training 1/1 epoch (loss 1.6548): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 890/1563 [22:59<16:49, 1.50s/it] Training 1/1 epoch (loss 1.6548): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 891/1563 [22:59<18:28, 1.65s/it] Training 1/1 epoch (loss 1.6469): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 891/1563 [23:00<18:28, 1.65s/it] Training 1/1 epoch (loss 1.6469): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 892/1563 [23:00<17:23, 1.56s/it] Training 1/1 epoch (loss 1.5372): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 892/1563 [23:02<17:23, 1.56s/it] Training 1/1 epoch (loss 1.5372): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 893/1563 [23:02<16:36, 1.49s/it] Training 1/1 epoch (loss 1.6041): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 893/1563 [23:03<16:36, 1.49s/it] Training 1/1 epoch (loss 1.6041): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 894/1563 [23:03<16:54, 1.52s/it] Training 1/1 epoch (loss 1.6889): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 894/1563 [23:05<16:54, 1.52s/it] Training 1/1 epoch (loss 1.6889): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 895/1563 [23:05<15:44, 1.41s/it] Training 1/1 epoch (loss 1.5732): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 895/1563 [23:07<15:44, 1.41s/it] Training 1/1 epoch (loss 1.5732): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 896/1563 [23:07<18:08, 1.63s/it] Training 1/1 epoch (loss 1.6899): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 896/1563 [23:08<18:08, 1.63s/it] Training 1/1 epoch (loss 1.6899): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 897/1563 [23:08<17:58, 1.62s/it] Training 1/1 epoch (loss 1.6503): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 897/1563 [23:10<17:58, 1.62s/it] Training 1/1 epoch (loss 1.6503): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 898/1563 [23:10<18:15, 1.65s/it] Training 1/1 epoch (loss 1.5961): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 898/1563 [23:12<18:15, 1.65s/it] Training 1/1 epoch (loss 1.5961): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 899/1563 [23:12<18:57, 1.71s/it] Training 1/1 epoch (loss 1.7093): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 899/1563 [23:14<18:57, 1.71s/it] Training 1/1 epoch (loss 1.7093): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 900/1563 [23:14<19:16, 1.74s/it] Training 1/1 epoch (loss 1.5421): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 900/1563 [23:15<19:16, 1.74s/it] Training 1/1 epoch (loss 1.5421): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 901/1563 [23:15<17:32, 1.59s/it] Training 1/1 epoch (loss 1.5642): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 901/1563 [23:17<17:32, 1.59s/it] Training 1/1 epoch (loss 1.5642): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 902/1563 [23:17<18:47, 1.71s/it] Training 1/1 epoch (loss 1.6610): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 902/1563 [23:19<18:47, 1.71s/it] Training 1/1 epoch (loss 1.6610): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 903/1563 [23:19<19:07, 1.74s/it] Training 1/1 epoch (loss 1.5503): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 903/1563 [23:21<19:07, 1.74s/it] Training 1/1 epoch (loss 1.5503): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 904/1563 [23:21<19:21, 1.76s/it] Training 1/1 epoch (loss 1.6616): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 904/1563 [23:22<19:21, 1.76s/it] Training 1/1 epoch (loss 1.6616): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 905/1563 [23:22<17:22, 1.58s/it] Training 1/1 epoch (loss 1.7245): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 905/1563 [23:23<17:22, 1.58s/it] Training 1/1 epoch (loss 1.7245): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 906/1563 [23:23<16:17, 1.49s/it] Training 1/1 epoch (loss 1.5682): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 906/1563 [23:24<16:17, 1.49s/it] Training 1/1 epoch (loss 1.5682): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 907/1563 [23:24<15:25, 1.41s/it] Training 1/1 epoch (loss 1.6745): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 907/1563 [23:26<15:25, 1.41s/it] Training 1/1 epoch (loss 1.6745): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 908/1563 [23:26<17:05, 1.57s/it] Training 1/1 epoch (loss 1.5534): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 908/1563 [23:27<17:05, 1.57s/it] Training 1/1 epoch (loss 1.5534): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 909/1563 [23:27<15:50, 1.45s/it] Training 1/1 epoch (loss 1.6521): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 909/1563 [23:30<15:50, 1.45s/it] Training 1/1 epoch (loss 1.6521): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 910/1563 [23:30<19:03, 1.75s/it] Training 1/1 epoch (loss 1.6678): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 910/1563 [23:32<19:03, 1.75s/it] Training 1/1 epoch (loss 1.6678): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 911/1563 [23:32<19:55, 1.83s/it] Training 1/1 epoch (loss 1.5540): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 911/1563 [23:34<19:55, 1.83s/it] Training 1/1 epoch (loss 1.5540): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 912/1563 [23:34<22:23, 2.06s/it] Training 1/1 epoch (loss 1.6426): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 912/1563 [23:36<22:23, 2.06s/it] Training 1/1 epoch (loss 1.6426): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 913/1563 [23:36<22:25, 2.07s/it] Training 1/1 epoch (loss 1.4786): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 913/1563 [23:39<22:25, 2.07s/it] Training 1/1 epoch (loss 1.4786): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 914/1563 [23:39<22:15, 2.06s/it] Training 1/1 epoch (loss 1.7107): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 914/1563 [23:41<22:15, 2.06s/it] Training 1/1 epoch (loss 1.7107): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 915/1563 [23:41<23:24, 2.17s/it] Training 1/1 epoch (loss 1.5617): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 915/1563 [23:42<23:24, 2.17s/it] Training 1/1 epoch (loss 1.5617): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 916/1563 [23:42<20:09, 1.87s/it] Training 1/1 epoch (loss 1.5542): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 916/1563 [23:44<20:09, 1.87s/it] Training 1/1 epoch (loss 1.5542): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 917/1563 [23:44<19:49, 1.84s/it] Training 1/1 epoch (loss 1.5374): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 917/1563 [23:46<19:49, 1.84s/it] Training 1/1 epoch (loss 1.5374): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 918/1563 [23:46<20:02, 1.86s/it] Training 1/1 epoch (loss 1.7507): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 918/1563 [23:48<20:02, 1.86s/it] Training 1/1 epoch (loss 1.7507): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 919/1563 [23:48<19:36, 1.83s/it] Training 1/1 epoch (loss 1.6757): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 919/1563 [23:49<19:36, 1.83s/it] Training 1/1 epoch (loss 1.6757): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 920/1563 [23:49<18:30, 1.73s/it] Training 1/1 epoch (loss 1.5129): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 920/1563 [23:52<18:30, 1.73s/it] Training 1/1 epoch (loss 1.5129): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 921/1563 [23:52<20:53, 1.95s/it] Training 1/1 epoch (loss 1.6228): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 921/1563 [23:53<20:53, 1.95s/it] Training 1/1 epoch (loss 1.6228): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 922/1563 [23:53<19:33, 1.83s/it] Training 1/1 epoch (loss 1.5916): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 922/1563 [23:55<19:33, 1.83s/it] Training 1/1 epoch (loss 1.5916): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 923/1563 [23:55<18:49, 1.77s/it] Training 1/1 epoch (loss 1.5963): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 923/1563 [23:56<18:49, 1.77s/it] Training 1/1 epoch (loss 1.5963): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 924/1563 [23:56<17:44, 1.67s/it] Training 1/1 epoch (loss 1.5459): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 924/1563 [23:59<17:44, 1.67s/it] Training 1/1 epoch (loss 1.5459): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 925/1563 [23:59<20:32, 1.93s/it] Training 1/1 epoch (loss 1.5969): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 925/1563 [24:01<20:32, 1.93s/it] Training 1/1 epoch (loss 1.5969): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 926/1563 [24:01<22:06, 2.08s/it] Training 1/1 epoch (loss 1.5831): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 926/1563 [24:03<22:06, 2.08s/it] Training 1/1 epoch (loss 1.5831): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 927/1563 [24:03<20:37, 1.95s/it] Training 1/1 epoch (loss 1.5425): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 927/1563 [24:05<20:37, 1.95s/it] Training 1/1 epoch (loss 1.5425): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 928/1563 [24:05<20:49, 1.97s/it] Training 1/1 epoch (loss 1.7486): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 928/1563 [24:06<20:49, 1.97s/it] Training 1/1 epoch (loss 1.7486): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 929/1563 [24:06<17:56, 1.70s/it] Training 1/1 epoch (loss 1.6295): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 929/1563 [24:08<17:56, 1.70s/it] Training 1/1 epoch (loss 1.6295): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 930/1563 [24:08<19:02, 1.80s/it] Training 1/1 epoch (loss 1.5828): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 930/1563 [24:10<19:02, 1.80s/it] Training 1/1 epoch (loss 1.5828): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 931/1563 [24:10<20:19, 1.93s/it] Training 1/1 epoch (loss 1.5978): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 931/1563 [24:12<20:19, 1.93s/it] Training 1/1 epoch (loss 1.5978): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 932/1563 [24:12<20:35, 1.96s/it] Training 1/1 epoch (loss 1.6023): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 932/1563 [24:13<20:35, 1.96s/it] Training 1/1 epoch (loss 1.6023): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 933/1563 [24:13<18:39, 1.78s/it] Training 1/1 epoch (loss 1.6607): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 933/1563 [24:15<18:39, 1.78s/it] Training 1/1 epoch (loss 1.6607): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 934/1563 [24:15<16:45, 1.60s/it] Training 1/1 epoch (loss 1.5809): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 934/1563 [24:16<16:45, 1.60s/it] Training 1/1 epoch (loss 1.5809): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 935/1563 [24:16<14:59, 1.43s/it] Training 1/1 epoch (loss 1.5741): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 935/1563 [24:18<14:59, 1.43s/it] Training 1/1 epoch (loss 1.5741): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 936/1563 [24:18<16:22, 1.57s/it] Training 1/1 epoch (loss 1.5267): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 936/1563 [24:20<16:22, 1.57s/it] Training 1/1 epoch (loss 1.5267): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 937/1563 [24:20<18:14, 1.75s/it] Training 1/1 epoch (loss 1.5927): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 937/1563 [24:22<18:14, 1.75s/it] Training 1/1 epoch (loss 1.5927): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 938/1563 [24:22<19:20, 1.86s/it] Training 1/1 epoch (loss 1.5326): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 938/1563 [24:24<19:20, 1.86s/it] Training 1/1 epoch (loss 1.5326): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 939/1563 [24:24<19:56, 1.92s/it] Training 1/1 epoch (loss 1.5817): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 939/1563 [24:25<19:56, 1.92s/it] Training 1/1 epoch (loss 1.5817): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 940/1563 [24:25<17:28, 1.68s/it] Training 1/1 epoch (loss 1.5120): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 940/1563 [24:27<17:28, 1.68s/it] Training 1/1 epoch (loss 1.5120): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 941/1563 [24:27<18:07, 1.75s/it] Training 1/1 epoch (loss 1.6075): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 941/1563 [24:28<18:07, 1.75s/it] Training 1/1 epoch (loss 1.6075): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 942/1563 [24:28<16:57, 1.64s/it] Training 1/1 epoch (loss 1.6275): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 942/1563 [24:30<16:57, 1.64s/it] Training 1/1 epoch (loss 1.6275): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 943/1563 [24:30<16:59, 1.64s/it] Training 1/1 epoch (loss 1.6098): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 943/1563 [24:32<16:59, 1.64s/it] Training 1/1 epoch (loss 1.6098): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 944/1563 [24:32<18:52, 1.83s/it] Training 1/1 epoch (loss 1.5562): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 944/1563 [24:35<18:52, 1.83s/it] Training 1/1 epoch (loss 1.5562): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 945/1563 [24:35<20:26, 1.98s/it] Training 1/1 epoch (loss 1.5872): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 945/1563 [24:36<20:26, 1.98s/it] Training 1/1 epoch (loss 1.5872): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 946/1563 [24:36<19:54, 1.94s/it] Training 1/1 epoch (loss 1.7368): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 946/1563 [24:38<19:54, 1.94s/it] Training 1/1 epoch (loss 1.7368): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 947/1563 [24:38<20:18, 1.98s/it] Training 1/1 epoch (loss 1.5874): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 947/1563 [24:40<20:18, 1.98s/it] Training 1/1 epoch (loss 1.5874): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 948/1563 [24:40<19:09, 1.87s/it] Training 1/1 epoch (loss 1.4966): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 948/1563 [24:42<19:09, 1.87s/it] Training 1/1 epoch (loss 1.4966): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 949/1563 [24:42<19:46, 1.93s/it] Training 1/1 epoch (loss 1.5692): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 949/1563 [24:44<19:46, 1.93s/it] Training 1/1 epoch (loss 1.5692): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 950/1563 [24:44<19:03, 1.87s/it] Training 1/1 epoch (loss 1.5923): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 950/1563 [24:46<19:03, 1.87s/it] Training 1/1 epoch (loss 1.5923): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 951/1563 [24:46<18:17, 1.79s/it] Training 1/1 epoch (loss 1.6214): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 951/1563 [24:48<18:17, 1.79s/it] Training 1/1 epoch (loss 1.6214): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 952/1563 [24:48<19:54, 1.96s/it] Training 1/1 epoch (loss 1.5624): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 952/1563 [24:49<19:54, 1.96s/it] Training 1/1 epoch (loss 1.5624): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 953/1563 [24:49<17:52, 1.76s/it] Training 1/1 epoch (loss 1.6873): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 953/1563 [24:51<17:52, 1.76s/it] Training 1/1 epoch (loss 1.6873): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 954/1563 [24:51<17:04, 1.68s/it] Training 1/1 epoch (loss 1.4946): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 954/1563 [24:53<17:04, 1.68s/it] Training 1/1 epoch (loss 1.4946): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 955/1563 [24:53<18:48, 1.86s/it] Training 1/1 epoch (loss 1.6531): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 955/1563 [24:54<18:48, 1.86s/it] Training 1/1 epoch (loss 1.6531): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 956/1563 [24:54<16:07, 1.59s/it] Training 1/1 epoch (loss 1.5741): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 956/1563 [24:55<16:07, 1.59s/it] Training 1/1 epoch (loss 1.5741): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 957/1563 [24:55<15:17, 1.51s/it] Training 1/1 epoch (loss 1.4656): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 957/1563 [24:57<15:17, 1.51s/it] Training 1/1 epoch (loss 1.4656): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 958/1563 [24:57<16:12, 1.61s/it] Training 1/1 epoch (loss 1.6343): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 958/1563 [24:58<16:12, 1.61s/it] Training 1/1 epoch (loss 1.6343): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 959/1563 [24:58<15:10, 1.51s/it] Training 1/1 epoch (loss 1.6493): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 959/1563 [24:59<15:10, 1.51s/it] Training 1/1 epoch (loss 1.6493): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 960/1563 [24:59<13:47, 1.37s/it] Training 1/1 epoch (loss 1.6907): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 960/1563 [25:02<13:47, 1.37s/it] Training 1/1 epoch (loss 1.6907): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 961/1563 [25:02<16:40, 1.66s/it] Training 1/1 epoch (loss 1.5960): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 961/1563 [25:04<16:40, 1.66s/it] Training 1/1 epoch (loss 1.5960): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 962/1563 [25:04<17:16, 1.72s/it] Training 1/1 epoch (loss 1.6204): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 962/1563 [25:06<17:16, 1.72s/it] Training 1/1 epoch (loss 1.6204): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 963/1563 [25:06<18:31, 1.85s/it] Training 1/1 epoch (loss 1.5922): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 963/1563 [25:07<18:31, 1.85s/it] Training 1/1 epoch (loss 1.5922): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 964/1563 [25:07<16:40, 1.67s/it] Training 1/1 epoch (loss 1.6296): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 964/1563 [25:09<16:40, 1.67s/it] Training 1/1 epoch (loss 1.6296): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 965/1563 [25:09<16:26, 1.65s/it] Training 1/1 epoch (loss 1.7278): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 965/1563 [25:11<16:26, 1.65s/it] Training 1/1 epoch (loss 1.7278): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 966/1563 [25:11<17:20, 1.74s/it] Training 1/1 epoch (loss 1.5258): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 966/1563 [25:12<17:20, 1.74s/it] Training 1/1 epoch (loss 1.5258): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 967/1563 [25:12<16:33, 1.67s/it] Training 1/1 epoch (loss 1.6301): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 967/1563 [25:14<16:33, 1.67s/it] Training 1/1 epoch (loss 1.6301): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 968/1563 [25:14<16:10, 1.63s/it] Training 1/1 epoch (loss 1.4069): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 968/1563 [25:15<16:10, 1.63s/it] Training 1/1 epoch (loss 1.4069): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 969/1563 [25:15<16:52, 1.70s/it] Training 1/1 epoch (loss 1.6634): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 969/1563 [25:17<16:52, 1.70s/it] Training 1/1 epoch (loss 1.6634): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 970/1563 [25:17<16:24, 1.66s/it] Training 1/1 epoch (loss 1.6001): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 970/1563 [25:18<16:24, 1.66s/it] Training 1/1 epoch (loss 1.6001): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 971/1563 [25:18<14:20, 1.45s/it] Training 1/1 epoch (loss 1.5123): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 971/1563 [25:20<14:20, 1.45s/it] Training 1/1 epoch (loss 1.5123): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 972/1563 [25:20<15:23, 1.56s/it] Training 1/1 epoch (loss 1.6109): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 972/1563 [25:21<15:23, 1.56s/it] Training 1/1 epoch (loss 1.6109): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 973/1563 [25:21<15:12, 1.55s/it] Training 1/1 epoch (loss 1.5364): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 973/1563 [25:23<15:12, 1.55s/it] Training 1/1 epoch (loss 1.5364): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 974/1563 [25:23<14:38, 1.49s/it] Training 1/1 epoch (loss 1.5585): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 974/1563 [25:25<14:38, 1.49s/it] Training 1/1 epoch (loss 1.5585): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 975/1563 [25:25<17:31, 1.79s/it] Training 1/1 epoch (loss 1.6555): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 975/1563 [25:27<17:31, 1.79s/it] Training 1/1 epoch (loss 1.6555): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 976/1563 [25:27<16:55, 1.73s/it] Training 1/1 epoch (loss 1.5269): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 976/1563 [25:28<16:55, 1.73s/it] Training 1/1 epoch (loss 1.5269): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 977/1563 [25:28<15:18, 1.57s/it] Training 1/1 epoch (loss 1.6337): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 977/1563 [25:29<15:18, 1.57s/it] Training 1/1 epoch (loss 1.6337): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 978/1563 [25:29<15:09, 1.56s/it] Training 1/1 epoch (loss 1.5394): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 978/1563 [25:31<15:09, 1.56s/it] Training 1/1 epoch (loss 1.5394): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 979/1563 [25:31<14:33, 1.50s/it] Training 1/1 epoch (loss 1.6516): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 979/1563 [25:32<14:33, 1.50s/it] Training 1/1 epoch (loss 1.6516): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 980/1563 [25:32<14:47, 1.52s/it] Training 1/1 epoch (loss 1.7402): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 980/1563 [25:35<14:47, 1.52s/it] Training 1/1 epoch (loss 1.7402): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 981/1563 [25:35<16:45, 1.73s/it] Training 1/1 epoch (loss 1.5921): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 981/1563 [25:36<16:45, 1.73s/it] Training 1/1 epoch (loss 1.5921): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 982/1563 [25:36<16:58, 1.75s/it] Training 1/1 epoch (loss 1.6839): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 982/1563 [25:38<16:58, 1.75s/it] Training 1/1 epoch (loss 1.6839): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 983/1563 [25:38<16:18, 1.69s/it] Training 1/1 epoch (loss 1.6992): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 983/1563 [25:39<16:18, 1.69s/it] Training 1/1 epoch (loss 1.6992): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 984/1563 [25:39<15:10, 1.57s/it] Training 1/1 epoch (loss 1.6478): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 984/1563 [25:42<15:10, 1.57s/it] Training 1/1 epoch (loss 1.6478): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 985/1563 [25:42<17:41, 1.84s/it] Training 1/1 epoch (loss 1.7797): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 985/1563 [25:44<17:41, 1.84s/it] Training 1/1 epoch (loss 1.7797): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 986/1563 [25:44<19:12, 2.00s/it] Training 1/1 epoch (loss 1.4420): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 986/1563 [25:45<19:12, 2.00s/it] Training 1/1 epoch (loss 1.4420): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 987/1563 [25:45<16:13, 1.69s/it] Training 1/1 epoch (loss 1.5493): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 987/1563 [25:46<16:13, 1.69s/it] Training 1/1 epoch (loss 1.5493): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 988/1563 [25:46<14:41, 1.53s/it] Training 1/1 epoch (loss 1.6185): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 988/1563 [25:47<14:41, 1.53s/it] Training 1/1 epoch (loss 1.6185): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 989/1563 [25:47<13:42, 1.43s/it] Training 1/1 epoch (loss 1.6530): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 989/1563 [25:49<13:42, 1.43s/it] Training 1/1 epoch (loss 1.6530): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 990/1563 [25:49<14:40, 1.54s/it] Training 1/1 epoch (loss 1.5944): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 990/1563 [25:51<14:40, 1.54s/it] Training 1/1 epoch (loss 1.5944): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 991/1563 [25:51<16:13, 1.70s/it] Training 1/1 epoch (loss 1.5502): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 991/1563 [25:53<16:13, 1.70s/it] Training 1/1 epoch (loss 1.5502): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 992/1563 [25:53<16:29, 1.73s/it] Training 1/1 epoch (loss 1.6800): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 992/1563 [25:55<16:29, 1.73s/it] Training 1/1 epoch (loss 1.6800): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 993/1563 [25:55<17:43, 1.87s/it] Training 1/1 epoch (loss 1.5176): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 993/1563 [25:56<17:43, 1.87s/it] Training 1/1 epoch (loss 1.5176): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 994/1563 [25:56<15:43, 1.66s/it] Training 1/1 epoch (loss 1.5543): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 994/1563 [25:58<15:43, 1.66s/it] Training 1/1 epoch (loss 1.5543): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 995/1563 [25:58<16:29, 1.74s/it] Training 1/1 epoch (loss 1.5258): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 995/1563 [26:00<16:29, 1.74s/it] Training 1/1 epoch (loss 1.5258): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 996/1563 [26:00<14:49, 1.57s/it] Training 1/1 epoch (loss 1.6612): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 996/1563 [26:02<14:49, 1.57s/it] Training 1/1 epoch (loss 1.6612): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 997/1563 [26:02<17:16, 1.83s/it] Training 1/1 epoch (loss 1.5775): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 997/1563 [26:03<17:16, 1.83s/it] Training 1/1 epoch (loss 1.5775): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 998/1563 [26:03<14:55, 1.59s/it] Training 1/1 epoch (loss 1.5693): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 998/1563 [26:05<14:55, 1.59s/it] Training 1/1 epoch (loss 1.5693): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 999/1563 [26:05<17:11, 1.83s/it] Training 1/1 epoch (loss 1.4555): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 999/1563 [26:07<17:11, 1.83s/it] Training 1/1 epoch (loss 1.4555): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1000/1563 [26:07<15:38, 1.67s/it] Training 1/1 epoch (loss 1.6524): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1000/1563 [26:09<15:38, 1.67s/it] Training 1/1 epoch (loss 1.6524): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1001/1563 [26:09<17:31, 1.87s/it] Training 1/1 epoch (loss 1.5550): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1001/1563 [26:11<17:31, 1.87s/it] Training 1/1 epoch (loss 1.5550): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1002/1563 [26:11<17:24, 1.86s/it] Training 1/1 epoch (loss 1.5881): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1002/1563 [26:12<17:24, 1.86s/it] Training 1/1 epoch (loss 1.5881): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1003/1563 [26:12<16:29, 1.77s/it] Training 1/1 epoch (loss 1.7481): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1003/1563 [26:15<16:29, 1.77s/it] Training 1/1 epoch (loss 1.7481): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1004/1563 [26:15<18:10, 1.95s/it] Training 1/1 epoch (loss 1.6477): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1004/1563 [26:16<18:10, 1.95s/it] Training 1/1 epoch (loss 1.6477): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1005/1563 [26:16<16:24, 1.76s/it] Training 1/1 epoch (loss 1.5666): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1005/1563 [26:18<16:24, 1.76s/it] Training 1/1 epoch (loss 1.5666): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1006/1563 [26:18<18:01, 1.94s/it] Training 1/1 epoch (loss 1.5903): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1006/1563 [26:20<18:01, 1.94s/it] Training 1/1 epoch (loss 1.5903): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1007/1563 [26:20<17:50, 1.93s/it] Training 1/1 epoch (loss 1.6156): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1007/1563 [26:22<17:50, 1.93s/it] Training 1/1 epoch (loss 1.6156): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1008/1563 [26:22<18:19, 1.98s/it] Training 1/1 epoch (loss 1.5345): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1008/1563 [26:24<18:19, 1.98s/it] Training 1/1 epoch (loss 1.5345): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1009/1563 [26:24<17:52, 1.94s/it] Training 1/1 epoch (loss 1.6326): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1009/1563 [26:26<17:52, 1.94s/it] Training 1/1 epoch (loss 1.6326): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1010/1563 [26:26<18:21, 1.99s/it] Training 1/1 epoch (loss 1.6208): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1010/1563 [26:29<18:21, 1.99s/it] Training 1/1 epoch (loss 1.6208): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1011/1563 [26:29<18:36, 2.02s/it] Training 1/1 epoch (loss 1.6111): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1011/1563 [26:31<18:36, 2.02s/it] Training 1/1 epoch (loss 1.6111): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1012/1563 [26:31<18:48, 2.05s/it] Training 1/1 epoch (loss 1.6332): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1012/1563 [26:32<18:48, 2.05s/it] Training 1/1 epoch (loss 1.6332): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1013/1563 [26:32<16:41, 1.82s/it] Training 1/1 epoch (loss 1.5596): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1013/1563 [26:33<16:41, 1.82s/it] Training 1/1 epoch (loss 1.5596): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1014/1563 [26:33<15:48, 1.73s/it] Training 1/1 epoch (loss 1.5916): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1014/1563 [26:35<15:48, 1.73s/it] Training 1/1 epoch (loss 1.5916): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1015/1563 [26:35<14:49, 1.62s/it] Training 1/1 epoch (loss 1.5573): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1015/1563 [26:38<14:49, 1.62s/it] Training 1/1 epoch (loss 1.5573): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1016/1563 [26:38<17:41, 1.94s/it] Training 1/1 epoch (loss 1.5860): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1016/1563 [26:39<17:41, 1.94s/it] Training 1/1 epoch (loss 1.5860): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1017/1563 [26:39<16:41, 1.83s/it] Training 1/1 epoch (loss 1.5891): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1017/1563 [26:41<16:41, 1.83s/it] Training 1/1 epoch (loss 1.5891): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1018/1563 [26:41<15:39, 1.72s/it] Training 1/1 epoch (loss 1.4698): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1018/1563 [26:42<15:39, 1.72s/it] Training 1/1 epoch (loss 1.4698): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1019/1563 [26:42<15:25, 1.70s/it] Training 1/1 epoch (loss 1.5475): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1019/1563 [26:44<15:25, 1.70s/it] Training 1/1 epoch (loss 1.5475): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1020/1563 [26:44<14:55, 1.65s/it] Training 1/1 epoch (loss 1.7190): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1020/1563 [26:45<14:55, 1.65s/it] Training 1/1 epoch (loss 1.7190): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1021/1563 [26:45<14:34, 1.61s/it] Training 1/1 epoch (loss 1.6715): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1021/1563 [26:47<14:34, 1.61s/it] Training 1/1 epoch (loss 1.6715): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1022/1563 [26:47<14:27, 1.60s/it] Training 1/1 epoch (loss 1.5820): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1022/1563 [26:49<14:27, 1.60s/it] Training 1/1 epoch (loss 1.5820): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1023/1563 [26:49<15:38, 1.74s/it] Training 1/1 epoch (loss 1.5760): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1023/1563 [26:51<15:38, 1.74s/it] Training 1/1 epoch (loss 1.5760): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1024/1563 [26:51<15:49, 1.76s/it] Training 1/1 epoch (loss 1.5693): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1024/1563 [26:52<15:49, 1.76s/it] Training 1/1 epoch (loss 1.5693): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1025/1563 [26:52<15:18, 1.71s/it] Training 1/1 epoch (loss 1.6578): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1025/1563 [26:55<15:18, 1.71s/it] Training 1/1 epoch (loss 1.6578): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1026/1563 [26:55<17:12, 1.92s/it] Training 1/1 epoch (loss 1.5860): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1026/1563 [26:56<17:12, 1.92s/it] Training 1/1 epoch (loss 1.5860): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1027/1563 [26:56<15:38, 1.75s/it] Training 1/1 epoch (loss 1.6074): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1027/1563 [26:58<15:38, 1.75s/it] Training 1/1 epoch (loss 1.6074): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1028/1563 [26:58<16:14, 1.82s/it] Training 1/1 epoch (loss 1.5825): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1028/1563 [26:59<16:14, 1.82s/it] Training 1/1 epoch (loss 1.5825): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1029/1563 [26:59<14:51, 1.67s/it] Training 1/1 epoch (loss 1.5254): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1029/1563 [27:01<14:51, 1.67s/it] Training 1/1 epoch (loss 1.5254): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1030/1563 [27:01<15:37, 1.76s/it] Training 1/1 epoch (loss 1.5486): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1030/1563 [27:03<15:37, 1.76s/it] Training 1/1 epoch (loss 1.5486): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1031/1563 [27:03<14:04, 1.59s/it] Training 1/1 epoch (loss 1.5689): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1031/1563 [27:04<14:04, 1.59s/it] Training 1/1 epoch (loss 1.5689): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1032/1563 [27:04<14:44, 1.67s/it] Training 1/1 epoch (loss 1.6389): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1032/1563 [27:06<14:44, 1.67s/it] Training 1/1 epoch (loss 1.6389): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1033/1563 [27:06<13:36, 1.54s/it] Training 1/1 epoch (loss 1.6808): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1033/1563 [27:07<13:36, 1.54s/it] Training 1/1 epoch (loss 1.6808): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1034/1563 [27:07<12:59, 1.47s/it] Training 1/1 epoch (loss 1.6457): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1034/1563 [27:09<12:59, 1.47s/it] Training 1/1 epoch (loss 1.6457): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1035/1563 [27:09<13:46, 1.57s/it] Training 1/1 epoch (loss 1.6168): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1035/1563 [27:11<13:46, 1.57s/it] Training 1/1 epoch (loss 1.6168): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1036/1563 [27:11<14:30, 1.65s/it] Training 1/1 epoch (loss 1.5192): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1036/1563 [27:12<14:30, 1.65s/it] Training 1/1 epoch (loss 1.5192): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1037/1563 [27:12<14:10, 1.62s/it] Training 1/1 epoch (loss 1.6063): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1037/1563 [27:14<14:10, 1.62s/it] Training 1/1 epoch (loss 1.6063): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1038/1563 [27:14<14:20, 1.64s/it] Training 1/1 epoch (loss 1.5013): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1038/1563 [27:16<14:20, 1.64s/it] Training 1/1 epoch (loss 1.5013): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1039/1563 [27:16<15:03, 1.72s/it] Training 1/1 epoch (loss 1.5817): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1039/1563 [27:17<15:03, 1.72s/it] Training 1/1 epoch (loss 1.5817): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1040/1563 [27:17<14:46, 1.70s/it] Training 1/1 epoch (loss 1.6513): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1040/1563 [27:19<14:46, 1.70s/it] Training 1/1 epoch (loss 1.6513): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1041/1563 [27:19<14:22, 1.65s/it] Training 1/1 epoch (loss 1.6229): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1041/1563 [27:20<14:22, 1.65s/it] Training 1/1 epoch (loss 1.6229): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1042/1563 [27:20<12:52, 1.48s/it] Training 1/1 epoch (loss 1.5048): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1042/1563 [27:22<12:52, 1.48s/it] Training 1/1 epoch (loss 1.5048): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1043/1563 [27:22<13:05, 1.51s/it] Training 1/1 epoch (loss 1.6308): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1043/1563 [27:23<13:05, 1.51s/it] Training 1/1 epoch (loss 1.6308): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1044/1563 [27:23<13:19, 1.54s/it] Training 1/1 epoch (loss 1.5681): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1044/1563 [27:25<13:19, 1.54s/it] Training 1/1 epoch (loss 1.5681): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1045/1563 [27:25<13:44, 1.59s/it] Training 1/1 epoch (loss 1.6329): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1045/1563 [27:26<13:44, 1.59s/it] Training 1/1 epoch (loss 1.6329): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1046/1563 [27:26<13:25, 1.56s/it] Training 1/1 epoch (loss 1.6456): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1046/1563 [27:28<13:25, 1.56s/it] Training 1/1 epoch (loss 1.6456): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1047/1563 [27:28<13:36, 1.58s/it] Training 1/1 epoch (loss 1.7347): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1047/1563 [27:31<13:36, 1.58s/it] Training 1/1 epoch (loss 1.7347): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1048/1563 [27:31<16:11, 1.89s/it] Training 1/1 epoch (loss 1.6032): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1048/1563 [27:32<16:11, 1.89s/it] Training 1/1 epoch (loss 1.6032): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1049/1563 [27:32<15:21, 1.79s/it] Training 1/1 epoch (loss 1.5181): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1049/1563 [27:33<15:21, 1.79s/it] Training 1/1 epoch (loss 1.5181): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1050/1563 [27:33<13:40, 1.60s/it] Training 1/1 epoch (loss 1.5362): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1050/1563 [27:35<13:40, 1.60s/it] Training 1/1 epoch (loss 1.5362): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1051/1563 [27:35<12:37, 1.48s/it] Training 1/1 epoch (loss 1.5558): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1051/1563 [27:36<12:37, 1.48s/it] Training 1/1 epoch (loss 1.5558): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1052/1563 [27:36<12:11, 1.43s/it] Training 1/1 epoch (loss 1.4220): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1052/1563 [27:37<12:11, 1.43s/it] Training 1/1 epoch (loss 1.4220): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1053/1563 [27:37<12:07, 1.43s/it] Training 1/1 epoch (loss 1.6120): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1053/1563 [27:39<12:07, 1.43s/it] Training 1/1 epoch (loss 1.6120): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1054/1563 [27:39<12:36, 1.49s/it] Training 1/1 epoch (loss 1.6625): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1054/1563 [27:40<12:36, 1.49s/it] Training 1/1 epoch (loss 1.6625): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1055/1563 [27:40<11:58, 1.41s/it] Training 1/1 epoch (loss 1.5365): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1055/1563 [27:42<11:58, 1.41s/it] Training 1/1 epoch (loss 1.5365): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1056/1563 [27:42<12:52, 1.52s/it] Training 1/1 epoch (loss 1.5319): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1056/1563 [27:43<12:52, 1.52s/it] Training 1/1 epoch (loss 1.5319): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1057/1563 [27:43<12:46, 1.52s/it] Training 1/1 epoch (loss 1.5598): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1057/1563 [27:45<12:46, 1.52s/it] Training 1/1 epoch (loss 1.5598): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1058/1563 [27:45<14:01, 1.67s/it] Training 1/1 epoch (loss 1.5509): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1058/1563 [27:47<14:01, 1.67s/it] Training 1/1 epoch (loss 1.5509): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1059/1563 [27:47<13:06, 1.56s/it] Training 1/1 epoch (loss 1.4792): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1059/1563 [27:48<13:06, 1.56s/it] Training 1/1 epoch (loss 1.4792): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1060/1563 [27:48<13:29, 1.61s/it] Training 1/1 epoch (loss 1.6054): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1060/1563 [27:50<13:29, 1.61s/it] Training 1/1 epoch (loss 1.6054): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1061/1563 [27:50<12:27, 1.49s/it] Training 1/1 epoch (loss 1.5769): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1061/1563 [27:52<12:27, 1.49s/it] Training 1/1 epoch (loss 1.5769): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1062/1563 [27:52<14:28, 1.73s/it] Training 1/1 epoch (loss 1.6202): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1062/1563 [27:53<14:28, 1.73s/it] Training 1/1 epoch (loss 1.6202): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1063/1563 [27:53<13:44, 1.65s/it] Training 1/1 epoch (loss 1.6606): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1063/1563 [27:55<13:44, 1.65s/it] Training 1/1 epoch (loss 1.6606): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1064/1563 [27:55<13:08, 1.58s/it] Training 1/1 epoch (loss 1.5268): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1064/1563 [27:57<13:08, 1.58s/it] Training 1/1 epoch (loss 1.5268): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1065/1563 [27:57<14:08, 1.70s/it] Training 1/1 epoch (loss 1.5504): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1065/1563 [27:59<14:08, 1.70s/it] Training 1/1 epoch (loss 1.5504): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1066/1563 [27:59<14:52, 1.80s/it] Training 1/1 epoch (loss 1.6087): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1066/1563 [28:00<14:52, 1.80s/it] Training 1/1 epoch (loss 1.6087): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1067/1563 [28:00<13:04, 1.58s/it] Training 1/1 epoch (loss 1.5111): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1067/1563 [28:02<13:04, 1.58s/it] Training 1/1 epoch (loss 1.5111): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1068/1563 [28:02<13:50, 1.68s/it] Training 1/1 epoch (loss 1.5786): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1068/1563 [28:03<13:50, 1.68s/it] Training 1/1 epoch (loss 1.5786): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1069/1563 [28:03<13:04, 1.59s/it] Training 1/1 epoch (loss 1.4551): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1069/1563 [28:04<13:04, 1.59s/it] Training 1/1 epoch (loss 1.4551): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1070/1563 [28:04<12:01, 1.46s/it] Training 1/1 epoch (loss 1.6053): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1070/1563 [28:06<12:01, 1.46s/it] Training 1/1 epoch (loss 1.6053): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1071/1563 [28:06<13:05, 1.60s/it] Training 1/1 epoch (loss 1.5134): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1071/1563 [28:08<13:05, 1.60s/it] Training 1/1 epoch (loss 1.5134): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1072/1563 [28:08<14:04, 1.72s/it] Training 1/1 epoch (loss 1.6019): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1072/1563 [28:10<14:04, 1.72s/it] Training 1/1 epoch (loss 1.6019): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1073/1563 [28:10<13:51, 1.70s/it] Training 1/1 epoch (loss 1.5496): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1073/1563 [28:12<13:51, 1.70s/it] Training 1/1 epoch (loss 1.5496): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1074/1563 [28:12<14:15, 1.75s/it] Training 1/1 epoch (loss 1.5363): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1074/1563 [28:13<14:15, 1.75s/it] Training 1/1 epoch (loss 1.5363): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1075/1563 [28:13<12:52, 1.58s/it] Training 1/1 epoch (loss 1.4970): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1075/1563 [28:15<12:52, 1.58s/it] Training 1/1 epoch (loss 1.4970): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1076/1563 [28:15<14:10, 1.75s/it] Training 1/1 epoch (loss 1.6210): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1076/1563 [28:16<14:10, 1.75s/it] Training 1/1 epoch (loss 1.6210): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1077/1563 [28:16<12:31, 1.55s/it] Training 1/1 epoch (loss 1.5829): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1077/1563 [28:17<12:31, 1.55s/it] Training 1/1 epoch (loss 1.5829): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1078/1563 [28:17<11:37, 1.44s/it] Training 1/1 epoch (loss 1.5332): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1078/1563 [28:19<11:37, 1.44s/it] Training 1/1 epoch (loss 1.5332): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1079/1563 [28:19<11:29, 1.42s/it] Training 1/1 epoch (loss 1.6309): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1079/1563 [28:21<11:29, 1.42s/it] Training 1/1 epoch (loss 1.6309): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1080/1563 [28:21<13:29, 1.67s/it] Training 1/1 epoch (loss 1.6129): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1080/1563 [28:23<13:29, 1.67s/it] Training 1/1 epoch (loss 1.6129): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1081/1563 [28:23<13:32, 1.69s/it] Training 1/1 epoch (loss 1.5442): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1081/1563 [28:24<13:32, 1.69s/it] Training 1/1 epoch (loss 1.5442): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1082/1563 [28:24<13:34, 1.69s/it] Training 1/1 epoch (loss 1.5638): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1082/1563 [28:26<13:34, 1.69s/it] Training 1/1 epoch (loss 1.5638): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1083/1563 [28:26<13:01, 1.63s/it] Training 1/1 epoch (loss 1.5469): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1083/1563 [28:28<13:01, 1.63s/it] Training 1/1 epoch (loss 1.5469): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1084/1563 [28:28<13:39, 1.71s/it] Training 1/1 epoch (loss 1.5190): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1084/1563 [28:29<13:39, 1.71s/it] Training 1/1 epoch (loss 1.5190): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1085/1563 [28:29<12:56, 1.62s/it] Training 1/1 epoch (loss 1.4708): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1085/1563 [28:31<12:56, 1.62s/it] Training 1/1 epoch (loss 1.4708): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1086/1563 [28:31<12:46, 1.61s/it] Training 1/1 epoch (loss 1.6043): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1086/1563 [28:32<12:46, 1.61s/it] Training 1/1 epoch (loss 1.6043): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1087/1563 [28:32<12:45, 1.61s/it] Training 1/1 epoch (loss 1.5559): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1087/1563 [28:35<12:45, 1.61s/it] Training 1/1 epoch (loss 1.5559): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1088/1563 [28:35<14:18, 1.81s/it] Training 1/1 epoch (loss 1.5125): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1088/1563 [28:36<14:18, 1.81s/it] Training 1/1 epoch (loss 1.5125): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1089/1563 [28:36<13:36, 1.72s/it] Training 1/1 epoch (loss 1.6795): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1089/1563 [28:37<13:36, 1.72s/it] Training 1/1 epoch (loss 1.6795): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1090/1563 [28:37<12:03, 1.53s/it] Training 1/1 epoch (loss 1.4483): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1090/1563 [28:39<12:03, 1.53s/it] Training 1/1 epoch (loss 1.4483): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1091/1563 [28:39<13:01, 1.66s/it] Training 1/1 epoch (loss 1.5785): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1091/1563 [28:40<13:01, 1.66s/it] Training 1/1 epoch (loss 1.5785): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1092/1563 [28:40<11:05, 1.41s/it] Training 1/1 epoch (loss 1.6033): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1092/1563 [28:42<11:05, 1.41s/it] Training 1/1 epoch (loss 1.6033): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1093/1563 [28:42<11:45, 1.50s/it] Training 1/1 epoch (loss 1.5694): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1093/1563 [28:44<11:45, 1.50s/it] Training 1/1 epoch (loss 1.5694): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1094/1563 [28:44<12:28, 1.60s/it] Training 1/1 epoch (loss 1.5165): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1094/1563 [28:46<12:28, 1.60s/it] Training 1/1 epoch (loss 1.5165): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1095/1563 [28:46<13:46, 1.77s/it] Training 1/1 epoch (loss 1.5548): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1095/1563 [28:48<13:46, 1.77s/it] Training 1/1 epoch (loss 1.5548): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1096/1563 [28:48<14:08, 1.82s/it] Training 1/1 epoch (loss 1.5253): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1096/1563 [28:49<14:08, 1.82s/it] Training 1/1 epoch (loss 1.5253): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1097/1563 [28:49<13:40, 1.76s/it] Training 1/1 epoch (loss 1.4921): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1097/1563 [28:51<13:40, 1.76s/it] Training 1/1 epoch (loss 1.4921): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1098/1563 [28:51<13:29, 1.74s/it] Training 1/1 epoch (loss 1.5856): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1098/1563 [28:54<13:29, 1.74s/it] Training 1/1 epoch (loss 1.5856): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1099/1563 [28:54<15:11, 1.96s/it] Training 1/1 epoch (loss 1.4902): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1099/1563 [28:56<15:11, 1.96s/it] Training 1/1 epoch (loss 1.4902): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1100/1563 [28:56<15:24, 2.00s/it] Training 1/1 epoch (loss 1.6002): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1100/1563 [28:58<15:24, 2.00s/it] Training 1/1 epoch (loss 1.6002): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1101/1563 [28:58<15:18, 1.99s/it] Training 1/1 epoch (loss 1.5706): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1101/1563 [29:00<15:18, 1.99s/it] Training 1/1 epoch (loss 1.5706): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1102/1563 [29:00<15:05, 1.97s/it] Training 1/1 epoch (loss 1.5558): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1102/1563 [29:01<15:05, 1.97s/it] Training 1/1 epoch (loss 1.5558): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1103/1563 [29:01<14:25, 1.88s/it] Training 1/1 epoch (loss 1.6267): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1103/1563 [29:03<14:25, 1.88s/it] Training 1/1 epoch (loss 1.6267): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1104/1563 [29:03<13:20, 1.74s/it] Training 1/1 epoch (loss 1.6426): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1104/1563 [29:04<13:20, 1.74s/it] Training 1/1 epoch (loss 1.6426): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1105/1563 [29:04<12:38, 1.66s/it] Training 1/1 epoch (loss 1.5071): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1105/1563 [29:05<12:38, 1.66s/it] Training 1/1 epoch (loss 1.5071): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1106/1563 [29:05<11:51, 1.56s/it] Training 1/1 epoch (loss 1.6021): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1106/1563 [29:07<11:51, 1.56s/it] Training 1/1 epoch (loss 1.6021): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1107/1563 [29:07<10:58, 1.44s/it] Training 1/1 epoch (loss 1.4158): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1107/1563 [29:08<10:58, 1.44s/it] Training 1/1 epoch (loss 1.4158): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1108/1563 [29:08<11:35, 1.53s/it] Training 1/1 epoch (loss 1.4919): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1108/1563 [29:10<11:35, 1.53s/it] Training 1/1 epoch (loss 1.4919): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1109/1563 [29:10<12:39, 1.67s/it] Training 1/1 epoch (loss 1.6083): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1109/1563 [29:11<12:39, 1.67s/it] Training 1/1 epoch (loss 1.6083): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1110/1563 [29:11<10:47, 1.43s/it] Training 1/1 epoch (loss 1.6122): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1110/1563 [29:13<10:47, 1.43s/it] Training 1/1 epoch (loss 1.6122): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1111/1563 [29:13<12:07, 1.61s/it] Training 1/1 epoch (loss 1.4778): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1111/1563 [29:15<12:07, 1.61s/it] Training 1/1 epoch (loss 1.4778): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1112/1563 [29:15<13:26, 1.79s/it] Training 1/1 epoch (loss 1.5645): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1112/1563 [29:17<13:26, 1.79s/it] Training 1/1 epoch (loss 1.5645): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1113/1563 [29:17<13:12, 1.76s/it] Training 1/1 epoch (loss 1.6198): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1113/1563 [29:19<13:12, 1.76s/it] Training 1/1 epoch (loss 1.6198): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1114/1563 [29:19<12:57, 1.73s/it] Training 1/1 epoch (loss 1.6099): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1114/1563 [29:21<12:57, 1.73s/it] Training 1/1 epoch (loss 1.6099): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1115/1563 [29:21<13:47, 1.85s/it] Training 1/1 epoch (loss 1.5947): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1115/1563 [29:22<13:47, 1.85s/it] Training 1/1 epoch (loss 1.5947): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1116/1563 [29:22<12:17, 1.65s/it] Training 1/1 epoch (loss 1.7438): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1116/1563 [29:24<12:17, 1.65s/it] Training 1/1 epoch (loss 1.7438): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1117/1563 [29:24<13:18, 1.79s/it] Training 1/1 epoch (loss 1.6474): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1117/1563 [29:25<13:18, 1.79s/it] Training 1/1 epoch (loss 1.6474): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1118/1563 [29:25<11:52, 1.60s/it] Training 1/1 epoch (loss 1.5235): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1118/1563 [29:26<11:52, 1.60s/it] Training 1/1 epoch (loss 1.5235): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1119/1563 [29:26<10:21, 1.40s/it] Training 1/1 epoch (loss 1.5361): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1119/1563 [29:28<10:21, 1.40s/it] Training 1/1 epoch (loss 1.5361): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1120/1563 [29:28<10:40, 1.45s/it] Training 1/1 epoch (loss 1.5711): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1120/1563 [29:30<10:40, 1.45s/it] Training 1/1 epoch (loss 1.5711): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1121/1563 [29:30<12:39, 1.72s/it] Training 1/1 epoch (loss 1.6188): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1121/1563 [29:32<12:39, 1.72s/it] Training 1/1 epoch (loss 1.6188): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1122/1563 [29:32<13:14, 1.80s/it] Training 1/1 epoch (loss 1.5286): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1122/1563 [29:35<13:14, 1.80s/it] Training 1/1 epoch (loss 1.5286): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1123/1563 [29:35<14:36, 1.99s/it] Training 1/1 epoch (loss 1.5728): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1123/1563 [29:37<14:36, 1.99s/it] Training 1/1 epoch (loss 1.5728): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1124/1563 [29:37<14:22, 1.97s/it] Training 1/1 epoch (loss 1.5628): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1124/1563 [29:38<14:22, 1.97s/it] Training 1/1 epoch (loss 1.5628): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1125/1563 [29:38<12:47, 1.75s/it] Training 1/1 epoch (loss 1.5865): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1125/1563 [29:40<12:47, 1.75s/it] Training 1/1 epoch (loss 1.5865): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1126/1563 [29:40<12:55, 1.77s/it] Training 1/1 epoch (loss 1.5343): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1126/1563 [29:42<12:55, 1.77s/it] Training 1/1 epoch (loss 1.5343): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1127/1563 [29:42<14:31, 2.00s/it] Training 1/1 epoch (loss 1.5646): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1127/1563 [29:44<14:31, 2.00s/it] Training 1/1 epoch (loss 1.5646): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1128/1563 [29:44<14:14, 1.97s/it] Training 1/1 epoch (loss 1.5956): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1128/1563 [29:45<14:14, 1.97s/it] Training 1/1 epoch (loss 1.5956): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1129/1563 [29:45<12:07, 1.68s/it] Training 1/1 epoch (loss 1.5309): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1129/1563 [29:47<12:07, 1.68s/it] Training 1/1 epoch (loss 1.5309): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1130/1563 [29:47<12:34, 1.74s/it] Training 1/1 epoch (loss 1.5820): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1130/1563 [29:49<12:34, 1.74s/it] Training 1/1 epoch (loss 1.5820): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1131/1563 [29:49<12:56, 1.80s/it] Training 1/1 epoch (loss 1.5951): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1131/1563 [29:51<12:56, 1.80s/it] Training 1/1 epoch (loss 1.5951): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1132/1563 [29:51<12:54, 1.80s/it] Training 1/1 epoch (loss 1.5135): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1132/1563 [29:52<12:54, 1.80s/it] Training 1/1 epoch (loss 1.5135): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1133/1563 [29:52<12:26, 1.74s/it] Training 1/1 epoch (loss 1.5846): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1133/1563 [29:53<12:26, 1.74s/it] Training 1/1 epoch (loss 1.5846): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1134/1563 [29:53<11:17, 1.58s/it] Training 1/1 epoch (loss 1.4995): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1134/1563 [29:55<11:17, 1.58s/it] Training 1/1 epoch (loss 1.4995): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1135/1563 [29:55<11:55, 1.67s/it] Training 1/1 epoch (loss 1.5569): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1135/1563 [29:58<11:55, 1.67s/it] Training 1/1 epoch (loss 1.5569): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1136/1563 [29:58<13:09, 1.85s/it] Training 1/1 epoch (loss 1.5305): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1136/1563 [30:00<13:09, 1.85s/it] Training 1/1 epoch (loss 1.5305): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1137/1563 [30:00<13:58, 1.97s/it] Training 1/1 epoch (loss 1.5535): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1137/1563 [30:02<13:58, 1.97s/it] Training 1/1 epoch (loss 1.5535): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1138/1563 [30:02<14:56, 2.11s/it] Training 1/1 epoch (loss 1.5880): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1138/1563 [30:04<14:56, 2.11s/it] Training 1/1 epoch (loss 1.5880): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1139/1563 [30:04<13:26, 1.90s/it] Training 1/1 epoch (loss 1.5975): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1139/1563 [30:06<13:26, 1.90s/it] Training 1/1 epoch (loss 1.5975): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1140/1563 [30:06<14:26, 2.05s/it] Training 1/1 epoch (loss 1.6251): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1140/1563 [30:07<14:26, 2.05s/it] Training 1/1 epoch (loss 1.6251): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1141/1563 [30:07<12:39, 1.80s/it] Training 1/1 epoch (loss 1.5159): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1141/1563 [30:09<12:39, 1.80s/it] Training 1/1 epoch (loss 1.5159): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1142/1563 [30:09<11:43, 1.67s/it] Training 1/1 epoch (loss 1.6133): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1142/1563 [30:10<11:43, 1.67s/it] Training 1/1 epoch (loss 1.6133): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1143/1563 [30:10<11:31, 1.65s/it] Training 1/1 epoch (loss 1.6112): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1143/1563 [30:12<11:31, 1.65s/it] Training 1/1 epoch (loss 1.6112): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1144/1563 [30:12<12:34, 1.80s/it] Training 1/1 epoch (loss 1.5841): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1144/1563 [30:14<12:34, 1.80s/it] Training 1/1 epoch (loss 1.5841): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1145/1563 [30:14<13:04, 1.88s/it] Training 1/1 epoch (loss 1.5232): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1145/1563 [30:16<13:04, 1.88s/it] Training 1/1 epoch (loss 1.5232): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1146/1563 [30:16<13:17, 1.91s/it] Training 1/1 epoch (loss 1.5675): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1146/1563 [30:18<13:17, 1.91s/it] Training 1/1 epoch (loss 1.5675): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1147/1563 [30:18<12:40, 1.83s/it] Training 1/1 epoch (loss 1.6910): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1147/1563 [30:19<12:40, 1.83s/it] Training 1/1 epoch (loss 1.6910): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1148/1563 [30:19<11:21, 1.64s/it] Training 1/1 epoch (loss 1.6133): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1148/1563 [30:21<11:21, 1.64s/it] Training 1/1 epoch (loss 1.6133): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1149/1563 [30:21<12:20, 1.79s/it] Training 1/1 epoch (loss 1.4759): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1149/1563 [30:23<12:20, 1.79s/it] Training 1/1 epoch (loss 1.4759): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1150/1563 [30:23<11:50, 1.72s/it] Training 1/1 epoch (loss 1.5454): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1150/1563 [30:25<11:50, 1.72s/it] Training 1/1 epoch (loss 1.5454): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1151/1563 [30:25<12:45, 1.86s/it] Training 1/1 epoch (loss 1.5510): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1151/1563 [30:28<12:45, 1.86s/it] Training 1/1 epoch (loss 1.5510): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1152/1563 [30:28<13:42, 2.00s/it] Training 1/1 epoch (loss 1.5554): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1152/1563 [30:30<13:42, 2.00s/it] Training 1/1 epoch (loss 1.5554): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1153/1563 [30:30<13:54, 2.04s/it] Training 1/1 epoch (loss 1.5979): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1153/1563 [30:31<13:54, 2.04s/it] Training 1/1 epoch (loss 1.5979): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1154/1563 [30:31<11:43, 1.72s/it] Training 1/1 epoch (loss 1.6131): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1154/1563 [30:32<11:43, 1.72s/it] Training 1/1 epoch (loss 1.6131): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1155/1563 [30:32<10:49, 1.59s/it] Training 1/1 epoch (loss 1.6045): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1155/1563 [30:33<10:49, 1.59s/it] Training 1/1 epoch (loss 1.6045): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1156/1563 [30:33<10:36, 1.56s/it] Training 1/1 epoch (loss 1.6044): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1156/1563 [30:35<10:36, 1.56s/it] Training 1/1 epoch (loss 1.6044): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1157/1563 [30:35<11:03, 1.63s/it] Training 1/1 epoch (loss 1.5679): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1157/1563 [30:37<11:03, 1.63s/it] Training 1/1 epoch (loss 1.5679): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1158/1563 [30:37<11:08, 1.65s/it] Training 1/1 epoch (loss 1.5664): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1158/1563 [30:39<11:08, 1.65s/it] Training 1/1 epoch (loss 1.5664): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1159/1563 [30:39<11:47, 1.75s/it] Training 1/1 epoch (loss 1.6358): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1159/1563 [30:40<11:47, 1.75s/it] Training 1/1 epoch (loss 1.6358): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1160/1563 [30:40<11:12, 1.67s/it] Training 1/1 epoch (loss 1.5498): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1160/1563 [30:42<11:12, 1.67s/it] Training 1/1 epoch (loss 1.5498): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1161/1563 [30:42<10:33, 1.58s/it] Training 1/1 epoch (loss 1.5372): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1161/1563 [30:43<10:33, 1.58s/it] Training 1/1 epoch (loss 1.5372): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1162/1563 [30:43<10:27, 1.56s/it] Training 1/1 epoch (loss 1.5122): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1162/1563 [30:45<10:27, 1.56s/it] Training 1/1 epoch (loss 1.5122): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1163/1563 [30:45<10:47, 1.62s/it] Training 1/1 epoch (loss 1.6463): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1163/1563 [30:47<10:47, 1.62s/it] Training 1/1 epoch (loss 1.6463): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1164/1563 [30:47<11:25, 1.72s/it] Training 1/1 epoch (loss 1.6286): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1164/1563 [30:50<11:25, 1.72s/it] Training 1/1 epoch (loss 1.6286): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1165/1563 [30:50<13:01, 1.96s/it] Training 1/1 epoch (loss 1.6589): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1165/1563 [30:51<13:01, 1.96s/it] Training 1/1 epoch (loss 1.6589): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1166/1563 [30:51<12:07, 1.83s/it] Training 1/1 epoch (loss 1.5938): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1166/1563 [30:53<12:07, 1.83s/it] Training 1/1 epoch (loss 1.5938): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1167/1563 [30:53<12:22, 1.88s/it] Training 1/1 epoch (loss 1.5417): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1167/1563 [30:55<12:22, 1.88s/it] Training 1/1 epoch (loss 1.5417): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1168/1563 [30:55<12:29, 1.90s/it] Training 1/1 epoch (loss 1.4863): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1168/1563 [30:57<12:29, 1.90s/it] Training 1/1 epoch (loss 1.4863): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1169/1563 [30:57<12:17, 1.87s/it] Training 1/1 epoch (loss 1.5754): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1169/1563 [30:58<12:17, 1.87s/it] Training 1/1 epoch (loss 1.5754): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1170/1563 [30:58<10:10, 1.55s/it] Training 1/1 epoch (loss 1.4557): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1170/1563 [30:59<10:10, 1.55s/it] Training 1/1 epoch (loss 1.4557): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1171/1563 [30:59<09:59, 1.53s/it] Training 1/1 epoch (loss 1.5589): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1171/1563 [31:01<09:59, 1.53s/it] Training 1/1 epoch (loss 1.5589): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1172/1563 [31:01<10:02, 1.54s/it] Training 1/1 epoch (loss 1.4827): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1172/1563 [31:02<10:02, 1.54s/it] Training 1/1 epoch (loss 1.4827): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1173/1563 [31:02<10:08, 1.56s/it] Training 1/1 epoch (loss 1.6016): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1173/1563 [31:04<10:08, 1.56s/it] Training 1/1 epoch (loss 1.6016): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1174/1563 [31:04<10:31, 1.62s/it] Training 1/1 epoch (loss 1.5025): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1174/1563 [31:06<10:31, 1.62s/it] Training 1/1 epoch (loss 1.5025): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1175/1563 [31:06<10:48, 1.67s/it] Training 1/1 epoch (loss 1.5546): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1175/1563 [31:08<10:48, 1.67s/it] Training 1/1 epoch (loss 1.5546): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1176/1563 [31:08<12:30, 1.94s/it] Training 1/1 epoch (loss 1.4161): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1176/1563 [31:10<12:30, 1.94s/it] Training 1/1 epoch (loss 1.4161): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1177/1563 [31:10<12:44, 1.98s/it] Training 1/1 epoch (loss 1.4871): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1177/1563 [31:12<12:44, 1.98s/it] Training 1/1 epoch (loss 1.4871): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1178/1563 [31:12<11:39, 1.82s/it] Training 1/1 epoch (loss 1.6263): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1178/1563 [31:14<11:39, 1.82s/it] Training 1/1 epoch (loss 1.6263): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1179/1563 [31:14<13:02, 2.04s/it] Training 1/1 epoch (loss 1.5650): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1179/1563 [31:16<13:02, 2.04s/it] Training 1/1 epoch (loss 1.5650): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1180/1563 [31:16<11:39, 1.83s/it] Training 1/1 epoch (loss 1.5660): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1180/1563 [31:18<11:39, 1.83s/it] Training 1/1 epoch (loss 1.5660): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1181/1563 [31:18<12:14, 1.92s/it] Training 1/1 epoch (loss 1.5929): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1181/1563 [31:19<12:14, 1.92s/it] Training 1/1 epoch (loss 1.5929): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1182/1563 [31:19<10:57, 1.73s/it] Training 1/1 epoch (loss 1.5982): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1182/1563 [31:21<10:57, 1.73s/it] Training 1/1 epoch (loss 1.5982): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1183/1563 [31:21<11:03, 1.75s/it] Training 1/1 epoch (loss 1.4035): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1183/1563 [31:22<11:03, 1.75s/it] Training 1/1 epoch (loss 1.4035): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1184/1563 [31:22<10:07, 1.60s/it] Training 1/1 epoch (loss 1.5796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1184/1563 [31:24<10:07, 1.60s/it] Training 1/1 epoch (loss 1.5796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1185/1563 [31:24<10:04, 1.60s/it] Training 1/1 epoch (loss 1.5995): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1185/1563 [31:26<10:04, 1.60s/it] Training 1/1 epoch (loss 1.5995): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1186/1563 [31:26<10:46, 1.71s/it] Training 1/1 epoch (loss 1.4757): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1186/1563 [31:28<10:46, 1.71s/it] Training 1/1 epoch (loss 1.4757): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1187/1563 [31:28<12:08, 1.94s/it] Training 1/1 epoch (loss 1.5466): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1187/1563 [31:30<12:08, 1.94s/it] Training 1/1 epoch (loss 1.5466): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1188/1563 [31:30<11:29, 1.84s/it] Training 1/1 epoch (loss 1.6965): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1188/1563 [31:31<11:29, 1.84s/it] Training 1/1 epoch (loss 1.6965): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1189/1563 [31:31<10:25, 1.67s/it] Training 1/1 epoch (loss 1.4955): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1189/1563 [31:34<10:25, 1.67s/it] Training 1/1 epoch (loss 1.4955): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1190/1563 [31:34<11:52, 1.91s/it] Training 1/1 epoch (loss 1.5742): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1190/1563 [31:35<11:52, 1.91s/it] Training 1/1 epoch (loss 1.5742): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1191/1563 [31:35<11:09, 1.80s/it] Training 1/1 epoch (loss 1.6796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1191/1563 [31:38<11:09, 1.80s/it] Training 1/1 epoch (loss 1.6796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1192/1563 [31:38<12:21, 2.00s/it] Training 1/1 epoch (loss 1.6765): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1192/1563 [31:40<12:21, 2.00s/it] Training 1/1 epoch (loss 1.6765): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1193/1563 [31:40<12:35, 2.04s/it] Training 1/1 epoch (loss 1.5696): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1193/1563 [31:41<12:35, 2.04s/it] Training 1/1 epoch (loss 1.5696): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1194/1563 [31:41<10:36, 1.72s/it] Training 1/1 epoch (loss 1.6211): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1194/1563 [31:43<10:36, 1.72s/it] Training 1/1 epoch (loss 1.6211): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1195/1563 [31:43<10:46, 1.76s/it] Training 1/1 epoch (loss 1.4935): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1195/1563 [31:45<10:46, 1.76s/it] Training 1/1 epoch (loss 1.4935): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1196/1563 [31:45<11:59, 1.96s/it] Training 1/1 epoch (loss 1.5533): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1196/1563 [31:47<11:59, 1.96s/it] Training 1/1 epoch (loss 1.5533): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1197/1563 [31:47<11:36, 1.90s/it] Training 1/1 epoch (loss 1.4815): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1197/1563 [31:48<11:36, 1.90s/it] Training 1/1 epoch (loss 1.4815): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1198/1563 [31:48<10:59, 1.81s/it] Training 1/1 epoch (loss 1.5885): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1198/1563 [31:49<10:59, 1.81s/it] Training 1/1 epoch (loss 1.5885): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1199/1563 [31:49<09:39, 1.59s/it] Training 1/1 epoch (loss 1.5963): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1199/1563 [31:51<09:39, 1.59s/it] Training 1/1 epoch (loss 1.5963): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1200/1563 [31:51<09:54, 1.64s/it] Training 1/1 epoch (loss 1.5755): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1200/1563 [31:53<09:54, 1.64s/it] Training 1/1 epoch (loss 1.5755): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1201/1563 [31:53<09:20, 1.55s/it] Training 1/1 epoch (loss 1.6318): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1201/1563 [31:54<09:20, 1.55s/it] Training 1/1 epoch (loss 1.6318): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1202/1563 [31:54<09:22, 1.56s/it] Training 1/1 epoch (loss 1.6167): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1202/1563 [31:55<09:22, 1.56s/it] Training 1/1 epoch (loss 1.6167): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1203/1563 [31:55<08:15, 1.38s/it] Training 1/1 epoch (loss 1.5391): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1203/1563 [31:57<08:15, 1.38s/it] Training 1/1 epoch (loss 1.5391): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1204/1563 [31:57<08:42, 1.46s/it] Training 1/1 epoch (loss 1.6076): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1204/1563 [31:59<08:42, 1.46s/it] Training 1/1 epoch (loss 1.6076): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1205/1563 [31:59<09:21, 1.57s/it] Training 1/1 epoch (loss 1.5311): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1205/1563 [32:01<09:21, 1.57s/it] Training 1/1 epoch (loss 1.5311): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1206/1563 [32:01<10:30, 1.77s/it] Training 1/1 epoch (loss 1.5742): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1206/1563 [32:02<10:30, 1.77s/it] Training 1/1 epoch (loss 1.5742): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1207/1563 [32:02<09:56, 1.68s/it] Training 1/1 epoch (loss 1.5631): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1207/1563 [32:04<09:56, 1.68s/it] Training 1/1 epoch (loss 1.5631): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1208/1563 [32:04<10:41, 1.81s/it] Training 1/1 epoch (loss 1.5459): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1208/1563 [32:06<10:41, 1.81s/it] Training 1/1 epoch (loss 1.5459): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1209/1563 [32:06<10:51, 1.84s/it] Training 1/1 epoch (loss 1.6449): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1209/1563 [32:08<10:51, 1.84s/it] Training 1/1 epoch (loss 1.6449): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1210/1563 [32:08<10:06, 1.72s/it] Training 1/1 epoch (loss 1.6599): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1210/1563 [32:10<10:06, 1.72s/it] Training 1/1 epoch (loss 1.6599): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1211/1563 [32:10<11:15, 1.92s/it] Training 1/1 epoch (loss 1.5450): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1211/1563 [32:12<11:15, 1.92s/it] Training 1/1 epoch (loss 1.5450): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1212/1563 [32:12<10:41, 1.83s/it] Training 1/1 epoch (loss 1.5070): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1212/1563 [32:13<10:41, 1.83s/it] Training 1/1 epoch (loss 1.5070): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1213/1563 [32:13<09:55, 1.70s/it] Training 1/1 epoch (loss 1.4974): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1213/1563 [32:15<09:55, 1.70s/it] Training 1/1 epoch (loss 1.4974): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1214/1563 [32:15<09:45, 1.68s/it] Training 1/1 epoch (loss 1.5184): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1214/1563 [32:16<09:45, 1.68s/it] Training 1/1 epoch (loss 1.5184): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1215/1563 [32:16<09:05, 1.57s/it] Training 1/1 epoch (loss 1.5431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1215/1563 [32:18<09:05, 1.57s/it] Training 1/1 epoch (loss 1.5431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1216/1563 [32:18<08:59, 1.56s/it] Training 1/1 epoch (loss 1.5488): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1216/1563 [32:19<08:59, 1.56s/it] Training 1/1 epoch (loss 1.5488): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1217/1563 [32:19<08:31, 1.48s/it] Training 1/1 epoch (loss 1.5425): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1217/1563 [32:21<08:31, 1.48s/it] Training 1/1 epoch (loss 1.5425): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1218/1563 [32:21<08:52, 1.54s/it] Training 1/1 epoch (loss 1.6208): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1218/1563 [32:23<08:52, 1.54s/it] Training 1/1 epoch (loss 1.6208): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1219/1563 [32:23<10:22, 1.81s/it] Training 1/1 epoch (loss 1.6187): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1219/1563 [32:24<10:22, 1.81s/it] Training 1/1 epoch (loss 1.6187): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1220/1563 [32:24<09:36, 1.68s/it] Training 1/1 epoch (loss 1.6427): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1220/1563 [32:26<09:36, 1.68s/it] Training 1/1 epoch (loss 1.6427): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1221/1563 [32:26<09:07, 1.60s/it] Training 1/1 epoch (loss 1.3449): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1221/1563 [32:28<09:07, 1.60s/it] Training 1/1 epoch (loss 1.3449): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1222/1563 [32:28<09:43, 1.71s/it] Training 1/1 epoch (loss 1.6389): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1222/1563 [32:29<09:43, 1.71s/it] Training 1/1 epoch (loss 1.6389): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1223/1563 [32:29<08:33, 1.51s/it] Training 1/1 epoch (loss 1.7048): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1223/1563 [32:31<08:33, 1.51s/it] Training 1/1 epoch (loss 1.7048): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1224/1563 [32:31<08:54, 1.58s/it] Training 1/1 epoch (loss 1.6315): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1224/1563 [32:32<08:54, 1.58s/it] Training 1/1 epoch (loss 1.6315): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1225/1563 [32:32<09:18, 1.65s/it] Training 1/1 epoch (loss 1.5597): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1225/1563 [32:34<09:18, 1.65s/it] Training 1/1 epoch (loss 1.5597): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1226/1563 [32:34<09:22, 1.67s/it] Training 1/1 epoch (loss 1.5436): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1226/1563 [32:36<09:22, 1.67s/it] Training 1/1 epoch (loss 1.5436): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1227/1563 [32:36<09:07, 1.63s/it] Training 1/1 epoch (loss 1.5001): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1227/1563 [32:37<09:07, 1.63s/it] Training 1/1 epoch (loss 1.5001): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1228/1563 [32:37<08:35, 1.54s/it] Training 1/1 epoch (loss 1.5536): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1228/1563 [32:39<08:35, 1.54s/it] Training 1/1 epoch (loss 1.5536): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1229/1563 [32:39<09:01, 1.62s/it] Training 1/1 epoch (loss 1.5312): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1229/1563 [32:40<09:01, 1.62s/it] Training 1/1 epoch (loss 1.5312): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1230/1563 [32:40<08:57, 1.61s/it] Training 1/1 epoch (loss 1.5808): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1230/1563 [32:43<08:57, 1.61s/it] Training 1/1 epoch (loss 1.5808): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1231/1563 [32:43<10:25, 1.89s/it] Training 1/1 epoch (loss 1.4356): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1231/1563 [32:45<10:25, 1.89s/it] Training 1/1 epoch (loss 1.4356): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1232/1563 [32:45<10:12, 1.85s/it] Training 1/1 epoch (loss 1.5260): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1232/1563 [32:46<10:12, 1.85s/it] Training 1/1 epoch (loss 1.5260): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1233/1563 [32:46<10:13, 1.86s/it] Training 1/1 epoch (loss 1.5742): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1233/1563 [32:48<10:13, 1.86s/it] Training 1/1 epoch (loss 1.5742): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1234/1563 [32:48<09:02, 1.65s/it] Training 1/1 epoch (loss 1.5834): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1234/1563 [32:49<09:02, 1.65s/it] Training 1/1 epoch (loss 1.5834): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1235/1563 [32:49<09:10, 1.68s/it] Training 1/1 epoch (loss 1.5604): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1235/1563 [32:51<09:10, 1.68s/it] Training 1/1 epoch (loss 1.5604): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1236/1563 [32:51<09:45, 1.79s/it] Training 1/1 epoch (loss 1.5269): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1236/1563 [32:53<09:45, 1.79s/it] Training 1/1 epoch (loss 1.5269): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1237/1563 [32:53<09:03, 1.67s/it] Training 1/1 epoch (loss 1.4902): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1237/1563 [32:54<09:03, 1.67s/it] Training 1/1 epoch (loss 1.4902): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1238/1563 [32:54<08:25, 1.56s/it] Training 1/1 epoch (loss 1.5374): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1238/1563 [32:56<08:25, 1.56s/it] Training 1/1 epoch (loss 1.5374): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1239/1563 [32:56<08:48, 1.63s/it] Training 1/1 epoch (loss 1.5974): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1239/1563 [32:58<08:48, 1.63s/it] Training 1/1 epoch (loss 1.5974): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1240/1563 [32:58<08:51, 1.65s/it] Training 1/1 epoch (loss 1.4305): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1240/1563 [33:00<08:51, 1.65s/it] Training 1/1 epoch (loss 1.4305): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1241/1563 [33:00<09:25, 1.76s/it] Training 1/1 epoch (loss 1.5775): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1241/1563 [33:01<09:25, 1.76s/it] Training 1/1 epoch (loss 1.5775): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1242/1563 [33:01<09:05, 1.70s/it] Training 1/1 epoch (loss 1.4242): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1242/1563 [33:03<09:05, 1.70s/it] Training 1/1 epoch (loss 1.4242): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1243/1563 [33:03<08:48, 1.65s/it] Training 1/1 epoch (loss 1.5610): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1243/1563 [33:05<08:48, 1.65s/it] Training 1/1 epoch (loss 1.5610): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1244/1563 [33:05<09:52, 1.86s/it] Training 1/1 epoch (loss 1.5503): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1244/1563 [33:07<09:52, 1.86s/it] Training 1/1 epoch (loss 1.5503): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1245/1563 [33:07<09:25, 1.78s/it] Training 1/1 epoch (loss 1.5672): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1245/1563 [33:08<09:25, 1.78s/it] Training 1/1 epoch (loss 1.5672): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1246/1563 [33:08<09:24, 1.78s/it] Training 1/1 epoch (loss 1.6122): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1246/1563 [33:10<09:24, 1.78s/it] Training 1/1 epoch (loss 1.6122): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1247/1563 [33:10<09:09, 1.74s/it] Training 1/1 epoch (loss 1.4761): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1247/1563 [33:12<09:09, 1.74s/it] Training 1/1 epoch (loss 1.4761): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1248/1563 [33:12<08:59, 1.71s/it] Training 1/1 epoch (loss 1.5586): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1248/1563 [33:13<08:59, 1.71s/it] Training 1/1 epoch (loss 1.5586): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1249/1563 [33:13<08:52, 1.69s/it] Training 1/1 epoch (loss 1.4811): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1249/1563 [33:15<08:52, 1.69s/it] Training 1/1 epoch (loss 1.4811): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1250/1563 [33:15<08:37, 1.65s/it] Training 1/1 epoch (loss 1.5119): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1250/1563 [33:16<08:37, 1.65s/it] Training 1/1 epoch (loss 1.5119): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1251/1563 [33:16<07:51, 1.51s/it] Training 1/1 epoch (loss 1.5354): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1251/1563 [33:18<07:51, 1.51s/it] Training 1/1 epoch (loss 1.5354): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1252/1563 [33:18<08:14, 1.59s/it] Training 1/1 epoch (loss 1.5670): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1252/1563 [33:20<08:14, 1.59s/it] Training 1/1 epoch (loss 1.5670): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1253/1563 [33:20<08:56, 1.73s/it] Training 1/1 epoch (loss 1.4722): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1253/1563 [33:22<08:56, 1.73s/it] Training 1/1 epoch (loss 1.4722): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1254/1563 [33:22<09:17, 1.81s/it] Training 1/1 epoch (loss 1.5545): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1254/1563 [33:24<09:17, 1.81s/it] Training 1/1 epoch (loss 1.5545): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1255/1563 [33:24<10:12, 1.99s/it] Training 1/1 epoch (loss 1.4668): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1255/1563 [33:27<10:12, 1.99s/it] Training 1/1 epoch (loss 1.4668): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1256/1563 [33:27<10:30, 2.06s/it] Training 1/1 epoch (loss 1.6266): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1256/1563 [33:29<10:30, 2.06s/it] Training 1/1 epoch (loss 1.6266): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1257/1563 [33:29<10:54, 2.14s/it] Training 1/1 epoch (loss 1.4788): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1257/1563 [33:30<10:54, 2.14s/it] Training 1/1 epoch (loss 1.4788): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1258/1563 [33:30<09:01, 1.78s/it] Training 1/1 epoch (loss 1.7039): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1258/1563 [33:32<09:01, 1.78s/it] Training 1/1 epoch (loss 1.7039): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1259/1563 [33:32<09:14, 1.82s/it] Training 1/1 epoch (loss 1.6287): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1259/1563 [33:34<09:14, 1.82s/it] Training 1/1 epoch (loss 1.6287): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1260/1563 [33:34<09:20, 1.85s/it] Training 1/1 epoch (loss 1.5416): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1260/1563 [33:35<09:20, 1.85s/it] Training 1/1 epoch (loss 1.5416): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1261/1563 [33:35<08:35, 1.71s/it] Training 1/1 epoch (loss 1.5669): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1261/1563 [33:37<08:35, 1.71s/it] Training 1/1 epoch (loss 1.5669): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1262/1563 [33:37<08:55, 1.78s/it] Training 1/1 epoch (loss 1.5650): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1262/1563 [33:39<08:55, 1.78s/it] Training 1/1 epoch (loss 1.5650): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1263/1563 [33:39<09:26, 1.89s/it] Training 1/1 epoch (loss 1.6376): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1263/1563 [33:41<09:26, 1.89s/it] Training 1/1 epoch (loss 1.6376): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1264/1563 [33:41<09:59, 2.01s/it] Training 1/1 epoch (loss 1.5949): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1264/1563 [33:43<09:59, 2.01s/it] Training 1/1 epoch (loss 1.5949): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1265/1563 [33:43<09:46, 1.97s/it] Training 1/1 epoch (loss 1.4761): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1265/1563 [33:45<09:46, 1.97s/it] Training 1/1 epoch (loss 1.4761): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1266/1563 [33:45<08:42, 1.76s/it] Training 1/1 epoch (loss 1.5266): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1266/1563 [33:46<08:42, 1.76s/it] Training 1/1 epoch (loss 1.5266): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1267/1563 [33:46<08:42, 1.77s/it] Training 1/1 epoch (loss 1.4920): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1267/1563 [33:48<08:42, 1.77s/it] Training 1/1 epoch (loss 1.4920): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1268/1563 [33:48<08:09, 1.66s/it] Training 1/1 epoch (loss 1.6193): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1268/1563 [33:50<08:09, 1.66s/it] Training 1/1 epoch (loss 1.6193): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1269/1563 [33:50<08:17, 1.69s/it] Training 1/1 epoch (loss 1.5463): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1269/1563 [33:52<08:17, 1.69s/it] Training 1/1 epoch (loss 1.5463): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1270/1563 [33:52<09:20, 1.91s/it] Training 1/1 epoch (loss 1.6419): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1270/1563 [33:53<09:20, 1.91s/it] Training 1/1 epoch (loss 1.6419): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1271/1563 [33:53<08:34, 1.76s/it] Training 1/1 epoch (loss 1.5421): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1271/1563 [33:55<08:34, 1.76s/it] Training 1/1 epoch (loss 1.5421): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1272/1563 [33:55<07:57, 1.64s/it] Training 1/1 epoch (loss 1.6592): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1272/1563 [33:57<07:57, 1.64s/it] Training 1/1 epoch (loss 1.6592): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1273/1563 [33:57<08:25, 1.74s/it] Training 1/1 epoch (loss 1.6392): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1273/1563 [33:58<08:25, 1.74s/it] Training 1/1 epoch (loss 1.6392): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1274/1563 [33:58<08:07, 1.69s/it] Training 1/1 epoch (loss 1.6064): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1274/1563 [34:00<08:07, 1.69s/it] Training 1/1 epoch (loss 1.6064): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1275/1563 [34:00<07:53, 1.64s/it] Training 1/1 epoch (loss 1.6205): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1275/1563 [34:02<07:53, 1.64s/it] Training 1/1 epoch (loss 1.6205): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1276/1563 [34:02<08:44, 1.83s/it] Training 1/1 epoch (loss 1.4298): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1276/1563 [34:04<08:44, 1.83s/it] Training 1/1 epoch (loss 1.4298): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1277/1563 [34:04<09:08, 1.92s/it] Training 1/1 epoch (loss 1.5807): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1277/1563 [34:06<09:08, 1.92s/it] Training 1/1 epoch (loss 1.5807): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1278/1563 [34:06<09:34, 2.02s/it] Training 1/1 epoch (loss 1.4942): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1278/1563 [34:09<09:34, 2.02s/it] Training 1/1 epoch (loss 1.4942): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1279/1563 [34:09<09:38, 2.04s/it] Training 1/1 epoch (loss 1.4067): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1279/1563 [34:11<09:38, 2.04s/it] Training 1/1 epoch (loss 1.4067): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1280/1563 [34:11<10:32, 2.24s/it] Training 1/1 epoch (loss 1.5387): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1280/1563 [34:13<10:32, 2.24s/it] Training 1/1 epoch (loss 1.5387): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1281/1563 [34:13<09:18, 1.98s/it] Training 1/1 epoch (loss 1.6451): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1281/1563 [34:14<09:18, 1.98s/it] Training 1/1 epoch (loss 1.6451): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1282/1563 [34:14<08:48, 1.88s/it] Training 1/1 epoch (loss 1.5181): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1282/1563 [34:16<08:48, 1.88s/it] Training 1/1 epoch (loss 1.5181): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1283/1563 [34:16<08:35, 1.84s/it] Training 1/1 epoch (loss 1.6480): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1283/1563 [34:18<08:35, 1.84s/it] Training 1/1 epoch (loss 1.6480): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1284/1563 [34:18<09:21, 2.01s/it] Training 1/1 epoch (loss 1.4852): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1284/1563 [34:20<09:21, 2.01s/it] Training 1/1 epoch (loss 1.4852): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1285/1563 [34:20<09:19, 2.01s/it] Training 1/1 epoch (loss 1.5208): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1285/1563 [34:23<09:19, 2.01s/it] Training 1/1 epoch (loss 1.5208): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1286/1563 [34:23<09:45, 2.11s/it] Training 1/1 epoch (loss 1.5891): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1286/1563 [34:25<09:45, 2.11s/it] Training 1/1 epoch (loss 1.5891): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1287/1563 [34:25<09:18, 2.02s/it] Training 1/1 epoch (loss 1.4608): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1287/1563 [34:27<09:18, 2.02s/it] Training 1/1 epoch (loss 1.4608): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1288/1563 [34:27<09:39, 2.11s/it] Training 1/1 epoch (loss 1.5932): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1288/1563 [34:29<09:39, 2.11s/it] Training 1/1 epoch (loss 1.5932): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1289/1563 [34:29<09:22, 2.05s/it] Training 1/1 epoch (loss 1.5642): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1289/1563 [34:31<09:22, 2.05s/it] Training 1/1 epoch (loss 1.5642): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1290/1563 [34:31<08:58, 1.97s/it] Training 1/1 epoch (loss 1.6445): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1290/1563 [34:33<08:58, 1.97s/it] Training 1/1 epoch (loss 1.6445): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1291/1563 [34:33<09:21, 2.07s/it] Training 1/1 epoch (loss 1.5393): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1291/1563 [34:34<09:21, 2.07s/it] Training 1/1 epoch (loss 1.5393): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1292/1563 [34:34<08:18, 1.84s/it] Training 1/1 epoch (loss 1.5789): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1292/1563 [34:36<08:18, 1.84s/it] Training 1/1 epoch (loss 1.5789): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1293/1563 [34:36<08:10, 1.82s/it] Training 1/1 epoch (loss 1.6085): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1293/1563 [34:38<08:10, 1.82s/it] Training 1/1 epoch (loss 1.6085): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1294/1563 [34:38<08:45, 1.95s/it] Training 1/1 epoch (loss 1.6357): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1294/1563 [34:40<08:45, 1.95s/it] Training 1/1 epoch (loss 1.6357): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1295/1563 [34:40<07:56, 1.78s/it] Training 1/1 epoch (loss 1.5933): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1295/1563 [34:41<07:56, 1.78s/it] Training 1/1 epoch (loss 1.5933): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1296/1563 [34:41<07:51, 1.77s/it] Training 1/1 epoch (loss 1.6326): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1296/1563 [34:43<07:51, 1.77s/it] Training 1/1 epoch (loss 1.6326): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1297/1563 [34:43<07:30, 1.69s/it] Training 1/1 epoch (loss 1.5077): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1297/1563 [34:44<07:30, 1.69s/it] Training 1/1 epoch (loss 1.5077): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1298/1563 [34:44<07:18, 1.65s/it] Training 1/1 epoch (loss 1.6522): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1298/1563 [34:46<07:18, 1.65s/it] Training 1/1 epoch (loss 1.6522): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1299/1563 [34:46<07:40, 1.74s/it] Training 1/1 epoch (loss 1.5859): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1299/1563 [34:48<07:40, 1.74s/it] Training 1/1 epoch (loss 1.5859): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1300/1563 [34:48<07:52, 1.80s/it] Training 1/1 epoch (loss 1.5587): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1300/1563 [34:50<07:52, 1.80s/it] Training 1/1 epoch (loss 1.5587): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1301/1563 [34:50<07:45, 1.78s/it] Training 1/1 epoch (loss 1.5381): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1301/1563 [34:52<07:45, 1.78s/it] Training 1/1 epoch (loss 1.5381): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1302/1563 [34:52<07:32, 1.73s/it] Training 1/1 epoch (loss 1.5564): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1302/1563 [34:53<07:32, 1.73s/it] Training 1/1 epoch (loss 1.5564): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1303/1563 [34:53<07:09, 1.65s/it] Training 1/1 epoch (loss 1.5723): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1303/1563 [34:55<07:09, 1.65s/it] Training 1/1 epoch (loss 1.5723): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1304/1563 [34:55<07:54, 1.83s/it] Training 1/1 epoch (loss 1.4853): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1304/1563 [34:57<07:54, 1.83s/it] Training 1/1 epoch (loss 1.4853): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1305/1563 [34:57<07:08, 1.66s/it] Training 1/1 epoch (loss 1.4728): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1305/1563 [34:59<07:08, 1.66s/it] Training 1/1 epoch (loss 1.4728): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1306/1563 [34:59<07:53, 1.84s/it] Training 1/1 epoch (loss 1.6140): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1306/1563 [35:00<07:53, 1.84s/it] Training 1/1 epoch (loss 1.6140): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1307/1563 [35:00<07:00, 1.64s/it] Training 1/1 epoch (loss 1.4878): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1307/1563 [35:02<07:00, 1.64s/it] Training 1/1 epoch (loss 1.4878): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1308/1563 [35:02<06:59, 1.64s/it] Training 1/1 epoch (loss 1.5799): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1308/1563 [35:04<06:59, 1.64s/it] Training 1/1 epoch (loss 1.5799): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1309/1563 [35:04<07:34, 1.79s/it] Training 1/1 epoch (loss 1.5702): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1309/1563 [35:06<07:34, 1.79s/it] Training 1/1 epoch (loss 1.5702): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1310/1563 [35:06<07:59, 1.90s/it] Training 1/1 epoch (loss 1.6775): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1310/1563 [35:08<07:59, 1.90s/it] Training 1/1 epoch (loss 1.6775): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1311/1563 [35:08<07:56, 1.89s/it] Training 1/1 epoch (loss 1.5850): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1311/1563 [35:10<07:56, 1.89s/it] Training 1/1 epoch (loss 1.5850): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1312/1563 [35:10<07:33, 1.81s/it] Training 1/1 epoch (loss 1.5066): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1312/1563 [35:12<07:33, 1.81s/it] Training 1/1 epoch (loss 1.5066): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1313/1563 [35:12<07:52, 1.89s/it] Training 1/1 epoch (loss 1.6267): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1313/1563 [35:14<07:52, 1.89s/it] Training 1/1 epoch (loss 1.6267): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1314/1563 [35:14<08:33, 2.06s/it] Training 1/1 epoch (loss 1.4755): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1314/1563 [35:16<08:33, 2.06s/it] Training 1/1 epoch (loss 1.4755): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1315/1563 [35:16<07:46, 1.88s/it] Training 1/1 epoch (loss 1.5492): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1315/1563 [35:17<07:46, 1.88s/it] Training 1/1 epoch (loss 1.5492): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1316/1563 [35:17<07:23, 1.80s/it] Training 1/1 epoch (loss 1.6022): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1316/1563 [35:19<07:23, 1.80s/it] Training 1/1 epoch (loss 1.6022): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1317/1563 [35:19<07:10, 1.75s/it] Training 1/1 epoch (loss 1.6135): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1317/1563 [35:21<07:10, 1.75s/it] Training 1/1 epoch (loss 1.6135): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1318/1563 [35:21<07:10, 1.76s/it] Training 1/1 epoch (loss 1.5304): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1318/1563 [35:23<07:10, 1.76s/it] Training 1/1 epoch (loss 1.5304): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1319/1563 [35:23<07:28, 1.84s/it] Training 1/1 epoch (loss 1.5886): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1319/1563 [35:24<07:28, 1.84s/it] Training 1/1 epoch (loss 1.5886): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1320/1563 [35:24<07:17, 1.80s/it] Training 1/1 epoch (loss 1.5341): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1320/1563 [35:26<07:17, 1.80s/it] Training 1/1 epoch (loss 1.5341): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1321/1563 [35:26<07:00, 1.74s/it] Training 1/1 epoch (loss 1.5865): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1321/1563 [35:28<07:00, 1.74s/it] Training 1/1 epoch (loss 1.5865): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1322/1563 [35:28<07:43, 1.92s/it] Training 1/1 epoch (loss 1.6343): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1322/1563 [35:31<07:43, 1.92s/it] Training 1/1 epoch (loss 1.6343): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1323/1563 [35:31<08:22, 2.09s/it] Training 1/1 epoch (loss 1.5661): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1323/1563 [35:33<08:22, 2.09s/it] Training 1/1 epoch (loss 1.5661): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1324/1563 [35:33<08:11, 2.06s/it] Training 1/1 epoch (loss 1.5623): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1324/1563 [35:34<08:11, 2.06s/it] Training 1/1 epoch (loss 1.5623): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1325/1563 [35:34<07:21, 1.85s/it] Training 1/1 epoch (loss 1.6150): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1325/1563 [35:36<07:21, 1.85s/it] Training 1/1 epoch (loss 1.6150): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1326/1563 [35:36<07:28, 1.89s/it] Training 1/1 epoch (loss 1.4608): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1326/1563 [35:38<07:28, 1.89s/it] Training 1/1 epoch (loss 1.4608): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1327/1563 [35:38<07:36, 1.93s/it] Training 1/1 epoch (loss 1.6166): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1327/1563 [35:40<07:36, 1.93s/it] Training 1/1 epoch (loss 1.6166): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1328/1563 [35:40<07:06, 1.82s/it] Training 1/1 epoch (loss 1.6093): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1328/1563 [35:42<07:06, 1.82s/it] Training 1/1 epoch (loss 1.6093): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1329/1563 [35:42<07:41, 1.97s/it] Training 1/1 epoch (loss 1.6507): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1329/1563 [35:44<07:41, 1.97s/it] Training 1/1 epoch (loss 1.6507): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1330/1563 [35:44<08:02, 2.07s/it] Training 1/1 epoch (loss 1.5204): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1330/1563 [35:46<08:02, 2.07s/it] Training 1/1 epoch (loss 1.5204): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1331/1563 [35:46<08:06, 2.10s/it] Training 1/1 epoch (loss 1.5530): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1331/1563 [35:48<08:06, 2.10s/it] Training 1/1 epoch (loss 1.5530): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1332/1563 [35:48<07:10, 1.86s/it] Training 1/1 epoch (loss 1.5358): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1332/1563 [35:50<07:10, 1.86s/it] Training 1/1 epoch (loss 1.5358): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1333/1563 [35:50<07:51, 2.05s/it] Training 1/1 epoch (loss 1.5461): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1333/1563 [35:52<07:51, 2.05s/it] Training 1/1 epoch (loss 1.5461): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1334/1563 [35:52<07:30, 1.97s/it] Training 1/1 epoch (loss 1.5437): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1334/1563 [35:54<07:30, 1.97s/it] Training 1/1 epoch (loss 1.5437): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1335/1563 [35:54<07:27, 1.96s/it] Training 1/1 epoch (loss 1.5062): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1335/1563 [35:56<07:27, 1.96s/it] Training 1/1 epoch (loss 1.5062): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1336/1563 [35:56<07:05, 1.87s/it] Training 1/1 epoch (loss 1.5716): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1336/1563 [35:57<07:05, 1.87s/it] Training 1/1 epoch (loss 1.5716): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1337/1563 [35:57<06:56, 1.84s/it] Training 1/1 epoch (loss 1.5446): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1337/1563 [36:00<06:56, 1.84s/it] Training 1/1 epoch (loss 1.5446): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1338/1563 [36:00<07:28, 1.99s/it] Training 1/1 epoch (loss 1.4955): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1338/1563 [36:02<07:28, 1.99s/it] Training 1/1 epoch (loss 1.4955): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1339/1563 [36:02<08:00, 2.14s/it] Training 1/1 epoch (loss 1.6611): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1339/1563 [36:04<08:00, 2.14s/it] Training 1/1 epoch (loss 1.6611): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1340/1563 [36:04<07:15, 1.95s/it] Training 1/1 epoch (loss 1.4475): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1340/1563 [36:06<07:15, 1.95s/it] Training 1/1 epoch (loss 1.4475): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1341/1563 [36:06<07:41, 2.08s/it] Training 1/1 epoch (loss 1.4748): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1341/1563 [36:08<07:41, 2.08s/it] Training 1/1 epoch (loss 1.4748): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1342/1563 [36:08<07:39, 2.08s/it] Training 1/1 epoch (loss 1.5766): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1342/1563 [36:10<07:39, 2.08s/it] Training 1/1 epoch (loss 1.5766): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1343/1563 [36:10<07:45, 2.12s/it] Training 1/1 epoch (loss 1.5743): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1343/1563 [36:12<07:45, 2.12s/it] Training 1/1 epoch (loss 1.5743): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1344/1563 [36:12<07:40, 2.10s/it] Training 1/1 epoch (loss 1.5198): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1344/1563 [36:14<07:40, 2.10s/it] Training 1/1 epoch (loss 1.5198): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1345/1563 [36:14<07:19, 2.01s/it] Training 1/1 epoch (loss 1.5403): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1345/1563 [36:16<07:19, 2.01s/it] Training 1/1 epoch (loss 1.5403): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1346/1563 [36:16<07:02, 1.95s/it] Training 1/1 epoch (loss 1.6333): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1346/1563 [36:18<07:02, 1.95s/it] Training 1/1 epoch (loss 1.6333): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1347/1563 [36:18<07:32, 2.10s/it] Training 1/1 epoch (loss 1.5476): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1347/1563 [36:20<07:32, 2.10s/it] Training 1/1 epoch (loss 1.5476): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1348/1563 [36:20<06:45, 1.88s/it] Training 1/1 epoch (loss 1.5611): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1348/1563 [36:22<06:45, 1.88s/it] Training 1/1 epoch (loss 1.5611): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1349/1563 [36:22<07:19, 2.05s/it] Training 1/1 epoch (loss 1.6265): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1349/1563 [36:24<07:19, 2.05s/it] Training 1/1 epoch (loss 1.6265): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1350/1563 [36:24<06:21, 1.79s/it] Training 1/1 epoch (loss 1.5605): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1350/1563 [36:25<06:21, 1.79s/it] Training 1/1 epoch (loss 1.5605): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1351/1563 [36:25<06:26, 1.82s/it] Training 1/1 epoch (loss 1.5793): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1351/1563 [36:27<06:26, 1.82s/it] Training 1/1 epoch (loss 1.5793): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1352/1563 [36:27<06:29, 1.84s/it] Training 1/1 epoch (loss 1.4544): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1352/1563 [36:29<06:29, 1.84s/it] Training 1/1 epoch (loss 1.4544): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1353/1563 [36:29<06:10, 1.76s/it] Training 1/1 epoch (loss 1.5903): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1353/1563 [36:31<06:10, 1.76s/it] Training 1/1 epoch (loss 1.5903): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1354/1563 [36:31<06:01, 1.73s/it] Training 1/1 epoch (loss 1.5661): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1354/1563 [36:32<06:01, 1.73s/it] Training 1/1 epoch (loss 1.5661): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1355/1563 [36:32<06:01, 1.74s/it] Training 1/1 epoch (loss 1.5867): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1355/1563 [36:34<06:01, 1.74s/it] Training 1/1 epoch (loss 1.5867): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1356/1563 [36:34<06:24, 1.86s/it] Training 1/1 epoch (loss 1.4293): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1356/1563 [36:36<06:24, 1.86s/it] Training 1/1 epoch (loss 1.4293): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1357/1563 [36:36<05:47, 1.69s/it] Training 1/1 epoch (loss 1.5101): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1357/1563 [36:38<05:47, 1.69s/it] Training 1/1 epoch (loss 1.5101): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1358/1563 [36:38<06:08, 1.80s/it] Training 1/1 epoch (loss 1.5606): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1358/1563 [36:39<06:08, 1.80s/it] Training 1/1 epoch (loss 1.5606): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1359/1563 [36:39<05:55, 1.74s/it] Training 1/1 epoch (loss 1.6378): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1359/1563 [36:41<05:55, 1.74s/it] Training 1/1 epoch (loss 1.6378): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1360/1563 [36:41<06:11, 1.83s/it] Training 1/1 epoch (loss 1.5603): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1360/1563 [36:43<06:11, 1.83s/it] Training 1/1 epoch (loss 1.5603): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1361/1563 [36:43<05:27, 1.62s/it] Training 1/1 epoch (loss 1.4943): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1361/1563 [36:44<05:27, 1.62s/it] Training 1/1 epoch (loss 1.4943): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1362/1563 [36:44<05:02, 1.51s/it] Training 1/1 epoch (loss 1.5891): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1362/1563 [36:46<05:02, 1.51s/it] Training 1/1 epoch (loss 1.5891): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1363/1563 [36:46<05:17, 1.59s/it] Training 1/1 epoch (loss 1.4878): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1363/1563 [36:47<05:17, 1.59s/it] Training 1/1 epoch (loss 1.4878): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1364/1563 [36:47<05:21, 1.61s/it] Training 1/1 epoch (loss 1.5786): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1364/1563 [36:50<05:21, 1.61s/it] Training 1/1 epoch (loss 1.5786): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1365/1563 [36:50<06:06, 1.85s/it] Training 1/1 epoch (loss 1.5553): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1365/1563 [36:52<06:06, 1.85s/it] Training 1/1 epoch (loss 1.5553): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1366/1563 [36:52<06:15, 1.91s/it] Training 1/1 epoch (loss 1.5829): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1366/1563 [36:53<06:15, 1.91s/it] Training 1/1 epoch (loss 1.5829): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1367/1563 [36:53<06:01, 1.84s/it] Training 1/1 epoch (loss 1.7079): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1367/1563 [36:55<06:01, 1.84s/it] Training 1/1 epoch (loss 1.7079): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1368/1563 [36:55<06:09, 1.90s/it] Training 1/1 epoch (loss 1.5586): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1368/1563 [36:57<06:09, 1.90s/it] Training 1/1 epoch (loss 1.5586): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1369/1563 [36:57<06:17, 1.95s/it] Training 1/1 epoch (loss 1.5831): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1369/1563 [36:59<06:17, 1.95s/it] Training 1/1 epoch (loss 1.5831): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1370/1563 [36:59<06:07, 1.90s/it] Training 1/1 epoch (loss 1.5383): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1370/1563 [37:01<06:07, 1.90s/it] Training 1/1 epoch (loss 1.5383): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1371/1563 [37:01<05:41, 1.78s/it] Training 1/1 epoch (loss 1.7248): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1371/1563 [37:02<05:41, 1.78s/it] Training 1/1 epoch (loss 1.7248): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1372/1563 [37:02<05:03, 1.59s/it] Training 1/1 epoch (loss 1.4351): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1372/1563 [37:04<05:03, 1.59s/it] Training 1/1 epoch (loss 1.4351): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1373/1563 [37:04<05:25, 1.72s/it] Training 1/1 epoch (loss 1.5970): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1373/1563 [37:06<05:25, 1.72s/it] Training 1/1 epoch (loss 1.5970): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1374/1563 [37:06<05:30, 1.75s/it] Training 1/1 epoch (loss 1.6188): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1374/1563 [37:08<05:30, 1.75s/it] Training 1/1 epoch (loss 1.6188): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1375/1563 [37:08<06:08, 1.96s/it] Training 1/1 epoch (loss 1.6196): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1375/1563 [37:10<06:08, 1.96s/it] Training 1/1 epoch (loss 1.6196): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1376/1563 [37:10<05:53, 1.89s/it] Training 1/1 epoch (loss 1.5985): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1376/1563 [37:12<05:53, 1.89s/it] Training 1/1 epoch (loss 1.5985): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1377/1563 [37:12<05:43, 1.85s/it] Training 1/1 epoch (loss 1.4026): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1377/1563 [37:13<05:43, 1.85s/it] Training 1/1 epoch (loss 1.4026): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1378/1563 [37:13<05:21, 1.74s/it] Training 1/1 epoch (loss 1.5130): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1378/1563 [37:15<05:21, 1.74s/it] Training 1/1 epoch (loss 1.5130): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1379/1563 [37:15<05:34, 1.82s/it] Training 1/1 epoch (loss 1.5330): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1379/1563 [37:17<05:34, 1.82s/it] Training 1/1 epoch (loss 1.5330): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1380/1563 [37:17<05:55, 1.94s/it] Training 1/1 epoch (loss 1.5487): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1380/1563 [37:19<05:55, 1.94s/it] Training 1/1 epoch (loss 1.5487): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1381/1563 [37:19<05:29, 1.81s/it] Training 1/1 epoch (loss 1.6119): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1381/1563 [37:20<05:29, 1.81s/it] Training 1/1 epoch (loss 1.6119): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1382/1563 [37:20<05:08, 1.70s/it] Training 1/1 epoch (loss 1.5166): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1382/1563 [37:21<05:08, 1.70s/it] Training 1/1 epoch (loss 1.5166): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1383/1563 [37:21<04:30, 1.50s/it] Training 1/1 epoch (loss 1.5832): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1383/1563 [37:23<04:30, 1.50s/it] Training 1/1 epoch (loss 1.5832): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1384/1563 [37:23<04:48, 1.61s/it] Training 1/1 epoch (loss 1.5055): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1384/1563 [37:25<04:48, 1.61s/it] Training 1/1 epoch (loss 1.5055): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1385/1563 [37:25<04:52, 1.64s/it] Training 1/1 epoch (loss 1.5345): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1385/1563 [37:26<04:52, 1.64s/it] Training 1/1 epoch (loss 1.5345): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1386/1563 [37:26<04:31, 1.53s/it] Training 1/1 epoch (loss 1.5492): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1386/1563 [37:28<04:31, 1.53s/it] Training 1/1 epoch (loss 1.5492): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1387/1563 [37:28<04:37, 1.57s/it] Training 1/1 epoch (loss 1.6548): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1387/1563 [37:30<04:37, 1.57s/it] Training 1/1 epoch (loss 1.6548): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1388/1563 [37:30<04:54, 1.68s/it] Training 1/1 epoch (loss 1.5080): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1388/1563 [37:32<04:54, 1.68s/it] Training 1/1 epoch (loss 1.5080): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1389/1563 [37:32<04:56, 1.70s/it] Training 1/1 epoch (loss 1.5842): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1389/1563 [37:33<04:56, 1.70s/it] Training 1/1 epoch (loss 1.5842): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1390/1563 [37:33<04:56, 1.71s/it] Training 1/1 epoch (loss 1.4638): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1390/1563 [37:35<04:56, 1.71s/it] Training 1/1 epoch (loss 1.4638): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1391/1563 [37:35<04:49, 1.68s/it] Training 1/1 epoch (loss 1.5242): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1391/1563 [37:37<04:49, 1.68s/it] Training 1/1 epoch (loss 1.5242): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1392/1563 [37:37<04:56, 1.74s/it] Training 1/1 epoch (loss 1.5911): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1392/1563 [37:39<04:56, 1.74s/it] Training 1/1 epoch (loss 1.5911): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1393/1563 [37:39<05:04, 1.79s/it] Training 1/1 epoch (loss 1.5003): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1393/1563 [37:40<05:04, 1.79s/it] Training 1/1 epoch (loss 1.5003): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1394/1563 [37:40<04:52, 1.73s/it] Training 1/1 epoch (loss 1.5186): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1394/1563 [37:43<04:52, 1.73s/it] Training 1/1 epoch (loss 1.5186): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1395/1563 [37:43<05:25, 1.94s/it] Training 1/1 epoch (loss 1.6131): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1395/1563 [37:44<05:25, 1.94s/it] Training 1/1 epoch (loss 1.6131): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1396/1563 [37:44<04:53, 1.76s/it] Training 1/1 epoch (loss 1.5399): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1396/1563 [37:46<04:53, 1.76s/it] Training 1/1 epoch (loss 1.5399): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1397/1563 [37:46<05:15, 1.90s/it] Training 1/1 epoch (loss 1.6210): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1397/1563 [37:48<05:15, 1.90s/it] Training 1/1 epoch (loss 1.6210): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1398/1563 [37:48<04:41, 1.71s/it] Training 1/1 epoch (loss 1.5895): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1398/1563 [37:49<04:41, 1.71s/it] Training 1/1 epoch (loss 1.5895): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1399/1563 [37:49<04:35, 1.68s/it] Training 1/1 epoch (loss 1.4685): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1399/1563 [37:51<04:35, 1.68s/it] Training 1/1 epoch (loss 1.4685): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1400/1563 [37:51<04:47, 1.76s/it] Training 1/1 epoch (loss 1.5532): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1400/1563 [37:53<04:47, 1.76s/it] Training 1/1 epoch (loss 1.5532): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1401/1563 [37:53<04:40, 1.73s/it] Training 1/1 epoch (loss 1.4875): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1401/1563 [37:54<04:40, 1.73s/it] Training 1/1 epoch (loss 1.4875): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1402/1563 [37:54<04:11, 1.56s/it] Training 1/1 epoch (loss 1.4051): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1402/1563 [37:56<04:11, 1.56s/it] Training 1/1 epoch (loss 1.4051): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1403/1563 [37:56<04:49, 1.81s/it] Training 1/1 epoch (loss 1.5159): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1403/1563 [37:58<04:49, 1.81s/it] Training 1/1 epoch (loss 1.5159): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1404/1563 [37:58<04:27, 1.68s/it] Training 1/1 epoch (loss 1.5057): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1404/1563 [37:59<04:27, 1.68s/it] Training 1/1 epoch (loss 1.5057): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1405/1563 [37:59<04:27, 1.70s/it] Training 1/1 epoch (loss 1.5420): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1405/1563 [38:01<04:27, 1.70s/it] Training 1/1 epoch (loss 1.5420): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1406/1563 [38:01<03:58, 1.52s/it] Training 1/1 epoch (loss 1.5001): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1406/1563 [38:02<03:58, 1.52s/it] Training 1/1 epoch (loss 1.5001): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1407/1563 [38:02<03:45, 1.45s/it] Training 1/1 epoch (loss 1.6015): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1407/1563 [38:04<03:45, 1.45s/it] Training 1/1 epoch (loss 1.6015): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1408/1563 [38:04<04:12, 1.63s/it] Training 1/1 epoch (loss 1.4591): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1408/1563 [38:05<04:12, 1.63s/it] Training 1/1 epoch (loss 1.4591): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1409/1563 [38:05<04:01, 1.57s/it] Training 1/1 epoch (loss 1.4729): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1409/1563 [38:07<04:01, 1.57s/it] Training 1/1 epoch (loss 1.4729): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1410/1563 [38:07<04:10, 1.64s/it] Training 1/1 epoch (loss 1.5387): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1410/1563 [38:09<04:10, 1.64s/it] Training 1/1 epoch (loss 1.5387): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1411/1563 [38:09<04:10, 1.65s/it] Training 1/1 epoch (loss 1.5492): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1411/1563 [38:11<04:10, 1.65s/it] Training 1/1 epoch (loss 1.5492): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1412/1563 [38:11<04:17, 1.70s/it] Training 1/1 epoch (loss 1.4750): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1412/1563 [38:12<04:17, 1.70s/it] Training 1/1 epoch (loss 1.4750): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1413/1563 [38:12<04:02, 1.62s/it] Training 1/1 epoch (loss 1.4714): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1413/1563 [38:13<04:02, 1.62s/it] Training 1/1 epoch (loss 1.4714): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1414/1563 [38:13<03:51, 1.55s/it] Training 1/1 epoch (loss 1.6558): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1414/1563 [38:16<03:51, 1.55s/it] Training 1/1 epoch (loss 1.6558): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1415/1563 [38:16<04:30, 1.83s/it] Training 1/1 epoch (loss 1.5085): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1415/1563 [38:18<04:30, 1.83s/it] Training 1/1 epoch (loss 1.5085): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1416/1563 [38:18<04:40, 1.91s/it] Training 1/1 epoch (loss 1.4846): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1416/1563 [38:20<04:40, 1.91s/it] Training 1/1 epoch (loss 1.4846): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1417/1563 [38:20<04:50, 1.99s/it] Training 1/1 epoch (loss 1.5502): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1417/1563 [38:22<04:50, 1.99s/it] Training 1/1 epoch (loss 1.5502): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1418/1563 [38:22<04:53, 2.02s/it] Training 1/1 epoch (loss 1.5164): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1418/1563 [38:23<04:53, 2.02s/it] Training 1/1 epoch (loss 1.5164): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1419/1563 [38:23<04:10, 1.74s/it] Training 1/1 epoch (loss 1.5228): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1419/1563 [38:24<04:10, 1.74s/it] Training 1/1 epoch (loss 1.5228): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1420/1563 [38:24<03:42, 1.55s/it] Training 1/1 epoch (loss 1.7003): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1420/1563 [38:26<03:42, 1.55s/it] Training 1/1 epoch (loss 1.7003): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1421/1563 [38:26<03:56, 1.66s/it] Training 1/1 epoch (loss 1.5131): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1421/1563 [38:28<03:56, 1.66s/it] Training 1/1 epoch (loss 1.5131): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1422/1563 [38:28<04:01, 1.71s/it] Training 1/1 epoch (loss 1.5120): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1422/1563 [38:30<04:01, 1.71s/it] Training 1/1 epoch (loss 1.5120): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1423/1563 [38:30<04:17, 1.84s/it] Training 1/1 epoch (loss 1.6164): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1423/1563 [38:32<04:17, 1.84s/it] Training 1/1 epoch (loss 1.6164): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1424/1563 [38:32<04:10, 1.81s/it] Training 1/1 epoch (loss 1.5131): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1424/1563 [38:34<04:10, 1.81s/it] Training 1/1 epoch (loss 1.5131): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1425/1563 [38:34<04:32, 1.98s/it] Training 1/1 epoch (loss 1.5690): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1425/1563 [38:36<04:32, 1.98s/it] Training 1/1 epoch (loss 1.5690): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1426/1563 [38:36<04:16, 1.87s/it] Training 1/1 epoch (loss 1.5242): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1426/1563 [38:38<04:16, 1.87s/it] Training 1/1 epoch (loss 1.5242): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1427/1563 [38:38<04:09, 1.84s/it] Training 1/1 epoch (loss 1.6244): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1427/1563 [38:40<04:09, 1.84s/it] Training 1/1 epoch (loss 1.6244): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1428/1563 [38:40<04:07, 1.83s/it] Training 1/1 epoch (loss 1.5098): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1428/1563 [38:41<04:07, 1.83s/it] Training 1/1 epoch (loss 1.5098): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1429/1563 [38:41<03:48, 1.70s/it] Training 1/1 epoch (loss 1.4637): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1429/1563 [38:43<03:48, 1.70s/it] Training 1/1 epoch (loss 1.4637): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1430/1563 [38:43<03:42, 1.67s/it] Training 1/1 epoch (loss 1.6522): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1430/1563 [38:44<03:42, 1.67s/it] Training 1/1 epoch (loss 1.6522): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1431/1563 [38:44<03:26, 1.56s/it] Training 1/1 epoch (loss 1.6032): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1431/1563 [38:46<03:26, 1.56s/it] Training 1/1 epoch (loss 1.6032): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1432/1563 [38:46<03:37, 1.66s/it] Training 1/1 epoch (loss 1.6055): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1432/1563 [38:47<03:37, 1.66s/it] Training 1/1 epoch (loss 1.6055): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1433/1563 [38:47<03:24, 1.57s/it] Training 1/1 epoch (loss 1.5314): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1433/1563 [38:49<03:24, 1.57s/it] Training 1/1 epoch (loss 1.5314): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1434/1563 [38:49<03:32, 1.65s/it] Training 1/1 epoch (loss 1.5945): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1434/1563 [38:51<03:32, 1.65s/it] Training 1/1 epoch (loss 1.5945): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1435/1563 [38:51<03:58, 1.87s/it] Training 1/1 epoch (loss 1.6092): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1435/1563 [38:53<03:58, 1.87s/it] Training 1/1 epoch (loss 1.6092): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1436/1563 [38:53<03:38, 1.72s/it] Training 1/1 epoch (loss 1.5960): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1436/1563 [38:55<03:38, 1.72s/it] Training 1/1 epoch (loss 1.5960): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1437/1563 [38:55<04:01, 1.92s/it] Training 1/1 epoch (loss 1.5735): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1437/1563 [38:57<04:01, 1.92s/it] Training 1/1 epoch (loss 1.5735): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1438/1563 [38:57<03:41, 1.78s/it] Training 1/1 epoch (loss 1.6203): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1438/1563 [38:58<03:41, 1.78s/it] Training 1/1 epoch (loss 1.6203): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1439/1563 [38:58<03:33, 1.72s/it] Training 1/1 epoch (loss 1.5643): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1439/1563 [39:00<03:33, 1.72s/it] Training 1/1 epoch (loss 1.5643): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1440/1563 [39:00<03:18, 1.61s/it] Training 1/1 epoch (loss 1.5062): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1440/1563 [39:01<03:18, 1.61s/it] Training 1/1 epoch (loss 1.5062): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1441/1563 [39:01<02:57, 1.45s/it] Training 1/1 epoch (loss 1.3604): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1441/1563 [39:03<02:57, 1.45s/it] Training 1/1 epoch (loss 1.3604): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1442/1563 [39:03<03:11, 1.58s/it] Training 1/1 epoch (loss 1.5526): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1442/1563 [39:04<03:11, 1.58s/it] Training 1/1 epoch (loss 1.5526): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1443/1563 [39:04<03:02, 1.52s/it] Training 1/1 epoch (loss 1.5269): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1443/1563 [39:06<03:02, 1.52s/it] Training 1/1 epoch (loss 1.5269): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1444/1563 [39:06<03:07, 1.58s/it] Training 1/1 epoch (loss 1.4810): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1444/1563 [39:08<03:07, 1.58s/it] Training 1/1 epoch (loss 1.4810): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1445/1563 [39:08<03:20, 1.70s/it] Training 1/1 epoch (loss 1.5343): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1445/1563 [39:09<03:20, 1.70s/it] Training 1/1 epoch (loss 1.5343): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1446/1563 [39:09<03:14, 1.66s/it] Training 1/1 epoch (loss 1.6185): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1446/1563 [39:11<03:14, 1.66s/it] Training 1/1 epoch (loss 1.6185): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1447/1563 [39:11<03:22, 1.75s/it] Training 1/1 epoch (loss 1.6200): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1447/1563 [39:13<03:22, 1.75s/it] Training 1/1 epoch (loss 1.6200): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1448/1563 [39:13<03:14, 1.69s/it] Training 1/1 epoch (loss 1.7273): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1448/1563 [39:14<03:14, 1.69s/it] Training 1/1 epoch (loss 1.7273): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1449/1563 [39:14<03:15, 1.71s/it] Training 1/1 epoch (loss 1.5813): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1449/1563 [39:16<03:15, 1.71s/it] Training 1/1 epoch (loss 1.5813): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1450/1563 [39:16<02:56, 1.56s/it] Training 1/1 epoch (loss 1.4603): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1450/1563 [39:18<02:56, 1.56s/it] Training 1/1 epoch (loss 1.4603): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1451/1563 [39:18<03:11, 1.71s/it] Training 1/1 epoch (loss 1.5061): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1451/1563 [39:20<03:11, 1.71s/it] Training 1/1 epoch (loss 1.5061): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1452/1563 [39:20<03:20, 1.81s/it] Training 1/1 epoch (loss 1.5159): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1452/1563 [39:22<03:20, 1.81s/it] Training 1/1 epoch (loss 1.5159): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1453/1563 [39:22<03:18, 1.80s/it] Training 1/1 epoch (loss 1.5244): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1453/1563 [39:24<03:18, 1.80s/it] Training 1/1 epoch (loss 1.5244): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1454/1563 [39:24<03:27, 1.90s/it] Training 1/1 epoch (loss 1.4565): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1454/1563 [39:25<03:27, 1.90s/it] Training 1/1 epoch (loss 1.4565): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1455/1563 [39:25<03:18, 1.84s/it] Training 1/1 epoch (loss 1.5530): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1455/1563 [39:27<03:18, 1.84s/it] Training 1/1 epoch (loss 1.5530): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1456/1563 [39:27<03:00, 1.68s/it] Training 1/1 epoch (loss 1.5756): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1456/1563 [39:28<03:00, 1.68s/it] Training 1/1 epoch (loss 1.5756): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1457/1563 [39:28<02:57, 1.67s/it] Training 1/1 epoch (loss 1.6876): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1457/1563 [39:30<02:57, 1.67s/it] Training 1/1 epoch (loss 1.6876): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1458/1563 [39:30<02:57, 1.69s/it] Training 1/1 epoch (loss 1.5261): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1458/1563 [39:32<02:57, 1.69s/it] Training 1/1 epoch (loss 1.5261): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1459/1563 [39:32<02:54, 1.68s/it] Training 1/1 epoch (loss 1.5854): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1459/1563 [39:33<02:54, 1.68s/it] Training 1/1 epoch (loss 1.5854): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1460/1563 [39:33<02:41, 1.57s/it] Training 1/1 epoch (loss 1.5629): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1460/1563 [39:35<02:41, 1.57s/it] Training 1/1 epoch (loss 1.5629): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1461/1563 [39:35<02:42, 1.59s/it] Training 1/1 epoch (loss 1.5019): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1461/1563 [39:37<02:42, 1.59s/it] Training 1/1 epoch (loss 1.5019): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1462/1563 [39:37<02:46, 1.65s/it] Training 1/1 epoch (loss 1.5659): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1462/1563 [39:38<02:46, 1.65s/it] Training 1/1 epoch (loss 1.5659): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1463/1563 [39:38<02:53, 1.74s/it] Training 1/1 epoch (loss 1.5474): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1463/1563 [39:41<02:53, 1.74s/it] Training 1/1 epoch (loss 1.5474): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1464/1563 [39:41<03:13, 1.96s/it] Training 1/1 epoch (loss 1.5639): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1464/1563 [39:42<03:13, 1.96s/it] Training 1/1 epoch (loss 1.5639): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1465/1563 [39:42<03:00, 1.84s/it] Training 1/1 epoch (loss 1.6633): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1465/1563 [39:44<03:00, 1.84s/it] Training 1/1 epoch (loss 1.6633): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1466/1563 [39:44<02:46, 1.72s/it] Training 1/1 epoch (loss 1.6391): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1466/1563 [39:46<02:46, 1.72s/it] Training 1/1 epoch (loss 1.6391): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1467/1563 [39:46<02:42, 1.70s/it] Training 1/1 epoch (loss 1.4604): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1467/1563 [39:47<02:42, 1.70s/it] Training 1/1 epoch (loss 1.4604): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1468/1563 [39:47<02:34, 1.62s/it] Training 1/1 epoch (loss 1.5788): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1468/1563 [39:49<02:34, 1.62s/it] Training 1/1 epoch (loss 1.5788): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1469/1563 [39:49<02:43, 1.74s/it] Training 1/1 epoch (loss 1.6496): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1469/1563 [39:51<02:43, 1.74s/it] Training 1/1 epoch (loss 1.6496): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1470/1563 [39:51<02:49, 1.83s/it] Training 1/1 epoch (loss 1.6207): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1470/1563 [39:52<02:49, 1.83s/it] Training 1/1 epoch (loss 1.6207): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1471/1563 [39:52<02:30, 1.64s/it] Training 1/1 epoch (loss 1.5088): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1471/1563 [39:54<02:30, 1.64s/it] Training 1/1 epoch (loss 1.5088): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1472/1563 [39:54<02:27, 1.62s/it] Training 1/1 epoch (loss 1.5248): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1472/1563 [39:56<02:27, 1.62s/it] Training 1/1 epoch (loss 1.5248): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1473/1563 [39:56<02:33, 1.71s/it] Training 1/1 epoch (loss 1.5937): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1473/1563 [39:58<02:33, 1.71s/it] Training 1/1 epoch (loss 1.5937): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1474/1563 [39:58<02:47, 1.88s/it] Training 1/1 epoch (loss 1.5075): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1474/1563 [40:00<02:47, 1.88s/it] Training 1/1 epoch (loss 1.5075): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1475/1563 [40:00<02:50, 1.94s/it] Training 1/1 epoch (loss 1.5595): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1475/1563 [40:02<02:50, 1.94s/it] Training 1/1 epoch (loss 1.5595): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1476/1563 [40:02<02:36, 1.80s/it] Training 1/1 epoch (loss 1.4925): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1476/1563 [40:03<02:36, 1.80s/it] Training 1/1 epoch (loss 1.4925): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1477/1563 [40:03<02:28, 1.72s/it] Training 1/1 epoch (loss 1.4924): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1477/1563 [40:05<02:28, 1.72s/it] Training 1/1 epoch (loss 1.4924): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1478/1563 [40:05<02:33, 1.81s/it] Training 1/1 epoch (loss 1.4982): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1478/1563 [40:07<02:33, 1.81s/it] Training 1/1 epoch (loss 1.4982): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1479/1563 [40:07<02:44, 1.96s/it] Training 1/1 epoch (loss 1.5138): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1479/1563 [40:10<02:44, 1.96s/it] Training 1/1 epoch (loss 1.5138): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1480/1563 [40:10<02:49, 2.04s/it] Training 1/1 epoch (loss 1.6070): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1480/1563 [40:11<02:49, 2.04s/it] Training 1/1 epoch (loss 1.6070): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1481/1563 [40:11<02:38, 1.94s/it] Training 1/1 epoch (loss 1.4837): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1481/1563 [40:13<02:38, 1.94s/it] Training 1/1 epoch (loss 1.4837): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1482/1563 [40:13<02:24, 1.78s/it] Training 1/1 epoch (loss 1.5480): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1482/1563 [40:14<02:24, 1.78s/it] Training 1/1 epoch (loss 1.5480): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1483/1563 [40:14<02:13, 1.66s/it] Training 1/1 epoch (loss 1.5031): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1483/1563 [40:16<02:13, 1.66s/it] Training 1/1 epoch (loss 1.5031): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1484/1563 [40:16<02:22, 1.81s/it] Training 1/1 epoch (loss 1.5300): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1484/1563 [40:18<02:22, 1.81s/it] Training 1/1 epoch (loss 1.5300): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1485/1563 [40:18<02:15, 1.73s/it] Training 1/1 epoch (loss 1.4817): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1485/1563 [40:20<02:15, 1.73s/it] Training 1/1 epoch (loss 1.4817): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1486/1563 [40:20<02:16, 1.78s/it] Training 1/1 epoch (loss 1.6484): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1486/1563 [40:21<02:16, 1.78s/it] Training 1/1 epoch (loss 1.6484): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1487/1563 [40:21<02:07, 1.67s/it] Training 1/1 epoch (loss 1.4696): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1487/1563 [40:23<02:07, 1.67s/it] Training 1/1 epoch (loss 1.4696): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1488/1563 [40:23<02:11, 1.75s/it] Training 1/1 epoch (loss 1.4966): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1488/1563 [40:26<02:11, 1.75s/it] Training 1/1 epoch (loss 1.4966): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1489/1563 [40:26<02:25, 1.96s/it] Training 1/1 epoch (loss 1.6346): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1489/1563 [40:28<02:25, 1.96s/it] Training 1/1 epoch (loss 1.6346): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1490/1563 [40:28<02:25, 2.00s/it] Training 1/1 epoch (loss 1.4835): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1490/1563 [40:29<02:25, 2.00s/it] Training 1/1 epoch (loss 1.4835): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1491/1563 [40:29<02:19, 1.94s/it] Training 1/1 epoch (loss 1.4934): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1491/1563 [40:31<02:19, 1.94s/it] Training 1/1 epoch (loss 1.4934): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1492/1563 [40:31<02:08, 1.81s/it] Training 1/1 epoch (loss 1.5729): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1492/1563 [40:33<02:08, 1.81s/it] Training 1/1 epoch (loss 1.5729): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1493/1563 [40:33<02:08, 1.83s/it] Training 1/1 epoch (loss 1.5735): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1493/1563 [40:34<02:08, 1.83s/it] Training 1/1 epoch (loss 1.5735): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1494/1563 [40:34<01:57, 1.70s/it] Training 1/1 epoch (loss 1.4777): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1494/1563 [40:36<01:57, 1.70s/it] Training 1/1 epoch (loss 1.4777): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1495/1563 [40:36<01:55, 1.70s/it] Training 1/1 epoch (loss 1.5641): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1495/1563 [40:38<01:55, 1.70s/it] Training 1/1 epoch (loss 1.5641): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1496/1563 [40:38<01:51, 1.66s/it] Training 1/1 epoch (loss 1.4738): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1496/1563 [40:39<01:51, 1.66s/it] Training 1/1 epoch (loss 1.4738): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1497/1563 [40:39<01:47, 1.63s/it] Training 1/1 epoch (loss 1.5862): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1497/1563 [40:41<01:47, 1.63s/it] Training 1/1 epoch (loss 1.5862): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1498/1563 [40:41<01:51, 1.72s/it] Training 1/1 epoch (loss 1.5015): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1498/1563 [40:44<01:51, 1.72s/it] Training 1/1 epoch (loss 1.5015): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1499/1563 [40:44<02:05, 1.96s/it] Training 1/1 epoch (loss 1.3891): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1499/1563 [40:45<02:05, 1.96s/it] Training 1/1 epoch (loss 1.3891): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1500/1563 [40:45<01:52, 1.79s/it] Training 1/1 epoch (loss 1.5074): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1500/1563 [40:47<01:52, 1.79s/it] Training 1/1 epoch (loss 1.5074): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1501/1563 [40:47<02:03, 2.00s/it] Training 1/1 epoch (loss 1.6277): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1501/1563 [40:49<02:03, 2.00s/it] Training 1/1 epoch (loss 1.6277): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1502/1563 [40:49<02:01, 1.99s/it] Training 1/1 epoch (loss 1.5430): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1502/1563 [40:52<02:01, 1.99s/it] Training 1/1 epoch (loss 1.5430): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1503/1563 [40:52<02:05, 2.10s/it] Training 1/1 epoch (loss 1.5412): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1503/1563 [40:53<02:05, 2.10s/it] Training 1/1 epoch (loss 1.5412): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1504/1563 [40:53<01:57, 2.00s/it] Training 1/1 epoch (loss 1.4844): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1504/1563 [40:55<01:57, 2.00s/it] Training 1/1 epoch (loss 1.4844): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1505/1563 [40:55<01:48, 1.88s/it] Training 1/1 epoch (loss 1.5431): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1505/1563 [40:56<01:48, 1.88s/it] Training 1/1 epoch (loss 1.5431): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1506/1563 [40:56<01:36, 1.70s/it] Training 1/1 epoch (loss 1.5799): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1506/1563 [40:58<01:36, 1.70s/it] Training 1/1 epoch (loss 1.5799): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1507/1563 [40:58<01:34, 1.69s/it] Training 1/1 epoch (loss 1.4788): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1507/1563 [41:00<01:34, 1.69s/it] Training 1/1 epoch (loss 1.4788): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1508/1563 [41:00<01:44, 1.90s/it] Training 1/1 epoch (loss 1.5463): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1508/1563 [41:03<01:44, 1.90s/it] Training 1/1 epoch (loss 1.5463): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1509/1563 [41:03<01:51, 2.07s/it] Training 1/1 epoch (loss 1.5536): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1509/1563 [41:04<01:51, 2.07s/it] Training 1/1 epoch (loss 1.5536): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1510/1563 [41:04<01:37, 1.83s/it] Training 1/1 epoch (loss 1.5866): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1510/1563 [41:06<01:37, 1.83s/it] Training 1/1 epoch (loss 1.5866): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1511/1563 [41:06<01:39, 1.92s/it] Training 1/1 epoch (loss 1.5175): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1511/1563 [41:09<01:39, 1.92s/it] Training 1/1 epoch (loss 1.5175): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1512/1563 [41:09<01:51, 2.18s/it] Training 1/1 epoch (loss 1.5014): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1512/1563 [41:11<01:51, 2.18s/it] Training 1/1 epoch (loss 1.5014): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1513/1563 [41:11<01:46, 2.13s/it] Training 1/1 epoch (loss 1.5454): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1513/1563 [41:13<01:46, 2.13s/it] Training 1/1 epoch (loss 1.5454): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1514/1563 [41:13<01:35, 1.96s/it] Training 1/1 epoch (loss 1.5638): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1514/1563 [41:15<01:35, 1.96s/it] Training 1/1 epoch (loss 1.5638): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1515/1563 [41:15<01:35, 2.00s/it] Training 1/1 epoch (loss 1.4972): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1515/1563 [41:17<01:35, 2.00s/it] Training 1/1 epoch (loss 1.4972): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1516/1563 [41:17<01:32, 1.98s/it] Training 1/1 epoch (loss 1.5307): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1516/1563 [41:19<01:32, 1.98s/it] Training 1/1 epoch (loss 1.5307): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1517/1563 [41:19<01:30, 1.97s/it] Training 1/1 epoch (loss 1.4196): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1517/1563 [41:20<01:30, 1.97s/it] Training 1/1 epoch (loss 1.4196): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1518/1563 [41:20<01:23, 1.86s/it] Training 1/1 epoch (loss 1.5340): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1518/1563 [41:22<01:23, 1.86s/it] Training 1/1 epoch (loss 1.5340): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1519/1563 [41:22<01:16, 1.75s/it] Training 1/1 epoch (loss 1.4240): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1519/1563 [41:24<01:16, 1.75s/it] Training 1/1 epoch (loss 1.4240): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1520/1563 [41:24<01:21, 1.88s/it] Training 1/1 epoch (loss 1.4708): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1520/1563 [41:25<01:21, 1.88s/it] Training 1/1 epoch (loss 1.4708): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1521/1563 [41:25<01:13, 1.74s/it] Training 1/1 epoch (loss 1.5975): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1521/1563 [41:27<01:13, 1.74s/it] Training 1/1 epoch (loss 1.5975): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1522/1563 [41:27<01:08, 1.68s/it] Training 1/1 epoch (loss 1.5884): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1522/1563 [41:28<01:08, 1.68s/it] Training 1/1 epoch (loss 1.5884): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1523/1563 [41:28<01:04, 1.61s/it] Training 1/1 epoch (loss 1.5947): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1523/1563 [41:30<01:04, 1.61s/it] Training 1/1 epoch (loss 1.5947): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1524/1563 [41:30<01:06, 1.71s/it] Training 1/1 epoch (loss 1.6047): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1524/1563 [41:32<01:06, 1.71s/it] Training 1/1 epoch (loss 1.6047): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1525/1563 [41:32<01:00, 1.59s/it] Training 1/1 epoch (loss 1.4569): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1525/1563 [41:33<01:00, 1.59s/it] Training 1/1 epoch (loss 1.4569): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1526/1563 [41:33<00:55, 1.50s/it] Training 1/1 epoch (loss 1.4617): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1526/1563 [41:35<00:55, 1.50s/it] Training 1/1 epoch (loss 1.4617): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1527/1563 [41:35<00:56, 1.57s/it] Training 1/1 epoch (loss 1.5920): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1527/1563 [41:37<00:56, 1.57s/it] Training 1/1 epoch (loss 1.5920): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1528/1563 [41:37<01:02, 1.78s/it] Training 1/1 epoch (loss 1.5137): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1528/1563 [41:38<01:02, 1.78s/it] Training 1/1 epoch (loss 1.5137): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1529/1563 [41:38<00:57, 1.70s/it] Training 1/1 epoch (loss 1.6578): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1529/1563 [41:40<00:57, 1.70s/it] Training 1/1 epoch (loss 1.6578): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1530/1563 [41:40<00:57, 1.73s/it] Training 1/1 epoch (loss 1.5274): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1530/1563 [41:42<00:57, 1.73s/it] Training 1/1 epoch (loss 1.5274): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1531/1563 [41:42<00:58, 1.82s/it] Training 1/1 epoch (loss 1.6535): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1531/1563 [41:44<00:58, 1.82s/it] Training 1/1 epoch (loss 1.6535): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1532/1563 [41:44<00:52, 1.70s/it] Training 1/1 epoch (loss 1.5437): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1532/1563 [41:46<00:52, 1.70s/it] Training 1/1 epoch (loss 1.5437): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1533/1563 [41:46<00:53, 1.79s/it] Training 1/1 epoch (loss 1.5714): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1533/1563 [41:47<00:53, 1.79s/it] Training 1/1 epoch (loss 1.5714): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1534/1563 [41:47<00:51, 1.78s/it] Training 1/1 epoch (loss 1.4978): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1534/1563 [41:49<00:51, 1.78s/it] Training 1/1 epoch (loss 1.4978): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1535/1563 [41:49<00:49, 1.76s/it] Training 1/1 epoch (loss 1.6911): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1535/1563 [41:51<00:49, 1.76s/it] Training 1/1 epoch (loss 1.6911): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1536/1563 [41:51<00:45, 1.69s/it] Training 1/1 epoch (loss 1.5552): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1536/1563 [41:52<00:45, 1.69s/it] Training 1/1 epoch (loss 1.5552): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1537/1563 [41:52<00:40, 1.56s/it] Training 1/1 epoch (loss 1.5253): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1537/1563 [41:53<00:40, 1.56s/it] Training 1/1 epoch (loss 1.5253): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1538/1563 [41:53<00:37, 1.51s/it] Training 1/1 epoch (loss 1.5827): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1538/1563 [41:55<00:37, 1.51s/it] Training 1/1 epoch (loss 1.5827): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1539/1563 [41:55<00:36, 1.53s/it] Training 1/1 epoch (loss 1.5289): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1539/1563 [41:56<00:36, 1.53s/it] Training 1/1 epoch (loss 1.5289): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1540/1563 [41:56<00:35, 1.55s/it] Training 1/1 epoch (loss 1.5936): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1540/1563 [41:58<00:35, 1.55s/it] Training 1/1 epoch (loss 1.5936): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1541/1563 [41:58<00:36, 1.66s/it] Training 1/1 epoch (loss 1.5494): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1541/1563 [42:00<00:36, 1.66s/it] Training 1/1 epoch (loss 1.5494): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1542/1563 [42:00<00:32, 1.52s/it] Training 1/1 epoch (loss 1.6090): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1542/1563 [42:02<00:32, 1.52s/it] Training 1/1 epoch (loss 1.6090): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1543/1563 [42:02<00:36, 1.81s/it] Training 1/1 epoch (loss 1.5101): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1543/1563 [42:04<00:36, 1.81s/it] Training 1/1 epoch (loss 1.5101): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1544/1563 [42:04<00:33, 1.77s/it] Training 1/1 epoch (loss 1.4341): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1544/1563 [42:06<00:33, 1.77s/it] Training 1/1 epoch (loss 1.4341): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1545/1563 [42:06<00:32, 1.83s/it] Training 1/1 epoch (loss 1.5332): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1545/1563 [42:07<00:32, 1.83s/it] Training 1/1 epoch (loss 1.5332): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1546/1563 [42:07<00:30, 1.79s/it] Training 1/1 epoch (loss 1.5551): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1546/1563 [42:10<00:30, 1.79s/it] Training 1/1 epoch (loss 1.5551): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1547/1563 [42:10<00:30, 1.90s/it] Training 1/1 epoch (loss 1.4410): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1547/1563 [42:11<00:30, 1.90s/it] Training 1/1 epoch (loss 1.4410): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1548/1563 [42:11<00:28, 1.87s/it] Training 1/1 epoch (loss 1.5564): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1548/1563 [42:13<00:28, 1.87s/it] Training 1/1 epoch (loss 1.5564): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1549/1563 [42:13<00:26, 1.86s/it] Training 1/1 epoch (loss 1.6096): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1549/1563 [42:15<00:26, 1.86s/it] Training 1/1 epoch (loss 1.6096): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1550/1563 [42:15<00:23, 1.81s/it] Training 1/1 epoch (loss 1.5366): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1550/1563 [42:16<00:23, 1.81s/it] Training 1/1 epoch (loss 1.5366): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1551/1563 [42:16<00:19, 1.66s/it] Training 1/1 epoch (loss 1.5916): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1551/1563 [42:18<00:19, 1.66s/it] Training 1/1 epoch (loss 1.5916): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1552/1563 [42:18<00:19, 1.77s/it] Training 1/1 epoch (loss 1.6632): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1552/1563 [42:20<00:19, 1.77s/it] Training 1/1 epoch (loss 1.6632): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1553/1563 [42:20<00:16, 1.64s/it] Training 1/1 epoch (loss 1.5903): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1553/1563 [42:21<00:16, 1.64s/it] Training 1/1 epoch (loss 1.5903): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1554/1563 [42:21<00:14, 1.59s/it] Training 1/1 epoch (loss 1.5654): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1554/1563 [42:23<00:14, 1.59s/it] Training 1/1 epoch (loss 1.5654): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1555/1563 [42:23<00:13, 1.65s/it] Training 1/1 epoch (loss 1.6286): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1555/1563 [42:25<00:13, 1.65s/it] Training 1/1 epoch (loss 1.6286): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1556/1563 [42:25<00:12, 1.81s/it] Training 1/1 epoch (loss 1.5978): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1556/1563 [42:26<00:12, 1.81s/it] Training 1/1 epoch (loss 1.5978): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1557/1563 [42:26<00:10, 1.69s/it] Training 1/1 epoch (loss 1.5106): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1557/1563 [42:28<00:10, 1.69s/it] Training 1/1 epoch (loss 1.5106): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1558/1563 [42:28<00:08, 1.77s/it] Training 1/1 epoch (loss 1.3888): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1558/1563 [42:30<00:08, 1.77s/it] Training 1/1 epoch (loss 1.3888): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1559/1563 [42:30<00:06, 1.64s/it] Training 1/1 epoch (loss 1.5644): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1559/1563 [42:31<00:06, 1.64s/it] Training 1/1 epoch (loss 1.5644): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1560/1563 [42:31<00:04, 1.48s/it] Training 1/1 epoch (loss 1.4964): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1560/1563 [42:32<00:04, 1.48s/it] Training 1/1 epoch (loss 1.4964): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1561/1563 [42:32<00:02, 1.46s/it] Training 1/1 epoch (loss 1.5924): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1561/1563 [42:34<00:02, 1.46s/it] Training 1/1 epoch (loss 1.5924): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1562/1563 [42:34<00:01, 1.52s/it] Training 1/1 epoch (loss 1.6350): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1562/1563 [42:36<00:01, 1.52s/it] Training 1/1 epoch (loss 1.6350): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1563/1563 [42:36<00:00, 1.55s/it] Training 1/1 epoch (loss 1.6350): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1563/1563 [42:36<00:00, 1.64s/it]
tokenizer config file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-50k/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-1T/tinyllama-1T-s3-Q1-50k/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x15511874a6d0>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 449, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 391, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2106, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2127, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2352, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2609, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown