Gaie's picture
Upload folder using huggingface_hub
7ea1c7c verified
+ deepspeed --master_port 25931 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/imdb_data/train/pos/10000/train.json --model_name_or_path /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T --max_length 512 --trust_remote_code True --epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-10000 --log_type wandb --log_run_name imdb-tinyllama-3T-s3-Q1-10000 --log_project Inverse_Alignment_IMDb --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
[rank4]:[W527 20:22:59.642662794 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank1]:[W527 20:23:00.660061403 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W527 20:23:00.681287585 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W527 20:23:00.682936166 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W527 20:23:00.686569890 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W527 20:23:00.691523446 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W527 20:23:00.701841157 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W527 20:23:00.712500378 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.52.1",
"use_cache": true,
"vocab_size": 32000
}
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
All model checkpoint weights were used when initializing LlamaForCausalLM.
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.json
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file added_tokens.json
loading file tokenizer_config.json
loading file special_tokens_map.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.11
wandb: Run data is saved locally in /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-10000/wandb/run-20250527_202314-j30d8uyq
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run imdb-tinyllama-3T-s3-Q1-10000
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment_IMDb
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment_IMDb/runs/j30d8uyq
Training 1/1 epoch: 0%| | 0/1250 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.6895): 0%| | 0/1250 [00:08<?, ?it/s] Training 1/1 epoch (loss 2.6895): 0%| | 1/1250 [00:08<2:56:03, 8.46s/it] Training 1/1 epoch (loss 2.7344): 0%| | 1/1250 [00:10<2:56:03, 8.46s/it] Training 1/1 epoch (loss 2.7344): 0%| | 2/1250 [00:10<1:39:30, 4.78s/it] Training 1/1 epoch (loss 2.6357): 0%| | 2/1250 [00:12<1:39:30, 4.78s/it] Training 1/1 epoch (loss 2.6357): 0%| | 3/1250 [00:12<1:11:50, 3.46s/it] Training 1/1 epoch (loss 2.9972): 0%| | 3/1250 [00:13<1:11:50, 3.46s/it] Training 1/1 epoch (loss 2.9972): 0%| | 4/1250 [00:13<54:33, 2.63s/it] Training 1/1 epoch (loss 2.7498): 0%| | 4/1250 [00:14<54:33, 2.63s/it] Training 1/1 epoch (loss 2.7498): 0%| | 5/1250 [00:14<42:12, 2.03s/it] Training 1/1 epoch (loss 2.9447): 0%| | 5/1250 [00:16<42:12, 2.03s/it] Training 1/1 epoch (loss 2.9447): 0%| | 6/1250 [00:16<41:11, 1.99s/it] Training 1/1 epoch (loss 2.7219): 0%| | 6/1250 [00:18<41:11, 1.99s/it] Training 1/1 epoch (loss 2.7219): 1%| | 7/1250 [00:18<37:49, 1.83s/it] Training 1/1 epoch (loss 2.6385): 1%| | 7/1250 [00:18<37:49, 1.83s/it] Training 1/1 epoch (loss 2.6385): 1%| | 8/1250 [00:18<30:25, 1.47s/it] Training 1/1 epoch (loss 2.7001): 1%| | 8/1250 [00:21<30:25, 1.47s/it] Training 1/1 epoch (loss 2.7001): 1%| | 9/1250 [00:21<36:40, 1.77s/it] Training 1/1 epoch (loss 2.4852): 1%| | 9/1250 [00:22<36:40, 1.77s/it] Training 1/1 epoch (loss 2.4852): 1%| | 10/1250 [00:22<31:07, 1.51s/it] Training 1/1 epoch (loss 2.7661): 1%| | 10/1250 [00:23<31:07, 1.51s/it] Training 1/1 epoch (loss 2.7661): 1%| | 11/1250 [00:23<28:09, 1.36s/it] Training 1/1 epoch (loss 2.7108): 1%| | 11/1250 [00:24<28:09, 1.36s/it] Training 1/1 epoch (loss 2.7108): 1%| | 12/1250 [00:24<28:51, 1.40s/it] Training 1/1 epoch (loss 2.6189): 1%| | 12/1250 [00:25<28:51, 1.40s/it] Training 1/1 epoch (loss 2.6189): 1%| | 13/1250 [00:25<26:25, 1.28s/it] Training 1/1 epoch (loss 2.4735): 1%| | 13/1250 [00:27<26:25, 1.28s/it] Training 1/1 epoch (loss 2.4735): 1%| | 14/1250 [00:27<27:29, 1.33s/it] Training 1/1 epoch (loss 2.7584): 1%| | 14/1250 [00:29<27:29, 1.33s/it] Training 1/1 epoch (loss 2.7584): 1%| | 15/1250 [00:29<30:27, 1.48s/it] Training 1/1 epoch (loss 2.6672): 1%| | 15/1250 [00:29<30:27, 1.48s/it] Training 1/1 epoch (loss 2.6672): 1%|▏ | 16/1250 [00:29<25:37, 1.25s/it] Training 1/1 epoch (loss 2.7078): 1%|▏ | 16/1250 [00:30<25:37, 1.25s/it] Training 1/1 epoch (loss 2.7078): 1%|▏ | 17/1250 [00:30<23:16, 1.13s/it] Training 1/1 epoch (loss 2.7807): 1%|▏ | 17/1250 [00:32<23:16, 1.13s/it] Training 1/1 epoch (loss 2.7807): 1%|▏ | 18/1250 [00:32<26:22, 1.28s/it] Training 1/1 epoch (loss 2.7497): 1%|▏ | 18/1250 [00:32<26:22, 1.28s/it] Training 1/1 epoch (loss 2.7497): 2%|▏ | 19/1250 [00:32<22:18, 1.09s/it] Training 1/1 epoch (loss 2.7083): 2%|▏ | 19/1250 [00:34<22:18, 1.09s/it] Training 1/1 epoch (loss 2.7083): 2%|▏ | 20/1250 [00:34<27:02, 1.32s/it] Training 1/1 epoch (loss 2.6306): 2%|▏ | 20/1250 [00:35<27:02, 1.32s/it] Training 1/1 epoch (loss 2.6306): 2%|▏ | 21/1250 [00:35<25:33, 1.25s/it] Training 1/1 epoch (loss 2.8995): 2%|▏ | 21/1250 [00:36<25:33, 1.25s/it] Training 1/1 epoch (loss 2.8995): 2%|▏ | 22/1250 [00:36<22:33, 1.10s/it] Training 1/1 epoch (loss 2.7049): 2%|▏ | 22/1250 [00:37<22:33, 1.10s/it] Training 1/1 epoch (loss 2.7049): 2%|▏ | 23/1250 [00:37<22:54, 1.12s/it] Training 1/1 epoch (loss 2.4003): 2%|▏ | 23/1250 [00:39<22:54, 1.12s/it] Training 1/1 epoch (loss 2.4003): 2%|▏ | 24/1250 [00:39<27:25, 1.34s/it] Training 1/1 epoch (loss 2.5756): 2%|▏ | 24/1250 [00:40<27:25, 1.34s/it] Training 1/1 epoch (loss 2.5756): 2%|▏ | 25/1250 [00:40<23:53, 1.17s/it] Training 1/1 epoch (loss 2.5142): 2%|▏ | 25/1250 [00:41<23:53, 1.17s/it] Training 1/1 epoch (loss 2.5142): 2%|▏ | 26/1250 [00:41<25:27, 1.25s/it] Training 1/1 epoch (loss 2.7769): 2%|▏ | 26/1250 [00:43<25:27, 1.25s/it] Training 1/1 epoch (loss 2.7769): 2%|▏ | 27/1250 [00:43<28:32, 1.40s/it] Training 1/1 epoch (loss 2.6898): 2%|▏ | 27/1250 [00:44<28:32, 1.40s/it] Training 1/1 epoch (loss 2.6898): 2%|▏ | 28/1250 [00:44<27:37, 1.36s/it] Training 1/1 epoch (loss 2.7262): 2%|▏ | 28/1250 [00:46<27:37, 1.36s/it] Training 1/1 epoch (loss 2.7262): 2%|▏ | 29/1250 [00:46<30:02, 1.48s/it] Training 1/1 epoch (loss 2.7273): 2%|▏ | 29/1250 [00:47<30:02, 1.48s/it] Training 1/1 epoch (loss 2.7273): 2%|▏ | 30/1250 [00:47<26:47, 1.32s/it] Training 1/1 epoch (loss 2.8782): 2%|▏ | 30/1250 [00:48<26:47, 1.32s/it] Training 1/1 epoch (loss 2.8782): 2%|▏ | 31/1250 [00:48<25:31, 1.26s/it] Training 1/1 epoch (loss 2.7278): 2%|▏ | 31/1250 [00:50<25:31, 1.26s/it] Training 1/1 epoch (loss 2.7278): 3%|β–Ž | 32/1250 [00:50<30:05, 1.48s/it] Training 1/1 epoch (loss 2.7458): 3%|β–Ž | 32/1250 [00:52<30:05, 1.48s/it] Training 1/1 epoch (loss 2.7458): 3%|β–Ž | 33/1250 [00:52<28:45, 1.42s/it] Training 1/1 epoch (loss 2.5771): 3%|β–Ž | 33/1250 [00:54<28:45, 1.42s/it] Training 1/1 epoch (loss 2.5771): 3%|β–Ž | 34/1250 [00:54<33:50, 1.67s/it] Training 1/1 epoch (loss 2.6981): 3%|β–Ž | 34/1250 [00:56<33:50, 1.67s/it] Training 1/1 epoch (loss 2.6981): 3%|β–Ž | 35/1250 [00:56<38:57, 1.92s/it] Training 1/1 epoch (loss 2.6769): 3%|β–Ž | 35/1250 [00:57<38:57, 1.92s/it] Training 1/1 epoch (loss 2.6769): 3%|β–Ž | 36/1250 [00:57<29:58, 1.48s/it] Training 1/1 epoch (loss 2.5012): 3%|β–Ž | 36/1250 [00:59<29:58, 1.48s/it] Training 1/1 epoch (loss 2.5012): 3%|β–Ž | 37/1250 [00:59<32:40, 1.62s/it] Training 1/1 epoch (loss 2.4744): 3%|β–Ž | 37/1250 [01:00<32:40, 1.62s/it] Training 1/1 epoch (loss 2.4744): 3%|β–Ž | 38/1250 [01:00<33:00, 1.63s/it] Training 1/1 epoch (loss 2.8530): 3%|β–Ž | 38/1250 [01:01<33:00, 1.63s/it] Training 1/1 epoch (loss 2.8530): 3%|β–Ž | 39/1250 [01:01<26:29, 1.31s/it] Training 1/1 epoch (loss 2.5973): 3%|β–Ž | 39/1250 [01:04<26:29, 1.31s/it] Training 1/1 epoch (loss 2.5973): 3%|β–Ž | 40/1250 [01:04<34:33, 1.71s/it] Training 1/1 epoch (loss 2.6672): 3%|β–Ž | 40/1250 [01:06<34:33, 1.71s/it] Training 1/1 epoch (loss 2.6672): 3%|β–Ž | 41/1250 [01:06<39:08, 1.94s/it] Training 1/1 epoch (loss 2.5157): 3%|β–Ž | 41/1250 [01:06<39:08, 1.94s/it] Training 1/1 epoch (loss 2.5157): 3%|β–Ž | 42/1250 [01:06<29:54, 1.49s/it] Training 1/1 epoch (loss 2.6115): 3%|β–Ž | 42/1250 [01:08<29:54, 1.49s/it] Training 1/1 epoch (loss 2.6115): 3%|β–Ž | 43/1250 [01:08<30:22, 1.51s/it] Training 1/1 epoch (loss 2.6442): 3%|β–Ž | 43/1250 [01:10<30:22, 1.51s/it] Training 1/1 epoch (loss 2.6442): 4%|β–Ž | 44/1250 [01:10<35:52, 1.78s/it] Training 1/1 epoch (loss 2.6178): 4%|β–Ž | 44/1250 [01:11<35:52, 1.78s/it] Training 1/1 epoch (loss 2.6178): 4%|β–Ž | 45/1250 [01:11<27:52, 1.39s/it] Training 1/1 epoch (loss 2.4632): 4%|β–Ž | 45/1250 [01:13<27:52, 1.39s/it] Training 1/1 epoch (loss 2.4632): 4%|β–Ž | 46/1250 [01:13<32:05, 1.60s/it] Training 1/1 epoch (loss 2.6651): 4%|β–Ž | 46/1250 [01:15<32:05, 1.60s/it] Training 1/1 epoch (loss 2.6651): 4%|▍ | 47/1250 [01:15<35:32, 1.77s/it] Training 1/1 epoch (loss 2.7233): 4%|▍ | 47/1250 [01:16<35:32, 1.77s/it] Training 1/1 epoch (loss 2.7233): 4%|▍ | 48/1250 [01:16<28:05, 1.40s/it] Training 1/1 epoch (loss 2.5598): 4%|▍ | 48/1250 [01:18<28:05, 1.40s/it] Training 1/1 epoch (loss 2.5598): 4%|▍ | 49/1250 [01:18<30:43, 1.54s/it] Training 1/1 epoch (loss 2.6908): 4%|▍ | 49/1250 [01:19<30:43, 1.54s/it] Training 1/1 epoch (loss 2.6908): 4%|▍ | 50/1250 [01:19<31:22, 1.57s/it] Training 1/1 epoch (loss 2.6063): 4%|▍ | 50/1250 [01:20<31:22, 1.57s/it] Training 1/1 epoch (loss 2.6063): 4%|▍ | 51/1250 [01:20<24:43, 1.24s/it] Training 1/1 epoch (loss 2.6327): 4%|▍ | 51/1250 [01:22<24:43, 1.24s/it] Training 1/1 epoch (loss 2.6327): 4%|▍ | 52/1250 [01:22<32:22, 1.62s/it] Training 1/1 epoch (loss 2.6822): 4%|▍ | 52/1250 [01:23<32:22, 1.62s/it] Training 1/1 epoch (loss 2.6822): 4%|▍ | 53/1250 [01:23<29:05, 1.46s/it] Training 1/1 epoch (loss 2.4893): 4%|▍ | 53/1250 [01:24<29:05, 1.46s/it] Training 1/1 epoch (loss 2.4893): 4%|▍ | 54/1250 [01:24<23:39, 1.19s/it] Training 1/1 epoch (loss 2.5229): 4%|▍ | 54/1250 [01:26<23:39, 1.19s/it] Training 1/1 epoch (loss 2.5229): 4%|▍ | 55/1250 [01:26<28:09, 1.41s/it] Training 1/1 epoch (loss 2.5454): 4%|▍ | 55/1250 [01:28<28:09, 1.41s/it] Training 1/1 epoch (loss 2.5454): 4%|▍ | 56/1250 [01:28<32:29, 1.63s/it] Training 1/1 epoch (loss 2.5369): 4%|▍ | 56/1250 [01:29<32:29, 1.63s/it] Training 1/1 epoch (loss 2.5369): 5%|▍ | 57/1250 [01:29<28:53, 1.45s/it] Training 1/1 epoch (loss 2.3972): 5%|▍ | 57/1250 [01:31<28:53, 1.45s/it] Training 1/1 epoch (loss 2.3972): 5%|▍ | 58/1250 [01:31<34:33, 1.74s/it] Training 1/1 epoch (loss 2.4651): 5%|▍ | 58/1250 [01:33<34:33, 1.74s/it] Training 1/1 epoch (loss 2.4651): 5%|▍ | 59/1250 [01:33<31:38, 1.59s/it] Training 1/1 epoch (loss 2.6384): 5%|▍ | 59/1250 [01:34<31:38, 1.59s/it] Training 1/1 epoch (loss 2.6384): 5%|▍ | 60/1250 [01:34<28:26, 1.43s/it] Training 1/1 epoch (loss 2.4940): 5%|▍ | 60/1250 [01:35<28:26, 1.43s/it] Training 1/1 epoch (loss 2.4940): 5%|▍ | 61/1250 [01:35<30:48, 1.55s/it] Training 1/1 epoch (loss 2.6199): 5%|▍ | 61/1250 [01:37<30:48, 1.55s/it] Training 1/1 epoch (loss 2.6199): 5%|▍ | 62/1250 [01:37<29:39, 1.50s/it] Training 1/1 epoch (loss 2.7417): 5%|▍ | 62/1250 [01:39<29:39, 1.50s/it] Training 1/1 epoch (loss 2.7417): 5%|β–Œ | 63/1250 [01:39<35:30, 1.79s/it] Training 1/1 epoch (loss 2.8153): 5%|β–Œ | 63/1250 [01:41<35:30, 1.79s/it] Training 1/1 epoch (loss 2.8153): 5%|β–Œ | 64/1250 [01:41<35:50, 1.81s/it] Training 1/1 epoch (loss 2.6814): 5%|β–Œ | 64/1250 [01:42<35:50, 1.81s/it] Training 1/1 epoch (loss 2.6814): 5%|β–Œ | 65/1250 [01:42<27:49, 1.41s/it] Training 1/1 epoch (loss 2.5319): 5%|β–Œ | 65/1250 [01:43<27:49, 1.41s/it] Training 1/1 epoch (loss 2.5319): 5%|β–Œ | 66/1250 [01:43<28:27, 1.44s/it] Training 1/1 epoch (loss 2.5994): 5%|β–Œ | 66/1250 [01:45<28:27, 1.44s/it] Training 1/1 epoch (loss 2.5994): 5%|β–Œ | 67/1250 [01:45<28:59, 1.47s/it] Training 1/1 epoch (loss 2.6963): 5%|β–Œ | 67/1250 [01:45<28:59, 1.47s/it] Training 1/1 epoch (loss 2.6963): 5%|β–Œ | 68/1250 [01:45<23:37, 1.20s/it] Training 1/1 epoch (loss 2.5470): 5%|β–Œ | 68/1250 [01:47<23:37, 1.20s/it] Training 1/1 epoch (loss 2.5470): 6%|β–Œ | 69/1250 [01:47<24:11, 1.23s/it] Training 1/1 epoch (loss 2.6826): 6%|β–Œ | 69/1250 [01:49<24:11, 1.23s/it] Training 1/1 epoch (loss 2.6826): 6%|β–Œ | 70/1250 [01:49<28:58, 1.47s/it] Training 1/1 epoch (loss 2.5648): 6%|β–Œ | 70/1250 [01:49<28:58, 1.47s/it] Training 1/1 epoch (loss 2.5648): 6%|β–Œ | 71/1250 [01:49<24:45, 1.26s/it] Training 1/1 epoch (loss 2.5215): 6%|β–Œ | 71/1250 [01:52<24:45, 1.26s/it] Training 1/1 epoch (loss 2.5215): 6%|β–Œ | 72/1250 [01:52<32:32, 1.66s/it] Training 1/1 epoch (loss 2.7345): 6%|β–Œ | 72/1250 [01:54<32:32, 1.66s/it] Training 1/1 epoch (loss 2.7345): 6%|β–Œ | 73/1250 [01:54<31:49, 1.62s/it] Training 1/1 epoch (loss 2.6594): 6%|β–Œ | 73/1250 [01:55<31:49, 1.62s/it] Training 1/1 epoch (loss 2.6594): 6%|β–Œ | 74/1250 [01:55<31:55, 1.63s/it] Training 1/1 epoch (loss 2.5403): 6%|β–Œ | 74/1250 [01:57<31:55, 1.63s/it] Training 1/1 epoch (loss 2.5403): 6%|β–Œ | 75/1250 [01:57<31:40, 1.62s/it] Training 1/1 epoch (loss 2.6318): 6%|β–Œ | 75/1250 [01:58<31:40, 1.62s/it] Training 1/1 epoch (loss 2.6318): 6%|β–Œ | 76/1250 [01:58<28:49, 1.47s/it] Training 1/1 epoch (loss 2.5235): 6%|β–Œ | 76/1250 [01:59<28:49, 1.47s/it] Training 1/1 epoch (loss 2.5235): 6%|β–Œ | 77/1250 [01:59<26:41, 1.37s/it] Training 1/1 epoch (loss 2.7579): 6%|β–Œ | 77/1250 [02:01<26:41, 1.37s/it] Training 1/1 epoch (loss 2.7579): 6%|β–Œ | 78/1250 [02:01<29:48, 1.53s/it] Training 1/1 epoch (loss 2.5993): 6%|β–Œ | 78/1250 [02:02<29:48, 1.53s/it] Training 1/1 epoch (loss 2.5993): 6%|β–‹ | 79/1250 [02:02<27:51, 1.43s/it] Training 1/1 epoch (loss 2.5968): 6%|β–‹ | 79/1250 [02:04<27:51, 1.43s/it] Training 1/1 epoch (loss 2.5968): 6%|β–‹ | 80/1250 [02:04<28:26, 1.46s/it] Training 1/1 epoch (loss 2.6585): 6%|β–‹ | 80/1250 [02:05<28:26, 1.46s/it] Training 1/1 epoch (loss 2.6585): 6%|β–‹ | 81/1250 [02:05<27:35, 1.42s/it] Training 1/1 epoch (loss 2.7149): 6%|β–‹ | 81/1250 [02:07<27:35, 1.42s/it] Training 1/1 epoch (loss 2.7149): 7%|β–‹ | 82/1250 [02:07<31:25, 1.61s/it] Training 1/1 epoch (loss 2.7487): 7%|β–‹ | 82/1250 [02:08<31:25, 1.61s/it] Training 1/1 epoch (loss 2.7487): 7%|β–‹ | 83/1250 [02:08<27:18, 1.40s/it] Training 1/1 epoch (loss 2.5938): 7%|β–‹ | 83/1250 [02:09<27:18, 1.40s/it] Training 1/1 epoch (loss 2.5938): 7%|β–‹ | 84/1250 [02:09<25:39, 1.32s/it] Training 1/1 epoch (loss 2.4955): 7%|β–‹ | 84/1250 [02:11<25:39, 1.32s/it] Training 1/1 epoch (loss 2.4955): 7%|β–‹ | 85/1250 [02:11<28:40, 1.48s/it] Training 1/1 epoch (loss 2.5313): 7%|β–‹ | 85/1250 [02:13<28:40, 1.48s/it] Training 1/1 epoch (loss 2.5313): 7%|β–‹ | 86/1250 [02:13<30:53, 1.59s/it] Training 1/1 epoch (loss 2.7306): 7%|β–‹ | 86/1250 [02:13<30:53, 1.59s/it] Training 1/1 epoch (loss 2.7306): 7%|β–‹ | 87/1250 [02:13<24:41, 1.27s/it] Training 1/1 epoch (loss 2.5057): 7%|β–‹ | 87/1250 [02:15<24:41, 1.27s/it] Training 1/1 epoch (loss 2.5057): 7%|β–‹ | 88/1250 [02:15<29:39, 1.53s/it] Training 1/1 epoch (loss 2.6186): 7%|β–‹ | 88/1250 [02:17<29:39, 1.53s/it] Training 1/1 epoch (loss 2.6186): 7%|β–‹ | 89/1250 [02:17<28:49, 1.49s/it] Training 1/1 epoch (loss 2.8304): 7%|β–‹ | 89/1250 [02:17<28:49, 1.49s/it] Training 1/1 epoch (loss 2.8304): 7%|β–‹ | 90/1250 [02:17<23:24, 1.21s/it] Training 1/1 epoch (loss 2.4773): 7%|β–‹ | 90/1250 [02:18<23:24, 1.21s/it] Training 1/1 epoch (loss 2.4773): 7%|β–‹ | 91/1250 [02:18<22:30, 1.17s/it] Training 1/1 epoch (loss 2.5533): 7%|β–‹ | 91/1250 [02:20<22:30, 1.17s/it] Training 1/1 epoch (loss 2.5533): 7%|β–‹ | 92/1250 [02:20<26:51, 1.39s/it] Training 1/1 epoch (loss 2.5505): 7%|β–‹ | 92/1250 [02:21<26:51, 1.39s/it] Training 1/1 epoch (loss 2.5505): 7%|β–‹ | 93/1250 [02:21<23:17, 1.21s/it] Training 1/1 epoch (loss 2.5337): 7%|β–‹ | 93/1250 [02:23<23:17, 1.21s/it] Training 1/1 epoch (loss 2.5337): 8%|β–Š | 94/1250 [02:23<29:44, 1.54s/it] Training 1/1 epoch (loss 2.5630): 8%|β–Š | 94/1250 [02:26<29:44, 1.54s/it] Training 1/1 epoch (loss 2.5630): 8%|β–Š | 95/1250 [02:26<34:49, 1.81s/it] Training 1/1 epoch (loss 2.6254): 8%|β–Š | 95/1250 [02:27<34:49, 1.81s/it] Training 1/1 epoch (loss 2.6254): 8%|β–Š | 96/1250 [02:27<28:32, 1.48s/it] Training 1/1 epoch (loss 2.5093): 8%|β–Š | 96/1250 [02:28<28:32, 1.48s/it] Training 1/1 epoch (loss 2.5093): 8%|β–Š | 97/1250 [02:28<29:03, 1.51s/it] Training 1/1 epoch (loss 2.5464): 8%|β–Š | 97/1250 [02:29<29:03, 1.51s/it] Training 1/1 epoch (loss 2.5464): 8%|β–Š | 98/1250 [02:29<26:06, 1.36s/it] Training 1/1 epoch (loss 2.6155): 8%|β–Š | 98/1250 [02:30<26:06, 1.36s/it] Training 1/1 epoch (loss 2.6155): 8%|β–Š | 99/1250 [02:30<23:25, 1.22s/it] Training 1/1 epoch (loss 2.8116): 8%|β–Š | 99/1250 [02:32<23:25, 1.22s/it] Training 1/1 epoch (loss 2.8116): 8%|β–Š | 100/1250 [02:32<25:55, 1.35s/it] Training 1/1 epoch (loss 2.6058): 8%|β–Š | 100/1250 [02:33<25:55, 1.35s/it] Training 1/1 epoch (loss 2.6058): 8%|β–Š | 101/1250 [02:33<24:56, 1.30s/it] Training 1/1 epoch (loss 2.5238): 8%|β–Š | 101/1250 [02:34<24:56, 1.30s/it] Training 1/1 epoch (loss 2.5238): 8%|β–Š | 102/1250 [02:34<25:49, 1.35s/it] Training 1/1 epoch (loss 2.5948): 8%|β–Š | 102/1250 [02:36<25:49, 1.35s/it] Training 1/1 epoch (loss 2.5948): 8%|β–Š | 103/1250 [02:36<27:30, 1.44s/it] Training 1/1 epoch (loss 2.7258): 8%|β–Š | 103/1250 [02:37<27:30, 1.44s/it] Training 1/1 epoch (loss 2.7258): 8%|β–Š | 104/1250 [02:37<25:39, 1.34s/it] Training 1/1 epoch (loss 2.5715): 8%|β–Š | 104/1250 [02:39<25:39, 1.34s/it] Training 1/1 epoch (loss 2.5715): 8%|β–Š | 105/1250 [02:39<30:08, 1.58s/it] Training 1/1 epoch (loss 2.5886): 8%|β–Š | 105/1250 [02:41<30:08, 1.58s/it] Training 1/1 epoch (loss 2.5886): 8%|β–Š | 106/1250 [02:41<31:24, 1.65s/it] Training 1/1 epoch (loss 2.5757): 8%|β–Š | 106/1250 [02:42<31:24, 1.65s/it] Training 1/1 epoch (loss 2.5757): 9%|β–Š | 107/1250 [02:42<24:57, 1.31s/it] Training 1/1 epoch (loss 2.7386): 9%|β–Š | 107/1250 [02:44<24:57, 1.31s/it] Training 1/1 epoch (loss 2.7386): 9%|β–Š | 108/1250 [02:44<31:18, 1.65s/it] Training 1/1 epoch (loss 2.6917): 9%|β–Š | 108/1250 [02:46<31:18, 1.65s/it] Training 1/1 epoch (loss 2.6917): 9%|β–Š | 109/1250 [02:46<31:23, 1.65s/it] Training 1/1 epoch (loss 2.6011): 9%|β–Š | 109/1250 [02:46<31:23, 1.65s/it] Training 1/1 epoch (loss 2.6011): 9%|β–‰ | 110/1250 [02:46<25:31, 1.34s/it] Training 1/1 epoch (loss 2.6944): 9%|β–‰ | 110/1250 [02:49<25:31, 1.34s/it] Training 1/1 epoch (loss 2.6944): 9%|β–‰ | 111/1250 [02:49<30:28, 1.61s/it] Training 1/1 epoch (loss 2.5051): 9%|β–‰ | 111/1250 [02:51<30:28, 1.61s/it] Training 1/1 epoch (loss 2.5051): 9%|β–‰ | 112/1250 [02:51<32:35, 1.72s/it] Training 1/1 epoch (loss 2.4703): 9%|β–‰ | 112/1250 [02:52<32:35, 1.72s/it] Training 1/1 epoch (loss 2.4703): 9%|β–‰ | 113/1250 [02:52<31:40, 1.67s/it] Training 1/1 epoch (loss 2.4521): 9%|β–‰ | 113/1250 [02:54<31:40, 1.67s/it] Training 1/1 epoch (loss 2.4521): 9%|β–‰ | 114/1250 [02:54<30:36, 1.62s/it] Training 1/1 epoch (loss 2.5111): 9%|β–‰ | 114/1250 [02:55<30:36, 1.62s/it] Training 1/1 epoch (loss 2.5111): 9%|β–‰ | 115/1250 [02:55<28:16, 1.50s/it] Training 1/1 epoch (loss 2.5967): 9%|β–‰ | 115/1250 [02:56<28:16, 1.50s/it] Training 1/1 epoch (loss 2.5967): 9%|β–‰ | 116/1250 [02:56<25:41, 1.36s/it] Training 1/1 epoch (loss 2.5230): 9%|β–‰ | 116/1250 [02:57<25:41, 1.36s/it] Training 1/1 epoch (loss 2.5230): 9%|β–‰ | 117/1250 [02:57<26:51, 1.42s/it] Training 1/1 epoch (loss 2.5070): 9%|β–‰ | 117/1250 [02:58<26:51, 1.42s/it] Training 1/1 epoch (loss 2.5070): 9%|β–‰ | 118/1250 [02:58<22:59, 1.22s/it] Training 1/1 epoch (loss 2.5612): 9%|β–‰ | 118/1250 [03:00<22:59, 1.22s/it] Training 1/1 epoch (loss 2.5612): 10%|β–‰ | 119/1250 [03:00<24:41, 1.31s/it] Training 1/1 epoch (loss 2.7372): 10%|β–‰ | 119/1250 [03:02<24:41, 1.31s/it] Training 1/1 epoch (loss 2.7372): 10%|β–‰ | 120/1250 [03:02<29:32, 1.57s/it] Training 1/1 epoch (loss 2.6908): 10%|β–‰ | 120/1250 [03:02<29:32, 1.57s/it] Training 1/1 epoch (loss 2.6908): 10%|β–‰ | 121/1250 [03:02<24:08, 1.28s/it] Training 1/1 epoch (loss 2.4961): 10%|β–‰ | 121/1250 [03:04<24:08, 1.28s/it] Training 1/1 epoch (loss 2.4961): 10%|β–‰ | 122/1250 [03:04<24:24, 1.30s/it] Training 1/1 epoch (loss 2.6315): 10%|β–‰ | 122/1250 [03:05<24:24, 1.30s/it] Training 1/1 epoch (loss 2.6315): 10%|β–‰ | 123/1250 [03:05<25:33, 1.36s/it] Training 1/1 epoch (loss 2.4585): 10%|β–‰ | 123/1250 [03:06<25:33, 1.36s/it] Training 1/1 epoch (loss 2.4585): 10%|β–‰ | 124/1250 [03:06<23:31, 1.25s/it] Training 1/1 epoch (loss 2.4177): 10%|β–‰ | 124/1250 [03:08<23:31, 1.25s/it] Training 1/1 epoch (loss 2.4177): 10%|β–ˆ | 125/1250 [03:08<26:32, 1.42s/it] Training 1/1 epoch (loss 2.6298): 10%|β–ˆ | 125/1250 [03:10<26:32, 1.42s/it] Training 1/1 epoch (loss 2.6298): 10%|β–ˆ | 126/1250 [03:10<28:41, 1.53s/it] Training 1/1 epoch (loss 2.4514): 10%|β–ˆ | 126/1250 [03:11<28:41, 1.53s/it] Training 1/1 epoch (loss 2.4514): 10%|β–ˆ | 127/1250 [03:11<27:17, 1.46s/it] Training 1/1 epoch (loss 2.6429): 10%|β–ˆ | 127/1250 [03:12<27:17, 1.46s/it] Training 1/1 epoch (loss 2.6429): 10%|β–ˆ | 128/1250 [03:12<21:52, 1.17s/it] Training 1/1 epoch (loss 2.5333): 10%|β–ˆ | 128/1250 [03:13<21:52, 1.17s/it] Training 1/1 epoch (loss 2.5333): 10%|β–ˆ | 129/1250 [03:13<20:59, 1.12s/it] Training 1/1 epoch (loss 2.5523): 10%|β–ˆ | 129/1250 [03:14<20:59, 1.12s/it] Training 1/1 epoch (loss 2.5523): 10%|β–ˆ | 130/1250 [03:14<21:52, 1.17s/it] Training 1/1 epoch (loss 2.7132): 10%|β–ˆ | 130/1250 [03:16<21:52, 1.17s/it] Training 1/1 epoch (loss 2.7132): 10%|β–ˆ | 131/1250 [03:16<24:43, 1.33s/it] Training 1/1 epoch (loss 2.7097): 10%|β–ˆ | 131/1250 [03:17<24:43, 1.33s/it] Training 1/1 epoch (loss 2.7097): 11%|β–ˆ | 132/1250 [03:17<24:16, 1.30s/it] Training 1/1 epoch (loss 2.6387): 11%|β–ˆ | 132/1250 [03:18<24:16, 1.30s/it] Training 1/1 epoch (loss 2.6387): 11%|β–ˆ | 133/1250 [03:18<25:38, 1.38s/it] Training 1/1 epoch (loss 2.7275): 11%|β–ˆ | 133/1250 [03:20<25:38, 1.38s/it] Training 1/1 epoch (loss 2.7275): 11%|β–ˆ | 134/1250 [03:20<25:21, 1.36s/it] Training 1/1 epoch (loss 2.6526): 11%|β–ˆ | 134/1250 [03:21<25:21, 1.36s/it] Training 1/1 epoch (loss 2.6526): 11%|β–ˆ | 135/1250 [03:21<23:01, 1.24s/it] Training 1/1 epoch (loss 2.6057): 11%|β–ˆ | 135/1250 [03:22<23:01, 1.24s/it] Training 1/1 epoch (loss 2.6057): 11%|β–ˆ | 136/1250 [03:22<23:36, 1.27s/it] Training 1/1 epoch (loss 2.5167): 11%|β–ˆ | 136/1250 [03:24<23:36, 1.27s/it] Training 1/1 epoch (loss 2.5167): 11%|β–ˆ | 137/1250 [03:24<28:44, 1.55s/it] Training 1/1 epoch (loss 2.4851): 11%|β–ˆ | 137/1250 [03:25<28:44, 1.55s/it] Training 1/1 epoch (loss 2.4851): 11%|β–ˆ | 138/1250 [03:25<23:53, 1.29s/it] Training 1/1 epoch (loss 2.7420): 11%|β–ˆ | 138/1250 [03:27<23:53, 1.29s/it] Training 1/1 epoch (loss 2.7420): 11%|β–ˆ | 139/1250 [03:27<30:15, 1.63s/it] Training 1/1 epoch (loss 2.6645): 11%|β–ˆ | 139/1250 [03:30<30:15, 1.63s/it] Training 1/1 epoch (loss 2.6645): 11%|β–ˆ | 140/1250 [03:30<34:25, 1.86s/it] Training 1/1 epoch (loss 2.5382): 11%|β–ˆ | 140/1250 [03:30<34:25, 1.86s/it] Training 1/1 epoch (loss 2.5382): 11%|β–ˆβ– | 141/1250 [03:30<26:26, 1.43s/it] Training 1/1 epoch (loss 2.6845): 11%|β–ˆβ– | 141/1250 [03:33<26:26, 1.43s/it] Training 1/1 epoch (loss 2.6845): 11%|β–ˆβ– | 142/1250 [03:33<31:15, 1.69s/it] Training 1/1 epoch (loss 2.6086): 11%|β–ˆβ– | 142/1250 [03:34<31:15, 1.69s/it] Training 1/1 epoch (loss 2.6086): 11%|β–ˆβ– | 143/1250 [03:34<30:05, 1.63s/it] Training 1/1 epoch (loss 2.7442): 11%|β–ˆβ– | 143/1250 [03:35<30:05, 1.63s/it] Training 1/1 epoch (loss 2.7442): 12%|β–ˆβ– | 144/1250 [03:35<26:14, 1.42s/it] Training 1/1 epoch (loss 2.6195): 12%|β–ˆβ– | 144/1250 [03:37<26:14, 1.42s/it] Training 1/1 epoch (loss 2.6195): 12%|β–ˆβ– | 145/1250 [03:37<27:33, 1.50s/it] Training 1/1 epoch (loss 2.7169): 12%|β–ˆβ– | 145/1250 [03:38<27:33, 1.50s/it] Training 1/1 epoch (loss 2.7169): 12%|β–ˆβ– | 146/1250 [03:38<25:18, 1.38s/it] Training 1/1 epoch (loss 2.5402): 12%|β–ˆβ– | 146/1250 [03:39<25:18, 1.38s/it] Training 1/1 epoch (loss 2.5402): 12%|β–ˆβ– | 147/1250 [03:39<24:28, 1.33s/it] Training 1/1 epoch (loss 2.6608): 12%|β–ˆβ– | 147/1250 [03:41<24:28, 1.33s/it] Training 1/1 epoch (loss 2.6608): 12%|β–ˆβ– | 148/1250 [03:41<27:34, 1.50s/it] Training 1/1 epoch (loss 2.5834): 12%|β–ˆβ– | 148/1250 [03:42<27:34, 1.50s/it] Training 1/1 epoch (loss 2.5834): 12%|β–ˆβ– | 149/1250 [03:42<25:25, 1.39s/it] Training 1/1 epoch (loss 2.8507): 12%|β–ˆβ– | 149/1250 [03:43<25:25, 1.39s/it] Training 1/1 epoch (loss 2.8507): 12%|β–ˆβ– | 150/1250 [03:43<21:57, 1.20s/it] Training 1/1 epoch (loss 2.6130): 12%|β–ˆβ– | 150/1250 [03:45<21:57, 1.20s/it] Training 1/1 epoch (loss 2.6130): 12%|β–ˆβ– | 151/1250 [03:45<26:07, 1.43s/it] Training 1/1 epoch (loss 2.6731): 12%|β–ˆβ– | 151/1250 [03:46<26:07, 1.43s/it] Training 1/1 epoch (loss 2.6731): 12%|β–ˆβ– | 152/1250 [03:46<25:29, 1.39s/it] Training 1/1 epoch (loss 2.8082): 12%|β–ˆβ– | 152/1250 [03:47<25:29, 1.39s/it] Training 1/1 epoch (loss 2.8082): 12%|β–ˆβ– | 153/1250 [03:47<25:32, 1.40s/it] Training 1/1 epoch (loss 2.7212): 12%|β–ˆβ– | 153/1250 [03:49<25:32, 1.40s/it] Training 1/1 epoch (loss 2.7212): 12%|β–ˆβ– | 154/1250 [03:49<24:53, 1.36s/it] Training 1/1 epoch (loss 2.6458): 12%|β–ˆβ– | 154/1250 [03:50<24:53, 1.36s/it] Training 1/1 epoch (loss 2.6458): 12%|β–ˆβ– | 155/1250 [03:50<22:42, 1.24s/it] Training 1/1 epoch (loss 2.7682): 12%|β–ˆβ– | 155/1250 [03:51<22:42, 1.24s/it] Training 1/1 epoch (loss 2.7682): 12%|β–ˆβ– | 156/1250 [03:51<24:57, 1.37s/it] Training 1/1 epoch (loss 2.7527): 12%|β–ˆβ– | 156/1250 [03:54<24:57, 1.37s/it] Training 1/1 epoch (loss 2.7527): 13%|β–ˆβ–Ž | 157/1250 [03:54<29:55, 1.64s/it] Training 1/1 epoch (loss 2.7489): 13%|β–ˆβ–Ž | 157/1250 [03:55<29:55, 1.64s/it] Training 1/1 epoch (loss 2.7489): 13%|β–ˆβ–Ž | 158/1250 [03:55<26:58, 1.48s/it] Training 1/1 epoch (loss 2.8730): 13%|β–ˆβ–Ž | 158/1250 [03:57<26:58, 1.48s/it] Training 1/1 epoch (loss 2.8730): 13%|β–ˆβ–Ž | 159/1250 [03:57<29:42, 1.63s/it] Training 1/1 epoch (loss 2.6000): 13%|β–ˆβ–Ž | 159/1250 [03:59<29:42, 1.63s/it] Training 1/1 epoch (loss 2.6000): 13%|β–ˆβ–Ž | 160/1250 [03:59<31:38, 1.74s/it] Training 1/1 epoch (loss 2.6196): 13%|β–ˆβ–Ž | 160/1250 [03:59<31:38, 1.74s/it] Training 1/1 epoch (loss 2.6196): 13%|β–ˆβ–Ž | 161/1250 [03:59<26:15, 1.45s/it] Training 1/1 epoch (loss 2.5049): 13%|β–ˆβ–Ž | 161/1250 [04:02<26:15, 1.45s/it] Training 1/1 epoch (loss 2.5049): 13%|β–ˆβ–Ž | 162/1250 [04:02<30:15, 1.67s/it] Training 1/1 epoch (loss 2.6809): 13%|β–ˆβ–Ž | 162/1250 [04:04<30:15, 1.67s/it] Training 1/1 epoch (loss 2.6809): 13%|β–ˆβ–Ž | 163/1250 [04:04<31:31, 1.74s/it] Training 1/1 epoch (loss 2.4809): 13%|β–ˆβ–Ž | 163/1250 [04:04<31:31, 1.74s/it] Training 1/1 epoch (loss 2.4809): 13%|β–ˆβ–Ž | 164/1250 [04:04<25:02, 1.38s/it] Training 1/1 epoch (loss 2.5805): 13%|β–ˆβ–Ž | 164/1250 [04:07<25:02, 1.38s/it] Training 1/1 epoch (loss 2.5805): 13%|β–ˆβ–Ž | 165/1250 [04:07<30:42, 1.70s/it] Training 1/1 epoch (loss 2.5779): 13%|β–ˆβ–Ž | 165/1250 [04:08<30:42, 1.70s/it] Training 1/1 epoch (loss 2.5779): 13%|β–ˆβ–Ž | 166/1250 [04:08<29:51, 1.65s/it] Training 1/1 epoch (loss 2.5566): 13%|β–ˆβ–Ž | 166/1250 [04:09<29:51, 1.65s/it] Training 1/1 epoch (loss 2.5566): 13%|β–ˆβ–Ž | 167/1250 [04:09<23:08, 1.28s/it] Training 1/1 epoch (loss 2.5927): 13%|β–ˆβ–Ž | 167/1250 [04:10<23:08, 1.28s/it] Training 1/1 epoch (loss 2.5927): 13%|β–ˆβ–Ž | 168/1250 [04:10<26:33, 1.47s/it] Training 1/1 epoch (loss 2.6260): 13%|β–ˆβ–Ž | 168/1250 [04:12<26:33, 1.47s/it] Training 1/1 epoch (loss 2.6260): 14%|β–ˆβ–Ž | 169/1250 [04:12<28:15, 1.57s/it] Training 1/1 epoch (loss 2.6720): 14%|β–ˆβ–Ž | 169/1250 [04:13<28:15, 1.57s/it] Training 1/1 epoch (loss 2.6720): 14%|β–ˆβ–Ž | 170/1250 [04:13<24:43, 1.37s/it] Training 1/1 epoch (loss 2.6212): 14%|β–ˆβ–Ž | 170/1250 [04:14<24:43, 1.37s/it] Training 1/1 epoch (loss 2.6212): 14%|β–ˆβ–Ž | 171/1250 [04:14<24:31, 1.36s/it] Training 1/1 epoch (loss 2.5141): 14%|β–ˆβ–Ž | 171/1250 [04:16<24:31, 1.36s/it] Training 1/1 epoch (loss 2.5141): 14%|β–ˆβ– | 172/1250 [04:16<26:19, 1.46s/it] Training 1/1 epoch (loss 2.4364): 14%|β–ˆβ– | 172/1250 [04:17<26:19, 1.46s/it] Training 1/1 epoch (loss 2.4364): 14%|β–ˆβ– | 173/1250 [04:17<23:37, 1.32s/it] Training 1/1 epoch (loss 2.7576): 14%|β–ˆβ– | 173/1250 [04:19<23:37, 1.32s/it] Training 1/1 epoch (loss 2.7576): 14%|β–ˆβ– | 174/1250 [04:19<25:37, 1.43s/it] Training 1/1 epoch (loss 2.7910): 14%|β–ˆβ– | 174/1250 [04:20<25:37, 1.43s/it] Training 1/1 epoch (loss 2.7910): 14%|β–ˆβ– | 175/1250 [04:20<23:30, 1.31s/it] Training 1/1 epoch (loss 2.5759): 14%|β–ˆβ– | 175/1250 [04:21<23:30, 1.31s/it] Training 1/1 epoch (loss 2.5759): 14%|β–ˆβ– | 176/1250 [04:21<24:35, 1.37s/it] Training 1/1 epoch (loss 2.8340): 14%|β–ˆβ– | 176/1250 [04:24<24:35, 1.37s/it] Training 1/1 epoch (loss 2.8340): 14%|β–ˆβ– | 177/1250 [04:24<29:29, 1.65s/it] Training 1/1 epoch (loss 2.6586): 14%|β–ˆβ– | 177/1250 [04:24<29:29, 1.65s/it] Training 1/1 epoch (loss 2.6586): 14%|β–ˆβ– | 178/1250 [04:24<22:38, 1.27s/it] Training 1/1 epoch (loss 2.8715): 14%|β–ˆβ– | 178/1250 [04:26<22:38, 1.27s/it] Training 1/1 epoch (loss 2.8715): 14%|β–ˆβ– | 179/1250 [04:26<27:07, 1.52s/it] Training 1/1 epoch (loss 2.7768): 14%|β–ˆβ– | 179/1250 [04:28<27:07, 1.52s/it] Training 1/1 epoch (loss 2.7768): 14%|β–ˆβ– | 180/1250 [04:28<29:22, 1.65s/it] Training 1/1 epoch (loss 2.8022): 14%|β–ˆβ– | 180/1250 [04:29<29:22, 1.65s/it] Training 1/1 epoch (loss 2.8022): 14%|β–ˆβ– | 181/1250 [04:29<23:29, 1.32s/it] Training 1/1 epoch (loss 2.6742): 14%|β–ˆβ– | 181/1250 [04:31<23:29, 1.32s/it] Training 1/1 epoch (loss 2.6742): 15%|β–ˆβ– | 182/1250 [04:31<27:49, 1.56s/it] Training 1/1 epoch (loss 2.7101): 15%|β–ˆβ– | 182/1250 [04:32<27:49, 1.56s/it] Training 1/1 epoch (loss 2.7101): 15%|β–ˆβ– | 183/1250 [04:32<27:46, 1.56s/it] Training 1/1 epoch (loss 2.6604): 15%|β–ˆβ– | 183/1250 [04:33<27:46, 1.56s/it] Training 1/1 epoch (loss 2.6604): 15%|β–ˆβ– | 184/1250 [04:33<23:09, 1.30s/it] Training 1/1 epoch (loss 2.5734): 15%|β–ˆβ– | 184/1250 [04:35<23:09, 1.30s/it] Training 1/1 epoch (loss 2.5734): 15%|β–ˆβ– | 185/1250 [04:35<23:58, 1.35s/it] Training 1/1 epoch (loss 2.6561): 15%|β–ˆβ– | 185/1250 [04:36<23:58, 1.35s/it] Training 1/1 epoch (loss 2.6561): 15%|β–ˆβ– | 186/1250 [04:36<25:26, 1.43s/it] Training 1/1 epoch (loss 2.7225): 15%|β–ˆβ– | 186/1250 [04:37<25:26, 1.43s/it] Training 1/1 epoch (loss 2.7225): 15%|β–ˆβ– | 187/1250 [04:37<20:48, 1.17s/it] Training 1/1 epoch (loss 2.4615): 15%|β–ˆβ– | 187/1250 [04:38<20:48, 1.17s/it] Training 1/1 epoch (loss 2.4615): 15%|β–ˆβ–Œ | 188/1250 [04:38<23:44, 1.34s/it] Training 1/1 epoch (loss 2.5255): 15%|β–ˆβ–Œ | 188/1250 [04:40<23:44, 1.34s/it] Training 1/1 epoch (loss 2.5255): 15%|β–ˆβ–Œ | 189/1250 [04:40<24:42, 1.40s/it] Training 1/1 epoch (loss 2.5048): 15%|β–ˆβ–Œ | 189/1250 [04:41<24:42, 1.40s/it] Training 1/1 epoch (loss 2.5048): 15%|β–ˆβ–Œ | 190/1250 [04:41<22:26, 1.27s/it] Training 1/1 epoch (loss 2.6060): 15%|β–ˆβ–Œ | 190/1250 [04:43<22:26, 1.27s/it] Training 1/1 epoch (loss 2.6060): 15%|β–ˆβ–Œ | 191/1250 [04:43<24:20, 1.38s/it] Training 1/1 epoch (loss 2.8505): 15%|β–ˆβ–Œ | 191/1250 [04:44<24:20, 1.38s/it] Training 1/1 epoch (loss 2.8505): 15%|β–ˆβ–Œ | 192/1250 [04:44<25:16, 1.43s/it] Training 1/1 epoch (loss 2.6298): 15%|β–ˆβ–Œ | 192/1250 [04:45<25:16, 1.43s/it] Training 1/1 epoch (loss 2.6298): 15%|β–ˆβ–Œ | 193/1250 [04:45<23:48, 1.35s/it] Training 1/1 epoch (loss 2.7686): 15%|β–ˆβ–Œ | 193/1250 [04:47<23:48, 1.35s/it] Training 1/1 epoch (loss 2.7686): 16%|β–ˆβ–Œ | 194/1250 [04:47<24:44, 1.41s/it] Training 1/1 epoch (loss 2.6273): 16%|β–ˆβ–Œ | 194/1250 [04:48<24:44, 1.41s/it] Training 1/1 epoch (loss 2.6273): 16%|β–ˆβ–Œ | 195/1250 [04:48<23:18, 1.33s/it] Training 1/1 epoch (loss 2.4896): 16%|β–ˆβ–Œ | 195/1250 [04:49<23:18, 1.33s/it] Training 1/1 epoch (loss 2.4896): 16%|β–ˆβ–Œ | 196/1250 [04:49<24:01, 1.37s/it] Training 1/1 epoch (loss 2.5446): 16%|β–ˆβ–Œ | 196/1250 [04:52<24:01, 1.37s/it] Training 1/1 epoch (loss 2.5446): 16%|β–ˆβ–Œ | 197/1250 [04:52<28:29, 1.62s/it] Training 1/1 epoch (loss 2.7473): 16%|β–ˆβ–Œ | 197/1250 [04:52<28:29, 1.62s/it] Training 1/1 epoch (loss 2.7473): 16%|β–ˆβ–Œ | 198/1250 [04:52<24:08, 1.38s/it] Training 1/1 epoch (loss 2.6832): 16%|β–ˆβ–Œ | 198/1250 [04:54<24:08, 1.38s/it] Training 1/1 epoch (loss 2.6832): 16%|β–ˆβ–Œ | 199/1250 [04:54<25:46, 1.47s/it] Training 1/1 epoch (loss 2.6156): 16%|β–ˆβ–Œ | 199/1250 [04:56<25:46, 1.47s/it] Training 1/1 epoch (loss 2.6156): 16%|β–ˆβ–Œ | 200/1250 [04:56<29:48, 1.70s/it] Training 1/1 epoch (loss 2.7322): 16%|β–ˆβ–Œ | 200/1250 [04:57<29:48, 1.70s/it] Training 1/1 epoch (loss 2.7322): 16%|β–ˆβ–Œ | 201/1250 [04:57<24:15, 1.39s/it] Training 1/1 epoch (loss 2.5959): 16%|β–ˆβ–Œ | 201/1250 [04:59<24:15, 1.39s/it] Training 1/1 epoch (loss 2.5959): 16%|β–ˆβ–Œ | 202/1250 [04:59<29:09, 1.67s/it] Training 1/1 epoch (loss 2.5177): 16%|β–ˆβ–Œ | 202/1250 [05:01<29:09, 1.67s/it] Training 1/1 epoch (loss 2.5177): 16%|β–ˆβ–Œ | 203/1250 [05:01<28:13, 1.62s/it] Training 1/1 epoch (loss 2.5243): 16%|β–ˆβ–Œ | 203/1250 [05:02<28:13, 1.62s/it] Training 1/1 epoch (loss 2.5243): 16%|β–ˆβ–‹ | 204/1250 [05:02<23:27, 1.35s/it] Training 1/1 epoch (loss 2.5084): 16%|β–ˆβ–‹ | 204/1250 [05:03<23:27, 1.35s/it] Training 1/1 epoch (loss 2.5084): 16%|β–ˆβ–‹ | 205/1250 [05:03<25:42, 1.48s/it] Training 1/1 epoch (loss 2.8023): 16%|β–ˆβ–‹ | 205/1250 [05:05<25:42, 1.48s/it] Training 1/1 epoch (loss 2.8023): 16%|β–ˆβ–‹ | 206/1250 [05:05<25:00, 1.44s/it] Training 1/1 epoch (loss 2.6461): 16%|β–ˆβ–‹ | 206/1250 [05:05<25:00, 1.44s/it] Training 1/1 epoch (loss 2.6461): 17%|β–ˆβ–‹ | 207/1250 [05:05<21:07, 1.22s/it] Training 1/1 epoch (loss 2.6567): 17%|β–ˆβ–‹ | 207/1250 [05:08<21:07, 1.22s/it] Training 1/1 epoch (loss 2.6567): 17%|β–ˆβ–‹ | 208/1250 [05:08<25:43, 1.48s/it] Training 1/1 epoch (loss 2.6318): 17%|β–ˆβ–‹ | 208/1250 [05:09<25:43, 1.48s/it] Training 1/1 epoch (loss 2.6318): 17%|β–ˆβ–‹ | 209/1250 [05:09<24:45, 1.43s/it] Training 1/1 epoch (loss 2.6482): 17%|β–ˆβ–‹ | 209/1250 [05:09<24:45, 1.43s/it] Training 1/1 epoch (loss 2.6482): 17%|β–ˆβ–‹ | 210/1250 [05:09<20:31, 1.18s/it] Training 1/1 epoch (loss 2.6710): 17%|β–ˆβ–‹ | 210/1250 [05:11<20:31, 1.18s/it] Training 1/1 epoch (loss 2.6710): 17%|β–ˆβ–‹ | 211/1250 [05:11<22:49, 1.32s/it] Training 1/1 epoch (loss 2.8160): 17%|β–ˆβ–‹ | 211/1250 [05:12<22:49, 1.32s/it] Training 1/1 epoch (loss 2.8160): 17%|β–ˆβ–‹ | 212/1250 [05:12<21:05, 1.22s/it] Training 1/1 epoch (loss 2.5953): 17%|β–ˆβ–‹ | 212/1250 [05:13<21:05, 1.22s/it] Training 1/1 epoch (loss 2.5953): 17%|β–ˆβ–‹ | 213/1250 [05:13<21:47, 1.26s/it] Training 1/1 epoch (loss 2.4638): 17%|β–ˆβ–‹ | 213/1250 [05:15<21:47, 1.26s/it] Training 1/1 epoch (loss 2.4638): 17%|β–ˆβ–‹ | 214/1250 [05:15<23:17, 1.35s/it] Training 1/1 epoch (loss 2.4792): 17%|β–ˆβ–‹ | 214/1250 [05:16<23:17, 1.35s/it] Training 1/1 epoch (loss 2.4792): 17%|β–ˆβ–‹ | 215/1250 [05:16<21:21, 1.24s/it] Training 1/1 epoch (loss 2.3677): 17%|β–ˆβ–‹ | 215/1250 [05:17<21:21, 1.24s/it] Training 1/1 epoch (loss 2.3677): 17%|β–ˆβ–‹ | 216/1250 [05:17<22:45, 1.32s/it] Training 1/1 epoch (loss 2.4528): 17%|β–ˆβ–‹ | 216/1250 [05:20<22:45, 1.32s/it] Training 1/1 epoch (loss 2.4528): 17%|β–ˆβ–‹ | 217/1250 [05:20<28:27, 1.65s/it] Training 1/1 epoch (loss 2.6955): 17%|β–ˆβ–‹ | 217/1250 [05:21<28:27, 1.65s/it] Training 1/1 epoch (loss 2.6955): 17%|β–ˆβ–‹ | 218/1250 [05:21<24:35, 1.43s/it] Training 1/1 epoch (loss 2.7063): 17%|β–ˆβ–‹ | 218/1250 [05:23<24:35, 1.43s/it] Training 1/1 epoch (loss 2.7063): 18%|β–ˆβ–Š | 219/1250 [05:23<26:19, 1.53s/it] Training 1/1 epoch (loss 2.4764): 18%|β–ˆβ–Š | 219/1250 [05:24<26:19, 1.53s/it] Training 1/1 epoch (loss 2.4764): 18%|β–ˆβ–Š | 220/1250 [05:24<28:18, 1.65s/it] Training 1/1 epoch (loss 2.5192): 18%|β–ˆβ–Š | 220/1250 [05:25<28:18, 1.65s/it] Training 1/1 epoch (loss 2.5192): 18%|β–ˆβ–Š | 221/1250 [05:25<22:31, 1.31s/it] Training 1/1 epoch (loss 2.6507): 18%|β–ˆβ–Š | 221/1250 [05:27<22:31, 1.31s/it] Training 1/1 epoch (loss 2.6507): 18%|β–ˆβ–Š | 222/1250 [05:27<26:50, 1.57s/it] Training 1/1 epoch (loss 2.5295): 18%|β–ˆβ–Š | 222/1250 [05:29<26:50, 1.57s/it] Training 1/1 epoch (loss 2.5295): 18%|β–ˆβ–Š | 223/1250 [05:29<27:28, 1.60s/it] Training 1/1 epoch (loss 2.5683): 18%|β–ˆβ–Š | 223/1250 [05:30<27:28, 1.60s/it] Training 1/1 epoch (loss 2.5683): 18%|β–ˆβ–Š | 224/1250 [05:30<24:55, 1.46s/it] Training 1/1 epoch (loss 2.4135): 18%|β–ˆβ–Š | 224/1250 [05:32<24:55, 1.46s/it] Training 1/1 epoch (loss 2.4135): 18%|β–ˆβ–Š | 225/1250 [05:32<29:44, 1.74s/it] Training 1/1 epoch (loss 2.5973): 18%|β–ˆβ–Š | 225/1250 [05:34<29:44, 1.74s/it] Training 1/1 epoch (loss 2.5973): 18%|β–ˆβ–Š | 226/1250 [05:34<30:13, 1.77s/it] Training 1/1 epoch (loss 2.7840): 18%|β–ˆβ–Š | 226/1250 [05:35<30:13, 1.77s/it] Training 1/1 epoch (loss 2.7840): 18%|β–ˆβ–Š | 227/1250 [05:35<25:06, 1.47s/it] Training 1/1 epoch (loss 2.7061): 18%|β–ˆβ–Š | 227/1250 [05:37<25:06, 1.47s/it] Training 1/1 epoch (loss 2.7061): 18%|β–ˆβ–Š | 228/1250 [05:37<28:08, 1.65s/it] Training 1/1 epoch (loss 2.6362): 18%|β–ˆβ–Š | 228/1250 [05:39<28:08, 1.65s/it] Training 1/1 epoch (loss 2.6362): 18%|β–ˆβ–Š | 229/1250 [05:39<28:10, 1.66s/it] Training 1/1 epoch (loss 2.5504): 18%|β–ˆβ–Š | 229/1250 [05:39<28:10, 1.66s/it] Training 1/1 epoch (loss 2.5504): 18%|β–ˆβ–Š | 230/1250 [05:39<22:29, 1.32s/it] Training 1/1 epoch (loss 2.7516): 18%|β–ˆβ–Š | 230/1250 [05:41<22:29, 1.32s/it] Training 1/1 epoch (loss 2.7516): 18%|β–ˆβ–Š | 231/1250 [05:41<26:16, 1.55s/it] Training 1/1 epoch (loss 2.7772): 18%|β–ˆβ–Š | 231/1250 [05:43<26:16, 1.55s/it] Training 1/1 epoch (loss 2.7772): 19%|β–ˆβ–Š | 232/1250 [05:43<25:55, 1.53s/it] Training 1/1 epoch (loss 2.6560): 19%|β–ˆβ–Š | 232/1250 [05:45<25:55, 1.53s/it] Training 1/1 epoch (loss 2.6560): 19%|β–ˆβ–Š | 233/1250 [05:45<26:49, 1.58s/it] Training 1/1 epoch (loss 2.6433): 19%|β–ˆβ–Š | 233/1250 [05:46<26:49, 1.58s/it] Training 1/1 epoch (loss 2.6433): 19%|β–ˆβ–Š | 234/1250 [05:46<28:04, 1.66s/it] Training 1/1 epoch (loss 2.6847): 19%|β–ˆβ–Š | 234/1250 [05:47<28:04, 1.66s/it] Training 1/1 epoch (loss 2.6847): 19%|β–ˆβ–‰ | 235/1250 [05:47<22:16, 1.32s/it] Training 1/1 epoch (loss 2.6686): 19%|β–ˆβ–‰ | 235/1250 [05:48<22:16, 1.32s/it] Training 1/1 epoch (loss 2.6686): 19%|β–ˆβ–‰ | 236/1250 [05:48<23:15, 1.38s/it] Training 1/1 epoch (loss 2.4901): 19%|β–ˆβ–‰ | 236/1250 [05:50<23:15, 1.38s/it] Training 1/1 epoch (loss 2.4901): 19%|β–ˆβ–‰ | 237/1250 [05:50<26:30, 1.57s/it] Training 1/1 epoch (loss 2.6159): 19%|β–ˆβ–‰ | 237/1250 [05:51<26:30, 1.57s/it] Training 1/1 epoch (loss 2.6159): 19%|β–ˆβ–‰ | 238/1250 [05:51<20:58, 1.24s/it] Training 1/1 epoch (loss 2.5147): 19%|β–ˆβ–‰ | 238/1250 [05:52<20:58, 1.24s/it] Training 1/1 epoch (loss 2.5147): 19%|β–ˆβ–‰ | 239/1250 [05:52<21:26, 1.27s/it] Training 1/1 epoch (loss 2.5738): 19%|β–ˆβ–‰ | 239/1250 [05:54<21:26, 1.27s/it] Training 1/1 epoch (loss 2.5738): 19%|β–ˆβ–‰ | 240/1250 [05:54<25:59, 1.54s/it] Training 1/1 epoch (loss 2.4772): 19%|β–ˆβ–‰ | 240/1250 [05:55<25:59, 1.54s/it] Training 1/1 epoch (loss 2.4772): 19%|β–ˆβ–‰ | 241/1250 [05:55<22:04, 1.31s/it] Training 1/1 epoch (loss 2.6071): 19%|β–ˆβ–‰ | 241/1250 [05:57<22:04, 1.31s/it] Training 1/1 epoch (loss 2.6071): 19%|β–ˆβ–‰ | 242/1250 [05:57<25:56, 1.54s/it] Training 1/1 epoch (loss 2.4817): 19%|β–ˆβ–‰ | 242/1250 [05:59<25:56, 1.54s/it] Training 1/1 epoch (loss 2.4817): 19%|β–ˆβ–‰ | 243/1250 [05:59<24:19, 1.45s/it] Training 1/1 epoch (loss 2.4051): 19%|β–ˆβ–‰ | 243/1250 [05:59<24:19, 1.45s/it] Training 1/1 epoch (loss 2.4051): 20%|β–ˆβ–‰ | 244/1250 [05:59<20:53, 1.25s/it] Training 1/1 epoch (loss 2.5612): 20%|β–ˆβ–‰ | 244/1250 [06:02<20:53, 1.25s/it] Training 1/1 epoch (loss 2.5612): 20%|β–ˆβ–‰ | 245/1250 [06:02<26:57, 1.61s/it] Training 1/1 epoch (loss 2.6350): 20%|β–ˆβ–‰ | 245/1250 [06:03<26:57, 1.61s/it] Training 1/1 epoch (loss 2.6350): 20%|β–ˆβ–‰ | 246/1250 [06:03<25:38, 1.53s/it] Training 1/1 epoch (loss 2.5225): 20%|β–ˆβ–‰ | 246/1250 [06:06<25:38, 1.53s/it] Training 1/1 epoch (loss 2.5225): 20%|β–ˆβ–‰ | 247/1250 [06:06<30:35, 1.83s/it] Training 1/1 epoch (loss 2.5179): 20%|β–ˆβ–‰ | 247/1250 [06:07<30:35, 1.83s/it] Training 1/1 epoch (loss 2.5179): 20%|β–ˆβ–‰ | 248/1250 [06:07<29:08, 1.75s/it] Training 1/1 epoch (loss 2.7152): 20%|β–ˆβ–‰ | 248/1250 [06:08<29:08, 1.75s/it] Training 1/1 epoch (loss 2.7152): 20%|β–ˆβ–‰ | 249/1250 [06:08<23:23, 1.40s/it] Training 1/1 epoch (loss 2.6878): 20%|β–ˆβ–‰ | 249/1250 [06:09<23:23, 1.40s/it] Training 1/1 epoch (loss 2.6878): 20%|β–ˆβ–ˆ | 250/1250 [06:09<23:59, 1.44s/it] Training 1/1 epoch (loss 2.5125): 20%|β–ˆβ–ˆ | 250/1250 [06:11<23:59, 1.44s/it] Training 1/1 epoch (loss 2.5125): 20%|β–ˆβ–ˆ | 251/1250 [06:11<25:48, 1.55s/it] Training 1/1 epoch (loss 2.6733): 20%|β–ˆβ–ˆ | 251/1250 [06:11<25:48, 1.55s/it] Training 1/1 epoch (loss 2.6733): 20%|β–ˆβ–ˆ | 252/1250 [06:11<19:59, 1.20s/it] Training 1/1 epoch (loss 2.5993): 20%|β–ˆβ–ˆ | 252/1250 [06:14<19:59, 1.20s/it] Training 1/1 epoch (loss 2.5993): 20%|β–ˆβ–ˆ | 253/1250 [06:14<25:26, 1.53s/it] Training 1/1 epoch (loss 2.6995): 20%|β–ˆβ–ˆ | 253/1250 [06:15<25:26, 1.53s/it] Training 1/1 epoch (loss 2.6995): 20%|β–ˆβ–ˆ | 254/1250 [06:15<24:21, 1.47s/it] Training 1/1 epoch (loss 2.6003): 20%|β–ˆβ–ˆ | 254/1250 [06:16<24:21, 1.47s/it] Training 1/1 epoch (loss 2.6003): 20%|β–ˆβ–ˆ | 255/1250 [06:16<20:01, 1.21s/it] Training 1/1 epoch (loss 2.6789): 20%|β–ˆβ–ˆ | 255/1250 [06:18<20:01, 1.21s/it] Training 1/1 epoch (loss 2.6789): 20%|β–ˆβ–ˆ | 256/1250 [06:18<23:43, 1.43s/it] Training 1/1 epoch (loss 2.6560): 20%|β–ˆβ–ˆ | 256/1250 [06:19<23:43, 1.43s/it] Training 1/1 epoch (loss 2.6560): 21%|β–ˆβ–ˆ | 257/1250 [06:19<22:12, 1.34s/it] Training 1/1 epoch (loss 2.6079): 21%|β–ˆβ–ˆ | 257/1250 [06:21<22:12, 1.34s/it] Training 1/1 epoch (loss 2.6079): 21%|β–ˆβ–ˆ | 258/1250 [06:21<27:57, 1.69s/it] Training 1/1 epoch (loss 2.5981): 21%|β–ˆβ–ˆ | 258/1250 [06:23<27:57, 1.69s/it] Training 1/1 epoch (loss 2.5981): 21%|β–ˆβ–ˆ | 259/1250 [06:23<30:06, 1.82s/it] Training 1/1 epoch (loss 2.6281): 21%|β–ˆβ–ˆ | 259/1250 [06:24<30:06, 1.82s/it] Training 1/1 epoch (loss 2.6281): 21%|β–ˆβ–ˆ | 260/1250 [06:24<23:18, 1.41s/it] Training 1/1 epoch (loss 2.5308): 21%|β–ˆβ–ˆ | 260/1250 [06:26<23:18, 1.41s/it] Training 1/1 epoch (loss 2.5308): 21%|β–ˆβ–ˆ | 261/1250 [06:26<25:59, 1.58s/it] Training 1/1 epoch (loss 2.5638): 21%|β–ˆβ–ˆ | 261/1250 [06:27<25:59, 1.58s/it] Training 1/1 epoch (loss 2.5638): 21%|β–ˆβ–ˆ | 262/1250 [06:27<24:09, 1.47s/it] Training 1/1 epoch (loss 2.5609): 21%|β–ˆβ–ˆ | 262/1250 [06:28<24:09, 1.47s/it] Training 1/1 epoch (loss 2.5609): 21%|β–ˆβ–ˆ | 263/1250 [06:28<20:22, 1.24s/it] Training 1/1 epoch (loss 2.3152): 21%|β–ˆβ–ˆ | 263/1250 [06:30<20:22, 1.24s/it] Training 1/1 epoch (loss 2.3152): 21%|β–ˆβ–ˆ | 264/1250 [06:30<25:35, 1.56s/it] Training 1/1 epoch (loss 2.6688): 21%|β–ˆβ–ˆ | 264/1250 [06:33<25:35, 1.56s/it] Training 1/1 epoch (loss 2.6688): 21%|β–ˆβ–ˆ | 265/1250 [06:33<30:05, 1.83s/it] Training 1/1 epoch (loss 2.6196): 21%|β–ˆβ–ˆ | 265/1250 [06:33<30:05, 1.83s/it] Training 1/1 epoch (loss 2.6196): 21%|β–ˆβ–ˆβ– | 266/1250 [06:33<24:30, 1.49s/it] Training 1/1 epoch (loss 2.6876): 21%|β–ˆβ–ˆβ– | 266/1250 [06:36<24:30, 1.49s/it] Training 1/1 epoch (loss 2.6876): 21%|β–ˆβ–ˆβ– | 267/1250 [06:36<28:57, 1.77s/it] Training 1/1 epoch (loss 2.6644): 21%|β–ˆβ–ˆβ– | 267/1250 [06:37<28:57, 1.77s/it] Training 1/1 epoch (loss 2.6644): 21%|β–ˆβ–ˆβ– | 268/1250 [06:37<26:51, 1.64s/it] Training 1/1 epoch (loss 2.6001): 21%|β–ˆβ–ˆβ– | 268/1250 [06:38<26:51, 1.64s/it] Training 1/1 epoch (loss 2.6001): 22%|β–ˆβ–ˆβ– | 269/1250 [06:38<21:25, 1.31s/it] Training 1/1 epoch (loss 2.5625): 22%|β–ˆβ–ˆβ– | 269/1250 [06:40<21:25, 1.31s/it] Training 1/1 epoch (loss 2.5625): 22%|β–ˆβ–ˆβ– | 270/1250 [06:40<26:07, 1.60s/it] Training 1/1 epoch (loss 2.4593): 22%|β–ˆβ–ˆβ– | 270/1250 [06:42<26:07, 1.60s/it] Training 1/1 epoch (loss 2.4593): 22%|β–ˆβ–ˆβ– | 271/1250 [06:42<26:37, 1.63s/it] Training 1/1 epoch (loss 2.3700): 22%|β–ˆβ–ˆβ– | 271/1250 [06:42<26:37, 1.63s/it] Training 1/1 epoch (loss 2.3700): 22%|β–ˆβ–ˆβ– | 272/1250 [06:42<23:08, 1.42s/it] Training 1/1 epoch (loss 2.5599): 22%|β–ˆβ–ˆβ– | 272/1250 [06:45<23:08, 1.42s/it] Training 1/1 epoch (loss 2.5599): 22%|β–ˆβ–ˆβ– | 273/1250 [06:45<26:44, 1.64s/it] Training 1/1 epoch (loss 2.6491): 22%|β–ˆβ–ˆβ– | 273/1250 [06:46<26:44, 1.64s/it] Training 1/1 epoch (loss 2.6491): 22%|β–ˆβ–ˆβ– | 274/1250 [06:46<24:37, 1.51s/it] Training 1/1 epoch (loss 2.6812): 22%|β–ˆβ–ˆβ– | 274/1250 [06:47<24:37, 1.51s/it] Training 1/1 epoch (loss 2.6812): 22%|β–ˆβ–ˆβ– | 275/1250 [06:47<21:38, 1.33s/it] Training 1/1 epoch (loss 2.6862): 22%|β–ˆβ–ˆβ– | 275/1250 [06:48<21:38, 1.33s/it] Training 1/1 epoch (loss 2.6862): 22%|β–ˆβ–ˆβ– | 276/1250 [06:48<22:27, 1.38s/it] Training 1/1 epoch (loss 2.5880): 22%|β–ˆβ–ˆβ– | 276/1250 [06:49<22:27, 1.38s/it] Training 1/1 epoch (loss 2.5880): 22%|β–ˆβ–ˆβ– | 277/1250 [06:49<19:43, 1.22s/it] Training 1/1 epoch (loss 2.7165): 22%|β–ˆβ–ˆβ– | 277/1250 [06:51<19:43, 1.22s/it] Training 1/1 epoch (loss 2.7165): 22%|β–ˆβ–ˆβ– | 278/1250 [06:51<22:40, 1.40s/it] Training 1/1 epoch (loss 2.7984): 22%|β–ˆβ–ˆβ– | 278/1250 [06:53<22:40, 1.40s/it] Training 1/1 epoch (loss 2.7984): 22%|β–ˆβ–ˆβ– | 279/1250 [06:53<26:01, 1.61s/it] Training 1/1 epoch (loss 2.5445): 22%|β–ˆβ–ˆβ– | 279/1250 [06:53<26:01, 1.61s/it] Training 1/1 epoch (loss 2.5445): 22%|β–ˆβ–ˆβ– | 280/1250 [06:53<20:34, 1.27s/it] Training 1/1 epoch (loss 2.7172): 22%|β–ˆβ–ˆβ– | 280/1250 [06:55<20:34, 1.27s/it] Training 1/1 epoch (loss 2.7172): 22%|β–ˆβ–ˆβ– | 281/1250 [06:55<22:12, 1.38s/it] Training 1/1 epoch (loss 2.5838): 22%|β–ˆβ–ˆβ– | 281/1250 [06:57<22:12, 1.38s/it] Training 1/1 epoch (loss 2.5838): 23%|β–ˆβ–ˆβ–Ž | 282/1250 [06:57<24:52, 1.54s/it] Training 1/1 epoch (loss 2.5956): 23%|β–ˆβ–ˆβ–Ž | 282/1250 [06:58<24:52, 1.54s/it] Training 1/1 epoch (loss 2.5956): 23%|β–ˆβ–ˆβ–Ž | 283/1250 [06:58<20:31, 1.27s/it] Training 1/1 epoch (loss 2.7378): 23%|β–ˆβ–ˆβ–Ž | 283/1250 [06:59<20:31, 1.27s/it] Training 1/1 epoch (loss 2.7378): 23%|β–ˆβ–ˆβ–Ž | 284/1250 [06:59<22:29, 1.40s/it] Training 1/1 epoch (loss 2.6273): 23%|β–ˆβ–ˆβ–Ž | 284/1250 [07:01<22:29, 1.40s/it] Training 1/1 epoch (loss 2.6273): 23%|β–ˆβ–ˆβ–Ž | 285/1250 [07:01<23:03, 1.43s/it] Training 1/1 epoch (loss 2.4468): 23%|β–ˆβ–ˆβ–Ž | 285/1250 [07:02<23:03, 1.43s/it] Training 1/1 epoch (loss 2.4468): 23%|β–ˆβ–ˆβ–Ž | 286/1250 [07:02<21:48, 1.36s/it] Training 1/1 epoch (loss 2.5782): 23%|β–ˆβ–ˆβ–Ž | 286/1250 [07:04<21:48, 1.36s/it] Training 1/1 epoch (loss 2.5782): 23%|β–ˆβ–ˆβ–Ž | 287/1250 [07:04<22:53, 1.43s/it] Training 1/1 epoch (loss 2.6086): 23%|β–ˆβ–ˆβ–Ž | 287/1250 [07:05<22:53, 1.43s/it] Training 1/1 epoch (loss 2.6086): 23%|β–ˆβ–ˆβ–Ž | 288/1250 [07:05<23:59, 1.50s/it] Training 1/1 epoch (loss 2.5192): 23%|β–ˆβ–ˆβ–Ž | 288/1250 [07:07<23:59, 1.50s/it] Training 1/1 epoch (loss 2.5192): 23%|β–ˆβ–ˆβ–Ž | 289/1250 [07:07<23:46, 1.48s/it] Training 1/1 epoch (loss 2.4951): 23%|β–ˆβ–ˆβ–Ž | 289/1250 [07:09<23:46, 1.48s/it] Training 1/1 epoch (loss 2.4951): 23%|β–ˆβ–ˆβ–Ž | 290/1250 [07:09<28:13, 1.76s/it] Training 1/1 epoch (loss 2.5562): 23%|β–ˆβ–ˆβ–Ž | 290/1250 [07:10<28:13, 1.76s/it] Training 1/1 epoch (loss 2.5562): 23%|β–ˆβ–ˆβ–Ž | 291/1250 [07:10<24:00, 1.50s/it] Training 1/1 epoch (loss 2.7603): 23%|β–ˆβ–ˆβ–Ž | 291/1250 [07:12<24:00, 1.50s/it] Training 1/1 epoch (loss 2.7603): 23%|β–ˆβ–ˆβ–Ž | 292/1250 [07:12<25:48, 1.62s/it] Training 1/1 epoch (loss 2.7137): 23%|β–ˆβ–ˆβ–Ž | 292/1250 [07:13<25:48, 1.62s/it] Training 1/1 epoch (loss 2.7137): 23%|β–ˆβ–ˆβ–Ž | 293/1250 [07:13<24:19, 1.53s/it] Training 1/1 epoch (loss 2.5779): 23%|β–ˆβ–ˆβ–Ž | 293/1250 [07:14<24:19, 1.53s/it] Training 1/1 epoch (loss 2.5779): 24%|β–ˆβ–ˆβ–Ž | 294/1250 [07:14<19:34, 1.23s/it] Training 1/1 epoch (loss 2.5600): 24%|β–ˆβ–ˆβ–Ž | 294/1250 [07:16<19:34, 1.23s/it] Training 1/1 epoch (loss 2.5600): 24%|β–ˆβ–ˆβ–Ž | 295/1250 [07:16<22:56, 1.44s/it] Training 1/1 epoch (loss 2.6078): 24%|β–ˆβ–ˆβ–Ž | 295/1250 [07:18<22:56, 1.44s/it] Training 1/1 epoch (loss 2.6078): 24%|β–ˆβ–ˆβ–Ž | 296/1250 [07:18<25:20, 1.59s/it] Training 1/1 epoch (loss 2.5779): 24%|β–ˆβ–ˆβ–Ž | 296/1250 [07:19<25:20, 1.59s/it] Training 1/1 epoch (loss 2.5779): 24%|β–ˆβ–ˆβ– | 297/1250 [07:19<24:04, 1.52s/it] Training 1/1 epoch (loss 2.6552): 24%|β–ˆβ–ˆβ– | 297/1250 [07:21<24:04, 1.52s/it] Training 1/1 epoch (loss 2.6552): 24%|β–ˆβ–ˆβ– | 298/1250 [07:21<24:55, 1.57s/it] Training 1/1 epoch (loss 2.5353): 24%|β–ˆβ–ˆβ– | 298/1250 [07:22<24:55, 1.57s/it] Training 1/1 epoch (loss 2.5353): 24%|β–ˆβ–ˆβ– | 299/1250 [07:22<21:09, 1.33s/it] Training 1/1 epoch (loss 2.6761): 24%|β–ˆβ–ˆβ– | 299/1250 [07:23<21:09, 1.33s/it] Training 1/1 epoch (loss 2.6761): 24%|β–ˆβ–ˆβ– | 300/1250 [07:23<21:01, 1.33s/it] Training 1/1 epoch (loss 2.5656): 24%|β–ˆβ–ˆβ– | 300/1250 [07:24<21:01, 1.33s/it] Training 1/1 epoch (loss 2.5656): 24%|β–ˆβ–ˆβ– | 301/1250 [07:24<21:38, 1.37s/it] Training 1/1 epoch (loss 2.4238): 24%|β–ˆβ–ˆβ– | 301/1250 [07:25<21:38, 1.37s/it] Training 1/1 epoch (loss 2.4238): 24%|β–ˆβ–ˆβ– | 302/1250 [07:25<17:27, 1.11s/it] Training 1/1 epoch (loss 2.3293): 24%|β–ˆβ–ˆβ– | 302/1250 [07:27<17:27, 1.11s/it] Training 1/1 epoch (loss 2.3293): 24%|β–ˆβ–ˆβ– | 303/1250 [07:27<23:31, 1.49s/it] Training 1/1 epoch (loss 2.7249): 24%|β–ˆβ–ˆβ– | 303/1250 [07:29<23:31, 1.49s/it] Training 1/1 epoch (loss 2.7249): 24%|β–ˆβ–ˆβ– | 304/1250 [07:29<27:20, 1.73s/it] Training 1/1 epoch (loss 2.7654): 24%|β–ˆβ–ˆβ– | 304/1250 [07:31<27:20, 1.73s/it] Training 1/1 epoch (loss 2.7654): 24%|β–ˆβ–ˆβ– | 305/1250 [07:31<24:43, 1.57s/it] Training 1/1 epoch (loss 2.7577): 24%|β–ˆβ–ˆβ– | 305/1250 [07:33<24:43, 1.57s/it] Training 1/1 epoch (loss 2.7577): 24%|β–ˆβ–ˆβ– | 306/1250 [07:33<29:09, 1.85s/it] Training 1/1 epoch (loss 2.6189): 24%|β–ˆβ–ˆβ– | 306/1250 [07:35<29:09, 1.85s/it] Training 1/1 epoch (loss 2.6189): 25%|β–ˆβ–ˆβ– | 307/1250 [07:35<27:41, 1.76s/it] Training 1/1 epoch (loss 2.5382): 25%|β–ˆβ–ˆβ– | 307/1250 [07:36<27:41, 1.76s/it] Training 1/1 epoch (loss 2.5382): 25%|β–ˆβ–ˆβ– | 308/1250 [07:36<26:11, 1.67s/it] Training 1/1 epoch (loss 2.5121): 25%|β–ˆβ–ˆβ– | 308/1250 [07:38<26:11, 1.67s/it] Training 1/1 epoch (loss 2.5121): 25%|β–ˆβ–ˆβ– | 309/1250 [07:38<26:33, 1.69s/it] Training 1/1 epoch (loss 2.5264): 25%|β–ˆβ–ˆβ– | 309/1250 [07:39<26:33, 1.69s/it] Training 1/1 epoch (loss 2.5264): 25%|β–ˆβ–ˆβ– | 310/1250 [07:39<22:56, 1.46s/it] Training 1/1 epoch (loss 2.6667): 25%|β–ˆβ–ˆβ– | 310/1250 [07:41<22:56, 1.46s/it] Training 1/1 epoch (loss 2.6667): 25%|β–ˆβ–ˆβ– | 311/1250 [07:41<24:36, 1.57s/it] Training 1/1 epoch (loss 2.4802): 25%|β–ˆβ–ˆβ– | 311/1250 [07:43<24:36, 1.57s/it] Training 1/1 epoch (loss 2.4802): 25%|β–ˆβ–ˆβ– | 312/1250 [07:43<27:37, 1.77s/it] Training 1/1 epoch (loss 2.3413): 25%|β–ˆβ–ˆβ– | 312/1250 [07:44<27:37, 1.77s/it] Training 1/1 epoch (loss 2.3413): 25%|β–ˆβ–ˆβ–Œ | 313/1250 [07:44<22:13, 1.42s/it] Training 1/1 epoch (loss 2.5338): 25%|β–ˆβ–ˆβ–Œ | 313/1250 [07:45<22:13, 1.42s/it] Training 1/1 epoch (loss 2.5338): 25%|β–ˆβ–ˆβ–Œ | 314/1250 [07:45<22:20, 1.43s/it] Training 1/1 epoch (loss 2.6837): 25%|β–ˆβ–ˆβ–Œ | 314/1250 [07:46<22:20, 1.43s/it] Training 1/1 epoch (loss 2.6837): 25%|β–ˆβ–ˆβ–Œ | 315/1250 [07:46<20:56, 1.34s/it] Training 1/1 epoch (loss 2.5422): 25%|β–ˆβ–ˆβ–Œ | 315/1250 [07:47<20:56, 1.34s/it] Training 1/1 epoch (loss 2.5422): 25%|β–ˆβ–ˆβ–Œ | 316/1250 [07:47<17:21, 1.12s/it] Training 1/1 epoch (loss 2.6326): 25%|β–ˆβ–ˆβ–Œ | 316/1250 [07:49<17:21, 1.12s/it] Training 1/1 epoch (loss 2.6326): 25%|β–ˆβ–ˆβ–Œ | 317/1250 [07:49<21:11, 1.36s/it] Training 1/1 epoch (loss 2.7725): 25%|β–ˆβ–ˆβ–Œ | 317/1250 [07:50<21:11, 1.36s/it] Training 1/1 epoch (loss 2.7725): 25%|β–ˆβ–ˆβ–Œ | 318/1250 [07:50<22:23, 1.44s/it] Training 1/1 epoch (loss 2.4994): 25%|β–ˆβ–ˆβ–Œ | 318/1250 [07:51<22:23, 1.44s/it] Training 1/1 epoch (loss 2.4994): 26%|β–ˆβ–ˆβ–Œ | 319/1250 [07:51<19:00, 1.23s/it] Training 1/1 epoch (loss 2.6697): 26%|β–ˆβ–ˆβ–Œ | 319/1250 [07:53<19:00, 1.23s/it] Training 1/1 epoch (loss 2.6697): 26%|β–ˆβ–ˆβ–Œ | 320/1250 [07:53<24:49, 1.60s/it] Training 1/1 epoch (loss 2.5395): 26%|β–ˆβ–ˆβ–Œ | 320/1250 [07:55<24:49, 1.60s/it] Training 1/1 epoch (loss 2.5395): 26%|β–ˆβ–ˆβ–Œ | 321/1250 [07:55<23:37, 1.53s/it] Training 1/1 epoch (loss 2.6640): 26%|β–ˆβ–ˆβ–Œ | 321/1250 [07:56<23:37, 1.53s/it] Training 1/1 epoch (loss 2.6640): 26%|β–ˆβ–ˆβ–Œ | 322/1250 [07:56<21:38, 1.40s/it] Training 1/1 epoch (loss 2.4870): 26%|β–ˆβ–ˆβ–Œ | 322/1250 [07:58<21:38, 1.40s/it] Training 1/1 epoch (loss 2.4870): 26%|β–ˆβ–ˆβ–Œ | 323/1250 [07:58<24:05, 1.56s/it] Training 1/1 epoch (loss 2.7707): 26%|β–ˆβ–ˆβ–Œ | 323/1250 [07:59<24:05, 1.56s/it] Training 1/1 epoch (loss 2.7707): 26%|β–ˆβ–ˆβ–Œ | 324/1250 [07:59<20:04, 1.30s/it] Training 1/1 epoch (loss 2.3809): 26%|β–ˆβ–ˆβ–Œ | 324/1250 [08:00<20:04, 1.30s/it] Training 1/1 epoch (loss 2.3809): 26%|β–ˆβ–ˆβ–Œ | 325/1250 [08:00<20:13, 1.31s/it] Training 1/1 epoch (loss 2.6545): 26%|β–ˆβ–ˆβ–Œ | 325/1250 [08:01<20:13, 1.31s/it] Training 1/1 epoch (loss 2.6545): 26%|β–ˆβ–ˆβ–Œ | 326/1250 [08:01<20:10, 1.31s/it] Training 1/1 epoch (loss 2.5101): 26%|β–ˆβ–ˆβ–Œ | 326/1250 [08:02<20:10, 1.31s/it] Training 1/1 epoch (loss 2.5101): 26%|β–ˆβ–ˆβ–Œ | 327/1250 [08:02<17:25, 1.13s/it] Training 1/1 epoch (loss 2.3084): 26%|β–ˆβ–ˆβ–Œ | 327/1250 [08:04<17:25, 1.13s/it] Training 1/1 epoch (loss 2.3084): 26%|β–ˆβ–ˆβ–Œ | 328/1250 [08:04<21:04, 1.37s/it] Training 1/1 epoch (loss 2.8567): 26%|β–ˆβ–ˆβ–Œ | 328/1250 [08:05<21:04, 1.37s/it] Training 1/1 epoch (loss 2.8567): 26%|β–ˆβ–ˆβ–‹ | 329/1250 [08:05<22:22, 1.46s/it] Training 1/1 epoch (loss 2.4954): 26%|β–ˆβ–ˆβ–‹ | 329/1250 [08:06<22:22, 1.46s/it] Training 1/1 epoch (loss 2.4954): 26%|β–ˆβ–ˆβ–‹ | 330/1250 [08:06<19:41, 1.28s/it] Training 1/1 epoch (loss 2.5785): 26%|β–ˆβ–ˆβ–‹ | 330/1250 [08:08<19:41, 1.28s/it] Training 1/1 epoch (loss 2.5785): 26%|β–ˆβ–ˆβ–‹ | 331/1250 [08:08<19:02, 1.24s/it] Training 1/1 epoch (loss 2.7224): 26%|β–ˆβ–ˆβ–‹ | 331/1250 [08:09<19:02, 1.24s/it] Training 1/1 epoch (loss 2.7224): 27%|β–ˆβ–ˆβ–‹ | 332/1250 [08:09<18:13, 1.19s/it] Training 1/1 epoch (loss 2.5375): 27%|β–ˆβ–ˆβ–‹ | 332/1250 [08:10<18:13, 1.19s/it] Training 1/1 epoch (loss 2.5375): 27%|β–ˆβ–ˆβ–‹ | 333/1250 [08:10<19:48, 1.30s/it] Training 1/1 epoch (loss 2.7098): 27%|β–ˆβ–ˆβ–‹ | 333/1250 [08:11<19:48, 1.30s/it] Training 1/1 epoch (loss 2.7098): 27%|β–ˆβ–ˆβ–‹ | 334/1250 [08:11<20:07, 1.32s/it] Training 1/1 epoch (loss 2.5793): 27%|β–ˆβ–ˆβ–‹ | 334/1250 [08:12<20:07, 1.32s/it] Training 1/1 epoch (loss 2.5793): 27%|β–ˆβ–ˆβ–‹ | 335/1250 [08:12<16:49, 1.10s/it] Training 1/1 epoch (loss 2.7352): 27%|β–ˆβ–ˆβ–‹ | 335/1250 [08:15<16:49, 1.10s/it] Training 1/1 epoch (loss 2.7352): 27%|β–ˆβ–ˆβ–‹ | 336/1250 [08:15<24:28, 1.61s/it] Training 1/1 epoch (loss 2.9033): 27%|β–ˆβ–ˆβ–‹ | 336/1250 [08:17<24:28, 1.61s/it] Training 1/1 epoch (loss 2.9033): 27%|β–ˆβ–ˆβ–‹ | 337/1250 [08:17<26:59, 1.77s/it] Training 1/1 epoch (loss 2.6548): 27%|β–ˆβ–ˆβ–‹ | 337/1250 [08:18<26:59, 1.77s/it] Training 1/1 epoch (loss 2.6548): 27%|β–ˆβ–ˆβ–‹ | 338/1250 [08:18<21:12, 1.40s/it] Training 1/1 epoch (loss 2.6761): 27%|β–ˆβ–ˆβ–‹ | 338/1250 [08:19<21:12, 1.40s/it] Training 1/1 epoch (loss 2.6761): 27%|β–ˆβ–ˆβ–‹ | 339/1250 [08:19<23:39, 1.56s/it] Training 1/1 epoch (loss 2.6127): 27%|β–ˆβ–ˆβ–‹ | 339/1250 [08:22<23:39, 1.56s/it] Training 1/1 epoch (loss 2.6127): 27%|β–ˆβ–ˆβ–‹ | 340/1250 [08:22<27:18, 1.80s/it] Training 1/1 epoch (loss 2.4530): 27%|β–ˆβ–ˆβ–‹ | 340/1250 [08:23<27:18, 1.80s/it] Training 1/1 epoch (loss 2.4530): 27%|β–ˆβ–ˆβ–‹ | 341/1250 [08:23<22:23, 1.48s/it] Training 1/1 epoch (loss 2.5442): 27%|β–ˆβ–ˆβ–‹ | 341/1250 [08:24<22:23, 1.48s/it] Training 1/1 epoch (loss 2.5442): 27%|β–ˆβ–ˆβ–‹ | 342/1250 [08:24<21:29, 1.42s/it] Training 1/1 epoch (loss 2.4876): 27%|β–ˆβ–ˆβ–‹ | 342/1250 [08:25<21:29, 1.42s/it] Training 1/1 epoch (loss 2.4876): 27%|β–ˆβ–ˆβ–‹ | 343/1250 [08:25<21:10, 1.40s/it] Training 1/1 epoch (loss 2.6410): 27%|β–ˆβ–ˆβ–‹ | 343/1250 [08:26<21:10, 1.40s/it] Training 1/1 epoch (loss 2.6410): 28%|β–ˆβ–ˆβ–Š | 344/1250 [08:26<17:27, 1.16s/it] Training 1/1 epoch (loss 2.5860): 28%|β–ˆβ–ˆβ–Š | 344/1250 [08:28<17:27, 1.16s/it] Training 1/1 epoch (loss 2.5860): 28%|β–ˆβ–ˆβ–Š | 345/1250 [08:28<22:28, 1.49s/it] Training 1/1 epoch (loss 2.5541): 28%|β–ˆβ–ˆβ–Š | 345/1250 [08:30<22:28, 1.49s/it] Training 1/1 epoch (loss 2.5541): 28%|β–ˆβ–ˆβ–Š | 346/1250 [08:30<22:55, 1.52s/it] Training 1/1 epoch (loss 2.6330): 28%|β–ˆβ–ˆβ–Š | 346/1250 [08:30<22:55, 1.52s/it] Training 1/1 epoch (loss 2.6330): 28%|β–ˆβ–ˆβ–Š | 347/1250 [08:30<19:15, 1.28s/it] Training 1/1 epoch (loss 2.7124): 28%|β–ˆβ–ˆβ–Š | 347/1250 [08:32<19:15, 1.28s/it] Training 1/1 epoch (loss 2.7124): 28%|β–ˆβ–ˆβ–Š | 348/1250 [08:32<20:38, 1.37s/it] Training 1/1 epoch (loss 2.5615): 28%|β–ˆβ–ˆβ–Š | 348/1250 [08:34<20:38, 1.37s/it] Training 1/1 epoch (loss 2.5615): 28%|β–ˆβ–ˆβ–Š | 349/1250 [08:34<24:11, 1.61s/it] Training 1/1 epoch (loss 2.6647): 28%|β–ˆβ–ˆβ–Š | 349/1250 [08:35<24:11, 1.61s/it] Training 1/1 epoch (loss 2.6647): 28%|β–ˆβ–ˆβ–Š | 350/1250 [08:35<20:26, 1.36s/it] Training 1/1 epoch (loss 2.5803): 28%|β–ˆβ–ˆβ–Š | 350/1250 [08:37<20:26, 1.36s/it] Training 1/1 epoch (loss 2.5803): 28%|β–ˆβ–ˆβ–Š | 351/1250 [08:37<22:13, 1.48s/it] Training 1/1 epoch (loss 2.7046): 28%|β–ˆβ–ˆβ–Š | 351/1250 [08:38<22:13, 1.48s/it] Training 1/1 epoch (loss 2.7046): 28%|β–ˆβ–ˆβ–Š | 352/1250 [08:38<21:52, 1.46s/it] Training 1/1 epoch (loss 2.7857): 28%|β–ˆβ–ˆβ–Š | 352/1250 [08:40<21:52, 1.46s/it] Training 1/1 epoch (loss 2.7857): 28%|β–ˆβ–ˆβ–Š | 353/1250 [08:40<22:34, 1.51s/it] Training 1/1 epoch (loss 2.6357): 28%|β–ˆβ–ˆβ–Š | 353/1250 [08:42<22:34, 1.51s/it] Training 1/1 epoch (loss 2.6357): 28%|β–ˆβ–ˆβ–Š | 354/1250 [08:42<24:15, 1.62s/it] Training 1/1 epoch (loss 2.4520): 28%|β–ˆβ–ˆβ–Š | 354/1250 [08:43<24:15, 1.62s/it] Training 1/1 epoch (loss 2.4520): 28%|β–ˆβ–ˆβ–Š | 355/1250 [08:43<21:38, 1.45s/it] Training 1/1 epoch (loss 2.5976): 28%|β–ˆβ–ˆβ–Š | 355/1250 [08:44<21:38, 1.45s/it] Training 1/1 epoch (loss 2.5976): 28%|β–ˆβ–ˆβ–Š | 356/1250 [08:44<21:54, 1.47s/it] Training 1/1 epoch (loss 2.7423): 28%|β–ˆβ–ˆβ–Š | 356/1250 [08:46<21:54, 1.47s/it] Training 1/1 epoch (loss 2.7423): 29%|β–ˆβ–ˆβ–Š | 357/1250 [08:46<22:32, 1.52s/it] Training 1/1 epoch (loss 2.7117): 29%|β–ˆβ–ˆβ–Š | 357/1250 [08:46<22:32, 1.52s/it] Training 1/1 epoch (loss 2.7117): 29%|β–ˆβ–ˆβ–Š | 358/1250 [08:46<17:53, 1.20s/it] Training 1/1 epoch (loss 2.6029): 29%|β–ˆβ–ˆβ–Š | 358/1250 [08:48<17:53, 1.20s/it] Training 1/1 epoch (loss 2.6029): 29%|β–ˆβ–ˆβ–Š | 359/1250 [08:48<20:32, 1.38s/it] Training 1/1 epoch (loss 2.4152): 29%|β–ˆβ–ˆβ–Š | 359/1250 [08:50<20:32, 1.38s/it] Training 1/1 epoch (loss 2.4152): 29%|β–ˆβ–ˆβ–‰ | 360/1250 [08:50<20:50, 1.40s/it] Training 1/1 epoch (loss 2.5098): 29%|β–ˆβ–ˆβ–‰ | 360/1250 [08:50<20:50, 1.40s/it] Training 1/1 epoch (loss 2.5098): 29%|β–ˆβ–ˆβ–‰ | 361/1250 [08:50<17:53, 1.21s/it] Training 1/1 epoch (loss 2.6053): 29%|β–ˆβ–ˆβ–‰ | 361/1250 [08:52<17:53, 1.21s/it] Training 1/1 epoch (loss 2.6053): 29%|β–ˆβ–ˆβ–‰ | 362/1250 [08:52<22:10, 1.50s/it] Training 1/1 epoch (loss 2.6451): 29%|β–ˆβ–ˆβ–‰ | 362/1250 [08:54<22:10, 1.50s/it] Training 1/1 epoch (loss 2.6451): 29%|β–ˆβ–ˆβ–‰ | 363/1250 [08:54<21:41, 1.47s/it] Training 1/1 epoch (loss 2.7097): 29%|β–ˆβ–ˆβ–‰ | 363/1250 [08:55<21:41, 1.47s/it] Training 1/1 epoch (loss 2.7097): 29%|β–ˆβ–ˆβ–‰ | 364/1250 [08:55<18:39, 1.26s/it] Training 1/1 epoch (loss 2.5016): 29%|β–ˆβ–ˆβ–‰ | 364/1250 [08:57<18:39, 1.26s/it] Training 1/1 epoch (loss 2.5016): 29%|β–ˆβ–ˆβ–‰ | 365/1250 [08:57<22:17, 1.51s/it] Training 1/1 epoch (loss 2.3725): 29%|β–ˆβ–ˆβ–‰ | 365/1250 [08:58<22:17, 1.51s/it] Training 1/1 epoch (loss 2.3725): 29%|β–ˆβ–ˆβ–‰ | 366/1250 [08:58<23:26, 1.59s/it] Training 1/1 epoch (loss 2.4695): 29%|β–ˆβ–ˆβ–‰ | 366/1250 [08:59<23:26, 1.59s/it] Training 1/1 epoch (loss 2.4695): 29%|β–ˆβ–ˆβ–‰ | 367/1250 [08:59<19:51, 1.35s/it] Training 1/1 epoch (loss 2.5351): 29%|β–ˆβ–ˆβ–‰ | 367/1250 [09:01<19:51, 1.35s/it] Training 1/1 epoch (loss 2.5351): 29%|β–ˆβ–ˆβ–‰ | 368/1250 [09:01<19:45, 1.34s/it] Training 1/1 epoch (loss 2.6431): 29%|β–ˆβ–ˆβ–‰ | 368/1250 [09:02<19:45, 1.34s/it] Training 1/1 epoch (loss 2.6431): 30%|β–ˆβ–ˆβ–‰ | 369/1250 [09:02<18:02, 1.23s/it] Training 1/1 epoch (loss 2.7314): 30%|β–ˆβ–ˆβ–‰ | 369/1250 [09:04<18:02, 1.23s/it] Training 1/1 epoch (loss 2.7314): 30%|β–ˆβ–ˆβ–‰ | 370/1250 [09:04<22:41, 1.55s/it] Training 1/1 epoch (loss 2.5126): 30%|β–ˆβ–ˆβ–‰ | 370/1250 [09:06<22:41, 1.55s/it] Training 1/1 epoch (loss 2.5126): 30%|β–ˆβ–ˆβ–‰ | 371/1250 [09:06<24:31, 1.67s/it] Training 1/1 epoch (loss 2.5577): 30%|β–ˆβ–ˆβ–‰ | 371/1250 [09:07<24:31, 1.67s/it] Training 1/1 epoch (loss 2.5577): 30%|β–ˆβ–ˆβ–‰ | 372/1250 [09:07<20:49, 1.42s/it] Training 1/1 epoch (loss 2.5325): 30%|β–ˆβ–ˆβ–‰ | 372/1250 [09:08<20:49, 1.42s/it] Training 1/1 epoch (loss 2.5325): 30%|β–ˆβ–ˆβ–‰ | 373/1250 [09:08<20:39, 1.41s/it] Training 1/1 epoch (loss 2.3176): 30%|β–ˆβ–ˆβ–‰ | 373/1250 [09:11<20:39, 1.41s/it] Training 1/1 epoch (loss 2.3176): 30%|β–ˆβ–ˆβ–‰ | 374/1250 [09:11<25:10, 1.72s/it] Training 1/1 epoch (loss 2.6262): 30%|β–ˆβ–ˆβ–‰ | 374/1250 [09:11<25:10, 1.72s/it] Training 1/1 epoch (loss 2.6262): 30%|β–ˆβ–ˆβ–ˆ | 375/1250 [09:11<20:00, 1.37s/it] Training 1/1 epoch (loss 2.7935): 30%|β–ˆβ–ˆβ–ˆ | 375/1250 [09:14<20:00, 1.37s/it] Training 1/1 epoch (loss 2.7935): 30%|β–ˆβ–ˆβ–ˆ | 376/1250 [09:14<25:21, 1.74s/it] Training 1/1 epoch (loss 2.7203): 30%|β–ˆβ–ˆβ–ˆ | 376/1250 [09:15<25:21, 1.74s/it] Training 1/1 epoch (loss 2.7203): 30%|β–ˆβ–ˆβ–ˆ | 377/1250 [09:15<23:34, 1.62s/it] Training 1/1 epoch (loss 2.5729): 30%|β–ˆβ–ˆβ–ˆ | 377/1250 [09:16<23:34, 1.62s/it] Training 1/1 epoch (loss 2.5729): 30%|β–ˆβ–ˆβ–ˆ | 378/1250 [09:16<21:32, 1.48s/it] Training 1/1 epoch (loss 2.4096): 30%|β–ˆβ–ˆβ–ˆ | 378/1250 [09:18<21:32, 1.48s/it] Training 1/1 epoch (loss 2.4096): 30%|β–ˆβ–ˆβ–ˆ | 379/1250 [09:18<23:12, 1.60s/it] Training 1/1 epoch (loss 2.6214): 30%|β–ˆβ–ˆβ–ˆ | 379/1250 [09:19<23:12, 1.60s/it] Training 1/1 epoch (loss 2.6214): 30%|β–ˆβ–ˆβ–ˆ | 380/1250 [09:19<20:59, 1.45s/it] Training 1/1 epoch (loss 2.6523): 30%|β–ˆβ–ˆβ–ˆ | 380/1250 [09:21<20:59, 1.45s/it] Training 1/1 epoch (loss 2.6523): 30%|β–ˆβ–ˆβ–ˆ | 381/1250 [09:21<20:43, 1.43s/it] Training 1/1 epoch (loss 2.7056): 30%|β–ˆβ–ˆβ–ˆ | 381/1250 [09:22<20:43, 1.43s/it] Training 1/1 epoch (loss 2.7056): 31%|β–ˆβ–ˆβ–ˆ | 382/1250 [09:22<20:32, 1.42s/it] Training 1/1 epoch (loss 2.5801): 31%|β–ˆβ–ˆβ–ˆ | 382/1250 [09:23<20:32, 1.42s/it] Training 1/1 epoch (loss 2.5801): 31%|β–ˆβ–ˆβ–ˆ | 383/1250 [09:23<17:36, 1.22s/it] Training 1/1 epoch (loss 2.4524): 31%|β–ˆβ–ˆβ–ˆ | 383/1250 [09:25<17:36, 1.22s/it] Training 1/1 epoch (loss 2.4524): 31%|β–ˆβ–ˆβ–ˆ | 384/1250 [09:25<23:37, 1.64s/it] Training 1/1 epoch (loss 2.5614): 31%|β–ˆβ–ˆβ–ˆ | 384/1250 [09:27<23:37, 1.64s/it] Training 1/1 epoch (loss 2.5614): 31%|β–ˆβ–ˆβ–ˆ | 385/1250 [09:27<23:10, 1.61s/it] Training 1/1 epoch (loss 2.4120): 31%|β–ˆβ–ˆβ–ˆ | 385/1250 [09:28<23:10, 1.61s/it] Training 1/1 epoch (loss 2.4120): 31%|β–ˆβ–ˆβ–ˆ | 386/1250 [09:28<20:31, 1.43s/it] Training 1/1 epoch (loss 2.4607): 31%|β–ˆβ–ˆβ–ˆ | 386/1250 [09:29<20:31, 1.43s/it] Training 1/1 epoch (loss 2.4607): 31%|β–ˆβ–ˆβ–ˆ | 387/1250 [09:29<21:08, 1.47s/it] Training 1/1 epoch (loss 2.5548): 31%|β–ˆβ–ˆβ–ˆ | 387/1250 [09:32<21:08, 1.47s/it] Training 1/1 epoch (loss 2.5548): 31%|β–ˆβ–ˆβ–ˆ | 388/1250 [09:32<25:35, 1.78s/it] Training 1/1 epoch (loss 2.6726): 31%|β–ˆβ–ˆβ–ˆ | 388/1250 [09:33<25:35, 1.78s/it] Training 1/1 epoch (loss 2.6726): 31%|β–ˆβ–ˆβ–ˆ | 389/1250 [09:33<21:37, 1.51s/it] Training 1/1 epoch (loss 2.5596): 31%|β–ˆβ–ˆβ–ˆ | 389/1250 [09:34<21:37, 1.51s/it] Training 1/1 epoch (loss 2.5596): 31%|β–ˆβ–ˆβ–ˆ | 390/1250 [09:34<22:12, 1.55s/it] Training 1/1 epoch (loss 2.7853): 31%|β–ˆβ–ˆβ–ˆ | 390/1250 [09:36<22:12, 1.55s/it] Training 1/1 epoch (loss 2.7853): 31%|β–ˆβ–ˆβ–ˆβ– | 391/1250 [09:36<22:01, 1.54s/it] Training 1/1 epoch (loss 2.6251): 31%|β–ˆβ–ˆβ–ˆβ– | 391/1250 [09:38<22:01, 1.54s/it] Training 1/1 epoch (loss 2.6251): 31%|β–ˆβ–ˆβ–ˆβ– | 392/1250 [09:38<25:42, 1.80s/it] Training 1/1 epoch (loss 2.6902): 31%|β–ˆβ–ˆβ–ˆβ– | 392/1250 [09:40<25:42, 1.80s/it] Training 1/1 epoch (loss 2.6902): 31%|β–ˆβ–ˆβ–ˆβ– | 393/1250 [09:40<23:59, 1.68s/it] Training 1/1 epoch (loss 2.4547): 31%|β–ˆβ–ˆβ–ˆβ– | 393/1250 [09:40<23:59, 1.68s/it] Training 1/1 epoch (loss 2.4547): 32%|β–ˆβ–ˆβ–ˆβ– | 394/1250 [09:40<18:51, 1.32s/it] Training 1/1 epoch (loss 2.3229): 32%|β–ˆβ–ˆβ–ˆβ– | 394/1250 [09:42<18:51, 1.32s/it] Training 1/1 epoch (loss 2.3229): 32%|β–ˆβ–ˆβ–ˆβ– | 395/1250 [09:42<20:20, 1.43s/it] Training 1/1 epoch (loss 2.5291): 32%|β–ˆβ–ˆβ–ˆβ– | 395/1250 [09:44<20:20, 1.43s/it] Training 1/1 epoch (loss 2.5291): 32%|β–ˆβ–ˆβ–ˆβ– | 396/1250 [09:44<24:04, 1.69s/it] Training 1/1 epoch (loss 2.3587): 32%|β–ˆβ–ˆβ–ˆβ– | 396/1250 [09:45<24:04, 1.69s/it] Training 1/1 epoch (loss 2.3587): 32%|β–ˆβ–ˆβ–ˆβ– | 397/1250 [09:45<20:53, 1.47s/it] Training 1/1 epoch (loss 2.3998): 32%|β–ˆβ–ˆβ–ˆβ– | 397/1250 [09:47<20:53, 1.47s/it] Training 1/1 epoch (loss 2.3998): 32%|β–ˆβ–ˆβ–ˆβ– | 398/1250 [09:47<23:49, 1.68s/it] Training 1/1 epoch (loss 2.6377): 32%|β–ˆβ–ˆβ–ˆβ– | 398/1250 [09:48<23:49, 1.68s/it] Training 1/1 epoch (loss 2.6377): 32%|β–ˆβ–ˆβ–ˆβ– | 399/1250 [09:48<20:42, 1.46s/it] Training 1/1 epoch (loss 2.6170): 32%|β–ˆβ–ˆβ–ˆβ– | 399/1250 [09:49<20:42, 1.46s/it] Training 1/1 epoch (loss 2.6170): 32%|β–ˆβ–ˆβ–ˆβ– | 400/1250 [09:49<19:29, 1.38s/it] Training 1/1 epoch (loss 2.4228): 32%|β–ˆβ–ˆβ–ˆβ– | 400/1250 [09:51<19:29, 1.38s/it] Training 1/1 epoch (loss 2.4228): 32%|β–ˆβ–ˆβ–ˆβ– | 401/1250 [09:51<18:52, 1.33s/it] Training 1/1 epoch (loss 2.7312): 32%|β–ˆβ–ˆβ–ˆβ– | 401/1250 [09:52<18:52, 1.33s/it] Training 1/1 epoch (loss 2.7312): 32%|β–ˆβ–ˆβ–ˆβ– | 402/1250 [09:52<18:18, 1.30s/it] Training 1/1 epoch (loss 2.6724): 32%|β–ˆβ–ˆβ–ˆβ– | 402/1250 [09:53<18:18, 1.30s/it] Training 1/1 epoch (loss 2.6724): 32%|β–ˆβ–ˆβ–ˆβ– | 403/1250 [09:53<16:53, 1.20s/it] Training 1/1 epoch (loss 2.4254): 32%|β–ˆβ–ˆβ–ˆβ– | 403/1250 [09:55<16:53, 1.20s/it] Training 1/1 epoch (loss 2.4254): 32%|β–ˆβ–ˆβ–ˆβ– | 404/1250 [09:55<19:40, 1.40s/it] Training 1/1 epoch (loss 2.5950): 32%|β–ˆβ–ˆβ–ˆβ– | 404/1250 [09:55<19:40, 1.40s/it] Training 1/1 epoch (loss 2.5950): 32%|β–ˆβ–ˆβ–ˆβ– | 405/1250 [09:55<16:29, 1.17s/it] Training 1/1 epoch (loss 2.5874): 32%|β–ˆβ–ˆβ–ˆβ– | 405/1250 [09:57<16:29, 1.17s/it] Training 1/1 epoch (loss 2.5874): 32%|β–ˆβ–ˆβ–ˆβ– | 406/1250 [09:57<18:54, 1.34s/it] Training 1/1 epoch (loss 2.4829): 32%|β–ˆβ–ˆβ–ˆβ– | 406/1250 [09:59<18:54, 1.34s/it] Training 1/1 epoch (loss 2.4829): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 407/1250 [09:59<22:34, 1.61s/it] Training 1/1 epoch (loss 2.6153): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 407/1250 [10:00<22:34, 1.61s/it] Training 1/1 epoch (loss 2.6153): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 408/1250 [10:00<19:08, 1.36s/it] Training 1/1 epoch (loss 2.7196): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 408/1250 [10:02<19:08, 1.36s/it] Training 1/1 epoch (loss 2.7196): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 409/1250 [10:02<19:12, 1.37s/it] Training 1/1 epoch (loss 2.7338): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 409/1250 [10:03<19:12, 1.37s/it] Training 1/1 epoch (loss 2.7338): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 410/1250 [10:03<19:02, 1.36s/it] Training 1/1 epoch (loss 2.5394): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 410/1250 [10:04<19:02, 1.36s/it] Training 1/1 epoch (loss 2.5394): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 411/1250 [10:04<16:05, 1.15s/it] Training 1/1 epoch (loss 2.3797): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 411/1250 [10:06<16:05, 1.15s/it] Training 1/1 epoch (loss 2.3797): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 412/1250 [10:06<20:35, 1.47s/it] Training 1/1 epoch (loss 2.5171): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 412/1250 [10:08<20:35, 1.47s/it] Training 1/1 epoch (loss 2.5171): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 413/1250 [10:08<24:57, 1.79s/it] Training 1/1 epoch (loss 2.6820): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 413/1250 [10:09<24:57, 1.79s/it] Training 1/1 epoch (loss 2.6820): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 414/1250 [10:09<19:18, 1.39s/it] Training 1/1 epoch (loss 2.5399): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 414/1250 [10:10<19:18, 1.39s/it] Training 1/1 epoch (loss 2.5399): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 415/1250 [10:10<20:46, 1.49s/it] Training 1/1 epoch (loss 2.6981): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 415/1250 [10:13<20:46, 1.49s/it] Training 1/1 epoch (loss 2.6981): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 416/1250 [10:13<23:40, 1.70s/it] Training 1/1 epoch (loss 2.7274): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 416/1250 [10:14<23:40, 1.70s/it] Training 1/1 epoch (loss 2.7274): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 417/1250 [10:14<21:45, 1.57s/it] Training 1/1 epoch (loss 2.6189): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 417/1250 [10:16<21:45, 1.57s/it] Training 1/1 epoch (loss 2.6189): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 418/1250 [10:16<22:36, 1.63s/it] Training 1/1 epoch (loss 2.7220): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 418/1250 [10:17<22:36, 1.63s/it] Training 1/1 epoch (loss 2.7220): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 419/1250 [10:17<19:33, 1.41s/it] Training 1/1 epoch (loss 2.6373): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 419/1250 [10:18<19:33, 1.41s/it] Training 1/1 epoch (loss 2.6373): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 420/1250 [10:18<20:26, 1.48s/it] Training 1/1 epoch (loss 2.5842): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 420/1250 [10:20<20:26, 1.48s/it] Training 1/1 epoch (loss 2.5842): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 421/1250 [10:20<22:33, 1.63s/it] Training 1/1 epoch (loss 2.5287): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 421/1250 [10:21<22:33, 1.63s/it] Training 1/1 epoch (loss 2.5287): 34%|β–ˆβ–ˆβ–ˆβ– | 422/1250 [10:21<18:09, 1.32s/it] Training 1/1 epoch (loss 2.5558): 34%|β–ˆβ–ˆβ–ˆβ– | 422/1250 [10:23<18:09, 1.32s/it] Training 1/1 epoch (loss 2.5558): 34%|β–ˆβ–ˆβ–ˆβ– | 423/1250 [10:23<22:31, 1.63s/it] Training 1/1 epoch (loss 2.5308): 34%|β–ˆβ–ˆβ–ˆβ– | 423/1250 [10:26<22:31, 1.63s/it] Training 1/1 epoch (loss 2.5308): 34%|β–ˆβ–ˆβ–ˆβ– | 424/1250 [10:26<25:34, 1.86s/it] Training 1/1 epoch (loss 2.7111): 34%|β–ˆβ–ˆβ–ˆβ– | 424/1250 [10:26<25:34, 1.86s/it] Training 1/1 epoch (loss 2.7111): 34%|β–ˆβ–ˆβ–ˆβ– | 425/1250 [10:26<21:23, 1.56s/it] Training 1/1 epoch (loss 2.7780): 34%|β–ˆβ–ˆβ–ˆβ– | 425/1250 [10:28<21:23, 1.56s/it] Training 1/1 epoch (loss 2.7780): 34%|β–ˆβ–ˆβ–ˆβ– | 426/1250 [10:28<22:20, 1.63s/it] Training 1/1 epoch (loss 2.5859): 34%|β–ˆβ–ˆβ–ˆβ– | 426/1250 [10:30<22:20, 1.63s/it] Training 1/1 epoch (loss 2.5859): 34%|β–ˆβ–ˆβ–ˆβ– | 427/1250 [10:30<21:48, 1.59s/it] Training 1/1 epoch (loss 2.5363): 34%|β–ˆβ–ˆβ–ˆβ– | 427/1250 [10:30<21:48, 1.59s/it] Training 1/1 epoch (loss 2.5363): 34%|β–ˆβ–ˆβ–ˆβ– | 428/1250 [10:30<18:31, 1.35s/it] Training 1/1 epoch (loss 2.5184): 34%|β–ˆβ–ˆβ–ˆβ– | 428/1250 [10:32<18:31, 1.35s/it] Training 1/1 epoch (loss 2.5184): 34%|β–ˆβ–ˆβ–ˆβ– | 429/1250 [10:32<20:54, 1.53s/it] Training 1/1 epoch (loss 2.4157): 34%|β–ˆβ–ˆβ–ˆβ– | 429/1250 [10:34<20:54, 1.53s/it] Training 1/1 epoch (loss 2.4157): 34%|β–ˆβ–ˆβ–ˆβ– | 430/1250 [10:34<20:18, 1.49s/it] Training 1/1 epoch (loss 2.5697): 34%|β–ˆβ–ˆβ–ˆβ– | 430/1250 [10:35<20:18, 1.49s/it] Training 1/1 epoch (loss 2.5697): 34%|β–ˆβ–ˆβ–ˆβ– | 431/1250 [10:35<19:41, 1.44s/it] Training 1/1 epoch (loss 2.6282): 34%|β–ˆβ–ˆβ–ˆβ– | 431/1250 [10:37<19:41, 1.44s/it] Training 1/1 epoch (loss 2.6282): 35%|β–ˆβ–ˆβ–ˆβ– | 432/1250 [10:37<22:55, 1.68s/it] Training 1/1 epoch (loss 2.5365): 35%|β–ˆβ–ˆβ–ˆβ– | 432/1250 [10:38<22:55, 1.68s/it] Training 1/1 epoch (loss 2.5365): 35%|β–ˆβ–ˆβ–ˆβ– | 433/1250 [10:38<18:47, 1.38s/it] Training 1/1 epoch (loss 2.6741): 35%|β–ˆβ–ˆβ–ˆβ– | 433/1250 [10:40<18:47, 1.38s/it] Training 1/1 epoch (loss 2.6741): 35%|β–ˆβ–ˆβ–ˆβ– | 434/1250 [10:40<20:43, 1.52s/it] Training 1/1 epoch (loss 2.6643): 35%|β–ˆβ–ˆβ–ˆβ– | 434/1250 [10:41<20:43, 1.52s/it] Training 1/1 epoch (loss 2.6643): 35%|β–ˆβ–ˆβ–ˆβ– | 435/1250 [10:41<18:45, 1.38s/it] Training 1/1 epoch (loss 2.7269): 35%|β–ˆβ–ˆβ–ˆβ– | 435/1250 [10:41<18:45, 1.38s/it] Training 1/1 epoch (loss 2.7269): 35%|β–ˆβ–ˆβ–ˆβ– | 436/1250 [10:41<15:07, 1.11s/it] Training 1/1 epoch (loss 2.5342): 35%|β–ˆβ–ˆβ–ˆβ– | 436/1250 [10:43<15:07, 1.11s/it] Training 1/1 epoch (loss 2.5342): 35%|β–ˆβ–ˆβ–ˆβ– | 437/1250 [10:43<18:50, 1.39s/it] Training 1/1 epoch (loss 2.5106): 35%|β–ˆβ–ˆβ–ˆβ– | 437/1250 [10:45<18:50, 1.39s/it] Training 1/1 epoch (loss 2.5106): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 438/1250 [10:45<19:04, 1.41s/it] Training 1/1 epoch (loss 2.6423): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 438/1250 [10:46<19:04, 1.41s/it] Training 1/1 epoch (loss 2.6423): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 439/1250 [10:46<16:15, 1.20s/it] Training 1/1 epoch (loss 2.9034): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 439/1250 [10:48<16:15, 1.20s/it] Training 1/1 epoch (loss 2.9034): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 440/1250 [10:48<19:55, 1.48s/it] Training 1/1 epoch (loss 2.4833): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 440/1250 [10:49<19:55, 1.48s/it] Training 1/1 epoch (loss 2.4833): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 441/1250 [10:49<19:51, 1.47s/it] Training 1/1 epoch (loss 2.5617): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 441/1250 [10:51<19:51, 1.47s/it] Training 1/1 epoch (loss 2.5617): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 442/1250 [10:51<21:24, 1.59s/it] Training 1/1 epoch (loss 2.5330): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 442/1250 [10:53<21:24, 1.59s/it] Training 1/1 epoch (loss 2.5330): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 443/1250 [10:53<22:00, 1.64s/it] Training 1/1 epoch (loss 2.7666): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 443/1250 [10:53<22:00, 1.64s/it] Training 1/1 epoch (loss 2.7666): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 444/1250 [10:53<17:09, 1.28s/it] Training 1/1 epoch (loss 2.5395): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 444/1250 [10:56<17:09, 1.28s/it] Training 1/1 epoch (loss 2.5395): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 445/1250 [10:56<21:28, 1.60s/it] Training 1/1 epoch (loss 2.4934): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 445/1250 [10:57<21:28, 1.60s/it] Training 1/1 epoch (loss 2.4934): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 446/1250 [10:57<21:58, 1.64s/it] Training 1/1 epoch (loss 2.5582): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 446/1250 [10:58<21:58, 1.64s/it] Training 1/1 epoch (loss 2.5582): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 447/1250 [10:58<18:07, 1.35s/it] Training 1/1 epoch (loss 2.4085): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 447/1250 [11:00<18:07, 1.35s/it] Training 1/1 epoch (loss 2.4085): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 448/1250 [11:00<20:40, 1.55s/it] Training 1/1 epoch (loss 2.6301): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 448/1250 [11:01<20:40, 1.55s/it] Training 1/1 epoch (loss 2.6301): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 449/1250 [11:01<18:55, 1.42s/it] Training 1/1 epoch (loss 2.5570): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 449/1250 [11:03<18:55, 1.42s/it] Training 1/1 epoch (loss 2.5570): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 450/1250 [11:03<20:16, 1.52s/it] Training 1/1 epoch (loss 2.5005): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 450/1250 [11:05<20:16, 1.52s/it] Training 1/1 epoch (loss 2.5005): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 451/1250 [11:05<24:09, 1.81s/it] Training 1/1 epoch (loss 2.5036): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 451/1250 [11:06<24:09, 1.81s/it] Training 1/1 epoch (loss 2.5036): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 452/1250 [11:06<20:30, 1.54s/it] Training 1/1 epoch (loss 2.6251): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 452/1250 [11:08<20:30, 1.54s/it] Training 1/1 epoch (loss 2.6251): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 453/1250 [11:08<22:32, 1.70s/it] Training 1/1 epoch (loss 2.6079): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 453/1250 [11:11<22:32, 1.70s/it] Training 1/1 epoch (loss 2.6079): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 454/1250 [11:11<25:38, 1.93s/it] Training 1/1 epoch (loss 2.5340): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 454/1250 [11:12<25:38, 1.93s/it] Training 1/1 epoch (loss 2.5340): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 455/1250 [11:12<20:20, 1.54s/it] Training 1/1 epoch (loss 2.7243): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 455/1250 [11:13<20:20, 1.54s/it] Training 1/1 epoch (loss 2.7243): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 456/1250 [11:13<22:01, 1.66s/it] Training 1/1 epoch (loss 2.7251): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 456/1250 [11:15<22:01, 1.66s/it] Training 1/1 epoch (loss 2.7251): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 457/1250 [11:15<22:47, 1.72s/it] Training 1/1 epoch (loss 2.5736): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 457/1250 [11:17<22:47, 1.72s/it] Training 1/1 epoch (loss 2.5736): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 458/1250 [11:17<21:59, 1.67s/it] Training 1/1 epoch (loss 2.8168): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 458/1250 [11:18<21:59, 1.67s/it] Training 1/1 epoch (loss 2.8168): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 459/1250 [11:18<19:01, 1.44s/it] Training 1/1 epoch (loss 2.6727): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 459/1250 [11:19<19:01, 1.44s/it] Training 1/1 epoch (loss 2.6727): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 460/1250 [11:19<17:49, 1.35s/it] Training 1/1 epoch (loss 2.4602): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 460/1250 [11:20<17:49, 1.35s/it] Training 1/1 epoch (loss 2.4602): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 461/1250 [11:20<17:24, 1.32s/it] Training 1/1 epoch (loss 2.5269): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 461/1250 [11:21<17:24, 1.32s/it] Training 1/1 epoch (loss 2.5269): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 462/1250 [11:21<16:17, 1.24s/it] Training 1/1 epoch (loss 2.6007): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 462/1250 [11:23<16:17, 1.24s/it] Training 1/1 epoch (loss 2.6007): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 463/1250 [11:23<16:50, 1.28s/it] Training 1/1 epoch (loss 2.4682): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 463/1250 [11:24<16:50, 1.28s/it] Training 1/1 epoch (loss 2.4682): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 464/1250 [11:24<16:47, 1.28s/it] Training 1/1 epoch (loss 2.3068): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 464/1250 [11:26<16:47, 1.28s/it] Training 1/1 epoch (loss 2.3068): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 465/1250 [11:26<20:47, 1.59s/it] Training 1/1 epoch (loss 2.3991): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 465/1250 [11:27<20:47, 1.59s/it] Training 1/1 epoch (loss 2.3991): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 466/1250 [11:27<17:23, 1.33s/it] Training 1/1 epoch (loss 2.4429): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 466/1250 [11:29<17:23, 1.33s/it] Training 1/1 epoch (loss 2.4429): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 467/1250 [11:29<19:38, 1.51s/it] Training 1/1 epoch (loss 2.5941): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 467/1250 [11:31<19:38, 1.51s/it] Training 1/1 epoch (loss 2.5941): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 468/1250 [11:31<20:27, 1.57s/it] Training 1/1 epoch (loss 2.6405): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 468/1250 [11:31<20:27, 1.57s/it] Training 1/1 epoch (loss 2.6405): 38%|β–ˆβ–ˆβ–ˆβ–Š | 469/1250 [11:31<16:24, 1.26s/it] Training 1/1 epoch (loss 2.6718): 38%|β–ˆβ–ˆβ–ˆβ–Š | 469/1250 [11:32<16:24, 1.26s/it] Training 1/1 epoch (loss 2.6718): 38%|β–ˆβ–ˆβ–ˆβ–Š | 470/1250 [11:32<16:37, 1.28s/it] Training 1/1 epoch (loss 2.7111): 38%|β–ˆβ–ˆβ–ˆβ–Š | 470/1250 [11:34<16:37, 1.28s/it] Training 1/1 epoch (loss 2.7111): 38%|β–ˆβ–ˆβ–ˆβ–Š | 471/1250 [11:34<18:31, 1.43s/it] Training 1/1 epoch (loss 2.4984): 38%|β–ˆβ–ˆβ–ˆβ–Š | 471/1250 [11:35<18:31, 1.43s/it] Training 1/1 epoch (loss 2.4984): 38%|β–ˆβ–ˆβ–ˆβ–Š | 472/1250 [11:35<17:03, 1.32s/it] Training 1/1 epoch (loss 2.5212): 38%|β–ˆβ–ˆβ–ˆβ–Š | 472/1250 [11:37<17:03, 1.32s/it] Training 1/1 epoch (loss 2.5212): 38%|β–ˆβ–ˆβ–ˆβ–Š | 473/1250 [11:37<18:44, 1.45s/it] Training 1/1 epoch (loss 2.4705): 38%|β–ˆβ–ˆβ–ˆβ–Š | 473/1250 [11:38<18:44, 1.45s/it] Training 1/1 epoch (loss 2.4705): 38%|β–ˆβ–ˆβ–ˆβ–Š | 474/1250 [11:38<17:27, 1.35s/it] Training 1/1 epoch (loss 2.5481): 38%|β–ˆβ–ˆβ–ˆβ–Š | 474/1250 [11:39<17:27, 1.35s/it] Training 1/1 epoch (loss 2.5481): 38%|β–ˆβ–ˆβ–ˆβ–Š | 475/1250 [11:39<14:45, 1.14s/it] Training 1/1 epoch (loss 2.6578): 38%|β–ˆβ–ˆβ–ˆβ–Š | 475/1250 [11:40<14:45, 1.14s/it] Training 1/1 epoch (loss 2.6578): 38%|β–ˆβ–ˆβ–ˆβ–Š | 476/1250 [11:40<16:50, 1.31s/it] Training 1/1 epoch (loss 2.6011): 38%|β–ˆβ–ˆβ–ˆβ–Š | 476/1250 [11:42<16:50, 1.31s/it] Training 1/1 epoch (loss 2.6011): 38%|β–ˆβ–ˆβ–ˆβ–Š | 477/1250 [11:42<18:12, 1.41s/it] Training 1/1 epoch (loss 2.7581): 38%|β–ˆβ–ˆβ–ˆβ–Š | 477/1250 [11:44<18:12, 1.41s/it] Training 1/1 epoch (loss 2.7581): 38%|β–ˆβ–ˆβ–ˆβ–Š | 478/1250 [11:44<18:19, 1.42s/it] Training 1/1 epoch (loss 2.5347): 38%|β–ˆβ–ˆβ–ˆβ–Š | 478/1250 [11:45<18:19, 1.42s/it] Training 1/1 epoch (loss 2.5347): 38%|β–ˆβ–ˆβ–ˆβ–Š | 479/1250 [11:45<19:52, 1.55s/it] Training 1/1 epoch (loss 2.5335): 38%|β–ˆβ–ˆβ–ˆβ–Š | 479/1250 [11:47<19:52, 1.55s/it] Training 1/1 epoch (loss 2.5335): 38%|β–ˆβ–ˆβ–ˆβ–Š | 480/1250 [11:47<18:30, 1.44s/it] Training 1/1 epoch (loss 2.5061): 38%|β–ˆβ–ˆβ–ˆβ–Š | 480/1250 [11:49<18:30, 1.44s/it] Training 1/1 epoch (loss 2.5061): 38%|β–ˆβ–ˆβ–ˆβ–Š | 481/1250 [11:49<21:55, 1.71s/it] Training 1/1 epoch (loss 2.7551): 38%|β–ˆβ–ˆβ–ˆβ–Š | 481/1250 [11:50<21:55, 1.71s/it] Training 1/1 epoch (loss 2.7551): 39%|β–ˆβ–ˆβ–ˆβ–Š | 482/1250 [11:50<20:13, 1.58s/it] Training 1/1 epoch (loss 2.5300): 39%|β–ˆβ–ˆβ–ˆβ–Š | 482/1250 [11:51<20:13, 1.58s/it] Training 1/1 epoch (loss 2.5300): 39%|β–ˆβ–ˆβ–ˆβ–Š | 483/1250 [11:51<16:05, 1.26s/it] Training 1/1 epoch (loss 2.4110): 39%|β–ˆβ–ˆβ–ˆβ–Š | 483/1250 [11:52<16:05, 1.26s/it] Training 1/1 epoch (loss 2.4110): 39%|β–ˆβ–ˆβ–ˆβ–Š | 484/1250 [11:52<16:22, 1.28s/it] Training 1/1 epoch (loss 2.4364): 39%|β–ˆβ–ˆβ–ˆβ–Š | 484/1250 [11:54<16:22, 1.28s/it] Training 1/1 epoch (loss 2.4364): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 485/1250 [11:54<17:19, 1.36s/it] Training 1/1 epoch (loss 2.6351): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 485/1250 [11:54<17:19, 1.36s/it] Training 1/1 epoch (loss 2.6351): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 486/1250 [11:54<14:00, 1.10s/it] Training 1/1 epoch (loss 2.5930): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 486/1250 [11:57<14:00, 1.10s/it] Training 1/1 epoch (loss 2.5930): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 487/1250 [11:57<19:08, 1.51s/it] Training 1/1 epoch (loss 2.4453): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 487/1250 [11:59<19:08, 1.51s/it] Training 1/1 epoch (loss 2.4453): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 488/1250 [11:59<21:03, 1.66s/it] Training 1/1 epoch (loss 2.5649): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 488/1250 [11:59<21:03, 1.66s/it] Training 1/1 epoch (loss 2.5649): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 489/1250 [11:59<17:31, 1.38s/it] Training 1/1 epoch (loss 2.7979): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 489/1250 [12:01<17:31, 1.38s/it] Training 1/1 epoch (loss 2.7979): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 490/1250 [12:01<17:21, 1.37s/it] Training 1/1 epoch (loss 2.5990): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 490/1250 [12:02<17:21, 1.37s/it] Training 1/1 epoch (loss 2.5990): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 491/1250 [12:02<17:20, 1.37s/it] Training 1/1 epoch (loss 2.5440): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 491/1250 [12:03<17:20, 1.37s/it] Training 1/1 epoch (loss 2.5440): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 492/1250 [12:03<16:58, 1.34s/it] Training 1/1 epoch (loss 2.6045): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 492/1250 [12:05<16:58, 1.34s/it] Training 1/1 epoch (loss 2.6045): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 493/1250 [12:05<19:01, 1.51s/it] Training 1/1 epoch (loss 2.4121): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 493/1250 [12:06<19:01, 1.51s/it] Training 1/1 epoch (loss 2.4121): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 494/1250 [12:06<15:22, 1.22s/it] Training 1/1 epoch (loss 2.6961): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 494/1250 [12:07<15:22, 1.22s/it] Training 1/1 epoch (loss 2.6961): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 495/1250 [12:07<15:47, 1.25s/it] Training 1/1 epoch (loss 2.6368): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 495/1250 [12:09<15:47, 1.25s/it] Training 1/1 epoch (loss 2.6368): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 496/1250 [12:09<16:37, 1.32s/it] Training 1/1 epoch (loss 2.4938): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 496/1250 [12:09<16:37, 1.32s/it] Training 1/1 epoch (loss 2.4938): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 497/1250 [12:09<14:33, 1.16s/it] Training 1/1 epoch (loss 2.7109): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 497/1250 [12:12<14:33, 1.16s/it] Training 1/1 epoch (loss 2.7109): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 498/1250 [12:12<19:18, 1.54s/it] Training 1/1 epoch (loss 2.3223): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 498/1250 [12:14<19:18, 1.54s/it] Training 1/1 epoch (loss 2.3223): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 499/1250 [12:14<22:01, 1.76s/it] Training 1/1 epoch (loss 2.6849): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 499/1250 [12:15<22:01, 1.76s/it] Training 1/1 epoch (loss 2.6849): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 500/1250 [12:15<17:11, 1.38s/it] Training 1/1 epoch (loss 2.6097): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 500/1250 [12:16<17:11, 1.38s/it] Training 1/1 epoch (loss 2.6097): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 501/1250 [12:16<17:37, 1.41s/it] Training 1/1 epoch (loss 2.7321): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 501/1250 [12:18<17:37, 1.41s/it] Training 1/1 epoch (loss 2.7321): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 502/1250 [12:18<20:05, 1.61s/it] Training 1/1 epoch (loss 2.8476): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 502/1250 [12:19<20:05, 1.61s/it] Training 1/1 epoch (loss 2.8476): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 503/1250 [12:19<15:57, 1.28s/it] Training 1/1 epoch (loss 2.4316): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 503/1250 [12:20<15:57, 1.28s/it] Training 1/1 epoch (loss 2.4316): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 504/1250 [12:20<17:52, 1.44s/it] Training 1/1 epoch (loss 2.8073): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 504/1250 [12:23<17:52, 1.44s/it] Training 1/1 epoch (loss 2.8073): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 505/1250 [12:23<21:30, 1.73s/it] Training 1/1 epoch (loss 2.7498): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 505/1250 [12:23<21:30, 1.73s/it] Training 1/1 epoch (loss 2.7498): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 506/1250 [12:23<16:25, 1.32s/it] Training 1/1 epoch (loss 2.5748): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 506/1250 [12:26<16:25, 1.32s/it] Training 1/1 epoch (loss 2.5748): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 507/1250 [12:26<20:02, 1.62s/it] Training 1/1 epoch (loss 2.5914): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 507/1250 [12:27<20:02, 1.62s/it] Training 1/1 epoch (loss 2.5914): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 508/1250 [12:27<20:13, 1.63s/it] Training 1/1 epoch (loss 2.6236): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 508/1250 [12:28<20:13, 1.63s/it] Training 1/1 epoch (loss 2.6236): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 509/1250 [12:28<16:52, 1.37s/it] Training 1/1 epoch (loss 2.4864): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 509/1250 [12:30<16:52, 1.37s/it] Training 1/1 epoch (loss 2.4864): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 510/1250 [12:30<19:41, 1.60s/it] Training 1/1 epoch (loss 2.4993): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 510/1250 [12:32<19:41, 1.60s/it] Training 1/1 epoch (loss 2.4993): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 511/1250 [12:32<19:10, 1.56s/it] Training 1/1 epoch (loss 2.5413): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 511/1250 [12:33<19:10, 1.56s/it] Training 1/1 epoch (loss 2.5413): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 512/1250 [12:33<18:40, 1.52s/it] Training 1/1 epoch (loss 2.5662): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 512/1250 [12:34<18:40, 1.52s/it] Training 1/1 epoch (loss 2.5662): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 513/1250 [12:34<17:06, 1.39s/it] Training 1/1 epoch (loss 2.4906): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 513/1250 [12:36<17:06, 1.39s/it] Training 1/1 epoch (loss 2.4906): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 514/1250 [12:36<17:35, 1.43s/it] Training 1/1 epoch (loss 2.4394): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 514/1250 [12:37<17:35, 1.43s/it] Training 1/1 epoch (loss 2.4394): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 515/1250 [12:37<15:49, 1.29s/it] Training 1/1 epoch (loss 2.5632): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 515/1250 [12:39<15:49, 1.29s/it] Training 1/1 epoch (loss 2.5632): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 516/1250 [12:39<18:21, 1.50s/it] Training 1/1 epoch (loss 2.7066): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 516/1250 [12:39<18:21, 1.50s/it] Training 1/1 epoch (loss 2.7066): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 517/1250 [12:39<16:15, 1.33s/it] Training 1/1 epoch (loss 2.5641): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 517/1250 [12:41<16:15, 1.33s/it] Training 1/1 epoch (loss 2.5641): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 518/1250 [12:41<18:01, 1.48s/it] Training 1/1 epoch (loss 2.5284): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 518/1250 [12:43<18:01, 1.48s/it] Training 1/1 epoch (loss 2.5284): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 519/1250 [12:43<19:10, 1.57s/it] Training 1/1 epoch (loss 2.4196): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 519/1250 [12:44<19:10, 1.57s/it] Training 1/1 epoch (loss 2.4196): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 520/1250 [12:44<17:22, 1.43s/it] Training 1/1 epoch (loss 2.7214): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 520/1250 [12:47<17:22, 1.43s/it] Training 1/1 epoch (loss 2.7214): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 521/1250 [12:47<20:45, 1.71s/it] Training 1/1 epoch (loss 2.6504): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 521/1250 [12:49<20:45, 1.71s/it] Training 1/1 epoch (loss 2.6504): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 522/1250 [12:49<22:06, 1.82s/it] Training 1/1 epoch (loss 2.7730): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 522/1250 [12:49<22:06, 1.82s/it] Training 1/1 epoch (loss 2.7730): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 523/1250 [12:49<17:37, 1.45s/it] Training 1/1 epoch (loss 2.6191): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 523/1250 [12:50<17:37, 1.45s/it] Training 1/1 epoch (loss 2.6191): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 524/1250 [12:50<15:57, 1.32s/it] Training 1/1 epoch (loss 2.6424): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 524/1250 [12:52<15:57, 1.32s/it] Training 1/1 epoch (loss 2.6424): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 525/1250 [12:52<17:39, 1.46s/it] Training 1/1 epoch (loss 2.5818): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 525/1250 [12:53<17:39, 1.46s/it] Training 1/1 epoch (loss 2.5818): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 526/1250 [12:53<14:52, 1.23s/it] Training 1/1 epoch (loss 2.4771): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 526/1250 [12:54<14:52, 1.23s/it] Training 1/1 epoch (loss 2.4771): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 527/1250 [12:54<16:11, 1.34s/it] Training 1/1 epoch (loss 2.6574): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 527/1250 [12:56<16:11, 1.34s/it] Training 1/1 epoch (loss 2.6574): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 528/1250 [12:56<19:10, 1.59s/it] Training 1/1 epoch (loss 2.4600): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 528/1250 [12:57<19:10, 1.59s/it] Training 1/1 epoch (loss 2.4600): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 529/1250 [12:57<16:15, 1.35s/it] Training 1/1 epoch (loss 2.6247): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 529/1250 [12:59<16:15, 1.35s/it] Training 1/1 epoch (loss 2.6247): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 530/1250 [12:59<15:55, 1.33s/it] Training 1/1 epoch (loss 2.3705): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 530/1250 [13:00<15:55, 1.33s/it] Training 1/1 epoch (loss 2.3705): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 531/1250 [13:00<16:37, 1.39s/it] Training 1/1 epoch (loss 2.5586): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 531/1250 [13:01<16:37, 1.39s/it] Training 1/1 epoch (loss 2.5586): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 532/1250 [13:01<16:03, 1.34s/it] Training 1/1 epoch (loss 2.6245): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 532/1250 [13:04<16:03, 1.34s/it] Training 1/1 epoch (loss 2.6245): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 533/1250 [13:04<19:27, 1.63s/it] Training 1/1 epoch (loss 2.7376): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 533/1250 [13:04<19:27, 1.63s/it] Training 1/1 epoch (loss 2.7376): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 534/1250 [13:04<16:09, 1.35s/it] Training 1/1 epoch (loss 2.8108): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 534/1250 [13:06<16:09, 1.35s/it] Training 1/1 epoch (loss 2.8108): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 535/1250 [13:06<15:55, 1.34s/it] Training 1/1 epoch (loss 2.7653): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 535/1250 [13:07<15:55, 1.34s/it] Training 1/1 epoch (loss 2.7653): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 536/1250 [13:07<16:28, 1.38s/it] Training 1/1 epoch (loss 2.8272): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 536/1250 [13:08<16:28, 1.38s/it] Training 1/1 epoch (loss 2.8272): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/1250 [13:08<13:24, 1.13s/it] Training 1/1 epoch (loss 2.4811): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/1250 [13:09<13:24, 1.13s/it] Training 1/1 epoch (loss 2.4811): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 538/1250 [13:09<14:18, 1.21s/it] Training 1/1 epoch (loss 2.6670): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 538/1250 [13:10<14:18, 1.21s/it] Training 1/1 epoch (loss 2.6670): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 539/1250 [13:10<15:04, 1.27s/it] Training 1/1 epoch (loss 2.7530): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 539/1250 [13:11<15:04, 1.27s/it] Training 1/1 epoch (loss 2.7530): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 540/1250 [13:11<12:05, 1.02s/it] Training 1/1 epoch (loss 2.6119): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 540/1250 [13:13<12:05, 1.02s/it] Training 1/1 epoch (loss 2.6119): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 541/1250 [13:13<15:18, 1.29s/it] Training 1/1 epoch (loss 2.6061): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 541/1250 [13:14<15:18, 1.29s/it] Training 1/1 epoch (loss 2.6061): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 542/1250 [13:14<15:52, 1.35s/it] Training 1/1 epoch (loss 2.4441): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 542/1250 [13:16<15:52, 1.35s/it] Training 1/1 epoch (loss 2.4441): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 543/1250 [13:16<15:26, 1.31s/it] Training 1/1 epoch (loss 2.5208): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 543/1250 [13:17<15:26, 1.31s/it] Training 1/1 epoch (loss 2.5208): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 544/1250 [13:17<16:22, 1.39s/it] Training 1/1 epoch (loss 2.5969): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 544/1250 [13:19<16:22, 1.39s/it] Training 1/1 epoch (loss 2.5969): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 545/1250 [13:19<18:43, 1.59s/it] Training 1/1 epoch (loss 2.6844): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 545/1250 [13:20<18:43, 1.59s/it] Training 1/1 epoch (loss 2.6844): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 546/1250 [13:20<15:22, 1.31s/it] Training 1/1 epoch (loss 2.6625): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 546/1250 [13:22<15:22, 1.31s/it] Training 1/1 epoch (loss 2.6625): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 547/1250 [13:22<17:22, 1.48s/it] Training 1/1 epoch (loss 2.7248): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 547/1250 [13:24<17:22, 1.48s/it] Training 1/1 epoch (loss 2.7248): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 548/1250 [13:24<20:34, 1.76s/it] Training 1/1 epoch (loss 2.8320): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 548/1250 [13:25<20:34, 1.76s/it] Training 1/1 epoch (loss 2.8320): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 549/1250 [13:25<15:59, 1.37s/it] Training 1/1 epoch (loss 2.6510): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 549/1250 [13:26<15:59, 1.37s/it] Training 1/1 epoch (loss 2.6510): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 550/1250 [13:26<16:59, 1.46s/it] Training 1/1 epoch (loss 2.5457): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 550/1250 [13:28<16:59, 1.46s/it] Training 1/1 epoch (loss 2.5457): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 551/1250 [13:28<18:41, 1.60s/it] Training 1/1 epoch (loss 2.4311): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 551/1250 [13:29<18:41, 1.60s/it] Training 1/1 epoch (loss 2.4311): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 552/1250 [13:29<15:59, 1.37s/it] Training 1/1 epoch (loss 2.7267): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 552/1250 [13:31<15:59, 1.37s/it] Training 1/1 epoch (loss 2.7267): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 553/1250 [13:31<16:25, 1.41s/it] Training 1/1 epoch (loss 2.5724): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 553/1250 [13:33<16:25, 1.41s/it] Training 1/1 epoch (loss 2.5724): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 554/1250 [13:33<18:25, 1.59s/it] Training 1/1 epoch (loss 2.3507): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 554/1250 [13:33<18:25, 1.59s/it] Training 1/1 epoch (loss 2.3507): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 555/1250 [13:33<14:58, 1.29s/it] Training 1/1 epoch (loss 2.5844): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 555/1250 [13:35<14:58, 1.29s/it] Training 1/1 epoch (loss 2.5844): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 556/1250 [13:35<15:27, 1.34s/it] Training 1/1 epoch (loss 2.7556): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 556/1250 [13:36<15:27, 1.34s/it] Training 1/1 epoch (loss 2.7556): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 557/1250 [13:36<16:16, 1.41s/it] Training 1/1 epoch (loss 2.6244): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 557/1250 [13:37<16:16, 1.41s/it] Training 1/1 epoch (loss 2.6244): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 558/1250 [13:37<13:49, 1.20s/it] Training 1/1 epoch (loss 2.6870): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 558/1250 [13:39<13:49, 1.20s/it] Training 1/1 epoch (loss 2.6870): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 559/1250 [13:39<18:08, 1.58s/it] Training 1/1 epoch (loss 2.7056): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 559/1250 [13:41<18:08, 1.58s/it] Training 1/1 epoch (loss 2.7056): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 560/1250 [13:41<18:39, 1.62s/it] Training 1/1 epoch (loss 2.4816): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 560/1250 [13:42<18:39, 1.62s/it] Training 1/1 epoch (loss 2.4816): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 561/1250 [13:42<16:01, 1.40s/it] Training 1/1 epoch (loss 2.6817): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 561/1250 [13:44<16:01, 1.40s/it] Training 1/1 epoch (loss 2.6817): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 562/1250 [13:44<17:11, 1.50s/it] Training 1/1 epoch (loss 2.7023): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 562/1250 [13:44<17:11, 1.50s/it] Training 1/1 epoch (loss 2.7023): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 563/1250 [13:44<14:20, 1.25s/it] Training 1/1 epoch (loss 2.7654): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 563/1250 [13:46<14:20, 1.25s/it] Training 1/1 epoch (loss 2.7654): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 564/1250 [13:46<14:07, 1.24s/it] Training 1/1 epoch (loss 2.6745): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 564/1250 [13:48<14:07, 1.24s/it] Training 1/1 epoch (loss 2.6745): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 565/1250 [13:48<16:58, 1.49s/it] Training 1/1 epoch (loss 2.6957): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 565/1250 [13:49<16:58, 1.49s/it] Training 1/1 epoch (loss 2.6957): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 566/1250 [13:49<16:38, 1.46s/it] Training 1/1 epoch (loss 2.6633): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 566/1250 [13:50<16:38, 1.46s/it] Training 1/1 epoch (loss 2.6633): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 567/1250 [13:50<14:31, 1.28s/it] Training 1/1 epoch (loss 2.5755): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 567/1250 [13:52<14:31, 1.28s/it] Training 1/1 epoch (loss 2.5755): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 568/1250 [13:52<18:58, 1.67s/it] Training 1/1 epoch (loss 2.7143): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 568/1250 [13:53<18:58, 1.67s/it] Training 1/1 epoch (loss 2.7143): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 569/1250 [13:53<16:23, 1.44s/it] Training 1/1 epoch (loss 2.5625): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 569/1250 [13:55<16:23, 1.44s/it] Training 1/1 epoch (loss 2.5625): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 570/1250 [13:55<16:04, 1.42s/it] Training 1/1 epoch (loss 2.6036): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 570/1250 [13:57<16:04, 1.42s/it] Training 1/1 epoch (loss 2.6036): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 571/1250 [13:57<17:39, 1.56s/it] Training 1/1 epoch (loss 2.7195): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 571/1250 [13:57<17:39, 1.56s/it] Training 1/1 epoch (loss 2.7195): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 572/1250 [13:57<13:49, 1.22s/it] Training 1/1 epoch (loss 2.6236): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 572/1250 [13:58<13:49, 1.22s/it] Training 1/1 epoch (loss 2.6236): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 573/1250 [13:58<14:08, 1.25s/it] Training 1/1 epoch (loss 2.6549): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 573/1250 [14:00<14:08, 1.25s/it] Training 1/1 epoch (loss 2.6549): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 574/1250 [14:00<15:06, 1.34s/it] Training 1/1 epoch (loss 2.7592): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 574/1250 [14:01<15:06, 1.34s/it] Training 1/1 epoch (loss 2.7592): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 575/1250 [14:01<12:47, 1.14s/it] Training 1/1 epoch (loss 2.5464): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 575/1250 [14:02<12:47, 1.14s/it] Training 1/1 epoch (loss 2.5464): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 576/1250 [14:02<14:13, 1.27s/it] Training 1/1 epoch (loss 2.5353): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 576/1250 [14:04<14:13, 1.27s/it] Training 1/1 epoch (loss 2.5353): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 577/1250 [14:04<15:34, 1.39s/it] Training 1/1 epoch (loss 2.7078): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 577/1250 [14:05<15:34, 1.39s/it] Training 1/1 epoch (loss 2.7078): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 578/1250 [14:05<13:34, 1.21s/it] Training 1/1 epoch (loss 2.8384): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 578/1250 [14:06<13:34, 1.21s/it] Training 1/1 epoch (loss 2.8384): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 579/1250 [14:06<15:37, 1.40s/it] Training 1/1 epoch (loss 2.7911): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 579/1250 [14:07<15:37, 1.40s/it] Training 1/1 epoch (loss 2.7911): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 580/1250 [14:07<13:31, 1.21s/it] Training 1/1 epoch (loss 2.5906): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 580/1250 [14:09<13:31, 1.21s/it] Training 1/1 epoch (loss 2.5906): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 581/1250 [14:09<15:29, 1.39s/it] Training 1/1 epoch (loss 2.5244): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 581/1250 [14:11<15:29, 1.39s/it] Training 1/1 epoch (loss 2.5244): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 582/1250 [14:11<16:55, 1.52s/it] Training 1/1 epoch (loss 2.4629): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 582/1250 [14:12<16:55, 1.52s/it] Training 1/1 epoch (loss 2.4629): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 583/1250 [14:12<14:32, 1.31s/it] Training 1/1 epoch (loss 2.5469): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 583/1250 [14:14<14:32, 1.31s/it] Training 1/1 epoch (loss 2.5469): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 584/1250 [14:14<16:38, 1.50s/it] Training 1/1 epoch (loss 2.6817): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 584/1250 [14:16<16:38, 1.50s/it] Training 1/1 epoch (loss 2.6817): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 585/1250 [14:16<18:01, 1.63s/it] Training 1/1 epoch (loss 2.5131): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 585/1250 [14:16<18:01, 1.63s/it] Training 1/1 epoch (loss 2.5131): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 586/1250 [14:16<14:19, 1.29s/it] Training 1/1 epoch (loss 2.5664): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 586/1250 [14:17<14:19, 1.29s/it] Training 1/1 epoch (loss 2.5664): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 587/1250 [14:17<14:46, 1.34s/it] Training 1/1 epoch (loss 2.3229): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 587/1250 [14:20<14:46, 1.34s/it] Training 1/1 epoch (loss 2.3229): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 588/1250 [14:20<18:31, 1.68s/it] Training 1/1 epoch (loss 2.4710): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 588/1250 [14:21<18:31, 1.68s/it] Training 1/1 epoch (loss 2.4710): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 589/1250 [14:21<15:26, 1.40s/it] Training 1/1 epoch (loss 2.6074): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 589/1250 [14:23<15:26, 1.40s/it] Training 1/1 epoch (loss 2.6074): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 590/1250 [14:23<17:25, 1.58s/it] Training 1/1 epoch (loss 2.4835): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 590/1250 [14:24<17:25, 1.58s/it] Training 1/1 epoch (loss 2.4835): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 591/1250 [14:24<15:38, 1.42s/it] Training 1/1 epoch (loss 2.6145): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 591/1250 [14:26<15:38, 1.42s/it] Training 1/1 epoch (loss 2.6145): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 592/1250 [14:26<16:44, 1.53s/it] Training 1/1 epoch (loss 2.7174): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 592/1250 [14:27<16:44, 1.53s/it] Training 1/1 epoch (loss 2.7174): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 593/1250 [14:27<16:50, 1.54s/it] Training 1/1 epoch (loss 2.5328): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 593/1250 [14:28<16:50, 1.54s/it] Training 1/1 epoch (loss 2.5328): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 594/1250 [14:28<13:46, 1.26s/it] Training 1/1 epoch (loss 2.4997): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 594/1250 [14:30<13:46, 1.26s/it] Training 1/1 epoch (loss 2.4997): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 595/1250 [14:30<16:36, 1.52s/it] Training 1/1 epoch (loss 2.5500): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 595/1250 [14:32<16:36, 1.52s/it] Training 1/1 epoch (loss 2.5500): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 596/1250 [14:32<18:08, 1.66s/it] Training 1/1 epoch (loss 2.6454): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 596/1250 [14:33<18:08, 1.66s/it] Training 1/1 epoch (loss 2.6454): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 597/1250 [14:33<15:00, 1.38s/it] Training 1/1 epoch (loss 2.5811): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 597/1250 [14:33<15:00, 1.38s/it] Training 1/1 epoch (loss 2.5811): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 598/1250 [14:33<13:22, 1.23s/it] Training 1/1 epoch (loss 2.6443): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 598/1250 [14:36<13:22, 1.23s/it] Training 1/1 epoch (loss 2.6443): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 599/1250 [14:36<16:19, 1.50s/it] Training 1/1 epoch (loss 2.6103): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 599/1250 [14:36<16:19, 1.50s/it] Training 1/1 epoch (loss 2.6103): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 600/1250 [14:36<13:29, 1.24s/it] Training 1/1 epoch (loss 2.7919): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 600/1250 [14:38<13:29, 1.24s/it] Training 1/1 epoch (loss 2.7919): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 601/1250 [14:38<15:58, 1.48s/it] Training 1/1 epoch (loss 2.5110): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 601/1250 [14:40<15:58, 1.48s/it] Training 1/1 epoch (loss 2.5110): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 602/1250 [14:40<16:02, 1.49s/it] Training 1/1 epoch (loss 2.4821): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 602/1250 [14:41<16:02, 1.49s/it] Training 1/1 epoch (loss 2.4821): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 603/1250 [14:41<16:05, 1.49s/it] Training 1/1 epoch (loss 2.6708): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 603/1250 [14:43<16:05, 1.49s/it] Training 1/1 epoch (loss 2.6708): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 604/1250 [14:43<16:16, 1.51s/it] Training 1/1 epoch (loss 2.5681): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 604/1250 [14:44<16:16, 1.51s/it] Training 1/1 epoch (loss 2.5681): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 605/1250 [14:44<15:26, 1.44s/it] Training 1/1 epoch (loss 2.7022): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 605/1250 [14:45<15:26, 1.44s/it] Training 1/1 epoch (loss 2.7022): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 606/1250 [14:45<15:13, 1.42s/it] Training 1/1 epoch (loss 2.4906): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 606/1250 [14:47<15:13, 1.42s/it] Training 1/1 epoch (loss 2.4906): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 607/1250 [14:47<16:46, 1.57s/it] Training 1/1 epoch (loss 2.8043): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 607/1250 [14:49<16:46, 1.57s/it] Training 1/1 epoch (loss 2.8043): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 608/1250 [14:49<15:40, 1.47s/it] Training 1/1 epoch (loss 2.5385): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 608/1250 [14:50<15:40, 1.47s/it] Training 1/1 epoch (loss 2.5385): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 609/1250 [14:50<16:53, 1.58s/it] Training 1/1 epoch (loss 2.6402): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 609/1250 [14:52<16:53, 1.58s/it] Training 1/1 epoch (loss 2.6402): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 610/1250 [14:52<16:44, 1.57s/it] Training 1/1 epoch (loss 2.6810): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 610/1250 [14:53<16:44, 1.57s/it] Training 1/1 epoch (loss 2.6810): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 611/1250 [14:53<13:49, 1.30s/it] Training 1/1 epoch (loss 2.4853): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 611/1250 [14:54<13:49, 1.30s/it] Training 1/1 epoch (loss 2.4853): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 612/1250 [14:54<14:57, 1.41s/it] Training 1/1 epoch (loss 2.7313): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 612/1250 [14:56<14:57, 1.41s/it] Training 1/1 epoch (loss 2.7313): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 613/1250 [14:56<14:29, 1.36s/it] Training 1/1 epoch (loss 2.7618): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 613/1250 [14:56<14:29, 1.36s/it] Training 1/1 epoch (loss 2.7618): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 614/1250 [14:56<11:50, 1.12s/it] Training 1/1 epoch (loss 2.5682): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 614/1250 [14:58<11:50, 1.12s/it] Training 1/1 epoch (loss 2.5682): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 615/1250 [14:58<13:59, 1.32s/it] Training 1/1 epoch (loss 2.6902): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 615/1250 [15:00<13:59, 1.32s/it] Training 1/1 epoch (loss 2.6902): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 616/1250 [15:00<16:49, 1.59s/it] Training 1/1 epoch (loss 2.2787): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 616/1250 [15:01<16:49, 1.59s/it] Training 1/1 epoch (loss 2.2787): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 617/1250 [15:01<14:08, 1.34s/it] Training 1/1 epoch (loss 2.5914): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 617/1250 [15:03<14:08, 1.34s/it] Training 1/1 epoch (loss 2.5914): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 618/1250 [15:03<15:03, 1.43s/it] Training 1/1 epoch (loss 2.4870): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 618/1250 [15:04<15:03, 1.43s/it] Training 1/1 epoch (loss 2.4870): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 619/1250 [15:04<14:01, 1.33s/it] Training 1/1 epoch (loss 2.5183): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 619/1250 [15:05<14:01, 1.33s/it] Training 1/1 epoch (loss 2.5183): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 620/1250 [15:05<14:42, 1.40s/it] Training 1/1 epoch (loss 2.4669): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 620/1250 [15:08<14:42, 1.40s/it] Training 1/1 epoch (loss 2.4669): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 621/1250 [15:08<17:54, 1.71s/it] Training 1/1 epoch (loss 2.7508): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 621/1250 [15:09<17:54, 1.71s/it] Training 1/1 epoch (loss 2.7508): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 622/1250 [15:09<15:27, 1.48s/it] Training 1/1 epoch (loss 2.6511): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 622/1250 [15:11<15:27, 1.48s/it] Training 1/1 epoch (loss 2.6511): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 623/1250 [15:11<18:34, 1.78s/it] Training 1/1 epoch (loss 2.5640): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 623/1250 [15:14<18:34, 1.78s/it] Training 1/1 epoch (loss 2.5640): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 624/1250 [15:14<21:14, 2.04s/it] Training 1/1 epoch (loss 2.6227): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 624/1250 [15:15<21:14, 2.04s/it] Training 1/1 epoch (loss 2.6227): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 625/1250 [15:15<18:11, 1.75s/it] Training 1/1 epoch (loss 2.6607): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 625/1250 [15:16<18:11, 1.75s/it] Training 1/1 epoch (loss 2.6607): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 626/1250 [15:16<17:53, 1.72s/it] Training 1/1 epoch (loss 2.3666): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 626/1250 [15:18<17:53, 1.72s/it] Training 1/1 epoch (loss 2.3666): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 627/1250 [15:18<17:08, 1.65s/it] Training 1/1 epoch (loss 2.5780): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 627/1250 [15:20<17:08, 1.65s/it] Training 1/1 epoch (loss 2.5780): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 628/1250 [15:20<17:41, 1.71s/it] Training 1/1 epoch (loss 2.7759): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 628/1250 [15:21<17:41, 1.71s/it] Training 1/1 epoch (loss 2.7759): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 629/1250 [15:21<16:50, 1.63s/it] Training 1/1 epoch (loss 2.3194): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 629/1250 [15:22<16:50, 1.63s/it] Training 1/1 epoch (loss 2.3194): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 630/1250 [15:22<13:33, 1.31s/it] Training 1/1 epoch (loss 2.6304): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 630/1250 [15:23<13:33, 1.31s/it] Training 1/1 epoch (loss 2.6304): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 631/1250 [15:23<13:50, 1.34s/it] Training 1/1 epoch (loss 2.7322): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 631/1250 [15:25<13:50, 1.34s/it] Training 1/1 epoch (loss 2.7322): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 632/1250 [15:25<15:19, 1.49s/it] Training 1/1 epoch (loss 2.6277): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 632/1250 [15:26<15:19, 1.49s/it] Training 1/1 epoch (loss 2.6277): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 633/1250 [15:26<13:24, 1.30s/it] Training 1/1 epoch (loss 2.5903): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 633/1250 [15:27<13:24, 1.30s/it] Training 1/1 epoch (loss 2.5903): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 634/1250 [15:27<13:54, 1.35s/it] Training 1/1 epoch (loss 2.4531): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 634/1250 [15:30<13:54, 1.35s/it] Training 1/1 epoch (loss 2.4531): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 635/1250 [15:30<17:13, 1.68s/it] Training 1/1 epoch (loss 2.4080): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 635/1250 [15:30<17:13, 1.68s/it] Training 1/1 epoch (loss 2.4080): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 636/1250 [15:30<14:05, 1.38s/it] Training 1/1 epoch (loss 2.4797): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 636/1250 [15:33<14:05, 1.38s/it] Training 1/1 epoch (loss 2.4797): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 637/1250 [15:33<17:07, 1.68s/it] Training 1/1 epoch (loss 2.4366): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 637/1250 [15:35<17:07, 1.68s/it] Training 1/1 epoch (loss 2.4366): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 638/1250 [15:35<17:49, 1.75s/it] Training 1/1 epoch (loss 2.7567): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 638/1250 [15:36<17:49, 1.75s/it] Training 1/1 epoch (loss 2.7567): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 639/1250 [15:36<15:37, 1.53s/it] Training 1/1 epoch (loss 2.5729): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 639/1250 [15:39<15:37, 1.53s/it] Training 1/1 epoch (loss 2.5729): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 640/1250 [15:39<19:22, 1.91s/it] Training 1/1 epoch (loss 2.5568): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 640/1250 [15:40<19:22, 1.91s/it] Training 1/1 epoch (loss 2.5568): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 641/1250 [15:40<18:00, 1.77s/it] Training 1/1 epoch (loss 2.5435): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 641/1250 [15:42<18:00, 1.77s/it] Training 1/1 epoch (loss 2.5435): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 642/1250 [15:42<18:59, 1.87s/it] Training 1/1 epoch (loss 2.6269): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 642/1250 [15:44<18:59, 1.87s/it] Training 1/1 epoch (loss 2.6269): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 643/1250 [15:44<18:21, 1.81s/it] Training 1/1 epoch (loss 2.5830): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 643/1250 [15:45<18:21, 1.81s/it] Training 1/1 epoch (loss 2.5830): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 644/1250 [15:45<15:06, 1.50s/it] Training 1/1 epoch (loss 2.4615): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 644/1250 [15:46<15:06, 1.50s/it] Training 1/1 epoch (loss 2.4615): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 645/1250 [15:46<15:23, 1.53s/it] Training 1/1 epoch (loss 2.1903): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 645/1250 [15:47<15:23, 1.53s/it] Training 1/1 epoch (loss 2.1903): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 646/1250 [15:47<13:50, 1.37s/it] Training 1/1 epoch (loss 2.5423): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 646/1250 [15:48<13:50, 1.37s/it] Training 1/1 epoch (loss 2.5423): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 647/1250 [15:48<11:24, 1.14s/it] Training 1/1 epoch (loss 2.7382): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 647/1250 [15:50<11:24, 1.14s/it] Training 1/1 epoch (loss 2.7382): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 648/1250 [15:50<15:37, 1.56s/it] Training 1/1 epoch (loss 2.4365): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 648/1250 [15:52<15:37, 1.56s/it] Training 1/1 epoch (loss 2.4365): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 649/1250 [15:52<16:08, 1.61s/it] Training 1/1 epoch (loss 2.7005): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 649/1250 [15:53<16:08, 1.61s/it] Training 1/1 epoch (loss 2.7005): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 650/1250 [15:53<15:31, 1.55s/it] Training 1/1 epoch (loss 2.6318): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 650/1250 [15:54<15:31, 1.55s/it] Training 1/1 epoch (loss 2.6318): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 651/1250 [15:54<14:01, 1.40s/it] Training 1/1 epoch (loss 2.4586): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 651/1250 [15:56<14:01, 1.40s/it] Training 1/1 epoch (loss 2.4586): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 652/1250 [15:56<14:33, 1.46s/it] Training 1/1 epoch (loss 2.4828): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 652/1250 [15:58<14:33, 1.46s/it] Training 1/1 epoch (loss 2.4828): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 653/1250 [15:58<14:35, 1.47s/it] Training 1/1 epoch (loss 2.4164): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 653/1250 [15:59<14:35, 1.47s/it] Training 1/1 epoch (loss 2.4164): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 654/1250 [15:59<14:56, 1.50s/it] Training 1/1 epoch (loss 2.5949): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 654/1250 [16:01<14:56, 1.50s/it] Training 1/1 epoch (loss 2.5949): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 655/1250 [16:01<14:37, 1.47s/it] Training 1/1 epoch (loss 2.7660): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 655/1250 [16:02<14:37, 1.47s/it] Training 1/1 epoch (loss 2.7660): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 656/1250 [16:02<13:02, 1.32s/it] Training 1/1 epoch (loss 2.7274): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 656/1250 [16:03<13:02, 1.32s/it] Training 1/1 epoch (loss 2.7274): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 657/1250 [16:03<13:57, 1.41s/it] Training 1/1 epoch (loss 2.7435): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 657/1250 [16:04<13:57, 1.41s/it] Training 1/1 epoch (loss 2.7435): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 658/1250 [16:04<11:36, 1.18s/it] Training 1/1 epoch (loss 2.4963): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 658/1250 [16:06<11:36, 1.18s/it] Training 1/1 epoch (loss 2.4963): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 659/1250 [16:06<15:39, 1.59s/it] Training 1/1 epoch (loss 2.2754): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 659/1250 [16:09<15:39, 1.59s/it] Training 1/1 epoch (loss 2.2754): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 660/1250 [16:09<18:05, 1.84s/it] Training 1/1 epoch (loss 2.4492): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 660/1250 [16:09<18:05, 1.84s/it] Training 1/1 epoch (loss 2.4492): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 661/1250 [16:09<14:02, 1.43s/it] Training 1/1 epoch (loss 2.5118): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 661/1250 [16:12<14:02, 1.43s/it] Training 1/1 epoch (loss 2.5118): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 662/1250 [16:12<16:55, 1.73s/it] Training 1/1 epoch (loss 2.6008): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 662/1250 [16:13<16:55, 1.73s/it] Training 1/1 epoch (loss 2.6008): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 663/1250 [16:13<16:42, 1.71s/it] Training 1/1 epoch (loss 2.6699): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 663/1250 [16:14<16:42, 1.71s/it] Training 1/1 epoch (loss 2.6699): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 664/1250 [16:14<13:32, 1.39s/it] Training 1/1 epoch (loss 2.6048): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 664/1250 [16:16<13:32, 1.39s/it] Training 1/1 epoch (loss 2.6048): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 665/1250 [16:16<14:22, 1.47s/it] Training 1/1 epoch (loss 2.5760): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 665/1250 [16:18<14:22, 1.47s/it] Training 1/1 epoch (loss 2.5760): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 666/1250 [16:18<15:55, 1.64s/it] Training 1/1 epoch (loss 2.7904): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 666/1250 [16:18<15:55, 1.64s/it] Training 1/1 epoch (loss 2.7904): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 667/1250 [16:18<12:42, 1.31s/it] Training 1/1 epoch (loss 2.6637): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 667/1250 [16:21<12:42, 1.31s/it] Training 1/1 epoch (loss 2.6637): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 668/1250 [16:21<16:14, 1.67s/it] Training 1/1 epoch (loss 2.6271): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 668/1250 [16:22<16:14, 1.67s/it] Training 1/1 epoch (loss 2.6271): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 669/1250 [16:22<15:02, 1.55s/it] Training 1/1 epoch (loss 2.8648): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 669/1250 [16:23<15:02, 1.55s/it] Training 1/1 epoch (loss 2.8648): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 670/1250 [16:23<12:32, 1.30s/it] Training 1/1 epoch (loss 2.8592): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 670/1250 [16:24<12:32, 1.30s/it] Training 1/1 epoch (loss 2.8592): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 671/1250 [16:24<13:05, 1.36s/it] Training 1/1 epoch (loss 2.7948): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 671/1250 [16:26<13:05, 1.36s/it] Training 1/1 epoch (loss 2.7948): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/1250 [16:26<14:14, 1.48s/it] Training 1/1 epoch (loss 2.4465): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/1250 [16:27<14:14, 1.48s/it] Training 1/1 epoch (loss 2.4465): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/1250 [16:27<14:21, 1.49s/it] Training 1/1 epoch (loss 2.7113): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/1250 [16:30<14:21, 1.49s/it] Training 1/1 epoch (loss 2.7113): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/1250 [16:30<16:44, 1.74s/it] Training 1/1 epoch (loss 2.3839): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/1250 [16:31<16:44, 1.74s/it] Training 1/1 epoch (loss 2.3839): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/1250 [16:31<14:16, 1.49s/it] Training 1/1 epoch (loss 2.6224): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/1250 [16:33<14:16, 1.49s/it] Training 1/1 epoch (loss 2.6224): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/1250 [16:33<15:13, 1.59s/it] Training 1/1 epoch (loss 2.4234): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/1250 [16:35<15:13, 1.59s/it] Training 1/1 epoch (loss 2.4234): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/1250 [16:35<17:44, 1.86s/it] Training 1/1 epoch (loss 2.6902): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/1250 [16:36<17:44, 1.86s/it] Training 1/1 epoch (loss 2.6902): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/1250 [16:36<14:40, 1.54s/it] Training 1/1 epoch (loss 2.6778): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/1250 [16:38<14:40, 1.54s/it] Training 1/1 epoch (loss 2.6778): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/1250 [16:38<15:59, 1.68s/it] Training 1/1 epoch (loss 2.6037): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/1250 [16:39<15:59, 1.68s/it] Training 1/1 epoch (loss 2.6037): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/1250 [16:39<15:40, 1.65s/it] Training 1/1 epoch (loss 2.4749): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/1250 [16:40<15:40, 1.65s/it] Training 1/1 epoch (loss 2.4749): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 681/1250 [16:40<13:07, 1.38s/it] Training 1/1 epoch (loss 2.6862): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 681/1250 [16:42<13:07, 1.38s/it] Training 1/1 epoch (loss 2.6862): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 682/1250 [16:42<15:20, 1.62s/it] Training 1/1 epoch (loss 2.7036): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 682/1250 [16:44<15:20, 1.62s/it] Training 1/1 epoch (loss 2.7036): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 683/1250 [16:44<15:06, 1.60s/it] Training 1/1 epoch (loss 2.6619): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 683/1250 [16:45<15:06, 1.60s/it] Training 1/1 epoch (loss 2.6619): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 684/1250 [16:45<14:20, 1.52s/it] Training 1/1 epoch (loss 2.3929): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 684/1250 [16:47<14:20, 1.52s/it] Training 1/1 epoch (loss 2.3929): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 685/1250 [16:47<13:55, 1.48s/it] Training 1/1 epoch (loss 2.6683): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 685/1250 [16:48<13:55, 1.48s/it] Training 1/1 epoch (loss 2.6683): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 686/1250 [16:48<14:00, 1.49s/it] Training 1/1 epoch (loss 2.4910): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 686/1250 [16:49<14:00, 1.49s/it] Training 1/1 epoch (loss 2.4910): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 687/1250 [16:49<12:48, 1.37s/it] Training 1/1 epoch (loss 2.6414): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 687/1250 [16:51<12:48, 1.37s/it] Training 1/1 epoch (loss 2.6414): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 688/1250 [16:51<14:38, 1.56s/it] Training 1/1 epoch (loss 2.6943): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 688/1250 [16:52<14:38, 1.56s/it] Training 1/1 epoch (loss 2.6943): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 689/1250 [16:52<11:56, 1.28s/it] Training 1/1 epoch (loss 2.7189): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 689/1250 [16:54<11:56, 1.28s/it] Training 1/1 epoch (loss 2.7189): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 690/1250 [16:54<15:23, 1.65s/it] Training 1/1 epoch (loss 2.5563): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 690/1250 [16:56<15:23, 1.65s/it] Training 1/1 epoch (loss 2.5563): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 691/1250 [16:56<15:52, 1.70s/it] Training 1/1 epoch (loss 2.6491): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 691/1250 [16:57<15:52, 1.70s/it] Training 1/1 epoch (loss 2.6491): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 692/1250 [16:57<13:19, 1.43s/it] Training 1/1 epoch (loss 2.6049): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 692/1250 [16:58<13:19, 1.43s/it] Training 1/1 epoch (loss 2.6049): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 693/1250 [16:58<13:16, 1.43s/it] Training 1/1 epoch (loss 2.7655): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 693/1250 [16:59<13:16, 1.43s/it] Training 1/1 epoch (loss 2.7655): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 694/1250 [16:59<12:01, 1.30s/it] Training 1/1 epoch (loss 2.5687): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 694/1250 [17:01<12:01, 1.30s/it] Training 1/1 epoch (loss 2.5687): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 695/1250 [17:01<12:31, 1.35s/it] Training 1/1 epoch (loss 2.8039): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 695/1250 [17:03<12:31, 1.35s/it] Training 1/1 epoch (loss 2.8039): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 696/1250 [17:03<13:34, 1.47s/it] Training 1/1 epoch (loss 2.6238): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 696/1250 [17:03<13:34, 1.47s/it] Training 1/1 epoch (loss 2.6238): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 697/1250 [17:03<11:38, 1.26s/it] Training 1/1 epoch (loss 2.4462): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 697/1250 [17:04<11:38, 1.26s/it] Training 1/1 epoch (loss 2.4462): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 698/1250 [17:04<10:52, 1.18s/it] Training 1/1 epoch (loss 2.7596): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 698/1250 [17:07<10:52, 1.18s/it] Training 1/1 epoch (loss 2.7596): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 699/1250 [17:07<14:12, 1.55s/it] Training 1/1 epoch (loss 2.8827): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 699/1250 [17:07<14:12, 1.55s/it] Training 1/1 epoch (loss 2.8827): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 700/1250 [17:07<11:15, 1.23s/it] Training 1/1 epoch (loss 2.7614): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 700/1250 [17:09<11:15, 1.23s/it] Training 1/1 epoch (loss 2.7614): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 701/1250 [17:09<11:34, 1.27s/it] Training 1/1 epoch (loss 2.5256): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 701/1250 [17:10<11:34, 1.27s/it] Training 1/1 epoch (loss 2.5256): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 702/1250 [17:10<12:43, 1.39s/it] Training 1/1 epoch (loss 2.6495): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 702/1250 [17:12<12:43, 1.39s/it] Training 1/1 epoch (loss 2.6495): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 703/1250 [17:12<12:35, 1.38s/it] Training 1/1 epoch (loss 2.4158): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 703/1250 [17:14<12:35, 1.38s/it] Training 1/1 epoch (loss 2.4158): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 704/1250 [17:14<16:11, 1.78s/it] Training 1/1 epoch (loss 2.6337): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 704/1250 [17:15<16:11, 1.78s/it] Training 1/1 epoch (loss 2.6337): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 705/1250 [17:15<14:06, 1.55s/it] Training 1/1 epoch (loss 2.7225): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 705/1250 [17:18<14:06, 1.55s/it] Training 1/1 epoch (loss 2.7225): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 706/1250 [17:18<15:49, 1.75s/it] Training 1/1 epoch (loss 2.6764): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 706/1250 [17:20<15:49, 1.75s/it] Training 1/1 epoch (loss 2.6764): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 707/1250 [17:20<17:44, 1.96s/it] Training 1/1 epoch (loss 2.7126): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 707/1250 [17:21<17:44, 1.96s/it] Training 1/1 epoch (loss 2.7126): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 708/1250 [17:21<13:42, 1.52s/it] Training 1/1 epoch (loss 2.7566): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 708/1250 [17:22<13:42, 1.52s/it] Training 1/1 epoch (loss 2.7566): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 709/1250 [17:22<14:54, 1.65s/it] Training 1/1 epoch (loss 2.5558): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 709/1250 [17:25<14:54, 1.65s/it] Training 1/1 epoch (loss 2.5558): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 710/1250 [17:25<16:35, 1.84s/it] Training 1/1 epoch (loss 2.6972): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 710/1250 [17:25<16:35, 1.84s/it] Training 1/1 epoch (loss 2.6972): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 711/1250 [17:25<13:15, 1.48s/it] Training 1/1 epoch (loss 2.7081): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 711/1250 [17:28<13:15, 1.48s/it] Training 1/1 epoch (loss 2.7081): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 712/1250 [17:28<15:37, 1.74s/it] Training 1/1 epoch (loss 2.4670): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 712/1250 [17:29<15:37, 1.74s/it] Training 1/1 epoch (loss 2.4670): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 713/1250 [17:29<15:11, 1.70s/it] Training 1/1 epoch (loss 2.8242): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 713/1250 [17:31<15:11, 1.70s/it] Training 1/1 epoch (loss 2.8242): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 714/1250 [17:31<15:19, 1.72s/it] Training 1/1 epoch (loss 2.7099): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 714/1250 [17:33<15:19, 1.72s/it] Training 1/1 epoch (loss 2.7099): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 715/1250 [17:33<16:54, 1.90s/it] Training 1/1 epoch (loss 2.3441): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 715/1250 [17:34<16:54, 1.90s/it] Training 1/1 epoch (loss 2.3441): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/1250 [17:34<13:04, 1.47s/it] Training 1/1 epoch (loss 2.6468): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/1250 [17:36<13:04, 1.47s/it] Training 1/1 epoch (loss 2.6468): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/1250 [17:36<13:50, 1.56s/it] Training 1/1 epoch (loss 2.6536): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/1250 [17:38<13:50, 1.56s/it] Training 1/1 epoch (loss 2.6536): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/1250 [17:38<14:44, 1.66s/it] Training 1/1 epoch (loss 2.6858): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/1250 [17:38<14:44, 1.66s/it] Training 1/1 epoch (loss 2.6858): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 719/1250 [17:38<11:32, 1.30s/it] Training 1/1 epoch (loss 2.7392): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 719/1250 [17:41<11:32, 1.30s/it] Training 1/1 epoch (loss 2.7392): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 720/1250 [17:41<14:36, 1.65s/it] Training 1/1 epoch (loss 2.5389): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 720/1250 [17:42<14:36, 1.65s/it] Training 1/1 epoch (loss 2.5389): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 721/1250 [17:42<14:54, 1.69s/it] Training 1/1 epoch (loss 2.5279): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 721/1250 [17:43<14:54, 1.69s/it] Training 1/1 epoch (loss 2.5279): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 722/1250 [17:43<12:49, 1.46s/it] Training 1/1 epoch (loss 2.4048): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 722/1250 [17:45<12:49, 1.46s/it] Training 1/1 epoch (loss 2.4048): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 723/1250 [17:45<13:15, 1.51s/it] Training 1/1 epoch (loss 2.5238): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 723/1250 [17:46<13:15, 1.51s/it] Training 1/1 epoch (loss 2.5238): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 724/1250 [17:46<13:29, 1.54s/it] Training 1/1 epoch (loss 2.8951): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 724/1250 [17:47<13:29, 1.54s/it] Training 1/1 epoch (loss 2.8951): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 725/1250 [17:47<11:00, 1.26s/it] Training 1/1 epoch (loss 2.5546): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 725/1250 [17:49<11:00, 1.26s/it] Training 1/1 epoch (loss 2.5546): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 726/1250 [17:49<14:08, 1.62s/it] Training 1/1 epoch (loss 2.4951): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 726/1250 [17:52<14:08, 1.62s/it] Training 1/1 epoch (loss 2.4951): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/1250 [17:52<15:14, 1.75s/it] Training 1/1 epoch (loss 2.4851): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/1250 [17:52<15:14, 1.75s/it] Training 1/1 epoch (loss 2.4851): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/1250 [17:52<12:42, 1.46s/it] Training 1/1 epoch (loss 2.6863): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/1250 [17:55<12:42, 1.46s/it] Training 1/1 epoch (loss 2.6863): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/1250 [17:55<14:51, 1.71s/it] Training 1/1 epoch (loss 2.6912): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/1250 [17:56<14:51, 1.71s/it] Training 1/1 epoch (loss 2.6912): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/1250 [17:56<12:56, 1.49s/it] Training 1/1 epoch (loss 2.6046): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/1250 [17:57<12:56, 1.49s/it] Training 1/1 epoch (loss 2.6046): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/1250 [17:57<12:48, 1.48s/it] Training 1/1 epoch (loss 2.3552): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/1250 [17:59<12:48, 1.48s/it] Training 1/1 epoch (loss 2.3552): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/1250 [17:59<13:59, 1.62s/it] Training 1/1 epoch (loss 2.4945): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/1250 [18:00<13:59, 1.62s/it] Training 1/1 epoch (loss 2.4945): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/1250 [18:00<12:40, 1.47s/it] Training 1/1 epoch (loss 2.6101): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/1250 [18:02<12:40, 1.47s/it] Training 1/1 epoch (loss 2.6101): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/1250 [18:02<13:46, 1.60s/it] Training 1/1 epoch (loss 2.5340): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/1250 [18:04<13:46, 1.60s/it] Training 1/1 epoch (loss 2.5340): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 735/1250 [18:04<13:25, 1.56s/it] Training 1/1 epoch (loss 2.7817): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 735/1250 [18:04<13:25, 1.56s/it] Training 1/1 epoch (loss 2.7817): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 736/1250 [18:04<10:54, 1.27s/it] Training 1/1 epoch (loss 2.4524): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 736/1250 [18:07<10:54, 1.27s/it] Training 1/1 epoch (loss 2.4524): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 737/1250 [18:07<13:50, 1.62s/it] Training 1/1 epoch (loss 2.5512): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 737/1250 [18:08<13:50, 1.62s/it] Training 1/1 epoch (loss 2.5512): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 738/1250 [18:08<14:37, 1.71s/it] Training 1/1 epoch (loss 2.6145): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 738/1250 [18:09<14:37, 1.71s/it] Training 1/1 epoch (loss 2.6145): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/1250 [18:09<11:13, 1.32s/it] Training 1/1 epoch (loss 2.5717): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/1250 [18:11<11:13, 1.32s/it] Training 1/1 epoch (loss 2.5717): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/1250 [18:11<12:27, 1.47s/it] Training 1/1 epoch (loss 2.5406): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/1250 [18:12<12:27, 1.47s/it] Training 1/1 epoch (loss 2.5406): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/1250 [18:12<11:46, 1.39s/it] Training 1/1 epoch (loss 2.6900): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/1250 [18:13<11:46, 1.39s/it] Training 1/1 epoch (loss 2.6900): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/1250 [18:13<09:50, 1.16s/it] Training 1/1 epoch (loss 2.6204): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/1250 [18:14<09:50, 1.16s/it] Training 1/1 epoch (loss 2.6204): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/1250 [18:14<10:36, 1.25s/it] Training 1/1 epoch (loss 2.6929): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/1250 [18:17<10:36, 1.25s/it] Training 1/1 epoch (loss 2.6929): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/1250 [18:17<14:11, 1.68s/it] Training 1/1 epoch (loss 2.5914): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/1250 [18:18<14:11, 1.68s/it] Training 1/1 epoch (loss 2.5914): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/1250 [18:18<12:21, 1.47s/it] Training 1/1 epoch (loss 2.6766): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/1250 [18:20<12:21, 1.47s/it] Training 1/1 epoch (loss 2.6766): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/1250 [18:20<14:46, 1.76s/it] Training 1/1 epoch (loss 2.4528): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/1250 [18:21<14:46, 1.76s/it] Training 1/1 epoch (loss 2.4528): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/1250 [18:21<13:32, 1.61s/it] Training 1/1 epoch (loss 2.6642): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/1250 [18:22<13:32, 1.61s/it] Training 1/1 epoch (loss 2.6642): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/1250 [18:22<11:49, 1.41s/it] Training 1/1 epoch (loss 2.4567): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/1250 [18:24<11:49, 1.41s/it] Training 1/1 epoch (loss 2.4567): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/1250 [18:24<13:05, 1.57s/it] Training 1/1 epoch (loss 2.6649): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/1250 [18:26<13:05, 1.57s/it] Training 1/1 epoch (loss 2.6649): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 750/1250 [18:26<12:57, 1.56s/it] Training 1/1 epoch (loss 2.4400): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 750/1250 [18:27<12:57, 1.56s/it] Training 1/1 epoch (loss 2.4400): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/1250 [18:27<12:09, 1.46s/it] Training 1/1 epoch (loss 2.6450): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/1250 [18:29<12:09, 1.46s/it] Training 1/1 epoch (loss 2.6450): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/1250 [18:29<12:20, 1.49s/it] Training 1/1 epoch (loss 2.7684): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/1250 [18:29<12:20, 1.49s/it] Training 1/1 epoch (loss 2.7684): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/1250 [18:29<10:53, 1.31s/it] Training 1/1 epoch (loss 2.7738): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/1250 [18:31<10:53, 1.31s/it] Training 1/1 epoch (loss 2.7738): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/1250 [18:31<12:02, 1.46s/it] Training 1/1 epoch (loss 2.5576): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/1250 [18:33<12:02, 1.46s/it] Training 1/1 epoch (loss 2.5576): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/1250 [18:33<12:49, 1.56s/it] Training 1/1 epoch (loss 2.7882): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/1250 [18:34<12:49, 1.56s/it] Training 1/1 epoch (loss 2.7882): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/1250 [18:34<10:26, 1.27s/it] Training 1/1 epoch (loss 2.5757): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/1250 [18:35<10:26, 1.27s/it] Training 1/1 epoch (loss 2.5757): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/1250 [18:35<10:40, 1.30s/it] Training 1/1 epoch (loss 2.5358): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/1250 [18:37<10:40, 1.30s/it] Training 1/1 epoch (loss 2.5358): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/1250 [18:37<12:08, 1.48s/it] Training 1/1 epoch (loss 2.6014): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/1250 [18:38<12:08, 1.48s/it] Training 1/1 epoch (loss 2.6014): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/1250 [18:38<10:28, 1.28s/it] Training 1/1 epoch (loss 2.6589): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/1250 [18:40<10:28, 1.28s/it] Training 1/1 epoch (loss 2.6589): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/1250 [18:40<12:29, 1.53s/it] Training 1/1 epoch (loss 2.6697): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/1250 [18:42<12:29, 1.53s/it] Training 1/1 epoch (loss 2.6697): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/1250 [18:42<14:42, 1.80s/it] Training 1/1 epoch (loss 2.4782): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/1250 [18:43<14:42, 1.80s/it] Training 1/1 epoch (loss 2.4782): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/1250 [18:43<12:20, 1.52s/it] Training 1/1 epoch (loss 2.4893): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/1250 [18:45<12:20, 1.52s/it] Training 1/1 epoch (loss 2.4893): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 763/1250 [18:45<12:53, 1.59s/it] Training 1/1 epoch (loss 2.6189): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 763/1250 [18:46<12:53, 1.59s/it] Training 1/1 epoch (loss 2.6189): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 764/1250 [18:46<12:11, 1.50s/it] Training 1/1 epoch (loss 2.6710): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 764/1250 [18:47<12:11, 1.50s/it] Training 1/1 epoch (loss 2.6710): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 765/1250 [18:47<10:40, 1.32s/it] Training 1/1 epoch (loss 2.3150): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 765/1250 [18:48<10:40, 1.32s/it] Training 1/1 epoch (loss 2.3150): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/1250 [18:48<10:46, 1.33s/it] Training 1/1 epoch (loss 2.4091): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/1250 [18:50<10:46, 1.33s/it] Training 1/1 epoch (loss 2.4091): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/1250 [18:50<11:00, 1.37s/it] Training 1/1 epoch (loss 2.6093): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/1250 [18:51<11:00, 1.37s/it] Training 1/1 epoch (loss 2.6093): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/1250 [18:51<09:30, 1.18s/it] Training 1/1 epoch (loss 2.7798): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/1250 [18:53<09:30, 1.18s/it] Training 1/1 epoch (loss 2.7798): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/1250 [18:53<11:20, 1.41s/it] Training 1/1 epoch (loss 2.6157): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/1250 [18:53<11:20, 1.41s/it] Training 1/1 epoch (loss 2.6157): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/1250 [18:53<09:37, 1.20s/it] Training 1/1 epoch (loss 2.6638): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/1250 [18:55<09:37, 1.20s/it] Training 1/1 epoch (loss 2.6638): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/1250 [18:55<11:39, 1.46s/it] Training 1/1 epoch (loss 2.7043): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/1250 [18:58<11:39, 1.46s/it] Training 1/1 epoch (loss 2.7043): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/1250 [18:58<13:40, 1.72s/it] Training 1/1 epoch (loss 2.5573): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/1250 [18:58<13:40, 1.72s/it] Training 1/1 epoch (loss 2.5573): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/1250 [18:58<10:49, 1.36s/it] Training 1/1 epoch (loss 2.5263): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/1250 [19:00<10:49, 1.36s/it] Training 1/1 epoch (loss 2.5263): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 774/1250 [19:00<11:56, 1.51s/it] Training 1/1 epoch (loss 2.6907): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 774/1250 [19:02<11:56, 1.51s/it] Training 1/1 epoch (loss 2.6907): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 775/1250 [19:02<12:18, 1.55s/it] Training 1/1 epoch (loss 2.5233): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 775/1250 [19:03<12:18, 1.55s/it] Training 1/1 epoch (loss 2.5233): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 776/1250 [19:03<11:03, 1.40s/it] Training 1/1 epoch (loss 2.6244): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 776/1250 [19:04<11:03, 1.40s/it] Training 1/1 epoch (loss 2.6244): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 777/1250 [19:04<11:21, 1.44s/it] Training 1/1 epoch (loss 2.7002): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 777/1250 [19:05<11:21, 1.44s/it] Training 1/1 epoch (loss 2.7002): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 778/1250 [19:05<09:26, 1.20s/it] Training 1/1 epoch (loss 2.5338): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 778/1250 [19:07<09:26, 1.20s/it] Training 1/1 epoch (loss 2.5338): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 779/1250 [19:07<11:39, 1.49s/it] Training 1/1 epoch (loss 2.8167): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 779/1250 [19:09<11:39, 1.49s/it] Training 1/1 epoch (loss 2.8167): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 780/1250 [19:09<13:10, 1.68s/it] Training 1/1 epoch (loss 2.4582): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 780/1250 [19:10<13:10, 1.68s/it] Training 1/1 epoch (loss 2.4582): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 781/1250 [19:10<10:33, 1.35s/it] Training 1/1 epoch (loss 2.6639): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 781/1250 [19:12<10:33, 1.35s/it] Training 1/1 epoch (loss 2.6639): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/1250 [19:12<11:38, 1.49s/it] Training 1/1 epoch (loss 2.6254): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/1250 [19:13<11:38, 1.49s/it] Training 1/1 epoch (loss 2.6254): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/1250 [19:13<10:42, 1.38s/it] Training 1/1 epoch (loss 2.4882): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/1250 [19:14<10:42, 1.38s/it] Training 1/1 epoch (loss 2.4882): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/1250 [19:14<09:42, 1.25s/it] Training 1/1 epoch (loss 2.7216): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/1250 [19:16<09:42, 1.25s/it] Training 1/1 epoch (loss 2.7216): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/1250 [19:16<12:32, 1.62s/it] Training 1/1 epoch (loss 2.6348): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/1250 [19:17<12:32, 1.62s/it] Training 1/1 epoch (loss 2.6348): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 786/1250 [19:17<11:33, 1.49s/it] Training 1/1 epoch (loss 2.6470): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 786/1250 [19:19<11:33, 1.49s/it] Training 1/1 epoch (loss 2.6470): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 787/1250 [19:19<12:20, 1.60s/it] Training 1/1 epoch (loss 2.4259): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 787/1250 [19:20<12:20, 1.60s/it] Training 1/1 epoch (loss 2.4259): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 788/1250 [19:20<11:26, 1.48s/it] Training 1/1 epoch (loss 2.4005): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 788/1250 [19:22<11:26, 1.48s/it] Training 1/1 epoch (loss 2.4005): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 789/1250 [19:22<11:10, 1.46s/it] Training 1/1 epoch (loss 2.5451): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 789/1250 [19:24<11:10, 1.46s/it] Training 1/1 epoch (loss 2.5451): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 790/1250 [19:24<11:51, 1.55s/it] Training 1/1 epoch (loss 2.5916): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 790/1250 [19:24<11:51, 1.55s/it] Training 1/1 epoch (loss 2.5916): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 791/1250 [19:24<10:11, 1.33s/it] Training 1/1 epoch (loss 2.6291): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 791/1250 [19:26<10:11, 1.33s/it] Training 1/1 epoch (loss 2.6291): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 792/1250 [19:26<10:11, 1.34s/it] Training 1/1 epoch (loss 2.4113): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 792/1250 [19:28<10:11, 1.34s/it] Training 1/1 epoch (loss 2.4113): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 793/1250 [19:28<11:26, 1.50s/it] Training 1/1 epoch (loss 2.5931): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 793/1250 [19:28<11:26, 1.50s/it] Training 1/1 epoch (loss 2.5931): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 794/1250 [19:28<09:03, 1.19s/it] Training 1/1 epoch (loss 2.7272): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 794/1250 [19:30<09:03, 1.19s/it] Training 1/1 epoch (loss 2.7272): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 795/1250 [19:30<09:34, 1.26s/it] Training 1/1 epoch (loss 2.5390): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 795/1250 [19:31<09:34, 1.26s/it] Training 1/1 epoch (loss 2.5390): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 796/1250 [19:31<10:40, 1.41s/it] Training 1/1 epoch (loss 2.4921): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 796/1250 [19:32<10:40, 1.41s/it] Training 1/1 epoch (loss 2.4921): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/1250 [19:32<08:49, 1.17s/it] Training 1/1 epoch (loss 2.7276): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/1250 [19:34<08:49, 1.17s/it] Training 1/1 epoch (loss 2.7276): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 798/1250 [19:34<11:12, 1.49s/it] Training 1/1 epoch (loss 2.6862): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 798/1250 [19:36<11:12, 1.49s/it] Training 1/1 epoch (loss 2.6862): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 799/1250 [19:36<11:16, 1.50s/it] Training 1/1 epoch (loss 2.6211): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 799/1250 [19:37<11:16, 1.50s/it] Training 1/1 epoch (loss 2.6211): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 800/1250 [19:37<09:52, 1.32s/it] Training 1/1 epoch (loss 2.5910): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 800/1250 [19:38<09:52, 1.32s/it] Training 1/1 epoch (loss 2.5910): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 801/1250 [19:38<10:34, 1.41s/it] Training 1/1 epoch (loss 2.7081): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 801/1250 [19:39<10:34, 1.41s/it] Training 1/1 epoch (loss 2.7081): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 802/1250 [19:39<09:14, 1.24s/it] Training 1/1 epoch (loss 2.6299): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 802/1250 [19:41<09:14, 1.24s/it] Training 1/1 epoch (loss 2.6299): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 803/1250 [19:41<10:17, 1.38s/it] Training 1/1 epoch (loss 2.5030): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 803/1250 [19:43<10:17, 1.38s/it] Training 1/1 epoch (loss 2.5030): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 804/1250 [19:43<12:14, 1.65s/it] Training 1/1 epoch (loss 2.6330): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 804/1250 [19:44<12:14, 1.65s/it] Training 1/1 epoch (loss 2.6330): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 805/1250 [19:44<09:54, 1.34s/it] Training 1/1 epoch (loss 2.5741): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 805/1250 [19:46<09:54, 1.34s/it] Training 1/1 epoch (loss 2.5741): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 806/1250 [19:46<11:29, 1.55s/it] Training 1/1 epoch (loss 2.6079): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 806/1250 [19:47<11:29, 1.55s/it] Training 1/1 epoch (loss 2.6079): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 807/1250 [19:47<11:45, 1.59s/it] Training 1/1 epoch (loss 2.5071): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 807/1250 [19:48<11:45, 1.59s/it] Training 1/1 epoch (loss 2.5071): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 808/1250 [19:48<09:37, 1.31s/it] Training 1/1 epoch (loss 2.5349): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 808/1250 [19:49<09:37, 1.31s/it] Training 1/1 epoch (loss 2.5349): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 809/1250 [19:49<09:26, 1.28s/it] Training 1/1 epoch (loss 2.6618): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 809/1250 [19:51<09:26, 1.28s/it] Training 1/1 epoch (loss 2.6618): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 810/1250 [19:51<10:55, 1.49s/it] Training 1/1 epoch (loss 2.5876): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 810/1250 [19:52<10:55, 1.49s/it] Training 1/1 epoch (loss 2.5876): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 811/1250 [19:52<09:05, 1.24s/it] Training 1/1 epoch (loss 2.6055): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 811/1250 [19:54<09:05, 1.24s/it] Training 1/1 epoch (loss 2.6055): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 812/1250 [19:54<11:11, 1.53s/it] Training 1/1 epoch (loss 2.5211): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 812/1250 [19:56<11:11, 1.53s/it] Training 1/1 epoch (loss 2.5211): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 813/1250 [19:56<11:18, 1.55s/it] Training 1/1 epoch (loss 2.6334): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 813/1250 [19:56<11:18, 1.55s/it] Training 1/1 epoch (loss 2.6334): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 814/1250 [19:56<09:02, 1.25s/it] Training 1/1 epoch (loss 2.6441): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 814/1250 [19:58<09:02, 1.25s/it] Training 1/1 epoch (loss 2.6441): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 815/1250 [19:58<09:32, 1.31s/it] Training 1/1 epoch (loss 2.7660): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 815/1250 [19:59<09:32, 1.31s/it] Training 1/1 epoch (loss 2.7660): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 816/1250 [19:59<09:30, 1.31s/it] Training 1/1 epoch (loss 2.6304): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 816/1250 [20:00<09:30, 1.31s/it] Training 1/1 epoch (loss 2.6304): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 817/1250 [20:00<08:03, 1.12s/it] Training 1/1 epoch (loss 2.4917): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 817/1250 [20:01<08:03, 1.12s/it] Training 1/1 epoch (loss 2.4917): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 818/1250 [20:01<08:18, 1.15s/it] Training 1/1 epoch (loss 2.7647): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 818/1250 [20:02<08:18, 1.15s/it] Training 1/1 epoch (loss 2.7647): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 819/1250 [20:02<08:23, 1.17s/it] Training 1/1 epoch (loss 2.5597): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 819/1250 [20:04<08:23, 1.17s/it] Training 1/1 epoch (loss 2.5597): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 820/1250 [20:04<08:54, 1.24s/it] Training 1/1 epoch (loss 2.4858): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 820/1250 [20:06<08:54, 1.24s/it] Training 1/1 epoch (loss 2.4858): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 821/1250 [20:06<11:11, 1.57s/it] Training 1/1 epoch (loss 2.4307): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 821/1250 [20:07<11:11, 1.57s/it] Training 1/1 epoch (loss 2.4307): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 822/1250 [20:07<09:33, 1.34s/it] Training 1/1 epoch (loss 2.4565): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 822/1250 [20:08<09:33, 1.34s/it] Training 1/1 epoch (loss 2.4565): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 823/1250 [20:08<09:28, 1.33s/it] Training 1/1 epoch (loss 2.4885): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 823/1250 [20:10<09:28, 1.33s/it] Training 1/1 epoch (loss 2.4885): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 824/1250 [20:10<11:45, 1.66s/it] Training 1/1 epoch (loss 2.5281): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 824/1250 [20:11<11:45, 1.66s/it] Training 1/1 epoch (loss 2.5281): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 825/1250 [20:11<09:52, 1.39s/it] Training 1/1 epoch (loss 2.7554): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 825/1250 [20:13<09:52, 1.39s/it] Training 1/1 epoch (loss 2.7554): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 826/1250 [20:13<11:11, 1.58s/it] Training 1/1 epoch (loss 2.6348): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 826/1250 [20:14<11:11, 1.58s/it] Training 1/1 epoch (loss 2.6348): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 827/1250 [20:14<10:19, 1.46s/it] Training 1/1 epoch (loss 2.5573): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 827/1250 [20:16<10:19, 1.46s/it] Training 1/1 epoch (loss 2.5573): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 828/1250 [20:16<10:12, 1.45s/it] Training 1/1 epoch (loss 2.5925): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 828/1250 [20:18<10:12, 1.45s/it] Training 1/1 epoch (loss 2.5925): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 829/1250 [20:18<10:52, 1.55s/it] Training 1/1 epoch (loss 2.5363): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 829/1250 [20:18<10:52, 1.55s/it] Training 1/1 epoch (loss 2.5363): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 830/1250 [20:18<09:21, 1.34s/it] Training 1/1 epoch (loss 2.5350): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 830/1250 [20:19<09:21, 1.34s/it] Training 1/1 epoch (loss 2.5350): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 831/1250 [20:19<08:37, 1.24s/it] Training 1/1 epoch (loss 2.7175): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 831/1250 [20:22<08:37, 1.24s/it] Training 1/1 epoch (loss 2.7175): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 832/1250 [20:22<10:30, 1.51s/it] Training 1/1 epoch (loss 2.5325): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 832/1250 [20:22<10:30, 1.51s/it] Training 1/1 epoch (loss 2.5325): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 833/1250 [20:22<09:15, 1.33s/it] Training 1/1 epoch (loss 2.6689): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 833/1250 [20:24<09:15, 1.33s/it] Training 1/1 epoch (loss 2.6689): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 834/1250 [20:24<08:51, 1.28s/it] Training 1/1 epoch (loss 2.7942): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 834/1250 [20:26<08:51, 1.28s/it] Training 1/1 epoch (loss 2.7942): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 835/1250 [20:26<10:15, 1.48s/it] Training 1/1 epoch (loss 2.5888): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 835/1250 [20:26<10:15, 1.48s/it] Training 1/1 epoch (loss 2.5888): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 836/1250 [20:26<08:36, 1.25s/it] Training 1/1 epoch (loss 2.5762): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 836/1250 [20:28<08:36, 1.25s/it] Training 1/1 epoch (loss 2.5762): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 837/1250 [20:28<08:51, 1.29s/it] Training 1/1 epoch (loss 2.6959): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 837/1250 [20:29<08:51, 1.29s/it] Training 1/1 epoch (loss 2.6959): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 838/1250 [20:29<08:55, 1.30s/it] Training 1/1 epoch (loss 2.6638): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 838/1250 [20:29<08:55, 1.30s/it] Training 1/1 epoch (loss 2.6638): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 839/1250 [20:29<07:07, 1.04s/it] Training 1/1 epoch (loss 2.3857): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 839/1250 [20:31<07:07, 1.04s/it] Training 1/1 epoch (loss 2.3857): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 840/1250 [20:31<08:48, 1.29s/it] Training 1/1 epoch (loss 2.5791): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 840/1250 [20:33<08:48, 1.29s/it] Training 1/1 epoch (loss 2.5791): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 841/1250 [20:33<10:00, 1.47s/it] Training 1/1 epoch (loss 2.7169): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 841/1250 [20:34<10:00, 1.47s/it] Training 1/1 epoch (loss 2.7169): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 842/1250 [20:34<08:10, 1.20s/it] Training 1/1 epoch (loss 2.5354): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 842/1250 [20:35<08:10, 1.20s/it] Training 1/1 epoch (loss 2.5354): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 843/1250 [20:35<07:51, 1.16s/it] Training 1/1 epoch (loss 2.7006): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 843/1250 [20:37<07:51, 1.16s/it] Training 1/1 epoch (loss 2.7006): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 844/1250 [20:37<10:12, 1.51s/it] Training 1/1 epoch (loss 2.5832): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 844/1250 [20:38<10:12, 1.51s/it] Training 1/1 epoch (loss 2.5832): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 845/1250 [20:38<09:48, 1.45s/it] Training 1/1 epoch (loss 2.6906): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 845/1250 [20:40<09:48, 1.45s/it] Training 1/1 epoch (loss 2.6906): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 846/1250 [20:40<10:41, 1.59s/it] Training 1/1 epoch (loss 2.5068): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 846/1250 [20:42<10:41, 1.59s/it] Training 1/1 epoch (loss 2.5068): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 847/1250 [20:42<10:35, 1.58s/it] Training 1/1 epoch (loss 2.6124): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 847/1250 [20:44<10:35, 1.58s/it] Training 1/1 epoch (loss 2.6124): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 848/1250 [20:44<10:52, 1.62s/it] Training 1/1 epoch (loss 2.6230): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 848/1250 [20:45<10:52, 1.62s/it] Training 1/1 epoch (loss 2.6230): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 849/1250 [20:45<10:35, 1.59s/it] Training 1/1 epoch (loss 2.7716): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 849/1250 [20:46<10:35, 1.59s/it] Training 1/1 epoch (loss 2.7716): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 850/1250 [20:46<08:25, 1.26s/it] Training 1/1 epoch (loss 2.5207): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 850/1250 [20:47<08:25, 1.26s/it] Training 1/1 epoch (loss 2.5207): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 851/1250 [20:47<09:20, 1.40s/it] Training 1/1 epoch (loss 2.4372): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 851/1250 [20:49<09:20, 1.40s/it] Training 1/1 epoch (loss 2.4372): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 852/1250 [20:49<09:14, 1.39s/it] Training 1/1 epoch (loss 2.7808): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 852/1250 [20:49<09:14, 1.39s/it] Training 1/1 epoch (loss 2.7808): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 853/1250 [20:49<07:18, 1.10s/it] Training 1/1 epoch (loss 2.5053): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 853/1250 [20:50<07:18, 1.10s/it] Training 1/1 epoch (loss 2.5053): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 854/1250 [20:50<07:40, 1.16s/it] Training 1/1 epoch (loss 2.7193): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 854/1250 [20:53<07:40, 1.16s/it] Training 1/1 epoch (loss 2.7193): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 855/1250 [20:53<09:58, 1.51s/it] Training 1/1 epoch (loss 2.3415): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 855/1250 [20:53<09:58, 1.51s/it] Training 1/1 epoch (loss 2.3415): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 856/1250 [20:53<08:09, 1.24s/it] Training 1/1 epoch (loss 2.6195): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 856/1250 [20:55<08:09, 1.24s/it] Training 1/1 epoch (loss 2.6195): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 857/1250 [20:55<08:46, 1.34s/it] Training 1/1 epoch (loss 2.5837): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 857/1250 [20:56<08:46, 1.34s/it] Training 1/1 epoch (loss 2.5837): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 858/1250 [20:56<08:41, 1.33s/it] Training 1/1 epoch (loss 2.5090): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 858/1250 [20:57<08:41, 1.33s/it] Training 1/1 epoch (loss 2.5090): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 859/1250 [20:57<07:28, 1.15s/it] Training 1/1 epoch (loss 2.5764): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 859/1250 [20:59<07:28, 1.15s/it] Training 1/1 epoch (loss 2.5764): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 860/1250 [20:59<08:19, 1.28s/it] Training 1/1 epoch (loss 2.6460): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 860/1250 [21:00<08:19, 1.28s/it] Training 1/1 epoch (loss 2.6460): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 861/1250 [21:00<07:58, 1.23s/it] Training 1/1 epoch (loss 2.5679): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 861/1250 [21:01<07:58, 1.23s/it] Training 1/1 epoch (loss 2.5679): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 862/1250 [21:01<07:49, 1.21s/it] Training 1/1 epoch (loss 2.6031): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 862/1250 [21:02<07:49, 1.21s/it] Training 1/1 epoch (loss 2.6031): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 863/1250 [21:02<08:01, 1.24s/it] Training 1/1 epoch (loss 2.5348): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 863/1250 [21:03<08:01, 1.24s/it] Training 1/1 epoch (loss 2.5348): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 864/1250 [21:03<07:21, 1.14s/it] Training 1/1 epoch (loss 2.5406): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 864/1250 [21:05<07:21, 1.14s/it] Training 1/1 epoch (loss 2.5406): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 865/1250 [21:05<08:58, 1.40s/it] Training 1/1 epoch (loss 2.4610): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 865/1250 [21:07<08:58, 1.40s/it] Training 1/1 epoch (loss 2.4610): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 866/1250 [21:07<09:48, 1.53s/it] Training 1/1 epoch (loss 2.6391): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 866/1250 [21:08<09:48, 1.53s/it] Training 1/1 epoch (loss 2.6391): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 867/1250 [21:08<07:57, 1.25s/it] Training 1/1 epoch (loss 2.8108): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 867/1250 [21:09<07:57, 1.25s/it] Training 1/1 epoch (loss 2.8108): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 868/1250 [21:09<09:09, 1.44s/it] Training 1/1 epoch (loss 2.4407): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 868/1250 [21:12<09:09, 1.44s/it] Training 1/1 epoch (loss 2.4407): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 869/1250 [21:12<11:08, 1.76s/it] Training 1/1 epoch (loss 2.4503): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 869/1250 [21:12<11:08, 1.76s/it] Training 1/1 epoch (loss 2.4503): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 870/1250 [21:12<08:38, 1.36s/it] Training 1/1 epoch (loss 2.5171): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 870/1250 [21:15<08:38, 1.36s/it] Training 1/1 epoch (loss 2.5171): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 871/1250 [21:15<10:37, 1.68s/it] Training 1/1 epoch (loss 2.5934): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 871/1250 [21:16<10:37, 1.68s/it] Training 1/1 epoch (loss 2.5934): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 872/1250 [21:16<10:01, 1.59s/it] Training 1/1 epoch (loss 2.7718): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 872/1250 [21:17<10:01, 1.59s/it] Training 1/1 epoch (loss 2.7718): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 873/1250 [21:17<08:28, 1.35s/it] Training 1/1 epoch (loss 2.5617): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 873/1250 [21:19<08:28, 1.35s/it] Training 1/1 epoch (loss 2.5617): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 874/1250 [21:19<10:34, 1.69s/it] Training 1/1 epoch (loss 2.6053): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 874/1250 [21:21<10:34, 1.69s/it] Training 1/1 epoch (loss 2.6053): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 875/1250 [21:21<10:18, 1.65s/it] Training 1/1 epoch (loss 2.6612): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 875/1250 [21:22<10:18, 1.65s/it] Training 1/1 epoch (loss 2.6612): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 876/1250 [21:22<08:55, 1.43s/it] Training 1/1 epoch (loss 2.7251): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 876/1250 [21:23<08:55, 1.43s/it] Training 1/1 epoch (loss 2.7251): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 877/1250 [21:23<08:39, 1.39s/it] Training 1/1 epoch (loss 2.8156): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 877/1250 [21:24<08:39, 1.39s/it] Training 1/1 epoch (loss 2.8156): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 878/1250 [21:24<08:22, 1.35s/it] Training 1/1 epoch (loss 2.6279): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 878/1250 [21:25<08:22, 1.35s/it] Training 1/1 epoch (loss 2.6279): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 879/1250 [21:25<07:33, 1.22s/it] Training 1/1 epoch (loss 2.7227): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 879/1250 [21:27<07:33, 1.22s/it] Training 1/1 epoch (loss 2.7227): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 880/1250 [21:27<08:10, 1.33s/it] Training 1/1 epoch (loss 2.4996): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 880/1250 [21:28<08:10, 1.33s/it] Training 1/1 epoch (loss 2.4996): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 881/1250 [21:28<07:23, 1.20s/it] Training 1/1 epoch (loss 2.6115): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 881/1250 [21:29<07:23, 1.20s/it] Training 1/1 epoch (loss 2.6115): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 882/1250 [21:29<06:33, 1.07s/it] Training 1/1 epoch (loss 2.7681): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 882/1250 [21:31<06:33, 1.07s/it] Training 1/1 epoch (loss 2.7681): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 883/1250 [21:31<08:10, 1.34s/it] Training 1/1 epoch (loss 2.6786): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 883/1250 [21:32<08:10, 1.34s/it] Training 1/1 epoch (loss 2.6786): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 884/1250 [21:32<08:06, 1.33s/it] Training 1/1 epoch (loss 2.5621): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 884/1250 [21:33<08:06, 1.33s/it] Training 1/1 epoch (loss 2.5621): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 885/1250 [21:33<07:56, 1.30s/it] Training 1/1 epoch (loss 2.6602): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 885/1250 [21:35<07:56, 1.30s/it] Training 1/1 epoch (loss 2.6602): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 886/1250 [21:35<08:55, 1.47s/it] Training 1/1 epoch (loss 2.5998): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 886/1250 [21:36<08:55, 1.47s/it] Training 1/1 epoch (loss 2.5998): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 887/1250 [21:36<08:02, 1.33s/it] Training 1/1 epoch (loss 2.5609): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 887/1250 [21:38<08:02, 1.33s/it] Training 1/1 epoch (loss 2.5609): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 888/1250 [21:38<08:52, 1.47s/it] Training 1/1 epoch (loss 2.6031): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 888/1250 [21:40<08:52, 1.47s/it] Training 1/1 epoch (loss 2.6031): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 889/1250 [21:40<09:23, 1.56s/it] Training 1/1 epoch (loss 2.6048): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 889/1250 [21:40<09:23, 1.56s/it] Training 1/1 epoch (loss 2.6048): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 890/1250 [21:40<07:25, 1.24s/it] Training 1/1 epoch (loss 2.7464): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 890/1250 [21:42<07:25, 1.24s/it] Training 1/1 epoch (loss 2.7464): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 891/1250 [21:42<09:29, 1.59s/it] Training 1/1 epoch (loss 2.5917): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 891/1250 [21:45<09:29, 1.59s/it] Training 1/1 epoch (loss 2.5917): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 892/1250 [21:45<10:59, 1.84s/it] Training 1/1 epoch (loss 2.4278): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 892/1250 [21:45<10:59, 1.84s/it] Training 1/1 epoch (loss 2.4278): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 893/1250 [21:45<08:28, 1.42s/it] Training 1/1 epoch (loss 2.4345): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 893/1250 [21:47<08:28, 1.42s/it] Training 1/1 epoch (loss 2.4345): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 894/1250 [21:47<09:16, 1.56s/it] Training 1/1 epoch (loss 2.9056): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 894/1250 [21:49<09:16, 1.56s/it] Training 1/1 epoch (loss 2.9056): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 895/1250 [21:49<09:30, 1.61s/it] Training 1/1 epoch (loss 2.2784): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 895/1250 [21:50<09:30, 1.61s/it] Training 1/1 epoch (loss 2.2784): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 896/1250 [21:50<08:15, 1.40s/it] Training 1/1 epoch (loss 2.5317): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 896/1250 [21:52<08:15, 1.40s/it] Training 1/1 epoch (loss 2.5317): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 897/1250 [21:52<09:09, 1.56s/it] Training 1/1 epoch (loss 2.6922): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 897/1250 [21:53<09:09, 1.56s/it] Training 1/1 epoch (loss 2.6922): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 898/1250 [21:53<07:58, 1.36s/it] Training 1/1 epoch (loss 2.5874): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 898/1250 [21:55<07:58, 1.36s/it] Training 1/1 epoch (loss 2.5874): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 899/1250 [21:55<09:00, 1.54s/it] Training 1/1 epoch (loss 2.7648): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 899/1250 [21:56<09:00, 1.54s/it] Training 1/1 epoch (loss 2.7648): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 900/1250 [21:56<08:58, 1.54s/it] Training 1/1 epoch (loss 2.6152): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 900/1250 [21:57<08:58, 1.54s/it] Training 1/1 epoch (loss 2.6152): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 901/1250 [21:57<07:53, 1.36s/it] Training 1/1 epoch (loss 2.6407): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 901/1250 [21:59<07:53, 1.36s/it] Training 1/1 epoch (loss 2.6407): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 902/1250 [21:59<09:17, 1.60s/it] Training 1/1 epoch (loss 2.5982): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 902/1250 [22:02<09:17, 1.60s/it] Training 1/1 epoch (loss 2.5982): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 903/1250 [22:02<10:26, 1.80s/it] Training 1/1 epoch (loss 2.6316): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 903/1250 [22:03<10:26, 1.80s/it] Training 1/1 epoch (loss 2.6316): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 904/1250 [22:03<09:40, 1.68s/it] Training 1/1 epoch (loss 2.5046): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 904/1250 [22:05<09:40, 1.68s/it] Training 1/1 epoch (loss 2.5046): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 905/1250 [22:05<10:04, 1.75s/it] Training 1/1 epoch (loss 2.4795): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 905/1250 [22:06<10:04, 1.75s/it] Training 1/1 epoch (loss 2.4795): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 906/1250 [22:06<08:55, 1.56s/it] Training 1/1 epoch (loss 2.4291): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 906/1250 [22:07<08:55, 1.56s/it] Training 1/1 epoch (loss 2.4291): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 907/1250 [22:07<07:39, 1.34s/it] Training 1/1 epoch (loss 2.4618): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 907/1250 [22:08<07:39, 1.34s/it] Training 1/1 epoch (loss 2.4618): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 908/1250 [22:08<08:00, 1.41s/it] Training 1/1 epoch (loss 2.7279): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 908/1250 [22:09<08:00, 1.41s/it] Training 1/1 epoch (loss 2.7279): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 909/1250 [22:09<07:28, 1.32s/it] Training 1/1 epoch (loss 2.5039): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 909/1250 [22:11<07:28, 1.32s/it] Training 1/1 epoch (loss 2.5039): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 910/1250 [22:11<07:51, 1.39s/it] Training 1/1 epoch (loss 2.3722): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 910/1250 [22:13<07:51, 1.39s/it] Training 1/1 epoch (loss 2.3722): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 911/1250 [22:13<08:08, 1.44s/it] Training 1/1 epoch (loss 2.6333): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 911/1250 [22:13<08:08, 1.44s/it] Training 1/1 epoch (loss 2.6333): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 912/1250 [22:13<06:51, 1.22s/it] Training 1/1 epoch (loss 2.4461): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 912/1250 [22:15<06:51, 1.22s/it] Training 1/1 epoch (loss 2.4461): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 913/1250 [22:15<07:27, 1.33s/it] Training 1/1 epoch (loss 2.5899): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 913/1250 [22:16<07:27, 1.33s/it] Training 1/1 epoch (loss 2.5899): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 914/1250 [22:16<07:43, 1.38s/it] Training 1/1 epoch (loss 2.6921): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 914/1250 [22:18<07:43, 1.38s/it] Training 1/1 epoch (loss 2.6921): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 915/1250 [22:18<07:33, 1.35s/it] Training 1/1 epoch (loss 2.6312): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 915/1250 [22:19<07:33, 1.35s/it] Training 1/1 epoch (loss 2.6312): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 916/1250 [22:19<07:33, 1.36s/it] Training 1/1 epoch (loss 2.3663): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 916/1250 [22:20<07:33, 1.36s/it] Training 1/1 epoch (loss 2.3663): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 917/1250 [22:20<06:39, 1.20s/it] Training 1/1 epoch (loss 2.7745): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 917/1250 [22:21<06:39, 1.20s/it] Training 1/1 epoch (loss 2.7745): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 918/1250 [22:21<06:20, 1.15s/it] Training 1/1 epoch (loss 2.5045): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 918/1250 [22:22<06:20, 1.15s/it] Training 1/1 epoch (loss 2.5045): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 919/1250 [22:22<06:51, 1.24s/it] Training 1/1 epoch (loss 2.6749): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 919/1250 [22:23<06:51, 1.24s/it] Training 1/1 epoch (loss 2.6749): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 920/1250 [22:23<05:49, 1.06s/it] Training 1/1 epoch (loss 2.5836): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 920/1250 [22:25<05:49, 1.06s/it] Training 1/1 epoch (loss 2.5836): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 921/1250 [22:25<08:04, 1.47s/it] Training 1/1 epoch (loss 2.5361): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 921/1250 [22:27<08:04, 1.47s/it] Training 1/1 epoch (loss 2.5361): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 922/1250 [22:27<08:20, 1.53s/it] Training 1/1 epoch (loss 2.4722): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 922/1250 [22:28<08:20, 1.53s/it] Training 1/1 epoch (loss 2.4722): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 923/1250 [22:28<06:45, 1.24s/it] Training 1/1 epoch (loss 2.7755): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 923/1250 [22:29<06:45, 1.24s/it] Training 1/1 epoch (loss 2.7755): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 924/1250 [22:29<07:26, 1.37s/it] Training 1/1 epoch (loss 2.6048): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 924/1250 [22:31<07:26, 1.37s/it] Training 1/1 epoch (loss 2.6048): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 925/1250 [22:31<07:55, 1.46s/it] Training 1/1 epoch (loss 2.6251): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 925/1250 [22:32<07:55, 1.46s/it] Training 1/1 epoch (loss 2.6251): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 926/1250 [22:32<07:13, 1.34s/it] Training 1/1 epoch (loss 2.5272): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 926/1250 [22:34<07:13, 1.34s/it] Training 1/1 epoch (loss 2.5272): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 927/1250 [22:34<08:15, 1.54s/it] Training 1/1 epoch (loss 2.8139): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 927/1250 [22:36<08:15, 1.54s/it] Training 1/1 epoch (loss 2.8139): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 928/1250 [22:36<08:21, 1.56s/it] Training 1/1 epoch (loss 2.6710): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 928/1250 [22:37<08:21, 1.56s/it] Training 1/1 epoch (loss 2.6710): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 929/1250 [22:37<07:21, 1.37s/it] Training 1/1 epoch (loss 2.7887): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 929/1250 [22:39<07:21, 1.37s/it] Training 1/1 epoch (loss 2.7887): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 930/1250 [22:39<09:05, 1.70s/it] Training 1/1 epoch (loss 2.6824): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 930/1250 [22:40<09:05, 1.70s/it] Training 1/1 epoch (loss 2.6824): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 931/1250 [22:40<08:11, 1.54s/it] Training 1/1 epoch (loss 2.5978): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 931/1250 [22:42<08:11, 1.54s/it] Training 1/1 epoch (loss 2.5978): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 932/1250 [22:42<08:37, 1.63s/it] Training 1/1 epoch (loss 2.6021): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 932/1250 [22:44<08:37, 1.63s/it] Training 1/1 epoch (loss 2.6021): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 933/1250 [22:44<08:54, 1.69s/it] Training 1/1 epoch (loss 2.4795): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 933/1250 [22:45<08:54, 1.69s/it] Training 1/1 epoch (loss 2.4795): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 934/1250 [22:45<07:13, 1.37s/it] Training 1/1 epoch (loss 2.5124): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 934/1250 [22:46<07:13, 1.37s/it] Training 1/1 epoch (loss 2.5124): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 935/1250 [22:46<07:23, 1.41s/it] Training 1/1 epoch (loss 2.6426): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 935/1250 [22:48<07:23, 1.41s/it] Training 1/1 epoch (loss 2.6426): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 936/1250 [22:48<07:39, 1.46s/it] Training 1/1 epoch (loss 2.3844): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 936/1250 [22:48<07:39, 1.46s/it] Training 1/1 epoch (loss 2.3844): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 937/1250 [22:48<06:38, 1.27s/it] Training 1/1 epoch (loss 2.5120): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 937/1250 [22:50<06:38, 1.27s/it] Training 1/1 epoch (loss 2.5120): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 938/1250 [22:50<07:24, 1.43s/it] Training 1/1 epoch (loss 2.6529): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 938/1250 [22:51<07:24, 1.43s/it] Training 1/1 epoch (loss 2.6529): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 939/1250 [22:51<06:58, 1.34s/it] Training 1/1 epoch (loss 2.6475): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 939/1250 [22:53<06:58, 1.34s/it] Training 1/1 epoch (loss 2.6475): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 940/1250 [22:53<07:34, 1.46s/it] Training 1/1 epoch (loss 2.5826): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 940/1250 [22:55<07:34, 1.46s/it] Training 1/1 epoch (loss 2.5826): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 941/1250 [22:55<07:41, 1.49s/it] Training 1/1 epoch (loss 2.5538): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 941/1250 [22:55<07:41, 1.49s/it] Training 1/1 epoch (loss 2.5538): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 942/1250 [22:55<06:20, 1.23s/it] Training 1/1 epoch (loss 2.4481): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 942/1250 [22:57<06:20, 1.23s/it] Training 1/1 epoch (loss 2.4481): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 943/1250 [22:57<07:13, 1.41s/it] Training 1/1 epoch (loss 2.3013): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 943/1250 [22:58<07:13, 1.41s/it] Training 1/1 epoch (loss 2.3013): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 944/1250 [22:58<07:04, 1.39s/it] Training 1/1 epoch (loss 2.5500): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 944/1250 [22:59<07:04, 1.39s/it] Training 1/1 epoch (loss 2.5500): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 945/1250 [22:59<05:56, 1.17s/it] Training 1/1 epoch (loss 2.6140): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 945/1250 [23:02<05:56, 1.17s/it] Training 1/1 epoch (loss 2.6140): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 946/1250 [23:02<07:54, 1.56s/it] Training 1/1 epoch (loss 2.5047): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 946/1250 [23:03<07:54, 1.56s/it] Training 1/1 epoch (loss 2.5047): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 947/1250 [23:03<07:33, 1.50s/it] Training 1/1 epoch (loss 2.4620): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 947/1250 [23:04<07:33, 1.50s/it] Training 1/1 epoch (loss 2.4620): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 948/1250 [23:04<06:09, 1.22s/it] Training 1/1 epoch (loss 2.6158): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 948/1250 [23:05<06:09, 1.22s/it] Training 1/1 epoch (loss 2.6158): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 949/1250 [23:05<06:59, 1.39s/it] Training 1/1 epoch (loss 2.4767): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 949/1250 [23:07<06:59, 1.39s/it] Training 1/1 epoch (loss 2.4767): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 950/1250 [23:07<07:34, 1.51s/it] Training 1/1 epoch (loss 2.3984): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 950/1250 [23:08<07:34, 1.51s/it] Training 1/1 epoch (loss 2.3984): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 951/1250 [23:08<06:09, 1.24s/it] Training 1/1 epoch (loss 2.7034): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 951/1250 [23:10<06:09, 1.24s/it] Training 1/1 epoch (loss 2.7034): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 952/1250 [23:10<07:44, 1.56s/it] Training 1/1 epoch (loss 2.4641): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 952/1250 [23:12<07:44, 1.56s/it] Training 1/1 epoch (loss 2.4641): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 953/1250 [23:12<07:49, 1.58s/it] Training 1/1 epoch (loss 2.5437): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 953/1250 [23:12<07:49, 1.58s/it] Training 1/1 epoch (loss 2.5437): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 954/1250 [23:12<06:08, 1.24s/it] Training 1/1 epoch (loss 2.5106): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 954/1250 [23:13<06:08, 1.24s/it] Training 1/1 epoch (loss 2.5106): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 955/1250 [23:13<06:16, 1.28s/it] Training 1/1 epoch (loss 2.5911): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 955/1250 [23:15<06:16, 1.28s/it] Training 1/1 epoch (loss 2.5911): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 956/1250 [23:15<06:50, 1.40s/it] Training 1/1 epoch (loss 2.7537): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 956/1250 [23:16<06:50, 1.40s/it] Training 1/1 epoch (loss 2.7537): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 957/1250 [23:16<05:30, 1.13s/it] Training 1/1 epoch (loss 2.5620): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 957/1250 [23:17<05:30, 1.13s/it] Training 1/1 epoch (loss 2.5620): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 958/1250 [23:17<06:11, 1.27s/it] Training 1/1 epoch (loss 2.7914): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 958/1250 [23:20<06:11, 1.27s/it] Training 1/1 epoch (loss 2.7914): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 959/1250 [23:20<07:36, 1.57s/it] Training 1/1 epoch (loss 2.3507): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 959/1250 [23:20<07:36, 1.57s/it] Training 1/1 epoch (loss 2.3507): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 960/1250 [23:20<05:54, 1.22s/it] Training 1/1 epoch (loss 2.3302): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 960/1250 [23:22<05:54, 1.22s/it] Training 1/1 epoch (loss 2.3302): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 961/1250 [23:22<06:30, 1.35s/it] Training 1/1 epoch (loss 2.3539): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 961/1250 [23:23<06:30, 1.35s/it] Training 1/1 epoch (loss 2.3539): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 962/1250 [23:23<06:52, 1.43s/it] Training 1/1 epoch (loss 2.7459): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 962/1250 [23:25<06:52, 1.43s/it] Training 1/1 epoch (loss 2.7459): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 963/1250 [23:25<06:46, 1.42s/it] Training 1/1 epoch (loss 2.6836): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 963/1250 [23:26<06:46, 1.42s/it] Training 1/1 epoch (loss 2.6836): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 964/1250 [23:26<06:48, 1.43s/it] Training 1/1 epoch (loss 2.3475): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 964/1250 [23:27<06:48, 1.43s/it] Training 1/1 epoch (loss 2.3475): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 965/1250 [23:27<05:54, 1.24s/it] Training 1/1 epoch (loss 2.5563): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 965/1250 [23:29<05:54, 1.24s/it] Training 1/1 epoch (loss 2.5563): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 966/1250 [23:29<06:29, 1.37s/it] Training 1/1 epoch (loss 2.5153): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 966/1250 [23:31<06:29, 1.37s/it] Training 1/1 epoch (loss 2.5153): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 967/1250 [23:31<07:27, 1.58s/it] Training 1/1 epoch (loss 2.5604): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 967/1250 [23:31<07:27, 1.58s/it] Training 1/1 epoch (loss 2.5604): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 968/1250 [23:31<06:22, 1.36s/it] Training 1/1 epoch (loss 2.8741): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 968/1250 [23:33<06:22, 1.36s/it] Training 1/1 epoch (loss 2.8741): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 969/1250 [23:33<07:10, 1.53s/it] Training 1/1 epoch (loss 2.4924): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 969/1250 [23:35<07:10, 1.53s/it] Training 1/1 epoch (loss 2.4924): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 970/1250 [23:35<07:07, 1.53s/it] Training 1/1 epoch (loss 2.8911): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 970/1250 [23:36<07:07, 1.53s/it] Training 1/1 epoch (loss 2.8911): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 971/1250 [23:36<05:53, 1.27s/it] Training 1/1 epoch (loss 2.5558): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 971/1250 [23:37<05:53, 1.27s/it] Training 1/1 epoch (loss 2.5558): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 972/1250 [23:37<05:54, 1.28s/it] Training 1/1 epoch (loss 2.5788): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 972/1250 [23:39<05:54, 1.28s/it] Training 1/1 epoch (loss 2.5788): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 973/1250 [23:39<06:59, 1.52s/it] Training 1/1 epoch (loss 2.4504): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 973/1250 [23:40<06:59, 1.52s/it] Training 1/1 epoch (loss 2.4504): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 974/1250 [23:40<05:50, 1.27s/it] Training 1/1 epoch (loss 2.5891): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 974/1250 [23:42<05:50, 1.27s/it] Training 1/1 epoch (loss 2.5891): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 975/1250 [23:42<06:49, 1.49s/it] Training 1/1 epoch (loss 2.4772): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 975/1250 [23:43<06:49, 1.49s/it] Training 1/1 epoch (loss 2.4772): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 976/1250 [23:43<06:19, 1.39s/it] Training 1/1 epoch (loss 2.6558): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 976/1250 [23:44<06:19, 1.39s/it] Training 1/1 epoch (loss 2.6558): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 977/1250 [23:44<06:02, 1.33s/it] Training 1/1 epoch (loss 2.6577): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 977/1250 [23:46<06:02, 1.33s/it] Training 1/1 epoch (loss 2.6577): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 978/1250 [23:46<06:40, 1.47s/it] Training 1/1 epoch (loss 2.8601): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 978/1250 [23:47<06:40, 1.47s/it] Training 1/1 epoch (loss 2.8601): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 979/1250 [23:47<06:15, 1.39s/it] Training 1/1 epoch (loss 2.6597): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 979/1250 [23:48<06:15, 1.39s/it] Training 1/1 epoch (loss 2.6597): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 980/1250 [23:48<05:58, 1.33s/it] Training 1/1 epoch (loss 2.6060): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 980/1250 [23:50<05:58, 1.33s/it] Training 1/1 epoch (loss 2.6060): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 981/1250 [23:50<06:37, 1.48s/it] Training 1/1 epoch (loss 2.5961): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 981/1250 [23:51<06:37, 1.48s/it] Training 1/1 epoch (loss 2.5961): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 982/1250 [23:51<05:42, 1.28s/it] Training 1/1 epoch (loss 2.6592): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 982/1250 [23:52<05:42, 1.28s/it] Training 1/1 epoch (loss 2.6592): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 983/1250 [23:52<05:39, 1.27s/it] Training 1/1 epoch (loss 2.6301): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 983/1250 [23:54<05:39, 1.27s/it] Training 1/1 epoch (loss 2.6301): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 984/1250 [23:54<06:47, 1.53s/it] Training 1/1 epoch (loss 2.7192): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 984/1250 [23:55<06:47, 1.53s/it] Training 1/1 epoch (loss 2.7192): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 985/1250 [23:55<05:24, 1.22s/it] Training 1/1 epoch (loss 2.4560): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 985/1250 [23:57<05:24, 1.22s/it] Training 1/1 epoch (loss 2.4560): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 986/1250 [23:57<06:35, 1.50s/it] Training 1/1 epoch (loss 2.5943): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 986/1250 [23:59<06:35, 1.50s/it] Training 1/1 epoch (loss 2.5943): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 987/1250 [23:59<07:15, 1.65s/it] Training 1/1 epoch (loss 2.5223): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 987/1250 [24:00<07:15, 1.65s/it] Training 1/1 epoch (loss 2.5223): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 988/1250 [24:00<06:20, 1.45s/it] Training 1/1 epoch (loss 2.8114): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 988/1250 [24:01<06:20, 1.45s/it] Training 1/1 epoch (loss 2.8114): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 989/1250 [24:01<06:23, 1.47s/it] Training 1/1 epoch (loss 2.4353): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 989/1250 [24:03<06:23, 1.47s/it] Training 1/1 epoch (loss 2.4353): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 990/1250 [24:03<06:32, 1.51s/it] Training 1/1 epoch (loss 2.5145): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 990/1250 [24:04<06:32, 1.51s/it] Training 1/1 epoch (loss 2.5145): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 991/1250 [24:04<05:32, 1.28s/it] Training 1/1 epoch (loss 2.2521): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 991/1250 [24:06<05:32, 1.28s/it] Training 1/1 epoch (loss 2.2521): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 992/1250 [24:06<06:25, 1.50s/it] Training 1/1 epoch (loss 2.5101): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 992/1250 [24:07<06:25, 1.50s/it] Training 1/1 epoch (loss 2.5101): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 993/1250 [24:07<06:10, 1.44s/it] Training 1/1 epoch (loss 2.4350): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 993/1250 [24:08<06:10, 1.44s/it] Training 1/1 epoch (loss 2.4350): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 994/1250 [24:08<06:01, 1.41s/it] Training 1/1 epoch (loss 2.2831): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 994/1250 [24:10<06:01, 1.41s/it] Training 1/1 epoch (loss 2.2831): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 995/1250 [24:10<05:58, 1.40s/it] Training 1/1 epoch (loss 2.5321): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 995/1250 [24:11<05:58, 1.40s/it] Training 1/1 epoch (loss 2.5321): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 996/1250 [24:11<05:35, 1.32s/it] Training 1/1 epoch (loss 2.6333): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 996/1250 [24:12<05:35, 1.32s/it] Training 1/1 epoch (loss 2.6333): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 997/1250 [24:12<04:57, 1.18s/it] Training 1/1 epoch (loss 2.5478): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 997/1250 [24:14<04:57, 1.18s/it] Training 1/1 epoch (loss 2.5478): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 998/1250 [24:14<06:31, 1.55s/it] Training 1/1 epoch (loss 2.5820): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 998/1250 [24:15<06:31, 1.55s/it] Training 1/1 epoch (loss 2.5820): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 999/1250 [24:15<05:43, 1.37s/it] Training 1/1 epoch (loss 2.5452): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 999/1250 [24:17<05:43, 1.37s/it] Training 1/1 epoch (loss 2.5452): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1000/1250 [24:17<06:05, 1.46s/it] Training 1/1 epoch (loss 2.6420): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1000/1250 [24:19<06:05, 1.46s/it] Training 1/1 epoch (loss 2.6420): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1001/1250 [24:19<06:45, 1.63s/it] Training 1/1 epoch (loss 2.3777): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1001/1250 [24:19<06:45, 1.63s/it] Training 1/1 epoch (loss 2.3777): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1002/1250 [24:19<05:16, 1.28s/it] Training 1/1 epoch (loss 2.4638): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1002/1250 [24:21<05:16, 1.28s/it] Training 1/1 epoch (loss 2.4638): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1003/1250 [24:21<05:31, 1.34s/it] Training 1/1 epoch (loss 2.4468): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1003/1250 [24:23<05:31, 1.34s/it] Training 1/1 epoch (loss 2.4468): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1004/1250 [24:23<06:23, 1.56s/it] Training 1/1 epoch (loss 2.6731): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1004/1250 [24:23<06:23, 1.56s/it] Training 1/1 epoch (loss 2.6731): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1005/1250 [24:23<05:19, 1.30s/it] Training 1/1 epoch (loss 2.5950): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1005/1250 [24:25<05:19, 1.30s/it] Training 1/1 epoch (loss 2.5950): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1006/1250 [24:25<05:11, 1.28s/it] Training 1/1 epoch (loss 2.6667): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1006/1250 [24:27<05:11, 1.28s/it] Training 1/1 epoch (loss 2.6667): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1007/1250 [24:27<06:02, 1.49s/it] Training 1/1 epoch (loss 2.6168): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1007/1250 [24:28<06:02, 1.49s/it] Training 1/1 epoch (loss 2.6168): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1008/1250 [24:28<05:55, 1.47s/it] Training 1/1 epoch (loss 2.7035): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1008/1250 [24:30<05:55, 1.47s/it] Training 1/1 epoch (loss 2.7035): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1009/1250 [24:30<06:26, 1.61s/it] Training 1/1 epoch (loss 2.6358): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1009/1250 [24:31<06:26, 1.61s/it] Training 1/1 epoch (loss 2.6358): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1010/1250 [24:31<05:58, 1.49s/it] Training 1/1 epoch (loss 2.7557): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1010/1250 [24:34<05:58, 1.49s/it] Training 1/1 epoch (loss 2.7557): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1011/1250 [24:34<07:05, 1.78s/it] Training 1/1 epoch (loss 2.6186): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1011/1250 [24:36<07:05, 1.78s/it] Training 1/1 epoch (loss 2.6186): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1012/1250 [24:36<07:50, 1.98s/it] Training 1/1 epoch (loss 2.4513): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1012/1250 [24:37<07:50, 1.98s/it] Training 1/1 epoch (loss 2.4513): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1013/1250 [24:37<06:45, 1.71s/it] Training 1/1 epoch (loss 2.3893): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1013/1250 [24:39<06:45, 1.71s/it] Training 1/1 epoch (loss 2.3893): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1014/1250 [24:39<07:13, 1.84s/it] Training 1/1 epoch (loss 2.5520): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1014/1250 [24:42<07:13, 1.84s/it] Training 1/1 epoch (loss 2.5520): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1015/1250 [24:42<07:57, 2.03s/it] Training 1/1 epoch (loss 2.5779): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1015/1250 [24:43<07:57, 2.03s/it] Training 1/1 epoch (loss 2.5779): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1016/1250 [24:43<06:45, 1.73s/it] Training 1/1 epoch (loss 2.5612): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1016/1250 [24:44<06:45, 1.73s/it] Training 1/1 epoch (loss 2.5612): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1017/1250 [24:44<05:43, 1.48s/it] Training 1/1 epoch (loss 2.6684): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1017/1250 [24:45<05:43, 1.48s/it] Training 1/1 epoch (loss 2.6684): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1018/1250 [24:45<05:04, 1.31s/it] Training 1/1 epoch (loss 2.5985): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1018/1250 [24:47<05:04, 1.31s/it] Training 1/1 epoch (loss 2.5985): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1019/1250 [24:47<05:42, 1.48s/it] Training 1/1 epoch (loss 2.7760): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1019/1250 [24:48<05:42, 1.48s/it] Training 1/1 epoch (loss 2.7760): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1020/1250 [24:48<05:55, 1.55s/it] Training 1/1 epoch (loss 2.5240): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1020/1250 [24:49<05:55, 1.55s/it] Training 1/1 epoch (loss 2.5240): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1021/1250 [24:49<04:53, 1.28s/it] Training 1/1 epoch (loss 2.8070): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1021/1250 [24:51<04:53, 1.28s/it] Training 1/1 epoch (loss 2.8070): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1022/1250 [24:51<05:18, 1.40s/it] Training 1/1 epoch (loss 2.4870): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1022/1250 [24:52<05:18, 1.40s/it] Training 1/1 epoch (loss 2.4870): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1023/1250 [24:52<05:26, 1.44s/it] Training 1/1 epoch (loss 2.5521): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1023/1250 [24:53<05:26, 1.44s/it] Training 1/1 epoch (loss 2.5521): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1024/1250 [24:53<04:54, 1.30s/it] Training 1/1 epoch (loss 2.4295): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1024/1250 [24:56<04:54, 1.30s/it] Training 1/1 epoch (loss 2.4295): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1025/1250 [24:56<06:09, 1.64s/it] Training 1/1 epoch (loss 2.6075): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1025/1250 [24:57<06:09, 1.64s/it] Training 1/1 epoch (loss 2.6075): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1026/1250 [24:57<05:53, 1.58s/it] Training 1/1 epoch (loss 2.4976): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1026/1250 [24:57<05:53, 1.58s/it] Training 1/1 epoch (loss 2.4976): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1027/1250 [24:57<04:38, 1.25s/it] Training 1/1 epoch (loss 2.4923): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1027/1250 [24:59<04:38, 1.25s/it] Training 1/1 epoch (loss 2.4923): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1028/1250 [24:59<05:19, 1.44s/it] Training 1/1 epoch (loss 2.6077): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1028/1250 [25:01<05:19, 1.44s/it] Training 1/1 epoch (loss 2.6077): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1029/1250 [25:01<05:50, 1.59s/it] Training 1/1 epoch (loss 2.6690): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1029/1250 [25:02<05:50, 1.59s/it] Training 1/1 epoch (loss 2.6690): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1030/1250 [25:02<04:41, 1.28s/it] Training 1/1 epoch (loss 2.6047): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1030/1250 [25:03<04:41, 1.28s/it] Training 1/1 epoch (loss 2.6047): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1031/1250 [25:03<04:37, 1.27s/it] Training 1/1 epoch (loss 2.5900): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1031/1250 [25:05<04:37, 1.27s/it] Training 1/1 epoch (loss 2.5900): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1032/1250 [25:05<05:26, 1.50s/it] Training 1/1 epoch (loss 2.4710): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1032/1250 [25:06<05:26, 1.50s/it] Training 1/1 epoch (loss 2.4710): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1033/1250 [25:06<05:00, 1.38s/it] Training 1/1 epoch (loss 2.4462): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1033/1250 [25:08<05:00, 1.38s/it] Training 1/1 epoch (loss 2.4462): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1034/1250 [25:08<05:34, 1.55s/it] Training 1/1 epoch (loss 2.3863): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1034/1250 [25:10<05:34, 1.55s/it] Training 1/1 epoch (loss 2.3863): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1035/1250 [25:10<05:54, 1.65s/it] Training 1/1 epoch (loss 2.5109): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1035/1250 [25:11<05:54, 1.65s/it] Training 1/1 epoch (loss 2.5109): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1036/1250 [25:11<05:08, 1.44s/it] Training 1/1 epoch (loss 2.6931): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1036/1250 [25:13<05:08, 1.44s/it] Training 1/1 epoch (loss 2.6931): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1037/1250 [25:13<06:09, 1.73s/it] Training 1/1 epoch (loss 2.4604): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1037/1250 [25:15<06:09, 1.73s/it] Training 1/1 epoch (loss 2.4604): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1038/1250 [25:15<06:05, 1.72s/it] Training 1/1 epoch (loss 2.3465): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1038/1250 [25:17<06:05, 1.72s/it] Training 1/1 epoch (loss 2.3465): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1039/1250 [25:17<05:44, 1.63s/it] Training 1/1 epoch (loss 2.4537): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1039/1250 [25:19<05:44, 1.63s/it] Training 1/1 epoch (loss 2.4537): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1040/1250 [25:19<06:18, 1.80s/it] Training 1/1 epoch (loss 2.3101): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1040/1250 [25:20<06:18, 1.80s/it] Training 1/1 epoch (loss 2.3101): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1041/1250 [25:20<05:11, 1.49s/it] Training 1/1 epoch (loss 2.7219): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1041/1250 [25:22<05:11, 1.49s/it] Training 1/1 epoch (loss 2.7219): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1042/1250 [25:22<05:48, 1.67s/it] Training 1/1 epoch (loss 2.6239): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1042/1250 [25:23<05:48, 1.67s/it] Training 1/1 epoch (loss 2.6239): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1043/1250 [25:23<05:30, 1.60s/it] Training 1/1 epoch (loss 2.6034): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1043/1250 [25:24<05:30, 1.60s/it] Training 1/1 epoch (loss 2.6034): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1044/1250 [25:24<04:19, 1.26s/it] Training 1/1 epoch (loss 2.6666): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1044/1250 [25:25<04:19, 1.26s/it] Training 1/1 epoch (loss 2.6666): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1045/1250 [25:25<04:32, 1.33s/it] Training 1/1 epoch (loss 2.6361): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1045/1250 [25:27<04:32, 1.33s/it] Training 1/1 epoch (loss 2.6361): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1046/1250 [25:27<05:36, 1.65s/it] Training 1/1 epoch (loss 2.6398): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1046/1250 [25:28<05:36, 1.65s/it] Training 1/1 epoch (loss 2.6398): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1047/1250 [25:28<04:26, 1.31s/it] Training 1/1 epoch (loss 2.6086): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1047/1250 [25:30<04:26, 1.31s/it] Training 1/1 epoch (loss 2.6086): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1048/1250 [25:30<05:22, 1.60s/it] Training 1/1 epoch (loss 2.4828): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1048/1250 [25:32<05:22, 1.60s/it] Training 1/1 epoch (loss 2.4828): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1049/1250 [25:32<05:35, 1.67s/it] Training 1/1 epoch (loss 2.7476): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1049/1250 [25:33<05:35, 1.67s/it] Training 1/1 epoch (loss 2.7476): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1050/1250 [25:33<04:40, 1.40s/it] Training 1/1 epoch (loss 2.6126): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1050/1250 [25:35<04:40, 1.40s/it] Training 1/1 epoch (loss 2.6126): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1051/1250 [25:35<05:07, 1.55s/it] Training 1/1 epoch (loss 2.5927): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1051/1250 [25:36<05:07, 1.55s/it] Training 1/1 epoch (loss 2.5927): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1052/1250 [25:36<04:28, 1.36s/it] Training 1/1 epoch (loss 2.4628): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1052/1250 [25:37<04:28, 1.36s/it] Training 1/1 epoch (loss 2.4628): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1053/1250 [25:37<04:40, 1.42s/it] Training 1/1 epoch (loss 2.4557): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1053/1250 [25:38<04:40, 1.42s/it] Training 1/1 epoch (loss 2.4557): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1054/1250 [25:38<04:18, 1.32s/it] Training 1/1 epoch (loss 2.3544): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1054/1250 [25:40<04:18, 1.32s/it] Training 1/1 epoch (loss 2.3544): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1055/1250 [25:40<04:24, 1.35s/it] Training 1/1 epoch (loss 2.4873): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1055/1250 [25:42<04:24, 1.35s/it] Training 1/1 epoch (loss 2.4873): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1056/1250 [25:42<04:59, 1.54s/it] Training 1/1 epoch (loss 2.6093): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1056/1250 [25:44<04:59, 1.54s/it] Training 1/1 epoch (loss 2.6093): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1057/1250 [25:44<05:49, 1.81s/it] Training 1/1 epoch (loss 2.6188): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1057/1250 [25:45<05:49, 1.81s/it] Training 1/1 epoch (loss 2.6188): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1058/1250 [25:45<04:37, 1.45s/it] Training 1/1 epoch (loss 2.5976): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1058/1250 [25:46<04:37, 1.45s/it] Training 1/1 epoch (loss 2.5976): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1059/1250 [25:46<04:49, 1.51s/it] Training 1/1 epoch (loss 2.6862): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1059/1250 [25:48<04:49, 1.51s/it] Training 1/1 epoch (loss 2.6862): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1060/1250 [25:48<05:11, 1.64s/it] Training 1/1 epoch (loss 2.4173): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1060/1250 [25:49<05:11, 1.64s/it] Training 1/1 epoch (loss 2.4173): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1061/1250 [25:49<04:08, 1.32s/it] Training 1/1 epoch (loss 2.8002): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1061/1250 [25:51<04:08, 1.32s/it] Training 1/1 epoch (loss 2.8002): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1062/1250 [25:51<04:37, 1.48s/it] Training 1/1 epoch (loss 2.7227): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1062/1250 [25:52<04:37, 1.48s/it] Training 1/1 epoch (loss 2.7227): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1063/1250 [25:52<04:31, 1.45s/it] Training 1/1 epoch (loss 2.6381): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1063/1250 [25:53<04:31, 1.45s/it] Training 1/1 epoch (loss 2.6381): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1064/1250 [25:53<04:15, 1.37s/it] Training 1/1 epoch (loss 2.5342): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1064/1250 [25:55<04:15, 1.37s/it] Training 1/1 epoch (loss 2.5342): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1065/1250 [25:55<04:35, 1.49s/it] Training 1/1 epoch (loss 2.7228): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1065/1250 [25:56<04:35, 1.49s/it] Training 1/1 epoch (loss 2.7228): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1066/1250 [25:56<04:14, 1.38s/it] Training 1/1 epoch (loss 2.5571): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1066/1250 [25:58<04:14, 1.38s/it] Training 1/1 epoch (loss 2.5571): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1067/1250 [25:58<04:31, 1.49s/it] Training 1/1 epoch (loss 2.5155): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1067/1250 [26:00<04:31, 1.49s/it] Training 1/1 epoch (loss 2.5155): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1068/1250 [26:00<04:38, 1.53s/it] Training 1/1 epoch (loss 2.4969): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1068/1250 [26:00<04:38, 1.53s/it] Training 1/1 epoch (loss 2.4969): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1069/1250 [26:00<03:42, 1.23s/it] Training 1/1 epoch (loss 2.5663): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1069/1250 [26:02<03:42, 1.23s/it] Training 1/1 epoch (loss 2.5663): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1070/1250 [26:02<04:36, 1.53s/it] Training 1/1 epoch (loss 2.6728): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1070/1250 [26:04<04:36, 1.53s/it] Training 1/1 epoch (loss 2.6728): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1071/1250 [26:04<04:52, 1.63s/it] Training 1/1 epoch (loss 2.5853): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1071/1250 [26:05<04:52, 1.63s/it] Training 1/1 epoch (loss 2.5853): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1072/1250 [26:05<03:58, 1.34s/it] Training 1/1 epoch (loss 2.5546): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1072/1250 [26:07<03:58, 1.34s/it] Training 1/1 epoch (loss 2.5546): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1073/1250 [26:07<05:01, 1.70s/it] Training 1/1 epoch (loss 2.6339): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1073/1250 [26:09<05:01, 1.70s/it] Training 1/1 epoch (loss 2.6339): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1074/1250 [26:09<04:56, 1.69s/it] Training 1/1 epoch (loss 2.6324): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1074/1250 [26:10<04:56, 1.69s/it] Training 1/1 epoch (loss 2.6324): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1075/1250 [26:10<04:39, 1.60s/it] Training 1/1 epoch (loss 2.6647): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1075/1250 [26:13<04:39, 1.60s/it] Training 1/1 epoch (loss 2.6647): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1076/1250 [26:13<05:12, 1.80s/it] Training 1/1 epoch (loss 2.5929): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1076/1250 [26:14<05:12, 1.80s/it] Training 1/1 epoch (loss 2.5929): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1077/1250 [26:14<04:22, 1.52s/it] Training 1/1 epoch (loss 2.5930): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1077/1250 [26:16<04:22, 1.52s/it] Training 1/1 epoch (loss 2.5930): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1078/1250 [26:16<05:03, 1.77s/it] Training 1/1 epoch (loss 2.5875): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1078/1250 [26:18<05:03, 1.77s/it] Training 1/1 epoch (loss 2.5875): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1079/1250 [26:18<05:12, 1.83s/it] Training 1/1 epoch (loss 2.7598): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1079/1250 [26:19<05:12, 1.83s/it] Training 1/1 epoch (loss 2.7598): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1080/1250 [26:19<04:47, 1.69s/it] Training 1/1 epoch (loss 2.7287): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1080/1250 [26:21<04:47, 1.69s/it] Training 1/1 epoch (loss 2.7287): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1081/1250 [26:21<04:59, 1.77s/it] Training 1/1 epoch (loss 2.3162): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1081/1250 [26:22<04:59, 1.77s/it] Training 1/1 epoch (loss 2.3162): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1082/1250 [26:22<04:26, 1.59s/it] Training 1/1 epoch (loss 2.6266): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1082/1250 [26:23<04:26, 1.59s/it] Training 1/1 epoch (loss 2.6266): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1083/1250 [26:23<03:45, 1.35s/it] Training 1/1 epoch (loss 2.7282): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1083/1250 [26:25<03:45, 1.35s/it] Training 1/1 epoch (loss 2.7282): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1084/1250 [26:25<04:24, 1.59s/it] Training 1/1 epoch (loss 2.4837): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1084/1250 [26:27<04:24, 1.59s/it] Training 1/1 epoch (loss 2.4837): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1085/1250 [26:27<04:06, 1.50s/it] Training 1/1 epoch (loss 2.5547): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1085/1250 [26:28<04:06, 1.50s/it] Training 1/1 epoch (loss 2.5547): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1086/1250 [26:28<03:57, 1.45s/it] Training 1/1 epoch (loss 2.6983): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1086/1250 [26:30<03:57, 1.45s/it] Training 1/1 epoch (loss 2.6983): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1087/1250 [26:30<04:23, 1.62s/it] Training 1/1 epoch (loss 2.6428): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1087/1250 [26:31<04:23, 1.62s/it] Training 1/1 epoch (loss 2.6428): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1088/1250 [26:31<04:07, 1.53s/it] Training 1/1 epoch (loss 2.4744): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1088/1250 [26:33<04:07, 1.53s/it] Training 1/1 epoch (loss 2.4744): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1089/1250 [26:33<04:33, 1.70s/it] Training 1/1 epoch (loss 2.5946): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1089/1250 [26:35<04:33, 1.70s/it] Training 1/1 epoch (loss 2.5946): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1090/1250 [26:35<04:32, 1.70s/it] Training 1/1 epoch (loss 2.6582): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1090/1250 [26:36<04:32, 1.70s/it] Training 1/1 epoch (loss 2.6582): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1091/1250 [26:36<03:29, 1.32s/it] Training 1/1 epoch (loss 2.4311): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1091/1250 [26:37<03:29, 1.32s/it] Training 1/1 epoch (loss 2.4311): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1092/1250 [26:37<03:34, 1.36s/it] Training 1/1 epoch (loss 2.4783): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1092/1250 [26:39<03:34, 1.36s/it] Training 1/1 epoch (loss 2.4783): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1093/1250 [26:39<03:58, 1.52s/it] Training 1/1 epoch (loss 2.4859): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1093/1250 [26:39<03:58, 1.52s/it] Training 1/1 epoch (loss 2.4859): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1094/1250 [26:39<03:11, 1.23s/it] Training 1/1 epoch (loss 2.6309): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1094/1250 [26:41<03:11, 1.23s/it] Training 1/1 epoch (loss 2.6309): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1095/1250 [26:41<03:12, 1.24s/it] Training 1/1 epoch (loss 2.4778): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1095/1250 [26:43<03:12, 1.24s/it] Training 1/1 epoch (loss 2.4778): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1096/1250 [26:43<03:41, 1.44s/it] Training 1/1 epoch (loss 2.7625): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1096/1250 [26:43<03:41, 1.44s/it] Training 1/1 epoch (loss 2.7625): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1097/1250 [26:43<03:04, 1.21s/it] Training 1/1 epoch (loss 2.6985): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1097/1250 [26:45<03:04, 1.21s/it] Training 1/1 epoch (loss 2.6985): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1098/1250 [26:45<03:25, 1.35s/it] Training 1/1 epoch (loss 2.6662): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1098/1250 [26:47<03:25, 1.35s/it] Training 1/1 epoch (loss 2.6662): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1099/1250 [26:47<03:43, 1.48s/it] Training 1/1 epoch (loss 2.6642): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1099/1250 [26:48<03:43, 1.48s/it] Training 1/1 epoch (loss 2.6642): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1100/1250 [26:48<03:13, 1.29s/it] Training 1/1 epoch (loss 2.4185): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1100/1250 [26:49<03:13, 1.29s/it] Training 1/1 epoch (loss 2.4185): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1101/1250 [26:49<03:10, 1.28s/it] Training 1/1 epoch (loss 2.6703): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1101/1250 [26:50<03:10, 1.28s/it] Training 1/1 epoch (loss 2.6703): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1102/1250 [26:50<03:20, 1.36s/it] Training 1/1 epoch (loss 2.5991): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1102/1250 [26:51<03:20, 1.36s/it] Training 1/1 epoch (loss 2.5991): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1103/1250 [26:51<02:58, 1.21s/it] Training 1/1 epoch (loss 2.4457): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1103/1250 [26:52<02:58, 1.21s/it] Training 1/1 epoch (loss 2.4457): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1104/1250 [26:52<02:58, 1.22s/it] Training 1/1 epoch (loss 2.5200): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1104/1250 [26:54<02:58, 1.22s/it] Training 1/1 epoch (loss 2.5200): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1105/1250 [26:54<03:00, 1.24s/it] Training 1/1 epoch (loss 2.5257): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1105/1250 [26:56<03:00, 1.24s/it] Training 1/1 epoch (loss 2.5257): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1106/1250 [26:56<03:35, 1.50s/it] Training 1/1 epoch (loss 2.4245): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1106/1250 [26:58<03:35, 1.50s/it] Training 1/1 epoch (loss 2.4245): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1107/1250 [26:58<03:48, 1.59s/it] Training 1/1 epoch (loss 2.6259): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1107/1250 [26:59<03:48, 1.59s/it] Training 1/1 epoch (loss 2.6259): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1108/1250 [26:59<03:31, 1.49s/it] Training 1/1 epoch (loss 2.6796): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1108/1250 [27:01<03:31, 1.49s/it] Training 1/1 epoch (loss 2.6796): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1109/1250 [27:01<03:49, 1.63s/it] Training 1/1 epoch (loss 2.7158): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1109/1250 [27:02<03:49, 1.63s/it] Training 1/1 epoch (loss 2.7158): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1110/1250 [27:02<03:18, 1.42s/it] Training 1/1 epoch (loss 2.7189): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1110/1250 [27:02<03:18, 1.42s/it] Training 1/1 epoch (loss 2.7189): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1111/1250 [27:02<02:44, 1.19s/it] Training 1/1 epoch (loss 2.5364): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1111/1250 [27:04<02:44, 1.19s/it] Training 1/1 epoch (loss 2.5364): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1112/1250 [27:04<03:05, 1.34s/it] Training 1/1 epoch (loss 2.6516): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1112/1250 [27:05<03:05, 1.34s/it] Training 1/1 epoch (loss 2.6516): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1113/1250 [27:05<02:44, 1.20s/it] Training 1/1 epoch (loss 2.5622): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1113/1250 [27:07<02:44, 1.20s/it] Training 1/1 epoch (loss 2.5622): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1114/1250 [27:07<03:07, 1.38s/it] Training 1/1 epoch (loss 2.4541): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1114/1250 [27:09<03:07, 1.38s/it] Training 1/1 epoch (loss 2.4541): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1115/1250 [27:09<03:33, 1.58s/it] Training 1/1 epoch (loss 2.5162): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1115/1250 [27:09<03:33, 1.58s/it] Training 1/1 epoch (loss 2.5162): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1116/1250 [27:09<02:51, 1.28s/it] Training 1/1 epoch (loss 2.6401): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1116/1250 [27:12<02:51, 1.28s/it] Training 1/1 epoch (loss 2.6401): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1117/1250 [27:12<03:36, 1.63s/it] Training 1/1 epoch (loss 2.4001): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1117/1250 [27:14<03:36, 1.63s/it] Training 1/1 epoch (loss 2.4001): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1118/1250 [27:14<03:57, 1.80s/it] Training 1/1 epoch (loss 2.4936): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1118/1250 [27:15<03:57, 1.80s/it] Training 1/1 epoch (loss 2.4936): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1119/1250 [27:15<03:03, 1.40s/it] Training 1/1 epoch (loss 2.3757): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1119/1250 [27:16<03:03, 1.40s/it] Training 1/1 epoch (loss 2.3757): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1120/1250 [27:16<03:08, 1.45s/it] Training 1/1 epoch (loss 2.5726): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1120/1250 [27:18<03:08, 1.45s/it] Training 1/1 epoch (loss 2.5726): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1121/1250 [27:18<03:20, 1.56s/it] Training 1/1 epoch (loss 2.5466): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1121/1250 [27:18<03:20, 1.56s/it] Training 1/1 epoch (loss 2.5466): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1122/1250 [27:18<02:37, 1.23s/it] Training 1/1 epoch (loss 2.6105): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1122/1250 [27:20<02:37, 1.23s/it] Training 1/1 epoch (loss 2.6105): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1123/1250 [27:20<02:58, 1.40s/it] Training 1/1 epoch (loss 2.5412): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1123/1250 [27:23<02:58, 1.40s/it] Training 1/1 epoch (loss 2.5412): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1124/1250 [27:23<03:34, 1.71s/it] Training 1/1 epoch (loss 2.6791): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1124/1250 [27:23<03:34, 1.71s/it] Training 1/1 epoch (loss 2.6791): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1125/1250 [27:23<02:51, 1.37s/it] Training 1/1 epoch (loss 2.6925): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1125/1250 [27:25<02:51, 1.37s/it] Training 1/1 epoch (loss 2.6925): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1126/1250 [27:25<02:49, 1.37s/it] Training 1/1 epoch (loss 2.4991): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1126/1250 [27:27<02:49, 1.37s/it] Training 1/1 epoch (loss 2.4991): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1127/1250 [27:27<03:09, 1.54s/it] Training 1/1 epoch (loss 2.6352): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1127/1250 [27:28<03:09, 1.54s/it] Training 1/1 epoch (loss 2.6352): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1128/1250 [27:28<03:06, 1.52s/it] Training 1/1 epoch (loss 2.5169): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1128/1250 [27:30<03:06, 1.52s/it] Training 1/1 epoch (loss 2.5169): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1129/1250 [27:30<03:34, 1.78s/it] Training 1/1 epoch (loss 2.5729): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1129/1250 [27:31<03:34, 1.78s/it] Training 1/1 epoch (loss 2.5729): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1130/1250 [27:31<03:07, 1.56s/it] Training 1/1 epoch (loss 2.3904): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1130/1250 [27:33<03:07, 1.56s/it] Training 1/1 epoch (loss 2.3904): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1131/1250 [27:33<03:19, 1.68s/it] Training 1/1 epoch (loss 2.5233): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1131/1250 [27:36<03:19, 1.68s/it] Training 1/1 epoch (loss 2.5233): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1132/1250 [27:36<03:48, 1.93s/it] Training 1/1 epoch (loss 2.4717): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1132/1250 [27:37<03:48, 1.93s/it] Training 1/1 epoch (loss 2.4717): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1133/1250 [27:37<03:00, 1.54s/it] Training 1/1 epoch (loss 2.5917): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1133/1250 [27:39<03:00, 1.54s/it] Training 1/1 epoch (loss 2.5917): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1134/1250 [27:39<03:15, 1.69s/it] Training 1/1 epoch (loss 2.5820): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1134/1250 [27:40<03:15, 1.69s/it] Training 1/1 epoch (loss 2.5820): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1135/1250 [27:40<03:01, 1.57s/it] Training 1/1 epoch (loss 2.5021): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1135/1250 [27:41<03:01, 1.57s/it] Training 1/1 epoch (loss 2.5021): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1136/1250 [27:41<02:38, 1.39s/it] Training 1/1 epoch (loss 2.5755): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1136/1250 [27:42<02:38, 1.39s/it] Training 1/1 epoch (loss 2.5755): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1137/1250 [27:42<02:40, 1.42s/it] Training 1/1 epoch (loss 2.5868): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1137/1250 [27:43<02:40, 1.42s/it] Training 1/1 epoch (loss 2.5868): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1138/1250 [27:43<02:13, 1.19s/it] Training 1/1 epoch (loss 2.3728): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1138/1250 [27:45<02:13, 1.19s/it] Training 1/1 epoch (loss 2.3728): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1139/1250 [27:45<02:38, 1.43s/it] Training 1/1 epoch (loss 2.7479): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1139/1250 [27:46<02:38, 1.43s/it] Training 1/1 epoch (loss 2.7479): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1140/1250 [27:46<02:38, 1.45s/it] Training 1/1 epoch (loss 2.4649): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1140/1250 [27:47<02:38, 1.45s/it] Training 1/1 epoch (loss 2.4649): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1141/1250 [27:47<02:10, 1.19s/it] Training 1/1 epoch (loss 2.5440): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1141/1250 [27:49<02:10, 1.19s/it] Training 1/1 epoch (loss 2.5440): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1142/1250 [27:49<02:22, 1.32s/it] Training 1/1 epoch (loss 2.5934): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1142/1250 [27:50<02:22, 1.32s/it] Training 1/1 epoch (loss 2.5934): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1143/1250 [27:50<02:34, 1.44s/it] Training 1/1 epoch (loss 2.4030): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1143/1250 [27:51<02:34, 1.44s/it] Training 1/1 epoch (loss 2.4030): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1144/1250 [27:51<02:18, 1.30s/it] Training 1/1 epoch (loss 2.5512): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1144/1250 [27:54<02:18, 1.30s/it] Training 1/1 epoch (loss 2.5512): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1145/1250 [27:54<02:52, 1.64s/it] Training 1/1 epoch (loss 2.5944): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1145/1250 [27:55<02:52, 1.64s/it] Training 1/1 epoch (loss 2.5944): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1146/1250 [27:55<02:35, 1.49s/it] Training 1/1 epoch (loss 2.4347): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1146/1250 [27:56<02:35, 1.49s/it] Training 1/1 epoch (loss 2.4347): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1147/1250 [27:56<02:18, 1.34s/it] Training 1/1 epoch (loss 2.6403): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1147/1250 [27:58<02:18, 1.34s/it] Training 1/1 epoch (loss 2.6403): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1148/1250 [27:58<02:28, 1.46s/it] Training 1/1 epoch (loss 2.4058): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1148/1250 [27:58<02:28, 1.46s/it] Training 1/1 epoch (loss 2.4058): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1149/1250 [27:58<02:07, 1.26s/it] Training 1/1 epoch (loss 2.4109): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1149/1250 [28:00<02:07, 1.26s/it] Training 1/1 epoch (loss 2.4109): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1150/1250 [28:00<02:26, 1.47s/it] Training 1/1 epoch (loss 2.7828): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1150/1250 [28:02<02:26, 1.47s/it] Training 1/1 epoch (loss 2.7828): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1151/1250 [28:02<02:19, 1.41s/it] Training 1/1 epoch (loss 2.5855): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1151/1250 [28:03<02:19, 1.41s/it] Training 1/1 epoch (loss 2.5855): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1152/1250 [28:03<02:01, 1.24s/it] Training 1/1 epoch (loss 2.5834): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1152/1250 [28:05<02:01, 1.24s/it] Training 1/1 epoch (loss 2.5834): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1153/1250 [28:05<02:29, 1.54s/it] Training 1/1 epoch (loss 2.4872): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1153/1250 [28:07<02:29, 1.54s/it] Training 1/1 epoch (loss 2.4872): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1154/1250 [28:07<02:37, 1.64s/it] Training 1/1 epoch (loss 2.5188): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1154/1250 [28:07<02:37, 1.64s/it] Training 1/1 epoch (loss 2.5188): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1155/1250 [28:07<02:07, 1.34s/it] Training 1/1 epoch (loss 2.5595): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1155/1250 [28:09<02:07, 1.34s/it] Training 1/1 epoch (loss 2.5595): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1156/1250 [28:09<02:19, 1.48s/it] Training 1/1 epoch (loss 2.5277): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1156/1250 [28:10<02:19, 1.48s/it] Training 1/1 epoch (loss 2.5277): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1157/1250 [28:10<02:08, 1.39s/it] Training 1/1 epoch (loss 2.5257): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1157/1250 [28:11<02:08, 1.39s/it] Training 1/1 epoch (loss 2.5257): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1158/1250 [28:11<01:57, 1.28s/it] Training 1/1 epoch (loss 2.5206): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1158/1250 [28:13<01:57, 1.28s/it] Training 1/1 epoch (loss 2.5206): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1159/1250 [28:13<02:16, 1.50s/it] Training 1/1 epoch (loss 2.5084): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1159/1250 [28:14<02:16, 1.50s/it] Training 1/1 epoch (loss 2.5084): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1160/1250 [28:14<02:02, 1.36s/it] Training 1/1 epoch (loss 2.3911): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1160/1250 [28:17<02:02, 1.36s/it] Training 1/1 epoch (loss 2.3911): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1161/1250 [28:17<02:29, 1.68s/it] Training 1/1 epoch (loss 2.5214): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1161/1250 [28:19<02:29, 1.68s/it] Training 1/1 epoch (loss 2.5214): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1162/1250 [28:19<02:36, 1.78s/it] Training 1/1 epoch (loss 2.4351): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1162/1250 [28:20<02:36, 1.78s/it] Training 1/1 epoch (loss 2.4351): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1163/1250 [28:20<02:21, 1.62s/it] Training 1/1 epoch (loss 2.3491): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1163/1250 [28:22<02:21, 1.62s/it] Training 1/1 epoch (loss 2.3491): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1164/1250 [28:22<02:39, 1.86s/it] Training 1/1 epoch (loss 2.5583): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1164/1250 [28:24<02:39, 1.86s/it] Training 1/1 epoch (loss 2.5583): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1165/1250 [28:24<02:32, 1.79s/it] Training 1/1 epoch (loss 2.6846): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1165/1250 [28:26<02:32, 1.79s/it] Training 1/1 epoch (loss 2.6846): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1166/1250 [28:26<02:23, 1.71s/it] Training 1/1 epoch (loss 2.7476): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1166/1250 [28:28<02:23, 1.71s/it] Training 1/1 epoch (loss 2.7476): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1167/1250 [28:28<02:26, 1.77s/it] Training 1/1 epoch (loss 2.6697): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1167/1250 [28:29<02:26, 1.77s/it] Training 1/1 epoch (loss 2.6697): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1168/1250 [28:29<02:12, 1.62s/it] Training 1/1 epoch (loss 2.3768): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1168/1250 [28:30<02:12, 1.62s/it] Training 1/1 epoch (loss 2.3768): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1169/1250 [28:30<02:07, 1.58s/it] Training 1/1 epoch (loss 2.4952): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1169/1250 [28:32<02:07, 1.58s/it] Training 1/1 epoch (loss 2.4952): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1170/1250 [28:32<02:07, 1.59s/it] Training 1/1 epoch (loss 2.5288): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1170/1250 [28:32<02:07, 1.59s/it] Training 1/1 epoch (loss 2.5288): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1171/1250 [28:32<01:41, 1.29s/it] Training 1/1 epoch (loss 2.6879): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 1171/1250 [28:34<01:41, 1.29s/it] Training 1/1 epoch (loss 2.6879): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1172/1250 [28:34<01:43, 1.33s/it] Training 1/1 epoch (loss 2.7168): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1172/1250 [28:35<01:43, 1.33s/it] Training 1/1 epoch (loss 2.7168): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1173/1250 [28:35<01:45, 1.37s/it] Training 1/1 epoch (loss 2.6382): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1173/1250 [28:36<01:45, 1.37s/it] Training 1/1 epoch (loss 2.6382): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1174/1250 [28:36<01:23, 1.10s/it] Training 1/1 epoch (loss 2.6200): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1174/1250 [28:37<01:23, 1.10s/it] Training 1/1 epoch (loss 2.6200): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1175/1250 [28:37<01:27, 1.17s/it] Training 1/1 epoch (loss 2.5829): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1175/1250 [28:39<01:27, 1.17s/it] Training 1/1 epoch (loss 2.5829): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1176/1250 [28:39<01:45, 1.43s/it] Training 1/1 epoch (loss 2.6541): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1176/1250 [28:41<01:45, 1.43s/it] Training 1/1 epoch (loss 2.6541): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1177/1250 [28:41<01:42, 1.40s/it] Training 1/1 epoch (loss 2.6480): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1177/1250 [28:42<01:42, 1.40s/it] Training 1/1 epoch (loss 2.6480): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1178/1250 [28:42<01:50, 1.53s/it] Training 1/1 epoch (loss 2.5011): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1178/1250 [28:43<01:50, 1.53s/it] Training 1/1 epoch (loss 2.5011): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1179/1250 [28:43<01:38, 1.39s/it] Training 1/1 epoch (loss 2.6760): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1179/1250 [28:44<01:38, 1.39s/it] Training 1/1 epoch (loss 2.6760): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1180/1250 [28:44<01:28, 1.26s/it] Training 1/1 epoch (loss 2.6525): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1180/1250 [28:47<01:28, 1.26s/it] Training 1/1 epoch (loss 2.6525): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1181/1250 [28:47<01:50, 1.60s/it] Training 1/1 epoch (loss 2.6090): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1181/1250 [28:47<01:50, 1.60s/it] Training 1/1 epoch (loss 2.6090): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1182/1250 [28:47<01:30, 1.33s/it] Training 1/1 epoch (loss 2.5906): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1182/1250 [28:50<01:30, 1.33s/it] Training 1/1 epoch (loss 2.5906): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1183/1250 [28:50<01:52, 1.67s/it] Training 1/1 epoch (loss 2.5045): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1183/1250 [28:52<01:52, 1.67s/it] Training 1/1 epoch (loss 2.5045): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1184/1250 [28:52<01:59, 1.81s/it] Training 1/1 epoch (loss 2.6087): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1184/1250 [28:53<01:59, 1.81s/it] Training 1/1 epoch (loss 2.6087): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1185/1250 [28:53<01:46, 1.64s/it] Training 1/1 epoch (loss 2.5958): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1185/1250 [28:55<01:46, 1.64s/it] Training 1/1 epoch (loss 2.5958): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1186/1250 [28:55<01:39, 1.56s/it] Training 1/1 epoch (loss 2.7413): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1186/1250 [28:56<01:39, 1.56s/it] Training 1/1 epoch (loss 2.7413): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1187/1250 [28:56<01:37, 1.55s/it] Training 1/1 epoch (loss 2.3428): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1187/1250 [28:58<01:37, 1.55s/it] Training 1/1 epoch (loss 2.3428): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1188/1250 [28:58<01:39, 1.60s/it] Training 1/1 epoch (loss 2.4637): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1188/1250 [28:59<01:39, 1.60s/it] Training 1/1 epoch (loss 2.4637): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1189/1250 [28:59<01:35, 1.57s/it] Training 1/1 epoch (loss 2.8186): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1189/1250 [29:01<01:35, 1.57s/it] Training 1/1 epoch (loss 2.8186): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1190/1250 [29:01<01:35, 1.58s/it] Training 1/1 epoch (loss 2.7044): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1190/1250 [29:02<01:35, 1.58s/it] Training 1/1 epoch (loss 2.7044): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1191/1250 [29:02<01:21, 1.38s/it] Training 1/1 epoch (loss 2.5824): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1191/1250 [29:04<01:21, 1.38s/it] Training 1/1 epoch (loss 2.5824): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1192/1250 [29:04<01:25, 1.47s/it] Training 1/1 epoch (loss 2.5608): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1192/1250 [29:05<01:25, 1.47s/it] Training 1/1 epoch (loss 2.5608): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1193/1250 [29:05<01:17, 1.36s/it] Training 1/1 epoch (loss 2.6238): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1193/1250 [29:07<01:17, 1.36s/it] Training 1/1 epoch (loss 2.6238): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1194/1250 [29:07<01:31, 1.63s/it] Training 1/1 epoch (loss 2.7050): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1194/1250 [29:09<01:31, 1.63s/it] Training 1/1 epoch (loss 2.7050): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1195/1250 [29:09<01:32, 1.68s/it] Training 1/1 epoch (loss 2.8214): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1195/1250 [29:10<01:32, 1.68s/it] Training 1/1 epoch (loss 2.8214): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1196/1250 [29:10<01:24, 1.56s/it] Training 1/1 epoch (loss 2.6563): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1196/1250 [29:12<01:24, 1.56s/it] Training 1/1 epoch (loss 2.6563): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1197/1250 [29:12<01:32, 1.74s/it] Training 1/1 epoch (loss 2.4924): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1197/1250 [29:13<01:32, 1.74s/it] Training 1/1 epoch (loss 2.4924): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1198/1250 [29:13<01:16, 1.48s/it] Training 1/1 epoch (loss 2.5062): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1198/1250 [29:15<01:16, 1.48s/it] Training 1/1 epoch (loss 2.5062): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1199/1250 [29:15<01:25, 1.69s/it] Training 1/1 epoch (loss 2.4934): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1199/1250 [29:17<01:25, 1.69s/it] Training 1/1 epoch (loss 2.4934): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1200/1250 [29:17<01:18, 1.57s/it] Training 1/1 epoch (loss 2.4510): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1200/1250 [29:18<01:18, 1.57s/it] Training 1/1 epoch (loss 2.4510): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1201/1250 [29:18<01:21, 1.66s/it] Training 1/1 epoch (loss 2.3981): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1201/1250 [29:20<01:21, 1.66s/it] Training 1/1 epoch (loss 2.3981): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1202/1250 [29:20<01:23, 1.75s/it] Training 1/1 epoch (loss 2.6060): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1202/1250 [29:21<01:23, 1.75s/it] Training 1/1 epoch (loss 2.6060): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1203/1250 [29:21<01:05, 1.39s/it] Training 1/1 epoch (loss 2.4702): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 1203/1250 [29:22<01:05, 1.39s/it] Training 1/1 epoch (loss 2.4702): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1204/1250 [29:22<01:03, 1.38s/it] Training 1/1 epoch (loss 2.8554): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1204/1250 [29:24<01:03, 1.38s/it] Training 1/1 epoch (loss 2.8554): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1205/1250 [29:24<01:10, 1.56s/it] Training 1/1 epoch (loss 2.5014): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1205/1250 [29:25<01:10, 1.56s/it] Training 1/1 epoch (loss 2.5014): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1206/1250 [29:25<00:56, 1.29s/it] Training 1/1 epoch (loss 2.5519): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1206/1250 [29:26<00:56, 1.29s/it] Training 1/1 epoch (loss 2.5519): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1207/1250 [29:26<00:56, 1.31s/it] Training 1/1 epoch (loss 2.5414): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1207/1250 [29:27<00:56, 1.31s/it] Training 1/1 epoch (loss 2.5414): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1208/1250 [29:27<00:51, 1.23s/it] Training 1/1 epoch (loss 2.5866): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1208/1250 [29:28<00:51, 1.23s/it] Training 1/1 epoch (loss 2.5866): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1209/1250 [29:28<00:46, 1.13s/it] Training 1/1 epoch (loss 2.7022): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1209/1250 [29:31<00:46, 1.13s/it] Training 1/1 epoch (loss 2.7022): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1210/1250 [29:31<01:01, 1.53s/it] Training 1/1 epoch (loss 2.5918): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1210/1250 [29:32<01:01, 1.53s/it] Training 1/1 epoch (loss 2.5918): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1211/1250 [29:32<00:52, 1.34s/it] Training 1/1 epoch (loss 2.5189): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1211/1250 [29:33<00:52, 1.34s/it] Training 1/1 epoch (loss 2.5189): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1212/1250 [29:33<00:50, 1.32s/it] Training 1/1 epoch (loss 2.6406): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1212/1250 [29:34<00:50, 1.32s/it] Training 1/1 epoch (loss 2.6406): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1213/1250 [29:34<00:47, 1.28s/it] Training 1/1 epoch (loss 2.3997): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1213/1250 [29:35<00:47, 1.28s/it] Training 1/1 epoch (loss 2.3997): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1214/1250 [29:35<00:44, 1.24s/it] Training 1/1 epoch (loss 2.5515): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1214/1250 [29:37<00:44, 1.24s/it] Training 1/1 epoch (loss 2.5515): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1215/1250 [29:37<00:44, 1.26s/it] Training 1/1 epoch (loss 2.8467): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1215/1250 [29:39<00:44, 1.26s/it] Training 1/1 epoch (loss 2.8467): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1216/1250 [29:39<00:57, 1.70s/it] Training 1/1 epoch (loss 2.4436): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1216/1250 [29:40<00:57, 1.70s/it] Training 1/1 epoch (loss 2.4436): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1217/1250 [29:40<00:44, 1.35s/it] Training 1/1 epoch (loss 2.5091): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1217/1250 [29:41<00:44, 1.35s/it] Training 1/1 epoch (loss 2.5091): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1218/1250 [29:41<00:45, 1.43s/it] Training 1/1 epoch (loss 2.6322): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 1218/1250 [29:43<00:45, 1.43s/it] Training 1/1 epoch (loss 2.6322): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1219/1250 [29:43<00:46, 1.52s/it] Training 1/1 epoch (loss 2.5312): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1219/1250 [29:44<00:46, 1.52s/it] Training 1/1 epoch (loss 2.5312): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1220/1250 [29:44<00:38, 1.28s/it] Training 1/1 epoch (loss 2.4408): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1220/1250 [29:45<00:38, 1.28s/it] Training 1/1 epoch (loss 2.4408): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1221/1250 [29:45<00:34, 1.18s/it] Training 1/1 epoch (loss 2.4282): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1221/1250 [29:46<00:34, 1.18s/it] Training 1/1 epoch (loss 2.4282): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1222/1250 [29:46<00:36, 1.31s/it] Training 1/1 epoch (loss 2.4532): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1222/1250 [29:47<00:36, 1.31s/it] Training 1/1 epoch (loss 2.4532): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1223/1250 [29:47<00:32, 1.19s/it] Training 1/1 epoch (loss 2.5952): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1223/1250 [29:50<00:32, 1.19s/it] Training 1/1 epoch (loss 2.5952): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1224/1250 [29:50<00:39, 1.53s/it] Training 1/1 epoch (loss 2.6099): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1224/1250 [29:50<00:39, 1.53s/it] Training 1/1 epoch (loss 2.6099): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1225/1250 [29:50<00:33, 1.32s/it] Training 1/1 epoch (loss 2.6889): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1225/1250 [29:51<00:33, 1.32s/it] Training 1/1 epoch (loss 2.6889): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1226/1250 [29:51<00:29, 1.23s/it] Training 1/1 epoch (loss 2.5131): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1226/1250 [29:53<00:29, 1.23s/it] Training 1/1 epoch (loss 2.5131): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1227/1250 [29:53<00:33, 1.44s/it] Training 1/1 epoch (loss 2.6867): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1227/1250 [29:54<00:33, 1.44s/it] Training 1/1 epoch (loss 2.6867): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1228/1250 [29:54<00:25, 1.16s/it] Training 1/1 epoch (loss 2.3714): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1228/1250 [29:56<00:25, 1.16s/it] Training 1/1 epoch (loss 2.3714): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1229/1250 [29:56<00:28, 1.36s/it] Training 1/1 epoch (loss 2.6576): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1229/1250 [29:57<00:28, 1.36s/it] Training 1/1 epoch (loss 2.6576): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1230/1250 [29:57<00:26, 1.35s/it] Training 1/1 epoch (loss 2.4986): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1230/1250 [29:58<00:26, 1.35s/it] Training 1/1 epoch (loss 2.4986): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1231/1250 [29:58<00:22, 1.17s/it] Training 1/1 epoch (loss 2.6246): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1231/1250 [30:01<00:22, 1.17s/it] Training 1/1 epoch (loss 2.6246): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1232/1250 [30:01<00:30, 1.71s/it] Training 1/1 epoch (loss 2.6569): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1232/1250 [30:03<00:30, 1.71s/it] Training 1/1 epoch (loss 2.6569): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1233/1250 [30:03<00:30, 1.80s/it] Training 1/1 epoch (loss 2.6504): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1233/1250 [30:04<00:30, 1.80s/it] Training 1/1 epoch (loss 2.6504): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1234/1250 [30:04<00:24, 1.50s/it] Training 1/1 epoch (loss 2.5479): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1234/1250 [30:05<00:24, 1.50s/it] Training 1/1 epoch (loss 2.5479): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1235/1250 [30:05<00:20, 1.39s/it] Training 1/1 epoch (loss 2.4806): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1235/1250 [30:07<00:20, 1.39s/it] Training 1/1 epoch (loss 2.4806): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1236/1250 [30:07<00:21, 1.52s/it] Training 1/1 epoch (loss 2.4953): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1236/1250 [30:07<00:21, 1.52s/it] Training 1/1 epoch (loss 2.4953): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1237/1250 [30:07<00:15, 1.22s/it] Training 1/1 epoch (loss 2.5670): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1237/1250 [30:09<00:15, 1.22s/it] Training 1/1 epoch (loss 2.5670): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1238/1250 [30:09<00:15, 1.28s/it] Training 1/1 epoch (loss 2.6437): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1238/1250 [30:10<00:15, 1.28s/it] Training 1/1 epoch (loss 2.6437): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1239/1250 [30:10<00:14, 1.28s/it] Training 1/1 epoch (loss 2.5759): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1239/1250 [30:11<00:14, 1.28s/it] Training 1/1 epoch (loss 2.5759): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1240/1250 [30:11<00:11, 1.16s/it] Training 1/1 epoch (loss 2.6161): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1240/1250 [30:12<00:11, 1.16s/it] Training 1/1 epoch (loss 2.6161): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1241/1250 [30:12<00:11, 1.30s/it] Training 1/1 epoch (loss 2.6190): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1241/1250 [30:13<00:11, 1.30s/it] Training 1/1 epoch (loss 2.6190): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1242/1250 [30:13<00:09, 1.15s/it] Training 1/1 epoch (loss 2.7743): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1242/1250 [30:15<00:09, 1.15s/it] Training 1/1 epoch (loss 2.7743): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1243/1250 [30:15<00:09, 1.30s/it] Training 1/1 epoch (loss 2.7077): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1243/1250 [30:17<00:09, 1.30s/it] Training 1/1 epoch (loss 2.7077): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1244/1250 [30:17<00:09, 1.55s/it] Training 1/1 epoch (loss 2.8131): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1244/1250 [30:17<00:09, 1.55s/it] Training 1/1 epoch (loss 2.8131): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1245/1250 [30:17<00:06, 1.21s/it] Training 1/1 epoch (loss 2.4559): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1245/1250 [30:19<00:06, 1.21s/it] Training 1/1 epoch (loss 2.4559): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1246/1250 [30:19<00:05, 1.40s/it] Training 1/1 epoch (loss 2.7500): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1246/1250 [30:21<00:05, 1.40s/it] Training 1/1 epoch (loss 2.7500): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1247/1250 [30:21<00:05, 1.67s/it] Training 1/1 epoch (loss 2.6166): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1247/1250 [30:22<00:05, 1.67s/it] Training 1/1 epoch (loss 2.6166): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1248/1250 [30:22<00:02, 1.32s/it] Training 1/1 epoch (loss 2.7382): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1248/1250 [30:24<00:02, 1.32s/it] Training 1/1 epoch (loss 2.7382): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1249/1250 [30:24<00:01, 1.67s/it] Training 1/1 epoch (loss 2.3804): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 1249/1250 [30:26<00:01, 1.67s/it] Training 1/1 epoch (loss 2.3804): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1250/1250 [30:26<00:00, 1.51s/it] Training 1/1 epoch (loss 2.3804): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1250/1250 [30:26<00:00, 1.46s/it]
tokenizer config file saved in /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-10000/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-10000/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x155081f08290>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 406, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 503, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 451, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2309, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 406, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2337, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2550, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2806, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown