alignmentforever's picture
Upload folder using huggingface_hub
86095a1 verified
+ deepspeed --master_port 19346 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/setting3/safety_data/training/safe/safe_30k.json --model_name_or_path /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T --max_length 2048 --trust_remote_code True --epochs 1 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-2T/tinyllama-2T-s3-Q1-30k --log_type wandb --log_run_name tinyllama-2T-s3-Q1-30k --log_project Inverse_Alignment --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
[rank4]:[W529 17:26:48.462090986 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W529 17:26:49.301168376 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank1]:[W529 17:26:49.611287717 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W529 17:26:49.722383317 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W529 17:26:49.736489114 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W529 17:26:49.744524280 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W529 17:26:49.745616340 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W529 17:26:49.764576913 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/config.json
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.json
loading file tokenizer.json
loading file added_tokens.json
loading file tokenizer.json
loading file special_tokens_map.json
loading file added_tokens.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file special_tokens_map.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file chat_template.jinja
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-955k-token-2T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.8
wandb: Run data is saved locally in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-2T/tinyllama-2T-s3-Q1-30k/wandb/run-20250529_172723-8imc1sym
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run tinyllama-2T-s3-Q1-30k
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment/runs/8imc1sym
Training 1/1 epoch: 0%| | 0/938 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.0886): 0%| | 0/938 [00:05<?, ?it/s] Training 1/1 epoch (loss 2.0886): 0%| | 1/938 [00:05<1:23:40, 5.36s/it] Training 1/1 epoch (loss 2.0124): 0%| | 1/938 [00:06<1:23:40, 5.36s/it] Training 1/1 epoch (loss 2.0124): 0%| | 2/938 [00:06<47:47, 3.06s/it] Training 1/1 epoch (loss 2.0405): 0%| | 2/938 [00:07<47:47, 3.06s/it] Training 1/1 epoch (loss 2.0405): 0%| | 3/938 [00:07<28:19, 1.82s/it] Training 1/1 epoch (loss 2.0534): 0%| | 3/938 [00:07<28:19, 1.82s/it] Training 1/1 epoch (loss 2.0534): 0%| | 4/938 [00:07<19:01, 1.22s/it] Training 1/1 epoch (loss 1.9724): 0%| | 4/938 [00:07<19:01, 1.22s/it] Training 1/1 epoch (loss 1.9724): 1%| | 5/938 [00:07<13:51, 1.12it/s] Training 1/1 epoch (loss 1.9856): 1%| | 5/938 [00:08<13:51, 1.12it/s] Training 1/1 epoch (loss 1.9856): 1%| | 6/938 [00:08<10:54, 1.42it/s] Training 1/1 epoch (loss 2.0163): 1%| | 6/938 [00:08<10:54, 1.42it/s] Training 1/1 epoch (loss 2.0163): 1%| | 7/938 [00:08<08:50, 1.76it/s] Training 1/1 epoch (loss 2.0723): 1%| | 7/938 [00:08<08:50, 1.76it/s] Training 1/1 epoch (loss 2.0723): 1%| | 8/938 [00:08<07:59, 1.94it/s] Training 1/1 epoch (loss 1.9441): 1%| | 8/938 [00:09<07:59, 1.94it/s] Training 1/1 epoch (loss 1.9441): 1%| | 9/938 [00:09<06:56, 2.23it/s] Training 1/1 epoch (loss 1.9509): 1%| | 9/938 [00:09<06:56, 2.23it/s] Training 1/1 epoch (loss 1.9509): 1%| | 10/938 [00:09<06:17, 2.46it/s] Training 1/1 epoch (loss 2.0250): 1%| | 10/938 [00:09<06:17, 2.46it/s] Training 1/1 epoch (loss 2.0250): 1%| | 11/938 [00:09<06:08, 2.52it/s] Training 1/1 epoch (loss 2.0769): 1%| | 11/938 [00:10<06:08, 2.52it/s] Training 1/1 epoch (loss 2.0769): 1%|▏ | 12/938 [00:10<05:50, 2.64it/s] Training 1/1 epoch (loss 2.1021): 1%|▏ | 12/938 [00:10<05:50, 2.64it/s] Training 1/1 epoch (loss 2.1021): 1%|▏ | 13/938 [00:10<05:46, 2.67it/s] Training 1/1 epoch (loss 1.8803): 1%|▏ | 13/938 [00:10<05:46, 2.67it/s] Training 1/1 epoch (loss 1.8803): 1%|▏ | 14/938 [00:10<05:27, 2.82it/s] Training 1/1 epoch (loss 2.1142): 1%|▏ | 14/938 [00:11<05:27, 2.82it/s] Training 1/1 epoch (loss 2.1142): 2%|▏ | 15/938 [00:11<05:11, 2.96it/s] Training 1/1 epoch (loss 2.0380): 2%|▏ | 15/938 [00:11<05:11, 2.96it/s] Training 1/1 epoch (loss 2.0380): 2%|▏ | 16/938 [00:11<05:06, 3.01it/s] Training 1/1 epoch (loss 1.8542): 2%|▏ | 16/938 [00:11<05:06, 3.01it/s] Training 1/1 epoch (loss 1.8542): 2%|▏ | 17/938 [00:11<05:03, 3.03it/s] Training 1/1 epoch (loss 2.0572): 2%|▏ | 17/938 [00:12<05:03, 3.03it/s] Training 1/1 epoch (loss 2.0572): 2%|▏ | 18/938 [00:12<05:12, 2.94it/s] Training 1/1 epoch (loss 1.8676): 2%|▏ | 18/938 [00:12<05:12, 2.94it/s] Training 1/1 epoch (loss 1.8676): 2%|▏ | 19/938 [00:12<05:06, 3.00it/s] Training 1/1 epoch (loss 1.9604): 2%|▏ | 19/938 [00:12<05:06, 3.00it/s] Training 1/1 epoch (loss 1.9604): 2%|▏ | 20/938 [00:12<04:58, 3.07it/s] Training 1/1 epoch (loss 1.9735): 2%|▏ | 20/938 [00:13<04:58, 3.07it/s] Training 1/1 epoch (loss 1.9735): 2%|▏ | 21/938 [00:13<04:58, 3.07it/s] Training 1/1 epoch (loss 2.1057): 2%|▏ | 21/938 [00:13<04:58, 3.07it/s] Training 1/1 epoch (loss 2.1057): 2%|▏ | 22/938 [00:13<04:54, 3.12it/s] Training 1/1 epoch (loss 1.9158): 2%|▏ | 22/938 [00:13<04:54, 3.12it/s] Training 1/1 epoch (loss 1.9158): 2%|▏ | 23/938 [00:13<04:52, 3.13it/s] Training 1/1 epoch (loss 1.9458): 2%|▏ | 23/938 [00:14<04:52, 3.13it/s] Training 1/1 epoch (loss 1.9458): 3%|β–Ž | 24/938 [00:14<05:05, 2.99it/s] Training 1/1 epoch (loss 2.0418): 3%|β–Ž | 24/938 [00:14<05:05, 2.99it/s] Training 1/1 epoch (loss 2.0418): 3%|β–Ž | 25/938 [00:14<04:58, 3.06it/s] Training 1/1 epoch (loss 2.0315): 3%|β–Ž | 25/938 [00:14<04:58, 3.06it/s] Training 1/1 epoch (loss 2.0315): 3%|β–Ž | 26/938 [00:14<04:54, 3.10it/s] Training 1/1 epoch (loss 1.9305): 3%|β–Ž | 26/938 [00:15<04:54, 3.10it/s] Training 1/1 epoch (loss 1.9305): 3%|β–Ž | 27/938 [00:15<04:58, 3.06it/s] Training 1/1 epoch (loss 1.8989): 3%|β–Ž | 27/938 [00:15<04:58, 3.06it/s] Training 1/1 epoch (loss 1.8989): 3%|β–Ž | 28/938 [00:15<04:51, 3.12it/s] Training 1/1 epoch (loss 2.0149): 3%|β–Ž | 28/938 [00:15<04:51, 3.12it/s] Training 1/1 epoch (loss 2.0149): 3%|β–Ž | 29/938 [00:15<04:54, 3.09it/s] Training 1/1 epoch (loss 2.0278): 3%|β–Ž | 29/938 [00:16<04:54, 3.09it/s] Training 1/1 epoch (loss 2.0278): 3%|β–Ž | 30/938 [00:16<05:03, 2.99it/s] Training 1/1 epoch (loss 1.9639): 3%|β–Ž | 30/938 [00:16<05:03, 2.99it/s] Training 1/1 epoch (loss 1.9639): 3%|β–Ž | 31/938 [00:16<04:54, 3.08it/s] Training 1/1 epoch (loss 1.9956): 3%|β–Ž | 31/938 [00:16<04:54, 3.08it/s] Training 1/1 epoch (loss 1.9956): 3%|β–Ž | 32/938 [00:16<04:55, 3.07it/s] Training 1/1 epoch (loss 1.8724): 3%|β–Ž | 32/938 [00:16<04:55, 3.07it/s] Training 1/1 epoch (loss 1.8724): 4%|β–Ž | 33/938 [00:16<04:50, 3.11it/s] Training 1/1 epoch (loss 1.9103): 4%|β–Ž | 33/938 [00:17<04:50, 3.11it/s] Training 1/1 epoch (loss 1.9103): 4%|β–Ž | 34/938 [00:17<04:46, 3.15it/s] Training 1/1 epoch (loss 1.8523): 4%|β–Ž | 34/938 [00:17<04:46, 3.15it/s] Training 1/1 epoch (loss 1.8523): 4%|β–Ž | 35/938 [00:17<04:43, 3.18it/s] Training 1/1 epoch (loss 1.8735): 4%|β–Ž | 35/938 [00:17<04:43, 3.18it/s] Training 1/1 epoch (loss 1.8735): 4%|▍ | 36/938 [00:17<04:59, 3.01it/s] Training 1/1 epoch (loss 1.9303): 4%|▍ | 36/938 [00:18<04:59, 3.01it/s] Training 1/1 epoch (loss 1.9303): 4%|▍ | 37/938 [00:18<04:49, 3.11it/s] Training 1/1 epoch (loss 1.8985): 4%|▍ | 37/938 [00:18<04:49, 3.11it/s] Training 1/1 epoch (loss 1.8985): 4%|▍ | 38/938 [00:18<04:44, 3.16it/s] Training 1/1 epoch (loss 1.8596): 4%|▍ | 38/938 [00:18<04:44, 3.16it/s] Training 1/1 epoch (loss 1.8596): 4%|▍ | 39/938 [00:18<04:41, 3.19it/s] Training 1/1 epoch (loss 1.8876): 4%|▍ | 39/938 [00:19<04:41, 3.19it/s] Training 1/1 epoch (loss 1.8876): 4%|▍ | 40/938 [00:19<04:48, 3.12it/s] Training 1/1 epoch (loss 1.7993): 4%|▍ | 40/938 [00:19<04:48, 3.12it/s] Training 1/1 epoch (loss 1.7993): 4%|▍ | 41/938 [00:19<04:43, 3.16it/s] Training 1/1 epoch (loss 1.9353): 4%|▍ | 41/938 [00:19<04:43, 3.16it/s] Training 1/1 epoch (loss 1.9353): 4%|▍ | 42/938 [00:19<04:47, 3.12it/s] Training 1/1 epoch (loss 1.8642): 4%|▍ | 42/938 [00:20<04:47, 3.12it/s] Training 1/1 epoch (loss 1.8642): 5%|▍ | 43/938 [00:20<05:03, 2.95it/s] Training 1/1 epoch (loss 1.8517): 5%|▍ | 43/938 [00:20<05:03, 2.95it/s] Training 1/1 epoch (loss 1.8517): 5%|▍ | 44/938 [00:20<04:54, 3.04it/s] Training 1/1 epoch (loss 1.8349): 5%|▍ | 44/938 [00:20<04:54, 3.04it/s] Training 1/1 epoch (loss 1.8349): 5%|▍ | 45/938 [00:20<05:00, 2.97it/s] Training 1/1 epoch (loss 1.8698): 5%|▍ | 45/938 [00:21<05:00, 2.97it/s] Training 1/1 epoch (loss 1.8698): 5%|▍ | 46/938 [00:21<04:49, 3.09it/s] Training 1/1 epoch (loss 1.9042): 5%|▍ | 46/938 [00:21<04:49, 3.09it/s] Training 1/1 epoch (loss 1.9042): 5%|β–Œ | 47/938 [00:21<04:44, 3.13it/s] Training 1/1 epoch (loss 1.8854): 5%|β–Œ | 47/938 [00:21<04:44, 3.13it/s] Training 1/1 epoch (loss 1.8854): 5%|β–Œ | 48/938 [00:21<04:51, 3.05it/s] Training 1/1 epoch (loss 1.9489): 5%|β–Œ | 48/938 [00:22<04:51, 3.05it/s] Training 1/1 epoch (loss 1.9489): 5%|β–Œ | 49/938 [00:22<04:54, 3.02it/s] Training 1/1 epoch (loss 1.7554): 5%|β–Œ | 49/938 [00:22<04:54, 3.02it/s] Training 1/1 epoch (loss 1.7554): 5%|β–Œ | 50/938 [00:22<04:47, 3.09it/s] Training 1/1 epoch (loss 1.8064): 5%|β–Œ | 50/938 [00:22<04:47, 3.09it/s] Training 1/1 epoch (loss 1.8064): 5%|β–Œ | 51/938 [00:22<04:42, 3.14it/s] Training 1/1 epoch (loss 1.8808): 5%|β–Œ | 51/938 [00:23<04:42, 3.14it/s] Training 1/1 epoch (loss 1.8808): 6%|β–Œ | 52/938 [00:23<04:39, 3.16it/s] Training 1/1 epoch (loss 1.7632): 6%|β–Œ | 52/938 [00:23<04:39, 3.16it/s] Training 1/1 epoch (loss 1.7632): 6%|β–Œ | 53/938 [00:23<04:39, 3.17it/s] Training 1/1 epoch (loss 1.8720): 6%|β–Œ | 53/938 [00:23<04:39, 3.17it/s] Training 1/1 epoch (loss 1.8720): 6%|β–Œ | 54/938 [00:23<04:30, 3.27it/s] Training 1/1 epoch (loss 1.8994): 6%|β–Œ | 54/938 [00:24<04:30, 3.27it/s] Training 1/1 epoch (loss 1.8994): 6%|β–Œ | 55/938 [00:24<04:39, 3.16it/s] Training 1/1 epoch (loss 1.7690): 6%|β–Œ | 55/938 [00:24<04:39, 3.16it/s] Training 1/1 epoch (loss 1.7690): 6%|β–Œ | 56/938 [00:24<04:38, 3.17it/s] Training 1/1 epoch (loss 1.7478): 6%|β–Œ | 56/938 [00:24<04:38, 3.17it/s] Training 1/1 epoch (loss 1.7478): 6%|β–Œ | 57/938 [00:24<04:36, 3.19it/s] Training 1/1 epoch (loss 1.7764): 6%|β–Œ | 57/938 [00:24<04:36, 3.19it/s] Training 1/1 epoch (loss 1.7764): 6%|β–Œ | 58/938 [00:24<04:36, 3.18it/s] Training 1/1 epoch (loss 1.7674): 6%|β–Œ | 58/938 [00:25<04:36, 3.18it/s] Training 1/1 epoch (loss 1.7674): 6%|β–‹ | 59/938 [00:25<04:32, 3.23it/s] Training 1/1 epoch (loss 1.8116): 6%|β–‹ | 59/938 [00:25<04:32, 3.23it/s] Training 1/1 epoch (loss 1.8116): 6%|β–‹ | 60/938 [00:25<04:37, 3.17it/s] Training 1/1 epoch (loss 1.8339): 6%|β–‹ | 60/938 [00:25<04:37, 3.17it/s] Training 1/1 epoch (loss 1.8339): 7%|β–‹ | 61/938 [00:25<04:43, 3.09it/s] Training 1/1 epoch (loss 1.8119): 7%|β–‹ | 61/938 [00:26<04:43, 3.09it/s] Training 1/1 epoch (loss 1.8119): 7%|β–‹ | 62/938 [00:26<04:45, 3.07it/s] Training 1/1 epoch (loss 1.8281): 7%|β–‹ | 62/938 [00:26<04:45, 3.07it/s] Training 1/1 epoch (loss 1.8281): 7%|β–‹ | 63/938 [00:26<04:38, 3.14it/s] Training 1/1 epoch (loss 1.6736): 7%|β–‹ | 63/938 [00:26<04:38, 3.14it/s] Training 1/1 epoch (loss 1.6736): 7%|β–‹ | 64/938 [00:26<04:41, 3.10it/s] Training 1/1 epoch (loss 1.7686): 7%|β–‹ | 64/938 [00:27<04:41, 3.10it/s] Training 1/1 epoch (loss 1.7686): 7%|β–‹ | 65/938 [00:27<04:38, 3.13it/s] Training 1/1 epoch (loss 1.8431): 7%|β–‹ | 65/938 [00:27<04:38, 3.13it/s] Training 1/1 epoch (loss 1.8431): 7%|β–‹ | 66/938 [00:27<04:35, 3.16it/s] Training 1/1 epoch (loss 1.7686): 7%|β–‹ | 66/938 [00:27<04:35, 3.16it/s] Training 1/1 epoch (loss 1.7686): 7%|β–‹ | 67/938 [00:27<04:43, 3.07it/s] Training 1/1 epoch (loss 1.7111): 7%|β–‹ | 67/938 [00:28<04:43, 3.07it/s] Training 1/1 epoch (loss 1.7111): 7%|β–‹ | 68/938 [00:28<04:56, 2.94it/s] Training 1/1 epoch (loss 1.7841): 7%|β–‹ | 68/938 [00:28<04:56, 2.94it/s] Training 1/1 epoch (loss 1.7841): 7%|β–‹ | 69/938 [00:28<04:49, 3.00it/s] Training 1/1 epoch (loss 1.8431): 7%|β–‹ | 69/938 [00:28<04:49, 3.00it/s] Training 1/1 epoch (loss 1.8431): 7%|β–‹ | 70/938 [00:28<04:41, 3.08it/s] Training 1/1 epoch (loss 1.6584): 7%|β–‹ | 70/938 [00:29<04:41, 3.08it/s] Training 1/1 epoch (loss 1.6584): 8%|β–Š | 71/938 [00:29<04:33, 3.17it/s] Training 1/1 epoch (loss 1.7463): 8%|β–Š | 71/938 [00:29<04:33, 3.17it/s] Training 1/1 epoch (loss 1.7463): 8%|β–Š | 72/938 [00:29<04:34, 3.15it/s] Training 1/1 epoch (loss 1.6684): 8%|β–Š | 72/938 [00:29<04:34, 3.15it/s] Training 1/1 epoch (loss 1.6684): 8%|β–Š | 73/938 [00:29<04:34, 3.15it/s] Training 1/1 epoch (loss 1.6903): 8%|β–Š | 73/938 [00:30<04:34, 3.15it/s] Training 1/1 epoch (loss 1.6903): 8%|β–Š | 74/938 [00:30<04:53, 2.94it/s] Training 1/1 epoch (loss 1.6414): 8%|β–Š | 74/938 [00:30<04:53, 2.94it/s] Training 1/1 epoch (loss 1.6414): 8%|β–Š | 75/938 [00:30<04:47, 3.00it/s] Training 1/1 epoch (loss 1.7005): 8%|β–Š | 75/938 [00:30<04:47, 3.00it/s] Training 1/1 epoch (loss 1.7005): 8%|β–Š | 76/938 [00:30<04:46, 3.01it/s] Training 1/1 epoch (loss 1.6195): 8%|β–Š | 76/938 [00:31<04:46, 3.01it/s] Training 1/1 epoch (loss 1.6195): 8%|β–Š | 77/938 [00:31<04:40, 3.07it/s] Training 1/1 epoch (loss 1.8112): 8%|β–Š | 77/938 [00:31<04:40, 3.07it/s] Training 1/1 epoch (loss 1.8112): 8%|β–Š | 78/938 [00:31<04:34, 3.13it/s] Training 1/1 epoch (loss 1.7361): 8%|β–Š | 78/938 [00:31<04:34, 3.13it/s] Training 1/1 epoch (loss 1.7361): 8%|β–Š | 79/938 [00:31<04:35, 3.12it/s] Training 1/1 epoch (loss 1.8144): 8%|β–Š | 79/938 [00:32<04:35, 3.12it/s] Training 1/1 epoch (loss 1.8144): 9%|β–Š | 80/938 [00:32<04:48, 2.97it/s] Training 1/1 epoch (loss 1.7534): 9%|β–Š | 80/938 [00:32<04:48, 2.97it/s] Training 1/1 epoch (loss 1.7534): 9%|β–Š | 81/938 [00:32<04:56, 2.89it/s] Training 1/1 epoch (loss 1.6256): 9%|β–Š | 81/938 [00:32<04:56, 2.89it/s] Training 1/1 epoch (loss 1.6256): 9%|β–Š | 82/938 [00:32<04:58, 2.86it/s] Training 1/1 epoch (loss 1.7068): 9%|β–Š | 82/938 [00:33<04:58, 2.86it/s] Training 1/1 epoch (loss 1.7068): 9%|β–‰ | 83/938 [00:33<04:46, 2.99it/s] Training 1/1 epoch (loss 1.6722): 9%|β–‰ | 83/938 [00:33<04:46, 2.99it/s] Training 1/1 epoch (loss 1.6722): 9%|β–‰ | 84/938 [00:33<04:38, 3.06it/s] Training 1/1 epoch (loss 1.6786): 9%|β–‰ | 84/938 [00:33<04:38, 3.06it/s] Training 1/1 epoch (loss 1.6786): 9%|β–‰ | 85/938 [00:33<04:38, 3.06it/s] Training 1/1 epoch (loss 1.7165): 9%|β–‰ | 85/938 [00:34<04:38, 3.06it/s] Training 1/1 epoch (loss 1.7165): 9%|β–‰ | 86/938 [00:34<04:41, 3.03it/s] Training 1/1 epoch (loss 1.6949): 9%|β–‰ | 86/938 [00:34<04:41, 3.03it/s] Training 1/1 epoch (loss 1.6949): 9%|β–‰ | 87/938 [00:34<04:38, 3.05it/s] Training 1/1 epoch (loss 1.6626): 9%|β–‰ | 87/938 [00:34<04:38, 3.05it/s] Training 1/1 epoch (loss 1.6626): 9%|β–‰ | 88/938 [00:34<04:34, 3.10it/s] Training 1/1 epoch (loss 1.6971): 9%|β–‰ | 88/938 [00:35<04:34, 3.10it/s] Training 1/1 epoch (loss 1.6971): 9%|β–‰ | 89/938 [00:35<04:29, 3.14it/s] Training 1/1 epoch (loss 1.7156): 9%|β–‰ | 89/938 [00:35<04:29, 3.14it/s] Training 1/1 epoch (loss 1.7156): 10%|β–‰ | 90/938 [00:35<04:28, 3.15it/s] Training 1/1 epoch (loss 1.7658): 10%|β–‰ | 90/938 [00:35<04:28, 3.15it/s] Training 1/1 epoch (loss 1.7658): 10%|β–‰ | 91/938 [00:35<04:35, 3.07it/s] Training 1/1 epoch (loss 1.5823): 10%|β–‰ | 91/938 [00:36<04:35, 3.07it/s] Training 1/1 epoch (loss 1.5823): 10%|β–‰ | 92/938 [00:36<04:36, 3.06it/s] Training 1/1 epoch (loss 1.8280): 10%|β–‰ | 92/938 [00:36<04:36, 3.06it/s] Training 1/1 epoch (loss 1.8280): 10%|β–‰ | 93/938 [00:36<04:37, 3.05it/s] Training 1/1 epoch (loss 1.7043): 10%|β–‰ | 93/938 [00:36<04:37, 3.05it/s] Training 1/1 epoch (loss 1.7043): 10%|β–ˆ | 94/938 [00:36<04:34, 3.07it/s] Training 1/1 epoch (loss 1.6915): 10%|β–ˆ | 94/938 [00:37<04:34, 3.07it/s] Training 1/1 epoch (loss 1.6915): 10%|β–ˆ | 95/938 [00:37<04:31, 3.11it/s] Training 1/1 epoch (loss 1.7185): 10%|β–ˆ | 95/938 [00:37<04:31, 3.11it/s] Training 1/1 epoch (loss 1.7185): 10%|β–ˆ | 96/938 [00:37<04:27, 3.15it/s] Training 1/1 epoch (loss 1.7642): 10%|β–ˆ | 96/938 [00:37<04:27, 3.15it/s] Training 1/1 epoch (loss 1.7642): 10%|β–ˆ | 97/938 [00:37<04:26, 3.16it/s] Training 1/1 epoch (loss 1.6670): 10%|β–ˆ | 97/938 [00:37<04:26, 3.16it/s] Training 1/1 epoch (loss 1.6670): 10%|β–ˆ | 98/938 [00:37<04:34, 3.06it/s] Training 1/1 epoch (loss 1.7587): 10%|β–ˆ | 98/938 [00:38<04:34, 3.06it/s] Training 1/1 epoch (loss 1.7587): 11%|β–ˆ | 99/938 [00:38<04:36, 3.04it/s] Training 1/1 epoch (loss 1.5194): 11%|β–ˆ | 99/938 [00:38<04:36, 3.04it/s] Training 1/1 epoch (loss 1.5194): 11%|β–ˆ | 100/938 [00:38<04:28, 3.12it/s] Training 1/1 epoch (loss 1.6758): 11%|β–ˆ | 100/938 [00:38<04:28, 3.12it/s] Training 1/1 epoch (loss 1.6758): 11%|β–ˆ | 101/938 [00:38<04:24, 3.16it/s] Training 1/1 epoch (loss 1.6739): 11%|β–ˆ | 101/938 [00:39<04:24, 3.16it/s] Training 1/1 epoch (loss 1.6739): 11%|β–ˆ | 102/938 [00:39<04:25, 3.15it/s] Training 1/1 epoch (loss 1.6454): 11%|β–ˆ | 102/938 [00:39<04:25, 3.15it/s] Training 1/1 epoch (loss 1.6454): 11%|β–ˆ | 103/938 [00:39<04:21, 3.19it/s] Training 1/1 epoch (loss 1.7066): 11%|β–ˆ | 103/938 [00:39<04:21, 3.19it/s] Training 1/1 epoch (loss 1.7066): 11%|β–ˆ | 104/938 [00:39<04:30, 3.09it/s] Training 1/1 epoch (loss 1.7418): 11%|β–ˆ | 104/938 [00:40<04:30, 3.09it/s] Training 1/1 epoch (loss 1.7418): 11%|β–ˆ | 105/938 [00:40<04:32, 3.06it/s] Training 1/1 epoch (loss 1.7281): 11%|β–ˆ | 105/938 [00:40<04:32, 3.06it/s] Training 1/1 epoch (loss 1.7281): 11%|β–ˆβ– | 106/938 [00:40<04:43, 2.94it/s] Training 1/1 epoch (loss 1.7611): 11%|β–ˆβ– | 106/938 [00:40<04:43, 2.94it/s] Training 1/1 epoch (loss 1.7611): 11%|β–ˆβ– | 107/938 [00:40<04:34, 3.03it/s] Training 1/1 epoch (loss 1.8436): 11%|β–ˆβ– | 107/938 [00:41<04:34, 3.03it/s] Training 1/1 epoch (loss 1.8436): 12%|β–ˆβ– | 108/938 [00:41<04:25, 3.12it/s] Training 1/1 epoch (loss 1.7631): 12%|β–ˆβ– | 108/938 [00:41<04:25, 3.12it/s] Training 1/1 epoch (loss 1.7631): 12%|β–ˆβ– | 109/938 [00:41<04:23, 3.14it/s] Training 1/1 epoch (loss 1.8213): 12%|β–ˆβ– | 109/938 [00:41<04:23, 3.14it/s] Training 1/1 epoch (loss 1.8213): 12%|β–ˆβ– | 110/938 [00:41<04:44, 2.91it/s] Training 1/1 epoch (loss 1.5215): 12%|β–ˆβ– | 110/938 [00:42<04:44, 2.91it/s] Training 1/1 epoch (loss 1.5215): 12%|β–ˆβ– | 111/938 [00:42<04:46, 2.88it/s] Training 1/1 epoch (loss 1.7422): 12%|β–ˆβ– | 111/938 [00:42<04:46, 2.88it/s] Training 1/1 epoch (loss 1.7422): 12%|β–ˆβ– | 112/938 [00:42<04:43, 2.92it/s] Training 1/1 epoch (loss 1.6342): 12%|β–ˆβ– | 112/938 [00:42<04:43, 2.92it/s] Training 1/1 epoch (loss 1.6342): 12%|β–ˆβ– | 113/938 [00:42<04:32, 3.03it/s] Training 1/1 epoch (loss 1.6527): 12%|β–ˆβ– | 113/938 [00:43<04:32, 3.03it/s] Training 1/1 epoch (loss 1.6527): 12%|β–ˆβ– | 114/938 [00:43<04:22, 3.14it/s] Training 1/1 epoch (loss 1.6351): 12%|β–ˆβ– | 114/938 [00:43<04:22, 3.14it/s] Training 1/1 epoch (loss 1.6351): 12%|β–ˆβ– | 115/938 [00:43<04:21, 3.15it/s] Training 1/1 epoch (loss 1.7526): 12%|β–ˆβ– | 115/938 [00:43<04:21, 3.15it/s] Training 1/1 epoch (loss 1.7526): 12%|β–ˆβ– | 116/938 [00:43<04:18, 3.18it/s] Training 1/1 epoch (loss 1.6462): 12%|β–ˆβ– | 116/938 [00:44<04:18, 3.18it/s] Training 1/1 epoch (loss 1.6462): 12%|β–ˆβ– | 117/938 [00:44<04:24, 3.10it/s] Training 1/1 epoch (loss 1.6912): 12%|β–ˆβ– | 117/938 [00:44<04:24, 3.10it/s] Training 1/1 epoch (loss 1.6912): 13%|β–ˆβ–Ž | 118/938 [00:44<04:19, 3.17it/s] Training 1/1 epoch (loss 1.7530): 13%|β–ˆβ–Ž | 118/938 [00:44<04:19, 3.17it/s] Training 1/1 epoch (loss 1.7530): 13%|β–ˆβ–Ž | 119/938 [00:44<04:18, 3.17it/s] Training 1/1 epoch (loss 1.5902): 13%|β–ˆβ–Ž | 119/938 [00:45<04:18, 3.17it/s] Training 1/1 epoch (loss 1.5902): 13%|β–ˆβ–Ž | 120/938 [00:45<04:19, 3.15it/s] Training 1/1 epoch (loss 1.6880): 13%|β–ˆβ–Ž | 120/938 [00:45<04:19, 3.15it/s] Training 1/1 epoch (loss 1.6880): 13%|β–ˆβ–Ž | 121/938 [00:45<04:18, 3.17it/s] Training 1/1 epoch (loss 1.6769): 13%|β–ˆβ–Ž | 121/938 [00:45<04:18, 3.17it/s] Training 1/1 epoch (loss 1.6769): 13%|β–ˆβ–Ž | 122/938 [00:45<04:17, 3.17it/s] Training 1/1 epoch (loss 1.6504): 13%|β–ˆβ–Ž | 122/938 [00:46<04:17, 3.17it/s] Training 1/1 epoch (loss 1.6504): 13%|β–ˆβ–Ž | 123/938 [00:46<04:27, 3.05it/s] Training 1/1 epoch (loss 1.6799): 13%|β–ˆβ–Ž | 123/938 [00:46<04:27, 3.05it/s] Training 1/1 epoch (loss 1.6799): 13%|β–ˆβ–Ž | 124/938 [00:46<04:20, 3.12it/s] Training 1/1 epoch (loss 1.7216): 13%|β–ˆβ–Ž | 124/938 [00:46<04:20, 3.12it/s] Training 1/1 epoch (loss 1.7216): 13%|β–ˆβ–Ž | 125/938 [00:46<04:12, 3.22it/s] Training 1/1 epoch (loss 1.7810): 13%|β–ˆβ–Ž | 125/938 [00:47<04:12, 3.22it/s] Training 1/1 epoch (loss 1.7810): 13%|β–ˆβ–Ž | 126/938 [00:47<04:25, 3.06it/s] Training 1/1 epoch (loss 1.5516): 13%|β–ˆβ–Ž | 126/938 [00:47<04:25, 3.06it/s] Training 1/1 epoch (loss 1.5516): 14%|β–ˆβ–Ž | 127/938 [00:47<04:23, 3.07it/s] Training 1/1 epoch (loss 1.6115): 14%|β–ˆβ–Ž | 127/938 [00:47<04:23, 3.07it/s] Training 1/1 epoch (loss 1.6115): 14%|β–ˆβ–Ž | 128/938 [00:47<04:25, 3.06it/s] Training 1/1 epoch (loss 1.6810): 14%|β–ˆβ–Ž | 128/938 [00:48<04:25, 3.06it/s] Training 1/1 epoch (loss 1.6810): 14%|β–ˆβ– | 129/938 [00:48<04:26, 3.04it/s] Training 1/1 epoch (loss 1.5843): 14%|β–ˆβ– | 129/938 [00:48<04:26, 3.04it/s] Training 1/1 epoch (loss 1.5843): 14%|β–ˆβ– | 130/938 [00:48<04:23, 3.07it/s] Training 1/1 epoch (loss 1.6530): 14%|β–ˆβ– | 130/938 [00:48<04:23, 3.07it/s] Training 1/1 epoch (loss 1.6530): 14%|β–ˆβ– | 131/938 [00:48<04:18, 3.12it/s] Training 1/1 epoch (loss 1.5635): 14%|β–ˆβ– | 131/938 [00:48<04:18, 3.12it/s] Training 1/1 epoch (loss 1.5635): 14%|β–ˆβ– | 132/938 [00:48<04:12, 3.20it/s] Training 1/1 epoch (loss 1.6971): 14%|β–ˆβ– | 132/938 [00:49<04:12, 3.20it/s] Training 1/1 epoch (loss 1.6971): 14%|β–ˆβ– | 133/938 [00:49<04:12, 3.19it/s] Training 1/1 epoch (loss 1.7782): 14%|β–ˆβ– | 133/938 [00:49<04:12, 3.19it/s] Training 1/1 epoch (loss 1.7782): 14%|β–ˆβ– | 134/938 [00:49<04:16, 3.14it/s] Training 1/1 epoch (loss 1.5750): 14%|β–ˆβ– | 134/938 [00:49<04:16, 3.14it/s] Training 1/1 epoch (loss 1.5750): 14%|β–ˆβ– | 135/938 [00:49<04:27, 3.00it/s] Training 1/1 epoch (loss 1.7044): 14%|β–ˆβ– | 135/938 [00:50<04:27, 3.00it/s] Training 1/1 epoch (loss 1.7044): 14%|β–ˆβ– | 136/938 [00:50<04:33, 2.93it/s] Training 1/1 epoch (loss 1.6881): 14%|β–ˆβ– | 136/938 [00:50<04:33, 2.93it/s] Training 1/1 epoch (loss 1.6881): 15%|β–ˆβ– | 137/938 [00:50<04:30, 2.96it/s] Training 1/1 epoch (loss 1.6578): 15%|β–ˆβ– | 137/938 [00:50<04:30, 2.96it/s] Training 1/1 epoch (loss 1.6578): 15%|β–ˆβ– | 138/938 [00:50<04:22, 3.05it/s] Training 1/1 epoch (loss 1.6749): 15%|β–ˆβ– | 138/938 [00:51<04:22, 3.05it/s] Training 1/1 epoch (loss 1.6749): 15%|β–ˆβ– | 139/938 [00:51<04:19, 3.08it/s] Training 1/1 epoch (loss 1.6477): 15%|β–ˆβ– | 139/938 [00:51<04:19, 3.08it/s] Training 1/1 epoch (loss 1.6477): 15%|β–ˆβ– | 140/938 [00:51<04:12, 3.16it/s] Training 1/1 epoch (loss 1.7120): 15%|β–ˆβ– | 140/938 [00:51<04:12, 3.16it/s] Training 1/1 epoch (loss 1.7120): 15%|β–ˆβ–Œ | 141/938 [00:51<04:14, 3.14it/s] Training 1/1 epoch (loss 1.6788): 15%|β–ˆβ–Œ | 141/938 [00:52<04:14, 3.14it/s] Training 1/1 epoch (loss 1.6788): 15%|β–ˆβ–Œ | 142/938 [00:52<04:38, 2.85it/s] Training 1/1 epoch (loss 1.6142): 15%|β–ˆβ–Œ | 142/938 [00:52<04:38, 2.85it/s] Training 1/1 epoch (loss 1.6142): 15%|β–ˆβ–Œ | 143/938 [00:52<04:25, 2.99it/s] Training 1/1 epoch (loss 1.7386): 15%|β–ˆβ–Œ | 143/938 [00:52<04:25, 2.99it/s] Training 1/1 epoch (loss 1.7386): 15%|β–ˆβ–Œ | 144/938 [00:52<04:22, 3.03it/s] Training 1/1 epoch (loss 1.6449): 15%|β–ˆβ–Œ | 144/938 [00:53<04:22, 3.03it/s] Training 1/1 epoch (loss 1.6449): 15%|β–ˆβ–Œ | 145/938 [00:53<04:15, 3.10it/s] Training 1/1 epoch (loss 1.6387): 15%|β–ˆβ–Œ | 145/938 [00:53<04:15, 3.10it/s] Training 1/1 epoch (loss 1.6387): 16%|β–ˆβ–Œ | 146/938 [00:53<04:10, 3.16it/s] Training 1/1 epoch (loss 1.4747): 16%|β–ˆβ–Œ | 146/938 [00:53<04:10, 3.16it/s] Training 1/1 epoch (loss 1.4747): 16%|β–ˆβ–Œ | 147/938 [00:53<04:09, 3.17it/s] Training 1/1 epoch (loss 1.7604): 16%|β–ˆβ–Œ | 147/938 [00:54<04:09, 3.17it/s] Training 1/1 epoch (loss 1.7604): 16%|β–ˆβ–Œ | 148/938 [00:54<04:12, 3.12it/s] Training 1/1 epoch (loss 1.7886): 16%|β–ˆβ–Œ | 148/938 [00:54<04:12, 3.12it/s] Training 1/1 epoch (loss 1.7886): 16%|β–ˆβ–Œ | 149/938 [00:54<04:07, 3.18it/s] Training 1/1 epoch (loss 1.5595): 16%|β–ˆβ–Œ | 149/938 [00:54<04:07, 3.18it/s] Training 1/1 epoch (loss 1.5595): 16%|β–ˆβ–Œ | 150/938 [00:54<04:05, 3.21it/s] Training 1/1 epoch (loss 1.6608): 16%|β–ˆβ–Œ | 150/938 [00:55<04:05, 3.21it/s] Training 1/1 epoch (loss 1.6608): 16%|β–ˆβ–Œ | 151/938 [00:55<04:01, 3.26it/s] Training 1/1 epoch (loss 1.6243): 16%|β–ˆβ–Œ | 151/938 [00:55<04:01, 3.26it/s] Training 1/1 epoch (loss 1.6243): 16%|β–ˆβ–Œ | 152/938 [00:55<04:11, 3.13it/s] Training 1/1 epoch (loss 1.6585): 16%|β–ˆβ–Œ | 152/938 [00:55<04:11, 3.13it/s] Training 1/1 epoch (loss 1.6585): 16%|β–ˆβ–‹ | 153/938 [00:55<04:13, 3.09it/s] Training 1/1 epoch (loss 1.5482): 16%|β–ˆβ–‹ | 153/938 [00:56<04:13, 3.09it/s] Training 1/1 epoch (loss 1.5482): 16%|β–ˆβ–‹ | 154/938 [00:56<04:09, 3.14it/s] Training 1/1 epoch (loss 1.7548): 16%|β–ˆβ–‹ | 154/938 [00:56<04:09, 3.14it/s] Training 1/1 epoch (loss 1.7548): 17%|β–ˆβ–‹ | 155/938 [00:56<04:11, 3.11it/s] Training 1/1 epoch (loss 1.6700): 17%|β–ˆβ–‹ | 155/938 [00:56<04:11, 3.11it/s] Training 1/1 epoch (loss 1.6700): 17%|β–ˆβ–‹ | 156/938 [00:56<04:06, 3.17it/s] Training 1/1 epoch (loss 1.6748): 17%|β–ˆβ–‹ | 156/938 [00:57<04:06, 3.17it/s] Training 1/1 epoch (loss 1.6748): 17%|β–ˆβ–‹ | 157/938 [00:57<04:04, 3.19it/s] Training 1/1 epoch (loss 1.5869): 17%|β–ˆβ–‹ | 157/938 [00:57<04:04, 3.19it/s] Training 1/1 epoch (loss 1.5869): 17%|β–ˆβ–‹ | 158/938 [00:57<04:04, 3.19it/s] Training 1/1 epoch (loss 1.7521): 17%|β–ˆβ–‹ | 158/938 [00:57<04:04, 3.19it/s] Training 1/1 epoch (loss 1.7521): 17%|β–ˆβ–‹ | 159/938 [00:57<04:06, 3.16it/s] Training 1/1 epoch (loss 1.6897): 17%|β–ˆβ–‹ | 159/938 [00:58<04:06, 3.16it/s] Training 1/1 epoch (loss 1.6897): 17%|β–ˆβ–‹ | 160/938 [00:58<04:15, 3.04it/s] Training 1/1 epoch (loss 1.6043): 17%|β–ˆβ–‹ | 160/938 [00:58<04:15, 3.04it/s] Training 1/1 epoch (loss 1.6043): 17%|β–ˆβ–‹ | 161/938 [00:58<04:16, 3.02it/s] Training 1/1 epoch (loss 1.6774): 17%|β–ˆβ–‹ | 161/938 [00:58<04:16, 3.02it/s] Training 1/1 epoch (loss 1.6774): 17%|β–ˆβ–‹ | 162/938 [00:58<04:11, 3.09it/s] Training 1/1 epoch (loss 1.6200): 17%|β–ˆβ–‹ | 162/938 [00:58<04:11, 3.09it/s] Training 1/1 epoch (loss 1.6200): 17%|β–ˆβ–‹ | 163/938 [00:58<04:03, 3.18it/s] Training 1/1 epoch (loss 1.6467): 17%|β–ˆβ–‹ | 163/938 [00:59<04:03, 3.18it/s] Training 1/1 epoch (loss 1.6467): 17%|β–ˆβ–‹ | 164/938 [00:59<04:02, 3.19it/s] Training 1/1 epoch (loss 1.6675): 17%|β–ˆβ–‹ | 164/938 [00:59<04:02, 3.19it/s] Training 1/1 epoch (loss 1.6675): 18%|β–ˆβ–Š | 165/938 [00:59<03:59, 3.23it/s] Training 1/1 epoch (loss 1.6160): 18%|β–ˆβ–Š | 165/938 [00:59<03:59, 3.23it/s] Training 1/1 epoch (loss 1.6160): 18%|β–ˆβ–Š | 166/938 [00:59<04:06, 3.14it/s] Training 1/1 epoch (loss 1.6708): 18%|β–ˆβ–Š | 166/938 [01:00<04:06, 3.14it/s] Training 1/1 epoch (loss 1.6708): 18%|β–ˆβ–Š | 167/938 [01:00<04:21, 2.95it/s] Training 1/1 epoch (loss 1.5310): 18%|β–ˆβ–Š | 167/938 [01:00<04:21, 2.95it/s] Training 1/1 epoch (loss 1.5310): 18%|β–ˆβ–Š | 168/938 [01:00<04:15, 3.01it/s] Training 1/1 epoch (loss 1.7714): 18%|β–ˆβ–Š | 168/938 [01:00<04:15, 3.01it/s] Training 1/1 epoch (loss 1.7714): 18%|β–ˆβ–Š | 169/938 [01:00<04:18, 2.97it/s] Training 1/1 epoch (loss 1.6491): 18%|β–ˆβ–Š | 169/938 [01:01<04:18, 2.97it/s] Training 1/1 epoch (loss 1.6491): 18%|β–ˆβ–Š | 170/938 [01:01<04:08, 3.08it/s] Training 1/1 epoch (loss 1.6202): 18%|β–ˆβ–Š | 170/938 [01:01<04:08, 3.08it/s] Training 1/1 epoch (loss 1.6202): 18%|β–ˆβ–Š | 171/938 [01:01<04:09, 3.08it/s] Training 1/1 epoch (loss 1.5773): 18%|β–ˆβ–Š | 171/938 [01:01<04:09, 3.08it/s] Training 1/1 epoch (loss 1.5773): 18%|β–ˆβ–Š | 172/938 [01:01<04:08, 3.08it/s] Training 1/1 epoch (loss 1.6230): 18%|β–ˆβ–Š | 172/938 [01:02<04:08, 3.08it/s] Training 1/1 epoch (loss 1.6230): 18%|β–ˆβ–Š | 173/938 [01:02<04:18, 2.96it/s] Training 1/1 epoch (loss 1.7478): 18%|β–ˆβ–Š | 173/938 [01:02<04:18, 2.96it/s] Training 1/1 epoch (loss 1.7478): 19%|β–ˆβ–Š | 174/938 [01:02<04:08, 3.08it/s] Training 1/1 epoch (loss 1.7059): 19%|β–ˆβ–Š | 174/938 [01:02<04:08, 3.08it/s] Training 1/1 epoch (loss 1.7059): 19%|β–ˆβ–Š | 175/938 [01:02<04:02, 3.14it/s] Training 1/1 epoch (loss 1.4940): 19%|β–ˆβ–Š | 175/938 [01:03<04:02, 3.14it/s] Training 1/1 epoch (loss 1.4940): 19%|β–ˆβ–‰ | 176/938 [01:03<04:04, 3.12it/s] Training 1/1 epoch (loss 1.6093): 19%|β–ˆβ–‰ | 176/938 [01:03<04:04, 3.12it/s] Training 1/1 epoch (loss 1.6093): 19%|β–ˆβ–‰ | 177/938 [01:03<03:58, 3.19it/s] Training 1/1 epoch (loss 1.6904): 19%|β–ˆβ–‰ | 177/938 [01:03<03:58, 3.19it/s] Training 1/1 epoch (loss 1.6904): 19%|β–ˆβ–‰ | 178/938 [01:03<04:04, 3.11it/s] Training 1/1 epoch (loss 1.7279): 19%|β–ˆβ–‰ | 178/938 [01:04<04:04, 3.11it/s] Training 1/1 epoch (loss 1.7279): 19%|β–ˆβ–‰ | 179/938 [01:04<04:08, 3.05it/s] Training 1/1 epoch (loss 1.5911): 19%|β–ˆβ–‰ | 179/938 [01:04<04:08, 3.05it/s] Training 1/1 epoch (loss 1.5911): 19%|β–ˆβ–‰ | 180/938 [01:04<04:05, 3.08it/s] Training 1/1 epoch (loss 1.6488): 19%|β–ˆβ–‰ | 180/938 [01:04<04:05, 3.08it/s] Training 1/1 epoch (loss 1.6488): 19%|β–ˆβ–‰ | 181/938 [01:04<04:05, 3.08it/s] Training 1/1 epoch (loss 1.4930): 19%|β–ˆβ–‰ | 181/938 [01:05<04:05, 3.08it/s] Training 1/1 epoch (loss 1.4930): 19%|β–ˆβ–‰ | 182/938 [01:05<04:03, 3.10it/s] Training 1/1 epoch (loss 1.7389): 19%|β–ˆβ–‰ | 182/938 [01:05<04:03, 3.10it/s] Training 1/1 epoch (loss 1.7389): 20%|β–ˆβ–‰ | 183/938 [01:05<03:59, 3.15it/s] Training 1/1 epoch (loss 1.6204): 20%|β–ˆβ–‰ | 183/938 [01:05<03:59, 3.15it/s] Training 1/1 epoch (loss 1.6204): 20%|β–ˆβ–‰ | 184/938 [01:05<03:58, 3.16it/s] Training 1/1 epoch (loss 1.6640): 20%|β–ˆβ–‰ | 184/938 [01:06<03:58, 3.16it/s] Training 1/1 epoch (loss 1.6640): 20%|β–ˆβ–‰ | 185/938 [01:06<04:05, 3.06it/s] Training 1/1 epoch (loss 1.6388): 20%|β–ˆβ–‰ | 185/938 [01:06<04:05, 3.06it/s] Training 1/1 epoch (loss 1.6388): 20%|β–ˆβ–‰ | 186/938 [01:06<04:03, 3.09it/s] Training 1/1 epoch (loss 1.5416): 20%|β–ˆβ–‰ | 186/938 [01:06<04:03, 3.09it/s] Training 1/1 epoch (loss 1.5416): 20%|β–ˆβ–‰ | 187/938 [01:06<03:59, 3.13it/s] Training 1/1 epoch (loss 1.5998): 20%|β–ˆβ–‰ | 187/938 [01:07<03:59, 3.13it/s] Training 1/1 epoch (loss 1.5998): 20%|β–ˆβ–ˆ | 188/938 [01:07<03:57, 3.16it/s] Training 1/1 epoch (loss 1.6855): 20%|β–ˆβ–ˆ | 188/938 [01:07<03:57, 3.16it/s] Training 1/1 epoch (loss 1.6855): 20%|β–ˆβ–ˆ | 189/938 [01:07<03:50, 3.25it/s] Training 1/1 epoch (loss 1.8642): 20%|β–ˆβ–ˆ | 189/938 [01:07<03:50, 3.25it/s] Training 1/1 epoch (loss 1.8642): 20%|β–ˆβ–ˆ | 190/938 [01:07<03:52, 3.21it/s] Training 1/1 epoch (loss 1.5854): 20%|β–ˆβ–ˆ | 190/938 [01:08<03:52, 3.21it/s] Training 1/1 epoch (loss 1.5854): 20%|β–ˆβ–ˆ | 191/938 [01:08<04:10, 2.99it/s] Training 1/1 epoch (loss 1.6415): 20%|β–ˆβ–ˆ | 191/938 [01:08<04:10, 2.99it/s] Training 1/1 epoch (loss 1.6415): 20%|β–ˆβ–ˆ | 192/938 [01:08<04:12, 2.96it/s] Training 1/1 epoch (loss 1.6344): 20%|β–ˆβ–ˆ | 192/938 [01:08<04:12, 2.96it/s] Training 1/1 epoch (loss 1.6344): 21%|β–ˆβ–ˆ | 193/938 [01:08<04:03, 3.06it/s] Training 1/1 epoch (loss 1.6393): 21%|β–ˆβ–ˆ | 193/938 [01:08<04:03, 3.06it/s] Training 1/1 epoch (loss 1.6393): 21%|β–ˆβ–ˆ | 194/938 [01:08<03:57, 3.13it/s] Training 1/1 epoch (loss 1.6402): 21%|β–ˆβ–ˆ | 194/938 [01:09<03:57, 3.13it/s] Training 1/1 epoch (loss 1.6402): 21%|β–ˆβ–ˆ | 195/938 [01:09<03:53, 3.19it/s] Training 1/1 epoch (loss 1.6466): 21%|β–ˆβ–ˆ | 195/938 [01:09<03:53, 3.19it/s] Training 1/1 epoch (loss 1.6466): 21%|β–ˆβ–ˆ | 196/938 [01:09<03:51, 3.20it/s] Training 1/1 epoch (loss 1.6812): 21%|β–ˆβ–ˆ | 196/938 [01:09<03:51, 3.20it/s] Training 1/1 epoch (loss 1.6812): 21%|β–ˆβ–ˆ | 197/938 [01:09<03:55, 3.15it/s] Training 1/1 epoch (loss 1.5231): 21%|β–ˆβ–ˆ | 197/938 [01:10<03:55, 3.15it/s] Training 1/1 epoch (loss 1.5231): 21%|β–ˆβ–ˆ | 198/938 [01:10<03:55, 3.15it/s] Training 1/1 epoch (loss 1.6337): 21%|β–ˆβ–ˆ | 198/938 [01:10<03:55, 3.15it/s] Training 1/1 epoch (loss 1.6337): 21%|β–ˆβ–ˆ | 199/938 [01:10<03:56, 3.13it/s] Training 1/1 epoch (loss 1.5626): 21%|β–ˆβ–ˆ | 199/938 [01:10<03:56, 3.13it/s] Training 1/1 epoch (loss 1.5626): 21%|β–ˆβ–ˆβ– | 200/938 [01:10<03:59, 3.08it/s] Training 1/1 epoch (loss 1.6209): 21%|β–ˆβ–ˆβ– | 200/938 [01:11<03:59, 3.08it/s] Training 1/1 epoch (loss 1.6209): 21%|β–ˆβ–ˆβ– | 201/938 [01:11<04:05, 3.01it/s] Training 1/1 epoch (loss 1.6431): 21%|β–ˆβ–ˆβ– | 201/938 [01:11<04:05, 3.01it/s] Training 1/1 epoch (loss 1.6431): 22%|β–ˆβ–ˆβ– | 202/938 [01:11<03:56, 3.11it/s] Training 1/1 epoch (loss 1.6604): 22%|β–ˆβ–ˆβ– | 202/938 [01:11<03:56, 3.11it/s] Training 1/1 epoch (loss 1.6604): 22%|β–ˆβ–ˆβ– | 203/938 [01:11<03:57, 3.09it/s] Training 1/1 epoch (loss 1.6077): 22%|β–ˆβ–ˆβ– | 203/938 [01:12<03:57, 3.09it/s] Training 1/1 epoch (loss 1.6077): 22%|β–ˆβ–ˆβ– | 204/938 [01:12<03:59, 3.07it/s] Training 1/1 epoch (loss 1.6324): 22%|β–ˆβ–ˆβ– | 204/938 [01:12<03:59, 3.07it/s] Training 1/1 epoch (loss 1.6324): 22%|β–ˆβ–ˆβ– | 205/938 [01:12<03:51, 3.16it/s] Training 1/1 epoch (loss 1.7329): 22%|β–ˆβ–ˆβ– | 205/938 [01:12<03:51, 3.16it/s] Training 1/1 epoch (loss 1.7329): 22%|β–ˆβ–ˆβ– | 206/938 [01:12<03:55, 3.11it/s] Training 1/1 epoch (loss 1.6274): 22%|β–ˆβ–ˆβ– | 206/938 [01:13<03:55, 3.11it/s] Training 1/1 epoch (loss 1.6274): 22%|β–ˆβ–ˆβ– | 207/938 [01:13<03:56, 3.09it/s] Training 1/1 epoch (loss 1.6928): 22%|β–ˆβ–ˆβ– | 207/938 [01:13<03:56, 3.09it/s] Training 1/1 epoch (loss 1.6928): 22%|β–ˆβ–ˆβ– | 208/938 [01:13<03:54, 3.12it/s] Training 1/1 epoch (loss 1.6178): 22%|β–ˆβ–ˆβ– | 208/938 [01:13<03:54, 3.12it/s] Training 1/1 epoch (loss 1.6178): 22%|β–ˆβ–ˆβ– | 209/938 [01:13<03:55, 3.10it/s] Training 1/1 epoch (loss 1.5929): 22%|β–ˆβ–ˆβ– | 209/938 [01:14<03:55, 3.10it/s] Training 1/1 epoch (loss 1.5929): 22%|β–ˆβ–ˆβ– | 210/938 [01:14<03:58, 3.05it/s] Training 1/1 epoch (loss 1.6575): 22%|β–ˆβ–ˆβ– | 210/938 [01:14<03:58, 3.05it/s] Training 1/1 epoch (loss 1.6575): 22%|β–ˆβ–ˆβ– | 211/938 [01:14<03:57, 3.06it/s] Training 1/1 epoch (loss 1.5425): 22%|β–ˆβ–ˆβ– | 211/938 [01:14<03:57, 3.06it/s] Training 1/1 epoch (loss 1.5425): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:14<03:50, 3.15it/s] Training 1/1 epoch (loss 1.5030): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:15<03:50, 3.15it/s] Training 1/1 epoch (loss 1.5030): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:15<04:07, 2.93it/s] Training 1/1 epoch (loss 1.5982): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:15<04:07, 2.93it/s] Training 1/1 epoch (loss 1.5982): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:15<03:57, 3.05it/s] Training 1/1 epoch (loss 1.6612): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:15<03:57, 3.05it/s] Training 1/1 epoch (loss 1.6612): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:15<03:51, 3.13it/s] Training 1/1 epoch (loss 1.5478): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:16<03:51, 3.13it/s] Training 1/1 epoch (loss 1.5478): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:16<04:02, 2.98it/s] Training 1/1 epoch (loss 1.7504): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:16<04:02, 2.98it/s] Training 1/1 epoch (loss 1.7504): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:16<03:56, 3.04it/s] Training 1/1 epoch (loss 1.6669): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:16<03:56, 3.04it/s] Training 1/1 epoch (loss 1.6669): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:16<03:50, 3.12it/s] Training 1/1 epoch (loss 1.6092): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:17<03:50, 3.12it/s] Training 1/1 epoch (loss 1.6092): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:17<03:45, 3.18it/s] Training 1/1 epoch (loss 1.5222): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:17<03:45, 3.18it/s] Training 1/1 epoch (loss 1.5222): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:17<03:43, 3.21it/s] Training 1/1 epoch (loss 1.6733): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:17<03:43, 3.21it/s] Training 1/1 epoch (loss 1.6733): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:17<03:42, 3.22it/s] Training 1/1 epoch (loss 1.6237): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:18<03:42, 3.22it/s] Training 1/1 epoch (loss 1.6237): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:18<03:50, 3.11it/s] Training 1/1 epoch (loss 1.7217): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:18<03:50, 3.11it/s] Training 1/1 epoch (loss 1.7217): 24%|β–ˆβ–ˆβ– | 223/938 [01:18<04:02, 2.95it/s] Training 1/1 epoch (loss 1.6716): 24%|β–ˆβ–ˆβ– | 223/938 [01:18<04:02, 2.95it/s] Training 1/1 epoch (loss 1.6716): 24%|β–ˆβ–ˆβ– | 224/938 [01:18<04:01, 2.96it/s] Training 1/1 epoch (loss 1.6477): 24%|β–ˆβ–ˆβ– | 224/938 [01:19<04:01, 2.96it/s] Training 1/1 epoch (loss 1.6477): 24%|β–ˆβ–ˆβ– | 225/938 [01:19<03:56, 3.01it/s] Training 1/1 epoch (loss 1.7280): 24%|β–ˆβ–ˆβ– | 225/938 [01:19<03:56, 3.01it/s] Training 1/1 epoch (loss 1.7280): 24%|β–ˆβ–ˆβ– | 226/938 [01:19<03:50, 3.09it/s] Training 1/1 epoch (loss 1.6436): 24%|β–ˆβ–ˆβ– | 226/938 [01:19<03:50, 3.09it/s] Training 1/1 epoch (loss 1.6436): 24%|β–ˆβ–ˆβ– | 227/938 [01:19<03:46, 3.14it/s] Training 1/1 epoch (loss 1.6275): 24%|β–ˆβ–ˆβ– | 227/938 [01:19<03:46, 3.14it/s] Training 1/1 epoch (loss 1.6275): 24%|β–ˆβ–ˆβ– | 228/938 [01:19<03:43, 3.17it/s] Training 1/1 epoch (loss 1.6674): 24%|β–ˆβ–ˆβ– | 228/938 [01:20<03:43, 3.17it/s] Training 1/1 epoch (loss 1.6674): 24%|β–ˆβ–ˆβ– | 229/938 [01:20<04:11, 2.82it/s] Training 1/1 epoch (loss 1.6616): 24%|β–ˆβ–ˆβ– | 229/938 [01:20<04:11, 2.82it/s] Training 1/1 epoch (loss 1.6616): 25%|β–ˆβ–ˆβ– | 230/938 [01:20<04:00, 2.95it/s] Training 1/1 epoch (loss 1.5981): 25%|β–ˆβ–ˆβ– | 230/938 [01:21<04:00, 2.95it/s] Training 1/1 epoch (loss 1.5981): 25%|β–ˆβ–ˆβ– | 231/938 [01:21<03:55, 3.00it/s] Training 1/1 epoch (loss 1.6509): 25%|β–ˆβ–ˆβ– | 231/938 [01:21<03:55, 3.00it/s] Training 1/1 epoch (loss 1.6509): 25%|β–ˆβ–ˆβ– | 232/938 [01:21<04:00, 2.93it/s] Training 1/1 epoch (loss 1.6298): 25%|β–ˆβ–ˆβ– | 232/938 [01:21<04:00, 2.93it/s] Training 1/1 epoch (loss 1.6298): 25%|β–ˆβ–ˆβ– | 233/938 [01:21<03:53, 3.02it/s] Training 1/1 epoch (loss 1.5997): 25%|β–ˆβ–ˆβ– | 233/938 [01:22<03:53, 3.02it/s] Training 1/1 epoch (loss 1.5997): 25%|β–ˆβ–ˆβ– | 234/938 [01:22<03:56, 2.98it/s] Training 1/1 epoch (loss 1.5950): 25%|β–ˆβ–ˆβ– | 234/938 [01:22<03:56, 2.98it/s] Training 1/1 epoch (loss 1.5950): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:22<03:58, 2.94it/s] Training 1/1 epoch (loss 1.5372): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:22<03:58, 2.94it/s] Training 1/1 epoch (loss 1.5372): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:22<03:49, 3.06it/s] Training 1/1 epoch (loss 1.7157): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:22<03:49, 3.06it/s] Training 1/1 epoch (loss 1.7157): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:22<03:42, 3.14it/s] Training 1/1 epoch (loss 1.5143): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:23<03:42, 3.14it/s] Training 1/1 epoch (loss 1.5143): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:23<03:43, 3.14it/s] Training 1/1 epoch (loss 1.5730): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:23<03:43, 3.14it/s] Training 1/1 epoch (loss 1.5730): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:23<03:48, 3.06it/s] Training 1/1 epoch (loss 1.6160): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:24<03:48, 3.06it/s] Training 1/1 epoch (loss 1.6160): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:24<03:55, 2.96it/s] Training 1/1 epoch (loss 1.7365): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:24<03:55, 2.96it/s] Training 1/1 epoch (loss 1.7365): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:24<03:55, 2.96it/s] Training 1/1 epoch (loss 1.6512): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:24<03:55, 2.96it/s] Training 1/1 epoch (loss 1.6512): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:24<03:49, 3.03it/s] Training 1/1 epoch (loss 1.6544): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:24<03:49, 3.03it/s] Training 1/1 epoch (loss 1.6544): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:24<03:42, 3.13it/s] Training 1/1 epoch (loss 1.5156): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:25<03:42, 3.13it/s] Training 1/1 epoch (loss 1.5156): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:25<03:37, 3.19it/s] Training 1/1 epoch (loss 1.5657): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:25<03:37, 3.19it/s] Training 1/1 epoch (loss 1.5657): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:25<03:39, 3.15it/s] Training 1/1 epoch (loss 1.6442): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:25<03:39, 3.15it/s] Training 1/1 epoch (loss 1.6442): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:25<03:40, 3.14it/s] Training 1/1 epoch (loss 1.5974): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:26<03:40, 3.14it/s] Training 1/1 epoch (loss 1.5974): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:26<03:41, 3.12it/s] Training 1/1 epoch (loss 1.5832): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:26<03:41, 3.12it/s] Training 1/1 epoch (loss 1.5832): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:26<04:00, 2.87it/s] Training 1/1 epoch (loss 1.6202): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:26<04:00, 2.87it/s] Training 1/1 epoch (loss 1.6202): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:26<03:51, 2.97it/s] Training 1/1 epoch (loss 1.4820): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:27<03:51, 2.97it/s] Training 1/1 epoch (loss 1.4820): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:27<03:44, 3.07it/s] Training 1/1 epoch (loss 1.7170): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:27<03:44, 3.07it/s] Training 1/1 epoch (loss 1.7170): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:27<03:38, 3.14it/s] Training 1/1 epoch (loss 1.6215): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:27<03:38, 3.14it/s] Training 1/1 epoch (loss 1.6215): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:27<03:46, 3.03it/s] Training 1/1 epoch (loss 1.6206): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:28<03:46, 3.03it/s] Training 1/1 epoch (loss 1.6206): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:28<03:59, 2.86it/s] Training 1/1 epoch (loss 1.6177): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:28<03:59, 2.86it/s] Training 1/1 epoch (loss 1.6177): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:28<04:00, 2.84it/s] Training 1/1 epoch (loss 1.6140): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:29<04:00, 2.84it/s] Training 1/1 epoch (loss 1.6140): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:29<04:00, 2.85it/s] Training 1/1 epoch (loss 1.5305): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:29<04:00, 2.85it/s] Training 1/1 epoch (loss 1.5305): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:29<04:03, 2.80it/s] Training 1/1 epoch (loss 1.6040): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:29<04:03, 2.80it/s] Training 1/1 epoch (loss 1.6040): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:29<04:02, 2.80it/s] Training 1/1 epoch (loss 1.6376): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:30<04:02, 2.80it/s] Training 1/1 epoch (loss 1.6376): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:30<04:11, 2.70it/s] Training 1/1 epoch (loss 1.6486): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:30<04:11, 2.70it/s] Training 1/1 epoch (loss 1.6486): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:30<04:01, 2.81it/s] Training 1/1 epoch (loss 1.5686): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:30<04:01, 2.81it/s] Training 1/1 epoch (loss 1.5686): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:30<03:49, 2.95it/s] Training 1/1 epoch (loss 1.5860): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:31<03:49, 2.95it/s] Training 1/1 epoch (loss 1.5860): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:31<03:41, 3.05it/s] Training 1/1 epoch (loss 1.5855): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:31<03:41, 3.05it/s] Training 1/1 epoch (loss 1.5855): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:31<03:36, 3.12it/s] Training 1/1 epoch (loss 1.4769): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:31<03:36, 3.12it/s] Training 1/1 epoch (loss 1.4769): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:31<03:31, 3.19it/s] Training 1/1 epoch (loss 1.7211): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:32<03:31, 3.19it/s] Training 1/1 epoch (loss 1.7211): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:32<03:37, 3.10it/s] Training 1/1 epoch (loss 1.5971): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:32<03:37, 3.10it/s] Training 1/1 epoch (loss 1.5971): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:32<03:42, 3.03it/s] Training 1/1 epoch (loss 1.6377): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:32<03:42, 3.03it/s] Training 1/1 epoch (loss 1.6377): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:32<03:35, 3.12it/s] Training 1/1 epoch (loss 1.5361): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:32<03:35, 3.12it/s] Training 1/1 epoch (loss 1.5361): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:32<03:31, 3.18it/s] Training 1/1 epoch (loss 1.5395): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:33<03:31, 3.18it/s] Training 1/1 epoch (loss 1.5395): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:33<03:26, 3.25it/s] Training 1/1 epoch (loss 1.5421): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:33<03:26, 3.25it/s] Training 1/1 epoch (loss 1.5421): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:33<03:27, 3.22it/s] Training 1/1 epoch (loss 1.6312): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:33<03:27, 3.22it/s] Training 1/1 epoch (loss 1.6312): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:33<03:32, 3.14it/s] Training 1/1 epoch (loss 1.7046): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:34<03:32, 3.14it/s] Training 1/1 epoch (loss 1.7046): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:34<03:47, 2.93it/s] Training 1/1 epoch (loss 1.5291): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:34<03:47, 2.93it/s] Training 1/1 epoch (loss 1.5291): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:34<03:42, 2.99it/s] Training 1/1 epoch (loss 1.6319): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:34<03:42, 2.99it/s] Training 1/1 epoch (loss 1.6319): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:34<03:38, 3.05it/s] Training 1/1 epoch (loss 1.6745): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:35<03:38, 3.05it/s] Training 1/1 epoch (loss 1.6745): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:35<03:33, 3.11it/s] Training 1/1 epoch (loss 1.7573): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:35<03:33, 3.11it/s] Training 1/1 epoch (loss 1.7573): 29%|β–ˆβ–ˆβ–‰ | 275/938 [01:35<03:33, 3.10it/s] Training 1/1 epoch (loss 1.6184): 29%|β–ˆβ–ˆβ–‰ | 275/938 [01:35<03:33, 3.10it/s] Training 1/1 epoch (loss 1.6184): 29%|β–ˆβ–ˆβ–‰ | 276/938 [01:35<03:29, 3.17it/s] Training 1/1 epoch (loss 1.6704): 29%|β–ˆβ–ˆβ–‰ | 276/938 [01:36<03:29, 3.17it/s] Training 1/1 epoch (loss 1.6704): 30%|β–ˆβ–ˆβ–‰ | 277/938 [01:36<03:35, 3.06it/s] Training 1/1 epoch (loss 1.6226): 30%|β–ˆβ–ˆβ–‰ | 277/938 [01:36<03:35, 3.06it/s] Training 1/1 epoch (loss 1.6226): 30%|β–ˆβ–ˆβ–‰ | 278/938 [01:36<03:35, 3.07it/s] Training 1/1 epoch (loss 1.6661): 30%|β–ˆβ–ˆβ–‰ | 278/938 [01:36<03:35, 3.07it/s] Training 1/1 epoch (loss 1.6661): 30%|β–ˆβ–ˆβ–‰ | 279/938 [01:36<03:27, 3.17it/s] Training 1/1 epoch (loss 1.4928): 30%|β–ˆβ–ˆβ–‰ | 279/938 [01:37<03:27, 3.17it/s] Training 1/1 epoch (loss 1.4928): 30%|β–ˆβ–ˆβ–‰ | 280/938 [01:37<03:29, 3.15it/s] Training 1/1 epoch (loss 1.5825): 30%|β–ˆβ–ˆβ–‰ | 280/938 [01:37<03:29, 3.15it/s] Training 1/1 epoch (loss 1.5825): 30%|β–ˆβ–ˆβ–‰ | 281/938 [01:37<03:26, 3.19it/s] Training 1/1 epoch (loss 1.5580): 30%|β–ˆβ–ˆβ–‰ | 281/938 [01:37<03:26, 3.19it/s] Training 1/1 epoch (loss 1.5580): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [01:37<03:22, 3.23it/s] Training 1/1 epoch (loss 1.7229): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [01:38<03:22, 3.23it/s] Training 1/1 epoch (loss 1.7229): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [01:38<03:26, 3.17it/s] Training 1/1 epoch (loss 1.5564): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [01:38<03:26, 3.17it/s] Training 1/1 epoch (loss 1.5564): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [01:38<03:24, 3.20it/s] Training 1/1 epoch (loss 1.5304): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [01:38<03:24, 3.20it/s] Training 1/1 epoch (loss 1.5304): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [01:38<03:21, 3.25it/s] Training 1/1 epoch (loss 1.6668): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [01:38<03:21, 3.25it/s] Training 1/1 epoch (loss 1.6668): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [01:38<03:21, 3.24it/s] Training 1/1 epoch (loss 1.5310): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [01:39<03:21, 3.24it/s] Training 1/1 epoch (loss 1.5310): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [01:39<03:18, 3.28it/s] Training 1/1 epoch (loss 1.7303): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [01:39<03:18, 3.28it/s] Training 1/1 epoch (loss 1.7303): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [01:39<03:20, 3.24it/s] Training 1/1 epoch (loss 1.7207): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [01:39<03:20, 3.24it/s] Training 1/1 epoch (loss 1.7207): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [01:39<03:32, 3.05it/s] Training 1/1 epoch (loss 1.5484): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [01:40<03:32, 3.05it/s] Training 1/1 epoch (loss 1.5484): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [01:40<03:29, 3.09it/s] Training 1/1 epoch (loss 1.6929): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [01:40<03:29, 3.09it/s] Training 1/1 epoch (loss 1.6929): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [01:40<03:31, 3.06it/s] Training 1/1 epoch (loss 1.5666): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [01:40<03:31, 3.06it/s] Training 1/1 epoch (loss 1.5666): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [01:40<03:28, 3.10it/s] Training 1/1 epoch (loss 1.5789): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [01:41<03:28, 3.10it/s] Training 1/1 epoch (loss 1.5789): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [01:41<03:22, 3.19it/s] Training 1/1 epoch (loss 1.5815): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [01:41<03:22, 3.19it/s] Training 1/1 epoch (loss 1.5815): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [01:41<03:18, 3.24it/s] Training 1/1 epoch (loss 1.4567): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [01:41<03:18, 3.24it/s] Training 1/1 epoch (loss 1.4567): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [01:41<03:30, 3.05it/s] Training 1/1 epoch (loss 1.5445): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [01:42<03:30, 3.05it/s] Training 1/1 epoch (loss 1.5445): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [01:42<03:31, 3.03it/s] Training 1/1 epoch (loss 1.6087): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [01:42<03:31, 3.03it/s] Training 1/1 epoch (loss 1.6087): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [01:42<03:30, 3.04it/s] Training 1/1 epoch (loss 1.5853): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [01:42<03:30, 3.04it/s] Training 1/1 epoch (loss 1.5853): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [01:42<03:26, 3.10it/s] Training 1/1 epoch (loss 1.6333): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [01:43<03:26, 3.10it/s] Training 1/1 epoch (loss 1.6333): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [01:43<03:18, 3.21it/s] Training 1/1 epoch (loss 1.5953): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [01:43<03:18, 3.21it/s] Training 1/1 epoch (loss 1.5953): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [01:43<03:20, 3.18it/s] Training 1/1 epoch (loss 1.6243): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [01:43<03:20, 3.18it/s] Training 1/1 epoch (loss 1.6243): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [01:43<03:16, 3.25it/s] Training 1/1 epoch (loss 1.6566): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [01:44<03:16, 3.25it/s] Training 1/1 epoch (loss 1.6566): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [01:44<03:20, 3.17it/s] Training 1/1 epoch (loss 1.5399): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [01:44<03:20, 3.17it/s] Training 1/1 epoch (loss 1.5399): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [01:44<03:25, 3.10it/s] Training 1/1 epoch (loss 1.7533): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [01:44<03:25, 3.10it/s] Training 1/1 epoch (loss 1.7533): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [01:44<03:28, 3.03it/s] Training 1/1 epoch (loss 1.5649): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [01:45<03:28, 3.03it/s] Training 1/1 epoch (loss 1.5649): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [01:45<03:26, 3.07it/s] Training 1/1 epoch (loss 1.6416): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [01:45<03:26, 3.07it/s] Training 1/1 epoch (loss 1.6416): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [01:45<03:22, 3.12it/s] Training 1/1 epoch (loss 1.6210): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [01:45<03:22, 3.12it/s] Training 1/1 epoch (loss 1.6210): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [01:45<03:16, 3.20it/s] Training 1/1 epoch (loss 1.5713): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [01:46<03:16, 3.20it/s] Training 1/1 epoch (loss 1.5713): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [01:46<03:20, 3.14it/s] Training 1/1 epoch (loss 1.4771): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [01:46<03:20, 3.14it/s] Training 1/1 epoch (loss 1.4771): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [01:46<03:24, 3.08it/s] Training 1/1 epoch (loss 1.6953): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [01:46<03:24, 3.08it/s] Training 1/1 epoch (loss 1.6953): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [01:46<03:21, 3.11it/s] Training 1/1 epoch (loss 1.4491): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [01:46<03:21, 3.11it/s] Training 1/1 epoch (loss 1.4491): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [01:46<03:15, 3.21it/s] Training 1/1 epoch (loss 1.5550): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [01:47<03:15, 3.21it/s] Training 1/1 epoch (loss 1.5550): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [01:47<03:18, 3.15it/s] Training 1/1 epoch (loss 1.6548): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [01:47<03:18, 3.15it/s] Training 1/1 epoch (loss 1.6548): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [01:47<03:16, 3.18it/s] Training 1/1 epoch (loss 1.6909): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [01:47<03:16, 3.18it/s] Training 1/1 epoch (loss 1.6909): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [01:47<03:22, 3.09it/s] Training 1/1 epoch (loss 1.5104): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [01:48<03:22, 3.09it/s] Training 1/1 epoch (loss 1.5104): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [01:48<03:17, 3.15it/s] Training 1/1 epoch (loss 1.5979): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [01:48<03:17, 3.15it/s] Training 1/1 epoch (loss 1.5979): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [01:48<03:18, 3.14it/s] Training 1/1 epoch (loss 1.5877): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [01:48<03:18, 3.14it/s] Training 1/1 epoch (loss 1.5877): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [01:48<03:14, 3.18it/s] Training 1/1 epoch (loss 1.6459): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [01:49<03:14, 3.18it/s] Training 1/1 epoch (loss 1.6459): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [01:49<03:10, 3.25it/s] Training 1/1 epoch (loss 1.5306): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [01:49<03:10, 3.25it/s] Training 1/1 epoch (loss 1.5306): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [01:49<03:09, 3.27it/s] Training 1/1 epoch (loss 1.4058): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [01:49<03:09, 3.27it/s] Training 1/1 epoch (loss 1.4058): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [01:49<03:21, 3.07it/s] Training 1/1 epoch (loss 1.5406): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [01:50<03:21, 3.07it/s] Training 1/1 epoch (loss 1.5406): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [01:50<03:24, 3.01it/s] Training 1/1 epoch (loss 1.6437): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [01:50<03:24, 3.01it/s] Training 1/1 epoch (loss 1.6437): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [01:50<03:21, 3.05it/s] Training 1/1 epoch (loss 1.7029): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [01:50<03:21, 3.05it/s] Training 1/1 epoch (loss 1.7029): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [01:50<03:17, 3.11it/s] Training 1/1 epoch (loss 1.6351): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [01:51<03:17, 3.11it/s] Training 1/1 epoch (loss 1.6351): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [01:51<03:10, 3.22it/s] Training 1/1 epoch (loss 1.5802): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [01:51<03:10, 3.22it/s] Training 1/1 epoch (loss 1.5802): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [01:51<03:11, 3.21it/s] Training 1/1 epoch (loss 1.6808): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [01:51<03:11, 3.21it/s] Training 1/1 epoch (loss 1.6808): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [01:51<03:11, 3.20it/s] Training 1/1 epoch (loss 1.5454): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [01:52<03:11, 3.20it/s] Training 1/1 epoch (loss 1.5454): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [01:52<03:21, 3.03it/s] Training 1/1 epoch (loss 1.5775): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [01:52<03:21, 3.03it/s] Training 1/1 epoch (loss 1.5775): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [01:52<03:29, 2.91it/s] Training 1/1 epoch (loss 1.4769): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [01:52<03:29, 2.91it/s] Training 1/1 epoch (loss 1.4769): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [01:52<03:23, 2.99it/s] Training 1/1 epoch (loss 1.6485): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [01:53<03:23, 2.99it/s] Training 1/1 epoch (loss 1.6485): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [01:53<03:14, 3.12it/s] Training 1/1 epoch (loss 1.5652): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [01:53<03:14, 3.12it/s] Training 1/1 epoch (loss 1.5652): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [01:53<03:10, 3.18it/s] Training 1/1 epoch (loss 1.6691): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [01:53<03:10, 3.18it/s] Training 1/1 epoch (loss 1.6691): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [01:53<03:13, 3.14it/s] Training 1/1 epoch (loss 1.5979): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [01:54<03:13, 3.14it/s] Training 1/1 epoch (loss 1.5979): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [01:54<03:11, 3.16it/s] Training 1/1 epoch (loss 1.5351): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [01:54<03:11, 3.16it/s] Training 1/1 epoch (loss 1.5351): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [01:54<03:16, 3.07it/s] Training 1/1 epoch (loss 1.6264): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [01:54<03:16, 3.07it/s] Training 1/1 epoch (loss 1.6264): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [01:54<03:15, 3.09it/s] Training 1/1 epoch (loss 1.5125): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [01:55<03:15, 3.09it/s] Training 1/1 epoch (loss 1.5125): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [01:55<03:19, 3.01it/s] Training 1/1 epoch (loss 1.6347): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [01:55<03:19, 3.01it/s] Training 1/1 epoch (loss 1.6347): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [01:55<03:14, 3.08it/s] Training 1/1 epoch (loss 1.5915): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [01:56<03:14, 3.08it/s] Training 1/1 epoch (loss 1.5915): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [01:56<04:25, 2.26it/s] Training 1/1 epoch (loss 1.6214): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [01:56<04:25, 2.26it/s] Training 1/1 epoch (loss 1.6214): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [01:56<04:09, 2.40it/s] Training 1/1 epoch (loss 1.5453): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [01:56<04:09, 2.40it/s] Training 1/1 epoch (loss 1.5453): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [01:56<03:47, 2.63it/s] Training 1/1 epoch (loss 1.5984): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [01:57<03:47, 2.63it/s] Training 1/1 epoch (loss 1.5984): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [01:57<03:33, 2.80it/s] Training 1/1 epoch (loss 1.5432): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [01:57<03:33, 2.80it/s] Training 1/1 epoch (loss 1.5432): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [01:57<03:23, 2.94it/s] Training 1/1 epoch (loss 1.6890): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [01:57<03:23, 2.94it/s] Training 1/1 epoch (loss 1.6890): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [01:57<03:21, 2.96it/s] Training 1/1 epoch (loss 1.5958): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [01:58<03:21, 2.96it/s] Training 1/1 epoch (loss 1.5958): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [01:58<03:26, 2.87it/s] Training 1/1 epoch (loss 1.5319): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [01:58<03:26, 2.87it/s] Training 1/1 epoch (loss 1.5319): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [01:58<03:28, 2.84it/s] Training 1/1 epoch (loss 1.5166): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [01:58<03:28, 2.84it/s] Training 1/1 epoch (loss 1.5166): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [01:58<03:19, 2.97it/s] Training 1/1 epoch (loss 1.5898): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [01:59<03:19, 2.97it/s] Training 1/1 epoch (loss 1.5898): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [01:59<03:13, 3.05it/s] Training 1/1 epoch (loss 1.4523): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [01:59<03:13, 3.05it/s] Training 1/1 epoch (loss 1.4523): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [01:59<03:08, 3.13it/s] Training 1/1 epoch (loss 1.6601): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [01:59<03:08, 3.13it/s] Training 1/1 epoch (loss 1.6601): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [01:59<03:10, 3.10it/s] Training 1/1 epoch (loss 1.6030): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [01:59<03:10, 3.10it/s] Training 1/1 epoch (loss 1.6030): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [01:59<03:07, 3.13it/s] Training 1/1 epoch (loss 1.6246): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [02:00<03:07, 3.13it/s] Training 1/1 epoch (loss 1.6246): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:00<03:23, 2.89it/s] Training 1/1 epoch (loss 1.5418): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:00<03:23, 2.89it/s] Training 1/1 epoch (loss 1.5418): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:00<03:17, 2.96it/s] Training 1/1 epoch (loss 1.7047): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:00<03:17, 2.96it/s] Training 1/1 epoch (loss 1.7047): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:00<03:11, 3.05it/s] Training 1/1 epoch (loss 1.6274): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:01<03:11, 3.05it/s] Training 1/1 epoch (loss 1.6274): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:01<03:08, 3.10it/s] Training 1/1 epoch (loss 1.6178): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:01<03:08, 3.10it/s] Training 1/1 epoch (loss 1.6178): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:01<03:02, 3.19it/s] Training 1/1 epoch (loss 1.5290): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:01<03:02, 3.19it/s] Training 1/1 epoch (loss 1.5290): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:01<03:03, 3.17it/s] Training 1/1 epoch (loss 1.5729): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:02<03:03, 3.17it/s] Training 1/1 epoch (loss 1.5729): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:02<03:11, 3.03it/s] Training 1/1 epoch (loss 1.5523): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:02<03:11, 3.03it/s] Training 1/1 epoch (loss 1.5523): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:02<03:10, 3.05it/s] Training 1/1 epoch (loss 1.5563): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:02<03:10, 3.05it/s] Training 1/1 epoch (loss 1.5563): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:02<03:08, 3.07it/s] Training 1/1 epoch (loss 1.6700): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:03<03:08, 3.07it/s] Training 1/1 epoch (loss 1.6700): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:03<03:07, 3.07it/s] Training 1/1 epoch (loss 1.7260): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:03<03:07, 3.07it/s] Training 1/1 epoch (loss 1.7260): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:03<03:05, 3.12it/s] Training 1/1 epoch (loss 1.5854): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:03<03:05, 3.12it/s] Training 1/1 epoch (loss 1.5854): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:03<03:03, 3.14it/s] Training 1/1 epoch (loss 1.6269): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:04<03:03, 3.14it/s] Training 1/1 epoch (loss 1.6269): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:04<03:03, 3.14it/s] Training 1/1 epoch (loss 1.6599): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:04<03:03, 3.14it/s] Training 1/1 epoch (loss 1.6599): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:04<03:06, 3.07it/s] Training 1/1 epoch (loss 1.5544): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:04<03:06, 3.07it/s] Training 1/1 epoch (loss 1.5544): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:04<03:03, 3.12it/s] Training 1/1 epoch (loss 1.5731): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:05<03:03, 3.12it/s] Training 1/1 epoch (loss 1.5731): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:05<02:59, 3.18it/s] Training 1/1 epoch (loss 1.6048): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:05<02:59, 3.18it/s] Training 1/1 epoch (loss 1.6048): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:05<02:58, 3.19it/s] Training 1/1 epoch (loss 1.6955): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:05<02:58, 3.19it/s] Training 1/1 epoch (loss 1.6955): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:05<03:04, 3.10it/s] Training 1/1 epoch (loss 1.5247): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:06<03:04, 3.10it/s] Training 1/1 epoch (loss 1.5247): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:06<03:17, 2.89it/s] Training 1/1 epoch (loss 1.6465): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:06<03:17, 2.89it/s] Training 1/1 epoch (loss 1.6465): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:06<03:20, 2.83it/s] Training 1/1 epoch (loss 1.5579): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:06<03:20, 2.83it/s] Training 1/1 epoch (loss 1.5579): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:06<03:08, 3.01it/s] Training 1/1 epoch (loss 1.6024): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:07<03:08, 3.01it/s] Training 1/1 epoch (loss 1.6024): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:07<03:06, 3.04it/s] Training 1/1 epoch (loss 1.5064): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:07<03:06, 3.04it/s] Training 1/1 epoch (loss 1.5064): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:07<02:59, 3.14it/s] Training 1/1 epoch (loss 1.4443): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:07<02:59, 3.14it/s] Training 1/1 epoch (loss 1.4443): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:07<02:57, 3.17it/s] Training 1/1 epoch (loss 1.6487): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:08<02:57, 3.17it/s] Training 1/1 epoch (loss 1.6487): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:08<02:58, 3.16it/s] Training 1/1 epoch (loss 1.6285): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:08<02:58, 3.16it/s] Training 1/1 epoch (loss 1.6285): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:08<02:59, 3.13it/s] Training 1/1 epoch (loss 1.7372): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:08<02:59, 3.13it/s] Training 1/1 epoch (loss 1.7372): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:08<02:58, 3.15it/s] Training 1/1 epoch (loss 1.4899): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:09<02:58, 3.15it/s] Training 1/1 epoch (loss 1.4899): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:09<02:56, 3.18it/s] Training 1/1 epoch (loss 1.5597): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:09<02:56, 3.18it/s] Training 1/1 epoch (loss 1.5597): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:09<02:52, 3.24it/s] Training 1/1 epoch (loss 1.5139): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:09<02:52, 3.24it/s] Training 1/1 epoch (loss 1.5139): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:09<02:52, 3.24it/s] Training 1/1 epoch (loss 1.5972): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:09<02:52, 3.24it/s] Training 1/1 epoch (loss 1.5972): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:09<02:52, 3.22it/s] Training 1/1 epoch (loss 1.4537): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:10<02:52, 3.22it/s] Training 1/1 epoch (loss 1.4537): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:10<02:52, 3.22it/s] Training 1/1 epoch (loss 1.6371): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:10<02:52, 3.22it/s] Training 1/1 epoch (loss 1.6371): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:10<03:02, 3.04it/s] Training 1/1 epoch (loss 1.6368): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:10<03:02, 3.04it/s] Training 1/1 epoch (loss 1.6368): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:10<02:59, 3.09it/s] Training 1/1 epoch (loss 1.5469): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:11<02:59, 3.09it/s] Training 1/1 epoch (loss 1.5469): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:11<02:58, 3.11it/s] Training 1/1 epoch (loss 1.5926): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:11<02:58, 3.11it/s] Training 1/1 epoch (loss 1.5926): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:11<02:53, 3.18it/s] Training 1/1 epoch (loss 1.6435): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:11<02:53, 3.18it/s] Training 1/1 epoch (loss 1.6435): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:11<02:49, 3.25it/s] Training 1/1 epoch (loss 1.5837): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:12<02:49, 3.25it/s] Training 1/1 epoch (loss 1.5837): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:12<02:49, 3.25it/s] Training 1/1 epoch (loss 1.7125): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:12<02:49, 3.25it/s] Training 1/1 epoch (loss 1.7125): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:12<03:05, 2.95it/s] Training 1/1 epoch (loss 1.5776): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:12<03:05, 2.95it/s] Training 1/1 epoch (loss 1.5776): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:12<02:59, 3.06it/s] Training 1/1 epoch (loss 1.5869): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:13<02:59, 3.06it/s] Training 1/1 epoch (loss 1.5869): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:13<02:53, 3.15it/s] Training 1/1 epoch (loss 1.6618): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:13<02:53, 3.15it/s] Training 1/1 epoch (loss 1.6618): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:13<02:57, 3.07it/s] Training 1/1 epoch (loss 1.5847): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:13<02:57, 3.07it/s] Training 1/1 epoch (loss 1.5847): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:13<02:54, 3.12it/s] Training 1/1 epoch (loss 1.7094): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:14<02:54, 3.12it/s] Training 1/1 epoch (loss 1.7094): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:14<02:51, 3.16it/s] Training 1/1 epoch (loss 1.5726): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:14<02:51, 3.16it/s] Training 1/1 epoch (loss 1.5726): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:14<02:56, 3.07it/s] Training 1/1 epoch (loss 1.5782): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:14<02:56, 3.07it/s] Training 1/1 epoch (loss 1.5782): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:14<02:52, 3.14it/s] Training 1/1 epoch (loss 1.5406): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:15<02:52, 3.14it/s] Training 1/1 epoch (loss 1.5406): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:15<02:54, 3.10it/s] Training 1/1 epoch (loss 1.5744): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:15<02:54, 3.10it/s] Training 1/1 epoch (loss 1.5744): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:15<02:50, 3.17it/s] Training 1/1 epoch (loss 1.5748): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:15<02:50, 3.17it/s] Training 1/1 epoch (loss 1.5748): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:15<02:52, 3.12it/s] Training 1/1 epoch (loss 1.6720): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:16<02:52, 3.12it/s] Training 1/1 epoch (loss 1.6720): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:16<03:07, 2.87it/s] Training 1/1 epoch (loss 1.6497): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:16<03:07, 2.87it/s] Training 1/1 epoch (loss 1.6497): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:16<03:14, 2.76it/s] Training 1/1 epoch (loss 1.6547): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:16<03:14, 2.76it/s] Training 1/1 epoch (loss 1.6547): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:16<03:02, 2.94it/s] Training 1/1 epoch (loss 1.6072): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:17<03:02, 2.94it/s] Training 1/1 epoch (loss 1.6072): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:17<02:56, 3.03it/s] Training 1/1 epoch (loss 1.6052): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:17<02:56, 3.03it/s] Training 1/1 epoch (loss 1.6052): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:17<02:54, 3.07it/s] Training 1/1 epoch (loss 1.5037): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:17<02:54, 3.07it/s] Training 1/1 epoch (loss 1.5037): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:17<02:51, 3.11it/s] Training 1/1 epoch (loss 1.7081): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:18<02:51, 3.11it/s] Training 1/1 epoch (loss 1.7081): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:18<02:49, 3.15it/s] Training 1/1 epoch (loss 1.6542): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:18<02:49, 3.15it/s] Training 1/1 epoch (loss 1.6542): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:18<02:49, 3.12it/s] Training 1/1 epoch (loss 1.5937): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:18<02:49, 3.12it/s] Training 1/1 epoch (loss 1.5937): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:18<02:51, 3.08it/s] Training 1/1 epoch (loss 1.6034): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:19<02:51, 3.08it/s] Training 1/1 epoch (loss 1.6034): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:19<02:48, 3.14it/s] Training 1/1 epoch (loss 1.6069): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:19<02:48, 3.14it/s] Training 1/1 epoch (loss 1.6069): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:19<02:45, 3.20it/s] Training 1/1 epoch (loss 1.4521): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:19<02:45, 3.20it/s] Training 1/1 epoch (loss 1.4521): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:19<02:42, 3.24it/s] Training 1/1 epoch (loss 1.5936): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:19<02:42, 3.24it/s] Training 1/1 epoch (loss 1.5936): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:19<02:43, 3.21it/s] Training 1/1 epoch (loss 1.5057): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:20<02:43, 3.21it/s] Training 1/1 epoch (loss 1.5057): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:20<02:43, 3.22it/s] Training 1/1 epoch (loss 1.6066): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:20<02:43, 3.22it/s] Training 1/1 epoch (loss 1.6066): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:20<02:44, 3.18it/s] Training 1/1 epoch (loss 1.5208): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:20<02:44, 3.18it/s] Training 1/1 epoch (loss 1.5208): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:20<02:43, 3.21it/s] Training 1/1 epoch (loss 1.4734): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:21<02:43, 3.21it/s] Training 1/1 epoch (loss 1.4734): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:21<02:51, 3.04it/s] Training 1/1 epoch (loss 1.4801): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:21<02:51, 3.04it/s] Training 1/1 epoch (loss 1.4801): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:21<02:50, 3.05it/s] Training 1/1 epoch (loss 1.5382): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:21<02:50, 3.05it/s] Training 1/1 epoch (loss 1.5382): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:21<02:46, 3.12it/s] Training 1/1 epoch (loss 1.6077): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:22<02:46, 3.12it/s] Training 1/1 epoch (loss 1.6077): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:22<02:48, 3.09it/s] Training 1/1 epoch (loss 1.6380): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:22<02:48, 3.09it/s] Training 1/1 epoch (loss 1.6380): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [02:22<02:49, 3.06it/s] Training 1/1 epoch (loss 1.5027): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [02:22<02:49, 3.06it/s] Training 1/1 epoch (loss 1.5027): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [02:22<02:47, 3.09it/s] Training 1/1 epoch (loss 1.5051): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [02:23<02:47, 3.09it/s] Training 1/1 epoch (loss 1.5051): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [02:23<02:44, 3.13it/s] Training 1/1 epoch (loss 1.5146): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [02:23<02:44, 3.13it/s] Training 1/1 epoch (loss 1.5146): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [02:23<02:40, 3.20it/s] Training 1/1 epoch (loss 1.4337): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [02:23<02:40, 3.20it/s] Training 1/1 epoch (loss 1.4337): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [02:23<02:43, 3.14it/s] Training 1/1 epoch (loss 1.5847): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [02:24<02:43, 3.14it/s] Training 1/1 epoch (loss 1.5847): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [02:24<02:45, 3.09it/s] Training 1/1 epoch (loss 1.5756): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [02:24<02:45, 3.09it/s] Training 1/1 epoch (loss 1.5756): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [02:24<02:46, 3.08it/s] Training 1/1 epoch (loss 1.6179): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [02:24<02:46, 3.08it/s] Training 1/1 epoch (loss 1.6179): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [02:24<02:41, 3.16it/s] Training 1/1 epoch (loss 1.5129): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [02:25<02:41, 3.16it/s] Training 1/1 epoch (loss 1.5129): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [02:25<02:39, 3.20it/s] Training 1/1 epoch (loss 1.5487): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [02:25<02:39, 3.20it/s] Training 1/1 epoch (loss 1.5487): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [02:25<02:36, 3.25it/s] Training 1/1 epoch (loss 1.5964): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [02:25<02:36, 3.25it/s] Training 1/1 epoch (loss 1.5964): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [02:25<02:38, 3.20it/s] Training 1/1 epoch (loss 1.5463): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [02:26<02:38, 3.20it/s] Training 1/1 epoch (loss 1.5463): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [02:26<02:43, 3.11it/s] Training 1/1 epoch (loss 1.6136): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [02:26<02:43, 3.11it/s] Training 1/1 epoch (loss 1.6136): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [02:26<02:50, 2.97it/s] Training 1/1 epoch (loss 1.5624): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [02:26<02:50, 2.97it/s] Training 1/1 epoch (loss 1.5624): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [02:26<02:45, 3.05it/s] Training 1/1 epoch (loss 1.6630): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [02:27<02:45, 3.05it/s] Training 1/1 epoch (loss 1.6630): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [02:27<02:44, 3.06it/s] Training 1/1 epoch (loss 1.5841): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [02:27<02:44, 3.06it/s] Training 1/1 epoch (loss 1.5841): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [02:27<02:38, 3.17it/s] Training 1/1 epoch (loss 1.6449): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [02:27<02:38, 3.17it/s] Training 1/1 epoch (loss 1.6449): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [02:27<02:37, 3.19it/s] Training 1/1 epoch (loss 1.6017): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [02:27<02:37, 3.19it/s] Training 1/1 epoch (loss 1.6017): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [02:27<02:40, 3.12it/s] Training 1/1 epoch (loss 1.4830): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [02:28<02:40, 3.12it/s] Training 1/1 epoch (loss 1.4830): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [02:28<02:37, 3.18it/s] Training 1/1 epoch (loss 1.6160): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [02:28<02:37, 3.18it/s] Training 1/1 epoch (loss 1.6160): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [02:28<02:39, 3.13it/s] Training 1/1 epoch (loss 1.5659): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [02:28<02:39, 3.13it/s] Training 1/1 epoch (loss 1.5659): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [02:28<02:36, 3.17it/s] Training 1/1 epoch (loss 1.5767): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [02:29<02:36, 3.17it/s] Training 1/1 epoch (loss 1.5767): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [02:29<02:35, 3.19it/s] Training 1/1 epoch (loss 1.5149): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [02:29<02:35, 3.19it/s] Training 1/1 epoch (loss 1.5149): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [02:29<02:34, 3.21it/s] Training 1/1 epoch (loss 1.5733): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [02:29<02:34, 3.21it/s] Training 1/1 epoch (loss 1.5733): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [02:29<02:33, 3.22it/s] Training 1/1 epoch (loss 1.6274): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [02:30<02:33, 3.22it/s] Training 1/1 epoch (loss 1.6274): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [02:30<02:36, 3.15it/s] Training 1/1 epoch (loss 1.4172): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [02:30<02:36, 3.15it/s] Training 1/1 epoch (loss 1.4172): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [02:30<02:37, 3.13it/s] Training 1/1 epoch (loss 1.4824): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [02:30<02:37, 3.13it/s] Training 1/1 epoch (loss 1.4824): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [02:30<02:35, 3.17it/s] Training 1/1 epoch (loss 1.6436): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [02:31<02:35, 3.17it/s] Training 1/1 epoch (loss 1.6436): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [02:31<02:31, 3.24it/s] Training 1/1 epoch (loss 1.5732): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [02:31<02:31, 3.24it/s] Training 1/1 epoch (loss 1.5732): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [02:31<02:31, 3.22it/s] Training 1/1 epoch (loss 1.6792): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [02:31<02:31, 3.22it/s] Training 1/1 epoch (loss 1.6792): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [02:31<02:40, 3.05it/s] Training 1/1 epoch (loss 1.4278): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [02:32<02:40, 3.05it/s] Training 1/1 epoch (loss 1.4278): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [02:32<02:40, 3.04it/s] Training 1/1 epoch (loss 1.5390): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [02:32<02:40, 3.04it/s] Training 1/1 epoch (loss 1.5390): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [02:32<02:41, 3.02it/s] Training 1/1 epoch (loss 1.5586): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [02:32<02:41, 3.02it/s] Training 1/1 epoch (loss 1.5586): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [02:32<02:38, 3.07it/s] Training 1/1 epoch (loss 1.5522): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [02:33<02:38, 3.07it/s] Training 1/1 epoch (loss 1.5522): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [02:33<02:34, 3.15it/s] Training 1/1 epoch (loss 1.4834): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [02:33<02:34, 3.15it/s] Training 1/1 epoch (loss 1.4834): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [02:33<02:31, 3.19it/s] Training 1/1 epoch (loss 1.6138): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [02:33<02:31, 3.19it/s] Training 1/1 epoch (loss 1.6138): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [02:33<02:30, 3.22it/s] Training 1/1 epoch (loss 1.5585): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [02:34<02:30, 3.22it/s] Training 1/1 epoch (loss 1.5585): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [02:34<02:33, 3.14it/s] Training 1/1 epoch (loss 1.6769): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [02:34<02:33, 3.14it/s] Training 1/1 epoch (loss 1.6769): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [02:34<02:40, 3.00it/s] Training 1/1 epoch (loss 1.6343): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [02:34<02:40, 3.00it/s] Training 1/1 epoch (loss 1.6343): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [02:34<02:36, 3.07it/s] Training 1/1 epoch (loss 1.6533): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [02:34<02:36, 3.07it/s] Training 1/1 epoch (loss 1.6533): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [02:34<02:33, 3.11it/s] Training 1/1 epoch (loss 1.6076): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [02:35<02:33, 3.11it/s] Training 1/1 epoch (loss 1.6076): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [02:35<02:32, 3.12it/s] Training 1/1 epoch (loss 1.5475): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [02:35<02:32, 3.12it/s] Training 1/1 epoch (loss 1.5475): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [02:35<02:33, 3.10it/s] Training 1/1 epoch (loss 1.5164): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [02:35<02:33, 3.10it/s] Training 1/1 epoch (loss 1.5164): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [02:35<02:32, 3.12it/s] Training 1/1 epoch (loss 1.5070): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [02:36<02:32, 3.12it/s] Training 1/1 epoch (loss 1.5070): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [02:36<02:37, 3.02it/s] Training 1/1 epoch (loss 1.5728): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [02:36<02:37, 3.02it/s] Training 1/1 epoch (loss 1.5728): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [02:36<02:46, 2.84it/s] Training 1/1 epoch (loss 1.5448): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [02:37<02:46, 2.84it/s] Training 1/1 epoch (loss 1.5448): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [02:37<02:44, 2.87it/s] Training 1/1 epoch (loss 1.6206): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [02:37<02:44, 2.87it/s] Training 1/1 epoch (loss 1.6206): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [02:37<02:36, 3.02it/s] Training 1/1 epoch (loss 1.6557): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [02:37<02:36, 3.02it/s] Training 1/1 epoch (loss 1.6557): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [02:37<02:35, 3.02it/s] Training 1/1 epoch (loss 1.6861): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [02:38<02:35, 3.02it/s] Training 1/1 epoch (loss 1.6861): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [02:38<03:04, 2.54it/s] Training 1/1 epoch (loss 1.6396): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [02:38<03:04, 2.54it/s] Training 1/1 epoch (loss 1.6396): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [02:38<02:58, 2.63it/s] Training 1/1 epoch (loss 1.4508): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [02:38<02:58, 2.63it/s] Training 1/1 epoch (loss 1.4508): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [02:38<02:51, 2.73it/s] Training 1/1 epoch (loss 1.5791): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [02:39<02:51, 2.73it/s] Training 1/1 epoch (loss 1.5791): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [02:39<02:41, 2.88it/s] Training 1/1 epoch (loss 1.5884): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [02:39<02:41, 2.88it/s] Training 1/1 epoch (loss 1.5884): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [02:39<02:40, 2.91it/s] Training 1/1 epoch (loss 1.4805): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [02:39<02:40, 2.91it/s] Training 1/1 epoch (loss 1.4805): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [02:39<02:38, 2.93it/s] Training 1/1 epoch (loss 1.6152): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [02:40<02:38, 2.93it/s] Training 1/1 epoch (loss 1.6152): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [02:40<02:42, 2.86it/s] Training 1/1 epoch (loss 1.5073): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [02:40<02:42, 2.86it/s] Training 1/1 epoch (loss 1.5073): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [02:40<02:45, 2.79it/s] Training 1/1 epoch (loss 1.5516): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [02:40<02:45, 2.79it/s] Training 1/1 epoch (loss 1.5516): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [02:40<02:36, 2.95it/s] Training 1/1 epoch (loss 1.5174): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [02:41<02:36, 2.95it/s] Training 1/1 epoch (loss 1.5174): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [02:41<02:30, 3.05it/s] Training 1/1 epoch (loss 1.4812): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [02:41<02:30, 3.05it/s] Training 1/1 epoch (loss 1.4812): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [02:41<02:27, 3.11it/s] Training 1/1 epoch (loss 1.6080): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [02:41<02:27, 3.11it/s] Training 1/1 epoch (loss 1.6080): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [02:41<02:25, 3.16it/s] Training 1/1 epoch (loss 1.5620): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [02:42<02:25, 3.16it/s] Training 1/1 epoch (loss 1.5620): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [02:42<02:34, 2.96it/s] Training 1/1 epoch (loss 1.5897): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [02:42<02:34, 2.96it/s] Training 1/1 epoch (loss 1.5897): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [02:42<02:34, 2.96it/s] Training 1/1 epoch (loss 1.5537): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [02:42<02:34, 2.96it/s] Training 1/1 epoch (loss 1.5537): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [02:42<02:29, 3.04it/s] Training 1/1 epoch (loss 1.5076): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [02:43<02:29, 3.04it/s] Training 1/1 epoch (loss 1.5076): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [02:43<02:28, 3.06it/s] Training 1/1 epoch (loss 1.6870): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [02:43<02:28, 3.06it/s] Training 1/1 epoch (loss 1.6870): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [02:43<02:24, 3.14it/s] Training 1/1 epoch (loss 1.6892): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [02:43<02:24, 3.14it/s] Training 1/1 epoch (loss 1.6892): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [02:43<02:24, 3.13it/s] Training 1/1 epoch (loss 1.5762): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [02:44<02:24, 3.13it/s] Training 1/1 epoch (loss 1.5762): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [02:44<02:26, 3.09it/s] Training 1/1 epoch (loss 1.5624): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [02:44<02:26, 3.09it/s] Training 1/1 epoch (loss 1.5624): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [02:44<02:29, 3.02it/s] Training 1/1 epoch (loss 1.5573): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [02:44<02:29, 3.02it/s] Training 1/1 epoch (loss 1.5573): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [02:44<02:28, 3.03it/s] Training 1/1 epoch (loss 1.5748): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [02:45<02:28, 3.03it/s] Training 1/1 epoch (loss 1.5748): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [02:45<02:31, 2.96it/s] Training 1/1 epoch (loss 1.6076): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [02:45<02:31, 2.96it/s] Training 1/1 epoch (loss 1.6076): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [02:45<02:26, 3.06it/s] Training 1/1 epoch (loss 1.6670): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [02:45<02:26, 3.06it/s] Training 1/1 epoch (loss 1.6670): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [02:45<02:21, 3.16it/s] Training 1/1 epoch (loss 1.6545): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [02:46<02:21, 3.16it/s] Training 1/1 epoch (loss 1.6545): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [02:46<02:22, 3.12it/s] Training 1/1 epoch (loss 1.5241): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [02:46<02:22, 3.12it/s] Training 1/1 epoch (loss 1.5241): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [02:46<02:20, 3.16it/s] Training 1/1 epoch (loss 1.5698): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [02:46<02:20, 3.16it/s] Training 1/1 epoch (loss 1.5698): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [02:46<02:20, 3.16it/s] Training 1/1 epoch (loss 1.5969): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [02:47<02:20, 3.16it/s] Training 1/1 epoch (loss 1.5969): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [02:47<02:18, 3.19it/s] Training 1/1 epoch (loss 1.4482): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [02:47<02:18, 3.19it/s] Training 1/1 epoch (loss 1.4482): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [02:47<02:24, 3.05it/s] Training 1/1 epoch (loss 1.6345): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [02:47<02:24, 3.05it/s] Training 1/1 epoch (loss 1.6345): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [02:47<02:23, 3.08it/s] Training 1/1 epoch (loss 1.5356): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [02:48<02:23, 3.08it/s] Training 1/1 epoch (loss 1.5356): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [02:48<02:22, 3.10it/s] Training 1/1 epoch (loss 1.5320): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [02:48<02:22, 3.10it/s] Training 1/1 epoch (loss 1.5320): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [02:48<02:20, 3.14it/s] Training 1/1 epoch (loss 1.4697): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [02:48<02:20, 3.14it/s] Training 1/1 epoch (loss 1.4697): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [02:48<02:22, 3.06it/s] Training 1/1 epoch (loss 1.6312): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [02:48<02:22, 3.06it/s] Training 1/1 epoch (loss 1.6312): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [02:48<02:17, 3.17it/s] Training 1/1 epoch (loss 1.6236): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [02:49<02:17, 3.17it/s] Training 1/1 epoch (loss 1.6236): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [02:49<02:17, 3.17it/s] Training 1/1 epoch (loss 1.5805): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [02:49<02:17, 3.17it/s] Training 1/1 epoch (loss 1.5805): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [02:49<02:14, 3.24it/s] Training 1/1 epoch (loss 1.4941): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [02:49<02:14, 3.24it/s] Training 1/1 epoch (loss 1.4941): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [02:49<02:15, 3.20it/s] Training 1/1 epoch (loss 1.5493): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [02:50<02:15, 3.20it/s] Training 1/1 epoch (loss 1.5493): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [02:50<02:17, 3.15it/s] Training 1/1 epoch (loss 1.5651): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [02:50<02:17, 3.15it/s] Training 1/1 epoch (loss 1.5651): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [02:50<02:30, 2.88it/s] Training 1/1 epoch (loss 1.6176): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [02:51<02:30, 2.88it/s] Training 1/1 epoch (loss 1.6176): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [02:51<02:35, 2.78it/s] Training 1/1 epoch (loss 1.5751): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [02:51<02:35, 2.78it/s] Training 1/1 epoch (loss 1.5751): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [02:51<02:40, 2.68it/s] Training 1/1 epoch (loss 1.5599): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [02:51<02:40, 2.68it/s] Training 1/1 epoch (loss 1.5599): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [02:51<02:42, 2.64it/s] Training 1/1 epoch (loss 1.6340): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [02:52<02:42, 2.64it/s] Training 1/1 epoch (loss 1.6340): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [02:52<02:50, 2.52it/s] Training 1/1 epoch (loss 1.7228): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [02:52<02:50, 2.52it/s] Training 1/1 epoch (loss 1.7228): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [02:52<03:03, 2.32it/s] Training 1/1 epoch (loss 1.5460): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [02:53<03:03, 2.32it/s] Training 1/1 epoch (loss 1.5460): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [02:53<03:00, 2.37it/s] Training 1/1 epoch (loss 1.5685): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [02:53<03:00, 2.37it/s] Training 1/1 epoch (loss 1.5685): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [02:53<02:58, 2.38it/s] Training 1/1 epoch (loss 1.6172): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [02:53<02:58, 2.38it/s] Training 1/1 epoch (loss 1.6172): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [02:53<02:57, 2.39it/s] Training 1/1 epoch (loss 1.5346): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [02:54<02:57, 2.39it/s] Training 1/1 epoch (loss 1.5346): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [02:54<02:54, 2.42it/s] Training 1/1 epoch (loss 1.5898): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [02:54<02:54, 2.42it/s] Training 1/1 epoch (loss 1.5898): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [02:54<02:54, 2.42it/s] Training 1/1 epoch (loss 1.6138): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [02:55<02:54, 2.42it/s] Training 1/1 epoch (loss 1.6138): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [02:55<02:49, 2.48it/s] Training 1/1 epoch (loss 1.5388): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [02:55<02:49, 2.48it/s] Training 1/1 epoch (loss 1.5388): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [02:55<02:41, 2.60it/s] Training 1/1 epoch (loss 1.6172): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [02:55<02:41, 2.60it/s] Training 1/1 epoch (loss 1.6172): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [02:55<02:32, 2.75it/s] Training 1/1 epoch (loss 1.6425): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [02:56<02:32, 2.75it/s] Training 1/1 epoch (loss 1.6425): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [02:56<02:29, 2.79it/s] Training 1/1 epoch (loss 1.4823): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [02:56<02:29, 2.79it/s] Training 1/1 epoch (loss 1.4823): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [02:56<02:24, 2.89it/s] Training 1/1 epoch (loss 1.5558): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [02:56<02:24, 2.89it/s] Training 1/1 epoch (loss 1.5558): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [02:56<02:24, 2.87it/s] Training 1/1 epoch (loss 1.5334): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [02:57<02:24, 2.87it/s] Training 1/1 epoch (loss 1.5334): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [02:57<02:19, 2.97it/s] Training 1/1 epoch (loss 1.4454): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [02:57<02:19, 2.97it/s] Training 1/1 epoch (loss 1.4454): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [02:57<02:15, 3.06it/s] Training 1/1 epoch (loss 1.5476): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [02:57<02:15, 3.06it/s] Training 1/1 epoch (loss 1.5476): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [02:57<02:18, 2.99it/s] Training 1/1 epoch (loss 1.6010): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [02:58<02:18, 2.99it/s] Training 1/1 epoch (loss 1.6010): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [02:58<02:17, 2.99it/s] Training 1/1 epoch (loss 1.5598): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [02:58<02:17, 2.99it/s] Training 1/1 epoch (loss 1.5598): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [02:58<02:17, 2.99it/s] Training 1/1 epoch (loss 1.5859): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [02:58<02:17, 2.99it/s] Training 1/1 epoch (loss 1.5859): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [02:58<02:18, 2.95it/s] Training 1/1 epoch (loss 1.5165): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [02:59<02:18, 2.95it/s] Training 1/1 epoch (loss 1.5165): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [02:59<02:15, 3.03it/s] Training 1/1 epoch (loss 1.5363): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [02:59<02:15, 3.03it/s] Training 1/1 epoch (loss 1.5363): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [02:59<02:11, 3.11it/s] Training 1/1 epoch (loss 1.5001): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [02:59<02:11, 3.11it/s] Training 1/1 epoch (loss 1.5001): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [02:59<02:09, 3.13it/s] Training 1/1 epoch (loss 1.6523): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [03:00<02:09, 3.13it/s] Training 1/1 epoch (loss 1.6523): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:00<02:14, 3.02it/s] Training 1/1 epoch (loss 1.5929): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:00<02:14, 3.02it/s] Training 1/1 epoch (loss 1.5929): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:00<02:13, 3.03it/s] Training 1/1 epoch (loss 1.5308): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:00<02:13, 3.03it/s] Training 1/1 epoch (loss 1.5308): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:00<02:11, 3.07it/s] Training 1/1 epoch (loss 1.5366): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:01<02:11, 3.07it/s] Training 1/1 epoch (loss 1.5366): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:01<02:08, 3.14it/s] Training 1/1 epoch (loss 1.4742): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:01<02:08, 3.14it/s] Training 1/1 epoch (loss 1.4742): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:01<02:11, 3.06it/s] Training 1/1 epoch (loss 1.5881): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:01<02:11, 3.06it/s] Training 1/1 epoch (loss 1.5881): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:01<02:09, 3.09it/s] Training 1/1 epoch (loss 1.5053): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:02<02:09, 3.09it/s] Training 1/1 epoch (loss 1.5053): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:02<02:06, 3.17it/s] Training 1/1 epoch (loss 1.5543): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:02<02:06, 3.17it/s] Training 1/1 epoch (loss 1.5543): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:02<02:07, 3.13it/s] Training 1/1 epoch (loss 1.5671): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:02<02:07, 3.13it/s] Training 1/1 epoch (loss 1.5671): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:02<02:07, 3.13it/s] Training 1/1 epoch (loss 1.4403): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:03<02:07, 3.13it/s] Training 1/1 epoch (loss 1.4403): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:03<02:12, 3.01it/s] Training 1/1 epoch (loss 1.5155): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:03<02:12, 3.01it/s] Training 1/1 epoch (loss 1.5155): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:03<02:06, 3.12it/s] Training 1/1 epoch (loss 1.5976): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:03<02:06, 3.12it/s] Training 1/1 epoch (loss 1.5976): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:03<02:06, 3.12it/s] Training 1/1 epoch (loss 1.5115): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:03<02:06, 3.12it/s] Training 1/1 epoch (loss 1.5115): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:03<02:08, 3.07it/s] Training 1/1 epoch (loss 1.4575): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:04<02:08, 3.07it/s] Training 1/1 epoch (loss 1.4575): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:04<02:08, 3.06it/s] Training 1/1 epoch (loss 1.4824): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:04<02:08, 3.06it/s] Training 1/1 epoch (loss 1.4824): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:04<02:06, 3.09it/s] Training 1/1 epoch (loss 1.4837): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:04<02:06, 3.09it/s] Training 1/1 epoch (loss 1.4837): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:04<02:04, 3.13it/s] Training 1/1 epoch (loss 1.6152): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:05<02:04, 3.13it/s] Training 1/1 epoch (loss 1.6152): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:05<02:03, 3.17it/s] Training 1/1 epoch (loss 1.4821): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:05<02:03, 3.17it/s] Training 1/1 epoch (loss 1.4821): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:05<02:00, 3.23it/s] Training 1/1 epoch (loss 1.6039): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:05<02:00, 3.23it/s] Training 1/1 epoch (loss 1.6039): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:05<01:59, 3.24it/s] Training 1/1 epoch (loss 1.5533): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:06<01:59, 3.24it/s] Training 1/1 epoch (loss 1.5533): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:06<02:01, 3.19it/s] Training 1/1 epoch (loss 1.5321): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:06<02:01, 3.19it/s] Training 1/1 epoch (loss 1.5321): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:06<02:01, 3.17it/s] Training 1/1 epoch (loss 1.6364): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:06<02:01, 3.17it/s] Training 1/1 epoch (loss 1.6364): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:06<02:07, 3.01it/s] Training 1/1 epoch (loss 1.5900): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:07<02:07, 3.01it/s] Training 1/1 epoch (loss 1.5900): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:07<02:05, 3.07it/s] Training 1/1 epoch (loss 1.4983): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:07<02:05, 3.07it/s] Training 1/1 epoch (loss 1.4983): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:07<02:02, 3.13it/s] Training 1/1 epoch (loss 1.6046): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:07<02:02, 3.13it/s] Training 1/1 epoch (loss 1.6046): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:07<02:00, 3.17it/s] Training 1/1 epoch (loss 1.5894): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:08<02:00, 3.17it/s] Training 1/1 epoch (loss 1.5894): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:08<02:04, 3.07it/s] Training 1/1 epoch (loss 1.5874): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:08<02:04, 3.07it/s] Training 1/1 epoch (loss 1.5874): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:08<02:03, 3.08it/s] Training 1/1 epoch (loss 1.5958): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:08<02:03, 3.08it/s] Training 1/1 epoch (loss 1.5958): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:08<02:04, 3.05it/s] Training 1/1 epoch (loss 1.4838): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:09<02:04, 3.05it/s] Training 1/1 epoch (loss 1.4838): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:09<02:02, 3.08it/s] Training 1/1 epoch (loss 1.5990): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:09<02:02, 3.08it/s] Training 1/1 epoch (loss 1.5990): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:09<02:01, 3.11it/s] Training 1/1 epoch (loss 1.6684): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:09<02:01, 3.11it/s] Training 1/1 epoch (loss 1.6684): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:09<01:59, 3.15it/s] Training 1/1 epoch (loss 1.4778): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:10<01:59, 3.15it/s] Training 1/1 epoch (loss 1.4778): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:10<01:56, 3.21it/s] Training 1/1 epoch (loss 1.5886): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:10<01:56, 3.21it/s] Training 1/1 epoch (loss 1.5886): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:10<01:58, 3.15it/s] Training 1/1 epoch (loss 1.5525): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:10<01:58, 3.15it/s] Training 1/1 epoch (loss 1.5525): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:10<02:01, 3.07it/s] Training 1/1 epoch (loss 1.5454): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:11<02:01, 3.07it/s] Training 1/1 epoch (loss 1.5454): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:11<01:58, 3.15it/s] Training 1/1 epoch (loss 1.5654): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:11<01:58, 3.15it/s] Training 1/1 epoch (loss 1.5654): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:11<01:56, 3.20it/s] Training 1/1 epoch (loss 1.6219): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:11<01:56, 3.20it/s] Training 1/1 epoch (loss 1.6219): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [03:11<01:57, 3.16it/s] Training 1/1 epoch (loss 1.6935): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [03:12<01:57, 3.16it/s] Training 1/1 epoch (loss 1.6935): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [03:12<02:05, 2.95it/s] Training 1/1 epoch (loss 1.6091): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [03:12<02:05, 2.95it/s] Training 1/1 epoch (loss 1.6091): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [03:12<01:59, 3.07it/s] Training 1/1 epoch (loss 1.6279): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [03:12<01:59, 3.07it/s] Training 1/1 epoch (loss 1.6279): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [03:12<02:02, 2.99it/s] Training 1/1 epoch (loss 1.6155): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [03:12<02:02, 2.99it/s] Training 1/1 epoch (loss 1.6155): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [03:12<01:56, 3.15it/s] Training 1/1 epoch (loss 1.6044): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [03:13<01:56, 3.15it/s] Training 1/1 epoch (loss 1.6044): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [03:13<01:58, 3.08it/s] Training 1/1 epoch (loss 1.5567): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [03:13<01:58, 3.08it/s] Training 1/1 epoch (loss 1.5567): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [03:13<02:01, 3.01it/s] Training 1/1 epoch (loss 1.5404): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [03:13<02:01, 3.01it/s] Training 1/1 epoch (loss 1.5404): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [03:13<01:59, 3.03it/s] Training 1/1 epoch (loss 1.5661): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [03:14<01:59, 3.03it/s] Training 1/1 epoch (loss 1.5661): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [03:14<02:00, 3.00it/s] Training 1/1 epoch (loss 1.5116): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [03:14<02:00, 3.00it/s] Training 1/1 epoch (loss 1.5116): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [03:14<01:59, 3.02it/s] Training 1/1 epoch (loss 1.6470): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [03:14<01:59, 3.02it/s] Training 1/1 epoch (loss 1.6470): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [03:14<01:58, 3.04it/s] Training 1/1 epoch (loss 1.6173): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [03:15<01:58, 3.04it/s] Training 1/1 epoch (loss 1.6173): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [03:15<01:54, 3.13it/s] Training 1/1 epoch (loss 1.5724): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [03:15<01:54, 3.13it/s] Training 1/1 epoch (loss 1.5724): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [03:15<01:52, 3.18it/s] Training 1/1 epoch (loss 1.5508): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [03:15<01:52, 3.18it/s] Training 1/1 epoch (loss 1.5508): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [03:15<01:50, 3.24it/s] Training 1/1 epoch (loss 1.4746): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [03:16<01:50, 3.24it/s] Training 1/1 epoch (loss 1.4746): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [03:16<01:50, 3.23it/s] Training 1/1 epoch (loss 1.6138): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [03:16<01:50, 3.23it/s] Training 1/1 epoch (loss 1.6138): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [03:16<01:52, 3.17it/s] Training 1/1 epoch (loss 1.4368): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [03:16<01:52, 3.17it/s] Training 1/1 epoch (loss 1.4368): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [03:16<01:54, 3.09it/s] Training 1/1 epoch (loss 1.4547): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [03:17<01:54, 3.09it/s] Training 1/1 epoch (loss 1.4547): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [03:17<01:53, 3.12it/s] Training 1/1 epoch (loss 1.6004): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [03:17<01:53, 3.12it/s] Training 1/1 epoch (loss 1.6004): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [03:17<01:51, 3.17it/s] Training 1/1 epoch (loss 1.5961): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [03:17<01:51, 3.17it/s] Training 1/1 epoch (loss 1.5961): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [03:17<01:49, 3.21it/s] Training 1/1 epoch (loss 1.5942): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [03:18<01:49, 3.21it/s] Training 1/1 epoch (loss 1.5942): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [03:18<01:47, 3.26it/s] Training 1/1 epoch (loss 1.6173): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [03:18<01:47, 3.26it/s] Training 1/1 epoch (loss 1.6173): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [03:18<01:53, 3.07it/s] Training 1/1 epoch (loss 1.6200): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [03:18<01:53, 3.07it/s] Training 1/1 epoch (loss 1.6200): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [03:18<01:56, 2.97it/s] Training 1/1 epoch (loss 1.5579): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [03:19<01:56, 2.97it/s] Training 1/1 epoch (loss 1.5579): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [03:19<01:52, 3.08it/s] Training 1/1 epoch (loss 1.4550): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [03:19<01:52, 3.08it/s] Training 1/1 epoch (loss 1.4550): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [03:19<01:52, 3.08it/s] Training 1/1 epoch (loss 1.6276): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [03:19<01:52, 3.08it/s] Training 1/1 epoch (loss 1.6276): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [03:19<01:50, 3.12it/s] Training 1/1 epoch (loss 1.5924): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [03:20<01:50, 3.12it/s] Training 1/1 epoch (loss 1.5924): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [03:20<01:50, 3.11it/s] Training 1/1 epoch (loss 1.5105): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [03:20<01:50, 3.11it/s] Training 1/1 epoch (loss 1.5105): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [03:20<01:49, 3.13it/s] Training 1/1 epoch (loss 1.6402): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [03:20<01:49, 3.13it/s] Training 1/1 epoch (loss 1.6402): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [03:20<01:52, 3.05it/s] Training 1/1 epoch (loss 1.5250): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [03:21<01:52, 3.05it/s] Training 1/1 epoch (loss 1.5250): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [03:21<01:50, 3.10it/s] Training 1/1 epoch (loss 1.6509): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [03:21<01:50, 3.10it/s] Training 1/1 epoch (loss 1.6509): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [03:21<01:47, 3.16it/s] Training 1/1 epoch (loss 1.6092): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [03:21<01:47, 3.16it/s] Training 1/1 epoch (loss 1.6092): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [03:21<01:46, 3.19it/s] Training 1/1 epoch (loss 1.5281): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [03:21<01:46, 3.19it/s] Training 1/1 epoch (loss 1.5281): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [03:21<01:46, 3.18it/s] Training 1/1 epoch (loss 1.6105): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [03:22<01:46, 3.18it/s] Training 1/1 epoch (loss 1.6105): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [03:22<01:48, 3.09it/s] Training 1/1 epoch (loss 1.4614): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [03:22<01:48, 3.09it/s] Training 1/1 epoch (loss 1.4614): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [03:22<01:49, 3.08it/s] Training 1/1 epoch (loss 1.5772): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [03:22<01:49, 3.08it/s] Training 1/1 epoch (loss 1.5772): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [03:22<01:47, 3.11it/s] Training 1/1 epoch (loss 1.6313): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [03:23<01:47, 3.11it/s] Training 1/1 epoch (loss 1.6313): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [03:23<01:44, 3.19it/s] Training 1/1 epoch (loss 1.5476): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [03:23<01:44, 3.19it/s] Training 1/1 epoch (loss 1.5476): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [03:23<01:43, 3.21it/s] Training 1/1 epoch (loss 1.6078): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [03:23<01:43, 3.21it/s] Training 1/1 epoch (loss 1.6078): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [03:23<01:44, 3.18it/s] Training 1/1 epoch (loss 1.5457): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [03:24<01:44, 3.18it/s] Training 1/1 epoch (loss 1.5457): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [03:24<01:49, 3.02it/s] Training 1/1 epoch (loss 1.6533): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [03:24<01:49, 3.02it/s] Training 1/1 epoch (loss 1.6533): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [03:24<01:49, 3.01it/s] Training 1/1 epoch (loss 1.5588): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [03:24<01:49, 3.01it/s] Training 1/1 epoch (loss 1.5588): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [03:24<01:49, 3.00it/s] Training 1/1 epoch (loss 1.5401): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [03:25<01:49, 3.00it/s] Training 1/1 epoch (loss 1.5401): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [03:25<01:46, 3.09it/s] Training 1/1 epoch (loss 1.6145): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [03:25<01:46, 3.09it/s] Training 1/1 epoch (loss 1.6145): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [03:25<01:44, 3.13it/s] Training 1/1 epoch (loss 1.5095): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [03:25<01:44, 3.13it/s] Training 1/1 epoch (loss 1.5095): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [03:25<01:42, 3.18it/s] Training 1/1 epoch (loss 1.3935): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [03:26<01:42, 3.18it/s] Training 1/1 epoch (loss 1.3935): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [03:26<01:45, 3.09it/s] Training 1/1 epoch (loss 1.6739): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [03:26<01:45, 3.09it/s] Training 1/1 epoch (loss 1.6739): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [03:26<01:44, 3.10it/s] Training 1/1 epoch (loss 1.4317): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [03:26<01:44, 3.10it/s] Training 1/1 epoch (loss 1.4317): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [03:26<01:43, 3.14it/s] Training 1/1 epoch (loss 1.5907): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [03:27<01:43, 3.14it/s] Training 1/1 epoch (loss 1.5907): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [03:27<01:43, 3.12it/s] Training 1/1 epoch (loss 1.4520): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [03:27<01:43, 3.12it/s] Training 1/1 epoch (loss 1.4520): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [03:27<01:41, 3.15it/s] Training 1/1 epoch (loss 1.5291): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [03:27<01:41, 3.15it/s] Training 1/1 epoch (loss 1.5291): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [03:27<01:39, 3.22it/s] Training 1/1 epoch (loss 1.5793): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [03:28<01:39, 3.22it/s] Training 1/1 epoch (loss 1.5793): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [03:28<01:41, 3.14it/s] Training 1/1 epoch (loss 1.4663): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [03:28<01:41, 3.14it/s] Training 1/1 epoch (loss 1.4663): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [03:28<01:39, 3.20it/s] Training 1/1 epoch (loss 1.4455): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [03:28<01:39, 3.20it/s] Training 1/1 epoch (loss 1.4455): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [03:28<01:40, 3.15it/s] Training 1/1 epoch (loss 1.6780): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [03:28<01:40, 3.15it/s] Training 1/1 epoch (loss 1.6780): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [03:28<01:39, 3.18it/s] Training 1/1 epoch (loss 1.4908): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [03:29<01:39, 3.18it/s] Training 1/1 epoch (loss 1.4908): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [03:29<01:41, 3.11it/s] Training 1/1 epoch (loss 1.4927): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [03:29<01:41, 3.11it/s] Training 1/1 epoch (loss 1.4927): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [03:29<01:39, 3.14it/s] Training 1/1 epoch (loss 1.4076): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [03:29<01:39, 3.14it/s] Training 1/1 epoch (loss 1.4076): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [03:29<01:40, 3.10it/s] Training 1/1 epoch (loss 1.5538): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [03:30<01:40, 3.10it/s] Training 1/1 epoch (loss 1.5538): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [03:30<01:37, 3.20it/s] Training 1/1 epoch (loss 1.5698): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [03:30<01:37, 3.20it/s] Training 1/1 epoch (loss 1.5698): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [03:30<01:38, 3.15it/s] Training 1/1 epoch (loss 1.5335): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [03:30<01:38, 3.15it/s] Training 1/1 epoch (loss 1.5335): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [03:30<01:37, 3.17it/s] Training 1/1 epoch (loss 1.5108): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [03:31<01:37, 3.17it/s] Training 1/1 epoch (loss 1.5108): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [03:31<01:35, 3.24it/s] Training 1/1 epoch (loss 1.5615): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [03:31<01:35, 3.24it/s] Training 1/1 epoch (loss 1.5615): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [03:31<01:35, 3.23it/s] Training 1/1 epoch (loss 1.4929): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [03:31<01:35, 3.23it/s] Training 1/1 epoch (loss 1.4929): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [03:31<01:33, 3.28it/s] Training 1/1 epoch (loss 1.6252): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [03:32<01:33, 3.28it/s] Training 1/1 epoch (loss 1.6252): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [03:32<01:37, 3.14it/s] Training 1/1 epoch (loss 1.4273): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [03:32<01:37, 3.14it/s] Training 1/1 epoch (loss 1.4273): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [03:32<01:36, 3.15it/s] Training 1/1 epoch (loss 1.5811): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [03:32<01:36, 3.15it/s] Training 1/1 epoch (loss 1.5811): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [03:32<01:38, 3.10it/s] Training 1/1 epoch (loss 1.5471): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [03:33<01:38, 3.10it/s] Training 1/1 epoch (loss 1.5471): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [03:33<01:35, 3.17it/s] Training 1/1 epoch (loss 1.4875): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [03:33<01:35, 3.17it/s] Training 1/1 epoch (loss 1.4875): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [03:33<01:33, 3.24it/s] Training 1/1 epoch (loss 1.6209): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [03:33<01:33, 3.24it/s] Training 1/1 epoch (loss 1.6209): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [03:33<01:33, 3.22it/s] Training 1/1 epoch (loss 1.5611): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [03:34<01:33, 3.22it/s] Training 1/1 epoch (loss 1.5611): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [03:34<01:33, 3.22it/s] Training 1/1 epoch (loss 1.4530): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [03:34<01:33, 3.22it/s] Training 1/1 epoch (loss 1.4530): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [03:34<01:45, 2.85it/s] Training 1/1 epoch (loss 1.5948): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [03:34<01:45, 2.85it/s] Training 1/1 epoch (loss 1.5948): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [03:34<01:45, 2.83it/s] Training 1/1 epoch (loss 1.4304): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [03:35<01:45, 2.83it/s] Training 1/1 epoch (loss 1.4304): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [03:35<01:40, 2.95it/s] Training 1/1 epoch (loss 1.4964): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [03:35<01:40, 2.95it/s] Training 1/1 epoch (loss 1.4964): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [03:35<01:38, 3.01it/s] Training 1/1 epoch (loss 1.6106): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [03:35<01:38, 3.01it/s] Training 1/1 epoch (loss 1.6106): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [03:35<01:34, 3.11it/s] Training 1/1 epoch (loss 1.4385): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [03:36<01:34, 3.11it/s] Training 1/1 epoch (loss 1.4385): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [03:36<01:34, 3.11it/s] Training 1/1 epoch (loss 1.5954): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [03:36<01:34, 3.11it/s] Training 1/1 epoch (loss 1.5954): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [03:36<01:33, 3.12it/s] Training 1/1 epoch (loss 1.6235): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [03:36<01:33, 3.12it/s] Training 1/1 epoch (loss 1.6235): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [03:36<01:33, 3.11it/s] Training 1/1 epoch (loss 1.6103): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [03:36<01:33, 3.11it/s] Training 1/1 epoch (loss 1.6103): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [03:36<01:31, 3.17it/s] Training 1/1 epoch (loss 1.6081): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [03:37<01:31, 3.17it/s] Training 1/1 epoch (loss 1.6081): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [03:37<01:33, 3.11it/s] Training 1/1 epoch (loss 1.5501): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [03:37<01:33, 3.11it/s] Training 1/1 epoch (loss 1.5501): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [03:37<01:31, 3.16it/s] Training 1/1 epoch (loss 1.5011): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [03:37<01:31, 3.16it/s] Training 1/1 epoch (loss 1.5011): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [03:37<01:31, 3.16it/s] Training 1/1 epoch (loss 1.5149): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [03:38<01:31, 3.16it/s] Training 1/1 epoch (loss 1.5149): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [03:38<01:30, 3.17it/s] Training 1/1 epoch (loss 1.5174): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [03:38<01:30, 3.17it/s] Training 1/1 epoch (loss 1.5174): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [03:38<01:30, 3.18it/s] Training 1/1 epoch (loss 1.6725): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [03:38<01:30, 3.18it/s] Training 1/1 epoch (loss 1.6725): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [03:38<01:30, 3.13it/s] Training 1/1 epoch (loss 1.6095): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [03:39<01:30, 3.13it/s] Training 1/1 epoch (loss 1.6095): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [03:39<01:28, 3.20it/s] Training 1/1 epoch (loss 1.5075): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [03:39<01:28, 3.20it/s] Training 1/1 epoch (loss 1.5075): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [03:39<01:27, 3.25it/s] Training 1/1 epoch (loss 1.5683): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [03:39<01:27, 3.25it/s] Training 1/1 epoch (loss 1.5683): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [03:39<01:29, 3.15it/s] Training 1/1 epoch (loss 1.6426): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [03:40<01:29, 3.15it/s] Training 1/1 epoch (loss 1.6426): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [03:40<01:30, 3.12it/s] Training 1/1 epoch (loss 1.5477): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [03:40<01:30, 3.12it/s] Training 1/1 epoch (loss 1.5477): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [03:40<01:33, 2.99it/s] Training 1/1 epoch (loss 1.6125): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [03:40<01:33, 2.99it/s] Training 1/1 epoch (loss 1.6125): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [03:40<01:31, 3.04it/s] Training 1/1 epoch (loss 1.5697): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [03:41<01:31, 3.04it/s] Training 1/1 epoch (loss 1.5697): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [03:41<01:29, 3.10it/s] Training 1/1 epoch (loss 1.6092): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [03:41<01:29, 3.10it/s] Training 1/1 epoch (loss 1.6092): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [03:41<01:27, 3.15it/s] Training 1/1 epoch (loss 1.5572): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [03:41<01:27, 3.15it/s] Training 1/1 epoch (loss 1.5572): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [03:41<01:24, 3.27it/s] Training 1/1 epoch (loss 1.4584): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [03:42<01:24, 3.27it/s] Training 1/1 epoch (loss 1.4584): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [03:42<01:31, 3.01it/s] Training 1/1 epoch (loss 1.5627): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [03:42<01:31, 3.01it/s] Training 1/1 epoch (loss 1.5627): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [03:42<01:30, 3.01it/s] Training 1/1 epoch (loss 1.5734): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [03:42<01:30, 3.01it/s] Training 1/1 epoch (loss 1.5734): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [03:42<01:33, 2.93it/s] Training 1/1 epoch (loss 1.5088): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [03:43<01:33, 2.93it/s] Training 1/1 epoch (loss 1.5088): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [03:43<01:32, 2.96it/s] Training 1/1 epoch (loss 1.5149): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [03:43<01:32, 2.96it/s] Training 1/1 epoch (loss 1.5149): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [03:43<01:28, 3.06it/s] Training 1/1 epoch (loss 1.4213): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [03:43<01:28, 3.06it/s] Training 1/1 epoch (loss 1.4213): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [03:43<01:26, 3.10it/s] Training 1/1 epoch (loss 1.4748): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [03:44<01:26, 3.10it/s] Training 1/1 epoch (loss 1.4748): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [03:44<01:27, 3.07it/s] Training 1/1 epoch (loss 1.4353): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [03:44<01:27, 3.07it/s] Training 1/1 epoch (loss 1.4353): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [03:44<01:29, 2.98it/s] Training 1/1 epoch (loss 1.6206): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [03:44<01:29, 2.98it/s] Training 1/1 epoch (loss 1.6206): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [03:44<01:32, 2.89it/s] Training 1/1 epoch (loss 1.7323): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [03:45<01:32, 2.89it/s] Training 1/1 epoch (loss 1.7323): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [03:45<01:35, 2.78it/s] Training 1/1 epoch (loss 1.5270): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [03:45<01:35, 2.78it/s] Training 1/1 epoch (loss 1.5270): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [03:45<01:30, 2.93it/s] Training 1/1 epoch (loss 1.5611): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [03:45<01:30, 2.93it/s] Training 1/1 epoch (loss 1.5611): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [03:45<01:26, 3.05it/s] Training 1/1 epoch (loss 1.5257): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [03:46<01:26, 3.05it/s] Training 1/1 epoch (loss 1.5257): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [03:46<01:24, 3.12it/s] Training 1/1 epoch (loss 1.5568): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [03:46<01:24, 3.12it/s] Training 1/1 epoch (loss 1.5568): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [03:46<01:24, 3.09it/s] Training 1/1 epoch (loss 1.6458): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [03:46<01:24, 3.09it/s] Training 1/1 epoch (loss 1.6458): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [03:46<01:24, 3.09it/s] Training 1/1 epoch (loss 1.4791): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [03:47<01:24, 3.09it/s] Training 1/1 epoch (loss 1.4791): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [03:47<01:23, 3.11it/s] Training 1/1 epoch (loss 1.5308): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [03:47<01:23, 3.11it/s] Training 1/1 epoch (loss 1.5308): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [03:47<01:22, 3.15it/s] Training 1/1 epoch (loss 1.5858): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [03:47<01:22, 3.15it/s] Training 1/1 epoch (loss 1.5858): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [03:47<01:23, 3.09it/s] Training 1/1 epoch (loss 1.5595): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [03:48<01:23, 3.09it/s] Training 1/1 epoch (loss 1.5595): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [03:48<01:24, 3.05it/s] Training 1/1 epoch (loss 1.5776): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [03:48<01:24, 3.05it/s] Training 1/1 epoch (loss 1.5776): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [03:48<01:24, 3.04it/s] Training 1/1 epoch (loss 1.6080): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [03:48<01:24, 3.04it/s] Training 1/1 epoch (loss 1.6080): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [03:48<01:25, 2.97it/s] Training 1/1 epoch (loss 1.5678): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [03:49<01:25, 2.97it/s] Training 1/1 epoch (loss 1.5678): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [03:49<01:23, 3.03it/s] Training 1/1 epoch (loss 1.5794): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [03:49<01:23, 3.03it/s] Training 1/1 epoch (loss 1.5794): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [03:49<01:21, 3.11it/s] Training 1/1 epoch (loss 1.6389): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [03:49<01:21, 3.11it/s] Training 1/1 epoch (loss 1.6389): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [03:49<01:19, 3.15it/s] Training 1/1 epoch (loss 1.5603): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [03:50<01:19, 3.15it/s] Training 1/1 epoch (loss 1.5603): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [03:50<01:25, 2.95it/s] Training 1/1 epoch (loss 1.5706): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [03:50<01:25, 2.95it/s] Training 1/1 epoch (loss 1.5706): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [03:50<01:32, 2.70it/s] Training 1/1 epoch (loss 1.5967): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [03:50<01:32, 2.70it/s] Training 1/1 epoch (loss 1.5967): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [03:50<01:29, 2.77it/s] Training 1/1 epoch (loss 1.6062): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [03:51<01:29, 2.77it/s] Training 1/1 epoch (loss 1.6062): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [03:51<01:25, 2.90it/s] Training 1/1 epoch (loss 1.6382): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [03:51<01:25, 2.90it/s] Training 1/1 epoch (loss 1.6382): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [03:51<01:23, 2.96it/s] Training 1/1 epoch (loss 1.5799): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [03:51<01:23, 2.96it/s] Training 1/1 epoch (loss 1.5799): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [03:51<01:21, 3.00it/s] Training 1/1 epoch (loss 1.5219): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [03:52<01:21, 3.00it/s] Training 1/1 epoch (loss 1.5219): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [03:52<01:21, 2.99it/s] Training 1/1 epoch (loss 1.4773): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [03:52<01:21, 2.99it/s] Training 1/1 epoch (loss 1.4773): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [03:52<01:22, 2.95it/s] Training 1/1 epoch (loss 1.5627): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [03:52<01:22, 2.95it/s] Training 1/1 epoch (loss 1.5627): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [03:52<01:22, 2.96it/s] Training 1/1 epoch (loss 1.3962): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [03:53<01:22, 2.96it/s] Training 1/1 epoch (loss 1.3962): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [03:53<01:19, 3.04it/s] Training 1/1 epoch (loss 1.6194): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [03:53<01:19, 3.04it/s] Training 1/1 epoch (loss 1.6194): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [03:53<01:18, 3.07it/s] Training 1/1 epoch (loss 1.5220): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [03:53<01:18, 3.07it/s] Training 1/1 epoch (loss 1.5220): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [03:53<01:16, 3.15it/s] Training 1/1 epoch (loss 1.4612): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [03:54<01:16, 3.15it/s] Training 1/1 epoch (loss 1.4612): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [03:54<01:17, 3.07it/s] Training 1/1 epoch (loss 1.5786): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [03:54<01:17, 3.07it/s] Training 1/1 epoch (loss 1.5786): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [03:54<01:20, 2.95it/s] Training 1/1 epoch (loss 1.4870): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [03:54<01:20, 2.95it/s] Training 1/1 epoch (loss 1.4870): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [03:54<01:24, 2.81it/s] Training 1/1 epoch (loss 1.4649): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [03:55<01:24, 2.81it/s] Training 1/1 epoch (loss 1.4649): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [03:55<01:20, 2.92it/s] Training 1/1 epoch (loss 1.5970): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [03:55<01:20, 2.92it/s] Training 1/1 epoch (loss 1.5970): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [03:55<01:20, 2.93it/s] Training 1/1 epoch (loss 1.4781): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [03:55<01:20, 2.93it/s] Training 1/1 epoch (loss 1.4781): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [03:55<01:18, 2.96it/s] Training 1/1 epoch (loss 1.5448): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [03:56<01:18, 2.96it/s] Training 1/1 epoch (loss 1.5448): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [03:56<01:19, 2.94it/s] Training 1/1 epoch (loss 1.4522): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [03:56<01:19, 2.94it/s] Training 1/1 epoch (loss 1.4522): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [03:56<01:17, 3.01it/s] Training 1/1 epoch (loss 1.5497): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [03:56<01:17, 3.01it/s] Training 1/1 epoch (loss 1.5497): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [03:56<01:16, 3.00it/s] Training 1/1 epoch (loss 1.5887): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [03:57<01:16, 3.00it/s] Training 1/1 epoch (loss 1.5887): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [03:57<01:17, 2.98it/s] Training 1/1 epoch (loss 1.6578): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [03:57<01:17, 2.98it/s] Training 1/1 epoch (loss 1.6578): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [03:57<01:14, 3.06it/s] Training 1/1 epoch (loss 1.5301): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [03:57<01:14, 3.06it/s] Training 1/1 epoch (loss 1.5301): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [03:57<01:12, 3.15it/s] Training 1/1 epoch (loss 1.5908): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [03:58<01:12, 3.15it/s] Training 1/1 epoch (loss 1.5908): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [03:58<01:12, 3.14it/s] Training 1/1 epoch (loss 1.5124): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [03:58<01:12, 3.14it/s] Training 1/1 epoch (loss 1.5124): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [03:58<01:14, 3.01it/s] Training 1/1 epoch (loss 1.5890): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [03:58<01:14, 3.01it/s] Training 1/1 epoch (loss 1.5890): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [03:58<01:13, 3.07it/s] Training 1/1 epoch (loss 1.5139): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [03:59<01:13, 3.07it/s] Training 1/1 epoch (loss 1.5139): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [03:59<01:11, 3.13it/s] Training 1/1 epoch (loss 1.5305): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [03:59<01:11, 3.13it/s] Training 1/1 epoch (loss 1.5305): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [03:59<01:10, 3.15it/s] Training 1/1 epoch (loss 1.5059): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [03:59<01:10, 3.15it/s] Training 1/1 epoch (loss 1.5059): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [03:59<01:09, 3.21it/s] Training 1/1 epoch (loss 1.5183): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [04:00<01:09, 3.21it/s] Training 1/1 epoch (loss 1.5183): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [04:00<01:14, 2.97it/s] Training 1/1 epoch (loss 1.5360): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [04:00<01:14, 2.97it/s] Training 1/1 epoch (loss 1.5360): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [04:00<01:11, 3.07it/s] Training 1/1 epoch (loss 1.5554): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [04:00<01:11, 3.07it/s] Training 1/1 epoch (loss 1.5554): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [04:00<01:12, 3.03it/s] Training 1/1 epoch (loss 1.5172): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [04:01<01:12, 3.03it/s] Training 1/1 epoch (loss 1.5172): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [04:01<01:13, 2.95it/s] Training 1/1 epoch (loss 1.4905): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [04:01<01:13, 2.95it/s] Training 1/1 epoch (loss 1.4905): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [04:01<01:11, 3.03it/s] Training 1/1 epoch (loss 1.4959): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [04:01<01:11, 3.03it/s] Training 1/1 epoch (loss 1.4959): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [04:01<01:08, 3.15it/s] Training 1/1 epoch (loss 1.4015): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [04:01<01:08, 3.15it/s] Training 1/1 epoch (loss 1.4015): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [04:01<01:07, 3.19it/s] Training 1/1 epoch (loss 1.4422): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [04:02<01:07, 3.19it/s] Training 1/1 epoch (loss 1.4422): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [04:02<01:07, 3.18it/s] Training 1/1 epoch (loss 1.4943): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [04:02<01:07, 3.18it/s] Training 1/1 epoch (loss 1.4943): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [04:02<01:07, 3.16it/s] Training 1/1 epoch (loss 1.4814): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [04:02<01:07, 3.16it/s] Training 1/1 epoch (loss 1.4814): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [04:02<01:06, 3.18it/s] Training 1/1 epoch (loss 1.5570): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [04:03<01:06, 3.18it/s] Training 1/1 epoch (loss 1.5570): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [04:03<01:06, 3.20it/s] Training 1/1 epoch (loss 1.4316): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [04:03<01:06, 3.20it/s] Training 1/1 epoch (loss 1.4316): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [04:03<01:05, 3.18it/s] Training 1/1 epoch (loss 1.5077): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [04:03<01:05, 3.18it/s] Training 1/1 epoch (loss 1.5077): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [04:03<01:05, 3.19it/s] Training 1/1 epoch (loss 1.5535): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [04:04<01:05, 3.19it/s] Training 1/1 epoch (loss 1.5535): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [04:04<01:05, 3.20it/s] Training 1/1 epoch (loss 1.5956): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [04:04<01:05, 3.20it/s] Training 1/1 epoch (loss 1.5956): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [04:04<01:04, 3.20it/s] Training 1/1 epoch (loss 1.5431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [04:04<01:04, 3.20it/s] Training 1/1 epoch (loss 1.5431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [04:04<01:06, 3.11it/s] Training 1/1 epoch (loss 1.6304): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [04:05<01:06, 3.11it/s] Training 1/1 epoch (loss 1.6304): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [04:05<01:05, 3.15it/s] Training 1/1 epoch (loss 1.5368): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [04:05<01:05, 3.15it/s] Training 1/1 epoch (loss 1.5368): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [04:05<01:04, 3.16it/s] Training 1/1 epoch (loss 1.5698): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [04:05<01:04, 3.16it/s] Training 1/1 epoch (loss 1.5698): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [04:05<01:02, 3.22it/s] Training 1/1 epoch (loss 1.5325): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [04:06<01:02, 3.22it/s] Training 1/1 epoch (loss 1.5325): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [04:06<01:06, 3.03it/s] Training 1/1 epoch (loss 1.5415): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [04:06<01:06, 3.03it/s] Training 1/1 epoch (loss 1.5415): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [04:06<01:07, 2.97it/s] Training 1/1 epoch (loss 1.5268): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [04:06<01:07, 2.97it/s] Training 1/1 epoch (loss 1.5268): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [04:06<01:05, 3.04it/s] Training 1/1 epoch (loss 1.4781): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [04:07<01:05, 3.04it/s] Training 1/1 epoch (loss 1.4781): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [04:07<01:03, 3.14it/s] Training 1/1 epoch (loss 1.5546): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [04:07<01:03, 3.14it/s] Training 1/1 epoch (loss 1.5546): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [04:07<01:02, 3.15it/s] Training 1/1 epoch (loss 1.5922): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [04:07<01:02, 3.15it/s] Training 1/1 epoch (loss 1.5922): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [04:07<01:01, 3.20it/s] Training 1/1 epoch (loss 1.5520): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [04:08<01:01, 3.20it/s] Training 1/1 epoch (loss 1.5520): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [04:08<01:02, 3.15it/s] Training 1/1 epoch (loss 1.5593): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [04:08<01:02, 3.15it/s] Training 1/1 epoch (loss 1.5593): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [04:08<01:03, 3.05it/s] Training 1/1 epoch (loss 1.5317): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [04:08<01:03, 3.05it/s] Training 1/1 epoch (loss 1.5317): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [04:08<01:04, 3.03it/s] Training 1/1 epoch (loss 1.6153): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [04:09<01:04, 3.03it/s] Training 1/1 epoch (loss 1.6153): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [04:09<01:02, 3.09it/s] Training 1/1 epoch (loss 1.4852): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [04:09<01:02, 3.09it/s] Training 1/1 epoch (loss 1.4852): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [04:09<01:02, 3.10it/s] Training 1/1 epoch (loss 1.6571): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [04:09<01:02, 3.10it/s] Training 1/1 epoch (loss 1.6571): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [04:09<01:00, 3.17it/s] Training 1/1 epoch (loss 1.5035): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [04:09<01:00, 3.17it/s] Training 1/1 epoch (loss 1.5035): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [04:09<01:00, 3.15it/s] Training 1/1 epoch (loss 1.4767): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [04:10<01:00, 3.15it/s] Training 1/1 epoch (loss 1.4767): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [04:10<00:59, 3.18it/s] Training 1/1 epoch (loss 1.4858): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [04:10<00:59, 3.18it/s] Training 1/1 epoch (loss 1.4858): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [04:10<01:00, 3.11it/s] Training 1/1 epoch (loss 1.5336): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [04:10<01:00, 3.11it/s] Training 1/1 epoch (loss 1.5336): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [04:10<00:59, 3.13it/s] Training 1/1 epoch (loss 1.5274): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [04:11<00:59, 3.13it/s] Training 1/1 epoch (loss 1.5274): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [04:11<01:01, 3.00it/s] Training 1/1 epoch (loss 1.5871): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [04:11<01:01, 3.00it/s] Training 1/1 epoch (loss 1.5871): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [04:11<01:00, 3.07it/s] Training 1/1 epoch (loss 1.5743): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [04:11<01:00, 3.07it/s] Training 1/1 epoch (loss 1.5743): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [04:11<00:59, 3.09it/s] Training 1/1 epoch (loss 1.5395): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [04:12<00:59, 3.09it/s] Training 1/1 epoch (loss 1.5395): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [04:12<00:59, 3.08it/s] Training 1/1 epoch (loss 1.5602): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [04:12<00:59, 3.08it/s] Training 1/1 epoch (loss 1.5602): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [04:12<01:00, 3.00it/s] Training 1/1 epoch (loss 1.5234): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [04:12<01:00, 3.00it/s] Training 1/1 epoch (loss 1.5234): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [04:12<00:58, 3.12it/s] Training 1/1 epoch (loss 1.6487): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [04:13<00:58, 3.12it/s] Training 1/1 epoch (loss 1.6487): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [04:13<00:56, 3.16it/s] Training 1/1 epoch (loss 1.5287): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [04:13<00:56, 3.16it/s] Training 1/1 epoch (loss 1.5287): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [04:13<00:56, 3.18it/s] Training 1/1 epoch (loss 1.3763): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [04:13<00:56, 3.18it/s] Training 1/1 epoch (loss 1.3763): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [04:13<00:55, 3.21it/s] Training 1/1 epoch (loss 1.6733): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [04:14<00:55, 3.21it/s] Training 1/1 epoch (loss 1.6733): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [04:14<00:57, 3.10it/s] Training 1/1 epoch (loss 1.5245): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [04:14<00:57, 3.10it/s] Training 1/1 epoch (loss 1.5245): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [04:14<00:57, 3.08it/s] Training 1/1 epoch (loss 1.6487): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [04:14<00:57, 3.08it/s] Training 1/1 epoch (loss 1.6487): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [04:14<00:57, 3.06it/s] Training 1/1 epoch (loss 1.3808): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [04:15<00:57, 3.06it/s] Training 1/1 epoch (loss 1.3808): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [04:15<00:58, 2.99it/s] Training 1/1 epoch (loss 1.6058): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [04:15<00:58, 2.99it/s] Training 1/1 epoch (loss 1.6058): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [04:15<00:55, 3.13it/s] Training 1/1 epoch (loss 1.5348): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [04:15<00:55, 3.13it/s] Training 1/1 epoch (loss 1.5348): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [04:15<00:54, 3.13it/s] Training 1/1 epoch (loss 1.5916): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [04:16<00:54, 3.13it/s] Training 1/1 epoch (loss 1.5916): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [04:16<00:53, 3.18it/s] Training 1/1 epoch (loss 1.4585): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [04:16<00:53, 3.18it/s] Training 1/1 epoch (loss 1.4585): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [04:16<00:55, 3.04it/s] Training 1/1 epoch (loss 1.7167): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [04:16<00:55, 3.04it/s] Training 1/1 epoch (loss 1.7167): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [04:16<00:55, 3.02it/s] Training 1/1 epoch (loss 1.5531): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [04:17<00:55, 3.02it/s] Training 1/1 epoch (loss 1.5531): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [04:17<00:54, 3.08it/s] Training 1/1 epoch (loss 1.4547): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [04:17<00:54, 3.08it/s] Training 1/1 epoch (loss 1.4547): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [04:17<00:54, 3.09it/s] Training 1/1 epoch (loss 1.4792): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [04:17<00:54, 3.09it/s] Training 1/1 epoch (loss 1.4792): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [04:17<00:52, 3.15it/s] Training 1/1 epoch (loss 1.4779): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [04:18<00:52, 3.15it/s] Training 1/1 epoch (loss 1.4779): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [04:18<00:52, 3.16it/s] Training 1/1 epoch (loss 1.4138): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [04:18<00:52, 3.16it/s] Training 1/1 epoch (loss 1.4138): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [04:18<00:50, 3.23it/s] Training 1/1 epoch (loss 1.5065): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [04:18<00:50, 3.23it/s] Training 1/1 epoch (loss 1.5065): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [04:18<00:51, 3.15it/s] Training 1/1 epoch (loss 1.6044): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [04:18<00:51, 3.15it/s] Training 1/1 epoch (loss 1.6044): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [04:18<00:51, 3.13it/s] Training 1/1 epoch (loss 1.5497): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [04:19<00:51, 3.13it/s] Training 1/1 epoch (loss 1.5497): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [04:19<00:51, 3.16it/s] Training 1/1 epoch (loss 1.4901): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [04:19<00:51, 3.16it/s] Training 1/1 epoch (loss 1.4901): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [04:19<00:50, 3.14it/s] Training 1/1 epoch (loss 1.4382): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [04:19<00:50, 3.14it/s] Training 1/1 epoch (loss 1.4382): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [04:19<00:49, 3.20it/s] Training 1/1 epoch (loss 1.6184): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [04:20<00:49, 3.20it/s] Training 1/1 epoch (loss 1.6184): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [04:20<00:49, 3.17it/s] Training 1/1 epoch (loss 1.6159): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [04:20<00:49, 3.17it/s] Training 1/1 epoch (loss 1.6159): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [04:20<00:50, 3.09it/s] Training 1/1 epoch (loss 1.5002): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [04:20<00:50, 3.09it/s] Training 1/1 epoch (loss 1.5002): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [04:20<00:50, 3.08it/s] Training 1/1 epoch (loss 1.5458): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [04:21<00:50, 3.08it/s] Training 1/1 epoch (loss 1.5458): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [04:21<00:49, 3.13it/s] Training 1/1 epoch (loss 1.5048): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [04:21<00:49, 3.13it/s] Training 1/1 epoch (loss 1.5048): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [04:21<00:51, 3.01it/s] Training 1/1 epoch (loss 1.5504): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [04:21<00:51, 3.01it/s] Training 1/1 epoch (loss 1.5504): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [04:21<00:51, 2.98it/s] Training 1/1 epoch (loss 1.5692): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [04:22<00:51, 2.98it/s] Training 1/1 epoch (loss 1.5692): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [04:22<00:49, 3.05it/s] Training 1/1 epoch (loss 1.4996): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [04:22<00:49, 3.05it/s] Training 1/1 epoch (loss 1.4996): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [04:22<00:49, 3.08it/s] Training 1/1 epoch (loss 1.6763): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [04:22<00:49, 3.08it/s] Training 1/1 epoch (loss 1.6763): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [04:22<00:48, 3.10it/s] Training 1/1 epoch (loss 1.4414): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [04:23<00:48, 3.10it/s] Training 1/1 epoch (loss 1.4414): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [04:23<00:48, 3.09it/s] Training 1/1 epoch (loss 1.4197): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [04:23<00:48, 3.09it/s] Training 1/1 epoch (loss 1.4197): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [04:23<00:46, 3.15it/s] Training 1/1 epoch (loss 1.5575): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [04:23<00:46, 3.15it/s] Training 1/1 epoch (loss 1.5575): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [04:23<00:45, 3.24it/s] Training 1/1 epoch (loss 1.4733): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [04:24<00:45, 3.24it/s] Training 1/1 epoch (loss 1.4733): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [04:24<00:46, 3.14it/s] Training 1/1 epoch (loss 1.6012): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [04:24<00:46, 3.14it/s] Training 1/1 epoch (loss 1.6012): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [04:24<00:46, 3.12it/s] Training 1/1 epoch (loss 1.6115): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [04:24<00:46, 3.12it/s] Training 1/1 epoch (loss 1.6115): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [04:24<00:46, 3.11it/s] Training 1/1 epoch (loss 1.5284): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [04:25<00:46, 3.11it/s] Training 1/1 epoch (loss 1.5284): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [04:25<00:44, 3.22it/s] Training 1/1 epoch (loss 1.5278): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [04:25<00:44, 3.22it/s] Training 1/1 epoch (loss 1.5278): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [04:25<00:46, 3.05it/s] Training 1/1 epoch (loss 1.4962): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [04:25<00:46, 3.05it/s] Training 1/1 epoch (loss 1.4962): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [04:25<00:45, 3.08it/s] Training 1/1 epoch (loss 1.5980): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [04:26<00:45, 3.08it/s] Training 1/1 epoch (loss 1.5980): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [04:26<00:45, 3.09it/s] Training 1/1 epoch (loss 1.5752): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [04:26<00:45, 3.09it/s] Training 1/1 epoch (loss 1.5752): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [04:26<00:44, 3.11it/s] Training 1/1 epoch (loss 1.5379): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [04:26<00:44, 3.11it/s] Training 1/1 epoch (loss 1.5379): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [04:26<00:46, 2.98it/s] Training 1/1 epoch (loss 1.6042): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [04:27<00:46, 2.98it/s] Training 1/1 epoch (loss 1.6042): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [04:27<00:47, 2.87it/s] Training 1/1 epoch (loss 1.6044): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [04:27<00:47, 2.87it/s] Training 1/1 epoch (loss 1.6044): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [04:27<00:45, 2.98it/s] Training 1/1 epoch (loss 1.5680): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [04:27<00:45, 2.98it/s] Training 1/1 epoch (loss 1.5680): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [04:27<00:43, 3.08it/s] Training 1/1 epoch (loss 1.4906): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [04:28<00:43, 3.08it/s] Training 1/1 epoch (loss 1.4906): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [04:28<00:42, 3.12it/s] Training 1/1 epoch (loss 1.5307): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [04:28<00:42, 3.12it/s] Training 1/1 epoch (loss 1.5307): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [04:28<00:42, 3.13it/s] Training 1/1 epoch (loss 1.5009): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [04:28<00:42, 3.13it/s] Training 1/1 epoch (loss 1.5009): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [04:28<00:42, 3.12it/s] Training 1/1 epoch (loss 1.5395): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [04:29<00:42, 3.12it/s] Training 1/1 epoch (loss 1.5395): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [04:29<00:41, 3.12it/s] Training 1/1 epoch (loss 1.5626): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [04:29<00:41, 3.12it/s] Training 1/1 epoch (loss 1.5626): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [04:29<00:41, 3.11it/s] Training 1/1 epoch (loss 1.4485): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [04:29<00:41, 3.11it/s] Training 1/1 epoch (loss 1.4485): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [04:29<00:40, 3.18it/s] Training 1/1 epoch (loss 1.5608): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [04:29<00:40, 3.18it/s] Training 1/1 epoch (loss 1.5608): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [04:29<00:40, 3.16it/s] Training 1/1 epoch (loss 1.4432): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [04:30<00:40, 3.16it/s] Training 1/1 epoch (loss 1.4432): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [04:30<00:41, 3.03it/s] Training 1/1 epoch (loss 1.5502): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [04:30<00:41, 3.03it/s] Training 1/1 epoch (loss 1.5502): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [04:30<00:41, 3.05it/s] Training 1/1 epoch (loss 1.5991): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [04:30<00:41, 3.05it/s] Training 1/1 epoch (loss 1.5991): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [04:30<00:41, 3.04it/s] Training 1/1 epoch (loss 1.5209): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [04:31<00:41, 3.04it/s] Training 1/1 epoch (loss 1.5209): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [04:31<00:39, 3.10it/s] Training 1/1 epoch (loss 1.5304): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [04:31<00:39, 3.10it/s] Training 1/1 epoch (loss 1.5304): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [04:31<00:39, 3.15it/s] Training 1/1 epoch (loss 1.5446): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [04:31<00:39, 3.15it/s] Training 1/1 epoch (loss 1.5446): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [04:31<00:39, 3.12it/s] Training 1/1 epoch (loss 1.5223): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [04:32<00:39, 3.12it/s] Training 1/1 epoch (loss 1.5223): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [04:32<00:41, 2.95it/s] Training 1/1 epoch (loss 1.4721): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [04:32<00:41, 2.95it/s] Training 1/1 epoch (loss 1.4721): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [04:32<00:40, 2.94it/s] Training 1/1 epoch (loss 1.6526): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [04:32<00:40, 2.94it/s] Training 1/1 epoch (loss 1.6526): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [04:32<00:39, 2.98it/s] Training 1/1 epoch (loss 1.4840): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [04:33<00:39, 2.98it/s] Training 1/1 epoch (loss 1.4840): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [04:33<00:37, 3.11it/s] Training 1/1 epoch (loss 1.5689): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [04:33<00:37, 3.11it/s] Training 1/1 epoch (loss 1.5689): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [04:33<00:37, 3.15it/s] Training 1/1 epoch (loss 1.6443): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [04:33<00:37, 3.15it/s] Training 1/1 epoch (loss 1.6443): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [04:33<00:36, 3.21it/s] Training 1/1 epoch (loss 1.5241): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [04:34<00:36, 3.21it/s] Training 1/1 epoch (loss 1.5241): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [04:34<00:36, 3.19it/s] Training 1/1 epoch (loss 1.4930): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [04:34<00:36, 3.19it/s] Training 1/1 epoch (loss 1.4930): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [04:34<00:36, 3.09it/s] Training 1/1 epoch (loss 1.4653): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [04:34<00:36, 3.09it/s] Training 1/1 epoch (loss 1.4653): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [04:34<00:37, 3.02it/s] Training 1/1 epoch (loss 1.5567): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [04:35<00:37, 3.02it/s] Training 1/1 epoch (loss 1.5567): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [04:35<00:36, 3.06it/s] Training 1/1 epoch (loss 1.5262): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [04:35<00:36, 3.06it/s] Training 1/1 epoch (loss 1.5262): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [04:35<00:36, 3.05it/s] Training 1/1 epoch (loss 1.6177): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [04:35<00:36, 3.05it/s] Training 1/1 epoch (loss 1.6177): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [04:35<00:36, 3.03it/s] Training 1/1 epoch (loss 1.5426): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [04:36<00:36, 3.03it/s] Training 1/1 epoch (loss 1.5426): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [04:36<00:35, 3.07it/s] Training 1/1 epoch (loss 1.5885): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [04:36<00:35, 3.07it/s] Training 1/1 epoch (loss 1.5885): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [04:36<00:34, 3.12it/s] Training 1/1 epoch (loss 1.6284): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [04:36<00:34, 3.12it/s] Training 1/1 epoch (loss 1.6284): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [04:36<00:35, 3.05it/s] Training 1/1 epoch (loss 1.5572): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [04:37<00:35, 3.05it/s] Training 1/1 epoch (loss 1.5572): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [04:37<00:35, 3.03it/s] Training 1/1 epoch (loss 1.5540): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [04:37<00:35, 3.03it/s] Training 1/1 epoch (loss 1.5540): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [04:37<00:35, 2.94it/s] Training 1/1 epoch (loss 1.5392): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [04:37<00:35, 2.94it/s] Training 1/1 epoch (loss 1.5392): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [04:37<00:34, 3.04it/s] Training 1/1 epoch (loss 1.4836): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [04:38<00:34, 3.04it/s] Training 1/1 epoch (loss 1.4836): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [04:38<00:33, 3.05it/s] Training 1/1 epoch (loss 1.5001): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [04:38<00:33, 3.05it/s] Training 1/1 epoch (loss 1.5001): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [04:38<00:32, 3.10it/s] Training 1/1 epoch (loss 1.6516): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [04:38<00:32, 3.10it/s] Training 1/1 epoch (loss 1.6516): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [04:38<00:33, 3.03it/s] Training 1/1 epoch (loss 1.5345): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [04:39<00:33, 3.03it/s] Training 1/1 epoch (loss 1.5345): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [04:39<00:32, 3.09it/s] Training 1/1 epoch (loss 1.5013): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [04:39<00:32, 3.09it/s] Training 1/1 epoch (loss 1.5013): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [04:39<00:31, 3.18it/s] Training 1/1 epoch (loss 1.4530): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [04:39<00:31, 3.18it/s] Training 1/1 epoch (loss 1.4530): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [04:39<00:31, 3.14it/s] Training 1/1 epoch (loss 1.6245): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [04:40<00:31, 3.14it/s] Training 1/1 epoch (loss 1.6245): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [04:40<00:30, 3.18it/s] Training 1/1 epoch (loss 1.4918): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [04:40<00:30, 3.18it/s] Training 1/1 epoch (loss 1.4918): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [04:40<00:30, 3.17it/s] Training 1/1 epoch (loss 1.6224): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [04:40<00:30, 3.17it/s] Training 1/1 epoch (loss 1.6224): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [04:40<00:30, 3.08it/s] Training 1/1 epoch (loss 1.4425): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [04:41<00:30, 3.08it/s] Training 1/1 epoch (loss 1.4425): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [04:41<00:30, 3.08it/s] Training 1/1 epoch (loss 1.5912): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [04:41<00:30, 3.08it/s] Training 1/1 epoch (loss 1.5912): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [04:41<00:29, 3.12it/s] Training 1/1 epoch (loss 1.6644): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [04:41<00:29, 3.12it/s] Training 1/1 epoch (loss 1.6644): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [04:41<00:28, 3.17it/s] Training 1/1 epoch (loss 1.5229): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [04:41<00:28, 3.17it/s] Training 1/1 epoch (loss 1.5229): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [04:41<00:29, 3.14it/s] Training 1/1 epoch (loss 1.5059): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [04:42<00:29, 3.14it/s] Training 1/1 epoch (loss 1.5059): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [04:42<00:28, 3.12it/s] Training 1/1 epoch (loss 1.5519): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [04:42<00:28, 3.12it/s] Training 1/1 epoch (loss 1.5519): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [04:42<00:31, 2.87it/s] Training 1/1 epoch (loss 1.4814): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [04:43<00:31, 2.87it/s] Training 1/1 epoch (loss 1.4814): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [04:43<00:29, 2.97it/s] Training 1/1 epoch (loss 1.6227): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [04:43<00:29, 2.97it/s] Training 1/1 epoch (loss 1.6227): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [04:43<00:28, 3.07it/s] Training 1/1 epoch (loss 1.6425): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [04:43<00:28, 3.07it/s] Training 1/1 epoch (loss 1.6425): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [04:43<00:27, 3.10it/s] Training 1/1 epoch (loss 1.5761): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [04:43<00:27, 3.10it/s] Training 1/1 epoch (loss 1.5761): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [04:43<00:27, 3.12it/s] Training 1/1 epoch (loss 1.5306): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [04:44<00:27, 3.12it/s] Training 1/1 epoch (loss 1.5306): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [04:44<00:26, 3.23it/s] Training 1/1 epoch (loss 1.6081): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [04:44<00:26, 3.23it/s] Training 1/1 epoch (loss 1.6081): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [04:44<00:26, 3.19it/s] Training 1/1 epoch (loss 1.5339): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [04:44<00:26, 3.19it/s] Training 1/1 epoch (loss 1.5339): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [04:44<00:26, 3.14it/s] Training 1/1 epoch (loss 1.5924): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [04:45<00:26, 3.14it/s] Training 1/1 epoch (loss 1.5924): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [04:45<00:25, 3.13it/s] Training 1/1 epoch (loss 1.5376): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [04:45<00:25, 3.13it/s] Training 1/1 epoch (loss 1.5376): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [04:45<00:25, 3.15it/s] Training 1/1 epoch (loss 1.5534): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [04:45<00:25, 3.15it/s] Training 1/1 epoch (loss 1.5534): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [04:45<00:25, 3.07it/s] Training 1/1 epoch (loss 1.4396): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [04:46<00:25, 3.07it/s] Training 1/1 epoch (loss 1.4396): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [04:46<00:25, 3.04it/s] Training 1/1 epoch (loss 1.5445): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [04:46<00:25, 3.04it/s] Training 1/1 epoch (loss 1.5445): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [04:46<00:25, 3.05it/s] Training 1/1 epoch (loss 1.6179): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [04:46<00:25, 3.05it/s] Training 1/1 epoch (loss 1.6179): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [04:46<00:24, 3.06it/s] Training 1/1 epoch (loss 1.6457): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [04:47<00:24, 3.06it/s] Training 1/1 epoch (loss 1.6457): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [04:47<00:23, 3.13it/s] Training 1/1 epoch (loss 1.5193): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [04:47<00:23, 3.13it/s] Training 1/1 epoch (loss 1.5193): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [04:47<00:23, 3.17it/s] Training 1/1 epoch (loss 1.5789): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [04:47<00:23, 3.17it/s] Training 1/1 epoch (loss 1.5789): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [04:47<00:22, 3.22it/s] Training 1/1 epoch (loss 1.4011): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [04:48<00:22, 3.22it/s] Training 1/1 epoch (loss 1.4011): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [04:48<00:23, 3.01it/s] Training 1/1 epoch (loss 1.3718): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [04:48<00:23, 3.01it/s] Training 1/1 epoch (loss 1.3718): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [04:48<00:23, 3.04it/s] Training 1/1 epoch (loss 1.4525): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [04:48<00:23, 3.04it/s] Training 1/1 epoch (loss 1.4525): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [04:48<00:22, 3.06it/s] Training 1/1 epoch (loss 1.4947): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [04:49<00:22, 3.06it/s] Training 1/1 epoch (loss 1.4947): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [04:49<00:22, 3.11it/s] Training 1/1 epoch (loss 1.6176): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [04:49<00:22, 3.11it/s] Training 1/1 epoch (loss 1.6176): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [04:49<00:21, 3.21it/s] Training 1/1 epoch (loss 1.4931): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [04:49<00:21, 3.21it/s] Training 1/1 epoch (loss 1.4931): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [04:49<00:20, 3.20it/s] Training 1/1 epoch (loss 1.5443): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [04:50<00:20, 3.20it/s] Training 1/1 epoch (loss 1.5443): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [04:50<00:20, 3.19it/s] Training 1/1 epoch (loss 1.6516): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [04:50<00:20, 3.19it/s] Training 1/1 epoch (loss 1.6516): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [04:50<00:20, 3.13it/s] Training 1/1 epoch (loss 1.5384): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [04:50<00:20, 3.13it/s] Training 1/1 epoch (loss 1.5384): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [04:50<00:20, 3.15it/s] Training 1/1 epoch (loss 1.5957): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [04:50<00:20, 3.15it/s] Training 1/1 epoch (loss 1.5957): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [04:50<00:19, 3.16it/s] Training 1/1 epoch (loss 1.5189): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [04:51<00:19, 3.16it/s] Training 1/1 epoch (loss 1.5189): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [04:51<00:19, 3.20it/s] Training 1/1 epoch (loss 1.5224): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [04:51<00:19, 3.20it/s] Training 1/1 epoch (loss 1.5224): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [04:51<00:19, 3.20it/s] Training 1/1 epoch (loss 1.4918): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [04:51<00:19, 3.20it/s] Training 1/1 epoch (loss 1.4918): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [04:51<00:18, 3.25it/s] Training 1/1 epoch (loss 1.4729): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [04:52<00:18, 3.25it/s] Training 1/1 epoch (loss 1.4729): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [04:52<00:18, 3.24it/s] Training 1/1 epoch (loss 1.6702): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [04:52<00:18, 3.24it/s] Training 1/1 epoch (loss 1.6702): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [04:52<00:18, 3.15it/s] Training 1/1 epoch (loss 1.5397): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [04:52<00:18, 3.15it/s] Training 1/1 epoch (loss 1.5397): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [04:52<00:18, 3.10it/s] Training 1/1 epoch (loss 1.5487): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [04:53<00:18, 3.10it/s] Training 1/1 epoch (loss 1.5487): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [04:53<00:17, 3.16it/s] Training 1/1 epoch (loss 1.6189): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [04:53<00:17, 3.16it/s] Training 1/1 epoch (loss 1.6189): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [04:53<00:17, 3.23it/s] Training 1/1 epoch (loss 1.5976): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [04:53<00:17, 3.23it/s] Training 1/1 epoch (loss 1.5976): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [04:53<00:16, 3.22it/s] Training 1/1 epoch (loss 1.5391): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [04:54<00:16, 3.22it/s] Training 1/1 epoch (loss 1.5391): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [04:54<00:16, 3.25it/s] Training 1/1 epoch (loss 1.4369): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [04:54<00:16, 3.25it/s] Training 1/1 epoch (loss 1.4369): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [04:54<00:16, 3.24it/s] Training 1/1 epoch (loss 1.5440): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [04:54<00:16, 3.24it/s] Training 1/1 epoch (loss 1.5440): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [04:54<00:15, 3.21it/s] Training 1/1 epoch (loss 1.4927): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [04:55<00:15, 3.21it/s] Training 1/1 epoch (loss 1.4927): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [04:55<00:15, 3.17it/s] Training 1/1 epoch (loss 1.5418): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [04:55<00:15, 3.17it/s] Training 1/1 epoch (loss 1.5418): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [04:55<00:15, 3.13it/s] Training 1/1 epoch (loss 1.4197): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [04:55<00:15, 3.13it/s] Training 1/1 epoch (loss 1.4197): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [04:55<00:15, 3.10it/s] Training 1/1 epoch (loss 1.5274): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [04:56<00:15, 3.10it/s] Training 1/1 epoch (loss 1.5274): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [04:56<00:15, 3.04it/s] Training 1/1 epoch (loss 1.5815): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [04:56<00:15, 3.04it/s] Training 1/1 epoch (loss 1.5815): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [04:56<00:14, 3.09it/s] Training 1/1 epoch (loss 1.6290): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [04:56<00:14, 3.09it/s] Training 1/1 epoch (loss 1.6290): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [04:56<00:14, 3.07it/s] Training 1/1 epoch (loss 1.4577): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [04:56<00:14, 3.07it/s] Training 1/1 epoch (loss 1.4577): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [04:56<00:14, 3.11it/s] Training 1/1 epoch (loss 1.5334): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [04:57<00:14, 3.11it/s] Training 1/1 epoch (loss 1.5334): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [04:57<00:13, 3.17it/s] Training 1/1 epoch (loss 1.5004): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [04:57<00:13, 3.17it/s] Training 1/1 epoch (loss 1.5004): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [04:57<00:13, 3.18it/s] Training 1/1 epoch (loss 1.5021): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [04:57<00:13, 3.18it/s] Training 1/1 epoch (loss 1.5021): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [04:57<00:12, 3.20it/s] Training 1/1 epoch (loss 1.5338): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [04:58<00:12, 3.20it/s] Training 1/1 epoch (loss 1.5338): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [04:58<00:12, 3.20it/s] Training 1/1 epoch (loss 1.5456): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [04:58<00:12, 3.20it/s] Training 1/1 epoch (loss 1.5456): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [04:58<00:12, 3.07it/s] Training 1/1 epoch (loss 1.5286): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [04:58<00:12, 3.07it/s] Training 1/1 epoch (loss 1.5286): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [04:58<00:12, 3.01it/s] Training 1/1 epoch (loss 1.4384): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [04:59<00:12, 3.01it/s] Training 1/1 epoch (loss 1.4384): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [04:59<00:12, 3.01it/s] Training 1/1 epoch (loss 1.4648): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [04:59<00:12, 3.01it/s] Training 1/1 epoch (loss 1.4648): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [04:59<00:11, 3.14it/s] Training 1/1 epoch (loss 1.5321): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [04:59<00:11, 3.14it/s] Training 1/1 epoch (loss 1.5321): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [04:59<00:11, 3.13it/s] Training 1/1 epoch (loss 1.6033): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [05:00<00:11, 3.13it/s] Training 1/1 epoch (loss 1.6033): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [05:00<00:11, 3.03it/s] Training 1/1 epoch (loss 1.5703): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [05:00<00:11, 3.03it/s] Training 1/1 epoch (loss 1.5703): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [05:00<00:10, 3.12it/s] Training 1/1 epoch (loss 1.5849): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [05:00<00:10, 3.12it/s] Training 1/1 epoch (loss 1.5849): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [05:00<00:10, 3.09it/s] Training 1/1 epoch (loss 1.4479): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [05:01<00:10, 3.09it/s] Training 1/1 epoch (loss 1.4479): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [05:01<00:09, 3.14it/s] Training 1/1 epoch (loss 1.5448): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [05:01<00:09, 3.14it/s] Training 1/1 epoch (loss 1.5448): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [05:01<00:09, 3.20it/s] Training 1/1 epoch (loss 1.4818): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [05:01<00:09, 3.20it/s] Training 1/1 epoch (loss 1.4818): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [05:01<00:08, 3.24it/s] Training 1/1 epoch (loss 1.5678): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [05:02<00:08, 3.24it/s] Training 1/1 epoch (loss 1.5678): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [05:02<00:08, 3.29it/s] Training 1/1 epoch (loss 1.5252): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [05:02<00:08, 3.29it/s] Training 1/1 epoch (loss 1.5252): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [05:02<00:08, 3.19it/s] Training 1/1 epoch (loss 1.5444): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [05:02<00:08, 3.19it/s] Training 1/1 epoch (loss 1.5444): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [05:02<00:08, 3.11it/s] Training 1/1 epoch (loss 1.5509): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [05:03<00:08, 3.11it/s] Training 1/1 epoch (loss 1.5509): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [05:03<00:07, 3.13it/s] Training 1/1 epoch (loss 1.5627): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [05:03<00:07, 3.13it/s] Training 1/1 epoch (loss 1.5627): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [05:03<00:07, 3.16it/s] Training 1/1 epoch (loss 1.5139): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [05:03<00:07, 3.16it/s] Training 1/1 epoch (loss 1.5139): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [05:03<00:07, 3.18it/s] Training 1/1 epoch (loss 1.5852): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [05:03<00:07, 3.18it/s] Training 1/1 epoch (loss 1.5852): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [05:03<00:06, 3.24it/s] Training 1/1 epoch (loss 1.5107): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [05:04<00:06, 3.24it/s] Training 1/1 epoch (loss 1.5107): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [05:04<00:06, 3.15it/s] Training 1/1 epoch (loss 1.4975): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [05:04<00:06, 3.15it/s] Training 1/1 epoch (loss 1.4975): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [05:04<00:06, 3.13it/s] Training 1/1 epoch (loss 1.5054): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [05:04<00:06, 3.13it/s] Training 1/1 epoch (loss 1.5054): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [05:04<00:06, 3.06it/s] Training 1/1 epoch (loss 1.5309): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [05:05<00:06, 3.06it/s] Training 1/1 epoch (loss 1.5309): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [05:05<00:05, 3.04it/s] Training 1/1 epoch (loss 1.5070): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [05:05<00:05, 3.04it/s] Training 1/1 epoch (loss 1.5070): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [05:05<00:05, 3.11it/s] Training 1/1 epoch (loss 1.4739): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [05:05<00:05, 3.11it/s] Training 1/1 epoch (loss 1.4739): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [05:05<00:05, 3.15it/s] Training 1/1 epoch (loss 1.4511): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [05:06<00:05, 3.15it/s] Training 1/1 epoch (loss 1.4511): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [05:06<00:05, 3.00it/s] Training 1/1 epoch (loss 1.5123): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [05:06<00:05, 3.00it/s] Training 1/1 epoch (loss 1.5123): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [05:06<00:04, 3.01it/s] Training 1/1 epoch (loss 1.5278): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [05:06<00:04, 3.01it/s] Training 1/1 epoch (loss 1.5278): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [05:06<00:04, 3.08it/s] Training 1/1 epoch (loss 1.4407): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [05:07<00:04, 3.08it/s] Training 1/1 epoch (loss 1.4407): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [05:07<00:03, 3.15it/s] Training 1/1 epoch (loss 1.5205): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [05:07<00:03, 3.15it/s] Training 1/1 epoch (loss 1.5205): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [05:07<00:03, 3.18it/s] Training 1/1 epoch (loss 1.5022): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [05:07<00:03, 3.18it/s] Training 1/1 epoch (loss 1.5022): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [05:07<00:03, 3.17it/s] Training 1/1 epoch (loss 1.4787): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [05:08<00:03, 3.17it/s] Training 1/1 epoch (loss 1.4787): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [05:08<00:02, 3.18it/s] Training 1/1 epoch (loss 1.5318): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [05:08<00:02, 3.18it/s] Training 1/1 epoch (loss 1.5318): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [05:08<00:02, 3.16it/s] Training 1/1 epoch (loss 1.4681): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [05:08<00:02, 3.16it/s] Training 1/1 epoch (loss 1.4681): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [05:08<00:02, 3.18it/s] Training 1/1 epoch (loss 1.5486): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [05:09<00:02, 3.18it/s] Training 1/1 epoch (loss 1.5486): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [05:09<00:01, 3.07it/s] Training 1/1 epoch (loss 1.5389): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [05:09<00:01, 3.07it/s] Training 1/1 epoch (loss 1.5389): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [05:09<00:01, 3.10it/s] Training 1/1 epoch (loss 1.5312): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [05:09<00:01, 3.10it/s] Training 1/1 epoch (loss 1.5312): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [05:09<00:01, 3.16it/s] Training 1/1 epoch (loss 1.6599): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [05:10<00:01, 3.16it/s] Training 1/1 epoch (loss 1.6599): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [05:10<00:00, 3.18it/s] Training 1/1 epoch (loss 1.5348): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [05:10<00:00, 3.18it/s] Training 1/1 epoch (loss 1.5348): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [05:10<00:00, 3.07it/s] Training 1/1 epoch (loss 1.5127): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [05:10<00:00, 3.07it/s] Training 1/1 epoch (loss 1.5127): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [05:10<00:00, 2.99it/s] Training 1/1 epoch (loss 1.5660): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [05:11<00:00, 2.99it/s] Training 1/1 epoch (loss 1.5660): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [05:11<00:00, 3.02it/s] Training 1/1 epoch (loss 1.5660): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [05:11<00:00, 3.02it/s]
tokenizer config file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-2T/tinyllama-2T-s3-Q1-30k/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-2T/tinyllama-2T-s3-Q1-30k/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x15505f9df950>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 449, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 391, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2106, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2127, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2352, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2609, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown