alignmentforever's picture
Upload folder using huggingface_hub
cea32ad verified
+ deepspeed --master_port 23492 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/setting3/safety_data/training/safe/safe_30k.json --model_name_or_path /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T --max_length 2048 --trust_remote_code True --epochs 1 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-3T/tinyllama-3T-s3-Q1-30k --log_type wandb --log_run_name tinyllama-3T-s3-Q1-30k --log_project Inverse_Alignment --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
[rank1]:[W529 02:14:35.331153081 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W529 02:14:35.375353911 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank4]:[W529 02:14:35.400698596 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W529 02:14:35.402550858 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W529 02:14:35.498670538 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W529 02:14:35.583207408 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W529 02:14:35.727667235 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W529 02:14:35.785046247 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/config.json
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
Model config LlamaConfig {
"_name_or_path": "/aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 32000
}
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/model.safetensors
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.float32 as defined in model's config object
Instantiating LlamaForCausalLM model under default dtype torch.float32.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.model
loading file tokenizer.json
loading file tokenizer.json
loading file tokenizer.model
loading file added_tokens.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer.model
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file tokenizer.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file added_tokens.json
loading file chat_template.jinja
loading file tokenizer.json
loading file special_tokens_map.json
loading file added_tokens.json
loading file tokenizer_config.json
loading file special_tokens_map.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer.model
loading file tokenizer_config.json
loading file chat_template.jinja
loading file tokenizer.json
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file added_tokens.json
loading file tokenizer_config.json
loading file special_tokens_map.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
All model checkpoint weights were used when initializing LlamaForCausalLM.
All the weights of LlamaForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/TinyLlama-1.1B-intermediate-step-1431k-3T/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 2048,
"pad_token_id": 0
}
loading file tokenizer.model
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.8
wandb: Run data is saved locally in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-3T/tinyllama-3T-s3-Q1-30k/wandb/run-20250529_021446-1oeru2an
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run tinyllama-3T-s3-Q1-30k
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment/runs/1oeru2an
Training 1/1 epoch: 0%| | 0/938 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.0020): 0%| | 0/938 [00:06<?, ?it/s] Training 1/1 epoch (loss 2.0020): 0%| | 1/938 [00:06<1:46:48, 6.84s/it] Training 1/1 epoch (loss 1.9623): 0%| | 1/938 [00:08<1:46:48, 6.84s/it] Training 1/1 epoch (loss 1.9623): 0%| | 2/938 [00:08<1:01:05, 3.92s/it] Training 1/1 epoch (loss 1.9595): 0%| | 2/938 [00:09<1:01:05, 3.92s/it] Training 1/1 epoch (loss 1.9595): 0%| | 3/938 [00:09<35:56, 2.31s/it] Training 1/1 epoch (loss 1.9877): 0%| | 3/938 [00:09<35:56, 2.31s/it] Training 1/1 epoch (loss 1.9877): 0%| | 4/938 [00:09<24:03, 1.55s/it] Training 1/1 epoch (loss 1.9219): 0%| | 4/938 [00:09<24:03, 1.55s/it] Training 1/1 epoch (loss 1.9219): 1%| | 5/938 [00:09<17:30, 1.13s/it] Training 1/1 epoch (loss 1.9014): 1%| | 5/938 [00:10<17:30, 1.13s/it] Training 1/1 epoch (loss 1.9014): 1%| | 6/938 [00:10<13:36, 1.14it/s] Training 1/1 epoch (loss 1.9475): 1%| | 6/938 [00:10<13:36, 1.14it/s] Training 1/1 epoch (loss 1.9475): 1%| | 7/938 [00:10<11:01, 1.41it/s] Training 1/1 epoch (loss 2.0038): 1%| | 7/938 [00:11<11:01, 1.41it/s] Training 1/1 epoch (loss 2.0038): 1%| | 8/938 [00:11<10:06, 1.53it/s] Training 1/1 epoch (loss 1.7789): 1%| | 8/938 [00:11<10:06, 1.53it/s] Training 1/1 epoch (loss 1.7789): 1%| | 9/938 [00:11<09:00, 1.72it/s] Training 1/1 epoch (loss 1.7788): 1%| | 9/938 [00:11<09:00, 1.72it/s] Training 1/1 epoch (loss 1.7788): 1%| | 10/938 [00:11<07:59, 1.94it/s] Training 1/1 epoch (loss 1.8612): 1%| | 10/938 [00:12<07:59, 1.94it/s] Training 1/1 epoch (loss 1.8612): 1%| | 11/938 [00:12<07:23, 2.09it/s] Training 1/1 epoch (loss 1.8916): 1%| | 11/938 [00:12<07:23, 2.09it/s] Training 1/1 epoch (loss 1.8916): 1%|▏ | 12/938 [00:12<06:47, 2.27it/s] Training 1/1 epoch (loss 1.9388): 1%|▏ | 12/938 [00:13<06:47, 2.27it/s] Training 1/1 epoch (loss 1.9388): 1%|▏ | 13/938 [00:13<06:31, 2.36it/s] Training 1/1 epoch (loss 1.6995): 1%|▏ | 13/938 [00:13<06:31, 2.36it/s] Training 1/1 epoch (loss 1.6995): 1%|▏ | 14/938 [00:13<06:31, 2.36it/s] Training 1/1 epoch (loss 1.9198): 1%|▏ | 14/938 [00:13<06:31, 2.36it/s] Training 1/1 epoch (loss 1.9198): 2%|▏ | 15/938 [00:13<06:18, 2.44it/s] Training 1/1 epoch (loss 1.8617): 2%|▏ | 15/938 [00:14<06:18, 2.44it/s] Training 1/1 epoch (loss 1.8617): 2%|▏ | 16/938 [00:14<06:16, 2.45it/s] Training 1/1 epoch (loss 1.6163): 2%|▏ | 16/938 [00:14<06:16, 2.45it/s] Training 1/1 epoch (loss 1.6163): 2%|▏ | 17/938 [00:14<06:42, 2.29it/s] Training 1/1 epoch (loss 1.7746): 2%|▏ | 17/938 [00:15<06:42, 2.29it/s] Training 1/1 epoch (loss 1.7746): 2%|▏ | 18/938 [00:15<06:38, 2.31it/s] Training 1/1 epoch (loss 1.5777): 2%|▏ | 18/938 [00:15<06:38, 2.31it/s] Training 1/1 epoch (loss 1.5777): 2%|▏ | 19/938 [00:15<06:31, 2.35it/s] Training 1/1 epoch (loss 1.6825): 2%|▏ | 19/938 [00:15<06:31, 2.35it/s] Training 1/1 epoch (loss 1.6825): 2%|▏ | 20/938 [00:15<06:10, 2.48it/s] Training 1/1 epoch (loss 1.7394): 2%|▏ | 20/938 [00:16<06:10, 2.48it/s] Training 1/1 epoch (loss 1.7394): 2%|▏ | 21/938 [00:16<06:06, 2.50it/s] Training 1/1 epoch (loss 1.7963): 2%|▏ | 21/938 [00:16<06:06, 2.50it/s] Training 1/1 epoch (loss 1.7963): 2%|▏ | 22/938 [00:16<06:03, 2.52it/s] Training 1/1 epoch (loss 1.6556): 2%|▏ | 22/938 [00:17<06:03, 2.52it/s] Training 1/1 epoch (loss 1.6556): 2%|▏ | 23/938 [00:17<05:59, 2.55it/s] Training 1/1 epoch (loss 1.7121): 2%|▏ | 23/938 [00:17<05:59, 2.55it/s] Training 1/1 epoch (loss 1.7121): 3%|β–Ž | 24/938 [00:17<05:57, 2.56it/s] Training 1/1 epoch (loss 1.7337): 3%|β–Ž | 24/938 [00:17<05:57, 2.56it/s] Training 1/1 epoch (loss 1.7337): 3%|β–Ž | 25/938 [00:17<06:04, 2.50it/s] Training 1/1 epoch (loss 1.7382): 3%|β–Ž | 25/938 [00:18<06:04, 2.50it/s] Training 1/1 epoch (loss 1.7382): 3%|β–Ž | 26/938 [00:18<05:59, 2.54it/s] Training 1/1 epoch (loss 1.6627): 3%|β–Ž | 26/938 [00:18<05:59, 2.54it/s] Training 1/1 epoch (loss 1.6627): 3%|β–Ž | 27/938 [00:18<05:54, 2.57it/s] Training 1/1 epoch (loss 1.5895): 3%|β–Ž | 27/938 [00:19<05:54, 2.57it/s] Training 1/1 epoch (loss 1.5895): 3%|β–Ž | 28/938 [00:19<05:49, 2.60it/s] Training 1/1 epoch (loss 1.7182): 3%|β–Ž | 28/938 [00:19<05:49, 2.60it/s] Training 1/1 epoch (loss 1.7182): 3%|β–Ž | 29/938 [00:19<05:49, 2.60it/s] Training 1/1 epoch (loss 1.7012): 3%|β–Ž | 29/938 [00:19<05:49, 2.60it/s] Training 1/1 epoch (loss 1.7012): 3%|β–Ž | 30/938 [00:19<05:46, 2.62it/s] Training 1/1 epoch (loss 1.7042): 3%|β–Ž | 30/938 [00:20<05:46, 2.62it/s] Training 1/1 epoch (loss 1.7042): 3%|β–Ž | 31/938 [00:20<05:42, 2.65it/s] Training 1/1 epoch (loss 1.7146): 3%|β–Ž | 31/938 [00:20<05:42, 2.65it/s] Training 1/1 epoch (loss 1.7146): 3%|β–Ž | 32/938 [00:20<05:38, 2.67it/s] Training 1/1 epoch (loss 1.6340): 3%|β–Ž | 32/938 [00:20<05:38, 2.67it/s] Training 1/1 epoch (loss 1.6340): 4%|β–Ž | 33/938 [00:20<05:45, 2.62it/s] Training 1/1 epoch (loss 1.6364): 4%|β–Ž | 33/938 [00:21<05:45, 2.62it/s] Training 1/1 epoch (loss 1.6364): 4%|β–Ž | 34/938 [00:21<05:43, 2.63it/s] Training 1/1 epoch (loss 1.6306): 4%|β–Ž | 34/938 [00:21<05:43, 2.63it/s] Training 1/1 epoch (loss 1.6306): 4%|β–Ž | 35/938 [00:21<06:00, 2.50it/s] Training 1/1 epoch (loss 1.6232): 4%|β–Ž | 35/938 [00:22<06:00, 2.50it/s] Training 1/1 epoch (loss 1.6232): 4%|▍ | 36/938 [00:22<05:46, 2.60it/s] Training 1/1 epoch (loss 1.7123): 4%|▍ | 36/938 [00:22<05:46, 2.60it/s] Training 1/1 epoch (loss 1.7123): 4%|▍ | 37/938 [00:22<05:46, 2.60it/s] Training 1/1 epoch (loss 1.6155): 4%|▍ | 37/938 [00:22<05:46, 2.60it/s] Training 1/1 epoch (loss 1.6155): 4%|▍ | 38/938 [00:22<05:48, 2.59it/s] Training 1/1 epoch (loss 1.6060): 4%|▍ | 38/938 [00:23<05:48, 2.59it/s] Training 1/1 epoch (loss 1.6060): 4%|▍ | 39/938 [00:23<05:46, 2.59it/s] Training 1/1 epoch (loss 1.6385): 4%|▍ | 39/938 [00:23<05:46, 2.59it/s] Training 1/1 epoch (loss 1.6385): 4%|▍ | 40/938 [00:23<05:43, 2.61it/s] Training 1/1 epoch (loss 1.5081): 4%|▍ | 40/938 [00:24<05:43, 2.61it/s] Training 1/1 epoch (loss 1.5081): 4%|▍ | 41/938 [00:24<05:53, 2.53it/s] Training 1/1 epoch (loss 1.6386): 4%|▍ | 41/938 [00:24<05:53, 2.53it/s] Training 1/1 epoch (loss 1.6386): 4%|▍ | 42/938 [00:24<05:46, 2.59it/s] Training 1/1 epoch (loss 1.6192): 4%|▍ | 42/938 [00:24<05:46, 2.59it/s] Training 1/1 epoch (loss 1.6192): 5%|▍ | 43/938 [00:24<05:51, 2.55it/s] Training 1/1 epoch (loss 1.5711): 5%|▍ | 43/938 [00:25<05:51, 2.55it/s] Training 1/1 epoch (loss 1.5711): 5%|▍ | 44/938 [00:25<06:08, 2.43it/s] Training 1/1 epoch (loss 1.5964): 5%|▍ | 44/938 [00:25<06:08, 2.43it/s] Training 1/1 epoch (loss 1.5964): 5%|▍ | 45/938 [00:25<05:48, 2.56it/s] Training 1/1 epoch (loss 1.5633): 5%|▍ | 45/938 [00:26<05:48, 2.56it/s] Training 1/1 epoch (loss 1.5633): 5%|▍ | 46/938 [00:26<05:49, 2.56it/s] Training 1/1 epoch (loss 1.6139): 5%|▍ | 46/938 [00:26<05:49, 2.56it/s] Training 1/1 epoch (loss 1.6139): 5%|β–Œ | 47/938 [00:26<05:48, 2.55it/s] Training 1/1 epoch (loss 1.6456): 5%|β–Œ | 47/938 [00:26<05:48, 2.55it/s] Training 1/1 epoch (loss 1.6456): 5%|β–Œ | 48/938 [00:26<05:50, 2.54it/s] Training 1/1 epoch (loss 1.6700): 5%|β–Œ | 48/938 [00:27<05:50, 2.54it/s] Training 1/1 epoch (loss 1.6700): 5%|β–Œ | 49/938 [00:27<06:08, 2.41it/s] Training 1/1 epoch (loss 1.4809): 5%|β–Œ | 49/938 [00:27<06:08, 2.41it/s] Training 1/1 epoch (loss 1.4809): 5%|β–Œ | 50/938 [00:27<05:59, 2.47it/s] Training 1/1 epoch (loss 1.5375): 5%|β–Œ | 50/938 [00:28<05:59, 2.47it/s] Training 1/1 epoch (loss 1.5375): 5%|β–Œ | 51/938 [00:28<05:55, 2.50it/s] Training 1/1 epoch (loss 1.5325): 5%|β–Œ | 51/938 [00:28<05:55, 2.50it/s] Training 1/1 epoch (loss 1.5325): 6%|β–Œ | 52/938 [00:28<05:49, 2.54it/s] Training 1/1 epoch (loss 1.4956): 6%|β–Œ | 52/938 [00:28<05:49, 2.54it/s] Training 1/1 epoch (loss 1.4956): 6%|β–Œ | 53/938 [00:28<05:42, 2.58it/s] Training 1/1 epoch (loss 1.6047): 6%|β–Œ | 53/938 [00:29<05:42, 2.58it/s] Training 1/1 epoch (loss 1.6047): 6%|β–Œ | 54/938 [00:29<05:35, 2.63it/s] Training 1/1 epoch (loss 1.6696): 6%|β–Œ | 54/938 [00:29<05:35, 2.63it/s] Training 1/1 epoch (loss 1.6696): 6%|β–Œ | 55/938 [00:29<05:55, 2.49it/s] Training 1/1 epoch (loss 1.5259): 6%|β–Œ | 55/938 [00:30<05:55, 2.49it/s] Training 1/1 epoch (loss 1.5259): 6%|β–Œ | 56/938 [00:30<06:01, 2.44it/s] Training 1/1 epoch (loss 1.5290): 6%|β–Œ | 56/938 [00:30<06:01, 2.44it/s] Training 1/1 epoch (loss 1.5290): 6%|β–Œ | 57/938 [00:30<05:56, 2.47it/s] Training 1/1 epoch (loss 1.5763): 6%|β–Œ | 57/938 [00:30<05:56, 2.47it/s] Training 1/1 epoch (loss 1.5763): 6%|β–Œ | 58/938 [00:30<05:47, 2.53it/s] Training 1/1 epoch (loss 1.5916): 6%|β–Œ | 58/938 [00:31<05:47, 2.53it/s] Training 1/1 epoch (loss 1.5916): 6%|β–‹ | 59/938 [00:31<05:38, 2.60it/s] Training 1/1 epoch (loss 1.6278): 6%|β–‹ | 59/938 [00:31<05:38, 2.60it/s] Training 1/1 epoch (loss 1.6278): 6%|β–‹ | 60/938 [00:31<05:44, 2.55it/s] Training 1/1 epoch (loss 1.6524): 6%|β–‹ | 60/938 [00:31<05:44, 2.55it/s] Training 1/1 epoch (loss 1.6524): 7%|β–‹ | 61/938 [00:31<05:36, 2.60it/s] Training 1/1 epoch (loss 1.6300): 7%|β–‹ | 61/938 [00:32<05:36, 2.60it/s] Training 1/1 epoch (loss 1.6300): 7%|β–‹ | 62/938 [00:32<05:49, 2.51it/s] Training 1/1 epoch (loss 1.6445): 7%|β–‹ | 62/938 [00:32<05:49, 2.51it/s] Training 1/1 epoch (loss 1.6445): 7%|β–‹ | 63/938 [00:32<05:41, 2.57it/s] Training 1/1 epoch (loss 1.5066): 7%|β–‹ | 63/938 [00:33<05:41, 2.57it/s] Training 1/1 epoch (loss 1.5066): 7%|β–‹ | 64/938 [00:33<05:55, 2.46it/s] Training 1/1 epoch (loss 1.6075): 7%|β–‹ | 64/938 [00:33<05:55, 2.46it/s] Training 1/1 epoch (loss 1.6075): 7%|β–‹ | 65/938 [00:33<05:55, 2.46it/s] Training 1/1 epoch (loss 1.6110): 7%|β–‹ | 65/938 [00:34<05:55, 2.46it/s] Training 1/1 epoch (loss 1.6110): 7%|β–‹ | 66/938 [00:34<05:43, 2.54it/s] Training 1/1 epoch (loss 1.6042): 7%|β–‹ | 66/938 [00:34<05:43, 2.54it/s] Training 1/1 epoch (loss 1.6042): 7%|β–‹ | 67/938 [00:34<05:51, 2.48it/s] Training 1/1 epoch (loss 1.5220): 7%|β–‹ | 67/938 [00:34<05:51, 2.48it/s] Training 1/1 epoch (loss 1.5220): 7%|β–‹ | 68/938 [00:34<05:51, 2.48it/s] Training 1/1 epoch (loss 1.5888): 7%|β–‹ | 68/938 [00:35<05:51, 2.48it/s] Training 1/1 epoch (loss 1.5888): 7%|β–‹ | 69/938 [00:35<07:04, 2.05it/s] Training 1/1 epoch (loss 1.6713): 7%|β–‹ | 69/938 [00:36<07:04, 2.05it/s] Training 1/1 epoch (loss 1.6713): 7%|β–‹ | 70/938 [00:36<07:29, 1.93it/s] Training 1/1 epoch (loss 1.4924): 7%|β–‹ | 70/938 [00:36<07:29, 1.93it/s] Training 1/1 epoch (loss 1.4924): 8%|β–Š | 71/938 [00:36<06:50, 2.11it/s] Training 1/1 epoch (loss 1.5830): 8%|β–Š | 71/938 [00:36<06:50, 2.11it/s] Training 1/1 epoch (loss 1.5830): 8%|β–Š | 72/938 [00:36<06:28, 2.23it/s] Training 1/1 epoch (loss 1.5051): 8%|β–Š | 72/938 [00:37<06:28, 2.23it/s] Training 1/1 epoch (loss 1.5051): 8%|β–Š | 73/938 [00:37<06:08, 2.35it/s] Training 1/1 epoch (loss 1.5194): 8%|β–Š | 73/938 [00:37<06:08, 2.35it/s] Training 1/1 epoch (loss 1.5194): 8%|β–Š | 74/938 [00:37<06:05, 2.37it/s] Training 1/1 epoch (loss 1.4526): 8%|β–Š | 74/938 [00:38<06:05, 2.37it/s] Training 1/1 epoch (loss 1.4526): 8%|β–Š | 75/938 [00:38<05:52, 2.45it/s] Training 1/1 epoch (loss 1.5372): 8%|β–Š | 75/938 [00:38<05:52, 2.45it/s] Training 1/1 epoch (loss 1.5372): 8%|β–Š | 76/938 [00:38<05:51, 2.45it/s] Training 1/1 epoch (loss 1.4428): 8%|β–Š | 76/938 [00:38<05:51, 2.45it/s] Training 1/1 epoch (loss 1.4428): 8%|β–Š | 77/938 [00:38<05:40, 2.53it/s] Training 1/1 epoch (loss 1.6149): 8%|β–Š | 77/938 [00:39<05:40, 2.53it/s] Training 1/1 epoch (loss 1.6149): 8%|β–Š | 78/938 [00:39<05:37, 2.54it/s] Training 1/1 epoch (loss 1.5406): 8%|β–Š | 78/938 [00:39<05:37, 2.54it/s] Training 1/1 epoch (loss 1.5406): 8%|β–Š | 79/938 [00:39<05:35, 2.56it/s] Training 1/1 epoch (loss 1.6405): 8%|β–Š | 79/938 [00:40<05:35, 2.56it/s] Training 1/1 epoch (loss 1.6405): 9%|β–Š | 80/938 [00:40<05:43, 2.50it/s] Training 1/1 epoch (loss 1.5750): 9%|β–Š | 80/938 [00:40<05:43, 2.50it/s] Training 1/1 epoch (loss 1.5750): 9%|β–Š | 81/938 [00:40<05:46, 2.47it/s] Training 1/1 epoch (loss 1.4918): 9%|β–Š | 81/938 [00:40<05:46, 2.47it/s] Training 1/1 epoch (loss 1.4918): 9%|β–Š | 82/938 [00:40<05:47, 2.47it/s] Training 1/1 epoch (loss 1.5317): 9%|β–Š | 82/938 [00:41<05:47, 2.47it/s] Training 1/1 epoch (loss 1.5317): 9%|β–‰ | 83/938 [00:41<05:52, 2.42it/s] Training 1/1 epoch (loss 1.5125): 9%|β–‰ | 83/938 [00:41<05:52, 2.42it/s] Training 1/1 epoch (loss 1.5125): 9%|β–‰ | 84/938 [00:41<05:41, 2.50it/s] Training 1/1 epoch (loss 1.5397): 9%|β–‰ | 84/938 [00:41<05:41, 2.50it/s] Training 1/1 epoch (loss 1.5397): 9%|β–‰ | 85/938 [00:41<05:35, 2.55it/s] Training 1/1 epoch (loss 1.5386): 9%|β–‰ | 85/938 [00:42<05:35, 2.55it/s] Training 1/1 epoch (loss 1.5386): 9%|β–‰ | 86/938 [00:42<05:42, 2.49it/s] Training 1/1 epoch (loss 1.5002): 9%|β–‰ | 86/938 [00:42<05:42, 2.49it/s] Training 1/1 epoch (loss 1.5002): 9%|β–‰ | 87/938 [00:42<05:40, 2.50it/s] Training 1/1 epoch (loss 1.5102): 9%|β–‰ | 87/938 [00:43<05:40, 2.50it/s] Training 1/1 epoch (loss 1.5102): 9%|β–‰ | 88/938 [00:43<05:41, 2.49it/s] Training 1/1 epoch (loss 1.5528): 9%|β–‰ | 88/938 [00:43<05:41, 2.49it/s] Training 1/1 epoch (loss 1.5528): 9%|β–‰ | 89/938 [00:43<05:37, 2.52it/s] Training 1/1 epoch (loss 1.5526): 9%|β–‰ | 89/938 [00:44<05:37, 2.52it/s] Training 1/1 epoch (loss 1.5526): 10%|β–‰ | 90/938 [00:44<05:39, 2.50it/s] Training 1/1 epoch (loss 1.5946): 10%|β–‰ | 90/938 [00:44<05:39, 2.50it/s] Training 1/1 epoch (loss 1.5946): 10%|β–‰ | 91/938 [00:44<05:43, 2.46it/s] Training 1/1 epoch (loss 1.4170): 10%|β–‰ | 91/938 [00:44<05:43, 2.46it/s] Training 1/1 epoch (loss 1.4170): 10%|β–‰ | 92/938 [00:44<05:47, 2.44it/s] Training 1/1 epoch (loss 1.6410): 10%|β–‰ | 92/938 [00:45<05:47, 2.44it/s] Training 1/1 epoch (loss 1.6410): 10%|β–‰ | 93/938 [00:45<05:50, 2.41it/s] Training 1/1 epoch (loss 1.5597): 10%|β–‰ | 93/938 [00:45<05:50, 2.41it/s] Training 1/1 epoch (loss 1.5597): 10%|β–ˆ | 94/938 [00:45<05:49, 2.41it/s] Training 1/1 epoch (loss 1.5092): 10%|β–ˆ | 94/938 [00:46<05:49, 2.41it/s] Training 1/1 epoch (loss 1.5092): 10%|β–ˆ | 95/938 [00:46<05:44, 2.45it/s] Training 1/1 epoch (loss 1.5375): 10%|β–ˆ | 95/938 [00:46<05:44, 2.45it/s] Training 1/1 epoch (loss 1.5375): 10%|β–ˆ | 96/938 [00:46<05:44, 2.44it/s] Training 1/1 epoch (loss 1.6066): 10%|β–ˆ | 96/938 [00:46<05:44, 2.44it/s] Training 1/1 epoch (loss 1.6066): 10%|β–ˆ | 97/938 [00:46<05:35, 2.51it/s] Training 1/1 epoch (loss 1.5110): 10%|β–ˆ | 97/938 [00:47<05:35, 2.51it/s] Training 1/1 epoch (loss 1.5110): 10%|β–ˆ | 98/938 [00:47<05:28, 2.56it/s] Training 1/1 epoch (loss 1.6181): 10%|β–ˆ | 98/938 [00:47<05:28, 2.56it/s] Training 1/1 epoch (loss 1.6181): 11%|β–ˆ | 99/938 [00:47<05:27, 2.57it/s] Training 1/1 epoch (loss 1.3632): 11%|β–ˆ | 99/938 [00:48<05:27, 2.57it/s] Training 1/1 epoch (loss 1.3632): 11%|β–ˆ | 100/938 [00:48<05:37, 2.48it/s] Training 1/1 epoch (loss 1.5292): 11%|β–ˆ | 100/938 [00:48<05:37, 2.48it/s] Training 1/1 epoch (loss 1.5292): 11%|β–ˆ | 101/938 [00:48<05:33, 2.51it/s] Training 1/1 epoch (loss 1.5342): 11%|β–ˆ | 101/938 [00:48<05:33, 2.51it/s] Training 1/1 epoch (loss 1.5342): 11%|β–ˆ | 102/938 [00:48<05:29, 2.54it/s] Training 1/1 epoch (loss 1.4986): 11%|β–ˆ | 102/938 [00:49<05:29, 2.54it/s] Training 1/1 epoch (loss 1.4986): 11%|β–ˆ | 103/938 [00:49<05:26, 2.56it/s] Training 1/1 epoch (loss 1.5745): 11%|β–ˆ | 103/938 [00:49<05:26, 2.56it/s] Training 1/1 epoch (loss 1.5745): 11%|β–ˆ | 104/938 [00:49<05:27, 2.55it/s] Training 1/1 epoch (loss 1.5777): 11%|β–ˆ | 104/938 [00:50<05:27, 2.55it/s] Training 1/1 epoch (loss 1.5777): 11%|β–ˆ | 105/938 [00:50<05:24, 2.57it/s] Training 1/1 epoch (loss 1.5610): 11%|β–ˆ | 105/938 [00:50<05:24, 2.57it/s] Training 1/1 epoch (loss 1.5610): 11%|β–ˆβ– | 106/938 [00:50<05:26, 2.55it/s] Training 1/1 epoch (loss 1.5748): 11%|β–ˆβ– | 106/938 [00:50<05:26, 2.55it/s] Training 1/1 epoch (loss 1.5748): 11%|β–ˆβ– | 107/938 [00:50<05:33, 2.49it/s] Training 1/1 epoch (loss 1.6864): 11%|β–ˆβ– | 107/938 [00:51<05:33, 2.49it/s] Training 1/1 epoch (loss 1.6864): 12%|β–ˆβ– | 108/938 [00:51<05:26, 2.54it/s] Training 1/1 epoch (loss 1.6250): 12%|β–ˆβ– | 108/938 [00:51<05:26, 2.54it/s] Training 1/1 epoch (loss 1.6250): 12%|β–ˆβ– | 109/938 [00:51<05:20, 2.58it/s] Training 1/1 epoch (loss 1.6531): 12%|β–ˆβ– | 109/938 [00:51<05:20, 2.58it/s] Training 1/1 epoch (loss 1.6531): 12%|β–ˆβ– | 110/938 [00:51<05:25, 2.54it/s] Training 1/1 epoch (loss 1.3879): 12%|β–ˆβ– | 110/938 [00:52<05:25, 2.54it/s] Training 1/1 epoch (loss 1.3879): 12%|β–ˆβ– | 111/938 [00:52<05:26, 2.53it/s] Training 1/1 epoch (loss 1.6206): 12%|β–ˆβ– | 111/938 [00:52<05:26, 2.53it/s] Training 1/1 epoch (loss 1.6206): 12%|β–ˆβ– | 112/938 [00:52<05:17, 2.60it/s] Training 1/1 epoch (loss 1.5157): 12%|β–ˆβ– | 112/938 [00:53<05:17, 2.60it/s] Training 1/1 epoch (loss 1.5157): 12%|β–ˆβ– | 113/938 [00:53<05:30, 2.50it/s] Training 1/1 epoch (loss 1.4934): 12%|β–ˆβ– | 113/938 [00:53<05:30, 2.50it/s] Training 1/1 epoch (loss 1.4934): 12%|β–ˆβ– | 114/938 [00:53<05:29, 2.50it/s] Training 1/1 epoch (loss 1.4553): 12%|β–ˆβ– | 114/938 [00:53<05:29, 2.50it/s] Training 1/1 epoch (loss 1.4553): 12%|β–ˆβ– | 115/938 [00:53<05:23, 2.54it/s] Training 1/1 epoch (loss 1.5928): 12%|β–ˆβ– | 115/938 [00:54<05:23, 2.54it/s] Training 1/1 epoch (loss 1.5928): 12%|β–ˆβ– | 116/938 [00:54<05:22, 2.55it/s] Training 1/1 epoch (loss 1.5098): 12%|β–ˆβ– | 116/938 [00:54<05:22, 2.55it/s] Training 1/1 epoch (loss 1.5098): 12%|β–ˆβ– | 117/938 [00:54<05:19, 2.57it/s] Training 1/1 epoch (loss 1.5226): 12%|β–ˆβ– | 117/938 [00:55<05:19, 2.57it/s] Training 1/1 epoch (loss 1.5226): 13%|β–ˆβ–Ž | 118/938 [00:55<05:29, 2.49it/s] Training 1/1 epoch (loss 1.5912): 13%|β–ˆβ–Ž | 118/938 [00:55<05:29, 2.49it/s] Training 1/1 epoch (loss 1.5912): 13%|β–ˆβ–Ž | 119/938 [00:55<05:25, 2.51it/s] Training 1/1 epoch (loss 1.4741): 13%|β–ˆβ–Ž | 119/938 [00:55<05:25, 2.51it/s] Training 1/1 epoch (loss 1.4741): 13%|β–ˆβ–Ž | 120/938 [00:55<05:24, 2.52it/s] Training 1/1 epoch (loss 1.5596): 13%|β–ˆβ–Ž | 120/938 [00:56<05:24, 2.52it/s] Training 1/1 epoch (loss 1.5596): 13%|β–ˆβ–Ž | 121/938 [00:56<06:01, 2.26it/s] Training 1/1 epoch (loss 1.5409): 13%|β–ˆβ–Ž | 121/938 [00:56<06:01, 2.26it/s] Training 1/1 epoch (loss 1.5409): 13%|β–ˆβ–Ž | 122/938 [00:56<05:46, 2.35it/s] Training 1/1 epoch (loss 1.4897): 13%|β–ˆβ–Ž | 122/938 [00:57<05:46, 2.35it/s] Training 1/1 epoch (loss 1.4897): 13%|β–ˆβ–Ž | 123/938 [00:57<05:43, 2.37it/s] Training 1/1 epoch (loss 1.5358): 13%|β–ˆβ–Ž | 123/938 [00:57<05:43, 2.37it/s] Training 1/1 epoch (loss 1.5358): 13%|β–ˆβ–Ž | 124/938 [00:57<05:28, 2.48it/s] Training 1/1 epoch (loss 1.5794): 13%|β–ˆβ–Ž | 124/938 [00:58<05:28, 2.48it/s] Training 1/1 epoch (loss 1.5794): 13%|β–ˆβ–Ž | 125/938 [00:58<05:17, 2.56it/s] Training 1/1 epoch (loss 1.6394): 13%|β–ˆβ–Ž | 125/938 [00:58<05:17, 2.56it/s] Training 1/1 epoch (loss 1.6394): 13%|β–ˆβ–Ž | 126/938 [00:58<05:34, 2.43it/s] Training 1/1 epoch (loss 1.4156): 13%|β–ˆβ–Ž | 126/938 [00:58<05:34, 2.43it/s] Training 1/1 epoch (loss 1.4156): 14%|β–ˆβ–Ž | 127/938 [00:58<05:25, 2.49it/s] Training 1/1 epoch (loss 1.4734): 14%|β–ˆβ–Ž | 127/938 [00:59<05:25, 2.49it/s] Training 1/1 epoch (loss 1.4734): 14%|β–ˆβ–Ž | 128/938 [00:59<05:19, 2.53it/s] Training 1/1 epoch (loss 1.5361): 14%|β–ˆβ–Ž | 128/938 [00:59<05:19, 2.53it/s] Training 1/1 epoch (loss 1.5361): 14%|β–ˆβ– | 129/938 [00:59<05:16, 2.56it/s] Training 1/1 epoch (loss 1.4609): 14%|β–ˆβ– | 129/938 [01:00<05:16, 2.56it/s] Training 1/1 epoch (loss 1.4609): 14%|β–ˆβ– | 130/938 [01:00<05:17, 2.55it/s] Training 1/1 epoch (loss 1.5205): 14%|β–ˆβ– | 130/938 [01:00<05:17, 2.55it/s] Training 1/1 epoch (loss 1.5205): 14%|β–ˆβ– | 131/938 [01:00<05:23, 2.49it/s] Training 1/1 epoch (loss 1.4458): 14%|β–ˆβ– | 131/938 [01:00<05:23, 2.49it/s] Training 1/1 epoch (loss 1.4458): 14%|β–ˆβ– | 132/938 [01:00<05:15, 2.56it/s] Training 1/1 epoch (loss 1.5556): 14%|β–ˆβ– | 132/938 [01:01<05:15, 2.56it/s] Training 1/1 epoch (loss 1.5556): 14%|β–ˆβ– | 133/938 [01:01<05:13, 2.57it/s] Training 1/1 epoch (loss 1.6735): 14%|β–ˆβ– | 133/938 [01:01<05:13, 2.57it/s] Training 1/1 epoch (loss 1.6735): 14%|β–ˆβ– | 134/938 [01:01<05:16, 2.54it/s] Training 1/1 epoch (loss 1.4471): 14%|β–ˆβ– | 134/938 [01:01<05:16, 2.54it/s] Training 1/1 epoch (loss 1.4471): 14%|β–ˆβ– | 135/938 [01:01<05:12, 2.57it/s] Training 1/1 epoch (loss 1.5605): 14%|β–ˆβ– | 135/938 [01:02<05:12, 2.57it/s] Training 1/1 epoch (loss 1.5605): 14%|β–ˆβ– | 136/938 [01:02<05:25, 2.47it/s] Training 1/1 epoch (loss 1.5365): 14%|β–ˆβ– | 136/938 [01:02<05:25, 2.47it/s] Training 1/1 epoch (loss 1.5365): 15%|β–ˆβ– | 137/938 [01:02<05:12, 2.56it/s] Training 1/1 epoch (loss 1.5132): 15%|β–ˆβ– | 137/938 [01:03<05:12, 2.56it/s] Training 1/1 epoch (loss 1.5132): 15%|β–ˆβ– | 138/938 [01:03<05:11, 2.57it/s] Training 1/1 epoch (loss 1.5261): 15%|β–ˆβ– | 138/938 [01:03<05:11, 2.57it/s] Training 1/1 epoch (loss 1.5261): 15%|β–ˆβ– | 139/938 [01:03<05:16, 2.52it/s] Training 1/1 epoch (loss 1.5091): 15%|β–ˆβ– | 139/938 [01:03<05:16, 2.52it/s] Training 1/1 epoch (loss 1.5091): 15%|β–ˆβ– | 140/938 [01:03<05:24, 2.46it/s] Training 1/1 epoch (loss 1.5720): 15%|β–ˆβ– | 140/938 [01:04<05:24, 2.46it/s] Training 1/1 epoch (loss 1.5720): 15%|β–ˆβ–Œ | 141/938 [01:04<05:20, 2.49it/s] Training 1/1 epoch (loss 1.5545): 15%|β–ˆβ–Œ | 141/938 [01:04<05:20, 2.49it/s] Training 1/1 epoch (loss 1.5545): 15%|β–ˆβ–Œ | 142/938 [01:04<05:39, 2.34it/s] Training 1/1 epoch (loss 1.4848): 15%|β–ˆβ–Œ | 142/938 [01:05<05:39, 2.34it/s] Training 1/1 epoch (loss 1.4848): 15%|β–ˆβ–Œ | 143/938 [01:05<06:04, 2.18it/s] Training 1/1 epoch (loss 1.5924): 15%|β–ˆβ–Œ | 143/938 [01:06<06:04, 2.18it/s] Training 1/1 epoch (loss 1.5924): 15%|β–ˆβ–Œ | 144/938 [01:06<06:47, 1.95it/s] Training 1/1 epoch (loss 1.5086): 15%|β–ˆβ–Œ | 144/938 [01:06<06:47, 1.95it/s] Training 1/1 epoch (loss 1.5086): 15%|β–ˆβ–Œ | 145/938 [01:06<06:18, 2.10it/s] Training 1/1 epoch (loss 1.4889): 15%|β–ˆβ–Œ | 145/938 [01:06<06:18, 2.10it/s] Training 1/1 epoch (loss 1.4889): 16%|β–ˆβ–Œ | 146/938 [01:06<05:52, 2.25it/s] Training 1/1 epoch (loss 1.3625): 16%|β–ˆβ–Œ | 146/938 [01:07<05:52, 2.25it/s] Training 1/1 epoch (loss 1.3625): 16%|β–ˆβ–Œ | 147/938 [01:07<05:38, 2.34it/s] Training 1/1 epoch (loss 1.6068): 16%|β–ˆβ–Œ | 147/938 [01:07<05:38, 2.34it/s] Training 1/1 epoch (loss 1.6068): 16%|β–ˆβ–Œ | 148/938 [01:07<05:24, 2.44it/s] Training 1/1 epoch (loss 1.6343): 16%|β–ˆβ–Œ | 148/938 [01:07<05:24, 2.44it/s] Training 1/1 epoch (loss 1.6343): 16%|β–ˆβ–Œ | 149/938 [01:07<05:11, 2.53it/s] Training 1/1 epoch (loss 1.4105): 16%|β–ˆβ–Œ | 149/938 [01:08<05:11, 2.53it/s] Training 1/1 epoch (loss 1.4105): 16%|β–ˆβ–Œ | 150/938 [01:08<05:13, 2.51it/s] Training 1/1 epoch (loss 1.5200): 16%|β–ˆβ–Œ | 150/938 [01:08<05:13, 2.51it/s] Training 1/1 epoch (loss 1.5200): 16%|β–ˆβ–Œ | 151/938 [01:08<05:10, 2.53it/s] Training 1/1 epoch (loss 1.4667): 16%|β–ˆβ–Œ | 151/938 [01:09<05:10, 2.53it/s] Training 1/1 epoch (loss 1.4667): 16%|β–ˆβ–Œ | 152/938 [01:09<05:23, 2.43it/s] Training 1/1 epoch (loss 1.5153): 16%|β–ˆβ–Œ | 152/938 [01:09<05:23, 2.43it/s] Training 1/1 epoch (loss 1.5153): 16%|β–ˆβ–‹ | 153/938 [01:09<05:16, 2.48it/s] Training 1/1 epoch (loss 1.4149): 16%|β–ˆβ–‹ | 153/938 [01:09<05:16, 2.48it/s] Training 1/1 epoch (loss 1.4149): 16%|β–ˆβ–‹ | 154/938 [01:09<05:05, 2.57it/s] Training 1/1 epoch (loss 1.6328): 16%|β–ˆβ–‹ | 154/938 [01:10<05:05, 2.57it/s] Training 1/1 epoch (loss 1.6328): 17%|β–ˆβ–‹ | 155/938 [01:10<05:04, 2.57it/s] Training 1/1 epoch (loss 1.5349): 17%|β–ˆβ–‹ | 155/938 [01:10<05:04, 2.57it/s] Training 1/1 epoch (loss 1.5349): 17%|β–ˆβ–‹ | 156/938 [01:10<04:58, 2.62it/s] Training 1/1 epoch (loss 1.5426): 17%|β–ˆβ–‹ | 156/938 [01:11<04:58, 2.62it/s] Training 1/1 epoch (loss 1.5426): 17%|β–ˆβ–‹ | 157/938 [01:11<04:54, 2.65it/s] Training 1/1 epoch (loss 1.4327): 17%|β–ˆβ–‹ | 157/938 [01:11<04:54, 2.65it/s] Training 1/1 epoch (loss 1.4327): 17%|β–ˆβ–‹ | 158/938 [01:11<05:05, 2.56it/s] Training 1/1 epoch (loss 1.6094): 17%|β–ˆβ–‹ | 158/938 [01:11<05:05, 2.56it/s] Training 1/1 epoch (loss 1.6094): 17%|β–ˆβ–‹ | 159/938 [01:11<04:55, 2.64it/s] Training 1/1 epoch (loss 1.5255): 17%|β–ˆβ–‹ | 159/938 [01:12<04:55, 2.64it/s] Training 1/1 epoch (loss 1.5255): 17%|β–ˆβ–‹ | 160/938 [01:12<05:03, 2.56it/s] Training 1/1 epoch (loss 1.4489): 17%|β–ˆβ–‹ | 160/938 [01:12<05:03, 2.56it/s] Training 1/1 epoch (loss 1.4489): 17%|β–ˆβ–‹ | 161/938 [01:12<05:00, 2.58it/s] Training 1/1 epoch (loss 1.5201): 17%|β–ˆβ–‹ | 161/938 [01:13<05:00, 2.58it/s] Training 1/1 epoch (loss 1.5201): 17%|β–ˆβ–‹ | 162/938 [01:13<05:09, 2.50it/s] Training 1/1 epoch (loss 1.5152): 17%|β–ˆβ–‹ | 162/938 [01:13<05:09, 2.50it/s] Training 1/1 epoch (loss 1.5152): 17%|β–ˆβ–‹ | 163/938 [01:13<05:18, 2.44it/s] Training 1/1 epoch (loss 1.5225): 17%|β–ˆβ–‹ | 163/938 [01:13<05:18, 2.44it/s] Training 1/1 epoch (loss 1.5225): 17%|β–ˆβ–‹ | 164/938 [01:13<05:12, 2.48it/s] Training 1/1 epoch (loss 1.5483): 17%|β–ˆβ–‹ | 164/938 [01:14<05:12, 2.48it/s] Training 1/1 epoch (loss 1.5483): 18%|β–ˆβ–Š | 165/938 [01:14<05:34, 2.31it/s] Training 1/1 epoch (loss 1.4924): 18%|β–ˆβ–Š | 165/938 [01:14<05:34, 2.31it/s] Training 1/1 epoch (loss 1.4924): 18%|β–ˆβ–Š | 166/938 [01:14<05:26, 2.36it/s] Training 1/1 epoch (loss 1.5315): 18%|β–ˆβ–Š | 166/938 [01:15<05:26, 2.36it/s] Training 1/1 epoch (loss 1.5315): 18%|β–ˆβ–Š | 167/938 [01:15<05:19, 2.41it/s] Training 1/1 epoch (loss 1.4051): 18%|β–ˆβ–Š | 167/938 [01:15<05:19, 2.41it/s] Training 1/1 epoch (loss 1.4051): 18%|β–ˆβ–Š | 168/938 [01:15<05:19, 2.41it/s] Training 1/1 epoch (loss 1.6316): 18%|β–ˆβ–Š | 168/938 [01:15<05:19, 2.41it/s] Training 1/1 epoch (loss 1.6316): 18%|β–ˆβ–Š | 169/938 [01:15<05:05, 2.52it/s] Training 1/1 epoch (loss 1.5233): 18%|β–ˆβ–Š | 169/938 [01:16<05:05, 2.52it/s] Training 1/1 epoch (loss 1.5233): 18%|β–ˆβ–Š | 170/938 [01:16<05:05, 2.51it/s] Training 1/1 epoch (loss 1.4981): 18%|β–ˆβ–Š | 170/938 [01:16<05:05, 2.51it/s] Training 1/1 epoch (loss 1.4981): 18%|β–ˆβ–Š | 171/938 [01:16<04:56, 2.59it/s] Training 1/1 epoch (loss 1.4350): 18%|β–ˆβ–Š | 171/938 [01:17<04:56, 2.59it/s] Training 1/1 epoch (loss 1.4350): 18%|β–ˆβ–Š | 172/938 [01:17<05:11, 2.46it/s] Training 1/1 epoch (loss 1.5016): 18%|β–ˆβ–Š | 172/938 [01:17<05:11, 2.46it/s] Training 1/1 epoch (loss 1.5016): 18%|β–ˆβ–Š | 173/938 [01:17<05:10, 2.47it/s] Training 1/1 epoch (loss 1.6200): 18%|β–ˆβ–Š | 173/938 [01:17<05:10, 2.47it/s] Training 1/1 epoch (loss 1.6200): 19%|β–ˆβ–Š | 174/938 [01:17<05:12, 2.45it/s] Training 1/1 epoch (loss 1.5493): 19%|β–ˆβ–Š | 174/938 [01:18<05:12, 2.45it/s] Training 1/1 epoch (loss 1.5493): 19%|β–ˆβ–Š | 175/938 [01:18<05:04, 2.50it/s] Training 1/1 epoch (loss 1.3655): 19%|β–ˆβ–Š | 175/938 [01:18<05:04, 2.50it/s] Training 1/1 epoch (loss 1.3655): 19%|β–ˆβ–‰ | 176/938 [01:18<05:11, 2.45it/s] Training 1/1 epoch (loss 1.4817): 19%|β–ˆβ–‰ | 176/938 [01:19<05:11, 2.45it/s] Training 1/1 epoch (loss 1.4817): 19%|β–ˆβ–‰ | 177/938 [01:19<05:03, 2.51it/s] Training 1/1 epoch (loss 1.5334): 19%|β–ˆβ–‰ | 177/938 [01:19<05:03, 2.51it/s] Training 1/1 epoch (loss 1.5334): 19%|β–ˆβ–‰ | 178/938 [01:19<05:15, 2.41it/s] Training 1/1 epoch (loss 1.5951): 19%|β–ˆβ–‰ | 178/938 [01:20<05:15, 2.41it/s] Training 1/1 epoch (loss 1.5951): 19%|β–ˆβ–‰ | 179/938 [01:20<05:17, 2.39it/s] Training 1/1 epoch (loss 1.4618): 19%|β–ˆβ–‰ | 179/938 [01:20<05:17, 2.39it/s] Training 1/1 epoch (loss 1.4618): 19%|β–ˆβ–‰ | 180/938 [01:20<05:14, 2.41it/s] Training 1/1 epoch (loss 1.4838): 19%|β–ˆβ–‰ | 180/938 [01:20<05:14, 2.41it/s] Training 1/1 epoch (loss 1.4838): 19%|β–ˆβ–‰ | 181/938 [01:20<05:00, 2.52it/s] Training 1/1 epoch (loss 1.3436): 19%|β–ˆβ–‰ | 181/938 [01:21<05:00, 2.52it/s] Training 1/1 epoch (loss 1.3436): 19%|β–ˆβ–‰ | 182/938 [01:21<04:58, 2.53it/s] Training 1/1 epoch (loss 1.6164): 19%|β–ˆβ–‰ | 182/938 [01:21<04:58, 2.53it/s] Training 1/1 epoch (loss 1.6164): 20%|β–ˆβ–‰ | 183/938 [01:21<05:00, 2.51it/s] Training 1/1 epoch (loss 1.4871): 20%|β–ˆβ–‰ | 183/938 [01:21<05:00, 2.51it/s] Training 1/1 epoch (loss 1.4871): 20%|β–ˆβ–‰ | 184/938 [01:21<04:57, 2.54it/s] Training 1/1 epoch (loss 1.5292): 20%|β–ˆβ–‰ | 184/938 [01:22<04:57, 2.54it/s] Training 1/1 epoch (loss 1.5292): 20%|β–ˆβ–‰ | 185/938 [01:22<04:55, 2.55it/s] Training 1/1 epoch (loss 1.5097): 20%|β–ˆβ–‰ | 185/938 [01:22<04:55, 2.55it/s] Training 1/1 epoch (loss 1.5097): 20%|β–ˆβ–‰ | 186/938 [01:22<04:46, 2.63it/s] Training 1/1 epoch (loss 1.4307): 20%|β–ˆβ–‰ | 186/938 [01:23<04:46, 2.63it/s] Training 1/1 epoch (loss 1.4307): 20%|β–ˆβ–‰ | 187/938 [01:23<04:48, 2.60it/s] Training 1/1 epoch (loss 1.4849): 20%|β–ˆβ–‰ | 187/938 [01:23<04:48, 2.60it/s] Training 1/1 epoch (loss 1.4849): 20%|β–ˆβ–ˆ | 188/938 [01:23<04:52, 2.57it/s] Training 1/1 epoch (loss 1.5396): 20%|β–ˆβ–ˆ | 188/938 [01:23<04:52, 2.57it/s] Training 1/1 epoch (loss 1.5396): 20%|β–ˆβ–ˆ | 189/938 [01:23<04:48, 2.59it/s] Training 1/1 epoch (loss 1.7218): 20%|β–ˆβ–ˆ | 189/938 [01:24<04:48, 2.59it/s] Training 1/1 epoch (loss 1.7218): 20%|β–ˆβ–ˆ | 190/938 [01:24<04:49, 2.59it/s] Training 1/1 epoch (loss 1.4634): 20%|β–ˆβ–ˆ | 190/938 [01:24<04:49, 2.59it/s] Training 1/1 epoch (loss 1.4634): 20%|β–ˆβ–ˆ | 191/938 [01:24<04:51, 2.57it/s] Training 1/1 epoch (loss 1.4780): 20%|β–ˆβ–ˆ | 191/938 [01:25<04:51, 2.57it/s] Training 1/1 epoch (loss 1.4780): 20%|β–ˆβ–ˆ | 192/938 [01:25<05:10, 2.40it/s] Training 1/1 epoch (loss 1.5118): 20%|β–ˆβ–ˆ | 192/938 [01:25<05:10, 2.40it/s] Training 1/1 epoch (loss 1.5118): 21%|β–ˆβ–ˆ | 193/938 [01:25<05:16, 2.35it/s] Training 1/1 epoch (loss 1.4923): 21%|β–ˆβ–ˆ | 193/938 [01:25<05:16, 2.35it/s] Training 1/1 epoch (loss 1.4923): 21%|β–ˆβ–ˆ | 194/938 [01:25<05:10, 2.40it/s] Training 1/1 epoch (loss 1.4912): 21%|β–ˆβ–ˆ | 194/938 [01:26<05:10, 2.40it/s] Training 1/1 epoch (loss 1.4912): 21%|β–ˆβ–ˆ | 195/938 [01:26<05:33, 2.23it/s] Training 1/1 epoch (loss 1.5236): 21%|β–ˆβ–ˆ | 195/938 [01:26<05:33, 2.23it/s] Training 1/1 epoch (loss 1.5236): 21%|β–ˆβ–ˆ | 196/938 [01:26<05:26, 2.27it/s] Training 1/1 epoch (loss 1.5551): 21%|β–ˆβ–ˆ | 196/938 [01:27<05:26, 2.27it/s] Training 1/1 epoch (loss 1.5551): 21%|β–ˆβ–ˆ | 197/938 [01:27<05:40, 2.18it/s] Training 1/1 epoch (loss 1.4136): 21%|β–ˆβ–ˆ | 197/938 [01:27<05:40, 2.18it/s] Training 1/1 epoch (loss 1.4136): 21%|β–ˆβ–ˆ | 198/938 [01:27<05:37, 2.19it/s] Training 1/1 epoch (loss 1.4954): 21%|β–ˆβ–ˆ | 198/938 [01:28<05:37, 2.19it/s] Training 1/1 epoch (loss 1.4954): 21%|β–ˆβ–ˆ | 199/938 [01:28<05:16, 2.34it/s] Training 1/1 epoch (loss 1.4490): 21%|β–ˆβ–ˆ | 199/938 [01:28<05:16, 2.34it/s] Training 1/1 epoch (loss 1.4490): 21%|β–ˆβ–ˆβ– | 200/938 [01:28<05:17, 2.33it/s] Training 1/1 epoch (loss 1.4882): 21%|β–ˆβ–ˆβ– | 200/938 [01:29<05:17, 2.33it/s] Training 1/1 epoch (loss 1.4882): 21%|β–ˆβ–ˆβ– | 201/938 [01:29<05:01, 2.44it/s] Training 1/1 epoch (loss 1.5032): 21%|β–ˆβ–ˆβ– | 201/938 [01:29<05:01, 2.44it/s] Training 1/1 epoch (loss 1.5032): 22%|β–ˆβ–ˆβ– | 202/938 [01:29<05:00, 2.45it/s] Training 1/1 epoch (loss 1.5195): 22%|β–ˆβ–ˆβ– | 202/938 [01:29<05:00, 2.45it/s] Training 1/1 epoch (loss 1.5195): 22%|β–ˆβ–ˆβ– | 203/938 [01:29<05:00, 2.44it/s] Training 1/1 epoch (loss 1.4513): 22%|β–ˆβ–ˆβ– | 203/938 [01:30<05:00, 2.44it/s] Training 1/1 epoch (loss 1.4513): 22%|β–ˆβ–ˆβ– | 204/938 [01:30<05:07, 2.39it/s] Training 1/1 epoch (loss 1.5058): 22%|β–ˆβ–ˆβ– | 204/938 [01:30<05:07, 2.39it/s] Training 1/1 epoch (loss 1.5058): 22%|β–ˆβ–ˆβ– | 205/938 [01:30<04:57, 2.46it/s] Training 1/1 epoch (loss 1.5842): 22%|β–ˆβ–ˆβ– | 205/938 [01:31<04:57, 2.46it/s] Training 1/1 epoch (loss 1.5842): 22%|β–ˆβ–ˆβ– | 206/938 [01:31<04:48, 2.54it/s] Training 1/1 epoch (loss 1.4993): 22%|β–ˆβ–ˆβ– | 206/938 [01:31<04:48, 2.54it/s] Training 1/1 epoch (loss 1.4993): 22%|β–ˆβ–ˆβ– | 207/938 [01:31<04:52, 2.50it/s] Training 1/1 epoch (loss 1.5575): 22%|β–ˆβ–ˆβ– | 207/938 [01:31<04:52, 2.50it/s] Training 1/1 epoch (loss 1.5575): 22%|β–ˆβ–ˆβ– | 208/938 [01:31<04:57, 2.46it/s] Training 1/1 epoch (loss 1.4829): 22%|β–ˆβ–ˆβ– | 208/938 [01:32<04:57, 2.46it/s] Training 1/1 epoch (loss 1.4829): 22%|β–ˆβ–ˆβ– | 209/938 [01:32<05:08, 2.37it/s] Training 1/1 epoch (loss 1.4617): 22%|β–ˆβ–ˆβ– | 209/938 [01:32<05:08, 2.37it/s] Training 1/1 epoch (loss 1.4617): 22%|β–ˆβ–ˆβ– | 210/938 [01:32<04:55, 2.47it/s] Training 1/1 epoch (loss 1.4719): 22%|β–ˆβ–ˆβ– | 210/938 [01:33<04:55, 2.47it/s] Training 1/1 epoch (loss 1.4719): 22%|β–ˆβ–ˆβ– | 211/938 [01:33<04:54, 2.47it/s] Training 1/1 epoch (loss 1.3829): 22%|β–ˆβ–ˆβ– | 211/938 [01:33<04:54, 2.47it/s] Training 1/1 epoch (loss 1.3829): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:33<04:58, 2.43it/s] Training 1/1 epoch (loss 1.4132): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:33<04:58, 2.43it/s] Training 1/1 epoch (loss 1.4132): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:33<05:01, 2.40it/s] Training 1/1 epoch (loss 1.4269): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:34<05:01, 2.40it/s] Training 1/1 epoch (loss 1.4269): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:34<04:57, 2.43it/s] Training 1/1 epoch (loss 1.5241): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:34<04:57, 2.43it/s] Training 1/1 epoch (loss 1.5241): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:34<04:53, 2.46it/s] Training 1/1 epoch (loss 1.4263): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:35<04:53, 2.46it/s] Training 1/1 epoch (loss 1.4263): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:35<05:33, 2.16it/s] Training 1/1 epoch (loss 1.6101): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:36<05:33, 2.16it/s] Training 1/1 epoch (loss 1.6101): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:36<06:39, 1.81it/s] Training 1/1 epoch (loss 1.5387): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:36<06:39, 1.81it/s] Training 1/1 epoch (loss 1.5387): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:36<06:10, 1.94it/s] Training 1/1 epoch (loss 1.4693): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:36<06:10, 1.94it/s] Training 1/1 epoch (loss 1.4693): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:36<05:38, 2.12it/s] Training 1/1 epoch (loss 1.4113): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:37<05:38, 2.12it/s] Training 1/1 epoch (loss 1.4113): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:37<05:37, 2.13it/s] Training 1/1 epoch (loss 1.5499): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:37<05:37, 2.13it/s] Training 1/1 epoch (loss 1.5499): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:37<05:18, 2.25it/s] Training 1/1 epoch (loss 1.4759): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:38<05:18, 2.25it/s] Training 1/1 epoch (loss 1.4759): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:38<05:06, 2.34it/s] Training 1/1 epoch (loss 1.6005): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:38<05:06, 2.34it/s] Training 1/1 epoch (loss 1.6005): 24%|β–ˆβ–ˆβ– | 223/938 [01:38<04:55, 2.42it/s] Training 1/1 epoch (loss 1.5403): 24%|β–ˆβ–ˆβ– | 223/938 [01:38<04:55, 2.42it/s] Training 1/1 epoch (loss 1.5403): 24%|β–ˆβ–ˆβ– | 224/938 [01:38<04:46, 2.49it/s] Training 1/1 epoch (loss 1.5050): 24%|β–ˆβ–ˆβ– | 224/938 [01:39<04:46, 2.49it/s] Training 1/1 epoch (loss 1.5050): 24%|β–ˆβ–ˆβ– | 225/938 [01:39<04:56, 2.41it/s] Training 1/1 epoch (loss 1.5974): 24%|β–ˆβ–ˆβ– | 225/938 [01:39<04:56, 2.41it/s] Training 1/1 epoch (loss 1.5974): 24%|β–ˆβ–ˆβ– | 226/938 [01:39<04:45, 2.49it/s] Training 1/1 epoch (loss 1.4900): 24%|β–ˆβ–ˆβ– | 226/938 [01:40<04:45, 2.49it/s] Training 1/1 epoch (loss 1.4900): 24%|β–ˆβ–ˆβ– | 227/938 [01:40<04:52, 2.43it/s] Training 1/1 epoch (loss 1.5132): 24%|β–ˆβ–ˆβ– | 227/938 [01:40<04:52, 2.43it/s] Training 1/1 epoch (loss 1.5132): 24%|β–ˆβ–ˆβ– | 228/938 [01:40<04:47, 2.47it/s] Training 1/1 epoch (loss 1.5400): 24%|β–ˆβ–ˆβ– | 228/938 [01:40<04:47, 2.47it/s] Training 1/1 epoch (loss 1.5400): 24%|β–ˆβ–ˆβ– | 229/938 [01:40<04:40, 2.53it/s] Training 1/1 epoch (loss 1.5459): 24%|β–ˆβ–ˆβ– | 229/938 [01:41<04:40, 2.53it/s] Training 1/1 epoch (loss 1.5459): 25%|β–ˆβ–ˆβ– | 230/938 [01:41<04:37, 2.55it/s] Training 1/1 epoch (loss 1.4707): 25%|β–ˆβ–ˆβ– | 230/938 [01:41<04:37, 2.55it/s] Training 1/1 epoch (loss 1.4707): 25%|β–ˆβ–ˆβ– | 231/938 [01:41<04:32, 2.59it/s] Training 1/1 epoch (loss 1.5035): 25%|β–ˆβ–ˆβ– | 231/938 [01:42<04:32, 2.59it/s] Training 1/1 epoch (loss 1.5035): 25%|β–ˆβ–ˆβ– | 232/938 [01:42<04:39, 2.53it/s] Training 1/1 epoch (loss 1.5022): 25%|β–ˆβ–ˆβ– | 232/938 [01:42<04:39, 2.53it/s] Training 1/1 epoch (loss 1.5022): 25%|β–ˆβ–ˆβ– | 233/938 [01:42<04:38, 2.53it/s] Training 1/1 epoch (loss 1.4794): 25%|β–ˆβ–ˆβ– | 233/938 [01:42<04:38, 2.53it/s] Training 1/1 epoch (loss 1.4794): 25%|β–ˆβ–ˆβ– | 234/938 [01:42<04:33, 2.58it/s] Training 1/1 epoch (loss 1.4822): 25%|β–ˆβ–ˆβ– | 234/938 [01:43<04:33, 2.58it/s] Training 1/1 epoch (loss 1.4822): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:43<04:31, 2.59it/s] Training 1/1 epoch (loss 1.4054): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:43<04:31, 2.59it/s] Training 1/1 epoch (loss 1.4054): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:43<04:32, 2.58it/s] Training 1/1 epoch (loss 1.5899): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:43<04:32, 2.58it/s] Training 1/1 epoch (loss 1.5899): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:43<04:30, 2.60it/s] Training 1/1 epoch (loss 1.3846): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:44<04:30, 2.60it/s] Training 1/1 epoch (loss 1.3846): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:44<04:26, 2.63it/s] Training 1/1 epoch (loss 1.4586): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:44<04:26, 2.63it/s] Training 1/1 epoch (loss 1.4586): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:44<04:42, 2.48it/s] Training 1/1 epoch (loss 1.4638): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:45<04:42, 2.48it/s] Training 1/1 epoch (loss 1.4638): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:45<05:01, 2.31it/s] Training 1/1 epoch (loss 1.6142): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:45<05:01, 2.31it/s] Training 1/1 epoch (loss 1.6142): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:45<05:02, 2.30it/s] Training 1/1 epoch (loss 1.5047): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:46<05:02, 2.30it/s] Training 1/1 epoch (loss 1.5047): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:46<04:53, 2.37it/s] Training 1/1 epoch (loss 1.5184): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:46<04:53, 2.37it/s] Training 1/1 epoch (loss 1.5184): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:46<04:45, 2.43it/s] Training 1/1 epoch (loss 1.3770): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:46<04:45, 2.43it/s] Training 1/1 epoch (loss 1.3770): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:46<04:38, 2.49it/s] Training 1/1 epoch (loss 1.4266): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:47<04:38, 2.49it/s] Training 1/1 epoch (loss 1.4266): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:47<04:33, 2.53it/s] Training 1/1 epoch (loss 1.4827): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:47<04:33, 2.53it/s] Training 1/1 epoch (loss 1.4827): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:47<04:29, 2.57it/s] Training 1/1 epoch (loss 1.4735): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:48<04:29, 2.57it/s] Training 1/1 epoch (loss 1.4735): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:48<04:47, 2.40it/s] Training 1/1 epoch (loss 1.4460): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:48<04:47, 2.40it/s] Training 1/1 epoch (loss 1.4460): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:48<04:50, 2.38it/s] Training 1/1 epoch (loss 1.5027): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:49<04:50, 2.38it/s] Training 1/1 epoch (loss 1.5027): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:49<04:53, 2.35it/s] Training 1/1 epoch (loss 1.3708): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:49<04:53, 2.35it/s] Training 1/1 epoch (loss 1.3708): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:49<04:43, 2.43it/s] Training 1/1 epoch (loss 1.6117): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:49<04:43, 2.43it/s] Training 1/1 epoch (loss 1.6117): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:49<04:36, 2.49it/s] Training 1/1 epoch (loss 1.5280): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:50<04:36, 2.49it/s] Training 1/1 epoch (loss 1.5280): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:50<04:36, 2.48it/s] Training 1/1 epoch (loss 1.4955): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:50<04:36, 2.48it/s] Training 1/1 epoch (loss 1.4955): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:50<04:29, 2.54it/s] Training 1/1 epoch (loss 1.5053): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:50<04:29, 2.54it/s] Training 1/1 epoch (loss 1.5053): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:50<04:27, 2.55it/s] Training 1/1 epoch (loss 1.5049): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:51<04:27, 2.55it/s] Training 1/1 epoch (loss 1.5049): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:51<04:33, 2.50it/s] Training 1/1 epoch (loss 1.3954): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:51<04:33, 2.50it/s] Training 1/1 epoch (loss 1.3954): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:51<04:26, 2.56it/s] Training 1/1 epoch (loss 1.4757): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:52<04:26, 2.56it/s] Training 1/1 epoch (loss 1.4757): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:52<04:35, 2.48it/s] Training 1/1 epoch (loss 1.4993): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:52<04:35, 2.48it/s] Training 1/1 epoch (loss 1.4993): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:52<04:30, 2.51it/s] Training 1/1 epoch (loss 1.5188): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:52<04:30, 2.51it/s] Training 1/1 epoch (loss 1.5188): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:52<04:22, 2.59it/s] Training 1/1 epoch (loss 1.4522): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:53<04:22, 2.59it/s] Training 1/1 epoch (loss 1.4522): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:53<04:24, 2.56it/s] Training 1/1 epoch (loss 1.4686): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:53<04:24, 2.56it/s] Training 1/1 epoch (loss 1.4686): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:53<04:22, 2.58it/s] Training 1/1 epoch (loss 1.4515): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:54<04:22, 2.58it/s] Training 1/1 epoch (loss 1.4515): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:54<04:22, 2.58it/s] Training 1/1 epoch (loss 1.3550): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:54<04:22, 2.58it/s] Training 1/1 epoch (loss 1.3550): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:54<04:34, 2.46it/s] Training 1/1 epoch (loss 1.5947): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:55<04:34, 2.46it/s] Training 1/1 epoch (loss 1.5947): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:55<04:57, 2.27it/s] Training 1/1 epoch (loss 1.4540): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:55<04:57, 2.27it/s] Training 1/1 epoch (loss 1.4540): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:55<05:23, 2.08it/s] Training 1/1 epoch (loss 1.4999): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:56<05:23, 2.08it/s] Training 1/1 epoch (loss 1.4999): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:56<05:25, 2.07it/s] Training 1/1 epoch (loss 1.4054): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:56<05:25, 2.07it/s] Training 1/1 epoch (loss 1.4054): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:56<05:29, 2.04it/s] Training 1/1 epoch (loss 1.4059): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:57<05:29, 2.04it/s] Training 1/1 epoch (loss 1.4059): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:57<05:15, 2.12it/s] Training 1/1 epoch (loss 1.4112): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:57<05:15, 2.12it/s] Training 1/1 epoch (loss 1.4112): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:57<05:05, 2.19it/s] Training 1/1 epoch (loss 1.5069): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:57<05:05, 2.19it/s] Training 1/1 epoch (loss 1.5069): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:57<04:51, 2.29it/s] Training 1/1 epoch (loss 1.5692): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:58<04:51, 2.29it/s] Training 1/1 epoch (loss 1.5692): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:58<04:41, 2.37it/s] Training 1/1 epoch (loss 1.3878): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:58<04:41, 2.37it/s] Training 1/1 epoch (loss 1.3878): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:58<04:39, 2.39it/s] Training 1/1 epoch (loss 1.5019): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:59<04:39, 2.39it/s] Training 1/1 epoch (loss 1.5019): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:59<04:31, 2.44it/s] Training 1/1 epoch (loss 1.5486): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:59<04:31, 2.44it/s] Training 1/1 epoch (loss 1.5486): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:59<04:25, 2.50it/s] Training 1/1 epoch (loss 1.6470): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:59<04:25, 2.50it/s] Training 1/1 epoch (loss 1.6470): 29%|β–ˆβ–ˆβ–‰ | 275/938 [01:59<04:18, 2.56it/s] Training 1/1 epoch (loss 1.4853): 29%|β–ˆβ–ˆβ–‰ | 275/938 [02:00<04:18, 2.56it/s] Training 1/1 epoch (loss 1.4853): 29%|β–ˆβ–ˆβ–‰ | 276/938 [02:00<04:25, 2.49it/s] Training 1/1 epoch (loss 1.5508): 29%|β–ˆβ–ˆβ–‰ | 276/938 [02:00<04:25, 2.49it/s] Training 1/1 epoch (loss 1.5508): 30%|β–ˆβ–ˆβ–‰ | 277/938 [02:00<04:28, 2.46it/s] Training 1/1 epoch (loss 1.5001): 30%|β–ˆβ–ˆβ–‰ | 277/938 [02:01<04:28, 2.46it/s] Training 1/1 epoch (loss 1.5001): 30%|β–ˆβ–ˆβ–‰ | 278/938 [02:01<04:35, 2.40it/s] Training 1/1 epoch (loss 1.5307): 30%|β–ˆβ–ˆβ–‰ | 278/938 [02:01<04:35, 2.40it/s] Training 1/1 epoch (loss 1.5307): 30%|β–ˆβ–ˆβ–‰ | 279/938 [02:01<04:40, 2.35it/s] Training 1/1 epoch (loss 1.3540): 30%|β–ˆβ–ˆβ–‰ | 279/938 [02:01<04:40, 2.35it/s] Training 1/1 epoch (loss 1.3540): 30%|β–ˆβ–ˆβ–‰ | 280/938 [02:01<04:33, 2.40it/s] Training 1/1 epoch (loss 1.4501): 30%|β–ˆβ–ˆβ–‰ | 280/938 [02:02<04:33, 2.40it/s] Training 1/1 epoch (loss 1.4501): 30%|β–ˆβ–ˆβ–‰ | 281/938 [02:02<04:27, 2.46it/s] Training 1/1 epoch (loss 1.4280): 30%|β–ˆβ–ˆβ–‰ | 281/938 [02:02<04:27, 2.46it/s] Training 1/1 epoch (loss 1.4280): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [02:02<04:18, 2.54it/s] Training 1/1 epoch (loss 1.5630): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [02:03<04:18, 2.54it/s] Training 1/1 epoch (loss 1.5630): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [02:03<04:17, 2.54it/s] Training 1/1 epoch (loss 1.4455): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [02:03<04:17, 2.54it/s] Training 1/1 epoch (loss 1.4455): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [02:03<04:15, 2.56it/s] Training 1/1 epoch (loss 1.4256): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [02:03<04:15, 2.56it/s] Training 1/1 epoch (loss 1.4256): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [02:03<04:11, 2.60it/s] Training 1/1 epoch (loss 1.5283): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [02:04<04:11, 2.60it/s] Training 1/1 epoch (loss 1.5283): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [02:04<04:10, 2.60it/s] Training 1/1 epoch (loss 1.3993): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [02:04<04:10, 2.60it/s] Training 1/1 epoch (loss 1.3993): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [02:04<04:15, 2.55it/s] Training 1/1 epoch (loss 1.5870): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [02:05<04:15, 2.55it/s] Training 1/1 epoch (loss 1.5870): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [02:05<04:41, 2.31it/s] Training 1/1 epoch (loss 1.5873): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [02:05<04:41, 2.31it/s] Training 1/1 epoch (loss 1.5873): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [02:05<05:18, 2.04it/s] Training 1/1 epoch (loss 1.4596): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [02:06<05:18, 2.04it/s] Training 1/1 epoch (loss 1.4596): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [02:06<05:24, 2.00it/s] Training 1/1 epoch (loss 1.5695): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [02:06<05:24, 2.00it/s] Training 1/1 epoch (loss 1.5695): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [02:06<05:06, 2.11it/s] Training 1/1 epoch (loss 1.4230): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [02:07<05:06, 2.11it/s] Training 1/1 epoch (loss 1.4230): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [02:07<04:45, 2.26it/s] Training 1/1 epoch (loss 1.4538): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [02:07<04:45, 2.26it/s] Training 1/1 epoch (loss 1.4538): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [02:07<04:39, 2.31it/s] Training 1/1 epoch (loss 1.4569): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [02:07<04:39, 2.31it/s] Training 1/1 epoch (loss 1.4569): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [02:07<04:28, 2.40it/s] Training 1/1 epoch (loss 1.3152): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [02:08<04:28, 2.40it/s] Training 1/1 epoch (loss 1.3152): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [02:08<04:38, 2.31it/s] Training 1/1 epoch (loss 1.4196): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [02:08<04:38, 2.31it/s] Training 1/1 epoch (loss 1.4196): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [02:08<04:28, 2.39it/s] Training 1/1 epoch (loss 1.4875): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [02:09<04:28, 2.39it/s] Training 1/1 epoch (loss 1.4875): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [02:09<04:18, 2.48it/s] Training 1/1 epoch (loss 1.4340): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [02:09<04:18, 2.48it/s] Training 1/1 epoch (loss 1.4340): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [02:09<04:22, 2.44it/s] Training 1/1 epoch (loss 1.5025): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [02:09<04:22, 2.44it/s] Training 1/1 epoch (loss 1.5025): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [02:09<04:20, 2.45it/s] Training 1/1 epoch (loss 1.4822): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [02:10<04:20, 2.45it/s] Training 1/1 epoch (loss 1.4822): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [02:10<04:13, 2.52it/s] Training 1/1 epoch (loss 1.4999): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [02:10<04:13, 2.52it/s] Training 1/1 epoch (loss 1.4999): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [02:10<04:10, 2.54it/s] Training 1/1 epoch (loss 1.5312): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [02:11<04:10, 2.54it/s] Training 1/1 epoch (loss 1.5312): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [02:11<04:08, 2.55it/s] Training 1/1 epoch (loss 1.4063): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [02:11<04:08, 2.55it/s] Training 1/1 epoch (loss 1.4063): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [02:11<04:05, 2.58it/s] Training 1/1 epoch (loss 1.6265): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [02:11<04:05, 2.58it/s] Training 1/1 epoch (loss 1.6265): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [02:11<04:01, 2.62it/s] Training 1/1 epoch (loss 1.4051): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [02:12<04:01, 2.62it/s] Training 1/1 epoch (loss 1.4051): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [02:12<04:09, 2.54it/s] Training 1/1 epoch (loss 1.4983): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [02:12<04:09, 2.54it/s] Training 1/1 epoch (loss 1.4983): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [02:12<04:11, 2.51it/s] Training 1/1 epoch (loss 1.5102): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [02:13<04:11, 2.51it/s] Training 1/1 epoch (loss 1.5102): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [02:13<04:09, 2.53it/s] Training 1/1 epoch (loss 1.4304): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [02:13<04:09, 2.53it/s] Training 1/1 epoch (loss 1.4304): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [02:13<04:04, 2.58it/s] Training 1/1 epoch (loss 1.3522): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [02:13<04:04, 2.58it/s] Training 1/1 epoch (loss 1.3522): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [02:13<03:58, 2.63it/s] Training 1/1 epoch (loss 1.5598): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [02:14<03:58, 2.63it/s] Training 1/1 epoch (loss 1.5598): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [02:14<03:58, 2.63it/s] Training 1/1 epoch (loss 1.2966): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [02:14<03:58, 2.63it/s] Training 1/1 epoch (loss 1.2966): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [02:14<04:00, 2.61it/s] Training 1/1 epoch (loss 1.4529): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [02:15<04:00, 2.61it/s] Training 1/1 epoch (loss 1.4529): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [02:15<04:22, 2.39it/s] Training 1/1 epoch (loss 1.5081): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [02:15<04:22, 2.39it/s] Training 1/1 epoch (loss 1.5081): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [02:15<04:26, 2.34it/s] Training 1/1 epoch (loss 1.5764): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [02:15<04:26, 2.34it/s] Training 1/1 epoch (loss 1.5764): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [02:15<04:17, 2.43it/s] Training 1/1 epoch (loss 1.3901): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [02:16<04:17, 2.43it/s] Training 1/1 epoch (loss 1.3901): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [02:16<04:11, 2.47it/s] Training 1/1 epoch (loss 1.5032): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [02:16<04:11, 2.47it/s] Training 1/1 epoch (loss 1.5032): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [02:16<04:21, 2.38it/s] Training 1/1 epoch (loss 1.4703): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [02:17<04:21, 2.38it/s] Training 1/1 epoch (loss 1.4703): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [02:17<04:10, 2.48it/s] Training 1/1 epoch (loss 1.5055): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [02:17<04:10, 2.48it/s] Training 1/1 epoch (loss 1.5055): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [02:17<04:04, 2.53it/s] Training 1/1 epoch (loss 1.3727): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [02:17<04:04, 2.53it/s] Training 1/1 epoch (loss 1.3727): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [02:17<04:03, 2.54it/s] Training 1/1 epoch (loss 1.3223): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [02:18<04:03, 2.54it/s] Training 1/1 epoch (loss 1.3223): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [02:18<04:20, 2.38it/s] Training 1/1 epoch (loss 1.4042): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [02:18<04:20, 2.38it/s] Training 1/1 epoch (loss 1.4042): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [02:18<04:26, 2.31it/s] Training 1/1 epoch (loss 1.4896): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [02:19<04:26, 2.31it/s] Training 1/1 epoch (loss 1.4896): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [02:19<04:18, 2.38it/s] Training 1/1 epoch (loss 1.5974): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [02:19<04:18, 2.38it/s] Training 1/1 epoch (loss 1.5974): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [02:19<04:12, 2.43it/s] Training 1/1 epoch (loss 1.5037): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [02:19<04:12, 2.43it/s] Training 1/1 epoch (loss 1.5037): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [02:19<04:08, 2.48it/s] Training 1/1 epoch (loss 1.4564): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [02:20<04:08, 2.48it/s] Training 1/1 epoch (loss 1.4564): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [02:20<04:01, 2.54it/s] Training 1/1 epoch (loss 1.5180): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [02:20<04:01, 2.54it/s] Training 1/1 epoch (loss 1.5180): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [02:20<03:59, 2.55it/s] Training 1/1 epoch (loss 1.4490): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [02:21<03:59, 2.55it/s] Training 1/1 epoch (loss 1.4490): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [02:21<03:56, 2.59it/s] Training 1/1 epoch (loss 1.4143): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [02:21<03:56, 2.59it/s] Training 1/1 epoch (loss 1.4143): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [02:21<03:57, 2.57it/s] Training 1/1 epoch (loss 1.3630): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [02:21<03:57, 2.57it/s] Training 1/1 epoch (loss 1.3630): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [02:21<03:52, 2.62it/s] Training 1/1 epoch (loss 1.5279): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [02:22<03:52, 2.62it/s] Training 1/1 epoch (loss 1.5279): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [02:22<03:59, 2.54it/s] Training 1/1 epoch (loss 1.4438): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [02:22<03:59, 2.54it/s] Training 1/1 epoch (loss 1.4438): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [02:22<04:03, 2.49it/s] Training 1/1 epoch (loss 1.5463): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [02:23<04:03, 2.49it/s] Training 1/1 epoch (loss 1.5463): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [02:23<03:57, 2.55it/s] Training 1/1 epoch (loss 1.4554): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [02:23<03:57, 2.55it/s] Training 1/1 epoch (loss 1.4554): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [02:23<03:55, 2.57it/s] Training 1/1 epoch (loss 1.4105): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [02:23<03:55, 2.57it/s] Training 1/1 epoch (loss 1.4105): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [02:23<03:51, 2.61it/s] Training 1/1 epoch (loss 1.4954): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [02:24<03:51, 2.61it/s] Training 1/1 epoch (loss 1.4954): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [02:24<03:56, 2.55it/s] Training 1/1 epoch (loss 1.3607): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [02:24<03:56, 2.55it/s] Training 1/1 epoch (loss 1.3607): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [02:24<03:57, 2.54it/s] Training 1/1 epoch (loss 1.4846): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [02:25<03:57, 2.54it/s] Training 1/1 epoch (loss 1.4846): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [02:25<04:10, 2.40it/s] Training 1/1 epoch (loss 1.4665): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [02:25<04:10, 2.40it/s] Training 1/1 epoch (loss 1.4665): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [02:25<05:39, 1.77it/s] Training 1/1 epoch (loss 1.4970): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [02:26<05:39, 1.77it/s] Training 1/1 epoch (loss 1.4970): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [02:26<05:21, 1.87it/s] Training 1/1 epoch (loss 1.3939): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [02:26<05:21, 1.87it/s] Training 1/1 epoch (loss 1.3939): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [02:26<04:56, 2.02it/s] Training 1/1 epoch (loss 1.4814): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [02:27<04:56, 2.02it/s] Training 1/1 epoch (loss 1.4814): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [02:27<04:31, 2.20it/s] Training 1/1 epoch (loss 1.4269): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [02:27<04:31, 2.20it/s] Training 1/1 epoch (loss 1.4269): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [02:27<04:25, 2.25it/s] Training 1/1 epoch (loss 1.5264): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [02:28<04:25, 2.25it/s] Training 1/1 epoch (loss 1.5264): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [02:28<04:13, 2.34it/s] Training 1/1 epoch (loss 1.4660): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [02:28<04:13, 2.34it/s] Training 1/1 epoch (loss 1.4660): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [02:28<04:09, 2.38it/s] Training 1/1 epoch (loss 1.4003): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [02:28<04:09, 2.38it/s] Training 1/1 epoch (loss 1.4003): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [02:28<04:10, 2.36it/s] Training 1/1 epoch (loss 1.4083): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [02:29<04:10, 2.36it/s] Training 1/1 epoch (loss 1.4083): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [02:29<04:01, 2.45it/s] Training 1/1 epoch (loss 1.4849): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [02:29<04:01, 2.45it/s] Training 1/1 epoch (loss 1.4849): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [02:29<03:56, 2.50it/s] Training 1/1 epoch (loss 1.3135): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [02:30<03:56, 2.50it/s] Training 1/1 epoch (loss 1.3135): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [02:30<04:02, 2.44it/s] Training 1/1 epoch (loss 1.5207): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [02:30<04:02, 2.44it/s] Training 1/1 epoch (loss 1.5207): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [02:30<04:02, 2.43it/s] Training 1/1 epoch (loss 1.4674): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [02:30<04:02, 2.43it/s] Training 1/1 epoch (loss 1.4674): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [02:30<03:56, 2.49it/s] Training 1/1 epoch (loss 1.4884): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [02:31<03:56, 2.49it/s] Training 1/1 epoch (loss 1.4884): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:31<03:47, 2.58it/s] Training 1/1 epoch (loss 1.4194): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:31<03:47, 2.58it/s] Training 1/1 epoch (loss 1.4194): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:31<03:58, 2.45it/s] Training 1/1 epoch (loss 1.5693): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:32<03:58, 2.45it/s] Training 1/1 epoch (loss 1.5693): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:32<03:52, 2.52it/s] Training 1/1 epoch (loss 1.4810): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:32<03:52, 2.52it/s] Training 1/1 epoch (loss 1.4810): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:32<03:51, 2.52it/s] Training 1/1 epoch (loss 1.4861): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:32<03:51, 2.52it/s] Training 1/1 epoch (loss 1.4861): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:32<03:53, 2.49it/s] Training 1/1 epoch (loss 1.3907): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:33<03:53, 2.49it/s] Training 1/1 epoch (loss 1.3907): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:33<03:51, 2.52it/s] Training 1/1 epoch (loss 1.4490): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:33<03:51, 2.52it/s] Training 1/1 epoch (loss 1.4490): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:33<03:49, 2.53it/s] Training 1/1 epoch (loss 1.4268): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:33<03:49, 2.53it/s] Training 1/1 epoch (loss 1.4268): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:33<03:45, 2.57it/s] Training 1/1 epoch (loss 1.4253): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:34<03:45, 2.57it/s] Training 1/1 epoch (loss 1.4253): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:34<03:44, 2.58it/s] Training 1/1 epoch (loss 1.5393): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:34<03:44, 2.58it/s] Training 1/1 epoch (loss 1.5393): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:34<04:08, 2.33it/s] Training 1/1 epoch (loss 1.6007): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:35<04:08, 2.33it/s] Training 1/1 epoch (loss 1.6007): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:35<04:26, 2.17it/s] Training 1/1 epoch (loss 1.4588): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:36<04:26, 2.17it/s] Training 1/1 epoch (loss 1.4588): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:36<05:29, 1.75it/s] Training 1/1 epoch (loss 1.4901): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:36<05:29, 1.75it/s] Training 1/1 epoch (loss 1.4901): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:36<04:57, 1.93it/s] Training 1/1 epoch (loss 1.5194): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:37<04:57, 1.93it/s] Training 1/1 epoch (loss 1.5194): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:37<04:36, 2.08it/s] Training 1/1 epoch (loss 1.4126): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:37<04:36, 2.08it/s] Training 1/1 epoch (loss 1.4126): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:37<04:19, 2.21it/s] Training 1/1 epoch (loss 1.4577): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:37<04:19, 2.21it/s] Training 1/1 epoch (loss 1.4577): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:37<04:02, 2.36it/s] Training 1/1 epoch (loss 1.4588): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:38<04:02, 2.36it/s] Training 1/1 epoch (loss 1.4588): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:38<04:05, 2.33it/s] Training 1/1 epoch (loss 1.5916): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:38<04:05, 2.33it/s] Training 1/1 epoch (loss 1.5916): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:38<03:55, 2.42it/s] Training 1/1 epoch (loss 1.3872): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:39<03:55, 2.42it/s] Training 1/1 epoch (loss 1.3872): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:39<03:57, 2.40it/s] Training 1/1 epoch (loss 1.5077): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:39<03:57, 2.40it/s] Training 1/1 epoch (loss 1.5077): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:39<03:51, 2.45it/s] Training 1/1 epoch (loss 1.4422): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:39<03:51, 2.45it/s] Training 1/1 epoch (loss 1.4422): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:39<03:46, 2.51it/s] Training 1/1 epoch (loss 1.4806): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:40<03:46, 2.51it/s] Training 1/1 epoch (loss 1.4806): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:40<03:40, 2.56it/s] Training 1/1 epoch (loss 1.3837): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:40<03:40, 2.56it/s] Training 1/1 epoch (loss 1.3837): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:40<03:42, 2.54it/s] Training 1/1 epoch (loss 1.3241): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:40<03:42, 2.54it/s] Training 1/1 epoch (loss 1.3241): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:40<03:44, 2.52it/s] Training 1/1 epoch (loss 1.5083): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:41<03:44, 2.52it/s] Training 1/1 epoch (loss 1.5083): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:41<03:45, 2.50it/s] Training 1/1 epoch (loss 1.5153): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:41<03:45, 2.50it/s] Training 1/1 epoch (loss 1.5153): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:41<03:46, 2.48it/s] Training 1/1 epoch (loss 1.6042): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:42<03:46, 2.48it/s] Training 1/1 epoch (loss 1.6042): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:42<03:50, 2.44it/s] Training 1/1 epoch (loss 1.3718): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:42<03:50, 2.44it/s] Training 1/1 epoch (loss 1.3718): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:42<03:43, 2.50it/s] Training 1/1 epoch (loss 1.4299): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:42<03:43, 2.50it/s] Training 1/1 epoch (loss 1.4299): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:42<03:37, 2.57it/s] Training 1/1 epoch (loss 1.3832): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:43<03:37, 2.57it/s] Training 1/1 epoch (loss 1.3832): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:43<03:42, 2.51it/s] Training 1/1 epoch (loss 1.4755): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:43<03:42, 2.51it/s] Training 1/1 epoch (loss 1.4755): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:43<03:37, 2.56it/s] Training 1/1 epoch (loss 1.3441): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:44<03:37, 2.56it/s] Training 1/1 epoch (loss 1.3441): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:44<03:32, 2.61it/s] Training 1/1 epoch (loss 1.5032): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:44<03:32, 2.61it/s] Training 1/1 epoch (loss 1.5032): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:44<03:36, 2.57it/s] Training 1/1 epoch (loss 1.5103): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:44<03:36, 2.57it/s] Training 1/1 epoch (loss 1.5103): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:44<03:44, 2.47it/s] Training 1/1 epoch (loss 1.4589): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:45<03:44, 2.47it/s] Training 1/1 epoch (loss 1.4589): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:45<03:42, 2.48it/s] Training 1/1 epoch (loss 1.4775): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:45<03:42, 2.48it/s] Training 1/1 epoch (loss 1.4775): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:45<03:37, 2.54it/s] Training 1/1 epoch (loss 1.4988): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:46<03:37, 2.54it/s] Training 1/1 epoch (loss 1.4988): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:46<03:47, 2.42it/s] Training 1/1 epoch (loss 1.4546): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:46<03:47, 2.42it/s] Training 1/1 epoch (loss 1.4546): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:46<03:57, 2.32it/s] Training 1/1 epoch (loss 1.5923): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:47<03:57, 2.32it/s] Training 1/1 epoch (loss 1.5923): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:47<04:03, 2.25it/s] Training 1/1 epoch (loss 1.4333): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:47<04:03, 2.25it/s] Training 1/1 epoch (loss 1.4333): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:47<03:57, 2.30it/s] Training 1/1 epoch (loss 1.4409): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:47<03:57, 2.30it/s] Training 1/1 epoch (loss 1.4409): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:47<04:00, 2.27it/s] Training 1/1 epoch (loss 1.5660): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:48<04:00, 2.27it/s] Training 1/1 epoch (loss 1.5660): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:48<03:54, 2.33it/s] Training 1/1 epoch (loss 1.4685): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:48<03:54, 2.33it/s] Training 1/1 epoch (loss 1.4685): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:48<03:54, 2.32it/s] Training 1/1 epoch (loss 1.5751): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:49<03:54, 2.32it/s] Training 1/1 epoch (loss 1.5751): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:49<03:44, 2.42it/s] Training 1/1 epoch (loss 1.4699): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:49<03:44, 2.42it/s] Training 1/1 epoch (loss 1.4699): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:49<03:38, 2.48it/s] Training 1/1 epoch (loss 1.4325): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:49<03:38, 2.48it/s] Training 1/1 epoch (loss 1.4325): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:49<03:32, 2.55it/s] Training 1/1 epoch (loss 1.4216): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:50<03:32, 2.55it/s] Training 1/1 epoch (loss 1.4216): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:50<03:35, 2.51it/s] Training 1/1 epoch (loss 1.4322): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:50<03:35, 2.51it/s] Training 1/1 epoch (loss 1.4322): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:50<03:30, 2.57it/s] Training 1/1 epoch (loss 1.4677): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:51<03:30, 2.57it/s] Training 1/1 epoch (loss 1.4677): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:51<03:29, 2.58it/s] Training 1/1 epoch (loss 1.5291): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:51<03:29, 2.58it/s] Training 1/1 epoch (loss 1.5291): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:51<03:34, 2.51it/s] Training 1/1 epoch (loss 1.4989): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:51<03:34, 2.51it/s] Training 1/1 epoch (loss 1.4989): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:51<03:30, 2.56it/s] Training 1/1 epoch (loss 1.5097): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:52<03:30, 2.56it/s] Training 1/1 epoch (loss 1.5097): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:52<03:30, 2.55it/s] Training 1/1 epoch (loss 1.4836): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:52<03:30, 2.55it/s] Training 1/1 epoch (loss 1.4836): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:52<03:36, 2.47it/s] Training 1/1 epoch (loss 1.4746): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:53<03:36, 2.47it/s] Training 1/1 epoch (loss 1.4746): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:53<03:32, 2.52it/s] Training 1/1 epoch (loss 1.3487): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:53<03:32, 2.52it/s] Training 1/1 epoch (loss 1.3487): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:53<03:30, 2.53it/s] Training 1/1 epoch (loss 1.5845): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:53<03:30, 2.53it/s] Training 1/1 epoch (loss 1.5845): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:53<03:30, 2.53it/s] Training 1/1 epoch (loss 1.5436): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:54<03:30, 2.53it/s] Training 1/1 epoch (loss 1.5436): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:54<03:36, 2.45it/s] Training 1/1 epoch (loss 1.4320): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:54<03:36, 2.45it/s] Training 1/1 epoch (loss 1.4320): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:54<03:35, 2.46it/s] Training 1/1 epoch (loss 1.4859): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:55<03:35, 2.46it/s] Training 1/1 epoch (loss 1.4859): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:55<03:37, 2.43it/s] Training 1/1 epoch (loss 1.4669): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:55<03:37, 2.43it/s] Training 1/1 epoch (loss 1.4669): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:55<03:41, 2.39it/s] Training 1/1 epoch (loss 1.3210): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:55<03:41, 2.39it/s] Training 1/1 epoch (loss 1.3210): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:55<03:32, 2.48it/s] Training 1/1 epoch (loss 1.4350): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:56<03:32, 2.48it/s] Training 1/1 epoch (loss 1.4350): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:56<03:40, 2.38it/s] Training 1/1 epoch (loss 1.3684): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:56<03:40, 2.38it/s] Training 1/1 epoch (loss 1.3684): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:56<03:29, 2.50it/s] Training 1/1 epoch (loss 1.4595): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:57<03:29, 2.50it/s] Training 1/1 epoch (loss 1.4595): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:57<03:27, 2.53it/s] Training 1/1 epoch (loss 1.3959): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:57<03:27, 2.53it/s] Training 1/1 epoch (loss 1.3959): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:57<03:26, 2.54it/s] Training 1/1 epoch (loss 1.3413): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:57<03:26, 2.54it/s] Training 1/1 epoch (loss 1.3413): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:57<03:22, 2.57it/s] Training 1/1 epoch (loss 1.3607): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:58<03:22, 2.57it/s] Training 1/1 epoch (loss 1.3607): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:58<03:25, 2.54it/s] Training 1/1 epoch (loss 1.4108): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:58<03:25, 2.54it/s] Training 1/1 epoch (loss 1.4108): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:58<03:23, 2.56it/s] Training 1/1 epoch (loss 1.5189): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:59<03:23, 2.56it/s] Training 1/1 epoch (loss 1.5189): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:59<03:21, 2.57it/s] Training 1/1 epoch (loss 1.5435): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:59<03:21, 2.57it/s] Training 1/1 epoch (loss 1.5435): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [02:59<03:49, 2.26it/s] Training 1/1 epoch (loss 1.3876): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [03:00<03:49, 2.26it/s] Training 1/1 epoch (loss 1.3876): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [03:00<03:46, 2.29it/s] Training 1/1 epoch (loss 1.3856): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [03:00<03:46, 2.29it/s] Training 1/1 epoch (loss 1.3856): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [03:00<03:42, 2.32it/s] Training 1/1 epoch (loss 1.3910): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [03:00<03:42, 2.32it/s] Training 1/1 epoch (loss 1.3910): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [03:00<03:33, 2.41it/s] Training 1/1 epoch (loss 1.3055): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [03:01<03:33, 2.41it/s] Training 1/1 epoch (loss 1.3055): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [03:01<03:30, 2.44it/s] Training 1/1 epoch (loss 1.4462): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [03:01<03:30, 2.44it/s] Training 1/1 epoch (loss 1.4462): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [03:01<03:30, 2.44it/s] Training 1/1 epoch (loss 1.4751): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [03:02<03:30, 2.44it/s] Training 1/1 epoch (loss 1.4751): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [03:02<03:30, 2.43it/s] Training 1/1 epoch (loss 1.4718): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [03:02<03:30, 2.43it/s] Training 1/1 epoch (loss 1.4718): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [03:02<03:22, 2.52it/s] Training 1/1 epoch (loss 1.3801): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [03:02<03:22, 2.52it/s] Training 1/1 epoch (loss 1.3801): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [03:02<03:27, 2.45it/s] Training 1/1 epoch (loss 1.4292): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [03:03<03:27, 2.45it/s] Training 1/1 epoch (loss 1.4292): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [03:03<03:22, 2.51it/s] Training 1/1 epoch (loss 1.4506): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [03:03<03:22, 2.51it/s] Training 1/1 epoch (loss 1.4506): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [03:03<03:24, 2.48it/s] Training 1/1 epoch (loss 1.3936): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [03:04<03:24, 2.48it/s] Training 1/1 epoch (loss 1.3936): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [03:04<03:20, 2.53it/s] Training 1/1 epoch (loss 1.4919): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [03:04<03:20, 2.53it/s] Training 1/1 epoch (loss 1.4919): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [03:04<03:25, 2.47it/s] Training 1/1 epoch (loss 1.4207): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [03:05<03:25, 2.47it/s] Training 1/1 epoch (loss 1.4207): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [03:05<03:45, 2.24it/s] Training 1/1 epoch (loss 1.5332): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [03:05<03:45, 2.24it/s] Training 1/1 epoch (loss 1.5332): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [03:05<03:59, 2.11it/s] Training 1/1 epoch (loss 1.4357): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [03:06<03:59, 2.11it/s] Training 1/1 epoch (loss 1.4357): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [03:06<03:52, 2.17it/s] Training 1/1 epoch (loss 1.5064): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [03:06<03:52, 2.17it/s] Training 1/1 epoch (loss 1.5064): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [03:06<03:50, 2.18it/s] Training 1/1 epoch (loss 1.5057): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [03:06<03:50, 2.18it/s] Training 1/1 epoch (loss 1.5057): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [03:06<03:40, 2.27it/s] Training 1/1 epoch (loss 1.3359): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [03:07<03:40, 2.27it/s] Training 1/1 epoch (loss 1.3359): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [03:07<03:27, 2.41it/s] Training 1/1 epoch (loss 1.4965): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [03:07<03:27, 2.41it/s] Training 1/1 epoch (loss 1.4965): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [03:07<03:19, 2.50it/s] Training 1/1 epoch (loss 1.4540): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [03:07<03:19, 2.50it/s] Training 1/1 epoch (loss 1.4540): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [03:07<03:17, 2.52it/s] Training 1/1 epoch (loss 1.4493): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [03:08<03:17, 2.52it/s] Training 1/1 epoch (loss 1.4493): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [03:08<03:18, 2.50it/s] Training 1/1 epoch (loss 1.3906): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [03:08<03:18, 2.50it/s] Training 1/1 epoch (loss 1.3906): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [03:08<03:15, 2.54it/s] Training 1/1 epoch (loss 1.4740): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [03:09<03:15, 2.54it/s] Training 1/1 epoch (loss 1.4740): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [03:09<03:11, 2.58it/s] Training 1/1 epoch (loss 1.5165): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [03:09<03:11, 2.58it/s] Training 1/1 epoch (loss 1.5165): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [03:09<03:12, 2.56it/s] Training 1/1 epoch (loss 1.2764): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [03:09<03:12, 2.56it/s] Training 1/1 epoch (loss 1.2764): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [03:09<03:20, 2.46it/s] Training 1/1 epoch (loss 1.3626): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [03:10<03:20, 2.46it/s] Training 1/1 epoch (loss 1.3626): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [03:10<03:22, 2.43it/s] Training 1/1 epoch (loss 1.5081): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [03:10<03:22, 2.43it/s] Training 1/1 epoch (loss 1.5081): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [03:10<03:15, 2.51it/s] Training 1/1 epoch (loss 1.4538): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [03:11<03:15, 2.51it/s] Training 1/1 epoch (loss 1.4538): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [03:11<03:13, 2.53it/s] Training 1/1 epoch (loss 1.5580): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [03:11<03:13, 2.53it/s] Training 1/1 epoch (loss 1.5580): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [03:11<03:26, 2.37it/s] Training 1/1 epoch (loss 1.2832): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [03:12<03:26, 2.37it/s] Training 1/1 epoch (loss 1.2832): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [03:12<03:22, 2.41it/s] Training 1/1 epoch (loss 1.3934): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [03:12<03:22, 2.41it/s] Training 1/1 epoch (loss 1.3934): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [03:12<03:23, 2.39it/s] Training 1/1 epoch (loss 1.4359): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [03:12<03:23, 2.39it/s] Training 1/1 epoch (loss 1.4359): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [03:12<03:23, 2.38it/s] Training 1/1 epoch (loss 1.4446): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [03:13<03:23, 2.38it/s] Training 1/1 epoch (loss 1.4446): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [03:13<03:14, 2.50it/s] Training 1/1 epoch (loss 1.3829): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [03:13<03:14, 2.50it/s] Training 1/1 epoch (loss 1.3829): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [03:13<03:08, 2.56it/s] Training 1/1 epoch (loss 1.4857): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [03:14<03:08, 2.56it/s] Training 1/1 epoch (loss 1.4857): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [03:14<03:07, 2.57it/s] Training 1/1 epoch (loss 1.4338): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [03:14<03:07, 2.57it/s] Training 1/1 epoch (loss 1.4338): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [03:14<03:10, 2.53it/s] Training 1/1 epoch (loss 1.5224): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [03:14<03:10, 2.53it/s] Training 1/1 epoch (loss 1.5224): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [03:14<03:11, 2.51it/s] Training 1/1 epoch (loss 1.5185): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [03:15<03:11, 2.51it/s] Training 1/1 epoch (loss 1.5185): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [03:15<03:17, 2.43it/s] Training 1/1 epoch (loss 1.5330): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [03:15<03:17, 2.43it/s] Training 1/1 epoch (loss 1.5330): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [03:15<03:16, 2.44it/s] Training 1/1 epoch (loss 1.4590): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [03:16<03:16, 2.44it/s] Training 1/1 epoch (loss 1.4590): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [03:16<03:21, 2.37it/s] Training 1/1 epoch (loss 1.4250): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [03:16<03:21, 2.37it/s] Training 1/1 epoch (loss 1.4250): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [03:16<03:25, 2.32it/s] Training 1/1 epoch (loss 1.3977): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [03:16<03:25, 2.32it/s] Training 1/1 epoch (loss 1.3977): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [03:16<03:17, 2.41it/s] Training 1/1 epoch (loss 1.3802): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [03:17<03:17, 2.41it/s] Training 1/1 epoch (loss 1.3802): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [03:17<03:16, 2.42it/s] Training 1/1 epoch (loss 1.4555): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [03:17<03:16, 2.42it/s] Training 1/1 epoch (loss 1.4555): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [03:17<03:12, 2.46it/s] Training 1/1 epoch (loss 1.4169): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [03:18<03:12, 2.46it/s] Training 1/1 epoch (loss 1.4169): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [03:18<03:22, 2.34it/s] Training 1/1 epoch (loss 1.5026): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [03:18<03:22, 2.34it/s] Training 1/1 epoch (loss 1.5026): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [03:18<03:21, 2.34it/s] Training 1/1 epoch (loss 1.5275): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [03:19<03:21, 2.34it/s] Training 1/1 epoch (loss 1.5275): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [03:19<03:42, 2.11it/s] Training 1/1 epoch (loss 1.5689): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [03:19<03:42, 2.11it/s] Training 1/1 epoch (loss 1.5689): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [03:19<03:28, 2.25it/s] Training 1/1 epoch (loss 1.5137): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [03:19<03:28, 2.25it/s] Training 1/1 epoch (loss 1.5137): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [03:19<03:16, 2.39it/s] Training 1/1 epoch (loss 1.3367): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [03:20<03:16, 2.39it/s] Training 1/1 epoch (loss 1.3367): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [03:20<03:27, 2.25it/s] Training 1/1 epoch (loss 1.4369): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [03:20<03:27, 2.25it/s] Training 1/1 epoch (loss 1.4369): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [03:20<03:17, 2.37it/s] Training 1/1 epoch (loss 1.4782): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [03:21<03:17, 2.37it/s] Training 1/1 epoch (loss 1.4782): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [03:21<03:13, 2.40it/s] Training 1/1 epoch (loss 1.3356): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [03:21<03:13, 2.40it/s] Training 1/1 epoch (loss 1.3356): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [03:21<03:07, 2.48it/s] Training 1/1 epoch (loss 1.5026): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [03:21<03:07, 2.48it/s] Training 1/1 epoch (loss 1.5026): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [03:21<03:02, 2.55it/s] Training 1/1 epoch (loss 1.3640): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [03:22<03:02, 2.55it/s] Training 1/1 epoch (loss 1.3640): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [03:22<02:56, 2.62it/s] Training 1/1 epoch (loss 1.4038): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [03:22<02:56, 2.62it/s] Training 1/1 epoch (loss 1.4038): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [03:22<03:01, 2.54it/s] Training 1/1 epoch (loss 1.4088): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [03:23<03:01, 2.54it/s] Training 1/1 epoch (loss 1.4088): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [03:23<03:07, 2.45it/s] Training 1/1 epoch (loss 1.3746): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [03:23<03:07, 2.45it/s] Training 1/1 epoch (loss 1.3746): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [03:23<03:05, 2.48it/s] Training 1/1 epoch (loss 1.4837): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [03:23<03:05, 2.48it/s] Training 1/1 epoch (loss 1.4837): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [03:23<02:59, 2.55it/s] Training 1/1 epoch (loss 1.3984): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [03:24<02:59, 2.55it/s] Training 1/1 epoch (loss 1.3984): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [03:24<03:07, 2.44it/s] Training 1/1 epoch (loss 1.4526): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [03:24<03:07, 2.44it/s] Training 1/1 epoch (loss 1.4526): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [03:24<03:08, 2.43it/s] Training 1/1 epoch (loss 1.4546): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [03:25<03:08, 2.43it/s] Training 1/1 epoch (loss 1.4546): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [03:25<03:03, 2.49it/s] Training 1/1 epoch (loss 1.3856): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [03:25<03:03, 2.49it/s] Training 1/1 epoch (loss 1.3856): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [03:25<03:04, 2.47it/s] Training 1/1 epoch (loss 1.5815): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [03:26<03:04, 2.47it/s] Training 1/1 epoch (loss 1.5815): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [03:26<03:00, 2.52it/s] Training 1/1 epoch (loss 1.5574): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [03:26<03:00, 2.52it/s] Training 1/1 epoch (loss 1.5574): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [03:26<02:55, 2.58it/s] Training 1/1 epoch (loss 1.4542): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [03:26<02:55, 2.58it/s] Training 1/1 epoch (loss 1.4542): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [03:26<02:53, 2.61it/s] Training 1/1 epoch (loss 1.4495): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [03:27<02:53, 2.61it/s] Training 1/1 epoch (loss 1.4495): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [03:27<02:54, 2.58it/s] Training 1/1 epoch (loss 1.4466): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [03:27<02:54, 2.58it/s] Training 1/1 epoch (loss 1.4466): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [03:27<02:58, 2.53it/s] Training 1/1 epoch (loss 1.4632): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [03:27<02:58, 2.53it/s] Training 1/1 epoch (loss 1.4632): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [03:27<02:52, 2.60it/s] Training 1/1 epoch (loss 1.4731): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [03:28<02:52, 2.60it/s] Training 1/1 epoch (loss 1.4731): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [03:28<02:53, 2.59it/s] Training 1/1 epoch (loss 1.5514): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [03:28<02:53, 2.59it/s] Training 1/1 epoch (loss 1.5514): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [03:28<02:57, 2.51it/s] Training 1/1 epoch (loss 1.5310): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [03:29<02:57, 2.51it/s] Training 1/1 epoch (loss 1.5310): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [03:29<02:56, 2.53it/s] Training 1/1 epoch (loss 1.3902): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [03:29<02:56, 2.53it/s] Training 1/1 epoch (loss 1.3902): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [03:29<02:51, 2.60it/s] Training 1/1 epoch (loss 1.4367): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [03:29<02:51, 2.60it/s] Training 1/1 epoch (loss 1.4367): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [03:29<02:54, 2.54it/s] Training 1/1 epoch (loss 1.4665): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [03:30<02:54, 2.54it/s] Training 1/1 epoch (loss 1.4665): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [03:30<02:53, 2.56it/s] Training 1/1 epoch (loss 1.3390): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [03:30<02:53, 2.56it/s] Training 1/1 epoch (loss 1.3390): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [03:30<03:01, 2.43it/s] Training 1/1 epoch (loss 1.4904): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [03:31<03:01, 2.43it/s] Training 1/1 epoch (loss 1.4904): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [03:31<03:05, 2.38it/s] Training 1/1 epoch (loss 1.4114): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [03:31<03:05, 2.38it/s] Training 1/1 epoch (loss 1.4114): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [03:31<02:59, 2.45it/s] Training 1/1 epoch (loss 1.3625): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [03:31<02:59, 2.45it/s] Training 1/1 epoch (loss 1.3625): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [03:31<02:52, 2.55it/s] Training 1/1 epoch (loss 1.3316): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [03:32<02:52, 2.55it/s] Training 1/1 epoch (loss 1.3316): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [03:32<02:50, 2.56it/s] Training 1/1 epoch (loss 1.5156): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [03:32<02:50, 2.56it/s] Training 1/1 epoch (loss 1.5156): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [03:32<02:50, 2.57it/s] Training 1/1 epoch (loss 1.5175): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [03:33<02:50, 2.57it/s] Training 1/1 epoch (loss 1.5175): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [03:33<02:51, 2.54it/s] Training 1/1 epoch (loss 1.4548): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [03:33<02:51, 2.54it/s] Training 1/1 epoch (loss 1.4548): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [03:33<02:48, 2.58it/s] Training 1/1 epoch (loss 1.3606): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [03:33<02:48, 2.58it/s] Training 1/1 epoch (loss 1.3606): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [03:33<02:47, 2.59it/s] Training 1/1 epoch (loss 1.4042): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [03:34<02:47, 2.59it/s] Training 1/1 epoch (loss 1.4042): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [03:34<02:48, 2.56it/s] Training 1/1 epoch (loss 1.4014): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [03:34<02:48, 2.56it/s] Training 1/1 epoch (loss 1.4014): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [03:34<02:50, 2.53it/s] Training 1/1 epoch (loss 1.4774): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [03:35<02:50, 2.53it/s] Training 1/1 epoch (loss 1.4774): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [03:35<03:16, 2.19it/s] Training 1/1 epoch (loss 1.4478): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [03:35<03:16, 2.19it/s] Training 1/1 epoch (loss 1.4478): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [03:35<03:19, 2.16it/s] Training 1/1 epoch (loss 1.4450): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [03:36<03:19, 2.16it/s] Training 1/1 epoch (loss 1.4450): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [03:36<03:33, 2.01it/s] Training 1/1 epoch (loss 1.4957): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [03:36<03:33, 2.01it/s] Training 1/1 epoch (loss 1.4957): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [03:36<03:19, 2.14it/s] Training 1/1 epoch (loss 1.6024): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [03:37<03:19, 2.14it/s] Training 1/1 epoch (loss 1.6024): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [03:37<03:06, 2.29it/s] Training 1/1 epoch (loss 1.3900): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [03:37<03:06, 2.29it/s] Training 1/1 epoch (loss 1.3900): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [03:37<03:00, 2.36it/s] Training 1/1 epoch (loss 1.4610): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [03:37<03:00, 2.36it/s] Training 1/1 epoch (loss 1.4610): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [03:37<02:54, 2.43it/s] Training 1/1 epoch (loss 1.4921): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [03:38<02:54, 2.43it/s] Training 1/1 epoch (loss 1.4921): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [03:38<02:46, 2.54it/s] Training 1/1 epoch (loss 1.3990): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [03:38<02:46, 2.54it/s] Training 1/1 epoch (loss 1.3990): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [03:38<02:52, 2.45it/s] Training 1/1 epoch (loss 1.4454): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [03:39<02:52, 2.45it/s] Training 1/1 epoch (loss 1.4454): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [03:39<02:51, 2.46it/s] Training 1/1 epoch (loss 1.4819): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [03:39<02:51, 2.46it/s] Training 1/1 epoch (loss 1.4819): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [03:39<02:47, 2.51it/s] Training 1/1 epoch (loss 1.4307): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [03:39<02:47, 2.51it/s] Training 1/1 epoch (loss 1.4307): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [03:39<02:42, 2.59it/s] Training 1/1 epoch (loss 1.5075): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [03:40<02:42, 2.59it/s] Training 1/1 epoch (loss 1.5075): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [03:40<02:40, 2.61it/s] Training 1/1 epoch (loss 1.5315): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [03:40<02:40, 2.61it/s] Training 1/1 epoch (loss 1.5315): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [03:40<02:57, 2.35it/s] Training 1/1 epoch (loss 1.3255): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [03:41<02:57, 2.35it/s] Training 1/1 epoch (loss 1.3255): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [03:41<02:52, 2.42it/s] Training 1/1 epoch (loss 1.4286): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [03:41<02:52, 2.42it/s] Training 1/1 epoch (loss 1.4286): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [03:41<02:48, 2.46it/s] Training 1/1 epoch (loss 1.4392): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [03:41<02:48, 2.46it/s] Training 1/1 epoch (loss 1.4392): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [03:41<02:50, 2.44it/s] Training 1/1 epoch (loss 1.3217): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [03:42<02:50, 2.44it/s] Training 1/1 epoch (loss 1.3217): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [03:42<02:42, 2.55it/s] Training 1/1 epoch (loss 1.4022): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [03:42<02:42, 2.55it/s] Training 1/1 epoch (loss 1.4022): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [03:42<02:40, 2.58it/s] Training 1/1 epoch (loss 1.4504): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [03:42<02:40, 2.58it/s] Training 1/1 epoch (loss 1.4504): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [03:42<02:39, 2.58it/s] Training 1/1 epoch (loss 1.4327): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [03:43<02:39, 2.58it/s] Training 1/1 epoch (loss 1.4327): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [03:43<02:42, 2.53it/s] Training 1/1 epoch (loss 1.4568): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [03:43<02:42, 2.53it/s] Training 1/1 epoch (loss 1.4568): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [03:43<02:41, 2.54it/s] Training 1/1 epoch (loss 1.3612): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [03:44<02:41, 2.54it/s] Training 1/1 epoch (loss 1.3612): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [03:44<02:40, 2.55it/s] Training 1/1 epoch (loss 1.4102): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [03:44<02:40, 2.55it/s] Training 1/1 epoch (loss 1.4102): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [03:44<02:39, 2.56it/s] Training 1/1 epoch (loss 1.3848): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [03:45<02:39, 2.56it/s] Training 1/1 epoch (loss 1.3848): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [03:45<02:51, 2.37it/s] Training 1/1 epoch (loss 1.5194): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [03:45<02:51, 2.37it/s] Training 1/1 epoch (loss 1.5194): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:45<02:50, 2.38it/s] Training 1/1 epoch (loss 1.4782): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:45<02:50, 2.38it/s] Training 1/1 epoch (loss 1.4782): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:45<02:44, 2.47it/s] Training 1/1 epoch (loss 1.4074): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:46<02:44, 2.47it/s] Training 1/1 epoch (loss 1.4074): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:46<02:46, 2.43it/s] Training 1/1 epoch (loss 1.3901): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:46<02:46, 2.43it/s] Training 1/1 epoch (loss 1.3901): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:46<02:50, 2.36it/s] Training 1/1 epoch (loss 1.3537): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:47<02:50, 2.36it/s] Training 1/1 epoch (loss 1.3537): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:47<02:50, 2.35it/s] Training 1/1 epoch (loss 1.4780): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:47<02:50, 2.35it/s] Training 1/1 epoch (loss 1.4780): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:47<02:47, 2.40it/s] Training 1/1 epoch (loss 1.3681): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:47<02:47, 2.40it/s] Training 1/1 epoch (loss 1.3681): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:47<02:38, 2.52it/s] Training 1/1 epoch (loss 1.4344): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:48<02:38, 2.52it/s] Training 1/1 epoch (loss 1.4344): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:48<02:50, 2.35it/s] Training 1/1 epoch (loss 1.4409): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:48<02:50, 2.35it/s] Training 1/1 epoch (loss 1.4409): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:48<02:48, 2.36it/s] Training 1/1 epoch (loss 1.3000): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:49<02:48, 2.36it/s] Training 1/1 epoch (loss 1.3000): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:49<02:41, 2.45it/s] Training 1/1 epoch (loss 1.4010): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:49<02:41, 2.45it/s] Training 1/1 epoch (loss 1.4010): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:49<02:43, 2.43it/s] Training 1/1 epoch (loss 1.4578): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:49<02:43, 2.43it/s] Training 1/1 epoch (loss 1.4578): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:49<02:38, 2.49it/s] Training 1/1 epoch (loss 1.3741): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:50<02:38, 2.49it/s] Training 1/1 epoch (loss 1.3741): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:50<02:37, 2.50it/s] Training 1/1 epoch (loss 1.3385): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:50<02:37, 2.50it/s] Training 1/1 epoch (loss 1.3385): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:50<02:39, 2.47it/s] Training 1/1 epoch (loss 1.3472): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:51<02:39, 2.47it/s] Training 1/1 epoch (loss 1.3472): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:51<02:36, 2.51it/s] Training 1/1 epoch (loss 1.3616): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:51<02:36, 2.51it/s] Training 1/1 epoch (loss 1.3616): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:51<02:41, 2.42it/s] Training 1/1 epoch (loss 1.5096): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:52<02:41, 2.42it/s] Training 1/1 epoch (loss 1.5096): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:52<02:38, 2.46it/s] Training 1/1 epoch (loss 1.3692): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:52<02:38, 2.46it/s] Training 1/1 epoch (loss 1.3692): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:52<02:41, 2.41it/s] Training 1/1 epoch (loss 1.4574): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:52<02:41, 2.41it/s] Training 1/1 epoch (loss 1.4574): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:52<02:38, 2.45it/s] Training 1/1 epoch (loss 1.4303): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:53<02:38, 2.45it/s] Training 1/1 epoch (loss 1.4303): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:53<02:35, 2.49it/s] Training 1/1 epoch (loss 1.4070): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:53<02:35, 2.49it/s] Training 1/1 epoch (loss 1.4070): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:53<02:40, 2.40it/s] Training 1/1 epoch (loss 1.4771): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:54<02:40, 2.40it/s] Training 1/1 epoch (loss 1.4771): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:54<02:35, 2.48it/s] Training 1/1 epoch (loss 1.4594): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:54<02:35, 2.48it/s] Training 1/1 epoch (loss 1.4594): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:54<02:34, 2.49it/s] Training 1/1 epoch (loss 1.3736): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:54<02:34, 2.49it/s] Training 1/1 epoch (loss 1.3736): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:54<02:36, 2.45it/s] Training 1/1 epoch (loss 1.4771): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:55<02:36, 2.45it/s] Training 1/1 epoch (loss 1.4771): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:55<02:33, 2.48it/s] Training 1/1 epoch (loss 1.4699): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:55<02:33, 2.48it/s] Training 1/1 epoch (loss 1.4699): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:55<02:32, 2.50it/s] Training 1/1 epoch (loss 1.4556): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:56<02:32, 2.50it/s] Training 1/1 epoch (loss 1.4556): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:56<02:27, 2.57it/s] Training 1/1 epoch (loss 1.4552): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:56<02:27, 2.57it/s] Training 1/1 epoch (loss 1.4552): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:56<02:28, 2.55it/s] Training 1/1 epoch (loss 1.3534): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:56<02:28, 2.55it/s] Training 1/1 epoch (loss 1.3534): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:56<02:32, 2.48it/s] Training 1/1 epoch (loss 1.4651): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:57<02:32, 2.48it/s] Training 1/1 epoch (loss 1.4651): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:57<02:30, 2.51it/s] Training 1/1 epoch (loss 1.5284): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:57<02:30, 2.51it/s] Training 1/1 epoch (loss 1.5284): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:57<02:28, 2.53it/s] Training 1/1 epoch (loss 1.3448): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:58<02:28, 2.53it/s] Training 1/1 epoch (loss 1.3448): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:58<02:31, 2.47it/s] Training 1/1 epoch (loss 1.4598): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:58<02:31, 2.47it/s] Training 1/1 epoch (loss 1.4598): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:58<02:31, 2.47it/s] Training 1/1 epoch (loss 1.4176): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:58<02:31, 2.47it/s] Training 1/1 epoch (loss 1.4176): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:58<02:29, 2.50it/s] Training 1/1 epoch (loss 1.3988): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:59<02:29, 2.50it/s] Training 1/1 epoch (loss 1.3988): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:59<02:25, 2.56it/s] Training 1/1 epoch (loss 1.4618): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:59<02:25, 2.56it/s] Training 1/1 epoch (loss 1.4618): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:59<02:22, 2.61it/s] Training 1/1 epoch (loss 1.5128): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:59<02:22, 2.61it/s] Training 1/1 epoch (loss 1.5128): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [03:59<02:24, 2.56it/s] Training 1/1 epoch (loss 1.5869): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [04:00<02:24, 2.56it/s] Training 1/1 epoch (loss 1.5869): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [04:00<02:23, 2.58it/s] Training 1/1 epoch (loss 1.4363): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [04:00<02:23, 2.58it/s] Training 1/1 epoch (loss 1.4363): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [04:00<02:24, 2.55it/s] Training 1/1 epoch (loss 1.4779): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [04:01<02:24, 2.55it/s] Training 1/1 epoch (loss 1.4779): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [04:01<02:23, 2.56it/s] Training 1/1 epoch (loss 1.4521): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [04:01<02:23, 2.56it/s] Training 1/1 epoch (loss 1.4521): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [04:01<02:23, 2.55it/s] Training 1/1 epoch (loss 1.4669): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [04:01<02:23, 2.55it/s] Training 1/1 epoch (loss 1.4669): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [04:01<02:20, 2.59it/s] Training 1/1 epoch (loss 1.4337): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [04:02<02:20, 2.59it/s] Training 1/1 epoch (loss 1.4337): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [04:02<02:26, 2.48it/s] Training 1/1 epoch (loss 1.4187): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [04:02<02:26, 2.48it/s] Training 1/1 epoch (loss 1.4187): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [04:02<02:27, 2.47it/s] Training 1/1 epoch (loss 1.3987): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [04:03<02:27, 2.47it/s] Training 1/1 epoch (loss 1.3987): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [04:03<02:29, 2.43it/s] Training 1/1 epoch (loss 1.3968): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [04:03<02:29, 2.43it/s] Training 1/1 epoch (loss 1.3968): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [04:03<02:26, 2.47it/s] Training 1/1 epoch (loss 1.5279): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [04:04<02:26, 2.47it/s] Training 1/1 epoch (loss 1.5279): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [04:04<02:25, 2.47it/s] Training 1/1 epoch (loss 1.4752): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [04:04<02:25, 2.47it/s] Training 1/1 epoch (loss 1.4752): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [04:04<02:25, 2.47it/s] Training 1/1 epoch (loss 1.4625): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [04:04<02:25, 2.47it/s] Training 1/1 epoch (loss 1.4625): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [04:04<02:33, 2.33it/s] Training 1/1 epoch (loss 1.4332): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [04:05<02:33, 2.33it/s] Training 1/1 epoch (loss 1.4332): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [04:05<02:42, 2.20it/s] Training 1/1 epoch (loss 1.3807): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [04:06<02:42, 2.20it/s] Training 1/1 epoch (loss 1.3807): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [04:06<03:01, 1.96it/s] Training 1/1 epoch (loss 1.4794): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [04:06<03:01, 1.96it/s] Training 1/1 epoch (loss 1.4794): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [04:06<02:50, 2.09it/s] Training 1/1 epoch (loss 1.3210): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [04:06<02:50, 2.09it/s] Training 1/1 epoch (loss 1.3210): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [04:06<02:43, 2.16it/s] Training 1/1 epoch (loss 1.3330): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [04:07<02:43, 2.16it/s] Training 1/1 epoch (loss 1.3330): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [04:07<02:33, 2.30it/s] Training 1/1 epoch (loss 1.4809): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [04:07<02:33, 2.30it/s] Training 1/1 epoch (loss 1.4809): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [04:07<02:29, 2.35it/s] Training 1/1 epoch (loss 1.4550): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [04:08<02:29, 2.35it/s] Training 1/1 epoch (loss 1.4550): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [04:08<02:28, 2.37it/s] Training 1/1 epoch (loss 1.4898): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [04:08<02:28, 2.37it/s] Training 1/1 epoch (loss 1.4898): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [04:08<02:22, 2.45it/s] Training 1/1 epoch (loss 1.4710): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [04:08<02:22, 2.45it/s] Training 1/1 epoch (loss 1.4710): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [04:08<02:19, 2.50it/s] Training 1/1 epoch (loss 1.4837): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [04:09<02:19, 2.50it/s] Training 1/1 epoch (loss 1.4837): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [04:09<02:18, 2.52it/s] Training 1/1 epoch (loss 1.4034): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [04:09<02:18, 2.52it/s] Training 1/1 epoch (loss 1.4034): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [04:09<02:16, 2.55it/s] Training 1/1 epoch (loss 1.3146): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [04:10<02:16, 2.55it/s] Training 1/1 epoch (loss 1.3146): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [04:10<02:20, 2.45it/s] Training 1/1 epoch (loss 1.5097): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [04:10<02:20, 2.45it/s] Training 1/1 epoch (loss 1.5097): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [04:10<02:19, 2.48it/s] Training 1/1 epoch (loss 1.4368): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [04:10<02:19, 2.48it/s] Training 1/1 epoch (loss 1.4368): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [04:10<02:19, 2.47it/s] Training 1/1 epoch (loss 1.3773): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [04:11<02:19, 2.47it/s] Training 1/1 epoch (loss 1.3773): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [04:11<02:17, 2.49it/s] Training 1/1 epoch (loss 1.4964): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [04:11<02:17, 2.49it/s] Training 1/1 epoch (loss 1.4964): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [04:11<02:22, 2.41it/s] Training 1/1 epoch (loss 1.4193): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [04:12<02:22, 2.41it/s] Training 1/1 epoch (loss 1.4193): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [04:12<02:19, 2.44it/s] Training 1/1 epoch (loss 1.5206): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [04:12<02:19, 2.44it/s] Training 1/1 epoch (loss 1.5206): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [04:12<02:15, 2.50it/s] Training 1/1 epoch (loss 1.4747): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [04:12<02:15, 2.50it/s] Training 1/1 epoch (loss 1.4747): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [04:12<02:17, 2.46it/s] Training 1/1 epoch (loss 1.3846): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [04:13<02:17, 2.46it/s] Training 1/1 epoch (loss 1.3846): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [04:13<02:16, 2.47it/s] Training 1/1 epoch (loss 1.4954): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [04:13<02:16, 2.47it/s] Training 1/1 epoch (loss 1.4954): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [04:13<02:14, 2.50it/s] Training 1/1 epoch (loss 1.3438): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [04:14<02:14, 2.50it/s] Training 1/1 epoch (loss 1.3438): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [04:14<02:14, 2.50it/s] Training 1/1 epoch (loss 1.4715): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [04:14<02:14, 2.50it/s] Training 1/1 epoch (loss 1.4715): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [04:14<02:14, 2.48it/s] Training 1/1 epoch (loss 1.4930): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [04:14<02:14, 2.48it/s] Training 1/1 epoch (loss 1.4930): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [04:14<02:20, 2.37it/s] Training 1/1 epoch (loss 1.4025): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [04:15<02:20, 2.37it/s] Training 1/1 epoch (loss 1.4025): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [04:15<02:14, 2.48it/s] Training 1/1 epoch (loss 1.4946): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [04:15<02:14, 2.48it/s] Training 1/1 epoch (loss 1.4946): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [04:15<02:16, 2.42it/s] Training 1/1 epoch (loss 1.4157): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [04:16<02:16, 2.42it/s] Training 1/1 epoch (loss 1.4157): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [04:16<02:13, 2.47it/s] Training 1/1 epoch (loss 1.5149): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [04:16<02:13, 2.47it/s] Training 1/1 epoch (loss 1.5149): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [04:16<02:17, 2.39it/s] Training 1/1 epoch (loss 1.4091): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [04:16<02:17, 2.39it/s] Training 1/1 epoch (loss 1.4091): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [04:16<02:16, 2.41it/s] Training 1/1 epoch (loss 1.3924): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [04:17<02:16, 2.41it/s] Training 1/1 epoch (loss 1.3924): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [04:17<02:12, 2.47it/s] Training 1/1 epoch (loss 1.4786): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [04:17<02:12, 2.47it/s] Training 1/1 epoch (loss 1.4786): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [04:17<02:08, 2.55it/s] Training 1/1 epoch (loss 1.3736): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [04:18<02:08, 2.55it/s] Training 1/1 epoch (loss 1.3736): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [04:18<02:12, 2.45it/s] Training 1/1 epoch (loss 1.2647): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [04:18<02:12, 2.45it/s] Training 1/1 epoch (loss 1.2647): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [04:18<02:13, 2.44it/s] Training 1/1 epoch (loss 1.5651): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [04:18<02:13, 2.44it/s] Training 1/1 epoch (loss 1.5651): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [04:18<02:11, 2.47it/s] Training 1/1 epoch (loss 1.3193): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [04:19<02:11, 2.47it/s] Training 1/1 epoch (loss 1.3193): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [04:19<02:10, 2.47it/s] Training 1/1 epoch (loss 1.4646): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [04:19<02:10, 2.47it/s] Training 1/1 epoch (loss 1.4646): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [04:19<02:09, 2.49it/s] Training 1/1 epoch (loss 1.3352): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [04:20<02:09, 2.49it/s] Training 1/1 epoch (loss 1.3352): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [04:20<02:05, 2.56it/s] Training 1/1 epoch (loss 1.3857): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [04:20<02:05, 2.56it/s] Training 1/1 epoch (loss 1.3857): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [04:20<02:09, 2.48it/s] Training 1/1 epoch (loss 1.4323): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [04:20<02:09, 2.48it/s] Training 1/1 epoch (loss 1.4323): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [04:20<02:07, 2.51it/s] Training 1/1 epoch (loss 1.3219): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [04:21<02:07, 2.51it/s] Training 1/1 epoch (loss 1.3219): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [04:21<02:02, 2.59it/s] Training 1/1 epoch (loss 1.3018): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [04:21<02:02, 2.59it/s] Training 1/1 epoch (loss 1.3018): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [04:21<02:06, 2.51it/s] Training 1/1 epoch (loss 1.5541): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [04:22<02:06, 2.51it/s] Training 1/1 epoch (loss 1.5541): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [04:22<02:02, 2.58it/s] Training 1/1 epoch (loss 1.3655): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [04:22<02:02, 2.58it/s] Training 1/1 epoch (loss 1.3655): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [04:22<02:07, 2.47it/s] Training 1/1 epoch (loss 1.3774): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [04:22<02:07, 2.47it/s] Training 1/1 epoch (loss 1.3774): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [04:22<02:09, 2.43it/s] Training 1/1 epoch (loss 1.2429): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [04:23<02:09, 2.43it/s] Training 1/1 epoch (loss 1.2429): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [04:23<02:07, 2.46it/s] Training 1/1 epoch (loss 1.4347): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [04:23<02:07, 2.46it/s] Training 1/1 epoch (loss 1.4347): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [04:23<02:05, 2.49it/s] Training 1/1 epoch (loss 1.4701): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [04:24<02:05, 2.49it/s] Training 1/1 epoch (loss 1.4701): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [04:24<02:02, 2.55it/s] Training 1/1 epoch (loss 1.4212): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [04:24<02:02, 2.55it/s] Training 1/1 epoch (loss 1.4212): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [04:24<01:58, 2.61it/s] Training 1/1 epoch (loss 1.4026): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [04:24<01:58, 2.61it/s] Training 1/1 epoch (loss 1.4026): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [04:24<02:01, 2.54it/s] Training 1/1 epoch (loss 1.4472): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [04:25<02:01, 2.54it/s] Training 1/1 epoch (loss 1.4472): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [04:25<02:03, 2.50it/s] Training 1/1 epoch (loss 1.3325): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [04:25<02:03, 2.50it/s] Training 1/1 epoch (loss 1.3325): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [04:25<02:04, 2.47it/s] Training 1/1 epoch (loss 1.4928): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [04:26<02:04, 2.47it/s] Training 1/1 epoch (loss 1.4928): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [04:26<02:03, 2.48it/s] Training 1/1 epoch (loss 1.3055): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [04:26<02:03, 2.48it/s] Training 1/1 epoch (loss 1.3055): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [04:26<02:01, 2.51it/s] Training 1/1 epoch (loss 1.4178): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [04:26<02:01, 2.51it/s] Training 1/1 epoch (loss 1.4178): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [04:26<01:59, 2.54it/s] Training 1/1 epoch (loss 1.4152): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [04:27<01:59, 2.54it/s] Training 1/1 epoch (loss 1.4152): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [04:27<02:04, 2.43it/s] Training 1/1 epoch (loss 1.3715): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [04:27<02:04, 2.43it/s] Training 1/1 epoch (loss 1.3715): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [04:27<02:00, 2.50it/s] Training 1/1 epoch (loss 1.4981): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [04:28<02:00, 2.50it/s] Training 1/1 epoch (loss 1.4981): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [04:28<01:58, 2.55it/s] Training 1/1 epoch (loss 1.4219): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [04:28<01:58, 2.55it/s] Training 1/1 epoch (loss 1.4219): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [04:28<01:56, 2.58it/s] Training 1/1 epoch (loss 1.3188): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [04:28<01:56, 2.58it/s] Training 1/1 epoch (loss 1.3188): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [04:28<02:01, 2.46it/s] Training 1/1 epoch (loss 1.4781): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [04:29<02:01, 2.46it/s] Training 1/1 epoch (loss 1.4781): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [04:29<02:01, 2.44it/s] Training 1/1 epoch (loss 1.3213): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [04:29<02:01, 2.44it/s] Training 1/1 epoch (loss 1.3213): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [04:29<01:57, 2.52it/s] Training 1/1 epoch (loss 1.3270): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [04:30<01:57, 2.52it/s] Training 1/1 epoch (loss 1.3270): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [04:30<02:00, 2.45it/s] Training 1/1 epoch (loss 1.4478): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [04:30<02:00, 2.45it/s] Training 1/1 epoch (loss 1.4478): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [04:30<01:59, 2.47it/s] Training 1/1 epoch (loss 1.3158): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [04:30<01:59, 2.47it/s] Training 1/1 epoch (loss 1.3158): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [04:30<01:57, 2.50it/s] Training 1/1 epoch (loss 1.4322): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [04:31<01:57, 2.50it/s] Training 1/1 epoch (loss 1.4322): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [04:31<01:57, 2.49it/s] Training 1/1 epoch (loss 1.4980): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [04:31<01:57, 2.49it/s] Training 1/1 epoch (loss 1.4980): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [04:31<01:54, 2.55it/s] Training 1/1 epoch (loss 1.4898): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [04:32<01:54, 2.55it/s] Training 1/1 epoch (loss 1.4898): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [04:32<01:51, 2.61it/s] Training 1/1 epoch (loss 1.4786): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [04:32<01:51, 2.61it/s] Training 1/1 epoch (loss 1.4786): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [04:32<01:54, 2.54it/s] Training 1/1 epoch (loss 1.4262): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [04:32<01:54, 2.54it/s] Training 1/1 epoch (loss 1.4262): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [04:32<01:56, 2.49it/s] Training 1/1 epoch (loss 1.3514): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [04:33<01:56, 2.49it/s] Training 1/1 epoch (loss 1.3514): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [04:33<01:59, 2.42it/s] Training 1/1 epoch (loss 1.4159): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [04:33<01:59, 2.42it/s] Training 1/1 epoch (loss 1.4159): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [04:33<01:55, 2.50it/s] Training 1/1 epoch (loss 1.3796): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [04:34<01:55, 2.50it/s] Training 1/1 epoch (loss 1.3796): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [04:34<01:56, 2.46it/s] Training 1/1 epoch (loss 1.5353): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [04:34<01:56, 2.46it/s] Training 1/1 epoch (loss 1.5353): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [04:34<01:54, 2.48it/s] Training 1/1 epoch (loss 1.5088): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [04:34<01:54, 2.48it/s] Training 1/1 epoch (loss 1.5088): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [04:34<01:53, 2.51it/s] Training 1/1 epoch (loss 1.3820): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [04:35<01:53, 2.51it/s] Training 1/1 epoch (loss 1.3820): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [04:35<01:59, 2.36it/s] Training 1/1 epoch (loss 1.4236): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [04:36<01:59, 2.36it/s] Training 1/1 epoch (loss 1.4236): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [04:36<02:17, 2.05it/s] Training 1/1 epoch (loss 1.4872): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [04:36<02:17, 2.05it/s] Training 1/1 epoch (loss 1.4872): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [04:36<02:12, 2.13it/s] Training 1/1 epoch (loss 1.4189): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [04:36<02:12, 2.13it/s] Training 1/1 epoch (loss 1.4189): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [04:36<02:06, 2.21it/s] Training 1/1 epoch (loss 1.4803): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [04:37<02:06, 2.21it/s] Training 1/1 epoch (loss 1.4803): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [04:37<02:00, 2.32it/s] Training 1/1 epoch (loss 1.4382): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [04:37<02:00, 2.32it/s] Training 1/1 epoch (loss 1.4382): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [04:37<01:57, 2.36it/s] Training 1/1 epoch (loss 1.4592): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [04:38<01:57, 2.36it/s] Training 1/1 epoch (loss 1.4592): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [04:38<01:54, 2.43it/s] Training 1/1 epoch (loss 1.4172): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [04:38<01:54, 2.43it/s] Training 1/1 epoch (loss 1.4172): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [04:38<01:50, 2.49it/s] Training 1/1 epoch (loss 1.3271): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [04:38<01:50, 2.49it/s] Training 1/1 epoch (loss 1.3271): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [04:38<01:51, 2.46it/s] Training 1/1 epoch (loss 1.4438): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [04:39<01:51, 2.46it/s] Training 1/1 epoch (loss 1.4438): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [04:39<01:52, 2.44it/s] Training 1/1 epoch (loss 1.4533): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [04:39<01:52, 2.44it/s] Training 1/1 epoch (loss 1.4533): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [04:39<01:50, 2.47it/s] Training 1/1 epoch (loss 1.3928): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [04:40<01:50, 2.47it/s] Training 1/1 epoch (loss 1.3928): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [04:40<01:49, 2.47it/s] Training 1/1 epoch (loss 1.3925): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [04:40<01:49, 2.47it/s] Training 1/1 epoch (loss 1.3925): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [04:40<01:48, 2.49it/s] Training 1/1 epoch (loss 1.2964): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [04:40<01:48, 2.49it/s] Training 1/1 epoch (loss 1.2964): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [04:40<01:46, 2.55it/s] Training 1/1 epoch (loss 1.3570): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [04:41<01:46, 2.55it/s] Training 1/1 epoch (loss 1.3570): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [04:41<01:42, 2.62it/s] Training 1/1 epoch (loss 1.3157): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [04:41<01:42, 2.62it/s] Training 1/1 epoch (loss 1.3157): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [04:41<01:42, 2.61it/s] Training 1/1 epoch (loss 1.4791): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [04:41<01:42, 2.61it/s] Training 1/1 epoch (loss 1.4791): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [04:41<01:40, 2.65it/s] Training 1/1 epoch (loss 1.5764): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [04:42<01:40, 2.65it/s] Training 1/1 epoch (loss 1.5764): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [04:42<01:44, 2.54it/s] Training 1/1 epoch (loss 1.3959): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [04:42<01:44, 2.54it/s] Training 1/1 epoch (loss 1.3959): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [04:42<01:42, 2.59it/s] Training 1/1 epoch (loss 1.4215): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [04:43<01:42, 2.59it/s] Training 1/1 epoch (loss 1.4215): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [04:43<01:41, 2.60it/s] Training 1/1 epoch (loss 1.3360): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [04:43<01:41, 2.60it/s] Training 1/1 epoch (loss 1.3360): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [04:43<01:40, 2.62it/s] Training 1/1 epoch (loss 1.4123): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [04:43<01:40, 2.62it/s] Training 1/1 epoch (loss 1.4123): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [04:43<01:44, 2.50it/s] Training 1/1 epoch (loss 1.5267): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [04:44<01:44, 2.50it/s] Training 1/1 epoch (loss 1.5267): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [04:44<01:43, 2.52it/s] Training 1/1 epoch (loss 1.3370): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [04:44<01:43, 2.52it/s] Training 1/1 epoch (loss 1.3370): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [04:44<01:51, 2.34it/s] Training 1/1 epoch (loss 1.3895): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [04:45<01:51, 2.34it/s] Training 1/1 epoch (loss 1.3895): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [04:45<01:50, 2.35it/s] Training 1/1 epoch (loss 1.4430): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [04:45<01:50, 2.35it/s] Training 1/1 epoch (loss 1.4430): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [04:45<01:49, 2.36it/s] Training 1/1 epoch (loss 1.4054): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [04:46<01:49, 2.36it/s] Training 1/1 epoch (loss 1.4054): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [04:46<01:48, 2.36it/s] Training 1/1 epoch (loss 1.4632): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [04:46<01:48, 2.36it/s] Training 1/1 epoch (loss 1.4632): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [04:46<01:45, 2.42it/s] Training 1/1 epoch (loss 1.4338): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [04:46<01:45, 2.42it/s] Training 1/1 epoch (loss 1.4338): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [04:46<01:44, 2.45it/s] Training 1/1 epoch (loss 1.4547): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [04:47<01:44, 2.45it/s] Training 1/1 epoch (loss 1.4547): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [04:47<01:42, 2.48it/s] Training 1/1 epoch (loss 1.4786): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [04:47<01:42, 2.48it/s] Training 1/1 epoch (loss 1.4786): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [04:47<01:40, 2.53it/s] Training 1/1 epoch (loss 1.5015): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [04:48<01:40, 2.53it/s] Training 1/1 epoch (loss 1.5015): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [04:48<01:46, 2.37it/s] Training 1/1 epoch (loss 1.3905): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [04:48<01:46, 2.37it/s] Training 1/1 epoch (loss 1.3905): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [04:48<01:43, 2.42it/s] Training 1/1 epoch (loss 1.4509): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [04:48<01:43, 2.42it/s] Training 1/1 epoch (loss 1.4509): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [04:48<01:42, 2.44it/s] Training 1/1 epoch (loss 1.4738): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [04:49<01:42, 2.44it/s] Training 1/1 epoch (loss 1.4738): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [04:49<01:44, 2.39it/s] Training 1/1 epoch (loss 1.4620): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [04:49<01:44, 2.39it/s] Training 1/1 epoch (loss 1.4620): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [04:49<01:42, 2.41it/s] Training 1/1 epoch (loss 1.5188): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [04:50<01:42, 2.41it/s] Training 1/1 epoch (loss 1.5188): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [04:50<01:41, 2.44it/s] Training 1/1 epoch (loss 1.4448): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [04:50<01:41, 2.44it/s] Training 1/1 epoch (loss 1.4448): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [04:50<01:43, 2.37it/s] Training 1/1 epoch (loss 1.4126): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [04:51<01:43, 2.37it/s] Training 1/1 epoch (loss 1.4126): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [04:51<01:40, 2.44it/s] Training 1/1 epoch (loss 1.3541): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [04:51<01:40, 2.44it/s] Training 1/1 epoch (loss 1.3541): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [04:51<01:39, 2.44it/s] Training 1/1 epoch (loss 1.4373): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [04:51<01:39, 2.44it/s] Training 1/1 epoch (loss 1.4373): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [04:51<01:36, 2.52it/s] Training 1/1 epoch (loss 1.2442): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [04:52<01:36, 2.52it/s] Training 1/1 epoch (loss 1.2442): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [04:52<01:37, 2.48it/s] Training 1/1 epoch (loss 1.4684): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [04:52<01:37, 2.48it/s] Training 1/1 epoch (loss 1.4684): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [04:52<01:38, 2.45it/s] Training 1/1 epoch (loss 1.3573): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [04:53<01:38, 2.45it/s] Training 1/1 epoch (loss 1.3573): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [04:53<01:36, 2.48it/s] Training 1/1 epoch (loss 1.3015): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [04:53<01:36, 2.48it/s] Training 1/1 epoch (loss 1.3015): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [04:53<01:35, 2.50it/s] Training 1/1 epoch (loss 1.4357): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [04:53<01:35, 2.50it/s] Training 1/1 epoch (loss 1.4357): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [04:53<01:32, 2.57it/s] Training 1/1 epoch (loss 1.3594): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [04:54<01:32, 2.57it/s] Training 1/1 epoch (loss 1.3594): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [04:54<01:33, 2.54it/s] Training 1/1 epoch (loss 1.3315): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [04:54<01:33, 2.54it/s] Training 1/1 epoch (loss 1.3315): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [04:54<01:31, 2.58it/s] Training 1/1 epoch (loss 1.4739): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [04:54<01:31, 2.58it/s] Training 1/1 epoch (loss 1.4739): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [04:54<01:33, 2.51it/s] Training 1/1 epoch (loss 1.3683): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [04:55<01:33, 2.51it/s] Training 1/1 epoch (loss 1.3683): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [04:55<01:40, 2.33it/s] Training 1/1 epoch (loss 1.4203): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [04:55<01:40, 2.33it/s] Training 1/1 epoch (loss 1.4203): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [04:55<01:38, 2.36it/s] Training 1/1 epoch (loss 1.3593): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [04:56<01:38, 2.36it/s] Training 1/1 epoch (loss 1.3593): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [04:56<01:35, 2.44it/s] Training 1/1 epoch (loss 1.4233): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [04:56<01:35, 2.44it/s] Training 1/1 epoch (loss 1.4233): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [04:56<01:32, 2.50it/s] Training 1/1 epoch (loss 1.4395): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [04:57<01:32, 2.50it/s] Training 1/1 epoch (loss 1.4395): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [04:57<01:31, 2.52it/s] Training 1/1 epoch (loss 1.5097): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [04:57<01:31, 2.52it/s] Training 1/1 epoch (loss 1.5097): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [04:57<01:29, 2.55it/s] Training 1/1 epoch (loss 1.3613): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [04:57<01:29, 2.55it/s] Training 1/1 epoch (loss 1.3613): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [04:57<01:27, 2.60it/s] Training 1/1 epoch (loss 1.4762): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [04:58<01:27, 2.60it/s] Training 1/1 epoch (loss 1.4762): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [04:58<01:28, 2.57it/s] Training 1/1 epoch (loss 1.3622): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [04:58<01:28, 2.57it/s] Training 1/1 epoch (loss 1.3622): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [04:58<01:29, 2.51it/s] Training 1/1 epoch (loss 1.4447): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [04:58<01:29, 2.51it/s] Training 1/1 epoch (loss 1.4447): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [04:58<01:28, 2.53it/s] Training 1/1 epoch (loss 1.3983): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [04:59<01:28, 2.53it/s] Training 1/1 epoch (loss 1.3983): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [04:59<01:27, 2.55it/s] Training 1/1 epoch (loss 1.4320): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [04:59<01:27, 2.55it/s] Training 1/1 epoch (loss 1.4320): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [04:59<01:27, 2.55it/s] Training 1/1 epoch (loss 1.3344): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [05:00<01:27, 2.55it/s] Training 1/1 epoch (loss 1.3344): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [05:00<01:30, 2.44it/s] Training 1/1 epoch (loss 1.4136): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [05:00<01:30, 2.44it/s] Training 1/1 epoch (loss 1.4136): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [05:00<01:30, 2.44it/s] Training 1/1 epoch (loss 1.3638): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [05:01<01:30, 2.44it/s] Training 1/1 epoch (loss 1.3638): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [05:01<01:28, 2.48it/s] Training 1/1 epoch (loss 1.4472): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [05:01<01:28, 2.48it/s] Training 1/1 epoch (loss 1.4472): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [05:01<01:26, 2.52it/s] Training 1/1 epoch (loss 1.3605): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [05:01<01:26, 2.52it/s] Training 1/1 epoch (loss 1.3605): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [05:01<01:25, 2.54it/s] Training 1/1 epoch (loss 1.3218): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [05:02<01:25, 2.54it/s] Training 1/1 epoch (loss 1.3218): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [05:02<01:26, 2.50it/s] Training 1/1 epoch (loss 1.3436): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [05:02<01:26, 2.50it/s] Training 1/1 epoch (loss 1.3436): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [05:02<01:25, 2.53it/s] Training 1/1 epoch (loss 1.2589): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [05:02<01:25, 2.53it/s] Training 1/1 epoch (loss 1.2589): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [05:02<01:24, 2.56it/s] Training 1/1 epoch (loss 1.3220): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [05:03<01:24, 2.56it/s] Training 1/1 epoch (loss 1.3220): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [05:03<01:22, 2.59it/s] Training 1/1 epoch (loss 1.3663): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [05:03<01:22, 2.59it/s] Training 1/1 epoch (loss 1.3663): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [05:03<01:24, 2.51it/s] Training 1/1 epoch (loss 1.3039): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [05:04<01:24, 2.51it/s] Training 1/1 epoch (loss 1.3039): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [05:04<01:24, 2.50it/s] Training 1/1 epoch (loss 1.4317): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [05:04<01:24, 2.50it/s] Training 1/1 epoch (loss 1.4317): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [05:04<01:23, 2.53it/s] Training 1/1 epoch (loss 1.2703): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [05:05<01:23, 2.53it/s] Training 1/1 epoch (loss 1.2703): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [05:05<01:35, 2.21it/s] Training 1/1 epoch (loss 1.3306): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [05:05<01:35, 2.21it/s] Training 1/1 epoch (loss 1.3306): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [05:05<01:34, 2.21it/s] Training 1/1 epoch (loss 1.4215): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [05:06<01:34, 2.21it/s] Training 1/1 epoch (loss 1.4215): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [05:06<01:46, 1.95it/s] Training 1/1 epoch (loss 1.4456): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [05:06<01:46, 1.95it/s] Training 1/1 epoch (loss 1.4456): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [05:06<01:40, 2.06it/s] Training 1/1 epoch (loss 1.3847): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [05:07<01:40, 2.06it/s] Training 1/1 epoch (loss 1.3847): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [05:07<01:34, 2.18it/s] Training 1/1 epoch (loss 1.4539): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [05:07<01:34, 2.18it/s] Training 1/1 epoch (loss 1.4539): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [05:07<01:29, 2.30it/s] Training 1/1 epoch (loss 1.4272): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [05:07<01:29, 2.30it/s] Training 1/1 epoch (loss 1.4272): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [05:07<01:24, 2.40it/s] Training 1/1 epoch (loss 1.4538): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [05:08<01:24, 2.40it/s] Training 1/1 epoch (loss 1.4538): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [05:08<01:23, 2.44it/s] Training 1/1 epoch (loss 1.3981): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [05:08<01:23, 2.44it/s] Training 1/1 epoch (loss 1.3981): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [05:08<01:23, 2.41it/s] Training 1/1 epoch (loss 1.3895): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [05:09<01:23, 2.41it/s] Training 1/1 epoch (loss 1.3895): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [05:09<01:23, 2.42it/s] Training 1/1 epoch (loss 1.3480): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [05:09<01:23, 2.42it/s] Training 1/1 epoch (loss 1.3480): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [05:09<01:19, 2.51it/s] Training 1/1 epoch (loss 1.3297): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [05:09<01:19, 2.51it/s] Training 1/1 epoch (loss 1.3297): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [05:09<01:18, 2.54it/s] Training 1/1 epoch (loss 1.4349): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [05:10<01:18, 2.54it/s] Training 1/1 epoch (loss 1.4349): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [05:10<01:18, 2.53it/s] Training 1/1 epoch (loss 1.4453): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [05:10<01:18, 2.53it/s] Training 1/1 epoch (loss 1.4453): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [05:10<01:16, 2.58it/s] Training 1/1 epoch (loss 1.4072): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [05:10<01:16, 2.58it/s] Training 1/1 epoch (loss 1.4072): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [05:10<01:15, 2.60it/s] Training 1/1 epoch (loss 1.4113): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [05:11<01:15, 2.60it/s] Training 1/1 epoch (loss 1.4113): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [05:11<01:14, 2.60it/s] Training 1/1 epoch (loss 1.3668): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [05:11<01:14, 2.60it/s] Training 1/1 epoch (loss 1.3668): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [05:11<01:19, 2.45it/s] Training 1/1 epoch (loss 1.4664): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [05:12<01:19, 2.45it/s] Training 1/1 epoch (loss 1.4664): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [05:12<01:17, 2.49it/s] Training 1/1 epoch (loss 1.3700): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [05:12<01:17, 2.49it/s] Training 1/1 epoch (loss 1.3700): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [05:12<01:15, 2.55it/s] Training 1/1 epoch (loss 1.4895): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [05:13<01:15, 2.55it/s] Training 1/1 epoch (loss 1.4895): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [05:13<01:20, 2.37it/s] Training 1/1 epoch (loss 1.3520): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [05:13<01:20, 2.37it/s] Training 1/1 epoch (loss 1.3520): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [05:13<01:17, 2.46it/s] Training 1/1 epoch (loss 1.3584): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [05:13<01:17, 2.46it/s] Training 1/1 epoch (loss 1.3584): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [05:13<01:17, 2.43it/s] Training 1/1 epoch (loss 1.3440): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [05:14<01:17, 2.43it/s] Training 1/1 epoch (loss 1.3440): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [05:14<01:16, 2.47it/s] Training 1/1 epoch (loss 1.3620): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [05:14<01:16, 2.47it/s] Training 1/1 epoch (loss 1.3620): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [05:14<01:14, 2.51it/s] Training 1/1 epoch (loss 1.3784): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [05:15<01:14, 2.51it/s] Training 1/1 epoch (loss 1.3784): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [05:15<01:17, 2.39it/s] Training 1/1 epoch (loss 1.4466): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [05:15<01:17, 2.39it/s] Training 1/1 epoch (loss 1.4466): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [05:15<01:15, 2.46it/s] Training 1/1 epoch (loss 1.4629): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [05:15<01:15, 2.46it/s] Training 1/1 epoch (loss 1.4629): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [05:15<01:14, 2.47it/s] Training 1/1 epoch (loss 1.4066): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [05:16<01:14, 2.47it/s] Training 1/1 epoch (loss 1.4066): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [05:16<01:13, 2.50it/s] Training 1/1 epoch (loss 1.4283): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [05:16<01:13, 2.50it/s] Training 1/1 epoch (loss 1.4283): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [05:16<01:14, 2.46it/s] Training 1/1 epoch (loss 1.3993): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [05:17<01:14, 2.46it/s] Training 1/1 epoch (loss 1.3993): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [05:17<01:14, 2.42it/s] Training 1/1 epoch (loss 1.5349): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [05:17<01:14, 2.42it/s] Training 1/1 epoch (loss 1.5349): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [05:17<01:11, 2.51it/s] Training 1/1 epoch (loss 1.3726): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [05:17<01:11, 2.51it/s] Training 1/1 epoch (loss 1.3726): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [05:17<01:11, 2.52it/s] Training 1/1 epoch (loss 1.2733): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [05:18<01:11, 2.52it/s] Training 1/1 epoch (loss 1.2733): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [05:18<01:17, 2.30it/s] Training 1/1 epoch (loss 1.5541): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [05:18<01:17, 2.30it/s] Training 1/1 epoch (loss 1.5541): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [05:18<01:13, 2.41it/s] Training 1/1 epoch (loss 1.3840): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [05:19<01:13, 2.41it/s] Training 1/1 epoch (loss 1.3840): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [05:19<01:10, 2.50it/s] Training 1/1 epoch (loss 1.5095): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [05:19<01:10, 2.50it/s] Training 1/1 epoch (loss 1.5095): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [05:19<01:11, 2.46it/s] Training 1/1 epoch (loss 1.2816): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [05:19<01:11, 2.46it/s] Training 1/1 epoch (loss 1.2816): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [05:19<01:10, 2.46it/s] Training 1/1 epoch (loss 1.4703): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [05:20<01:10, 2.46it/s] Training 1/1 epoch (loss 1.4703): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [05:20<01:08, 2.53it/s] Training 1/1 epoch (loss 1.3712): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [05:20<01:08, 2.53it/s] Training 1/1 epoch (loss 1.3712): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [05:20<01:08, 2.51it/s] Training 1/1 epoch (loss 1.4840): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [05:21<01:08, 2.51it/s] Training 1/1 epoch (loss 1.4840): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [05:21<01:07, 2.52it/s] Training 1/1 epoch (loss 1.3549): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [05:21<01:07, 2.52it/s] Training 1/1 epoch (loss 1.3549): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [05:21<01:05, 2.58it/s] Training 1/1 epoch (loss 1.5747): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [05:21<01:05, 2.58it/s] Training 1/1 epoch (loss 1.5747): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [05:21<01:06, 2.54it/s] Training 1/1 epoch (loss 1.3923): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [05:22<01:06, 2.54it/s] Training 1/1 epoch (loss 1.3923): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [05:22<01:05, 2.56it/s] Training 1/1 epoch (loss 1.3280): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [05:22<01:05, 2.56it/s] Training 1/1 epoch (loss 1.3280): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [05:22<01:03, 2.62it/s] Training 1/1 epoch (loss 1.3406): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [05:23<01:03, 2.62it/s] Training 1/1 epoch (loss 1.3406): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [05:23<01:04, 2.55it/s] Training 1/1 epoch (loss 1.3483): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [05:23<01:04, 2.55it/s] Training 1/1 epoch (loss 1.3483): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [05:23<01:03, 2.61it/s] Training 1/1 epoch (loss 1.2872): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [05:23<01:03, 2.61it/s] Training 1/1 epoch (loss 1.2872): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [05:23<01:03, 2.57it/s] Training 1/1 epoch (loss 1.3299): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [05:24<01:03, 2.57it/s] Training 1/1 epoch (loss 1.3299): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [05:24<01:03, 2.58it/s] Training 1/1 epoch (loss 1.4621): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [05:24<01:03, 2.58it/s] Training 1/1 epoch (loss 1.4621): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [05:24<01:03, 2.55it/s] Training 1/1 epoch (loss 1.4238): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [05:24<01:03, 2.55it/s] Training 1/1 epoch (loss 1.4238): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [05:24<01:03, 2.54it/s] Training 1/1 epoch (loss 1.3755): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [05:25<01:03, 2.54it/s] Training 1/1 epoch (loss 1.3755): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [05:25<01:04, 2.46it/s] Training 1/1 epoch (loss 1.3005): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [05:25<01:04, 2.46it/s] Training 1/1 epoch (loss 1.3005): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [05:25<01:05, 2.44it/s] Training 1/1 epoch (loss 1.4695): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [05:26<01:05, 2.44it/s] Training 1/1 epoch (loss 1.4695): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [05:26<01:04, 2.47it/s] Training 1/1 epoch (loss 1.4693): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [05:26<01:04, 2.47it/s] Training 1/1 epoch (loss 1.4693): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [05:26<01:02, 2.53it/s] Training 1/1 epoch (loss 1.3386): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [05:27<01:02, 2.53it/s] Training 1/1 epoch (loss 1.3386): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [05:27<01:03, 2.47it/s] Training 1/1 epoch (loss 1.4340): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [05:27<01:03, 2.47it/s] Training 1/1 epoch (loss 1.4340): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [05:27<01:01, 2.51it/s] Training 1/1 epoch (loss 1.3781): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [05:27<01:01, 2.51it/s] Training 1/1 epoch (loss 1.3781): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [05:27<01:01, 2.50it/s] Training 1/1 epoch (loss 1.4109): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [05:28<01:01, 2.50it/s] Training 1/1 epoch (loss 1.4109): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [05:28<01:00, 2.54it/s] Training 1/1 epoch (loss 1.4292): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [05:28<01:00, 2.54it/s] Training 1/1 epoch (loss 1.4292): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [05:28<00:59, 2.54it/s] Training 1/1 epoch (loss 1.3912): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [05:28<00:59, 2.54it/s] Training 1/1 epoch (loss 1.3912): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [05:28<00:57, 2.63it/s] Training 1/1 epoch (loss 1.5159): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [05:29<00:57, 2.63it/s] Training 1/1 epoch (loss 1.5159): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [05:29<00:57, 2.63it/s] Training 1/1 epoch (loss 1.2799): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [05:29<00:57, 2.63it/s] Training 1/1 epoch (loss 1.2799): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [05:29<00:58, 2.53it/s] Training 1/1 epoch (loss 1.2874): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [05:30<00:58, 2.53it/s] Training 1/1 epoch (loss 1.2874): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [05:30<00:58, 2.53it/s] Training 1/1 epoch (loss 1.4524): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [05:30<00:58, 2.53it/s] Training 1/1 epoch (loss 1.4524): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [05:30<00:58, 2.52it/s] Training 1/1 epoch (loss 1.3427): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [05:30<00:58, 2.52it/s] Training 1/1 epoch (loss 1.3427): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [05:30<00:57, 2.52it/s] Training 1/1 epoch (loss 1.4570): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [05:31<00:57, 2.52it/s] Training 1/1 epoch (loss 1.4570): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [05:31<01:00, 2.39it/s] Training 1/1 epoch (loss 1.4562): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [05:31<01:00, 2.39it/s] Training 1/1 epoch (loss 1.4562): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [05:31<00:58, 2.44it/s] Training 1/1 epoch (loss 1.3881): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [05:32<00:58, 2.44it/s] Training 1/1 epoch (loss 1.3881): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [05:32<00:58, 2.45it/s] Training 1/1 epoch (loss 1.3780): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [05:32<00:58, 2.45it/s] Training 1/1 epoch (loss 1.3780): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [05:32<01:00, 2.36it/s] Training 1/1 epoch (loss 1.3430): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [05:33<01:00, 2.36it/s] Training 1/1 epoch (loss 1.3430): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [05:33<00:57, 2.44it/s] Training 1/1 epoch (loss 1.4737): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [05:33<00:57, 2.44it/s] Training 1/1 epoch (loss 1.4737): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [05:33<00:57, 2.42it/s] Training 1/1 epoch (loss 1.4330): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [05:33<00:57, 2.42it/s] Training 1/1 epoch (loss 1.4330): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [05:33<00:57, 2.42it/s] Training 1/1 epoch (loss 1.3727): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [05:34<00:57, 2.42it/s] Training 1/1 epoch (loss 1.3727): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [05:34<00:58, 2.35it/s] Training 1/1 epoch (loss 1.4603): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [05:34<00:58, 2.35it/s] Training 1/1 epoch (loss 1.4603): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [05:34<00:58, 2.35it/s] Training 1/1 epoch (loss 1.4940): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [05:35<00:58, 2.35it/s] Training 1/1 epoch (loss 1.4940): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [05:35<01:01, 2.21it/s] Training 1/1 epoch (loss 1.4308): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [05:35<01:01, 2.21it/s] Training 1/1 epoch (loss 1.4308): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [05:35<01:03, 2.13it/s] Training 1/1 epoch (loss 1.3787): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [05:36<01:03, 2.13it/s] Training 1/1 epoch (loss 1.3787): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [05:36<01:06, 2.01it/s] Training 1/1 epoch (loss 1.3808): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [05:36<01:06, 2.01it/s] Training 1/1 epoch (loss 1.3808): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [05:36<01:02, 2.14it/s] Training 1/1 epoch (loss 1.3776): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [05:37<01:02, 2.14it/s] Training 1/1 epoch (loss 1.3776): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [05:37<01:00, 2.18it/s] Training 1/1 epoch (loss 1.3633): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [05:37<01:00, 2.18it/s] Training 1/1 epoch (loss 1.3633): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [05:37<00:57, 2.28it/s] Training 1/1 epoch (loss 1.4126): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [05:37<00:57, 2.28it/s] Training 1/1 epoch (loss 1.4126): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [05:37<00:55, 2.35it/s] Training 1/1 epoch (loss 1.3316): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [05:38<00:55, 2.35it/s] Training 1/1 epoch (loss 1.3316): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [05:38<00:52, 2.44it/s] Training 1/1 epoch (loss 1.4222): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [05:38<00:52, 2.44it/s] Training 1/1 epoch (loss 1.4222): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [05:38<00:52, 2.45it/s] Training 1/1 epoch (loss 1.2999): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [05:39<00:52, 2.45it/s] Training 1/1 epoch (loss 1.2999): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [05:39<00:50, 2.51it/s] Training 1/1 epoch (loss 1.3898): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [05:39<00:50, 2.51it/s] Training 1/1 epoch (loss 1.3898): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [05:39<00:49, 2.56it/s] Training 1/1 epoch (loss 1.4494): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [05:39<00:49, 2.56it/s] Training 1/1 epoch (loss 1.4494): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [05:39<00:49, 2.55it/s] Training 1/1 epoch (loss 1.3817): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [05:40<00:49, 2.55it/s] Training 1/1 epoch (loss 1.3817): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [05:40<00:47, 2.61it/s] Training 1/1 epoch (loss 1.3954): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [05:40<00:47, 2.61it/s] Training 1/1 epoch (loss 1.3954): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [05:40<00:48, 2.53it/s] Training 1/1 epoch (loss 1.4282): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [05:41<00:48, 2.53it/s] Training 1/1 epoch (loss 1.4282): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [05:41<00:48, 2.52it/s] Training 1/1 epoch (loss 1.4028): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [05:41<00:48, 2.52it/s] Training 1/1 epoch (loss 1.4028): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [05:41<00:46, 2.61it/s] Training 1/1 epoch (loss 1.3436): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [05:41<00:46, 2.61it/s] Training 1/1 epoch (loss 1.3436): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [05:41<00:45, 2.64it/s] Training 1/1 epoch (loss 1.5247): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [05:42<00:45, 2.64it/s] Training 1/1 epoch (loss 1.5247): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [05:42<00:45, 2.62it/s] Training 1/1 epoch (loss 1.3505): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [05:42<00:45, 2.62it/s] Training 1/1 epoch (loss 1.3505): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [05:42<00:44, 2.65it/s] Training 1/1 epoch (loss 1.3999): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [05:43<00:44, 2.65it/s] Training 1/1 epoch (loss 1.3999): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [05:43<00:47, 2.46it/s] Training 1/1 epoch (loss 1.5262): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [05:43<00:47, 2.46it/s] Training 1/1 epoch (loss 1.5262): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [05:43<00:46, 2.49it/s] Training 1/1 epoch (loss 1.3645): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [05:43<00:46, 2.49it/s] Training 1/1 epoch (loss 1.3645): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [05:43<00:45, 2.55it/s] Training 1/1 epoch (loss 1.3908): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [05:44<00:45, 2.55it/s] Training 1/1 epoch (loss 1.3908): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [05:44<00:45, 2.52it/s] Training 1/1 epoch (loss 1.3037): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [05:44<00:45, 2.52it/s] Training 1/1 epoch (loss 1.3037): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [05:44<00:44, 2.56it/s] Training 1/1 epoch (loss 1.4308): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [05:45<00:44, 2.56it/s] Training 1/1 epoch (loss 1.4308): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [05:45<00:45, 2.48it/s] Training 1/1 epoch (loss 1.4077): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [05:45<00:45, 2.48it/s] Training 1/1 epoch (loss 1.4077): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [05:45<00:43, 2.56it/s] Training 1/1 epoch (loss 1.5200): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [05:45<00:43, 2.56it/s] Training 1/1 epoch (loss 1.5200): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [05:45<00:42, 2.57it/s] Training 1/1 epoch (loss 1.3508): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [05:46<00:42, 2.57it/s] Training 1/1 epoch (loss 1.3508): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [05:46<00:42, 2.58it/s] Training 1/1 epoch (loss 1.4548): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [05:46<00:42, 2.58it/s] Training 1/1 epoch (loss 1.4548): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [05:46<00:41, 2.58it/s] Training 1/1 epoch (loss 1.5039): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [05:46<00:41, 2.58it/s] Training 1/1 epoch (loss 1.5039): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [05:46<00:41, 2.59it/s] Training 1/1 epoch (loss 1.4359): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [05:47<00:41, 2.59it/s] Training 1/1 epoch (loss 1.4359): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [05:47<00:42, 2.51it/s] Training 1/1 epoch (loss 1.4135): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [05:47<00:42, 2.51it/s] Training 1/1 epoch (loss 1.4135): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [05:47<00:41, 2.54it/s] Training 1/1 epoch (loss 1.3888): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [05:48<00:41, 2.54it/s] Training 1/1 epoch (loss 1.3888): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [05:48<00:43, 2.40it/s] Training 1/1 epoch (loss 1.3691): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [05:48<00:43, 2.40it/s] Training 1/1 epoch (loss 1.3691): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [05:48<00:42, 2.44it/s] Training 1/1 epoch (loss 1.3504): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [05:48<00:42, 2.44it/s] Training 1/1 epoch (loss 1.3504): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [05:48<00:40, 2.49it/s] Training 1/1 epoch (loss 1.4833): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [05:49<00:40, 2.49it/s] Training 1/1 epoch (loss 1.4833): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [05:49<00:40, 2.48it/s] Training 1/1 epoch (loss 1.4190): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [05:49<00:40, 2.48it/s] Training 1/1 epoch (loss 1.4190): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [05:49<00:39, 2.55it/s] Training 1/1 epoch (loss 1.3475): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [05:50<00:39, 2.55it/s] Training 1/1 epoch (loss 1.3475): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [05:50<00:38, 2.60it/s] Training 1/1 epoch (loss 1.3313): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [05:50<00:38, 2.60it/s] Training 1/1 epoch (loss 1.3313): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [05:50<00:38, 2.58it/s] Training 1/1 epoch (loss 1.4595): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [05:50<00:38, 2.58it/s] Training 1/1 epoch (loss 1.4595): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [05:50<00:37, 2.56it/s] Training 1/1 epoch (loss 1.3738): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [05:51<00:37, 2.56it/s] Training 1/1 epoch (loss 1.3738): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [05:51<00:39, 2.45it/s] Training 1/1 epoch (loss 1.5037): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [05:51<00:39, 2.45it/s] Training 1/1 epoch (loss 1.5037): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [05:51<00:36, 2.57it/s] Training 1/1 epoch (loss 1.3260): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [05:52<00:36, 2.57it/s] Training 1/1 epoch (loss 1.3260): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [05:52<00:36, 2.56it/s] Training 1/1 epoch (loss 1.4462): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [05:52<00:36, 2.56it/s] Training 1/1 epoch (loss 1.4462): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [05:52<00:36, 2.57it/s] Training 1/1 epoch (loss 1.5137): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [05:52<00:36, 2.57it/s] Training 1/1 epoch (loss 1.5137): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [05:52<00:35, 2.57it/s] Training 1/1 epoch (loss 1.3882): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [05:53<00:35, 2.57it/s] Training 1/1 epoch (loss 1.3882): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [05:53<00:36, 2.49it/s] Training 1/1 epoch (loss 1.3431): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [05:53<00:36, 2.49it/s] Training 1/1 epoch (loss 1.3431): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [05:53<00:36, 2.46it/s] Training 1/1 epoch (loss 1.4140): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [05:54<00:36, 2.46it/s] Training 1/1 epoch (loss 1.4140): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [05:54<00:35, 2.48it/s] Training 1/1 epoch (loss 1.3167): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [05:54<00:35, 2.48it/s] Training 1/1 epoch (loss 1.3167): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [05:54<00:36, 2.42it/s] Training 1/1 epoch (loss 1.5012): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [05:54<00:36, 2.42it/s] Training 1/1 epoch (loss 1.5012): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [05:54<00:35, 2.46it/s] Training 1/1 epoch (loss 1.4618): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [05:55<00:35, 2.46it/s] Training 1/1 epoch (loss 1.4618): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [05:55<00:34, 2.49it/s] Training 1/1 epoch (loss 1.4353): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [05:55<00:34, 2.49it/s] Training 1/1 epoch (loss 1.4353): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [05:55<00:35, 2.40it/s] Training 1/1 epoch (loss 1.4079): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [05:56<00:35, 2.40it/s] Training 1/1 epoch (loss 1.4079): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [05:56<00:34, 2.47it/s] Training 1/1 epoch (loss 1.4603): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [05:56<00:34, 2.47it/s] Training 1/1 epoch (loss 1.4603): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [05:56<00:33, 2.48it/s] Training 1/1 epoch (loss 1.4194): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [05:56<00:33, 2.48it/s] Training 1/1 epoch (loss 1.4194): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [05:56<00:32, 2.50it/s] Training 1/1 epoch (loss 1.4601): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [05:57<00:32, 2.50it/s] Training 1/1 epoch (loss 1.4601): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [05:57<00:32, 2.51it/s] Training 1/1 epoch (loss 1.3853): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [05:57<00:32, 2.51it/s] Training 1/1 epoch (loss 1.3853): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [05:57<00:31, 2.54it/s] Training 1/1 epoch (loss 1.4101): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [05:58<00:31, 2.54it/s] Training 1/1 epoch (loss 1.4101): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [05:58<00:30, 2.59it/s] Training 1/1 epoch (loss 1.3287): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [05:58<00:30, 2.59it/s] Training 1/1 epoch (loss 1.3287): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [05:58<00:29, 2.62it/s] Training 1/1 epoch (loss 1.4031): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [05:58<00:29, 2.62it/s] Training 1/1 epoch (loss 1.4031): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [05:58<00:30, 2.52it/s] Training 1/1 epoch (loss 1.4975): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [05:59<00:30, 2.52it/s] Training 1/1 epoch (loss 1.4975): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [05:59<00:29, 2.55it/s] Training 1/1 epoch (loss 1.5202): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [05:59<00:29, 2.55it/s] Training 1/1 epoch (loss 1.5202): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [05:59<00:29, 2.53it/s] Training 1/1 epoch (loss 1.3898): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [06:00<00:29, 2.53it/s] Training 1/1 epoch (loss 1.3898): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [06:00<00:31, 2.34it/s] Training 1/1 epoch (loss 1.4608): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [06:00<00:31, 2.34it/s] Training 1/1 epoch (loss 1.4608): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [06:00<00:31, 2.35it/s] Training 1/1 epoch (loss 1.2776): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [06:00<00:31, 2.35it/s] Training 1/1 epoch (loss 1.2776): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [06:00<00:29, 2.40it/s] Training 1/1 epoch (loss 1.2095): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [06:01<00:29, 2.40it/s] Training 1/1 epoch (loss 1.2095): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [06:01<00:29, 2.43it/s] Training 1/1 epoch (loss 1.3537): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [06:01<00:29, 2.43it/s] Training 1/1 epoch (loss 1.3537): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [06:01<00:28, 2.48it/s] Training 1/1 epoch (loss 1.3781): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [06:02<00:28, 2.48it/s] Training 1/1 epoch (loss 1.3781): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [06:02<00:27, 2.49it/s] Training 1/1 epoch (loss 1.4681): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [06:02<00:27, 2.49it/s] Training 1/1 epoch (loss 1.4681): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [06:02<00:26, 2.54it/s] Training 1/1 epoch (loss 1.3456): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [06:02<00:26, 2.54it/s] Training 1/1 epoch (loss 1.3456): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [06:02<00:26, 2.55it/s] Training 1/1 epoch (loss 1.3910): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [06:03<00:26, 2.55it/s] Training 1/1 epoch (loss 1.3910): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [06:03<00:27, 2.44it/s] Training 1/1 epoch (loss 1.5047): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [06:03<00:27, 2.44it/s] Training 1/1 epoch (loss 1.5047): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [06:03<00:26, 2.48it/s] Training 1/1 epoch (loss 1.4312): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [06:04<00:26, 2.48it/s] Training 1/1 epoch (loss 1.4312): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [06:04<00:26, 2.40it/s] Training 1/1 epoch (loss 1.4616): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [06:04<00:26, 2.40it/s] Training 1/1 epoch (loss 1.4616): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [06:04<00:26, 2.41it/s] Training 1/1 epoch (loss 1.3844): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [06:05<00:26, 2.41it/s] Training 1/1 epoch (loss 1.3844): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [06:05<00:25, 2.39it/s] Training 1/1 epoch (loss 1.3567): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [06:05<00:25, 2.39it/s] Training 1/1 epoch (loss 1.3567): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [06:05<00:27, 2.22it/s] Training 1/1 epoch (loss 1.3621): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [06:06<00:27, 2.22it/s] Training 1/1 epoch (loss 1.3621): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [06:06<00:30, 1.97it/s] Training 1/1 epoch (loss 1.3306): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [06:06<00:30, 1.97it/s] Training 1/1 epoch (loss 1.3306): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [06:06<00:28, 2.07it/s] Training 1/1 epoch (loss 1.5588): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [06:07<00:28, 2.07it/s] Training 1/1 epoch (loss 1.5588): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [06:07<00:26, 2.20it/s] Training 1/1 epoch (loss 1.4320): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [06:07<00:26, 2.20it/s] Training 1/1 epoch (loss 1.4320): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [06:07<00:25, 2.27it/s] Training 1/1 epoch (loss 1.4221): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [06:07<00:25, 2.27it/s] Training 1/1 epoch (loss 1.4221): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [06:07<00:24, 2.32it/s] Training 1/1 epoch (loss 1.4939): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [06:08<00:24, 2.32it/s] Training 1/1 epoch (loss 1.4939): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [06:08<00:23, 2.36it/s] Training 1/1 epoch (loss 1.4519): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [06:08<00:23, 2.36it/s] Training 1/1 epoch (loss 1.4519): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [06:08<00:22, 2.40it/s] Training 1/1 epoch (loss 1.4336): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [06:09<00:22, 2.40it/s] Training 1/1 epoch (loss 1.4336): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [06:09<00:21, 2.47it/s] Training 1/1 epoch (loss 1.3369): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [06:09<00:21, 2.47it/s] Training 1/1 epoch (loss 1.3369): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [06:09<00:21, 2.47it/s] Training 1/1 epoch (loss 1.4194): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [06:09<00:21, 2.47it/s] Training 1/1 epoch (loss 1.4194): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [06:09<00:20, 2.54it/s] Training 1/1 epoch (loss 1.3670): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [06:10<00:20, 2.54it/s] Training 1/1 epoch (loss 1.3670): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [06:10<00:19, 2.55it/s] Training 1/1 epoch (loss 1.4161): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [06:10<00:19, 2.55it/s] Training 1/1 epoch (loss 1.4161): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [06:10<00:19, 2.58it/s] Training 1/1 epoch (loss 1.2729): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [06:10<00:19, 2.58it/s] Training 1/1 epoch (loss 1.2729): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [06:10<00:18, 2.56it/s] Training 1/1 epoch (loss 1.4136): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [06:11<00:18, 2.56it/s] Training 1/1 epoch (loss 1.4136): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [06:11<00:17, 2.62it/s] Training 1/1 epoch (loss 1.4520): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [06:11<00:17, 2.62it/s] Training 1/1 epoch (loss 1.4520): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [06:11<00:17, 2.59it/s] Training 1/1 epoch (loss 1.5104): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [06:12<00:17, 2.59it/s] Training 1/1 epoch (loss 1.5104): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [06:12<00:17, 2.56it/s] Training 1/1 epoch (loss 1.3210): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [06:12<00:17, 2.56it/s] Training 1/1 epoch (loss 1.3210): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [06:12<00:17, 2.57it/s] Training 1/1 epoch (loss 1.3850): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [06:12<00:17, 2.57it/s] Training 1/1 epoch (loss 1.3850): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [06:12<00:16, 2.55it/s] Training 1/1 epoch (loss 1.3641): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [06:13<00:16, 2.55it/s] Training 1/1 epoch (loss 1.3641): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [06:13<00:16, 2.51it/s] Training 1/1 epoch (loss 1.3617): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [06:13<00:16, 2.51it/s] Training 1/1 epoch (loss 1.3617): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [06:13<00:16, 2.48it/s] Training 1/1 epoch (loss 1.4344): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [06:14<00:16, 2.48it/s] Training 1/1 epoch (loss 1.4344): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [06:14<00:16, 2.45it/s] Training 1/1 epoch (loss 1.3797): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [06:14<00:16, 2.45it/s] Training 1/1 epoch (loss 1.3797): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [06:14<00:16, 2.38it/s] Training 1/1 epoch (loss 1.3837): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [06:15<00:16, 2.38it/s] Training 1/1 epoch (loss 1.3837): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [06:15<00:15, 2.39it/s] Training 1/1 epoch (loss 1.3018): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [06:15<00:15, 2.39it/s] Training 1/1 epoch (loss 1.3018): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [06:15<00:15, 2.46it/s] Training 1/1 epoch (loss 1.3036): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [06:15<00:15, 2.46it/s] Training 1/1 epoch (loss 1.3036): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [06:15<00:14, 2.46it/s] Training 1/1 epoch (loss 1.4066): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [06:16<00:14, 2.46it/s] Training 1/1 epoch (loss 1.4066): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [06:16<00:13, 2.50it/s] Training 1/1 epoch (loss 1.5011): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [06:16<00:13, 2.50it/s] Training 1/1 epoch (loss 1.5011): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [06:16<00:13, 2.49it/s] Training 1/1 epoch (loss 1.4453): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [06:16<00:13, 2.49it/s] Training 1/1 epoch (loss 1.4453): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [06:16<00:13, 2.53it/s] Training 1/1 epoch (loss 1.4301): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [06:17<00:13, 2.53it/s] Training 1/1 epoch (loss 1.4301): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [06:17<00:12, 2.48it/s] Training 1/1 epoch (loss 1.3429): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [06:17<00:12, 2.48it/s] Training 1/1 epoch (loss 1.3429): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [06:17<00:12, 2.45it/s] Training 1/1 epoch (loss 1.3784): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [06:18<00:12, 2.45it/s] Training 1/1 epoch (loss 1.3784): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [06:18<00:12, 2.37it/s] Training 1/1 epoch (loss 1.3607): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [06:18<00:12, 2.37it/s] Training 1/1 epoch (loss 1.3607): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [06:18<00:12, 2.40it/s] Training 1/1 epoch (loss 1.4616): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [06:19<00:12, 2.40it/s] Training 1/1 epoch (loss 1.4616): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [06:19<00:11, 2.46it/s] Training 1/1 epoch (loss 1.4125): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [06:19<00:11, 2.46it/s] Training 1/1 epoch (loss 1.4125): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [06:19<00:10, 2.50it/s] Training 1/1 epoch (loss 1.4056): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [06:19<00:10, 2.50it/s] Training 1/1 epoch (loss 1.4056): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [06:19<00:10, 2.49it/s] Training 1/1 epoch (loss 1.4123): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [06:20<00:10, 2.49it/s] Training 1/1 epoch (loss 1.4123): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [06:20<00:09, 2.50it/s] Training 1/1 epoch (loss 1.4569): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [06:20<00:09, 2.50it/s] Training 1/1 epoch (loss 1.4569): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [06:20<00:09, 2.54it/s] Training 1/1 epoch (loss 1.4257): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [06:21<00:09, 2.54it/s] Training 1/1 epoch (loss 1.4257): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [06:21<00:09, 2.51it/s] Training 1/1 epoch (loss 1.4759): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [06:21<00:09, 2.51it/s] Training 1/1 epoch (loss 1.4759): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [06:21<00:08, 2.50it/s] Training 1/1 epoch (loss 1.3801): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [06:21<00:08, 2.50it/s] Training 1/1 epoch (loss 1.3801): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [06:21<00:08, 2.54it/s] Training 1/1 epoch (loss 1.3763): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [06:22<00:08, 2.54it/s] Training 1/1 epoch (loss 1.3763): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [06:22<00:07, 2.55it/s] Training 1/1 epoch (loss 1.3907): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [06:22<00:07, 2.55it/s] Training 1/1 epoch (loss 1.3907): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [06:22<00:07, 2.56it/s] Training 1/1 epoch (loss 1.3770): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [06:22<00:07, 2.56it/s] Training 1/1 epoch (loss 1.3770): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [06:22<00:07, 2.55it/s] Training 1/1 epoch (loss 1.3672): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [06:23<00:07, 2.55it/s] Training 1/1 epoch (loss 1.3672): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [06:23<00:06, 2.57it/s] Training 1/1 epoch (loss 1.3476): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [06:23<00:06, 2.57it/s] Training 1/1 epoch (loss 1.3476): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [06:23<00:06, 2.56it/s] Training 1/1 epoch (loss 1.3112): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [06:24<00:06, 2.56it/s] Training 1/1 epoch (loss 1.3112): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [06:24<00:05, 2.52it/s] Training 1/1 epoch (loss 1.3869): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [06:24<00:05, 2.52it/s] Training 1/1 epoch (loss 1.3869): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [06:24<00:05, 2.51it/s] Training 1/1 epoch (loss 1.4174): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [06:25<00:05, 2.51it/s] Training 1/1 epoch (loss 1.4174): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [06:25<00:05, 2.41it/s] Training 1/1 epoch (loss 1.2937): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [06:25<00:05, 2.41it/s] Training 1/1 epoch (loss 1.2937): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [06:25<00:04, 2.50it/s] Training 1/1 epoch (loss 1.4355): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [06:25<00:04, 2.50it/s] Training 1/1 epoch (loss 1.4355): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [06:25<00:04, 2.45it/s] Training 1/1 epoch (loss 1.3572): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [06:26<00:04, 2.45it/s] Training 1/1 epoch (loss 1.3572): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [06:26<00:04, 2.42it/s] Training 1/1 epoch (loss 1.3109): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [06:26<00:04, 2.42it/s] Training 1/1 epoch (loss 1.3109): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [06:26<00:03, 2.47it/s] Training 1/1 epoch (loss 1.3701): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [06:27<00:03, 2.47it/s] Training 1/1 epoch (loss 1.3701): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [06:27<00:03, 2.45it/s] Training 1/1 epoch (loss 1.3276): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [06:27<00:03, 2.45it/s] Training 1/1 epoch (loss 1.3276): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [06:27<00:02, 2.49it/s] Training 1/1 epoch (loss 1.3996): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [06:27<00:02, 2.49it/s] Training 1/1 epoch (loss 1.3996): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [06:27<00:02, 2.51it/s] Training 1/1 epoch (loss 1.4141): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [06:28<00:02, 2.51it/s] Training 1/1 epoch (loss 1.4141): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [06:28<00:01, 2.54it/s] Training 1/1 epoch (loss 1.4167): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [06:28<00:01, 2.54it/s] Training 1/1 epoch (loss 1.4167): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [06:28<00:01, 2.58it/s] Training 1/1 epoch (loss 1.5263): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [06:28<00:01, 2.58it/s] Training 1/1 epoch (loss 1.5263): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [06:28<00:01, 2.62it/s] Training 1/1 epoch (loss 1.4366): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [06:29<00:01, 2.62it/s] Training 1/1 epoch (loss 1.4366): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [06:29<00:00, 2.47it/s] Training 1/1 epoch (loss 1.3975): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [06:29<00:00, 2.47it/s] Training 1/1 epoch (loss 1.3975): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [06:29<00:00, 2.51it/s] Training 1/1 epoch (loss 1.4480): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [06:30<00:00, 2.51it/s] Training 1/1 epoch (loss 1.4480): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [06:30<00:00, 2.40it/s] Training 1/1 epoch (loss 1.4480): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [06:30<00:00, 2.40it/s]
tokenizer config file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-3T/tinyllama-3T-s3-Q1-30k/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/tinyllama-3T/tinyllama-3T-s3-Q1-30k/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x15505936e050>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 449, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 391, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2106, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2127, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2352, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2609, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown