Text Generation
Transformers
Safetensors
qwen3
llama-factory
full
Generated from Trainer
conversational
text-generation-inference
Instructions to use DCAgent/a1-multifile_composition with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DCAgent/a1-multifile_composition with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DCAgent/a1-multifile_composition") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-multifile_composition") model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-multifile_composition") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DCAgent/a1-multifile_composition with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DCAgent/a1-multifile_composition" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-multifile_composition", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DCAgent/a1-multifile_composition
- SGLang
How to use DCAgent/a1-multifile_composition with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DCAgent/a1-multifile_composition" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-multifile_composition", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DCAgent/a1-multifile_composition" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-multifile_composition", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DCAgent/a1-multifile_composition with Docker Model Runner:
docker model run hf.co/DCAgent/a1-multifile_composition
| { | |
| "best_global_step": null, | |
| "best_metric": null, | |
| "best_model_checkpoint": null, | |
| "epoch": 7.0, | |
| "eval_steps": 500, | |
| "global_step": 4333, | |
| "is_hyper_param_search": false, | |
| "is_local_process_zero": true, | |
| "is_world_process_zero": true, | |
| "log_history": [ | |
| { | |
| "epoch": 0.008077544426494346, | |
| "grad_norm": 17.94031922937094, | |
| "learning_rate": 3.686635944700461e-07, | |
| "loss": 0.6301, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6101559400558472, | |
| "step": 5, | |
| "valid_targets_mean": 5131.6, | |
| "valid_targets_min": 2456 | |
| }, | |
| { | |
| "epoch": 0.01615508885298869, | |
| "grad_norm": 18.031251271375503, | |
| "learning_rate": 8.294930875576038e-07, | |
| "loss": 0.6434, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6626201868057251, | |
| "step": 10, | |
| "valid_targets_mean": 4924.7, | |
| "valid_targets_min": 2511 | |
| }, | |
| { | |
| "epoch": 0.024232633279483037, | |
| "grad_norm": 15.80671655943195, | |
| "learning_rate": 1.2903225806451614e-06, | |
| "loss": 0.5987, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6084240078926086, | |
| "step": 15, | |
| "valid_targets_mean": 5216.2, | |
| "valid_targets_min": 2840 | |
| }, | |
| { | |
| "epoch": 0.03231017770597738, | |
| "grad_norm": 12.200268087793992, | |
| "learning_rate": 1.751152073732719e-06, | |
| "loss": 0.5488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5271545052528381, | |
| "step": 20, | |
| "valid_targets_mean": 5782.9, | |
| "valid_targets_min": 3261 | |
| }, | |
| { | |
| "epoch": 0.04038772213247173, | |
| "grad_norm": 7.84942309438541, | |
| "learning_rate": 2.211981566820277e-06, | |
| "loss": 0.5143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.48371779918670654, | |
| "step": 25, | |
| "valid_targets_mean": 5333.9, | |
| "valid_targets_min": 3236 | |
| }, | |
| { | |
| "epoch": 0.048465266558966075, | |
| "grad_norm": 4.200322718651227, | |
| "learning_rate": 2.6728110599078343e-06, | |
| "loss": 0.4757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.444708913564682, | |
| "step": 30, | |
| "valid_targets_mean": 5335.8, | |
| "valid_targets_min": 3982 | |
| }, | |
| { | |
| "epoch": 0.05654281098546042, | |
| "grad_norm": 2.5697926973071, | |
| "learning_rate": 3.1336405529953917e-06, | |
| "loss": 0.4407, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4613327383995056, | |
| "step": 35, | |
| "valid_targets_mean": 5051.0, | |
| "valid_targets_min": 3053 | |
| }, | |
| { | |
| "epoch": 0.06462035541195477, | |
| "grad_norm": 1.2615763319708777, | |
| "learning_rate": 3.5944700460829495e-06, | |
| "loss": 0.412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38105225563049316, | |
| "step": 40, | |
| "valid_targets_mean": 6270.6, | |
| "valid_targets_min": 4347 | |
| }, | |
| { | |
| "epoch": 0.07269789983844911, | |
| "grad_norm": 1.2005219937697187, | |
| "learning_rate": 4.055299539170508e-06, | |
| "loss": 0.3991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3904378414154053, | |
| "step": 45, | |
| "valid_targets_mean": 4399.7, | |
| "valid_targets_min": 773 | |
| }, | |
| { | |
| "epoch": 0.08077544426494346, | |
| "grad_norm": 0.8711539509364695, | |
| "learning_rate": 4.516129032258065e-06, | |
| "loss": 0.3806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3860582709312439, | |
| "step": 50, | |
| "valid_targets_mean": 5580.8, | |
| "valid_targets_min": 2625 | |
| }, | |
| { | |
| "epoch": 0.0888529886914378, | |
| "grad_norm": 0.7806846326439755, | |
| "learning_rate": 4.976958525345623e-06, | |
| "loss": 0.3699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3754419684410095, | |
| "step": 55, | |
| "valid_targets_mean": 5052.6, | |
| "valid_targets_min": 915 | |
| }, | |
| { | |
| "epoch": 0.09693053311793215, | |
| "grad_norm": 0.6803342618147262, | |
| "learning_rate": 5.43778801843318e-06, | |
| "loss": 0.3472, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3440122604370117, | |
| "step": 60, | |
| "valid_targets_mean": 4892.2, | |
| "valid_targets_min": 1595 | |
| }, | |
| { | |
| "epoch": 0.1050080775444265, | |
| "grad_norm": 0.5873054899995456, | |
| "learning_rate": 5.8986175115207375e-06, | |
| "loss": 0.3444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3376592993736267, | |
| "step": 65, | |
| "valid_targets_mean": 5686.4, | |
| "valid_targets_min": 3311 | |
| }, | |
| { | |
| "epoch": 0.11308562197092084, | |
| "grad_norm": 0.5812104869467237, | |
| "learning_rate": 6.359447004608295e-06, | |
| "loss": 0.3351, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32878339290618896, | |
| "step": 70, | |
| "valid_targets_mean": 5069.1, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 0.12116316639741519, | |
| "grad_norm": 0.6110766171265148, | |
| "learning_rate": 6.820276497695853e-06, | |
| "loss": 0.3314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32319021224975586, | |
| "step": 75, | |
| "valid_targets_mean": 5218.8, | |
| "valid_targets_min": 3039 | |
| }, | |
| { | |
| "epoch": 0.12924071082390953, | |
| "grad_norm": 0.5275806578865466, | |
| "learning_rate": 7.28110599078341e-06, | |
| "loss": 0.3181, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2842397391796112, | |
| "step": 80, | |
| "valid_targets_mean": 5314.7, | |
| "valid_targets_min": 936 | |
| }, | |
| { | |
| "epoch": 0.13731825525040386, | |
| "grad_norm": 0.5043499827664671, | |
| "learning_rate": 7.741935483870968e-06, | |
| "loss": 0.3135, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29710888862609863, | |
| "step": 85, | |
| "valid_targets_mean": 5840.7, | |
| "valid_targets_min": 3092 | |
| }, | |
| { | |
| "epoch": 0.14539579967689822, | |
| "grad_norm": 0.46665023880464535, | |
| "learning_rate": 8.202764976958527e-06, | |
| "loss": 0.3041, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30811554193496704, | |
| "step": 90, | |
| "valid_targets_mean": 5763.2, | |
| "valid_targets_min": 2093 | |
| }, | |
| { | |
| "epoch": 0.15347334410339256, | |
| "grad_norm": 0.479007494945398, | |
| "learning_rate": 8.663594470046084e-06, | |
| "loss": 0.2898, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27862054109573364, | |
| "step": 95, | |
| "valid_targets_mean": 5151.4, | |
| "valid_targets_min": 3499 | |
| }, | |
| { | |
| "epoch": 0.16155088852988692, | |
| "grad_norm": 0.46977805962833913, | |
| "learning_rate": 9.124423963133642e-06, | |
| "loss": 0.2857, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27801576256752014, | |
| "step": 100, | |
| "valid_targets_mean": 5621.1, | |
| "valid_targets_min": 2984 | |
| }, | |
| { | |
| "epoch": 0.16962843295638125, | |
| "grad_norm": 0.5413430637817005, | |
| "learning_rate": 9.5852534562212e-06, | |
| "loss": 0.2824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27780991792678833, | |
| "step": 105, | |
| "valid_targets_mean": 5530.5, | |
| "valid_targets_min": 3262 | |
| }, | |
| { | |
| "epoch": 0.1777059773828756, | |
| "grad_norm": 0.4854713237809568, | |
| "learning_rate": 1.0046082949308758e-05, | |
| "loss": 0.2841, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29190337657928467, | |
| "step": 110, | |
| "valid_targets_mean": 5010.4, | |
| "valid_targets_min": 2172 | |
| }, | |
| { | |
| "epoch": 0.18578352180936994, | |
| "grad_norm": 0.43214061276729393, | |
| "learning_rate": 1.0506912442396313e-05, | |
| "loss": 0.2842, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2443092167377472, | |
| "step": 115, | |
| "valid_targets_mean": 5622.8, | |
| "valid_targets_min": 2807 | |
| }, | |
| { | |
| "epoch": 0.1938610662358643, | |
| "grad_norm": 0.4667686489049157, | |
| "learning_rate": 1.096774193548387e-05, | |
| "loss": 0.2985, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29900214076042175, | |
| "step": 120, | |
| "valid_targets_mean": 5740.0, | |
| "valid_targets_min": 2342 | |
| }, | |
| { | |
| "epoch": 0.20193861066235863, | |
| "grad_norm": 0.47079196966589154, | |
| "learning_rate": 1.1428571428571429e-05, | |
| "loss": 0.2718, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2535105049610138, | |
| "step": 125, | |
| "valid_targets_mean": 5683.8, | |
| "valid_targets_min": 3121 | |
| }, | |
| { | |
| "epoch": 0.210016155088853, | |
| "grad_norm": 0.48737229402291704, | |
| "learning_rate": 1.1889400921658986e-05, | |
| "loss": 0.2677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2664521038532257, | |
| "step": 130, | |
| "valid_targets_mean": 5758.3, | |
| "valid_targets_min": 3031 | |
| }, | |
| { | |
| "epoch": 0.21809369951534732, | |
| "grad_norm": 0.44438273857452876, | |
| "learning_rate": 1.2350230414746545e-05, | |
| "loss": 0.254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24761803448200226, | |
| "step": 135, | |
| "valid_targets_mean": 5854.6, | |
| "valid_targets_min": 2705 | |
| }, | |
| { | |
| "epoch": 0.22617124394184168, | |
| "grad_norm": 0.484602203573656, | |
| "learning_rate": 1.2811059907834102e-05, | |
| "loss": 0.2591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2663928270339966, | |
| "step": 140, | |
| "valid_targets_mean": 4858.4, | |
| "valid_targets_min": 3166 | |
| }, | |
| { | |
| "epoch": 0.23424878836833601, | |
| "grad_norm": 0.8373509253059376, | |
| "learning_rate": 1.327188940092166e-05, | |
| "loss": 0.2545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23110944032669067, | |
| "step": 145, | |
| "valid_targets_mean": 5774.5, | |
| "valid_targets_min": 3857 | |
| }, | |
| { | |
| "epoch": 0.24232633279483037, | |
| "grad_norm": 0.4744523854954822, | |
| "learning_rate": 1.3732718894009217e-05, | |
| "loss": 0.2515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2517966628074646, | |
| "step": 150, | |
| "valid_targets_mean": 5936.8, | |
| "valid_targets_min": 3820 | |
| }, | |
| { | |
| "epoch": 0.25040387722132473, | |
| "grad_norm": 0.4436732050116699, | |
| "learning_rate": 1.4193548387096776e-05, | |
| "loss": 0.2664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2502896189689636, | |
| "step": 155, | |
| "valid_targets_mean": 5545.9, | |
| "valid_targets_min": 3053 | |
| }, | |
| { | |
| "epoch": 0.25848142164781907, | |
| "grad_norm": 0.5326764127215488, | |
| "learning_rate": 1.4654377880184335e-05, | |
| "loss": 0.2446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2557904124259949, | |
| "step": 160, | |
| "valid_targets_mean": 4778.4, | |
| "valid_targets_min": 2787 | |
| }, | |
| { | |
| "epoch": 0.2665589660743134, | |
| "grad_norm": 0.5742888836203236, | |
| "learning_rate": 1.511520737327189e-05, | |
| "loss": 0.2506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2368374466896057, | |
| "step": 165, | |
| "valid_targets_mean": 4304.3, | |
| "valid_targets_min": 982 | |
| }, | |
| { | |
| "epoch": 0.27463651050080773, | |
| "grad_norm": 0.516540542996034, | |
| "learning_rate": 1.5576036866359447e-05, | |
| "loss": 0.2448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23024621605873108, | |
| "step": 170, | |
| "valid_targets_mean": 5190.8, | |
| "valid_targets_min": 3596 | |
| }, | |
| { | |
| "epoch": 0.2827140549273021, | |
| "grad_norm": 0.49927626067384623, | |
| "learning_rate": 1.6036866359447006e-05, | |
| "loss": 0.2536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26048240065574646, | |
| "step": 175, | |
| "valid_targets_mean": 5276.1, | |
| "valid_targets_min": 3502 | |
| }, | |
| { | |
| "epoch": 0.29079159935379645, | |
| "grad_norm": 0.5279070782491759, | |
| "learning_rate": 1.6497695852534564e-05, | |
| "loss": 0.2599, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26732373237609863, | |
| "step": 180, | |
| "valid_targets_mean": 5403.4, | |
| "valid_targets_min": 3363 | |
| }, | |
| { | |
| "epoch": 0.2988691437802908, | |
| "grad_norm": 0.7686728653284448, | |
| "learning_rate": 1.695852534562212e-05, | |
| "loss": 0.2502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24225091934204102, | |
| "step": 185, | |
| "valid_targets_mean": 5036.5, | |
| "valid_targets_min": 3044 | |
| }, | |
| { | |
| "epoch": 0.3069466882067851, | |
| "grad_norm": 0.5401854168098045, | |
| "learning_rate": 1.741935483870968e-05, | |
| "loss": 0.2427, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23782731592655182, | |
| "step": 190, | |
| "valid_targets_mean": 4939.9, | |
| "valid_targets_min": 3655 | |
| }, | |
| { | |
| "epoch": 0.3150242326332795, | |
| "grad_norm": 0.4741183912318437, | |
| "learning_rate": 1.7880184331797237e-05, | |
| "loss": 0.2358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23438788950443268, | |
| "step": 195, | |
| "valid_targets_mean": 5697.8, | |
| "valid_targets_min": 3818 | |
| }, | |
| { | |
| "epoch": 0.32310177705977383, | |
| "grad_norm": 0.5649501345080572, | |
| "learning_rate": 1.8341013824884796e-05, | |
| "loss": 0.2449, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2522869110107422, | |
| "step": 200, | |
| "valid_targets_mean": 5545.6, | |
| "valid_targets_min": 2987 | |
| }, | |
| { | |
| "epoch": 0.33117932148626816, | |
| "grad_norm": 0.5024588642048885, | |
| "learning_rate": 1.880184331797235e-05, | |
| "loss": 0.2459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25441572070121765, | |
| "step": 205, | |
| "valid_targets_mean": 5945.6, | |
| "valid_targets_min": 3733 | |
| }, | |
| { | |
| "epoch": 0.3392568659127625, | |
| "grad_norm": 0.4723024425418466, | |
| "learning_rate": 1.926267281105991e-05, | |
| "loss": 0.228, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22206412255764008, | |
| "step": 210, | |
| "valid_targets_mean": 5697.2, | |
| "valid_targets_min": 3558 | |
| }, | |
| { | |
| "epoch": 0.3473344103392569, | |
| "grad_norm": 0.6726911987350219, | |
| "learning_rate": 1.9723502304147465e-05, | |
| "loss": 0.2514, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28493285179138184, | |
| "step": 215, | |
| "valid_targets_mean": 5278.5, | |
| "valid_targets_min": 3333 | |
| }, | |
| { | |
| "epoch": 0.3554119547657512, | |
| "grad_norm": 0.4932617145639215, | |
| "learning_rate": 2.0184331797235024e-05, | |
| "loss": 0.2348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24521911144256592, | |
| "step": 220, | |
| "valid_targets_mean": 5599.8, | |
| "valid_targets_min": 3656 | |
| }, | |
| { | |
| "epoch": 0.36348949919224555, | |
| "grad_norm": 0.7642926488649742, | |
| "learning_rate": 2.0645161290322582e-05, | |
| "loss": 0.2364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23008081316947937, | |
| "step": 225, | |
| "valid_targets_mean": 4490.4, | |
| "valid_targets_min": 2678 | |
| }, | |
| { | |
| "epoch": 0.3715670436187399, | |
| "grad_norm": 0.4615383980334948, | |
| "learning_rate": 2.110599078341014e-05, | |
| "loss": 0.2451, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2407582700252533, | |
| "step": 230, | |
| "valid_targets_mean": 5580.6, | |
| "valid_targets_min": 1278 | |
| }, | |
| { | |
| "epoch": 0.37964458804523427, | |
| "grad_norm": 0.48203999030444306, | |
| "learning_rate": 2.1566820276497696e-05, | |
| "loss": 0.2462, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2535155117511749, | |
| "step": 235, | |
| "valid_targets_mean": 5543.1, | |
| "valid_targets_min": 3751 | |
| }, | |
| { | |
| "epoch": 0.3877221324717286, | |
| "grad_norm": 0.5148221390988014, | |
| "learning_rate": 2.2027649769585255e-05, | |
| "loss": 0.2353, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2425214797258377, | |
| "step": 240, | |
| "valid_targets_mean": 5763.6, | |
| "valid_targets_min": 3036 | |
| }, | |
| { | |
| "epoch": 0.39579967689822293, | |
| "grad_norm": 0.46547764871482555, | |
| "learning_rate": 2.2488479262672814e-05, | |
| "loss": 0.2425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25116443634033203, | |
| "step": 245, | |
| "valid_targets_mean": 5522.1, | |
| "valid_targets_min": 2416 | |
| }, | |
| { | |
| "epoch": 0.40387722132471726, | |
| "grad_norm": 0.5077942282172957, | |
| "learning_rate": 2.2949308755760372e-05, | |
| "loss": 0.2362, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22106090188026428, | |
| "step": 250, | |
| "valid_targets_mean": 4705.5, | |
| "valid_targets_min": 2789 | |
| }, | |
| { | |
| "epoch": 0.41195476575121165, | |
| "grad_norm": 0.4787719956412287, | |
| "learning_rate": 2.3410138248847928e-05, | |
| "loss": 0.2346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2345045804977417, | |
| "step": 255, | |
| "valid_targets_mean": 5413.2, | |
| "valid_targets_min": 2585 | |
| }, | |
| { | |
| "epoch": 0.420032310177706, | |
| "grad_norm": 0.485030743771553, | |
| "learning_rate": 2.3870967741935486e-05, | |
| "loss": 0.2306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22481679916381836, | |
| "step": 260, | |
| "valid_targets_mean": 5144.9, | |
| "valid_targets_min": 2641 | |
| }, | |
| { | |
| "epoch": 0.4281098546042003, | |
| "grad_norm": 0.4205207190362599, | |
| "learning_rate": 2.4331797235023045e-05, | |
| "loss": 0.2358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21444383263587952, | |
| "step": 265, | |
| "valid_targets_mean": 6273.4, | |
| "valid_targets_min": 2606 | |
| }, | |
| { | |
| "epoch": 0.43618739903069464, | |
| "grad_norm": 0.5211610541894309, | |
| "learning_rate": 2.4792626728110604e-05, | |
| "loss": 0.249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2537360191345215, | |
| "step": 270, | |
| "valid_targets_mean": 5275.8, | |
| "valid_targets_min": 2861 | |
| }, | |
| { | |
| "epoch": 0.44426494345718903, | |
| "grad_norm": 0.4770920801732771, | |
| "learning_rate": 2.525345622119816e-05, | |
| "loss": 0.2366, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23997442424297333, | |
| "step": 275, | |
| "valid_targets_mean": 5575.6, | |
| "valid_targets_min": 3039 | |
| }, | |
| { | |
| "epoch": 0.45234248788368336, | |
| "grad_norm": 0.4570395047899931, | |
| "learning_rate": 2.5714285714285718e-05, | |
| "loss": 0.231, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23134195804595947, | |
| "step": 280, | |
| "valid_targets_mean": 6000.9, | |
| "valid_targets_min": 3196 | |
| }, | |
| { | |
| "epoch": 0.4604200323101777, | |
| "grad_norm": 0.5166087876373537, | |
| "learning_rate": 2.6175115207373277e-05, | |
| "loss": 0.2356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2255057841539383, | |
| "step": 285, | |
| "valid_targets_mean": 4838.3, | |
| "valid_targets_min": 3099 | |
| }, | |
| { | |
| "epoch": 0.46849757673667203, | |
| "grad_norm": 0.49241744642017077, | |
| "learning_rate": 2.663594470046083e-05, | |
| "loss": 0.2282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23905658721923828, | |
| "step": 290, | |
| "valid_targets_mean": 5878.9, | |
| "valid_targets_min": 3158 | |
| }, | |
| { | |
| "epoch": 0.4765751211631664, | |
| "grad_norm": 0.5151431055676697, | |
| "learning_rate": 2.7096774193548387e-05, | |
| "loss": 0.238, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2637583017349243, | |
| "step": 295, | |
| "valid_targets_mean": 5450.1, | |
| "valid_targets_min": 3154 | |
| }, | |
| { | |
| "epoch": 0.48465266558966075, | |
| "grad_norm": 0.5264942287162724, | |
| "learning_rate": 2.7557603686635946e-05, | |
| "loss": 0.2284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24266698956489563, | |
| "step": 300, | |
| "valid_targets_mean": 5319.9, | |
| "valid_targets_min": 3183 | |
| }, | |
| { | |
| "epoch": 0.4927302100161551, | |
| "grad_norm": 0.5615744267117123, | |
| "learning_rate": 2.8018433179723505e-05, | |
| "loss": 0.2227, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23225907981395721, | |
| "step": 305, | |
| "valid_targets_mean": 5445.9, | |
| "valid_targets_min": 1904 | |
| }, | |
| { | |
| "epoch": 0.5008077544426495, | |
| "grad_norm": 0.4460039656784886, | |
| "learning_rate": 2.847926267281106e-05, | |
| "loss": 0.2255, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22618719935417175, | |
| "step": 310, | |
| "valid_targets_mean": 5620.4, | |
| "valid_targets_min": 2755 | |
| }, | |
| { | |
| "epoch": 0.5088852988691438, | |
| "grad_norm": 0.5221362994436883, | |
| "learning_rate": 2.894009216589862e-05, | |
| "loss": 0.229, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23870401084423065, | |
| "step": 315, | |
| "valid_targets_mean": 5645.4, | |
| "valid_targets_min": 3330 | |
| }, | |
| { | |
| "epoch": 0.5169628432956381, | |
| "grad_norm": 0.49079495003474594, | |
| "learning_rate": 2.9400921658986177e-05, | |
| "loss": 0.2267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22571909427642822, | |
| "step": 320, | |
| "valid_targets_mean": 5016.4, | |
| "valid_targets_min": 3153 | |
| }, | |
| { | |
| "epoch": 0.5250403877221325, | |
| "grad_norm": 0.6884266293843466, | |
| "learning_rate": 2.9861751152073736e-05, | |
| "loss": 0.2207, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2298227846622467, | |
| "step": 325, | |
| "valid_targets_mean": 5508.1, | |
| "valid_targets_min": 2791 | |
| }, | |
| { | |
| "epoch": 0.5331179321486268, | |
| "grad_norm": 0.4402543092747063, | |
| "learning_rate": 3.032258064516129e-05, | |
| "loss": 0.2171, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21528840065002441, | |
| "step": 330, | |
| "valid_targets_mean": 6062.5, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 0.5411954765751211, | |
| "grad_norm": 0.4982840307868996, | |
| "learning_rate": 3.078341013824885e-05, | |
| "loss": 0.2218, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21430733799934387, | |
| "step": 335, | |
| "valid_targets_mean": 5194.5, | |
| "valid_targets_min": 3011 | |
| }, | |
| { | |
| "epoch": 0.5492730210016155, | |
| "grad_norm": 0.47525498968315927, | |
| "learning_rate": 3.124423963133641e-05, | |
| "loss": 0.2222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23421087861061096, | |
| "step": 340, | |
| "valid_targets_mean": 5589.8, | |
| "valid_targets_min": 2991 | |
| }, | |
| { | |
| "epoch": 0.5573505654281099, | |
| "grad_norm": 0.48962076056010556, | |
| "learning_rate": 3.170506912442397e-05, | |
| "loss": 0.2267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23705869913101196, | |
| "step": 345, | |
| "valid_targets_mean": 5420.9, | |
| "valid_targets_min": 3046 | |
| }, | |
| { | |
| "epoch": 0.5654281098546042, | |
| "grad_norm": 0.5248601937434871, | |
| "learning_rate": 3.2165898617511526e-05, | |
| "loss": 0.2208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20738640427589417, | |
| "step": 350, | |
| "valid_targets_mean": 4991.7, | |
| "valid_targets_min": 3306 | |
| }, | |
| { | |
| "epoch": 0.5735056542810986, | |
| "grad_norm": 0.5529134517454101, | |
| "learning_rate": 3.2626728110599085e-05, | |
| "loss": 0.2236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2433146983385086, | |
| "step": 355, | |
| "valid_targets_mean": 5450.4, | |
| "valid_targets_min": 2867 | |
| }, | |
| { | |
| "epoch": 0.5815831987075929, | |
| "grad_norm": 0.5074738580262357, | |
| "learning_rate": 3.3087557603686637e-05, | |
| "loss": 0.2144, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21955862641334534, | |
| "step": 360, | |
| "valid_targets_mean": 5322.2, | |
| "valid_targets_min": 3236 | |
| }, | |
| { | |
| "epoch": 0.5896607431340872, | |
| "grad_norm": 0.5076742089256605, | |
| "learning_rate": 3.3548387096774195e-05, | |
| "loss": 0.2229, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2087869644165039, | |
| "step": 365, | |
| "valid_targets_mean": 5765.1, | |
| "valid_targets_min": 3093 | |
| }, | |
| { | |
| "epoch": 0.5977382875605816, | |
| "grad_norm": 0.5278251933504018, | |
| "learning_rate": 3.4009216589861754e-05, | |
| "loss": 0.2196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2305276095867157, | |
| "step": 370, | |
| "valid_targets_mean": 4804.2, | |
| "valid_targets_min": 3003 | |
| }, | |
| { | |
| "epoch": 0.6058158319870759, | |
| "grad_norm": 0.4826719162983493, | |
| "learning_rate": 3.447004608294931e-05, | |
| "loss": 0.2263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23459242284297943, | |
| "step": 375, | |
| "valid_targets_mean": 5528.1, | |
| "valid_targets_min": 3357 | |
| }, | |
| { | |
| "epoch": 0.6138933764135702, | |
| "grad_norm": 0.6320625812833691, | |
| "learning_rate": 3.493087557603687e-05, | |
| "loss": 0.2187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2187860608100891, | |
| "step": 380, | |
| "valid_targets_mean": 5503.6, | |
| "valid_targets_min": 3016 | |
| }, | |
| { | |
| "epoch": 0.6219709208400647, | |
| "grad_norm": 0.47007347965400653, | |
| "learning_rate": 3.539170506912443e-05, | |
| "loss": 0.219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20786529779434204, | |
| "step": 385, | |
| "valid_targets_mean": 5985.6, | |
| "valid_targets_min": 2796 | |
| }, | |
| { | |
| "epoch": 0.630048465266559, | |
| "grad_norm": 0.4630795250015434, | |
| "learning_rate": 3.585253456221198e-05, | |
| "loss": 0.2187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19715559482574463, | |
| "step": 390, | |
| "valid_targets_mean": 4612.4, | |
| "valid_targets_min": 1713 | |
| }, | |
| { | |
| "epoch": 0.6381260096930533, | |
| "grad_norm": 0.5023966582999537, | |
| "learning_rate": 3.631336405529954e-05, | |
| "loss": 0.2291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2126794159412384, | |
| "step": 395, | |
| "valid_targets_mean": 5234.2, | |
| "valid_targets_min": 3174 | |
| }, | |
| { | |
| "epoch": 0.6462035541195477, | |
| "grad_norm": 0.5251269446671685, | |
| "learning_rate": 3.67741935483871e-05, | |
| "loss": 0.2214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20799562335014343, | |
| "step": 400, | |
| "valid_targets_mean": 5003.8, | |
| "valid_targets_min": 3258 | |
| }, | |
| { | |
| "epoch": 0.654281098546042, | |
| "grad_norm": 0.4819749890049641, | |
| "learning_rate": 3.723502304147466e-05, | |
| "loss": 0.2086, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21949312090873718, | |
| "step": 405, | |
| "valid_targets_mean": 5259.9, | |
| "valid_targets_min": 3497 | |
| }, | |
| { | |
| "epoch": 0.6623586429725363, | |
| "grad_norm": 0.5188610703053317, | |
| "learning_rate": 3.7695852534562217e-05, | |
| "loss": 0.213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22154083847999573, | |
| "step": 410, | |
| "valid_targets_mean": 5679.1, | |
| "valid_targets_min": 3014 | |
| }, | |
| { | |
| "epoch": 0.6704361873990307, | |
| "grad_norm": 0.5039417340983822, | |
| "learning_rate": 3.815668202764977e-05, | |
| "loss": 0.2212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20660996437072754, | |
| "step": 415, | |
| "valid_targets_mean": 5197.8, | |
| "valid_targets_min": 1997 | |
| }, | |
| { | |
| "epoch": 0.678513731825525, | |
| "grad_norm": 0.7798249907735217, | |
| "learning_rate": 3.861751152073733e-05, | |
| "loss": 0.2075, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2136276513338089, | |
| "step": 420, | |
| "valid_targets_mean": 4663.6, | |
| "valid_targets_min": 2738 | |
| }, | |
| { | |
| "epoch": 0.6865912762520194, | |
| "grad_norm": 0.6066741003505866, | |
| "learning_rate": 3.9078341013824886e-05, | |
| "loss": 0.2176, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21756117045879364, | |
| "step": 425, | |
| "valid_targets_mean": 5574.6, | |
| "valid_targets_min": 3514 | |
| }, | |
| { | |
| "epoch": 0.6946688206785138, | |
| "grad_norm": 0.5223178069456903, | |
| "learning_rate": 3.9539170506912445e-05, | |
| "loss": 0.2074, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21287712454795837, | |
| "step": 430, | |
| "valid_targets_mean": 4827.9, | |
| "valid_targets_min": 2644 | |
| }, | |
| { | |
| "epoch": 0.7027463651050081, | |
| "grad_norm": 0.6672575878387428, | |
| "learning_rate": 4e-05, | |
| "loss": 0.225, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2201261818408966, | |
| "step": 435, | |
| "valid_targets_mean": 5125.8, | |
| "valid_targets_min": 3199 | |
| }, | |
| { | |
| "epoch": 0.7108239095315024, | |
| "grad_norm": 0.5201189718514464, | |
| "learning_rate": 3.999983769470204e-05, | |
| "loss": 0.215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19945184886455536, | |
| "step": 440, | |
| "valid_targets_mean": 4788.5, | |
| "valid_targets_min": 3072 | |
| }, | |
| { | |
| "epoch": 0.7189014539579968, | |
| "grad_norm": 0.5370932365597184, | |
| "learning_rate": 3.999935078144245e-05, | |
| "loss": 0.2192, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2333751916885376, | |
| "step": 445, | |
| "valid_targets_mean": 5377.1, | |
| "valid_targets_min": 2867 | |
| }, | |
| { | |
| "epoch": 0.7269789983844911, | |
| "grad_norm": 0.47353136343989644, | |
| "learning_rate": 3.99985392681241e-05, | |
| "loss": 0.2166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2093178629875183, | |
| "step": 450, | |
| "valid_targets_mean": 5342.3, | |
| "valid_targets_min": 3433 | |
| }, | |
| { | |
| "epoch": 0.7350565428109854, | |
| "grad_norm": 0.46778639574510306, | |
| "learning_rate": 3.999740316791827e-05, | |
| "loss": 0.2017, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19645030796527863, | |
| "step": 455, | |
| "valid_targets_mean": 5162.6, | |
| "valid_targets_min": 2735 | |
| }, | |
| { | |
| "epoch": 0.7431340872374798, | |
| "grad_norm": 0.5039905355084849, | |
| "learning_rate": 3.999594249926448e-05, | |
| "loss": 0.2197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21575768291950226, | |
| "step": 460, | |
| "valid_targets_mean": 5144.1, | |
| "valid_targets_min": 2666 | |
| }, | |
| { | |
| "epoch": 0.7512116316639742, | |
| "grad_norm": 0.4533789907763941, | |
| "learning_rate": 3.999415728587014e-05, | |
| "loss": 0.2122, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2194235473871231, | |
| "step": 465, | |
| "valid_targets_mean": 5687.6, | |
| "valid_targets_min": 3098 | |
| }, | |
| { | |
| "epoch": 0.7592891760904685, | |
| "grad_norm": 0.510641703733595, | |
| "learning_rate": 3.999204755671023e-05, | |
| "loss": 0.2295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23498345911502838, | |
| "step": 470, | |
| "valid_targets_mean": 5974.1, | |
| "valid_targets_min": 3649 | |
| }, | |
| { | |
| "epoch": 0.7673667205169629, | |
| "grad_norm": 0.4919939945918816, | |
| "learning_rate": 3.998961334602676e-05, | |
| "loss": 0.2198, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2350984811782837, | |
| "step": 475, | |
| "valid_targets_mean": 5363.2, | |
| "valid_targets_min": 3018 | |
| }, | |
| { | |
| "epoch": 0.7754442649434572, | |
| "grad_norm": 0.4966404337232152, | |
| "learning_rate": 3.998685469332826e-05, | |
| "loss": 0.2176, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22934846580028534, | |
| "step": 480, | |
| "valid_targets_mean": 5015.6, | |
| "valid_targets_min": 2629 | |
| }, | |
| { | |
| "epoch": 0.7835218093699515, | |
| "grad_norm": 0.4663688220948183, | |
| "learning_rate": 3.998377164338912e-05, | |
| "loss": 0.2188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19937145709991455, | |
| "step": 485, | |
| "valid_targets_mean": 4668.9, | |
| "valid_targets_min": 2626 | |
| }, | |
| { | |
| "epoch": 0.7915993537964459, | |
| "grad_norm": 0.43392124152029365, | |
| "learning_rate": 3.9980364246248886e-05, | |
| "loss": 0.2094, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21592256426811218, | |
| "step": 490, | |
| "valid_targets_mean": 6146.0, | |
| "valid_targets_min": 3710 | |
| }, | |
| { | |
| "epoch": 0.7996768982229402, | |
| "grad_norm": 0.4987695829826951, | |
| "learning_rate": 3.997663255721141e-05, | |
| "loss": 0.2169, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20188398659229279, | |
| "step": 495, | |
| "valid_targets_mean": 5604.4, | |
| "valid_targets_min": 3111 | |
| }, | |
| { | |
| "epoch": 0.8077544426494345, | |
| "grad_norm": 0.42396799291927767, | |
| "learning_rate": 3.9972576636843976e-05, | |
| "loss": 0.2103, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19645419716835022, | |
| "step": 500, | |
| "valid_targets_mean": 5397.3, | |
| "valid_targets_min": 3259 | |
| }, | |
| { | |
| "epoch": 0.815831987075929, | |
| "grad_norm": 0.4615497571346846, | |
| "learning_rate": 3.9968196550976335e-05, | |
| "loss": 0.2084, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20108461380004883, | |
| "step": 505, | |
| "valid_targets_mean": 5036.8, | |
| "valid_targets_min": 3462 | |
| }, | |
| { | |
| "epoch": 0.8239095315024233, | |
| "grad_norm": 0.45009318204805604, | |
| "learning_rate": 3.99634923706996e-05, | |
| "loss": 0.2156, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.214469313621521, | |
| "step": 510, | |
| "valid_targets_mean": 5153.3, | |
| "valid_targets_min": 951 | |
| }, | |
| { | |
| "epoch": 0.8319870759289176, | |
| "grad_norm": 0.4672244800718876, | |
| "learning_rate": 3.99584641723651e-05, | |
| "loss": 0.2138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22961321473121643, | |
| "step": 515, | |
| "valid_targets_mean": 5477.1, | |
| "valid_targets_min": 3372 | |
| }, | |
| { | |
| "epoch": 0.840064620355412, | |
| "grad_norm": 0.5295100729321306, | |
| "learning_rate": 3.995311203758315e-05, | |
| "loss": 0.1954, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19334357976913452, | |
| "step": 520, | |
| "valid_targets_mean": 5322.9, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 0.8481421647819063, | |
| "grad_norm": 0.44621467111732166, | |
| "learning_rate": 3.9947436053221765e-05, | |
| "loss": 0.203, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1929529756307602, | |
| "step": 525, | |
| "valid_targets_mean": 5034.4, | |
| "valid_targets_min": 2613 | |
| }, | |
| { | |
| "epoch": 0.8562197092084006, | |
| "grad_norm": 0.45403900653095236, | |
| "learning_rate": 3.9941436311405146e-05, | |
| "loss": 0.2146, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19396071135997772, | |
| "step": 530, | |
| "valid_targets_mean": 5233.5, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 0.864297253634895, | |
| "grad_norm": 0.46795099266112605, | |
| "learning_rate": 3.9935112909512296e-05, | |
| "loss": 0.2138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19735530018806458, | |
| "step": 535, | |
| "valid_targets_mean": 5340.6, | |
| "valid_targets_min": 2698 | |
| }, | |
| { | |
| "epoch": 0.8723747980613893, | |
| "grad_norm": 0.44225098932497453, | |
| "learning_rate": 3.992846595017538e-05, | |
| "loss": 0.2062, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19745707511901855, | |
| "step": 540, | |
| "valid_targets_mean": 5411.4, | |
| "valid_targets_min": 2974 | |
| }, | |
| { | |
| "epoch": 0.8804523424878837, | |
| "grad_norm": 0.4699030453155565, | |
| "learning_rate": 3.9921495541278055e-05, | |
| "loss": 0.201, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22131837904453278, | |
| "step": 545, | |
| "valid_targets_mean": 5369.1, | |
| "valid_targets_min": 3132 | |
| }, | |
| { | |
| "epoch": 0.8885298869143781, | |
| "grad_norm": 0.4484267118953338, | |
| "learning_rate": 3.991420179595377e-05, | |
| "loss": 0.2042, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19803202152252197, | |
| "step": 550, | |
| "valid_targets_mean": 5474.4, | |
| "valid_targets_min": 3240 | |
| }, | |
| { | |
| "epoch": 0.8966074313408724, | |
| "grad_norm": 0.4363172999484147, | |
| "learning_rate": 3.990658483258386e-05, | |
| "loss": 0.2075, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20177263021469116, | |
| "step": 555, | |
| "valid_targets_mean": 5468.0, | |
| "valid_targets_min": 3611 | |
| }, | |
| { | |
| "epoch": 0.9046849757673667, | |
| "grad_norm": 0.42726472894471124, | |
| "learning_rate": 3.989864477479568e-05, | |
| "loss": 0.2086, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19797097146511078, | |
| "step": 560, | |
| "valid_targets_mean": 5297.5, | |
| "valid_targets_min": 3148 | |
| }, | |
| { | |
| "epoch": 0.9127625201938611, | |
| "grad_norm": 0.8604100735139655, | |
| "learning_rate": 3.989038175146058e-05, | |
| "loss": 0.1994, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1961522400379181, | |
| "step": 565, | |
| "valid_targets_mean": 4621.4, | |
| "valid_targets_min": 2523 | |
| }, | |
| { | |
| "epoch": 0.9208400646203554, | |
| "grad_norm": 0.5029110628980737, | |
| "learning_rate": 3.98817958966918e-05, | |
| "loss": 0.212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20313574373722076, | |
| "step": 570, | |
| "valid_targets_mean": 5248.2, | |
| "valid_targets_min": 2731 | |
| }, | |
| { | |
| "epoch": 0.9289176090468497, | |
| "grad_norm": 0.4875542838449614, | |
| "learning_rate": 3.9872887349842314e-05, | |
| "loss": 0.2066, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2127552330493927, | |
| "step": 575, | |
| "valid_targets_mean": 5201.7, | |
| "valid_targets_min": 3392 | |
| }, | |
| { | |
| "epoch": 0.9369951534733441, | |
| "grad_norm": 0.43717989119834383, | |
| "learning_rate": 3.9863656255502564e-05, | |
| "loss": 0.2088, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21224826574325562, | |
| "step": 580, | |
| "valid_targets_mean": 5225.4, | |
| "valid_targets_min": 3381 | |
| }, | |
| { | |
| "epoch": 0.9450726978998385, | |
| "grad_norm": 0.4616037851520303, | |
| "learning_rate": 3.985410276349809e-05, | |
| "loss": 0.2075, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17695999145507812, | |
| "step": 585, | |
| "valid_targets_mean": 5687.9, | |
| "valid_targets_min": 2938 | |
| }, | |
| { | |
| "epoch": 0.9531502423263328, | |
| "grad_norm": 0.4463582506367317, | |
| "learning_rate": 3.984422702888714e-05, | |
| "loss": 0.2002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1981690526008606, | |
| "step": 590, | |
| "valid_targets_mean": 6257.0, | |
| "valid_targets_min": 3446 | |
| }, | |
| { | |
| "epoch": 0.9612277867528272, | |
| "grad_norm": 0.480542991208568, | |
| "learning_rate": 3.983402921195811e-05, | |
| "loss": 0.2052, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1909128725528717, | |
| "step": 595, | |
| "valid_targets_mean": 5778.5, | |
| "valid_targets_min": 2941 | |
| }, | |
| { | |
| "epoch": 0.9693053311793215, | |
| "grad_norm": 0.3902770147091619, | |
| "learning_rate": 3.982350947822697e-05, | |
| "loss": 0.1947, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1805838793516159, | |
| "step": 600, | |
| "valid_targets_mean": 5756.9, | |
| "valid_targets_min": 3440 | |
| }, | |
| { | |
| "epoch": 0.9773828756058158, | |
| "grad_norm": 0.4131148462914201, | |
| "learning_rate": 3.981266799843458e-05, | |
| "loss": 0.1985, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18150508403778076, | |
| "step": 605, | |
| "valid_targets_mean": 5193.4, | |
| "valid_targets_min": 2566 | |
| }, | |
| { | |
| "epoch": 0.9854604200323102, | |
| "grad_norm": 0.4235370608617834, | |
| "learning_rate": 3.9801504948543896e-05, | |
| "loss": 0.2027, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19405357539653778, | |
| "step": 610, | |
| "valid_targets_mean": 5936.7, | |
| "valid_targets_min": 3174 | |
| }, | |
| { | |
| "epoch": 0.9935379644588045, | |
| "grad_norm": 0.4516255003041157, | |
| "learning_rate": 3.979002050973713e-05, | |
| "loss": 0.1997, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18916358053684235, | |
| "step": 615, | |
| "valid_targets_mean": 5747.1, | |
| "valid_targets_min": 2721 | |
| }, | |
| { | |
| "epoch": 1.001615508885299, | |
| "grad_norm": 0.461759410912527, | |
| "learning_rate": 3.9778214868412814e-05, | |
| "loss": 0.2055, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19580666720867157, | |
| "step": 620, | |
| "valid_targets_mean": 5032.6, | |
| "valid_targets_min": 2931 | |
| }, | |
| { | |
| "epoch": 1.0096930533117932, | |
| "grad_norm": 0.4503702954517947, | |
| "learning_rate": 3.9766088216182757e-05, | |
| "loss": 0.2, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19610512256622314, | |
| "step": 625, | |
| "valid_targets_mean": 5503.9, | |
| "valid_targets_min": 3242 | |
| }, | |
| { | |
| "epoch": 1.0177705977382876, | |
| "grad_norm": 0.38535465367440225, | |
| "learning_rate": 3.975364074986895e-05, | |
| "loss": 0.1926, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1850084364414215, | |
| "step": 630, | |
| "valid_targets_mean": 5991.6, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 1.0258481421647818, | |
| "grad_norm": 0.49310881925785266, | |
| "learning_rate": 3.974087267150037e-05, | |
| "loss": 0.1896, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18842723965644836, | |
| "step": 635, | |
| "valid_targets_mean": 5075.8, | |
| "valid_targets_min": 3002 | |
| }, | |
| { | |
| "epoch": 1.0339256865912763, | |
| "grad_norm": 0.43908988379157626, | |
| "learning_rate": 3.972778418830969e-05, | |
| "loss": 0.1946, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19962944090366364, | |
| "step": 640, | |
| "valid_targets_mean": 5342.0, | |
| "valid_targets_min": 2747 | |
| }, | |
| { | |
| "epoch": 1.0420032310177707, | |
| "grad_norm": 0.4331538096922879, | |
| "learning_rate": 3.971437551272992e-05, | |
| "loss": 0.1919, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21022823452949524, | |
| "step": 645, | |
| "valid_targets_mean": 6027.9, | |
| "valid_targets_min": 3154 | |
| }, | |
| { | |
| "epoch": 1.050080775444265, | |
| "grad_norm": 0.5019102025271842, | |
| "learning_rate": 3.970064686239098e-05, | |
| "loss": 0.2003, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20737481117248535, | |
| "step": 650, | |
| "valid_targets_mean": 5382.2, | |
| "valid_targets_min": 2603 | |
| }, | |
| { | |
| "epoch": 1.0581583198707594, | |
| "grad_norm": 0.5083170777115193, | |
| "learning_rate": 3.968659846011614e-05, | |
| "loss": 0.1951, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20576795935630798, | |
| "step": 655, | |
| "valid_targets_mean": 5068.6, | |
| "valid_targets_min": 3324 | |
| }, | |
| { | |
| "epoch": 1.0662358642972536, | |
| "grad_norm": 0.5345249950524992, | |
| "learning_rate": 3.9672230533918394e-05, | |
| "loss": 0.1948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20422016084194183, | |
| "step": 660, | |
| "valid_targets_mean": 4944.4, | |
| "valid_targets_min": 2800 | |
| }, | |
| { | |
| "epoch": 1.074313408723748, | |
| "grad_norm": 0.45969874071534583, | |
| "learning_rate": 3.965754331699681e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21545246243476868, | |
| "step": 665, | |
| "valid_targets_mean": 5249.9, | |
| "valid_targets_min": 3535 | |
| }, | |
| { | |
| "epoch": 1.0823909531502423, | |
| "grad_norm": 0.448125663434801, | |
| "learning_rate": 3.96425370477327e-05, | |
| "loss": 0.1806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18208260834217072, | |
| "step": 670, | |
| "valid_targets_mean": 5157.9, | |
| "valid_targets_min": 3169 | |
| }, | |
| { | |
| "epoch": 1.0904684975767367, | |
| "grad_norm": 0.4135195037949607, | |
| "learning_rate": 3.962721196968575e-05, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1834549605846405, | |
| "step": 675, | |
| "valid_targets_mean": 5862.1, | |
| "valid_targets_min": 1603 | |
| }, | |
| { | |
| "epoch": 1.098546042003231, | |
| "grad_norm": 0.44798067711582557, | |
| "learning_rate": 3.961156833159012e-05, | |
| "loss": 0.1985, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19529622793197632, | |
| "step": 680, | |
| "valid_targets_mean": 5122.6, | |
| "valid_targets_min": 3054 | |
| }, | |
| { | |
| "epoch": 1.1066235864297254, | |
| "grad_norm": 0.42894997437857857, | |
| "learning_rate": 3.959560638735033e-05, | |
| "loss": 0.1919, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20408596098423004, | |
| "step": 685, | |
| "valid_targets_mean": 5497.6, | |
| "valid_targets_min": 1227 | |
| }, | |
| { | |
| "epoch": 1.1147011308562198, | |
| "grad_norm": 0.4528612328531972, | |
| "learning_rate": 3.9579326396037194e-05, | |
| "loss": 0.2021, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20387399196624756, | |
| "step": 690, | |
| "valid_targets_mean": 5263.1, | |
| "valid_targets_min": 3358 | |
| }, | |
| { | |
| "epoch": 1.122778675282714, | |
| "grad_norm": 0.5306644692965696, | |
| "learning_rate": 3.95627286218836e-05, | |
| "loss": 0.1899, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1954362839460373, | |
| "step": 695, | |
| "valid_targets_mean": 5552.9, | |
| "valid_targets_min": 3140 | |
| }, | |
| { | |
| "epoch": 1.1308562197092085, | |
| "grad_norm": 0.4038746508777421, | |
| "learning_rate": 3.9545813334280196e-05, | |
| "loss": 0.2006, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22040167450904846, | |
| "step": 700, | |
| "valid_targets_mean": 6317.6, | |
| "valid_targets_min": 3279 | |
| }, | |
| { | |
| "epoch": 1.1389337641357027, | |
| "grad_norm": 0.48511839796376943, | |
| "learning_rate": 3.952858080777109e-05, | |
| "loss": 0.1927, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18491333723068237, | |
| "step": 705, | |
| "valid_targets_mean": 4990.2, | |
| "valid_targets_min": 3120 | |
| }, | |
| { | |
| "epoch": 1.1470113085621971, | |
| "grad_norm": 0.4163906032254929, | |
| "learning_rate": 3.951103132204929e-05, | |
| "loss": 0.1957, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18463069200515747, | |
| "step": 710, | |
| "valid_targets_mean": 5359.3, | |
| "valid_targets_min": 3578 | |
| }, | |
| { | |
| "epoch": 1.1550888529886914, | |
| "grad_norm": 0.4707109747768151, | |
| "learning_rate": 3.9493165161952273e-05, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2015255093574524, | |
| "step": 715, | |
| "valid_targets_mean": 5115.6, | |
| "valid_targets_min": 3198 | |
| }, | |
| { | |
| "epoch": 1.1631663974151858, | |
| "grad_norm": 0.4360586669061346, | |
| "learning_rate": 3.947498261745727e-05, | |
| "loss": 0.1886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21659937500953674, | |
| "step": 720, | |
| "valid_targets_mean": 5570.9, | |
| "valid_targets_min": 3109 | |
| }, | |
| { | |
| "epoch": 1.1712439418416802, | |
| "grad_norm": 0.4909905395226162, | |
| "learning_rate": 3.9456483983676605e-05, | |
| "loss": 0.1923, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1849258691072464, | |
| "step": 725, | |
| "valid_targets_mean": 4441.8, | |
| "valid_targets_min": 2975 | |
| }, | |
| { | |
| "epoch": 1.1793214862681745, | |
| "grad_norm": 0.3961239491997915, | |
| "learning_rate": 3.943766956085291e-05, | |
| "loss": 0.1932, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18745461106300354, | |
| "step": 730, | |
| "valid_targets_mean": 5738.1, | |
| "valid_targets_min": 3693 | |
| }, | |
| { | |
| "epoch": 1.187399030694669, | |
| "grad_norm": 0.4504001793054616, | |
| "learning_rate": 3.941853965435425e-05, | |
| "loss": 0.1935, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20162153244018555, | |
| "step": 735, | |
| "valid_targets_mean": 5464.8, | |
| "valid_targets_min": 3482 | |
| }, | |
| { | |
| "epoch": 1.1954765751211631, | |
| "grad_norm": 0.4192116804539455, | |
| "learning_rate": 3.939909457466912e-05, | |
| "loss": 0.2014, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19164562225341797, | |
| "step": 740, | |
| "valid_targets_mean": 5500.5, | |
| "valid_targets_min": 3403 | |
| }, | |
| { | |
| "epoch": 1.2035541195476576, | |
| "grad_norm": 0.40950065823595944, | |
| "learning_rate": 3.937933463740147e-05, | |
| "loss": 0.189, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19027677178382874, | |
| "step": 745, | |
| "valid_targets_mean": 5712.7, | |
| "valid_targets_min": 3948 | |
| }, | |
| { | |
| "epoch": 1.2116316639741518, | |
| "grad_norm": 0.49384646135843263, | |
| "learning_rate": 3.9359260163265565e-05, | |
| "loss": 0.1929, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20605245232582092, | |
| "step": 750, | |
| "valid_targets_mean": 5703.0, | |
| "valid_targets_min": 3153 | |
| }, | |
| { | |
| "epoch": 1.2197092084006462, | |
| "grad_norm": 0.43935835645833166, | |
| "learning_rate": 3.933887147808074e-05, | |
| "loss": 0.1872, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19032615423202515, | |
| "step": 755, | |
| "valid_targets_mean": 5017.1, | |
| "valid_targets_min": 3492 | |
| }, | |
| { | |
| "epoch": 1.2277867528271407, | |
| "grad_norm": 0.4246915116255467, | |
| "learning_rate": 3.9318168912766165e-05, | |
| "loss": 0.1914, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17201930284500122, | |
| "step": 760, | |
| "valid_targets_mean": 4913.5, | |
| "valid_targets_min": 2726 | |
| }, | |
| { | |
| "epoch": 1.235864297253635, | |
| "grad_norm": 0.4371494767513589, | |
| "learning_rate": 3.929715280333544e-05, | |
| "loss": 0.1956, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1912434995174408, | |
| "step": 765, | |
| "valid_targets_mean": 5462.5, | |
| "valid_targets_min": 3552 | |
| }, | |
| { | |
| "epoch": 1.2439418416801293, | |
| "grad_norm": 0.4254249509205848, | |
| "learning_rate": 3.927582349089115e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18086455762386322, | |
| "step": 770, | |
| "valid_targets_mean": 5954.0, | |
| "valid_targets_min": 2815 | |
| }, | |
| { | |
| "epoch": 1.2520193861066236, | |
| "grad_norm": 0.49992158133323245, | |
| "learning_rate": 3.9254181321619354e-05, | |
| "loss": 0.1959, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21536007523536682, | |
| "step": 775, | |
| "valid_targets_mean": 6339.7, | |
| "valid_targets_min": 2654 | |
| }, | |
| { | |
| "epoch": 1.260096930533118, | |
| "grad_norm": 0.4391153689958113, | |
| "learning_rate": 3.923222664678391e-05, | |
| "loss": 0.197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20235435664653778, | |
| "step": 780, | |
| "valid_targets_mean": 5025.3, | |
| "valid_targets_min": 3189 | |
| }, | |
| { | |
| "epoch": 1.2681744749596122, | |
| "grad_norm": 0.45436181679553417, | |
| "learning_rate": 3.9209959822720825e-05, | |
| "loss": 0.1824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18655994534492493, | |
| "step": 785, | |
| "valid_targets_mean": 5029.0, | |
| "valid_targets_min": 3026 | |
| }, | |
| { | |
| "epoch": 1.2762520193861067, | |
| "grad_norm": 0.4123558934830543, | |
| "learning_rate": 3.918738121083244e-05, | |
| "loss": 0.1855, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18062835931777954, | |
| "step": 790, | |
| "valid_targets_mean": 6172.1, | |
| "valid_targets_min": 4072 | |
| }, | |
| { | |
| "epoch": 1.284329563812601, | |
| "grad_norm": 0.4648329512789721, | |
| "learning_rate": 3.916449117758162e-05, | |
| "loss": 0.1919, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20275992155075073, | |
| "step": 795, | |
| "valid_targets_mean": 5446.8, | |
| "valid_targets_min": 2875 | |
| }, | |
| { | |
| "epoch": 1.2924071082390953, | |
| "grad_norm": 0.43193021218404604, | |
| "learning_rate": 3.9141290094485695e-05, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1833624541759491, | |
| "step": 800, | |
| "valid_targets_mean": 5321.3, | |
| "valid_targets_min": 2904 | |
| }, | |
| { | |
| "epoch": 1.3004846526655895, | |
| "grad_norm": 0.4754726363482082, | |
| "learning_rate": 3.911777833811056e-05, | |
| "loss": 0.1989, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19616547226905823, | |
| "step": 805, | |
| "valid_targets_mean": 5313.9, | |
| "valid_targets_min": 3515 | |
| }, | |
| { | |
| "epoch": 1.308562197092084, | |
| "grad_norm": 0.45079727006553794, | |
| "learning_rate": 3.909395629006446e-05, | |
| "loss": 0.1944, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19740955531597137, | |
| "step": 810, | |
| "valid_targets_mean": 6083.8, | |
| "valid_targets_min": 3641 | |
| }, | |
| { | |
| "epoch": 1.3166397415185784, | |
| "grad_norm": 0.6882667145706619, | |
| "learning_rate": 3.906982433699188e-05, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19484448432922363, | |
| "step": 815, | |
| "valid_targets_mean": 5730.5, | |
| "valid_targets_min": 3268 | |
| }, | |
| { | |
| "epoch": 1.3247172859450727, | |
| "grad_norm": 0.4089270681222454, | |
| "learning_rate": 3.9045382870567176e-05, | |
| "loss": 0.1857, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18390175700187683, | |
| "step": 820, | |
| "valid_targets_mean": 5482.7, | |
| "valid_targets_min": 1648 | |
| }, | |
| { | |
| "epoch": 1.332794830371567, | |
| "grad_norm": 0.45501691325829674, | |
| "learning_rate": 3.9020632287488306e-05, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18795420229434967, | |
| "step": 825, | |
| "valid_targets_mean": 5459.8, | |
| "valid_targets_min": 2946 | |
| }, | |
| { | |
| "epoch": 1.3408723747980613, | |
| "grad_norm": 0.4032112523950925, | |
| "learning_rate": 3.899557298947036e-05, | |
| "loss": 0.1899, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1787724494934082, | |
| "step": 830, | |
| "valid_targets_mean": 5734.0, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 1.3489499192245558, | |
| "grad_norm": 0.6157000030569536, | |
| "learning_rate": 3.897020538323901e-05, | |
| "loss": 0.1947, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18250499665737152, | |
| "step": 835, | |
| "valid_targets_mean": 4282.5, | |
| "valid_targets_min": 935 | |
| }, | |
| { | |
| "epoch": 1.35702746365105, | |
| "grad_norm": 0.4289416257934381, | |
| "learning_rate": 3.894452988052393e-05, | |
| "loss": 0.1946, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20244619250297546, | |
| "step": 840, | |
| "valid_targets_mean": 5285.2, | |
| "valid_targets_min": 3236 | |
| }, | |
| { | |
| "epoch": 1.3651050080775444, | |
| "grad_norm": 0.443933802902784, | |
| "learning_rate": 3.8918546898052163e-05, | |
| "loss": 0.1878, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1882239729166031, | |
| "step": 845, | |
| "valid_targets_mean": 5619.4, | |
| "valid_targets_min": 2782 | |
| }, | |
| { | |
| "epoch": 1.3731825525040389, | |
| "grad_norm": 0.42620992362285304, | |
| "learning_rate": 3.889225685754126e-05, | |
| "loss": 0.1887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19183053076267242, | |
| "step": 850, | |
| "valid_targets_mean": 5213.1, | |
| "valid_targets_min": 3058 | |
| }, | |
| { | |
| "epoch": 1.381260096930533, | |
| "grad_norm": 0.4479107119801428, | |
| "learning_rate": 3.8865660185692506e-05, | |
| "loss": 0.1883, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19001539051532745, | |
| "step": 855, | |
| "valid_targets_mean": 4980.0, | |
| "valid_targets_min": 2417 | |
| }, | |
| { | |
| "epoch": 1.3893376413570275, | |
| "grad_norm": 0.4378497546979614, | |
| "learning_rate": 3.883875731418399e-05, | |
| "loss": 0.1829, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18882623314857483, | |
| "step": 860, | |
| "valid_targets_mean": 5717.3, | |
| "valid_targets_min": 3232 | |
| }, | |
| { | |
| "epoch": 1.3974151857835218, | |
| "grad_norm": 0.449776610356312, | |
| "learning_rate": 3.881154867966356e-05, | |
| "loss": 0.1932, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19307288527488708, | |
| "step": 865, | |
| "valid_targets_mean": 5585.6, | |
| "valid_targets_min": 3062 | |
| }, | |
| { | |
| "epoch": 1.4054927302100162, | |
| "grad_norm": 0.42148035811696744, | |
| "learning_rate": 3.878403472374176e-05, | |
| "loss": 0.1852, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18191015720367432, | |
| "step": 870, | |
| "valid_targets_mean": 5366.6, | |
| "valid_targets_min": 3470 | |
| }, | |
| { | |
| "epoch": 1.4135702746365104, | |
| "grad_norm": 0.4657919244437453, | |
| "learning_rate": 3.875621589298469e-05, | |
| "loss": 0.1888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19623816013336182, | |
| "step": 875, | |
| "valid_targets_mean": 5090.8, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 1.4216478190630049, | |
| "grad_norm": 0.5048847662100151, | |
| "learning_rate": 3.872809263890669e-05, | |
| "loss": 0.1868, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1963757425546646, | |
| "step": 880, | |
| "valid_targets_mean": 5389.4, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 1.4297253634894993, | |
| "grad_norm": 0.44706704838768546, | |
| "learning_rate": 3.8699665417963104e-05, | |
| "loss": 0.1928, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1936870962381363, | |
| "step": 885, | |
| "valid_targets_mean": 5634.3, | |
| "valid_targets_min": 3611 | |
| }, | |
| { | |
| "epoch": 1.4378029079159935, | |
| "grad_norm": 0.45989658582019405, | |
| "learning_rate": 3.867093469154275e-05, | |
| "loss": 0.1945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18847247958183289, | |
| "step": 890, | |
| "valid_targets_mean": 4658.3, | |
| "valid_targets_min": 2568 | |
| }, | |
| { | |
| "epoch": 1.445880452342488, | |
| "grad_norm": 0.44383505996365463, | |
| "learning_rate": 3.864190092596058e-05, | |
| "loss": 0.184, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19026578962802887, | |
| "step": 895, | |
| "valid_targets_mean": 5617.9, | |
| "valid_targets_min": 2564 | |
| }, | |
| { | |
| "epoch": 1.4539579967689822, | |
| "grad_norm": 0.4194199111124668, | |
| "learning_rate": 3.861256459244996e-05, | |
| "loss": 0.1909, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17383940517902374, | |
| "step": 900, | |
| "valid_targets_mean": 5795.9, | |
| "valid_targets_min": 3262 | |
| }, | |
| { | |
| "epoch": 1.4620355411954766, | |
| "grad_norm": 0.4255109011676174, | |
| "learning_rate": 3.858292616715514e-05, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19212770462036133, | |
| "step": 905, | |
| "valid_targets_mean": 5731.1, | |
| "valid_targets_min": 2999 | |
| }, | |
| { | |
| "epoch": 1.4701130856219708, | |
| "grad_norm": 0.4319447738916833, | |
| "learning_rate": 3.855298613112346e-05, | |
| "loss": 0.1902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1939164698123932, | |
| "step": 910, | |
| "valid_targets_mean": 5160.8, | |
| "valid_targets_min": 3688 | |
| }, | |
| { | |
| "epoch": 1.4781906300484653, | |
| "grad_norm": 0.4272240749656332, | |
| "learning_rate": 3.852274497029757e-05, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19081246852874756, | |
| "step": 915, | |
| "valid_targets_mean": 5302.4, | |
| "valid_targets_min": 2299 | |
| }, | |
| { | |
| "epoch": 1.4862681744749597, | |
| "grad_norm": 0.46589612683429193, | |
| "learning_rate": 3.849220317550753e-05, | |
| "loss": 0.1894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18594548106193542, | |
| "step": 920, | |
| "valid_targets_mean": 5551.4, | |
| "valid_targets_min": 3274 | |
| }, | |
| { | |
| "epoch": 1.494345718901454, | |
| "grad_norm": 0.4239255105833143, | |
| "learning_rate": 3.846136124246285e-05, | |
| "loss": 0.1844, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1909162998199463, | |
| "step": 925, | |
| "valid_targets_mean": 5199.1, | |
| "valid_targets_min": 3129 | |
| }, | |
| { | |
| "epoch": 1.5024232633279482, | |
| "grad_norm": 0.422429353264246, | |
| "learning_rate": 3.843021967174444e-05, | |
| "loss": 0.1881, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19193074107170105, | |
| "step": 930, | |
| "valid_targets_mean": 5211.6, | |
| "valid_targets_min": 2783 | |
| }, | |
| { | |
| "epoch": 1.5105008077544426, | |
| "grad_norm": 0.41284494740761696, | |
| "learning_rate": 3.8398778968796504e-05, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18506529927253723, | |
| "step": 935, | |
| "valid_targets_mean": 5035.5, | |
| "valid_targets_min": 2843 | |
| }, | |
| { | |
| "epoch": 1.518578352180937, | |
| "grad_norm": 0.46260388072253517, | |
| "learning_rate": 3.836703964391829e-05, | |
| "loss": 0.1911, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17804807424545288, | |
| "step": 940, | |
| "valid_targets_mean": 4760.1, | |
| "valid_targets_min": 2700 | |
| }, | |
| { | |
| "epoch": 1.5266558966074313, | |
| "grad_norm": 0.4539090096449715, | |
| "learning_rate": 3.833500221225587e-05, | |
| "loss": 0.1905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.196572944521904, | |
| "step": 945, | |
| "valid_targets_mean": 5048.2, | |
| "valid_targets_min": 2911 | |
| }, | |
| { | |
| "epoch": 1.5347334410339257, | |
| "grad_norm": 0.3711320418441143, | |
| "learning_rate": 3.830266719379372e-05, | |
| "loss": 0.1779, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16216923296451569, | |
| "step": 950, | |
| "valid_targets_mean": 5451.4, | |
| "valid_targets_min": 2728 | |
| }, | |
| { | |
| "epoch": 1.5428109854604202, | |
| "grad_norm": 0.4053523215280699, | |
| "learning_rate": 3.827003511334634e-05, | |
| "loss": 0.188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1896899789571762, | |
| "step": 955, | |
| "valid_targets_mean": 5590.5, | |
| "valid_targets_min": 3036 | |
| }, | |
| { | |
| "epoch": 1.5508885298869144, | |
| "grad_norm": 0.4143801981662257, | |
| "learning_rate": 3.8237106500549665e-05, | |
| "loss": 0.1836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18760329484939575, | |
| "step": 960, | |
| "valid_targets_mean": 5340.1, | |
| "valid_targets_min": 3031 | |
| }, | |
| { | |
| "epoch": 1.5589660743134086, | |
| "grad_norm": 0.43435463117874407, | |
| "learning_rate": 3.8203881889852546e-05, | |
| "loss": 0.1973, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19068461656570435, | |
| "step": 965, | |
| "valid_targets_mean": 5864.2, | |
| "valid_targets_min": 2487 | |
| }, | |
| { | |
| "epoch": 1.567043618739903, | |
| "grad_norm": 0.398129305740008, | |
| "learning_rate": 3.8170361820508e-05, | |
| "loss": 0.1869, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18233898282051086, | |
| "step": 970, | |
| "valid_targets_mean": 5440.2, | |
| "valid_targets_min": 3231 | |
| }, | |
| { | |
| "epoch": 1.5751211631663975, | |
| "grad_norm": 0.3950559172966164, | |
| "learning_rate": 3.813654683656451e-05, | |
| "loss": 0.1789, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1903170347213745, | |
| "step": 975, | |
| "valid_targets_mean": 5554.0, | |
| "valid_targets_min": 3331 | |
| }, | |
| { | |
| "epoch": 1.5831987075928917, | |
| "grad_norm": 0.42327581866709485, | |
| "learning_rate": 3.810243748685719e-05, | |
| "loss": 0.1953, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1960287094116211, | |
| "step": 980, | |
| "valid_targets_mean": 5807.6, | |
| "valid_targets_min": 3432 | |
| }, | |
| { | |
| "epoch": 1.5912762520193862, | |
| "grad_norm": 0.426598665566564, | |
| "learning_rate": 3.806803432499885e-05, | |
| "loss": 0.1867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17561861872673035, | |
| "step": 985, | |
| "valid_targets_mean": 5001.1, | |
| "valid_targets_min": 3431 | |
| }, | |
| { | |
| "epoch": 1.5993537964458806, | |
| "grad_norm": 0.40412602556538074, | |
| "learning_rate": 3.803333790937105e-05, | |
| "loss": 0.1838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18966534733772278, | |
| "step": 990, | |
| "valid_targets_mean": 5509.1, | |
| "valid_targets_min": 2404 | |
| }, | |
| { | |
| "epoch": 1.6074313408723748, | |
| "grad_norm": 0.4340842659419959, | |
| "learning_rate": 3.7998348803114976e-05, | |
| "loss": 0.192, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1840440034866333, | |
| "step": 995, | |
| "valid_targets_mean": 5310.2, | |
| "valid_targets_min": 2912 | |
| }, | |
| { | |
| "epoch": 1.615508885298869, | |
| "grad_norm": 0.4111931761292879, | |
| "learning_rate": 3.7963067574122366e-05, | |
| "loss": 0.1888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1781894564628601, | |
| "step": 1000, | |
| "valid_targets_mean": 5349.6, | |
| "valid_targets_min": 2450 | |
| }, | |
| { | |
| "epoch": 1.6235864297253635, | |
| "grad_norm": 0.4087537870046933, | |
| "learning_rate": 3.7927494795026265e-05, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19502080976963043, | |
| "step": 1005, | |
| "valid_targets_mean": 5933.0, | |
| "valid_targets_min": 3385 | |
| }, | |
| { | |
| "epoch": 1.631663974151858, | |
| "grad_norm": 0.4394450939693841, | |
| "learning_rate": 3.789163104319172e-05, | |
| "loss": 0.1864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17000094056129456, | |
| "step": 1010, | |
| "valid_targets_mean": 4293.8, | |
| "valid_targets_min": 3014 | |
| }, | |
| { | |
| "epoch": 1.6397415185783522, | |
| "grad_norm": 0.4179986706849446, | |
| "learning_rate": 3.785547690070642e-05, | |
| "loss": 0.1914, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20335310697555542, | |
| "step": 1015, | |
| "valid_targets_mean": 4971.2, | |
| "valid_targets_min": 1935 | |
| }, | |
| { | |
| "epoch": 1.6478190630048464, | |
| "grad_norm": 0.41354564827824025, | |
| "learning_rate": 3.7819032954371265e-05, | |
| "loss": 0.1858, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19064179062843323, | |
| "step": 1020, | |
| "valid_targets_mean": 4628.4, | |
| "valid_targets_min": 915 | |
| }, | |
| { | |
| "epoch": 1.655896607431341, | |
| "grad_norm": 0.4510503380733471, | |
| "learning_rate": 3.7782299795690794e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17823462188243866, | |
| "step": 1025, | |
| "valid_targets_mean": 4513.6, | |
| "valid_targets_min": 3031 | |
| }, | |
| { | |
| "epoch": 1.6639741518578353, | |
| "grad_norm": 0.4359485981028798, | |
| "learning_rate": 3.774527802086364e-05, | |
| "loss": 0.1826, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.174652099609375, | |
| "step": 1030, | |
| "valid_targets_mean": 5560.6, | |
| "valid_targets_min": 3401 | |
| }, | |
| { | |
| "epoch": 1.6720516962843295, | |
| "grad_norm": 0.40463728071035676, | |
| "learning_rate": 3.770796823077283e-05, | |
| "loss": 0.1939, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18872065842151642, | |
| "step": 1035, | |
| "valid_targets_mean": 5051.6, | |
| "valid_targets_min": 936 | |
| }, | |
| { | |
| "epoch": 1.680129240710824, | |
| "grad_norm": 0.4285363767291524, | |
| "learning_rate": 3.767037103097602e-05, | |
| "loss": 0.1813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19253358244895935, | |
| "step": 1040, | |
| "valid_targets_mean": 4814.4, | |
| "valid_targets_min": 826 | |
| }, | |
| { | |
| "epoch": 1.6882067851373184, | |
| "grad_norm": 0.42583910093778676, | |
| "learning_rate": 3.7632487031695675e-05, | |
| "loss": 0.1844, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18691575527191162, | |
| "step": 1045, | |
| "valid_targets_mean": 4759.1, | |
| "valid_targets_min": 2459 | |
| }, | |
| { | |
| "epoch": 1.6962843295638126, | |
| "grad_norm": 0.43279804450379034, | |
| "learning_rate": 3.7594316847809186e-05, | |
| "loss": 0.1937, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20354296267032623, | |
| "step": 1050, | |
| "valid_targets_mean": 5009.4, | |
| "valid_targets_min": 1970 | |
| }, | |
| { | |
| "epoch": 1.7043618739903068, | |
| "grad_norm": 0.4184227732912941, | |
| "learning_rate": 3.755586109883885e-05, | |
| "loss": 0.1824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1752951741218567, | |
| "step": 1055, | |
| "valid_targets_mean": 5004.6, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 1.7124394184168013, | |
| "grad_norm": 0.3799780135092674, | |
| "learning_rate": 3.751712040894184e-05, | |
| "loss": 0.1893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19882872700691223, | |
| "step": 1060, | |
| "valid_targets_mean": 6411.0, | |
| "valid_targets_min": 4081 | |
| }, | |
| { | |
| "epoch": 1.7205169628432957, | |
| "grad_norm": 0.41181195585953756, | |
| "learning_rate": 3.7478095406900095e-05, | |
| "loss": 0.1813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20096594095230103, | |
| "step": 1065, | |
| "valid_targets_mean": 5860.9, | |
| "valid_targets_min": 3251 | |
| }, | |
| { | |
| "epoch": 1.72859450726979, | |
| "grad_norm": 0.42902723973896295, | |
| "learning_rate": 3.743878672611007e-05, | |
| "loss": 0.186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18756358325481415, | |
| "step": 1070, | |
| "valid_targets_mean": 5537.8, | |
| "valid_targets_min": 3196 | |
| }, | |
| { | |
| "epoch": 1.7366720516962844, | |
| "grad_norm": 0.41118928927546355, | |
| "learning_rate": 3.7399195004572466e-05, | |
| "loss": 0.1795, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19546402990818024, | |
| "step": 1075, | |
| "valid_targets_mean": 5526.4, | |
| "valid_targets_min": 2809 | |
| }, | |
| { | |
| "epoch": 1.7447495961227788, | |
| "grad_norm": 0.4324303457609878, | |
| "learning_rate": 3.735932088488191e-05, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18632230162620544, | |
| "step": 1080, | |
| "valid_targets_mean": 5635.8, | |
| "valid_targets_min": 916 | |
| }, | |
| { | |
| "epoch": 1.752827140549273, | |
| "grad_norm": 0.4055121110771316, | |
| "learning_rate": 3.731916501421649e-05, | |
| "loss": 0.1869, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18193931877613068, | |
| "step": 1085, | |
| "valid_targets_mean": 5442.8, | |
| "valid_targets_min": 3361 | |
| }, | |
| { | |
| "epoch": 1.7609046849757672, | |
| "grad_norm": 0.527999221110702, | |
| "learning_rate": 3.727872804432726e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20073384046554565, | |
| "step": 1090, | |
| "valid_targets_mean": 5310.4, | |
| "valid_targets_min": 3223 | |
| }, | |
| { | |
| "epoch": 1.7689822294022617, | |
| "grad_norm": 0.4274858774477681, | |
| "learning_rate": 3.7238010631527666e-05, | |
| "loss": 0.1834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18347422778606415, | |
| "step": 1095, | |
| "valid_targets_mean": 5612.4, | |
| "valid_targets_min": 3423 | |
| }, | |
| { | |
| "epoch": 1.7770597738287561, | |
| "grad_norm": 0.42927802221254874, | |
| "learning_rate": 3.719701343668289e-05, | |
| "loss": 0.1927, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1750662922859192, | |
| "step": 1100, | |
| "valid_targets_mean": 4959.8, | |
| "valid_targets_min": 3136 | |
| }, | |
| { | |
| "epoch": 1.7851373182552503, | |
| "grad_norm": 0.3628087054837778, | |
| "learning_rate": 3.715573712519911e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17887043952941895, | |
| "step": 1105, | |
| "valid_targets_mean": 5815.1, | |
| "valid_targets_min": 4014 | |
| }, | |
| { | |
| "epoch": 1.7932148626817448, | |
| "grad_norm": 0.4266388200490565, | |
| "learning_rate": 3.711418236701275e-05, | |
| "loss": 0.193, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21544912457466125, | |
| "step": 1110, | |
| "valid_targets_mean": 5051.7, | |
| "valid_targets_min": 2386 | |
| }, | |
| { | |
| "epoch": 1.8012924071082392, | |
| "grad_norm": 0.4136868601031113, | |
| "learning_rate": 3.707234983657954e-05, | |
| "loss": 0.1871, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18486517667770386, | |
| "step": 1115, | |
| "valid_targets_mean": 5495.8, | |
| "valid_targets_min": 1305 | |
| }, | |
| { | |
| "epoch": 1.8093699515347335, | |
| "grad_norm": 0.402967118487239, | |
| "learning_rate": 3.703024021286362e-05, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20577257871627808, | |
| "step": 1120, | |
| "valid_targets_mean": 5403.4, | |
| "valid_targets_min": 2698 | |
| }, | |
| { | |
| "epoch": 1.8174474959612277, | |
| "grad_norm": 0.42496692219736487, | |
| "learning_rate": 3.698785417932649e-05, | |
| "loss": 0.1902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17963314056396484, | |
| "step": 1125, | |
| "valid_targets_mean": 5468.1, | |
| "valid_targets_min": 2550 | |
| }, | |
| { | |
| "epoch": 1.8255250403877221, | |
| "grad_norm": 0.39479948585929464, | |
| "learning_rate": 3.694519242391592e-05, | |
| "loss": 0.1895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18878337740898132, | |
| "step": 1130, | |
| "valid_targets_mean": 5651.1, | |
| "valid_targets_min": 4059 | |
| }, | |
| { | |
| "epoch": 1.8336025848142166, | |
| "grad_norm": 0.4389945003529063, | |
| "learning_rate": 3.6902255639054806e-05, | |
| "loss": 0.1871, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18954357504844666, | |
| "step": 1135, | |
| "valid_targets_mean": 5592.6, | |
| "valid_targets_min": 3598 | |
| }, | |
| { | |
| "epoch": 1.8416801292407108, | |
| "grad_norm": 0.41830928765826586, | |
| "learning_rate": 3.685904452162992e-05, | |
| "loss": 0.185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18112030625343323, | |
| "step": 1140, | |
| "valid_targets_mean": 4664.1, | |
| "valid_targets_min": 2236 | |
| }, | |
| { | |
| "epoch": 1.849757673667205, | |
| "grad_norm": 0.41070471823717275, | |
| "learning_rate": 3.681555977298059e-05, | |
| "loss": 0.1967, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2223418653011322, | |
| "step": 1145, | |
| "valid_targets_mean": 5106.7, | |
| "valid_targets_min": 2678 | |
| }, | |
| { | |
| "epoch": 1.8578352180936997, | |
| "grad_norm": 0.3916635528154224, | |
| "learning_rate": 3.677180209888733e-05, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17309582233428955, | |
| "step": 1150, | |
| "valid_targets_mean": 5281.2, | |
| "valid_targets_min": 2840 | |
| }, | |
| { | |
| "epoch": 1.865912762520194, | |
| "grad_norm": 0.4439330061230736, | |
| "learning_rate": 3.672777220956035e-05, | |
| "loss": 0.1853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1888798475265503, | |
| "step": 1155, | |
| "valid_targets_mean": 4692.0, | |
| "valid_targets_min": 2598 | |
| }, | |
| { | |
| "epoch": 1.8739903069466881, | |
| "grad_norm": 0.4098709425390136, | |
| "learning_rate": 3.6683470819628104e-05, | |
| "loss": 0.1905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2108800709247589, | |
| "step": 1160, | |
| "valid_targets_mean": 6092.9, | |
| "valid_targets_min": 3524 | |
| }, | |
| { | |
| "epoch": 1.8820678513731826, | |
| "grad_norm": 0.3506145878762836, | |
| "learning_rate": 3.663889864812562e-05, | |
| "loss": 0.179, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15639182925224304, | |
| "step": 1165, | |
| "valid_targets_mean": 5912.6, | |
| "valid_targets_min": 3557 | |
| }, | |
| { | |
| "epoch": 1.890145395799677, | |
| "grad_norm": 0.4280367169398799, | |
| "learning_rate": 3.6594056418482844e-05, | |
| "loss": 0.1792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1831849217414856, | |
| "step": 1170, | |
| "valid_targets_mean": 4734.1, | |
| "valid_targets_min": 2709 | |
| }, | |
| { | |
| "epoch": 1.8982229402261712, | |
| "grad_norm": 0.402531733486501, | |
| "learning_rate": 3.6548944858512926e-05, | |
| "loss": 0.1853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19320084154605865, | |
| "step": 1175, | |
| "valid_targets_mean": 5464.7, | |
| "valid_targets_min": 3488 | |
| }, | |
| { | |
| "epoch": 1.9063004846526654, | |
| "grad_norm": 0.426933703454056, | |
| "learning_rate": 3.650356470040038e-05, | |
| "loss": 0.1847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17539432644844055, | |
| "step": 1180, | |
| "valid_targets_mean": 4791.7, | |
| "valid_targets_min": 1595 | |
| }, | |
| { | |
| "epoch": 1.9143780290791599, | |
| "grad_norm": 0.39173229481194327, | |
| "learning_rate": 3.645791668068923e-05, | |
| "loss": 0.1886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20481710135936737, | |
| "step": 1185, | |
| "valid_targets_mean": 5580.2, | |
| "valid_targets_min": 3431 | |
| }, | |
| { | |
| "epoch": 1.9224555735056543, | |
| "grad_norm": 0.3534776648862029, | |
| "learning_rate": 3.6412001540271e-05, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17972326278686523, | |
| "step": 1190, | |
| "valid_targets_mean": 6481.8, | |
| "valid_targets_min": 3058 | |
| }, | |
| { | |
| "epoch": 1.9305331179321485, | |
| "grad_norm": 0.44528228054460384, | |
| "learning_rate": 3.6365820024372754e-05, | |
| "loss": 0.1828, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1828109472990036, | |
| "step": 1195, | |
| "valid_targets_mean": 5294.1, | |
| "valid_targets_min": 2729 | |
| }, | |
| { | |
| "epoch": 1.938610662358643, | |
| "grad_norm": 0.6772035127037407, | |
| "learning_rate": 3.6319372882544966e-05, | |
| "loss": 0.1893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1858840435743332, | |
| "step": 1200, | |
| "valid_targets_mean": 5217.1, | |
| "valid_targets_min": 1086 | |
| }, | |
| { | |
| "epoch": 1.9466882067851374, | |
| "grad_norm": 0.3861931899139504, | |
| "learning_rate": 3.627266086864935e-05, | |
| "loss": 0.2012, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1952175348997116, | |
| "step": 1205, | |
| "valid_targets_mean": 5899.5, | |
| "valid_targets_min": 3731 | |
| }, | |
| { | |
| "epoch": 1.9547657512116317, | |
| "grad_norm": 0.43872850575648026, | |
| "learning_rate": 3.622568474084664e-05, | |
| "loss": 0.1825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16384650766849518, | |
| "step": 1210, | |
| "valid_targets_mean": 4261.4, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 1.9628432956381259, | |
| "grad_norm": 0.37399241885571466, | |
| "learning_rate": 3.6178445261584275e-05, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1738545447587967, | |
| "step": 1215, | |
| "valid_targets_mean": 5521.0, | |
| "valid_targets_min": 2739 | |
| }, | |
| { | |
| "epoch": 1.9709208400646203, | |
| "grad_norm": 0.41605738475702797, | |
| "learning_rate": 3.6130943197584036e-05, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18516364693641663, | |
| "step": 1220, | |
| "valid_targets_mean": 5404.1, | |
| "valid_targets_min": 3228 | |
| }, | |
| { | |
| "epoch": 1.9789983844911148, | |
| "grad_norm": 0.4085987552814228, | |
| "learning_rate": 3.608317931982958e-05, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19084042310714722, | |
| "step": 1225, | |
| "valid_targets_mean": 5928.6, | |
| "valid_targets_min": 3101 | |
| }, | |
| { | |
| "epoch": 1.987075928917609, | |
| "grad_norm": 0.36877496540693433, | |
| "learning_rate": 3.6035154403553965e-05, | |
| "loss": 0.171, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.174210324883461, | |
| "step": 1230, | |
| "valid_targets_mean": 5422.4, | |
| "valid_targets_min": 2923 | |
| }, | |
| { | |
| "epoch": 1.9951534733441034, | |
| "grad_norm": 0.426933061586902, | |
| "learning_rate": 3.5986869228227005e-05, | |
| "loss": 0.1826, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18611037731170654, | |
| "step": 1235, | |
| "valid_targets_mean": 4997.9, | |
| "valid_targets_min": 1502 | |
| }, | |
| { | |
| "epoch": 2.003231017770598, | |
| "grad_norm": 0.401606174889642, | |
| "learning_rate": 3.593832457754269e-05, | |
| "loss": 0.172, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1687077432870865, | |
| "step": 1240, | |
| "valid_targets_mean": 5052.2, | |
| "valid_targets_min": 3015 | |
| }, | |
| { | |
| "epoch": 2.011308562197092, | |
| "grad_norm": 0.40606330051530004, | |
| "learning_rate": 3.588952123940642e-05, | |
| "loss": 0.1664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16908740997314453, | |
| "step": 1245, | |
| "valid_targets_mean": 5349.8, | |
| "valid_targets_min": 2523 | |
| }, | |
| { | |
| "epoch": 2.0193861066235863, | |
| "grad_norm": 0.4274105385686568, | |
| "learning_rate": 3.584046000592222e-05, | |
| "loss": 0.1634, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15948118269443512, | |
| "step": 1250, | |
| "valid_targets_mean": 5093.5, | |
| "valid_targets_min": 3262 | |
| }, | |
| { | |
| "epoch": 2.027463651050081, | |
| "grad_norm": 0.45256421704515376, | |
| "learning_rate": 3.5791141673379906e-05, | |
| "loss": 0.1644, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16202068328857422, | |
| "step": 1255, | |
| "valid_targets_mean": 5621.1, | |
| "valid_targets_min": 3673 | |
| }, | |
| { | |
| "epoch": 2.035541195476575, | |
| "grad_norm": 0.3862372173341649, | |
| "learning_rate": 3.574156704224215e-05, | |
| "loss": 0.1719, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14887890219688416, | |
| "step": 1260, | |
| "valid_targets_mean": 5471.6, | |
| "valid_targets_min": 3034 | |
| }, | |
| { | |
| "epoch": 2.0436187399030694, | |
| "grad_norm": 0.43883940614494954, | |
| "learning_rate": 3.569173691713147e-05, | |
| "loss": 0.1661, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17507337033748627, | |
| "step": 1265, | |
| "valid_targets_mean": 4869.2, | |
| "valid_targets_min": 3110 | |
| }, | |
| { | |
| "epoch": 2.0516962843295636, | |
| "grad_norm": 0.4077056739419978, | |
| "learning_rate": 3.5641652106817194e-05, | |
| "loss": 0.17, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1628391593694687, | |
| "step": 1270, | |
| "valid_targets_mean": 4946.0, | |
| "valid_targets_min": 3397 | |
| }, | |
| { | |
| "epoch": 2.0597738287560583, | |
| "grad_norm": 0.3920400314649599, | |
| "learning_rate": 3.559131342420235e-05, | |
| "loss": 0.1655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16144214570522308, | |
| "step": 1275, | |
| "valid_targets_mean": 5687.8, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 2.0678513731825525, | |
| "grad_norm": 0.4697014734112868, | |
| "learning_rate": 3.554072168631039e-05, | |
| "loss": 0.1689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16841688752174377, | |
| "step": 1280, | |
| "valid_targets_mean": 5773.9, | |
| "valid_targets_min": 2827 | |
| }, | |
| { | |
| "epoch": 2.0759289176090467, | |
| "grad_norm": 0.41851856710048596, | |
| "learning_rate": 3.5489877714272064e-05, | |
| "loss": 0.1722, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17223848402500153, | |
| "step": 1285, | |
| "valid_targets_mean": 5887.8, | |
| "valid_targets_min": 3573 | |
| }, | |
| { | |
| "epoch": 2.0840064620355414, | |
| "grad_norm": 0.3877542211888081, | |
| "learning_rate": 3.543878233331194e-05, | |
| "loss": 0.173, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16172486543655396, | |
| "step": 1290, | |
| "valid_targets_mean": 5461.9, | |
| "valid_targets_min": 3066 | |
| }, | |
| { | |
| "epoch": 2.0920840064620356, | |
| "grad_norm": 0.4405751263169863, | |
| "learning_rate": 3.538743637273515e-05, | |
| "loss": 0.1632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.156228169798851, | |
| "step": 1295, | |
| "valid_targets_mean": 5554.2, | |
| "valid_targets_min": 3937 | |
| }, | |
| { | |
| "epoch": 2.10016155088853, | |
| "grad_norm": 0.44672479309346075, | |
| "learning_rate": 3.5335840665913814e-05, | |
| "loss": 0.171, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1716388314962387, | |
| "step": 1300, | |
| "valid_targets_mean": 4500.0, | |
| "valid_targets_min": 2274 | |
| }, | |
| { | |
| "epoch": 2.108239095315024, | |
| "grad_norm": 0.43396813723890304, | |
| "learning_rate": 3.5283996050273606e-05, | |
| "loss": 0.176, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17064327001571655, | |
| "step": 1305, | |
| "valid_targets_mean": 5100.9, | |
| "valid_targets_min": 2093 | |
| }, | |
| { | |
| "epoch": 2.1163166397415187, | |
| "grad_norm": 0.43545641516471767, | |
| "learning_rate": 3.523190336728009e-05, | |
| "loss": 0.1649, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16103370487689972, | |
| "step": 1310, | |
| "valid_targets_mean": 4831.9, | |
| "valid_targets_min": 3058 | |
| }, | |
| { | |
| "epoch": 2.124394184168013, | |
| "grad_norm": 0.4303303750630934, | |
| "learning_rate": 3.517956346242512e-05, | |
| "loss": 0.1791, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16605603694915771, | |
| "step": 1315, | |
| "valid_targets_mean": 4949.8, | |
| "valid_targets_min": 3379 | |
| }, | |
| { | |
| "epoch": 2.132471728594507, | |
| "grad_norm": 0.3785254380964551, | |
| "learning_rate": 3.512697718521308e-05, | |
| "loss": 0.1703, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.157035693526268, | |
| "step": 1320, | |
| "valid_targets_mean": 5796.8, | |
| "valid_targets_min": 2571 | |
| }, | |
| { | |
| "epoch": 2.1405492730210014, | |
| "grad_norm": 0.4479990330504035, | |
| "learning_rate": 3.5074145389147095e-05, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18853747844696045, | |
| "step": 1325, | |
| "valid_targets_mean": 5415.8, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 2.148626817447496, | |
| "grad_norm": 0.4332672460538596, | |
| "learning_rate": 3.502106893171523e-05, | |
| "loss": 0.1703, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17232483625411987, | |
| "step": 1330, | |
| "valid_targets_mean": 5326.8, | |
| "valid_targets_min": 2771 | |
| }, | |
| { | |
| "epoch": 2.1567043618739903, | |
| "grad_norm": 0.4158673008862669, | |
| "learning_rate": 3.4967748674376494e-05, | |
| "loss": 0.173, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18939775228500366, | |
| "step": 1335, | |
| "valid_targets_mean": 5557.8, | |
| "valid_targets_min": 3100 | |
| }, | |
| { | |
| "epoch": 2.1647819063004845, | |
| "grad_norm": 0.42762701762425376, | |
| "learning_rate": 3.4914185482546915e-05, | |
| "loss": 0.1641, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16821393370628357, | |
| "step": 1340, | |
| "valid_targets_mean": 5204.9, | |
| "valid_targets_min": 2956 | |
| }, | |
| { | |
| "epoch": 2.172859450726979, | |
| "grad_norm": 0.39851109399748474, | |
| "learning_rate": 3.4860380225585475e-05, | |
| "loss": 0.1775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17029531300067902, | |
| "step": 1345, | |
| "valid_targets_mean": 5027.6, | |
| "valid_targets_min": 2571 | |
| }, | |
| { | |
| "epoch": 2.1809369951534734, | |
| "grad_norm": 0.4168775245127039, | |
| "learning_rate": 3.480633377678e-05, | |
| "loss": 0.1682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15992960333824158, | |
| "step": 1350, | |
| "valid_targets_mean": 4966.4, | |
| "valid_targets_min": 720 | |
| }, | |
| { | |
| "epoch": 2.1890145395799676, | |
| "grad_norm": 0.475573654602836, | |
| "learning_rate": 3.475204701333298e-05, | |
| "loss": 0.1677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1710505187511444, | |
| "step": 1355, | |
| "valid_targets_mean": 4562.9, | |
| "valid_targets_min": 2976 | |
| }, | |
| { | |
| "epoch": 2.197092084006462, | |
| "grad_norm": 0.41742134982113543, | |
| "learning_rate": 3.469752081634736e-05, | |
| "loss": 0.1631, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17456302046775818, | |
| "step": 1360, | |
| "valid_targets_mean": 5109.5, | |
| "valid_targets_min": 2370 | |
| }, | |
| { | |
| "epoch": 2.2051696284329565, | |
| "grad_norm": 0.5028864896919181, | |
| "learning_rate": 3.464275607081219e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15900300443172455, | |
| "step": 1365, | |
| "valid_targets_mean": 4860.5, | |
| "valid_targets_min": 1502 | |
| }, | |
| { | |
| "epoch": 2.2132471728594507, | |
| "grad_norm": 0.3673834965367812, | |
| "learning_rate": 3.458775366558832e-05, | |
| "loss": 0.1737, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17337435483932495, | |
| "step": 1370, | |
| "valid_targets_mean": 6093.9, | |
| "valid_targets_min": 3429 | |
| }, | |
| { | |
| "epoch": 2.221324717285945, | |
| "grad_norm": 0.41072037713762377, | |
| "learning_rate": 3.4532514493393914e-05, | |
| "loss": 0.1718, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17573903501033783, | |
| "step": 1375, | |
| "valid_targets_mean": 5765.9, | |
| "valid_targets_min": 2648 | |
| }, | |
| { | |
| "epoch": 2.2294022617124396, | |
| "grad_norm": 0.4284254331283453, | |
| "learning_rate": 3.4477039450790015e-05, | |
| "loss": 0.1675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16634736955165863, | |
| "step": 1380, | |
| "valid_targets_mean": 5872.1, | |
| "valid_targets_min": 3221 | |
| }, | |
| { | |
| "epoch": 2.237479806138934, | |
| "grad_norm": 0.6490092475424905, | |
| "learning_rate": 3.4421329438165944e-05, | |
| "loss": 0.1679, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15642938017845154, | |
| "step": 1385, | |
| "valid_targets_mean": 4863.2, | |
| "valid_targets_min": 2800 | |
| }, | |
| { | |
| "epoch": 2.245557350565428, | |
| "grad_norm": 0.4532650779405078, | |
| "learning_rate": 3.436538535972473e-05, | |
| "loss": 0.1663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16853484511375427, | |
| "step": 1390, | |
| "valid_targets_mean": 5580.8, | |
| "valid_targets_min": 2997 | |
| }, | |
| { | |
| "epoch": 2.2536348949919223, | |
| "grad_norm": 0.46862263231029005, | |
| "learning_rate": 3.430920812346838e-05, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18197011947631836, | |
| "step": 1395, | |
| "valid_targets_mean": 5054.0, | |
| "valid_targets_min": 1102 | |
| }, | |
| { | |
| "epoch": 2.261712439418417, | |
| "grad_norm": 0.3926929669058476, | |
| "learning_rate": 3.425279864118324e-05, | |
| "loss": 0.1623, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1653885543346405, | |
| "step": 1400, | |
| "valid_targets_mean": 5755.1, | |
| "valid_targets_min": 3486 | |
| }, | |
| { | |
| "epoch": 2.269789983844911, | |
| "grad_norm": 0.4473829670830625, | |
| "learning_rate": 3.419615782842507e-05, | |
| "loss": 0.169, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17438775300979614, | |
| "step": 1405, | |
| "valid_targets_mean": 4425.4, | |
| "valid_targets_min": 2813 | |
| }, | |
| { | |
| "epoch": 2.2778675282714054, | |
| "grad_norm": 0.4497507198651243, | |
| "learning_rate": 3.413928660450427e-05, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18055163323879242, | |
| "step": 1410, | |
| "valid_targets_mean": 4834.5, | |
| "valid_targets_min": 2564 | |
| }, | |
| { | |
| "epoch": 2.2859450726979, | |
| "grad_norm": 0.3833845124298959, | |
| "learning_rate": 3.408218589247094e-05, | |
| "loss": 0.1629, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14256328344345093, | |
| "step": 1415, | |
| "valid_targets_mean": 5140.1, | |
| "valid_targets_min": 3313 | |
| }, | |
| { | |
| "epoch": 2.2940226171243943, | |
| "grad_norm": 0.4145273879316233, | |
| "learning_rate": 3.402485661909988e-05, | |
| "loss": 0.1704, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18663017451763153, | |
| "step": 1420, | |
| "valid_targets_mean": 5167.8, | |
| "valid_targets_min": 3523 | |
| }, | |
| { | |
| "epoch": 2.3021001615508885, | |
| "grad_norm": 0.3623669701914888, | |
| "learning_rate": 3.396729971487557e-05, | |
| "loss": 0.1657, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1587752103805542, | |
| "step": 1425, | |
| "valid_targets_mean": 5643.6, | |
| "valid_targets_min": 3462 | |
| }, | |
| { | |
| "epoch": 2.3101777059773827, | |
| "grad_norm": 0.4066395797349654, | |
| "learning_rate": 3.3909516113977073e-05, | |
| "loss": 0.1674, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16322427988052368, | |
| "step": 1430, | |
| "valid_targets_mean": 5590.2, | |
| "valid_targets_min": 3377 | |
| }, | |
| { | |
| "epoch": 2.3182552504038774, | |
| "grad_norm": 0.42382751521347023, | |
| "learning_rate": 3.385150675426283e-05, | |
| "loss": 0.1677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16787301003932953, | |
| "step": 1435, | |
| "valid_targets_mean": 4900.2, | |
| "valid_targets_min": 2613 | |
| }, | |
| { | |
| "epoch": 2.3263327948303716, | |
| "grad_norm": 0.4268245994999202, | |
| "learning_rate": 3.37932725772555e-05, | |
| "loss": 0.1652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1627117395401001, | |
| "step": 1440, | |
| "valid_targets_mean": 5724.6, | |
| "valid_targets_min": 3407 | |
| }, | |
| { | |
| "epoch": 2.334410339256866, | |
| "grad_norm": 0.43132043272994586, | |
| "learning_rate": 3.37348145281266e-05, | |
| "loss": 0.1648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18179920315742493, | |
| "step": 1445, | |
| "valid_targets_mean": 5504.1, | |
| "valid_targets_min": 3547 | |
| }, | |
| { | |
| "epoch": 2.3424878836833605, | |
| "grad_norm": 0.40927578349516525, | |
| "learning_rate": 3.367613355568126e-05, | |
| "loss": 0.1741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18249475955963135, | |
| "step": 1450, | |
| "valid_targets_mean": 5405.1, | |
| "valid_targets_min": 3096 | |
| }, | |
| { | |
| "epoch": 2.3505654281098547, | |
| "grad_norm": 0.37071367481289286, | |
| "learning_rate": 3.361723061234275e-05, | |
| "loss": 0.1613, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15142659842967987, | |
| "step": 1455, | |
| "valid_targets_mean": 5216.8, | |
| "valid_targets_min": 2946 | |
| }, | |
| { | |
| "epoch": 2.358642972536349, | |
| "grad_norm": 0.4273375833985219, | |
| "learning_rate": 3.355810665413704e-05, | |
| "loss": 0.1752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16882875561714172, | |
| "step": 1460, | |
| "valid_targets_mean": 5893.2, | |
| "valid_targets_min": 4168 | |
| }, | |
| { | |
| "epoch": 2.366720516962843, | |
| "grad_norm": 0.43524365947164323, | |
| "learning_rate": 3.34987626406773e-05, | |
| "loss": 0.1615, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16199825704097748, | |
| "step": 1465, | |
| "valid_targets_mean": 5291.4, | |
| "valid_targets_min": 3203 | |
| }, | |
| { | |
| "epoch": 2.374798061389338, | |
| "grad_norm": 0.4326769770987258, | |
| "learning_rate": 3.343919953514831e-05, | |
| "loss": 0.1679, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1664809137582779, | |
| "step": 1470, | |
| "valid_targets_mean": 5582.1, | |
| "valid_targets_min": 2984 | |
| }, | |
| { | |
| "epoch": 2.382875605815832, | |
| "grad_norm": 0.4114451446619777, | |
| "learning_rate": 3.337941830429082e-05, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1652735322713852, | |
| "step": 1475, | |
| "valid_targets_mean": 5063.2, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 2.3909531502423262, | |
| "grad_norm": 0.3660040398033965, | |
| "learning_rate": 3.331941991838589e-05, | |
| "loss": 0.1754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16325372457504272, | |
| "step": 1480, | |
| "valid_targets_mean": 5960.8, | |
| "valid_targets_min": 4023 | |
| }, | |
| { | |
| "epoch": 2.399030694668821, | |
| "grad_norm": 0.43398835076941117, | |
| "learning_rate": 3.32592053512391e-05, | |
| "loss": 0.1684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16197870671749115, | |
| "step": 1485, | |
| "valid_targets_mean": 4514.6, | |
| "valid_targets_min": 665 | |
| }, | |
| { | |
| "epoch": 2.407108239095315, | |
| "grad_norm": 0.3909720869151018, | |
| "learning_rate": 3.319877558016478e-05, | |
| "loss": 0.1671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16862240433692932, | |
| "step": 1490, | |
| "valid_targets_mean": 5479.7, | |
| "valid_targets_min": 2374 | |
| }, | |
| { | |
| "epoch": 2.4151857835218093, | |
| "grad_norm": 0.44100978004666036, | |
| "learning_rate": 3.3138131585970134e-05, | |
| "loss": 0.1723, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18514972925186157, | |
| "step": 1495, | |
| "valid_targets_mean": 5116.6, | |
| "valid_targets_min": 3257 | |
| }, | |
| { | |
| "epoch": 2.4232633279483036, | |
| "grad_norm": 0.3981887790759492, | |
| "learning_rate": 3.3077274352939315e-05, | |
| "loss": 0.1751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15568625926971436, | |
| "step": 1500, | |
| "valid_targets_mean": 5510.9, | |
| "valid_targets_min": 3469 | |
| }, | |
| { | |
| "epoch": 2.4313408723747982, | |
| "grad_norm": 0.43017412638926306, | |
| "learning_rate": 3.301620486881746e-05, | |
| "loss": 0.1639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17815840244293213, | |
| "step": 1505, | |
| "valid_targets_mean": 5197.9, | |
| "valid_targets_min": 2762 | |
| }, | |
| { | |
| "epoch": 2.4394184168012925, | |
| "grad_norm": 0.4401707898411131, | |
| "learning_rate": 3.2954924124794636e-05, | |
| "loss": 0.1695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16379782557487488, | |
| "step": 1510, | |
| "valid_targets_mean": 4692.9, | |
| "valid_targets_min": 2956 | |
| }, | |
| { | |
| "epoch": 2.4474959612277867, | |
| "grad_norm": 0.3607160404524143, | |
| "learning_rate": 3.28934331154898e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15107029676437378, | |
| "step": 1515, | |
| "valid_targets_mean": 5661.4, | |
| "valid_targets_min": 2921 | |
| }, | |
| { | |
| "epoch": 2.4555735056542813, | |
| "grad_norm": 0.35908930376071707, | |
| "learning_rate": 3.2831732838934615e-05, | |
| "loss": 0.167, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14789265394210815, | |
| "step": 1520, | |
| "valid_targets_mean": 5564.4, | |
| "valid_targets_min": 2092 | |
| }, | |
| { | |
| "epoch": 2.4636510500807756, | |
| "grad_norm": 0.44315420095764363, | |
| "learning_rate": 3.276982429655724e-05, | |
| "loss": 0.1652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15249694883823395, | |
| "step": 1525, | |
| "valid_targets_mean": 4679.8, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 2.47172859450727, | |
| "grad_norm": 0.3807929761670207, | |
| "learning_rate": 3.270770849316612e-05, | |
| "loss": 0.1617, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1659863442182541, | |
| "step": 1530, | |
| "valid_targets_mean": 5424.8, | |
| "valid_targets_min": 3098 | |
| }, | |
| { | |
| "epoch": 2.479806138933764, | |
| "grad_norm": 0.4029349113660074, | |
| "learning_rate": 3.264538643693367e-05, | |
| "loss": 0.1638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14238610863685608, | |
| "step": 1535, | |
| "valid_targets_mean": 5705.2, | |
| "valid_targets_min": 3058 | |
| }, | |
| { | |
| "epoch": 2.4878836833602587, | |
| "grad_norm": 0.392438283116615, | |
| "learning_rate": 3.258285913937988e-05, | |
| "loss": 0.1574, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15044398605823517, | |
| "step": 1540, | |
| "valid_targets_mean": 5630.6, | |
| "valid_targets_min": 3316 | |
| }, | |
| { | |
| "epoch": 2.495961227786753, | |
| "grad_norm": 0.42696864338998214, | |
| "learning_rate": 3.2520127615355885e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17117537558078766, | |
| "step": 1545, | |
| "valid_targets_mean": 5718.1, | |
| "valid_targets_min": 3588 | |
| }, | |
| { | |
| "epoch": 2.504038772213247, | |
| "grad_norm": 0.4088967704654339, | |
| "learning_rate": 3.245719288302758e-05, | |
| "loss": 0.1636, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17196603119373322, | |
| "step": 1550, | |
| "valid_targets_mean": 5265.9, | |
| "valid_targets_min": 2815 | |
| }, | |
| { | |
| "epoch": 2.5121163166397418, | |
| "grad_norm": 0.3502253105336199, | |
| "learning_rate": 3.239405596385902e-05, | |
| "loss": 0.1583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14983238279819489, | |
| "step": 1555, | |
| "valid_targets_mean": 6427.2, | |
| "valid_targets_min": 4682 | |
| }, | |
| { | |
| "epoch": 2.520193861066236, | |
| "grad_norm": 0.4183519330842782, | |
| "learning_rate": 3.233071788259582e-05, | |
| "loss": 0.1611, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14721640944480896, | |
| "step": 1560, | |
| "valid_targets_mean": 4815.1, | |
| "valid_targets_min": 3095 | |
| }, | |
| { | |
| "epoch": 2.52827140549273, | |
| "grad_norm": 0.4352795741458169, | |
| "learning_rate": 3.2267179667248636e-05, | |
| "loss": 0.1582, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15912222862243652, | |
| "step": 1565, | |
| "valid_targets_mean": 5072.4, | |
| "valid_targets_min": 3089 | |
| }, | |
| { | |
| "epoch": 2.5363489499192244, | |
| "grad_norm": 0.42056275261628046, | |
| "learning_rate": 3.220344234907634e-05, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16274453699588776, | |
| "step": 1570, | |
| "valid_targets_mean": 5461.2, | |
| "valid_targets_min": 3889 | |
| }, | |
| { | |
| "epoch": 2.5444264943457187, | |
| "grad_norm": 0.39668108670101565, | |
| "learning_rate": 3.2139506962569376e-05, | |
| "loss": 0.1609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15409982204437256, | |
| "step": 1575, | |
| "valid_targets_mean": 5482.7, | |
| "valid_targets_min": 1894 | |
| }, | |
| { | |
| "epoch": 2.5525040387722133, | |
| "grad_norm": 0.39268047733866424, | |
| "learning_rate": 3.2075374545432945e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15878914296627045, | |
| "step": 1580, | |
| "valid_targets_mean": 5595.6, | |
| "valid_targets_min": 2744 | |
| }, | |
| { | |
| "epoch": 2.5605815831987075, | |
| "grad_norm": 0.48291631001099655, | |
| "learning_rate": 3.201104613857015e-05, | |
| "loss": 0.1646, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18096968531608582, | |
| "step": 1585, | |
| "valid_targets_mean": 4983.4, | |
| "valid_targets_min": 1583 | |
| }, | |
| { | |
| "epoch": 2.568659127625202, | |
| "grad_norm": 0.4034564494750503, | |
| "learning_rate": 3.1946522786065125e-05, | |
| "loss": 0.1543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14894041419029236, | |
| "step": 1590, | |
| "valid_targets_mean": 5471.7, | |
| "valid_targets_min": 2886 | |
| }, | |
| { | |
| "epoch": 2.5767366720516964, | |
| "grad_norm": 0.4119983923922993, | |
| "learning_rate": 3.188180553516606e-05, | |
| "loss": 0.1643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16128647327423096, | |
| "step": 1595, | |
| "valid_targets_mean": 5662.8, | |
| "valid_targets_min": 2978 | |
| }, | |
| { | |
| "epoch": 2.5848142164781907, | |
| "grad_norm": 0.4123640591545943, | |
| "learning_rate": 3.181689543626822e-05, | |
| "loss": 0.1591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18056736886501312, | |
| "step": 1600, | |
| "valid_targets_mean": 5447.2, | |
| "valid_targets_min": 3365 | |
| }, | |
| { | |
| "epoch": 2.592891760904685, | |
| "grad_norm": 0.3869046988263049, | |
| "learning_rate": 3.1751793542896895e-05, | |
| "loss": 0.1546, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16578814387321472, | |
| "step": 1605, | |
| "valid_targets_mean": 5747.5, | |
| "valid_targets_min": 1603 | |
| }, | |
| { | |
| "epoch": 2.600969305331179, | |
| "grad_norm": 0.4316608359130143, | |
| "learning_rate": 3.1686500911690315e-05, | |
| "loss": 0.1615, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1606169044971466, | |
| "step": 1610, | |
| "valid_targets_mean": 5133.5, | |
| "valid_targets_min": 906 | |
| }, | |
| { | |
| "epoch": 2.6090468497576738, | |
| "grad_norm": 0.40446250120673677, | |
| "learning_rate": 3.162101860238247e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14105135202407837, | |
| "step": 1615, | |
| "valid_targets_mean": 5533.9, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 2.617124394184168, | |
| "grad_norm": 0.47625077340872074, | |
| "learning_rate": 3.155534767778594e-05, | |
| "loss": 0.1659, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18141523003578186, | |
| "step": 1620, | |
| "valid_targets_mean": 4673.2, | |
| "valid_targets_min": 2542 | |
| }, | |
| { | |
| "epoch": 2.625201938610662, | |
| "grad_norm": 0.4465275345328576, | |
| "learning_rate": 3.1489489203774627e-05, | |
| "loss": 0.1574, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1559813916683197, | |
| "step": 1625, | |
| "valid_targets_mean": 5545.8, | |
| "valid_targets_min": 3295 | |
| }, | |
| { | |
| "epoch": 2.633279483037157, | |
| "grad_norm": 0.4192958307468656, | |
| "learning_rate": 3.1423444249266436e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18528008460998535, | |
| "step": 1630, | |
| "valid_targets_mean": 5995.1, | |
| "valid_targets_min": 3733 | |
| }, | |
| { | |
| "epoch": 2.641357027463651, | |
| "grad_norm": 0.4049950210554484, | |
| "learning_rate": 3.1357213886205986e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17278358340263367, | |
| "step": 1635, | |
| "valid_targets_mean": 5698.8, | |
| "valid_targets_min": 1120 | |
| }, | |
| { | |
| "epoch": 2.6494345718901453, | |
| "grad_norm": 0.40913206526342066, | |
| "learning_rate": 3.1290799189547155e-05, | |
| "loss": 0.1562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15328720211982727, | |
| "step": 1640, | |
| "valid_targets_mean": 4910.6, | |
| "valid_targets_min": 860 | |
| }, | |
| { | |
| "epoch": 2.6575121163166395, | |
| "grad_norm": 0.4257325837836936, | |
| "learning_rate": 3.122420123723565e-05, | |
| "loss": 0.1526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15455028414726257, | |
| "step": 1645, | |
| "valid_targets_mean": 5031.7, | |
| "valid_targets_min": 3529 | |
| }, | |
| { | |
| "epoch": 2.665589660743134, | |
| "grad_norm": 0.4521874347895494, | |
| "learning_rate": 3.115742111019152e-05, | |
| "loss": 0.1565, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14618174731731415, | |
| "step": 1650, | |
| "valid_targets_mean": 4829.0, | |
| "valid_targets_min": 2955 | |
| }, | |
| { | |
| "epoch": 2.6736672051696284, | |
| "grad_norm": 0.39499453267355306, | |
| "learning_rate": 3.1090459892291616e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1510046422481537, | |
| "step": 1655, | |
| "valid_targets_mean": 5492.4, | |
| "valid_targets_min": 3011 | |
| }, | |
| { | |
| "epoch": 2.6817447495961226, | |
| "grad_norm": 0.42966057396892693, | |
| "learning_rate": 3.102331867035197e-05, | |
| "loss": 0.1573, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16350698471069336, | |
| "step": 1660, | |
| "valid_targets_mean": 5012.4, | |
| "valid_targets_min": 3294 | |
| }, | |
| { | |
| "epoch": 2.6898222940226173, | |
| "grad_norm": 0.4262326803802079, | |
| "learning_rate": 3.095599853411019e-05, | |
| "loss": 0.1545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16098210215568542, | |
| "step": 1665, | |
| "valid_targets_mean": 5450.9, | |
| "valid_targets_min": 3548 | |
| }, | |
| { | |
| "epoch": 2.6978998384491115, | |
| "grad_norm": 0.4546118001183254, | |
| "learning_rate": 3.088850057620777e-05, | |
| "loss": 0.1557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16224367916584015, | |
| "step": 1670, | |
| "valid_targets_mean": 5435.6, | |
| "valid_targets_min": 3398 | |
| }, | |
| { | |
| "epoch": 2.7059773828756057, | |
| "grad_norm": 0.40167696331255726, | |
| "learning_rate": 3.08208258921723e-05, | |
| "loss": 0.1608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13801468908786774, | |
| "step": 1675, | |
| "valid_targets_mean": 5132.9, | |
| "valid_targets_min": 3184 | |
| }, | |
| { | |
| "epoch": 2.7140549273021, | |
| "grad_norm": 0.4210913248605828, | |
| "learning_rate": 3.075297558039976e-05, | |
| "loss": 0.1592, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14852957427501678, | |
| "step": 1680, | |
| "valid_targets_mean": 5154.9, | |
| "valid_targets_min": 2962 | |
| }, | |
| { | |
| "epoch": 2.7221324717285946, | |
| "grad_norm": 0.3743885008222628, | |
| "learning_rate": 3.068495074213667e-05, | |
| "loss": 0.1605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1448890119791031, | |
| "step": 1685, | |
| "valid_targets_mean": 5794.9, | |
| "valid_targets_min": 2758 | |
| }, | |
| { | |
| "epoch": 2.730210016155089, | |
| "grad_norm": 0.40410715658459306, | |
| "learning_rate": 3.0616752481462184e-05, | |
| "loss": 0.1645, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15657979249954224, | |
| "step": 1690, | |
| "valid_targets_mean": 5021.4, | |
| "valid_targets_min": 3366 | |
| }, | |
| { | |
| "epoch": 2.738287560581583, | |
| "grad_norm": 0.4714995506258271, | |
| "learning_rate": 3.054838190527021e-05, | |
| "loss": 0.1536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16343599557876587, | |
| "step": 1695, | |
| "valid_targets_mean": 4527.0, | |
| "valid_targets_min": 2803 | |
| }, | |
| { | |
| "epoch": 2.7463651050080777, | |
| "grad_norm": 0.43772533089432647, | |
| "learning_rate": 3.0479840123251424e-05, | |
| "loss": 0.1593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17091104388237, | |
| "step": 1700, | |
| "valid_targets_mean": 5508.5, | |
| "valid_targets_min": 3143 | |
| }, | |
| { | |
| "epoch": 2.754442649434572, | |
| "grad_norm": 0.45122256716443765, | |
| "learning_rate": 3.0411128247875248e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17041394114494324, | |
| "step": 1705, | |
| "valid_targets_mean": 4875.2, | |
| "valid_targets_min": 2137 | |
| }, | |
| { | |
| "epoch": 2.762520193861066, | |
| "grad_norm": 0.38714450629462066, | |
| "learning_rate": 3.0342247394371836e-05, | |
| "loss": 0.1651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.152548149228096, | |
| "step": 1710, | |
| "valid_targets_mean": 5843.6, | |
| "valid_targets_min": 3284 | |
| }, | |
| { | |
| "epoch": 2.7705977382875604, | |
| "grad_norm": 0.5422209052327676, | |
| "learning_rate": 3.0273198680713937e-05, | |
| "loss": 0.1635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16515977680683136, | |
| "step": 1715, | |
| "valid_targets_mean": 5581.4, | |
| "valid_targets_min": 2737 | |
| }, | |
| { | |
| "epoch": 2.778675282714055, | |
| "grad_norm": 0.41258879566049084, | |
| "learning_rate": 3.0203983227598733e-05, | |
| "loss": 0.166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1601826399564743, | |
| "step": 1720, | |
| "valid_targets_mean": 5942.6, | |
| "valid_targets_min": 3936 | |
| }, | |
| { | |
| "epoch": 2.7867528271405493, | |
| "grad_norm": 0.4158870047598975, | |
| "learning_rate": 3.0134602158429723e-05, | |
| "loss": 0.1614, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15733978152275085, | |
| "step": 1725, | |
| "valid_targets_mean": 6098.9, | |
| "valid_targets_min": 3073 | |
| }, | |
| { | |
| "epoch": 2.7948303715670435, | |
| "grad_norm": 0.43216407751154545, | |
| "learning_rate": 3.00650565992984e-05, | |
| "loss": 0.1626, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1826433390378952, | |
| "step": 1730, | |
| "valid_targets_mean": 5351.1, | |
| "valid_targets_min": 2817 | |
| }, | |
| { | |
| "epoch": 2.802907915993538, | |
| "grad_norm": 0.3883027870620293, | |
| "learning_rate": 2.999534767896604e-05, | |
| "loss": 0.1657, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14935646951198578, | |
| "step": 1735, | |
| "valid_targets_mean": 5830.9, | |
| "valid_targets_min": 4066 | |
| }, | |
| { | |
| "epoch": 2.8109854604200324, | |
| "grad_norm": 0.48044695829642137, | |
| "learning_rate": 2.992547652884536e-05, | |
| "loss": 0.1544, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16807019710540771, | |
| "step": 1740, | |
| "valid_targets_mean": 4481.2, | |
| "valid_targets_min": 2976 | |
| }, | |
| { | |
| "epoch": 2.8190630048465266, | |
| "grad_norm": 0.4140322247787378, | |
| "learning_rate": 2.985544428298213e-05, | |
| "loss": 0.162, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1593261957168579, | |
| "step": 1745, | |
| "valid_targets_mean": 5142.8, | |
| "valid_targets_min": 1294 | |
| }, | |
| { | |
| "epoch": 2.827140549273021, | |
| "grad_norm": 0.42543096236919437, | |
| "learning_rate": 2.978525207803681e-05, | |
| "loss": 0.1519, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1564638316631317, | |
| "step": 1750, | |
| "valid_targets_mean": 4954.4, | |
| "valid_targets_min": 3126 | |
| }, | |
| { | |
| "epoch": 2.8352180936995155, | |
| "grad_norm": 0.4694157233084932, | |
| "learning_rate": 2.9714901053266075e-05, | |
| "loss": 0.1599, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15422756969928741, | |
| "step": 1755, | |
| "valid_targets_mean": 4928.2, | |
| "valid_targets_min": 2928 | |
| }, | |
| { | |
| "epoch": 2.8432956381260097, | |
| "grad_norm": 0.4116696511240596, | |
| "learning_rate": 2.9644392350504322e-05, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15589065849781036, | |
| "step": 1760, | |
| "valid_targets_mean": 4785.0, | |
| "valid_targets_min": 3331 | |
| }, | |
| { | |
| "epoch": 2.851373182552504, | |
| "grad_norm": 0.41296699834187545, | |
| "learning_rate": 2.9573727114145162e-05, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18629789352416992, | |
| "step": 1765, | |
| "valid_targets_mean": 5090.4, | |
| "valid_targets_min": 3547 | |
| }, | |
| { | |
| "epoch": 2.8594507269789986, | |
| "grad_norm": 0.4347499443746334, | |
| "learning_rate": 2.9502906491122813e-05, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15154750645160675, | |
| "step": 1770, | |
| "valid_targets_mean": 5262.7, | |
| "valid_targets_min": 2874 | |
| }, | |
| { | |
| "epoch": 2.867528271405493, | |
| "grad_norm": 0.4141641791774333, | |
| "learning_rate": 2.9431931630893504e-05, | |
| "loss": 0.1654, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17479820549488068, | |
| "step": 1775, | |
| "valid_targets_mean": 5459.6, | |
| "valid_targets_min": 2579 | |
| }, | |
| { | |
| "epoch": 2.875605815831987, | |
| "grad_norm": 0.4638086659411083, | |
| "learning_rate": 2.9360803685416825e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16348320245742798, | |
| "step": 1780, | |
| "valid_targets_mean": 5805.8, | |
| "valid_targets_min": 3612 | |
| }, | |
| { | |
| "epoch": 2.8836833602584813, | |
| "grad_norm": 0.4504562586687795, | |
| "learning_rate": 2.9289523809137016e-05, | |
| "loss": 0.1529, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1563473343849182, | |
| "step": 1785, | |
| "valid_targets_mean": 4970.2, | |
| "valid_targets_min": 3548 | |
| }, | |
| { | |
| "epoch": 2.891760904684976, | |
| "grad_norm": 0.38292898167329614, | |
| "learning_rate": 2.9218093158964227e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15860334038734436, | |
| "step": 1790, | |
| "valid_targets_mean": 5831.6, | |
| "valid_targets_min": 3135 | |
| }, | |
| { | |
| "epoch": 2.89983844911147, | |
| "grad_norm": 0.3947814339597183, | |
| "learning_rate": 2.9146512894255755e-05, | |
| "loss": 0.1572, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16357748210430145, | |
| "step": 1795, | |
| "valid_targets_mean": 5600.8, | |
| "valid_targets_min": 2395 | |
| }, | |
| { | |
| "epoch": 2.9079159935379644, | |
| "grad_norm": 0.3773815244534708, | |
| "learning_rate": 2.907478417679722e-05, | |
| "loss": 0.1573, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15501318871974945, | |
| "step": 1800, | |
| "valid_targets_mean": 5604.1, | |
| "valid_targets_min": 3823 | |
| }, | |
| { | |
| "epoch": 2.915993537964459, | |
| "grad_norm": 0.49862612619338115, | |
| "learning_rate": 2.900290817078371e-05, | |
| "loss": 0.1569, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.145344540476799, | |
| "step": 1805, | |
| "valid_targets_mean": 5771.7, | |
| "valid_targets_min": 2848 | |
| }, | |
| { | |
| "epoch": 2.9240710823909533, | |
| "grad_norm": 0.39778554749092054, | |
| "learning_rate": 2.893088604280088e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.157730370759964, | |
| "step": 1810, | |
| "valid_targets_mean": 5914.8, | |
| "valid_targets_min": 3646 | |
| }, | |
| { | |
| "epoch": 2.9321486268174475, | |
| "grad_norm": 0.3893095212067741, | |
| "learning_rate": 2.8858718961806028e-05, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17110855877399445, | |
| "step": 1815, | |
| "valid_targets_mean": 6156.4, | |
| "valid_targets_min": 3925 | |
| }, | |
| { | |
| "epoch": 2.9402261712439417, | |
| "grad_norm": 0.442018274409042, | |
| "learning_rate": 2.87864080991091e-05, | |
| "loss": 0.1571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15044714510440826, | |
| "step": 1820, | |
| "valid_targets_mean": 4851.6, | |
| "valid_targets_min": 2939 | |
| }, | |
| { | |
| "epoch": 2.948303715670436, | |
| "grad_norm": 0.4241200775734895, | |
| "learning_rate": 2.8713954628353727e-05, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15260717272758484, | |
| "step": 1825, | |
| "valid_targets_mean": 4876.9, | |
| "valid_targets_min": 2759 | |
| }, | |
| { | |
| "epoch": 2.9563812600969306, | |
| "grad_norm": 0.4261849048816585, | |
| "learning_rate": 2.8641359725498106e-05, | |
| "loss": 0.1625, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18526265025138855, | |
| "step": 1830, | |
| "valid_targets_mean": 5735.2, | |
| "valid_targets_min": 1069 | |
| }, | |
| { | |
| "epoch": 2.964458804523425, | |
| "grad_norm": 0.4134324557091191, | |
| "learning_rate": 2.856862456879599e-05, | |
| "loss": 0.1458, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15292227268218994, | |
| "step": 1835, | |
| "valid_targets_mean": 4747.1, | |
| "valid_targets_min": 3410 | |
| }, | |
| { | |
| "epoch": 2.9725363489499195, | |
| "grad_norm": 0.4210915654521287, | |
| "learning_rate": 2.8495750338777487e-05, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15871697664260864, | |
| "step": 1840, | |
| "valid_targets_mean": 5455.9, | |
| "valid_targets_min": 3518 | |
| }, | |
| { | |
| "epoch": 2.9806138933764137, | |
| "grad_norm": 0.4495497695447994, | |
| "learning_rate": 2.8422738218229974e-05, | |
| "loss": 0.152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16716666519641876, | |
| "step": 1845, | |
| "valid_targets_mean": 5352.4, | |
| "valid_targets_min": 3391 | |
| }, | |
| { | |
| "epoch": 2.988691437802908, | |
| "grad_norm": 1.5430872891092386, | |
| "learning_rate": 2.8349589392178843e-05, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13364684581756592, | |
| "step": 1850, | |
| "valid_targets_mean": 5169.6, | |
| "valid_targets_min": 3248 | |
| }, | |
| { | |
| "epoch": 2.996768982229402, | |
| "grad_norm": 0.385078503937771, | |
| "learning_rate": 2.827630504786829e-05, | |
| "loss": 0.1605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15355360507965088, | |
| "step": 1855, | |
| "valid_targets_mean": 5491.1, | |
| "valid_targets_min": 3017 | |
| }, | |
| { | |
| "epoch": 3.004846526655897, | |
| "grad_norm": 0.3790490762550386, | |
| "learning_rate": 2.8202886374742062e-05, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17056137323379517, | |
| "step": 1860, | |
| "valid_targets_mean": 5756.8, | |
| "valid_targets_min": 2958 | |
| }, | |
| { | |
| "epoch": 3.012924071082391, | |
| "grad_norm": 0.3968150107316026, | |
| "learning_rate": 2.812933456442411e-05, | |
| "loss": 0.1541, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15332677960395813, | |
| "step": 1865, | |
| "valid_targets_mean": 5816.9, | |
| "valid_targets_min": 2930 | |
| }, | |
| { | |
| "epoch": 3.0210016155088852, | |
| "grad_norm": 0.4168205957829322, | |
| "learning_rate": 2.8055650810699286e-05, | |
| "loss": 0.1588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17240363359451294, | |
| "step": 1870, | |
| "valid_targets_mean": 5458.4, | |
| "valid_targets_min": 1314 | |
| }, | |
| { | |
| "epoch": 3.0290791599353795, | |
| "grad_norm": 0.41901551796139186, | |
| "learning_rate": 2.7981836309493945e-05, | |
| "loss": 0.1468, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13956554234027863, | |
| "step": 1875, | |
| "valid_targets_mean": 5344.1, | |
| "valid_targets_min": 3291 | |
| }, | |
| { | |
| "epoch": 3.037156704361874, | |
| "grad_norm": 0.43648629037836206, | |
| "learning_rate": 2.790789225885656e-05, | |
| "loss": 0.1541, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1436006724834442, | |
| "step": 1880, | |
| "valid_targets_mean": 5276.2, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 3.0452342487883683, | |
| "grad_norm": 0.38914872559345653, | |
| "learning_rate": 2.783381985893824e-05, | |
| "loss": 0.1487, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14736038446426392, | |
| "step": 1885, | |
| "valid_targets_mean": 5778.1, | |
| "valid_targets_min": 3255 | |
| }, | |
| { | |
| "epoch": 3.0533117932148626, | |
| "grad_norm": 0.371589235961242, | |
| "learning_rate": 2.775962031197328e-05, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1469155102968216, | |
| "step": 1890, | |
| "valid_targets_mean": 6252.9, | |
| "valid_targets_min": 3988 | |
| }, | |
| { | |
| "epoch": 3.0613893376413572, | |
| "grad_norm": 0.42447513418921207, | |
| "learning_rate": 2.768529482225964e-05, | |
| "loss": 0.1518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1542665660381317, | |
| "step": 1895, | |
| "valid_targets_mean": 4997.4, | |
| "valid_targets_min": 3403 | |
| }, | |
| { | |
| "epoch": 3.0694668820678515, | |
| "grad_norm": 0.36654037754401236, | |
| "learning_rate": 2.761084459613939e-05, | |
| "loss": 0.1497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14190000295639038, | |
| "step": 1900, | |
| "valid_targets_mean": 6075.4, | |
| "valid_targets_min": 3704 | |
| }, | |
| { | |
| "epoch": 3.0775444264943457, | |
| "grad_norm": 0.3991548440822749, | |
| "learning_rate": 2.7536270841979153e-05, | |
| "loss": 0.151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14582324028015137, | |
| "step": 1905, | |
| "valid_targets_mean": 5232.4, | |
| "valid_targets_min": 3181 | |
| }, | |
| { | |
| "epoch": 3.08562197092084, | |
| "grad_norm": 0.3896275257772552, | |
| "learning_rate": 2.7461574770150457e-05, | |
| "loss": 0.152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1319112479686737, | |
| "step": 1910, | |
| "valid_targets_mean": 5051.2, | |
| "valid_targets_min": 2904 | |
| }, | |
| { | |
| "epoch": 3.0936995153473346, | |
| "grad_norm": 0.40055785830035945, | |
| "learning_rate": 2.7386757593010127e-05, | |
| "loss": 0.1499, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14539475739002228, | |
| "step": 1915, | |
| "valid_targets_mean": 5449.8, | |
| "valid_targets_min": 2642 | |
| }, | |
| { | |
| "epoch": 3.101777059773829, | |
| "grad_norm": 0.41050944408580226, | |
| "learning_rate": 2.7311820524880584e-05, | |
| "loss": 0.1547, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15138906240463257, | |
| "step": 1920, | |
| "valid_targets_mean": 5428.7, | |
| "valid_targets_min": 3801 | |
| }, | |
| { | |
| "epoch": 3.109854604200323, | |
| "grad_norm": 0.414382801561421, | |
| "learning_rate": 2.7236764782030153e-05, | |
| "loss": 0.1496, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15684393048286438, | |
| "step": 1925, | |
| "valid_targets_mean": 5752.2, | |
| "valid_targets_min": 3445 | |
| }, | |
| { | |
| "epoch": 3.1179321486268172, | |
| "grad_norm": 0.4253007826394208, | |
| "learning_rate": 2.7161591582653293e-05, | |
| "loss": 0.1573, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15882694721221924, | |
| "step": 1930, | |
| "valid_targets_mean": 5576.8, | |
| "valid_targets_min": 999 | |
| }, | |
| { | |
| "epoch": 3.126009693053312, | |
| "grad_norm": 0.4116885191877429, | |
| "learning_rate": 2.708630214685086e-05, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1493636965751648, | |
| "step": 1935, | |
| "valid_targets_mean": 5349.6, | |
| "valid_targets_min": 2809 | |
| }, | |
| { | |
| "epoch": 3.134087237479806, | |
| "grad_norm": 0.3824930218093128, | |
| "learning_rate": 2.70108976966103e-05, | |
| "loss": 0.1562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14908301830291748, | |
| "step": 1940, | |
| "valid_targets_mean": 5485.6, | |
| "valid_targets_min": 3274 | |
| }, | |
| { | |
| "epoch": 3.1421647819063003, | |
| "grad_norm": 0.43229853577719435, | |
| "learning_rate": 2.693537945578576e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15119776129722595, | |
| "step": 1945, | |
| "valid_targets_mean": 5140.9, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 3.150242326332795, | |
| "grad_norm": 0.42884161042903884, | |
| "learning_rate": 2.6859748650078327e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17630845308303833, | |
| "step": 1950, | |
| "valid_targets_mean": 5807.7, | |
| "valid_targets_min": 3410 | |
| }, | |
| { | |
| "epoch": 3.158319870759289, | |
| "grad_norm": 0.37371276792780805, | |
| "learning_rate": 2.678400650701602e-05, | |
| "loss": 0.1511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1380992978811264, | |
| "step": 1955, | |
| "valid_targets_mean": 5687.8, | |
| "valid_targets_min": 2911 | |
| }, | |
| { | |
| "epoch": 3.1663974151857834, | |
| "grad_norm": 0.4145495752229891, | |
| "learning_rate": 2.6708154255933973e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1371968537569046, | |
| "step": 1960, | |
| "valid_targets_mean": 4979.2, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 3.1744749596122777, | |
| "grad_norm": 0.4406884355216408, | |
| "learning_rate": 2.6632193127954393e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1561473160982132, | |
| "step": 1965, | |
| "valid_targets_mean": 5259.7, | |
| "valid_targets_min": 1825 | |
| }, | |
| { | |
| "epoch": 3.1825525040387723, | |
| "grad_norm": 0.4179333160175781, | |
| "learning_rate": 2.6556124355966633e-05, | |
| "loss": 0.1575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15797705948352814, | |
| "step": 1970, | |
| "valid_targets_mean": 5317.8, | |
| "valid_targets_min": 982 | |
| }, | |
| { | |
| "epoch": 3.1906300484652665, | |
| "grad_norm": 0.4068834683524423, | |
| "learning_rate": 2.6479949174607166e-05, | |
| "loss": 0.1557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1401294469833374, | |
| "step": 1975, | |
| "valid_targets_mean": 6256.9, | |
| "valid_targets_min": 2170 | |
| }, | |
| { | |
| "epoch": 3.1987075928917608, | |
| "grad_norm": 0.46135880028117426, | |
| "learning_rate": 2.640366882023955e-05, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1438083052635193, | |
| "step": 1980, | |
| "valid_targets_mean": 4902.4, | |
| "valid_targets_min": 2561 | |
| }, | |
| { | |
| "epoch": 3.2067851373182554, | |
| "grad_norm": 0.3772951249061422, | |
| "learning_rate": 2.6327284530934333e-05, | |
| "loss": 0.1458, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13431373238563538, | |
| "step": 1985, | |
| "valid_targets_mean": 5266.6, | |
| "valid_targets_min": 3377 | |
| }, | |
| { | |
| "epoch": 3.2148626817447497, | |
| "grad_norm": 0.4365284090149285, | |
| "learning_rate": 2.625079754644901e-05, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14966890215873718, | |
| "step": 1990, | |
| "valid_targets_mean": 5547.6, | |
| "valid_targets_min": 3367 | |
| }, | |
| { | |
| "epoch": 3.222940226171244, | |
| "grad_norm": 0.4344060467491617, | |
| "learning_rate": 2.6174209108207858e-05, | |
| "loss": 0.1555, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15894004702568054, | |
| "step": 1995, | |
| "valid_targets_mean": 5271.2, | |
| "valid_targets_min": 2963 | |
| }, | |
| { | |
| "epoch": 3.231017770597738, | |
| "grad_norm": 0.4290654640834972, | |
| "learning_rate": 2.609752045928181e-05, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15122604370117188, | |
| "step": 2000, | |
| "valid_targets_mean": 4874.1, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 3.2390953150242328, | |
| "grad_norm": 0.40755382282044406, | |
| "learning_rate": 2.6020732844368253e-05, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15718157589435577, | |
| "step": 2005, | |
| "valid_targets_mean": 5710.7, | |
| "valid_targets_min": 3247 | |
| }, | |
| { | |
| "epoch": 3.247172859450727, | |
| "grad_norm": 0.45017496542152, | |
| "learning_rate": 2.5943847509770878e-05, | |
| "loss": 0.1635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1719478964805603, | |
| "step": 2010, | |
| "valid_targets_mean": 5359.8, | |
| "valid_targets_min": 2700 | |
| }, | |
| { | |
| "epoch": 3.255250403877221, | |
| "grad_norm": 0.4623455962159807, | |
| "learning_rate": 2.586686570337939e-05, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15864837169647217, | |
| "step": 2015, | |
| "valid_targets_mean": 5411.4, | |
| "valid_targets_min": 2368 | |
| }, | |
| { | |
| "epoch": 3.263327948303716, | |
| "grad_norm": 0.3774210696224583, | |
| "learning_rate": 2.5789788674649287e-05, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15108612179756165, | |
| "step": 2020, | |
| "valid_targets_mean": 5870.5, | |
| "valid_targets_min": 3853 | |
| }, | |
| { | |
| "epoch": 3.27140549273021, | |
| "grad_norm": 0.3702630043239486, | |
| "learning_rate": 2.571261767458159e-05, | |
| "loss": 0.1539, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15424832701683044, | |
| "step": 2025, | |
| "valid_targets_mean": 6300.9, | |
| "valid_targets_min": 3895 | |
| }, | |
| { | |
| "epoch": 3.2794830371567043, | |
| "grad_norm": 0.42808499410597545, | |
| "learning_rate": 2.5635353955702517e-05, | |
| "loss": 0.1621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15260916948318481, | |
| "step": 2030, | |
| "valid_targets_mean": 5049.5, | |
| "valid_targets_min": 3091 | |
| }, | |
| { | |
| "epoch": 3.2875605815831985, | |
| "grad_norm": 0.40554869240072816, | |
| "learning_rate": 2.5557998772043148e-05, | |
| "loss": 0.1511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14066237211227417, | |
| "step": 2035, | |
| "valid_targets_mean": 4771.4, | |
| "valid_targets_min": 773 | |
| }, | |
| { | |
| "epoch": 3.295638126009693, | |
| "grad_norm": 0.4033053077584176, | |
| "learning_rate": 2.54805533791191e-05, | |
| "loss": 0.1536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15944485366344452, | |
| "step": 2040, | |
| "valid_targets_mean": 5350.1, | |
| "valid_targets_min": 2479 | |
| }, | |
| { | |
| "epoch": 3.3037156704361874, | |
| "grad_norm": 0.4070671096794082, | |
| "learning_rate": 2.5403019033910137e-05, | |
| "loss": 0.152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1527840942144394, | |
| "step": 2045, | |
| "valid_targets_mean": 5546.4, | |
| "valid_targets_min": 2999 | |
| }, | |
| { | |
| "epoch": 3.3117932148626816, | |
| "grad_norm": 0.40454651912809586, | |
| "learning_rate": 2.5325396994839748e-05, | |
| "loss": 0.151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14618010818958282, | |
| "step": 2050, | |
| "valid_targets_mean": 4913.0, | |
| "valid_targets_min": 2799 | |
| }, | |
| { | |
| "epoch": 3.3198707592891763, | |
| "grad_norm": 0.4048992795149119, | |
| "learning_rate": 2.5247688521754758e-05, | |
| "loss": 0.1536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16438153386116028, | |
| "step": 2055, | |
| "valid_targets_mean": 5300.4, | |
| "valid_targets_min": 3030 | |
| }, | |
| { | |
| "epoch": 3.3279483037156705, | |
| "grad_norm": 0.43565390906405965, | |
| "learning_rate": 2.5169894875904853e-05, | |
| "loss": 0.1489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14154842495918274, | |
| "step": 2060, | |
| "valid_targets_mean": 4655.8, | |
| "valid_targets_min": 3134 | |
| }, | |
| { | |
| "epoch": 3.3360258481421647, | |
| "grad_norm": 0.400196945243683, | |
| "learning_rate": 2.509201731992212e-05, | |
| "loss": 0.1568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15692731738090515, | |
| "step": 2065, | |
| "valid_targets_mean": 5103.1, | |
| "valid_targets_min": 3355 | |
| }, | |
| { | |
| "epoch": 3.344103392568659, | |
| "grad_norm": 0.3937349961185628, | |
| "learning_rate": 2.5014057117800556e-05, | |
| "loss": 0.1608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14389444887638092, | |
| "step": 2070, | |
| "valid_targets_mean": 5687.6, | |
| "valid_targets_min": 2137 | |
| }, | |
| { | |
| "epoch": 3.3521809369951536, | |
| "grad_norm": 0.39353561868224096, | |
| "learning_rate": 2.4936015534875547e-05, | |
| "loss": 0.1527, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1426299661397934, | |
| "step": 2075, | |
| "valid_targets_mean": 4550.7, | |
| "valid_targets_min": 1747 | |
| }, | |
| { | |
| "epoch": 3.360258481421648, | |
| "grad_norm": 0.406929436006472, | |
| "learning_rate": 2.4857893837803313e-05, | |
| "loss": 0.1567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15165364742279053, | |
| "step": 2080, | |
| "valid_targets_mean": 5169.2, | |
| "valid_targets_min": 3414 | |
| }, | |
| { | |
| "epoch": 3.368336025848142, | |
| "grad_norm": 0.6134118024117283, | |
| "learning_rate": 2.4779693294540407e-05, | |
| "loss": 0.1586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1490992307662964, | |
| "step": 2085, | |
| "valid_targets_mean": 5403.9, | |
| "valid_targets_min": 3683 | |
| }, | |
| { | |
| "epoch": 3.3764135702746367, | |
| "grad_norm": 0.38653362977731454, | |
| "learning_rate": 2.470141517432306e-05, | |
| "loss": 0.1565, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13950718939304352, | |
| "step": 2090, | |
| "valid_targets_mean": 5945.8, | |
| "valid_targets_min": 3636 | |
| }, | |
| { | |
| "epoch": 3.384491114701131, | |
| "grad_norm": 0.4412013292005518, | |
| "learning_rate": 2.4623060747646643e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1426210254430771, | |
| "step": 2095, | |
| "valid_targets_mean": 4918.0, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 3.392568659127625, | |
| "grad_norm": 0.4239121175423532, | |
| "learning_rate": 2.4544631286245007e-05, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16665086150169373, | |
| "step": 2100, | |
| "valid_targets_mean": 5425.8, | |
| "valid_targets_min": 2417 | |
| }, | |
| { | |
| "epoch": 3.4006462035541194, | |
| "grad_norm": 0.4186314057065351, | |
| "learning_rate": 2.4466128063069873e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1559276580810547, | |
| "step": 2105, | |
| "valid_targets_mean": 5203.6, | |
| "valid_targets_min": 2440 | |
| }, | |
| { | |
| "epoch": 3.408723747980614, | |
| "grad_norm": 0.3986419672009297, | |
| "learning_rate": 2.4387552352270136e-05, | |
| "loss": 0.1522, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13025811314582825, | |
| "step": 2110, | |
| "valid_targets_mean": 4968.6, | |
| "valid_targets_min": 2560 | |
| }, | |
| { | |
| "epoch": 3.4168012924071083, | |
| "grad_norm": 0.38677721019528954, | |
| "learning_rate": 2.430890542917121e-05, | |
| "loss": 0.1404, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13195520639419556, | |
| "step": 2115, | |
| "valid_targets_mean": 6195.9, | |
| "valid_targets_min": 4188 | |
| }, | |
| { | |
| "epoch": 3.4248788368336025, | |
| "grad_norm": 0.37812614213504553, | |
| "learning_rate": 2.4230188570254328e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14694543182849884, | |
| "step": 2120, | |
| "valid_targets_mean": 5629.4, | |
| "valid_targets_min": 2668 | |
| }, | |
| { | |
| "epoch": 3.432956381260097, | |
| "grad_norm": 0.3811752552651874, | |
| "learning_rate": 2.415140305313581e-05, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14528897404670715, | |
| "step": 2125, | |
| "valid_targets_mean": 6108.1, | |
| "valid_targets_min": 2661 | |
| }, | |
| { | |
| "epoch": 3.4410339256865914, | |
| "grad_norm": 0.44768742792803773, | |
| "learning_rate": 2.4072550156546343e-05, | |
| "loss": 0.1666, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17549705505371094, | |
| "step": 2130, | |
| "valid_targets_mean": 5012.2, | |
| "valid_targets_min": 1476 | |
| }, | |
| { | |
| "epoch": 3.4491114701130856, | |
| "grad_norm": 0.42807143683133275, | |
| "learning_rate": 2.399363116031022e-05, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15805082023143768, | |
| "step": 2135, | |
| "valid_targets_mean": 5021.9, | |
| "valid_targets_min": 3006 | |
| }, | |
| { | |
| "epoch": 3.45718901453958, | |
| "grad_norm": 0.38078146986186595, | |
| "learning_rate": 2.3914647345324543e-05, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14287428557872772, | |
| "step": 2140, | |
| "valid_targets_mean": 5498.7, | |
| "valid_targets_min": 3390 | |
| }, | |
| { | |
| "epoch": 3.4652665589660745, | |
| "grad_norm": 0.3851937956769129, | |
| "learning_rate": 2.3835599993538486e-05, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1542748063802719, | |
| "step": 2145, | |
| "valid_targets_mean": 6078.2, | |
| "valid_targets_min": 3209 | |
| }, | |
| { | |
| "epoch": 3.4733441033925687, | |
| "grad_norm": 0.43169567603841624, | |
| "learning_rate": 2.3756490387932458e-05, | |
| "loss": 0.1567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15492887794971466, | |
| "step": 2150, | |
| "valid_targets_mean": 4833.4, | |
| "valid_targets_min": 3133 | |
| }, | |
| { | |
| "epoch": 3.481421647819063, | |
| "grad_norm": 0.3642965932222815, | |
| "learning_rate": 2.3677319812497252e-05, | |
| "loss": 0.1487, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1442583203315735, | |
| "step": 2155, | |
| "valid_targets_mean": 6142.7, | |
| "valid_targets_min": 2984 | |
| }, | |
| { | |
| "epoch": 3.489499192245557, | |
| "grad_norm": 0.366185985042951, | |
| "learning_rate": 2.359808955221326e-05, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15077048540115356, | |
| "step": 2160, | |
| "valid_targets_mean": 5692.4, | |
| "valid_targets_min": 4004 | |
| }, | |
| { | |
| "epoch": 3.497576736672052, | |
| "grad_norm": 0.399036857478809, | |
| "learning_rate": 2.351880089302958e-05, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14326925575733185, | |
| "step": 2165, | |
| "valid_targets_mean": 5415.2, | |
| "valid_targets_min": 2826 | |
| }, | |
| { | |
| "epoch": 3.505654281098546, | |
| "grad_norm": 0.4264292290712075, | |
| "learning_rate": 2.3439455121843157e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16648367047309875, | |
| "step": 2170, | |
| "valid_targets_mean": 5267.7, | |
| "valid_targets_min": 2725 | |
| }, | |
| { | |
| "epoch": 3.5137318255250403, | |
| "grad_norm": 0.3993313828190049, | |
| "learning_rate": 2.33600535264779e-05, | |
| "loss": 0.1483, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14025278389453888, | |
| "step": 2175, | |
| "valid_targets_mean": 5846.4, | |
| "valid_targets_min": 3385 | |
| }, | |
| { | |
| "epoch": 3.5218093699515345, | |
| "grad_norm": 0.42577091740104067, | |
| "learning_rate": 2.3280597395663764e-05, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16203680634498596, | |
| "step": 2180, | |
| "valid_targets_mean": 5129.8, | |
| "valid_targets_min": 2050 | |
| }, | |
| { | |
| "epoch": 3.529886914378029, | |
| "grad_norm": 0.3971417885413414, | |
| "learning_rate": 2.3201088019015843e-05, | |
| "loss": 0.1533, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1632687896490097, | |
| "step": 2185, | |
| "valid_targets_mean": 5620.4, | |
| "valid_targets_min": 3488 | |
| }, | |
| { | |
| "epoch": 3.5379644588045234, | |
| "grad_norm": 0.4336661416116471, | |
| "learning_rate": 2.3121526687013448e-05, | |
| "loss": 0.1613, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1689453125, | |
| "step": 2190, | |
| "valid_targets_mean": 4963.4, | |
| "valid_targets_min": 2764 | |
| }, | |
| { | |
| "epoch": 3.546042003231018, | |
| "grad_norm": 0.4304892599162082, | |
| "learning_rate": 2.3041914690979157e-05, | |
| "loss": 0.1544, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15438434481620789, | |
| "step": 2195, | |
| "valid_targets_mean": 5219.6, | |
| "valid_targets_min": 826 | |
| }, | |
| { | |
| "epoch": 3.5541195476575123, | |
| "grad_norm": 0.44762639861098785, | |
| "learning_rate": 2.2962253323057834e-05, | |
| "loss": 0.1513, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15575367212295532, | |
| "step": 2200, | |
| "valid_targets_mean": 5476.6, | |
| "valid_targets_min": 2659 | |
| }, | |
| { | |
| "epoch": 3.5621970920840065, | |
| "grad_norm": 0.4112970830970271, | |
| "learning_rate": 2.288254387619568e-05, | |
| "loss": 0.16, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15437915921211243, | |
| "step": 2205, | |
| "valid_targets_mean": 4804.3, | |
| "valid_targets_min": 3202 | |
| }, | |
| { | |
| "epoch": 3.5702746365105007, | |
| "grad_norm": 0.4353593179660125, | |
| "learning_rate": 2.2802787644119265e-05, | |
| "loss": 0.149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1426883041858673, | |
| "step": 2210, | |
| "valid_targets_mean": 4871.5, | |
| "valid_targets_min": 2812 | |
| }, | |
| { | |
| "epoch": 3.578352180936995, | |
| "grad_norm": 0.38606076115639726, | |
| "learning_rate": 2.272298592131447e-05, | |
| "loss": 0.1512, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14033401012420654, | |
| "step": 2215, | |
| "valid_targets_mean": 5166.7, | |
| "valid_targets_min": 1471 | |
| }, | |
| { | |
| "epoch": 3.5864297253634896, | |
| "grad_norm": 0.37884641017389226, | |
| "learning_rate": 2.264314000300555e-05, | |
| "loss": 0.1523, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.144095778465271, | |
| "step": 2220, | |
| "valid_targets_mean": 5684.6, | |
| "valid_targets_min": 3165 | |
| }, | |
| { | |
| "epoch": 3.594507269789984, | |
| "grad_norm": 0.48067246758501286, | |
| "learning_rate": 2.2563251185134054e-05, | |
| "loss": 0.1456, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1463029831647873, | |
| "step": 2225, | |
| "valid_targets_mean": 5646.9, | |
| "valid_targets_min": 3291 | |
| }, | |
| { | |
| "epoch": 3.602584814216478, | |
| "grad_norm": 0.3994157579202727, | |
| "learning_rate": 2.248332076433782e-05, | |
| "loss": 0.1468, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13935203850269318, | |
| "step": 2230, | |
| "valid_targets_mean": 5540.3, | |
| "valid_targets_min": 2448 | |
| }, | |
| { | |
| "epoch": 3.6106623586429727, | |
| "grad_norm": 0.3991704899058015, | |
| "learning_rate": 2.2403350037929928e-05, | |
| "loss": 0.1505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13998529314994812, | |
| "step": 2235, | |
| "valid_targets_mean": 5438.3, | |
| "valid_targets_min": 2712 | |
| }, | |
| { | |
| "epoch": 3.618739903069467, | |
| "grad_norm": 0.4060660237223782, | |
| "learning_rate": 2.2323340303877633e-05, | |
| "loss": 0.1588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15896086394786835, | |
| "step": 2240, | |
| "valid_targets_mean": 5169.9, | |
| "valid_targets_min": 2435 | |
| }, | |
| { | |
| "epoch": 3.626817447495961, | |
| "grad_norm": 0.5493544951854048, | |
| "learning_rate": 2.224329286078131e-05, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15103283524513245, | |
| "step": 2245, | |
| "valid_targets_mean": 4844.8, | |
| "valid_targets_min": 2845 | |
| }, | |
| { | |
| "epoch": 3.6348949919224554, | |
| "grad_norm": 0.3800540098826123, | |
| "learning_rate": 2.2163209007853367e-05, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15237392485141754, | |
| "step": 2250, | |
| "valid_targets_mean": 6551.5, | |
| "valid_targets_min": 3834 | |
| }, | |
| { | |
| "epoch": 3.64297253634895, | |
| "grad_norm": 0.4234706738999503, | |
| "learning_rate": 2.2083090044897172e-05, | |
| "loss": 0.1521, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16106975078582764, | |
| "step": 2255, | |
| "valid_targets_mean": 5363.4, | |
| "valid_targets_min": 2606 | |
| }, | |
| { | |
| "epoch": 3.6510500807754442, | |
| "grad_norm": 0.3864243032489708, | |
| "learning_rate": 2.200293727228593e-05, | |
| "loss": 0.1543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16979104280471802, | |
| "step": 2260, | |
| "valid_targets_mean": 6010.2, | |
| "valid_targets_min": 4332 | |
| }, | |
| { | |
| "epoch": 3.6591276252019385, | |
| "grad_norm": 0.4019173231742567, | |
| "learning_rate": 2.1922751990941614e-05, | |
| "loss": 0.1538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15049397945404053, | |
| "step": 2265, | |
| "valid_targets_mean": 5629.3, | |
| "valid_targets_min": 3522 | |
| }, | |
| { | |
| "epoch": 3.667205169628433, | |
| "grad_norm": 0.42205042047957253, | |
| "learning_rate": 2.1842535502313816e-05, | |
| "loss": 0.1497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15802344679832458, | |
| "step": 2270, | |
| "valid_targets_mean": 5414.5, | |
| "valid_targets_min": 2678 | |
| }, | |
| { | |
| "epoch": 3.6752827140549273, | |
| "grad_norm": 0.43674629248505054, | |
| "learning_rate": 2.1762289108358645e-05, | |
| "loss": 0.149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1704617440700531, | |
| "step": 2275, | |
| "valid_targets_mean": 5013.8, | |
| "valid_targets_min": 2887 | |
| }, | |
| { | |
| "epoch": 3.6833602584814216, | |
| "grad_norm": 0.3875197904903836, | |
| "learning_rate": 2.1682014111517588e-05, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14359474182128906, | |
| "step": 2280, | |
| "valid_targets_mean": 5659.9, | |
| "valid_targets_min": 3261 | |
| }, | |
| { | |
| "epoch": 3.691437802907916, | |
| "grad_norm": 0.38330175090036805, | |
| "learning_rate": 2.1601711814696387e-05, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15725120902061462, | |
| "step": 2285, | |
| "valid_targets_mean": 5879.8, | |
| "valid_targets_min": 2714 | |
| }, | |
| { | |
| "epoch": 3.6995153473344105, | |
| "grad_norm": 0.4367315902170705, | |
| "learning_rate": 2.1521383521243842e-05, | |
| "loss": 0.1452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13833829760551453, | |
| "step": 2290, | |
| "valid_targets_mean": 4894.6, | |
| "valid_targets_min": 3211 | |
| }, | |
| { | |
| "epoch": 3.7075928917609047, | |
| "grad_norm": 0.4581004559395213, | |
| "learning_rate": 2.1441030534930728e-05, | |
| "loss": 0.1415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1371515393257141, | |
| "step": 2295, | |
| "valid_targets_mean": 5755.8, | |
| "valid_targets_min": 3641 | |
| }, | |
| { | |
| "epoch": 3.715670436187399, | |
| "grad_norm": 0.40780633076924916, | |
| "learning_rate": 2.1360654159928577e-05, | |
| "loss": 0.1505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14940571784973145, | |
| "step": 2300, | |
| "valid_targets_mean": 5407.1, | |
| "valid_targets_min": 2902 | |
| }, | |
| { | |
| "epoch": 3.7237479806138936, | |
| "grad_norm": 0.4440265677355658, | |
| "learning_rate": 2.1280255700788542e-05, | |
| "loss": 0.1501, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15323799848556519, | |
| "step": 2305, | |
| "valid_targets_mean": 4922.2, | |
| "valid_targets_min": 2340 | |
| }, | |
| { | |
| "epoch": 3.731825525040388, | |
| "grad_norm": 0.41170826111410097, | |
| "learning_rate": 2.119983646242021e-05, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13932514190673828, | |
| "step": 2310, | |
| "valid_targets_mean": 5513.8, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 3.739903069466882, | |
| "grad_norm": 0.42184075604681603, | |
| "learning_rate": 2.1119397750070428e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14799876511096954, | |
| "step": 2315, | |
| "valid_targets_mean": 5156.5, | |
| "valid_targets_min": 3263 | |
| }, | |
| { | |
| "epoch": 3.7479806138933762, | |
| "grad_norm": 0.4407278270429571, | |
| "learning_rate": 2.10389408693021e-05, | |
| "loss": 0.1465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1560024917125702, | |
| "step": 2320, | |
| "valid_targets_mean": 5386.8, | |
| "valid_targets_min": 2776 | |
| }, | |
| { | |
| "epoch": 3.756058158319871, | |
| "grad_norm": 0.41160777876348226, | |
| "learning_rate": 2.095846712597304e-05, | |
| "loss": 0.1568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1562466025352478, | |
| "step": 2325, | |
| "valid_targets_mean": 5080.0, | |
| "valid_targets_min": 2467 | |
| }, | |
| { | |
| "epoch": 3.764135702746365, | |
| "grad_norm": 0.46822578896832445, | |
| "learning_rate": 2.0877977826214737e-05, | |
| "loss": 0.1554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16950945556163788, | |
| "step": 2330, | |
| "valid_targets_mean": 5497.2, | |
| "valid_targets_min": 3151 | |
| }, | |
| { | |
| "epoch": 3.7722132471728593, | |
| "grad_norm": 0.3987880976941857, | |
| "learning_rate": 2.079747427641116e-05, | |
| "loss": 0.1573, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16459164023399353, | |
| "step": 2335, | |
| "valid_targets_mean": 5201.4, | |
| "valid_targets_min": 3350 | |
| }, | |
| { | |
| "epoch": 3.780290791599354, | |
| "grad_norm": 0.4061680905283449, | |
| "learning_rate": 2.071695778317758e-05, | |
| "loss": 0.1475, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14735883474349976, | |
| "step": 2340, | |
| "valid_targets_mean": 5035.7, | |
| "valid_targets_min": 2845 | |
| }, | |
| { | |
| "epoch": 3.788368336025848, | |
| "grad_norm": 0.39441369050080427, | |
| "learning_rate": 2.0636429653339335e-05, | |
| "loss": 0.1491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1540263593196869, | |
| "step": 2345, | |
| "valid_targets_mean": 5057.6, | |
| "valid_targets_min": 2274 | |
| }, | |
| { | |
| "epoch": 3.7964458804523424, | |
| "grad_norm": 0.416889915883133, | |
| "learning_rate": 2.0555891193910646e-05, | |
| "loss": 0.1476, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14710207283496857, | |
| "step": 2350, | |
| "valid_targets_mean": 4770.0, | |
| "valid_targets_min": 3236 | |
| }, | |
| { | |
| "epoch": 3.8045234248788367, | |
| "grad_norm": 0.40711401787943335, | |
| "learning_rate": 2.0475343712073363e-05, | |
| "loss": 0.1458, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15809385478496552, | |
| "step": 2355, | |
| "valid_targets_mean": 5591.0, | |
| "valid_targets_min": 2093 | |
| }, | |
| { | |
| "epoch": 3.8126009693053313, | |
| "grad_norm": 0.3903390563586942, | |
| "learning_rate": 2.0394788515155803e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14185532927513123, | |
| "step": 2360, | |
| "valid_targets_mean": 5076.9, | |
| "valid_targets_min": 920 | |
| }, | |
| { | |
| "epoch": 3.8206785137318255, | |
| "grad_norm": 0.413416495294967, | |
| "learning_rate": 2.0314226910611478e-05, | |
| "loss": 0.1501, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14693567156791687, | |
| "step": 2365, | |
| "valid_targets_mean": 5353.4, | |
| "valid_targets_min": 3014 | |
| }, | |
| { | |
| "epoch": 3.8287560581583198, | |
| "grad_norm": 0.4039346596230813, | |
| "learning_rate": 2.023366020599792e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1467030942440033, | |
| "step": 2370, | |
| "valid_targets_mean": 5815.2, | |
| "valid_targets_min": 2985 | |
| }, | |
| { | |
| "epoch": 3.8368336025848144, | |
| "grad_norm": 0.41298565389518715, | |
| "learning_rate": 2.0153089708955432e-05, | |
| "loss": 0.1518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1566350758075714, | |
| "step": 2375, | |
| "valid_targets_mean": 5346.1, | |
| "valid_targets_min": 2150 | |
| }, | |
| { | |
| "epoch": 3.8449111470113086, | |
| "grad_norm": 0.4089974403263933, | |
| "learning_rate": 2.0072516727185854e-05, | |
| "loss": 0.1497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15077462792396545, | |
| "step": 2380, | |
| "valid_targets_mean": 4935.6, | |
| "valid_targets_min": 2943 | |
| }, | |
| { | |
| "epoch": 3.852988691437803, | |
| "grad_norm": 0.42079270976668665, | |
| "learning_rate": 1.999194256843138e-05, | |
| "loss": 0.1511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1612389087677002, | |
| "step": 2385, | |
| "valid_targets_mean": 5302.7, | |
| "valid_targets_min": 3147 | |
| }, | |
| { | |
| "epoch": 3.861066235864297, | |
| "grad_norm": 0.38147493590776055, | |
| "learning_rate": 1.9911368540453285e-05, | |
| "loss": 0.1433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1430862843990326, | |
| "step": 2390, | |
| "valid_targets_mean": 5481.4, | |
| "valid_targets_min": 3173 | |
| }, | |
| { | |
| "epoch": 3.8691437802907918, | |
| "grad_norm": 0.41106419802822197, | |
| "learning_rate": 1.9830795951010737e-05, | |
| "loss": 0.1479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15999916195869446, | |
| "step": 2395, | |
| "valid_targets_mean": 5510.4, | |
| "valid_targets_min": 3092 | |
| }, | |
| { | |
| "epoch": 3.877221324717286, | |
| "grad_norm": 0.4031529390937838, | |
| "learning_rate": 1.9750226107839546e-05, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15642176568508148, | |
| "step": 2400, | |
| "valid_targets_mean": 5265.4, | |
| "valid_targets_min": 2776 | |
| }, | |
| { | |
| "epoch": 3.88529886914378, | |
| "grad_norm": 0.4187223419051394, | |
| "learning_rate": 1.966966031863096e-05, | |
| "loss": 0.1551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15662488341331482, | |
| "step": 2405, | |
| "valid_targets_mean": 5388.5, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 3.893376413570275, | |
| "grad_norm": 0.42727605871459245, | |
| "learning_rate": 1.958909989101041e-05, | |
| "loss": 0.1499, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14462366700172424, | |
| "step": 2410, | |
| "valid_targets_mean": 5035.6, | |
| "valid_targets_min": 3184 | |
| }, | |
| { | |
| "epoch": 3.901453957996769, | |
| "grad_norm": 0.42498564523285326, | |
| "learning_rate": 1.9508546132516316e-05, | |
| "loss": 0.1516, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15392355620861053, | |
| "step": 2415, | |
| "valid_targets_mean": 5218.9, | |
| "valid_targets_min": 2957 | |
| }, | |
| { | |
| "epoch": 3.9095315024232633, | |
| "grad_norm": 0.3959878020726741, | |
| "learning_rate": 1.9428000350578874e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14723697304725647, | |
| "step": 2420, | |
| "valid_targets_mean": 5313.4, | |
| "valid_targets_min": 3003 | |
| }, | |
| { | |
| "epoch": 3.9176090468497575, | |
| "grad_norm": 0.40065923174356644, | |
| "learning_rate": 1.9347463852498776e-05, | |
| "loss": 0.1525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1575714349746704, | |
| "step": 2425, | |
| "valid_targets_mean": 5617.6, | |
| "valid_targets_min": 3985 | |
| }, | |
| { | |
| "epoch": 3.9256865912762517, | |
| "grad_norm": 0.44927439386069595, | |
| "learning_rate": 1.926693794542607e-05, | |
| "loss": 0.1514, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1579849123954773, | |
| "step": 2430, | |
| "valid_targets_mean": 4860.3, | |
| "valid_targets_min": 2840 | |
| }, | |
| { | |
| "epoch": 3.9337641357027464, | |
| "grad_norm": 0.3777485375355132, | |
| "learning_rate": 1.9186423936338886e-05, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13425903022289276, | |
| "step": 2435, | |
| "valid_targets_mean": 5260.8, | |
| "valid_targets_min": 1369 | |
| }, | |
| { | |
| "epoch": 3.9418416801292406, | |
| "grad_norm": 0.4350245152644942, | |
| "learning_rate": 1.9105923132022234e-05, | |
| "loss": 0.1527, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16233918070793152, | |
| "step": 2440, | |
| "valid_targets_mean": 4957.4, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 3.9499192245557353, | |
| "grad_norm": 0.4091057515846294, | |
| "learning_rate": 1.9025436839046834e-05, | |
| "loss": 0.1604, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15739592909812927, | |
| "step": 2445, | |
| "valid_targets_mean": 5077.9, | |
| "valid_targets_min": 2154 | |
| }, | |
| { | |
| "epoch": 3.9579967689822295, | |
| "grad_norm": 0.4097245633846646, | |
| "learning_rate": 1.894496636374786e-05, | |
| "loss": 0.1559, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1472896784543991, | |
| "step": 2450, | |
| "valid_targets_mean": 5508.9, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 3.9660743134087237, | |
| "grad_norm": 0.39627955516501584, | |
| "learning_rate": 1.8864513012203747e-05, | |
| "loss": 0.1483, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16468317806720734, | |
| "step": 2455, | |
| "valid_targets_mean": 6392.1, | |
| "valid_targets_min": 3349 | |
| }, | |
| { | |
| "epoch": 3.974151857835218, | |
| "grad_norm": 0.4209417514706817, | |
| "learning_rate": 1.8784078090215028e-05, | |
| "loss": 0.1509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13668987154960632, | |
| "step": 2460, | |
| "valid_targets_mean": 4554.8, | |
| "valid_targets_min": 3061 | |
| }, | |
| { | |
| "epoch": 3.982229402261712, | |
| "grad_norm": 0.40708659888796483, | |
| "learning_rate": 1.8703662903283092e-05, | |
| "loss": 0.1586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14635734260082245, | |
| "step": 2465, | |
| "valid_targets_mean": 5307.4, | |
| "valid_targets_min": 2869 | |
| }, | |
| { | |
| "epoch": 3.990306946688207, | |
| "grad_norm": 0.4201079145553348, | |
| "learning_rate": 1.8623268756589033e-05, | |
| "loss": 0.1519, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1568068265914917, | |
| "step": 2470, | |
| "valid_targets_mean": 5867.6, | |
| "valid_targets_min": 2861 | |
| }, | |
| { | |
| "epoch": 3.998384491114701, | |
| "grad_norm": 0.5636352727995623, | |
| "learning_rate": 1.8542896954972443e-05, | |
| "loss": 0.1536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15685251355171204, | |
| "step": 2475, | |
| "valid_targets_mean": 4876.6, | |
| "valid_targets_min": 3117 | |
| }, | |
| { | |
| "epoch": 4.006462035541196, | |
| "grad_norm": 0.42565467595314055, | |
| "learning_rate": 1.846254880291024e-05, | |
| "loss": 0.1401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12946423888206482, | |
| "step": 2480, | |
| "valid_targets_mean": 5089.5, | |
| "valid_targets_min": 3193 | |
| }, | |
| { | |
| "epoch": 4.01453957996769, | |
| "grad_norm": 0.44213909133955387, | |
| "learning_rate": 1.8382225604495505e-05, | |
| "loss": 0.1384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14307674765586853, | |
| "step": 2485, | |
| "valid_targets_mean": 5329.4, | |
| "valid_targets_min": 2999 | |
| }, | |
| { | |
| "epoch": 4.022617124394184, | |
| "grad_norm": 0.43432375435262827, | |
| "learning_rate": 1.830192866341629e-05, | |
| "loss": 0.1346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1415438950061798, | |
| "step": 2490, | |
| "valid_targets_mean": 5478.9, | |
| "valid_targets_min": 3263 | |
| }, | |
| { | |
| "epoch": 4.030694668820678, | |
| "grad_norm": 0.43445963751081745, | |
| "learning_rate": 1.822165928293451e-05, | |
| "loss": 0.1419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15091031789779663, | |
| "step": 2495, | |
| "valid_targets_mean": 5127.4, | |
| "valid_targets_min": 906 | |
| }, | |
| { | |
| "epoch": 4.038772213247173, | |
| "grad_norm": 0.4037874438131497, | |
| "learning_rate": 1.8141418765864726e-05, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14658978581428528, | |
| "step": 2500, | |
| "valid_targets_mean": 6061.2, | |
| "valid_targets_min": 2480 | |
| }, | |
| { | |
| "epoch": 4.046849757673667, | |
| "grad_norm": 0.42639107834727435, | |
| "learning_rate": 1.8061208414553033e-05, | |
| "loss": 0.1441, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13869735598564148, | |
| "step": 2505, | |
| "valid_targets_mean": 5297.2, | |
| "valid_targets_min": 3324 | |
| }, | |
| { | |
| "epoch": 4.054927302100162, | |
| "grad_norm": 0.4073550017785693, | |
| "learning_rate": 1.798102953085595e-05, | |
| "loss": 0.141, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12786024808883667, | |
| "step": 2510, | |
| "valid_targets_mean": 5235.1, | |
| "valid_targets_min": 2653 | |
| }, | |
| { | |
| "epoch": 4.063004846526656, | |
| "grad_norm": 0.41328082544496686, | |
| "learning_rate": 1.7900883416119216e-05, | |
| "loss": 0.1333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12691153585910797, | |
| "step": 2515, | |
| "valid_targets_mean": 5156.7, | |
| "valid_targets_min": 1294 | |
| }, | |
| { | |
| "epoch": 4.07108239095315, | |
| "grad_norm": 0.4346008708708734, | |
| "learning_rate": 1.7820771371156737e-05, | |
| "loss": 0.1348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1545967161655426, | |
| "step": 2520, | |
| "valid_targets_mean": 5345.7, | |
| "valid_targets_min": 2941 | |
| }, | |
| { | |
| "epoch": 4.079159935379645, | |
| "grad_norm": 0.47779679748632076, | |
| "learning_rate": 1.7740694696229463e-05, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15502513945102692, | |
| "step": 2525, | |
| "valid_targets_mean": 4844.4, | |
| "valid_targets_min": 3134 | |
| }, | |
| { | |
| "epoch": 4.087237479806139, | |
| "grad_norm": 0.43057242625314834, | |
| "learning_rate": 1.766065469102423e-05, | |
| "loss": 0.1434, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1366223692893982, | |
| "step": 2530, | |
| "valid_targets_mean": 4981.9, | |
| "valid_targets_min": 2603 | |
| }, | |
| { | |
| "epoch": 4.095315024232633, | |
| "grad_norm": 0.429086317612865, | |
| "learning_rate": 1.7580652654632745e-05, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1400957852602005, | |
| "step": 2535, | |
| "valid_targets_mean": 5427.9, | |
| "valid_targets_min": 3298 | |
| }, | |
| { | |
| "epoch": 4.103392568659127, | |
| "grad_norm": 0.465331706392212, | |
| "learning_rate": 1.7500689885530435e-05, | |
| "loss": 0.1391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13422629237174988, | |
| "step": 2540, | |
| "valid_targets_mean": 4903.6, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 4.111470113085622, | |
| "grad_norm": 0.44363029506124135, | |
| "learning_rate": 1.7420767681555406e-05, | |
| "loss": 0.1443, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16646191477775574, | |
| "step": 2545, | |
| "valid_targets_mean": 5154.2, | |
| "valid_targets_min": 922 | |
| }, | |
| { | |
| "epoch": 4.119547657512117, | |
| "grad_norm": 0.43473031219865776, | |
| "learning_rate": 1.734088733988738e-05, | |
| "loss": 0.1363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13686491549015045, | |
| "step": 2550, | |
| "valid_targets_mean": 5565.6, | |
| "valid_targets_min": 2904 | |
| }, | |
| { | |
| "epoch": 4.127625201938611, | |
| "grad_norm": 0.4531518269846548, | |
| "learning_rate": 1.7261050157026606e-05, | |
| "loss": 0.1313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1368991732597351, | |
| "step": 2555, | |
| "valid_targets_mean": 5132.1, | |
| "valid_targets_min": 3521 | |
| }, | |
| { | |
| "epoch": 4.135702746365105, | |
| "grad_norm": 0.4536959505796717, | |
| "learning_rate": 1.7181257428772876e-05, | |
| "loss": 0.1396, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1559496521949768, | |
| "step": 2560, | |
| "valid_targets_mean": 5842.8, | |
| "valid_targets_min": 3620 | |
| }, | |
| { | |
| "epoch": 4.143780290791599, | |
| "grad_norm": 0.43018680374896995, | |
| "learning_rate": 1.7101510450204435e-05, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1514936238527298, | |
| "step": 2565, | |
| "valid_targets_mean": 5458.9, | |
| "valid_targets_min": 2870 | |
| }, | |
| { | |
| "epoch": 4.1518578352180935, | |
| "grad_norm": 0.45789834628046766, | |
| "learning_rate": 1.7021810515656993e-05, | |
| "loss": 0.1397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14251697063446045, | |
| "step": 2570, | |
| "valid_targets_mean": 4443.4, | |
| "valid_targets_min": 2642 | |
| }, | |
| { | |
| "epoch": 4.159935379644588, | |
| "grad_norm": 0.4046050667132452, | |
| "learning_rate": 1.694215891870272e-05, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1378990113735199, | |
| "step": 2575, | |
| "valid_targets_mean": 5174.7, | |
| "valid_targets_min": 3246 | |
| }, | |
| { | |
| "epoch": 4.168012924071083, | |
| "grad_norm": 0.39401964693624647, | |
| "learning_rate": 1.686255695212923e-05, | |
| "loss": 0.1475, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14063289761543274, | |
| "step": 2580, | |
| "valid_targets_mean": 5825.9, | |
| "valid_targets_min": 2571 | |
| }, | |
| { | |
| "epoch": 4.176090468497577, | |
| "grad_norm": 0.4149990311373484, | |
| "learning_rate": 1.678300590791861e-05, | |
| "loss": 0.1352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12822958827018738, | |
| "step": 2585, | |
| "valid_targets_mean": 5082.3, | |
| "valid_targets_min": 3157 | |
| }, | |
| { | |
| "epoch": 4.184168012924071, | |
| "grad_norm": 0.4724232157465758, | |
| "learning_rate": 1.6703507077226456e-05, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14374995231628418, | |
| "step": 2590, | |
| "valid_targets_mean": 5187.9, | |
| "valid_targets_min": 2995 | |
| }, | |
| { | |
| "epoch": 4.1922455573505655, | |
| "grad_norm": 0.42323445983331587, | |
| "learning_rate": 1.6624061750360903e-05, | |
| "loss": 0.1397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13836079835891724, | |
| "step": 2595, | |
| "valid_targets_mean": 4957.8, | |
| "valid_targets_min": 2629 | |
| }, | |
| { | |
| "epoch": 4.20032310177706, | |
| "grad_norm": 0.4554566483557804, | |
| "learning_rate": 1.6544671216761713e-05, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14031758904457092, | |
| "step": 2600, | |
| "valid_targets_mean": 5175.9, | |
| "valid_targets_min": 2404 | |
| }, | |
| { | |
| "epoch": 4.208400646203554, | |
| "grad_norm": 0.4430945069630982, | |
| "learning_rate": 1.6465336764979292e-05, | |
| "loss": 0.1353, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13017365336418152, | |
| "step": 2605, | |
| "valid_targets_mean": 4475.2, | |
| "valid_targets_min": 2718 | |
| }, | |
| { | |
| "epoch": 4.216478190630048, | |
| "grad_norm": 0.4501689534740505, | |
| "learning_rate": 1.638605968265382e-05, | |
| "loss": 0.1341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12520724534988403, | |
| "step": 2610, | |
| "valid_targets_mean": 5041.1, | |
| "valid_targets_min": 3294 | |
| }, | |
| { | |
| "epoch": 4.224555735056542, | |
| "grad_norm": 0.5609815427967917, | |
| "learning_rate": 1.630684125649436e-05, | |
| "loss": 0.1353, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.136855348944664, | |
| "step": 2615, | |
| "valid_targets_mean": 5298.2, | |
| "valid_targets_min": 3349 | |
| }, | |
| { | |
| "epoch": 4.2326332794830375, | |
| "grad_norm": 0.42487595678563556, | |
| "learning_rate": 1.6227682772257917e-05, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14639994502067566, | |
| "step": 2620, | |
| "valid_targets_mean": 5180.0, | |
| "valid_targets_min": 1936 | |
| }, | |
| { | |
| "epoch": 4.240710823909532, | |
| "grad_norm": 0.49601400066615947, | |
| "learning_rate": 1.614858551472865e-05, | |
| "loss": 0.1374, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1491582691669464, | |
| "step": 2625, | |
| "valid_targets_mean": 4556.6, | |
| "valid_targets_min": 587 | |
| }, | |
| { | |
| "epoch": 4.248788368336026, | |
| "grad_norm": 0.4777696243054849, | |
| "learning_rate": 1.606955076769694e-05, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14732205867767334, | |
| "step": 2630, | |
| "valid_targets_mean": 4729.9, | |
| "valid_targets_min": 3100 | |
| }, | |
| { | |
| "epoch": 4.25686591276252, | |
| "grad_norm": 0.3994414220127131, | |
| "learning_rate": 1.599057981393861e-05, | |
| "loss": 0.1368, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13781988620758057, | |
| "step": 2635, | |
| "valid_targets_mean": 5807.2, | |
| "valid_targets_min": 3105 | |
| }, | |
| { | |
| "epoch": 4.264943457189014, | |
| "grad_norm": 0.42041612018001095, | |
| "learning_rate": 1.5911673935194076e-05, | |
| "loss": 0.1406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13944032788276672, | |
| "step": 2640, | |
| "valid_targets_mean": 5278.3, | |
| "valid_targets_min": 3270 | |
| }, | |
| { | |
| "epoch": 4.273021001615509, | |
| "grad_norm": 0.45473382797456474, | |
| "learning_rate": 1.5832834412147555e-05, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1407957673072815, | |
| "step": 2645, | |
| "valid_targets_mean": 5380.7, | |
| "valid_targets_min": 3752 | |
| }, | |
| { | |
| "epoch": 4.281098546042003, | |
| "grad_norm": 0.393765619903028, | |
| "learning_rate": 1.5754062524406268e-05, | |
| "loss": 0.1414, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14375929534435272, | |
| "step": 2650, | |
| "valid_targets_mean": 5499.1, | |
| "valid_targets_min": 2657 | |
| }, | |
| { | |
| "epoch": 4.289176090468498, | |
| "grad_norm": 0.41567383960953874, | |
| "learning_rate": 1.56753595504797e-05, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1321938931941986, | |
| "step": 2655, | |
| "valid_targets_mean": 5447.8, | |
| "valid_targets_min": 2886 | |
| }, | |
| { | |
| "epoch": 4.297253634894992, | |
| "grad_norm": 0.3821655224339751, | |
| "learning_rate": 1.55967267677588e-05, | |
| "loss": 0.1438, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13490810990333557, | |
| "step": 2660, | |
| "valid_targets_mean": 5990.6, | |
| "valid_targets_min": 3450 | |
| }, | |
| { | |
| "epoch": 4.305331179321486, | |
| "grad_norm": 0.3877345295764598, | |
| "learning_rate": 1.55181654524953e-05, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12980371713638306, | |
| "step": 2665, | |
| "valid_targets_mean": 5655.2, | |
| "valid_targets_min": 3541 | |
| }, | |
| { | |
| "epoch": 4.313408723747981, | |
| "grad_norm": 0.3800794993104672, | |
| "learning_rate": 1.5439676879780967e-05, | |
| "loss": 0.143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14696776866912842, | |
| "step": 2670, | |
| "valid_targets_mean": 6074.6, | |
| "valid_targets_min": 2990 | |
| }, | |
| { | |
| "epoch": 4.321486268174475, | |
| "grad_norm": 0.43433424667276765, | |
| "learning_rate": 1.536126232352691e-05, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14254535734653473, | |
| "step": 2675, | |
| "valid_targets_mean": 5378.1, | |
| "valid_targets_min": 3445 | |
| }, | |
| { | |
| "epoch": 4.329563812600969, | |
| "grad_norm": 0.4918736832136846, | |
| "learning_rate": 1.5282923056442937e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12696698307991028, | |
| "step": 2680, | |
| "valid_targets_mean": 5071.9, | |
| "valid_targets_min": 3521 | |
| }, | |
| { | |
| "epoch": 4.337641357027463, | |
| "grad_norm": 0.42555336980945496, | |
| "learning_rate": 1.5204660350016837e-05, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15498827397823334, | |
| "step": 2685, | |
| "valid_targets_mean": 5173.1, | |
| "valid_targets_min": 1768 | |
| }, | |
| { | |
| "epoch": 4.345718901453958, | |
| "grad_norm": 0.4748149092313879, | |
| "learning_rate": 1.5126475474493825e-05, | |
| "loss": 0.132, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1370469033718109, | |
| "step": 2690, | |
| "valid_targets_mean": 5121.7, | |
| "valid_targets_min": 2159 | |
| }, | |
| { | |
| "epoch": 4.353796445880453, | |
| "grad_norm": 0.3777654185593629, | |
| "learning_rate": 1.504836969885583e-05, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1358526647090912, | |
| "step": 2695, | |
| "valid_targets_mean": 6279.1, | |
| "valid_targets_min": 3547 | |
| }, | |
| { | |
| "epoch": 4.361873990306947, | |
| "grad_norm": 0.38522738434416426, | |
| "learning_rate": 1.4970344290800971e-05, | |
| "loss": 0.1378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1352044939994812, | |
| "step": 2700, | |
| "valid_targets_mean": 5467.2, | |
| "valid_targets_min": 2092 | |
| }, | |
| { | |
| "epoch": 4.369951534733441, | |
| "grad_norm": 0.4645665341976508, | |
| "learning_rate": 1.4892400516722978e-05, | |
| "loss": 0.1399, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14225059747695923, | |
| "step": 2705, | |
| "valid_targets_mean": 5347.4, | |
| "valid_targets_min": 2911 | |
| }, | |
| { | |
| "epoch": 4.378029079159935, | |
| "grad_norm": 0.4848170535572879, | |
| "learning_rate": 1.4814539641690574e-05, | |
| "loss": 0.1394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14791812002658844, | |
| "step": 2710, | |
| "valid_targets_mean": 4669.4, | |
| "valid_targets_min": 936 | |
| }, | |
| { | |
| "epoch": 4.386106623586429, | |
| "grad_norm": 0.4151190917642315, | |
| "learning_rate": 1.4736762929427029e-05, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13626649975776672, | |
| "step": 2715, | |
| "valid_targets_mean": 4952.8, | |
| "valid_targets_min": 1603 | |
| }, | |
| { | |
| "epoch": 4.394184168012924, | |
| "grad_norm": 0.39763403121391977, | |
| "learning_rate": 1.4659071642289586e-05, | |
| "loss": 0.1378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13494357466697693, | |
| "step": 2720, | |
| "valid_targets_mean": 5396.2, | |
| "valid_targets_min": 3598 | |
| }, | |
| { | |
| "epoch": 4.402261712439419, | |
| "grad_norm": 0.4411026275876086, | |
| "learning_rate": 1.4581467041248987e-05, | |
| "loss": 0.1394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14752957224845886, | |
| "step": 2725, | |
| "valid_targets_mean": 5540.3, | |
| "valid_targets_min": 3036 | |
| }, | |
| { | |
| "epoch": 4.410339256865913, | |
| "grad_norm": 0.40905042976740047, | |
| "learning_rate": 1.450395038586903e-05, | |
| "loss": 0.1418, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13268199563026428, | |
| "step": 2730, | |
| "valid_targets_mean": 5190.2, | |
| "valid_targets_min": 3000 | |
| }, | |
| { | |
| "epoch": 4.418416801292407, | |
| "grad_norm": 0.4297968504084322, | |
| "learning_rate": 1.4426522934286101e-05, | |
| "loss": 0.1438, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1449374258518219, | |
| "step": 2735, | |
| "valid_targets_mean": 4803.5, | |
| "valid_targets_min": 1305 | |
| }, | |
| { | |
| "epoch": 4.426494345718901, | |
| "grad_norm": 0.412783061241555, | |
| "learning_rate": 1.4349185943188748e-05, | |
| "loss": 0.142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1454753279685974, | |
| "step": 2740, | |
| "valid_targets_mean": 5826.1, | |
| "valid_targets_min": 3507 | |
| }, | |
| { | |
| "epoch": 4.434571890145396, | |
| "grad_norm": 0.41250103264883276, | |
| "learning_rate": 1.4271940667797324e-05, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13304153084754944, | |
| "step": 2745, | |
| "valid_targets_mean": 5462.4, | |
| "valid_targets_min": 2791 | |
| }, | |
| { | |
| "epoch": 4.44264943457189, | |
| "grad_norm": 0.4502312399331622, | |
| "learning_rate": 1.4194788361843563e-05, | |
| "loss": 0.1425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13922573626041412, | |
| "step": 2750, | |
| "valid_targets_mean": 5384.1, | |
| "valid_targets_min": 3369 | |
| }, | |
| { | |
| "epoch": 4.450726978998384, | |
| "grad_norm": 0.41469346574409216, | |
| "learning_rate": 1.411773027755027e-05, | |
| "loss": 0.1415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16563090682029724, | |
| "step": 2755, | |
| "valid_targets_mean": 5837.9, | |
| "valid_targets_min": 3309 | |
| }, | |
| { | |
| "epoch": 4.458804523424879, | |
| "grad_norm": 0.433737086009782, | |
| "learning_rate": 1.4040767665610981e-05, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14223286509513855, | |
| "step": 2760, | |
| "valid_targets_mean": 5067.8, | |
| "valid_targets_min": 3270 | |
| }, | |
| { | |
| "epoch": 4.466882067851373, | |
| "grad_norm": 0.41723969171999564, | |
| "learning_rate": 1.396390177516965e-05, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1245618462562561, | |
| "step": 2765, | |
| "valid_targets_mean": 5411.2, | |
| "valid_targets_min": 2965 | |
| }, | |
| { | |
| "epoch": 4.474959612277868, | |
| "grad_norm": 0.4127449084932179, | |
| "learning_rate": 1.3887133853800418e-05, | |
| "loss": 0.1369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1349238157272339, | |
| "step": 2770, | |
| "valid_targets_mean": 5649.5, | |
| "valid_targets_min": 3148 | |
| }, | |
| { | |
| "epoch": 4.483037156704362, | |
| "grad_norm": 0.44963424947146446, | |
| "learning_rate": 1.3810465147487305e-05, | |
| "loss": 0.1425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14304043352603912, | |
| "step": 2775, | |
| "valid_targets_mean": 5597.4, | |
| "valid_targets_min": 2511 | |
| }, | |
| { | |
| "epoch": 4.491114701130856, | |
| "grad_norm": 0.4019246775715318, | |
| "learning_rate": 1.373389690060405e-05, | |
| "loss": 0.1359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13364571332931519, | |
| "step": 2780, | |
| "valid_targets_mean": 5670.1, | |
| "valid_targets_min": 3242 | |
| }, | |
| { | |
| "epoch": 4.49919224555735, | |
| "grad_norm": 0.4594474877153609, | |
| "learning_rate": 1.3657430355893855e-05, | |
| "loss": 0.1325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12532834708690643, | |
| "step": 2785, | |
| "valid_targets_mean": 5329.9, | |
| "valid_targets_min": 2668 | |
| }, | |
| { | |
| "epoch": 4.5072697899838445, | |
| "grad_norm": 0.4134999029055612, | |
| "learning_rate": 1.3581066754449246e-05, | |
| "loss": 0.1367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13771578669548035, | |
| "step": 2790, | |
| "valid_targets_mean": 5896.8, | |
| "valid_targets_min": 2730 | |
| }, | |
| { | |
| "epoch": 4.51534733441034, | |
| "grad_norm": 0.41778304086801016, | |
| "learning_rate": 1.3504807335691948e-05, | |
| "loss": 0.1462, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15168075263500214, | |
| "step": 2795, | |
| "valid_targets_mean": 5395.0, | |
| "valid_targets_min": 3492 | |
| }, | |
| { | |
| "epoch": 4.523424878836834, | |
| "grad_norm": 0.4403664374264896, | |
| "learning_rate": 1.3428653337352722e-05, | |
| "loss": 0.1396, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1525859832763672, | |
| "step": 2800, | |
| "valid_targets_mean": 5305.6, | |
| "valid_targets_min": 3771 | |
| }, | |
| { | |
| "epoch": 4.531502423263328, | |
| "grad_norm": 0.4078756485043072, | |
| "learning_rate": 1.3352605995451295e-05, | |
| "loss": 0.1421, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13579663634300232, | |
| "step": 2805, | |
| "valid_targets_mean": 5819.9, | |
| "valid_targets_min": 3028 | |
| }, | |
| { | |
| "epoch": 4.539579967689822, | |
| "grad_norm": 0.47241584949791804, | |
| "learning_rate": 1.3276666544276338e-05, | |
| "loss": 0.1334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12606637179851532, | |
| "step": 2810, | |
| "valid_targets_mean": 5300.6, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 4.5476575121163165, | |
| "grad_norm": 0.4472552136291, | |
| "learning_rate": 1.3200836216365357e-05, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16859114170074463, | |
| "step": 2815, | |
| "valid_targets_mean": 5528.4, | |
| "valid_targets_min": 3139 | |
| }, | |
| { | |
| "epoch": 4.555735056542811, | |
| "grad_norm": 0.4314047387299008, | |
| "learning_rate": 1.3125116242484761e-05, | |
| "loss": 0.1467, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14334100484848022, | |
| "step": 2820, | |
| "valid_targets_mean": 5568.1, | |
| "valid_targets_min": 1120 | |
| }, | |
| { | |
| "epoch": 4.563812600969305, | |
| "grad_norm": 0.4770918683992712, | |
| "learning_rate": 1.3049507851609837e-05, | |
| "loss": 0.1369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1330651342868805, | |
| "step": 2825, | |
| "valid_targets_mean": 4386.8, | |
| "valid_targets_min": 2747 | |
| }, | |
| { | |
| "epoch": 4.5718901453958, | |
| "grad_norm": 0.4870368298725214, | |
| "learning_rate": 1.2974012270904824e-05, | |
| "loss": 0.1398, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14877751469612122, | |
| "step": 2830, | |
| "valid_targets_mean": 4765.4, | |
| "valid_targets_min": 2739 | |
| }, | |
| { | |
| "epoch": 4.579967689822294, | |
| "grad_norm": 0.44178395738617654, | |
| "learning_rate": 1.2898630725703e-05, | |
| "loss": 0.1464, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14977668225765228, | |
| "step": 2835, | |
| "valid_targets_mean": 5217.3, | |
| "valid_targets_min": 3492 | |
| }, | |
| { | |
| "epoch": 4.5880452342487885, | |
| "grad_norm": 0.39914740501270685, | |
| "learning_rate": 1.2823364439486775e-05, | |
| "loss": 0.1412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14151275157928467, | |
| "step": 2840, | |
| "valid_targets_mean": 6049.9, | |
| "valid_targets_min": 2668 | |
| }, | |
| { | |
| "epoch": 4.596122778675283, | |
| "grad_norm": 0.43543696709388485, | |
| "learning_rate": 1.2748214633867852e-05, | |
| "loss": 0.1388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1313113123178482, | |
| "step": 2845, | |
| "valid_targets_mean": 5043.1, | |
| "valid_targets_min": 665 | |
| }, | |
| { | |
| "epoch": 4.604200323101777, | |
| "grad_norm": 0.43160935343861717, | |
| "learning_rate": 1.2673182528567394e-05, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16229557991027832, | |
| "step": 2850, | |
| "valid_targets_mean": 5334.7, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 4.612277867528271, | |
| "grad_norm": 0.4112636471282577, | |
| "learning_rate": 1.2598269341396212e-05, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14326302707195282, | |
| "step": 2855, | |
| "valid_targets_mean": 5666.9, | |
| "valid_targets_min": 2840 | |
| }, | |
| { | |
| "epoch": 4.620355411954765, | |
| "grad_norm": 0.4193132603434726, | |
| "learning_rate": 1.2523476288235035e-05, | |
| "loss": 0.1336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11997410655021667, | |
| "step": 2860, | |
| "valid_targets_mean": 5055.6, | |
| "valid_targets_min": 2936 | |
| }, | |
| { | |
| "epoch": 4.6284329563812605, | |
| "grad_norm": 0.40727152336695605, | |
| "learning_rate": 1.2448804583014734e-05, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14037710428237915, | |
| "step": 2865, | |
| "valid_targets_mean": 5948.5, | |
| "valid_targets_min": 3460 | |
| }, | |
| { | |
| "epoch": 4.636510500807755, | |
| "grad_norm": 0.4576903143700993, | |
| "learning_rate": 1.237425543769664e-05, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1467304229736328, | |
| "step": 2870, | |
| "valid_targets_mean": 4774.2, | |
| "valid_targets_min": 3035 | |
| }, | |
| { | |
| "epoch": 4.644588045234249, | |
| "grad_norm": 0.383155018230044, | |
| "learning_rate": 1.229983006225289e-05, | |
| "loss": 0.1373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12850114703178406, | |
| "step": 2875, | |
| "valid_targets_mean": 5473.4, | |
| "valid_targets_min": 2920 | |
| }, | |
| { | |
| "epoch": 4.652665589660743, | |
| "grad_norm": 0.42907692476292697, | |
| "learning_rate": 1.2225529664646743e-05, | |
| "loss": 0.1433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1375298947095871, | |
| "step": 2880, | |
| "valid_targets_mean": 5756.6, | |
| "valid_targets_min": 3347 | |
| }, | |
| { | |
| "epoch": 4.660743134087237, | |
| "grad_norm": 0.45578066384811955, | |
| "learning_rate": 1.2151355450813032e-05, | |
| "loss": 0.142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13910695910453796, | |
| "step": 2885, | |
| "valid_targets_mean": 5256.9, | |
| "valid_targets_min": 2607 | |
| }, | |
| { | |
| "epoch": 4.668820678513732, | |
| "grad_norm": 0.4392376613684911, | |
| "learning_rate": 1.207730862463854e-05, | |
| "loss": 0.1373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1305842101573944, | |
| "step": 2890, | |
| "valid_targets_mean": 5400.7, | |
| "valid_targets_min": 3440 | |
| }, | |
| { | |
| "epoch": 4.676898222940226, | |
| "grad_norm": 0.43322009220366464, | |
| "learning_rate": 1.2003390387942472e-05, | |
| "loss": 0.1448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13938024640083313, | |
| "step": 2895, | |
| "valid_targets_mean": 5202.0, | |
| "valid_targets_min": 3009 | |
| }, | |
| { | |
| "epoch": 4.684975767366721, | |
| "grad_norm": 0.42387663404361464, | |
| "learning_rate": 1.192960194045699e-05, | |
| "loss": 0.1331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13488176465034485, | |
| "step": 2900, | |
| "valid_targets_mean": 5007.9, | |
| "valid_targets_min": 1583 | |
| }, | |
| { | |
| "epoch": 4.693053311793215, | |
| "grad_norm": 0.5059478326875065, | |
| "learning_rate": 1.1855944479807678e-05, | |
| "loss": 0.1459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15618818998336792, | |
| "step": 2905, | |
| "valid_targets_mean": 4973.8, | |
| "valid_targets_min": 2736 | |
| }, | |
| { | |
| "epoch": 4.701130856219709, | |
| "grad_norm": 0.39919140562243194, | |
| "learning_rate": 1.1782419201494154e-05, | |
| "loss": 0.1366, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12572744488716125, | |
| "step": 2910, | |
| "valid_targets_mean": 5450.1, | |
| "valid_targets_min": 3087 | |
| }, | |
| { | |
| "epoch": 4.709208400646204, | |
| "grad_norm": 0.4809512538206678, | |
| "learning_rate": 1.1709027298870632e-05, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1231696754693985, | |
| "step": 2915, | |
| "valid_targets_mean": 5773.8, | |
| "valid_targets_min": 2820 | |
| }, | |
| { | |
| "epoch": 4.717285945072698, | |
| "grad_norm": 0.41574868747529725, | |
| "learning_rate": 1.1635769963126573e-05, | |
| "loss": 0.1352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1308540552854538, | |
| "step": 2920, | |
| "valid_targets_mean": 5483.6, | |
| "valid_targets_min": 3132 | |
| }, | |
| { | |
| "epoch": 4.725363489499192, | |
| "grad_norm": 0.40307841075512413, | |
| "learning_rate": 1.1562648383267358e-05, | |
| "loss": 0.1384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13654011487960815, | |
| "step": 2925, | |
| "valid_targets_mean": 5368.4, | |
| "valid_targets_min": 3448 | |
| }, | |
| { | |
| "epoch": 4.733441033925686, | |
| "grad_norm": 0.536090353564831, | |
| "learning_rate": 1.1489663746094953e-05, | |
| "loss": 0.1465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1650780290365219, | |
| "step": 2930, | |
| "valid_targets_mean": 4809.2, | |
| "valid_targets_min": 2689 | |
| }, | |
| { | |
| "epoch": 4.741518578352181, | |
| "grad_norm": 0.4830950093812901, | |
| "learning_rate": 1.1416817236188695e-05, | |
| "loss": 0.1382, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14563900232315063, | |
| "step": 2935, | |
| "valid_targets_mean": 5572.7, | |
| "valid_targets_min": 3211 | |
| }, | |
| { | |
| "epoch": 4.749596122778676, | |
| "grad_norm": 0.4126946285281672, | |
| "learning_rate": 1.1344110035886038e-05, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1415388286113739, | |
| "step": 2940, | |
| "valid_targets_mean": 5840.8, | |
| "valid_targets_min": 3161 | |
| }, | |
| { | |
| "epoch": 4.75767366720517, | |
| "grad_norm": 0.4183524471538227, | |
| "learning_rate": 1.1271543325263342e-05, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14083191752433777, | |
| "step": 2945, | |
| "valid_targets_mean": 5269.1, | |
| "valid_targets_min": 1018 | |
| }, | |
| { | |
| "epoch": 4.765751211631664, | |
| "grad_norm": 0.40863305060929256, | |
| "learning_rate": 1.1199118282116797e-05, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13962915539741516, | |
| "step": 2950, | |
| "valid_targets_mean": 5663.4, | |
| "valid_targets_min": 3494 | |
| }, | |
| { | |
| "epoch": 4.773828756058158, | |
| "grad_norm": 0.48244071805874694, | |
| "learning_rate": 1.1126836081943199e-05, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15097686648368835, | |
| "step": 2955, | |
| "valid_targets_mean": 4976.0, | |
| "valid_targets_min": 2044 | |
| }, | |
| { | |
| "epoch": 4.7819063004846525, | |
| "grad_norm": 0.4261822766446634, | |
| "learning_rate": 1.1054697897920963e-05, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1349121630191803, | |
| "step": 2960, | |
| "valid_targets_mean": 4968.2, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 4.789983844911147, | |
| "grad_norm": 0.5078548452558224, | |
| "learning_rate": 1.098270490089104e-05, | |
| "loss": 0.1348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1354251205921173, | |
| "step": 2965, | |
| "valid_targets_mean": 5543.1, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 4.798061389337642, | |
| "grad_norm": 0.43986780155442085, | |
| "learning_rate": 1.0910858259337896e-05, | |
| "loss": 0.1374, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13416457176208496, | |
| "step": 2970, | |
| "valid_targets_mean": 4822.5, | |
| "valid_targets_min": 1904 | |
| }, | |
| { | |
| "epoch": 4.806138933764136, | |
| "grad_norm": 0.43214223060348766, | |
| "learning_rate": 1.0839159139370597e-05, | |
| "loss": 0.1425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13904398679733276, | |
| "step": 2975, | |
| "valid_targets_mean": 5033.6, | |
| "valid_targets_min": 3069 | |
| }, | |
| { | |
| "epoch": 4.81421647819063, | |
| "grad_norm": 0.44873477765055975, | |
| "learning_rate": 1.0767608704703852e-05, | |
| "loss": 0.135, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14608915150165558, | |
| "step": 2980, | |
| "valid_targets_mean": 4896.9, | |
| "valid_targets_min": 3119 | |
| }, | |
| { | |
| "epoch": 4.8222940226171245, | |
| "grad_norm": 0.42963463585563594, | |
| "learning_rate": 1.0696208116639108e-05, | |
| "loss": 0.1321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13478168845176697, | |
| "step": 2985, | |
| "valid_targets_mean": 5041.9, | |
| "valid_targets_min": 2861 | |
| }, | |
| { | |
| "epoch": 4.830371567043619, | |
| "grad_norm": 0.4184747605118807, | |
| "learning_rate": 1.0624958534045748e-05, | |
| "loss": 0.1362, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1315079629421234, | |
| "step": 2990, | |
| "valid_targets_mean": 4842.6, | |
| "valid_targets_min": 2395 | |
| }, | |
| { | |
| "epoch": 4.838449111470113, | |
| "grad_norm": 0.4507756298256187, | |
| "learning_rate": 1.055386111334225e-05, | |
| "loss": 0.1417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1261076033115387, | |
| "step": 2995, | |
| "valid_targets_mean": 5323.1, | |
| "valid_targets_min": 2606 | |
| }, | |
| { | |
| "epoch": 4.846526655896607, | |
| "grad_norm": 0.4213500877752229, | |
| "learning_rate": 1.0482917008477406e-05, | |
| "loss": 0.1375, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1388499140739441, | |
| "step": 3000, | |
| "valid_targets_mean": 5244.6, | |
| "valid_targets_min": 2050 | |
| }, | |
| { | |
| "epoch": 4.854604200323102, | |
| "grad_norm": 0.4394120888553323, | |
| "learning_rate": 1.0412127370911627e-05, | |
| "loss": 0.1424, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1536855250597, | |
| "step": 3005, | |
| "valid_targets_mean": 5368.1, | |
| "valid_targets_min": 2641 | |
| }, | |
| { | |
| "epoch": 4.8626817447495965, | |
| "grad_norm": 0.4350239563816111, | |
| "learning_rate": 1.034149334959824e-05, | |
| "loss": 0.1399, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13980424404144287, | |
| "step": 3010, | |
| "valid_targets_mean": 5067.7, | |
| "valid_targets_min": 3414 | |
| }, | |
| { | |
| "epoch": 4.870759289176091, | |
| "grad_norm": 0.4385149359865282, | |
| "learning_rate": 1.0271016090964836e-05, | |
| "loss": 0.1415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1483292281627655, | |
| "step": 3015, | |
| "valid_targets_mean": 5908.5, | |
| "valid_targets_min": 3151 | |
| }, | |
| { | |
| "epoch": 4.878836833602585, | |
| "grad_norm": 0.4021151824125644, | |
| "learning_rate": 1.0200696738894651e-05, | |
| "loss": 0.1395, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1273038387298584, | |
| "step": 3020, | |
| "valid_targets_mean": 5290.0, | |
| "valid_targets_min": 2564 | |
| }, | |
| { | |
| "epoch": 4.886914378029079, | |
| "grad_norm": 0.4468569200340916, | |
| "learning_rate": 1.0130536434708024e-05, | |
| "loss": 0.1388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.130598783493042, | |
| "step": 3025, | |
| "valid_targets_mean": 5362.9, | |
| "valid_targets_min": 4032 | |
| }, | |
| { | |
| "epoch": 4.894991922455573, | |
| "grad_norm": 0.4066728906290766, | |
| "learning_rate": 1.0060536317143877e-05, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1388537734746933, | |
| "step": 3030, | |
| "valid_targets_mean": 5224.4, | |
| "valid_targets_min": 826 | |
| }, | |
| { | |
| "epoch": 4.903069466882068, | |
| "grad_norm": 0.7116931473944323, | |
| "learning_rate": 9.990697522341179e-06, | |
| "loss": 0.1309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13971570134162903, | |
| "step": 3035, | |
| "valid_targets_mean": 4791.1, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 4.911147011308563, | |
| "grad_norm": 0.4071326459613783, | |
| "learning_rate": 9.921021183820601e-06, | |
| "loss": 0.1414, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1335953325033188, | |
| "step": 3040, | |
| "valid_targets_mean": 5521.1, | |
| "valid_targets_min": 3073 | |
| }, | |
| { | |
| "epoch": 4.919224555735057, | |
| "grad_norm": 0.42992397938879723, | |
| "learning_rate": 9.851508432466006e-06, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13599014282226562, | |
| "step": 3045, | |
| "valid_targets_mean": 4856.4, | |
| "valid_targets_min": 1595 | |
| }, | |
| { | |
| "epoch": 4.927302100161551, | |
| "grad_norm": 0.4990884521696119, | |
| "learning_rate": 9.782160396506187e-06, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14541074633598328, | |
| "step": 3050, | |
| "valid_targets_mean": 4448.3, | |
| "valid_targets_min": 3020 | |
| }, | |
| { | |
| "epoch": 4.935379644588045, | |
| "grad_norm": 0.422162536375542, | |
| "learning_rate": 9.712978201496512e-06, | |
| "loss": 0.1375, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13707958161830902, | |
| "step": 3055, | |
| "valid_targets_mean": 4581.8, | |
| "valid_targets_min": 2808 | |
| }, | |
| { | |
| "epoch": 4.94345718901454, | |
| "grad_norm": 0.40274962941480746, | |
| "learning_rate": 9.643962970300646e-06, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13235828280448914, | |
| "step": 3060, | |
| "valid_targets_mean": 5385.7, | |
| "valid_targets_min": 2964 | |
| }, | |
| { | |
| "epoch": 4.951534733441034, | |
| "grad_norm": 0.41106224144589665, | |
| "learning_rate": 9.575115823072366e-06, | |
| "loss": 0.1385, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15319183468818665, | |
| "step": 3065, | |
| "valid_targets_mean": 5383.4, | |
| "valid_targets_min": 3733 | |
| }, | |
| { | |
| "epoch": 4.959612277867528, | |
| "grad_norm": 0.4368314776007296, | |
| "learning_rate": 9.506437877237346e-06, | |
| "loss": 0.1308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13027821481227875, | |
| "step": 3070, | |
| "valid_targets_mean": 5122.1, | |
| "valid_targets_min": 3149 | |
| }, | |
| { | |
| "epoch": 4.967689822294023, | |
| "grad_norm": 0.44034403884405504, | |
| "learning_rate": 9.437930247475023e-06, | |
| "loss": 0.1411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.129189133644104, | |
| "step": 3075, | |
| "valid_targets_mean": 4822.6, | |
| "valid_targets_min": 3038 | |
| }, | |
| { | |
| "epoch": 4.975767366720517, | |
| "grad_norm": 0.4280324030430042, | |
| "learning_rate": 9.36959404570053e-06, | |
| "loss": 0.1328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13658729195594788, | |
| "step": 3080, | |
| "valid_targets_mean": 4948.9, | |
| "valid_targets_min": 1227 | |
| }, | |
| { | |
| "epoch": 4.983844911147012, | |
| "grad_norm": 0.4238657423203, | |
| "learning_rate": 9.30143038104663e-06, | |
| "loss": 0.1362, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12977300584316254, | |
| "step": 3085, | |
| "valid_targets_mean": 5721.9, | |
| "valid_targets_min": 3482 | |
| }, | |
| { | |
| "epoch": 4.991922455573506, | |
| "grad_norm": 0.48199672073194927, | |
| "learning_rate": 9.233440359845703e-06, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18151067197322845, | |
| "step": 3090, | |
| "valid_targets_mean": 5081.3, | |
| "valid_targets_min": 2827 | |
| }, | |
| { | |
| "epoch": 5.0, | |
| "grad_norm": 0.44906074019942926, | |
| "learning_rate": 9.165625085611818e-06, | |
| "loss": 0.1378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14372006058692932, | |
| "step": 3095, | |
| "valid_targets_mean": 4968.9, | |
| "valid_targets_min": 3162 | |
| }, | |
| { | |
| "epoch": 5.008077544426494, | |
| "grad_norm": 0.43772755900870186, | |
| "learning_rate": 9.097985659022808e-06, | |
| "loss": 0.1305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1423802673816681, | |
| "step": 3100, | |
| "valid_targets_mean": 5149.9, | |
| "valid_targets_min": 2678 | |
| }, | |
| { | |
| "epoch": 5.016155088852988, | |
| "grad_norm": 0.43633990751021157, | |
| "learning_rate": 9.030523177902403e-06, | |
| "loss": 0.1288, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13901320099830627, | |
| "step": 3105, | |
| "valid_targets_mean": 5018.6, | |
| "valid_targets_min": 3138 | |
| }, | |
| { | |
| "epoch": 5.024232633279483, | |
| "grad_norm": 0.43155930945131143, | |
| "learning_rate": 8.963238737202405e-06, | |
| "loss": 0.1433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1326574683189392, | |
| "step": 3110, | |
| "valid_targets_mean": 5598.6, | |
| "valid_targets_min": 3441 | |
| }, | |
| { | |
| "epoch": 5.032310177705978, | |
| "grad_norm": 0.44926415068891506, | |
| "learning_rate": 8.896133428984935e-06, | |
| "loss": 0.1292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13450875878334045, | |
| "step": 3115, | |
| "valid_targets_mean": 5065.4, | |
| "valid_targets_min": 2402 | |
| }, | |
| { | |
| "epoch": 5.040387722132472, | |
| "grad_norm": 0.41465194337253264, | |
| "learning_rate": 8.829208342404709e-06, | |
| "loss": 0.1295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11707586795091629, | |
| "step": 3120, | |
| "valid_targets_mean": 5308.8, | |
| "valid_targets_min": 3109 | |
| }, | |
| { | |
| "epoch": 5.048465266558966, | |
| "grad_norm": 0.4216992603049396, | |
| "learning_rate": 8.762464563691315e-06, | |
| "loss": 0.1316, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11611682176589966, | |
| "step": 3125, | |
| "valid_targets_mean": 5293.1, | |
| "valid_targets_min": 2159 | |
| }, | |
| { | |
| "epoch": 5.05654281098546, | |
| "grad_norm": 0.44501179934754415, | |
| "learning_rate": 8.695903176131671e-06, | |
| "loss": 0.1314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11896132677793503, | |
| "step": 3130, | |
| "valid_targets_mean": 5573.8, | |
| "valid_targets_min": 2874 | |
| }, | |
| { | |
| "epoch": 5.064620355411955, | |
| "grad_norm": 0.3946578295576979, | |
| "learning_rate": 8.62952526005234e-06, | |
| "loss": 0.128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.131038099527359, | |
| "step": 3135, | |
| "valid_targets_mean": 6400.7, | |
| "valid_targets_min": 3402 | |
| }, | |
| { | |
| "epoch": 5.072697899838449, | |
| "grad_norm": 0.45436587295250586, | |
| "learning_rate": 8.563331892802076e-06, | |
| "loss": 0.1286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.117171511054039, | |
| "step": 3140, | |
| "valid_targets_mean": 4811.6, | |
| "valid_targets_min": 2952 | |
| }, | |
| { | |
| "epoch": 5.080775444264943, | |
| "grad_norm": 0.40209870031547296, | |
| "learning_rate": 8.497324148734301e-06, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1264287829399109, | |
| "step": 3145, | |
| "valid_targets_mean": 5566.0, | |
| "valid_targets_min": 2714 | |
| }, | |
| { | |
| "epoch": 5.088852988691438, | |
| "grad_norm": 0.45442895826783986, | |
| "learning_rate": 8.431503099189663e-06, | |
| "loss": 0.1325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1307426542043686, | |
| "step": 3150, | |
| "valid_targets_mean": 4966.5, | |
| "valid_targets_min": 2668 | |
| }, | |
| { | |
| "epoch": 5.096930533117932, | |
| "grad_norm": 0.43759406598068035, | |
| "learning_rate": 8.36586981247867e-06, | |
| "loss": 0.1262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12658154964447021, | |
| "step": 3155, | |
| "valid_targets_mean": 5125.0, | |
| "valid_targets_min": 3152 | |
| }, | |
| { | |
| "epoch": 5.105008077544427, | |
| "grad_norm": 0.4329307110240369, | |
| "learning_rate": 8.300425353864347e-06, | |
| "loss": 0.1315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12121981382369995, | |
| "step": 3160, | |
| "valid_targets_mean": 5237.1, | |
| "valid_targets_min": 3475 | |
| }, | |
| { | |
| "epoch": 5.113085621970921, | |
| "grad_norm": 0.40729526908685826, | |
| "learning_rate": 8.235170785544915e-06, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12251444905996323, | |
| "step": 3165, | |
| "valid_targets_mean": 5824.7, | |
| "valid_targets_min": 3889 | |
| }, | |
| { | |
| "epoch": 5.121163166397415, | |
| "grad_norm": 0.4037150575762845, | |
| "learning_rate": 8.170107166636598e-06, | |
| "loss": 0.1384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14814183115959167, | |
| "step": 3170, | |
| "valid_targets_mean": 6272.6, | |
| "valid_targets_min": 3781 | |
| }, | |
| { | |
| "epoch": 5.129240710823909, | |
| "grad_norm": 0.4471386441574684, | |
| "learning_rate": 8.105235553156407e-06, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13282373547554016, | |
| "step": 3175, | |
| "valid_targets_mean": 4648.6, | |
| "valid_targets_min": 2869 | |
| }, | |
| { | |
| "epoch": 5.1373182552504035, | |
| "grad_norm": 0.4227276424268678, | |
| "learning_rate": 8.040556998004983e-06, | |
| "loss": 0.1271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12098711729049683, | |
| "step": 3180, | |
| "valid_targets_mean": 5430.0, | |
| "valid_targets_min": 3431 | |
| }, | |
| { | |
| "epoch": 5.145395799676899, | |
| "grad_norm": 0.4509975577793107, | |
| "learning_rate": 7.97607255094955e-06, | |
| "loss": 0.131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13260114192962646, | |
| "step": 3185, | |
| "valid_targets_mean": 5342.8, | |
| "valid_targets_min": 2395 | |
| }, | |
| { | |
| "epoch": 5.153473344103393, | |
| "grad_norm": 0.4477220807877767, | |
| "learning_rate": 7.91178325860685e-06, | |
| "loss": 0.124, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12227153778076172, | |
| "step": 3190, | |
| "valid_targets_mean": 5051.9, | |
| "valid_targets_min": 2560 | |
| }, | |
| { | |
| "epoch": 5.161550888529887, | |
| "grad_norm": 0.4264237924218974, | |
| "learning_rate": 7.847690164426158e-06, | |
| "loss": 0.1322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12720181047916412, | |
| "step": 3195, | |
| "valid_targets_mean": 5173.0, | |
| "valid_targets_min": 2556 | |
| }, | |
| { | |
| "epoch": 5.169628432956381, | |
| "grad_norm": 0.49404814187524054, | |
| "learning_rate": 7.783794308672343e-06, | |
| "loss": 0.1309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13522876799106598, | |
| "step": 3200, | |
| "valid_targets_mean": 4619.8, | |
| "valid_targets_min": 935 | |
| }, | |
| { | |
| "epoch": 5.1777059773828755, | |
| "grad_norm": 0.4602724062856385, | |
| "learning_rate": 7.720096728408997e-06, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12243683636188507, | |
| "step": 3205, | |
| "valid_targets_mean": 4999.7, | |
| "valid_targets_min": 1595 | |
| }, | |
| { | |
| "epoch": 5.18578352180937, | |
| "grad_norm": 0.4411519856570941, | |
| "learning_rate": 7.656598457481605e-06, | |
| "loss": 0.138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1451737880706787, | |
| "step": 3210, | |
| "valid_targets_mean": 5547.5, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 5.193861066235864, | |
| "grad_norm": 0.4378288689722619, | |
| "learning_rate": 7.593300526500729e-06, | |
| "loss": 0.1276, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1233760267496109, | |
| "step": 3215, | |
| "valid_targets_mean": 5512.4, | |
| "valid_targets_min": 3502 | |
| }, | |
| { | |
| "epoch": 5.201938610662358, | |
| "grad_norm": 0.47342394353383355, | |
| "learning_rate": 7.530203962825331e-06, | |
| "loss": 0.1284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12568993866443634, | |
| "step": 3220, | |
| "valid_targets_mean": 4450.1, | |
| "valid_targets_min": 3026 | |
| }, | |
| { | |
| "epoch": 5.210016155088853, | |
| "grad_norm": 0.4796869582707379, | |
| "learning_rate": 7.467309790546069e-06, | |
| "loss": 0.125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12920737266540527, | |
| "step": 3225, | |
| "valid_targets_mean": 5082.8, | |
| "valid_targets_min": 3288 | |
| }, | |
| { | |
| "epoch": 5.2180936995153475, | |
| "grad_norm": 0.4160624936206936, | |
| "learning_rate": 7.40461903046868e-06, | |
| "loss": 0.125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1330195665359497, | |
| "step": 3230, | |
| "valid_targets_mean": 5481.2, | |
| "valid_targets_min": 2841 | |
| }, | |
| { | |
| "epoch": 5.226171243941842, | |
| "grad_norm": 0.4510127933852413, | |
| "learning_rate": 7.34213270009742e-06, | |
| "loss": 0.1308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1508096605539322, | |
| "step": 3235, | |
| "valid_targets_mean": 5563.2, | |
| "valid_targets_min": 3330 | |
| }, | |
| { | |
| "epoch": 5.234248788368336, | |
| "grad_norm": 0.4432108039540858, | |
| "learning_rate": 7.2798518136185195e-06, | |
| "loss": 0.1288, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14449772238731384, | |
| "step": 3240, | |
| "valid_targets_mean": 5562.4, | |
| "valid_targets_min": 3147 | |
| }, | |
| { | |
| "epoch": 5.24232633279483, | |
| "grad_norm": 0.4836206888531536, | |
| "learning_rate": 7.2177773818837744e-06, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14104701578617096, | |
| "step": 3245, | |
| "valid_targets_mean": 4805.9, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 5.250403877221324, | |
| "grad_norm": 0.4230418368080128, | |
| "learning_rate": 7.1559104123941e-06, | |
| "loss": 0.1196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1265610009431839, | |
| "step": 3250, | |
| "valid_targets_mean": 5207.2, | |
| "valid_targets_min": 3414 | |
| }, | |
| { | |
| "epoch": 5.258481421647819, | |
| "grad_norm": 0.44816877933584826, | |
| "learning_rate": 7.094251909283181e-06, | |
| "loss": 0.1293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11602858453989029, | |
| "step": 3255, | |
| "valid_targets_mean": 4590.4, | |
| "valid_targets_min": 2297 | |
| }, | |
| { | |
| "epoch": 5.266558966074314, | |
| "grad_norm": 0.43966715189943295, | |
| "learning_rate": 7.0328028733011875e-06, | |
| "loss": 0.1277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12680992484092712, | |
| "step": 3260, | |
| "valid_targets_mean": 5906.9, | |
| "valid_targets_min": 2735 | |
| }, | |
| { | |
| "epoch": 5.274636510500808, | |
| "grad_norm": 0.4378291301058054, | |
| "learning_rate": 6.971564301798543e-06, | |
| "loss": 0.1292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12160511314868927, | |
| "step": 3265, | |
| "valid_targets_mean": 5089.8, | |
| "valid_targets_min": 2870 | |
| }, | |
| { | |
| "epoch": 5.282714054927302, | |
| "grad_norm": 0.4000060678767217, | |
| "learning_rate": 6.91053718870969e-06, | |
| "loss": 0.1309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12655307352542877, | |
| "step": 3270, | |
| "valid_targets_mean": 5991.4, | |
| "valid_targets_min": 2731 | |
| }, | |
| { | |
| "epoch": 5.290791599353796, | |
| "grad_norm": 0.4099273992124578, | |
| "learning_rate": 6.8497225245370144e-06, | |
| "loss": 0.1257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1262008100748062, | |
| "step": 3275, | |
| "valid_targets_mean": 5426.8, | |
| "valid_targets_min": 2333 | |
| }, | |
| { | |
| "epoch": 5.298869143780291, | |
| "grad_norm": 0.4364288531777203, | |
| "learning_rate": 6.78912129633474e-06, | |
| "loss": 0.1337, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1258752942085266, | |
| "step": 3280, | |
| "valid_targets_mean": 5990.5, | |
| "valid_targets_min": 2789 | |
| }, | |
| { | |
| "epoch": 5.306946688206785, | |
| "grad_norm": 0.4393195781993097, | |
| "learning_rate": 6.728734487692894e-06, | |
| "loss": 0.133, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1283208578824997, | |
| "step": 3285, | |
| "valid_targets_mean": 4943.7, | |
| "valid_targets_min": 2061 | |
| }, | |
| { | |
| "epoch": 5.315024232633279, | |
| "grad_norm": 0.4903622209303023, | |
| "learning_rate": 6.668563078721382e-06, | |
| "loss": 0.1214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11967407166957855, | |
| "step": 3290, | |
| "valid_targets_mean": 4793.8, | |
| "valid_targets_min": 3392 | |
| }, | |
| { | |
| "epoch": 5.323101777059774, | |
| "grad_norm": 0.45020393410672527, | |
| "learning_rate": 6.608608046034046e-06, | |
| "loss": 0.1321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13561296463012695, | |
| "step": 3295, | |
| "valid_targets_mean": 5099.8, | |
| "valid_targets_min": 2654 | |
| }, | |
| { | |
| "epoch": 5.331179321486268, | |
| "grad_norm": 0.45266802210645696, | |
| "learning_rate": 6.54887036273284e-06, | |
| "loss": 0.1282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11593136191368103, | |
| "step": 3300, | |
| "valid_targets_mean": 4864.9, | |
| "valid_targets_min": 3313 | |
| }, | |
| { | |
| "epoch": 5.339256865912763, | |
| "grad_norm": 0.4377571625246567, | |
| "learning_rate": 6.489350998392001e-06, | |
| "loss": 0.1266, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12568038702011108, | |
| "step": 3305, | |
| "valid_targets_mean": 5153.2, | |
| "valid_targets_min": 2897 | |
| }, | |
| { | |
| "epoch": 5.347334410339257, | |
| "grad_norm": 0.43303242638782025, | |
| "learning_rate": 6.430050919042347e-06, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13954932987689972, | |
| "step": 3310, | |
| "valid_targets_mean": 6399.5, | |
| "valid_targets_min": 4216 | |
| }, | |
| { | |
| "epoch": 5.355411954765751, | |
| "grad_norm": 0.44970687075331883, | |
| "learning_rate": 6.370971087155591e-06, | |
| "loss": 0.1281, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13170082867145538, | |
| "step": 3315, | |
| "valid_targets_mean": 5281.7, | |
| "valid_targets_min": 3191 | |
| }, | |
| { | |
| "epoch": 5.363489499192245, | |
| "grad_norm": 0.4332525900942074, | |
| "learning_rate": 6.312112461628697e-06, | |
| "loss": 0.127, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12100870907306671, | |
| "step": 3320, | |
| "valid_targets_mean": 5090.7, | |
| "valid_targets_min": 2566 | |
| }, | |
| { | |
| "epoch": 5.3715670436187395, | |
| "grad_norm": 0.4382168631917469, | |
| "learning_rate": 6.253475997768348e-06, | |
| "loss": 0.1326, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13327665627002716, | |
| "step": 3325, | |
| "valid_targets_mean": 5236.8, | |
| "valid_targets_min": 2975 | |
| }, | |
| { | |
| "epoch": 5.379644588045235, | |
| "grad_norm": 0.42660852601181254, | |
| "learning_rate": 6.195062647275408e-06, | |
| "loss": 0.1271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12060268223285675, | |
| "step": 3330, | |
| "valid_targets_mean": 5242.0, | |
| "valid_targets_min": 2180 | |
| }, | |
| { | |
| "epoch": 5.387722132471729, | |
| "grad_norm": 0.4373544379702163, | |
| "learning_rate": 6.136873358229507e-06, | |
| "loss": 0.1341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11785711348056793, | |
| "step": 3335, | |
| "valid_targets_mean": 5259.1, | |
| "valid_targets_min": 2708 | |
| }, | |
| { | |
| "epoch": 5.395799676898223, | |
| "grad_norm": 0.41586955748569704, | |
| "learning_rate": 6.078909075073642e-06, | |
| "loss": 0.1323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1338985115289688, | |
| "step": 3340, | |
| "valid_targets_mean": 6643.9, | |
| "valid_targets_min": 3093 | |
| }, | |
| { | |
| "epoch": 5.403877221324717, | |
| "grad_norm": 0.4180961613164803, | |
| "learning_rate": 6.021170738598827e-06, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12959451973438263, | |
| "step": 3345, | |
| "valid_targets_mean": 5452.9, | |
| "valid_targets_min": 2867 | |
| }, | |
| { | |
| "epoch": 5.4119547657512115, | |
| "grad_norm": 0.44161500993929853, | |
| "learning_rate": 5.9636592859288536e-06, | |
| "loss": 0.1308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14301708340644836, | |
| "step": 3350, | |
| "valid_targets_mean": 5461.9, | |
| "valid_targets_min": 3516 | |
| }, | |
| { | |
| "epoch": 5.420032310177706, | |
| "grad_norm": 0.4327699310648836, | |
| "learning_rate": 5.906375650505078e-06, | |
| "loss": 0.1306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12899701297283173, | |
| "step": 3355, | |
| "valid_targets_mean": 5683.8, | |
| "valid_targets_min": 3401 | |
| }, | |
| { | |
| "epoch": 5.4281098546042, | |
| "grad_norm": 0.4773817043534635, | |
| "learning_rate": 5.84932076207124e-06, | |
| "loss": 0.1363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14128807187080383, | |
| "step": 3360, | |
| "valid_targets_mean": 5567.1, | |
| "valid_targets_min": 3045 | |
| }, | |
| { | |
| "epoch": 5.436187399030695, | |
| "grad_norm": 0.4658579711976311, | |
| "learning_rate": 5.792495546658412e-06, | |
| "loss": 0.1294, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1277376413345337, | |
| "step": 3365, | |
| "valid_targets_mean": 5739.8, | |
| "valid_targets_min": 3299 | |
| }, | |
| { | |
| "epoch": 5.444264943457189, | |
| "grad_norm": 0.4584917541629052, | |
| "learning_rate": 5.7359009265699504e-06, | |
| "loss": 0.1311, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14182579517364502, | |
| "step": 3370, | |
| "valid_targets_mean": 4920.7, | |
| "valid_targets_min": 2618 | |
| }, | |
| { | |
| "epoch": 5.4523424878836835, | |
| "grad_norm": 0.4310834601799318, | |
| "learning_rate": 5.679537820366512e-06, | |
| "loss": 0.1301, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12319634854793549, | |
| "step": 3375, | |
| "valid_targets_mean": 5160.1, | |
| "valid_targets_min": 2577 | |
| }, | |
| { | |
| "epoch": 5.460420032310178, | |
| "grad_norm": 0.45882510774341584, | |
| "learning_rate": 5.623407142851176e-06, | |
| "loss": 0.1296, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13417544960975647, | |
| "step": 3380, | |
| "valid_targets_mean": 4943.8, | |
| "valid_targets_min": 2983 | |
| }, | |
| { | |
| "epoch": 5.468497576736672, | |
| "grad_norm": 0.4706822122675181, | |
| "learning_rate": 5.567509805054578e-06, | |
| "loss": 0.1245, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13072508573532104, | |
| "step": 3385, | |
| "valid_targets_mean": 5548.0, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 5.476575121163166, | |
| "grad_norm": 0.4430008079811621, | |
| "learning_rate": 5.5118467142201305e-06, | |
| "loss": 0.1316, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1270829737186432, | |
| "step": 3390, | |
| "valid_targets_mean": 4675.6, | |
| "valid_targets_min": 2274 | |
| }, | |
| { | |
| "epoch": 5.48465266558966, | |
| "grad_norm": 0.4055283775412177, | |
| "learning_rate": 5.456418773789274e-06, | |
| "loss": 0.1282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12905651330947876, | |
| "step": 3395, | |
| "valid_targets_mean": 6398.8, | |
| "valid_targets_min": 3517 | |
| }, | |
| { | |
| "epoch": 5.4927302100161555, | |
| "grad_norm": 0.4292954787037575, | |
| "learning_rate": 5.401226883386856e-06, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13923311233520508, | |
| "step": 3400, | |
| "valid_targets_mean": 5949.5, | |
| "valid_targets_min": 2955 | |
| }, | |
| { | |
| "epoch": 5.50080775444265, | |
| "grad_norm": 0.4320861331630341, | |
| "learning_rate": 5.346271938806495e-06, | |
| "loss": 0.1341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1434241682291031, | |
| "step": 3405, | |
| "valid_targets_mean": 5600.8, | |
| "valid_targets_min": 3004 | |
| }, | |
| { | |
| "epoch": 5.508885298869144, | |
| "grad_norm": 0.426440128757936, | |
| "learning_rate": 5.291554831996062e-06, | |
| "loss": 0.1352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13731537759304047, | |
| "step": 3410, | |
| "valid_targets_mean": 5350.3, | |
| "valid_targets_min": 2603 | |
| }, | |
| { | |
| "epoch": 5.516962843295638, | |
| "grad_norm": 0.4300163801774742, | |
| "learning_rate": 5.237076451043188e-06, | |
| "loss": 0.1242, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.130250483751297, | |
| "step": 3415, | |
| "valid_targets_mean": 5307.6, | |
| "valid_targets_min": 2725 | |
| }, | |
| { | |
| "epoch": 5.525040387722132, | |
| "grad_norm": 0.451123503977269, | |
| "learning_rate": 5.1828376801608525e-06, | |
| "loss": 0.1247, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11685208231210709, | |
| "step": 3420, | |
| "valid_targets_mean": 4572.0, | |
| "valid_targets_min": 494 | |
| }, | |
| { | |
| "epoch": 5.533117932148627, | |
| "grad_norm": 0.41538448653314575, | |
| "learning_rate": 5.128839399673045e-06, | |
| "loss": 0.1289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13642895221710205, | |
| "step": 3425, | |
| "valid_targets_mean": 5603.0, | |
| "valid_targets_min": 3452 | |
| }, | |
| { | |
| "epoch": 5.541195476575121, | |
| "grad_norm": 0.4329784732882114, | |
| "learning_rate": 5.075082486000474e-06, | |
| "loss": 0.1266, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13025406002998352, | |
| "step": 3430, | |
| "valid_targets_mean": 5057.3, | |
| "valid_targets_min": 1270 | |
| }, | |
| { | |
| "epoch": 5.549273021001616, | |
| "grad_norm": 0.44960446892514844, | |
| "learning_rate": 5.021567811646313e-06, | |
| "loss": 0.1345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1450745165348053, | |
| "step": 3435, | |
| "valid_targets_mean": 5347.9, | |
| "valid_targets_min": 3246 | |
| }, | |
| { | |
| "epoch": 5.55735056542811, | |
| "grad_norm": 0.41581636303296576, | |
| "learning_rate": 4.968296245182089e-06, | |
| "loss": 0.1195, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11230681836605072, | |
| "step": 3440, | |
| "valid_targets_mean": 5219.1, | |
| "valid_targets_min": 3596 | |
| }, | |
| { | |
| "epoch": 5.565428109854604, | |
| "grad_norm": 0.46622193654098787, | |
| "learning_rate": 4.915268651233553e-06, | |
| "loss": 0.1318, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1380825638771057, | |
| "step": 3445, | |
| "valid_targets_mean": 4438.7, | |
| "valid_targets_min": 2707 | |
| }, | |
| { | |
| "epoch": 5.573505654281099, | |
| "grad_norm": 0.44530442740815335, | |
| "learning_rate": 4.862485890466635e-06, | |
| "loss": 0.1377, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13368725776672363, | |
| "step": 3450, | |
| "valid_targets_mean": 5893.5, | |
| "valid_targets_min": 3301 | |
| }, | |
| { | |
| "epoch": 5.581583198707593, | |
| "grad_norm": 0.43374674664178625, | |
| "learning_rate": 4.8099488195735155e-06, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12141568958759308, | |
| "step": 3455, | |
| "valid_targets_mean": 6286.9, | |
| "valid_targets_min": 3189 | |
| }, | |
| { | |
| "epoch": 5.589660743134087, | |
| "grad_norm": 0.43085346625468096, | |
| "learning_rate": 4.757658291258693e-06, | |
| "loss": 0.1344, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12574200332164764, | |
| "step": 3460, | |
| "valid_targets_mean": 5033.8, | |
| "valid_targets_min": 3589 | |
| }, | |
| { | |
| "epoch": 5.597738287560581, | |
| "grad_norm": 0.4267327262798803, | |
| "learning_rate": 4.705615154225134e-06, | |
| "loss": 0.1254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11858279258012772, | |
| "step": 3465, | |
| "valid_targets_mean": 5222.2, | |
| "valid_targets_min": 1583 | |
| }, | |
| { | |
| "epoch": 5.605815831987076, | |
| "grad_norm": 0.41479559745363154, | |
| "learning_rate": 4.6538202531605304e-06, | |
| "loss": 0.1281, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12612152099609375, | |
| "step": 3470, | |
| "valid_targets_mean": 5382.6, | |
| "valid_targets_min": 3126 | |
| }, | |
| { | |
| "epoch": 5.613893376413571, | |
| "grad_norm": 0.44362522228036605, | |
| "learning_rate": 4.602274428723568e-06, | |
| "loss": 0.1283, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13906408846378326, | |
| "step": 3475, | |
| "valid_targets_mean": 5549.4, | |
| "valid_targets_min": 2956 | |
| }, | |
| { | |
| "epoch": 5.621970920840065, | |
| "grad_norm": 0.48940058866056046, | |
| "learning_rate": 4.550978517530287e-06, | |
| "loss": 0.1323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15402500331401825, | |
| "step": 3480, | |
| "valid_targets_mean": 4956.8, | |
| "valid_targets_min": 3389 | |
| }, | |
| { | |
| "epoch": 5.630048465266559, | |
| "grad_norm": 0.4380326810478009, | |
| "learning_rate": 4.499933352140504e-06, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1398143470287323, | |
| "step": 3485, | |
| "valid_targets_mean": 5444.6, | |
| "valid_targets_min": 3100 | |
| }, | |
| { | |
| "epoch": 5.638126009693053, | |
| "grad_norm": 0.42605344964709, | |
| "learning_rate": 4.44913976104429e-06, | |
| "loss": 0.1329, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12708860635757446, | |
| "step": 3490, | |
| "valid_targets_mean": 5242.1, | |
| "valid_targets_min": 3196 | |
| }, | |
| { | |
| "epoch": 5.646203554119547, | |
| "grad_norm": 0.48778301584741457, | |
| "learning_rate": 4.398598568648542e-06, | |
| "loss": 0.1243, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1347925364971161, | |
| "step": 3495, | |
| "valid_targets_mean": 4615.4, | |
| "valid_targets_min": 3110 | |
| }, | |
| { | |
| "epoch": 5.654281098546042, | |
| "grad_norm": 0.39618198656394327, | |
| "learning_rate": 4.348310595263597e-06, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13055220246315002, | |
| "step": 3500, | |
| "valid_targets_mean": 6067.0, | |
| "valid_targets_min": 3733 | |
| }, | |
| { | |
| "epoch": 5.662358642972537, | |
| "grad_norm": 0.44742584197434127, | |
| "learning_rate": 4.29827665708989e-06, | |
| "loss": 0.1248, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12233701348304749, | |
| "step": 3505, | |
| "valid_targets_mean": 4934.1, | |
| "valid_targets_min": 2995 | |
| }, | |
| { | |
| "epoch": 5.670436187399031, | |
| "grad_norm": 0.4462032079816014, | |
| "learning_rate": 4.248497566204755e-06, | |
| "loss": 0.1318, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12860026955604553, | |
| "step": 3510, | |
| "valid_targets_mean": 5250.4, | |
| "valid_targets_min": 3606 | |
| }, | |
| { | |
| "epoch": 5.678513731825525, | |
| "grad_norm": 0.43055984492150595, | |
| "learning_rate": 4.198974130549209e-06, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1282704770565033, | |
| "step": 3515, | |
| "valid_targets_mean": 5206.6, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 5.686591276252019, | |
| "grad_norm": 0.4087171428147634, | |
| "learning_rate": 4.149707153914855e-06, | |
| "loss": 0.1287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12912295758724213, | |
| "step": 3520, | |
| "valid_targets_mean": 5396.3, | |
| "valid_targets_min": 2628 | |
| }, | |
| { | |
| "epoch": 5.694668820678514, | |
| "grad_norm": 0.46483924914673547, | |
| "learning_rate": 4.100697435930812e-06, | |
| "loss": 0.1356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14172643423080444, | |
| "step": 3525, | |
| "valid_targets_mean": 5788.2, | |
| "valid_targets_min": 3464 | |
| }, | |
| { | |
| "epoch": 5.702746365105008, | |
| "grad_norm": 0.4632668885464545, | |
| "learning_rate": 4.0519457720507785e-06, | |
| "loss": 0.1308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13240531086921692, | |
| "step": 3530, | |
| "valid_targets_mean": 5077.1, | |
| "valid_targets_min": 1471 | |
| }, | |
| { | |
| "epoch": 5.710823909531502, | |
| "grad_norm": 0.4334095684159384, | |
| "learning_rate": 4.003452953540088e-06, | |
| "loss": 0.1274, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.131183922290802, | |
| "step": 3535, | |
| "valid_targets_mean": 5292.6, | |
| "valid_targets_min": 3034 | |
| }, | |
| { | |
| "epoch": 5.718901453957997, | |
| "grad_norm": 0.3888369802534452, | |
| "learning_rate": 3.95521976746287e-06, | |
| "loss": 0.1304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13127687573432922, | |
| "step": 3540, | |
| "valid_targets_mean": 6261.8, | |
| "valid_targets_min": 2698 | |
| }, | |
| { | |
| "epoch": 5.726978998384491, | |
| "grad_norm": 0.42989158386615484, | |
| "learning_rate": 3.907246996669289e-06, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1272650510072708, | |
| "step": 3545, | |
| "valid_targets_mean": 5860.9, | |
| "valid_targets_min": 3736 | |
| }, | |
| { | |
| "epoch": 5.735056542810986, | |
| "grad_norm": 0.41599823091501176, | |
| "learning_rate": 3.8595354197828405e-06, | |
| "loss": 0.1233, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12474222481250763, | |
| "step": 3550, | |
| "valid_targets_mean": 5342.7, | |
| "valid_targets_min": 3059 | |
| }, | |
| { | |
| "epoch": 5.74313408723748, | |
| "grad_norm": 0.4043405806565461, | |
| "learning_rate": 3.8120858111876824e-06, | |
| "loss": 0.1282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11929463595151901, | |
| "step": 3555, | |
| "valid_targets_mean": 5652.8, | |
| "valid_targets_min": 3672 | |
| }, | |
| { | |
| "epoch": 5.751211631663974, | |
| "grad_norm": 0.46661331335623885, | |
| "learning_rate": 3.7648989410161064e-06, | |
| "loss": 0.1321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13622185587882996, | |
| "step": 3560, | |
| "valid_targets_mean": 4769.8, | |
| "valid_targets_min": 2370 | |
| }, | |
| { | |
| "epoch": 5.759289176090468, | |
| "grad_norm": 0.4184938336356662, | |
| "learning_rate": 3.7179755751360193e-06, | |
| "loss": 0.1331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1348327100276947, | |
| "step": 3565, | |
| "valid_targets_mean": 5862.5, | |
| "valid_targets_min": 2050 | |
| }, | |
| { | |
| "epoch": 5.7673667205169625, | |
| "grad_norm": 0.4209144525581572, | |
| "learning_rate": 3.6713164751384933e-06, | |
| "loss": 0.1267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12385410070419312, | |
| "step": 3570, | |
| "valid_targets_mean": 5000.0, | |
| "valid_targets_min": 875 | |
| }, | |
| { | |
| "epoch": 5.775444264943458, | |
| "grad_norm": 0.47636037540360815, | |
| "learning_rate": 3.624922398325461e-06, | |
| "loss": 0.1239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1303969919681549, | |
| "step": 3575, | |
| "valid_targets_mean": 4286.1, | |
| "valid_targets_min": 860 | |
| }, | |
| { | |
| "epoch": 5.783521809369952, | |
| "grad_norm": 0.4155562167139012, | |
| "learning_rate": 3.578794097697358e-06, | |
| "loss": 0.1288, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13030406832695007, | |
| "step": 3580, | |
| "valid_targets_mean": 5310.9, | |
| "valid_targets_min": 3587 | |
| }, | |
| { | |
| "epoch": 5.791599353796446, | |
| "grad_norm": 0.5016412125220293, | |
| "learning_rate": 3.5329323219409404e-06, | |
| "loss": 0.128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13560795783996582, | |
| "step": 3585, | |
| "valid_targets_mean": 5757.9, | |
| "valid_targets_min": 3802 | |
| }, | |
| { | |
| "epoch": 5.79967689822294, | |
| "grad_norm": 0.4287142497742641, | |
| "learning_rate": 3.4873378154171355e-06, | |
| "loss": 0.1286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12317405641078949, | |
| "step": 3590, | |
| "valid_targets_mean": 5041.9, | |
| "valid_targets_min": 3316 | |
| }, | |
| { | |
| "epoch": 5.8077544426494345, | |
| "grad_norm": 0.415107501998662, | |
| "learning_rate": 3.44201131814893e-06, | |
| "loss": 0.13, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11532598733901978, | |
| "step": 3595, | |
| "valid_targets_mean": 5660.2, | |
| "valid_targets_min": 2571 | |
| }, | |
| { | |
| "epoch": 5.815831987075929, | |
| "grad_norm": 0.4576138982293604, | |
| "learning_rate": 3.396953565809393e-06, | |
| "loss": 0.1298, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12490224838256836, | |
| "step": 3600, | |
| "valid_targets_mean": 4615.6, | |
| "valid_targets_min": 3102 | |
| }, | |
| { | |
| "epoch": 5.823909531502423, | |
| "grad_norm": 0.44157735215572447, | |
| "learning_rate": 3.3521652897097145e-06, | |
| "loss": 0.1342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14809024333953857, | |
| "step": 3605, | |
| "valid_targets_mean": 5332.1, | |
| "valid_targets_min": 1294 | |
| }, | |
| { | |
| "epoch": 5.831987075928918, | |
| "grad_norm": 0.44788378747331953, | |
| "learning_rate": 3.3076472167873506e-06, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14152215421199799, | |
| "step": 3610, | |
| "valid_targets_mean": 5306.1, | |
| "valid_targets_min": 3152 | |
| }, | |
| { | |
| "epoch": 5.840064620355412, | |
| "grad_norm": 0.41062346511478454, | |
| "learning_rate": 3.2634000695942005e-06, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11961716413497925, | |
| "step": 3615, | |
| "valid_targets_mean": 5562.0, | |
| "valid_targets_min": 2508 | |
| }, | |
| { | |
| "epoch": 5.8481421647819065, | |
| "grad_norm": 0.45060523033481636, | |
| "learning_rate": 3.2194245662849076e-06, | |
| "loss": 0.1264, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13177934288978577, | |
| "step": 3620, | |
| "valid_targets_mean": 5529.1, | |
| "valid_targets_min": 3357 | |
| }, | |
| { | |
| "epoch": 5.856219709208401, | |
| "grad_norm": 0.46069722388709394, | |
| "learning_rate": 3.175721420605198e-06, | |
| "loss": 0.1331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12503468990325928, | |
| "step": 3625, | |
| "valid_targets_mean": 4698.9, | |
| "valid_targets_min": 3025 | |
| }, | |
| { | |
| "epoch": 5.864297253634895, | |
| "grad_norm": 0.42719671755097033, | |
| "learning_rate": 3.132291341880269e-06, | |
| "loss": 0.1291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12041535973548889, | |
| "step": 3630, | |
| "valid_targets_mean": 5449.2, | |
| "valid_targets_min": 3499 | |
| }, | |
| { | |
| "epoch": 5.872374798061389, | |
| "grad_norm": 0.45809759944896844, | |
| "learning_rate": 3.08913503500331e-06, | |
| "loss": 0.1272, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12977182865142822, | |
| "step": 3635, | |
| "valid_targets_mean": 4846.6, | |
| "valid_targets_min": 1747 | |
| }, | |
| { | |
| "epoch": 5.880452342487883, | |
| "grad_norm": 0.4004710139327011, | |
| "learning_rate": 3.0462532004240497e-06, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11773309111595154, | |
| "step": 3640, | |
| "valid_targets_mean": 5603.2, | |
| "valid_targets_min": 1502 | |
| }, | |
| { | |
| "epoch": 5.8885298869143785, | |
| "grad_norm": 0.42901347254613426, | |
| "learning_rate": 3.0036465341373764e-06, | |
| "loss": 0.136, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15141505002975464, | |
| "step": 3645, | |
| "valid_targets_mean": 5615.2, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 5.896607431340873, | |
| "grad_norm": 0.4360637592715409, | |
| "learning_rate": 2.9613157276720604e-06, | |
| "loss": 0.1282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12636485695838928, | |
| "step": 3650, | |
| "valid_targets_mean": 5416.1, | |
| "valid_targets_min": 3584 | |
| }, | |
| { | |
| "epoch": 5.904684975767367, | |
| "grad_norm": 0.40220716793314576, | |
| "learning_rate": 2.9192614680795196e-06, | |
| "loss": 0.1321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12780413031578064, | |
| "step": 3655, | |
| "valid_targets_mean": 5938.9, | |
| "valid_targets_min": 3451 | |
| }, | |
| { | |
| "epoch": 5.912762520193861, | |
| "grad_norm": 0.4325378825436018, | |
| "learning_rate": 2.8774844379226554e-06, | |
| "loss": 0.128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13803870975971222, | |
| "step": 3660, | |
| "valid_targets_mean": 5305.1, | |
| "valid_targets_min": 2629 | |
| }, | |
| { | |
| "epoch": 5.920840064620355, | |
| "grad_norm": 0.4414669381253834, | |
| "learning_rate": 2.8359853152648175e-06, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12497833371162415, | |
| "step": 3665, | |
| "valid_targets_mean": 5239.0, | |
| "valid_targets_min": 2990 | |
| }, | |
| { | |
| "epoch": 5.92891760904685, | |
| "grad_norm": 0.4393760609199369, | |
| "learning_rate": 2.7947647736587425e-06, | |
| "loss": 0.1333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13562984764575958, | |
| "step": 3670, | |
| "valid_targets_mean": 5573.1, | |
| "valid_targets_min": 3174 | |
| }, | |
| { | |
| "epoch": 5.936995153473344, | |
| "grad_norm": 0.42547485094726206, | |
| "learning_rate": 2.753823482135658e-06, | |
| "loss": 0.128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13088464736938477, | |
| "step": 3675, | |
| "valid_targets_mean": 5294.8, | |
| "valid_targets_min": 2571 | |
| }, | |
| { | |
| "epoch": 5.945072697899839, | |
| "grad_norm": 0.41839366327792726, | |
| "learning_rate": 2.713162105194427e-06, | |
| "loss": 0.1242, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1284753829240799, | |
| "step": 3680, | |
| "valid_targets_mean": 6062.1, | |
| "valid_targets_min": 3553 | |
| }, | |
| { | |
| "epoch": 5.953150242326333, | |
| "grad_norm": 0.44798323279186714, | |
| "learning_rate": 2.672781302790726e-06, | |
| "loss": 0.1381, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14375978708267212, | |
| "step": 3685, | |
| "valid_targets_mean": 5534.2, | |
| "valid_targets_min": 3805 | |
| }, | |
| { | |
| "epoch": 5.961227786752827, | |
| "grad_norm": 0.44867958556757354, | |
| "learning_rate": 2.6326817303263764e-06, | |
| "loss": 0.1305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1329280585050583, | |
| "step": 3690, | |
| "valid_targets_mean": 5351.2, | |
| "valid_targets_min": 2891 | |
| }, | |
| { | |
| "epoch": 5.969305331179322, | |
| "grad_norm": 0.437445716769064, | |
| "learning_rate": 2.592864038638685e-06, | |
| "loss": 0.1314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1237957552075386, | |
| "step": 3695, | |
| "valid_targets_mean": 5218.3, | |
| "valid_targets_min": 3165 | |
| }, | |
| { | |
| "epoch": 5.977382875605816, | |
| "grad_norm": 0.4171417743389424, | |
| "learning_rate": 2.5533288739898865e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16024985909461975, | |
| "step": 3700, | |
| "valid_targets_mean": 5685.4, | |
| "valid_targets_min": 1278 | |
| }, | |
| { | |
| "epoch": 5.98546042003231, | |
| "grad_norm": 0.42356417930931556, | |
| "learning_rate": 2.5140768780566436e-06, | |
| "loss": 0.1295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1332971751689911, | |
| "step": 3705, | |
| "valid_targets_mean": 6115.1, | |
| "valid_targets_min": 4652 | |
| }, | |
| { | |
| "epoch": 5.993537964458804, | |
| "grad_norm": 0.4187791264085228, | |
| "learning_rate": 2.475108687919645e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12996046245098114, | |
| "step": 3710, | |
| "valid_targets_mean": 5287.6, | |
| "valid_targets_min": 2671 | |
| }, | |
| { | |
| "epoch": 6.0016155088852985, | |
| "grad_norm": 0.38042177308670916, | |
| "learning_rate": 2.43642493605327e-06, | |
| "loss": 0.1311, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13582850992679596, | |
| "step": 3715, | |
| "valid_targets_mean": 6274.0, | |
| "valid_targets_min": 3274 | |
| }, | |
| { | |
| "epoch": 6.009693053311794, | |
| "grad_norm": 0.3996995746582545, | |
| "learning_rate": 2.398026250315295e-06, | |
| "loss": 0.1265, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12647423148155212, | |
| "step": 3720, | |
| "valid_targets_mean": 5534.5, | |
| "valid_targets_min": 494 | |
| }, | |
| { | |
| "epoch": 6.017770597738288, | |
| "grad_norm": 0.40840529795994723, | |
| "learning_rate": 2.3599132539367386e-06, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12015852332115173, | |
| "step": 3725, | |
| "valid_targets_mean": 5502.9, | |
| "valid_targets_min": 2993 | |
| }, | |
| { | |
| "epoch": 6.025848142164782, | |
| "grad_norm": 0.45063017808794575, | |
| "learning_rate": 2.3220865655117274e-06, | |
| "loss": 0.126, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12192650139331818, | |
| "step": 3730, | |
| "valid_targets_mean": 4680.8, | |
| "valid_targets_min": 3100 | |
| }, | |
| { | |
| "epoch": 6.033925686591276, | |
| "grad_norm": 0.43630585939735095, | |
| "learning_rate": 2.284546798987448e-06, | |
| "loss": 0.1235, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12560415267944336, | |
| "step": 3735, | |
| "valid_targets_mean": 5294.4, | |
| "valid_targets_min": 2865 | |
| }, | |
| { | |
| "epoch": 6.0420032310177705, | |
| "grad_norm": 0.505722121935076, | |
| "learning_rate": 2.2472945636542033e-06, | |
| "loss": 0.1269, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13361534476280212, | |
| "step": 3740, | |
| "valid_targets_mean": 5304.9, | |
| "valid_targets_min": 920 | |
| }, | |
| { | |
| "epoch": 6.050080775444265, | |
| "grad_norm": 0.42675333406544164, | |
| "learning_rate": 2.2103304641355128e-06, | |
| "loss": 0.1281, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12612757086753845, | |
| "step": 3745, | |
| "valid_targets_mean": 5455.9, | |
| "valid_targets_min": 3102 | |
| }, | |
| { | |
| "epoch": 6.058158319870759, | |
| "grad_norm": 0.4264645769477084, | |
| "learning_rate": 2.17365510037828e-06, | |
| "loss": 0.1218, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12067071348428726, | |
| "step": 3750, | |
| "valid_targets_mean": 5514.5, | |
| "valid_targets_min": 3111 | |
| }, | |
| { | |
| "epoch": 6.066235864297254, | |
| "grad_norm": 0.3923668630594968, | |
| "learning_rate": 2.1372690676431106e-06, | |
| "loss": 0.1219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12325260043144226, | |
| "step": 3755, | |
| "valid_targets_mean": 6121.4, | |
| "valid_targets_min": 4815 | |
| }, | |
| { | |
| "epoch": 6.074313408723748, | |
| "grad_norm": 0.4314493239984855, | |
| "learning_rate": 2.101172956494577e-06, | |
| "loss": 0.1205, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1209428682923317, | |
| "step": 3760, | |
| "valid_targets_mean": 5166.6, | |
| "valid_targets_min": 2498 | |
| }, | |
| { | |
| "epoch": 6.0823909531502425, | |
| "grad_norm": 0.4682899642080794, | |
| "learning_rate": 2.065367352791694e-06, | |
| "loss": 0.1313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1530829668045044, | |
| "step": 3765, | |
| "valid_targets_mean": 5197.5, | |
| "valid_targets_min": 2402 | |
| }, | |
| { | |
| "epoch": 6.090468497576737, | |
| "grad_norm": 0.4411425846116113, | |
| "learning_rate": 2.0298528376783787e-06, | |
| "loss": 0.1152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11478254199028015, | |
| "step": 3770, | |
| "valid_targets_mean": 5477.7, | |
| "valid_targets_min": 3269 | |
| }, | |
| { | |
| "epoch": 6.098546042003231, | |
| "grad_norm": 0.41948667962899977, | |
| "learning_rate": 1.9946299875740236e-06, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13149794936180115, | |
| "step": 3775, | |
| "valid_targets_mean": 5922.2, | |
| "valid_targets_min": 2735 | |
| }, | |
| { | |
| "epoch": 6.106623586429725, | |
| "grad_norm": 0.3997061769361075, | |
| "learning_rate": 1.959699374164148e-06, | |
| "loss": 0.1311, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12436765432357788, | |
| "step": 3780, | |
| "valid_targets_mean": 6095.8, | |
| "valid_targets_min": 3791 | |
| }, | |
| { | |
| "epoch": 6.114701130856219, | |
| "grad_norm": 0.40787813247608634, | |
| "learning_rate": 1.9250615643911195e-06, | |
| "loss": 0.1254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12347099184989929, | |
| "step": 3785, | |
| "valid_targets_mean": 5512.1, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 6.1227786752827145, | |
| "grad_norm": 0.4394743128766455, | |
| "learning_rate": 1.8907171204449316e-06, | |
| "loss": 0.125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11828821897506714, | |
| "step": 3790, | |
| "valid_targets_mean": 5453.8, | |
| "valid_targets_min": 3864 | |
| }, | |
| { | |
| "epoch": 6.130856219709209, | |
| "grad_norm": 0.4012592442183262, | |
| "learning_rate": 1.8566665997541111e-06, | |
| "loss": 0.1243, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1390397846698761, | |
| "step": 3795, | |
| "valid_targets_mean": 5487.6, | |
| "valid_targets_min": 2737 | |
| }, | |
| { | |
| "epoch": 6.138933764135703, | |
| "grad_norm": 0.462240025647866, | |
| "learning_rate": 1.8229105549766469e-06, | |
| "loss": 0.1271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14077478647232056, | |
| "step": 3800, | |
| "valid_targets_mean": 5228.1, | |
| "valid_targets_min": 2965 | |
| }, | |
| { | |
| "epoch": 6.147011308562197, | |
| "grad_norm": 0.4367640979741714, | |
| "learning_rate": 1.7894495339910345e-06, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11776141822338104, | |
| "step": 3805, | |
| "valid_targets_mean": 5597.7, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 6.155088852988691, | |
| "grad_norm": 0.40903146232389964, | |
| "learning_rate": 1.7562840798873648e-06, | |
| "loss": 0.1241, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11086548119783401, | |
| "step": 3810, | |
| "valid_targets_mean": 5194.6, | |
| "valid_targets_min": 3095 | |
| }, | |
| { | |
| "epoch": 6.163166397415186, | |
| "grad_norm": 0.6394406603603819, | |
| "learning_rate": 1.7234147309585304e-06, | |
| "loss": 0.124, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13591381907463074, | |
| "step": 3815, | |
| "valid_targets_mean": 5851.1, | |
| "valid_targets_min": 3703 | |
| }, | |
| { | |
| "epoch": 6.17124394184168, | |
| "grad_norm": 0.45620403937167114, | |
| "learning_rate": 1.6908420206914854e-06, | |
| "loss": 0.1305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12876799702644348, | |
| "step": 3820, | |
| "valid_targets_mean": 5486.4, | |
| "valid_targets_min": 2299 | |
| }, | |
| { | |
| "epoch": 6.179321486268175, | |
| "grad_norm": 0.4606826442584153, | |
| "learning_rate": 1.6585664777585654e-06, | |
| "loss": 0.1214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13059750199317932, | |
| "step": 3825, | |
| "valid_targets_mean": 4831.5, | |
| "valid_targets_min": 2564 | |
| }, | |
| { | |
| "epoch": 6.187399030694669, | |
| "grad_norm": 0.4775624709862394, | |
| "learning_rate": 1.6265886260089337e-06, | |
| "loss": 0.1242, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11253519356250763, | |
| "step": 3830, | |
| "valid_targets_mean": 5556.9, | |
| "valid_targets_min": 3334 | |
| }, | |
| { | |
| "epoch": 6.195476575121163, | |
| "grad_norm": 0.5357108543862896, | |
| "learning_rate": 1.5949089844600729e-06, | |
| "loss": 0.1231, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11639741063117981, | |
| "step": 3835, | |
| "valid_targets_mean": 4648.2, | |
| "valid_targets_min": 2791 | |
| }, | |
| { | |
| "epoch": 6.203554119547658, | |
| "grad_norm": 1.215560948141749, | |
| "learning_rate": 1.563528067289335e-06, | |
| "loss": 0.1285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1310233175754547, | |
| "step": 3840, | |
| "valid_targets_mean": 5432.3, | |
| "valid_targets_min": 2975 | |
| }, | |
| { | |
| "epoch": 6.211631663974152, | |
| "grad_norm": 0.43295667523345815, | |
| "learning_rate": 1.5324463838256476e-06, | |
| "loss": 0.1253, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12923157215118408, | |
| "step": 3845, | |
| "valid_targets_mean": 5914.4, | |
| "valid_targets_min": 875 | |
| }, | |
| { | |
| "epoch": 6.219709208400646, | |
| "grad_norm": 0.4313108587022746, | |
| "learning_rate": 1.5016644385411928e-06, | |
| "loss": 0.1249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13349610567092896, | |
| "step": 3850, | |
| "valid_targets_mean": 5821.2, | |
| "valid_targets_min": 3989 | |
| }, | |
| { | |
| "epoch": 6.22778675282714, | |
| "grad_norm": 0.48650958615453055, | |
| "learning_rate": 1.4711827310432437e-06, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15580685436725616, | |
| "step": 3855, | |
| "valid_targets_mean": 5863.0, | |
| "valid_targets_min": 3242 | |
| }, | |
| { | |
| "epoch": 6.2358642972536344, | |
| "grad_norm": 0.40857640102352444, | |
| "learning_rate": 1.441001756066076e-06, | |
| "loss": 0.1262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12889908254146576, | |
| "step": 3860, | |
| "valid_targets_mean": 5974.9, | |
| "valid_targets_min": 2904 | |
| }, | |
| { | |
| "epoch": 6.24394184168013, | |
| "grad_norm": 0.4541360947416613, | |
| "learning_rate": 1.4111220034628925e-06, | |
| "loss": 0.1277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12288201600313187, | |
| "step": 3865, | |
| "valid_targets_mean": 5001.9, | |
| "valid_targets_min": 2061 | |
| }, | |
| { | |
| "epoch": 6.252019386106624, | |
| "grad_norm": 0.4507694019187057, | |
| "learning_rate": 1.3815439581979106e-06, | |
| "loss": 0.1212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12719199061393738, | |
| "step": 3870, | |
| "valid_targets_mean": 4805.4, | |
| "valid_targets_min": 635 | |
| }, | |
| { | |
| "epoch": 6.260096930533118, | |
| "grad_norm": 0.43314759395541486, | |
| "learning_rate": 1.352268100338483e-06, | |
| "loss": 0.1264, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12478262186050415, | |
| "step": 3875, | |
| "valid_targets_mean": 5551.4, | |
| "valid_targets_min": 3299 | |
| }, | |
| { | |
| "epoch": 6.268174474959612, | |
| "grad_norm": 0.4434291666714666, | |
| "learning_rate": 1.3232949050472809e-06, | |
| "loss": 0.1225, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11485721915960312, | |
| "step": 3880, | |
| "valid_targets_mean": 5089.8, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 6.276252019386106, | |
| "grad_norm": 0.4269358778893437, | |
| "learning_rate": 1.2946248425746189e-06, | |
| "loss": 0.125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12474660575389862, | |
| "step": 3885, | |
| "valid_targets_mean": 5354.8, | |
| "valid_targets_min": 3237 | |
| }, | |
| { | |
| "epoch": 6.284329563812601, | |
| "grad_norm": 0.4944266854838816, | |
| "learning_rate": 1.266258378250802e-06, | |
| "loss": 0.1288, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1357978880405426, | |
| "step": 3890, | |
| "valid_targets_mean": 4909.8, | |
| "valid_targets_min": 2240 | |
| }, | |
| { | |
| "epoch": 6.292407108239095, | |
| "grad_norm": 0.4811185167949499, | |
| "learning_rate": 1.2381959724785752e-06, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12598447501659393, | |
| "step": 3895, | |
| "valid_targets_mean": 4965.4, | |
| "valid_targets_min": 1612 | |
| }, | |
| { | |
| "epoch": 6.30048465266559, | |
| "grad_norm": 0.45350839806662147, | |
| "learning_rate": 1.2104380807256488e-06, | |
| "loss": 0.1263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12420719116926193, | |
| "step": 3900, | |
| "valid_targets_mean": 5155.8, | |
| "valid_targets_min": 2456 | |
| }, | |
| { | |
| "epoch": 6.308562197092084, | |
| "grad_norm": 0.4407541653779522, | |
| "learning_rate": 1.182985153517311e-06, | |
| "loss": 0.1279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12510338425636292, | |
| "step": 3905, | |
| "valid_targets_mean": 5554.1, | |
| "valid_targets_min": 1894 | |
| }, | |
| { | |
| "epoch": 6.316639741518578, | |
| "grad_norm": 0.43753400948465604, | |
| "learning_rate": 1.1558376364291202e-06, | |
| "loss": 0.1313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13448363542556763, | |
| "step": 3910, | |
| "valid_targets_mean": 5597.8, | |
| "valid_targets_min": 3414 | |
| }, | |
| { | |
| "epoch": 6.324717285945073, | |
| "grad_norm": 0.4867143708533207, | |
| "learning_rate": 1.128995970079656e-06, | |
| "loss": 0.1306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12516459822654724, | |
| "step": 3915, | |
| "valid_targets_mean": 4465.9, | |
| "valid_targets_min": 2876 | |
| }, | |
| { | |
| "epoch": 6.332794830371567, | |
| "grad_norm": 0.4556391374587531, | |
| "learning_rate": 1.1024605901233843e-06, | |
| "loss": 0.1267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1345321536064148, | |
| "step": 3920, | |
| "valid_targets_mean": 5282.8, | |
| "valid_targets_min": 3026 | |
| }, | |
| { | |
| "epoch": 6.340872374798061, | |
| "grad_norm": 0.4901112286992194, | |
| "learning_rate": 1.0762319272435851e-06, | |
| "loss": 0.1216, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12185169011354446, | |
| "step": 3925, | |
| "valid_targets_mean": 5655.7, | |
| "valid_targets_min": 3168 | |
| }, | |
| { | |
| "epoch": 6.348949919224555, | |
| "grad_norm": 0.4458223827027878, | |
| "learning_rate": 1.0503104071453429e-06, | |
| "loss": 0.1273, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1298675388097763, | |
| "step": 3930, | |
| "valid_targets_mean": 5410.5, | |
| "valid_targets_min": 3389 | |
| }, | |
| { | |
| "epoch": 6.35702746365105, | |
| "grad_norm": 0.4251996105028484, | |
| "learning_rate": 1.0246964505486768e-06, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13420116901397705, | |
| "step": 3935, | |
| "valid_targets_mean": 5914.3, | |
| "valid_targets_min": 2609 | |
| }, | |
| { | |
| "epoch": 6.365105008077545, | |
| "grad_norm": 0.46439313311522407, | |
| "learning_rate": 9.993904731816606e-07, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1249784454703331, | |
| "step": 3940, | |
| "valid_targets_mean": 5972.4, | |
| "valid_targets_min": 2670 | |
| }, | |
| { | |
| "epoch": 6.373182552504039, | |
| "grad_norm": 0.4739447815808242, | |
| "learning_rate": 9.743928857737138e-07, | |
| "loss": 0.1263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13924196362495422, | |
| "step": 3945, | |
| "valid_targets_mean": 4824.4, | |
| "valid_targets_min": 2712 | |
| }, | |
| { | |
| "epoch": 6.381260096930533, | |
| "grad_norm": 0.48654467584611094, | |
| "learning_rate": 9.497040940489332e-07, | |
| "loss": 0.1237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12547633051872253, | |
| "step": 3950, | |
| "valid_targets_mean": 4858.6, | |
| "valid_targets_min": 2845 | |
| }, | |
| { | |
| "epoch": 6.389337641357027, | |
| "grad_norm": 0.4356735072665942, | |
| "learning_rate": 9.253244987194798e-07, | |
| "loss": 0.1253, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12728527188301086, | |
| "step": 3955, | |
| "valid_targets_mean": 5167.6, | |
| "valid_targets_min": 2985 | |
| }, | |
| { | |
| "epoch": 6.3974151857835215, | |
| "grad_norm": 0.4506609202751045, | |
| "learning_rate": 9.012544954791048e-07, | |
| "loss": 0.1283, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13564032316207886, | |
| "step": 3960, | |
| "valid_targets_mean": 6034.4, | |
| "valid_targets_min": 3891 | |
| }, | |
| { | |
| "epoch": 6.405492730210016, | |
| "grad_norm": 0.4608952018911762, | |
| "learning_rate": 8.774944749967162e-07, | |
| "loss": 0.1296, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12978151440620422, | |
| "step": 3965, | |
| "valid_targets_mean": 5206.6, | |
| "valid_targets_min": 2467 | |
| }, | |
| { | |
| "epoch": 6.413570274636511, | |
| "grad_norm": 0.46107689800024954, | |
| "learning_rate": 8.540448229100295e-07, | |
| "loss": 0.1206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13031840324401855, | |
| "step": 3970, | |
| "valid_targets_mean": 5134.1, | |
| "valid_targets_min": 1820 | |
| }, | |
| { | |
| "epoch": 6.421647819063005, | |
| "grad_norm": 0.4118015091462396, | |
| "learning_rate": 8.309059198193225e-07, | |
| "loss": 0.1287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13095030188560486, | |
| "step": 3975, | |
| "valid_targets_mean": 6233.5, | |
| "valid_targets_min": 2561 | |
| }, | |
| { | |
| "epoch": 6.429725363489499, | |
| "grad_norm": 0.4682974706826252, | |
| "learning_rate": 8.080781412812544e-07, | |
| "loss": 0.1276, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13665667176246643, | |
| "step": 3980, | |
| "valid_targets_mean": 5492.9, | |
| "valid_targets_min": 3368 | |
| }, | |
| { | |
| "epoch": 6.4378029079159935, | |
| "grad_norm": 0.5043978939772313, | |
| "learning_rate": 7.85561857802759e-07, | |
| "loss": 0.1299, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14872294664382935, | |
| "step": 3985, | |
| "valid_targets_mean": 5439.5, | |
| "valid_targets_min": 3211 | |
| }, | |
| { | |
| "epoch": 6.445880452342488, | |
| "grad_norm": 0.4736575403831154, | |
| "learning_rate": 7.633574348350525e-07, | |
| "loss": 0.1271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1309095025062561, | |
| "step": 3990, | |
| "valid_targets_mean": 5669.1, | |
| "valid_targets_min": 3871 | |
| }, | |
| { | |
| "epoch": 6.453957996768982, | |
| "grad_norm": 0.44250125729727774, | |
| "learning_rate": 7.414652327676797e-07, | |
| "loss": 0.1194, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11579351127147675, | |
| "step": 3995, | |
| "valid_targets_mean": 5095.6, | |
| "valid_targets_min": 3083 | |
| }, | |
| { | |
| "epoch": 6.462035541195476, | |
| "grad_norm": 0.46412183365878834, | |
| "learning_rate": 7.198856069226834e-07, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1247982382774353, | |
| "step": 4000, | |
| "valid_targets_mean": 5121.2, | |
| "valid_targets_min": 3202 | |
| }, | |
| { | |
| "epoch": 6.470113085621971, | |
| "grad_norm": 0.42099510160073933, | |
| "learning_rate": 6.986189075488159e-07, | |
| "loss": 0.1277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.123319610953331, | |
| "step": 4005, | |
| "valid_targets_mean": 5561.8, | |
| "valid_targets_min": 3313 | |
| }, | |
| { | |
| "epoch": 6.4781906300484655, | |
| "grad_norm": 0.5096369000331581, | |
| "learning_rate": 6.77665479815881e-07, | |
| "loss": 0.1282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1309448778629303, | |
| "step": 4010, | |
| "valid_targets_mean": 5498.9, | |
| "valid_targets_min": 2886 | |
| }, | |
| { | |
| "epoch": 6.48626817447496, | |
| "grad_norm": 0.4469201811102124, | |
| "learning_rate": 6.570256638091122e-07, | |
| "loss": 0.1199, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11223016679286957, | |
| "step": 4015, | |
| "valid_targets_mean": 4988.0, | |
| "valid_targets_min": 3026 | |
| }, | |
| { | |
| "epoch": 6.494345718901454, | |
| "grad_norm": 0.4204535857468685, | |
| "learning_rate": 6.366997945236519e-07, | |
| "loss": 0.1294, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1368045210838318, | |
| "step": 4020, | |
| "valid_targets_mean": 5391.7, | |
| "valid_targets_min": 2986 | |
| }, | |
| { | |
| "epoch": 6.502423263327948, | |
| "grad_norm": 0.43551176249044327, | |
| "learning_rate": 6.166882018591369e-07, | |
| "loss": 0.1293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1250474750995636, | |
| "step": 4025, | |
| "valid_targets_mean": 5422.8, | |
| "valid_targets_min": 2538 | |
| }, | |
| { | |
| "epoch": 6.510500807754442, | |
| "grad_norm": 0.4399271110250978, | |
| "learning_rate": 5.969912106143105e-07, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13834628462791443, | |
| "step": 4030, | |
| "valid_targets_mean": 5169.5, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 6.518578352180937, | |
| "grad_norm": 0.43382150239329875, | |
| "learning_rate": 5.776091404817762e-07, | |
| "loss": 0.1204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12629744410514832, | |
| "step": 4035, | |
| "valid_targets_mean": 5217.6, | |
| "valid_targets_min": 3262 | |
| }, | |
| { | |
| "epoch": 6.526655896607432, | |
| "grad_norm": 0.43276969760803347, | |
| "learning_rate": 5.585423060428064e-07, | |
| "loss": 0.1302, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1219807118177414, | |
| "step": 4040, | |
| "valid_targets_mean": 5053.8, | |
| "valid_targets_min": 2763 | |
| }, | |
| { | |
| "epoch": 6.534733441033926, | |
| "grad_norm": 0.4228424315263158, | |
| "learning_rate": 5.397910167622212e-07, | |
| "loss": 0.1241, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1212330013513565, | |
| "step": 4045, | |
| "valid_targets_mean": 5681.0, | |
| "valid_targets_min": 3393 | |
| }, | |
| { | |
| "epoch": 6.54281098546042, | |
| "grad_norm": 0.43356050429031545, | |
| "learning_rate": 5.2135557698338e-07, | |
| "loss": 0.1184, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11314418911933899, | |
| "step": 4050, | |
| "valid_targets_mean": 5226.2, | |
| "valid_targets_min": 862 | |
| }, | |
| { | |
| "epoch": 6.550888529886914, | |
| "grad_norm": 0.4410363384518348, | |
| "learning_rate": 5.032362859232409e-07, | |
| "loss": 0.1188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11539218574762344, | |
| "step": 4055, | |
| "valid_targets_mean": 5639.7, | |
| "valid_targets_min": 2686 | |
| }, | |
| { | |
| "epoch": 6.558966074313409, | |
| "grad_norm": 0.39036790685484907, | |
| "learning_rate": 4.854334376674908e-07, | |
| "loss": 0.1254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11777080595493317, | |
| "step": 4060, | |
| "valid_targets_mean": 6159.1, | |
| "valid_targets_min": 1583 | |
| }, | |
| { | |
| "epoch": 6.567043618739903, | |
| "grad_norm": 0.3970441420439976, | |
| "learning_rate": 4.679473211657959e-07, | |
| "loss": 0.1235, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11998297274112701, | |
| "step": 4065, | |
| "valid_targets_mean": 6206.8, | |
| "valid_targets_min": 3559 | |
| }, | |
| { | |
| "epoch": 6.575121163166397, | |
| "grad_norm": 0.4361589600653003, | |
| "learning_rate": 4.507782202270883e-07, | |
| "loss": 0.1261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12242761254310608, | |
| "step": 4070, | |
| "valid_targets_mean": 5184.5, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 6.583198707592892, | |
| "grad_norm": 0.41712998221462744, | |
| "learning_rate": 4.33926413514969e-07, | |
| "loss": 0.1239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12609797716140747, | |
| "step": 4075, | |
| "valid_targets_mean": 5275.7, | |
| "valid_targets_min": 3429 | |
| }, | |
| { | |
| "epoch": 6.591276252019386, | |
| "grad_norm": 0.42389981043438, | |
| "learning_rate": 4.1739217454319413e-07, | |
| "loss": 0.1224, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1254565268754959, | |
| "step": 4080, | |
| "valid_targets_mean": 5863.4, | |
| "valid_targets_min": 3192 | |
| }, | |
| { | |
| "epoch": 6.599353796445881, | |
| "grad_norm": 0.4415821899768825, | |
| "learning_rate": 4.0117577167121835e-07, | |
| "loss": 0.123, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11877983808517456, | |
| "step": 4085, | |
| "valid_targets_mean": 5662.8, | |
| "valid_targets_min": 3604 | |
| }, | |
| { | |
| "epoch": 6.607431340872375, | |
| "grad_norm": 0.4035743849460305, | |
| "learning_rate": 3.852774680998561e-07, | |
| "loss": 0.1267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11850054562091827, | |
| "step": 4090, | |
| "valid_targets_mean": 5700.8, | |
| "valid_targets_min": 2739 | |
| }, | |
| { | |
| "epoch": 6.615508885298869, | |
| "grad_norm": 0.46246205856773076, | |
| "learning_rate": 3.6969752186699184e-07, | |
| "loss": 0.1253, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13161437213420868, | |
| "step": 4095, | |
| "valid_targets_mean": 5081.2, | |
| "valid_targets_min": 3038 | |
| }, | |
| { | |
| "epoch": 6.623586429725363, | |
| "grad_norm": 0.4273771914997136, | |
| "learning_rate": 3.544361858434098e-07, | |
| "loss": 0.1262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12390320748090744, | |
| "step": 4100, | |
| "valid_targets_mean": 5140.4, | |
| "valid_targets_min": 1120 | |
| }, | |
| { | |
| "epoch": 6.6316639741518575, | |
| "grad_norm": 0.5070081607251208, | |
| "learning_rate": 3.3949370772868194e-07, | |
| "loss": 0.1293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14423541724681854, | |
| "step": 4105, | |
| "valid_targets_mean": 5319.9, | |
| "valid_targets_min": 2947 | |
| }, | |
| { | |
| "epoch": 6.639741518578353, | |
| "grad_norm": 0.4555409998613445, | |
| "learning_rate": 3.2487033004713564e-07, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13123884797096252, | |
| "step": 4110, | |
| "valid_targets_mean": 5210.4, | |
| "valid_targets_min": 3311 | |
| }, | |
| { | |
| "epoch": 6.647819063004847, | |
| "grad_norm": 0.42299953702729287, | |
| "learning_rate": 3.105662901439499e-07, | |
| "loss": 0.1202, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1164276972413063, | |
| "step": 4115, | |
| "valid_targets_mean": 5548.9, | |
| "valid_targets_min": 3105 | |
| }, | |
| { | |
| "epoch": 6.655896607431341, | |
| "grad_norm": 0.5750357035278376, | |
| "learning_rate": 2.9658182018126316e-07, | |
| "loss": 0.121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12372918426990509, | |
| "step": 4120, | |
| "valid_targets_mean": 4148.2, | |
| "valid_targets_min": 936 | |
| }, | |
| { | |
| "epoch": 6.663974151857835, | |
| "grad_norm": 0.39431612594628507, | |
| "learning_rate": 2.829171471344294e-07, | |
| "loss": 0.1249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11072470247745514, | |
| "step": 4125, | |
| "valid_targets_mean": 5713.8, | |
| "valid_targets_min": 3219 | |
| }, | |
| { | |
| "epoch": 6.6720516962843295, | |
| "grad_norm": 0.41958708922647425, | |
| "learning_rate": 2.695724927883392e-07, | |
| "loss": 0.1187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12350701540708542, | |
| "step": 4130, | |
| "valid_targets_mean": 5393.2, | |
| "valid_targets_min": 2541 | |
| }, | |
| { | |
| "epoch": 6.680129240710824, | |
| "grad_norm": 0.43875701635891834, | |
| "learning_rate": 2.5654807373379777e-07, | |
| "loss": 0.1224, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1251029372215271, | |
| "step": 4135, | |
| "valid_targets_mean": 5595.2, | |
| "valid_targets_min": 2490 | |
| }, | |
| { | |
| "epoch": 6.688206785137318, | |
| "grad_norm": 0.4338446938410848, | |
| "learning_rate": 2.438441013640258e-07, | |
| "loss": 0.123, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12370643019676208, | |
| "step": 4140, | |
| "valid_targets_mean": 5581.8, | |
| "valid_targets_min": 3348 | |
| }, | |
| { | |
| "epoch": 6.696284329563813, | |
| "grad_norm": 0.46482721561432655, | |
| "learning_rate": 2.314607818712311e-07, | |
| "loss": 0.123, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13866999745368958, | |
| "step": 4145, | |
| "valid_targets_mean": 5038.1, | |
| "valid_targets_min": 3053 | |
| }, | |
| { | |
| "epoch": 6.704361873990307, | |
| "grad_norm": 0.43402661773234946, | |
| "learning_rate": 2.1939831624324226e-07, | |
| "loss": 0.1251, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12082387506961823, | |
| "step": 4150, | |
| "valid_targets_mean": 5538.1, | |
| "valid_targets_min": 3741 | |
| }, | |
| { | |
| "epoch": 6.7124394184168015, | |
| "grad_norm": 0.45577974257774717, | |
| "learning_rate": 2.0765690026027353e-07, | |
| "loss": 0.129, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11891952157020569, | |
| "step": 4155, | |
| "valid_targets_mean": 4971.3, | |
| "valid_targets_min": 1305 | |
| }, | |
| { | |
| "epoch": 6.720516962843296, | |
| "grad_norm": 0.4053312005766982, | |
| "learning_rate": 1.9623672449172292e-07, | |
| "loss": 0.121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11686712503433228, | |
| "step": 4160, | |
| "valid_targets_mean": 5524.4, | |
| "valid_targets_min": 3847 | |
| }, | |
| { | |
| "epoch": 6.72859450726979, | |
| "grad_norm": 0.4495177975522038, | |
| "learning_rate": 1.8513797429309477e-07, | |
| "loss": 0.1237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13314317166805267, | |
| "step": 4165, | |
| "valid_targets_mean": 5426.2, | |
| "valid_targets_min": 3193 | |
| }, | |
| { | |
| "epoch": 6.736672051696284, | |
| "grad_norm": 0.4933174749913547, | |
| "learning_rate": 1.743608298029842e-07, | |
| "loss": 0.1267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13363343477249146, | |
| "step": 4170, | |
| "valid_targets_mean": 4938.6, | |
| "valid_targets_min": 3063 | |
| }, | |
| { | |
| "epoch": 6.744749596122778, | |
| "grad_norm": 0.4442812474453568, | |
| "learning_rate": 1.6390546594015512e-07, | |
| "loss": 0.1244, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11812600493431091, | |
| "step": 4175, | |
| "valid_targets_mean": 5135.5, | |
| "valid_targets_min": 3018 | |
| }, | |
| { | |
| "epoch": 6.7528271405492735, | |
| "grad_norm": 0.4094154273593245, | |
| "learning_rate": 1.5377205240070692e-07, | |
| "loss": 0.1239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12172652781009674, | |
| "step": 4180, | |
| "valid_targets_mean": 6441.6, | |
| "valid_targets_min": 2946 | |
| }, | |
| { | |
| "epoch": 6.760904684975768, | |
| "grad_norm": 0.4033924852108923, | |
| "learning_rate": 1.4396075365530337e-07, | |
| "loss": 0.1203, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11645178496837616, | |
| "step": 4185, | |
| "valid_targets_mean": 5842.5, | |
| "valid_targets_min": 3388 | |
| }, | |
| { | |
| "epoch": 6.768982229402262, | |
| "grad_norm": 0.45671966707657646, | |
| "learning_rate": 1.3447172894652582e-07, | |
| "loss": 0.1239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13203783333301544, | |
| "step": 4190, | |
| "valid_targets_mean": 5126.8, | |
| "valid_targets_min": 2642 | |
| }, | |
| { | |
| "epoch": 6.777059773828756, | |
| "grad_norm": 0.4395252653404362, | |
| "learning_rate": 1.2530513228627305e-07, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12178006768226624, | |
| "step": 4195, | |
| "valid_targets_mean": 5482.8, | |
| "valid_targets_min": 3456 | |
| }, | |
| { | |
| "epoch": 6.78513731825525, | |
| "grad_norm": 0.4085031802856091, | |
| "learning_rate": 1.1646111245326108e-07, | |
| "loss": 0.1246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11593121290206909, | |
| "step": 4200, | |
| "valid_targets_mean": 5370.7, | |
| "valid_targets_min": 3289 | |
| }, | |
| { | |
| "epoch": 6.793214862681745, | |
| "grad_norm": 0.4155977707068557, | |
| "learning_rate": 1.0793981299062061e-07, | |
| "loss": 0.12, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11050020158290863, | |
| "step": 4205, | |
| "valid_targets_mean": 6034.5, | |
| "valid_targets_min": 2639 | |
| }, | |
| { | |
| "epoch": 6.801292407108239, | |
| "grad_norm": 0.49421045013893233, | |
| "learning_rate": 9.97413722035545e-08, | |
| "loss": 0.1249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12287741899490356, | |
| "step": 4210, | |
| "valid_targets_mean": 4324.1, | |
| "valid_targets_min": 3034 | |
| }, | |
| { | |
| "epoch": 6.809369951534734, | |
| "grad_norm": 0.4509493958502341, | |
| "learning_rate": 9.186592315710175e-08, | |
| "loss": 0.1256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12337324023246765, | |
| "step": 4215, | |
| "valid_targets_mean": 5274.3, | |
| "valid_targets_min": 3306 | |
| }, | |
| { | |
| "epoch": 6.817447495961228, | |
| "grad_norm": 0.4824539638852361, | |
| "learning_rate": 8.431359367397251e-08, | |
| "loss": 0.1218, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13827504217624664, | |
| "step": 4220, | |
| "valid_targets_mean": 5677.6, | |
| "valid_targets_min": 3337 | |
| }, | |
| { | |
| "epoch": 6.825525040387722, | |
| "grad_norm": 0.4641231639078321, | |
| "learning_rate": 7.708450633247655e-08, | |
| "loss": 0.1291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13125362992286682, | |
| "step": 4225, | |
| "valid_targets_mean": 4806.1, | |
| "valid_targets_min": 915 | |
| }, | |
| { | |
| "epoch": 6.833602584814217, | |
| "grad_norm": 0.43578209394600614, | |
| "learning_rate": 7.017877846452692e-08, | |
| "loss": 0.125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13602393865585327, | |
| "step": 4230, | |
| "valid_targets_mean": 5431.9, | |
| "valid_targets_min": 3350 | |
| }, | |
| { | |
| "epoch": 6.841680129240711, | |
| "grad_norm": 0.4337462209646266, | |
| "learning_rate": 6.359652215375045e-08, | |
| "loss": 0.1243, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1291150450706482, | |
| "step": 4235, | |
| "valid_targets_mean": 5241.9, | |
| "valid_targets_min": 3324 | |
| }, | |
| { | |
| "epoch": 6.849757673667205, | |
| "grad_norm": 0.44529633508876937, | |
| "learning_rate": 5.733784423365141e-08, | |
| "loss": 0.134, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13770753145217896, | |
| "step": 4240, | |
| "valid_targets_mean": 5720.6, | |
| "valid_targets_min": 2434 | |
| }, | |
| { | |
| "epoch": 6.857835218093699, | |
| "grad_norm": 0.4833382892737251, | |
| "learning_rate": 5.140284628589065e-08, | |
| "loss": 0.1257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1306309700012207, | |
| "step": 4245, | |
| "valid_targets_mean": 5192.2, | |
| "valid_targets_min": 3305 | |
| }, | |
| { | |
| "epoch": 6.865912762520194, | |
| "grad_norm": 0.4569082875177306, | |
| "learning_rate": 4.5791624638626966e-08, | |
| "loss": 0.1263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13639846444129944, | |
| "step": 4250, | |
| "valid_targets_mean": 5091.7, | |
| "valid_targets_min": 3352 | |
| }, | |
| { | |
| "epoch": 6.8739903069466886, | |
| "grad_norm": 0.44052978535555415, | |
| "learning_rate": 4.050427036496274e-08, | |
| "loss": 0.1208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12174591422080994, | |
| "step": 4255, | |
| "valid_targets_mean": 5032.0, | |
| "valid_targets_min": 3405 | |
| }, | |
| { | |
| "epoch": 6.882067851373183, | |
| "grad_norm": 0.4322882078765028, | |
| "learning_rate": 3.554086928145628e-08, | |
| "loss": 0.1233, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12149839103221893, | |
| "step": 4260, | |
| "valid_targets_mean": 5363.5, | |
| "valid_targets_min": 3387 | |
| }, | |
| { | |
| "epoch": 6.890145395799677, | |
| "grad_norm": 0.4059631546711396, | |
| "learning_rate": 3.090150194674069e-08, | |
| "loss": 0.1322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1226094663143158, | |
| "step": 4265, | |
| "valid_targets_mean": 5866.4, | |
| "valid_targets_min": 3317 | |
| }, | |
| { | |
| "epoch": 6.898222940226171, | |
| "grad_norm": 0.42761180544079447, | |
| "learning_rate": 2.658624366020046e-08, | |
| "loss": 0.1341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1262435019016266, | |
| "step": 4270, | |
| "valid_targets_mean": 5290.6, | |
| "valid_targets_min": 2684 | |
| }, | |
| { | |
| "epoch": 6.906300484652665, | |
| "grad_norm": 0.4514593673968472, | |
| "learning_rate": 2.2595164460768037e-08, | |
| "loss": 0.1224, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12937885522842407, | |
| "step": 4275, | |
| "valid_targets_mean": 4918.3, | |
| "valid_targets_min": 2783 | |
| }, | |
| { | |
| "epoch": 6.91437802907916, | |
| "grad_norm": 0.4155716287333983, | |
| "learning_rate": 1.892832912577136e-08, | |
| "loss": 0.1233, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1169050857424736, | |
| "step": 4280, | |
| "valid_targets_mean": 5553.2, | |
| "valid_targets_min": 2556 | |
| }, | |
| { | |
| "epoch": 6.922455573505655, | |
| "grad_norm": 0.468710773720154, | |
| "learning_rate": 1.55857971698925e-08, | |
| "loss": 0.1267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1221599131822586, | |
| "step": 4285, | |
| "valid_targets_mean": 5412.1, | |
| "valid_targets_min": 906 | |
| }, | |
| { | |
| "epoch": 6.930533117932149, | |
| "grad_norm": 0.45936368044765435, | |
| "learning_rate": 1.2567622844192885e-08, | |
| "loss": 0.1279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14119364321231842, | |
| "step": 4290, | |
| "valid_targets_mean": 5258.7, | |
| "valid_targets_min": 3509 | |
| }, | |
| { | |
| "epoch": 6.938610662358643, | |
| "grad_norm": 0.4446837612589739, | |
| "learning_rate": 9.87385513524286e-09, | |
| "loss": 0.1279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1250559538602829, | |
| "step": 4295, | |
| "valid_targets_mean": 5432.3, | |
| "valid_targets_min": 1471 | |
| }, | |
| { | |
| "epoch": 6.946688206785137, | |
| "grad_norm": 0.4374289120449582, | |
| "learning_rate": 7.504537764320142e-09, | |
| "loss": 0.123, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11904168874025345, | |
| "step": 4300, | |
| "valid_targets_mean": 5316.8, | |
| "valid_targets_min": 2943 | |
| }, | |
| { | |
| "epoch": 6.954765751211632, | |
| "grad_norm": 0.4125254709747706, | |
| "learning_rate": 5.4597091866992465e-09, | |
| "loss": 0.1199, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.10562781989574432, | |
| "step": 4305, | |
| "valid_targets_mean": 5005.0, | |
| "valid_targets_min": 2977 | |
| }, | |
| { | |
| "epoch": 6.962843295638126, | |
| "grad_norm": 0.4364936215987924, | |
| "learning_rate": 3.7394025910320045e-09, | |
| "loss": 0.1325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14175035059452057, | |
| "step": 4310, | |
| "valid_targets_mean": 5272.9, | |
| "valid_targets_min": 3332 | |
| }, | |
| { | |
| "epoch": 6.97092084006462, | |
| "grad_norm": 0.43734639977822165, | |
| "learning_rate": 2.3436458988057574e-09, | |
| "loss": 0.1325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12999486923217773, | |
| "step": 4315, | |
| "valid_targets_mean": 5339.0, | |
| "valid_targets_min": 3848 | |
| }, | |
| { | |
| "epoch": 6.978998384491114, | |
| "grad_norm": 0.4629362816601932, | |
| "learning_rate": 1.272461763890398e-09, | |
| "loss": 0.1262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13342653214931488, | |
| "step": 4320, | |
| "valid_targets_mean": 5303.4, | |
| "valid_targets_min": 1747 | |
| }, | |
| { | |
| "epoch": 6.987075928917609, | |
| "grad_norm": 0.41405175355515794, | |
| "learning_rate": 5.258675721742101e-10, | |
| "loss": 0.119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11401565372943878, | |
| "step": 4325, | |
| "valid_targets_mean": 5648.8, | |
| "valid_targets_min": 2634 | |
| }, | |
| { | |
| "epoch": 6.995153473344104, | |
| "grad_norm": 0.4428591581295041, | |
| "learning_rate": 1.038754412752141e-10, | |
| "loss": 0.1164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13264863193035126, | |
| "step": 4330, | |
| "valid_targets_mean": 5337.8, | |
| "valid_targets_min": 2983 | |
| }, | |
| { | |
| "epoch": 7.0, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13213281333446503, | |
| "step": 4333, | |
| "total_flos": 1596873627009024.0, | |
| "train_loss": 0.09127889497572297, | |
| "train_runtime": 13176.1283, | |
| "train_samples_per_second": 5.26, | |
| "train_steps_per_second": 0.329, | |
| "valid_targets_mean": 5086.1, | |
| "valid_targets_min": 2015 | |
| } | |
| ], | |
| "logging_steps": 5, | |
| "max_steps": 4333, | |
| "num_input_tokens_seen": 0, | |
| "num_train_epochs": 7, | |
| "save_steps": 1500, | |
| "stateful_callbacks": { | |
| "TrainerControl": { | |
| "args": { | |
| "should_epoch_stop": false, | |
| "should_evaluate": false, | |
| "should_log": false, | |
| "should_save": true, | |
| "should_training_stop": true | |
| }, | |
| "attributes": {} | |
| } | |
| }, | |
| "total_flos": 1596873627009024.0, | |
| "train_batch_size": 1, | |
| "trial_name": null, | |
| "trial_params": null | |
| } | |