Text Generation
Transformers
Safetensors
qwen3
llama-factory
full
Generated from Trainer
conversational
text-generation-inference
Instructions to use DCAgent/a1-nemotron_bash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DCAgent/a1-nemotron_bash with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DCAgent/a1-nemotron_bash") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-nemotron_bash") model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-nemotron_bash") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DCAgent/a1-nemotron_bash with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DCAgent/a1-nemotron_bash" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-nemotron_bash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DCAgent/a1-nemotron_bash
- SGLang
How to use DCAgent/a1-nemotron_bash with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DCAgent/a1-nemotron_bash" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-nemotron_bash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DCAgent/a1-nemotron_bash" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-nemotron_bash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DCAgent/a1-nemotron_bash with Docker Model Runner:
docker model run hf.co/DCAgent/a1-nemotron_bash
| { | |
| "best_global_step": null, | |
| "best_metric": null, | |
| "best_model_checkpoint": null, | |
| "epoch": 7.0, | |
| "eval_steps": 500, | |
| "global_step": 4319, | |
| "is_hyper_param_search": false, | |
| "is_local_process_zero": true, | |
| "is_world_process_zero": true, | |
| "log_history": [ | |
| { | |
| "epoch": 0.008103727714748784, | |
| "grad_norm": 15.052451796629647, | |
| "learning_rate": 3.7037037037037036e-07, | |
| "loss": 0.744, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7075120210647583, | |
| "step": 5, | |
| "valid_targets_mean": 4307.1, | |
| "valid_targets_min": 2137 | |
| }, | |
| { | |
| "epoch": 0.01620745542949757, | |
| "grad_norm": 14.16077406559237, | |
| "learning_rate": 8.333333333333333e-07, | |
| "loss": 0.6971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7030583620071411, | |
| "step": 10, | |
| "valid_targets_mean": 6069.7, | |
| "valid_targets_min": 2317 | |
| }, | |
| { | |
| "epoch": 0.024311183144246355, | |
| "grad_norm": 12.310953306211442, | |
| "learning_rate": 1.2962962962962962e-06, | |
| "loss": 0.7099, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6455563306808472, | |
| "step": 15, | |
| "valid_targets_mean": 5927.9, | |
| "valid_targets_min": 2198 | |
| }, | |
| { | |
| "epoch": 0.03241491085899514, | |
| "grad_norm": 11.75979384479062, | |
| "learning_rate": 1.7592592592592594e-06, | |
| "loss": 0.6609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6437151432037354, | |
| "step": 20, | |
| "valid_targets_mean": 5138.6, | |
| "valid_targets_min": 1450 | |
| }, | |
| { | |
| "epoch": 0.04051863857374392, | |
| "grad_norm": 7.098490762665116, | |
| "learning_rate": 2.222222222222222e-06, | |
| "loss": 0.5954, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5416496396064758, | |
| "step": 25, | |
| "valid_targets_mean": 4411.9, | |
| "valid_targets_min": 1399 | |
| }, | |
| { | |
| "epoch": 0.04862236628849271, | |
| "grad_norm": 4.7050280865823675, | |
| "learning_rate": 2.6851851851851856e-06, | |
| "loss": 0.5509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5249617099761963, | |
| "step": 30, | |
| "valid_targets_mean": 3748.4, | |
| "valid_targets_min": 1344 | |
| }, | |
| { | |
| "epoch": 0.05672609400324149, | |
| "grad_norm": 2.5302822583427482, | |
| "learning_rate": 3.1481481481481483e-06, | |
| "loss": 0.5361, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5037345886230469, | |
| "step": 35, | |
| "valid_targets_mean": 4683.2, | |
| "valid_targets_min": 1359 | |
| }, | |
| { | |
| "epoch": 0.06482982171799027, | |
| "grad_norm": 1.7953600503801777, | |
| "learning_rate": 3.6111111111111115e-06, | |
| "loss": 0.4987, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4654555916786194, | |
| "step": 40, | |
| "valid_targets_mean": 4019.7, | |
| "valid_targets_min": 1745 | |
| }, | |
| { | |
| "epoch": 0.07293354943273905, | |
| "grad_norm": 1.3040568058787088, | |
| "learning_rate": 4.074074074074074e-06, | |
| "loss": 0.4715, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4576088786125183, | |
| "step": 45, | |
| "valid_targets_mean": 4056.4, | |
| "valid_targets_min": 1336 | |
| }, | |
| { | |
| "epoch": 0.08103727714748785, | |
| "grad_norm": 1.1186358672828327, | |
| "learning_rate": 4.537037037037038e-06, | |
| "loss": 0.4492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4448954463005066, | |
| "step": 50, | |
| "valid_targets_mean": 3493.5, | |
| "valid_targets_min": 944 | |
| }, | |
| { | |
| "epoch": 0.08914100486223663, | |
| "grad_norm": 0.8583349811923283, | |
| "learning_rate": 5e-06, | |
| "loss": 0.432, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4754987061023712, | |
| "step": 55, | |
| "valid_targets_mean": 4823.1, | |
| "valid_targets_min": 2229 | |
| }, | |
| { | |
| "epoch": 0.09724473257698542, | |
| "grad_norm": 0.7390069286691012, | |
| "learning_rate": 5.462962962962963e-06, | |
| "loss": 0.4456, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.43904808163642883, | |
| "step": 60, | |
| "valid_targets_mean": 4810.8, | |
| "valid_targets_min": 1311 | |
| }, | |
| { | |
| "epoch": 0.1053484602917342, | |
| "grad_norm": 0.7415924654887941, | |
| "learning_rate": 5.925925925925926e-06, | |
| "loss": 0.4378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4822186231613159, | |
| "step": 65, | |
| "valid_targets_mean": 4482.6, | |
| "valid_targets_min": 1182 | |
| }, | |
| { | |
| "epoch": 0.11345218800648298, | |
| "grad_norm": 0.6929052623611306, | |
| "learning_rate": 6.3888888888888885e-06, | |
| "loss": 0.4383, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.49454551935195923, | |
| "step": 70, | |
| "valid_targets_mean": 6779.2, | |
| "valid_targets_min": 2081 | |
| }, | |
| { | |
| "epoch": 0.12155591572123177, | |
| "grad_norm": 0.6540631430503951, | |
| "learning_rate": 6.851851851851853e-06, | |
| "loss": 0.4128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35995030403137207, | |
| "step": 75, | |
| "valid_targets_mean": 4102.6, | |
| "valid_targets_min": 1430 | |
| }, | |
| { | |
| "epoch": 0.12965964343598055, | |
| "grad_norm": 0.6110409858399266, | |
| "learning_rate": 7.314814814814816e-06, | |
| "loss": 0.4059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.44885435700416565, | |
| "step": 80, | |
| "valid_targets_mean": 5470.8, | |
| "valid_targets_min": 1387 | |
| }, | |
| { | |
| "epoch": 0.13776337115072934, | |
| "grad_norm": 0.6225608077350795, | |
| "learning_rate": 7.77777777777778e-06, | |
| "loss": 0.4113, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4068431258201599, | |
| "step": 85, | |
| "valid_targets_mean": 5061.4, | |
| "valid_targets_min": 1385 | |
| }, | |
| { | |
| "epoch": 0.1458670988654781, | |
| "grad_norm": 0.5508564720574586, | |
| "learning_rate": 8.240740740740741e-06, | |
| "loss": 0.3725, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35934633016586304, | |
| "step": 90, | |
| "valid_targets_mean": 4809.6, | |
| "valid_targets_min": 676 | |
| }, | |
| { | |
| "epoch": 0.1539708265802269, | |
| "grad_norm": 0.6585566821793581, | |
| "learning_rate": 8.703703703703705e-06, | |
| "loss": 0.3701, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37526312470436096, | |
| "step": 95, | |
| "valid_targets_mean": 4011.1, | |
| "valid_targets_min": 1477 | |
| }, | |
| { | |
| "epoch": 0.1620745542949757, | |
| "grad_norm": 0.567030762477571, | |
| "learning_rate": 9.166666666666666e-06, | |
| "loss": 0.3514, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.367622971534729, | |
| "step": 100, | |
| "valid_targets_mean": 4913.1, | |
| "valid_targets_min": 1476 | |
| }, | |
| { | |
| "epoch": 0.17017828200972449, | |
| "grad_norm": 0.6144227532600338, | |
| "learning_rate": 9.62962962962963e-06, | |
| "loss": 0.3632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3684994876384735, | |
| "step": 105, | |
| "valid_targets_mean": 3957.9, | |
| "valid_targets_min": 1271 | |
| }, | |
| { | |
| "epoch": 0.17828200972447325, | |
| "grad_norm": 0.5569033858777038, | |
| "learning_rate": 1.0092592592592594e-05, | |
| "loss": 0.3343, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36880379915237427, | |
| "step": 110, | |
| "valid_targets_mean": 5144.2, | |
| "valid_targets_min": 1418 | |
| }, | |
| { | |
| "epoch": 0.18638573743922204, | |
| "grad_norm": 0.636909406673007, | |
| "learning_rate": 1.0555555555555557e-05, | |
| "loss": 0.3293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3586111068725586, | |
| "step": 115, | |
| "valid_targets_mean": 3771.0, | |
| "valid_targets_min": 1440 | |
| }, | |
| { | |
| "epoch": 0.19448946515397084, | |
| "grad_norm": 0.5782624385183504, | |
| "learning_rate": 1.1018518518518519e-05, | |
| "loss": 0.3206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2734828293323517, | |
| "step": 120, | |
| "valid_targets_mean": 4552.7, | |
| "valid_targets_min": 2314 | |
| }, | |
| { | |
| "epoch": 0.2025931928687196, | |
| "grad_norm": 0.8013266382776258, | |
| "learning_rate": 1.1481481481481482e-05, | |
| "loss": 0.3345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34413567185401917, | |
| "step": 125, | |
| "valid_targets_mean": 3705.8, | |
| "valid_targets_min": 1327 | |
| }, | |
| { | |
| "epoch": 0.2106969205834684, | |
| "grad_norm": 0.6573462845572078, | |
| "learning_rate": 1.1944444444444444e-05, | |
| "loss": 0.3102, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3564257323741913, | |
| "step": 130, | |
| "valid_targets_mean": 5756.6, | |
| "valid_targets_min": 1527 | |
| }, | |
| { | |
| "epoch": 0.2188006482982172, | |
| "grad_norm": 0.6145775154410387, | |
| "learning_rate": 1.2407407407407408e-05, | |
| "loss": 0.2884, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2915225625038147, | |
| "step": 135, | |
| "valid_targets_mean": 4130.4, | |
| "valid_targets_min": 2179 | |
| }, | |
| { | |
| "epoch": 0.22690437601296595, | |
| "grad_norm": 0.6299790820533513, | |
| "learning_rate": 1.2870370370370371e-05, | |
| "loss": 0.3186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35333144664764404, | |
| "step": 140, | |
| "valid_targets_mean": 4053.9, | |
| "valid_targets_min": 1936 | |
| }, | |
| { | |
| "epoch": 0.23500810372771475, | |
| "grad_norm": 0.5798870208280786, | |
| "learning_rate": 1.3333333333333333e-05, | |
| "loss": 0.3072, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2676125466823578, | |
| "step": 145, | |
| "valid_targets_mean": 4628.5, | |
| "valid_targets_min": 640 | |
| }, | |
| { | |
| "epoch": 0.24311183144246354, | |
| "grad_norm": 0.7204824013564382, | |
| "learning_rate": 1.3796296296296297e-05, | |
| "loss": 0.2963, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32080602645874023, | |
| "step": 150, | |
| "valid_targets_mean": 4789.8, | |
| "valid_targets_min": 2557 | |
| }, | |
| { | |
| "epoch": 0.25121555915721233, | |
| "grad_norm": 0.5701902969082061, | |
| "learning_rate": 1.4259259259259259e-05, | |
| "loss": 0.3141, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2823035717010498, | |
| "step": 155, | |
| "valid_targets_mean": 4539.7, | |
| "valid_targets_min": 1340 | |
| }, | |
| { | |
| "epoch": 0.2593192868719611, | |
| "grad_norm": 0.672412353901392, | |
| "learning_rate": 1.4722222222222224e-05, | |
| "loss": 0.2941, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31722885370254517, | |
| "step": 160, | |
| "valid_targets_mean": 3801.2, | |
| "valid_targets_min": 1484 | |
| }, | |
| { | |
| "epoch": 0.26742301458670986, | |
| "grad_norm": 0.6055091876647029, | |
| "learning_rate": 1.5185185185185187e-05, | |
| "loss": 0.2834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.289315789937973, | |
| "step": 165, | |
| "valid_targets_mean": 4378.0, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 0.2755267423014587, | |
| "grad_norm": 0.5986947151887498, | |
| "learning_rate": 1.564814814814815e-05, | |
| "loss": 0.28, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2713383138179779, | |
| "step": 170, | |
| "valid_targets_mean": 4662.1, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 0.28363047001620745, | |
| "grad_norm": 0.589272512044107, | |
| "learning_rate": 1.6111111111111115e-05, | |
| "loss": 0.2893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23535937070846558, | |
| "step": 175, | |
| "valid_targets_mean": 4591.1, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 0.2917341977309562, | |
| "grad_norm": 0.6642075943919709, | |
| "learning_rate": 1.6574074074074075e-05, | |
| "loss": 0.2996, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3774803876876831, | |
| "step": 180, | |
| "valid_targets_mean": 4475.9, | |
| "valid_targets_min": 1154 | |
| }, | |
| { | |
| "epoch": 0.29983792544570503, | |
| "grad_norm": 0.5268979191410055, | |
| "learning_rate": 1.7037037037037038e-05, | |
| "loss": 0.2799, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2813992500305176, | |
| "step": 185, | |
| "valid_targets_mean": 5149.6, | |
| "valid_targets_min": 1851 | |
| }, | |
| { | |
| "epoch": 0.3079416531604538, | |
| "grad_norm": 0.570391651488006, | |
| "learning_rate": 1.7500000000000002e-05, | |
| "loss": 0.2825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2881418764591217, | |
| "step": 190, | |
| "valid_targets_mean": 5290.5, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 0.3160453808752026, | |
| "grad_norm": 0.6322213174821355, | |
| "learning_rate": 1.7962962962962965e-05, | |
| "loss": 0.2729, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3086417317390442, | |
| "step": 195, | |
| "valid_targets_mean": 4418.4, | |
| "valid_targets_min": 1260 | |
| }, | |
| { | |
| "epoch": 0.3241491085899514, | |
| "grad_norm": 0.6750293354196638, | |
| "learning_rate": 1.8425925925925926e-05, | |
| "loss": 0.2881, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3000980615615845, | |
| "step": 200, | |
| "valid_targets_mean": 3632.8, | |
| "valid_targets_min": 1405 | |
| }, | |
| { | |
| "epoch": 0.33225283630470015, | |
| "grad_norm": 0.5598256550566737, | |
| "learning_rate": 1.888888888888889e-05, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3009907007217407, | |
| "step": 205, | |
| "valid_targets_mean": 4531.7, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 0.34035656401944897, | |
| "grad_norm": 0.6410032816153864, | |
| "learning_rate": 1.9351851851851853e-05, | |
| "loss": 0.2728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23578113317489624, | |
| "step": 210, | |
| "valid_targets_mean": 3848.1, | |
| "valid_targets_min": 1328 | |
| }, | |
| { | |
| "epoch": 0.34846029173419774, | |
| "grad_norm": 0.670577961345641, | |
| "learning_rate": 1.9814814814814816e-05, | |
| "loss": 0.2385, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26014572381973267, | |
| "step": 215, | |
| "valid_targets_mean": 3549.2, | |
| "valid_targets_min": 1323 | |
| }, | |
| { | |
| "epoch": 0.3565640194489465, | |
| "grad_norm": 0.5542778499650366, | |
| "learning_rate": 2.027777777777778e-05, | |
| "loss": 0.287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2911411225795746, | |
| "step": 220, | |
| "valid_targets_mean": 6411.4, | |
| "valid_targets_min": 1036 | |
| }, | |
| { | |
| "epoch": 0.3646677471636953, | |
| "grad_norm": 0.5687749934223739, | |
| "learning_rate": 2.074074074074074e-05, | |
| "loss": 0.2726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2129267156124115, | |
| "step": 225, | |
| "valid_targets_mean": 4360.8, | |
| "valid_targets_min": 1473 | |
| }, | |
| { | |
| "epoch": 0.3727714748784441, | |
| "grad_norm": 0.545745675422607, | |
| "learning_rate": 2.1203703703703703e-05, | |
| "loss": 0.2619, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25986871123313904, | |
| "step": 230, | |
| "valid_targets_mean": 4436.2, | |
| "valid_targets_min": 1362 | |
| }, | |
| { | |
| "epoch": 0.38087520259319285, | |
| "grad_norm": 0.5149796037209613, | |
| "learning_rate": 2.1666666666666667e-05, | |
| "loss": 0.2504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23290029168128967, | |
| "step": 235, | |
| "valid_targets_mean": 5435.3, | |
| "valid_targets_min": 1451 | |
| }, | |
| { | |
| "epoch": 0.3889789303079417, | |
| "grad_norm": 0.6133182407650534, | |
| "learning_rate": 2.212962962962963e-05, | |
| "loss": 0.2427, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26056742668151855, | |
| "step": 240, | |
| "valid_targets_mean": 4159.4, | |
| "valid_targets_min": 1314 | |
| }, | |
| { | |
| "epoch": 0.39708265802269044, | |
| "grad_norm": 0.5649078331736651, | |
| "learning_rate": 2.2592592592592594e-05, | |
| "loss": 0.262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2821381092071533, | |
| "step": 245, | |
| "valid_targets_mean": 4649.4, | |
| "valid_targets_min": 1490 | |
| }, | |
| { | |
| "epoch": 0.4051863857374392, | |
| "grad_norm": 0.6118904964971021, | |
| "learning_rate": 2.3055555555555554e-05, | |
| "loss": 0.2612, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3021799921989441, | |
| "step": 250, | |
| "valid_targets_mean": 5177.1, | |
| "valid_targets_min": 1482 | |
| }, | |
| { | |
| "epoch": 0.413290113452188, | |
| "grad_norm": 0.6071615375178763, | |
| "learning_rate": 2.3518518518518518e-05, | |
| "loss": 0.2524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19076423346996307, | |
| "step": 255, | |
| "valid_targets_mean": 4186.5, | |
| "valid_targets_min": 1178 | |
| }, | |
| { | |
| "epoch": 0.4213938411669368, | |
| "grad_norm": 0.5991586523880158, | |
| "learning_rate": 2.3981481481481485e-05, | |
| "loss": 0.2559, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2608402371406555, | |
| "step": 260, | |
| "valid_targets_mean": 3828.4, | |
| "valid_targets_min": 914 | |
| }, | |
| { | |
| "epoch": 0.42949756888168555, | |
| "grad_norm": 0.6337407192018856, | |
| "learning_rate": 2.444444444444445e-05, | |
| "loss": 0.2418, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24443909525871277, | |
| "step": 265, | |
| "valid_targets_mean": 4669.5, | |
| "valid_targets_min": 1387 | |
| }, | |
| { | |
| "epoch": 0.4376012965964344, | |
| "grad_norm": 0.6370177249174864, | |
| "learning_rate": 2.4907407407407412e-05, | |
| "loss": 0.2488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23749151825904846, | |
| "step": 270, | |
| "valid_targets_mean": 3662.0, | |
| "valid_targets_min": 1268 | |
| }, | |
| { | |
| "epoch": 0.44570502431118314, | |
| "grad_norm": 0.6685559577348903, | |
| "learning_rate": 2.5370370370370375e-05, | |
| "loss": 0.2526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22629933059215546, | |
| "step": 275, | |
| "valid_targets_mean": 3966.9, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 0.4538087520259319, | |
| "grad_norm": 0.5774981883358449, | |
| "learning_rate": 2.5833333333333336e-05, | |
| "loss": 0.2525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2319318950176239, | |
| "step": 280, | |
| "valid_targets_mean": 4366.3, | |
| "valid_targets_min": 1123 | |
| }, | |
| { | |
| "epoch": 0.4619124797406807, | |
| "grad_norm": 0.639777977381849, | |
| "learning_rate": 2.62962962962963e-05, | |
| "loss": 0.2821, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22607596218585968, | |
| "step": 285, | |
| "valid_targets_mean": 4010.4, | |
| "valid_targets_min": 1034 | |
| }, | |
| { | |
| "epoch": 0.4700162074554295, | |
| "grad_norm": 0.5632527042212236, | |
| "learning_rate": 2.6759259259259263e-05, | |
| "loss": 0.2222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21673567593097687, | |
| "step": 290, | |
| "valid_targets_mean": 4221.4, | |
| "valid_targets_min": 1426 | |
| }, | |
| { | |
| "epoch": 0.47811993517017826, | |
| "grad_norm": 0.641243133328366, | |
| "learning_rate": 2.7222222222222226e-05, | |
| "loss": 0.2342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21339434385299683, | |
| "step": 295, | |
| "valid_targets_mean": 3281.4, | |
| "valid_targets_min": 862 | |
| }, | |
| { | |
| "epoch": 0.4862236628849271, | |
| "grad_norm": 0.5785904863633673, | |
| "learning_rate": 2.768518518518519e-05, | |
| "loss": 0.2401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2441210299730301, | |
| "step": 300, | |
| "valid_targets_mean": 5462.9, | |
| "valid_targets_min": 1413 | |
| }, | |
| { | |
| "epoch": 0.49432739059967584, | |
| "grad_norm": 0.6186086113261662, | |
| "learning_rate": 2.814814814814815e-05, | |
| "loss": 0.2421, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24510961771011353, | |
| "step": 305, | |
| "valid_targets_mean": 4120.8, | |
| "valid_targets_min": 1337 | |
| }, | |
| { | |
| "epoch": 0.5024311183144247, | |
| "grad_norm": 0.7369432792730909, | |
| "learning_rate": 2.8611111111111113e-05, | |
| "loss": 0.2769, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22853118181228638, | |
| "step": 310, | |
| "valid_targets_mean": 3062.8, | |
| "valid_targets_min": 582 | |
| }, | |
| { | |
| "epoch": 0.5105348460291734, | |
| "grad_norm": 0.5012959603011699, | |
| "learning_rate": 2.9074074074074077e-05, | |
| "loss": 0.2268, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.265197217464447, | |
| "step": 315, | |
| "valid_targets_mean": 4856.2, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 0.5186385737439222, | |
| "grad_norm": 0.5881603988884309, | |
| "learning_rate": 2.953703703703704e-05, | |
| "loss": 0.2235, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24882851541042328, | |
| "step": 320, | |
| "valid_targets_mean": 4374.3, | |
| "valid_targets_min": 1504 | |
| }, | |
| { | |
| "epoch": 0.526742301458671, | |
| "grad_norm": 0.5843683180477552, | |
| "learning_rate": 3.0000000000000004e-05, | |
| "loss": 0.2561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3021346926689148, | |
| "step": 325, | |
| "valid_targets_mean": 4368.1, | |
| "valid_targets_min": 849 | |
| }, | |
| { | |
| "epoch": 0.5348460291734197, | |
| "grad_norm": 0.5638995855114154, | |
| "learning_rate": 3.0462962962962964e-05, | |
| "loss": 0.2591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2310064136981964, | |
| "step": 330, | |
| "valid_targets_mean": 4034.5, | |
| "valid_targets_min": 1392 | |
| }, | |
| { | |
| "epoch": 0.5429497568881686, | |
| "grad_norm": 0.6241796157785493, | |
| "learning_rate": 3.092592592592593e-05, | |
| "loss": 0.2328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2775031626224518, | |
| "step": 335, | |
| "valid_targets_mean": 4538.9, | |
| "valid_targets_min": 1472 | |
| }, | |
| { | |
| "epoch": 0.5510534846029174, | |
| "grad_norm": 0.5986159843009783, | |
| "learning_rate": 3.138888888888889e-05, | |
| "loss": 0.2192, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2154683768749237, | |
| "step": 340, | |
| "valid_targets_mean": 4171.1, | |
| "valid_targets_min": 1448 | |
| }, | |
| { | |
| "epoch": 0.5591572123176661, | |
| "grad_norm": 0.5678213186550801, | |
| "learning_rate": 3.185185185185185e-05, | |
| "loss": 0.2442, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21003559231758118, | |
| "step": 345, | |
| "valid_targets_mean": 4115.6, | |
| "valid_targets_min": 1494 | |
| }, | |
| { | |
| "epoch": 0.5672609400324149, | |
| "grad_norm": 0.5000435710526403, | |
| "learning_rate": 3.231481481481482e-05, | |
| "loss": 0.2384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21373742818832397, | |
| "step": 350, | |
| "valid_targets_mean": 5205.3, | |
| "valid_targets_min": 1382 | |
| }, | |
| { | |
| "epoch": 0.5753646677471637, | |
| "grad_norm": 0.6182955908884262, | |
| "learning_rate": 3.277777777777778e-05, | |
| "loss": 0.2278, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23803135752677917, | |
| "step": 355, | |
| "valid_targets_mean": 3645.1, | |
| "valid_targets_min": 1045 | |
| }, | |
| { | |
| "epoch": 0.5834683954619124, | |
| "grad_norm": 0.618571479862893, | |
| "learning_rate": 3.3240740740740746e-05, | |
| "loss": 0.2483, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19696547091007233, | |
| "step": 360, | |
| "valid_targets_mean": 4458.5, | |
| "valid_targets_min": 1330 | |
| }, | |
| { | |
| "epoch": 0.5915721231766613, | |
| "grad_norm": 0.5044774531396774, | |
| "learning_rate": 3.3703703703703706e-05, | |
| "loss": 0.2331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2560524344444275, | |
| "step": 365, | |
| "valid_targets_mean": 5364.7, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 0.5996758508914101, | |
| "grad_norm": 0.4850915687468068, | |
| "learning_rate": 3.4166666666666666e-05, | |
| "loss": 0.2305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22693286836147308, | |
| "step": 370, | |
| "valid_targets_mean": 5804.1, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 0.6077795786061588, | |
| "grad_norm": 0.7767357031334243, | |
| "learning_rate": 3.462962962962963e-05, | |
| "loss": 0.2409, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23934313654899597, | |
| "step": 375, | |
| "valid_targets_mean": 4335.7, | |
| "valid_targets_min": 688 | |
| }, | |
| { | |
| "epoch": 0.6158833063209076, | |
| "grad_norm": 0.5297011414951764, | |
| "learning_rate": 3.509259259259259e-05, | |
| "loss": 0.2484, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2228357195854187, | |
| "step": 380, | |
| "valid_targets_mean": 4548.9, | |
| "valid_targets_min": 616 | |
| }, | |
| { | |
| "epoch": 0.6239870340356564, | |
| "grad_norm": 0.5580892808180266, | |
| "learning_rate": 3.555555555555555e-05, | |
| "loss": 0.2297, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24209387600421906, | |
| "step": 385, | |
| "valid_targets_mean": 4770.1, | |
| "valid_targets_min": 1486 | |
| }, | |
| { | |
| "epoch": 0.6320907617504052, | |
| "grad_norm": 0.5064875325748078, | |
| "learning_rate": 3.601851851851852e-05, | |
| "loss": 0.2317, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24600476026535034, | |
| "step": 390, | |
| "valid_targets_mean": 4807.3, | |
| "valid_targets_min": 1405 | |
| }, | |
| { | |
| "epoch": 0.640194489465154, | |
| "grad_norm": 0.8096595526534478, | |
| "learning_rate": 3.648148148148149e-05, | |
| "loss": 0.2349, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2103320062160492, | |
| "step": 395, | |
| "valid_targets_mean": 5441.9, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 0.6482982171799028, | |
| "grad_norm": 0.527656122669523, | |
| "learning_rate": 3.694444444444445e-05, | |
| "loss": 0.2286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21797297894954681, | |
| "step": 400, | |
| "valid_targets_mean": 4970.8, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 0.6564019448946515, | |
| "grad_norm": 0.5353735139926096, | |
| "learning_rate": 3.7407407407407414e-05, | |
| "loss": 0.2274, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2168428897857666, | |
| "step": 405, | |
| "valid_targets_mean": 4440.8, | |
| "valid_targets_min": 1348 | |
| }, | |
| { | |
| "epoch": 0.6645056726094003, | |
| "grad_norm": 0.5014487865079957, | |
| "learning_rate": 3.7870370370370374e-05, | |
| "loss": 0.2525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2826576828956604, | |
| "step": 410, | |
| "valid_targets_mean": 4816.1, | |
| "valid_targets_min": 1489 | |
| }, | |
| { | |
| "epoch": 0.6726094003241491, | |
| "grad_norm": 0.5310106006676556, | |
| "learning_rate": 3.833333333333334e-05, | |
| "loss": 0.2236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22325406968593597, | |
| "step": 415, | |
| "valid_targets_mean": 4324.9, | |
| "valid_targets_min": 1259 | |
| }, | |
| { | |
| "epoch": 0.6807131280388979, | |
| "grad_norm": 0.5631510976247105, | |
| "learning_rate": 3.87962962962963e-05, | |
| "loss": 0.2561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2960614562034607, | |
| "step": 420, | |
| "valid_targets_mean": 4590.2, | |
| "valid_targets_min": 714 | |
| }, | |
| { | |
| "epoch": 0.6888168557536467, | |
| "grad_norm": 0.5090391200220923, | |
| "learning_rate": 3.925925925925926e-05, | |
| "loss": 0.2336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2151426374912262, | |
| "step": 425, | |
| "valid_targets_mean": 4842.5, | |
| "valid_targets_min": 1187 | |
| }, | |
| { | |
| "epoch": 0.6969205834683955, | |
| "grad_norm": 0.49546560892471103, | |
| "learning_rate": 3.972222222222223e-05, | |
| "loss": 0.2253, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19332653284072876, | |
| "step": 430, | |
| "valid_targets_mean": 4583.8, | |
| "valid_targets_min": 1348 | |
| }, | |
| { | |
| "epoch": 0.7050243111831442, | |
| "grad_norm": 0.426580811573209, | |
| "learning_rate": 3.99999738705324e-05, | |
| "loss": 0.2268, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22979189455509186, | |
| "step": 435, | |
| "valid_targets_mean": 6008.7, | |
| "valid_targets_min": 2062 | |
| }, | |
| { | |
| "epoch": 0.713128038897893, | |
| "grad_norm": 0.5296107565627104, | |
| "learning_rate": 3.999967991480598e-05, | |
| "loss": 0.2269, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2161000519990921, | |
| "step": 440, | |
| "valid_targets_mean": 4431.4, | |
| "valid_targets_min": 1459 | |
| }, | |
| { | |
| "epoch": 0.7212317666126418, | |
| "grad_norm": 0.5911774054449865, | |
| "learning_rate": 3.99990593463352e-05, | |
| "loss": 0.2061, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20900100469589233, | |
| "step": 445, | |
| "valid_targets_mean": 3832.8, | |
| "valid_targets_min": 1421 | |
| }, | |
| { | |
| "epoch": 0.7293354943273906, | |
| "grad_norm": 0.50964362931027, | |
| "learning_rate": 3.9998112175254504e-05, | |
| "loss": 0.2348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19737258553504944, | |
| "step": 450, | |
| "valid_targets_mean": 4713.8, | |
| "valid_targets_min": 1330 | |
| }, | |
| { | |
| "epoch": 0.7374392220421394, | |
| "grad_norm": 0.6532159249673581, | |
| "learning_rate": 3.9996838417032044e-05, | |
| "loss": 0.2183, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1929355263710022, | |
| "step": 455, | |
| "valid_targets_mean": 3942.9, | |
| "valid_targets_min": 845 | |
| }, | |
| { | |
| "epoch": 0.7455429497568882, | |
| "grad_norm": 0.4928978484161786, | |
| "learning_rate": 3.999523809246943e-05, | |
| "loss": 0.2456, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20819595456123352, | |
| "step": 460, | |
| "valid_targets_mean": 5082.1, | |
| "valid_targets_min": 1540 | |
| }, | |
| { | |
| "epoch": 0.7536466774716369, | |
| "grad_norm": 0.5236619156041604, | |
| "learning_rate": 3.9993311227701415e-05, | |
| "loss": 0.2218, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2349873036146164, | |
| "step": 465, | |
| "valid_targets_mean": 4213.9, | |
| "valid_targets_min": 1406 | |
| }, | |
| { | |
| "epoch": 0.7617504051863857, | |
| "grad_norm": 0.603508775726922, | |
| "learning_rate": 3.9991057854195424e-05, | |
| "loss": 0.221, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2348279356956482, | |
| "step": 470, | |
| "valid_targets_mean": 4291.8, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 0.7698541329011345, | |
| "grad_norm": 0.5979704039956626, | |
| "learning_rate": 3.998847800875107e-05, | |
| "loss": 0.2382, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2371644824743271, | |
| "step": 475, | |
| "valid_targets_mean": 5716.9, | |
| "valid_targets_min": 2081 | |
| }, | |
| { | |
| "epoch": 0.7779578606158833, | |
| "grad_norm": 0.5663750794625397, | |
| "learning_rate": 3.9985571733499544e-05, | |
| "loss": 0.208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16969016194343567, | |
| "step": 480, | |
| "valid_targets_mean": 3593.6, | |
| "valid_targets_min": 1369 | |
| }, | |
| { | |
| "epoch": 0.7860615883306321, | |
| "grad_norm": 0.465204031907573, | |
| "learning_rate": 3.998233907590294e-05, | |
| "loss": 0.2224, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22521032392978668, | |
| "step": 485, | |
| "valid_targets_mean": 5773.2, | |
| "valid_targets_min": 2696 | |
| }, | |
| { | |
| "epoch": 0.7941653160453809, | |
| "grad_norm": 0.5166936419430709, | |
| "learning_rate": 3.9978780088753464e-05, | |
| "loss": 0.21, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1836811602115631, | |
| "step": 490, | |
| "valid_targets_mean": 4100.2, | |
| "valid_targets_min": 1392 | |
| }, | |
| { | |
| "epoch": 0.8022690437601296, | |
| "grad_norm": 0.5850448684229496, | |
| "learning_rate": 3.997489483017256e-05, | |
| "loss": 0.2448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23051467537879944, | |
| "step": 495, | |
| "valid_targets_mean": 4196.2, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 0.8103727714748784, | |
| "grad_norm": 0.5313568085982785, | |
| "learning_rate": 3.997068336361e-05, | |
| "loss": 0.2322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2501906752586365, | |
| "step": 500, | |
| "valid_targets_mean": 5268.7, | |
| "valid_targets_min": 1424 | |
| }, | |
| { | |
| "epoch": 0.8184764991896273, | |
| "grad_norm": 0.45876391274602263, | |
| "learning_rate": 3.996614575784282e-05, | |
| "loss": 0.2417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22189603745937347, | |
| "step": 505, | |
| "valid_targets_mean": 5229.9, | |
| "valid_targets_min": 1133 | |
| }, | |
| { | |
| "epoch": 0.826580226904376, | |
| "grad_norm": 0.5204098853244463, | |
| "learning_rate": 3.99612820869742e-05, | |
| "loss": 0.2367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2431417852640152, | |
| "step": 510, | |
| "valid_targets_mean": 5087.5, | |
| "valid_targets_min": 773 | |
| }, | |
| { | |
| "epoch": 0.8346839546191248, | |
| "grad_norm": 0.5515103680701857, | |
| "learning_rate": 3.9956092430432254e-05, | |
| "loss": 0.2246, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23825669288635254, | |
| "step": 515, | |
| "valid_targets_mean": 3682.3, | |
| "valid_targets_min": 1074 | |
| }, | |
| { | |
| "epoch": 0.8427876823338736, | |
| "grad_norm": 0.49756644817448453, | |
| "learning_rate": 3.9950576872968735e-05, | |
| "loss": 0.2202, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24041366577148438, | |
| "step": 520, | |
| "valid_targets_mean": 4797.9, | |
| "valid_targets_min": 1445 | |
| }, | |
| { | |
| "epoch": 0.8508914100486223, | |
| "grad_norm": 0.8994165287630641, | |
| "learning_rate": 3.994473550465765e-05, | |
| "loss": 0.2418, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22890087962150574, | |
| "step": 525, | |
| "valid_targets_mean": 5103.2, | |
| "valid_targets_min": 1313 | |
| }, | |
| { | |
| "epoch": 0.8589951377633711, | |
| "grad_norm": 0.5108141443277746, | |
| "learning_rate": 3.993856842089379e-05, | |
| "loss": 0.2422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22940672934055328, | |
| "step": 530, | |
| "valid_targets_mean": 5652.1, | |
| "valid_targets_min": 2424 | |
| }, | |
| { | |
| "epoch": 0.86709886547812, | |
| "grad_norm": 0.5765103616276643, | |
| "learning_rate": 3.99320757223912e-05, | |
| "loss": 0.2214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2153976708650589, | |
| "step": 535, | |
| "valid_targets_mean": 3130.8, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 0.8752025931928687, | |
| "grad_norm": 0.5299043761506195, | |
| "learning_rate": 3.992525751518144e-05, | |
| "loss": 0.2119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21200573444366455, | |
| "step": 540, | |
| "valid_targets_mean": 4073.6, | |
| "valid_targets_min": 1365 | |
| }, | |
| { | |
| "epoch": 0.8833063209076175, | |
| "grad_norm": 0.6103411854134536, | |
| "learning_rate": 3.991811391061199e-05, | |
| "loss": 0.2448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25960564613342285, | |
| "step": 545, | |
| "valid_targets_mean": 5217.1, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 0.8914100486223663, | |
| "grad_norm": 0.5389982825030214, | |
| "learning_rate": 3.9910645025344324e-05, | |
| "loss": 0.2363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22821247577667236, | |
| "step": 550, | |
| "valid_targets_mean": 5121.3, | |
| "valid_targets_min": 1157 | |
| }, | |
| { | |
| "epoch": 0.899513776337115, | |
| "grad_norm": 0.5148328941205473, | |
| "learning_rate": 3.990285098135204e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20730610191822052, | |
| "step": 555, | |
| "valid_targets_mean": 4166.3, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 0.9076175040518638, | |
| "grad_norm": 0.4440249881059827, | |
| "learning_rate": 3.98947319059189e-05, | |
| "loss": 0.2162, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2121986448764801, | |
| "step": 560, | |
| "valid_targets_mean": 5257.2, | |
| "valid_targets_min": 2014 | |
| }, | |
| { | |
| "epoch": 0.9157212317666127, | |
| "grad_norm": 0.5427347682275363, | |
| "learning_rate": 3.988628793163669e-05, | |
| "loss": 0.2096, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2212689220905304, | |
| "step": 565, | |
| "valid_targets_mean": 4286.0, | |
| "valid_targets_min": 1240 | |
| }, | |
| { | |
| "epoch": 0.9238249594813615, | |
| "grad_norm": 0.4810450530247738, | |
| "learning_rate": 3.987751919640309e-05, | |
| "loss": 0.2119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2069002389907837, | |
| "step": 570, | |
| "valid_targets_mean": 4115.8, | |
| "valid_targets_min": 2249 | |
| }, | |
| { | |
| "epoch": 0.9319286871961102, | |
| "grad_norm": 0.4785049236515675, | |
| "learning_rate": 3.9868425843419445e-05, | |
| "loss": 0.2136, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22010046243667603, | |
| "step": 575, | |
| "valid_targets_mean": 4391.6, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 0.940032414910859, | |
| "grad_norm": 0.5830524896993419, | |
| "learning_rate": 3.985900802118836e-05, | |
| "loss": 0.2331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29115572571754456, | |
| "step": 580, | |
| "valid_targets_mean": 5825.9, | |
| "valid_targets_min": 1375 | |
| }, | |
| { | |
| "epoch": 0.9481361426256077, | |
| "grad_norm": 0.5392327750650897, | |
| "learning_rate": 3.984926588351135e-05, | |
| "loss": 0.2435, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22419336438179016, | |
| "step": 585, | |
| "valid_targets_mean": 4182.1, | |
| "valid_targets_min": 1327 | |
| }, | |
| { | |
| "epoch": 0.9562398703403565, | |
| "grad_norm": 0.5044623427949209, | |
| "learning_rate": 3.983919958948627e-05, | |
| "loss": 0.2206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18491308391094208, | |
| "step": 590, | |
| "valid_targets_mean": 4006.9, | |
| "valid_targets_min": 1321 | |
| }, | |
| { | |
| "epoch": 0.9643435980551054, | |
| "grad_norm": 0.4986732430556064, | |
| "learning_rate": 3.982880930350474e-05, | |
| "loss": 0.2245, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21416428685188293, | |
| "step": 595, | |
| "valid_targets_mean": 4211.7, | |
| "valid_targets_min": 1441 | |
| }, | |
| { | |
| "epoch": 0.9724473257698542, | |
| "grad_norm": 0.5124109728076002, | |
| "learning_rate": 3.981809519524948e-05, | |
| "loss": 0.2138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23443514108657837, | |
| "step": 600, | |
| "valid_targets_mean": 4845.2, | |
| "valid_targets_min": 1626 | |
| }, | |
| { | |
| "epoch": 0.9805510534846029, | |
| "grad_norm": 0.47347249636382777, | |
| "learning_rate": 3.98070574396915e-05, | |
| "loss": 0.2176, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23536278307437897, | |
| "step": 605, | |
| "valid_targets_mean": 4644.9, | |
| "valid_targets_min": 560 | |
| }, | |
| { | |
| "epoch": 0.9886547811993517, | |
| "grad_norm": 0.4741985537892797, | |
| "learning_rate": 3.979569621708727e-05, | |
| "loss": 0.2336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21886685490608215, | |
| "step": 610, | |
| "valid_targets_mean": 4478.0, | |
| "valid_targets_min": 1523 | |
| }, | |
| { | |
| "epoch": 0.9967585089141004, | |
| "grad_norm": 0.5974210354480042, | |
| "learning_rate": 3.978401171297576e-05, | |
| "loss": 0.2084, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20954114198684692, | |
| "step": 615, | |
| "valid_targets_mean": 3642.8, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 1.0048622366288493, | |
| "grad_norm": 0.5273995541794072, | |
| "learning_rate": 3.9772004118175424e-05, | |
| "loss": 0.226, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2294274866580963, | |
| "step": 620, | |
| "valid_targets_mean": 4028.8, | |
| "valid_targets_min": 1114 | |
| }, | |
| { | |
| "epoch": 1.012965964343598, | |
| "grad_norm": 0.5089306423895958, | |
| "learning_rate": 3.975967362878106e-05, | |
| "loss": 0.1909, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1963168829679489, | |
| "step": 625, | |
| "valid_targets_mean": 4406.2, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 1.0210696920583469, | |
| "grad_norm": 0.6148003496273449, | |
| "learning_rate": 3.9747020446160665e-05, | |
| "loss": 0.1999, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2061430662870407, | |
| "step": 630, | |
| "valid_targets_mean": 4103.2, | |
| "valid_targets_min": 2163 | |
| }, | |
| { | |
| "epoch": 1.0291734197730957, | |
| "grad_norm": 0.5805488016124426, | |
| "learning_rate": 3.973404477695207e-05, | |
| "loss": 0.206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21570029854774475, | |
| "step": 635, | |
| "valid_targets_mean": 4837.1, | |
| "valid_targets_min": 1372 | |
| }, | |
| { | |
| "epoch": 1.0372771474878444, | |
| "grad_norm": 0.5200072266673288, | |
| "learning_rate": 3.972074683305961e-05, | |
| "loss": 0.2049, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21892379224300385, | |
| "step": 640, | |
| "valid_targets_mean": 4104.2, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 1.0453808752025933, | |
| "grad_norm": 0.5120591679389618, | |
| "learning_rate": 3.970712683165066e-05, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15413756668567657, | |
| "step": 645, | |
| "valid_targets_mean": 3574.6, | |
| "valid_targets_min": 1450 | |
| }, | |
| { | |
| "epoch": 1.053484602917342, | |
| "grad_norm": 0.4986729646028235, | |
| "learning_rate": 3.969318499515208e-05, | |
| "loss": 0.209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21150606870651245, | |
| "step": 650, | |
| "valid_targets_mean": 4572.9, | |
| "valid_targets_min": 1004 | |
| }, | |
| { | |
| "epoch": 1.0615883306320908, | |
| "grad_norm": 0.4619514783129513, | |
| "learning_rate": 3.967892155124659e-05, | |
| "loss": 0.2158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18860788643360138, | |
| "step": 655, | |
| "valid_targets_mean": 4465.8, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 1.0696920583468394, | |
| "grad_norm": 0.45418605837420345, | |
| "learning_rate": 3.9664336732869044e-05, | |
| "loss": 0.2191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19863459467887878, | |
| "step": 660, | |
| "valid_targets_mean": 4744.1, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 1.0777957860615883, | |
| "grad_norm": 0.4863876337461995, | |
| "learning_rate": 3.964943077820263e-05, | |
| "loss": 0.2065, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24863094091415405, | |
| "step": 665, | |
| "valid_targets_mean": 5111.7, | |
| "valid_targets_min": 1074 | |
| }, | |
| { | |
| "epoch": 1.0858995137763372, | |
| "grad_norm": 0.4769229701415419, | |
| "learning_rate": 3.963420393067499e-05, | |
| "loss": 0.2335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24888065457344055, | |
| "step": 670, | |
| "valid_targets_mean": 4371.7, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 1.0940032414910859, | |
| "grad_norm": 0.5516196801331024, | |
| "learning_rate": 3.961865643895422e-05, | |
| "loss": 0.1992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16257669031620026, | |
| "step": 675, | |
| "valid_targets_mean": 3767.6, | |
| "valid_targets_min": 1354 | |
| }, | |
| { | |
| "epoch": 1.1021069692058347, | |
| "grad_norm": 0.6012761309137233, | |
| "learning_rate": 3.960278855694484e-05, | |
| "loss": 0.1891, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20597979426383972, | |
| "step": 680, | |
| "valid_targets_mean": 3221.5, | |
| "valid_targets_min": 459 | |
| }, | |
| { | |
| "epoch": 1.1102106969205834, | |
| "grad_norm": 0.5394070740857165, | |
| "learning_rate": 3.958660054378361e-05, | |
| "loss": 0.1949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23972061276435852, | |
| "step": 685, | |
| "valid_targets_mean": 4129.9, | |
| "valid_targets_min": 1312 | |
| }, | |
| { | |
| "epoch": 1.1183144246353323, | |
| "grad_norm": 0.6116321809309343, | |
| "learning_rate": 3.9570092663835334e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1856822669506073, | |
| "step": 690, | |
| "valid_targets_mean": 4818.4, | |
| "valid_targets_min": 1146 | |
| }, | |
| { | |
| "epoch": 1.1264181523500811, | |
| "grad_norm": 0.44563993029204846, | |
| "learning_rate": 3.9553265186688525e-05, | |
| "loss": 0.1994, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21733355522155762, | |
| "step": 695, | |
| "valid_targets_mean": 5456.4, | |
| "valid_targets_min": 1418 | |
| }, | |
| { | |
| "epoch": 1.1345218800648298, | |
| "grad_norm": 0.5097046093454173, | |
| "learning_rate": 3.953611838715101e-05, | |
| "loss": 0.1839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18223533034324646, | |
| "step": 700, | |
| "valid_targets_mean": 5043.8, | |
| "valid_targets_min": 1353 | |
| }, | |
| { | |
| "epoch": 1.1426256077795787, | |
| "grad_norm": 0.4684279825286209, | |
| "learning_rate": 3.951865254524542e-05, | |
| "loss": 0.2192, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2561621069908142, | |
| "step": 705, | |
| "valid_targets_mean": 5526.7, | |
| "valid_targets_min": 2616 | |
| }, | |
| { | |
| "epoch": 1.1507293354943273, | |
| "grad_norm": 0.4776014506859393, | |
| "learning_rate": 3.950086794620467e-05, | |
| "loss": 0.2056, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20872297883033752, | |
| "step": 710, | |
| "valid_targets_mean": 4848.7, | |
| "valid_targets_min": 707 | |
| }, | |
| { | |
| "epoch": 1.1588330632090762, | |
| "grad_norm": 0.5021057874599553, | |
| "learning_rate": 3.948276488046723e-05, | |
| "loss": 0.2001, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19527600705623627, | |
| "step": 715, | |
| "valid_targets_mean": 4283.8, | |
| "valid_targets_min": 2009 | |
| }, | |
| { | |
| "epoch": 1.1669367909238249, | |
| "grad_norm": 0.516010687565433, | |
| "learning_rate": 3.946434364367243e-05, | |
| "loss": 0.1908, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20048931241035461, | |
| "step": 720, | |
| "valid_targets_mean": 4517.3, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 1.1750405186385737, | |
| "grad_norm": 0.4845283973803967, | |
| "learning_rate": 3.9445604536655625e-05, | |
| "loss": 0.1926, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18405257165431976, | |
| "step": 725, | |
| "valid_targets_mean": 3697.2, | |
| "valid_targets_min": 1421 | |
| }, | |
| { | |
| "epoch": 1.1831442463533226, | |
| "grad_norm": 0.4974308404194371, | |
| "learning_rate": 3.9426547865443264e-05, | |
| "loss": 0.2131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19134216010570526, | |
| "step": 730, | |
| "valid_targets_mean": 3739.6, | |
| "valid_targets_min": 1313 | |
| }, | |
| { | |
| "epoch": 1.1912479740680713, | |
| "grad_norm": 0.4725179442958282, | |
| "learning_rate": 3.940717394124792e-05, | |
| "loss": 0.2091, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21701499819755554, | |
| "step": 735, | |
| "valid_targets_mean": 5500.9, | |
| "valid_targets_min": 2302 | |
| }, | |
| { | |
| "epoch": 1.1993517017828201, | |
| "grad_norm": 0.4966790424314751, | |
| "learning_rate": 3.938748308046318e-05, | |
| "loss": 0.214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2208487093448639, | |
| "step": 740, | |
| "valid_targets_mean": 4056.9, | |
| "valid_targets_min": 1194 | |
| }, | |
| { | |
| "epoch": 1.2074554294975688, | |
| "grad_norm": 0.44190871443509405, | |
| "learning_rate": 3.9367475604658485e-05, | |
| "loss": 0.2289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22573363780975342, | |
| "step": 745, | |
| "valid_targets_mean": 4984.1, | |
| "valid_targets_min": 628 | |
| }, | |
| { | |
| "epoch": 1.2155591572123177, | |
| "grad_norm": 0.4969917236201204, | |
| "learning_rate": 3.9347151840573915e-05, | |
| "loss": 0.2155, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2162916660308838, | |
| "step": 750, | |
| "valid_targets_mean": 3742.9, | |
| "valid_targets_min": 1585 | |
| }, | |
| { | |
| "epoch": 1.2236628849270665, | |
| "grad_norm": 0.49509015258174144, | |
| "learning_rate": 3.932651212011479e-05, | |
| "loss": 0.1949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18598730862140656, | |
| "step": 755, | |
| "valid_targets_mean": 4200.9, | |
| "valid_targets_min": 1582 | |
| }, | |
| { | |
| "epoch": 1.2317666126418152, | |
| "grad_norm": 0.43884225031680624, | |
| "learning_rate": 3.930555678034628e-05, | |
| "loss": 0.2025, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21229609847068787, | |
| "step": 760, | |
| "valid_targets_mean": 4642.7, | |
| "valid_targets_min": 1069 | |
| }, | |
| { | |
| "epoch": 1.239870340356564, | |
| "grad_norm": 0.5122574928153316, | |
| "learning_rate": 3.9284286163487924e-05, | |
| "loss": 0.2053, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1989661157131195, | |
| "step": 765, | |
| "valid_targets_mean": 4026.2, | |
| "valid_targets_min": 1399 | |
| }, | |
| { | |
| "epoch": 1.2479740680713127, | |
| "grad_norm": 0.5134191353676413, | |
| "learning_rate": 3.9262700616908006e-05, | |
| "loss": 0.1986, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.239942729473114, | |
| "step": 770, | |
| "valid_targets_mean": 4500.2, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 1.2560777957860616, | |
| "grad_norm": 0.458384033906824, | |
| "learning_rate": 3.924080049311791e-05, | |
| "loss": 0.2131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18945372104644775, | |
| "step": 775, | |
| "valid_targets_mean": 4252.0, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 1.2641815235008105, | |
| "grad_norm": 0.42458575412199884, | |
| "learning_rate": 3.921858614976632e-05, | |
| "loss": 0.2064, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20727957785129547, | |
| "step": 780, | |
| "valid_targets_mean": 5022.1, | |
| "valid_targets_min": 2198 | |
| }, | |
| { | |
| "epoch": 1.2722852512155591, | |
| "grad_norm": 0.51192900661981, | |
| "learning_rate": 3.919605794963343e-05, | |
| "loss": 0.1931, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18662604689598083, | |
| "step": 785, | |
| "valid_targets_mean": 3851.8, | |
| "valid_targets_min": 1488 | |
| }, | |
| { | |
| "epoch": 1.280388978930308, | |
| "grad_norm": 0.4668022934402429, | |
| "learning_rate": 3.9173216260624994e-05, | |
| "loss": 0.2029, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19831231236457825, | |
| "step": 790, | |
| "valid_targets_mean": 3995.3, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 1.2884927066450567, | |
| "grad_norm": 0.4703228446731573, | |
| "learning_rate": 3.915006145576632e-05, | |
| "loss": 0.1855, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17850956320762634, | |
| "step": 795, | |
| "valid_targets_mean": 4885.1, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 1.2965964343598055, | |
| "grad_norm": 0.5125676461591195, | |
| "learning_rate": 3.912659391319617e-05, | |
| "loss": 0.1978, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16043990850448608, | |
| "step": 800, | |
| "valid_targets_mean": 3561.5, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 1.3047001620745542, | |
| "grad_norm": 0.43433575074873904, | |
| "learning_rate": 3.910281401616061e-05, | |
| "loss": 0.1966, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17930254340171814, | |
| "step": 805, | |
| "valid_targets_mean": 4081.4, | |
| "valid_targets_min": 1332 | |
| }, | |
| { | |
| "epoch": 1.312803889789303, | |
| "grad_norm": 0.42220097462239276, | |
| "learning_rate": 3.907872215300672e-05, | |
| "loss": 0.1946, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16810905933380127, | |
| "step": 810, | |
| "valid_targets_mean": 3859.5, | |
| "valid_targets_min": 1351 | |
| }, | |
| { | |
| "epoch": 1.320907617504052, | |
| "grad_norm": 0.426999664447986, | |
| "learning_rate": 3.905431871717628e-05, | |
| "loss": 0.178, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19243112206459045, | |
| "step": 815, | |
| "valid_targets_mean": 4594.4, | |
| "valid_targets_min": 1473 | |
| }, | |
| { | |
| "epoch": 1.3290113452188006, | |
| "grad_norm": 0.5021614931496107, | |
| "learning_rate": 3.902960410719931e-05, | |
| "loss": 0.2196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.242889404296875, | |
| "step": 820, | |
| "valid_targets_mean": 3985.9, | |
| "valid_targets_min": 1472 | |
| }, | |
| { | |
| "epoch": 1.3371150729335495, | |
| "grad_norm": 0.46392047570045136, | |
| "learning_rate": 3.900457872668763e-05, | |
| "loss": 0.1977, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1502235233783722, | |
| "step": 825, | |
| "valid_targets_mean": 3738.2, | |
| "valid_targets_min": 1018 | |
| }, | |
| { | |
| "epoch": 1.3452188006482984, | |
| "grad_norm": 0.4118848660851366, | |
| "learning_rate": 3.897924298432816e-05, | |
| "loss": 0.2062, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2182319462299347, | |
| "step": 830, | |
| "valid_targets_mean": 5490.6, | |
| "valid_targets_min": 2289 | |
| }, | |
| { | |
| "epoch": 1.353322528363047, | |
| "grad_norm": 0.4280305041940191, | |
| "learning_rate": 3.895359729387636e-05, | |
| "loss": 0.2046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21427130699157715, | |
| "step": 835, | |
| "valid_targets_mean": 4321.1, | |
| "valid_targets_min": 1300 | |
| }, | |
| { | |
| "epoch": 1.3614262560777957, | |
| "grad_norm": 0.46406432789212704, | |
| "learning_rate": 3.892764207414939e-05, | |
| "loss": 0.1929, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19561821222305298, | |
| "step": 840, | |
| "valid_targets_mean": 4353.8, | |
| "valid_targets_min": 1442 | |
| }, | |
| { | |
| "epoch": 1.3695299837925445, | |
| "grad_norm": 0.4179681529478451, | |
| "learning_rate": 3.890137774901931e-05, | |
| "loss": 0.2092, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21955807507038116, | |
| "step": 845, | |
| "valid_targets_mean": 5324.8, | |
| "valid_targets_min": 2282 | |
| }, | |
| { | |
| "epoch": 1.3776337115072934, | |
| "grad_norm": 0.48683442708968644, | |
| "learning_rate": 3.887480474740616e-05, | |
| "loss": 0.1992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.204122394323349, | |
| "step": 850, | |
| "valid_targets_mean": 4110.2, | |
| "valid_targets_min": 970 | |
| }, | |
| { | |
| "epoch": 1.385737439222042, | |
| "grad_norm": 0.42361528043877883, | |
| "learning_rate": 3.8847923503270915e-05, | |
| "loss": 0.1964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18733659386634827, | |
| "step": 855, | |
| "valid_targets_mean": 4214.3, | |
| "valid_targets_min": 1427 | |
| }, | |
| { | |
| "epoch": 1.393841166936791, | |
| "grad_norm": 0.4732297962200014, | |
| "learning_rate": 3.8820734455608464e-05, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19635385274887085, | |
| "step": 860, | |
| "valid_targets_mean": 3787.2, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 1.4019448946515398, | |
| "grad_norm": 0.4337879170512228, | |
| "learning_rate": 3.879323804844038e-05, | |
| "loss": 0.2039, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2074633687734604, | |
| "step": 865, | |
| "valid_targets_mean": 4273.1, | |
| "valid_targets_min": 1281 | |
| }, | |
| { | |
| "epoch": 1.4100486223662885, | |
| "grad_norm": 0.4808999534174423, | |
| "learning_rate": 3.876543473080771e-05, | |
| "loss": 0.2076, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.243323415517807, | |
| "step": 870, | |
| "valid_targets_mean": 4553.7, | |
| "valid_targets_min": 1448 | |
| }, | |
| { | |
| "epoch": 1.4181523500810373, | |
| "grad_norm": 0.43158317153388576, | |
| "learning_rate": 3.87373249567636e-05, | |
| "loss": 0.1884, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18346458673477173, | |
| "step": 875, | |
| "valid_targets_mean": 4949.8, | |
| "valid_targets_min": 2071 | |
| }, | |
| { | |
| "epoch": 1.426256077795786, | |
| "grad_norm": 0.5068184567977475, | |
| "learning_rate": 3.870890918536592e-05, | |
| "loss": 0.1794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1928868591785431, | |
| "step": 880, | |
| "valid_targets_mean": 4209.9, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 1.4343598055105349, | |
| "grad_norm": 0.47287026057995374, | |
| "learning_rate": 3.8680187880669765e-05, | |
| "loss": 0.1914, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18420298397541046, | |
| "step": 885, | |
| "valid_targets_mean": 3873.4, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 1.4424635332252835, | |
| "grad_norm": 0.43700819097301596, | |
| "learning_rate": 3.865116151171983e-05, | |
| "loss": 0.2021, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.164870485663414, | |
| "step": 890, | |
| "valid_targets_mean": 3684.1, | |
| "valid_targets_min": 1318 | |
| }, | |
| { | |
| "epoch": 1.4505672609400324, | |
| "grad_norm": 0.47511708443362843, | |
| "learning_rate": 3.8621830552542826e-05, | |
| "loss": 0.1893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17125675082206726, | |
| "step": 895, | |
| "valid_targets_mean": 3790.2, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 1.4586709886547813, | |
| "grad_norm": 0.4709799341765991, | |
| "learning_rate": 3.859219548213965e-05, | |
| "loss": 0.1955, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19907286763191223, | |
| "step": 900, | |
| "valid_targets_mean": 3615.4, | |
| "valid_targets_min": 1886 | |
| }, | |
| { | |
| "epoch": 1.46677471636953, | |
| "grad_norm": 0.45038440332952534, | |
| "learning_rate": 3.856225678447764e-05, | |
| "loss": 0.212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19604134559631348, | |
| "step": 905, | |
| "valid_targets_mean": 3964.0, | |
| "valid_targets_min": 1365 | |
| }, | |
| { | |
| "epoch": 1.4748784440842788, | |
| "grad_norm": 0.4239032241495173, | |
| "learning_rate": 3.853201494848263e-05, | |
| "loss": 0.1836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17849305272102356, | |
| "step": 910, | |
| "valid_targets_mean": 4116.5, | |
| "valid_targets_min": 736 | |
| }, | |
| { | |
| "epoch": 1.4829821717990275, | |
| "grad_norm": 0.44353633252428387, | |
| "learning_rate": 3.8501470468030984e-05, | |
| "loss": 0.1886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1747392863035202, | |
| "step": 915, | |
| "valid_targets_mean": 4118.1, | |
| "valid_targets_min": 1576 | |
| }, | |
| { | |
| "epoch": 1.4910858995137763, | |
| "grad_norm": 0.47312911378928535, | |
| "learning_rate": 3.84706238419415e-05, | |
| "loss": 0.206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21894387900829315, | |
| "step": 920, | |
| "valid_targets_mean": 4172.2, | |
| "valid_targets_min": 1400 | |
| }, | |
| { | |
| "epoch": 1.499189627228525, | |
| "grad_norm": 0.4761430342519301, | |
| "learning_rate": 3.8439475573967315e-05, | |
| "loss": 0.1981, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18914563953876495, | |
| "step": 925, | |
| "valid_targets_mean": 3912.1, | |
| "valid_targets_min": 640 | |
| }, | |
| { | |
| "epoch": 1.5072933549432739, | |
| "grad_norm": 0.43698775643113397, | |
| "learning_rate": 3.840802617278763e-05, | |
| "loss": 0.2046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2051890343427658, | |
| "step": 930, | |
| "valid_targets_mean": 4464.6, | |
| "valid_targets_min": 1484 | |
| }, | |
| { | |
| "epoch": 1.5153970826580228, | |
| "grad_norm": 0.40937497215437296, | |
| "learning_rate": 3.837627615199942e-05, | |
| "loss": 0.1794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21246007084846497, | |
| "step": 935, | |
| "valid_targets_mean": 5861.9, | |
| "valid_targets_min": 1433 | |
| }, | |
| { | |
| "epoch": 1.5235008103727714, | |
| "grad_norm": 0.4612933987642411, | |
| "learning_rate": 3.834422603010906e-05, | |
| "loss": 0.2165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1685883104801178, | |
| "step": 940, | |
| "valid_targets_mean": 3331.0, | |
| "valid_targets_min": 952 | |
| }, | |
| { | |
| "epoch": 1.5316045380875203, | |
| "grad_norm": 0.44748181235308926, | |
| "learning_rate": 3.831187633052384e-05, | |
| "loss": 0.1926, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18851560354232788, | |
| "step": 945, | |
| "valid_targets_mean": 4319.6, | |
| "valid_targets_min": 1497 | |
| }, | |
| { | |
| "epoch": 1.5397082658022692, | |
| "grad_norm": 0.4645741670739686, | |
| "learning_rate": 3.8279227581543424e-05, | |
| "loss": 0.1944, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17598797380924225, | |
| "step": 950, | |
| "valid_targets_mean": 3860.6, | |
| "valid_targets_min": 1325 | |
| }, | |
| { | |
| "epoch": 1.5478119935170178, | |
| "grad_norm": 0.4446178110855918, | |
| "learning_rate": 3.824628031635122e-05, | |
| "loss": 0.2109, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20480293035507202, | |
| "step": 955, | |
| "valid_targets_mean": 5071.1, | |
| "valid_targets_min": 1335 | |
| }, | |
| { | |
| "epoch": 1.5559157212317665, | |
| "grad_norm": 0.43176875691929406, | |
| "learning_rate": 3.8213035073005665e-05, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.195458322763443, | |
| "step": 960, | |
| "valid_targets_mean": 4873.1, | |
| "valid_targets_min": 1731 | |
| }, | |
| { | |
| "epoch": 1.5640194489465153, | |
| "grad_norm": 0.4443525213532384, | |
| "learning_rate": 3.817949239443147e-05, | |
| "loss": 0.2033, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20885604619979858, | |
| "step": 965, | |
| "valid_targets_mean": 4482.3, | |
| "valid_targets_min": 2171 | |
| }, | |
| { | |
| "epoch": 1.5721231766612642, | |
| "grad_norm": 0.39045971509248545, | |
| "learning_rate": 3.814565282841071e-05, | |
| "loss": 0.1709, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18435224890708923, | |
| "step": 970, | |
| "valid_targets_mean": 4966.6, | |
| "valid_targets_min": 1448 | |
| }, | |
| { | |
| "epoch": 1.5802269043760129, | |
| "grad_norm": 0.42154294945737253, | |
| "learning_rate": 3.81115169275739e-05, | |
| "loss": 0.193, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14854902029037476, | |
| "step": 975, | |
| "valid_targets_mean": 3888.8, | |
| "valid_targets_min": 1378 | |
| }, | |
| { | |
| "epoch": 1.5883306320907618, | |
| "grad_norm": 0.42988574528808304, | |
| "learning_rate": 3.8077085249390997e-05, | |
| "loss": 0.2002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2115168422460556, | |
| "step": 980, | |
| "valid_targets_mean": 4769.3, | |
| "valid_targets_min": 1112 | |
| }, | |
| { | |
| "epoch": 1.5964343598055106, | |
| "grad_norm": 0.40490899317079226, | |
| "learning_rate": 3.8042358356162215e-05, | |
| "loss": 0.1967, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20156720280647278, | |
| "step": 985, | |
| "valid_targets_mean": 4670.4, | |
| "valid_targets_min": 1184 | |
| }, | |
| { | |
| "epoch": 1.6045380875202593, | |
| "grad_norm": 0.3933960356420022, | |
| "learning_rate": 3.8007336815008945e-05, | |
| "loss": 0.1945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1848571002483368, | |
| "step": 990, | |
| "valid_targets_mean": 4553.4, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 1.6126418152350082, | |
| "grad_norm": 0.425722007437382, | |
| "learning_rate": 3.797202119786442e-05, | |
| "loss": 0.186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1796191930770874, | |
| "step": 995, | |
| "valid_targets_mean": 4485.4, | |
| "valid_targets_min": 1558 | |
| }, | |
| { | |
| "epoch": 1.620745542949757, | |
| "grad_norm": 0.38964048320391237, | |
| "learning_rate": 3.79364120814644e-05, | |
| "loss": 0.2208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22639715671539307, | |
| "step": 1000, | |
| "valid_targets_mean": 5330.9, | |
| "valid_targets_min": 1302 | |
| }, | |
| { | |
| "epoch": 1.6288492706645057, | |
| "grad_norm": 0.41626273486480475, | |
| "learning_rate": 3.790051004733775e-05, | |
| "loss": 0.1822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2120615541934967, | |
| "step": 1005, | |
| "valid_targets_mean": 5283.8, | |
| "valid_targets_min": 1492 | |
| }, | |
| { | |
| "epoch": 1.6369529983792543, | |
| "grad_norm": 0.5147342689508975, | |
| "learning_rate": 3.786431568179697e-05, | |
| "loss": 0.2004, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20870164036750793, | |
| "step": 1010, | |
| "valid_targets_mean": 4583.5, | |
| "valid_targets_min": 1383 | |
| }, | |
| { | |
| "epoch": 1.6450567260940032, | |
| "grad_norm": 0.440817874619879, | |
| "learning_rate": 3.7827829575928536e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18191151320934296, | |
| "step": 1015, | |
| "valid_targets_mean": 3649.4, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 1.653160453808752, | |
| "grad_norm": 0.472655029247844, | |
| "learning_rate": 3.779105232558337e-05, | |
| "loss": 0.2106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25538885593414307, | |
| "step": 1020, | |
| "valid_targets_mean": 6181.6, | |
| "valid_targets_min": 1234 | |
| }, | |
| { | |
| "epoch": 1.6612641815235007, | |
| "grad_norm": 0.4216306868266021, | |
| "learning_rate": 3.775398453136701e-05, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17359581589698792, | |
| "step": 1025, | |
| "valid_targets_mean": 3975.2, | |
| "valid_targets_min": 736 | |
| }, | |
| { | |
| "epoch": 1.6693679092382496, | |
| "grad_norm": 0.39959055016658535, | |
| "learning_rate": 3.771662679862984e-05, | |
| "loss": 0.2012, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18280324339866638, | |
| "step": 1030, | |
| "valid_targets_mean": 4821.5, | |
| "valid_targets_min": 2264 | |
| }, | |
| { | |
| "epoch": 1.6774716369529985, | |
| "grad_norm": 0.4438321730012118, | |
| "learning_rate": 3.767897973745721e-05, | |
| "loss": 0.1941, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20099571347236633, | |
| "step": 1035, | |
| "valid_targets_mean": 4090.6, | |
| "valid_targets_min": 1036 | |
| }, | |
| { | |
| "epoch": 1.6855753646677472, | |
| "grad_norm": 0.4291135125758818, | |
| "learning_rate": 3.764104396265946e-05, | |
| "loss": 0.2081, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2137276828289032, | |
| "step": 1040, | |
| "valid_targets_mean": 4530.7, | |
| "valid_targets_min": 1375 | |
| }, | |
| { | |
| "epoch": 1.6936790923824958, | |
| "grad_norm": 0.4440714946429937, | |
| "learning_rate": 3.760282009376187e-05, | |
| "loss": 0.1952, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20279011130332947, | |
| "step": 1045, | |
| "valid_targets_mean": 4290.0, | |
| "valid_targets_min": 1269 | |
| }, | |
| { | |
| "epoch": 1.7017828200972447, | |
| "grad_norm": 0.46369261424562613, | |
| "learning_rate": 3.756430875499458e-05, | |
| "loss": 0.2035, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18480262160301208, | |
| "step": 1050, | |
| "valid_targets_mean": 3838.8, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 1.7098865478119936, | |
| "grad_norm": 0.4092457788481106, | |
| "learning_rate": 3.7525510575282354e-05, | |
| "loss": 0.1951, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22688287496566772, | |
| "step": 1055, | |
| "valid_targets_mean": 5638.9, | |
| "valid_targets_min": 1159 | |
| }, | |
| { | |
| "epoch": 1.7179902755267422, | |
| "grad_norm": 0.512732774310131, | |
| "learning_rate": 3.748642618823432e-05, | |
| "loss": 0.2242, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1761859655380249, | |
| "step": 1060, | |
| "valid_targets_mean": 3192.5, | |
| "valid_targets_min": 862 | |
| }, | |
| { | |
| "epoch": 1.726094003241491, | |
| "grad_norm": 0.44671104737829626, | |
| "learning_rate": 3.744705623213365e-05, | |
| "loss": 0.1995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18772464990615845, | |
| "step": 1065, | |
| "valid_targets_mean": 3969.5, | |
| "valid_targets_min": 1310 | |
| }, | |
| { | |
| "epoch": 1.73419773095624, | |
| "grad_norm": 0.43601883404552333, | |
| "learning_rate": 3.74074013499271e-05, | |
| "loss": 0.1994, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2021428793668747, | |
| "step": 1070, | |
| "valid_targets_mean": 5082.8, | |
| "valid_targets_min": 1369 | |
| }, | |
| { | |
| "epoch": 1.7423014586709886, | |
| "grad_norm": 0.421028623668563, | |
| "learning_rate": 3.736746218921453e-05, | |
| "loss": 0.1875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16165471076965332, | |
| "step": 1075, | |
| "valid_targets_mean": 4049.3, | |
| "valid_targets_min": 1509 | |
| }, | |
| { | |
| "epoch": 1.7504051863857373, | |
| "grad_norm": 0.4333329574828141, | |
| "learning_rate": 3.732723940223832e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2255226969718933, | |
| "step": 1080, | |
| "valid_targets_mean": 4857.2, | |
| "valid_targets_min": 2369 | |
| }, | |
| { | |
| "epoch": 1.7585089141004864, | |
| "grad_norm": 0.8093577300516263, | |
| "learning_rate": 3.7286733645872726e-05, | |
| "loss": 0.1845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16192904114723206, | |
| "step": 1085, | |
| "valid_targets_mean": 3395.7, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 1.766612641815235, | |
| "grad_norm": 0.4303236270123881, | |
| "learning_rate": 3.724594558161315e-05, | |
| "loss": 0.2106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1886795163154602, | |
| "step": 1090, | |
| "valid_targets_mean": 4195.4, | |
| "valid_targets_min": 1335 | |
| }, | |
| { | |
| "epoch": 1.7747163695299837, | |
| "grad_norm": 0.4271045023656605, | |
| "learning_rate": 3.720487587556534e-05, | |
| "loss": 0.1897, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22572541236877441, | |
| "step": 1095, | |
| "valid_targets_mean": 4628.6, | |
| "valid_targets_min": 1527 | |
| }, | |
| { | |
| "epoch": 1.7828200972447326, | |
| "grad_norm": 0.4168865221240875, | |
| "learning_rate": 3.716352519843448e-05, | |
| "loss": 0.2151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22260594367980957, | |
| "step": 1100, | |
| "valid_targets_mean": 5321.4, | |
| "valid_targets_min": 1732 | |
| }, | |
| { | |
| "epoch": 1.7909238249594814, | |
| "grad_norm": 0.42384928589160936, | |
| "learning_rate": 3.712189422551431e-05, | |
| "loss": 0.1946, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18755793571472168, | |
| "step": 1105, | |
| "valid_targets_mean": 4328.8, | |
| "valid_targets_min": 1434 | |
| }, | |
| { | |
| "epoch": 1.79902755267423, | |
| "grad_norm": 0.41646753010972154, | |
| "learning_rate": 3.7079983636676005e-05, | |
| "loss": 0.2057, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20401719212532043, | |
| "step": 1110, | |
| "valid_targets_mean": 5051.8, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 1.807131280388979, | |
| "grad_norm": 0.3813398032055785, | |
| "learning_rate": 3.703779411635714e-05, | |
| "loss": 0.2204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1891724020242691, | |
| "step": 1115, | |
| "valid_targets_mean": 5103.3, | |
| "valid_targets_min": 1392 | |
| }, | |
| { | |
| "epoch": 1.8152350081037278, | |
| "grad_norm": 0.46573526542742605, | |
| "learning_rate": 3.6995326353550515e-05, | |
| "loss": 0.1888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17837417125701904, | |
| "step": 1120, | |
| "valid_targets_mean": 3876.1, | |
| "valid_targets_min": 1297 | |
| }, | |
| { | |
| "epoch": 1.8233387358184765, | |
| "grad_norm": 0.4534625642159336, | |
| "learning_rate": 3.695258104179284e-05, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1509757936000824, | |
| "step": 1125, | |
| "valid_targets_mean": 3438.0, | |
| "valid_targets_min": 1239 | |
| }, | |
| { | |
| "epoch": 1.8314424635332252, | |
| "grad_norm": 0.4506591558299602, | |
| "learning_rate": 3.690955887915347e-05, | |
| "loss": 0.1873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1877855360507965, | |
| "step": 1130, | |
| "valid_targets_mean": 4014.6, | |
| "valid_targets_min": 1346 | |
| }, | |
| { | |
| "epoch": 1.839546191247974, | |
| "grad_norm": 0.4008424296471736, | |
| "learning_rate": 3.686626056822298e-05, | |
| "loss": 0.1703, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1591203808784485, | |
| "step": 1135, | |
| "valid_targets_mean": 4374.0, | |
| "valid_targets_min": 1357 | |
| }, | |
| { | |
| "epoch": 1.847649918962723, | |
| "grad_norm": 0.43014942361797626, | |
| "learning_rate": 3.682268681610169e-05, | |
| "loss": 0.2214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21824797987937927, | |
| "step": 1140, | |
| "valid_targets_mean": 4494.2, | |
| "valid_targets_min": 1408 | |
| }, | |
| { | |
| "epoch": 1.8557536466774716, | |
| "grad_norm": 0.42136818327229264, | |
| "learning_rate": 3.677883833438814e-05, | |
| "loss": 0.1746, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1907494217157364, | |
| "step": 1145, | |
| "valid_targets_mean": 4776.2, | |
| "valid_targets_min": 1475 | |
| }, | |
| { | |
| "epoch": 1.8638573743922204, | |
| "grad_norm": 0.4412420739828771, | |
| "learning_rate": 3.6734715839167427e-05, | |
| "loss": 0.1845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16720981895923615, | |
| "step": 1150, | |
| "valid_targets_mean": 3630.3, | |
| "valid_targets_min": 1298 | |
| }, | |
| { | |
| "epoch": 1.8719611021069693, | |
| "grad_norm": 0.4415751980816361, | |
| "learning_rate": 3.669032005099954e-05, | |
| "loss": 0.229, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2498154640197754, | |
| "step": 1155, | |
| "valid_targets_mean": 5633.4, | |
| "valid_targets_min": 1512 | |
| }, | |
| { | |
| "epoch": 1.880064829821718, | |
| "grad_norm": 0.40952145032072795, | |
| "learning_rate": 3.664565169490761e-05, | |
| "loss": 0.1897, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18677553534507751, | |
| "step": 1160, | |
| "valid_targets_mean": 4367.2, | |
| "valid_targets_min": 1187 | |
| }, | |
| { | |
| "epoch": 1.8881685575364666, | |
| "grad_norm": 0.489258878774405, | |
| "learning_rate": 3.660071150036602e-05, | |
| "loss": 0.1921, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20160531997680664, | |
| "step": 1165, | |
| "valid_targets_mean": 4219.1, | |
| "valid_targets_min": 1345 | |
| }, | |
| { | |
| "epoch": 1.8962722852512157, | |
| "grad_norm": 0.4525265915982652, | |
| "learning_rate": 3.655550020128852e-05, | |
| "loss": 0.2162, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2606647312641144, | |
| "step": 1170, | |
| "valid_targets_mean": 4542.6, | |
| "valid_targets_min": 1351 | |
| }, | |
| { | |
| "epoch": 1.9043760129659644, | |
| "grad_norm": 0.39049624581473075, | |
| "learning_rate": 3.651001853601627e-05, | |
| "loss": 0.1929, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17106109857559204, | |
| "step": 1175, | |
| "valid_targets_mean": 4377.2, | |
| "valid_targets_min": 808 | |
| }, | |
| { | |
| "epoch": 1.912479740680713, | |
| "grad_norm": 0.39226733614625686, | |
| "learning_rate": 3.646426724730572e-05, | |
| "loss": 0.1998, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21923178434371948, | |
| "step": 1180, | |
| "valid_targets_mean": 4946.4, | |
| "valid_targets_min": 1406 | |
| }, | |
| { | |
| "epoch": 1.920583468395462, | |
| "grad_norm": 0.3831318298196707, | |
| "learning_rate": 3.641824708231653e-05, | |
| "loss": 0.194, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23360614478588104, | |
| "step": 1185, | |
| "valid_targets_mean": 6560.4, | |
| "valid_targets_min": 2831 | |
| }, | |
| { | |
| "epoch": 1.9286871961102108, | |
| "grad_norm": 0.39667541079665347, | |
| "learning_rate": 3.6371958792599346e-05, | |
| "loss": 0.1888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15785202383995056, | |
| "step": 1190, | |
| "valid_targets_mean": 4117.3, | |
| "valid_targets_min": 1326 | |
| }, | |
| { | |
| "epoch": 1.9367909238249594, | |
| "grad_norm": 0.44441348853109325, | |
| "learning_rate": 3.632540313408353e-05, | |
| "loss": 0.2014, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15401755273342133, | |
| "step": 1195, | |
| "valid_targets_mean": 3274.1, | |
| "valid_targets_min": 1398 | |
| }, | |
| { | |
| "epoch": 1.9448946515397083, | |
| "grad_norm": 0.4016343317461078, | |
| "learning_rate": 3.627858086706483e-05, | |
| "loss": 0.1831, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17341655492782593, | |
| "step": 1200, | |
| "valid_targets_mean": 4483.2, | |
| "valid_targets_min": 1306 | |
| }, | |
| { | |
| "epoch": 1.9529983792544572, | |
| "grad_norm": 0.39527315259402573, | |
| "learning_rate": 3.6231492756192924e-05, | |
| "loss": 0.1984, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22069337964057922, | |
| "step": 1205, | |
| "valid_targets_mean": 5291.3, | |
| "valid_targets_min": 827 | |
| }, | |
| { | |
| "epoch": 1.9611021069692058, | |
| "grad_norm": 0.47431992205560747, | |
| "learning_rate": 3.618413957045899e-05, | |
| "loss": 0.1775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19169411063194275, | |
| "step": 1210, | |
| "valid_targets_mean": 3384.5, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 1.9692058346839545, | |
| "grad_norm": 0.42588976710081894, | |
| "learning_rate": 3.6136522083183096e-05, | |
| "loss": 0.194, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1703414022922516, | |
| "step": 1215, | |
| "valid_targets_mean": 4210.9, | |
| "valid_targets_min": 1380 | |
| }, | |
| { | |
| "epoch": 1.9773095623987034, | |
| "grad_norm": 0.3993131140715784, | |
| "learning_rate": 3.608864107200159e-05, | |
| "loss": 0.1883, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21244439482688904, | |
| "step": 1220, | |
| "valid_targets_mean": 4543.6, | |
| "valid_targets_min": 2047 | |
| }, | |
| { | |
| "epoch": 1.9854132901134522, | |
| "grad_norm": 0.4344328322329922, | |
| "learning_rate": 3.604049731885442e-05, | |
| "loss": 0.2028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18995779752731323, | |
| "step": 1225, | |
| "valid_targets_mean": 4069.1, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 1.993517017828201, | |
| "grad_norm": 0.4400446086567771, | |
| "learning_rate": 3.599209160997234e-05, | |
| "loss": 0.213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21253399550914764, | |
| "step": 1230, | |
| "valid_targets_mean": 4556.4, | |
| "valid_targets_min": 1281 | |
| }, | |
| { | |
| "epoch": 2.0016207455429496, | |
| "grad_norm": 0.376589399977183, | |
| "learning_rate": 3.594342473586406e-05, | |
| "loss": 0.1758, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1640516221523285, | |
| "step": 1235, | |
| "valid_targets_mean": 5084.4, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 2.0097244732576987, | |
| "grad_norm": 0.48073326523872145, | |
| "learning_rate": 3.589449749130337e-05, | |
| "loss": 0.174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18342489004135132, | |
| "step": 1240, | |
| "valid_targets_mean": 3867.4, | |
| "valid_targets_min": 1240 | |
| }, | |
| { | |
| "epoch": 2.0178282009724473, | |
| "grad_norm": 0.3727414184617058, | |
| "learning_rate": 3.5845310675316134e-05, | |
| "loss": 0.1988, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20190215110778809, | |
| "step": 1245, | |
| "valid_targets_mean": 5865.1, | |
| "valid_targets_min": 1351 | |
| }, | |
| { | |
| "epoch": 2.025931928687196, | |
| "grad_norm": 0.4395786818226242, | |
| "learning_rate": 3.5795865091167266e-05, | |
| "loss": 0.1887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18115079402923584, | |
| "step": 1250, | |
| "valid_targets_mean": 3741.1, | |
| "valid_targets_min": 1344 | |
| }, | |
| { | |
| "epoch": 2.034035656401945, | |
| "grad_norm": 0.3870970421019821, | |
| "learning_rate": 3.574616154634758e-05, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15488159656524658, | |
| "step": 1255, | |
| "valid_targets_mean": 4509.9, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 2.0421393841166937, | |
| "grad_norm": 0.3885177527861475, | |
| "learning_rate": 3.569620085256062e-05, | |
| "loss": 0.19, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14700886607170105, | |
| "step": 1260, | |
| "valid_targets_mean": 4300.9, | |
| "valid_targets_min": 1420 | |
| }, | |
| { | |
| "epoch": 2.0502431118314424, | |
| "grad_norm": 0.44324292584585834, | |
| "learning_rate": 3.564598382570943e-05, | |
| "loss": 0.185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14767037332057953, | |
| "step": 1265, | |
| "valid_targets_mean": 3636.4, | |
| "valid_targets_min": 1430 | |
| }, | |
| { | |
| "epoch": 2.0583468395461915, | |
| "grad_norm": 0.397763872763852, | |
| "learning_rate": 3.559551128588316e-05, | |
| "loss": 0.1878, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16453808546066284, | |
| "step": 1270, | |
| "valid_targets_mean": 4420.4, | |
| "valid_targets_min": 1350 | |
| }, | |
| { | |
| "epoch": 2.06645056726094, | |
| "grad_norm": 0.4902672391158386, | |
| "learning_rate": 3.5544784057343746e-05, | |
| "loss": 0.1839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16574980318546295, | |
| "step": 1275, | |
| "valid_targets_mean": 4362.5, | |
| "valid_targets_min": 1247 | |
| }, | |
| { | |
| "epoch": 2.0745542949756888, | |
| "grad_norm": 0.4352472097503133, | |
| "learning_rate": 3.549380296851242e-05, | |
| "loss": 0.1824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19808019697666168, | |
| "step": 1280, | |
| "valid_targets_mean": 4214.6, | |
| "valid_targets_min": 1297 | |
| }, | |
| { | |
| "epoch": 2.0826580226904374, | |
| "grad_norm": 0.4536384304875176, | |
| "learning_rate": 3.544256885195616e-05, | |
| "loss": 0.1807, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2267465442419052, | |
| "step": 1285, | |
| "valid_targets_mean": 4676.0, | |
| "valid_targets_min": 1323 | |
| }, | |
| { | |
| "epoch": 2.0907617504051865, | |
| "grad_norm": 0.4698011780277878, | |
| "learning_rate": 3.539108254437414e-05, | |
| "loss": 0.1921, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25781187415122986, | |
| "step": 1290, | |
| "valid_targets_mean": 4513.7, | |
| "valid_targets_min": 1157 | |
| }, | |
| { | |
| "epoch": 2.098865478119935, | |
| "grad_norm": 0.4401070741229321, | |
| "learning_rate": 3.5339344886584015e-05, | |
| "loss": 0.1886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21556992828845978, | |
| "step": 1295, | |
| "valid_targets_mean": 4360.9, | |
| "valid_targets_min": 1437 | |
| }, | |
| { | |
| "epoch": 2.106969205834684, | |
| "grad_norm": 0.3792798518134474, | |
| "learning_rate": 3.5287356723508245e-05, | |
| "loss": 0.1772, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13958227634429932, | |
| "step": 1300, | |
| "valid_targets_mean": 4286.2, | |
| "valid_targets_min": 1449 | |
| }, | |
| { | |
| "epoch": 2.115072933549433, | |
| "grad_norm": 0.45008204561893017, | |
| "learning_rate": 3.523511890416023e-05, | |
| "loss": 0.1716, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16553236544132233, | |
| "step": 1305, | |
| "valid_targets_mean": 3613.5, | |
| "valid_targets_min": 1289 | |
| }, | |
| { | |
| "epoch": 2.1231766612641816, | |
| "grad_norm": 0.45695133866981186, | |
| "learning_rate": 3.518263228163051e-05, | |
| "loss": 0.1904, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20444339513778687, | |
| "step": 1310, | |
| "valid_targets_mean": 4116.8, | |
| "valid_targets_min": 1346 | |
| }, | |
| { | |
| "epoch": 2.1312803889789302, | |
| "grad_norm": 0.39290783269742174, | |
| "learning_rate": 3.51298977130728e-05, | |
| "loss": 0.181, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1730700433254242, | |
| "step": 1315, | |
| "valid_targets_mean": 4396.1, | |
| "valid_targets_min": 1342 | |
| }, | |
| { | |
| "epoch": 2.139384116693679, | |
| "grad_norm": 0.40014498463542075, | |
| "learning_rate": 3.507691605968997e-05, | |
| "loss": 0.176, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16938640177249908, | |
| "step": 1320, | |
| "valid_targets_mean": 4114.8, | |
| "valid_targets_min": 1405 | |
| }, | |
| { | |
| "epoch": 2.147487844408428, | |
| "grad_norm": 0.4604317863246907, | |
| "learning_rate": 3.5023688186720047e-05, | |
| "loss": 0.1722, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17205074429512024, | |
| "step": 1325, | |
| "valid_targets_mean": 4108.2, | |
| "valid_targets_min": 1265 | |
| }, | |
| { | |
| "epoch": 2.1555915721231766, | |
| "grad_norm": 0.4533604714522615, | |
| "learning_rate": 3.497021496342203e-05, | |
| "loss": 0.1895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2422681301832199, | |
| "step": 1330, | |
| "valid_targets_mean": 4779.9, | |
| "valid_targets_min": 1374 | |
| }, | |
| { | |
| "epoch": 2.1636952998379253, | |
| "grad_norm": 0.4245083665207859, | |
| "learning_rate": 3.491649726306168e-05, | |
| "loss": 0.1824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18867146968841553, | |
| "step": 1335, | |
| "valid_targets_mean": 4150.7, | |
| "valid_targets_min": 800 | |
| }, | |
| { | |
| "epoch": 2.1717990275526744, | |
| "grad_norm": 0.395493334600026, | |
| "learning_rate": 3.486253596289734e-05, | |
| "loss": 0.1787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16125865280628204, | |
| "step": 1340, | |
| "valid_targets_mean": 4261.9, | |
| "valid_targets_min": 931 | |
| }, | |
| { | |
| "epoch": 2.179902755267423, | |
| "grad_norm": 0.386658367081084, | |
| "learning_rate": 3.4808331944165524e-05, | |
| "loss": 0.1677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16777998208999634, | |
| "step": 1345, | |
| "valid_targets_mean": 4928.1, | |
| "valid_targets_min": 2040 | |
| }, | |
| { | |
| "epoch": 2.1880064829821717, | |
| "grad_norm": 0.43589523641868716, | |
| "learning_rate": 3.4753886092066546e-05, | |
| "loss": 0.1864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1735723465681076, | |
| "step": 1350, | |
| "valid_targets_mean": 4623.9, | |
| "valid_targets_min": 1491 | |
| }, | |
| { | |
| "epoch": 2.1961102106969204, | |
| "grad_norm": 0.4408530475086916, | |
| "learning_rate": 3.469919929575012e-05, | |
| "loss": 0.1963, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18745142221450806, | |
| "step": 1355, | |
| "valid_targets_mean": 4525.6, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 2.2042139384116695, | |
| "grad_norm": 0.4038864126061987, | |
| "learning_rate": 3.464427244830076e-05, | |
| "loss": 0.2054, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20079880952835083, | |
| "step": 1360, | |
| "valid_targets_mean": 5277.9, | |
| "valid_targets_min": 2301 | |
| }, | |
| { | |
| "epoch": 2.212317666126418, | |
| "grad_norm": 0.4076427396108117, | |
| "learning_rate": 3.458910644672324e-05, | |
| "loss": 0.2047, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.216787189245224, | |
| "step": 1365, | |
| "valid_targets_mean": 4685.8, | |
| "valid_targets_min": 1483 | |
| }, | |
| { | |
| "epoch": 2.2204213938411668, | |
| "grad_norm": 0.4855663407841619, | |
| "learning_rate": 3.453370219192794e-05, | |
| "loss": 0.1598, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19118541479110718, | |
| "step": 1370, | |
| "valid_targets_mean": 4399.9, | |
| "valid_targets_min": 2187 | |
| }, | |
| { | |
| "epoch": 2.228525121555916, | |
| "grad_norm": 0.3994628496568297, | |
| "learning_rate": 3.4478060588716116e-05, | |
| "loss": 0.1831, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14572611451148987, | |
| "step": 1375, | |
| "valid_targets_mean": 4617.7, | |
| "valid_targets_min": 861 | |
| }, | |
| { | |
| "epoch": 2.2366288492706645, | |
| "grad_norm": 0.3932883085770016, | |
| "learning_rate": 3.442218254576516e-05, | |
| "loss": 0.1754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1831832379102707, | |
| "step": 1380, | |
| "valid_targets_mean": 4674.9, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 2.244732576985413, | |
| "grad_norm": 0.41916131913527677, | |
| "learning_rate": 3.436606897561372e-05, | |
| "loss": 0.1894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16581913828849792, | |
| "step": 1385, | |
| "valid_targets_mean": 3935.7, | |
| "valid_targets_min": 715 | |
| }, | |
| { | |
| "epoch": 2.2528363047001623, | |
| "grad_norm": 0.39450553784213077, | |
| "learning_rate": 3.430972079464682e-05, | |
| "loss": 0.1788, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16457019746303558, | |
| "step": 1390, | |
| "valid_targets_mean": 4476.0, | |
| "valid_targets_min": 2399 | |
| }, | |
| { | |
| "epoch": 2.260940032414911, | |
| "grad_norm": 0.3982500467520962, | |
| "learning_rate": 3.425313892308089e-05, | |
| "loss": 0.1982, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18190191686153412, | |
| "step": 1395, | |
| "valid_targets_mean": 4379.4, | |
| "valid_targets_min": 1290 | |
| }, | |
| { | |
| "epoch": 2.2690437601296596, | |
| "grad_norm": 0.41772831520986997, | |
| "learning_rate": 3.419632428494873e-05, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19021040201187134, | |
| "step": 1400, | |
| "valid_targets_mean": 4824.2, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 2.2771474878444082, | |
| "grad_norm": 0.41971007165893703, | |
| "learning_rate": 3.413927780808443e-05, | |
| "loss": 0.1542, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17558398842811584, | |
| "step": 1405, | |
| "valid_targets_mean": 4265.2, | |
| "valid_targets_min": 1603 | |
| }, | |
| { | |
| "epoch": 2.2852512155591573, | |
| "grad_norm": 0.43515280086555264, | |
| "learning_rate": 3.4082000424108235e-05, | |
| "loss": 0.1777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17722058296203613, | |
| "step": 1410, | |
| "valid_targets_mean": 3829.9, | |
| "valid_targets_min": 1239 | |
| }, | |
| { | |
| "epoch": 2.293354943273906, | |
| "grad_norm": 0.43561699538725956, | |
| "learning_rate": 3.402449306841128e-05, | |
| "loss": 0.1792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14969691634178162, | |
| "step": 1415, | |
| "valid_targets_mean": 3743.6, | |
| "valid_targets_min": 720 | |
| }, | |
| { | |
| "epoch": 2.3014586709886546, | |
| "grad_norm": 0.39844393207435513, | |
| "learning_rate": 3.396675668014035e-05, | |
| "loss": 0.2025, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1564260721206665, | |
| "step": 1420, | |
| "valid_targets_mean": 4526.0, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 2.3095623987034037, | |
| "grad_norm": 0.38012444527299505, | |
| "learning_rate": 3.390879220218258e-05, | |
| "loss": 0.1761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1637035608291626, | |
| "step": 1425, | |
| "valid_targets_mean": 4802.2, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 2.3176661264181524, | |
| "grad_norm": 0.4454911522622101, | |
| "learning_rate": 3.385060058114995e-05, | |
| "loss": 0.1836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19235506653785706, | |
| "step": 1430, | |
| "valid_targets_mean": 3851.9, | |
| "valid_targets_min": 1369 | |
| }, | |
| { | |
| "epoch": 2.325769854132901, | |
| "grad_norm": 0.43173142182525054, | |
| "learning_rate": 3.379218276736393e-05, | |
| "loss": 0.1819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16525980830192566, | |
| "step": 1435, | |
| "valid_targets_mean": 4005.5, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 2.3338735818476497, | |
| "grad_norm": 0.4623667736311442, | |
| "learning_rate": 3.373353971483992e-05, | |
| "loss": 0.1976, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18194140493869781, | |
| "step": 1440, | |
| "valid_targets_mean": 4458.6, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 2.341977309562399, | |
| "grad_norm": 0.43331675656237917, | |
| "learning_rate": 3.367467238127165e-05, | |
| "loss": 0.1766, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2014952003955841, | |
| "step": 1445, | |
| "valid_targets_mean": 4562.3, | |
| "valid_targets_min": 1395 | |
| }, | |
| { | |
| "epoch": 2.3500810372771475, | |
| "grad_norm": 0.4436223038884395, | |
| "learning_rate": 3.361558172801558e-05, | |
| "loss": 0.1931, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19612324237823486, | |
| "step": 1450, | |
| "valid_targets_mean": 3939.1, | |
| "valid_targets_min": 1480 | |
| }, | |
| { | |
| "epoch": 2.358184764991896, | |
| "grad_norm": 0.4077226858119085, | |
| "learning_rate": 3.355626872007518e-05, | |
| "loss": 0.2021, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17097198963165283, | |
| "step": 1455, | |
| "valid_targets_mean": 4233.2, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 2.366288492706645, | |
| "grad_norm": 0.3959886507466225, | |
| "learning_rate": 3.3496734326085154e-05, | |
| "loss": 0.1761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17004629969596863, | |
| "step": 1460, | |
| "valid_targets_mean": 4284.9, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 2.374392220421394, | |
| "grad_norm": 0.3684059118907128, | |
| "learning_rate": 3.3436979518295674e-05, | |
| "loss": 0.182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20196950435638428, | |
| "step": 1465, | |
| "valid_targets_mean": 6075.1, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 2.3824959481361425, | |
| "grad_norm": 0.3545113971679958, | |
| "learning_rate": 3.337700527255643e-05, | |
| "loss": 0.1603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15890374779701233, | |
| "step": 1470, | |
| "valid_targets_mean": 5087.4, | |
| "valid_targets_min": 1455 | |
| }, | |
| { | |
| "epoch": 2.3905996758508916, | |
| "grad_norm": 0.3840316800210763, | |
| "learning_rate": 3.331681256830074e-05, | |
| "loss": 0.1887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1505197286605835, | |
| "step": 1475, | |
| "valid_targets_mean": 4215.2, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 2.3987034035656403, | |
| "grad_norm": 0.37876560844311763, | |
| "learning_rate": 3.325640238852958e-05, | |
| "loss": 0.1871, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1649443656206131, | |
| "step": 1480, | |
| "valid_targets_mean": 4721.5, | |
| "valid_targets_min": 1414 | |
| }, | |
| { | |
| "epoch": 2.406807131280389, | |
| "grad_norm": 0.36497991834361543, | |
| "learning_rate": 3.319577571979544e-05, | |
| "loss": 0.1819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1684037744998932, | |
| "step": 1485, | |
| "valid_targets_mean": 5173.7, | |
| "valid_targets_min": 1511 | |
| }, | |
| { | |
| "epoch": 2.4149108589951376, | |
| "grad_norm": 0.418707067779524, | |
| "learning_rate": 3.313493355218632e-05, | |
| "loss": 0.1833, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20251314342021942, | |
| "step": 1490, | |
| "valid_targets_mean": 5805.2, | |
| "valid_targets_min": 3142 | |
| }, | |
| { | |
| "epoch": 2.4230145867098867, | |
| "grad_norm": 0.3779593561527125, | |
| "learning_rate": 3.307387687930947e-05, | |
| "loss": 0.1841, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20684057474136353, | |
| "step": 1495, | |
| "valid_targets_mean": 5985.4, | |
| "valid_targets_min": 1380 | |
| }, | |
| { | |
| "epoch": 2.4311183144246353, | |
| "grad_norm": 0.42380788528949165, | |
| "learning_rate": 3.301260669827524e-05, | |
| "loss": 0.1748, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20908914506435394, | |
| "step": 1500, | |
| "valid_targets_mean": 5506.3, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 2.439222042139384, | |
| "grad_norm": 0.43809464646488216, | |
| "learning_rate": 3.295112400968073e-05, | |
| "loss": 0.1667, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1538805365562439, | |
| "step": 1505, | |
| "valid_targets_mean": 3471.2, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 2.447325769854133, | |
| "grad_norm": 0.45057766440485186, | |
| "learning_rate": 3.2889429817593494e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14621488749980927, | |
| "step": 1510, | |
| "valid_targets_mean": 5778.6, | |
| "valid_targets_min": 1370 | |
| }, | |
| { | |
| "epoch": 2.4554294975688817, | |
| "grad_norm": 0.46698315776014765, | |
| "learning_rate": 3.2827525129535135e-05, | |
| "loss": 0.1733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16353151202201843, | |
| "step": 1515, | |
| "valid_targets_mean": 3501.4, | |
| "valid_targets_min": 1103 | |
| }, | |
| { | |
| "epoch": 2.4635332252836304, | |
| "grad_norm": 0.3562496067632637, | |
| "learning_rate": 3.276541095646482e-05, | |
| "loss": 0.1964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15975715219974518, | |
| "step": 1520, | |
| "valid_targets_mean": 5539.9, | |
| "valid_targets_min": 1481 | |
| }, | |
| { | |
| "epoch": 2.471636952998379, | |
| "grad_norm": 0.36221972351349063, | |
| "learning_rate": 3.2703088312762825e-05, | |
| "loss": 0.1879, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14780882000923157, | |
| "step": 1525, | |
| "valid_targets_mean": 4646.3, | |
| "valid_targets_min": 1338 | |
| }, | |
| { | |
| "epoch": 2.479740680713128, | |
| "grad_norm": 0.4051828309440133, | |
| "learning_rate": 3.2640558216213914e-05, | |
| "loss": 0.1731, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16273058950901031, | |
| "step": 1530, | |
| "valid_targets_mean": 4293.0, | |
| "valid_targets_min": 1364 | |
| }, | |
| { | |
| "epoch": 2.487844408427877, | |
| "grad_norm": 0.42177182526238644, | |
| "learning_rate": 3.2577821687990764e-05, | |
| "loss": 0.1574, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16551074385643005, | |
| "step": 1535, | |
| "valid_targets_mean": 3971.9, | |
| "valid_targets_min": 1276 | |
| }, | |
| { | |
| "epoch": 2.4959481361426255, | |
| "grad_norm": 0.42852114939450653, | |
| "learning_rate": 3.2514879752637236e-05, | |
| "loss": 0.1887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19772499799728394, | |
| "step": 1540, | |
| "valid_targets_mean": 4545.6, | |
| "valid_targets_min": 1195 | |
| }, | |
| { | |
| "epoch": 2.5040518638573745, | |
| "grad_norm": 0.3858613190702104, | |
| "learning_rate": 3.2451733438051705e-05, | |
| "loss": 0.1672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14023137092590332, | |
| "step": 1545, | |
| "valid_targets_mean": 4263.2, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 2.512155591572123, | |
| "grad_norm": 0.45356793401017254, | |
| "learning_rate": 3.238838377547023e-05, | |
| "loss": 0.1842, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16204509139060974, | |
| "step": 1550, | |
| "valid_targets_mean": 3907.6, | |
| "valid_targets_min": 844 | |
| }, | |
| { | |
| "epoch": 2.520259319286872, | |
| "grad_norm": 0.40037894398264473, | |
| "learning_rate": 3.2324831799449724e-05, | |
| "loss": 0.1671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17270907759666443, | |
| "step": 1555, | |
| "valid_targets_mean": 4516.8, | |
| "valid_targets_min": 2478 | |
| }, | |
| { | |
| "epoch": 2.528363047001621, | |
| "grad_norm": 0.40844298271631924, | |
| "learning_rate": 3.226107854785106e-05, | |
| "loss": 0.1912, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1686622053384781, | |
| "step": 1560, | |
| "valid_targets_mean": 4280.8, | |
| "valid_targets_min": 1433 | |
| }, | |
| { | |
| "epoch": 2.5364667747163696, | |
| "grad_norm": 0.3902243378177505, | |
| "learning_rate": 3.2197125061822135e-05, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18913598358631134, | |
| "step": 1565, | |
| "valid_targets_mean": 5352.0, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 2.5445705024311183, | |
| "grad_norm": 0.44643382348376237, | |
| "learning_rate": 3.213297238578082e-05, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1804146021604538, | |
| "step": 1570, | |
| "valid_targets_mean": 3832.6, | |
| "valid_targets_min": 1416 | |
| }, | |
| { | |
| "epoch": 2.5526742301458674, | |
| "grad_norm": 0.3677125080484775, | |
| "learning_rate": 3.206862156739799e-05, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17965905368328094, | |
| "step": 1575, | |
| "valid_targets_mean": 5295.1, | |
| "valid_targets_min": 1944 | |
| }, | |
| { | |
| "epoch": 2.560777957860616, | |
| "grad_norm": 0.424694281418007, | |
| "learning_rate": 3.2004073657580314e-05, | |
| "loss": 0.1701, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16907522082328796, | |
| "step": 1580, | |
| "valid_targets_mean": 4285.1, | |
| "valid_targets_min": 1325 | |
| }, | |
| { | |
| "epoch": 2.5688816855753647, | |
| "grad_norm": 0.3836005236671189, | |
| "learning_rate": 3.193932971045316e-05, | |
| "loss": 0.1671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1842470020055771, | |
| "step": 1585, | |
| "valid_targets_mean": 5091.1, | |
| "valid_targets_min": 2127 | |
| }, | |
| { | |
| "epoch": 2.5769854132901133, | |
| "grad_norm": 0.42016506443574564, | |
| "learning_rate": 3.187439078334338e-05, | |
| "loss": 0.1821, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18808531761169434, | |
| "step": 1590, | |
| "valid_targets_mean": 4131.1, | |
| "valid_targets_min": 1279 | |
| }, | |
| { | |
| "epoch": 2.585089141004862, | |
| "grad_norm": 0.39085605322357647, | |
| "learning_rate": 3.180925793676199e-05, | |
| "loss": 0.1839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17691290378570557, | |
| "step": 1595, | |
| "valid_targets_mean": 4609.9, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 2.593192868719611, | |
| "grad_norm": 0.3909121224465065, | |
| "learning_rate": 3.1743932234386905e-05, | |
| "loss": 0.1607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15855155885219574, | |
| "step": 1600, | |
| "valid_targets_mean": 4398.2, | |
| "valid_targets_min": 2216 | |
| }, | |
| { | |
| "epoch": 2.6012965964343597, | |
| "grad_norm": 0.36211045372400313, | |
| "learning_rate": 3.167841474304555e-05, | |
| "loss": 0.1767, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17921733856201172, | |
| "step": 1605, | |
| "valid_targets_mean": 5015.9, | |
| "valid_targets_min": 1357 | |
| }, | |
| { | |
| "epoch": 2.6094003241491084, | |
| "grad_norm": 0.37957996352532186, | |
| "learning_rate": 3.161270653269743e-05, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18475927412509918, | |
| "step": 1610, | |
| "valid_targets_mean": 5157.6, | |
| "valid_targets_min": 2993 | |
| }, | |
| { | |
| "epoch": 2.6175040518638575, | |
| "grad_norm": 0.36705428465477125, | |
| "learning_rate": 3.154680867641666e-05, | |
| "loss": 0.1651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19124308228492737, | |
| "step": 1615, | |
| "valid_targets_mean": 5468.5, | |
| "valid_targets_min": 1530 | |
| }, | |
| { | |
| "epoch": 2.625607779578606, | |
| "grad_norm": 0.424405031181383, | |
| "learning_rate": 3.1480722250374454e-05, | |
| "loss": 0.1866, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20675238966941833, | |
| "step": 1620, | |
| "valid_targets_mean": 5237.4, | |
| "valid_targets_min": 1502 | |
| }, | |
| { | |
| "epoch": 2.633711507293355, | |
| "grad_norm": 0.3963476176633248, | |
| "learning_rate": 3.1414448333821526e-05, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1941150426864624, | |
| "step": 1625, | |
| "valid_targets_mean": 4785.1, | |
| "valid_targets_min": 2163 | |
| }, | |
| { | |
| "epoch": 2.641815235008104, | |
| "grad_norm": 0.41204588126736313, | |
| "learning_rate": 3.1347988009070496e-05, | |
| "loss": 0.1845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21974653005599976, | |
| "step": 1630, | |
| "valid_targets_mean": 4643.4, | |
| "valid_targets_min": 2384 | |
| }, | |
| { | |
| "epoch": 2.6499189627228525, | |
| "grad_norm": 0.348197515719351, | |
| "learning_rate": 3.1281342361478184e-05, | |
| "loss": 0.1675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14443612098693848, | |
| "step": 1635, | |
| "valid_targets_mean": 4822.6, | |
| "valid_targets_min": 2198 | |
| }, | |
| { | |
| "epoch": 2.658022690437601, | |
| "grad_norm": 0.43986322837355346, | |
| "learning_rate": 3.121451247942789e-05, | |
| "loss": 0.215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20639345049858093, | |
| "step": 1640, | |
| "valid_targets_mean": 4152.5, | |
| "valid_targets_min": 1306 | |
| }, | |
| { | |
| "epoch": 2.6661264181523503, | |
| "grad_norm": 0.38870221906381547, | |
| "learning_rate": 3.1147499454311654e-05, | |
| "loss": 0.1867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18374845385551453, | |
| "step": 1645, | |
| "valid_targets_mean": 4483.2, | |
| "valid_targets_min": 848 | |
| }, | |
| { | |
| "epoch": 2.674230145867099, | |
| "grad_norm": 0.36761988123246936, | |
| "learning_rate": 3.1080304380512386e-05, | |
| "loss": 0.1683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14270418882369995, | |
| "step": 1650, | |
| "valid_targets_mean": 4992.8, | |
| "valid_targets_min": 2372 | |
| }, | |
| { | |
| "epoch": 2.6823338735818476, | |
| "grad_norm": 0.39358600764755813, | |
| "learning_rate": 3.101292835538602e-05, | |
| "loss": 0.1821, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18195050954818726, | |
| "step": 1655, | |
| "valid_targets_mean": 5848.4, | |
| "valid_targets_min": 2184 | |
| }, | |
| { | |
| "epoch": 2.6904376012965967, | |
| "grad_norm": 0.38005284644189086, | |
| "learning_rate": 3.0945372479243605e-05, | |
| "loss": 0.1831, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.175418421626091, | |
| "step": 1660, | |
| "valid_targets_mean": 4281.4, | |
| "valid_targets_min": 1260 | |
| }, | |
| { | |
| "epoch": 2.6985413290113454, | |
| "grad_norm": 0.4499877538649714, | |
| "learning_rate": 3.087763785533328e-05, | |
| "loss": 0.1876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2106286883354187, | |
| "step": 1665, | |
| "valid_targets_mean": 3791.6, | |
| "valid_targets_min": 1351 | |
| }, | |
| { | |
| "epoch": 2.706645056726094, | |
| "grad_norm": 0.4022005851852449, | |
| "learning_rate": 3.0809725589822325e-05, | |
| "loss": 0.1675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18053220212459564, | |
| "step": 1670, | |
| "valid_targets_mean": 4554.2, | |
| "valid_targets_min": 2254 | |
| }, | |
| { | |
| "epoch": 2.7147487844408427, | |
| "grad_norm": 0.38257462905088174, | |
| "learning_rate": 3.074163679177907e-05, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21444633603096008, | |
| "step": 1675, | |
| "valid_targets_mean": 5958.8, | |
| "valid_targets_min": 2338 | |
| }, | |
| { | |
| "epoch": 2.7228525121555913, | |
| "grad_norm": 0.4439641195628155, | |
| "learning_rate": 3.067337257315477e-05, | |
| "loss": 0.1909, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23471331596374512, | |
| "step": 1680, | |
| "valid_targets_mean": 4376.6, | |
| "valid_targets_min": 2428 | |
| }, | |
| { | |
| "epoch": 2.7309562398703404, | |
| "grad_norm": 0.49885017003598664, | |
| "learning_rate": 3.0604934048765444e-05, | |
| "loss": 0.174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17425662279129028, | |
| "step": 1685, | |
| "valid_targets_mean": 3140.1, | |
| "valid_targets_min": 1142 | |
| }, | |
| { | |
| "epoch": 2.739059967585089, | |
| "grad_norm": 0.39244438059170533, | |
| "learning_rate": 3.05363223362737e-05, | |
| "loss": 0.1633, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.177323579788208, | |
| "step": 1690, | |
| "valid_targets_mean": 4580.3, | |
| "valid_targets_min": 1184 | |
| }, | |
| { | |
| "epoch": 2.7471636952998377, | |
| "grad_norm": 0.43590713281508214, | |
| "learning_rate": 3.0467538556170463e-05, | |
| "loss": 0.1811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19646786153316498, | |
| "step": 1695, | |
| "valid_targets_mean": 3882.8, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 2.755267423014587, | |
| "grad_norm": 0.4140052856791728, | |
| "learning_rate": 3.0398583831756655e-05, | |
| "loss": 0.1762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17302680015563965, | |
| "step": 1700, | |
| "valid_targets_mean": 4188.7, | |
| "valid_targets_min": 1384 | |
| }, | |
| { | |
| "epoch": 2.7633711507293355, | |
| "grad_norm": 0.42351239820043063, | |
| "learning_rate": 3.03294592891249e-05, | |
| "loss": 0.1773, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16667763888835907, | |
| "step": 1705, | |
| "valid_targets_mean": 4216.1, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 2.771474878444084, | |
| "grad_norm": 0.4690366006019746, | |
| "learning_rate": 3.0260166057141086e-05, | |
| "loss": 0.1972, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17848166823387146, | |
| "step": 1710, | |
| "valid_targets_mean": 3578.8, | |
| "valid_targets_min": 1101 | |
| }, | |
| { | |
| "epoch": 2.7795786061588332, | |
| "grad_norm": 0.5127351606909685, | |
| "learning_rate": 3.0190705267425956e-05, | |
| "loss": 0.1868, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2254772037267685, | |
| "step": 1715, | |
| "valid_targets_mean": 3151.7, | |
| "valid_targets_min": 1365 | |
| }, | |
| { | |
| "epoch": 2.787682333873582, | |
| "grad_norm": 0.47576208857320945, | |
| "learning_rate": 3.0121078054336633e-05, | |
| "loss": 0.1912, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18778353929519653, | |
| "step": 1720, | |
| "valid_targets_mean": 3748.5, | |
| "valid_targets_min": 714 | |
| }, | |
| { | |
| "epoch": 2.7957860615883305, | |
| "grad_norm": 0.4531881309902607, | |
| "learning_rate": 3.005128555494806e-05, | |
| "loss": 0.188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19288434088230133, | |
| "step": 1725, | |
| "valid_targets_mean": 5238.9, | |
| "valid_targets_min": 1646 | |
| }, | |
| { | |
| "epoch": 2.8038897893030796, | |
| "grad_norm": 0.37505653972841513, | |
| "learning_rate": 2.998132890903448e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1486460566520691, | |
| "step": 1730, | |
| "valid_targets_mean": 4244.6, | |
| "valid_targets_min": 1485 | |
| }, | |
| { | |
| "epoch": 2.8119935170178283, | |
| "grad_norm": 0.3972142880917285, | |
| "learning_rate": 2.9911209259050763e-05, | |
| "loss": 0.2125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21991488337516785, | |
| "step": 1735, | |
| "valid_targets_mean": 5093.3, | |
| "valid_targets_min": 707 | |
| }, | |
| { | |
| "epoch": 2.820097244732577, | |
| "grad_norm": 0.4265599882568908, | |
| "learning_rate": 2.984092775011382e-05, | |
| "loss": 0.1781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17646397650241852, | |
| "step": 1740, | |
| "valid_targets_mean": 3938.9, | |
| "valid_targets_min": 1270 | |
| }, | |
| { | |
| "epoch": 2.828200972447326, | |
| "grad_norm": 0.4228912084128686, | |
| "learning_rate": 2.9770485529983834e-05, | |
| "loss": 0.1782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18994967639446259, | |
| "step": 1745, | |
| "valid_targets_mean": 4213.4, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 2.8363047001620747, | |
| "grad_norm": 0.373449509905702, | |
| "learning_rate": 2.9699883749045564e-05, | |
| "loss": 0.1694, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1784120500087738, | |
| "step": 1750, | |
| "valid_targets_mean": 4853.6, | |
| "valid_targets_min": 987 | |
| }, | |
| { | |
| "epoch": 2.8444084278768234, | |
| "grad_norm": 0.393297545157333, | |
| "learning_rate": 2.962912356028953e-05, | |
| "loss": 0.1781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1601720005273819, | |
| "step": 1755, | |
| "valid_targets_mean": 4341.7, | |
| "valid_targets_min": 1457 | |
| }, | |
| { | |
| "epoch": 2.852512155591572, | |
| "grad_norm": 0.39420593273198135, | |
| "learning_rate": 2.95582061192932e-05, | |
| "loss": 0.1766, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16387531161308289, | |
| "step": 1760, | |
| "valid_targets_mean": 4608.4, | |
| "valid_targets_min": 1457 | |
| }, | |
| { | |
| "epoch": 2.8606158833063207, | |
| "grad_norm": 0.45248636903908407, | |
| "learning_rate": 2.9487132584202115e-05, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17816318571567535, | |
| "step": 1765, | |
| "valid_targets_mean": 4026.9, | |
| "valid_targets_min": 736 | |
| }, | |
| { | |
| "epoch": 2.8687196110210698, | |
| "grad_norm": 0.4420996590189512, | |
| "learning_rate": 2.9415904115710964e-05, | |
| "loss": 0.1876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1964389532804489, | |
| "step": 1770, | |
| "valid_targets_mean": 4062.7, | |
| "valid_targets_min": 1477 | |
| }, | |
| { | |
| "epoch": 2.8768233387358184, | |
| "grad_norm": 0.3731738506531456, | |
| "learning_rate": 2.9344521877044633e-05, | |
| "loss": 0.1555, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1585400551557541, | |
| "step": 1775, | |
| "valid_targets_mean": 4726.6, | |
| "valid_targets_min": 1585 | |
| }, | |
| { | |
| "epoch": 2.884927066450567, | |
| "grad_norm": 0.4165595181612634, | |
| "learning_rate": 2.927298703393924e-05, | |
| "loss": 0.1599, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13799604773521423, | |
| "step": 1780, | |
| "valid_targets_mean": 3671.0, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 2.893030794165316, | |
| "grad_norm": 0.4273575398894236, | |
| "learning_rate": 2.9201300754623046e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18003098666667938, | |
| "step": 1785, | |
| "valid_targets_mean": 4422.1, | |
| "valid_targets_min": 882 | |
| }, | |
| { | |
| "epoch": 2.901134521880065, | |
| "grad_norm": 0.3639663217266001, | |
| "learning_rate": 2.9129464209797404e-05, | |
| "loss": 0.1757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.199708953499794, | |
| "step": 1790, | |
| "valid_targets_mean": 5779.1, | |
| "valid_targets_min": 1363 | |
| }, | |
| { | |
| "epoch": 2.9092382495948135, | |
| "grad_norm": 0.3843377653988568, | |
| "learning_rate": 2.9057478572617644e-05, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15898755192756653, | |
| "step": 1795, | |
| "valid_targets_mean": 4513.2, | |
| "valid_targets_min": 1493 | |
| }, | |
| { | |
| "epoch": 2.9173419773095626, | |
| "grad_norm": 0.41883027314954663, | |
| "learning_rate": 2.898534501867391e-05, | |
| "loss": 0.1818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2081822007894516, | |
| "step": 1800, | |
| "valid_targets_mean": 4719.1, | |
| "valid_targets_min": 1112 | |
| }, | |
| { | |
| "epoch": 2.9254457050243112, | |
| "grad_norm": 0.4313248963680573, | |
| "learning_rate": 2.8913064725971947e-05, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.176205113530159, | |
| "step": 1805, | |
| "valid_targets_mean": 4301.3, | |
| "valid_targets_min": 1469 | |
| }, | |
| { | |
| "epoch": 2.93354943273906, | |
| "grad_norm": 0.3577726246588782, | |
| "learning_rate": 2.8840638874913894e-05, | |
| "loss": 0.1863, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21595324575901031, | |
| "step": 1810, | |
| "valid_targets_mean": 5801.4, | |
| "valid_targets_min": 1025 | |
| }, | |
| { | |
| "epoch": 2.941653160453809, | |
| "grad_norm": 0.40431026504522666, | |
| "learning_rate": 2.8768068648278976e-05, | |
| "loss": 0.1856, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15173974633216858, | |
| "step": 1815, | |
| "valid_targets_mean": 4143.2, | |
| "valid_targets_min": 614 | |
| }, | |
| { | |
| "epoch": 2.9497568881685576, | |
| "grad_norm": 0.4056315860760992, | |
| "learning_rate": 2.8695355231204206e-05, | |
| "loss": 0.1934, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18253010511398315, | |
| "step": 1820, | |
| "valid_targets_mean": 4631.1, | |
| "valid_targets_min": 1584 | |
| }, | |
| { | |
| "epoch": 2.9578606158833063, | |
| "grad_norm": 0.3605840500101974, | |
| "learning_rate": 2.862249981116502e-05, | |
| "loss": 0.1699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17268909513950348, | |
| "step": 1825, | |
| "valid_targets_mean": 4815.9, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 2.965964343598055, | |
| "grad_norm": 0.4393224667022168, | |
| "learning_rate": 2.854950357795589e-05, | |
| "loss": 0.1618, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1699746996164322, | |
| "step": 1830, | |
| "valid_targets_mean": 3549.6, | |
| "valid_targets_min": 587 | |
| }, | |
| { | |
| "epoch": 2.974068071312804, | |
| "grad_norm": 0.4084908609658448, | |
| "learning_rate": 2.847636772367091e-05, | |
| "loss": 0.1899, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18084849417209625, | |
| "step": 1835, | |
| "valid_targets_mean": 4688.5, | |
| "valid_targets_min": 965 | |
| }, | |
| { | |
| "epoch": 2.9821717990275527, | |
| "grad_norm": 0.3904995138540871, | |
| "learning_rate": 2.8403093442684287e-05, | |
| "loss": 0.1687, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17414742708206177, | |
| "step": 1840, | |
| "valid_targets_mean": 4738.1, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 2.9902755267423013, | |
| "grad_norm": 0.3986330323947676, | |
| "learning_rate": 2.8329681931630877e-05, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1696147918701172, | |
| "step": 1845, | |
| "valid_targets_mean": 4719.2, | |
| "valid_targets_min": 1508 | |
| }, | |
| { | |
| "epoch": 2.99837925445705, | |
| "grad_norm": 0.3738285751356821, | |
| "learning_rate": 2.825613438938663e-05, | |
| "loss": 0.1729, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16895028948783875, | |
| "step": 1850, | |
| "valid_targets_mean": 4317.1, | |
| "valid_targets_min": 1512 | |
| }, | |
| { | |
| "epoch": 3.006482982171799, | |
| "grad_norm": 0.4458576556741155, | |
| "learning_rate": 2.8182452017048983e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20063985884189606, | |
| "step": 1855, | |
| "valid_targets_mean": 3893.2, | |
| "valid_targets_min": 1938 | |
| }, | |
| { | |
| "epoch": 3.0145867098865478, | |
| "grad_norm": 0.5437818382202464, | |
| "learning_rate": 2.81086360179173e-05, | |
| "loss": 0.1554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13127049803733826, | |
| "step": 1860, | |
| "valid_targets_mean": 3906.6, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 3.0226904376012964, | |
| "grad_norm": 0.7414088618069266, | |
| "learning_rate": 2.8034687597473164e-05, | |
| "loss": 0.1621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14226055145263672, | |
| "step": 1865, | |
| "valid_targets_mean": 4408.1, | |
| "valid_targets_min": 2015 | |
| }, | |
| { | |
| "epoch": 3.0307941653160455, | |
| "grad_norm": 0.3958731743842448, | |
| "learning_rate": 2.796060796336074e-05, | |
| "loss": 0.1666, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14440515637397766, | |
| "step": 1870, | |
| "valid_targets_mean": 3883.9, | |
| "valid_targets_min": 1145 | |
| }, | |
| { | |
| "epoch": 3.038897893030794, | |
| "grad_norm": 0.3863363231479288, | |
| "learning_rate": 2.7886398325367018e-05, | |
| "loss": 0.1571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1541876196861267, | |
| "step": 1875, | |
| "valid_targets_mean": 4888.3, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 3.047001620745543, | |
| "grad_norm": 0.38862351941958634, | |
| "learning_rate": 2.7812059895402064e-05, | |
| "loss": 0.1646, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17312254011631012, | |
| "step": 1880, | |
| "valid_targets_mean": 4551.8, | |
| "valid_targets_min": 1201 | |
| }, | |
| { | |
| "epoch": 3.055105348460292, | |
| "grad_norm": 0.40110635229720293, | |
| "learning_rate": 2.773759388747925e-05, | |
| "loss": 0.1567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14018696546554565, | |
| "step": 1885, | |
| "valid_targets_mean": 4621.5, | |
| "valid_targets_min": 1347 | |
| }, | |
| { | |
| "epoch": 3.0632090761750406, | |
| "grad_norm": 0.5140462329460982, | |
| "learning_rate": 2.7663001517695386e-05, | |
| "loss": 0.1655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14781486988067627, | |
| "step": 1890, | |
| "valid_targets_mean": 4935.8, | |
| "valid_targets_min": 2293 | |
| }, | |
| { | |
| "epoch": 3.0713128038897892, | |
| "grad_norm": 0.4632445118213682, | |
| "learning_rate": 2.7588284004210907e-05, | |
| "loss": 0.1795, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17635850608348846, | |
| "step": 1895, | |
| "valid_targets_mean": 3667.7, | |
| "valid_targets_min": 842 | |
| }, | |
| { | |
| "epoch": 3.079416531604538, | |
| "grad_norm": 0.4140405136032335, | |
| "learning_rate": 2.7513442567229936e-05, | |
| "loss": 0.162, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16254425048828125, | |
| "step": 1900, | |
| "valid_targets_mean": 3994.1, | |
| "valid_targets_min": 2484 | |
| }, | |
| { | |
| "epoch": 3.087520259319287, | |
| "grad_norm": 0.4340338509048081, | |
| "learning_rate": 2.7438478428980407e-05, | |
| "loss": 0.1571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1579328179359436, | |
| "step": 1905, | |
| "valid_targets_mean": 3750.8, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 3.0956239870340356, | |
| "grad_norm": 0.45026657211719584, | |
| "learning_rate": 2.7363392813694047e-05, | |
| "loss": 0.1593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.178205206990242, | |
| "step": 1910, | |
| "valid_targets_mean": 3745.9, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 3.1037277147487843, | |
| "grad_norm": 0.4337354223525519, | |
| "learning_rate": 2.7288186947586426e-05, | |
| "loss": 0.1641, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1774899810552597, | |
| "step": 1915, | |
| "valid_targets_mean": 4129.7, | |
| "valid_targets_min": 999 | |
| }, | |
| { | |
| "epoch": 3.1118314424635334, | |
| "grad_norm": 0.42347076906338227, | |
| "learning_rate": 2.7212862058836925e-05, | |
| "loss": 0.1652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14904651045799255, | |
| "step": 1920, | |
| "valid_targets_mean": 3875.8, | |
| "valid_targets_min": 1281 | |
| }, | |
| { | |
| "epoch": 3.119935170178282, | |
| "grad_norm": 0.405451275894032, | |
| "learning_rate": 2.713741937756865e-05, | |
| "loss": 0.1562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13380369544029236, | |
| "step": 1925, | |
| "valid_targets_mean": 3421.6, | |
| "valid_targets_min": 676 | |
| }, | |
| { | |
| "epoch": 3.1280388978930307, | |
| "grad_norm": 0.3967232066747482, | |
| "learning_rate": 2.7061860135828384e-05, | |
| "loss": 0.17, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1519922912120819, | |
| "step": 1930, | |
| "valid_targets_mean": 4485.6, | |
| "valid_targets_min": 1366 | |
| }, | |
| { | |
| "epoch": 3.1361426256077793, | |
| "grad_norm": 0.39244088979289066, | |
| "learning_rate": 2.6986185567566442e-05, | |
| "loss": 0.1639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16591840982437134, | |
| "step": 1935, | |
| "valid_targets_mean": 5006.6, | |
| "valid_targets_min": 1399 | |
| }, | |
| { | |
| "epoch": 3.1442463533225284, | |
| "grad_norm": 0.45152809145456724, | |
| "learning_rate": 2.6910396908616527e-05, | |
| "loss": 0.1541, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18249057233333588, | |
| "step": 1940, | |
| "valid_targets_mean": 4102.7, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 3.152350081037277, | |
| "grad_norm": 0.3925517166211551, | |
| "learning_rate": 2.6834495396675526e-05, | |
| "loss": 0.1632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1807394027709961, | |
| "step": 1945, | |
| "valid_targets_mean": 5163.2, | |
| "valid_targets_min": 2123 | |
| }, | |
| { | |
| "epoch": 3.1604538087520258, | |
| "grad_norm": 0.37615773969729294, | |
| "learning_rate": 2.6758482271283347e-05, | |
| "loss": 0.1567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15303456783294678, | |
| "step": 1950, | |
| "valid_targets_mean": 4954.9, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 3.168557536466775, | |
| "grad_norm": 0.3825435765582147, | |
| "learning_rate": 2.668235877380263e-05, | |
| "loss": 0.1635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14769810438156128, | |
| "step": 1955, | |
| "valid_targets_mean": 4466.6, | |
| "valid_targets_min": 1899 | |
| }, | |
| { | |
| "epoch": 3.1766612641815235, | |
| "grad_norm": 0.5029144947784785, | |
| "learning_rate": 2.660612614739849e-05, | |
| "loss": 0.1683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18598952889442444, | |
| "step": 1960, | |
| "valid_targets_mean": 3101.1, | |
| "valid_targets_min": 844 | |
| }, | |
| { | |
| "epoch": 3.184764991896272, | |
| "grad_norm": 0.3831787296377238, | |
| "learning_rate": 2.652978563701822e-05, | |
| "loss": 0.1491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1297546774148941, | |
| "step": 1965, | |
| "valid_targets_mean": 4340.1, | |
| "valid_targets_min": 1250 | |
| }, | |
| { | |
| "epoch": 3.1928687196110213, | |
| "grad_norm": 0.4144693596376343, | |
| "learning_rate": 2.645333848937095e-05, | |
| "loss": 0.1524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14467677474021912, | |
| "step": 1970, | |
| "valid_targets_mean": 3855.9, | |
| "valid_targets_min": 1517 | |
| }, | |
| { | |
| "epoch": 3.20097244732577, | |
| "grad_norm": 0.42457591777397313, | |
| "learning_rate": 2.6376785952907292e-05, | |
| "loss": 0.1793, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1861054003238678, | |
| "step": 1975, | |
| "valid_targets_mean": 4458.5, | |
| "valid_targets_min": 1381 | |
| }, | |
| { | |
| "epoch": 3.2090761750405186, | |
| "grad_norm": 0.39434289936014905, | |
| "learning_rate": 2.630012927779896e-05, | |
| "loss": 0.1649, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16375955939292908, | |
| "step": 1980, | |
| "valid_targets_mean": 4505.0, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 3.217179902755267, | |
| "grad_norm": 0.4022197268623526, | |
| "learning_rate": 2.6223369715918338e-05, | |
| "loss": 0.1785, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17773842811584473, | |
| "step": 1985, | |
| "valid_targets_mean": 5042.1, | |
| "valid_targets_min": 2203 | |
| }, | |
| { | |
| "epoch": 3.2252836304700163, | |
| "grad_norm": 0.42080806470642895, | |
| "learning_rate": 2.614650852081805e-05, | |
| "loss": 0.1617, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16340884566307068, | |
| "step": 1990, | |
| "valid_targets_mean": 4147.1, | |
| "valid_targets_min": 1395 | |
| }, | |
| { | |
| "epoch": 3.233387358184765, | |
| "grad_norm": 0.39877594715149084, | |
| "learning_rate": 2.606954694771047e-05, | |
| "loss": 0.1724, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19070208072662354, | |
| "step": 1995, | |
| "valid_targets_mean": 5167.7, | |
| "valid_targets_min": 2503 | |
| }, | |
| { | |
| "epoch": 3.2414910858995136, | |
| "grad_norm": 0.4170345486796608, | |
| "learning_rate": 2.5992486253447258e-05, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16822439432144165, | |
| "step": 2000, | |
| "valid_targets_mean": 4404.3, | |
| "valid_targets_min": 1622 | |
| }, | |
| { | |
| "epoch": 3.2495948136142627, | |
| "grad_norm": 0.3891233710055621, | |
| "learning_rate": 2.5915327696498787e-05, | |
| "loss": 0.1607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17434442043304443, | |
| "step": 2005, | |
| "valid_targets_mean": 5199.5, | |
| "valid_targets_min": 932 | |
| }, | |
| { | |
| "epoch": 3.2576985413290114, | |
| "grad_norm": 0.34701884376339476, | |
| "learning_rate": 2.583807253693362e-05, | |
| "loss": 0.1583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1562272012233734, | |
| "step": 2010, | |
| "valid_targets_mean": 5518.5, | |
| "valid_targets_min": 848 | |
| }, | |
| { | |
| "epoch": 3.26580226904376, | |
| "grad_norm": 0.41218355325803036, | |
| "learning_rate": 2.576072203639794e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15922808647155762, | |
| "step": 2015, | |
| "valid_targets_mean": 4113.3, | |
| "valid_targets_min": 1485 | |
| }, | |
| { | |
| "epoch": 3.2739059967585087, | |
| "grad_norm": 0.4109547934633965, | |
| "learning_rate": 2.5683277458094926e-05, | |
| "loss": 0.1676, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16199171543121338, | |
| "step": 2020, | |
| "valid_targets_mean": 4417.1, | |
| "valid_targets_min": 1268 | |
| }, | |
| { | |
| "epoch": 3.282009724473258, | |
| "grad_norm": 0.3744324756824404, | |
| "learning_rate": 2.560574006676413e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1650342345237732, | |
| "step": 2025, | |
| "valid_targets_mean": 5119.9, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 3.2901134521880064, | |
| "grad_norm": 0.3996207068608718, | |
| "learning_rate": 2.5528111128660826e-05, | |
| "loss": 0.174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18693450093269348, | |
| "step": 2030, | |
| "valid_targets_mean": 5332.9, | |
| "valid_targets_min": 1422 | |
| }, | |
| { | |
| "epoch": 3.298217179902755, | |
| "grad_norm": 0.3949609988006854, | |
| "learning_rate": 2.545039191153533e-05, | |
| "loss": 0.1714, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1728595793247223, | |
| "step": 2035, | |
| "valid_targets_mean": 4955.7, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 3.306320907617504, | |
| "grad_norm": 0.4139286573888577, | |
| "learning_rate": 2.53725836846123e-05, | |
| "loss": 0.1761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1731775999069214, | |
| "step": 2040, | |
| "valid_targets_mean": 4870.6, | |
| "valid_targets_min": 2364 | |
| }, | |
| { | |
| "epoch": 3.314424635332253, | |
| "grad_norm": 0.39980850038411364, | |
| "learning_rate": 2.5294687718569994e-05, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17138364911079407, | |
| "step": 2045, | |
| "valid_targets_mean": 4727.1, | |
| "valid_targets_min": 1195 | |
| }, | |
| { | |
| "epoch": 3.3225283630470015, | |
| "grad_norm": 0.4296995400998342, | |
| "learning_rate": 2.5216705285519525e-05, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14280448853969574, | |
| "step": 2050, | |
| "valid_targets_mean": 3781.5, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 3.3306320907617506, | |
| "grad_norm": 0.3962285375649505, | |
| "learning_rate": 2.5138637658984116e-05, | |
| "loss": 0.1772, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1752699315547943, | |
| "step": 2055, | |
| "valid_targets_mean": 4631.1, | |
| "valid_targets_min": 595 | |
| }, | |
| { | |
| "epoch": 3.3387358184764993, | |
| "grad_norm": 0.4211633580398836, | |
| "learning_rate": 2.5060486113878244e-05, | |
| "loss": 0.1723, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16262900829315186, | |
| "step": 2060, | |
| "valid_targets_mean": 4000.8, | |
| "valid_targets_min": 1298 | |
| }, | |
| { | |
| "epoch": 3.346839546191248, | |
| "grad_norm": 0.3968558871252429, | |
| "learning_rate": 2.4982251926486873e-05, | |
| "loss": 0.1611, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1583396941423416, | |
| "step": 2065, | |
| "valid_targets_mean": 4512.4, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 3.354943273905997, | |
| "grad_norm": 0.41675904818776327, | |
| "learning_rate": 2.490393637444458e-05, | |
| "loss": 0.1585, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1787576675415039, | |
| "step": 2070, | |
| "valid_targets_mean": 5438.1, | |
| "valid_targets_min": 1668 | |
| }, | |
| { | |
| "epoch": 3.3630470016207457, | |
| "grad_norm": 0.3822854208050519, | |
| "learning_rate": 2.482554073671471e-05, | |
| "loss": 0.1792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1810203194618225, | |
| "step": 2075, | |
| "valid_targets_mean": 4925.5, | |
| "valid_targets_min": 641 | |
| }, | |
| { | |
| "epoch": 3.3711507293354943, | |
| "grad_norm": 0.3746793390777653, | |
| "learning_rate": 2.4747066293568452e-05, | |
| "loss": 0.1606, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1479102522134781, | |
| "step": 2080, | |
| "valid_targets_mean": 4342.9, | |
| "valid_targets_min": 1357 | |
| }, | |
| { | |
| "epoch": 3.379254457050243, | |
| "grad_norm": 0.4169494009407781, | |
| "learning_rate": 2.4668514326564e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17268824577331543, | |
| "step": 2085, | |
| "valid_targets_mean": 4090.0, | |
| "valid_targets_min": 1306 | |
| }, | |
| { | |
| "epoch": 3.387358184764992, | |
| "grad_norm": 0.42376297596655554, | |
| "learning_rate": 2.4589886118525556e-05, | |
| "loss": 0.1579, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13542214035987854, | |
| "step": 2090, | |
| "valid_targets_mean": 3641.6, | |
| "valid_targets_min": 1265 | |
| }, | |
| { | |
| "epoch": 3.3954619124797407, | |
| "grad_norm": 0.35964018388907254, | |
| "learning_rate": 2.4511182953522405e-05, | |
| "loss": 0.1719, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18550920486450195, | |
| "step": 2095, | |
| "valid_targets_mean": 6288.9, | |
| "valid_targets_min": 2357 | |
| }, | |
| { | |
| "epoch": 3.4035656401944894, | |
| "grad_norm": 0.420468377931095, | |
| "learning_rate": 2.4432406116847954e-05, | |
| "loss": 0.1754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15402904152870178, | |
| "step": 2100, | |
| "valid_targets_mean": 3651.9, | |
| "valid_targets_min": 1443 | |
| }, | |
| { | |
| "epoch": 3.411669367909238, | |
| "grad_norm": 0.3480070041971189, | |
| "learning_rate": 2.435355689499874e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15732459723949432, | |
| "step": 2105, | |
| "valid_targets_mean": 5594.8, | |
| "valid_targets_min": 1491 | |
| }, | |
| { | |
| "epoch": 3.419773095623987, | |
| "grad_norm": 0.43677187908434917, | |
| "learning_rate": 2.4274636575653398e-05, | |
| "loss": 0.1593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1524367332458496, | |
| "step": 2110, | |
| "valid_targets_mean": 4121.1, | |
| "valid_targets_min": 1403 | |
| }, | |
| { | |
| "epoch": 3.427876823338736, | |
| "grad_norm": 0.39059587976025334, | |
| "learning_rate": 2.4195646447651663e-05, | |
| "loss": 0.1885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16522473096847534, | |
| "step": 2115, | |
| "valid_targets_mean": 5062.4, | |
| "valid_targets_min": 944 | |
| }, | |
| { | |
| "epoch": 3.4359805510534844, | |
| "grad_norm": 0.3852515258101976, | |
| "learning_rate": 2.411658780097331e-05, | |
| "loss": 0.1496, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1708238273859024, | |
| "step": 2120, | |
| "valid_targets_mean": 5137.6, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 3.4440842787682335, | |
| "grad_norm": 0.389032518997545, | |
| "learning_rate": 2.4037461926717075e-05, | |
| "loss": 0.1759, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17317542433738708, | |
| "step": 2125, | |
| "valid_targets_mean": 4953.7, | |
| "valid_targets_min": 738 | |
| }, | |
| { | |
| "epoch": 3.452188006482982, | |
| "grad_norm": 0.4147483170037612, | |
| "learning_rate": 2.395827011707959e-05, | |
| "loss": 0.1687, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16804209351539612, | |
| "step": 2130, | |
| "valid_targets_mean": 4543.6, | |
| "valid_targets_min": 1407 | |
| }, | |
| { | |
| "epoch": 3.460291734197731, | |
| "grad_norm": 0.4360946325373845, | |
| "learning_rate": 2.3879013665334258e-05, | |
| "loss": 0.1519, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1393245905637741, | |
| "step": 2135, | |
| "valid_targets_mean": 3851.3, | |
| "valid_targets_min": 1041 | |
| }, | |
| { | |
| "epoch": 3.46839546191248, | |
| "grad_norm": 0.35655609281423195, | |
| "learning_rate": 2.3799693865810163e-05, | |
| "loss": 0.1902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14911235868930817, | |
| "step": 2140, | |
| "valid_targets_mean": 5114.5, | |
| "valid_targets_min": 2426 | |
| }, | |
| { | |
| "epoch": 3.4764991896272286, | |
| "grad_norm": 0.45634252157298655, | |
| "learning_rate": 2.37203120138709e-05, | |
| "loss": 0.1962, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24309851229190826, | |
| "step": 2145, | |
| "valid_targets_mean": 4436.9, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 3.4846029173419772, | |
| "grad_norm": 0.42263469645641794, | |
| "learning_rate": 2.3640869405893446e-05, | |
| "loss": 0.1636, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1451752930879593, | |
| "step": 2150, | |
| "valid_targets_mean": 4446.0, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 3.492706645056726, | |
| "grad_norm": 0.39906455734326574, | |
| "learning_rate": 2.3561367339246976e-05, | |
| "loss": 0.1639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18286089599132538, | |
| "step": 2155, | |
| "valid_targets_mean": 5142.6, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 3.500810372771475, | |
| "grad_norm": 0.418866166123448, | |
| "learning_rate": 2.3481807112271678e-05, | |
| "loss": 0.1819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18328286707401276, | |
| "step": 2160, | |
| "valid_targets_mean": 4042.1, | |
| "valid_targets_min": 1430 | |
| }, | |
| { | |
| "epoch": 3.5089141004862237, | |
| "grad_norm": 0.4014900167402139, | |
| "learning_rate": 2.3402190024257543e-05, | |
| "loss": 0.1749, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15818876028060913, | |
| "step": 2165, | |
| "valid_targets_mean": 3847.3, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 3.5170178282009723, | |
| "grad_norm": 0.36321998289056967, | |
| "learning_rate": 2.3322517375423165e-05, | |
| "loss": 0.156, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13905078172683716, | |
| "step": 2170, | |
| "valid_targets_mean": 4851.6, | |
| "valid_targets_min": 1478 | |
| }, | |
| { | |
| "epoch": 3.525121555915721, | |
| "grad_norm": 0.4175559746072245, | |
| "learning_rate": 2.3242790466894494e-05, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2257763147354126, | |
| "step": 2175, | |
| "valid_targets_mean": 5145.1, | |
| "valid_targets_min": 1324 | |
| }, | |
| { | |
| "epoch": 3.53322528363047, | |
| "grad_norm": 0.3721262066889466, | |
| "learning_rate": 2.316301060068359e-05, | |
| "loss": 0.1557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15761905908584595, | |
| "step": 2180, | |
| "valid_targets_mean": 4843.8, | |
| "valid_targets_min": 2620 | |
| }, | |
| { | |
| "epoch": 3.5413290113452187, | |
| "grad_norm": 0.4012053491625949, | |
| "learning_rate": 2.3083179079667347e-05, | |
| "loss": 0.1684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15592560172080994, | |
| "step": 2185, | |
| "valid_targets_mean": 4353.8, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 3.5494327390599674, | |
| "grad_norm": 0.4375921159428318, | |
| "learning_rate": 2.300329720756625e-05, | |
| "loss": 0.1875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1943226456642151, | |
| "step": 2190, | |
| "valid_targets_mean": 4139.7, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 3.5575364667747165, | |
| "grad_norm": 0.40049465719853394, | |
| "learning_rate": 2.2923366288923045e-05, | |
| "loss": 0.1695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14853128790855408, | |
| "step": 2195, | |
| "valid_targets_mean": 4349.8, | |
| "valid_targets_min": 1489 | |
| }, | |
| { | |
| "epoch": 3.565640194489465, | |
| "grad_norm": 0.3931973901766357, | |
| "learning_rate": 2.2843387629081453e-05, | |
| "loss": 0.1799, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2016703337430954, | |
| "step": 2200, | |
| "valid_targets_mean": 5098.6, | |
| "valid_targets_min": 2065 | |
| }, | |
| { | |
| "epoch": 3.5737439222042138, | |
| "grad_norm": 0.4690674312594426, | |
| "learning_rate": 2.2763362534164854e-05, | |
| "loss": 0.1762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14409610629081726, | |
| "step": 2205, | |
| "valid_targets_mean": 4489.2, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 3.581847649918963, | |
| "grad_norm": 0.3623772576301561, | |
| "learning_rate": 2.268329231105498e-05, | |
| "loss": 0.1712, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1700354665517807, | |
| "step": 2210, | |
| "valid_targets_mean": 5424.8, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 3.5899513776337115, | |
| "grad_norm": 0.39514612685466244, | |
| "learning_rate": 2.2603178267370504e-05, | |
| "loss": 0.1643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2017160952091217, | |
| "step": 2215, | |
| "valid_targets_mean": 5409.3, | |
| "valid_targets_min": 1364 | |
| }, | |
| { | |
| "epoch": 3.59805510534846, | |
| "grad_norm": 0.384040585215328, | |
| "learning_rate": 2.2523021711445746e-05, | |
| "loss": 0.1696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17937374114990234, | |
| "step": 2220, | |
| "valid_targets_mean": 4387.1, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 3.6061588330632093, | |
| "grad_norm": 0.3998607141504835, | |
| "learning_rate": 2.2442823952309308e-05, | |
| "loss": 0.183, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16963882744312286, | |
| "step": 2225, | |
| "valid_targets_mean": 4776.0, | |
| "valid_targets_min": 1444 | |
| }, | |
| { | |
| "epoch": 3.614262560777958, | |
| "grad_norm": 0.3659718411449704, | |
| "learning_rate": 2.2362586299662642e-05, | |
| "loss": 0.1714, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1675758808851242, | |
| "step": 2230, | |
| "valid_targets_mean": 5171.9, | |
| "valid_targets_min": 1385 | |
| }, | |
| { | |
| "epoch": 3.6223662884927066, | |
| "grad_norm": 0.3723845904154494, | |
| "learning_rate": 2.228231006385873e-05, | |
| "loss": 0.1641, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19093650579452515, | |
| "step": 2235, | |
| "valid_targets_mean": 5510.3, | |
| "valid_targets_min": 1457 | |
| }, | |
| { | |
| "epoch": 3.6304700162074557, | |
| "grad_norm": 0.3889362488963387, | |
| "learning_rate": 2.2201996555880633e-05, | |
| "loss": 0.1632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14821797609329224, | |
| "step": 2240, | |
| "valid_targets_mean": 4622.1, | |
| "valid_targets_min": 1273 | |
| }, | |
| { | |
| "epoch": 3.6385737439222043, | |
| "grad_norm": 0.4083309341620573, | |
| "learning_rate": 2.2121647087320105e-05, | |
| "loss": 0.1719, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16050675511360168, | |
| "step": 2245, | |
| "valid_targets_mean": 4249.2, | |
| "valid_targets_min": 1355 | |
| }, | |
| { | |
| "epoch": 3.646677471636953, | |
| "grad_norm": 0.40763578479365914, | |
| "learning_rate": 2.204126297035617e-05, | |
| "loss": 0.1767, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16511400043964386, | |
| "step": 2250, | |
| "valid_targets_mean": 4262.1, | |
| "valid_targets_min": 1390 | |
| }, | |
| { | |
| "epoch": 3.6547811993517016, | |
| "grad_norm": 0.4033094775868518, | |
| "learning_rate": 2.196084551773368e-05, | |
| "loss": 0.1521, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18204748630523682, | |
| "step": 2255, | |
| "valid_targets_mean": 4871.3, | |
| "valid_targets_min": 1654 | |
| }, | |
| { | |
| "epoch": 3.6628849270664503, | |
| "grad_norm": 0.37714115301157575, | |
| "learning_rate": 2.1880396042741906e-05, | |
| "loss": 0.1556, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16021430492401123, | |
| "step": 2260, | |
| "valid_targets_mean": 4672.0, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 3.6709886547811994, | |
| "grad_norm": 0.3923717006496399, | |
| "learning_rate": 2.179991585919307e-05, | |
| "loss": 0.1699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15458917617797852, | |
| "step": 2265, | |
| "valid_targets_mean": 4315.2, | |
| "valid_targets_min": 1298 | |
| }, | |
| { | |
| "epoch": 3.679092382495948, | |
| "grad_norm": 0.4429131605887363, | |
| "learning_rate": 2.1719406281400873e-05, | |
| "loss": 0.1708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15454357862472534, | |
| "step": 2270, | |
| "valid_targets_mean": 3315.8, | |
| "valid_targets_min": 845 | |
| }, | |
| { | |
| "epoch": 3.6871961102106967, | |
| "grad_norm": 0.3727256205850918, | |
| "learning_rate": 2.163886862415908e-05, | |
| "loss": 0.1844, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19542384147644043, | |
| "step": 2275, | |
| "valid_targets_mean": 5506.1, | |
| "valid_targets_min": 1856 | |
| }, | |
| { | |
| "epoch": 3.695299837925446, | |
| "grad_norm": 0.41872269455653544, | |
| "learning_rate": 2.155830420272e-05, | |
| "loss": 0.1749, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1771114021539688, | |
| "step": 2280, | |
| "valid_targets_mean": 4206.1, | |
| "valid_targets_min": 914 | |
| }, | |
| { | |
| "epoch": 3.7034035656401945, | |
| "grad_norm": 0.38039050268682023, | |
| "learning_rate": 2.1477714332773022e-05, | |
| "loss": 0.1629, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15340042114257812, | |
| "step": 2285, | |
| "valid_targets_mean": 4411.3, | |
| "valid_targets_min": 1327 | |
| }, | |
| { | |
| "epoch": 3.711507293354943, | |
| "grad_norm": 0.38361661527060936, | |
| "learning_rate": 2.139710033042314e-05, | |
| "loss": 0.1551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14032137393951416, | |
| "step": 2290, | |
| "valid_targets_mean": 4275.1, | |
| "valid_targets_min": 1308 | |
| }, | |
| { | |
| "epoch": 3.719611021069692, | |
| "grad_norm": 0.37278829695234467, | |
| "learning_rate": 2.1316463512169453e-05, | |
| "loss": 0.1642, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17240861058235168, | |
| "step": 2295, | |
| "valid_targets_mean": 5224.4, | |
| "valid_targets_min": 2093 | |
| }, | |
| { | |
| "epoch": 3.727714748784441, | |
| "grad_norm": 0.41452369103684505, | |
| "learning_rate": 2.1235805194883665e-05, | |
| "loss": 0.1779, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15849941968917847, | |
| "step": 2300, | |
| "valid_targets_mean": 4130.9, | |
| "valid_targets_min": 709 | |
| }, | |
| { | |
| "epoch": 3.7358184764991895, | |
| "grad_norm": 0.4105117214988862, | |
| "learning_rate": 2.115512669578857e-05, | |
| "loss": 0.1619, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17380082607269287, | |
| "step": 2305, | |
| "valid_targets_mean": 4336.1, | |
| "valid_targets_min": 1375 | |
| }, | |
| { | |
| "epoch": 3.7439222042139386, | |
| "grad_norm": 0.40472321885140583, | |
| "learning_rate": 2.107442933243656e-05, | |
| "loss": 0.165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15123385190963745, | |
| "step": 2310, | |
| "valid_targets_mean": 3974.2, | |
| "valid_targets_min": 987 | |
| }, | |
| { | |
| "epoch": 3.7520259319286873, | |
| "grad_norm": 0.39922382065316403, | |
| "learning_rate": 2.099371442268809e-05, | |
| "loss": 0.1606, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1767762154340744, | |
| "step": 2315, | |
| "valid_targets_mean": 5035.8, | |
| "valid_targets_min": 1275 | |
| }, | |
| { | |
| "epoch": 3.760129659643436, | |
| "grad_norm": 0.40427130510941617, | |
| "learning_rate": 2.0912983284690157e-05, | |
| "loss": 0.1694, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16852274537086487, | |
| "step": 2320, | |
| "valid_targets_mean": 4461.9, | |
| "valid_targets_min": 1182 | |
| }, | |
| { | |
| "epoch": 3.768233387358185, | |
| "grad_norm": 0.36533036313023187, | |
| "learning_rate": 2.0832237236854794e-05, | |
| "loss": 0.1738, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1574646532535553, | |
| "step": 2325, | |
| "valid_targets_mean": 5237.7, | |
| "valid_targets_min": 2002 | |
| }, | |
| { | |
| "epoch": 3.7763371150729337, | |
| "grad_norm": 0.4491625565997479, | |
| "learning_rate": 2.0751477597837528e-05, | |
| "loss": 0.1746, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15878888964653015, | |
| "step": 2330, | |
| "valid_targets_mean": 3329.7, | |
| "valid_targets_min": 1413 | |
| }, | |
| { | |
| "epoch": 3.7844408427876823, | |
| "grad_norm": 0.42508704076737175, | |
| "learning_rate": 2.0670705686515822e-05, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17809738218784332, | |
| "step": 2335, | |
| "valid_targets_mean": 4440.5, | |
| "valid_targets_min": 1481 | |
| }, | |
| { | |
| "epoch": 3.792544570502431, | |
| "grad_norm": 0.368677101634329, | |
| "learning_rate": 2.0589922821967566e-05, | |
| "loss": 0.164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14476372301578522, | |
| "step": 2340, | |
| "valid_targets_mean": 4738.7, | |
| "valid_targets_min": 2581 | |
| }, | |
| { | |
| "epoch": 3.8006482982171796, | |
| "grad_norm": 0.3513444622347881, | |
| "learning_rate": 2.0509130323449545e-05, | |
| "loss": 0.1543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14614984393119812, | |
| "step": 2345, | |
| "valid_targets_mean": 5030.6, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 3.8087520259319287, | |
| "grad_norm": 0.38888200892854763, | |
| "learning_rate": 2.0428329510375838e-05, | |
| "loss": 0.1782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18158888816833496, | |
| "step": 2350, | |
| "valid_targets_mean": 5079.9, | |
| "valid_targets_min": 658 | |
| }, | |
| { | |
| "epoch": 3.8168557536466774, | |
| "grad_norm": 0.39272999735098785, | |
| "learning_rate": 2.0347521702296333e-05, | |
| "loss": 0.1909, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17020182311534882, | |
| "step": 2355, | |
| "valid_targets_mean": 4373.2, | |
| "valid_targets_min": 1442 | |
| }, | |
| { | |
| "epoch": 3.824959481361426, | |
| "grad_norm": 0.4232818574011928, | |
| "learning_rate": 2.026670821887516e-05, | |
| "loss": 0.1684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16656385362148285, | |
| "step": 2360, | |
| "valid_targets_mean": 3902.8, | |
| "valid_targets_min": 1268 | |
| }, | |
| { | |
| "epoch": 3.833063209076175, | |
| "grad_norm": 0.41216984193887707, | |
| "learning_rate": 2.0185890379869115e-05, | |
| "loss": 0.1592, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17816784977912903, | |
| "step": 2365, | |
| "valid_targets_mean": 4348.3, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 3.841166936790924, | |
| "grad_norm": 0.44398530330040875, | |
| "learning_rate": 2.0105069505106126e-05, | |
| "loss": 0.1797, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19711148738861084, | |
| "step": 2370, | |
| "valid_targets_mean": 3858.1, | |
| "valid_targets_min": 953 | |
| }, | |
| { | |
| "epoch": 3.8492706645056725, | |
| "grad_norm": 0.4232380513339039, | |
| "learning_rate": 2.00242469144637e-05, | |
| "loss": 0.1699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20543363690376282, | |
| "step": 2375, | |
| "valid_targets_mean": 4407.2, | |
| "valid_targets_min": 965 | |
| }, | |
| { | |
| "epoch": 3.8573743922204216, | |
| "grad_norm": 0.3900183593026856, | |
| "learning_rate": 1.994342392784738e-05, | |
| "loss": 0.1585, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16155295073986053, | |
| "step": 2380, | |
| "valid_targets_mean": 5209.2, | |
| "valid_targets_min": 1626 | |
| }, | |
| { | |
| "epoch": 3.86547811993517, | |
| "grad_norm": 0.4382394230770189, | |
| "learning_rate": 1.9862601865169154e-05, | |
| "loss": 0.1876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19150716066360474, | |
| "step": 2385, | |
| "valid_targets_mean": 4058.0, | |
| "valid_targets_min": 2171 | |
| }, | |
| { | |
| "epoch": 3.873581847649919, | |
| "grad_norm": 0.3915766944447351, | |
| "learning_rate": 1.9781782046325938e-05, | |
| "loss": 0.1774, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20614922046661377, | |
| "step": 2390, | |
| "valid_targets_mean": 5446.2, | |
| "valid_targets_min": 2126 | |
| }, | |
| { | |
| "epoch": 3.881685575364668, | |
| "grad_norm": 0.412179929960335, | |
| "learning_rate": 1.9700965791177986e-05, | |
| "loss": 0.1528, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1477297991514206, | |
| "step": 2395, | |
| "valid_targets_mean": 3787.9, | |
| "valid_targets_min": 613 | |
| }, | |
| { | |
| "epoch": 3.8897893030794166, | |
| "grad_norm": 0.3690789892884091, | |
| "learning_rate": 1.9620154419527372e-05, | |
| "loss": 0.1674, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15172114968299866, | |
| "step": 2400, | |
| "valid_targets_mean": 4618.4, | |
| "valid_targets_min": 1325 | |
| }, | |
| { | |
| "epoch": 3.8978930307941653, | |
| "grad_norm": 0.37676014543592173, | |
| "learning_rate": 1.953934925109641e-05, | |
| "loss": 0.1663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1793600618839264, | |
| "step": 2405, | |
| "valid_targets_mean": 6207.1, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 3.9059967585089144, | |
| "grad_norm": 0.376452620738995, | |
| "learning_rate": 1.945855160550611e-05, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1523461639881134, | |
| "step": 2410, | |
| "valid_targets_mean": 4525.1, | |
| "valid_targets_min": 1362 | |
| }, | |
| { | |
| "epoch": 3.914100486223663, | |
| "grad_norm": 0.4485198814325497, | |
| "learning_rate": 1.937776280225463e-05, | |
| "loss": 0.1734, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19850586354732513, | |
| "step": 2415, | |
| "valid_targets_mean": 4109.6, | |
| "valid_targets_min": 1346 | |
| }, | |
| { | |
| "epoch": 3.9222042139384117, | |
| "grad_norm": 0.36670050740221194, | |
| "learning_rate": 1.929698416069571e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12801700830459595, | |
| "step": 2420, | |
| "valid_targets_mean": 4113.8, | |
| "valid_targets_min": 1500 | |
| }, | |
| { | |
| "epoch": 3.9303079416531603, | |
| "grad_norm": 0.4351692157835089, | |
| "learning_rate": 1.9216217000017182e-05, | |
| "loss": 0.1633, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1605074107646942, | |
| "step": 2425, | |
| "valid_targets_mean": 4056.4, | |
| "valid_targets_min": 931 | |
| }, | |
| { | |
| "epoch": 3.938411669367909, | |
| "grad_norm": 0.36574619747694626, | |
| "learning_rate": 1.9135462639219325e-05, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1689274162054062, | |
| "step": 2430, | |
| "valid_targets_mean": 5585.9, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 3.946515397082658, | |
| "grad_norm": 0.38191010503407485, | |
| "learning_rate": 1.905472239709343e-05, | |
| "loss": 0.1717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17569342255592346, | |
| "step": 2435, | |
| "valid_targets_mean": 5288.8, | |
| "valid_targets_min": 1293 | |
| }, | |
| { | |
| "epoch": 3.9546191247974067, | |
| "grad_norm": 0.43597968410963234, | |
| "learning_rate": 1.89739975922002e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1510361284017563, | |
| "step": 2440, | |
| "valid_targets_mean": 3357.4, | |
| "valid_targets_min": 1437 | |
| }, | |
| { | |
| "epoch": 3.9627228525121554, | |
| "grad_norm": 0.4065088260858607, | |
| "learning_rate": 1.889328954284823e-05, | |
| "loss": 0.1807, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20322582125663757, | |
| "step": 2445, | |
| "valid_targets_mean": 5177.2, | |
| "valid_targets_min": 1809 | |
| }, | |
| { | |
| "epoch": 3.9708265802269045, | |
| "grad_norm": 0.40878374932726397, | |
| "learning_rate": 1.8812599567072496e-05, | |
| "loss": 0.1708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16112422943115234, | |
| "step": 2450, | |
| "valid_targets_mean": 4005.9, | |
| "valid_targets_min": 1297 | |
| }, | |
| { | |
| "epoch": 3.978930307941653, | |
| "grad_norm": 0.38156401896539294, | |
| "learning_rate": 1.873192898261281e-05, | |
| "loss": 0.1675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13983334600925446, | |
| "step": 2455, | |
| "valid_targets_mean": 4194.0, | |
| "valid_targets_min": 1475 | |
| }, | |
| { | |
| "epoch": 3.987034035656402, | |
| "grad_norm": 0.3457409189940038, | |
| "learning_rate": 1.8651279106892317e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15533336997032166, | |
| "step": 2460, | |
| "valid_targets_mean": 5028.2, | |
| "valid_targets_min": 1352 | |
| }, | |
| { | |
| "epoch": 3.995137763371151, | |
| "grad_norm": 0.39600319716364357, | |
| "learning_rate": 1.8570651256995933e-05, | |
| "loss": 0.1672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16132251918315887, | |
| "step": 2465, | |
| "valid_targets_mean": 4323.6, | |
| "valid_targets_min": 808 | |
| }, | |
| { | |
| "epoch": 4.003241491085899, | |
| "grad_norm": 0.3712874056679798, | |
| "learning_rate": 1.849004674964891e-05, | |
| "loss": 0.1685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1383274495601654, | |
| "step": 2470, | |
| "valid_targets_mean": 4591.7, | |
| "valid_targets_min": 2381 | |
| }, | |
| { | |
| "epoch": 4.011345218800648, | |
| "grad_norm": 0.4067132658580004, | |
| "learning_rate": 1.840946690119528e-05, | |
| "loss": 0.1543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15070272982120514, | |
| "step": 2475, | |
| "valid_targets_mean": 4897.0, | |
| "valid_targets_min": 1838 | |
| }, | |
| { | |
| "epoch": 4.019448946515397, | |
| "grad_norm": 0.40042514051100025, | |
| "learning_rate": 1.8328913027576373e-05, | |
| "loss": 0.1614, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17309023439884186, | |
| "step": 2480, | |
| "valid_targets_mean": 5681.7, | |
| "valid_targets_min": 1563 | |
| }, | |
| { | |
| "epoch": 4.0275526742301455, | |
| "grad_norm": 0.4256429505889172, | |
| "learning_rate": 1.824838644430934e-05, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19765028357505798, | |
| "step": 2485, | |
| "valid_targets_mean": 4691.1, | |
| "valid_targets_min": 1464 | |
| }, | |
| { | |
| "epoch": 4.035656401944895, | |
| "grad_norm": 0.4186562721256565, | |
| "learning_rate": 1.8167888466465652e-05, | |
| "loss": 0.1645, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1643456518650055, | |
| "step": 2490, | |
| "valid_targets_mean": 4467.4, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 4.043760129659644, | |
| "grad_norm": 0.429059584351004, | |
| "learning_rate": 1.8087420408649596e-05, | |
| "loss": 0.159, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18240870535373688, | |
| "step": 2495, | |
| "valid_targets_mean": 4909.3, | |
| "valid_targets_min": 1389 | |
| }, | |
| { | |
| "epoch": 4.051863857374392, | |
| "grad_norm": 0.3780733056271099, | |
| "learning_rate": 1.8006983584976877e-05, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14605969190597534, | |
| "step": 2500, | |
| "valid_targets_mean": 4654.2, | |
| "valid_targets_min": 1355 | |
| }, | |
| { | |
| "epoch": 4.059967585089141, | |
| "grad_norm": 0.3846916169230815, | |
| "learning_rate": 1.7926579309053098e-05, | |
| "loss": 0.1603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16894376277923584, | |
| "step": 2505, | |
| "valid_targets_mean": 4829.9, | |
| "valid_targets_min": 658 | |
| }, | |
| { | |
| "epoch": 4.06807131280389, | |
| "grad_norm": 0.4228920289359892, | |
| "learning_rate": 1.7846208893952346e-05, | |
| "loss": 0.1466, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1531849503517151, | |
| "step": 2510, | |
| "valid_targets_mean": 4364.9, | |
| "valid_targets_min": 1195 | |
| }, | |
| { | |
| "epoch": 4.076175040518638, | |
| "grad_norm": 0.37417154963610766, | |
| "learning_rate": 1.7765873652195713e-05, | |
| "loss": 0.1642, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16471320390701294, | |
| "step": 2515, | |
| "valid_targets_mean": 5711.6, | |
| "valid_targets_min": 2235 | |
| }, | |
| { | |
| "epoch": 4.084278768233387, | |
| "grad_norm": 0.41101129071151, | |
| "learning_rate": 1.7685574895729886e-05, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15926535427570343, | |
| "step": 2520, | |
| "valid_targets_mean": 4601.2, | |
| "valid_targets_min": 578 | |
| }, | |
| { | |
| "epoch": 4.0923824959481365, | |
| "grad_norm": 0.37464269508058695, | |
| "learning_rate": 1.7605313935905722e-05, | |
| "loss": 0.1501, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16744540631771088, | |
| "step": 2525, | |
| "valid_targets_mean": 5294.9, | |
| "valid_targets_min": 1516 | |
| }, | |
| { | |
| "epoch": 4.100486223662885, | |
| "grad_norm": 0.39631540471893606, | |
| "learning_rate": 1.7525092083456795e-05, | |
| "loss": 0.1853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17988884449005127, | |
| "step": 2530, | |
| "valid_targets_mean": 5167.6, | |
| "valid_targets_min": 1413 | |
| }, | |
| { | |
| "epoch": 4.108589951377634, | |
| "grad_norm": 0.3970097082477904, | |
| "learning_rate": 1.744491064847805e-05, | |
| "loss": 0.1649, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14189140498638153, | |
| "step": 2535, | |
| "valid_targets_mean": 4496.8, | |
| "valid_targets_min": 1527 | |
| }, | |
| { | |
| "epoch": 4.116693679092383, | |
| "grad_norm": 0.4092664053834939, | |
| "learning_rate": 1.7364770940404375e-05, | |
| "loss": 0.1569, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12488947808742523, | |
| "step": 2540, | |
| "valid_targets_mean": 3773.7, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 4.124797406807131, | |
| "grad_norm": 0.4108401889537438, | |
| "learning_rate": 1.7284674267989213e-05, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1261243373155594, | |
| "step": 2545, | |
| "valid_targets_mean": 3507.6, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 4.13290113452188, | |
| "grad_norm": 0.42068807609354025, | |
| "learning_rate": 1.72046219392832e-05, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15615776181221008, | |
| "step": 2550, | |
| "valid_targets_mean": 4401.0, | |
| "valid_targets_min": 2435 | |
| }, | |
| { | |
| "epoch": 4.1410048622366284, | |
| "grad_norm": 0.4504609491464449, | |
| "learning_rate": 1.712461526161279e-05, | |
| "loss": 0.1613, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1747884750366211, | |
| "step": 2555, | |
| "valid_targets_mean": 4031.8, | |
| "valid_targets_min": 688 | |
| }, | |
| { | |
| "epoch": 4.1491085899513775, | |
| "grad_norm": 0.5275595272741159, | |
| "learning_rate": 1.7044655541558934e-05, | |
| "loss": 0.1575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14213520288467407, | |
| "step": 2560, | |
| "valid_targets_mean": 2747.8, | |
| "valid_targets_min": 931 | |
| }, | |
| { | |
| "epoch": 4.157212317666127, | |
| "grad_norm": 0.4071729089384716, | |
| "learning_rate": 1.69647440849357e-05, | |
| "loss": 0.1706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16947373747825623, | |
| "step": 2565, | |
| "valid_targets_mean": 4519.0, | |
| "valid_targets_min": 1346 | |
| }, | |
| { | |
| "epoch": 4.165316045380875, | |
| "grad_norm": 0.3737831947406853, | |
| "learning_rate": 1.6884882196768985e-05, | |
| "loss": 0.1471, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13542313873767853, | |
| "step": 2570, | |
| "valid_targets_mean": 4573.9, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 4.173419773095624, | |
| "grad_norm": 0.39623962224896625, | |
| "learning_rate": 1.680507118127518e-05, | |
| "loss": 0.1629, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1541873961687088, | |
| "step": 2575, | |
| "valid_targets_mean": 4521.7, | |
| "valid_targets_min": 1449 | |
| }, | |
| { | |
| "epoch": 4.181523500810373, | |
| "grad_norm": 0.4031184428238581, | |
| "learning_rate": 1.6725312341839895e-05, | |
| "loss": 0.1808, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18571238219738007, | |
| "step": 2580, | |
| "valid_targets_mean": 5134.5, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 4.189627228525121, | |
| "grad_norm": 0.4361781205053545, | |
| "learning_rate": 1.664560698099664e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17057061195373535, | |
| "step": 2585, | |
| "valid_targets_mean": 4028.2, | |
| "valid_targets_min": 1416 | |
| }, | |
| { | |
| "epoch": 4.19773095623987, | |
| "grad_norm": 0.4433460811120863, | |
| "learning_rate": 1.6565956400405586e-05, | |
| "loss": 0.1672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1968003511428833, | |
| "step": 2590, | |
| "valid_targets_mean": 4815.5, | |
| "valid_targets_min": 1462 | |
| }, | |
| { | |
| "epoch": 4.2058346839546195, | |
| "grad_norm": 0.4117119339099135, | |
| "learning_rate": 1.6486361900832284e-05, | |
| "loss": 0.1703, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15136125683784485, | |
| "step": 2595, | |
| "valid_targets_mean": 4527.9, | |
| "valid_targets_min": 1020 | |
| }, | |
| { | |
| "epoch": 4.213938411669368, | |
| "grad_norm": 0.36532371602338826, | |
| "learning_rate": 1.6406824782126428e-05, | |
| "loss": 0.1496, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16010749340057373, | |
| "step": 2600, | |
| "valid_targets_mean": 6099.1, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 4.222042139384117, | |
| "grad_norm": 0.39087983526982967, | |
| "learning_rate": 1.632734634320064e-05, | |
| "loss": 0.1367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12246764451265335, | |
| "step": 2605, | |
| "valid_targets_mean": 4464.7, | |
| "valid_targets_min": 2176 | |
| }, | |
| { | |
| "epoch": 4.230145867098866, | |
| "grad_norm": 0.4272797365502613, | |
| "learning_rate": 1.6247927882009256e-05, | |
| "loss": 0.1598, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14960479736328125, | |
| "step": 2610, | |
| "valid_targets_mean": 4127.4, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 4.238249594813614, | |
| "grad_norm": 0.462725135810111, | |
| "learning_rate": 1.6168570695527096e-05, | |
| "loss": 0.154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14427827298641205, | |
| "step": 2615, | |
| "valid_targets_mean": 3540.2, | |
| "valid_targets_min": 1187 | |
| }, | |
| { | |
| "epoch": 4.246353322528363, | |
| "grad_norm": 0.36206227558488485, | |
| "learning_rate": 1.6089276079728334e-05, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1214427500963211, | |
| "step": 2620, | |
| "valid_targets_mean": 4497.8, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 4.254457050243111, | |
| "grad_norm": 0.40276086433073494, | |
| "learning_rate": 1.6010045329565294e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13030347228050232, | |
| "step": 2625, | |
| "valid_targets_mean": 3962.0, | |
| "valid_targets_min": 1434 | |
| }, | |
| { | |
| "epoch": 4.2625607779578605, | |
| "grad_norm": 0.4387160089339799, | |
| "learning_rate": 1.5930879738947328e-05, | |
| "loss": 0.1681, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14297866821289062, | |
| "step": 2630, | |
| "valid_targets_mean": 3749.6, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 4.27066450567261, | |
| "grad_norm": 0.372668862412784, | |
| "learning_rate": 1.585178060071966e-05, | |
| "loss": 0.1554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14359818398952484, | |
| "step": 2635, | |
| "valid_targets_mean": 5299.0, | |
| "valid_targets_min": 1386 | |
| }, | |
| { | |
| "epoch": 4.278768233387358, | |
| "grad_norm": 0.4146704722785337, | |
| "learning_rate": 1.5772749206642296e-05, | |
| "loss": 0.1479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1508176326751709, | |
| "step": 2640, | |
| "valid_targets_mean": 4494.6, | |
| "valid_targets_min": 2469 | |
| }, | |
| { | |
| "epoch": 4.286871961102107, | |
| "grad_norm": 0.4054725189068471, | |
| "learning_rate": 1.5693786847368918e-05, | |
| "loss": 0.1577, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13715773820877075, | |
| "step": 2645, | |
| "valid_targets_mean": 5204.2, | |
| "valid_targets_min": 2589 | |
| }, | |
| { | |
| "epoch": 4.294975688816856, | |
| "grad_norm": 0.39034530846194615, | |
| "learning_rate": 1.5614894812425806e-05, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12854069471359253, | |
| "step": 2650, | |
| "valid_targets_mean": 4378.7, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 4.303079416531604, | |
| "grad_norm": 0.36953410992638486, | |
| "learning_rate": 1.5536074390190786e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1475868672132492, | |
| "step": 2655, | |
| "valid_targets_mean": 5126.3, | |
| "valid_targets_min": 1363 | |
| }, | |
| { | |
| "epoch": 4.311183144246353, | |
| "grad_norm": 0.4138943659186782, | |
| "learning_rate": 1.5457326867872177e-05, | |
| "loss": 0.1459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13638821244239807, | |
| "step": 2660, | |
| "valid_targets_mean": 4109.2, | |
| "valid_targets_min": 1395 | |
| }, | |
| { | |
| "epoch": 4.319286871961102, | |
| "grad_norm": 0.34666953149423, | |
| "learning_rate": 1.5378653531487784e-05, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1447892189025879, | |
| "step": 2665, | |
| "valid_targets_mean": 6002.4, | |
| "valid_targets_min": 1608 | |
| }, | |
| { | |
| "epoch": 4.327390599675851, | |
| "grad_norm": 0.4287979661216327, | |
| "learning_rate": 1.5300055665843875e-05, | |
| "loss": 0.1583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15963557362556458, | |
| "step": 2670, | |
| "valid_targets_mean": 4144.3, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 4.3354943273906, | |
| "grad_norm": 0.38789554239222346, | |
| "learning_rate": 1.5221534554514225e-05, | |
| "loss": 0.1651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15133854746818542, | |
| "step": 2675, | |
| "valid_targets_mean": 5058.6, | |
| "valid_targets_min": 720 | |
| }, | |
| { | |
| "epoch": 4.343598055105349, | |
| "grad_norm": 0.3826558850319014, | |
| "learning_rate": 1.5143091479819146e-05, | |
| "loss": 0.1538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11236849427223206, | |
| "step": 2680, | |
| "valid_targets_mean": 4173.0, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 4.351701782820097, | |
| "grad_norm": 0.41350564769635967, | |
| "learning_rate": 1.5064727722804531e-05, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1576721966266632, | |
| "step": 2685, | |
| "valid_targets_mean": 4335.1, | |
| "valid_targets_min": 736 | |
| }, | |
| { | |
| "epoch": 4.359805510534846, | |
| "grad_norm": 0.43428870276985304, | |
| "learning_rate": 1.4986444563220948e-05, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15112698078155518, | |
| "step": 2690, | |
| "valid_targets_mean": 4689.3, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 4.367909238249595, | |
| "grad_norm": 0.37636298776027244, | |
| "learning_rate": 1.4908243279502741e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15192727744579315, | |
| "step": 2695, | |
| "valid_targets_mean": 5081.8, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 4.376012965964343, | |
| "grad_norm": 0.4397966263657783, | |
| "learning_rate": 1.4830125148747138e-05, | |
| "loss": 0.161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20267480611801147, | |
| "step": 2700, | |
| "valid_targets_mean": 4718.6, | |
| "valid_targets_min": 1545 | |
| }, | |
| { | |
| "epoch": 4.3841166936790925, | |
| "grad_norm": 0.43697570472069147, | |
| "learning_rate": 1.475209144669341e-05, | |
| "loss": 0.1714, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17748787999153137, | |
| "step": 2705, | |
| "valid_targets_mean": 4454.5, | |
| "valid_targets_min": 678 | |
| }, | |
| { | |
| "epoch": 4.392220421393841, | |
| "grad_norm": 0.40519216103231753, | |
| "learning_rate": 1.4674143447702036e-05, | |
| "loss": 0.1443, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14861005544662476, | |
| "step": 2710, | |
| "valid_targets_mean": 4173.1, | |
| "valid_targets_min": 1378 | |
| }, | |
| { | |
| "epoch": 4.40032414910859, | |
| "grad_norm": 0.4149261608469548, | |
| "learning_rate": 1.4596282424733877e-05, | |
| "loss": 0.1617, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14318136870861053, | |
| "step": 2715, | |
| "valid_targets_mean": 3568.0, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 4.408427876823339, | |
| "grad_norm": 0.413620830401606, | |
| "learning_rate": 1.4518509649329406e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13617044687271118, | |
| "step": 2720, | |
| "valid_targets_mean": 4053.7, | |
| "valid_targets_min": 1375 | |
| }, | |
| { | |
| "epoch": 4.416531604538087, | |
| "grad_norm": 0.4139055830438707, | |
| "learning_rate": 1.4440826391587926e-05, | |
| "loss": 0.1584, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15039443969726562, | |
| "step": 2725, | |
| "valid_targets_mean": 4649.6, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 4.424635332252836, | |
| "grad_norm": 0.38376787901494286, | |
| "learning_rate": 1.4363233920146855e-05, | |
| "loss": 0.132, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11790202558040619, | |
| "step": 2730, | |
| "valid_targets_mean": 4379.2, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 4.432739059967585, | |
| "grad_norm": 0.4083381210433433, | |
| "learning_rate": 1.4285733502160955e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14305062592029572, | |
| "step": 2735, | |
| "valid_targets_mean": 4222.9, | |
| "valid_targets_min": 2249 | |
| }, | |
| { | |
| "epoch": 4.4408427876823335, | |
| "grad_norm": 0.43656908504947634, | |
| "learning_rate": 1.4208326403281702e-05, | |
| "loss": 0.1632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16906528174877167, | |
| "step": 2740, | |
| "valid_targets_mean": 4419.1, | |
| "valid_targets_min": 1353 | |
| }, | |
| { | |
| "epoch": 4.448946515397083, | |
| "grad_norm": 0.4003934548959086, | |
| "learning_rate": 1.4131013887636576e-05, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15742376446723938, | |
| "step": 2745, | |
| "valid_targets_mean": 4827.4, | |
| "valid_targets_min": 614 | |
| }, | |
| { | |
| "epoch": 4.457050243111832, | |
| "grad_norm": 0.3840972375272, | |
| "learning_rate": 1.4053797217808432e-05, | |
| "loss": 0.1609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20730182528495789, | |
| "step": 2750, | |
| "valid_targets_mean": 6605.8, | |
| "valid_targets_min": 1338 | |
| }, | |
| { | |
| "epoch": 4.46515397082658, | |
| "grad_norm": 0.51490729353896, | |
| "learning_rate": 1.3976677654814866e-05, | |
| "loss": 0.1582, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1762509047985077, | |
| "step": 2755, | |
| "valid_targets_mean": 3084.2, | |
| "valid_targets_min": 572 | |
| }, | |
| { | |
| "epoch": 4.473257698541329, | |
| "grad_norm": 0.4120105039229711, | |
| "learning_rate": 1.3899656458087647e-05, | |
| "loss": 0.1571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13307833671569824, | |
| "step": 2760, | |
| "valid_targets_mean": 4038.2, | |
| "valid_targets_min": 1362 | |
| }, | |
| { | |
| "epoch": 4.481361426256078, | |
| "grad_norm": 0.4154926074325571, | |
| "learning_rate": 1.3822734885452136e-05, | |
| "loss": 0.1596, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15126535296440125, | |
| "step": 2765, | |
| "valid_targets_mean": 4166.2, | |
| "valid_targets_min": 1250 | |
| }, | |
| { | |
| "epoch": 4.489465153970826, | |
| "grad_norm": 0.42575125176208933, | |
| "learning_rate": 1.3745914193106715e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1594770848751068, | |
| "step": 2770, | |
| "valid_targets_mean": 4068.7, | |
| "valid_targets_min": 1298 | |
| }, | |
| { | |
| "epoch": 4.4975688816855754, | |
| "grad_norm": 0.5353069360545429, | |
| "learning_rate": 1.366919563560233e-05, | |
| "loss": 0.1726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2269129455089569, | |
| "step": 2775, | |
| "valid_targets_mean": 3735.1, | |
| "valid_targets_min": 1960 | |
| }, | |
| { | |
| "epoch": 4.5056726094003245, | |
| "grad_norm": 0.4379484746796353, | |
| "learning_rate": 1.3592580465821956e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17367328703403473, | |
| "step": 2780, | |
| "valid_targets_mean": 3981.8, | |
| "valid_targets_min": 1401 | |
| }, | |
| { | |
| "epoch": 4.513776337115073, | |
| "grad_norm": 0.4081571600651165, | |
| "learning_rate": 1.3516069934960174e-05, | |
| "loss": 0.1603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1333627998828888, | |
| "step": 2785, | |
| "valid_targets_mean": 4328.7, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 4.521880064829822, | |
| "grad_norm": 0.4401935424136, | |
| "learning_rate": 1.3439665292502695e-05, | |
| "loss": 0.164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17039506137371063, | |
| "step": 2790, | |
| "valid_targets_mean": 4571.1, | |
| "valid_targets_min": 1123 | |
| }, | |
| { | |
| "epoch": 4.52998379254457, | |
| "grad_norm": 0.4136394241777246, | |
| "learning_rate": 1.3363367786205985e-05, | |
| "loss": 0.1612, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1595153957605362, | |
| "step": 2795, | |
| "valid_targets_mean": 4450.9, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 4.538087520259319, | |
| "grad_norm": 0.4715376806814763, | |
| "learning_rate": 1.3287178662076893e-05, | |
| "loss": 0.148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17418047785758972, | |
| "step": 2800, | |
| "valid_targets_mean": 3706.5, | |
| "valid_targets_min": 1336 | |
| }, | |
| { | |
| "epoch": 4.546191247974068, | |
| "grad_norm": 0.4080114691839976, | |
| "learning_rate": 1.3211099164352261e-05, | |
| "loss": 0.1488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1426158845424652, | |
| "step": 2805, | |
| "valid_targets_mean": 4454.5, | |
| "valid_targets_min": 1403 | |
| }, | |
| { | |
| "epoch": 4.5542949756888165, | |
| "grad_norm": 0.40195595415999114, | |
| "learning_rate": 1.3135130535478655e-05, | |
| "loss": 0.1519, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.172957181930542, | |
| "step": 2810, | |
| "valid_targets_mean": 5329.9, | |
| "valid_targets_min": 2400 | |
| }, | |
| { | |
| "epoch": 4.562398703403566, | |
| "grad_norm": 0.4043689543889982, | |
| "learning_rate": 1.3059274016092057e-05, | |
| "loss": 0.1568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1621810793876648, | |
| "step": 2815, | |
| "valid_targets_mean": 4751.9, | |
| "valid_targets_min": 2420 | |
| }, | |
| { | |
| "epoch": 4.570502431118315, | |
| "grad_norm": 0.40706944938964373, | |
| "learning_rate": 1.2983530844997585e-05, | |
| "loss": 0.153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14708778262138367, | |
| "step": 2820, | |
| "valid_targets_mean": 4328.3, | |
| "valid_targets_min": 1470 | |
| }, | |
| { | |
| "epoch": 4.578606158833063, | |
| "grad_norm": 0.44577995316271046, | |
| "learning_rate": 1.2907902259149287e-05, | |
| "loss": 0.1403, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14407837390899658, | |
| "step": 2825, | |
| "valid_targets_mean": 3709.7, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 4.586709886547812, | |
| "grad_norm": 0.43153844741457964, | |
| "learning_rate": 1.2832389493629928e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1508851945400238, | |
| "step": 2830, | |
| "valid_targets_mean": 4266.7, | |
| "valid_targets_min": 1377 | |
| }, | |
| { | |
| "epoch": 4.594813614262561, | |
| "grad_norm": 0.4876086623263469, | |
| "learning_rate": 1.275699378163083e-05, | |
| "loss": 0.1542, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14464884996414185, | |
| "step": 2835, | |
| "valid_targets_mean": 3399.9, | |
| "valid_targets_min": 1327 | |
| }, | |
| { | |
| "epoch": 4.602917341977309, | |
| "grad_norm": 0.44880362000133106, | |
| "learning_rate": 1.2681716354431704e-05, | |
| "loss": 0.1501, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15762996673583984, | |
| "step": 2840, | |
| "valid_targets_mean": 3883.4, | |
| "valid_targets_min": 475 | |
| }, | |
| { | |
| "epoch": 4.611021069692058, | |
| "grad_norm": 0.38675466641560363, | |
| "learning_rate": 1.2606558441380587e-05, | |
| "loss": 0.1588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15953296422958374, | |
| "step": 2845, | |
| "valid_targets_mean": 4840.3, | |
| "valid_targets_min": 1133 | |
| }, | |
| { | |
| "epoch": 4.6191247974068075, | |
| "grad_norm": 0.472879228012513, | |
| "learning_rate": 1.2531521269873736e-05, | |
| "loss": 0.1409, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15060774981975555, | |
| "step": 2850, | |
| "valid_targets_mean": 3850.8, | |
| "valid_targets_min": 1299 | |
| }, | |
| { | |
| "epoch": 4.627228525121556, | |
| "grad_norm": 0.42192553939479277, | |
| "learning_rate": 1.245660606533559e-05, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17877057194709778, | |
| "step": 2855, | |
| "valid_targets_mean": 4635.9, | |
| "valid_targets_min": 2261 | |
| }, | |
| { | |
| "epoch": 4.635332252836305, | |
| "grad_norm": 0.4117654319469982, | |
| "learning_rate": 1.2381814051198751e-05, | |
| "loss": 0.1648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1610351800918579, | |
| "step": 2860, | |
| "valid_targets_mean": 4523.1, | |
| "valid_targets_min": 1408 | |
| }, | |
| { | |
| "epoch": 4.643435980551054, | |
| "grad_norm": 0.40499109611733797, | |
| "learning_rate": 1.2307146448884021e-05, | |
| "loss": 0.16, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20495977997779846, | |
| "step": 2865, | |
| "valid_targets_mean": 6101.8, | |
| "valid_targets_min": 2508 | |
| }, | |
| { | |
| "epoch": 4.651539708265802, | |
| "grad_norm": 0.3999011523794404, | |
| "learning_rate": 1.2232604477780445e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15106816589832306, | |
| "step": 2870, | |
| "valid_targets_mean": 4945.2, | |
| "valid_targets_min": 2132 | |
| }, | |
| { | |
| "epoch": 4.659643435980551, | |
| "grad_norm": 0.43766033396391785, | |
| "learning_rate": 1.2158189355225382e-05, | |
| "loss": 0.16, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16236332058906555, | |
| "step": 2875, | |
| "valid_targets_mean": 4062.0, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 4.667747163695299, | |
| "grad_norm": 0.4352998986314373, | |
| "learning_rate": 1.2083902296484659e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1306292861700058, | |
| "step": 2880, | |
| "valid_targets_mean": 3636.9, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 4.6758508914100485, | |
| "grad_norm": 0.4010496520874604, | |
| "learning_rate": 1.2009744514732698e-05, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12262415140867233, | |
| "step": 2885, | |
| "valid_targets_mean": 4299.4, | |
| "valid_targets_min": 1561 | |
| }, | |
| { | |
| "epoch": 4.683954619124798, | |
| "grad_norm": 0.4326404264916919, | |
| "learning_rate": 1.1935717221032707e-05, | |
| "loss": 0.1564, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14637967944145203, | |
| "step": 2890, | |
| "valid_targets_mean": 4200.1, | |
| "valid_targets_min": 1452 | |
| }, | |
| { | |
| "epoch": 4.692058346839546, | |
| "grad_norm": 0.37813017019777695, | |
| "learning_rate": 1.1861821624316916e-05, | |
| "loss": 0.1666, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19159114360809326, | |
| "step": 2895, | |
| "valid_targets_mean": 5756.8, | |
| "valid_targets_min": 1276 | |
| }, | |
| { | |
| "epoch": 4.700162074554295, | |
| "grad_norm": 0.4516279524948185, | |
| "learning_rate": 1.1788058931366822e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15397486090660095, | |
| "step": 2900, | |
| "valid_targets_mean": 4007.5, | |
| "valid_targets_min": 1361 | |
| }, | |
| { | |
| "epoch": 4.708265802269044, | |
| "grad_norm": 0.43035547891945575, | |
| "learning_rate": 1.1714430346793479e-05, | |
| "loss": 0.1593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15408627688884735, | |
| "step": 2905, | |
| "valid_targets_mean": 4192.9, | |
| "valid_targets_min": 1428 | |
| }, | |
| { | |
| "epoch": 4.716369529983792, | |
| "grad_norm": 0.42815560735664004, | |
| "learning_rate": 1.1640937073017837e-05, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12107810378074646, | |
| "step": 2910, | |
| "valid_targets_mean": 3664.6, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 4.724473257698541, | |
| "grad_norm": 0.40919295231478586, | |
| "learning_rate": 1.1567580310251097e-05, | |
| "loss": 0.1571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1773926317691803, | |
| "step": 2915, | |
| "valid_targets_mean": 4703.0, | |
| "valid_targets_min": 1690 | |
| }, | |
| { | |
| "epoch": 4.73257698541329, | |
| "grad_norm": 0.45522206805158383, | |
| "learning_rate": 1.1494361256475105e-05, | |
| "loss": 0.1678, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22190578281879425, | |
| "step": 2920, | |
| "valid_targets_mean": 4572.4, | |
| "valid_targets_min": 1368 | |
| }, | |
| { | |
| "epoch": 4.740680713128039, | |
| "grad_norm": 0.5050075348237438, | |
| "learning_rate": 1.1421281107422804e-05, | |
| "loss": 0.1798, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2091386914253235, | |
| "step": 2925, | |
| "valid_targets_mean": 4202.2, | |
| "valid_targets_min": 459 | |
| }, | |
| { | |
| "epoch": 4.748784440842788, | |
| "grad_norm": 0.4245191076711551, | |
| "learning_rate": 1.1348341056558709e-05, | |
| "loss": 0.1659, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15889310836791992, | |
| "step": 2930, | |
| "valid_targets_mean": 4289.3, | |
| "valid_targets_min": 1435 | |
| }, | |
| { | |
| "epoch": 4.756888168557537, | |
| "grad_norm": 0.4329251764412851, | |
| "learning_rate": 1.1275542295059384e-05, | |
| "loss": 0.1587, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1462092250585556, | |
| "step": 2935, | |
| "valid_targets_mean": 4087.2, | |
| "valid_targets_min": 1145 | |
| }, | |
| { | |
| "epoch": 4.764991896272285, | |
| "grad_norm": 0.3802982333434156, | |
| "learning_rate": 1.1202886011794023e-05, | |
| "loss": 0.1552, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13929399847984314, | |
| "step": 2940, | |
| "valid_targets_mean": 4642.4, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 4.773095623987034, | |
| "grad_norm": 0.41209636116235543, | |
| "learning_rate": 1.1130373393305004e-05, | |
| "loss": 0.1538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19231638312339783, | |
| "step": 2945, | |
| "valid_targets_mean": 5033.9, | |
| "valid_targets_min": 1383 | |
| }, | |
| { | |
| "epoch": 4.781199351701783, | |
| "grad_norm": 0.44427285185848897, | |
| "learning_rate": 1.1058005623788564e-05, | |
| "loss": 0.1639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2078791707754135, | |
| "step": 2950, | |
| "valid_targets_mean": 4785.1, | |
| "valid_targets_min": 1378 | |
| }, | |
| { | |
| "epoch": 4.789303079416531, | |
| "grad_norm": 0.4352310198495233, | |
| "learning_rate": 1.0985783885075407e-05, | |
| "loss": 0.1542, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.150970458984375, | |
| "step": 2955, | |
| "valid_targets_mean": 4012.4, | |
| "valid_targets_min": 1363 | |
| }, | |
| { | |
| "epoch": 4.7974068071312805, | |
| "grad_norm": 0.3848427570190768, | |
| "learning_rate": 1.0913709356611411e-05, | |
| "loss": 0.1732, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15960577130317688, | |
| "step": 2960, | |
| "valid_targets_mean": 5311.9, | |
| "valid_targets_min": 1370 | |
| }, | |
| { | |
| "epoch": 4.805510534846029, | |
| "grad_norm": 0.44803354520963734, | |
| "learning_rate": 1.0841783215438406e-05, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13963055610656738, | |
| "step": 2965, | |
| "valid_targets_mean": 3312.8, | |
| "valid_targets_min": 1017 | |
| }, | |
| { | |
| "epoch": 4.813614262560778, | |
| "grad_norm": 0.3988227974129132, | |
| "learning_rate": 1.07700066361749e-05, | |
| "loss": 0.154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15069864690303802, | |
| "step": 2970, | |
| "valid_targets_mean": 4787.4, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 4.821717990275527, | |
| "grad_norm": 0.3915010960709381, | |
| "learning_rate": 1.0698380790996921e-05, | |
| "loss": 0.1485, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1701292246580124, | |
| "step": 2975, | |
| "valid_targets_mean": 5517.6, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 4.829821717990275, | |
| "grad_norm": 0.39542275803497334, | |
| "learning_rate": 1.0626906849618903e-05, | |
| "loss": 0.1551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18241944909095764, | |
| "step": 2980, | |
| "valid_targets_mean": 5030.0, | |
| "valid_targets_min": 1287 | |
| }, | |
| { | |
| "epoch": 4.837925445705024, | |
| "grad_norm": 0.44683950497997893, | |
| "learning_rate": 1.0555585979274513e-05, | |
| "loss": 0.1628, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17230674624443054, | |
| "step": 2985, | |
| "valid_targets_mean": 4003.1, | |
| "valid_targets_min": 1242 | |
| }, | |
| { | |
| "epoch": 4.846029173419773, | |
| "grad_norm": 0.4059870561390617, | |
| "learning_rate": 1.0484419344697667e-05, | |
| "loss": 0.1758, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20989492535591125, | |
| "step": 2990, | |
| "valid_targets_mean": 5568.2, | |
| "valid_targets_min": 1154 | |
| }, | |
| { | |
| "epoch": 4.854132901134522, | |
| "grad_norm": 0.3983641471581764, | |
| "learning_rate": 1.0413408108103445e-05, | |
| "loss": 0.1525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13275814056396484, | |
| "step": 2995, | |
| "valid_targets_mean": 4678.1, | |
| "valid_targets_min": 1610 | |
| }, | |
| { | |
| "epoch": 4.862236628849271, | |
| "grad_norm": 0.40552466865168335, | |
| "learning_rate": 1.0342553429169163e-05, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12622925639152527, | |
| "step": 3000, | |
| "valid_targets_mean": 4144.2, | |
| "valid_targets_min": 1433 | |
| }, | |
| { | |
| "epoch": 4.87034035656402, | |
| "grad_norm": 0.46127170144733654, | |
| "learning_rate": 1.0271856465015388e-05, | |
| "loss": 0.1547, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1375845968723297, | |
| "step": 3005, | |
| "valid_targets_mean": 3773.5, | |
| "valid_targets_min": 1322 | |
| }, | |
| { | |
| "epoch": 4.878444084278768, | |
| "grad_norm": 0.4006716455638935, | |
| "learning_rate": 1.0201318370187065e-05, | |
| "loss": 0.1476, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1406145840883255, | |
| "step": 3010, | |
| "valid_targets_mean": 4621.4, | |
| "valid_targets_min": 1254 | |
| }, | |
| { | |
| "epoch": 4.886547811993517, | |
| "grad_norm": 0.5557739475830915, | |
| "learning_rate": 1.0130940296634683e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16776862740516663, | |
| "step": 3015, | |
| "valid_targets_mean": 5287.8, | |
| "valid_targets_min": 1799 | |
| }, | |
| { | |
| "epoch": 4.894651539708266, | |
| "grad_norm": 0.4756221948224015, | |
| "learning_rate": 1.0060723393695411e-05, | |
| "loss": 0.151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16840708255767822, | |
| "step": 3020, | |
| "valid_targets_mean": 4141.6, | |
| "valid_targets_min": 1372 | |
| }, | |
| { | |
| "epoch": 4.902755267423014, | |
| "grad_norm": 0.42322941384456886, | |
| "learning_rate": 9.990668808074378e-06, | |
| "loss": 0.1474, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16478174924850464, | |
| "step": 3025, | |
| "valid_targets_mean": 4635.3, | |
| "valid_targets_min": 1330 | |
| }, | |
| { | |
| "epoch": 4.9108589951377635, | |
| "grad_norm": 0.5538397214115858, | |
| "learning_rate": 9.920777683825906e-06, | |
| "loss": 0.1524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16234804689884186, | |
| "step": 3030, | |
| "valid_targets_mean": 4091.6, | |
| "valid_targets_min": 1426 | |
| }, | |
| { | |
| "epoch": 4.918962722852513, | |
| "grad_norm": 0.45256934248315905, | |
| "learning_rate": 9.851051162334871e-06, | |
| "loss": 0.1568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1707843542098999, | |
| "step": 3035, | |
| "valid_targets_mean": 4235.1, | |
| "valid_targets_min": 953 | |
| }, | |
| { | |
| "epoch": 4.927066450567261, | |
| "grad_norm": 0.42005299538899554, | |
| "learning_rate": 9.781490382298018e-06, | |
| "loss": 0.1574, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14144930243492126, | |
| "step": 3040, | |
| "valid_targets_mean": 4242.2, | |
| "valid_targets_min": 1385 | |
| }, | |
| { | |
| "epoch": 4.93517017828201, | |
| "grad_norm": 0.3750072233886948, | |
| "learning_rate": 9.712096479705382e-06, | |
| "loss": 0.1477, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12406253069639206, | |
| "step": 3045, | |
| "valid_targets_mean": 4638.4, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 4.943273905996758, | |
| "grad_norm": 0.45082908192768895, | |
| "learning_rate": 9.642870587821761e-06, | |
| "loss": 0.1614, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1364569514989853, | |
| "step": 3050, | |
| "valid_targets_mean": 3741.4, | |
| "valid_targets_min": 1223 | |
| }, | |
| { | |
| "epoch": 4.951377633711507, | |
| "grad_norm": 0.3965864418534553, | |
| "learning_rate": 9.573813837168166e-06, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15209516882896423, | |
| "step": 3055, | |
| "valid_targets_mean": 4904.0, | |
| "valid_targets_min": 2367 | |
| }, | |
| { | |
| "epoch": 4.959481361426256, | |
| "grad_norm": 0.39620169212275697, | |
| "learning_rate": 9.504927355503399e-06, | |
| "loss": 0.1468, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15747293829917908, | |
| "step": 3060, | |
| "valid_targets_mean": 4879.3, | |
| "valid_targets_min": 1475 | |
| }, | |
| { | |
| "epoch": 4.9675850891410045, | |
| "grad_norm": 0.3607647863147781, | |
| "learning_rate": 9.436212267805591e-06, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1285112053155899, | |
| "step": 3065, | |
| "valid_targets_mean": 5197.2, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 4.975688816855754, | |
| "grad_norm": 0.3991247742933118, | |
| "learning_rate": 9.367669696253885e-06, | |
| "loss": 0.1655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16575992107391357, | |
| "step": 3070, | |
| "valid_targets_mean": 4788.8, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 4.983792544570503, | |
| "grad_norm": 0.3829873997175187, | |
| "learning_rate": 9.299300760210059e-06, | |
| "loss": 0.1565, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14778871834278107, | |
| "step": 3075, | |
| "valid_targets_mean": 5312.8, | |
| "valid_targets_min": 1347 | |
| }, | |
| { | |
| "epoch": 4.991896272285251, | |
| "grad_norm": 0.3958938101503756, | |
| "learning_rate": 9.231106576200268e-06, | |
| "loss": 0.1509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16405248641967773, | |
| "step": 3080, | |
| "valid_targets_mean": 5094.3, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 5.0, | |
| "grad_norm": 0.4343165631028096, | |
| "learning_rate": 9.163088257896825e-06, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17925883829593658, | |
| "step": 3085, | |
| "valid_targets_mean": 4357.8, | |
| "valid_targets_min": 848 | |
| }, | |
| { | |
| "epoch": 5.008103727714749, | |
| "grad_norm": 0.3968269887435741, | |
| "learning_rate": 9.095246916099978e-06, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1301729828119278, | |
| "step": 3090, | |
| "valid_targets_mean": 4130.8, | |
| "valid_targets_min": 1417 | |
| }, | |
| { | |
| "epoch": 5.016207455429497, | |
| "grad_norm": 0.3830139509562347, | |
| "learning_rate": 9.027583658719812e-06, | |
| "loss": 0.1474, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14362263679504395, | |
| "step": 3095, | |
| "valid_targets_mean": 5103.4, | |
| "valid_targets_min": 655 | |
| }, | |
| { | |
| "epoch": 5.024311183144246, | |
| "grad_norm": 0.47938454905224653, | |
| "learning_rate": 8.960099590758104e-06, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18902552127838135, | |
| "step": 3100, | |
| "valid_targets_mean": 4781.9, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 5.0324149108589955, | |
| "grad_norm": 0.4381421053581367, | |
| "learning_rate": 8.892795814290342e-06, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15005433559417725, | |
| "step": 3105, | |
| "valid_targets_mean": 4429.2, | |
| "valid_targets_min": 1444 | |
| }, | |
| { | |
| "epoch": 5.040518638573744, | |
| "grad_norm": 0.4876459262258521, | |
| "learning_rate": 8.825673428447668e-06, | |
| "loss": 0.154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13418146967887878, | |
| "step": 3110, | |
| "valid_targets_mean": 3204.8, | |
| "valid_targets_min": 1485 | |
| }, | |
| { | |
| "epoch": 5.048622366288493, | |
| "grad_norm": 0.3989147066059561, | |
| "learning_rate": 8.758733529398945e-06, | |
| "loss": 0.1525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17128801345825195, | |
| "step": 3115, | |
| "valid_targets_mean": 5239.9, | |
| "valid_targets_min": 1425 | |
| }, | |
| { | |
| "epoch": 5.056726094003242, | |
| "grad_norm": 0.4370280823510108, | |
| "learning_rate": 8.691977210332892e-06, | |
| "loss": 0.1315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11790894716978073, | |
| "step": 3120, | |
| "valid_targets_mean": 3613.9, | |
| "valid_targets_min": 1034 | |
| }, | |
| { | |
| "epoch": 5.06482982171799, | |
| "grad_norm": 0.41568029457497896, | |
| "learning_rate": 8.625405561440172e-06, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14099830389022827, | |
| "step": 3125, | |
| "valid_targets_mean": 4263.2, | |
| "valid_targets_min": 1364 | |
| }, | |
| { | |
| "epoch": 5.072933549432739, | |
| "grad_norm": 0.42290991910310327, | |
| "learning_rate": 8.559019669895648e-06, | |
| "loss": 0.1454, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1221788227558136, | |
| "step": 3130, | |
| "valid_targets_mean": 4256.4, | |
| "valid_targets_min": 1474 | |
| }, | |
| { | |
| "epoch": 5.081037277147487, | |
| "grad_norm": 0.4702701852005545, | |
| "learning_rate": 8.492820619840563e-06, | |
| "loss": 0.1576, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15777115523815155, | |
| "step": 3135, | |
| "valid_targets_mean": 3536.3, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 5.0891410048622365, | |
| "grad_norm": 0.4413433161733371, | |
| "learning_rate": 8.426809492364907e-06, | |
| "loss": 0.1473, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14609038829803467, | |
| "step": 3140, | |
| "valid_targets_mean": 3770.4, | |
| "valid_targets_min": 1291 | |
| }, | |
| { | |
| "epoch": 5.097244732576986, | |
| "grad_norm": 0.4022291115099463, | |
| "learning_rate": 8.360987365489698e-06, | |
| "loss": 0.1575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13611765205860138, | |
| "step": 3145, | |
| "valid_targets_mean": 4807.1, | |
| "valid_targets_min": 1957 | |
| }, | |
| { | |
| "epoch": 5.105348460291734, | |
| "grad_norm": 0.44028849944004317, | |
| "learning_rate": 8.295355314149413e-06, | |
| "loss": 0.1588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1880216896533966, | |
| "step": 3150, | |
| "valid_targets_mean": 4639.2, | |
| "valid_targets_min": 1273 | |
| }, | |
| { | |
| "epoch": 5.113452188006483, | |
| "grad_norm": 0.4105899389205008, | |
| "learning_rate": 8.229914410174435e-06, | |
| "loss": 0.1451, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14482375979423523, | |
| "step": 3155, | |
| "valid_targets_mean": 4465.9, | |
| "valid_targets_min": 1511 | |
| }, | |
| { | |
| "epoch": 5.121555915721232, | |
| "grad_norm": 0.5313511259093264, | |
| "learning_rate": 8.16466572227352e-06, | |
| "loss": 0.1514, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16115616261959076, | |
| "step": 3160, | |
| "valid_targets_mean": 3309.9, | |
| "valid_targets_min": 952 | |
| }, | |
| { | |
| "epoch": 5.12965964343598, | |
| "grad_norm": 0.4059132147272568, | |
| "learning_rate": 8.099610316016373e-06, | |
| "loss": 0.1478, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1520155966281891, | |
| "step": 3165, | |
| "valid_targets_mean": 5199.4, | |
| "valid_targets_min": 1868 | |
| }, | |
| { | |
| "epoch": 5.137763371150729, | |
| "grad_norm": 0.4673866325160744, | |
| "learning_rate": 8.03474925381625e-06, | |
| "loss": 0.1441, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13583573698997498, | |
| "step": 3170, | |
| "valid_targets_mean": 3982.9, | |
| "valid_targets_min": 1495 | |
| }, | |
| { | |
| "epoch": 5.145867098865478, | |
| "grad_norm": 0.41493588873304466, | |
| "learning_rate": 7.97008359491257e-06, | |
| "loss": 0.148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15901492536067963, | |
| "step": 3175, | |
| "valid_targets_mean": 4487.7, | |
| "valid_targets_min": 1652 | |
| }, | |
| { | |
| "epoch": 5.153970826580227, | |
| "grad_norm": 0.3817364201910083, | |
| "learning_rate": 7.905614395353649e-06, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.169359028339386, | |
| "step": 3180, | |
| "valid_targets_mean": 6225.4, | |
| "valid_targets_min": 2198 | |
| }, | |
| { | |
| "epoch": 5.162074554294976, | |
| "grad_norm": 0.4350949526585142, | |
| "learning_rate": 7.841342707979442e-06, | |
| "loss": 0.1638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17552366852760315, | |
| "step": 3185, | |
| "valid_targets_mean": 4653.2, | |
| "valid_targets_min": 1322 | |
| }, | |
| { | |
| "epoch": 5.170178282009725, | |
| "grad_norm": 0.41863754898582967, | |
| "learning_rate": 7.77726958240437e-06, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1267434060573578, | |
| "step": 3190, | |
| "valid_targets_mean": 4613.6, | |
| "valid_targets_min": 1426 | |
| }, | |
| { | |
| "epoch": 5.178282009724473, | |
| "grad_norm": 0.4347179314293203, | |
| "learning_rate": 7.713396065000133e-06, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17588908970355988, | |
| "step": 3195, | |
| "valid_targets_mean": 4821.8, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 5.186385737439222, | |
| "grad_norm": 0.3741504502759021, | |
| "learning_rate": 7.649723198878676e-06, | |
| "loss": 0.1398, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1253783404827118, | |
| "step": 3200, | |
| "valid_targets_mean": 5491.1, | |
| "valid_targets_min": 701 | |
| }, | |
| { | |
| "epoch": 5.194489465153971, | |
| "grad_norm": 0.4167609797093036, | |
| "learning_rate": 7.586252023875125e-06, | |
| "loss": 0.1473, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17448574304580688, | |
| "step": 3205, | |
| "valid_targets_mean": 4872.2, | |
| "valid_targets_min": 1235 | |
| }, | |
| { | |
| "epoch": 5.2025931928687195, | |
| "grad_norm": 0.49274119267513306, | |
| "learning_rate": 7.522983576530791e-06, | |
| "loss": 0.1339, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1333792507648468, | |
| "step": 3210, | |
| "valid_targets_mean": 3084.2, | |
| "valid_targets_min": 1346 | |
| }, | |
| { | |
| "epoch": 5.210696920583469, | |
| "grad_norm": 0.4274907703145893, | |
| "learning_rate": 7.459918890076272e-06, | |
| "loss": 0.1403, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13575004041194916, | |
| "step": 3215, | |
| "valid_targets_mean": 4470.8, | |
| "valid_targets_min": 1264 | |
| }, | |
| { | |
| "epoch": 5.218800648298217, | |
| "grad_norm": 0.46411535100095974, | |
| "learning_rate": 7.397058994414563e-06, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15877866744995117, | |
| "step": 3220, | |
| "valid_targets_mean": 4287.1, | |
| "valid_targets_min": 1382 | |
| }, | |
| { | |
| "epoch": 5.226904376012966, | |
| "grad_norm": 0.42573424092441686, | |
| "learning_rate": 7.3344049161042495e-06, | |
| "loss": 0.1322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1221034973859787, | |
| "step": 3225, | |
| "valid_targets_mean": 4021.1, | |
| "valid_targets_min": 1400 | |
| }, | |
| { | |
| "epoch": 5.235008103727715, | |
| "grad_norm": 0.4507514969839428, | |
| "learning_rate": 7.271957678342738e-06, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1447732299566269, | |
| "step": 3230, | |
| "valid_targets_mean": 4574.4, | |
| "valid_targets_min": 1330 | |
| }, | |
| { | |
| "epoch": 5.243111831442463, | |
| "grad_norm": 0.37145491813213954, | |
| "learning_rate": 7.209718300949519e-06, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1309332549571991, | |
| "step": 3235, | |
| "valid_targets_mean": 5262.1, | |
| "valid_targets_min": 2743 | |
| }, | |
| { | |
| "epoch": 5.251215559157212, | |
| "grad_norm": 0.4714945659818598, | |
| "learning_rate": 7.14768780034957e-06, | |
| "loss": 0.1732, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14775560796260834, | |
| "step": 3240, | |
| "valid_targets_mean": 3638.2, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 5.259319286871961, | |
| "grad_norm": 0.44171814267945775, | |
| "learning_rate": 7.085867189556697e-06, | |
| "loss": 0.1551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19044238328933716, | |
| "step": 3245, | |
| "valid_targets_mean": 4592.8, | |
| "valid_targets_min": 1328 | |
| }, | |
| { | |
| "epoch": 5.26742301458671, | |
| "grad_norm": 0.42451536992860567, | |
| "learning_rate": 7.024257478157015e-06, | |
| "loss": 0.1364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13068664073944092, | |
| "step": 3250, | |
| "valid_targets_mean": 4324.1, | |
| "valid_targets_min": 1433 | |
| }, | |
| { | |
| "epoch": 5.275526742301459, | |
| "grad_norm": 0.4655686708567891, | |
| "learning_rate": 6.96285967229249e-06, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.165610671043396, | |
| "step": 3255, | |
| "valid_targets_mean": 3945.8, | |
| "valid_targets_min": 1336 | |
| }, | |
| { | |
| "epoch": 5.283630470016208, | |
| "grad_norm": 0.4449132978595166, | |
| "learning_rate": 6.901674774644449e-06, | |
| "loss": 0.1607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18209701776504517, | |
| "step": 3260, | |
| "valid_targets_mean": 4861.0, | |
| "valid_targets_min": 1069 | |
| }, | |
| { | |
| "epoch": 5.291734197730956, | |
| "grad_norm": 0.40137615746667976, | |
| "learning_rate": 6.840703784417262e-06, | |
| "loss": 0.1605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1194748729467392, | |
| "step": 3265, | |
| "valid_targets_mean": 4572.1, | |
| "valid_targets_min": 1018 | |
| }, | |
| { | |
| "epoch": 5.299837925445705, | |
| "grad_norm": 0.4359004956729294, | |
| "learning_rate": 6.779947697321974e-06, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16937534511089325, | |
| "step": 3270, | |
| "valid_targets_mean": 4552.4, | |
| "valid_targets_min": 1345 | |
| }, | |
| { | |
| "epoch": 5.307941653160454, | |
| "grad_norm": 0.4724067916024582, | |
| "learning_rate": 6.719407505560094e-06, | |
| "loss": 0.1461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1636667549610138, | |
| "step": 3275, | |
| "valid_targets_mean": 4430.4, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 5.316045380875202, | |
| "grad_norm": 0.44441044209123404, | |
| "learning_rate": 6.659084197807348e-06, | |
| "loss": 0.1605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16976942121982574, | |
| "step": 3280, | |
| "valid_targets_mean": 4312.1, | |
| "valid_targets_min": 1353 | |
| }, | |
| { | |
| "epoch": 5.3241491085899515, | |
| "grad_norm": 0.4150836959257879, | |
| "learning_rate": 6.598978759197554e-06, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11863559484481812, | |
| "step": 3285, | |
| "valid_targets_mean": 3961.8, | |
| "valid_targets_min": 1429 | |
| }, | |
| { | |
| "epoch": 5.332252836304701, | |
| "grad_norm": 0.4119131054274552, | |
| "learning_rate": 6.539092171306541e-06, | |
| "loss": 0.1478, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15804126858711243, | |
| "step": 3290, | |
| "valid_targets_mean": 4975.1, | |
| "valid_targets_min": 1620 | |
| }, | |
| { | |
| "epoch": 5.340356564019449, | |
| "grad_norm": 0.4431169860534305, | |
| "learning_rate": 6.479425412136093e-06, | |
| "loss": 0.1413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13536761701107025, | |
| "step": 3295, | |
| "valid_targets_mean": 3993.5, | |
| "valid_targets_min": 1020 | |
| }, | |
| { | |
| "epoch": 5.348460291734198, | |
| "grad_norm": 0.40748142947812366, | |
| "learning_rate": 6.419979456098016e-06, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14843331277370453, | |
| "step": 3300, | |
| "valid_targets_mean": 4888.7, | |
| "valid_targets_min": 572 | |
| }, | |
| { | |
| "epoch": 5.356564019448946, | |
| "grad_norm": 0.38063983350274677, | |
| "learning_rate": 6.360755273998174e-06, | |
| "loss": 0.1638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13254444301128387, | |
| "step": 3305, | |
| "valid_targets_mean": 5138.4, | |
| "valid_targets_min": 2549 | |
| }, | |
| { | |
| "epoch": 5.364667747163695, | |
| "grad_norm": 0.41858383988280357, | |
| "learning_rate": 6.301753833020691e-06, | |
| "loss": 0.1621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1635013222694397, | |
| "step": 3310, | |
| "valid_targets_mean": 5046.3, | |
| "valid_targets_min": 2402 | |
| }, | |
| { | |
| "epoch": 5.372771474878444, | |
| "grad_norm": 0.47226938264774465, | |
| "learning_rate": 6.242976096712112e-06, | |
| "loss": 0.1405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14795152842998505, | |
| "step": 3315, | |
| "valid_targets_mean": 3503.7, | |
| "valid_targets_min": 1286 | |
| }, | |
| { | |
| "epoch": 5.3808752025931925, | |
| "grad_norm": 0.41899647955888863, | |
| "learning_rate": 6.18442302496568e-06, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15799807012081146, | |
| "step": 3320, | |
| "valid_targets_mean": 4453.1, | |
| "valid_targets_min": 1332 | |
| }, | |
| { | |
| "epoch": 5.388978930307942, | |
| "grad_norm": 0.4445100748493841, | |
| "learning_rate": 6.1260955740056835e-06, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12522292137145996, | |
| "step": 3325, | |
| "valid_targets_mean": 3427.6, | |
| "valid_targets_min": 1299 | |
| }, | |
| { | |
| "epoch": 5.397082658022691, | |
| "grad_norm": 0.4442798105619433, | |
| "learning_rate": 6.067994696371797e-06, | |
| "loss": 0.1568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17335715889930725, | |
| "step": 3330, | |
| "valid_targets_mean": 4862.9, | |
| "valid_targets_min": 1227 | |
| }, | |
| { | |
| "epoch": 5.405186385737439, | |
| "grad_norm": 0.42669050164619443, | |
| "learning_rate": 6.010121340903574e-06, | |
| "loss": 0.1525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13971629738807678, | |
| "step": 3335, | |
| "valid_targets_mean": 4265.6, | |
| "valid_targets_min": 1599 | |
| }, | |
| { | |
| "epoch": 5.413290113452188, | |
| "grad_norm": 0.424049644257688, | |
| "learning_rate": 5.952476452724898e-06, | |
| "loss": 0.1581, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1421208679676056, | |
| "step": 3340, | |
| "valid_targets_mean": 4199.8, | |
| "valid_targets_min": 1370 | |
| }, | |
| { | |
| "epoch": 5.421393841166937, | |
| "grad_norm": 0.39489552858065036, | |
| "learning_rate": 5.895060973228606e-06, | |
| "loss": 0.1449, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1596226692199707, | |
| "step": 3345, | |
| "valid_targets_mean": 5840.3, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 5.429497568881685, | |
| "grad_norm": 0.3971053859954934, | |
| "learning_rate": 5.837875840061064e-06, | |
| "loss": 0.1518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.152027428150177, | |
| "step": 3350, | |
| "valid_targets_mean": 5854.6, | |
| "valid_targets_min": 2489 | |
| }, | |
| { | |
| "epoch": 5.437601296596434, | |
| "grad_norm": 0.42180726865326074, | |
| "learning_rate": 5.780921987106878e-06, | |
| "loss": 0.1451, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1312631070613861, | |
| "step": 3355, | |
| "valid_targets_mean": 4105.1, | |
| "valid_targets_min": 1417 | |
| }, | |
| { | |
| "epoch": 5.4457050243111835, | |
| "grad_norm": 0.4268852043834351, | |
| "learning_rate": 5.724200344473651e-06, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14172303676605225, | |
| "step": 3360, | |
| "valid_targets_mean": 4448.6, | |
| "valid_targets_min": 2495 | |
| }, | |
| { | |
| "epoch": 5.453808752025932, | |
| "grad_norm": 0.4342642841720189, | |
| "learning_rate": 5.66771183847677e-06, | |
| "loss": 0.1456, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14243818819522858, | |
| "step": 3365, | |
| "valid_targets_mean": 4433.8, | |
| "valid_targets_min": 2387 | |
| }, | |
| { | |
| "epoch": 5.461912479740681, | |
| "grad_norm": 0.4642434978963273, | |
| "learning_rate": 5.611457391624309e-06, | |
| "loss": 0.1567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18304626643657684, | |
| "step": 3370, | |
| "valid_targets_mean": 4033.4, | |
| "valid_targets_min": 1502 | |
| }, | |
| { | |
| "epoch": 5.47001620745543, | |
| "grad_norm": 0.4127896613433934, | |
| "learning_rate": 5.555437922601918e-06, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1505068838596344, | |
| "step": 3375, | |
| "valid_targets_mean": 4843.2, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 5.478119935170178, | |
| "grad_norm": 0.4367679456778927, | |
| "learning_rate": 5.499654346257879e-06, | |
| "loss": 0.1447, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13455377519130707, | |
| "step": 3380, | |
| "valid_targets_mean": 4203.2, | |
| "valid_targets_min": 1382 | |
| }, | |
| { | |
| "epoch": 5.486223662884927, | |
| "grad_norm": 0.4701162288125186, | |
| "learning_rate": 5.444107573588116e-06, | |
| "loss": 0.1391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12258616089820862, | |
| "step": 3385, | |
| "valid_targets_mean": 3157.6, | |
| "valid_targets_min": 1421 | |
| }, | |
| { | |
| "epoch": 5.4943273905996755, | |
| "grad_norm": 0.4412200000803221, | |
| "learning_rate": 5.388798511721329e-06, | |
| "loss": 0.1445, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1565963327884674, | |
| "step": 3390, | |
| "valid_targets_mean": 4411.5, | |
| "valid_targets_min": 459 | |
| }, | |
| { | |
| "epoch": 5.5024311183144246, | |
| "grad_norm": 0.3600623889453669, | |
| "learning_rate": 5.333728063904213e-06, | |
| "loss": 0.1445, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14468525350093842, | |
| "step": 3395, | |
| "valid_targets_mean": 5956.2, | |
| "valid_targets_min": 1337 | |
| }, | |
| { | |
| "epoch": 5.510534846029174, | |
| "grad_norm": 0.4598981790632925, | |
| "learning_rate": 5.278897129486656e-06, | |
| "loss": 0.1503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14410819113254547, | |
| "step": 3400, | |
| "valid_targets_mean": 3575.8, | |
| "valid_targets_min": 1436 | |
| }, | |
| { | |
| "epoch": 5.518638573743922, | |
| "grad_norm": 0.4580690381488814, | |
| "learning_rate": 5.224306603907095e-06, | |
| "loss": 0.1688, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18686503171920776, | |
| "step": 3405, | |
| "valid_targets_mean": 4421.5, | |
| "valid_targets_min": 844 | |
| }, | |
| { | |
| "epoch": 5.526742301458671, | |
| "grad_norm": 0.40648216673551496, | |
| "learning_rate": 5.169957378677859e-06, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16417232155799866, | |
| "step": 3410, | |
| "valid_targets_mean": 5176.2, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 5.53484602917342, | |
| "grad_norm": 0.42527715899312485, | |
| "learning_rate": 5.11585034137064e-06, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14472799003124237, | |
| "step": 3415, | |
| "valid_targets_mean": 4337.4, | |
| "valid_targets_min": 1441 | |
| }, | |
| { | |
| "epoch": 5.542949756888168, | |
| "grad_norm": 0.4257643164617211, | |
| "learning_rate": 5.061986375601977e-06, | |
| "loss": 0.1491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1356392502784729, | |
| "step": 3420, | |
| "valid_targets_mean": 4046.2, | |
| "valid_targets_min": 1496 | |
| }, | |
| { | |
| "epoch": 5.551053484602917, | |
| "grad_norm": 0.3765615482008815, | |
| "learning_rate": 5.0083663610188215e-06, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11371708661317825, | |
| "step": 3425, | |
| "valid_targets_mean": 5338.3, | |
| "valid_targets_min": 953 | |
| }, | |
| { | |
| "epoch": 5.5591572123176665, | |
| "grad_norm": 0.3964517008546095, | |
| "learning_rate": 4.954991173284207e-06, | |
| "loss": 0.149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15065959095954895, | |
| "step": 3430, | |
| "valid_targets_mean": 5702.8, | |
| "valid_targets_min": 1311 | |
| }, | |
| { | |
| "epoch": 5.567260940032415, | |
| "grad_norm": 0.4508220930947401, | |
| "learning_rate": 4.901861684062899e-06, | |
| "loss": 0.1398, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1419251263141632, | |
| "step": 3435, | |
| "valid_targets_mean": 4317.9, | |
| "valid_targets_min": 1380 | |
| }, | |
| { | |
| "epoch": 5.575364667747164, | |
| "grad_norm": 0.38660509987607483, | |
| "learning_rate": 4.848978761007206e-06, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1457797884941101, | |
| "step": 3440, | |
| "valid_targets_mean": 5160.2, | |
| "valid_targets_min": 1217 | |
| }, | |
| { | |
| "epoch": 5.583468395461912, | |
| "grad_norm": 0.38807504896802625, | |
| "learning_rate": 4.796343267742782e-06, | |
| "loss": 0.1636, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1514224410057068, | |
| "step": 3445, | |
| "valid_targets_mean": 5379.7, | |
| "valid_targets_min": 1357 | |
| }, | |
| { | |
| "epoch": 5.591572123176661, | |
| "grad_norm": 0.46275031225232527, | |
| "learning_rate": 4.743956063854529e-06, | |
| "loss": 0.1333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1503310203552246, | |
| "step": 3450, | |
| "valid_targets_mean": 3940.9, | |
| "valid_targets_min": 1426 | |
| }, | |
| { | |
| "epoch": 5.59967585089141, | |
| "grad_norm": 0.4452707900888768, | |
| "learning_rate": 4.691818004872557e-06, | |
| "loss": 0.1511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14793214201927185, | |
| "step": 3455, | |
| "valid_targets_mean": 4227.8, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 5.607779578606159, | |
| "grad_norm": 0.41732278622886226, | |
| "learning_rate": 4.639929942258217e-06, | |
| "loss": 0.1419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15394923090934753, | |
| "step": 3460, | |
| "valid_targets_mean": 4887.3, | |
| "valid_targets_min": 1363 | |
| }, | |
| { | |
| "epoch": 5.6158833063209075, | |
| "grad_norm": 0.3755320355039727, | |
| "learning_rate": 4.588292723390204e-06, | |
| "loss": 0.1459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14299200475215912, | |
| "step": 3465, | |
| "valid_targets_mean": 4793.9, | |
| "valid_targets_min": 1847 | |
| }, | |
| { | |
| "epoch": 5.623987034035657, | |
| "grad_norm": 0.4068035472698011, | |
| "learning_rate": 4.536907191550694e-06, | |
| "loss": 0.1521, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16701841354370117, | |
| "step": 3470, | |
| "valid_targets_mean": 5323.9, | |
| "valid_targets_min": 2438 | |
| }, | |
| { | |
| "epoch": 5.632090761750405, | |
| "grad_norm": 0.4033726053317685, | |
| "learning_rate": 4.4857741859116024e-06, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14929088950157166, | |
| "step": 3475, | |
| "valid_targets_mean": 5282.5, | |
| "valid_targets_min": 1223 | |
| }, | |
| { | |
| "epoch": 5.640194489465154, | |
| "grad_norm": 0.4361679512957998, | |
| "learning_rate": 4.434894541520862e-06, | |
| "loss": 0.1533, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16081160306930542, | |
| "step": 3480, | |
| "valid_targets_mean": 5153.8, | |
| "valid_targets_min": 2145 | |
| }, | |
| { | |
| "epoch": 5.648298217179903, | |
| "grad_norm": 0.4767583735219021, | |
| "learning_rate": 4.3842690892887795e-06, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14562401175498962, | |
| "step": 3485, | |
| "valid_targets_mean": 3356.0, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 5.656401944894651, | |
| "grad_norm": 0.40801873437269603, | |
| "learning_rate": 4.333898655974484e-06, | |
| "loss": 0.1477, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1436278223991394, | |
| "step": 3490, | |
| "valid_targets_mean": 5208.7, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 5.6645056726094, | |
| "grad_norm": 0.39494427065104426, | |
| "learning_rate": 4.283784064172405e-06, | |
| "loss": 0.1417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1220889762043953, | |
| "step": 3495, | |
| "valid_targets_mean": 4590.6, | |
| "valid_targets_min": 1449 | |
| }, | |
| { | |
| "epoch": 5.672609400324149, | |
| "grad_norm": 0.3727393025793469, | |
| "learning_rate": 4.233926132298867e-06, | |
| "loss": 0.142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12601833045482635, | |
| "step": 3500, | |
| "valid_targets_mean": 5080.4, | |
| "valid_targets_min": 1314 | |
| }, | |
| { | |
| "epoch": 5.680713128038898, | |
| "grad_norm": 0.4579620633493333, | |
| "learning_rate": 4.1843256745787e-06, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14137154817581177, | |
| "step": 3505, | |
| "valid_targets_mean": 4364.9, | |
| "valid_targets_min": 1103 | |
| }, | |
| { | |
| "epoch": 5.688816855753647, | |
| "grad_norm": 0.4210006457716337, | |
| "learning_rate": 4.134983501031942e-06, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14210812747478485, | |
| "step": 3510, | |
| "valid_targets_mean": 4854.4, | |
| "valid_targets_min": 1520 | |
| }, | |
| { | |
| "epoch": 5.696920583468396, | |
| "grad_norm": 0.44180877138160274, | |
| "learning_rate": 4.085900417460633e-06, | |
| "loss": 0.1554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17552919685840607, | |
| "step": 3515, | |
| "valid_targets_mean": 4908.9, | |
| "valid_targets_min": 1473 | |
| }, | |
| { | |
| "epoch": 5.705024311183144, | |
| "grad_norm": 0.44830779855225567, | |
| "learning_rate": 4.037077225435628e-06, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12323147058486938, | |
| "step": 3520, | |
| "valid_targets_mean": 3771.7, | |
| "valid_targets_min": 1324 | |
| }, | |
| { | |
| "epoch": 5.713128038897893, | |
| "grad_norm": 0.37678935263590063, | |
| "learning_rate": 3.988514722283523e-06, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12797431647777557, | |
| "step": 3525, | |
| "valid_targets_mean": 5230.3, | |
| "valid_targets_min": 2085 | |
| }, | |
| { | |
| "epoch": 5.721231766612641, | |
| "grad_norm": 0.43358458671138017, | |
| "learning_rate": 3.940213701073636e-06, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17389845848083496, | |
| "step": 3530, | |
| "valid_targets_mean": 4511.4, | |
| "valid_targets_min": 1178 | |
| }, | |
| { | |
| "epoch": 5.72933549432739, | |
| "grad_norm": 0.46038006750182087, | |
| "learning_rate": 3.892174950605039e-06, | |
| "loss": 0.1479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14572468400001526, | |
| "step": 3535, | |
| "valid_targets_mean": 4509.4, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 5.7374392220421395, | |
| "grad_norm": 0.4492541988440627, | |
| "learning_rate": 3.844399255393705e-06, | |
| "loss": 0.1513, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1724633127450943, | |
| "step": 3540, | |
| "valid_targets_mean": 4466.0, | |
| "valid_targets_min": 1992 | |
| }, | |
| { | |
| "epoch": 5.745542949756889, | |
| "grad_norm": 0.4795417181415384, | |
| "learning_rate": 3.7968873956596563e-06, | |
| "loss": 0.1693, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21007636189460754, | |
| "step": 3545, | |
| "valid_targets_mean": 4530.2, | |
| "valid_targets_min": 676 | |
| }, | |
| { | |
| "epoch": 5.753646677471637, | |
| "grad_norm": 0.46983777722992826, | |
| "learning_rate": 3.749640147314264e-06, | |
| "loss": 0.1344, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1398407220840454, | |
| "step": 3550, | |
| "valid_targets_mean": 3593.2, | |
| "valid_targets_min": 1362 | |
| }, | |
| { | |
| "epoch": 5.761750405186386, | |
| "grad_norm": 0.38049959215799295, | |
| "learning_rate": 3.7026582819475443e-06, | |
| "loss": 0.1455, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12801112234592438, | |
| "step": 3555, | |
| "valid_targets_mean": 4737.4, | |
| "valid_targets_min": 2235 | |
| }, | |
| { | |
| "epoch": 5.769854132901134, | |
| "grad_norm": 0.461054141076095, | |
| "learning_rate": 3.6559425668155733e-06, | |
| "loss": 0.1682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17298737168312073, | |
| "step": 3560, | |
| "valid_targets_mean": 3996.0, | |
| "valid_targets_min": 578 | |
| }, | |
| { | |
| "epoch": 5.777957860615883, | |
| "grad_norm": 0.4360822162589034, | |
| "learning_rate": 3.6094937648279647e-06, | |
| "loss": 0.1459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14468207955360413, | |
| "step": 3565, | |
| "valid_targets_mean": 4477.3, | |
| "valid_targets_min": 1338 | |
| }, | |
| { | |
| "epoch": 5.786061588330632, | |
| "grad_norm": 0.4418639601057002, | |
| "learning_rate": 3.563312634535383e-06, | |
| "loss": 0.1592, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.174360454082489, | |
| "step": 3570, | |
| "valid_targets_mean": 4328.9, | |
| "valid_targets_min": 1142 | |
| }, | |
| { | |
| "epoch": 5.7941653160453805, | |
| "grad_norm": 0.43052400582331984, | |
| "learning_rate": 3.517399930117196e-06, | |
| "loss": 0.1544, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16029515862464905, | |
| "step": 3575, | |
| "valid_targets_mean": 4829.5, | |
| "valid_targets_min": 2103 | |
| }, | |
| { | |
| "epoch": 5.80226904376013, | |
| "grad_norm": 0.4590780539905914, | |
| "learning_rate": 3.4717564013691087e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12398982793092728, | |
| "step": 3580, | |
| "valid_targets_mean": 3912.6, | |
| "valid_targets_min": 1397 | |
| }, | |
| { | |
| "epoch": 5.810372771474879, | |
| "grad_norm": 0.4078367715005851, | |
| "learning_rate": 3.4263827936909744e-06, | |
| "loss": 0.1435, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16300195455551147, | |
| "step": 3585, | |
| "valid_targets_mean": 5371.7, | |
| "valid_targets_min": 1463 | |
| }, | |
| { | |
| "epoch": 5.818476499189627, | |
| "grad_norm": 0.41245112010062046, | |
| "learning_rate": 3.38127984807457e-06, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1823018491268158, | |
| "step": 3590, | |
| "valid_targets_mean": 5383.8, | |
| "valid_targets_min": 1390 | |
| }, | |
| { | |
| "epoch": 5.826580226904376, | |
| "grad_norm": 0.41228359986338586, | |
| "learning_rate": 3.3364483010915237e-06, | |
| "loss": 0.1528, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16868922114372253, | |
| "step": 3595, | |
| "valid_targets_mean": 5036.3, | |
| "valid_targets_min": 1354 | |
| }, | |
| { | |
| "epoch": 5.834683954619125, | |
| "grad_norm": 0.4025206613962, | |
| "learning_rate": 3.2918888848812913e-06, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1353769749403, | |
| "step": 3600, | |
| "valid_targets_mean": 4976.0, | |
| "valid_targets_min": 2813 | |
| }, | |
| { | |
| "epoch": 5.842787682333873, | |
| "grad_norm": 0.4173151097140713, | |
| "learning_rate": 3.2476023271391698e-06, | |
| "loss": 0.1489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1375451683998108, | |
| "step": 3605, | |
| "valid_targets_mean": 4525.1, | |
| "valid_targets_min": 782 | |
| }, | |
| { | |
| "epoch": 5.8508914100486225, | |
| "grad_norm": 0.4600876492050453, | |
| "learning_rate": 3.2035893511044524e-06, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14605942368507385, | |
| "step": 3610, | |
| "valid_targets_mean": 3913.0, | |
| "valid_targets_min": 970 | |
| }, | |
| { | |
| "epoch": 5.858995137763371, | |
| "grad_norm": 0.3887244887549261, | |
| "learning_rate": 3.159850675548577e-06, | |
| "loss": 0.1566, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13345900177955627, | |
| "step": 3615, | |
| "valid_targets_mean": 4845.1, | |
| "valid_targets_min": 1469 | |
| }, | |
| { | |
| "epoch": 5.86709886547812, | |
| "grad_norm": 0.42651798816381464, | |
| "learning_rate": 3.116387014763429e-06, | |
| "loss": 0.1397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.126625657081604, | |
| "step": 3620, | |
| "valid_targets_mean": 3999.2, | |
| "valid_targets_min": 1276 | |
| }, | |
| { | |
| "epoch": 5.875202593192869, | |
| "grad_norm": 0.404591471586791, | |
| "learning_rate": 3.073199078549638e-06, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14635738730430603, | |
| "step": 3625, | |
| "valid_targets_mean": 5098.9, | |
| "valid_targets_min": 2199 | |
| }, | |
| { | |
| "epoch": 5.883306320907618, | |
| "grad_norm": 0.39164350966421785, | |
| "learning_rate": 3.0302875722050064e-06, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13689061999320984, | |
| "step": 3630, | |
| "valid_targets_mean": 5409.9, | |
| "valid_targets_min": 1353 | |
| }, | |
| { | |
| "epoch": 5.891410048622366, | |
| "grad_norm": 0.40314319994514747, | |
| "learning_rate": 2.987653196513003e-06, | |
| "loss": 0.1386, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12066174298524857, | |
| "step": 3635, | |
| "valid_targets_mean": 4769.4, | |
| "valid_targets_min": 1481 | |
| }, | |
| { | |
| "epoch": 5.899513776337115, | |
| "grad_norm": 0.4836471889303801, | |
| "learning_rate": 2.9452966477312815e-06, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16042715311050415, | |
| "step": 3640, | |
| "valid_targets_mean": 3923.6, | |
| "valid_targets_min": 715 | |
| }, | |
| { | |
| "epoch": 5.9076175040518635, | |
| "grad_norm": 0.46089519237408594, | |
| "learning_rate": 2.9032186175803545e-06, | |
| "loss": 0.1624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20133355259895325, | |
| "step": 3645, | |
| "valid_targets_mean": 4924.5, | |
| "valid_targets_min": 1456 | |
| }, | |
| { | |
| "epoch": 5.915721231766613, | |
| "grad_norm": 0.4565914501000379, | |
| "learning_rate": 2.8614197932322585e-06, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13373398780822754, | |
| "step": 3650, | |
| "valid_targets_mean": 3931.9, | |
| "valid_targets_min": 1146 | |
| }, | |
| { | |
| "epoch": 5.923824959481362, | |
| "grad_norm": 0.4408552819806134, | |
| "learning_rate": 2.819900857299358e-06, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15062901377677917, | |
| "step": 3655, | |
| "valid_targets_mean": 4433.2, | |
| "valid_targets_min": 2133 | |
| }, | |
| { | |
| "epoch": 5.93192868719611, | |
| "grad_norm": 0.3933877770279565, | |
| "learning_rate": 2.778662487823187e-06, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14033809304237366, | |
| "step": 3660, | |
| "valid_targets_mean": 5082.8, | |
| "valid_targets_min": 1363 | |
| }, | |
| { | |
| "epoch": 5.940032414910859, | |
| "grad_norm": 0.4435388470880922, | |
| "learning_rate": 2.7377053582633652e-06, | |
| "loss": 0.1518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1453082263469696, | |
| "step": 3665, | |
| "valid_targets_mean": 4131.4, | |
| "valid_targets_min": 1133 | |
| }, | |
| { | |
| "epoch": 5.948136142625608, | |
| "grad_norm": 0.4303911052570311, | |
| "learning_rate": 2.6970301374866337e-06, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13601087033748627, | |
| "step": 3670, | |
| "valid_targets_mean": 3857.9, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 5.956239870340356, | |
| "grad_norm": 0.3821087196176658, | |
| "learning_rate": 2.656637489755889e-06, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16024444997310638, | |
| "step": 3675, | |
| "valid_targets_mean": 6057.2, | |
| "valid_targets_min": 2643 | |
| }, | |
| { | |
| "epoch": 5.964343598055105, | |
| "grad_norm": 0.47180883356979997, | |
| "learning_rate": 2.616528074719371e-06, | |
| "loss": 0.1415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1266734004020691, | |
| "step": 3680, | |
| "valid_targets_mean": 3223.6, | |
| "valid_targets_min": 1354 | |
| }, | |
| { | |
| "epoch": 5.9724473257698545, | |
| "grad_norm": 0.42562678571064955, | |
| "learning_rate": 2.576702547399863e-06, | |
| "loss": 0.1407, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13321202993392944, | |
| "step": 3685, | |
| "valid_targets_mean": 4263.2, | |
| "valid_targets_min": 1387 | |
| }, | |
| { | |
| "epoch": 5.980551053484603, | |
| "grad_norm": 0.36066353182020416, | |
| "learning_rate": 2.53716155818402e-06, | |
| "loss": 0.1576, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12179975211620331, | |
| "step": 3690, | |
| "valid_targets_mean": 5267.1, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 5.988654781199352, | |
| "grad_norm": 0.47127497546296976, | |
| "learning_rate": 2.49790575281172e-06, | |
| "loss": 0.1475, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15804405510425568, | |
| "step": 3695, | |
| "valid_targets_mean": 4068.7, | |
| "valid_targets_min": 1112 | |
| }, | |
| { | |
| "epoch": 5.9967585089141, | |
| "grad_norm": 0.412154254601803, | |
| "learning_rate": 2.4589357723655405e-06, | |
| "loss": 0.1418, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12866616249084473, | |
| "step": 3700, | |
| "valid_targets_mean": 4358.7, | |
| "valid_targets_min": 1364 | |
| }, | |
| { | |
| "epoch": 6.004862236628849, | |
| "grad_norm": 0.4074648856856215, | |
| "learning_rate": 2.4202522532602846e-06, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16085468232631683, | |
| "step": 3705, | |
| "valid_targets_mean": 4796.6, | |
| "valid_targets_min": 614 | |
| }, | |
| { | |
| "epoch": 6.012965964343598, | |
| "grad_norm": 0.3982979848177161, | |
| "learning_rate": 2.381855827232571e-06, | |
| "loss": 0.1343, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11491286009550095, | |
| "step": 3710, | |
| "valid_targets_mean": 4550.1, | |
| "valid_targets_min": 1490 | |
| }, | |
| { | |
| "epoch": 6.021069692058346, | |
| "grad_norm": 0.41195237608810253, | |
| "learning_rate": 2.343747121330544e-06, | |
| "loss": 0.1406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12920981645584106, | |
| "step": 3715, | |
| "valid_targets_mean": 4646.1, | |
| "valid_targets_min": 1293 | |
| }, | |
| { | |
| "epoch": 6.0291734197730955, | |
| "grad_norm": 0.3762383380804612, | |
| "learning_rate": 2.3059267579036183e-06, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14210054278373718, | |
| "step": 3720, | |
| "valid_targets_mean": 5317.4, | |
| "valid_targets_min": 2300 | |
| }, | |
| { | |
| "epoch": 6.037277147487845, | |
| "grad_norm": 0.4411386976702003, | |
| "learning_rate": 2.268395354592312e-06, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23801150918006897, | |
| "step": 3725, | |
| "valid_targets_mean": 5265.6, | |
| "valid_targets_min": 1380 | |
| }, | |
| { | |
| "epoch": 6.045380875202593, | |
| "grad_norm": 0.36386178774219285, | |
| "learning_rate": 2.2311535243181637e-06, | |
| "loss": 0.1381, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.10592524707317352, | |
| "step": 3730, | |
| "valid_targets_mean": 5293.6, | |
| "valid_targets_min": 2116 | |
| }, | |
| { | |
| "epoch": 6.053484602917342, | |
| "grad_norm": 0.38730906481519406, | |
| "learning_rate": 2.1942018752737227e-06, | |
| "loss": 0.154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11556782573461533, | |
| "step": 3735, | |
| "valid_targets_mean": 4649.8, | |
| "valid_targets_min": 1440 | |
| }, | |
| { | |
| "epoch": 6.061588330632091, | |
| "grad_norm": 0.450030924562011, | |
| "learning_rate": 2.1575410109126293e-06, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13595318794250488, | |
| "step": 3740, | |
| "valid_targets_mean": 4143.9, | |
| "valid_targets_min": 1770 | |
| }, | |
| { | |
| "epoch": 6.069692058346839, | |
| "grad_norm": 0.4074205018656638, | |
| "learning_rate": 2.121171529939734e-06, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13340912759304047, | |
| "step": 3745, | |
| "valid_targets_mean": 4517.4, | |
| "valid_targets_min": 1548 | |
| }, | |
| { | |
| "epoch": 6.077795786061588, | |
| "grad_norm": 0.4529633278316854, | |
| "learning_rate": 2.085094026301349e-06, | |
| "loss": 0.1624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13528558611869812, | |
| "step": 3750, | |
| "valid_targets_mean": 3715.6, | |
| "valid_targets_min": 1436 | |
| }, | |
| { | |
| "epoch": 6.085899513776337, | |
| "grad_norm": 0.4190960207519973, | |
| "learning_rate": 2.0493090891755262e-06, | |
| "loss": 0.1298, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.133295476436615, | |
| "step": 3755, | |
| "valid_targets_mean": 4487.4, | |
| "valid_targets_min": 2011 | |
| }, | |
| { | |
| "epoch": 6.094003241491086, | |
| "grad_norm": 0.42452718549108076, | |
| "learning_rate": 2.013817302962444e-06, | |
| "loss": 0.1342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13761006295681, | |
| "step": 3760, | |
| "valid_targets_mean": 4643.9, | |
| "valid_targets_min": 1444 | |
| }, | |
| { | |
| "epoch": 6.102106969205835, | |
| "grad_norm": 0.4272396450205022, | |
| "learning_rate": 1.9786192472748643e-06, | |
| "loss": 0.1345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13067889213562012, | |
| "step": 3765, | |
| "valid_targets_mean": 4262.9, | |
| "valid_targets_min": 1406 | |
| }, | |
| { | |
| "epoch": 6.110210696920584, | |
| "grad_norm": 0.4182294978740752, | |
| "learning_rate": 1.9437154969286577e-06, | |
| "loss": 0.1427, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13712134957313538, | |
| "step": 3770, | |
| "valid_targets_mean": 4474.1, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 6.118314424635332, | |
| "grad_norm": 0.39799797675881166, | |
| "learning_rate": 1.9091066219334365e-06, | |
| "loss": 0.1354, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11920727789402008, | |
| "step": 3775, | |
| "valid_targets_mean": 4693.6, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 6.126418152350081, | |
| "grad_norm": 0.41626093816189375, | |
| "learning_rate": 1.8747931874832325e-06, | |
| "loss": 0.1384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12707899510860443, | |
| "step": 3780, | |
| "valid_targets_mean": 4307.7, | |
| "valid_targets_min": 2364 | |
| }, | |
| { | |
| "epoch": 6.13452188006483, | |
| "grad_norm": 0.4565834650265614, | |
| "learning_rate": 1.8407757539472548e-06, | |
| "loss": 0.1591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1688140332698822, | |
| "step": 3785, | |
| "valid_targets_mean": 4503.1, | |
| "valid_targets_min": 1358 | |
| }, | |
| { | |
| "epoch": 6.1426256077795784, | |
| "grad_norm": 0.42857367927873613, | |
| "learning_rate": 1.8070548768607744e-06, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11357726156711578, | |
| "step": 3790, | |
| "valid_targets_mean": 4074.0, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 6.1507293354943275, | |
| "grad_norm": 0.4370185122080301, | |
| "learning_rate": 1.773631106915996e-06, | |
| "loss": 0.1387, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1375211924314499, | |
| "step": 3795, | |
| "valid_targets_mean": 4234.0, | |
| "valid_targets_min": 1432 | |
| }, | |
| { | |
| "epoch": 6.158833063209076, | |
| "grad_norm": 0.4109496720135297, | |
| "learning_rate": 1.740504989953129e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.120418019592762, | |
| "step": 3800, | |
| "valid_targets_mean": 4492.1, | |
| "valid_targets_min": 676 | |
| }, | |
| { | |
| "epoch": 6.166936790923825, | |
| "grad_norm": 0.4439327368735157, | |
| "learning_rate": 1.707677066951432e-06, | |
| "loss": 0.1473, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12108451128005981, | |
| "step": 3805, | |
| "valid_targets_mean": 4027.9, | |
| "valid_targets_min": 1434 | |
| }, | |
| { | |
| "epoch": 6.175040518638574, | |
| "grad_norm": 0.4458492212785282, | |
| "learning_rate": 1.6751478740203776e-06, | |
| "loss": 0.1461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17976900935173035, | |
| "step": 3810, | |
| "valid_targets_mean": 4864.9, | |
| "valid_targets_min": 2482 | |
| }, | |
| { | |
| "epoch": 6.183144246353322, | |
| "grad_norm": 0.4227256889119454, | |
| "learning_rate": 1.6429179423909248e-06, | |
| "loss": 0.1419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1274784505367279, | |
| "step": 3815, | |
| "valid_targets_mean": 4209.2, | |
| "valid_targets_min": 1356 | |
| }, | |
| { | |
| "epoch": 6.191247974068071, | |
| "grad_norm": 0.8626461611916949, | |
| "learning_rate": 1.6109877984068089e-06, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13011348247528076, | |
| "step": 3820, | |
| "valid_targets_mean": 4491.9, | |
| "valid_targets_min": 2135 | |
| }, | |
| { | |
| "epoch": 6.19935170178282, | |
| "grad_norm": 0.4238788306714976, | |
| "learning_rate": 1.5793579635159883e-06, | |
| "loss": 0.1311, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11838759481906891, | |
| "step": 3825, | |
| "valid_targets_mean": 4214.7, | |
| "valid_targets_min": 1133 | |
| }, | |
| { | |
| "epoch": 6.207455429497569, | |
| "grad_norm": 0.4599834513527078, | |
| "learning_rate": 1.5480289542620686e-06, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11634839326143265, | |
| "step": 3830, | |
| "valid_targets_mean": 3438.7, | |
| "valid_targets_min": 1400 | |
| }, | |
| { | |
| "epoch": 6.215559157212318, | |
| "grad_norm": 0.5108850540567319, | |
| "learning_rate": 1.517001282275936e-06, | |
| "loss": 0.1436, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14523491263389587, | |
| "step": 3835, | |
| "valid_targets_mean": 3825.8, | |
| "valid_targets_min": 836 | |
| }, | |
| { | |
| "epoch": 6.223662884927067, | |
| "grad_norm": 0.45879081355549145, | |
| "learning_rate": 1.486275454267354e-06, | |
| "loss": 0.1339, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13570156693458557, | |
| "step": 3840, | |
| "valid_targets_mean": 3901.1, | |
| "valid_targets_min": 1254 | |
| }, | |
| { | |
| "epoch": 6.231766612641815, | |
| "grad_norm": 0.45552777520225757, | |
| "learning_rate": 1.4558519720166975e-06, | |
| "loss": 0.1526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15282249450683594, | |
| "step": 3845, | |
| "valid_targets_mean": 4158.1, | |
| "valid_targets_min": 1273 | |
| }, | |
| { | |
| "epoch": 6.239870340356564, | |
| "grad_norm": 0.3713879886251543, | |
| "learning_rate": 1.4257313323667798e-06, | |
| "loss": 0.1286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11740448325872421, | |
| "step": 3850, | |
| "valid_targets_mean": 4931.4, | |
| "valid_targets_min": 1525 | |
| }, | |
| { | |
| "epoch": 6.247974068071313, | |
| "grad_norm": 0.4063595181498841, | |
| "learning_rate": 1.3959140272146998e-06, | |
| "loss": 0.1406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15681567788124084, | |
| "step": 3855, | |
| "valid_targets_mean": 5654.3, | |
| "valid_targets_min": 1510 | |
| }, | |
| { | |
| "epoch": 6.256077795786061, | |
| "grad_norm": 0.4542483110057598, | |
| "learning_rate": 1.366400543503854e-06, | |
| "loss": 0.1499, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14490097761154175, | |
| "step": 3860, | |
| "valid_targets_mean": 4448.6, | |
| "valid_targets_min": 1146 | |
| }, | |
| { | |
| "epoch": 6.2641815235008105, | |
| "grad_norm": 0.45532835475196615, | |
| "learning_rate": 1.3371913632159506e-06, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13060614466667175, | |
| "step": 3865, | |
| "valid_targets_mean": 3886.1, | |
| "valid_targets_min": 1264 | |
| }, | |
| { | |
| "epoch": 6.272285251215559, | |
| "grad_norm": 0.45124326933966985, | |
| "learning_rate": 1.3082869633631413e-06, | |
| "loss": 0.153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1641269326210022, | |
| "step": 3870, | |
| "valid_targets_mean": 4320.4, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 6.280388978930308, | |
| "grad_norm": 0.4779066296871159, | |
| "learning_rate": 1.2796878159802595e-06, | |
| "loss": 0.1509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11321790516376495, | |
| "step": 3875, | |
| "valid_targets_mean": 3432.7, | |
| "valid_targets_min": 1321 | |
| }, | |
| { | |
| "epoch": 6.288492706645057, | |
| "grad_norm": 0.4951825756797347, | |
| "learning_rate": 1.2513943881170754e-06, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17373144626617432, | |
| "step": 3880, | |
| "valid_targets_mean": 3927.1, | |
| "valid_targets_min": 1325 | |
| }, | |
| { | |
| "epoch": 6.296596434359805, | |
| "grad_norm": 0.4155816592498077, | |
| "learning_rate": 1.2234071418306903e-06, | |
| "loss": 0.1309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12273648381233215, | |
| "step": 3885, | |
| "valid_targets_mean": 4243.1, | |
| "valid_targets_min": 1539 | |
| }, | |
| { | |
| "epoch": 6.304700162074554, | |
| "grad_norm": 0.48766069627915243, | |
| "learning_rate": 1.1957265341779855e-06, | |
| "loss": 0.1368, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14697131514549255, | |
| "step": 3890, | |
| "valid_targets_mean": 4181.6, | |
| "valid_targets_min": 1177 | |
| }, | |
| { | |
| "epoch": 6.312803889789303, | |
| "grad_norm": 0.3938083092461327, | |
| "learning_rate": 1.1683530172081592e-06, | |
| "loss": 0.1489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12041988968849182, | |
| "step": 3895, | |
| "valid_targets_mean": 5296.3, | |
| "valid_targets_min": 1306 | |
| }, | |
| { | |
| "epoch": 6.3209076175040515, | |
| "grad_norm": 0.47593597010364025, | |
| "learning_rate": 1.1412870379553387e-06, | |
| "loss": 0.16, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16000404953956604, | |
| "step": 3900, | |
| "valid_targets_mean": 4644.2, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 6.329011345218801, | |
| "grad_norm": 0.3902721038762542, | |
| "learning_rate": 1.1145290384312846e-06, | |
| "loss": 0.158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14155729115009308, | |
| "step": 3905, | |
| "valid_targets_mean": 5272.6, | |
| "valid_targets_min": 1362 | |
| }, | |
| { | |
| "epoch": 6.33711507293355, | |
| "grad_norm": 0.4251038760068053, | |
| "learning_rate": 1.0880794556181762e-06, | |
| "loss": 0.1526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15772640705108643, | |
| "step": 3910, | |
| "valid_targets_mean": 4669.5, | |
| "valid_targets_min": 1422 | |
| }, | |
| { | |
| "epoch": 6.345218800648298, | |
| "grad_norm": 0.40120884770549975, | |
| "learning_rate": 1.0619387214614662e-06, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11130170524120331, | |
| "step": 3915, | |
| "valid_targets_mean": 4427.3, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 6.353322528363047, | |
| "grad_norm": 0.4546439264922523, | |
| "learning_rate": 1.0361072628628354e-06, | |
| "loss": 0.1543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18124914169311523, | |
| "step": 3920, | |
| "valid_targets_mean": 4763.3, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 6.361426256077796, | |
| "grad_norm": 0.3865469165310739, | |
| "learning_rate": 1.0105855016732113e-06, | |
| "loss": 0.1406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1440453827381134, | |
| "step": 3925, | |
| "valid_targets_mean": 5420.2, | |
| "valid_targets_min": 1412 | |
| }, | |
| { | |
| "epoch": 6.369529983792544, | |
| "grad_norm": 0.46212783046959593, | |
| "learning_rate": 9.853738546858893e-07, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12405751645565033, | |
| "step": 3930, | |
| "valid_targets_mean": 3459.2, | |
| "valid_targets_min": 1270 | |
| }, | |
| { | |
| "epoch": 6.377633711507293, | |
| "grad_norm": 0.45293906022023217, | |
| "learning_rate": 9.604727336297203e-07, | |
| "loss": 0.1312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1338280439376831, | |
| "step": 3935, | |
| "valid_targets_mean": 3815.8, | |
| "valid_targets_min": 1333 | |
| }, | |
| { | |
| "epoch": 6.3857374392220425, | |
| "grad_norm": 0.4762642698396675, | |
| "learning_rate": 9.358825451623832e-07, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14254039525985718, | |
| "step": 3940, | |
| "valid_targets_mean": 4435.0, | |
| "valid_targets_min": 1307 | |
| }, | |
| { | |
| "epoch": 6.393841166936791, | |
| "grad_norm": 0.40993514188015256, | |
| "learning_rate": 9.116036908637582e-07, | |
| "loss": 0.1429, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13208088278770447, | |
| "step": 3945, | |
| "valid_targets_mean": 4901.5, | |
| "valid_targets_min": 2394 | |
| }, | |
| { | |
| "epoch": 6.40194489465154, | |
| "grad_norm": 0.4050004148081513, | |
| "learning_rate": 8.876365672293441e-07, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.10800512880086899, | |
| "step": 3950, | |
| "valid_targets_mean": 4286.4, | |
| "valid_targets_min": 1533 | |
| }, | |
| { | |
| "epoch": 6.410048622366288, | |
| "grad_norm": 0.4372028368161644, | |
| "learning_rate": 8.639815656638162e-07, | |
| "loss": 0.1391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14087510108947754, | |
| "step": 3955, | |
| "valid_targets_mean": 4526.8, | |
| "valid_targets_min": 1287 | |
| }, | |
| { | |
| "epoch": 6.418152350081037, | |
| "grad_norm": 0.41474637266063613, | |
| "learning_rate": 8.406390724745961e-07, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17648552358150482, | |
| "step": 3960, | |
| "valid_targets_mean": 6034.9, | |
| "valid_targets_min": 1347 | |
| }, | |
| { | |
| "epoch": 6.426256077795786, | |
| "grad_norm": 0.4048155564619817, | |
| "learning_rate": 8.176094688655789e-07, | |
| "loss": 0.1575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17440642416477203, | |
| "step": 3965, | |
| "valid_targets_mean": 5610.4, | |
| "valid_targets_min": 1310 | |
| }, | |
| { | |
| "epoch": 6.434359805510534, | |
| "grad_norm": 0.4228276820400858, | |
| "learning_rate": 7.948931309308872e-07, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14183327555656433, | |
| "step": 3970, | |
| "valid_targets_mean": 4442.7, | |
| "valid_targets_min": 1396 | |
| }, | |
| { | |
| "epoch": 6.4424635332252835, | |
| "grad_norm": 0.4454033902831941, | |
| "learning_rate": 7.724904296487246e-07, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13356520235538483, | |
| "step": 3975, | |
| "valid_targets_mean": 4284.1, | |
| "valid_targets_min": 1373 | |
| }, | |
| { | |
| "epoch": 6.450567260940033, | |
| "grad_norm": 0.48618002254936554, | |
| "learning_rate": 7.504017308753386e-07, | |
| "loss": 0.1365, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11197055876255035, | |
| "step": 3980, | |
| "valid_targets_mean": 3097.4, | |
| "valid_targets_min": 837 | |
| }, | |
| { | |
| "epoch": 6.458670988654781, | |
| "grad_norm": 1.1613596162903213, | |
| "learning_rate": 7.286273953390278e-07, | |
| "loss": 0.1575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13441595435142517, | |
| "step": 3985, | |
| "valid_targets_mean": 5556.1, | |
| "valid_targets_min": 1350 | |
| }, | |
| { | |
| "epoch": 6.46677471636953, | |
| "grad_norm": 0.4049486224120147, | |
| "learning_rate": 7.071677786342568e-07, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14194028079509735, | |
| "step": 3990, | |
| "valid_targets_mean": 5311.6, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 6.474878444084279, | |
| "grad_norm": 0.400038102304834, | |
| "learning_rate": 6.860232312158554e-07, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12901249527931213, | |
| "step": 3995, | |
| "valid_targets_mean": 4856.2, | |
| "valid_targets_min": 2179 | |
| }, | |
| { | |
| "epoch": 6.482982171799027, | |
| "grad_norm": 0.4069486273700877, | |
| "learning_rate": 6.651940983932737e-07, | |
| "loss": 0.1404, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12928912043571472, | |
| "step": 4000, | |
| "valid_targets_mean": 4616.2, | |
| "valid_targets_min": 848 | |
| }, | |
| { | |
| "epoch": 6.491085899513776, | |
| "grad_norm": 0.46015930681070816, | |
| "learning_rate": 6.44680720324975e-07, | |
| "loss": 0.1467, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13234779238700867, | |
| "step": 4005, | |
| "valid_targets_mean": 3336.2, | |
| "valid_targets_min": 1390 | |
| }, | |
| { | |
| "epoch": 6.4991896272285254, | |
| "grad_norm": 0.4012511213853744, | |
| "learning_rate": 6.244834320128501e-07, | |
| "loss": 0.1472, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15023906528949738, | |
| "step": 4010, | |
| "valid_targets_mean": 5269.1, | |
| "valid_targets_min": 2527 | |
| }, | |
| { | |
| "epoch": 6.507293354943274, | |
| "grad_norm": 0.47744140560962556, | |
| "learning_rate": 6.0460256329677e-07, | |
| "loss": 0.1387, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1656455397605896, | |
| "step": 4015, | |
| "valid_targets_mean": 4372.4, | |
| "valid_targets_min": 1450 | |
| }, | |
| { | |
| "epoch": 6.515397082658023, | |
| "grad_norm": 0.37939181638585245, | |
| "learning_rate": 5.850384388491814e-07, | |
| "loss": 0.1503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1422882229089737, | |
| "step": 4020, | |
| "valid_targets_mean": 5767.8, | |
| "valid_targets_min": 2245 | |
| }, | |
| { | |
| "epoch": 6.523500810372772, | |
| "grad_norm": 0.3968129940328583, | |
| "learning_rate": 5.657913781698221e-07, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13958857953548431, | |
| "step": 4025, | |
| "valid_targets_mean": 5217.1, | |
| "valid_targets_min": 2293 | |
| }, | |
| { | |
| "epoch": 6.53160453808752, | |
| "grad_norm": 0.45533894363588545, | |
| "learning_rate": 5.468616955804873e-07, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1408911496400833, | |
| "step": 4030, | |
| "valid_targets_mean": 4240.6, | |
| "valid_targets_min": 1458 | |
| }, | |
| { | |
| "epoch": 6.539708265802269, | |
| "grad_norm": 0.41456973084927484, | |
| "learning_rate": 5.282497002198983e-07, | |
| "loss": 0.151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1309795379638672, | |
| "step": 4035, | |
| "valid_targets_mean": 4633.8, | |
| "valid_targets_min": 1539 | |
| }, | |
| { | |
| "epoch": 6.547811993517017, | |
| "grad_norm": 0.4344790637851775, | |
| "learning_rate": 5.099556960386686e-07, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1465933471918106, | |
| "step": 4040, | |
| "valid_targets_mean": 4904.2, | |
| "valid_targets_min": 2034 | |
| }, | |
| { | |
| "epoch": 6.5559157212317665, | |
| "grad_norm": 0.4325462479106516, | |
| "learning_rate": 4.919799817943238e-07, | |
| "loss": 0.155, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14228688180446625, | |
| "step": 4045, | |
| "valid_targets_mean": 5782.6, | |
| "valid_targets_min": 2144 | |
| }, | |
| { | |
| "epoch": 6.564019448946516, | |
| "grad_norm": 0.4754686884062165, | |
| "learning_rate": 4.7432285104642703e-07, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16148260235786438, | |
| "step": 4050, | |
| "valid_targets_mean": 4153.5, | |
| "valid_targets_min": 677 | |
| }, | |
| { | |
| "epoch": 6.572123176661265, | |
| "grad_norm": 0.42697061151348537, | |
| "learning_rate": 4.569845921517968e-07, | |
| "loss": 0.1536, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16228759288787842, | |
| "step": 4055, | |
| "valid_targets_mean": 5002.3, | |
| "valid_targets_min": 2511 | |
| }, | |
| { | |
| "epoch": 6.580226904376013, | |
| "grad_norm": 0.3845365808911217, | |
| "learning_rate": 4.399654882597726e-07, | |
| "loss": 0.1294, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13468897342681885, | |
| "step": 4060, | |
| "valid_targets_mean": 5712.8, | |
| "valid_targets_min": 1045 | |
| }, | |
| { | |
| "epoch": 6.588330632090762, | |
| "grad_norm": 0.39716277545284767, | |
| "learning_rate": 4.232658173076232e-07, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14139370620250702, | |
| "step": 4065, | |
| "valid_targets_mean": 5950.2, | |
| "valid_targets_min": 1399 | |
| }, | |
| { | |
| "epoch": 6.59643435980551, | |
| "grad_norm": 0.4468084595108756, | |
| "learning_rate": 4.068858520159724e-07, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1694372296333313, | |
| "step": 4070, | |
| "valid_targets_mean": 4465.2, | |
| "valid_targets_min": 709 | |
| }, | |
| { | |
| "epoch": 6.604538087520259, | |
| "grad_norm": 0.3671649079013761, | |
| "learning_rate": 3.9082585988437617e-07, | |
| "loss": 0.157, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12348531186580658, | |
| "step": 4075, | |
| "valid_targets_mean": 5558.6, | |
| "valid_targets_min": 1388 | |
| }, | |
| { | |
| "epoch": 6.612641815235008, | |
| "grad_norm": 0.47981089308216524, | |
| "learning_rate": 3.7508610318693684e-07, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13009825348854065, | |
| "step": 4080, | |
| "valid_targets_mean": 3335.6, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 6.620745542949757, | |
| "grad_norm": 0.47241227108780176, | |
| "learning_rate": 3.596668389680247e-07, | |
| "loss": 0.1406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15220530331134796, | |
| "step": 4085, | |
| "valid_targets_mean": 3802.2, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 6.628849270664506, | |
| "grad_norm": 0.4556695876426719, | |
| "learning_rate": 3.445683190380833e-07, | |
| "loss": 0.1361, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12755940854549408, | |
| "step": 4090, | |
| "valid_targets_mean": 3677.8, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 6.636952998379255, | |
| "grad_norm": 0.441358097986932, | |
| "learning_rate": 3.297907899695019e-07, | |
| "loss": 0.1405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13846738636493683, | |
| "step": 4095, | |
| "valid_targets_mean": 4536.4, | |
| "valid_targets_min": 1451 | |
| }, | |
| { | |
| "epoch": 6.645056726094003, | |
| "grad_norm": 0.48127456616667247, | |
| "learning_rate": 3.1533449309262056e-07, | |
| "loss": 0.155, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1750427931547165, | |
| "step": 4100, | |
| "valid_targets_mean": 4037.1, | |
| "valid_targets_min": 1341 | |
| }, | |
| { | |
| "epoch": 6.653160453808752, | |
| "grad_norm": 0.46639173362247416, | |
| "learning_rate": 3.0119966449174474e-07, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15236026048660278, | |
| "step": 4105, | |
| "valid_targets_mean": 3952.3, | |
| "valid_targets_min": 1369 | |
| }, | |
| { | |
| "epoch": 6.661264181523501, | |
| "grad_norm": 0.47615735862720293, | |
| "learning_rate": 2.8738653500133494e-07, | |
| "loss": 0.1412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13197392225265503, | |
| "step": 4110, | |
| "valid_targets_mean": 3993.2, | |
| "valid_targets_min": 1323 | |
| }, | |
| { | |
| "epoch": 6.669367909238249, | |
| "grad_norm": 0.45472799651735124, | |
| "learning_rate": 2.738953302022096e-07, | |
| "loss": 0.1363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.120602548122406, | |
| "step": 4115, | |
| "valid_targets_mean": 3635.4, | |
| "valid_targets_min": 1392 | |
| }, | |
| { | |
| "epoch": 6.6774716369529985, | |
| "grad_norm": 0.4188546649594025, | |
| "learning_rate": 2.6072627041785925e-07, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13288158178329468, | |
| "step": 4120, | |
| "valid_targets_mean": 4825.9, | |
| "valid_targets_min": 2039 | |
| }, | |
| { | |
| "epoch": 6.685575364667747, | |
| "grad_norm": 0.4639950948551984, | |
| "learning_rate": 2.478795707108672e-07, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1535397320985794, | |
| "step": 4125, | |
| "valid_targets_mean": 4686.6, | |
| "valid_targets_min": 1323 | |
| }, | |
| { | |
| "epoch": 6.693679092382496, | |
| "grad_norm": 0.41109808884224275, | |
| "learning_rate": 2.3535544087938345e-07, | |
| "loss": 0.1485, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1529480516910553, | |
| "step": 4130, | |
| "valid_targets_mean": 5588.4, | |
| "valid_targets_min": 2517 | |
| }, | |
| { | |
| "epoch": 6.701782820097245, | |
| "grad_norm": 0.45926411106835413, | |
| "learning_rate": 2.2315408545370288e-07, | |
| "loss": 0.156, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1341223120689392, | |
| "step": 4135, | |
| "valid_targets_mean": 4781.3, | |
| "valid_targets_min": 1876 | |
| }, | |
| { | |
| "epoch": 6.709886547811994, | |
| "grad_norm": 0.395000259046136, | |
| "learning_rate": 2.1127570369292361e-07, | |
| "loss": 0.1298, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14368417859077454, | |
| "step": 4140, | |
| "valid_targets_mean": 5097.6, | |
| "valid_targets_min": 613 | |
| }, | |
| { | |
| "epoch": 6.717990275526742, | |
| "grad_norm": 0.46210837076522737, | |
| "learning_rate": 1.9972048958168954e-07, | |
| "loss": 0.1549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18180908262729645, | |
| "step": 4145, | |
| "valid_targets_mean": 4240.0, | |
| "valid_targets_min": 1223 | |
| }, | |
| { | |
| "epoch": 6.726094003241491, | |
| "grad_norm": 0.432229949832389, | |
| "learning_rate": 1.8848863182703513e-07, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1526516079902649, | |
| "step": 4150, | |
| "valid_targets_mean": 5149.5, | |
| "valid_targets_min": 1471 | |
| }, | |
| { | |
| "epoch": 6.7341977309562395, | |
| "grad_norm": 0.44808799433401697, | |
| "learning_rate": 1.7758031385528784e-07, | |
| "loss": 0.1423, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15459507703781128, | |
| "step": 4155, | |
| "valid_targets_mean": 4488.3, | |
| "valid_targets_min": 1343 | |
| }, | |
| { | |
| "epoch": 6.742301458670989, | |
| "grad_norm": 0.44004554566563986, | |
| "learning_rate": 1.6699571380908385e-07, | |
| "loss": 0.1415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15247702598571777, | |
| "step": 4160, | |
| "valid_targets_mean": 4497.5, | |
| "valid_targets_min": 1401 | |
| }, | |
| { | |
| "epoch": 6.750405186385738, | |
| "grad_norm": 0.43890356060699703, | |
| "learning_rate": 1.5673500454445046e-07, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1325896680355072, | |
| "step": 4165, | |
| "valid_targets_mean": 4162.1, | |
| "valid_targets_min": 1421 | |
| }, | |
| { | |
| "epoch": 6.758508914100486, | |
| "grad_norm": 0.39932933043091573, | |
| "learning_rate": 1.4679835362799266e-07, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13079729676246643, | |
| "step": 4170, | |
| "valid_targets_mean": 5510.4, | |
| "valid_targets_min": 1409 | |
| }, | |
| { | |
| "epoch": 6.766612641815235, | |
| "grad_norm": 0.46941869643754247, | |
| "learning_rate": 1.3718592333414881e-07, | |
| "loss": 0.1393, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13742662966251373, | |
| "step": 4175, | |
| "valid_targets_mean": 3786.2, | |
| "valid_targets_min": 1376 | |
| }, | |
| { | |
| "epoch": 6.774716369529984, | |
| "grad_norm": 0.39299371447081305, | |
| "learning_rate": 1.2789787064253934e-07, | |
| "loss": 0.15, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12248679995536804, | |
| "step": 4180, | |
| "valid_targets_mean": 4901.8, | |
| "valid_targets_min": 1500 | |
| }, | |
| { | |
| "epoch": 6.782820097244732, | |
| "grad_norm": 0.43082508409324244, | |
| "learning_rate": 1.1893434723541542e-07, | |
| "loss": 0.1413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12439413368701935, | |
| "step": 4185, | |
| "valid_targets_mean": 3693.9, | |
| "valid_targets_min": 1382 | |
| }, | |
| { | |
| "epoch": 6.790923824959481, | |
| "grad_norm": 0.4358507033661266, | |
| "learning_rate": 1.1029549949516549e-07, | |
| "loss": 0.1427, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15898624062538147, | |
| "step": 4190, | |
| "valid_targets_mean": 4447.9, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 6.7990275526742305, | |
| "grad_norm": 0.4554334548925996, | |
| "learning_rate": 1.0198146850193935e-07, | |
| "loss": 0.1548, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13918620347976685, | |
| "step": 4195, | |
| "valid_targets_mean": 4084.8, | |
| "valid_targets_min": 1247 | |
| }, | |
| { | |
| "epoch": 6.807131280388979, | |
| "grad_norm": 0.450900839717058, | |
| "learning_rate": 9.399239003132999e-08, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14567112922668457, | |
| "step": 4200, | |
| "valid_targets_mean": 4257.7, | |
| "valid_targets_min": 971 | |
| }, | |
| { | |
| "epoch": 6.815235008103728, | |
| "grad_norm": 0.44464926461952875, | |
| "learning_rate": 8.632839455216869e-08, | |
| "loss": 0.1331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11365213990211487, | |
| "step": 4205, | |
| "valid_targets_mean": 3640.5, | |
| "valid_targets_min": 1265 | |
| }, | |
| { | |
| "epoch": 6.823338735818476, | |
| "grad_norm": 0.4297413425159312, | |
| "learning_rate": 7.89896072243912e-08, | |
| "loss": 0.1449, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17040380835533142, | |
| "step": 4210, | |
| "valid_targets_mean": 5029.8, | |
| "valid_targets_min": 1465 | |
| }, | |
| { | |
| "epoch": 6.831442463533225, | |
| "grad_norm": 0.4265378172405877, | |
| "learning_rate": 7.197614789698604e-08, | |
| "loss": 0.1597, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1628272831439972, | |
| "step": 4215, | |
| "valid_targets_mean": 4827.1, | |
| "valid_targets_min": 2268 | |
| }, | |
| { | |
| "epoch": 6.839546191247974, | |
| "grad_norm": 0.42738757709554764, | |
| "learning_rate": 6.528813110604714e-08, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12047109007835388, | |
| "step": 4220, | |
| "valid_targets_mean": 4246.8, | |
| "valid_targets_min": 688 | |
| }, | |
| { | |
| "epoch": 6.847649918962723, | |
| "grad_norm": 0.5091043582800635, | |
| "learning_rate": 5.8925666072899845e-08, | |
| "loss": 0.1364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1382569521665573, | |
| "step": 4225, | |
| "valid_targets_mean": 3183.8, | |
| "valid_targets_min": 1433 | |
| }, | |
| { | |
| "epoch": 6.855753646677472, | |
| "grad_norm": 0.4138282308157216, | |
| "learning_rate": 5.288885670231336e-08, | |
| "loss": 0.1519, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13277751207351685, | |
| "step": 4230, | |
| "valid_targets_mean": 5006.8, | |
| "valid_targets_min": 1974 | |
| }, | |
| { | |
| "epoch": 6.863857374392221, | |
| "grad_norm": 0.4523192662938565, | |
| "learning_rate": 4.717780158080887e-08, | |
| "loss": 0.1546, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11566020548343658, | |
| "step": 4235, | |
| "valid_targets_mean": 3727.6, | |
| "valid_targets_min": 1486 | |
| }, | |
| { | |
| "epoch": 6.871961102106969, | |
| "grad_norm": 0.37548392402675607, | |
| "learning_rate": 4.1792593975049644e-08, | |
| "loss": 0.1369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13377472758293152, | |
| "step": 4240, | |
| "valid_targets_mean": 5953.8, | |
| "valid_targets_min": 1467 | |
| }, | |
| { | |
| "epoch": 6.880064829821718, | |
| "grad_norm": 0.47007523708634125, | |
| "learning_rate": 3.6733321830315636e-08, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12267227470874786, | |
| "step": 4245, | |
| "valid_targets_mean": 3472.3, | |
| "valid_targets_min": 2085 | |
| }, | |
| { | |
| "epoch": 6.888168557536467, | |
| "grad_norm": 0.42709079057303595, | |
| "learning_rate": 3.200006776906461e-08, | |
| "loss": 0.1387, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1224164217710495, | |
| "step": 4250, | |
| "valid_targets_mean": 3998.2, | |
| "valid_targets_min": 2443 | |
| }, | |
| { | |
| "epoch": 6.896272285251215, | |
| "grad_norm": 0.39239537657898, | |
| "learning_rate": 2.7592909089593224e-08, | |
| "loss": 0.1405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14886951446533203, | |
| "step": 4255, | |
| "valid_targets_mean": 5280.1, | |
| "valid_targets_min": 2546 | |
| }, | |
| { | |
| "epoch": 6.904376012965964, | |
| "grad_norm": 0.3539601345126062, | |
| "learning_rate": 2.351191776475581e-08, | |
| "loss": 0.1442, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12216240912675858, | |
| "step": 4260, | |
| "valid_targets_mean": 5626.0, | |
| "valid_targets_min": 678 | |
| }, | |
| { | |
| "epoch": 6.9124797406807135, | |
| "grad_norm": 0.42164253650702077, | |
| "learning_rate": 1.9757160440814217e-08, | |
| "loss": 0.1479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13502494990825653, | |
| "step": 4265, | |
| "valid_targets_mean": 4376.6, | |
| "valid_targets_min": 1339 | |
| }, | |
| { | |
| "epoch": 6.920583468395462, | |
| "grad_norm": 0.4705700973117692, | |
| "learning_rate": 1.6328698436327562e-08, | |
| "loss": 0.1588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14658623933792114, | |
| "step": 4270, | |
| "valid_targets_mean": 3737.1, | |
| "valid_targets_min": 1351 | |
| }, | |
| { | |
| "epoch": 6.928687196110211, | |
| "grad_norm": 0.41064747368250465, | |
| "learning_rate": 1.3226587741159702e-08, | |
| "loss": 0.1462, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13569799065589905, | |
| "step": 4275, | |
| "valid_targets_mean": 4845.8, | |
| "valid_targets_min": 1316 | |
| }, | |
| { | |
| "epoch": 6.93679092382496, | |
| "grad_norm": 0.42620733839378133, | |
| "learning_rate": 1.0450879015566629e-08, | |
| "loss": 0.1448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1472766101360321, | |
| "step": 4280, | |
| "valid_targets_mean": 4720.4, | |
| "valid_targets_min": 1342 | |
| }, | |
| { | |
| "epoch": 6.944894651539708, | |
| "grad_norm": 0.42942547760976824, | |
| "learning_rate": 8.001617589368238e-09, | |
| "loss": 0.1559, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17669114470481873, | |
| "step": 4285, | |
| "valid_targets_mean": 4890.8, | |
| "valid_targets_min": 546 | |
| }, | |
| { | |
| "epoch": 6.952998379254457, | |
| "grad_norm": 0.4720547278800281, | |
| "learning_rate": 5.878843461200046e-09, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13602668046951294, | |
| "step": 4290, | |
| "valid_targets_mean": 3606.6, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 6.961102106969205, | |
| "grad_norm": 0.4338336156998582, | |
| "learning_rate": 4.082591297873695e-09, | |
| "loss": 0.1385, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18032628297805786, | |
| "step": 4295, | |
| "valid_targets_mean": 4971.6, | |
| "valid_targets_min": 2500 | |
| }, | |
| { | |
| "epoch": 6.9692058346839545, | |
| "grad_norm": 0.41141829681237335, | |
| "learning_rate": 2.6128904338018624e-09, | |
| "loss": 0.1356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1081404983997345, | |
| "step": 4300, | |
| "valid_targets_mean": 4556.7, | |
| "valid_targets_min": 970 | |
| }, | |
| { | |
| "epoch": 6.977309562398704, | |
| "grad_norm": 0.4549661229722178, | |
| "learning_rate": 1.4697648705186417e-09, | |
| "loss": 0.1586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15801507234573364, | |
| "step": 4305, | |
| "valid_targets_mean": 4676.2, | |
| "valid_targets_min": 1281 | |
| }, | |
| { | |
| "epoch": 6.985413290113452, | |
| "grad_norm": 0.43854485734031307, | |
| "learning_rate": 6.532332762931859e-10, | |
| "loss": 0.1546, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.122274249792099, | |
| "step": 4310, | |
| "valid_targets_mean": 3756.2, | |
| "valid_targets_min": 1417 | |
| }, | |
| { | |
| "epoch": 6.993517017828201, | |
| "grad_norm": 0.5007277079923921, | |
| "learning_rate": 1.6330898581884414e-10, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1444101631641388, | |
| "step": 4315, | |
| "valid_targets_mean": 3494.4, | |
| "valid_targets_min": 587 | |
| }, | |
| { | |
| "epoch": 7.0, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12458416819572449, | |
| "step": 4319, | |
| "total_flos": 1278251302125568.0, | |
| "train_loss": 0.1840376865397341, | |
| "train_runtime": 22883.9619, | |
| "train_samples_per_second": 3.016, | |
| "train_steps_per_second": 0.189, | |
| "valid_targets_mean": 4142.8, | |
| "valid_targets_min": 1361 | |
| } | |
| ], | |
| "logging_steps": 5, | |
| "max_steps": 4319, | |
| "num_input_tokens_seen": 0, | |
| "num_train_epochs": 7, | |
| "save_steps": 1500, | |
| "stateful_callbacks": { | |
| "TrainerControl": { | |
| "args": { | |
| "should_epoch_stop": false, | |
| "should_evaluate": false, | |
| "should_log": false, | |
| "should_save": true, | |
| "should_training_stop": true | |
| }, | |
| "attributes": {} | |
| } | |
| }, | |
| "total_flos": 1278251302125568.0, | |
| "train_batch_size": 1, | |
| "trial_name": null, | |
| "trial_params": null | |
| } | |