Text Generation
Transformers
Safetensors
qwen3
llama-factory
full
Generated from Trainer
conversational
text-generation-inference
Instructions to use DCAgent/a1-pr_mining with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DCAgent/a1-pr_mining with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DCAgent/a1-pr_mining") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-pr_mining") model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-pr_mining") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DCAgent/a1-pr_mining with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DCAgent/a1-pr_mining" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-pr_mining", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DCAgent/a1-pr_mining
- SGLang
How to use DCAgent/a1-pr_mining with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DCAgent/a1-pr_mining" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-pr_mining", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DCAgent/a1-pr_mining" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-pr_mining", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DCAgent/a1-pr_mining with Docker Model Runner:
docker model run hf.co/DCAgent/a1-pr_mining
| { | |
| "best_global_step": null, | |
| "best_metric": null, | |
| "best_model_checkpoint": null, | |
| "epoch": 7.0, | |
| "eval_steps": 500, | |
| "global_step": 4249, | |
| "is_hyper_param_search": false, | |
| "is_local_process_zero": true, | |
| "is_world_process_zero": true, | |
| "log_history": [ | |
| { | |
| "epoch": 0.008237232289950576, | |
| "grad_norm": 13.838484688249258, | |
| "learning_rate": 3.764705882352941e-07, | |
| "loss": 0.6118, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6064709424972534, | |
| "step": 5, | |
| "valid_targets_mean": 5784.8, | |
| "valid_targets_min": 330 | |
| }, | |
| { | |
| "epoch": 0.016474464579901153, | |
| "grad_norm": 13.32901770197041, | |
| "learning_rate": 8.470588235294118e-07, | |
| "loss": 0.5804, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5605111718177795, | |
| "step": 10, | |
| "valid_targets_mean": 5472.2, | |
| "valid_targets_min": 302 | |
| }, | |
| { | |
| "epoch": 0.02471169686985173, | |
| "grad_norm": 11.580100121107343, | |
| "learning_rate": 1.3176470588235296e-06, | |
| "loss": 0.5825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.521429181098938, | |
| "step": 15, | |
| "valid_targets_mean": 7353.7, | |
| "valid_targets_min": 1985 | |
| }, | |
| { | |
| "epoch": 0.032948929159802305, | |
| "grad_norm": 10.103393446647656, | |
| "learning_rate": 1.7882352941176474e-06, | |
| "loss": 0.531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5192767381668091, | |
| "step": 20, | |
| "valid_targets_mean": 6811.0, | |
| "valid_targets_min": 478 | |
| }, | |
| { | |
| "epoch": 0.04118616144975288, | |
| "grad_norm": 6.257789545103483, | |
| "learning_rate": 2.258823529411765e-06, | |
| "loss": 0.4974, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.48259904980659485, | |
| "step": 25, | |
| "valid_targets_mean": 4482.8, | |
| "valid_targets_min": 696 | |
| }, | |
| { | |
| "epoch": 0.04942339373970346, | |
| "grad_norm": 3.4046228381087698, | |
| "learning_rate": 2.7294117647058825e-06, | |
| "loss": 0.4402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.41031402349472046, | |
| "step": 30, | |
| "valid_targets_mean": 6257.0, | |
| "valid_targets_min": 920 | |
| }, | |
| { | |
| "epoch": 0.057660626029654036, | |
| "grad_norm": 2.4965600200587112, | |
| "learning_rate": 3.2000000000000003e-06, | |
| "loss": 0.4411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4100528955459595, | |
| "step": 35, | |
| "valid_targets_mean": 4688.1, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 0.06589785831960461, | |
| "grad_norm": 1.3768965526806027, | |
| "learning_rate": 3.670588235294118e-06, | |
| "loss": 0.4167, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3804216980934143, | |
| "step": 40, | |
| "valid_targets_mean": 4690.6, | |
| "valid_targets_min": 254 | |
| }, | |
| { | |
| "epoch": 0.07413509060955518, | |
| "grad_norm": 0.9454923709313997, | |
| "learning_rate": 4.141176470588235e-06, | |
| "loss": 0.3692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36766085028648376, | |
| "step": 45, | |
| "valid_targets_mean": 5414.4, | |
| "valid_targets_min": 312 | |
| }, | |
| { | |
| "epoch": 0.08237232289950576, | |
| "grad_norm": 0.895620303556488, | |
| "learning_rate": 4.611764705882353e-06, | |
| "loss": 0.3545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.374552845954895, | |
| "step": 50, | |
| "valid_targets_mean": 4861.9, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 0.09060955518945635, | |
| "grad_norm": 0.6828779095753463, | |
| "learning_rate": 5.08235294117647e-06, | |
| "loss": 0.3537, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.320254385471344, | |
| "step": 55, | |
| "valid_targets_mean": 4765.4, | |
| "valid_targets_min": 273 | |
| }, | |
| { | |
| "epoch": 0.09884678747940692, | |
| "grad_norm": 0.6524094091597135, | |
| "learning_rate": 5.552941176470589e-06, | |
| "loss": 0.3345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34802478551864624, | |
| "step": 60, | |
| "valid_targets_mean": 5150.5, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 0.1070840197693575, | |
| "grad_norm": 0.5319046745810463, | |
| "learning_rate": 6.023529411764706e-06, | |
| "loss": 0.3405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32357484102249146, | |
| "step": 65, | |
| "valid_targets_mean": 6576.6, | |
| "valid_targets_min": 2014 | |
| }, | |
| { | |
| "epoch": 0.11532125205930807, | |
| "grad_norm": 0.5431891570006655, | |
| "learning_rate": 6.494117647058824e-06, | |
| "loss": 0.3479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34960079193115234, | |
| "step": 70, | |
| "valid_targets_mean": 6236.7, | |
| "valid_targets_min": 712 | |
| }, | |
| { | |
| "epoch": 0.12355848434925865, | |
| "grad_norm": 0.5215208361615085, | |
| "learning_rate": 6.964705882352941e-06, | |
| "loss": 0.3292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3283542990684509, | |
| "step": 75, | |
| "valid_targets_mean": 5510.8, | |
| "valid_targets_min": 383 | |
| }, | |
| { | |
| "epoch": 0.13179571663920922, | |
| "grad_norm": 0.5335500745911854, | |
| "learning_rate": 7.43529411764706e-06, | |
| "loss": 0.3182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.330269992351532, | |
| "step": 80, | |
| "valid_targets_mean": 5953.8, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 0.1400329489291598, | |
| "grad_norm": 0.4791132982310386, | |
| "learning_rate": 7.905882352941176e-06, | |
| "loss": 0.3313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31799042224884033, | |
| "step": 85, | |
| "valid_targets_mean": 6383.9, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 0.14827018121911037, | |
| "grad_norm": 0.6014083377497861, | |
| "learning_rate": 8.376470588235295e-06, | |
| "loss": 0.324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32026463747024536, | |
| "step": 90, | |
| "valid_targets_mean": 4742.5, | |
| "valid_targets_min": 430 | |
| }, | |
| { | |
| "epoch": 0.15650741350906094, | |
| "grad_norm": 0.4491945455497733, | |
| "learning_rate": 8.847058823529413e-06, | |
| "loss": 0.3073, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3021693229675293, | |
| "step": 95, | |
| "valid_targets_mean": 5802.7, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 0.16474464579901152, | |
| "grad_norm": 0.48340705186331245, | |
| "learning_rate": 9.31764705882353e-06, | |
| "loss": 0.3077, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3046356439590454, | |
| "step": 100, | |
| "valid_targets_mean": 5698.9, | |
| "valid_targets_min": 2110 | |
| }, | |
| { | |
| "epoch": 0.17298187808896212, | |
| "grad_norm": 0.45376506719321025, | |
| "learning_rate": 9.788235294117649e-06, | |
| "loss": 0.2947, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3012458384037018, | |
| "step": 105, | |
| "valid_targets_mean": 5975.9, | |
| "valid_targets_min": 392 | |
| }, | |
| { | |
| "epoch": 0.1812191103789127, | |
| "grad_norm": 0.48884539184282266, | |
| "learning_rate": 1.0258823529411766e-05, | |
| "loss": 0.2952, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30425509810447693, | |
| "step": 110, | |
| "valid_targets_mean": 5205.1, | |
| "valid_targets_min": 416 | |
| }, | |
| { | |
| "epoch": 0.18945634266886327, | |
| "grad_norm": 0.545402654783909, | |
| "learning_rate": 1.0729411764705884e-05, | |
| "loss": 0.302, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31671977043151855, | |
| "step": 115, | |
| "valid_targets_mean": 5585.2, | |
| "valid_targets_min": 1766 | |
| }, | |
| { | |
| "epoch": 0.19769357495881384, | |
| "grad_norm": 0.5044877249055142, | |
| "learning_rate": 1.1200000000000001e-05, | |
| "loss": 0.3002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31186243891716003, | |
| "step": 120, | |
| "valid_targets_mean": 4735.4, | |
| "valid_targets_min": 350 | |
| }, | |
| { | |
| "epoch": 0.20593080724876442, | |
| "grad_norm": 0.4917352363375157, | |
| "learning_rate": 1.1670588235294118e-05, | |
| "loss": 0.2917, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2831946909427643, | |
| "step": 125, | |
| "valid_targets_mean": 5007.1, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 0.214168039538715, | |
| "grad_norm": 0.40165976638911155, | |
| "learning_rate": 1.2141176470588237e-05, | |
| "loss": 0.2954, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2792291045188904, | |
| "step": 130, | |
| "valid_targets_mean": 6581.2, | |
| "valid_targets_min": 1047 | |
| }, | |
| { | |
| "epoch": 0.22240527182866557, | |
| "grad_norm": 0.5413117187263645, | |
| "learning_rate": 1.2611764705882354e-05, | |
| "loss": 0.299, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3235633075237274, | |
| "step": 135, | |
| "valid_targets_mean": 5528.8, | |
| "valid_targets_min": 361 | |
| }, | |
| { | |
| "epoch": 0.23064250411861614, | |
| "grad_norm": 0.5673717120036079, | |
| "learning_rate": 1.3082352941176472e-05, | |
| "loss": 0.2965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30562540888786316, | |
| "step": 140, | |
| "valid_targets_mean": 5469.2, | |
| "valid_targets_min": 474 | |
| }, | |
| { | |
| "epoch": 0.23887973640856672, | |
| "grad_norm": 0.49097902543645666, | |
| "learning_rate": 1.355294117647059e-05, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2939978241920471, | |
| "step": 145, | |
| "valid_targets_mean": 5334.3, | |
| "valid_targets_min": 260 | |
| }, | |
| { | |
| "epoch": 0.2471169686985173, | |
| "grad_norm": 0.5093548478479683, | |
| "learning_rate": 1.4023529411764706e-05, | |
| "loss": 0.2835, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2908037006855011, | |
| "step": 150, | |
| "valid_targets_mean": 6239.1, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 0.2553542009884679, | |
| "grad_norm": 0.4620067588632813, | |
| "learning_rate": 1.4494117647058825e-05, | |
| "loss": 0.2833, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2623613476753235, | |
| "step": 155, | |
| "valid_targets_mean": 4657.4, | |
| "valid_targets_min": 319 | |
| }, | |
| { | |
| "epoch": 0.26359143327841844, | |
| "grad_norm": 0.46233214702239567, | |
| "learning_rate": 1.4964705882352943e-05, | |
| "loss": 0.2818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25934308767318726, | |
| "step": 160, | |
| "valid_targets_mean": 5620.9, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 0.27182866556836904, | |
| "grad_norm": 0.48550710356868004, | |
| "learning_rate": 1.543529411764706e-05, | |
| "loss": 0.2915, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28682294487953186, | |
| "step": 165, | |
| "valid_targets_mean": 5591.1, | |
| "valid_targets_min": 390 | |
| }, | |
| { | |
| "epoch": 0.2800658978583196, | |
| "grad_norm": 0.4466982509739381, | |
| "learning_rate": 1.5905882352941177e-05, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23975607752799988, | |
| "step": 170, | |
| "valid_targets_mean": 5620.1, | |
| "valid_targets_min": 440 | |
| }, | |
| { | |
| "epoch": 0.2883031301482702, | |
| "grad_norm": 0.3962321896769396, | |
| "learning_rate": 1.6376470588235298e-05, | |
| "loss": 0.2743, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2576720416545868, | |
| "step": 175, | |
| "valid_targets_mean": 7407.4, | |
| "valid_targets_min": 1483 | |
| }, | |
| { | |
| "epoch": 0.29654036243822074, | |
| "grad_norm": 0.46736148924856563, | |
| "learning_rate": 1.6847058823529414e-05, | |
| "loss": 0.2725, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27849751710891724, | |
| "step": 180, | |
| "valid_targets_mean": 6247.8, | |
| "valid_targets_min": 373 | |
| }, | |
| { | |
| "epoch": 0.30477759472817134, | |
| "grad_norm": 0.4598062289821109, | |
| "learning_rate": 1.731764705882353e-05, | |
| "loss": 0.2667, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2737147808074951, | |
| "step": 185, | |
| "valid_targets_mean": 7175.9, | |
| "valid_targets_min": 2045 | |
| }, | |
| { | |
| "epoch": 0.3130148270181219, | |
| "grad_norm": 0.578819497206239, | |
| "learning_rate": 1.778823529411765e-05, | |
| "loss": 0.2878, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2716781198978424, | |
| "step": 190, | |
| "valid_targets_mean": 4155.1, | |
| "valid_targets_min": 224 | |
| }, | |
| { | |
| "epoch": 0.3212520593080725, | |
| "grad_norm": 0.9767672691031627, | |
| "learning_rate": 1.8258823529411765e-05, | |
| "loss": 0.2734, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2786101698875427, | |
| "step": 195, | |
| "valid_targets_mean": 5027.1, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 0.32948929159802304, | |
| "grad_norm": 0.4264957766087033, | |
| "learning_rate": 1.8729411764705886e-05, | |
| "loss": 0.2782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25898993015289307, | |
| "step": 200, | |
| "valid_targets_mean": 6737.0, | |
| "valid_targets_min": 3647 | |
| }, | |
| { | |
| "epoch": 0.33772652388797364, | |
| "grad_norm": 0.44189888957694456, | |
| "learning_rate": 1.9200000000000003e-05, | |
| "loss": 0.262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24469071626663208, | |
| "step": 205, | |
| "valid_targets_mean": 5740.4, | |
| "valid_targets_min": 635 | |
| }, | |
| { | |
| "epoch": 0.34596375617792424, | |
| "grad_norm": 0.5733106926699414, | |
| "learning_rate": 1.967058823529412e-05, | |
| "loss": 0.2695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2947193682193756, | |
| "step": 210, | |
| "valid_targets_mean": 4685.6, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 0.3542009884678748, | |
| "grad_norm": 0.48099444976450817, | |
| "learning_rate": 2.0141176470588236e-05, | |
| "loss": 0.2745, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2802830636501312, | |
| "step": 215, | |
| "valid_targets_mean": 5317.8, | |
| "valid_targets_min": 376 | |
| }, | |
| { | |
| "epoch": 0.3624382207578254, | |
| "grad_norm": 0.584239794292724, | |
| "learning_rate": 2.0611764705882353e-05, | |
| "loss": 0.2643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25567081570625305, | |
| "step": 220, | |
| "valid_targets_mean": 5466.9, | |
| "valid_targets_min": 1282 | |
| }, | |
| { | |
| "epoch": 0.37067545304777594, | |
| "grad_norm": 0.4745121059458066, | |
| "learning_rate": 2.1082352941176474e-05, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26192885637283325, | |
| "step": 225, | |
| "valid_targets_mean": 5270.5, | |
| "valid_targets_min": 296 | |
| }, | |
| { | |
| "epoch": 0.37891268533772654, | |
| "grad_norm": 0.41969400360486403, | |
| "learning_rate": 2.155294117647059e-05, | |
| "loss": 0.2573, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2373993843793869, | |
| "step": 230, | |
| "valid_targets_mean": 6760.0, | |
| "valid_targets_min": 326 | |
| }, | |
| { | |
| "epoch": 0.3871499176276771, | |
| "grad_norm": 0.5483205019486727, | |
| "learning_rate": 2.2023529411764707e-05, | |
| "loss": 0.2684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2549372911453247, | |
| "step": 235, | |
| "valid_targets_mean": 5682.2, | |
| "valid_targets_min": 308 | |
| }, | |
| { | |
| "epoch": 0.3953871499176277, | |
| "grad_norm": 0.6168666569029855, | |
| "learning_rate": 2.2494117647058828e-05, | |
| "loss": 0.276, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3015628457069397, | |
| "step": 240, | |
| "valid_targets_mean": 3946.9, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 0.40362438220757824, | |
| "grad_norm": 0.5365792559906071, | |
| "learning_rate": 2.296470588235294e-05, | |
| "loss": 0.2717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26749783754348755, | |
| "step": 245, | |
| "valid_targets_mean": 3893.8, | |
| "valid_targets_min": 542 | |
| }, | |
| { | |
| "epoch": 0.41186161449752884, | |
| "grad_norm": 0.6167717661808814, | |
| "learning_rate": 2.343529411764706e-05, | |
| "loss": 0.25, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2404215931892395, | |
| "step": 250, | |
| "valid_targets_mean": 5115.6, | |
| "valid_targets_min": 423 | |
| }, | |
| { | |
| "epoch": 0.4200988467874794, | |
| "grad_norm": 0.4913706628525373, | |
| "learning_rate": 2.390588235294118e-05, | |
| "loss": 0.2655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26305335760116577, | |
| "step": 255, | |
| "valid_targets_mean": 5529.3, | |
| "valid_targets_min": 557 | |
| }, | |
| { | |
| "epoch": 0.42833607907743, | |
| "grad_norm": 0.442082406594195, | |
| "learning_rate": 2.4376470588235296e-05, | |
| "loss": 0.2719, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2634507715702057, | |
| "step": 260, | |
| "valid_targets_mean": 5923.1, | |
| "valid_targets_min": 1430 | |
| }, | |
| { | |
| "epoch": 0.43657331136738053, | |
| "grad_norm": 0.49009125726576164, | |
| "learning_rate": 2.4847058823529416e-05, | |
| "loss": 0.2583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25144073367118835, | |
| "step": 265, | |
| "valid_targets_mean": 5996.3, | |
| "valid_targets_min": 3044 | |
| }, | |
| { | |
| "epoch": 0.44481054365733114, | |
| "grad_norm": 0.47334592009364207, | |
| "learning_rate": 2.5317647058823533e-05, | |
| "loss": 0.2556, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2636913061141968, | |
| "step": 270, | |
| "valid_targets_mean": 5090.6, | |
| "valid_targets_min": 3108 | |
| }, | |
| { | |
| "epoch": 0.45304777594728174, | |
| "grad_norm": 0.8202441780411607, | |
| "learning_rate": 2.5788235294117646e-05, | |
| "loss": 0.2653, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2844095528125763, | |
| "step": 275, | |
| "valid_targets_mean": 5099.9, | |
| "valid_targets_min": 384 | |
| }, | |
| { | |
| "epoch": 0.4612850082372323, | |
| "grad_norm": 0.44215328777494795, | |
| "learning_rate": 2.6258823529411767e-05, | |
| "loss": 0.2594, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2599104940891266, | |
| "step": 280, | |
| "valid_targets_mean": 5995.9, | |
| "valid_targets_min": 1275 | |
| }, | |
| { | |
| "epoch": 0.4695222405271829, | |
| "grad_norm": 0.518261753355759, | |
| "learning_rate": 2.6729411764705884e-05, | |
| "loss": 0.2402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23824268579483032, | |
| "step": 285, | |
| "valid_targets_mean": 6041.9, | |
| "valid_targets_min": 780 | |
| }, | |
| { | |
| "epoch": 0.47775947281713343, | |
| "grad_norm": 0.5600242998987862, | |
| "learning_rate": 2.7200000000000004e-05, | |
| "loss": 0.2499, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24535280466079712, | |
| "step": 290, | |
| "valid_targets_mean": 5678.4, | |
| "valid_targets_min": 1038 | |
| }, | |
| { | |
| "epoch": 0.48599670510708404, | |
| "grad_norm": 0.5046606384417934, | |
| "learning_rate": 2.767058823529412e-05, | |
| "loss": 0.2467, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23094016313552856, | |
| "step": 295, | |
| "valid_targets_mean": 6001.8, | |
| "valid_targets_min": 881 | |
| }, | |
| { | |
| "epoch": 0.4942339373970346, | |
| "grad_norm": 0.4919795163540572, | |
| "learning_rate": 2.8141176470588238e-05, | |
| "loss": 0.2538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2650337815284729, | |
| "step": 300, | |
| "valid_targets_mean": 6865.7, | |
| "valid_targets_min": 3232 | |
| }, | |
| { | |
| "epoch": 0.5024711696869851, | |
| "grad_norm": 0.5272382038625899, | |
| "learning_rate": 2.8611764705882358e-05, | |
| "loss": 0.2606, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25129497051239014, | |
| "step": 305, | |
| "valid_targets_mean": 5451.0, | |
| "valid_targets_min": 756 | |
| }, | |
| { | |
| "epoch": 0.5107084019769358, | |
| "grad_norm": 0.6093269737474022, | |
| "learning_rate": 2.908235294117647e-05, | |
| "loss": 0.2444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25131070613861084, | |
| "step": 310, | |
| "valid_targets_mean": 4781.3, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 0.5189456342668863, | |
| "grad_norm": 0.4752202448157912, | |
| "learning_rate": 2.955294117647059e-05, | |
| "loss": 0.2489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23877941071987152, | |
| "step": 315, | |
| "valid_targets_mean": 5204.8, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 0.5271828665568369, | |
| "grad_norm": 2.473505356827449, | |
| "learning_rate": 3.002352941176471e-05, | |
| "loss": 0.2396, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22138109803199768, | |
| "step": 320, | |
| "valid_targets_mean": 6812.0, | |
| "valid_targets_min": 402 | |
| }, | |
| { | |
| "epoch": 0.5354200988467874, | |
| "grad_norm": 0.4227222126444928, | |
| "learning_rate": 3.0494117647058826e-05, | |
| "loss": 0.2581, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2810346782207489, | |
| "step": 325, | |
| "valid_targets_mean": 6907.1, | |
| "valid_targets_min": 1919 | |
| }, | |
| { | |
| "epoch": 0.5436573311367381, | |
| "grad_norm": 0.5641877950067977, | |
| "learning_rate": 3.0964705882352946e-05, | |
| "loss": 0.2831, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23556017875671387, | |
| "step": 330, | |
| "valid_targets_mean": 3956.6, | |
| "valid_targets_min": 284 | |
| }, | |
| { | |
| "epoch": 0.5518945634266886, | |
| "grad_norm": 0.5731856664818545, | |
| "learning_rate": 3.143529411764706e-05, | |
| "loss": 0.2607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2894389033317566, | |
| "step": 335, | |
| "valid_targets_mean": 4187.3, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 0.5601317957166392, | |
| "grad_norm": 0.599774786273948, | |
| "learning_rate": 3.190588235294118e-05, | |
| "loss": 0.2503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27301275730133057, | |
| "step": 340, | |
| "valid_targets_mean": 4816.8, | |
| "valid_targets_min": 590 | |
| }, | |
| { | |
| "epoch": 0.5683690280065898, | |
| "grad_norm": 0.4861564202382807, | |
| "learning_rate": 3.23764705882353e-05, | |
| "loss": 0.2497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24534930288791656, | |
| "step": 345, | |
| "valid_targets_mean": 5835.0, | |
| "valid_targets_min": 273 | |
| }, | |
| { | |
| "epoch": 0.5766062602965404, | |
| "grad_norm": 0.46826059651931723, | |
| "learning_rate": 3.2847058823529414e-05, | |
| "loss": 0.256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23597204685211182, | |
| "step": 350, | |
| "valid_targets_mean": 5569.6, | |
| "valid_targets_min": 299 | |
| }, | |
| { | |
| "epoch": 0.5848434925864909, | |
| "grad_norm": 0.468741254368515, | |
| "learning_rate": 3.331764705882353e-05, | |
| "loss": 0.2447, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2362544685602188, | |
| "step": 355, | |
| "valid_targets_mean": 5644.9, | |
| "valid_targets_min": 955 | |
| }, | |
| { | |
| "epoch": 0.5930807248764415, | |
| "grad_norm": 0.5115784291313441, | |
| "learning_rate": 3.378823529411765e-05, | |
| "loss": 0.2549, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24134746193885803, | |
| "step": 360, | |
| "valid_targets_mean": 4920.2, | |
| "valid_targets_min": 549 | |
| }, | |
| { | |
| "epoch": 0.6013179571663921, | |
| "grad_norm": 0.46121075656681765, | |
| "learning_rate": 3.425882352941177e-05, | |
| "loss": 0.2525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2458174228668213, | |
| "step": 365, | |
| "valid_targets_mean": 5673.4, | |
| "valid_targets_min": 349 | |
| }, | |
| { | |
| "epoch": 0.6095551894563427, | |
| "grad_norm": 0.5766200034536373, | |
| "learning_rate": 3.472941176470589e-05, | |
| "loss": 0.2466, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2716304361820221, | |
| "step": 370, | |
| "valid_targets_mean": 3649.0, | |
| "valid_targets_min": 337 | |
| }, | |
| { | |
| "epoch": 0.6177924217462932, | |
| "grad_norm": 0.43965910915542156, | |
| "learning_rate": 3.52e-05, | |
| "loss": 0.2451, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24214795231819153, | |
| "step": 375, | |
| "valid_targets_mean": 5985.5, | |
| "valid_targets_min": 2331 | |
| }, | |
| { | |
| "epoch": 0.6260296540362438, | |
| "grad_norm": 0.5445523156203298, | |
| "learning_rate": 3.567058823529412e-05, | |
| "loss": 0.2497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27419784665107727, | |
| "step": 380, | |
| "valid_targets_mean": 5857.4, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 0.6342668863261944, | |
| "grad_norm": 0.45076566676047086, | |
| "learning_rate": 3.614117647058824e-05, | |
| "loss": 0.2359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2390020340681076, | |
| "step": 385, | |
| "valid_targets_mean": 5877.6, | |
| "valid_targets_min": 379 | |
| }, | |
| { | |
| "epoch": 0.642504118616145, | |
| "grad_norm": 0.5392434391586065, | |
| "learning_rate": 3.6611764705882356e-05, | |
| "loss": 0.2579, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24283994734287262, | |
| "step": 390, | |
| "valid_targets_mean": 4931.6, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 0.6507413509060955, | |
| "grad_norm": 0.5128687151504063, | |
| "learning_rate": 3.708235294117647e-05, | |
| "loss": 0.2398, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2664172649383545, | |
| "step": 395, | |
| "valid_targets_mean": 6280.8, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 0.6589785831960461, | |
| "grad_norm": 0.48861927361947727, | |
| "learning_rate": 3.755294117647059e-05, | |
| "loss": 0.2361, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22510594129562378, | |
| "step": 400, | |
| "valid_targets_mean": 5486.4, | |
| "valid_targets_min": 230 | |
| }, | |
| { | |
| "epoch": 0.6672158154859967, | |
| "grad_norm": 0.4578245672892225, | |
| "learning_rate": 3.802352941176471e-05, | |
| "loss": 0.2488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25764355063438416, | |
| "step": 405, | |
| "valid_targets_mean": 6231.9, | |
| "valid_targets_min": 460 | |
| }, | |
| { | |
| "epoch": 0.6754530477759473, | |
| "grad_norm": 0.8301830725579878, | |
| "learning_rate": 3.849411764705883e-05, | |
| "loss": 0.2406, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2402649223804474, | |
| "step": 410, | |
| "valid_targets_mean": 3948.3, | |
| "valid_targets_min": 242 | |
| }, | |
| { | |
| "epoch": 0.6836902800658978, | |
| "grad_norm": 0.44648575578094224, | |
| "learning_rate": 3.8964705882352944e-05, | |
| "loss": 0.2417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23881413042545319, | |
| "step": 415, | |
| "valid_targets_mean": 5817.6, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 0.6919275123558485, | |
| "grad_norm": 0.49886875102186296, | |
| "learning_rate": 3.943529411764706e-05, | |
| "loss": 0.2373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28194451332092285, | |
| "step": 420, | |
| "valid_targets_mean": 6352.9, | |
| "valid_targets_min": 1581 | |
| }, | |
| { | |
| "epoch": 0.700164744645799, | |
| "grad_norm": 0.4891568260174497, | |
| "learning_rate": 3.990588235294118e-05, | |
| "loss": 0.2424, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24988554418087006, | |
| "step": 425, | |
| "valid_targets_mean": 6102.6, | |
| "valid_targets_min": 2769 | |
| }, | |
| { | |
| "epoch": 0.7084019769357496, | |
| "grad_norm": 0.4540233317989794, | |
| "learning_rate": 3.9999892009992835e-05, | |
| "loss": 0.2359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2257465124130249, | |
| "step": 430, | |
| "valid_targets_mean": 6351.2, | |
| "valid_targets_min": 671 | |
| }, | |
| { | |
| "epoch": 0.7166392092257001, | |
| "grad_norm": 0.5060649229158644, | |
| "learning_rate": 3.999945330258739e-05, | |
| "loss": 0.2359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2537679672241211, | |
| "step": 435, | |
| "valid_targets_mean": 4834.8, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 0.7248764415156508, | |
| "grad_norm": 0.5382641331253157, | |
| "learning_rate": 3.9998677135805034e-05, | |
| "loss": 0.2347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.250946968793869, | |
| "step": 440, | |
| "valid_targets_mean": 4850.4, | |
| "valid_targets_min": 274 | |
| }, | |
| { | |
| "epoch": 0.7331136738056013, | |
| "grad_norm": 0.48149113904020235, | |
| "learning_rate": 3.999756352274235e-05, | |
| "loss": 0.2619, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24606043100357056, | |
| "step": 445, | |
| "valid_targets_mean": 4880.9, | |
| "valid_targets_min": 410 | |
| }, | |
| { | |
| "epoch": 0.7413509060955519, | |
| "grad_norm": 0.40703407464849967, | |
| "learning_rate": 3.9996112482189823e-05, | |
| "loss": 0.2331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2441137731075287, | |
| "step": 450, | |
| "valid_targets_mean": 7023.7, | |
| "valid_targets_min": 500 | |
| }, | |
| { | |
| "epoch": 0.7495881383855024, | |
| "grad_norm": 0.567651261036972, | |
| "learning_rate": 3.9994324038631475e-05, | |
| "loss": 0.2333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25954297184944153, | |
| "step": 455, | |
| "valid_targets_mean": 5304.1, | |
| "valid_targets_min": 733 | |
| }, | |
| { | |
| "epoch": 0.7578253706754531, | |
| "grad_norm": 0.5007540500147702, | |
| "learning_rate": 3.999219822224449e-05, | |
| "loss": 0.2351, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21681588888168335, | |
| "step": 460, | |
| "valid_targets_mean": 4537.3, | |
| "valid_targets_min": 319 | |
| }, | |
| { | |
| "epoch": 0.7660626029654036, | |
| "grad_norm": 0.48312602178484315, | |
| "learning_rate": 3.998973506889867e-05, | |
| "loss": 0.2371, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24203741550445557, | |
| "step": 465, | |
| "valid_targets_mean": 5523.3, | |
| "valid_targets_min": 712 | |
| }, | |
| { | |
| "epoch": 0.7742998352553542, | |
| "grad_norm": 0.4993949797656458, | |
| "learning_rate": 3.998693462015588e-05, | |
| "loss": 0.2261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23800945281982422, | |
| "step": 470, | |
| "valid_targets_mean": 5283.7, | |
| "valid_targets_min": 405 | |
| }, | |
| { | |
| "epoch": 0.7825370675453048, | |
| "grad_norm": 0.6911936462179921, | |
| "learning_rate": 3.9983796923269284e-05, | |
| "loss": 0.2526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2643522322177887, | |
| "step": 475, | |
| "valid_targets_mean": 4526.2, | |
| "valid_targets_min": 397 | |
| }, | |
| { | |
| "epoch": 0.7907742998352554, | |
| "grad_norm": 0.46237825781467196, | |
| "learning_rate": 3.99803220311826e-05, | |
| "loss": 0.2321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20834973454475403, | |
| "step": 480, | |
| "valid_targets_mean": 6325.9, | |
| "valid_targets_min": 1607 | |
| }, | |
| { | |
| "epoch": 0.7990115321252059, | |
| "grad_norm": 0.5178848730466026, | |
| "learning_rate": 3.997651000252916e-05, | |
| "loss": 0.2325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24416354298591614, | |
| "step": 485, | |
| "valid_targets_mean": 5612.9, | |
| "valid_targets_min": 471 | |
| }, | |
| { | |
| "epoch": 0.8072487644151565, | |
| "grad_norm": 0.4804988255352975, | |
| "learning_rate": 3.997236090163097e-05, | |
| "loss": 0.2403, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2480747252702713, | |
| "step": 490, | |
| "valid_targets_mean": 5062.2, | |
| "valid_targets_min": 278 | |
| }, | |
| { | |
| "epoch": 0.8154859967051071, | |
| "grad_norm": 0.4601940493818994, | |
| "learning_rate": 3.99678747984976e-05, | |
| "loss": 0.2285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22695352137088776, | |
| "step": 495, | |
| "valid_targets_mean": 6248.0, | |
| "valid_targets_min": 326 | |
| }, | |
| { | |
| "epoch": 0.8237232289950577, | |
| "grad_norm": 0.6489913836517276, | |
| "learning_rate": 3.9963051768824996e-05, | |
| "loss": 0.2413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2015495002269745, | |
| "step": 500, | |
| "valid_targets_mean": 4947.4, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 0.8319604612850082, | |
| "grad_norm": 0.47186518041594383, | |
| "learning_rate": 3.995789189399421e-05, | |
| "loss": 0.2412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2775544226169586, | |
| "step": 505, | |
| "valid_targets_mean": 5327.4, | |
| "valid_targets_min": 494 | |
| }, | |
| { | |
| "epoch": 0.8401976935749588, | |
| "grad_norm": 0.49441063849501865, | |
| "learning_rate": 3.995239526107003e-05, | |
| "loss": 0.236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19490137696266174, | |
| "step": 510, | |
| "valid_targets_mean": 5805.4, | |
| "valid_targets_min": 375 | |
| }, | |
| { | |
| "epoch": 0.8484349258649094, | |
| "grad_norm": 0.49399739098758555, | |
| "learning_rate": 3.99465619627995e-05, | |
| "loss": 0.2348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22286713123321533, | |
| "step": 515, | |
| "valid_targets_mean": 5033.8, | |
| "valid_targets_min": 395 | |
| }, | |
| { | |
| "epoch": 0.85667215815486, | |
| "grad_norm": 0.47527825732717, | |
| "learning_rate": 3.994039209761038e-05, | |
| "loss": 0.2295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22310678660869598, | |
| "step": 520, | |
| "valid_targets_mean": 4940.3, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 0.8649093904448105, | |
| "grad_norm": 0.43851822473266283, | |
| "learning_rate": 3.993388576960947e-05, | |
| "loss": 0.2347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22508376836776733, | |
| "step": 525, | |
| "valid_targets_mean": 5517.1, | |
| "valid_targets_min": 2534 | |
| }, | |
| { | |
| "epoch": 0.8731466227347611, | |
| "grad_norm": 0.5181505361314813, | |
| "learning_rate": 3.9927043088580805e-05, | |
| "loss": 0.2347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22217413783073425, | |
| "step": 530, | |
| "valid_targets_mean": 5870.6, | |
| "valid_targets_min": 2742 | |
| }, | |
| { | |
| "epoch": 0.8813838550247117, | |
| "grad_norm": 0.5172044137985335, | |
| "learning_rate": 3.991986416998391e-05, | |
| "loss": 0.2351, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2242281138896942, | |
| "step": 535, | |
| "valid_targets_mean": 4839.9, | |
| "valid_targets_min": 383 | |
| }, | |
| { | |
| "epoch": 0.8896210873146623, | |
| "grad_norm": 0.4915178095143843, | |
| "learning_rate": 3.991234913495177e-05, | |
| "loss": 0.2345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2577744722366333, | |
| "step": 540, | |
| "valid_targets_mean": 5669.3, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 0.8978583196046128, | |
| "grad_norm": 0.517453189663422, | |
| "learning_rate": 3.99044981102888e-05, | |
| "loss": 0.2409, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24107156693935394, | |
| "step": 545, | |
| "valid_targets_mean": 4511.5, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 0.9060955518945635, | |
| "grad_norm": 0.4249092247184588, | |
| "learning_rate": 3.98963112284687e-05, | |
| "loss": 0.2279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22153885662555695, | |
| "step": 550, | |
| "valid_targets_mean": 5979.3, | |
| "valid_targets_min": 1764 | |
| }, | |
| { | |
| "epoch": 0.914332784184514, | |
| "grad_norm": 0.4231497851737803, | |
| "learning_rate": 3.9887788627632266e-05, | |
| "loss": 0.2285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23912326991558075, | |
| "step": 555, | |
| "valid_targets_mean": 6845.6, | |
| "valid_targets_min": 501 | |
| }, | |
| { | |
| "epoch": 0.9225700164744646, | |
| "grad_norm": 0.5827210887236848, | |
| "learning_rate": 3.9878930451584995e-05, | |
| "loss": 0.2357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25541239976882935, | |
| "step": 560, | |
| "valid_targets_mean": 5153.3, | |
| "valid_targets_min": 417 | |
| }, | |
| { | |
| "epoch": 0.9308072487644151, | |
| "grad_norm": 0.447268034989374, | |
| "learning_rate": 3.986973684979471e-05, | |
| "loss": 0.2335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20884302258491516, | |
| "step": 565, | |
| "valid_targets_mean": 6352.9, | |
| "valid_targets_min": 1201 | |
| }, | |
| { | |
| "epoch": 0.9390444810543658, | |
| "grad_norm": 0.410089047384293, | |
| "learning_rate": 3.9860207977389e-05, | |
| "loss": 0.2254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.207575261592865, | |
| "step": 570, | |
| "valid_targets_mean": 6985.4, | |
| "valid_targets_min": 790 | |
| }, | |
| { | |
| "epoch": 0.9472817133443163, | |
| "grad_norm": 0.7473346503345262, | |
| "learning_rate": 3.9850343995152634e-05, | |
| "loss": 0.2314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23254147171974182, | |
| "step": 575, | |
| "valid_targets_mean": 5084.3, | |
| "valid_targets_min": 269 | |
| }, | |
| { | |
| "epoch": 0.9555189456342669, | |
| "grad_norm": 0.48067485765275164, | |
| "learning_rate": 3.9840145069524825e-05, | |
| "loss": 0.2323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22275948524475098, | |
| "step": 580, | |
| "valid_targets_mean": 5225.8, | |
| "valid_targets_min": 252 | |
| }, | |
| { | |
| "epoch": 0.9637561779242174, | |
| "grad_norm": 0.5036323546289907, | |
| "learning_rate": 3.982961137259643e-05, | |
| "loss": 0.2259, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23761439323425293, | |
| "step": 585, | |
| "valid_targets_mean": 5808.1, | |
| "valid_targets_min": 426 | |
| }, | |
| { | |
| "epoch": 0.9719934102141681, | |
| "grad_norm": 0.38418952548881075, | |
| "learning_rate": 3.981874308210705e-05, | |
| "loss": 0.2168, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21421602368354797, | |
| "step": 590, | |
| "valid_targets_mean": 6400.6, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 0.9802306425041186, | |
| "grad_norm": 0.4957357691924066, | |
| "learning_rate": 3.980754038144202e-05, | |
| "loss": 0.2314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23045432567596436, | |
| "step": 595, | |
| "valid_targets_mean": 5057.3, | |
| "valid_targets_min": 304 | |
| }, | |
| { | |
| "epoch": 0.9884678747940692, | |
| "grad_norm": 0.5349521308655141, | |
| "learning_rate": 3.9796003459629335e-05, | |
| "loss": 0.2327, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23907701671123505, | |
| "step": 600, | |
| "valid_targets_mean": 5696.1, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 0.9967051070840197, | |
| "grad_norm": 0.4707893596360662, | |
| "learning_rate": 3.978413251133643e-05, | |
| "loss": 0.2261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22950200736522675, | |
| "step": 605, | |
| "valid_targets_mean": 4733.3, | |
| "valid_targets_min": 350 | |
| }, | |
| { | |
| "epoch": 1.0049423393739703, | |
| "grad_norm": 0.41194237268820666, | |
| "learning_rate": 3.9771927736866926e-05, | |
| "loss": 0.234, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21946033835411072, | |
| "step": 610, | |
| "valid_targets_mean": 6868.2, | |
| "valid_targets_min": 1850 | |
| }, | |
| { | |
| "epoch": 1.013179571663921, | |
| "grad_norm": 0.4825240221007433, | |
| "learning_rate": 3.9759389342157225e-05, | |
| "loss": 0.2187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21633988618850708, | |
| "step": 615, | |
| "valid_targets_mean": 6282.6, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 1.0214168039538716, | |
| "grad_norm": 0.4864765772803356, | |
| "learning_rate": 3.974651753877306e-05, | |
| "loss": 0.2214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23486509919166565, | |
| "step": 620, | |
| "valid_targets_mean": 5138.8, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 1.029654036243822, | |
| "grad_norm": 0.5005826843944357, | |
| "learning_rate": 3.9733312543905904e-05, | |
| "loss": 0.2154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21022102236747742, | |
| "step": 625, | |
| "valid_targets_mean": 4148.2, | |
| "valid_targets_min": 224 | |
| }, | |
| { | |
| "epoch": 1.0378912685337727, | |
| "grad_norm": 0.4864604603417918, | |
| "learning_rate": 3.971977458036931e-05, | |
| "loss": 0.2234, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22325535118579865, | |
| "step": 630, | |
| "valid_targets_mean": 4923.6, | |
| "valid_targets_min": 347 | |
| }, | |
| { | |
| "epoch": 1.0461285008237233, | |
| "grad_norm": 0.4688017802978337, | |
| "learning_rate": 3.9705903876595167e-05, | |
| "loss": 0.2191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2212015837430954, | |
| "step": 635, | |
| "valid_targets_mean": 5238.6, | |
| "valid_targets_min": 425 | |
| }, | |
| { | |
| "epoch": 1.0543657331136738, | |
| "grad_norm": 0.428267438257124, | |
| "learning_rate": 3.969170066662982e-05, | |
| "loss": 0.2164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21326056122779846, | |
| "step": 640, | |
| "valid_targets_mean": 6401.2, | |
| "valid_targets_min": 2001 | |
| }, | |
| { | |
| "epoch": 1.0626029654036244, | |
| "grad_norm": 0.492908870611759, | |
| "learning_rate": 3.967716519013014e-05, | |
| "loss": 0.2071, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19257579743862152, | |
| "step": 645, | |
| "valid_targets_mean": 5080.1, | |
| "valid_targets_min": 308 | |
| }, | |
| { | |
| "epoch": 1.0708401976935749, | |
| "grad_norm": 0.5149843564039698, | |
| "learning_rate": 3.9662297692359486e-05, | |
| "loss": 0.219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2227625548839569, | |
| "step": 650, | |
| "valid_targets_mean": 4573.6, | |
| "valid_targets_min": 352 | |
| }, | |
| { | |
| "epoch": 1.0790774299835255, | |
| "grad_norm": 0.7996486231537374, | |
| "learning_rate": 3.964709842418351e-05, | |
| "loss": 0.2201, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2152567356824875, | |
| "step": 655, | |
| "valid_targets_mean": 4025.2, | |
| "valid_targets_min": 305 | |
| }, | |
| { | |
| "epoch": 1.0873146622734762, | |
| "grad_norm": 0.47476102749486415, | |
| "learning_rate": 3.9631567642066036e-05, | |
| "loss": 0.2062, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19000795483589172, | |
| "step": 660, | |
| "valid_targets_mean": 5094.8, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 1.0955518945634266, | |
| "grad_norm": 0.47703850464320063, | |
| "learning_rate": 3.961570560806461e-05, | |
| "loss": 0.2221, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24766787886619568, | |
| "step": 665, | |
| "valid_targets_mean": 5477.7, | |
| "valid_targets_min": 382 | |
| }, | |
| { | |
| "epoch": 1.1037891268533773, | |
| "grad_norm": 0.5069242817566683, | |
| "learning_rate": 3.959951258982616e-05, | |
| "loss": 0.2256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23409639298915863, | |
| "step": 670, | |
| "valid_targets_mean": 5145.1, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 1.112026359143328, | |
| "grad_norm": 0.4919381443213265, | |
| "learning_rate": 3.958298886058244e-05, | |
| "loss": 0.2184, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.226350337266922, | |
| "step": 675, | |
| "valid_targets_mean": 5394.1, | |
| "valid_targets_min": 408 | |
| }, | |
| { | |
| "epoch": 1.1202635914332784, | |
| "grad_norm": 0.5303991621418709, | |
| "learning_rate": 3.956613469914546e-05, | |
| "loss": 0.2236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22548964619636536, | |
| "step": 680, | |
| "valid_targets_mean": 6668.1, | |
| "valid_targets_min": 1192 | |
| }, | |
| { | |
| "epoch": 1.128500823723229, | |
| "grad_norm": 0.42339841271119033, | |
| "learning_rate": 3.95489503899027e-05, | |
| "loss": 0.2161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19513583183288574, | |
| "step": 685, | |
| "valid_targets_mean": 5445.3, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 1.1367380560131797, | |
| "grad_norm": 0.4454529502666242, | |
| "learning_rate": 3.9531436222812434e-05, | |
| "loss": 0.2285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22786834836006165, | |
| "step": 690, | |
| "valid_targets_mean": 6334.8, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 1.1449752883031301, | |
| "grad_norm": 0.47279587682132945, | |
| "learning_rate": 3.951359249339872e-05, | |
| "loss": 0.2232, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23073595762252808, | |
| "step": 695, | |
| "valid_targets_mean": 5592.6, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 1.1532125205930808, | |
| "grad_norm": 0.4956575408878937, | |
| "learning_rate": 3.9495419502746485e-05, | |
| "loss": 0.2234, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2247496247291565, | |
| "step": 700, | |
| "valid_targets_mean": 4740.6, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 1.1614497528830312, | |
| "grad_norm": 0.39388974668957244, | |
| "learning_rate": 3.9476917557496406e-05, | |
| "loss": 0.2045, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1931227445602417, | |
| "step": 705, | |
| "valid_targets_mean": 6726.2, | |
| "valid_targets_min": 3456 | |
| }, | |
| { | |
| "epoch": 1.1696869851729819, | |
| "grad_norm": 0.5233021564155736, | |
| "learning_rate": 3.945808696983977e-05, | |
| "loss": 0.2151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20530879497528076, | |
| "step": 710, | |
| "valid_targets_mean": 4515.9, | |
| "valid_targets_min": 368 | |
| }, | |
| { | |
| "epoch": 1.1779242174629325, | |
| "grad_norm": 0.44802130443163707, | |
| "learning_rate": 3.943892805751318e-05, | |
| "loss": 0.2113, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.221025288105011, | |
| "step": 715, | |
| "valid_targets_mean": 5391.6, | |
| "valid_targets_min": 359 | |
| }, | |
| { | |
| "epoch": 1.186161449752883, | |
| "grad_norm": 0.5190302515381963, | |
| "learning_rate": 3.941944114379321e-05, | |
| "loss": 0.217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.238490492105484, | |
| "step": 720, | |
| "valid_targets_mean": 4902.8, | |
| "valid_targets_min": 320 | |
| }, | |
| { | |
| "epoch": 1.1943986820428336, | |
| "grad_norm": 0.4587014194986882, | |
| "learning_rate": 3.939962655749092e-05, | |
| "loss": 0.2215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22766120731830597, | |
| "step": 725, | |
| "valid_targets_mean": 6048.8, | |
| "valid_targets_min": 953 | |
| }, | |
| { | |
| "epoch": 1.2026359143327843, | |
| "grad_norm": 0.7269077468395682, | |
| "learning_rate": 3.937948463294636e-05, | |
| "loss": 0.2081, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20253407955169678, | |
| "step": 730, | |
| "valid_targets_mean": 5467.1, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 1.2108731466227347, | |
| "grad_norm": 0.47128651735720306, | |
| "learning_rate": 3.935901571002288e-05, | |
| "loss": 0.2086, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21129494905471802, | |
| "step": 735, | |
| "valid_targets_mean": 4862.2, | |
| "valid_targets_min": 451 | |
| }, | |
| { | |
| "epoch": 1.2191103789126854, | |
| "grad_norm": 0.48356488331820596, | |
| "learning_rate": 3.9338220134101426e-05, | |
| "loss": 0.2156, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23600584268569946, | |
| "step": 740, | |
| "valid_targets_mean": 5907.7, | |
| "valid_targets_min": 392 | |
| }, | |
| { | |
| "epoch": 1.227347611202636, | |
| "grad_norm": 0.4054882234317549, | |
| "learning_rate": 3.931709825607469e-05, | |
| "loss": 0.2161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23984688520431519, | |
| "step": 745, | |
| "valid_targets_mean": 6646.1, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 1.2355848434925865, | |
| "grad_norm": 0.5529526522103348, | |
| "learning_rate": 3.9295650432341195e-05, | |
| "loss": 0.2164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2136891484260559, | |
| "step": 750, | |
| "valid_targets_mean": 4384.1, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 1.2438220757825371, | |
| "grad_norm": 0.44931791363596874, | |
| "learning_rate": 3.927387702479931e-05, | |
| "loss": 0.22, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2048332244157791, | |
| "step": 755, | |
| "valid_targets_mean": 4977.4, | |
| "valid_targets_min": 260 | |
| }, | |
| { | |
| "epoch": 1.2520593080724876, | |
| "grad_norm": 0.5100825061887327, | |
| "learning_rate": 3.925177840084109e-05, | |
| "loss": 0.2163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2232269048690796, | |
| "step": 760, | |
| "valid_targets_mean": 5746.7, | |
| "valid_targets_min": 766 | |
| }, | |
| { | |
| "epoch": 1.2602965403624382, | |
| "grad_norm": 0.46978923825758434, | |
| "learning_rate": 3.922935493334614e-05, | |
| "loss": 0.2057, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20063738524913788, | |
| "step": 765, | |
| "valid_targets_mean": 5882.4, | |
| "valid_targets_min": 805 | |
| }, | |
| { | |
| "epoch": 1.2685337726523889, | |
| "grad_norm": 0.4468454526508097, | |
| "learning_rate": 3.920660700067526e-05, | |
| "loss": 0.2093, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21785524487495422, | |
| "step": 770, | |
| "valid_targets_mean": 5499.2, | |
| "valid_targets_min": 2014 | |
| }, | |
| { | |
| "epoch": 1.2767710049423393, | |
| "grad_norm": 0.40158310809679576, | |
| "learning_rate": 3.9183534986664094e-05, | |
| "loss": 0.212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22869275510311127, | |
| "step": 775, | |
| "valid_targets_mean": 6712.8, | |
| "valid_targets_min": 332 | |
| }, | |
| { | |
| "epoch": 1.28500823723229, | |
| "grad_norm": 0.45415318965640566, | |
| "learning_rate": 3.9160139280616674e-05, | |
| "loss": 0.2209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21420571208000183, | |
| "step": 780, | |
| "valid_targets_mean": 4906.4, | |
| "valid_targets_min": 2072 | |
| }, | |
| { | |
| "epoch": 1.2932454695222404, | |
| "grad_norm": 0.5581765063544478, | |
| "learning_rate": 3.9136420277298795e-05, | |
| "loss": 0.2207, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2164240926504135, | |
| "step": 785, | |
| "valid_targets_mean": 4428.5, | |
| "valid_targets_min": 370 | |
| }, | |
| { | |
| "epoch": 1.301482701812191, | |
| "grad_norm": 0.49806923271667086, | |
| "learning_rate": 3.911237837693141e-05, | |
| "loss": 0.2039, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21554507315158844, | |
| "step": 790, | |
| "valid_targets_mean": 4793.6, | |
| "valid_targets_min": 230 | |
| }, | |
| { | |
| "epoch": 1.3097199341021417, | |
| "grad_norm": 0.45786660001198226, | |
| "learning_rate": 3.9088013985183836e-05, | |
| "loss": 0.2152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22300955653190613, | |
| "step": 795, | |
| "valid_targets_mean": 5414.1, | |
| "valid_targets_min": 2898 | |
| }, | |
| { | |
| "epoch": 1.3179571663920924, | |
| "grad_norm": 0.4395544205039379, | |
| "learning_rate": 3.9063327513166945e-05, | |
| "loss": 0.2196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22037920355796814, | |
| "step": 800, | |
| "valid_targets_mean": 4776.6, | |
| "valid_targets_min": 338 | |
| }, | |
| { | |
| "epoch": 1.3261943986820428, | |
| "grad_norm": 0.3795581984916046, | |
| "learning_rate": 3.9038319377426186e-05, | |
| "loss": 0.2172, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22801026701927185, | |
| "step": 805, | |
| "valid_targets_mean": 7366.5, | |
| "valid_targets_min": 3706 | |
| }, | |
| { | |
| "epoch": 1.3344316309719935, | |
| "grad_norm": 0.3818588596696933, | |
| "learning_rate": 3.9012989999934595e-05, | |
| "loss": 0.2062, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18441705405712128, | |
| "step": 810, | |
| "valid_targets_mean": 6655.9, | |
| "valid_targets_min": 269 | |
| }, | |
| { | |
| "epoch": 1.342668863261944, | |
| "grad_norm": 0.42620581429826004, | |
| "learning_rate": 3.898733980808565e-05, | |
| "loss": 0.2149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19475027918815613, | |
| "step": 815, | |
| "valid_targets_mean": 5158.4, | |
| "valid_targets_min": 716 | |
| }, | |
| { | |
| "epoch": 1.3509060955518946, | |
| "grad_norm": 0.4984329967854055, | |
| "learning_rate": 3.896136923468607e-05, | |
| "loss": 0.2187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20715723931789398, | |
| "step": 820, | |
| "valid_targets_mean": 7209.1, | |
| "valid_targets_min": 2177 | |
| }, | |
| { | |
| "epoch": 1.3591433278418452, | |
| "grad_norm": 0.41716974384175576, | |
| "learning_rate": 3.8935078717948507e-05, | |
| "loss": 0.2088, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20283842086791992, | |
| "step": 825, | |
| "valid_targets_mean": 6637.9, | |
| "valid_targets_min": 321 | |
| }, | |
| { | |
| "epoch": 1.3673805601317957, | |
| "grad_norm": 0.44915758841416215, | |
| "learning_rate": 3.890846870148416e-05, | |
| "loss": 0.2047, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20052731037139893, | |
| "step": 830, | |
| "valid_targets_mean": 4960.1, | |
| "valid_targets_min": 381 | |
| }, | |
| { | |
| "epoch": 1.3756177924217463, | |
| "grad_norm": 0.5013943897191635, | |
| "learning_rate": 3.8881539634295287e-05, | |
| "loss": 0.2236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20464956760406494, | |
| "step": 835, | |
| "valid_targets_mean": 4348.9, | |
| "valid_targets_min": 274 | |
| }, | |
| { | |
| "epoch": 1.3838550247116967, | |
| "grad_norm": 0.5068964263720004, | |
| "learning_rate": 3.88542919707676e-05, | |
| "loss": 0.212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20616479218006134, | |
| "step": 840, | |
| "valid_targets_mean": 4562.9, | |
| "valid_targets_min": 405 | |
| }, | |
| { | |
| "epoch": 1.3920922570016474, | |
| "grad_norm": 0.47570489286531875, | |
| "learning_rate": 3.882672617066266e-05, | |
| "loss": 0.2145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20461995899677277, | |
| "step": 845, | |
| "valid_targets_mean": 4907.6, | |
| "valid_targets_min": 476 | |
| }, | |
| { | |
| "epoch": 1.400329489291598, | |
| "grad_norm": 0.4025399969142752, | |
| "learning_rate": 3.8798842699110075e-05, | |
| "loss": 0.2118, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20368187129497528, | |
| "step": 850, | |
| "valid_targets_mean": 6078.8, | |
| "valid_targets_min": 253 | |
| }, | |
| { | |
| "epoch": 1.4085667215815487, | |
| "grad_norm": 0.6119611882715695, | |
| "learning_rate": 3.877064202659964e-05, | |
| "loss": 0.223, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2284880429506302, | |
| "step": 855, | |
| "valid_targets_mean": 5368.0, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 1.4168039538714992, | |
| "grad_norm": 0.48313598417744963, | |
| "learning_rate": 3.874212462897343e-05, | |
| "loss": 0.2012, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21637935936450958, | |
| "step": 860, | |
| "valid_targets_mean": 5340.4, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 1.4250411861614498, | |
| "grad_norm": 0.5025920361955611, | |
| "learning_rate": 3.871329098741776e-05, | |
| "loss": 0.2073, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23110291361808777, | |
| "step": 865, | |
| "valid_targets_mean": 3986.9, | |
| "valid_targets_min": 319 | |
| }, | |
| { | |
| "epoch": 1.4332784184514002, | |
| "grad_norm": 0.44545873367031735, | |
| "learning_rate": 3.8684141588455076e-05, | |
| "loss": 0.2147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2106075882911682, | |
| "step": 870, | |
| "valid_targets_mean": 5611.4, | |
| "valid_targets_min": 493 | |
| }, | |
| { | |
| "epoch": 1.441515650741351, | |
| "grad_norm": 0.4107282913016226, | |
| "learning_rate": 3.865467692393571e-05, | |
| "loss": 0.2066, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21611574292182922, | |
| "step": 875, | |
| "valid_targets_mean": 5921.5, | |
| "valid_targets_min": 964 | |
| }, | |
| { | |
| "epoch": 1.4497528830313016, | |
| "grad_norm": 0.4790434208811441, | |
| "learning_rate": 3.862489749102963e-05, | |
| "loss": 0.209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20723897218704224, | |
| "step": 880, | |
| "valid_targets_mean": 4724.5, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 1.457990115321252, | |
| "grad_norm": 0.4091069480001445, | |
| "learning_rate": 3.8594803792218e-05, | |
| "loss": 0.2095, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20824208855628967, | |
| "step": 885, | |
| "valid_targets_mean": 5924.8, | |
| "valid_targets_min": 1634 | |
| }, | |
| { | |
| "epoch": 1.4662273476112027, | |
| "grad_norm": 0.5379923390169834, | |
| "learning_rate": 3.856439633528476e-05, | |
| "loss": 0.2069, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2114851176738739, | |
| "step": 890, | |
| "valid_targets_mean": 5122.6, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 1.474464579901153, | |
| "grad_norm": 0.4108382282606267, | |
| "learning_rate": 3.853367563330799e-05, | |
| "loss": 0.2146, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21595284342765808, | |
| "step": 895, | |
| "valid_targets_mean": 6272.7, | |
| "valid_targets_min": 1989 | |
| }, | |
| { | |
| "epoch": 1.4827018121911038, | |
| "grad_norm": 0.4556536946526364, | |
| "learning_rate": 3.850264220465134e-05, | |
| "loss": 0.1994, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21706634759902954, | |
| "step": 900, | |
| "valid_targets_mean": 5730.1, | |
| "valid_targets_min": 752 | |
| }, | |
| { | |
| "epoch": 1.4909390444810544, | |
| "grad_norm": 0.4301435195930705, | |
| "learning_rate": 3.847129657295516e-05, | |
| "loss": 0.202, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20228904485702515, | |
| "step": 905, | |
| "valid_targets_mean": 5140.6, | |
| "valid_targets_min": 388 | |
| }, | |
| { | |
| "epoch": 1.499176276771005, | |
| "grad_norm": 0.40466531702536584, | |
| "learning_rate": 3.843963926712778e-05, | |
| "loss": 0.2018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20213758945465088, | |
| "step": 910, | |
| "valid_targets_mean": 6237.1, | |
| "valid_targets_min": 2286 | |
| }, | |
| { | |
| "epoch": 1.5074135090609555, | |
| "grad_norm": 0.4252404187833234, | |
| "learning_rate": 3.840767082133656e-05, | |
| "loss": 0.2204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1813409924507141, | |
| "step": 915, | |
| "valid_targets_mean": 5218.1, | |
| "valid_targets_min": 311 | |
| }, | |
| { | |
| "epoch": 1.515650741350906, | |
| "grad_norm": 0.4544865201227028, | |
| "learning_rate": 3.837539177499882e-05, | |
| "loss": 0.208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21020415425300598, | |
| "step": 920, | |
| "valid_targets_mean": 5514.9, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 1.5238879736408566, | |
| "grad_norm": 0.4220823729123365, | |
| "learning_rate": 3.834280267277279e-05, | |
| "loss": 0.2143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18695414066314697, | |
| "step": 925, | |
| "valid_targets_mean": 4791.7, | |
| "valid_targets_min": 369 | |
| }, | |
| { | |
| "epoch": 1.5321252059308073, | |
| "grad_norm": 0.44549970172652165, | |
| "learning_rate": 3.830990406454842e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2096899449825287, | |
| "step": 930, | |
| "valid_targets_mean": 5779.6, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 1.540362438220758, | |
| "grad_norm": 0.4509202561250749, | |
| "learning_rate": 3.827669650543805e-05, | |
| "loss": 0.2134, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20951557159423828, | |
| "step": 935, | |
| "valid_targets_mean": 5422.6, | |
| "valid_targets_min": 442 | |
| }, | |
| { | |
| "epoch": 1.5485996705107083, | |
| "grad_norm": 0.43475972027953813, | |
| "learning_rate": 3.8243180555767135e-05, | |
| "loss": 0.2196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22084136307239532, | |
| "step": 940, | |
| "valid_targets_mean": 5615.9, | |
| "valid_targets_min": 1337 | |
| }, | |
| { | |
| "epoch": 1.556836902800659, | |
| "grad_norm": 0.4231693225032134, | |
| "learning_rate": 3.8209356781064686e-05, | |
| "loss": 0.2185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22242343425750732, | |
| "step": 945, | |
| "valid_targets_mean": 6372.5, | |
| "valid_targets_min": 326 | |
| }, | |
| { | |
| "epoch": 1.5650741350906094, | |
| "grad_norm": 0.4315275861397971, | |
| "learning_rate": 3.8175225752053816e-05, | |
| "loss": 0.2191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2107992321252823, | |
| "step": 950, | |
| "valid_targets_mean": 5179.2, | |
| "valid_targets_min": 379 | |
| }, | |
| { | |
| "epoch": 1.57331136738056, | |
| "grad_norm": 0.4255890164918819, | |
| "learning_rate": 3.814078804464205e-05, | |
| "loss": 0.2138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2127227485179901, | |
| "step": 955, | |
| "valid_targets_mean": 5411.9, | |
| "valid_targets_min": 1032 | |
| }, | |
| { | |
| "epoch": 1.5815485996705108, | |
| "grad_norm": 0.48605730492655386, | |
| "learning_rate": 3.8106044239911645e-05, | |
| "loss": 0.2139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20564338564872742, | |
| "step": 960, | |
| "valid_targets_mean": 5263.7, | |
| "valid_targets_min": 384 | |
| }, | |
| { | |
| "epoch": 1.5897858319604614, | |
| "grad_norm": 0.40316591610709385, | |
| "learning_rate": 3.8070994924109746e-05, | |
| "loss": 0.2043, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20632685720920563, | |
| "step": 965, | |
| "valid_targets_mean": 6254.6, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 1.5980230642504119, | |
| "grad_norm": 0.4086555388604155, | |
| "learning_rate": 3.803564068863854e-05, | |
| "loss": 0.1907, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19364306330680847, | |
| "step": 970, | |
| "valid_targets_mean": 6633.9, | |
| "valid_targets_min": 3712 | |
| }, | |
| { | |
| "epoch": 1.6062602965403623, | |
| "grad_norm": 0.38782328074430267, | |
| "learning_rate": 3.7999982130045246e-05, | |
| "loss": 0.2016, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2168004810810089, | |
| "step": 975, | |
| "valid_targets_mean": 6655.2, | |
| "valid_targets_min": 3691 | |
| }, | |
| { | |
| "epoch": 1.614497528830313, | |
| "grad_norm": 0.4330948997917468, | |
| "learning_rate": 3.796401985001206e-05, | |
| "loss": 0.215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2149375081062317, | |
| "step": 980, | |
| "valid_targets_mean": 6287.2, | |
| "valid_targets_min": 318 | |
| }, | |
| { | |
| "epoch": 1.6227347611202636, | |
| "grad_norm": 0.4733788567450916, | |
| "learning_rate": 3.792775445534601e-05, | |
| "loss": 0.2095, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19296729564666748, | |
| "step": 985, | |
| "valid_targets_mean": 5054.3, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 1.6309719934102143, | |
| "grad_norm": 0.458391192134762, | |
| "learning_rate": 3.7891186557968665e-05, | |
| "loss": 0.2034, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21453091502189636, | |
| "step": 990, | |
| "valid_targets_mean": 6577.2, | |
| "valid_targets_min": 4384 | |
| }, | |
| { | |
| "epoch": 1.6392092257001647, | |
| "grad_norm": 0.4743187663655409, | |
| "learning_rate": 3.785431677490592e-05, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17525094747543335, | |
| "step": 995, | |
| "valid_targets_mean": 6421.0, | |
| "valid_targets_min": 1150 | |
| }, | |
| { | |
| "epoch": 1.6474464579901154, | |
| "grad_norm": 0.5108222541349242, | |
| "learning_rate": 3.781714572827746e-05, | |
| "loss": 0.2161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20602372288703918, | |
| "step": 1000, | |
| "valid_targets_mean": 4063.1, | |
| "valid_targets_min": 345 | |
| }, | |
| { | |
| "epoch": 1.6556836902800658, | |
| "grad_norm": 0.42592240451237107, | |
| "learning_rate": 3.777967404528636e-05, | |
| "loss": 0.2089, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21424970030784607, | |
| "step": 1005, | |
| "valid_targets_mean": 6034.2, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 1.6639209225700164, | |
| "grad_norm": 0.40989244289770793, | |
| "learning_rate": 3.774190235820843e-05, | |
| "loss": 0.2194, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19179269671440125, | |
| "step": 1010, | |
| "valid_targets_mean": 5793.4, | |
| "valid_targets_min": 373 | |
| }, | |
| { | |
| "epoch": 1.672158154859967, | |
| "grad_norm": 0.42689685330038074, | |
| "learning_rate": 3.770383130438159e-05, | |
| "loss": 0.2129, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19655553996562958, | |
| "step": 1015, | |
| "valid_targets_mean": 5861.7, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 1.6803953871499178, | |
| "grad_norm": 0.5214662975794156, | |
| "learning_rate": 3.76654615261951e-05, | |
| "loss": 0.2079, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22449663281440735, | |
| "step": 1020, | |
| "valid_targets_mean": 5074.8, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 1.6886326194398682, | |
| "grad_norm": 0.394367346520587, | |
| "learning_rate": 3.7626793671078743e-05, | |
| "loss": 0.2154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18750452995300293, | |
| "step": 1025, | |
| "valid_targets_mean": 6294.2, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 1.6968698517298186, | |
| "grad_norm": 0.6127740131059339, | |
| "learning_rate": 3.758782839149187e-05, | |
| "loss": 0.2, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18330290913581848, | |
| "step": 1030, | |
| "valid_targets_mean": 4697.9, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 1.7051070840197693, | |
| "grad_norm": 0.4847112036985494, | |
| "learning_rate": 3.754856634491238e-05, | |
| "loss": 0.2018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1978096067905426, | |
| "step": 1035, | |
| "valid_targets_mean": 5872.2, | |
| "valid_targets_min": 233 | |
| }, | |
| { | |
| "epoch": 1.71334431630972, | |
| "grad_norm": 0.5058791546645084, | |
| "learning_rate": 3.75090081938257e-05, | |
| "loss": 0.2188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23929041624069214, | |
| "step": 1040, | |
| "valid_targets_mean": 5497.1, | |
| "valid_targets_min": 459 | |
| }, | |
| { | |
| "epoch": 1.7215815485996706, | |
| "grad_norm": 0.41712531401074715, | |
| "learning_rate": 3.7469154605713496e-05, | |
| "loss": 0.204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18083135783672333, | |
| "step": 1045, | |
| "valid_targets_mean": 5281.0, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 1.729818780889621, | |
| "grad_norm": 0.45224754824628705, | |
| "learning_rate": 3.742900625304252e-05, | |
| "loss": 0.2046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20000770688056946, | |
| "step": 1050, | |
| "valid_targets_mean": 5335.8, | |
| "valid_targets_min": 442 | |
| }, | |
| { | |
| "epoch": 1.7380560131795717, | |
| "grad_norm": 0.4276669151075953, | |
| "learning_rate": 3.7388563813253194e-05, | |
| "loss": 0.2026, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21081122756004333, | |
| "step": 1055, | |
| "valid_targets_mean": 5431.8, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 1.7462932454695221, | |
| "grad_norm": 0.42270391290398784, | |
| "learning_rate": 3.7347827968748196e-05, | |
| "loss": 0.2142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21518349647521973, | |
| "step": 1060, | |
| "valid_targets_mean": 4905.4, | |
| "valid_targets_min": 361 | |
| }, | |
| { | |
| "epoch": 1.7545304777594728, | |
| "grad_norm": 0.4938586648526237, | |
| "learning_rate": 3.730679940688094e-05, | |
| "loss": 0.213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2053917944431305, | |
| "step": 1065, | |
| "valid_targets_mean": 5753.6, | |
| "valid_targets_min": 549 | |
| }, | |
| { | |
| "epoch": 1.7627677100494235, | |
| "grad_norm": 0.41970694178562035, | |
| "learning_rate": 3.7265478819944004e-05, | |
| "loss": 0.211, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2033730149269104, | |
| "step": 1070, | |
| "valid_targets_mean": 5415.0, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 1.771004942339374, | |
| "grad_norm": 0.5513001892847159, | |
| "learning_rate": 3.722386690515742e-05, | |
| "loss": 0.2144, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2219332456588745, | |
| "step": 1075, | |
| "valid_targets_mean": 4620.9, | |
| "valid_targets_min": 443 | |
| }, | |
| { | |
| "epoch": 1.7792421746293245, | |
| "grad_norm": 0.5136327419552082, | |
| "learning_rate": 3.718196436465693e-05, | |
| "loss": 0.206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19483298063278198, | |
| "step": 1080, | |
| "valid_targets_mean": 5695.8, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 1.787479406919275, | |
| "grad_norm": 0.46368358882142574, | |
| "learning_rate": 3.713977190548211e-05, | |
| "loss": 0.2152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2157488614320755, | |
| "step": 1085, | |
| "valid_targets_mean": 5854.6, | |
| "valid_targets_min": 1771 | |
| }, | |
| { | |
| "epoch": 1.7957166392092256, | |
| "grad_norm": 0.4376783891180776, | |
| "learning_rate": 3.709729023956448e-05, | |
| "loss": 0.208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20793405175209045, | |
| "step": 1090, | |
| "valid_targets_mean": 5121.4, | |
| "valid_targets_min": 306 | |
| }, | |
| { | |
| "epoch": 1.8039538714991763, | |
| "grad_norm": 0.6020766343191308, | |
| "learning_rate": 3.705452008371545e-05, | |
| "loss": 0.2072, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20232078433036804, | |
| "step": 1095, | |
| "valid_targets_mean": 5209.9, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 1.812191103789127, | |
| "grad_norm": 0.3929612039377728, | |
| "learning_rate": 3.701146215961426e-05, | |
| "loss": 0.2007, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1929633766412735, | |
| "step": 1100, | |
| "valid_targets_mean": 6037.1, | |
| "valid_targets_min": 2363 | |
| }, | |
| { | |
| "epoch": 1.8204283360790774, | |
| "grad_norm": 0.47583432200204007, | |
| "learning_rate": 3.696811719379578e-05, | |
| "loss": 0.1964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19097734987735748, | |
| "step": 1105, | |
| "valid_targets_mean": 5827.8, | |
| "valid_targets_min": 337 | |
| }, | |
| { | |
| "epoch": 1.828665568369028, | |
| "grad_norm": 0.42084059788905986, | |
| "learning_rate": 3.692448591763827e-05, | |
| "loss": 0.2096, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20220178365707397, | |
| "step": 1110, | |
| "valid_targets_mean": 5310.2, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 1.8369028006589785, | |
| "grad_norm": 0.621033541298011, | |
| "learning_rate": 3.6880569067350994e-05, | |
| "loss": 0.2113, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2220468968153, | |
| "step": 1115, | |
| "valid_targets_mean": 5355.2, | |
| "valid_targets_min": 967 | |
| }, | |
| { | |
| "epoch": 1.8451400329489291, | |
| "grad_norm": 0.5417180016003424, | |
| "learning_rate": 3.683636738396187e-05, | |
| "loss": 0.215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22816117107868195, | |
| "step": 1120, | |
| "valid_targets_mean": 3872.6, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 1.8533772652388798, | |
| "grad_norm": 0.39419195825057896, | |
| "learning_rate": 3.6791881613304916e-05, | |
| "loss": 0.199, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1901552528142929, | |
| "step": 1125, | |
| "valid_targets_mean": 6075.9, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 1.8616144975288305, | |
| "grad_norm": 0.48439504999155003, | |
| "learning_rate": 3.674711250600766e-05, | |
| "loss": 0.2356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23211458325386047, | |
| "step": 1130, | |
| "valid_targets_mean": 4931.1, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 1.869851729818781, | |
| "grad_norm": 0.4566773154514815, | |
| "learning_rate": 3.6702060817478524e-05, | |
| "loss": 0.2082, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20753763616085052, | |
| "step": 1135, | |
| "valid_targets_mean": 5041.4, | |
| "valid_targets_min": 576 | |
| }, | |
| { | |
| "epoch": 1.8780889621087313, | |
| "grad_norm": 0.4206786715682531, | |
| "learning_rate": 3.6656727307894e-05, | |
| "loss": 0.2028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2010621726512909, | |
| "step": 1140, | |
| "valid_targets_mean": 5559.8, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 1.886326194398682, | |
| "grad_norm": 0.4398821973871485, | |
| "learning_rate": 3.6611112742185913e-05, | |
| "loss": 0.2045, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21115025877952576, | |
| "step": 1145, | |
| "valid_targets_mean": 4765.1, | |
| "valid_targets_min": 755 | |
| }, | |
| { | |
| "epoch": 1.8945634266886326, | |
| "grad_norm": 0.446473805886963, | |
| "learning_rate": 3.6565217890028435e-05, | |
| "loss": 0.2197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19354012608528137, | |
| "step": 1150, | |
| "valid_targets_mean": 5087.8, | |
| "valid_targets_min": 295 | |
| }, | |
| { | |
| "epoch": 1.9028006589785833, | |
| "grad_norm": 0.37804036389153517, | |
| "learning_rate": 3.651904352582514e-05, | |
| "loss": 0.2072, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19648414850234985, | |
| "step": 1155, | |
| "valid_targets_mean": 6540.2, | |
| "valid_targets_min": 356 | |
| }, | |
| { | |
| "epoch": 1.9110378912685337, | |
| "grad_norm": 0.6203253075668428, | |
| "learning_rate": 3.647259042869595e-05, | |
| "loss": 0.2186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23002168536186218, | |
| "step": 1160, | |
| "valid_targets_mean": 4733.4, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 1.9192751235584844, | |
| "grad_norm": 0.38880529466871483, | |
| "learning_rate": 3.642585938246394e-05, | |
| "loss": 0.1957, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19138900935649872, | |
| "step": 1165, | |
| "valid_targets_mean": 6324.9, | |
| "valid_targets_min": 373 | |
| }, | |
| { | |
| "epoch": 1.9275123558484348, | |
| "grad_norm": 0.42675443770878757, | |
| "learning_rate": 3.6378851175642166e-05, | |
| "loss": 0.2018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20937415957450867, | |
| "step": 1170, | |
| "valid_targets_mean": 5846.4, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 1.9357495881383855, | |
| "grad_norm": 0.4296389983820003, | |
| "learning_rate": 3.63315666014203e-05, | |
| "loss": 0.2071, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19081468880176544, | |
| "step": 1175, | |
| "valid_targets_mean": 5246.9, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 1.9439868204283361, | |
| "grad_norm": 0.5039473970395317, | |
| "learning_rate": 3.6284006457651295e-05, | |
| "loss": 0.214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19345268607139587, | |
| "step": 1180, | |
| "valid_targets_mean": 5206.6, | |
| "valid_targets_min": 760 | |
| }, | |
| { | |
| "epoch": 1.9522240527182868, | |
| "grad_norm": 0.4820224223402811, | |
| "learning_rate": 3.6236171546837936e-05, | |
| "loss": 0.2022, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22416003048419952, | |
| "step": 1185, | |
| "valid_targets_mean": 4467.2, | |
| "valid_targets_min": 1428 | |
| }, | |
| { | |
| "epoch": 1.9604612850082372, | |
| "grad_norm": 0.4511616532068528, | |
| "learning_rate": 3.618806267611923e-05, | |
| "loss": 0.211, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21213218569755554, | |
| "step": 1190, | |
| "valid_targets_mean": 5811.9, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 1.9686985172981877, | |
| "grad_norm": 0.4572207471838012, | |
| "learning_rate": 3.613968065725685e-05, | |
| "loss": 0.2094, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23261676728725433, | |
| "step": 1195, | |
| "valid_targets_mean": 6750.7, | |
| "valid_targets_min": 3184 | |
| }, | |
| { | |
| "epoch": 1.9769357495881383, | |
| "grad_norm": 0.37778737740814883, | |
| "learning_rate": 3.60910263066214e-05, | |
| "loss": 0.2042, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2077624648809433, | |
| "step": 1200, | |
| "valid_targets_mean": 6281.2, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 1.985172981878089, | |
| "grad_norm": 0.44767855166484793, | |
| "learning_rate": 3.604210044517867e-05, | |
| "loss": 0.1944, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19422310590744019, | |
| "step": 1205, | |
| "valid_targets_mean": 4698.5, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 1.9934102141680397, | |
| "grad_norm": 0.4453341741888798, | |
| "learning_rate": 3.599290389847576e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2169584333896637, | |
| "step": 1210, | |
| "valid_targets_mean": 6427.6, | |
| "valid_targets_min": 1029 | |
| }, | |
| { | |
| "epoch": 2.0016474464579903, | |
| "grad_norm": 0.4691655308778174, | |
| "learning_rate": 3.594343749662716e-05, | |
| "loss": 0.1958, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16686594486236572, | |
| "step": 1215, | |
| "valid_targets_mean": 5029.6, | |
| "valid_targets_min": 250 | |
| }, | |
| { | |
| "epoch": 2.0098846787479405, | |
| "grad_norm": 0.49854351226994864, | |
| "learning_rate": 3.5893702074300745e-05, | |
| "loss": 0.1902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19464629888534546, | |
| "step": 1220, | |
| "valid_targets_mean": 5672.7, | |
| "valid_targets_min": 2131 | |
| }, | |
| { | |
| "epoch": 2.018121911037891, | |
| "grad_norm": 0.4964318584547705, | |
| "learning_rate": 3.584369847070369e-05, | |
| "loss": 0.1925, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21309590339660645, | |
| "step": 1225, | |
| "valid_targets_mean": 5729.2, | |
| "valid_targets_min": 393 | |
| }, | |
| { | |
| "epoch": 2.026359143327842, | |
| "grad_norm": 0.4023341001155962, | |
| "learning_rate": 3.579342752956829e-05, | |
| "loss": 0.1915, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1911289393901825, | |
| "step": 1230, | |
| "valid_targets_mean": 6125.6, | |
| "valid_targets_min": 2194 | |
| }, | |
| { | |
| "epoch": 2.0345963756177925, | |
| "grad_norm": 0.46394946737605847, | |
| "learning_rate": 3.574289009913776e-05, | |
| "loss": 0.1993, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1845487356185913, | |
| "step": 1235, | |
| "valid_targets_mean": 5679.7, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 2.042833607907743, | |
| "grad_norm": 0.4215099447257407, | |
| "learning_rate": 3.5692087032151915e-05, | |
| "loss": 0.1806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1543101668357849, | |
| "step": 1240, | |
| "valid_targets_mean": 4842.9, | |
| "valid_targets_min": 453 | |
| }, | |
| { | |
| "epoch": 2.0510708401976934, | |
| "grad_norm": 0.4781452383124886, | |
| "learning_rate": 3.564101918583274e-05, | |
| "loss": 0.183, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18705658614635468, | |
| "step": 1245, | |
| "valid_targets_mean": 4635.4, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 2.059308072487644, | |
| "grad_norm": 0.4242016530032853, | |
| "learning_rate": 3.5589687421869964e-05, | |
| "loss": 0.1811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1837356686592102, | |
| "step": 1250, | |
| "valid_targets_mean": 6262.1, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 2.0675453047775947, | |
| "grad_norm": 0.47101868005036085, | |
| "learning_rate": 3.553809260640653e-05, | |
| "loss": 0.1855, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17792844772338867, | |
| "step": 1255, | |
| "valid_targets_mean": 5523.5, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 2.0757825370675453, | |
| "grad_norm": 0.44058326488327854, | |
| "learning_rate": 3.5486235610023935e-05, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17145931720733643, | |
| "step": 1260, | |
| "valid_targets_mean": 5019.3, | |
| "valid_targets_min": 320 | |
| }, | |
| { | |
| "epoch": 2.084019769357496, | |
| "grad_norm": 0.4720538729006096, | |
| "learning_rate": 3.54341173077276e-05, | |
| "loss": 0.2022, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19867974519729614, | |
| "step": 1265, | |
| "valid_targets_mean": 5480.0, | |
| "valid_targets_min": 386 | |
| }, | |
| { | |
| "epoch": 2.0922570016474467, | |
| "grad_norm": 0.49979798033678424, | |
| "learning_rate": 3.5381738578932035e-05, | |
| "loss": 0.1881, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19164550304412842, | |
| "step": 1270, | |
| "valid_targets_mean": 5641.0, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 2.100494233937397, | |
| "grad_norm": 0.5126675244916216, | |
| "learning_rate": 3.532910030744607e-05, | |
| "loss": 0.1838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17244023084640503, | |
| "step": 1275, | |
| "valid_targets_mean": 5319.9, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 2.1087314662273475, | |
| "grad_norm": 0.4510430101034234, | |
| "learning_rate": 3.5276203381457894e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18565404415130615, | |
| "step": 1280, | |
| "valid_targets_mean": 6039.1, | |
| "valid_targets_min": 672 | |
| }, | |
| { | |
| "epoch": 2.116968698517298, | |
| "grad_norm": 0.4308787285559032, | |
| "learning_rate": 3.5223048693520094e-05, | |
| "loss": 0.185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18874096870422363, | |
| "step": 1285, | |
| "valid_targets_mean": 5633.6, | |
| "valid_targets_min": 484 | |
| }, | |
| { | |
| "epoch": 2.125205930807249, | |
| "grad_norm": 0.468459853247736, | |
| "learning_rate": 3.5169637140534574e-05, | |
| "loss": 0.1858, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19510608911514282, | |
| "step": 1290, | |
| "valid_targets_mean": 5250.2, | |
| "valid_targets_min": 589 | |
| }, | |
| { | |
| "epoch": 2.1334431630971995, | |
| "grad_norm": 0.4658684095500768, | |
| "learning_rate": 3.5115969623737447e-05, | |
| "loss": 0.1957, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18428108096122742, | |
| "step": 1295, | |
| "valid_targets_mean": 5244.8, | |
| "valid_targets_min": 390 | |
| }, | |
| { | |
| "epoch": 2.1416803953871497, | |
| "grad_norm": 0.4137733677060094, | |
| "learning_rate": 3.506204704868379e-05, | |
| "loss": 0.1905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1802988052368164, | |
| "step": 1300, | |
| "valid_targets_mean": 5942.2, | |
| "valid_targets_min": 2248 | |
| }, | |
| { | |
| "epoch": 2.1499176276771004, | |
| "grad_norm": 0.4398392014709175, | |
| "learning_rate": 3.500787032523242e-05, | |
| "loss": 0.1945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20117981731891632, | |
| "step": 1305, | |
| "valid_targets_mean": 5333.5, | |
| "valid_targets_min": 351 | |
| }, | |
| { | |
| "epoch": 2.158154859967051, | |
| "grad_norm": 0.41011198319086234, | |
| "learning_rate": 3.495344036753048e-05, | |
| "loss": 0.204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18782642483711243, | |
| "step": 1310, | |
| "valid_targets_mean": 6204.0, | |
| "valid_targets_min": 332 | |
| }, | |
| { | |
| "epoch": 2.1663920922570017, | |
| "grad_norm": 0.4747376443014503, | |
| "learning_rate": 3.489875809399806e-05, | |
| "loss": 0.1801, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18644480407238007, | |
| "step": 1315, | |
| "valid_targets_mean": 5435.4, | |
| "valid_targets_min": 341 | |
| }, | |
| { | |
| "epoch": 2.1746293245469523, | |
| "grad_norm": 0.41595362322660934, | |
| "learning_rate": 3.484382442731268e-05, | |
| "loss": 0.1878, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1779285967350006, | |
| "step": 1320, | |
| "valid_targets_mean": 6901.5, | |
| "valid_targets_min": 3389 | |
| }, | |
| { | |
| "epoch": 2.182866556836903, | |
| "grad_norm": 0.40494315889864735, | |
| "learning_rate": 3.4788640294393725e-05, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19018739461898804, | |
| "step": 1325, | |
| "valid_targets_mean": 7549.8, | |
| "valid_targets_min": 348 | |
| }, | |
| { | |
| "epoch": 2.191103789126853, | |
| "grad_norm": 0.4280136085385474, | |
| "learning_rate": 3.47332066263868e-05, | |
| "loss": 0.1872, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18496966361999512, | |
| "step": 1330, | |
| "valid_targets_mean": 5089.6, | |
| "valid_targets_min": 390 | |
| }, | |
| { | |
| "epoch": 2.199341021416804, | |
| "grad_norm": 0.5025570901923755, | |
| "learning_rate": 3.467752435864803e-05, | |
| "loss": 0.1818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16634967923164368, | |
| "step": 1335, | |
| "valid_targets_mean": 4767.1, | |
| "valid_targets_min": 468 | |
| }, | |
| { | |
| "epoch": 2.2075782537067545, | |
| "grad_norm": 0.4025390072839639, | |
| "learning_rate": 3.462159443072826e-05, | |
| "loss": 0.187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17967519164085388, | |
| "step": 1340, | |
| "valid_targets_mean": 6592.0, | |
| "valid_targets_min": 294 | |
| }, | |
| { | |
| "epoch": 2.215815485996705, | |
| "grad_norm": 0.5084259282496368, | |
| "learning_rate": 3.456541778635722e-05, | |
| "loss": 0.2037, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2011231929063797, | |
| "step": 1345, | |
| "valid_targets_mean": 5602.5, | |
| "valid_targets_min": 338 | |
| }, | |
| { | |
| "epoch": 2.224052718286656, | |
| "grad_norm": 0.6204597597615517, | |
| "learning_rate": 3.45089953734276e-05, | |
| "loss": 0.1904, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.187906876206398, | |
| "step": 1350, | |
| "valid_targets_mean": 5300.2, | |
| "valid_targets_min": 502 | |
| }, | |
| { | |
| "epoch": 2.232289950576606, | |
| "grad_norm": 0.43339460655750894, | |
| "learning_rate": 3.445232814397902e-05, | |
| "loss": 0.1934, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18730773031711578, | |
| "step": 1355, | |
| "valid_targets_mean": 5800.6, | |
| "valid_targets_min": 2008 | |
| }, | |
| { | |
| "epoch": 2.2405271828665567, | |
| "grad_norm": 0.49437914997453064, | |
| "learning_rate": 3.4395417054182034e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1944451928138733, | |
| "step": 1360, | |
| "valid_targets_mean": 5698.8, | |
| "valid_targets_min": 650 | |
| }, | |
| { | |
| "epoch": 2.2487644151565074, | |
| "grad_norm": 0.4679757922035915, | |
| "learning_rate": 3.433826306432193e-05, | |
| "loss": 0.1955, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1887608915567398, | |
| "step": 1365, | |
| "valid_targets_mean": 5416.2, | |
| "valid_targets_min": 451 | |
| }, | |
| { | |
| "epoch": 2.257001647446458, | |
| "grad_norm": 0.4833050563367706, | |
| "learning_rate": 3.428086713878255e-05, | |
| "loss": 0.1916, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18888002634048462, | |
| "step": 1370, | |
| "valid_targets_mean": 5162.8, | |
| "valid_targets_min": 561 | |
| }, | |
| { | |
| "epoch": 2.2652388797364087, | |
| "grad_norm": 0.9103976886906952, | |
| "learning_rate": 3.422323024603004e-05, | |
| "loss": 0.1872, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1994047462940216, | |
| "step": 1375, | |
| "valid_targets_mean": 4683.3, | |
| "valid_targets_min": 836 | |
| }, | |
| { | |
| "epoch": 2.2734761120263594, | |
| "grad_norm": 0.5014385943250483, | |
| "learning_rate": 3.4165353358596475e-05, | |
| "loss": 0.1864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21568092703819275, | |
| "step": 1380, | |
| "valid_targets_mean": 4668.3, | |
| "valid_targets_min": 375 | |
| }, | |
| { | |
| "epoch": 2.2817133443163096, | |
| "grad_norm": 0.4840041536578911, | |
| "learning_rate": 3.410723745306348e-05, | |
| "loss": 0.185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19082829356193542, | |
| "step": 1385, | |
| "valid_targets_mean": 4344.3, | |
| "valid_targets_min": 622 | |
| }, | |
| { | |
| "epoch": 2.2899505766062602, | |
| "grad_norm": 0.49307871381195234, | |
| "learning_rate": 3.404888351004571e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18089547753334045, | |
| "step": 1390, | |
| "valid_targets_mean": 4720.2, | |
| "valid_targets_min": 374 | |
| }, | |
| { | |
| "epoch": 2.298187808896211, | |
| "grad_norm": 0.49935101497553586, | |
| "learning_rate": 3.3990292514174346e-05, | |
| "loss": 0.1838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1871870756149292, | |
| "step": 1395, | |
| "valid_targets_mean": 5122.2, | |
| "valid_targets_min": 382 | |
| }, | |
| { | |
| "epoch": 2.3064250411861615, | |
| "grad_norm": 0.4076277361465614, | |
| "learning_rate": 3.393146545408047e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18360033631324768, | |
| "step": 1400, | |
| "valid_targets_mean": 6602.8, | |
| "valid_targets_min": 491 | |
| }, | |
| { | |
| "epoch": 2.314662273476112, | |
| "grad_norm": 0.4223706656199542, | |
| "learning_rate": 3.3872403322378354e-05, | |
| "loss": 0.1908, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18353493511676788, | |
| "step": 1405, | |
| "valid_targets_mean": 6359.8, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 2.3228995057660624, | |
| "grad_norm": 0.4849400359217284, | |
| "learning_rate": 3.3813107115648745e-05, | |
| "loss": 0.1924, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2019674926996231, | |
| "step": 1410, | |
| "valid_targets_mean": 5309.1, | |
| "valid_targets_min": 328 | |
| }, | |
| { | |
| "epoch": 2.331136738056013, | |
| "grad_norm": 0.4575537443988003, | |
| "learning_rate": 3.375357783442205e-05, | |
| "loss": 0.1938, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20391184091567993, | |
| "step": 1415, | |
| "valid_targets_mean": 5717.2, | |
| "valid_targets_min": 1413 | |
| }, | |
| { | |
| "epoch": 2.3393739703459637, | |
| "grad_norm": 0.5731198968413408, | |
| "learning_rate": 3.369381648316144e-05, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18056750297546387, | |
| "step": 1420, | |
| "valid_targets_mean": 4796.6, | |
| "valid_targets_min": 330 | |
| }, | |
| { | |
| "epoch": 2.3476112026359144, | |
| "grad_norm": 0.4055911598397016, | |
| "learning_rate": 3.363382407024588e-05, | |
| "loss": 0.1933, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19828687608242035, | |
| "step": 1425, | |
| "valid_targets_mean": 6127.6, | |
| "valid_targets_min": 305 | |
| }, | |
| { | |
| "epoch": 2.355848434925865, | |
| "grad_norm": 0.514361900038238, | |
| "learning_rate": 3.357360160795317e-05, | |
| "loss": 0.1816, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1833609938621521, | |
| "step": 1430, | |
| "valid_targets_mean": 5110.0, | |
| "valid_targets_min": 635 | |
| }, | |
| { | |
| "epoch": 2.3640856672158153, | |
| "grad_norm": 0.4939780199608309, | |
| "learning_rate": 3.35131501124428e-05, | |
| "loss": 0.1828, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18957363069057465, | |
| "step": 1435, | |
| "valid_targets_mean": 5170.4, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 2.372322899505766, | |
| "grad_norm": 0.4366021181201065, | |
| "learning_rate": 3.345247060373887e-05, | |
| "loss": 0.1873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19346314668655396, | |
| "step": 1440, | |
| "valid_targets_mean": 5597.6, | |
| "valid_targets_min": 342 | |
| }, | |
| { | |
| "epoch": 2.3805601317957166, | |
| "grad_norm": 0.5616766081872978, | |
| "learning_rate": 3.3391564105712824e-05, | |
| "loss": 0.1877, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19493432343006134, | |
| "step": 1445, | |
| "valid_targets_mean": 3618.6, | |
| "valid_targets_min": 318 | |
| }, | |
| { | |
| "epoch": 2.3887973640856672, | |
| "grad_norm": 0.4983461122768153, | |
| "learning_rate": 3.333043164606618e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.199821338057518, | |
| "step": 1450, | |
| "valid_targets_mean": 4254.8, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 2.397034596375618, | |
| "grad_norm": 0.4498876894639847, | |
| "learning_rate": 3.3269074256313246e-05, | |
| "loss": 0.1877, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20630428194999695, | |
| "step": 1455, | |
| "valid_targets_mean": 5124.8, | |
| "valid_targets_min": 293 | |
| }, | |
| { | |
| "epoch": 2.4052718286655685, | |
| "grad_norm": 0.46335332374164745, | |
| "learning_rate": 3.320749297176363e-05, | |
| "loss": 0.1804, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19408290088176727, | |
| "step": 1460, | |
| "valid_targets_mean": 6146.5, | |
| "valid_targets_min": 233 | |
| }, | |
| { | |
| "epoch": 2.4135090609555188, | |
| "grad_norm": 0.432494507185893, | |
| "learning_rate": 3.3145688831504845e-05, | |
| "loss": 0.168, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1638782024383545, | |
| "step": 1465, | |
| "valid_targets_mean": 6058.1, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 2.4217462932454694, | |
| "grad_norm": 0.4233371763444396, | |
| "learning_rate": 3.308366287838472e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17348337173461914, | |
| "step": 1470, | |
| "valid_targets_mean": 6430.6, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 2.42998352553542, | |
| "grad_norm": 0.5173084245980966, | |
| "learning_rate": 3.302141615899384e-05, | |
| "loss": 0.1845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16355247795581818, | |
| "step": 1475, | |
| "valid_targets_mean": 5664.9, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 2.4382207578253707, | |
| "grad_norm": 0.5007091972827029, | |
| "learning_rate": 3.2958949723647875e-05, | |
| "loss": 0.1883, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18168076872825623, | |
| "step": 1480, | |
| "valid_targets_mean": 4153.4, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 2.4464579901153214, | |
| "grad_norm": 0.4540610488829723, | |
| "learning_rate": 3.289626462636985e-05, | |
| "loss": 0.1882, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19426754117012024, | |
| "step": 1485, | |
| "valid_targets_mean": 5505.4, | |
| "valid_targets_min": 1120 | |
| }, | |
| { | |
| "epoch": 2.454695222405272, | |
| "grad_norm": 0.3850445234018962, | |
| "learning_rate": 3.283336192487238e-05, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17522047460079193, | |
| "step": 1490, | |
| "valid_targets_mean": 7046.2, | |
| "valid_targets_min": 760 | |
| }, | |
| { | |
| "epoch": 2.4629324546952223, | |
| "grad_norm": 0.49847271228259965, | |
| "learning_rate": 3.277024268053978e-05, | |
| "loss": 0.193, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19626614451408386, | |
| "step": 1495, | |
| "valid_targets_mean": 4122.3, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 2.471169686985173, | |
| "grad_norm": 0.4299543359076253, | |
| "learning_rate": 3.2706907958410234e-05, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21099016070365906, | |
| "step": 1500, | |
| "valid_targets_mean": 5716.4, | |
| "valid_targets_min": 230 | |
| }, | |
| { | |
| "epoch": 2.4794069192751236, | |
| "grad_norm": 0.5168100523876604, | |
| "learning_rate": 3.2643358827157724e-05, | |
| "loss": 0.1774, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1860761046409607, | |
| "step": 1505, | |
| "valid_targets_mean": 4338.4, | |
| "valid_targets_min": 418 | |
| }, | |
| { | |
| "epoch": 2.4876441515650742, | |
| "grad_norm": 0.4324694230224118, | |
| "learning_rate": 3.257959635907409e-05, | |
| "loss": 0.1919, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21066716313362122, | |
| "step": 1510, | |
| "valid_targets_mean": 6845.1, | |
| "valid_targets_min": 3145 | |
| }, | |
| { | |
| "epoch": 2.495881383855025, | |
| "grad_norm": 0.4426728632174381, | |
| "learning_rate": 3.251562163005087e-05, | |
| "loss": 0.1794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17823879420757294, | |
| "step": 1515, | |
| "valid_targets_mean": 4796.9, | |
| "valid_targets_min": 317 | |
| }, | |
| { | |
| "epoch": 2.504118616144975, | |
| "grad_norm": 0.4615007706928971, | |
| "learning_rate": 3.245143571956117e-05, | |
| "loss": 0.1834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17330557107925415, | |
| "step": 1520, | |
| "valid_targets_mean": 5856.2, | |
| "valid_targets_min": 299 | |
| }, | |
| { | |
| "epoch": 2.5123558484349258, | |
| "grad_norm": 0.4929827190489361, | |
| "learning_rate": 3.2387039710641486e-05, | |
| "loss": 0.1983, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23035567998886108, | |
| "step": 1525, | |
| "valid_targets_mean": 5336.1, | |
| "valid_targets_min": 314 | |
| }, | |
| { | |
| "epoch": 2.5205930807248764, | |
| "grad_norm": 0.47838149878259656, | |
| "learning_rate": 3.232243468987336e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18333536386489868, | |
| "step": 1530, | |
| "valid_targets_mean": 4696.2, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 2.528830313014827, | |
| "grad_norm": 0.4774794225296381, | |
| "learning_rate": 3.2257621747365086e-05, | |
| "loss": 0.1825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18388643860816956, | |
| "step": 1535, | |
| "valid_targets_mean": 4851.4, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 2.5370675453047777, | |
| "grad_norm": 0.5784682298913758, | |
| "learning_rate": 3.219260197673333e-05, | |
| "loss": 0.1912, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1999967098236084, | |
| "step": 1540, | |
| "valid_targets_mean": 5049.6, | |
| "valid_targets_min": 339 | |
| }, | |
| { | |
| "epoch": 2.545304777594728, | |
| "grad_norm": 0.4870283917003109, | |
| "learning_rate": 3.212737647508464e-05, | |
| "loss": 0.191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1813567727804184, | |
| "step": 1545, | |
| "valid_targets_mean": 4931.9, | |
| "valid_targets_min": 460 | |
| }, | |
| { | |
| "epoch": 2.5535420098846786, | |
| "grad_norm": 0.4912766743384124, | |
| "learning_rate": 3.206194634299696e-05, | |
| "loss": 0.1835, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18001456558704376, | |
| "step": 1550, | |
| "valid_targets_mean": 4067.6, | |
| "valid_targets_min": 354 | |
| }, | |
| { | |
| "epoch": 2.5617792421746293, | |
| "grad_norm": 0.4177379128740769, | |
| "learning_rate": 3.199631268450104e-05, | |
| "loss": 0.1882, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1830926239490509, | |
| "step": 1555, | |
| "valid_targets_mean": 5638.4, | |
| "valid_targets_min": 452 | |
| }, | |
| { | |
| "epoch": 2.57001647446458, | |
| "grad_norm": 0.4569501634208448, | |
| "learning_rate": 3.1930476607061835e-05, | |
| "loss": 0.187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1815715730190277, | |
| "step": 1560, | |
| "valid_targets_mean": 6495.7, | |
| "valid_targets_min": 1668 | |
| }, | |
| { | |
| "epoch": 2.5782537067545306, | |
| "grad_norm": 0.4755155809990493, | |
| "learning_rate": 3.1864439221559797e-05, | |
| "loss": 0.1921, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2156897485256195, | |
| "step": 1565, | |
| "valid_targets_mean": 5195.4, | |
| "valid_targets_min": 367 | |
| }, | |
| { | |
| "epoch": 2.586490939044481, | |
| "grad_norm": 0.4352493070918896, | |
| "learning_rate": 3.179820164227213e-05, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1801442950963974, | |
| "step": 1570, | |
| "valid_targets_mean": 5301.5, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 2.594728171334432, | |
| "grad_norm": 0.5418117949683932, | |
| "learning_rate": 3.1731764986853985e-05, | |
| "loss": 0.193, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20953936874866486, | |
| "step": 1575, | |
| "valid_targets_mean": 4256.2, | |
| "valid_targets_min": 273 | |
| }, | |
| { | |
| "epoch": 2.602965403624382, | |
| "grad_norm": 0.4712036121831422, | |
| "learning_rate": 3.166513037631965e-05, | |
| "loss": 0.1785, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18240496516227722, | |
| "step": 1580, | |
| "valid_targets_mean": 5122.9, | |
| "valid_targets_min": 386 | |
| }, | |
| { | |
| "epoch": 2.6112026359143328, | |
| "grad_norm": 0.4479622079225454, | |
| "learning_rate": 3.1598298935023536e-05, | |
| "loss": 0.1833, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17743943631649017, | |
| "step": 1585, | |
| "valid_targets_mean": 4750.2, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 2.6194398682042834, | |
| "grad_norm": 0.4363238458372522, | |
| "learning_rate": 3.153127179064131e-05, | |
| "loss": 0.19, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1790778636932373, | |
| "step": 1590, | |
| "valid_targets_mean": 5834.9, | |
| "valid_targets_min": 1060 | |
| }, | |
| { | |
| "epoch": 2.627677100494234, | |
| "grad_norm": 0.4144716349542137, | |
| "learning_rate": 3.146405007415081e-05, | |
| "loss": 0.1838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18715772032737732, | |
| "step": 1595, | |
| "valid_targets_mean": 5915.6, | |
| "valid_targets_min": 2732 | |
| }, | |
| { | |
| "epoch": 2.6359143327841847, | |
| "grad_norm": 0.402511904543039, | |
| "learning_rate": 3.1396634919812955e-05, | |
| "loss": 0.1798, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18299850821495056, | |
| "step": 1600, | |
| "valid_targets_mean": 6042.9, | |
| "valid_targets_min": 1751 | |
| }, | |
| { | |
| "epoch": 2.644151565074135, | |
| "grad_norm": 0.42949378682217304, | |
| "learning_rate": 3.1329027465152654e-05, | |
| "loss": 0.1797, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19716359674930573, | |
| "step": 1605, | |
| "valid_targets_mean": 5723.6, | |
| "valid_targets_min": 1438 | |
| }, | |
| { | |
| "epoch": 2.6523887973640856, | |
| "grad_norm": 0.41523450789292266, | |
| "learning_rate": 3.126122885093955e-05, | |
| "loss": 0.1846, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1815474033355713, | |
| "step": 1610, | |
| "valid_targets_mean": 6400.8, | |
| "valid_targets_min": 290 | |
| }, | |
| { | |
| "epoch": 2.6606260296540363, | |
| "grad_norm": 0.42400960287790107, | |
| "learning_rate": 3.1193240221168835e-05, | |
| "loss": 0.1912, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19176776707172394, | |
| "step": 1615, | |
| "valid_targets_mean": 7038.9, | |
| "valid_targets_min": 790 | |
| }, | |
| { | |
| "epoch": 2.668863261943987, | |
| "grad_norm": 0.4519946593514569, | |
| "learning_rate": 3.112506272304189e-05, | |
| "loss": 0.1875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17423108220100403, | |
| "step": 1620, | |
| "valid_targets_mean": 4906.9, | |
| "valid_targets_min": 326 | |
| }, | |
| { | |
| "epoch": 2.6771004942339376, | |
| "grad_norm": 0.40628518983943557, | |
| "learning_rate": 3.1056697506946965e-05, | |
| "loss": 0.1858, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19080524146556854, | |
| "step": 1625, | |
| "valid_targets_mean": 6026.1, | |
| "valid_targets_min": 2394 | |
| }, | |
| { | |
| "epoch": 2.685337726523888, | |
| "grad_norm": 0.43758527040491757, | |
| "learning_rate": 3.0988145726439754e-05, | |
| "loss": 0.1894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19035451114177704, | |
| "step": 1630, | |
| "valid_targets_mean": 5603.6, | |
| "valid_targets_min": 600 | |
| }, | |
| { | |
| "epoch": 2.6935749588138385, | |
| "grad_norm": 0.4213759976520408, | |
| "learning_rate": 3.091940853822393e-05, | |
| "loss": 0.1904, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17566537857055664, | |
| "step": 1635, | |
| "valid_targets_mean": 5556.9, | |
| "valid_targets_min": 692 | |
| }, | |
| { | |
| "epoch": 2.701812191103789, | |
| "grad_norm": 0.41064334724669393, | |
| "learning_rate": 3.0850487102131624e-05, | |
| "loss": 0.1802, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17263707518577576, | |
| "step": 1640, | |
| "valid_targets_mean": 6448.4, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 2.7100494233937398, | |
| "grad_norm": 0.40647920955642636, | |
| "learning_rate": 3.078138258110389e-05, | |
| "loss": 0.1925, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1742803454399109, | |
| "step": 1645, | |
| "valid_targets_mean": 7037.6, | |
| "valid_targets_min": 600 | |
| }, | |
| { | |
| "epoch": 2.7182866556836904, | |
| "grad_norm": 0.4439437502158048, | |
| "learning_rate": 3.071209614117103e-05, | |
| "loss": 0.1869, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18262037634849548, | |
| "step": 1650, | |
| "valid_targets_mean": 5216.6, | |
| "valid_targets_min": 386 | |
| }, | |
| { | |
| "epoch": 2.7265238879736406, | |
| "grad_norm": 0.4272024997497859, | |
| "learning_rate": 3.0642628951432916e-05, | |
| "loss": 0.1811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18403027951717377, | |
| "step": 1655, | |
| "valid_targets_mean": 5560.3, | |
| "valid_targets_min": 384 | |
| }, | |
| { | |
| "epoch": 2.7347611202635913, | |
| "grad_norm": 0.4106960597797865, | |
| "learning_rate": 3.057298218403934e-05, | |
| "loss": 0.1764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16496650874614716, | |
| "step": 1660, | |
| "valid_targets_mean": 5558.3, | |
| "valid_targets_min": 515 | |
| }, | |
| { | |
| "epoch": 2.742998352553542, | |
| "grad_norm": 0.4573999255888336, | |
| "learning_rate": 3.050315701417016e-05, | |
| "loss": 0.1854, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1997423619031906, | |
| "step": 1665, | |
| "valid_targets_mean": 5737.6, | |
| "valid_targets_min": 556 | |
| }, | |
| { | |
| "epoch": 2.7512355848434926, | |
| "grad_norm": 0.4588007386145714, | |
| "learning_rate": 3.043315462001549e-05, | |
| "loss": 0.1779, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1824597418308258, | |
| "step": 1670, | |
| "valid_targets_mean": 6000.8, | |
| "valid_targets_min": 291 | |
| }, | |
| { | |
| "epoch": 2.7594728171334433, | |
| "grad_norm": 0.45545623228765714, | |
| "learning_rate": 3.0362976182755837e-05, | |
| "loss": 0.1922, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1609131544828415, | |
| "step": 1675, | |
| "valid_targets_mean": 4831.3, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 2.7677100494233935, | |
| "grad_norm": 0.43744918321348536, | |
| "learning_rate": 3.0292622886542154e-05, | |
| "loss": 0.1934, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19479726254940033, | |
| "step": 1680, | |
| "valid_targets_mean": 5040.9, | |
| "valid_targets_min": 290 | |
| }, | |
| { | |
| "epoch": 2.775947281713344, | |
| "grad_norm": 0.3723836457441067, | |
| "learning_rate": 3.0222095918475873e-05, | |
| "loss": 0.1836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16930866241455078, | |
| "step": 1685, | |
| "valid_targets_mean": 7237.4, | |
| "valid_targets_min": 3984 | |
| }, | |
| { | |
| "epoch": 2.784184514003295, | |
| "grad_norm": 0.4397650823089542, | |
| "learning_rate": 3.0151396468588848e-05, | |
| "loss": 0.1955, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2224178910255432, | |
| "step": 1690, | |
| "valid_targets_mean": 5372.0, | |
| "valid_targets_min": 716 | |
| }, | |
| { | |
| "epoch": 2.7924217462932455, | |
| "grad_norm": 0.4766636021138799, | |
| "learning_rate": 3.0080525729823306e-05, | |
| "loss": 0.1988, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19022653996944427, | |
| "step": 1695, | |
| "valid_targets_mean": 5674.8, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 2.800658978583196, | |
| "grad_norm": 0.5783854027781857, | |
| "learning_rate": 3.0009484898011707e-05, | |
| "loss": 0.1826, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18798254430294037, | |
| "step": 1700, | |
| "valid_targets_mean": 6211.0, | |
| "valid_targets_min": 384 | |
| }, | |
| { | |
| "epoch": 2.808896210873147, | |
| "grad_norm": 0.6044879964158381, | |
| "learning_rate": 2.993827517185656e-05, | |
| "loss": 0.1976, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20792728662490845, | |
| "step": 1705, | |
| "valid_targets_mean": 2618.7, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 2.8171334431630974, | |
| "grad_norm": 0.48777693314470333, | |
| "learning_rate": 2.9866897752910194e-05, | |
| "loss": 0.1781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18926022946834564, | |
| "step": 1710, | |
| "valid_targets_mean": 4935.6, | |
| "valid_targets_min": 383 | |
| }, | |
| { | |
| "epoch": 2.8253706754530477, | |
| "grad_norm": 0.4355724130572252, | |
| "learning_rate": 2.979535384555449e-05, | |
| "loss": 0.1895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18864750862121582, | |
| "step": 1715, | |
| "valid_targets_mean": 6106.8, | |
| "valid_targets_min": 452 | |
| }, | |
| { | |
| "epoch": 2.8336079077429983, | |
| "grad_norm": 0.5005269538168059, | |
| "learning_rate": 2.972364465698058e-05, | |
| "loss": 0.1718, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17418703436851501, | |
| "step": 1720, | |
| "valid_targets_mean": 4066.6, | |
| "valid_targets_min": 343 | |
| }, | |
| { | |
| "epoch": 2.841845140032949, | |
| "grad_norm": 0.35405912220850505, | |
| "learning_rate": 2.9651771397168443e-05, | |
| "loss": 0.1864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16310462355613708, | |
| "step": 1725, | |
| "valid_targets_mean": 7079.8, | |
| "valid_targets_min": 3151 | |
| }, | |
| { | |
| "epoch": 2.8500823723228996, | |
| "grad_norm": 0.4118485814337778, | |
| "learning_rate": 2.95797352788665e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19650161266326904, | |
| "step": 1730, | |
| "valid_targets_mean": 6517.4, | |
| "valid_targets_min": 1213 | |
| }, | |
| { | |
| "epoch": 2.8583196046128503, | |
| "grad_norm": 0.3985995816730429, | |
| "learning_rate": 2.9507537517571144e-05, | |
| "loss": 0.1812, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16181087493896484, | |
| "step": 1735, | |
| "valid_targets_mean": 5469.2, | |
| "valid_targets_min": 329 | |
| }, | |
| { | |
| "epoch": 2.8665568369028005, | |
| "grad_norm": 0.41174127791281484, | |
| "learning_rate": 2.943517933150627e-05, | |
| "loss": 0.1792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16833308339118958, | |
| "step": 1740, | |
| "valid_targets_mean": 6566.2, | |
| "valid_targets_min": 697 | |
| }, | |
| { | |
| "epoch": 2.874794069192751, | |
| "grad_norm": 0.49914804119853634, | |
| "learning_rate": 2.936266194160266e-05, | |
| "loss": 0.1934, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19393935799598694, | |
| "step": 1745, | |
| "valid_targets_mean": 4929.6, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 2.883031301482702, | |
| "grad_norm": 0.45669196924100525, | |
| "learning_rate": 2.9289986571477437e-05, | |
| "loss": 0.1893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18913184106349945, | |
| "step": 1750, | |
| "valid_targets_mean": 5159.7, | |
| "valid_targets_min": 345 | |
| }, | |
| { | |
| "epoch": 2.8912685337726525, | |
| "grad_norm": 0.4050992240314355, | |
| "learning_rate": 2.9217154447413353e-05, | |
| "loss": 0.1856, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19301308691501617, | |
| "step": 1755, | |
| "valid_targets_mean": 5732.8, | |
| "valid_targets_min": 643 | |
| }, | |
| { | |
| "epoch": 2.899505766062603, | |
| "grad_norm": 0.4166433021084068, | |
| "learning_rate": 2.9144166798338183e-05, | |
| "loss": 0.1813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16880372166633606, | |
| "step": 1760, | |
| "valid_targets_mean": 5482.8, | |
| "valid_targets_min": 346 | |
| }, | |
| { | |
| "epoch": 2.9077429983525533, | |
| "grad_norm": 0.4876653800212749, | |
| "learning_rate": 2.9071024855803915e-05, | |
| "loss": 0.175, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16207365691661835, | |
| "step": 1765, | |
| "valid_targets_mean": 5472.4, | |
| "valid_targets_min": 313 | |
| }, | |
| { | |
| "epoch": 2.915980230642504, | |
| "grad_norm": 0.4432418577836535, | |
| "learning_rate": 2.899772985396599e-05, | |
| "loss": 0.18, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17681077122688293, | |
| "step": 1770, | |
| "valid_targets_mean": 5663.9, | |
| "valid_targets_min": 2714 | |
| }, | |
| { | |
| "epoch": 2.9242174629324547, | |
| "grad_norm": 0.4057865576825728, | |
| "learning_rate": 2.8924283029562507e-05, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17672967910766602, | |
| "step": 1775, | |
| "valid_targets_mean": 6175.4, | |
| "valid_targets_min": 1138 | |
| }, | |
| { | |
| "epoch": 2.9324546952224053, | |
| "grad_norm": 0.5907926016816272, | |
| "learning_rate": 2.8850685621893325e-05, | |
| "loss": 0.1944, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1969795525074005, | |
| "step": 1780, | |
| "valid_targets_mean": 4073.1, | |
| "valid_targets_min": 306 | |
| }, | |
| { | |
| "epoch": 2.940691927512356, | |
| "grad_norm": 0.44725505789253106, | |
| "learning_rate": 2.8776938872799154e-05, | |
| "loss": 0.1916, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1989360749721527, | |
| "step": 1785, | |
| "valid_targets_mean": 5965.4, | |
| "valid_targets_min": 282 | |
| }, | |
| { | |
| "epoch": 2.948929159802306, | |
| "grad_norm": 0.4760043999500545, | |
| "learning_rate": 2.8703044026640606e-05, | |
| "loss": 0.1756, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18851858377456665, | |
| "step": 1790, | |
| "valid_targets_mean": 4529.5, | |
| "valid_targets_min": 408 | |
| }, | |
| { | |
| "epoch": 2.957166392092257, | |
| "grad_norm": 0.5146855051453981, | |
| "learning_rate": 2.8629002330277193e-05, | |
| "loss": 0.1809, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19129858911037445, | |
| "step": 1795, | |
| "valid_targets_mean": 4191.7, | |
| "valid_targets_min": 313 | |
| }, | |
| { | |
| "epoch": 2.9654036243822075, | |
| "grad_norm": 0.5551959217056265, | |
| "learning_rate": 2.8554815033046314e-05, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21176767349243164, | |
| "step": 1800, | |
| "valid_targets_mean": 5302.6, | |
| "valid_targets_min": 318 | |
| }, | |
| { | |
| "epoch": 2.973640856672158, | |
| "grad_norm": 0.43494351702575534, | |
| "learning_rate": 2.8480483386742138e-05, | |
| "loss": 0.1778, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1887683868408203, | |
| "step": 1805, | |
| "valid_targets_mean": 5780.3, | |
| "valid_targets_min": 3622 | |
| }, | |
| { | |
| "epoch": 2.981878088962109, | |
| "grad_norm": 0.6709852335599017, | |
| "learning_rate": 2.8406008645594503e-05, | |
| "loss": 0.1755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18137386441230774, | |
| "step": 1810, | |
| "valid_targets_mean": 6075.0, | |
| "valid_targets_min": 280 | |
| }, | |
| { | |
| "epoch": 2.990115321252059, | |
| "grad_norm": 0.4533452043383325, | |
| "learning_rate": 2.8331392066247737e-05, | |
| "loss": 0.1829, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1966426968574524, | |
| "step": 1815, | |
| "valid_targets_mean": 6176.1, | |
| "valid_targets_min": 2014 | |
| }, | |
| { | |
| "epoch": 2.99835255354201, | |
| "grad_norm": 0.4872843448253986, | |
| "learning_rate": 2.825663490773949e-05, | |
| "loss": 0.1921, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2191101312637329, | |
| "step": 1820, | |
| "valid_targets_mean": 4317.1, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 3.0065897858319603, | |
| "grad_norm": 0.4925134741814334, | |
| "learning_rate": 2.8181738431479436e-05, | |
| "loss": 0.1654, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17591401934623718, | |
| "step": 1825, | |
| "valid_targets_mean": 5124.8, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 3.014827018121911, | |
| "grad_norm": 0.46951633331962356, | |
| "learning_rate": 2.8106703901228038e-05, | |
| "loss": 0.1724, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16612932085990906, | |
| "step": 1830, | |
| "valid_targets_mean": 5064.6, | |
| "valid_targets_min": 382 | |
| }, | |
| { | |
| "epoch": 3.0230642504118617, | |
| "grad_norm": 0.4201136329542275, | |
| "learning_rate": 2.803153258307519e-05, | |
| "loss": 0.1627, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1488945186138153, | |
| "step": 1835, | |
| "valid_targets_mean": 5890.2, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 3.0313014827018123, | |
| "grad_norm": 0.4070141782866309, | |
| "learning_rate": 2.795622574541887e-05, | |
| "loss": 0.1545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13417211174964905, | |
| "step": 1840, | |
| "valid_targets_mean": 6251.1, | |
| "valid_targets_min": 400 | |
| }, | |
| { | |
| "epoch": 3.039538714991763, | |
| "grad_norm": 0.4474445147917244, | |
| "learning_rate": 2.7880784658943754e-05, | |
| "loss": 0.1673, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.162965327501297, | |
| "step": 1845, | |
| "valid_targets_mean": 5795.4, | |
| "valid_targets_min": 2153 | |
| }, | |
| { | |
| "epoch": 3.047775947281713, | |
| "grad_norm": 0.5042184100386404, | |
| "learning_rate": 2.780521059659972e-05, | |
| "loss": 0.1735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17654483020305634, | |
| "step": 1850, | |
| "valid_targets_mean": 4581.5, | |
| "valid_targets_min": 763 | |
| }, | |
| { | |
| "epoch": 3.056013179571664, | |
| "grad_norm": 0.4690284250626636, | |
| "learning_rate": 2.7729504833580435e-05, | |
| "loss": 0.1723, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17030039429664612, | |
| "step": 1855, | |
| "valid_targets_mean": 5360.2, | |
| "valid_targets_min": 515 | |
| }, | |
| { | |
| "epoch": 3.0642504118616145, | |
| "grad_norm": 0.4457242021093399, | |
| "learning_rate": 2.7653668647301797e-05, | |
| "loss": 0.1598, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15688768029212952, | |
| "step": 1860, | |
| "valid_targets_mean": 5233.9, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 3.072487644151565, | |
| "grad_norm": 0.5262957209265869, | |
| "learning_rate": 2.7577703317380396e-05, | |
| "loss": 0.1596, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.162797212600708, | |
| "step": 1865, | |
| "valid_targets_mean": 5007.5, | |
| "valid_targets_min": 318 | |
| }, | |
| { | |
| "epoch": 3.080724876441516, | |
| "grad_norm": 0.435607664368139, | |
| "learning_rate": 2.7501610125611908e-05, | |
| "loss": 0.1639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16544128954410553, | |
| "step": 1870, | |
| "valid_targets_mean": 5627.2, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 3.088962108731466, | |
| "grad_norm": 0.48754061786282266, | |
| "learning_rate": 2.7425390355949485e-05, | |
| "loss": 0.1718, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17840203642845154, | |
| "step": 1875, | |
| "valid_targets_mean": 5176.1, | |
| "valid_targets_min": 664 | |
| }, | |
| { | |
| "epoch": 3.0971993410214167, | |
| "grad_norm": 0.4713253285078091, | |
| "learning_rate": 2.7349045294482085e-05, | |
| "loss": 0.1593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16155745089054108, | |
| "step": 1880, | |
| "valid_targets_mean": 5086.4, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 3.1054365733113674, | |
| "grad_norm": 0.43308469551129186, | |
| "learning_rate": 2.727257622941276e-05, | |
| "loss": 0.1629, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1810024082660675, | |
| "step": 1885, | |
| "valid_targets_mean": 6242.1, | |
| "valid_targets_min": 2910 | |
| }, | |
| { | |
| "epoch": 3.113673805601318, | |
| "grad_norm": 0.4364624354543134, | |
| "learning_rate": 2.719598445103693e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1708287000656128, | |
| "step": 1890, | |
| "valid_targets_mean": 5678.8, | |
| "valid_targets_min": 3190 | |
| }, | |
| { | |
| "epoch": 3.1219110378912687, | |
| "grad_norm": 0.4737469378096958, | |
| "learning_rate": 2.7119271251720628e-05, | |
| "loss": 0.1647, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18933191895484924, | |
| "step": 1895, | |
| "valid_targets_mean": 5111.1, | |
| "valid_targets_min": 556 | |
| }, | |
| { | |
| "epoch": 3.130148270181219, | |
| "grad_norm": 0.41676344263541504, | |
| "learning_rate": 2.7042437925878638e-05, | |
| "loss": 0.1673, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15466684103012085, | |
| "step": 1900, | |
| "valid_targets_mean": 5804.4, | |
| "valid_targets_min": 649 | |
| }, | |
| { | |
| "epoch": 3.1383855024711695, | |
| "grad_norm": 0.5653553337325623, | |
| "learning_rate": 2.696548576995273e-05, | |
| "loss": 0.1783, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18597179651260376, | |
| "step": 1905, | |
| "valid_targets_mean": 5242.3, | |
| "valid_targets_min": 359 | |
| }, | |
| { | |
| "epoch": 3.14662273476112, | |
| "grad_norm": 0.4879099377031038, | |
| "learning_rate": 2.688841608238971e-05, | |
| "loss": 0.1683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16499802470207214, | |
| "step": 1910, | |
| "valid_targets_mean": 4917.2, | |
| "valid_targets_min": 311 | |
| }, | |
| { | |
| "epoch": 3.154859967051071, | |
| "grad_norm": 0.4667216223779826, | |
| "learning_rate": 2.6811230163619573e-05, | |
| "loss": 0.1684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16547727584838867, | |
| "step": 1915, | |
| "valid_targets_mean": 4575.3, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 3.1630971993410215, | |
| "grad_norm": 0.42534284414973267, | |
| "learning_rate": 2.6733929316033516e-05, | |
| "loss": 0.1547, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1600956916809082, | |
| "step": 1920, | |
| "valid_targets_mean": 5993.5, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 3.171334431630972, | |
| "grad_norm": 0.4682657752517238, | |
| "learning_rate": 2.6656514843961995e-05, | |
| "loss": 0.1667, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17428819835186005, | |
| "step": 1925, | |
| "valid_targets_mean": 5779.7, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 3.1795716639209224, | |
| "grad_norm": 0.47746321265894026, | |
| "learning_rate": 2.6578988053652674e-05, | |
| "loss": 0.1715, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15723752975463867, | |
| "step": 1930, | |
| "valid_targets_mean": 4613.3, | |
| "valid_targets_min": 286 | |
| }, | |
| { | |
| "epoch": 3.187808896210873, | |
| "grad_norm": 0.4951364632733134, | |
| "learning_rate": 2.650135025324845e-05, | |
| "loss": 0.1651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1764582395553589, | |
| "step": 1935, | |
| "valid_targets_mean": 4870.2, | |
| "valid_targets_min": 373 | |
| }, | |
| { | |
| "epoch": 3.1960461285008237, | |
| "grad_norm": 0.40725037546610426, | |
| "learning_rate": 2.6423602752765313e-05, | |
| "loss": 0.1553, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13663309812545776, | |
| "step": 1940, | |
| "valid_targets_mean": 5494.2, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 3.2042833607907744, | |
| "grad_norm": 0.42521249997987376, | |
| "learning_rate": 2.6345746864070266e-05, | |
| "loss": 0.1652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15042027831077576, | |
| "step": 1945, | |
| "valid_targets_mean": 6342.6, | |
| "valid_targets_min": 484 | |
| }, | |
| { | |
| "epoch": 3.212520593080725, | |
| "grad_norm": 0.5413285488120383, | |
| "learning_rate": 2.626778390085921e-05, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16320699453353882, | |
| "step": 1950, | |
| "valid_targets_mean": 5744.4, | |
| "valid_targets_min": 429 | |
| }, | |
| { | |
| "epoch": 3.2207578253706757, | |
| "grad_norm": 0.43089481661574836, | |
| "learning_rate": 2.618971517863475e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12983350455760956, | |
| "step": 1955, | |
| "valid_targets_mean": 5079.2, | |
| "valid_targets_min": 971 | |
| }, | |
| { | |
| "epoch": 3.228995057660626, | |
| "grad_norm": 0.4911902168052223, | |
| "learning_rate": 2.611154201468402e-05, | |
| "loss": 0.1676, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16104954481124878, | |
| "step": 1960, | |
| "valid_targets_mean": 5500.1, | |
| "valid_targets_min": 347 | |
| }, | |
| { | |
| "epoch": 3.2372322899505765, | |
| "grad_norm": 0.4133285626323947, | |
| "learning_rate": 2.6033265728056437e-05, | |
| "loss": 0.1694, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14631503820419312, | |
| "step": 1965, | |
| "valid_targets_mean": 5502.2, | |
| "valid_targets_min": 1724 | |
| }, | |
| { | |
| "epoch": 3.245469522240527, | |
| "grad_norm": 0.5213460978823949, | |
| "learning_rate": 2.5954887639541435e-05, | |
| "loss": 0.1733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17967018485069275, | |
| "step": 1970, | |
| "valid_targets_mean": 5081.2, | |
| "valid_targets_min": 452 | |
| }, | |
| { | |
| "epoch": 3.253706754530478, | |
| "grad_norm": 0.4681430241946772, | |
| "learning_rate": 2.5876409071646224e-05, | |
| "loss": 0.1626, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1624462753534317, | |
| "step": 1975, | |
| "valid_targets_mean": 5415.8, | |
| "valid_targets_min": 252 | |
| }, | |
| { | |
| "epoch": 3.2619439868204285, | |
| "grad_norm": 0.46303449443059874, | |
| "learning_rate": 2.5797831348573425e-05, | |
| "loss": 0.1665, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17155566811561584, | |
| "step": 1980, | |
| "valid_targets_mean": 5258.4, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 3.2701812191103787, | |
| "grad_norm": 0.532316430057948, | |
| "learning_rate": 2.571915579619876e-05, | |
| "loss": 0.1683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17313048243522644, | |
| "step": 1985, | |
| "valid_targets_mean": 4552.9, | |
| "valid_targets_min": 459 | |
| }, | |
| { | |
| "epoch": 3.2784184514003294, | |
| "grad_norm": 0.4197831861072301, | |
| "learning_rate": 2.5640383742048654e-05, | |
| "loss": 0.1794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18682700395584106, | |
| "step": 1990, | |
| "valid_targets_mean": 6343.2, | |
| "valid_targets_min": 311 | |
| }, | |
| { | |
| "epoch": 3.28665568369028, | |
| "grad_norm": 0.47975068369647395, | |
| "learning_rate": 2.5561516515277863e-05, | |
| "loss": 0.1635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1698731780052185, | |
| "step": 1995, | |
| "valid_targets_mean": 4894.1, | |
| "valid_targets_min": 429 | |
| }, | |
| { | |
| "epoch": 3.2948929159802307, | |
| "grad_norm": 0.44143749997803633, | |
| "learning_rate": 2.5482555446647018e-05, | |
| "loss": 0.1712, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15826047956943512, | |
| "step": 2000, | |
| "valid_targets_mean": 4921.8, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 3.3031301482701814, | |
| "grad_norm": 0.47872780397929104, | |
| "learning_rate": 2.5403501868500185e-05, | |
| "loss": 0.1711, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1793244183063507, | |
| "step": 2005, | |
| "valid_targets_mean": 5263.2, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 3.3113673805601316, | |
| "grad_norm": 0.43795563431108253, | |
| "learning_rate": 2.5324357114742397e-05, | |
| "loss": 0.162, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1651221364736557, | |
| "step": 2010, | |
| "valid_targets_mean": 5599.2, | |
| "valid_targets_min": 999 | |
| }, | |
| { | |
| "epoch": 3.3196046128500822, | |
| "grad_norm": 0.4194171717207259, | |
| "learning_rate": 2.5245122520817117e-05, | |
| "loss": 0.1685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16536174714565277, | |
| "step": 2015, | |
| "valid_targets_mean": 5968.4, | |
| "valid_targets_min": 253 | |
| }, | |
| { | |
| "epoch": 3.327841845140033, | |
| "grad_norm": 0.630395759920917, | |
| "learning_rate": 2.516579942368373e-05, | |
| "loss": 0.1726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16959768533706665, | |
| "step": 2020, | |
| "valid_targets_mean": 4148.6, | |
| "valid_targets_min": 295 | |
| }, | |
| { | |
| "epoch": 3.3360790774299836, | |
| "grad_norm": 0.4597314779792269, | |
| "learning_rate": 2.508638916179497e-05, | |
| "loss": 0.1592, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16048160195350647, | |
| "step": 2025, | |
| "valid_targets_mean": 5179.0, | |
| "valid_targets_min": 416 | |
| }, | |
| { | |
| "epoch": 3.344316309719934, | |
| "grad_norm": 0.40076416546655863, | |
| "learning_rate": 2.5006893075074326e-05, | |
| "loss": 0.1638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14166682958602905, | |
| "step": 2030, | |
| "valid_targets_mean": 5947.9, | |
| "valid_targets_min": 408 | |
| }, | |
| { | |
| "epoch": 3.352553542009885, | |
| "grad_norm": 0.6022307050548966, | |
| "learning_rate": 2.4927312504893467e-05, | |
| "loss": 0.1577, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14992672204971313, | |
| "step": 2035, | |
| "valid_targets_mean": 6018.8, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 3.360790774299835, | |
| "grad_norm": 0.48131640597198694, | |
| "learning_rate": 2.4847648794049587e-05, | |
| "loss": 0.1592, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.177500382065773, | |
| "step": 2040, | |
| "valid_targets_mean": 4628.6, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 3.3690280065897857, | |
| "grad_norm": 0.40253163403627884, | |
| "learning_rate": 2.4767903286742732e-05, | |
| "loss": 0.1682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.170414999127388, | |
| "step": 2045, | |
| "valid_targets_mean": 6258.0, | |
| "valid_targets_min": 2733 | |
| }, | |
| { | |
| "epoch": 3.3772652388797364, | |
| "grad_norm": 0.429610208965216, | |
| "learning_rate": 2.468807732855314e-05, | |
| "loss": 0.1666, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15542641282081604, | |
| "step": 2050, | |
| "valid_targets_mean": 5290.5, | |
| "valid_targets_min": 239 | |
| }, | |
| { | |
| "epoch": 3.385502471169687, | |
| "grad_norm": 0.5228662176128691, | |
| "learning_rate": 2.4608172266418543e-05, | |
| "loss": 0.1939, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2773077189922333, | |
| "step": 2055, | |
| "valid_targets_mean": 6195.2, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 3.3937397034596377, | |
| "grad_norm": 0.4679719439792988, | |
| "learning_rate": 2.4528189448611417e-05, | |
| "loss": 0.1711, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15937143564224243, | |
| "step": 2060, | |
| "valid_targets_mean": 5585.6, | |
| "valid_targets_min": 1274 | |
| }, | |
| { | |
| "epoch": 3.4019769357495884, | |
| "grad_norm": 0.6292251061768565, | |
| "learning_rate": 2.4448130224716248e-05, | |
| "loss": 0.168, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17720317840576172, | |
| "step": 2065, | |
| "valid_targets_mean": 4486.8, | |
| "valid_targets_min": 440 | |
| }, | |
| { | |
| "epoch": 3.4102141680395386, | |
| "grad_norm": 0.41773728976670876, | |
| "learning_rate": 2.436799594560675e-05, | |
| "loss": 0.1747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16367585957050323, | |
| "step": 2070, | |
| "valid_targets_mean": 5920.7, | |
| "valid_targets_min": 282 | |
| }, | |
| { | |
| "epoch": 3.4184514003294892, | |
| "grad_norm": 0.5569444713805546, | |
| "learning_rate": 2.4287787963423076e-05, | |
| "loss": 0.164, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.172664612531662, | |
| "step": 2075, | |
| "valid_targets_mean": 4224.7, | |
| "valid_targets_min": 269 | |
| }, | |
| { | |
| "epoch": 3.42668863261944, | |
| "grad_norm": 0.4136416394016962, | |
| "learning_rate": 2.4207507631549013e-05, | |
| "loss": 0.1664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17453311383724213, | |
| "step": 2080, | |
| "valid_targets_mean": 6280.9, | |
| "valid_targets_min": 469 | |
| }, | |
| { | |
| "epoch": 3.4349258649093906, | |
| "grad_norm": 0.5492874775817022, | |
| "learning_rate": 2.412715630458912e-05, | |
| "loss": 0.1691, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18640339374542236, | |
| "step": 2085, | |
| "valid_targets_mean": 5082.9, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 3.443163097199341, | |
| "grad_norm": 0.4846899712538489, | |
| "learning_rate": 2.4046735338345895e-05, | |
| "loss": 0.1688, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16321972012519836, | |
| "step": 2090, | |
| "valid_targets_mean": 4117.7, | |
| "valid_targets_min": 381 | |
| }, | |
| { | |
| "epoch": 3.4514003294892914, | |
| "grad_norm": 0.43078882336824376, | |
| "learning_rate": 2.396624608979689e-05, | |
| "loss": 0.1637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1680295765399933, | |
| "step": 2095, | |
| "valid_targets_mean": 5800.4, | |
| "valid_targets_min": 486 | |
| }, | |
| { | |
| "epoch": 3.459637561779242, | |
| "grad_norm": 0.48957765167848494, | |
| "learning_rate": 2.3885689917071822e-05, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1648954600095749, | |
| "step": 2100, | |
| "valid_targets_mean": 4547.6, | |
| "valid_targets_min": 433 | |
| }, | |
| { | |
| "epoch": 3.4678747940691927, | |
| "grad_norm": 0.4024882547668721, | |
| "learning_rate": 2.3805068179429628e-05, | |
| "loss": 0.1605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15916281938552856, | |
| "step": 2105, | |
| "valid_targets_mean": 5993.3, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 3.4761120263591434, | |
| "grad_norm": 0.45971252491424164, | |
| "learning_rate": 2.372438223723555e-05, | |
| "loss": 0.1666, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14879193902015686, | |
| "step": 2110, | |
| "valid_targets_mean": 5487.4, | |
| "valid_targets_min": 826 | |
| }, | |
| { | |
| "epoch": 3.484349258649094, | |
| "grad_norm": 0.5282714874848389, | |
| "learning_rate": 2.3643633451938216e-05, | |
| "loss": 0.1773, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18927377462387085, | |
| "step": 2115, | |
| "valid_targets_mean": 4617.0, | |
| "valid_targets_min": 404 | |
| }, | |
| { | |
| "epoch": 3.4925864909390443, | |
| "grad_norm": 0.4385501002050119, | |
| "learning_rate": 2.35628231860466e-05, | |
| "loss": 0.1648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15995869040489197, | |
| "step": 2120, | |
| "valid_targets_mean": 7026.8, | |
| "valid_targets_min": 3996 | |
| }, | |
| { | |
| "epoch": 3.500823723228995, | |
| "grad_norm": 0.4226288385578095, | |
| "learning_rate": 2.3481952803107067e-05, | |
| "loss": 0.1601, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16267460584640503, | |
| "step": 2125, | |
| "valid_targets_mean": 5791.6, | |
| "valid_targets_min": 526 | |
| }, | |
| { | |
| "epoch": 3.5090609555189456, | |
| "grad_norm": 0.3999368378824118, | |
| "learning_rate": 2.3401023667680373e-05, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1635211855173111, | |
| "step": 2130, | |
| "valid_targets_mean": 7097.8, | |
| "valid_targets_min": 2006 | |
| }, | |
| { | |
| "epoch": 3.5172981878088962, | |
| "grad_norm": 0.4298215892191779, | |
| "learning_rate": 2.332003714531862e-05, | |
| "loss": 0.1636, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16600899398326874, | |
| "step": 2135, | |
| "valid_targets_mean": 6014.1, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 3.525535420098847, | |
| "grad_norm": 0.39900638433270885, | |
| "learning_rate": 2.3238994602542246e-05, | |
| "loss": 0.1597, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15637260675430298, | |
| "step": 2140, | |
| "valid_targets_mean": 6231.6, | |
| "valid_targets_min": 467 | |
| }, | |
| { | |
| "epoch": 3.533772652388797, | |
| "grad_norm": 0.45498071505271753, | |
| "learning_rate": 2.3157897406816917e-05, | |
| "loss": 0.1656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1908491551876068, | |
| "step": 2145, | |
| "valid_targets_mean": 5242.5, | |
| "valid_targets_min": 343 | |
| }, | |
| { | |
| "epoch": 3.5420098846787478, | |
| "grad_norm": 0.46961507934645447, | |
| "learning_rate": 2.3076746926530497e-05, | |
| "loss": 0.1764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17604225873947144, | |
| "step": 2150, | |
| "valid_targets_mean": 5637.7, | |
| "valid_targets_min": 951 | |
| }, | |
| { | |
| "epoch": 3.5502471169686984, | |
| "grad_norm": 0.4373049137664838, | |
| "learning_rate": 2.2995544530969946e-05, | |
| "loss": 0.1715, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16728343069553375, | |
| "step": 2155, | |
| "valid_targets_mean": 5702.9, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 3.558484349258649, | |
| "grad_norm": 0.4384327378195429, | |
| "learning_rate": 2.2914291590298197e-05, | |
| "loss": 0.1634, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16371147334575653, | |
| "step": 2160, | |
| "valid_targets_mean": 5260.1, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 3.5667215815485998, | |
| "grad_norm": 0.44590623551665487, | |
| "learning_rate": 2.2832989475531073e-05, | |
| "loss": 0.1705, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17798402905464172, | |
| "step": 2165, | |
| "valid_targets_mean": 5838.9, | |
| "valid_targets_min": 1807 | |
| }, | |
| { | |
| "epoch": 3.5749588138385504, | |
| "grad_norm": 0.4441625485059226, | |
| "learning_rate": 2.2751639558514116e-05, | |
| "loss": 0.1728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17515239119529724, | |
| "step": 2170, | |
| "valid_targets_mean": 5587.0, | |
| "valid_targets_min": 2698 | |
| }, | |
| { | |
| "epoch": 3.583196046128501, | |
| "grad_norm": 0.477649896519292, | |
| "learning_rate": 2.267024321189947e-05, | |
| "loss": 0.1725, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18034707009792328, | |
| "step": 2175, | |
| "valid_targets_mean": 4963.6, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 3.5914332784184513, | |
| "grad_norm": 0.4663224086535974, | |
| "learning_rate": 2.2588801809122692e-05, | |
| "loss": 0.1719, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18077439069747925, | |
| "step": 2180, | |
| "valid_targets_mean": 5901.2, | |
| "valid_targets_min": 396 | |
| }, | |
| { | |
| "epoch": 3.599670510708402, | |
| "grad_norm": 0.4046716315074143, | |
| "learning_rate": 2.2507316724379603e-05, | |
| "loss": 0.1643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16994984447956085, | |
| "step": 2185, | |
| "valid_targets_mean": 6699.7, | |
| "valid_targets_min": 3961 | |
| }, | |
| { | |
| "epoch": 3.6079077429983526, | |
| "grad_norm": 0.46538609652857194, | |
| "learning_rate": 2.2425789332603073e-05, | |
| "loss": 0.1687, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15804654359817505, | |
| "step": 2190, | |
| "valid_targets_mean": 5469.8, | |
| "valid_targets_min": 852 | |
| }, | |
| { | |
| "epoch": 3.6161449752883033, | |
| "grad_norm": 0.39597613600689074, | |
| "learning_rate": 2.2344221009439863e-05, | |
| "loss": 0.1664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1535954773426056, | |
| "step": 2195, | |
| "valid_targets_mean": 6177.7, | |
| "valid_targets_min": 290 | |
| }, | |
| { | |
| "epoch": 3.624382207578254, | |
| "grad_norm": 0.432923735193132, | |
| "learning_rate": 2.2262613131227357e-05, | |
| "loss": 0.1625, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15266185998916626, | |
| "step": 2200, | |
| "valid_targets_mean": 5965.2, | |
| "valid_targets_min": 290 | |
| }, | |
| { | |
| "epoch": 3.632619439868204, | |
| "grad_norm": 0.40127577438372597, | |
| "learning_rate": 2.2180967074970384e-05, | |
| "loss": 0.1695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16095589101314545, | |
| "step": 2205, | |
| "valid_targets_mean": 6422.1, | |
| "valid_targets_min": 3765 | |
| }, | |
| { | |
| "epoch": 3.640856672158155, | |
| "grad_norm": 0.4650779716931073, | |
| "learning_rate": 2.2099284218317965e-05, | |
| "loss": 0.1726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18171751499176025, | |
| "step": 2210, | |
| "valid_targets_mean": 5056.1, | |
| "valid_targets_min": 904 | |
| }, | |
| { | |
| "epoch": 3.6490939044481054, | |
| "grad_norm": 0.43130673319289825, | |
| "learning_rate": 2.201756593954007e-05, | |
| "loss": 0.1583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15492397546768188, | |
| "step": 2215, | |
| "valid_targets_mean": 5113.2, | |
| "valid_targets_min": 393 | |
| }, | |
| { | |
| "epoch": 3.657331136738056, | |
| "grad_norm": 0.5295744702809257, | |
| "learning_rate": 2.1935813617504372e-05, | |
| "loss": 0.1756, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16201339662075043, | |
| "step": 2220, | |
| "valid_targets_mean": 4780.9, | |
| "valid_targets_min": 277 | |
| }, | |
| { | |
| "epoch": 3.6655683690280068, | |
| "grad_norm": 0.5946369400747427, | |
| "learning_rate": 2.1854028631652958e-05, | |
| "loss": 0.1702, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16037127375602722, | |
| "step": 2225, | |
| "valid_targets_mean": 3275.8, | |
| "valid_targets_min": 279 | |
| }, | |
| { | |
| "epoch": 3.673805601317957, | |
| "grad_norm": 0.44952551623528814, | |
| "learning_rate": 2.1772212361979053e-05, | |
| "loss": 0.1648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16338971257209778, | |
| "step": 2230, | |
| "valid_targets_mean": 5095.8, | |
| "valid_targets_min": 346 | |
| }, | |
| { | |
| "epoch": 3.6820428336079076, | |
| "grad_norm": 0.43791872886873806, | |
| "learning_rate": 2.1690366189003776e-05, | |
| "loss": 0.1586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15391787886619568, | |
| "step": 2235, | |
| "valid_targets_mean": 5727.9, | |
| "valid_targets_min": 1342 | |
| }, | |
| { | |
| "epoch": 3.6902800658978583, | |
| "grad_norm": 0.3970801737083101, | |
| "learning_rate": 2.16084914937528e-05, | |
| "loss": 0.1678, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16625526547431946, | |
| "step": 2240, | |
| "valid_targets_mean": 6496.0, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 3.698517298187809, | |
| "grad_norm": 0.484519359275333, | |
| "learning_rate": 2.1526589657733076e-05, | |
| "loss": 0.1663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14906881749629974, | |
| "step": 2245, | |
| "valid_targets_mean": 5056.9, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 3.7067545304777596, | |
| "grad_norm": 0.543054449266146, | |
| "learning_rate": 2.1444662062909497e-05, | |
| "loss": 0.1765, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18880906701087952, | |
| "step": 2250, | |
| "valid_targets_mean": 5087.6, | |
| "valid_targets_min": 386 | |
| }, | |
| { | |
| "epoch": 3.71499176276771, | |
| "grad_norm": 0.412373847155115, | |
| "learning_rate": 2.136271009168162e-05, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16423764824867249, | |
| "step": 2255, | |
| "valid_targets_mean": 5792.0, | |
| "valid_targets_min": 1805 | |
| }, | |
| { | |
| "epoch": 3.7232289950576605, | |
| "grad_norm": 0.4126065374679235, | |
| "learning_rate": 2.128073512686029e-05, | |
| "loss": 0.1561, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15496188402175903, | |
| "step": 2260, | |
| "valid_targets_mean": 6312.9, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 3.731466227347611, | |
| "grad_norm": 0.4737629695491757, | |
| "learning_rate": 2.1198738551644344e-05, | |
| "loss": 0.1623, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18088474869728088, | |
| "step": 2265, | |
| "valid_targets_mean": 4541.9, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 3.739703459637562, | |
| "grad_norm": 0.43047213132792306, | |
| "learning_rate": 2.1116721749597265e-05, | |
| "loss": 0.1655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1655322015285492, | |
| "step": 2270, | |
| "valid_targets_mean": 5827.6, | |
| "valid_targets_min": 316 | |
| }, | |
| { | |
| "epoch": 3.7479406919275124, | |
| "grad_norm": 0.3918342153717676, | |
| "learning_rate": 2.1034686104623828e-05, | |
| "loss": 0.1624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16483767330646515, | |
| "step": 2275, | |
| "valid_targets_mean": 6364.0, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 3.7561779242174627, | |
| "grad_norm": 0.4478424801826196, | |
| "learning_rate": 2.0952633000946753e-05, | |
| "loss": 0.1623, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16138796508312225, | |
| "step": 2280, | |
| "valid_targets_mean": 5547.7, | |
| "valid_targets_min": 374 | |
| }, | |
| { | |
| "epoch": 3.7644151565074138, | |
| "grad_norm": 0.4736058075123787, | |
| "learning_rate": 2.0870563823083347e-05, | |
| "loss": 0.1682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1633884608745575, | |
| "step": 2285, | |
| "valid_targets_mean": 4713.3, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 3.772652388797364, | |
| "grad_norm": 0.4076171431317024, | |
| "learning_rate": 2.0788479955822137e-05, | |
| "loss": 0.167, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15652526915073395, | |
| "step": 2290, | |
| "valid_targets_mean": 6205.0, | |
| "valid_targets_min": 393 | |
| }, | |
| { | |
| "epoch": 3.7808896210873146, | |
| "grad_norm": 0.5176507339253252, | |
| "learning_rate": 2.0706382784199524e-05, | |
| "loss": 0.16, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16365516185760498, | |
| "step": 2295, | |
| "valid_targets_mean": 6140.5, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 3.7891268533772653, | |
| "grad_norm": 0.45103078449937783, | |
| "learning_rate": 2.06242736934764e-05, | |
| "loss": 0.1706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17881439626216888, | |
| "step": 2300, | |
| "valid_targets_mean": 4976.1, | |
| "valid_targets_min": 1110 | |
| }, | |
| { | |
| "epoch": 3.797364085667216, | |
| "grad_norm": 0.4068376487085923, | |
| "learning_rate": 2.0542154069114753e-05, | |
| "loss": 0.1612, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16194987297058105, | |
| "step": 2305, | |
| "valid_targets_mean": 6501.2, | |
| "valid_targets_min": 434 | |
| }, | |
| { | |
| "epoch": 3.8056013179571666, | |
| "grad_norm": 0.5485960350199107, | |
| "learning_rate": 2.0460025296754326e-05, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1692267209291458, | |
| "step": 2310, | |
| "valid_targets_mean": 4711.4, | |
| "valid_targets_min": 405 | |
| }, | |
| { | |
| "epoch": 3.813838550247117, | |
| "grad_norm": 0.5349054412710811, | |
| "learning_rate": 2.037788876218923e-05, | |
| "loss": 0.1671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1499655842781067, | |
| "step": 2315, | |
| "valid_targets_mean": 5649.0, | |
| "valid_targets_min": 390 | |
| }, | |
| { | |
| "epoch": 3.8220757825370675, | |
| "grad_norm": 0.40935562611532483, | |
| "learning_rate": 2.029574585134452e-05, | |
| "loss": 0.1613, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1631348431110382, | |
| "step": 2320, | |
| "valid_targets_mean": 6357.0, | |
| "valid_targets_min": 482 | |
| }, | |
| { | |
| "epoch": 3.830313014827018, | |
| "grad_norm": 0.4645878611326604, | |
| "learning_rate": 2.0213597950252864e-05, | |
| "loss": 0.1697, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17020228505134583, | |
| "step": 2325, | |
| "valid_targets_mean": 4671.6, | |
| "valid_targets_min": 484 | |
| }, | |
| { | |
| "epoch": 3.838550247116969, | |
| "grad_norm": 0.4871420384446106, | |
| "learning_rate": 2.0131446445031132e-05, | |
| "loss": 0.1654, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16444051265716553, | |
| "step": 2330, | |
| "valid_targets_mean": 4968.4, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 3.8467874794069195, | |
| "grad_norm": 0.4813413395155231, | |
| "learning_rate": 2.0049292721857005e-05, | |
| "loss": 0.1658, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17408815026283264, | |
| "step": 2335, | |
| "valid_targets_mean": 5293.2, | |
| "valid_targets_min": 768 | |
| }, | |
| { | |
| "epoch": 3.8550247116968697, | |
| "grad_norm": 0.4742090254671904, | |
| "learning_rate": 1.9967138166945584e-05, | |
| "loss": 0.165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17291584610939026, | |
| "step": 2340, | |
| "valid_targets_mean": 6698.8, | |
| "valid_targets_min": 4143 | |
| }, | |
| { | |
| "epoch": 3.8632619439868203, | |
| "grad_norm": 0.4163498895697308, | |
| "learning_rate": 1.9884984166525996e-05, | |
| "loss": 0.1741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1733531653881073, | |
| "step": 2345, | |
| "valid_targets_mean": 6222.6, | |
| "valid_targets_min": 858 | |
| }, | |
| { | |
| "epoch": 3.871499176276771, | |
| "grad_norm": 0.46658867545999017, | |
| "learning_rate": 1.9802832106818038e-05, | |
| "loss": 0.163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16182400286197662, | |
| "step": 2350, | |
| "valid_targets_mean": 5577.4, | |
| "valid_targets_min": 405 | |
| }, | |
| { | |
| "epoch": 3.8797364085667216, | |
| "grad_norm": 0.6884250760473662, | |
| "learning_rate": 1.9720683374008742e-05, | |
| "loss": 0.1738, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1815515160560608, | |
| "step": 2355, | |
| "valid_targets_mean": 4938.5, | |
| "valid_targets_min": 474 | |
| }, | |
| { | |
| "epoch": 3.8879736408566723, | |
| "grad_norm": 0.44499718070883826, | |
| "learning_rate": 1.9638539354228994e-05, | |
| "loss": 0.1618, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1717439889907837, | |
| "step": 2360, | |
| "valid_targets_mean": 5651.2, | |
| "valid_targets_min": 423 | |
| }, | |
| { | |
| "epoch": 3.8962108731466225, | |
| "grad_norm": 0.40457261570684494, | |
| "learning_rate": 1.9556401433530186e-05, | |
| "loss": 0.1622, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15134787559509277, | |
| "step": 2365, | |
| "valid_targets_mean": 5634.9, | |
| "valid_targets_min": 351 | |
| }, | |
| { | |
| "epoch": 3.904448105436573, | |
| "grad_norm": 0.6553583489153849, | |
| "learning_rate": 1.947427099786077e-05, | |
| "loss": 0.1706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16197733581066132, | |
| "step": 2370, | |
| "valid_targets_mean": 4561.9, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 3.912685337726524, | |
| "grad_norm": 0.4157227379076624, | |
| "learning_rate": 1.9392149433042915e-05, | |
| "loss": 0.1626, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16081485152244568, | |
| "step": 2375, | |
| "valid_targets_mean": 5540.6, | |
| "valid_targets_min": 329 | |
| }, | |
| { | |
| "epoch": 3.9209225700164745, | |
| "grad_norm": 0.4359729610143708, | |
| "learning_rate": 1.9310038124749102e-05, | |
| "loss": 0.1651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15543285012245178, | |
| "step": 2380, | |
| "valid_targets_mean": 5001.1, | |
| "valid_targets_min": 418 | |
| }, | |
| { | |
| "epoch": 3.929159802306425, | |
| "grad_norm": 0.7705180986032697, | |
| "learning_rate": 1.9227938458478744e-05, | |
| "loss": 0.167, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1696222722530365, | |
| "step": 2385, | |
| "valid_targets_mean": 4766.1, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 3.9373970345963754, | |
| "grad_norm": 0.4047261335899435, | |
| "learning_rate": 1.9145851819534824e-05, | |
| "loss": 0.1757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15419232845306396, | |
| "step": 2390, | |
| "valid_targets_mean": 6061.1, | |
| "valid_targets_min": 447 | |
| }, | |
| { | |
| "epoch": 3.9456342668863265, | |
| "grad_norm": 0.41724450128225465, | |
| "learning_rate": 1.9063779593000507e-05, | |
| "loss": 0.1603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1801353394985199, | |
| "step": 2395, | |
| "valid_targets_mean": 6642.6, | |
| "valid_targets_min": 396 | |
| }, | |
| { | |
| "epoch": 3.9538714991762767, | |
| "grad_norm": 0.4423034009298218, | |
| "learning_rate": 1.8981723163715756e-05, | |
| "loss": 0.1638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17328378558158875, | |
| "step": 2400, | |
| "valid_targets_mean": 5702.5, | |
| "valid_targets_min": 432 | |
| }, | |
| { | |
| "epoch": 3.9621087314662273, | |
| "grad_norm": 0.444583031013224, | |
| "learning_rate": 1.8899683916254013e-05, | |
| "loss": 0.1671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16314169764518738, | |
| "step": 2405, | |
| "valid_targets_mean": 5586.6, | |
| "valid_targets_min": 465 | |
| }, | |
| { | |
| "epoch": 3.970345963756178, | |
| "grad_norm": 0.5567518529012295, | |
| "learning_rate": 1.881766323489877e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15508978068828583, | |
| "step": 2410, | |
| "valid_targets_mean": 5686.6, | |
| "valid_targets_min": 842 | |
| }, | |
| { | |
| "epoch": 3.9785831960461286, | |
| "grad_norm": 0.4528089798600689, | |
| "learning_rate": 1.8735662503620268e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1737988293170929, | |
| "step": 2415, | |
| "valid_targets_mean": 5447.9, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 3.9868204283360793, | |
| "grad_norm": 0.4437374282788194, | |
| "learning_rate": 1.8653683106052097e-05, | |
| "loss": 0.1737, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16056564450263977, | |
| "step": 2420, | |
| "valid_targets_mean": 5152.9, | |
| "valid_targets_min": 1012 | |
| }, | |
| { | |
| "epoch": 3.9950576606260295, | |
| "grad_norm": 0.4629669597591696, | |
| "learning_rate": 1.857172642546791e-05, | |
| "loss": 0.1631, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16307508945465088, | |
| "step": 2425, | |
| "valid_targets_mean": 4988.5, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 4.003294892915981, | |
| "grad_norm": 0.362102397580454, | |
| "learning_rate": 1.8489793844758015e-05, | |
| "loss": 0.149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12046860158443451, | |
| "step": 2430, | |
| "valid_targets_mean": 7373.6, | |
| "valid_targets_min": 2163 | |
| }, | |
| { | |
| "epoch": 4.011532125205931, | |
| "grad_norm": 0.4588004757668333, | |
| "learning_rate": 1.8407886746406082e-05, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12914225459098816, | |
| "step": 2435, | |
| "valid_targets_mean": 4912.9, | |
| "valid_targets_min": 243 | |
| }, | |
| { | |
| "epoch": 4.019769357495881, | |
| "grad_norm": 0.46936874186243444, | |
| "learning_rate": 1.8326006512465806e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15502551198005676, | |
| "step": 2440, | |
| "valid_targets_mean": 5579.7, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 4.028006589785832, | |
| "grad_norm": 0.42358155648588003, | |
| "learning_rate": 1.8244154524537602e-05, | |
| "loss": 0.1506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1456337869167328, | |
| "step": 2445, | |
| "valid_targets_mean": 6027.8, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 4.036243822075782, | |
| "grad_norm": 0.411694955371839, | |
| "learning_rate": 1.8162332163745253e-05, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15597039461135864, | |
| "step": 2450, | |
| "valid_targets_mean": 6774.1, | |
| "valid_targets_min": 2283 | |
| }, | |
| { | |
| "epoch": 4.0444810543657335, | |
| "grad_norm": 0.47459473453439693, | |
| "learning_rate": 1.8080540810712652e-05, | |
| "loss": 0.1474, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17040134966373444, | |
| "step": 2455, | |
| "valid_targets_mean": 5131.1, | |
| "valid_targets_min": 346 | |
| }, | |
| { | |
| "epoch": 4.052718286655684, | |
| "grad_norm": 0.41826585402865163, | |
| "learning_rate": 1.799878184554047e-05, | |
| "loss": 0.1506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1321791410446167, | |
| "step": 2460, | |
| "valid_targets_mean": 6608.9, | |
| "valid_targets_min": 3516 | |
| }, | |
| { | |
| "epoch": 4.060955518945634, | |
| "grad_norm": 0.4155868317445694, | |
| "learning_rate": 1.7917056647782888e-05, | |
| "loss": 0.1443, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13027681410312653, | |
| "step": 2465, | |
| "valid_targets_mean": 5578.6, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 4.069192751235585, | |
| "grad_norm": 0.46080020104599295, | |
| "learning_rate": 1.7835366596424317e-05, | |
| "loss": 0.1553, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14736926555633545, | |
| "step": 2470, | |
| "valid_targets_mean": 5789.2, | |
| "valid_targets_min": 2207 | |
| }, | |
| { | |
| "epoch": 4.077429983525535, | |
| "grad_norm": 0.7024450667359199, | |
| "learning_rate": 1.775371306985613e-05, | |
| "loss": 0.1473, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.141681507229805, | |
| "step": 2475, | |
| "valid_targets_mean": 5083.5, | |
| "valid_targets_min": 338 | |
| }, | |
| { | |
| "epoch": 4.085667215815486, | |
| "grad_norm": 0.4645130391173883, | |
| "learning_rate": 1.7672097445853388e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15968500077724457, | |
| "step": 2480, | |
| "valid_targets_mean": 5763.9, | |
| "valid_targets_min": 426 | |
| }, | |
| { | |
| "epoch": 4.0939044481054365, | |
| "grad_norm": 0.4784694238147179, | |
| "learning_rate": 1.7590521101551635e-05, | |
| "loss": 0.1577, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16522473096847534, | |
| "step": 2485, | |
| "valid_targets_mean": 5397.2, | |
| "valid_targets_min": 379 | |
| }, | |
| { | |
| "epoch": 4.102141680395387, | |
| "grad_norm": 0.46501246791017226, | |
| "learning_rate": 1.7508985413423595e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13691683113574982, | |
| "step": 2490, | |
| "valid_targets_mean": 4509.5, | |
| "valid_targets_min": 306 | |
| }, | |
| { | |
| "epoch": 4.110378912685338, | |
| "grad_norm": 0.5700981481383305, | |
| "learning_rate": 1.742749175725601e-05, | |
| "loss": 0.148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1452605426311493, | |
| "step": 2495, | |
| "valid_targets_mean": 3580.9, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 4.118616144975288, | |
| "grad_norm": 0.5705186798816394, | |
| "learning_rate": 1.7346041508126386e-05, | |
| "loss": 0.1472, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17359477281570435, | |
| "step": 2500, | |
| "valid_targets_mean": 4987.8, | |
| "valid_targets_min": 295 | |
| }, | |
| { | |
| "epoch": 4.126853377265239, | |
| "grad_norm": 0.4400809895847404, | |
| "learning_rate": 1.7264636040379808e-05, | |
| "loss": 0.1484, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13463068008422852, | |
| "step": 2505, | |
| "valid_targets_mean": 5214.5, | |
| "valid_targets_min": 722 | |
| }, | |
| { | |
| "epoch": 4.135090609555189, | |
| "grad_norm": 0.4877814194690551, | |
| "learning_rate": 1.7183276727605743e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15212027728557587, | |
| "step": 2510, | |
| "valid_targets_mean": 4686.6, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 4.1433278418451405, | |
| "grad_norm": 0.5285967708806336, | |
| "learning_rate": 1.7101964942614864e-05, | |
| "loss": 0.1576, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14906981587409973, | |
| "step": 2515, | |
| "valid_targets_mean": 4191.4, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 4.151565074135091, | |
| "grad_norm": 0.5783330832204944, | |
| "learning_rate": 1.7020702057415884e-05, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14080877602100372, | |
| "step": 2520, | |
| "valid_targets_mean": 5295.0, | |
| "valid_targets_min": 347 | |
| }, | |
| { | |
| "epoch": 4.159802306425041, | |
| "grad_norm": 0.6973570205068965, | |
| "learning_rate": 1.6939489443192435e-05, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1889844387769699, | |
| "step": 2525, | |
| "valid_targets_mean": 3328.6, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 4.168039538714992, | |
| "grad_norm": 0.44997217802378553, | |
| "learning_rate": 1.6858328470279862e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15209095180034637, | |
| "step": 2530, | |
| "valid_targets_mean": 5899.5, | |
| "valid_targets_min": 3436 | |
| }, | |
| { | |
| "epoch": 4.176276771004942, | |
| "grad_norm": 0.4625339471080827, | |
| "learning_rate": 1.6777220508142184e-05, | |
| "loss": 0.1511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14391621947288513, | |
| "step": 2535, | |
| "valid_targets_mean": 5531.8, | |
| "valid_targets_min": 656 | |
| }, | |
| { | |
| "epoch": 4.184514003294893, | |
| "grad_norm": 0.4957253313719207, | |
| "learning_rate": 1.6696166925348924e-05, | |
| "loss": 0.148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13907882571220398, | |
| "step": 2540, | |
| "valid_targets_mean": 4948.8, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 4.1927512355848435, | |
| "grad_norm": 0.7565853150262708, | |
| "learning_rate": 1.6615169089552038e-05, | |
| "loss": 0.1506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1515434831380844, | |
| "step": 2545, | |
| "valid_targets_mean": 5125.2, | |
| "valid_targets_min": 347 | |
| }, | |
| { | |
| "epoch": 4.200988467874794, | |
| "grad_norm": 0.49753975963313524, | |
| "learning_rate": 1.6534228367462843e-05, | |
| "loss": 0.1476, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15490424633026123, | |
| "step": 2550, | |
| "valid_targets_mean": 5573.6, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 4.209225700164745, | |
| "grad_norm": 0.5345554614335993, | |
| "learning_rate": 1.6453346124828966e-05, | |
| "loss": 0.1529, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18045078217983246, | |
| "step": 2555, | |
| "valid_targets_mean": 4684.8, | |
| "valid_targets_min": 268 | |
| }, | |
| { | |
| "epoch": 4.217462932454695, | |
| "grad_norm": 0.4613752387425423, | |
| "learning_rate": 1.6372523726411253e-05, | |
| "loss": 0.1491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14130446314811707, | |
| "step": 2560, | |
| "valid_targets_mean": 5415.4, | |
| "valid_targets_min": 343 | |
| }, | |
| { | |
| "epoch": 4.225700164744646, | |
| "grad_norm": 0.4525908163882441, | |
| "learning_rate": 1.6291762535960817e-05, | |
| "loss": 0.1524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16135191917419434, | |
| "step": 2565, | |
| "valid_targets_mean": 5391.4, | |
| "valid_targets_min": 399 | |
| }, | |
| { | |
| "epoch": 4.233937397034596, | |
| "grad_norm": 0.4596645755169701, | |
| "learning_rate": 1.6211063916195945e-05, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15719518065452576, | |
| "step": 2570, | |
| "valid_targets_mean": 5872.5, | |
| "valid_targets_min": 403 | |
| }, | |
| { | |
| "epoch": 4.242174629324547, | |
| "grad_norm": 0.5069424876548778, | |
| "learning_rate": 1.6130429228779154e-05, | |
| "loss": 0.1503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16834935545921326, | |
| "step": 2575, | |
| "valid_targets_mean": 5199.9, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 4.250411861614498, | |
| "grad_norm": 0.48516012282655313, | |
| "learning_rate": 1.604985983429421e-05, | |
| "loss": 0.1495, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1605086624622345, | |
| "step": 2580, | |
| "valid_targets_mean": 5155.4, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 4.258649093904448, | |
| "grad_norm": 0.4468201225192397, | |
| "learning_rate": 1.596935709222315e-05, | |
| "loss": 0.1498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1490262746810913, | |
| "step": 2585, | |
| "valid_targets_mean": 6065.2, | |
| "valid_targets_min": 1807 | |
| }, | |
| { | |
| "epoch": 4.266886326194399, | |
| "grad_norm": 0.41002444382532305, | |
| "learning_rate": 1.5888922360923375e-05, | |
| "loss": 0.1488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13134261965751648, | |
| "step": 2590, | |
| "valid_targets_mean": 6304.5, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 4.275123558484349, | |
| "grad_norm": 0.5102813058004463, | |
| "learning_rate": 1.5808556997604684e-05, | |
| "loss": 0.1477, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14674484729766846, | |
| "step": 2595, | |
| "valid_targets_mean": 4860.8, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 4.283360790774299, | |
| "grad_norm": 0.4368439014835675, | |
| "learning_rate": 1.5728262358306418e-05, | |
| "loss": 0.1464, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1445227563381195, | |
| "step": 2600, | |
| "valid_targets_mean": 5476.2, | |
| "valid_targets_min": 1056 | |
| }, | |
| { | |
| "epoch": 4.2915980230642505, | |
| "grad_norm": 0.5676665382040276, | |
| "learning_rate": 1.564803979787457e-05, | |
| "loss": 0.1531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15031370520591736, | |
| "step": 2605, | |
| "valid_targets_mean": 3444.8, | |
| "valid_targets_min": 253 | |
| }, | |
| { | |
| "epoch": 4.299835255354201, | |
| "grad_norm": 0.5109094701134649, | |
| "learning_rate": 1.5567890669938903e-05, | |
| "loss": 0.1509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15295559167861938, | |
| "step": 2610, | |
| "valid_targets_mean": 5027.6, | |
| "valid_targets_min": 719 | |
| }, | |
| { | |
| "epoch": 4.308072487644152, | |
| "grad_norm": 0.45013114123730513, | |
| "learning_rate": 1.5487816326890118e-05, | |
| "loss": 0.1629, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14927126467227936, | |
| "step": 2615, | |
| "valid_targets_mean": 6415.9, | |
| "valid_targets_min": 1665 | |
| }, | |
| { | |
| "epoch": 4.316309719934102, | |
| "grad_norm": 0.43711700962396716, | |
| "learning_rate": 1.540781811985704e-05, | |
| "loss": 0.1452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14677034318447113, | |
| "step": 2620, | |
| "valid_targets_mean": 5701.2, | |
| "valid_targets_min": 403 | |
| }, | |
| { | |
| "epoch": 4.324546952224052, | |
| "grad_norm": 0.4567319205617331, | |
| "learning_rate": 1.5327897398683827e-05, | |
| "loss": 0.1482, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1432798206806183, | |
| "step": 2625, | |
| "valid_targets_mean": 6331.5, | |
| "valid_targets_min": 1060 | |
| }, | |
| { | |
| "epoch": 4.332784184514003, | |
| "grad_norm": 0.6150417219974588, | |
| "learning_rate": 1.5248055511907183e-05, | |
| "loss": 0.1532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1534081995487213, | |
| "step": 2630, | |
| "valid_targets_mean": 5722.9, | |
| "valid_targets_min": 895 | |
| }, | |
| { | |
| "epoch": 4.341021416803954, | |
| "grad_norm": 0.5439641469775286, | |
| "learning_rate": 1.5168293806733594e-05, | |
| "loss": 0.1612, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15055815875530243, | |
| "step": 2635, | |
| "valid_targets_mean": 3872.1, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 4.349258649093905, | |
| "grad_norm": 0.49432098862551554, | |
| "learning_rate": 1.5088613629016613e-05, | |
| "loss": 0.158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16186079382896423, | |
| "step": 2640, | |
| "valid_targets_mean": 5156.1, | |
| "valid_targets_min": 295 | |
| }, | |
| { | |
| "epoch": 4.357495881383855, | |
| "grad_norm": 0.4377662832647051, | |
| "learning_rate": 1.5009016323234158e-05, | |
| "loss": 0.159, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1668049693107605, | |
| "step": 2645, | |
| "valid_targets_mean": 6354.6, | |
| "valid_targets_min": 383 | |
| }, | |
| { | |
| "epoch": 4.365733113673806, | |
| "grad_norm": 0.4969241114877333, | |
| "learning_rate": 1.4929503232465804e-05, | |
| "loss": 0.1484, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15549027919769287, | |
| "step": 2650, | |
| "valid_targets_mean": 4336.9, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 4.373970345963756, | |
| "grad_norm": 0.39112878825838426, | |
| "learning_rate": 1.485007569837011e-05, | |
| "loss": 0.1444, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13279537856578827, | |
| "step": 2655, | |
| "valid_targets_mean": 6286.1, | |
| "valid_targets_min": 3182 | |
| }, | |
| { | |
| "epoch": 4.382207578253706, | |
| "grad_norm": 0.4899567925864192, | |
| "learning_rate": 1.477073506116204e-05, | |
| "loss": 0.1515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17375296354293823, | |
| "step": 2660, | |
| "valid_targets_mean": 5219.9, | |
| "valid_targets_min": 211 | |
| }, | |
| { | |
| "epoch": 4.3904448105436575, | |
| "grad_norm": 0.45711905561871397, | |
| "learning_rate": 1.469148265959028e-05, | |
| "loss": 0.1534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1331019401550293, | |
| "step": 2665, | |
| "valid_targets_mean": 5110.2, | |
| "valid_targets_min": 382 | |
| }, | |
| { | |
| "epoch": 4.398682042833608, | |
| "grad_norm": 0.5069456092396808, | |
| "learning_rate": 1.4612319830914684e-05, | |
| "loss": 0.1516, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14841151237487793, | |
| "step": 2670, | |
| "valid_targets_mean": 4724.4, | |
| "valid_targets_min": 487 | |
| }, | |
| { | |
| "epoch": 4.406919275123559, | |
| "grad_norm": 0.44034448930841963, | |
| "learning_rate": 1.4533247910883704e-05, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1460045576095581, | |
| "step": 2675, | |
| "valid_targets_mean": 5382.2, | |
| "valid_targets_min": 252 | |
| }, | |
| { | |
| "epoch": 4.415156507413509, | |
| "grad_norm": 0.4085357778933215, | |
| "learning_rate": 1.445426823371186e-05, | |
| "loss": 0.1468, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14165529608726501, | |
| "step": 2680, | |
| "valid_targets_mean": 6587.8, | |
| "valid_targets_min": 3807 | |
| }, | |
| { | |
| "epoch": 4.423393739703459, | |
| "grad_norm": 0.4985432420723232, | |
| "learning_rate": 1.4375382132057207e-05, | |
| "loss": 0.1541, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14564910531044006, | |
| "step": 2685, | |
| "valid_targets_mean": 7100.1, | |
| "valid_targets_min": 1402 | |
| }, | |
| { | |
| "epoch": 4.43163097199341, | |
| "grad_norm": 0.4558487525400957, | |
| "learning_rate": 1.4296590936998875e-05, | |
| "loss": 0.1527, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13396936655044556, | |
| "step": 2690, | |
| "valid_targets_mean": 5222.4, | |
| "valid_targets_min": 1873 | |
| }, | |
| { | |
| "epoch": 4.439868204283361, | |
| "grad_norm": 0.5212832580946739, | |
| "learning_rate": 1.4217895978014583e-05, | |
| "loss": 0.1475, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16112340986728668, | |
| "step": 2695, | |
| "valid_targets_mean": 6251.2, | |
| "valid_targets_min": 2514 | |
| }, | |
| { | |
| "epoch": 4.448105436573312, | |
| "grad_norm": 0.511176800268245, | |
| "learning_rate": 1.4139298582958208e-05, | |
| "loss": 0.1586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17118722200393677, | |
| "step": 2700, | |
| "valid_targets_mean": 5283.8, | |
| "valid_targets_min": 475 | |
| }, | |
| { | |
| "epoch": 4.456342668863262, | |
| "grad_norm": 0.4133253589641243, | |
| "learning_rate": 1.4060800078037407e-05, | |
| "loss": 0.1453, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13945919275283813, | |
| "step": 2705, | |
| "valid_targets_mean": 6163.3, | |
| "valid_targets_min": 348 | |
| }, | |
| { | |
| "epoch": 4.464579901153212, | |
| "grad_norm": 0.5355709069173332, | |
| "learning_rate": 1.3982401787791215e-05, | |
| "loss": 0.1589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16837137937545776, | |
| "step": 2710, | |
| "valid_targets_mean": 4398.6, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 4.472817133443163, | |
| "grad_norm": 0.47608650460084356, | |
| "learning_rate": 1.3904105035067701e-05, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14236906170845032, | |
| "step": 2715, | |
| "valid_targets_mean": 5176.2, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 4.481054365733113, | |
| "grad_norm": 0.4680478602386323, | |
| "learning_rate": 1.3825911141001637e-05, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14760847389698029, | |
| "step": 2720, | |
| "valid_targets_mean": 5412.5, | |
| "valid_targets_min": 282 | |
| }, | |
| { | |
| "epoch": 4.4892915980230645, | |
| "grad_norm": 0.619587835664657, | |
| "learning_rate": 1.3747821424992245e-05, | |
| "loss": 0.1506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15162773430347443, | |
| "step": 2725, | |
| "valid_targets_mean": 5752.3, | |
| "valid_targets_min": 400 | |
| }, | |
| { | |
| "epoch": 4.497528830313015, | |
| "grad_norm": 0.4661528683504612, | |
| "learning_rate": 1.3669837204680879e-05, | |
| "loss": 0.1488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14053663611412048, | |
| "step": 2730, | |
| "valid_targets_mean": 5016.6, | |
| "valid_targets_min": 410 | |
| }, | |
| { | |
| "epoch": 4.505766062602966, | |
| "grad_norm": 0.42047460260378905, | |
| "learning_rate": 1.3591959795928821e-05, | |
| "loss": 0.1482, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15329483151435852, | |
| "step": 2735, | |
| "valid_targets_mean": 6674.5, | |
| "valid_targets_min": 3457 | |
| }, | |
| { | |
| "epoch": 4.514003294892916, | |
| "grad_norm": 0.5636482204398094, | |
| "learning_rate": 1.3514190512795096e-05, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14222252368927002, | |
| "step": 2740, | |
| "valid_targets_mean": 5046.1, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 4.522240527182866, | |
| "grad_norm": 0.45583530674550665, | |
| "learning_rate": 1.3436530667514258e-05, | |
| "loss": 0.1517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15300938487052917, | |
| "step": 2745, | |
| "valid_targets_mean": 5698.6, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 4.530477759472817, | |
| "grad_norm": 0.4261953081142523, | |
| "learning_rate": 1.3358981570474276e-05, | |
| "loss": 0.1465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13072721660137177, | |
| "step": 2750, | |
| "valid_targets_mean": 5420.9, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 4.538714991762768, | |
| "grad_norm": 0.5880573918987582, | |
| "learning_rate": 1.3281544530194412e-05, | |
| "loss": 0.1463, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14867545664310455, | |
| "step": 2755, | |
| "valid_targets_mean": 5139.9, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 4.546952224052719, | |
| "grad_norm": 0.4689692885338706, | |
| "learning_rate": 1.3204220853303157e-05, | |
| "loss": 0.1497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14390037953853607, | |
| "step": 2760, | |
| "valid_targets_mean": 4890.1, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 4.555189456342669, | |
| "grad_norm": 0.5368370242051151, | |
| "learning_rate": 1.3127011844516165e-05, | |
| "loss": 0.1539, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14073890447616577, | |
| "step": 2765, | |
| "valid_targets_mean": 5944.6, | |
| "valid_targets_min": 841 | |
| }, | |
| { | |
| "epoch": 4.563426688632619, | |
| "grad_norm": 0.4342038972954832, | |
| "learning_rate": 1.304991880661425e-05, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14130136370658875, | |
| "step": 2770, | |
| "valid_targets_mean": 5363.7, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 4.57166392092257, | |
| "grad_norm": 0.5021035499032942, | |
| "learning_rate": 1.2972943040421398e-05, | |
| "loss": 0.1524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.146042600274086, | |
| "step": 2775, | |
| "valid_targets_mean": 7223.3, | |
| "valid_targets_min": 374 | |
| }, | |
| { | |
| "epoch": 4.5799011532125204, | |
| "grad_norm": 0.4739904329110086, | |
| "learning_rate": 1.2896085844782808e-05, | |
| "loss": 0.1505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14053794741630554, | |
| "step": 2780, | |
| "valid_targets_mean": 6500.1, | |
| "valid_targets_min": 2785 | |
| }, | |
| { | |
| "epoch": 4.5881383855024716, | |
| "grad_norm": 0.43782595703510596, | |
| "learning_rate": 1.2819348516543003e-05, | |
| "loss": 0.1489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1457478106021881, | |
| "step": 2785, | |
| "valid_targets_mean": 5669.9, | |
| "valid_targets_min": 350 | |
| }, | |
| { | |
| "epoch": 4.596375617792422, | |
| "grad_norm": 0.5041051641997347, | |
| "learning_rate": 1.2742732350523928e-05, | |
| "loss": 0.1582, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14434212446212769, | |
| "step": 2790, | |
| "valid_targets_mean": 4667.4, | |
| "valid_targets_min": 348 | |
| }, | |
| { | |
| "epoch": 4.604612850082372, | |
| "grad_norm": 0.5876482351362182, | |
| "learning_rate": 1.2666238639503099e-05, | |
| "loss": 0.1648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1538345068693161, | |
| "step": 2795, | |
| "valid_targets_mean": 5997.9, | |
| "valid_targets_min": 306 | |
| }, | |
| { | |
| "epoch": 4.612850082372323, | |
| "grad_norm": 0.4801368026632177, | |
| "learning_rate": 1.2589868674191794e-05, | |
| "loss": 0.1535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15362213551998138, | |
| "step": 2800, | |
| "valid_targets_mean": 5321.2, | |
| "valid_targets_min": 435 | |
| }, | |
| { | |
| "epoch": 4.621087314662273, | |
| "grad_norm": 0.48008674858622163, | |
| "learning_rate": 1.2513623743213297e-05, | |
| "loss": 0.1647, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16761736571788788, | |
| "step": 2805, | |
| "valid_targets_mean": 5240.1, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 4.629324546952224, | |
| "grad_norm": 0.39861573361053565, | |
| "learning_rate": 1.243750513308111e-05, | |
| "loss": 0.1423, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14428246021270752, | |
| "step": 2810, | |
| "valid_targets_mean": 6673.4, | |
| "valid_targets_min": 2183 | |
| }, | |
| { | |
| "epoch": 4.637561779242175, | |
| "grad_norm": 0.4416153244849873, | |
| "learning_rate": 1.2361514128177286e-05, | |
| "loss": 0.1503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1608775556087494, | |
| "step": 2815, | |
| "valid_targets_mean": 5941.4, | |
| "valid_targets_min": 353 | |
| }, | |
| { | |
| "epoch": 4.645799011532125, | |
| "grad_norm": 0.5094541742636675, | |
| "learning_rate": 1.2285652010730727e-05, | |
| "loss": 0.1509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15818873047828674, | |
| "step": 2820, | |
| "valid_targets_mean": 4775.7, | |
| "valid_targets_min": 709 | |
| }, | |
| { | |
| "epoch": 4.654036243822076, | |
| "grad_norm": 0.5138894968734843, | |
| "learning_rate": 1.220992006079557e-05, | |
| "loss": 0.1508, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17340649664402008, | |
| "step": 2825, | |
| "valid_targets_mean": 5142.4, | |
| "valid_targets_min": 780 | |
| }, | |
| { | |
| "epoch": 4.662273476112026, | |
| "grad_norm": 0.4349646733764812, | |
| "learning_rate": 1.2134319556229577e-05, | |
| "loss": 0.1562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17126768827438354, | |
| "step": 2830, | |
| "valid_targets_mean": 6356.2, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 4.670510708401977, | |
| "grad_norm": 0.4590448852323375, | |
| "learning_rate": 1.2058851772672569e-05, | |
| "loss": 0.1595, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18234387040138245, | |
| "step": 2835, | |
| "valid_targets_mean": 5573.4, | |
| "valid_targets_min": 292 | |
| }, | |
| { | |
| "epoch": 4.6787479406919275, | |
| "grad_norm": 0.4682703319290092, | |
| "learning_rate": 1.1983517983524917e-05, | |
| "loss": 0.1583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15168213844299316, | |
| "step": 2840, | |
| "valid_targets_mean": 5342.8, | |
| "valid_targets_min": 498 | |
| }, | |
| { | |
| "epoch": 4.686985172981878, | |
| "grad_norm": 0.46709472071321484, | |
| "learning_rate": 1.1908319459926059e-05, | |
| "loss": 0.1497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15055695176124573, | |
| "step": 2845, | |
| "valid_targets_mean": 5698.6, | |
| "valid_targets_min": 430 | |
| }, | |
| { | |
| "epoch": 4.695222405271829, | |
| "grad_norm": 0.41530709694840034, | |
| "learning_rate": 1.183325747073301e-05, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1470167487859726, | |
| "step": 2850, | |
| "valid_targets_mean": 6569.5, | |
| "valid_targets_min": 2168 | |
| }, | |
| { | |
| "epoch": 4.703459637561779, | |
| "grad_norm": 0.4986031558793924, | |
| "learning_rate": 1.1758333282499004e-05, | |
| "loss": 0.1616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15256115794181824, | |
| "step": 2855, | |
| "valid_targets_mean": 5291.1, | |
| "valid_targets_min": 356 | |
| }, | |
| { | |
| "epoch": 4.71169686985173, | |
| "grad_norm": 0.4644899790045432, | |
| "learning_rate": 1.1683548159452083e-05, | |
| "loss": 0.144, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13906921446323395, | |
| "step": 2860, | |
| "valid_targets_mean": 5042.7, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 4.71993410214168, | |
| "grad_norm": 0.4628750656923671, | |
| "learning_rate": 1.1608903363473804e-05, | |
| "loss": 0.1447, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15038900077342987, | |
| "step": 2865, | |
| "valid_targets_mean": 5170.4, | |
| "valid_targets_min": 392 | |
| }, | |
| { | |
| "epoch": 4.7281713344316305, | |
| "grad_norm": 0.46077501269852494, | |
| "learning_rate": 1.1534400154077901e-05, | |
| "loss": 0.1557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14422756433486938, | |
| "step": 2870, | |
| "valid_targets_mean": 5135.5, | |
| "valid_targets_min": 562 | |
| }, | |
| { | |
| "epoch": 4.736408566721582, | |
| "grad_norm": 0.44080772013963004, | |
| "learning_rate": 1.146003978838908e-05, | |
| "loss": 0.1468, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15541505813598633, | |
| "step": 2875, | |
| "valid_targets_mean": 5824.4, | |
| "valid_targets_min": 2243 | |
| }, | |
| { | |
| "epoch": 4.744645799011532, | |
| "grad_norm": 0.5154411270196606, | |
| "learning_rate": 1.1385823521121755e-05, | |
| "loss": 0.1556, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1654052883386612, | |
| "step": 2880, | |
| "valid_targets_mean": 5482.2, | |
| "valid_targets_min": 390 | |
| }, | |
| { | |
| "epoch": 4.752883031301483, | |
| "grad_norm": 0.48283476082428284, | |
| "learning_rate": 1.1311752604558945e-05, | |
| "loss": 0.1458, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1402990221977234, | |
| "step": 2885, | |
| "valid_targets_mean": 5145.9, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 4.761120263591433, | |
| "grad_norm": 0.4694411295793455, | |
| "learning_rate": 1.1237828288531065e-05, | |
| "loss": 0.1492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13541021943092346, | |
| "step": 2890, | |
| "valid_targets_mean": 5297.4, | |
| "valid_targets_min": 397 | |
| }, | |
| { | |
| "epoch": 4.769357495881383, | |
| "grad_norm": 0.42611432241362635, | |
| "learning_rate": 1.1164051820394903e-05, | |
| "loss": 0.1558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1598133146762848, | |
| "step": 2895, | |
| "valid_targets_mean": 6070.9, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 4.7775947281713345, | |
| "grad_norm": 0.416503846836609, | |
| "learning_rate": 1.1090424445012524e-05, | |
| "loss": 0.153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14090408384799957, | |
| "step": 2900, | |
| "valid_targets_mean": 6663.6, | |
| "valid_targets_min": 1886 | |
| }, | |
| { | |
| "epoch": 4.785831960461285, | |
| "grad_norm": 2.5787771100055266, | |
| "learning_rate": 1.1016947404730302e-05, | |
| "loss": 0.1579, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14577308297157288, | |
| "step": 2905, | |
| "valid_targets_mean": 6727.4, | |
| "valid_targets_min": 2421 | |
| }, | |
| { | |
| "epoch": 4.794069192751236, | |
| "grad_norm": 0.40795595011087066, | |
| "learning_rate": 1.0943621939357952e-05, | |
| "loss": 0.152, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1484503597021103, | |
| "step": 2910, | |
| "valid_targets_mean": 6500.4, | |
| "valid_targets_min": 997 | |
| }, | |
| { | |
| "epoch": 4.802306425041186, | |
| "grad_norm": 0.43831358879625426, | |
| "learning_rate": 1.0870449286147565e-05, | |
| "loss": 0.1491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14645683765411377, | |
| "step": 2915, | |
| "valid_targets_mean": 6608.6, | |
| "valid_targets_min": 771 | |
| }, | |
| { | |
| "epoch": 4.810543657331137, | |
| "grad_norm": 0.5171802770026964, | |
| "learning_rate": 1.07974306797728e-05, | |
| "loss": 0.1546, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14750051498413086, | |
| "step": 2920, | |
| "valid_targets_mean": 4631.4, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 4.818780889621087, | |
| "grad_norm": 0.4227242008790658, | |
| "learning_rate": 1.0724567352308008e-05, | |
| "loss": 0.1424, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14488349854946136, | |
| "step": 2925, | |
| "valid_targets_mean": 6282.1, | |
| "valid_targets_min": 642 | |
| }, | |
| { | |
| "epoch": 4.8270181219110375, | |
| "grad_norm": 0.4478606504649285, | |
| "learning_rate": 1.0651860533207448e-05, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1508277952671051, | |
| "step": 2930, | |
| "valid_targets_mean": 5273.1, | |
| "valid_targets_min": 615 | |
| }, | |
| { | |
| "epoch": 4.835255354200989, | |
| "grad_norm": 0.42744288780436157, | |
| "learning_rate": 1.0579311449284555e-05, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14862512052059174, | |
| "step": 2935, | |
| "valid_targets_mean": 5654.1, | |
| "valid_targets_min": 330 | |
| }, | |
| { | |
| "epoch": 4.843492586490939, | |
| "grad_norm": 0.511057530937132, | |
| "learning_rate": 1.0506921324691209e-05, | |
| "loss": 0.1471, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13991223275661469, | |
| "step": 2940, | |
| "valid_targets_mean": 5849.8, | |
| "valid_targets_min": 239 | |
| }, | |
| { | |
| "epoch": 4.85172981878089, | |
| "grad_norm": 0.43402347813103515, | |
| "learning_rate": 1.0434691380897132e-05, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14862912893295288, | |
| "step": 2945, | |
| "valid_targets_mean": 7194.9, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 4.85996705107084, | |
| "grad_norm": 0.5276452752375149, | |
| "learning_rate": 1.0362622836669216e-05, | |
| "loss": 0.1486, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15401889383792877, | |
| "step": 2950, | |
| "valid_targets_mean": 5620.6, | |
| "valid_targets_min": 711 | |
| }, | |
| { | |
| "epoch": 4.868204283360791, | |
| "grad_norm": 0.4492778144090402, | |
| "learning_rate": 1.0290716908051009e-05, | |
| "loss": 0.1502, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16334179043769836, | |
| "step": 2955, | |
| "valid_targets_mean": 6076.8, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 4.8764415156507415, | |
| "grad_norm": 0.47655472357038325, | |
| "learning_rate": 1.0218974808342168e-05, | |
| "loss": 0.1696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17806747555732727, | |
| "step": 2960, | |
| "valid_targets_mean": 5522.4, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 4.884678747940692, | |
| "grad_norm": 0.45349469543761634, | |
| "learning_rate": 1.0147397748077999e-05, | |
| "loss": 0.1457, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1454918086528778, | |
| "step": 2965, | |
| "valid_targets_mean": 5321.0, | |
| "valid_targets_min": 224 | |
| }, | |
| { | |
| "epoch": 4.892915980230643, | |
| "grad_norm": 0.43886529178564526, | |
| "learning_rate": 1.007598693500903e-05, | |
| "loss": 0.1546, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14286306500434875, | |
| "step": 2970, | |
| "valid_targets_mean": 5753.2, | |
| "valid_targets_min": 802 | |
| }, | |
| { | |
| "epoch": 4.901153212520593, | |
| "grad_norm": 0.518966892778548, | |
| "learning_rate": 1.0004743574080607e-05, | |
| "loss": 0.1578, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1655164361000061, | |
| "step": 2975, | |
| "valid_targets_mean": 4567.6, | |
| "valid_targets_min": 312 | |
| }, | |
| { | |
| "epoch": 4.909390444810544, | |
| "grad_norm": 0.4440850818507626, | |
| "learning_rate": 9.93366886741261e-06, | |
| "loss": 0.1481, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15223294496536255, | |
| "step": 2980, | |
| "valid_targets_mean": 6050.8, | |
| "valid_targets_min": 1823 | |
| }, | |
| { | |
| "epoch": 4.917627677100494, | |
| "grad_norm": 0.4514254890627531, | |
| "learning_rate": 9.862764014279134e-06, | |
| "loss": 0.1544, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14469844102859497, | |
| "step": 2985, | |
| "valid_targets_mean": 6442.1, | |
| "valid_targets_min": 846 | |
| }, | |
| { | |
| "epoch": 4.9258649093904445, | |
| "grad_norm": 0.43580792454868544, | |
| "learning_rate": 9.792030211088248e-06, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1455616056919098, | |
| "step": 2990, | |
| "valid_targets_mean": 6012.9, | |
| "valid_targets_min": 1013 | |
| }, | |
| { | |
| "epoch": 4.934102141680396, | |
| "grad_norm": 0.43958810455468345, | |
| "learning_rate": 9.721468651361852e-06, | |
| "loss": 0.1495, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17629413306713104, | |
| "step": 2995, | |
| "valid_targets_mean": 6032.8, | |
| "valid_targets_min": 934 | |
| }, | |
| { | |
| "epoch": 4.942339373970346, | |
| "grad_norm": 0.5343160717700296, | |
| "learning_rate": 9.651080525715464e-06, | |
| "loss": 0.149, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15880396962165833, | |
| "step": 3000, | |
| "valid_targets_mean": 4276.4, | |
| "valid_targets_min": 576 | |
| }, | |
| { | |
| "epoch": 4.950576606260297, | |
| "grad_norm": 0.4662434799894812, | |
| "learning_rate": 9.580867021838234e-06, | |
| "loss": 0.1432, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13793358206748962, | |
| "step": 3005, | |
| "valid_targets_mean": 5537.8, | |
| "valid_targets_min": 1519 | |
| }, | |
| { | |
| "epoch": 4.958813838550247, | |
| "grad_norm": 0.47078971320012264, | |
| "learning_rate": 9.510829324472784e-06, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1519417017698288, | |
| "step": 3010, | |
| "valid_targets_mean": 5382.9, | |
| "valid_targets_min": 351 | |
| }, | |
| { | |
| "epoch": 4.967051070840197, | |
| "grad_norm": 0.4478704939612275, | |
| "learning_rate": 9.440968615395314e-06, | |
| "loss": 0.1489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14127302169799805, | |
| "step": 3015, | |
| "valid_targets_mean": 6064.4, | |
| "valid_targets_min": 1989 | |
| }, | |
| { | |
| "epoch": 4.9752883031301485, | |
| "grad_norm": 0.4750655640254115, | |
| "learning_rate": 9.37128607339562e-06, | |
| "loss": 0.1572, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14737387001514435, | |
| "step": 3020, | |
| "valid_targets_mean": 6327.5, | |
| "valid_targets_min": 821 | |
| }, | |
| { | |
| "epoch": 4.983525535420099, | |
| "grad_norm": 0.4555400005042597, | |
| "learning_rate": 9.301782874257186e-06, | |
| "loss": 0.1688, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15319132804870605, | |
| "step": 3025, | |
| "valid_targets_mean": 5603.2, | |
| "valid_targets_min": 419 | |
| }, | |
| { | |
| "epoch": 4.99176276771005, | |
| "grad_norm": 0.4695221755893214, | |
| "learning_rate": 9.232460190737394e-06, | |
| "loss": 0.1551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1426851451396942, | |
| "step": 3030, | |
| "valid_targets_mean": 5278.2, | |
| "valid_targets_min": 294 | |
| }, | |
| { | |
| "epoch": 5.0, | |
| "grad_norm": 0.478596890936015, | |
| "learning_rate": 9.163319192547702e-06, | |
| "loss": 0.1465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.142055481672287, | |
| "step": 3035, | |
| "valid_targets_mean": 5582.5, | |
| "valid_targets_min": 531 | |
| }, | |
| { | |
| "epoch": 5.00823723228995, | |
| "grad_norm": 0.49069009485190446, | |
| "learning_rate": 9.094361046333887e-06, | |
| "loss": 0.1409, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1303725689649582, | |
| "step": 3040, | |
| "valid_targets_mean": 6621.2, | |
| "valid_targets_min": 911 | |
| }, | |
| { | |
| "epoch": 5.016474464579901, | |
| "grad_norm": 0.4821661219915955, | |
| "learning_rate": 9.025586915656435e-06, | |
| "loss": 0.1458, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1344459503889084, | |
| "step": 3045, | |
| "valid_targets_mean": 5295.7, | |
| "valid_targets_min": 325 | |
| }, | |
| { | |
| "epoch": 5.0247116968698515, | |
| "grad_norm": 0.5505894006961336, | |
| "learning_rate": 8.956997960970811e-06, | |
| "loss": 0.1548, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17692875862121582, | |
| "step": 3050, | |
| "valid_targets_mean": 4894.8, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 5.032948929159803, | |
| "grad_norm": 0.46750433795789914, | |
| "learning_rate": 8.888595339607961e-06, | |
| "loss": 0.1373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13959461450576782, | |
| "step": 3055, | |
| "valid_targets_mean": 6171.8, | |
| "valid_targets_min": 587 | |
| }, | |
| { | |
| "epoch": 5.041186161449753, | |
| "grad_norm": 0.5070608295694433, | |
| "learning_rate": 8.82038020575472e-06, | |
| "loss": 0.1417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14376729726791382, | |
| "step": 3060, | |
| "valid_targets_mean": 4882.1, | |
| "valid_targets_min": 555 | |
| }, | |
| { | |
| "epoch": 5.049423393739703, | |
| "grad_norm": 0.6060377491564969, | |
| "learning_rate": 8.752353710434384e-06, | |
| "loss": 0.1419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15076524019241333, | |
| "step": 3065, | |
| "valid_targets_mean": 3471.9, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 5.057660626029654, | |
| "grad_norm": 0.469440093985586, | |
| "learning_rate": 8.684517001487276e-06, | |
| "loss": 0.1425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14725562930107117, | |
| "step": 3070, | |
| "valid_targets_mean": 5937.0, | |
| "valid_targets_min": 331 | |
| }, | |
| { | |
| "epoch": 5.065897858319604, | |
| "grad_norm": 0.4900110272276438, | |
| "learning_rate": 8.616871223551335e-06, | |
| "loss": 0.1409, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13916555047035217, | |
| "step": 3075, | |
| "valid_targets_mean": 5039.7, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 5.0741350906095555, | |
| "grad_norm": 0.41725739160888586, | |
| "learning_rate": 8.549417518042899e-06, | |
| "loss": 0.1351, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1271580159664154, | |
| "step": 3080, | |
| "valid_targets_mean": 6236.5, | |
| "valid_targets_min": 422 | |
| }, | |
| { | |
| "epoch": 5.082372322899506, | |
| "grad_norm": 0.4605522955017288, | |
| "learning_rate": 8.482157023137326e-06, | |
| "loss": 0.1385, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11955632269382477, | |
| "step": 3085, | |
| "valid_targets_mean": 5906.8, | |
| "valid_targets_min": 600 | |
| }, | |
| { | |
| "epoch": 5.090609555189456, | |
| "grad_norm": 0.502778885048082, | |
| "learning_rate": 8.415090873749882e-06, | |
| "loss": 0.1439, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14969505369663239, | |
| "step": 3090, | |
| "valid_targets_mean": 5029.8, | |
| "valid_targets_min": 211 | |
| }, | |
| { | |
| "epoch": 5.098846787479407, | |
| "grad_norm": 0.5053536000161294, | |
| "learning_rate": 8.348220201516559e-06, | |
| "loss": 0.1469, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13605719804763794, | |
| "step": 3095, | |
| "valid_targets_mean": 6112.5, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 5.107084019769357, | |
| "grad_norm": 0.4598478330032498, | |
| "learning_rate": 8.281546134774952e-06, | |
| "loss": 0.1494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15661340951919556, | |
| "step": 3100, | |
| "valid_targets_mean": 5920.4, | |
| "valid_targets_min": 1189 | |
| }, | |
| { | |
| "epoch": 5.115321252059308, | |
| "grad_norm": 0.5547180056529626, | |
| "learning_rate": 8.215069798545272e-06, | |
| "loss": 0.1366, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13134953379631042, | |
| "step": 3105, | |
| "valid_targets_mean": 4233.9, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 5.1235584843492585, | |
| "grad_norm": 0.44622191419482493, | |
| "learning_rate": 8.148792314511338e-06, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14064301550388336, | |
| "step": 3110, | |
| "valid_targets_mean": 5863.4, | |
| "valid_targets_min": 486 | |
| }, | |
| { | |
| "epoch": 5.13179571663921, | |
| "grad_norm": 0.47904865237329824, | |
| "learning_rate": 8.082714801001619e-06, | |
| "loss": 0.138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13501349091529846, | |
| "step": 3115, | |
| "valid_targets_mean": 5212.8, | |
| "valid_targets_min": 395 | |
| }, | |
| { | |
| "epoch": 5.14003294892916, | |
| "grad_norm": 0.46284849540082074, | |
| "learning_rate": 8.016838372970453e-06, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13676853477954865, | |
| "step": 3120, | |
| "valid_targets_mean": 6098.6, | |
| "valid_targets_min": 1111 | |
| }, | |
| { | |
| "epoch": 5.14827018121911, | |
| "grad_norm": 0.4994068688425364, | |
| "learning_rate": 7.951164141979115e-06, | |
| "loss": 0.1306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12372538447380066, | |
| "step": 3125, | |
| "valid_targets_mean": 6541.1, | |
| "valid_targets_min": 1550 | |
| }, | |
| { | |
| "epoch": 5.156507413509061, | |
| "grad_norm": 0.5022833293912967, | |
| "learning_rate": 7.885693216177166e-06, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13442087173461914, | |
| "step": 3130, | |
| "valid_targets_mean": 5174.1, | |
| "valid_targets_min": 318 | |
| }, | |
| { | |
| "epoch": 5.164744645799011, | |
| "grad_norm": 0.4775571872447023, | |
| "learning_rate": 7.820426700283681e-06, | |
| "loss": 0.1453, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13596433401107788, | |
| "step": 3135, | |
| "valid_targets_mean": 4972.1, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 5.1729818780889625, | |
| "grad_norm": 0.49050982979527447, | |
| "learning_rate": 7.75536569556866e-06, | |
| "loss": 0.1383, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1351958066225052, | |
| "step": 3140, | |
| "valid_targets_mean": 5699.8, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 5.181219110378913, | |
| "grad_norm": 0.5461539540093063, | |
| "learning_rate": 7.690511299834422e-06, | |
| "loss": 0.1374, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1379501223564148, | |
| "step": 3145, | |
| "valid_targets_mean": 5473.7, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 5.189456342668863, | |
| "grad_norm": 0.4501236120342665, | |
| "learning_rate": 7.625864607397066e-06, | |
| "loss": 0.138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.146168053150177, | |
| "step": 3150, | |
| "valid_targets_mean": 6165.9, | |
| "valid_targets_min": 385 | |
| }, | |
| { | |
| "epoch": 5.197693574958814, | |
| "grad_norm": 0.5244359282299395, | |
| "learning_rate": 7.561426709068045e-06, | |
| "loss": 0.1483, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1434154212474823, | |
| "step": 3155, | |
| "valid_targets_mean": 4548.9, | |
| "valid_targets_min": 293 | |
| }, | |
| { | |
| "epoch": 5.205930807248764, | |
| "grad_norm": 0.46684517074949977, | |
| "learning_rate": 7.497198692135728e-06, | |
| "loss": 0.1381, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13306406140327454, | |
| "step": 3160, | |
| "valid_targets_mean": 5791.3, | |
| "valid_targets_min": 2554 | |
| }, | |
| { | |
| "epoch": 5.214168039538715, | |
| "grad_norm": 0.5243868937758703, | |
| "learning_rate": 7.433181640347069e-06, | |
| "loss": 0.1427, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15170177817344666, | |
| "step": 3165, | |
| "valid_targets_mean": 6504.1, | |
| "valid_targets_min": 1501 | |
| }, | |
| { | |
| "epoch": 5.2224052718286655, | |
| "grad_norm": 0.4643698500548393, | |
| "learning_rate": 7.369376633889325e-06, | |
| "loss": 0.1373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.140183687210083, | |
| "step": 3170, | |
| "valid_targets_mean": 5628.1, | |
| "valid_targets_min": 349 | |
| }, | |
| { | |
| "epoch": 5.230642504118616, | |
| "grad_norm": 0.4965467591482577, | |
| "learning_rate": 7.305784749371792e-06, | |
| "loss": 0.1417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1593542993068695, | |
| "step": 3175, | |
| "valid_targets_mean": 5535.8, | |
| "valid_targets_min": 753 | |
| }, | |
| { | |
| "epoch": 5.238879736408567, | |
| "grad_norm": 0.48661738008906197, | |
| "learning_rate": 7.242407059807703e-06, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14572195708751678, | |
| "step": 3180, | |
| "valid_targets_mean": 5237.9, | |
| "valid_targets_min": 549 | |
| }, | |
| { | |
| "epoch": 5.247116968698517, | |
| "grad_norm": 0.4806448175923749, | |
| "learning_rate": 7.1792446345960585e-06, | |
| "loss": 0.151, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1412971317768097, | |
| "step": 3185, | |
| "valid_targets_mean": 5538.9, | |
| "valid_targets_min": 423 | |
| }, | |
| { | |
| "epoch": 5.255354200988468, | |
| "grad_norm": 0.5155778651423555, | |
| "learning_rate": 7.116298539503632e-06, | |
| "loss": 0.141, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17066286504268646, | |
| "step": 3190, | |
| "valid_targets_mean": 5100.2, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 5.263591433278418, | |
| "grad_norm": 0.48430509588775345, | |
| "learning_rate": 7.053569836646954e-06, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1346225142478943, | |
| "step": 3195, | |
| "valid_targets_mean": 5719.1, | |
| "valid_targets_min": 2796 | |
| }, | |
| { | |
| "epoch": 5.2718286655683695, | |
| "grad_norm": 0.46910515647052425, | |
| "learning_rate": 6.991059584474411e-06, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13558635115623474, | |
| "step": 3200, | |
| "valid_targets_mean": 5281.7, | |
| "valid_targets_min": 312 | |
| }, | |
| { | |
| "epoch": 5.28006589785832, | |
| "grad_norm": 0.5480174850491485, | |
| "learning_rate": 6.928768837748374e-06, | |
| "loss": 0.1442, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1344684362411499, | |
| "step": 3205, | |
| "valid_targets_mean": 6858.4, | |
| "valid_targets_min": 320 | |
| }, | |
| { | |
| "epoch": 5.28830313014827, | |
| "grad_norm": 0.4918052522785785, | |
| "learning_rate": 6.866698647527392e-06, | |
| "loss": 0.1369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1560150682926178, | |
| "step": 3210, | |
| "valid_targets_mean": 5481.4, | |
| "valid_targets_min": 3738 | |
| }, | |
| { | |
| "epoch": 5.296540362438221, | |
| "grad_norm": 0.4490381704275577, | |
| "learning_rate": 6.804850061148485e-06, | |
| "loss": 0.1393, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1290242075920105, | |
| "step": 3215, | |
| "valid_targets_mean": 5527.1, | |
| "valid_targets_min": 339 | |
| }, | |
| { | |
| "epoch": 5.304777594728171, | |
| "grad_norm": 0.8224095339940158, | |
| "learning_rate": 6.7432241222094595e-06, | |
| "loss": 0.1473, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17443184554576874, | |
| "step": 3220, | |
| "valid_targets_mean": 4949.4, | |
| "valid_targets_min": 334 | |
| }, | |
| { | |
| "epoch": 5.313014827018122, | |
| "grad_norm": 0.47003033255636556, | |
| "learning_rate": 6.68182187055127e-06, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1260230839252472, | |
| "step": 3225, | |
| "valid_targets_mean": 5310.6, | |
| "valid_targets_min": 375 | |
| }, | |
| { | |
| "epoch": 5.3212520593080725, | |
| "grad_norm": 0.4908187369808388, | |
| "learning_rate": 6.620644342240534e-06, | |
| "loss": 0.1393, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14359594881534576, | |
| "step": 3230, | |
| "valid_targets_mean": 5568.1, | |
| "valid_targets_min": 1129 | |
| }, | |
| { | |
| "epoch": 5.329489291598023, | |
| "grad_norm": 0.551319463728614, | |
| "learning_rate": 6.559692569551976e-06, | |
| "loss": 0.1411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15381722152233124, | |
| "step": 3235, | |
| "valid_targets_mean": 5568.9, | |
| "valid_targets_min": 815 | |
| }, | |
| { | |
| "epoch": 5.337726523887974, | |
| "grad_norm": 0.4450876804926074, | |
| "learning_rate": 6.498967580951099e-06, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12999829649925232, | |
| "step": 3240, | |
| "valid_targets_mean": 5879.9, | |
| "valid_targets_min": 361 | |
| }, | |
| { | |
| "epoch": 5.345963756177924, | |
| "grad_norm": 0.4711121467460195, | |
| "learning_rate": 6.4384704010767284e-06, | |
| "loss": 0.1508, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14210301637649536, | |
| "step": 3245, | |
| "valid_targets_mean": 5384.6, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 5.354200988467875, | |
| "grad_norm": 0.5603577329282675, | |
| "learning_rate": 6.37820205072381e-06, | |
| "loss": 0.1346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13441210985183716, | |
| "step": 3250, | |
| "valid_targets_mean": 3903.1, | |
| "valid_targets_min": 269 | |
| }, | |
| { | |
| "epoch": 5.362438220757825, | |
| "grad_norm": 0.581302886704017, | |
| "learning_rate": 6.318163546826141e-06, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1303476095199585, | |
| "step": 3255, | |
| "valid_targets_mean": 5103.0, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 5.370675453047776, | |
| "grad_norm": 0.5001476639868807, | |
| "learning_rate": 6.258355902439211e-06, | |
| "loss": 0.1411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.149338960647583, | |
| "step": 3260, | |
| "valid_targets_mean": 5191.6, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 5.378912685337727, | |
| "grad_norm": 0.6039641844272668, | |
| "learning_rate": 6.198780126723127e-06, | |
| "loss": 0.1404, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15267781913280487, | |
| "step": 3265, | |
| "valid_targets_mean": 3882.5, | |
| "valid_targets_min": 309 | |
| }, | |
| { | |
| "epoch": 5.387149917627677, | |
| "grad_norm": 0.4942844367813607, | |
| "learning_rate": 6.139437224925586e-06, | |
| "loss": 0.1414, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12675516307353973, | |
| "step": 3270, | |
| "valid_targets_mean": 5851.5, | |
| "valid_targets_min": 1922 | |
| }, | |
| { | |
| "epoch": 5.395387149917628, | |
| "grad_norm": 0.5310086973536533, | |
| "learning_rate": 6.080328198364871e-06, | |
| "loss": 0.1478, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14578582346439362, | |
| "step": 3275, | |
| "valid_targets_mean": 4682.0, | |
| "valid_targets_min": 274 | |
| }, | |
| { | |
| "epoch": 5.403624382207578, | |
| "grad_norm": 0.510770120262347, | |
| "learning_rate": 6.021454044413036e-06, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14449571073055267, | |
| "step": 3280, | |
| "valid_targets_mean": 4886.8, | |
| "valid_targets_min": 357 | |
| }, | |
| { | |
| "epoch": 5.4118616144975284, | |
| "grad_norm": 0.6719528824690161, | |
| "learning_rate": 5.962815756478979e-06, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.146399587392807, | |
| "step": 3285, | |
| "valid_targets_mean": 4660.1, | |
| "valid_targets_min": 282 | |
| }, | |
| { | |
| "epoch": 5.4200988467874796, | |
| "grad_norm": 0.5505355745551022, | |
| "learning_rate": 5.904414323991758e-06, | |
| "loss": 0.1388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14041973650455475, | |
| "step": 3290, | |
| "valid_targets_mean": 5069.0, | |
| "valid_targets_min": 372 | |
| }, | |
| { | |
| "epoch": 5.42833607907743, | |
| "grad_norm": 0.5162576529384314, | |
| "learning_rate": 5.846250732383869e-06, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13150526583194733, | |
| "step": 3295, | |
| "valid_targets_mean": 5460.1, | |
| "valid_targets_min": 293 | |
| }, | |
| { | |
| "epoch": 5.436573311367381, | |
| "grad_norm": 0.6352297078680663, | |
| "learning_rate": 5.788325963074594e-06, | |
| "loss": 0.1504, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16142921149730682, | |
| "step": 3300, | |
| "valid_targets_mean": 4705.2, | |
| "valid_targets_min": 291 | |
| }, | |
| { | |
| "epoch": 5.444810543657331, | |
| "grad_norm": 0.580160236040931, | |
| "learning_rate": 5.7306409934534936e-06, | |
| "loss": 0.143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15034732222557068, | |
| "step": 3305, | |
| "valid_targets_mean": 4449.2, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 5.453047775947281, | |
| "grad_norm": 0.5256262308325671, | |
| "learning_rate": 5.673196796863852e-06, | |
| "loss": 0.1398, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14116880297660828, | |
| "step": 3310, | |
| "valid_targets_mean": 5438.1, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 5.461285008237232, | |
| "grad_norm": 0.4206557921596052, | |
| "learning_rate": 5.615994342586313e-06, | |
| "loss": 0.1317, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1337016373872757, | |
| "step": 3315, | |
| "valid_targets_mean": 6748.2, | |
| "valid_targets_min": 2293 | |
| }, | |
| { | |
| "epoch": 5.469522240527183, | |
| "grad_norm": 0.5268814730787679, | |
| "learning_rate": 5.5590345958224875e-06, | |
| "loss": 0.1263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12679679691791534, | |
| "step": 3320, | |
| "valid_targets_mean": 4569.9, | |
| "valid_targets_min": 317 | |
| }, | |
| { | |
| "epoch": 5.477759472817134, | |
| "grad_norm": 0.5252752092181787, | |
| "learning_rate": 5.502318517678682e-06, | |
| "loss": 0.1446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14012542366981506, | |
| "step": 3325, | |
| "valid_targets_mean": 4685.8, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 5.485996705107084, | |
| "grad_norm": 0.5623820612844119, | |
| "learning_rate": 5.445847065149679e-06, | |
| "loss": 0.1452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15179842710494995, | |
| "step": 3330, | |
| "valid_targets_mean": 4619.6, | |
| "valid_targets_min": 383 | |
| }, | |
| { | |
| "epoch": 5.494233937397034, | |
| "grad_norm": 0.45684783663012746, | |
| "learning_rate": 5.389621191102574e-06, | |
| "loss": 0.1418, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1381663978099823, | |
| "step": 3335, | |
| "valid_targets_mean": 5621.9, | |
| "valid_targets_min": 2010 | |
| }, | |
| { | |
| "epoch": 5.502471169686985, | |
| "grad_norm": 0.4806800362003253, | |
| "learning_rate": 5.333641844260726e-06, | |
| "loss": 0.1429, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14656078815460205, | |
| "step": 3340, | |
| "valid_targets_mean": 5493.8, | |
| "valid_targets_min": 780 | |
| }, | |
| { | |
| "epoch": 5.5107084019769355, | |
| "grad_norm": 0.4542630067117667, | |
| "learning_rate": 5.277909969187738e-06, | |
| "loss": 0.1337, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12890984117984772, | |
| "step": 3345, | |
| "valid_targets_mean": 4942.9, | |
| "valid_targets_min": 1065 | |
| }, | |
| { | |
| "epoch": 5.518945634266887, | |
| "grad_norm": 0.48692358345346237, | |
| "learning_rate": 5.222426506271492e-06, | |
| "loss": 0.1341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13926216959953308, | |
| "step": 3350, | |
| "valid_targets_mean": 5788.2, | |
| "valid_targets_min": 255 | |
| }, | |
| { | |
| "epoch": 5.527182866556837, | |
| "grad_norm": 0.41013954032758926, | |
| "learning_rate": 5.1671923917083265e-06, | |
| "loss": 0.1423, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1503438502550125, | |
| "step": 3355, | |
| "valid_targets_mean": 6804.8, | |
| "valid_targets_min": 5049 | |
| }, | |
| { | |
| "epoch": 5.535420098846787, | |
| "grad_norm": 0.45924517683091326, | |
| "learning_rate": 5.112208557487215e-06, | |
| "loss": 0.131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1323554813861847, | |
| "step": 3360, | |
| "valid_targets_mean": 6357.6, | |
| "valid_targets_min": 377 | |
| }, | |
| { | |
| "epoch": 5.543657331136738, | |
| "grad_norm": 0.527172213063129, | |
| "learning_rate": 5.057475931374043e-06, | |
| "loss": 0.1416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15223285555839539, | |
| "step": 3365, | |
| "valid_targets_mean": 4769.6, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 5.551894563426688, | |
| "grad_norm": 0.5024208642551655, | |
| "learning_rate": 5.002995436895939e-06, | |
| "loss": 0.1445, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14082445204257965, | |
| "step": 3370, | |
| "valid_targets_mean": 6298.3, | |
| "valid_targets_min": 2535 | |
| }, | |
| { | |
| "epoch": 5.560131795716639, | |
| "grad_norm": 0.4833609888224746, | |
| "learning_rate": 4.9487679933257225e-06, | |
| "loss": 0.1469, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15281173586845398, | |
| "step": 3375, | |
| "valid_targets_mean": 5999.8, | |
| "valid_targets_min": 708 | |
| }, | |
| { | |
| "epoch": 5.56836902800659, | |
| "grad_norm": 0.45535217794864863, | |
| "learning_rate": 4.894794515666379e-06, | |
| "loss": 0.1425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13284778594970703, | |
| "step": 3380, | |
| "valid_targets_mean": 5627.4, | |
| "valid_targets_min": 615 | |
| }, | |
| { | |
| "epoch": 5.576606260296541, | |
| "grad_norm": 0.4170755788775851, | |
| "learning_rate": 4.8410759146355955e-06, | |
| "loss": 0.1414, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12243291735649109, | |
| "step": 3385, | |
| "valid_targets_mean": 6536.8, | |
| "valid_targets_min": 402 | |
| }, | |
| { | |
| "epoch": 5.584843492586491, | |
| "grad_norm": 0.5003624466674033, | |
| "learning_rate": 4.787613096650436e-06, | |
| "loss": 0.1377, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1479077935218811, | |
| "step": 3390, | |
| "valid_targets_mean": 5083.2, | |
| "valid_targets_min": 393 | |
| }, | |
| { | |
| "epoch": 5.593080724876441, | |
| "grad_norm": 0.44526841862204153, | |
| "learning_rate": 4.734406963812024e-06, | |
| "loss": 0.1461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1423017829656601, | |
| "step": 3395, | |
| "valid_targets_mean": 6407.9, | |
| "valid_targets_min": 4165 | |
| }, | |
| { | |
| "epoch": 5.601317957166392, | |
| "grad_norm": 0.42750551337562315, | |
| "learning_rate": 4.68145841389032e-06, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12783539295196533, | |
| "step": 3400, | |
| "valid_targets_mean": 5636.9, | |
| "valid_targets_min": 369 | |
| }, | |
| { | |
| "epoch": 5.6095551894563425, | |
| "grad_norm": 0.4236352461981568, | |
| "learning_rate": 4.6287683403089845e-06, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1300227791070938, | |
| "step": 3405, | |
| "valid_targets_mean": 6075.1, | |
| "valid_targets_min": 1246 | |
| }, | |
| { | |
| "epoch": 5.617792421746294, | |
| "grad_norm": 0.5163705704902521, | |
| "learning_rate": 4.576337632130281e-06, | |
| "loss": 0.1428, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14247193932533264, | |
| "step": 3410, | |
| "valid_targets_mean": 4827.1, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 5.626029654036244, | |
| "grad_norm": 0.43379156718976253, | |
| "learning_rate": 4.524167174040106e-06, | |
| "loss": 0.1377, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1339322179555893, | |
| "step": 3415, | |
| "valid_targets_mean": 6573.1, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 5.634266886326195, | |
| "grad_norm": 0.4707230248503362, | |
| "learning_rate": 4.472257846333028e-06, | |
| "loss": 0.1391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1403801143169403, | |
| "step": 3420, | |
| "valid_targets_mean": 5142.9, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 5.642504118616145, | |
| "grad_norm": 0.5319126431444242, | |
| "learning_rate": 4.4206105248974595e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1340303272008896, | |
| "step": 3425, | |
| "valid_targets_mean": 6185.4, | |
| "valid_targets_min": 381 | |
| }, | |
| { | |
| "epoch": 5.650741350906095, | |
| "grad_norm": 0.440524877940003, | |
| "learning_rate": 4.3692260812008795e-06, | |
| "loss": 0.1394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12920840084552765, | |
| "step": 3430, | |
| "valid_targets_mean": 5609.5, | |
| "valid_targets_min": 250 | |
| }, | |
| { | |
| "epoch": 5.658978583196046, | |
| "grad_norm": 0.4609698993867005, | |
| "learning_rate": 4.318105382275084e-06, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14189115166664124, | |
| "step": 3435, | |
| "valid_targets_mean": 5514.4, | |
| "valid_targets_min": 305 | |
| }, | |
| { | |
| "epoch": 5.667215815485997, | |
| "grad_norm": 0.43144316493853585, | |
| "learning_rate": 4.267249290701636e-06, | |
| "loss": 0.1383, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12442096322774887, | |
| "step": 3440, | |
| "valid_targets_mean": 5884.9, | |
| "valid_targets_min": 375 | |
| }, | |
| { | |
| "epoch": 5.675453047775948, | |
| "grad_norm": 0.4997124831898971, | |
| "learning_rate": 4.216658664597224e-06, | |
| "loss": 0.1364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13440588116645813, | |
| "step": 3445, | |
| "valid_targets_mean": 5936.8, | |
| "valid_targets_min": 425 | |
| }, | |
| { | |
| "epoch": 5.683690280065898, | |
| "grad_norm": 0.4542471741740084, | |
| "learning_rate": 4.166334357599246e-06, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12336796522140503, | |
| "step": 3450, | |
| "valid_targets_mean": 5466.2, | |
| "valid_targets_min": 242 | |
| }, | |
| { | |
| "epoch": 5.691927512355848, | |
| "grad_norm": 0.7475283328556905, | |
| "learning_rate": 4.116277218851383e-06, | |
| "loss": 0.1343, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1225927472114563, | |
| "step": 3455, | |
| "valid_targets_mean": 5669.9, | |
| "valid_targets_min": 465 | |
| }, | |
| { | |
| "epoch": 5.700164744645799, | |
| "grad_norm": 0.502370666703315, | |
| "learning_rate": 4.066488092989256e-06, | |
| "loss": 0.1652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13243255019187927, | |
| "step": 3460, | |
| "valid_targets_mean": 4953.4, | |
| "valid_targets_min": 395 | |
| }, | |
| { | |
| "epoch": 5.7084019769357495, | |
| "grad_norm": 0.5522762130532978, | |
| "learning_rate": 4.016967820126205e-06, | |
| "loss": 0.1507, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15482047200202942, | |
| "step": 3465, | |
| "valid_targets_mean": 4570.9, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 5.716639209225701, | |
| "grad_norm": 0.5267311706871121, | |
| "learning_rate": 3.967717235839086e-06, | |
| "loss": 0.1343, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13417063653469086, | |
| "step": 3470, | |
| "valid_targets_mean": 4840.9, | |
| "valid_targets_min": 345 | |
| }, | |
| { | |
| "epoch": 5.724876441515651, | |
| "grad_norm": 0.5287502682061341, | |
| "learning_rate": 3.918737171154192e-06, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14333754777908325, | |
| "step": 3475, | |
| "valid_targets_mean": 4829.8, | |
| "valid_targets_min": 286 | |
| }, | |
| { | |
| "epoch": 5.733113673805601, | |
| "grad_norm": 0.48784650075697344, | |
| "learning_rate": 3.870028452533221e-06, | |
| "loss": 0.1411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14025351405143738, | |
| "step": 3480, | |
| "valid_targets_mean": 5756.6, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 5.741350906095552, | |
| "grad_norm": 0.5403889463433232, | |
| "learning_rate": 3.8215919018593295e-06, | |
| "loss": 0.1422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.146037757396698, | |
| "step": 3485, | |
| "valid_targets_mean": 4279.8, | |
| "valid_targets_min": 320 | |
| }, | |
| { | |
| "epoch": 5.749588138385502, | |
| "grad_norm": 0.4592507398840409, | |
| "learning_rate": 3.7734283364232747e-06, | |
| "loss": 0.1367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14514651894569397, | |
| "step": 3490, | |
| "valid_targets_mean": 5427.7, | |
| "valid_targets_min": 732 | |
| }, | |
| { | |
| "epoch": 5.757825370675453, | |
| "grad_norm": 0.4887221571373193, | |
| "learning_rate": 3.725538568909599e-06, | |
| "loss": 0.14, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14044776558876038, | |
| "step": 3495, | |
| "valid_targets_mean": 5423.7, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 5.766062602965404, | |
| "grad_norm": 0.5486820425373832, | |
| "learning_rate": 3.6779234073829528e-06, | |
| "loss": 0.1404, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13362336158752441, | |
| "step": 3500, | |
| "valid_targets_mean": 5640.1, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 5.774299835255354, | |
| "grad_norm": 0.44478317936989153, | |
| "learning_rate": 3.63058365527444e-06, | |
| "loss": 0.1323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14148974418640137, | |
| "step": 3505, | |
| "valid_targets_mean": 5923.2, | |
| "valid_targets_min": 369 | |
| }, | |
| { | |
| "epoch": 5.782537067545305, | |
| "grad_norm": 0.4929393880606377, | |
| "learning_rate": 3.5835201113680463e-06, | |
| "loss": 0.1402, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13513973355293274, | |
| "step": 3510, | |
| "valid_targets_mean": 4818.0, | |
| "valid_targets_min": 282 | |
| }, | |
| { | |
| "epoch": 5.790774299835255, | |
| "grad_norm": 0.4174321213852269, | |
| "learning_rate": 3.536733569787194e-06, | |
| "loss": 0.1379, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14224773645401, | |
| "step": 3515, | |
| "valid_targets_mean": 7219.0, | |
| "valid_targets_min": 3108 | |
| }, | |
| { | |
| "epoch": 5.799011532125206, | |
| "grad_norm": 0.48728034697704387, | |
| "learning_rate": 3.4902248199813203e-06, | |
| "loss": 0.1449, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.137048602104187, | |
| "step": 3520, | |
| "valid_targets_mean": 5947.2, | |
| "valid_targets_min": 416 | |
| }, | |
| { | |
| "epoch": 5.8072487644151565, | |
| "grad_norm": 0.4262984193290179, | |
| "learning_rate": 3.443994646712567e-06, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14041151106357574, | |
| "step": 3525, | |
| "valid_targets_mean": 7137.3, | |
| "valid_targets_min": 997 | |
| }, | |
| { | |
| "epoch": 5.815485996705107, | |
| "grad_norm": 0.41903023686254465, | |
| "learning_rate": 3.3980438300425324e-06, | |
| "loss": 0.147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13303492963314056, | |
| "step": 3530, | |
| "valid_targets_mean": 6324.1, | |
| "valid_targets_min": 379 | |
| }, | |
| { | |
| "epoch": 5.823723228995058, | |
| "grad_norm": 0.48396886689123936, | |
| "learning_rate": 3.3523731453190987e-06, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14490509033203125, | |
| "step": 3535, | |
| "valid_targets_mean": 5659.3, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 5.831960461285008, | |
| "grad_norm": 0.5634002824321679, | |
| "learning_rate": 3.306983363163381e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.149214506149292, | |
| "step": 3540, | |
| "valid_targets_mean": 4438.4, | |
| "valid_targets_min": 224 | |
| }, | |
| { | |
| "epoch": 5.840197693574959, | |
| "grad_norm": 0.43591445348583646, | |
| "learning_rate": 3.2618752494566854e-06, | |
| "loss": 0.1394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13709425926208496, | |
| "step": 3545, | |
| "valid_targets_mean": 6600.5, | |
| "valid_targets_min": 4529 | |
| }, | |
| { | |
| "epoch": 5.848434925864909, | |
| "grad_norm": 0.45792537128808686, | |
| "learning_rate": 3.217049565327619e-06, | |
| "loss": 0.1366, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15016064047813416, | |
| "step": 3550, | |
| "valid_targets_mean": 5791.0, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 5.8566721581548595, | |
| "grad_norm": 0.41811339718156515, | |
| "learning_rate": 3.17250706713923e-06, | |
| "loss": 0.137, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11696778237819672, | |
| "step": 3555, | |
| "valid_targets_mean": 6163.2, | |
| "valid_targets_min": 768 | |
| }, | |
| { | |
| "epoch": 5.864909390444811, | |
| "grad_norm": 0.5171929946947682, | |
| "learning_rate": 3.128248506476246e-06, | |
| "loss": 0.1328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13196325302124023, | |
| "step": 3560, | |
| "valid_targets_mean": 5356.0, | |
| "valid_targets_min": 474 | |
| }, | |
| { | |
| "epoch": 5.873146622734761, | |
| "grad_norm": 0.496618184466259, | |
| "learning_rate": 3.0842746301324e-06, | |
| "loss": 0.1412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15133965015411377, | |
| "step": 3565, | |
| "valid_targets_mean": 5518.6, | |
| "valid_targets_min": 1579 | |
| }, | |
| { | |
| "epoch": 5.881383855024712, | |
| "grad_norm": 0.47452647506786294, | |
| "learning_rate": 3.040586180097811e-06, | |
| "loss": 0.1399, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13840806484222412, | |
| "step": 3570, | |
| "valid_targets_mean": 5662.7, | |
| "valid_targets_min": 471 | |
| }, | |
| { | |
| "epoch": 5.889621087314662, | |
| "grad_norm": 0.45494258722616504, | |
| "learning_rate": 2.9971838935464915e-06, | |
| "loss": 0.1393, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1390582174062729, | |
| "step": 3575, | |
| "valid_targets_mean": 6040.2, | |
| "valid_targets_min": 501 | |
| }, | |
| { | |
| "epoch": 5.897858319604612, | |
| "grad_norm": 0.5316573344410656, | |
| "learning_rate": 2.9540685028238925e-06, | |
| "loss": 0.1369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1437787562608719, | |
| "step": 3580, | |
| "valid_targets_mean": 4216.3, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 5.9060955518945635, | |
| "grad_norm": 0.5021377198403308, | |
| "learning_rate": 2.9112407354345375e-06, | |
| "loss": 0.1376, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1306840479373932, | |
| "step": 3585, | |
| "valid_targets_mean": 4726.6, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 5.914332784184514, | |
| "grad_norm": 0.5214586963224782, | |
| "learning_rate": 2.8687013140297695e-06, | |
| "loss": 0.1319, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1259566992521286, | |
| "step": 3590, | |
| "valid_targets_mean": 6115.4, | |
| "valid_targets_min": 536 | |
| }, | |
| { | |
| "epoch": 5.922570016474465, | |
| "grad_norm": 0.5277563358185039, | |
| "learning_rate": 2.826450956395532e-06, | |
| "loss": 0.1401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1559084802865982, | |
| "step": 3595, | |
| "valid_targets_mean": 4852.1, | |
| "valid_targets_min": 348 | |
| }, | |
| { | |
| "epoch": 5.930807248764415, | |
| "grad_norm": 0.4598808489728509, | |
| "learning_rate": 2.7844903754402963e-06, | |
| "loss": 0.1487, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1452733725309372, | |
| "step": 3600, | |
| "valid_targets_mean": 5961.7, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 5.939044481054366, | |
| "grad_norm": 0.44420598869056455, | |
| "learning_rate": 2.7428202791829763e-06, | |
| "loss": 0.1403, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11765453219413757, | |
| "step": 3605, | |
| "valid_targets_mean": 5541.5, | |
| "valid_targets_min": 576 | |
| }, | |
| { | |
| "epoch": 5.947281713344316, | |
| "grad_norm": 0.5733465605447501, | |
| "learning_rate": 2.7014413707410357e-06, | |
| "loss": 0.145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13703860342502594, | |
| "step": 3610, | |
| "valid_targets_mean": 5567.2, | |
| "valid_targets_min": 639 | |
| }, | |
| { | |
| "epoch": 5.9555189456342665, | |
| "grad_norm": 0.4757594325785292, | |
| "learning_rate": 2.6603543483185966e-06, | |
| "loss": 0.1328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1293085515499115, | |
| "step": 3615, | |
| "valid_targets_mean": 5337.7, | |
| "valid_targets_min": 422 | |
| }, | |
| { | |
| "epoch": 5.963756177924218, | |
| "grad_norm": 0.5172925361057765, | |
| "learning_rate": 2.6195599051946465e-06, | |
| "loss": 0.1401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13227033615112305, | |
| "step": 3620, | |
| "valid_targets_mean": 4333.9, | |
| "valid_targets_min": 353 | |
| }, | |
| { | |
| "epoch": 5.971993410214168, | |
| "grad_norm": 0.4453302162097895, | |
| "learning_rate": 2.579058729711379e-06, | |
| "loss": 0.1346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13876956701278687, | |
| "step": 3625, | |
| "valid_targets_mean": 5969.4, | |
| "valid_targets_min": 290 | |
| }, | |
| { | |
| "epoch": 5.980230642504119, | |
| "grad_norm": 0.5283908360552565, | |
| "learning_rate": 2.538851505262545e-06, | |
| "loss": 0.1337, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13446350395679474, | |
| "step": 3630, | |
| "valid_targets_mean": 4775.6, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 5.988467874794069, | |
| "grad_norm": 0.47399508195474715, | |
| "learning_rate": 2.498938910281927e-06, | |
| "loss": 0.1392, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13291141390800476, | |
| "step": 3635, | |
| "valid_targets_mean": 5101.6, | |
| "valid_targets_min": 298 | |
| }, | |
| { | |
| "epoch": 5.996705107084019, | |
| "grad_norm": 0.4700837808956898, | |
| "learning_rate": 2.4593216182319203e-06, | |
| "loss": 0.1315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13764753937721252, | |
| "step": 3640, | |
| "valid_targets_mean": 5740.4, | |
| "valid_targets_min": 820 | |
| }, | |
| { | |
| "epoch": 6.0049423393739705, | |
| "grad_norm": 0.4282286520739445, | |
| "learning_rate": 2.42000029759212e-06, | |
| "loss": 0.1384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13631629943847656, | |
| "step": 3645, | |
| "valid_targets_mean": 6149.7, | |
| "valid_targets_min": 582 | |
| }, | |
| { | |
| "epoch": 6.013179571663921, | |
| "grad_norm": 0.4663616514291284, | |
| "learning_rate": 2.3809756118480887e-06, | |
| "loss": 0.1323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12601891160011292, | |
| "step": 3650, | |
| "valid_targets_mean": 6049.3, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 6.021416803953872, | |
| "grad_norm": 0.4517526044118583, | |
| "learning_rate": 2.3422482194801257e-06, | |
| "loss": 0.1303, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11501075327396393, | |
| "step": 3655, | |
| "valid_targets_mean": 6240.2, | |
| "valid_targets_min": 242 | |
| }, | |
| { | |
| "epoch": 6.029654036243822, | |
| "grad_norm": 0.45773566091644025, | |
| "learning_rate": 2.30381877395218e-06, | |
| "loss": 0.1443, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14801597595214844, | |
| "step": 3660, | |
| "valid_targets_mean": 6157.1, | |
| "valid_targets_min": 442 | |
| }, | |
| { | |
| "epoch": 6.037891268533772, | |
| "grad_norm": 0.4657053658680516, | |
| "learning_rate": 2.265687923700821e-06, | |
| "loss": 0.1245, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11897988617420197, | |
| "step": 3665, | |
| "valid_targets_mean": 5020.0, | |
| "valid_targets_min": 279 | |
| }, | |
| { | |
| "epoch": 6.046128500823723, | |
| "grad_norm": 0.49670506821102967, | |
| "learning_rate": 2.2278563121242722e-06, | |
| "loss": 0.1307, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14285153150558472, | |
| "step": 3670, | |
| "valid_targets_mean": 6166.0, | |
| "valid_targets_min": 469 | |
| }, | |
| { | |
| "epoch": 6.0543657331136735, | |
| "grad_norm": 0.544682632896979, | |
| "learning_rate": 2.190324577571592e-06, | |
| "loss": 0.1349, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14309510588645935, | |
| "step": 3675, | |
| "valid_targets_mean": 4997.1, | |
| "valid_targets_min": 367 | |
| }, | |
| { | |
| "epoch": 6.062602965403625, | |
| "grad_norm": 0.46759468317431707, | |
| "learning_rate": 2.153093353331881e-06, | |
| "loss": 0.1325, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12411434948444366, | |
| "step": 3680, | |
| "valid_targets_mean": 5710.9, | |
| "valid_targets_min": 300 | |
| }, | |
| { | |
| "epoch": 6.070840197693575, | |
| "grad_norm": 0.4818458379154973, | |
| "learning_rate": 2.1161632676235967e-06, | |
| "loss": 0.1388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1334155797958374, | |
| "step": 3685, | |
| "valid_targets_mean": 6071.1, | |
| "valid_targets_min": 3393 | |
| }, | |
| { | |
| "epoch": 6.079077429983526, | |
| "grad_norm": 0.4961800974677331, | |
| "learning_rate": 2.0795349435839606e-06, | |
| "loss": 0.1326, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14636382460594177, | |
| "step": 3690, | |
| "valid_targets_mean": 4735.6, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 6.087314662273476, | |
| "grad_norm": 0.47523702383058075, | |
| "learning_rate": 2.0432089992584326e-06, | |
| "loss": 0.13, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12490198016166687, | |
| "step": 3695, | |
| "valid_targets_mean": 6293.3, | |
| "valid_targets_min": 621 | |
| }, | |
| { | |
| "epoch": 6.095551894563426, | |
| "grad_norm": 0.4473024162840745, | |
| "learning_rate": 2.007186047590297e-06, | |
| "loss": 0.1314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13148343563079834, | |
| "step": 3700, | |
| "valid_targets_mean": 5969.4, | |
| "valid_targets_min": 332 | |
| }, | |
| { | |
| "epoch": 6.1037891268533775, | |
| "grad_norm": 0.4660267208167183, | |
| "learning_rate": 1.971466696410309e-06, | |
| "loss": 0.1277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.10753028839826584, | |
| "step": 3705, | |
| "valid_targets_mean": 5304.6, | |
| "valid_targets_min": 1081 | |
| }, | |
| { | |
| "epoch": 6.112026359143328, | |
| "grad_norm": 0.5422612688689189, | |
| "learning_rate": 1.936051548426443e-06, | |
| "loss": 0.134, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15001899003982544, | |
| "step": 3710, | |
| "valid_targets_mean": 5986.7, | |
| "valid_targets_min": 423 | |
| }, | |
| { | |
| "epoch": 6.120263591433279, | |
| "grad_norm": 0.5274266888800593, | |
| "learning_rate": 1.9009412012137218e-06, | |
| "loss": 0.1291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13106083869934082, | |
| "step": 3715, | |
| "valid_targets_mean": 5678.3, | |
| "valid_targets_min": 406 | |
| }, | |
| { | |
| "epoch": 6.128500823723229, | |
| "grad_norm": 0.4576755101043015, | |
| "learning_rate": 1.8661362472041355e-06, | |
| "loss": 0.1309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13249346613883972, | |
| "step": 3720, | |
| "valid_targets_mean": 5423.2, | |
| "valid_targets_min": 600 | |
| }, | |
| { | |
| "epoch": 6.136738056013179, | |
| "grad_norm": 0.5133862834647515, | |
| "learning_rate": 1.8316372736766452e-06, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13682904839515686, | |
| "step": 3725, | |
| "valid_targets_mean": 5098.6, | |
| "valid_targets_min": 1764 | |
| }, | |
| { | |
| "epoch": 6.14497528830313, | |
| "grad_norm": 0.5069868484461164, | |
| "learning_rate": 1.7974448627472618e-06, | |
| "loss": 0.138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1550980806350708, | |
| "step": 3730, | |
| "valid_targets_mean": 5112.9, | |
| "valid_targets_min": 579 | |
| }, | |
| { | |
| "epoch": 6.1532125205930805, | |
| "grad_norm": 0.41399824699276144, | |
| "learning_rate": 1.763559591359243e-06, | |
| "loss": 0.133, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13211315870285034, | |
| "step": 3735, | |
| "valid_targets_mean": 7211.7, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 6.161449752883032, | |
| "grad_norm": 0.5310567988388422, | |
| "learning_rate": 1.7299820312733518e-06, | |
| "loss": 0.1394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1566026508808136, | |
| "step": 3740, | |
| "valid_targets_mean": 4989.5, | |
| "valid_targets_min": 359 | |
| }, | |
| { | |
| "epoch": 6.169686985172982, | |
| "grad_norm": 0.5124474888168294, | |
| "learning_rate": 1.6967127490581892e-06, | |
| "loss": 0.1379, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14107312262058258, | |
| "step": 3745, | |
| "valid_targets_mean": 5098.3, | |
| "valid_targets_min": 664 | |
| }, | |
| { | |
| "epoch": 6.177924217462932, | |
| "grad_norm": 0.4382769185305687, | |
| "learning_rate": 1.6637523060806749e-06, | |
| "loss": 0.1306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11474081128835678, | |
| "step": 3750, | |
| "valid_targets_mean": 6166.8, | |
| "valid_targets_min": 1918 | |
| }, | |
| { | |
| "epoch": 6.186161449752883, | |
| "grad_norm": 0.43472143369355076, | |
| "learning_rate": 1.6311012584965257e-06, | |
| "loss": 0.1399, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1271776556968689, | |
| "step": 3755, | |
| "valid_targets_mean": 6410.5, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 6.194398682042833, | |
| "grad_norm": 0.45645793927866285, | |
| "learning_rate": 1.5987601572409195e-06, | |
| "loss": 0.1275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13115490972995758, | |
| "step": 3760, | |
| "valid_targets_mean": 6528.2, | |
| "valid_targets_min": 460 | |
| }, | |
| { | |
| "epoch": 6.2026359143327845, | |
| "grad_norm": 0.5159376108603132, | |
| "learning_rate": 1.566729548019168e-06, | |
| "loss": 0.1378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14629538357257843, | |
| "step": 3765, | |
| "valid_targets_mean": 4745.2, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 6.210873146622735, | |
| "grad_norm": 0.4815968005801268, | |
| "learning_rate": 1.5350099712975097e-06, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12448601424694061, | |
| "step": 3770, | |
| "valid_targets_mean": 5488.1, | |
| "valid_targets_min": 329 | |
| }, | |
| { | |
| "epoch": 6.219110378912685, | |
| "grad_norm": 0.4832899827342596, | |
| "learning_rate": 1.50360196229401e-06, | |
| "loss": 0.1378, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16062882542610168, | |
| "step": 3775, | |
| "valid_targets_mean": 5798.6, | |
| "valid_targets_min": 852 | |
| }, | |
| { | |
| "epoch": 6.227347611202636, | |
| "grad_norm": 0.4849698048363265, | |
| "learning_rate": 1.4725060509695088e-06, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13764359056949615, | |
| "step": 3780, | |
| "valid_targets_mean": 6223.4, | |
| "valid_targets_min": 346 | |
| }, | |
| { | |
| "epoch": 6.235584843492586, | |
| "grad_norm": 0.5424460364962002, | |
| "learning_rate": 1.4417227620186935e-06, | |
| "loss": 0.1387, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14788150787353516, | |
| "step": 3785, | |
| "valid_targets_mean": 4785.4, | |
| "valid_targets_min": 303 | |
| }, | |
| { | |
| "epoch": 6.243822075782537, | |
| "grad_norm": 0.46125398686289026, | |
| "learning_rate": 1.4112526148612381e-06, | |
| "loss": 0.1314, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1359138935804367, | |
| "step": 3790, | |
| "valid_targets_mean": 5888.5, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 6.2520593080724876, | |
| "grad_norm": 0.7772213081626521, | |
| "learning_rate": 1.3810961236330345e-06, | |
| "loss": 0.1396, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12322203814983368, | |
| "step": 3795, | |
| "valid_targets_mean": 4800.7, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 6.260296540362438, | |
| "grad_norm": 0.41414462497435917, | |
| "learning_rate": 1.3512537971775407e-06, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13013538718223572, | |
| "step": 3800, | |
| "valid_targets_mean": 7567.3, | |
| "valid_targets_min": 1657 | |
| }, | |
| { | |
| "epoch": 6.268533772652389, | |
| "grad_norm": 0.4316675872766028, | |
| "learning_rate": 1.321726139037156e-06, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12924161553382874, | |
| "step": 3805, | |
| "valid_targets_mean": 7021.1, | |
| "valid_targets_min": 704 | |
| }, | |
| { | |
| "epoch": 6.276771004942339, | |
| "grad_norm": 0.5125954945406457, | |
| "learning_rate": 1.2925136474447575e-06, | |
| "loss": 0.1359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13661538064479828, | |
| "step": 3810, | |
| "valid_targets_mean": 4975.6, | |
| "valid_targets_min": 1201 | |
| }, | |
| { | |
| "epoch": 6.28500823723229, | |
| "grad_norm": 0.49772458500840794, | |
| "learning_rate": 1.2636168153152828e-06, | |
| "loss": 0.1365, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13991546630859375, | |
| "step": 3815, | |
| "valid_targets_mean": 5599.8, | |
| "valid_targets_min": 472 | |
| }, | |
| { | |
| "epoch": 6.29324546952224, | |
| "grad_norm": 0.49945615845513786, | |
| "learning_rate": 1.2350361302374037e-06, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12564127147197723, | |
| "step": 3820, | |
| "valid_targets_mean": 4814.4, | |
| "valid_targets_min": 362 | |
| }, | |
| { | |
| "epoch": 6.3014827018121915, | |
| "grad_norm": 0.45827931098556673, | |
| "learning_rate": 1.2067720744653167e-06, | |
| "loss": 0.1315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13158249855041504, | |
| "step": 3825, | |
| "valid_targets_mean": 5550.9, | |
| "valid_targets_min": 333 | |
| }, | |
| { | |
| "epoch": 6.309719934102142, | |
| "grad_norm": 0.46714676648447434, | |
| "learning_rate": 1.178825124910583e-06, | |
| "loss": 0.1338, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1324673593044281, | |
| "step": 3830, | |
| "valid_targets_mean": 5685.1, | |
| "valid_targets_min": 269 | |
| }, | |
| { | |
| "epoch": 6.317957166392092, | |
| "grad_norm": 0.4896652597587443, | |
| "learning_rate": 1.1511957531341067e-06, | |
| "loss": 0.1374, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11876912415027618, | |
| "step": 3835, | |
| "valid_targets_mean": 5610.4, | |
| "valid_targets_min": 1521 | |
| }, | |
| { | |
| "epoch": 6.326194398682043, | |
| "grad_norm": 0.5371200728560797, | |
| "learning_rate": 1.123884425338162e-06, | |
| "loss": 0.1304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12302712351083755, | |
| "step": 3840, | |
| "valid_targets_mean": 4245.2, | |
| "valid_targets_min": 341 | |
| }, | |
| { | |
| "epoch": 6.334431630971993, | |
| "grad_norm": 0.6041879366529833, | |
| "learning_rate": 1.0968916023585252e-06, | |
| "loss": 0.142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1337546855211258, | |
| "step": 3845, | |
| "valid_targets_mean": 5333.9, | |
| "valid_targets_min": 397 | |
| }, | |
| { | |
| "epoch": 6.342668863261944, | |
| "grad_norm": 0.7693854601953327, | |
| "learning_rate": 1.0702177396567114e-06, | |
| "loss": 0.1286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13358263671398163, | |
| "step": 3850, | |
| "valid_targets_mean": 5477.5, | |
| "valid_targets_min": 370 | |
| }, | |
| { | |
| "epoch": 6.350906095551895, | |
| "grad_norm": 0.5217168667394281, | |
| "learning_rate": 1.0438632873122746e-06, | |
| "loss": 0.1348, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13919466733932495, | |
| "step": 3855, | |
| "valid_targets_mean": 4802.3, | |
| "valid_targets_min": 278 | |
| }, | |
| { | |
| "epoch": 6.359143327841845, | |
| "grad_norm": 0.49627335521623195, | |
| "learning_rate": 1.0178286900152234e-06, | |
| "loss": 0.1347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1277734637260437, | |
| "step": 3860, | |
| "valid_targets_mean": 5456.2, | |
| "valid_targets_min": 347 | |
| }, | |
| { | |
| "epoch": 6.367380560131796, | |
| "grad_norm": 0.4640966805166447, | |
| "learning_rate": 9.921143870585204e-07, | |
| "loss": 0.1339, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1288243532180786, | |
| "step": 3865, | |
| "valid_targets_mean": 7001.1, | |
| "valid_targets_min": 3741 | |
| }, | |
| { | |
| "epoch": 6.375617792421746, | |
| "grad_norm": 0.4493431529213666, | |
| "learning_rate": 9.667208123306504e-07, | |
| "loss": 0.1308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12215395271778107, | |
| "step": 3870, | |
| "valid_targets_mean": 6469.6, | |
| "valid_targets_min": 409 | |
| }, | |
| { | |
| "epoch": 6.383855024711697, | |
| "grad_norm": 0.5223283250804539, | |
| "learning_rate": 9.416483943083254e-07, | |
| "loss": 0.1385, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14019393920898438, | |
| "step": 3875, | |
| "valid_targets_mean": 4799.6, | |
| "valid_targets_min": 356 | |
| }, | |
| { | |
| "epoch": 6.392092257001647, | |
| "grad_norm": 0.5839320585301707, | |
| "learning_rate": 9.168975560492365e-07, | |
| "loss": 0.137, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1326146423816681, | |
| "step": 3880, | |
| "valid_targets_mean": 6029.2, | |
| "valid_targets_min": 468 | |
| }, | |
| { | |
| "epoch": 6.400329489291598, | |
| "grad_norm": 0.45084115574340156, | |
| "learning_rate": 8.924687151849287e-07, | |
| "loss": 0.1437, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13785238564014435, | |
| "step": 3885, | |
| "valid_targets_mean": 5878.5, | |
| "valid_targets_min": 2455 | |
| }, | |
| { | |
| "epoch": 6.408566721581549, | |
| "grad_norm": 0.5297128307959439, | |
| "learning_rate": 8.683622839137306e-07, | |
| "loss": 0.1355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12668374180793762, | |
| "step": 3890, | |
| "valid_targets_mean": 5506.2, | |
| "valid_targets_min": 2193 | |
| }, | |
| { | |
| "epoch": 6.416803953871499, | |
| "grad_norm": 0.48322783971651356, | |
| "learning_rate": 8.445786689938295e-07, | |
| "loss": 0.1346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1286458671092987, | |
| "step": 3895, | |
| "valid_targets_mean": 5458.9, | |
| "valid_targets_min": 301 | |
| }, | |
| { | |
| "epoch": 6.42504118616145, | |
| "grad_norm": 0.48395434493993517, | |
| "learning_rate": 8.211182717363897e-07, | |
| "loss": 0.1432, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15668228268623352, | |
| "step": 3900, | |
| "valid_targets_mean": 5767.8, | |
| "valid_targets_min": 3561 | |
| }, | |
| { | |
| "epoch": 6.4332784184514, | |
| "grad_norm": 0.4433161669875194, | |
| "learning_rate": 7.979814879987824e-07, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.138276606798172, | |
| "step": 3905, | |
| "valid_targets_mean": 5992.8, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 6.441515650741351, | |
| "grad_norm": 0.45862585322452476, | |
| "learning_rate": 7.751687081779091e-07, | |
| "loss": 0.1322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1268194019794464, | |
| "step": 3910, | |
| "valid_targets_mean": 5809.1, | |
| "valid_targets_min": 304 | |
| }, | |
| { | |
| "epoch": 6.449752883031302, | |
| "grad_norm": 0.5178320041186394, | |
| "learning_rate": 7.526803172036157e-07, | |
| "loss": 0.1342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1460873931646347, | |
| "step": 3915, | |
| "valid_targets_mean": 5288.8, | |
| "valid_targets_min": 373 | |
| }, | |
| { | |
| "epoch": 6.457990115321252, | |
| "grad_norm": 0.5881511480130761, | |
| "learning_rate": 7.305166945321973e-07, | |
| "loss": 0.1284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14629772305488586, | |
| "step": 3920, | |
| "valid_targets_mean": 5360.9, | |
| "valid_targets_min": 281 | |
| }, | |
| { | |
| "epoch": 6.466227347611203, | |
| "grad_norm": 0.45789052921456197, | |
| "learning_rate": 7.086782141399906e-07, | |
| "loss": 0.1328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13530060648918152, | |
| "step": 3925, | |
| "valid_targets_mean": 6599.0, | |
| "valid_targets_min": 295 | |
| }, | |
| { | |
| "epoch": 6.474464579901153, | |
| "grad_norm": 0.5108194896271699, | |
| "learning_rate": 6.871652445170651e-07, | |
| "loss": 0.1324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13670974969863892, | |
| "step": 3930, | |
| "valid_targets_mean": 5607.4, | |
| "valid_targets_min": 284 | |
| }, | |
| { | |
| "epoch": 6.482701812191104, | |
| "grad_norm": 0.4986363661068449, | |
| "learning_rate": 6.659781486610129e-07, | |
| "loss": 0.1316, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14160466194152832, | |
| "step": 3935, | |
| "valid_targets_mean": 5862.1, | |
| "valid_targets_min": 336 | |
| }, | |
| { | |
| "epoch": 6.490939044481054, | |
| "grad_norm": 0.486747793891971, | |
| "learning_rate": 6.451172840708087e-07, | |
| "loss": 0.1327, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1396481692790985, | |
| "step": 3940, | |
| "valid_targets_mean": 5292.6, | |
| "valid_targets_min": 746 | |
| }, | |
| { | |
| "epoch": 6.499176276771005, | |
| "grad_norm": 0.47788821872124543, | |
| "learning_rate": 6.245830027407973e-07, | |
| "loss": 0.1271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12838926911354065, | |
| "step": 3945, | |
| "valid_targets_mean": 5533.8, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 6.507413509060956, | |
| "grad_norm": 0.49062328683604933, | |
| "learning_rate": 6.043756511547427e-07, | |
| "loss": 0.1405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13557396829128265, | |
| "step": 3950, | |
| "valid_targets_mean": 4935.4, | |
| "valid_targets_min": 305 | |
| }, | |
| { | |
| "epoch": 6.515650741350906, | |
| "grad_norm": 0.6375889676833415, | |
| "learning_rate": 5.84495570279966e-07, | |
| "loss": 0.1413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14044895768165588, | |
| "step": 3955, | |
| "valid_targets_mean": 3416.4, | |
| "valid_targets_min": 342 | |
| }, | |
| { | |
| "epoch": 6.523887973640857, | |
| "grad_norm": 0.430404832431965, | |
| "learning_rate": 5.649430955616365e-07, | |
| "loss": 0.127, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.117364302277565, | |
| "step": 3960, | |
| "valid_targets_mean": 6041.8, | |
| "valid_targets_min": 1574 | |
| }, | |
| { | |
| "epoch": 6.532125205930807, | |
| "grad_norm": 0.6302140342224846, | |
| "learning_rate": 5.457185569170609e-07, | |
| "loss": 0.1279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14673450589179993, | |
| "step": 3965, | |
| "valid_targets_mean": 5401.9, | |
| "valid_targets_min": 533 | |
| }, | |
| { | |
| "epoch": 6.5403624382207575, | |
| "grad_norm": 0.5003641376463149, | |
| "learning_rate": 5.268222787301502e-07, | |
| "loss": 0.1377, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1596682220697403, | |
| "step": 3970, | |
| "valid_targets_mean": 6112.6, | |
| "valid_targets_min": 258 | |
| }, | |
| { | |
| "epoch": 6.548599670510709, | |
| "grad_norm": 0.5506918588497802, | |
| "learning_rate": 5.082545798459348e-07, | |
| "loss": 0.131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12708808481693268, | |
| "step": 3975, | |
| "valid_targets_mean": 4950.1, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 6.556836902800659, | |
| "grad_norm": 0.5124195626339795, | |
| "learning_rate": 4.900157735651779e-07, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1341022253036499, | |
| "step": 3980, | |
| "valid_targets_mean": 4943.3, | |
| "valid_targets_min": 388 | |
| }, | |
| { | |
| "epoch": 6.56507413509061, | |
| "grad_norm": 0.5172186903087186, | |
| "learning_rate": 4.721061676391081e-07, | |
| "loss": 0.1292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14156344532966614, | |
| "step": 3985, | |
| "valid_targets_mean": 5344.0, | |
| "valid_targets_min": 1391 | |
| }, | |
| { | |
| "epoch": 6.57331136738056, | |
| "grad_norm": 0.4351269973360783, | |
| "learning_rate": 4.5452606426420466e-07, | |
| "loss": 0.1247, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13022242486476898, | |
| "step": 3990, | |
| "valid_targets_mean": 5943.5, | |
| "valid_targets_min": 1326 | |
| }, | |
| { | |
| "epoch": 6.58154859967051, | |
| "grad_norm": 0.5132235188431938, | |
| "learning_rate": 4.372757600771116e-07, | |
| "loss": 0.1364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12379980832338333, | |
| "step": 3995, | |
| "valid_targets_mean": 4823.9, | |
| "valid_targets_min": 292 | |
| }, | |
| { | |
| "epoch": 6.589785831960461, | |
| "grad_norm": 0.5489329155887852, | |
| "learning_rate": 4.203555461496378e-07, | |
| "loss": 0.1562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14551113545894623, | |
| "step": 4000, | |
| "valid_targets_mean": 4537.8, | |
| "valid_targets_min": 440 | |
| }, | |
| { | |
| "epoch": 6.598023064250412, | |
| "grad_norm": 0.47238912777246345, | |
| "learning_rate": 4.0376570798382084e-07, | |
| "loss": 0.1356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11419009417295456, | |
| "step": 4005, | |
| "valid_targets_mean": 5518.3, | |
| "valid_targets_min": 1469 | |
| }, | |
| { | |
| "epoch": 6.606260296540363, | |
| "grad_norm": 0.5439108468408408, | |
| "learning_rate": 3.875065255071442e-07, | |
| "loss": 0.1319, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13088250160217285, | |
| "step": 4010, | |
| "valid_targets_mean": 5277.8, | |
| "valid_targets_min": 777 | |
| }, | |
| { | |
| "epoch": 6.614497528830313, | |
| "grad_norm": 0.44567096265885786, | |
| "learning_rate": 3.715782730677786e-07, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12983649969100952, | |
| "step": 4015, | |
| "valid_targets_mean": 5881.4, | |
| "valid_targets_min": 366 | |
| }, | |
| { | |
| "epoch": 6.622734761120263, | |
| "grad_norm": 0.4941823682417998, | |
| "learning_rate": 3.559812194299794e-07, | |
| "loss": 0.1389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13084876537322998, | |
| "step": 4020, | |
| "valid_targets_mean": 5684.1, | |
| "valid_targets_min": 2730 | |
| }, | |
| { | |
| "epoch": 6.630971993410214, | |
| "grad_norm": 0.4513343311045982, | |
| "learning_rate": 3.407156277695456e-07, | |
| "loss": 0.1287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.136585995554924, | |
| "step": 4025, | |
| "valid_targets_mean": 6158.8, | |
| "valid_targets_min": 1771 | |
| }, | |
| { | |
| "epoch": 6.6392092257001645, | |
| "grad_norm": 0.4841668577651671, | |
| "learning_rate": 3.257817556693654e-07, | |
| "loss": 0.1365, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13883385062217712, | |
| "step": 4030, | |
| "valid_targets_mean": 5912.1, | |
| "valid_targets_min": 432 | |
| }, | |
| { | |
| "epoch": 6.647446457990116, | |
| "grad_norm": 0.5305211565307272, | |
| "learning_rate": 3.1117985511508686e-07, | |
| "loss": 0.1397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13103139400482178, | |
| "step": 4035, | |
| "valid_targets_mean": 5152.6, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 6.655683690280066, | |
| "grad_norm": 0.4580349337969464, | |
| "learning_rate": 2.969101724908585e-07, | |
| "loss": 0.1335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13851921260356903, | |
| "step": 4040, | |
| "valid_targets_mean": 6540.2, | |
| "valid_targets_min": 2527 | |
| }, | |
| { | |
| "epoch": 6.663920922570016, | |
| "grad_norm": 0.5059365029484628, | |
| "learning_rate": 2.829729485751731e-07, | |
| "loss": 0.1373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13825827836990356, | |
| "step": 4045, | |
| "valid_targets_mean": 5032.9, | |
| "valid_targets_min": 414 | |
| }, | |
| { | |
| "epoch": 6.672158154859967, | |
| "grad_norm": 0.5024959144928178, | |
| "learning_rate": 2.6936841853680397e-07, | |
| "loss": 0.1331, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13369962573051453, | |
| "step": 4050, | |
| "valid_targets_mean": 5008.1, | |
| "valid_targets_min": 327 | |
| }, | |
| { | |
| "epoch": 6.680395387149917, | |
| "grad_norm": 0.6871281294342154, | |
| "learning_rate": 2.560968119308349e-07, | |
| "loss": 0.1411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14997810125350952, | |
| "step": 4055, | |
| "valid_targets_mean": 5776.6, | |
| "valid_targets_min": 353 | |
| }, | |
| { | |
| "epoch": 6.688632619439868, | |
| "grad_norm": 0.4735902654948438, | |
| "learning_rate": 2.4315835269479446e-07, | |
| "loss": 0.1306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12766751646995544, | |
| "step": 4060, | |
| "valid_targets_mean": 5684.6, | |
| "valid_targets_min": 294 | |
| }, | |
| { | |
| "epoch": 6.696869851729819, | |
| "grad_norm": 1.2645434899601486, | |
| "learning_rate": 2.305532591448678e-07, | |
| "loss": 0.1405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1371457278728485, | |
| "step": 4065, | |
| "valid_targets_mean": 6512.0, | |
| "valid_targets_min": 345 | |
| }, | |
| { | |
| "epoch": 6.70510708401977, | |
| "grad_norm": 0.5114470671638929, | |
| "learning_rate": 2.1828174397222179e-07, | |
| "loss": 0.1367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15383315086364746, | |
| "step": 4070, | |
| "valid_targets_mean": 5463.8, | |
| "valid_targets_min": 549 | |
| }, | |
| { | |
| "epoch": 6.71334431630972, | |
| "grad_norm": 0.5588565234687837, | |
| "learning_rate": 2.0634401423940797e-07, | |
| "loss": 0.139, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12472236156463623, | |
| "step": 4075, | |
| "valid_targets_mean": 6355.6, | |
| "valid_targets_min": 291 | |
| }, | |
| { | |
| "epoch": 6.72158154859967, | |
| "grad_norm": 0.47512870482477276, | |
| "learning_rate": 1.9474027137687425e-07, | |
| "loss": 0.124, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1210930347442627, | |
| "step": 4080, | |
| "valid_targets_mean": 5650.8, | |
| "valid_targets_min": 777 | |
| }, | |
| { | |
| "epoch": 6.729818780889621, | |
| "grad_norm": 0.5516275865205322, | |
| "learning_rate": 1.834707111795675e-07, | |
| "loss": 0.1305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12554940581321716, | |
| "step": 4085, | |
| "valid_targets_mean": 6649.1, | |
| "valid_targets_min": 343 | |
| }, | |
| { | |
| "epoch": 6.7380560131795715, | |
| "grad_norm": 0.48757816526391146, | |
| "learning_rate": 1.725355238036208e-07, | |
| "loss": 0.131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13415946066379547, | |
| "step": 4090, | |
| "valid_targets_mean": 5752.8, | |
| "valid_targets_min": 830 | |
| }, | |
| { | |
| "epoch": 6.746293245469523, | |
| "grad_norm": 0.4747972140397399, | |
| "learning_rate": 1.6193489376315375e-07, | |
| "loss": 0.1395, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14257818460464478, | |
| "step": 4095, | |
| "valid_targets_mean": 5895.2, | |
| "valid_targets_min": 1811 | |
| }, | |
| { | |
| "epoch": 6.754530477759473, | |
| "grad_norm": 0.45751329514365424, | |
| "learning_rate": 1.5166899992715922e-07, | |
| "loss": 0.129, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13291612267494202, | |
| "step": 4100, | |
| "valid_targets_mean": 6247.6, | |
| "valid_targets_min": 323 | |
| }, | |
| { | |
| "epoch": 6.762767710049423, | |
| "grad_norm": 0.48075546248084383, | |
| "learning_rate": 1.417380155164727e-07, | |
| "loss": 0.1358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1332646906375885, | |
| "step": 4105, | |
| "valid_targets_mean": 5221.9, | |
| "valid_targets_min": 738 | |
| }, | |
| { | |
| "epoch": 6.771004942339374, | |
| "grad_norm": 0.5601515010199739, | |
| "learning_rate": 1.3214210810087224e-07, | |
| "loss": 0.1434, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13819454610347748, | |
| "step": 4110, | |
| "valid_targets_mean": 4406.5, | |
| "valid_targets_min": 419 | |
| }, | |
| { | |
| "epoch": 6.779242174629324, | |
| "grad_norm": 0.6961080541032358, | |
| "learning_rate": 1.2288143959622523e-07, | |
| "loss": 0.1461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13622626662254333, | |
| "step": 4115, | |
| "valid_targets_mean": 3236.4, | |
| "valid_targets_min": 346 | |
| }, | |
| { | |
| "epoch": 6.787479406919275, | |
| "grad_norm": 0.45949694176316913, | |
| "learning_rate": 1.1395616626177718e-07, | |
| "loss": 0.137, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1468203365802765, | |
| "step": 4120, | |
| "valid_targets_mean": 6123.7, | |
| "valid_targets_min": 744 | |
| }, | |
| { | |
| "epoch": 6.795716639209226, | |
| "grad_norm": 0.4805711737263706, | |
| "learning_rate": 1.0536643869750508e-07, | |
| "loss": 0.1334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14250558614730835, | |
| "step": 4125, | |
| "valid_targets_mean": 5812.6, | |
| "valid_targets_min": 293 | |
| }, | |
| { | |
| "epoch": 6.803953871499177, | |
| "grad_norm": 0.46887014456462534, | |
| "learning_rate": 9.711240184157255e-08, | |
| "loss": 0.1302, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13092397153377533, | |
| "step": 4130, | |
| "valid_targets_mean": 5349.0, | |
| "valid_targets_min": 250 | |
| }, | |
| { | |
| "epoch": 6.812191103789127, | |
| "grad_norm": 0.5313920324876659, | |
| "learning_rate": 8.919419496789428e-08, | |
| "loss": 0.1332, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1342686414718628, | |
| "step": 4135, | |
| "valid_targets_mean": 4346.2, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 6.820428336079077, | |
| "grad_norm": 0.4859272674478183, | |
| "learning_rate": 8.161195168378211e-08, | |
| "loss": 0.1323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.11980287730693817, | |
| "step": 4140, | |
| "valid_targets_mean": 5261.4, | |
| "valid_targets_min": 352 | |
| }, | |
| { | |
| "epoch": 6.828665568369028, | |
| "grad_norm": 0.4423077861344137, | |
| "learning_rate": 7.4365799927687e-08, | |
| "loss": 0.1452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13322511315345764, | |
| "step": 4145, | |
| "valid_targets_mean": 6199.2, | |
| "valid_targets_min": 392 | |
| }, | |
| { | |
| "epoch": 6.8369028006589785, | |
| "grad_norm": 0.44086143214512336, | |
| "learning_rate": 6.745586196704512e-08, | |
| "loss": 0.1336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12674112617969513, | |
| "step": 4150, | |
| "valid_targets_mean": 6344.1, | |
| "valid_targets_min": 2470 | |
| }, | |
| { | |
| "epoch": 6.84514003294893, | |
| "grad_norm": 0.4598893989182098, | |
| "learning_rate": 6.088225439620621e-08, | |
| "loss": 0.1391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12376689910888672, | |
| "step": 4155, | |
| "valid_targets_mean": 5618.9, | |
| "valid_targets_min": 375 | |
| }, | |
| { | |
| "epoch": 6.85337726523888, | |
| "grad_norm": 0.4387084582808184, | |
| "learning_rate": 5.464508813448621e-08, | |
| "loss": 0.1258, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13196054100990295, | |
| "step": 4160, | |
| "valid_targets_mean": 6483.1, | |
| "valid_targets_min": 1453 | |
| }, | |
| { | |
| "epoch": 6.86161449752883, | |
| "grad_norm": 0.5826307617034738, | |
| "learning_rate": 4.874446842426883e-08, | |
| "loss": 0.1393, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13274163007736206, | |
| "step": 4165, | |
| "valid_targets_mean": 3858.2, | |
| "valid_targets_min": 265 | |
| }, | |
| { | |
| "epoch": 6.869851729818781, | |
| "grad_norm": 0.4902863000025557, | |
| "learning_rate": 4.318049482924913e-08, | |
| "loss": 0.1372, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14127911627292633, | |
| "step": 4170, | |
| "valid_targets_mean": 5120.2, | |
| "valid_targets_min": 365 | |
| }, | |
| { | |
| "epoch": 6.878088962108731, | |
| "grad_norm": 0.4716241280190188, | |
| "learning_rate": 3.7953261232743785e-08, | |
| "loss": 0.1354, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13096028566360474, | |
| "step": 4175, | |
| "valid_targets_mean": 5707.4, | |
| "valid_targets_min": 2026 | |
| }, | |
| { | |
| "epoch": 6.886326194398682, | |
| "grad_norm": 0.48274750236462316, | |
| "learning_rate": 3.3062855836119014e-08, | |
| "loss": 0.1297, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12582214176654816, | |
| "step": 4180, | |
| "valid_targets_mean": 6500.5, | |
| "valid_targets_min": 304 | |
| }, | |
| { | |
| "epoch": 6.894563426688633, | |
| "grad_norm": 0.520762828050495, | |
| "learning_rate": 2.8509361157285088e-08, | |
| "loss": 0.1249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12486450374126434, | |
| "step": 4185, | |
| "valid_targets_mean": 5814.1, | |
| "valid_targets_min": 315 | |
| }, | |
| { | |
| "epoch": 6.902800658978583, | |
| "grad_norm": 0.5051747976357015, | |
| "learning_rate": 2.429285402931525e-08, | |
| "loss": 0.138, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13390396535396576, | |
| "step": 4190, | |
| "valid_targets_mean": 4789.9, | |
| "valid_targets_min": 299 | |
| }, | |
| { | |
| "epoch": 6.911037891268534, | |
| "grad_norm": 0.46815700178456404, | |
| "learning_rate": 2.041340559914895e-08, | |
| "loss": 0.1377, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1404329240322113, | |
| "step": 4195, | |
| "valid_targets_mean": 6338.1, | |
| "valid_targets_min": 429 | |
| }, | |
| { | |
| "epoch": 6.919275123558484, | |
| "grad_norm": 0.4512663494555607, | |
| "learning_rate": 1.6871081326388372e-08, | |
| "loss": 0.1353, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12632620334625244, | |
| "step": 4200, | |
| "valid_targets_mean": 5904.5, | |
| "valid_targets_min": 270 | |
| }, | |
| { | |
| "epoch": 6.927512355848435, | |
| "grad_norm": 0.5289087445432837, | |
| "learning_rate": 1.366594098219487e-08, | |
| "loss": 0.1367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1431260108947754, | |
| "step": 4205, | |
| "valid_targets_mean": 4917.8, | |
| "valid_targets_min": 402 | |
| }, | |
| { | |
| "epoch": 6.9357495881383855, | |
| "grad_norm": 0.576493514416454, | |
| "learning_rate": 1.0798038648278664e-08, | |
| "loss": 0.1303, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1551305204629898, | |
| "step": 4210, | |
| "valid_targets_mean": 4842.8, | |
| "valid_targets_min": 307 | |
| }, | |
| { | |
| "epoch": 6.943986820428336, | |
| "grad_norm": 0.48286976995059333, | |
| "learning_rate": 8.26742271598846e-09, | |
| "loss": 0.1321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1334967464208603, | |
| "step": 4215, | |
| "valid_targets_mean": 5332.4, | |
| "valid_targets_min": 266 | |
| }, | |
| { | |
| "epoch": 6.952224052718287, | |
| "grad_norm": 0.4832915302519884, | |
| "learning_rate": 6.074135885498767e-09, | |
| "loss": 0.1345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13673850893974304, | |
| "step": 4220, | |
| "valid_targets_mean": 5995.8, | |
| "valid_targets_min": 1239 | |
| }, | |
| { | |
| "epoch": 6.960461285008237, | |
| "grad_norm": 0.4966652005135941, | |
| "learning_rate": 4.218215165077144e-09, | |
| "loss": 0.1357, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13495373725891113, | |
| "step": 4225, | |
| "valid_targets_mean": 5102.9, | |
| "valid_targets_min": 350 | |
| }, | |
| { | |
| "epoch": 6.968698517298188, | |
| "grad_norm": 0.437645161494385, | |
| "learning_rate": 2.699691870473586e-09, | |
| "loss": 0.131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.12133049219846725, | |
| "step": 4230, | |
| "valid_targets_mean": 6069.5, | |
| "valid_targets_min": 988 | |
| }, | |
| { | |
| "epoch": 6.976935749588138, | |
| "grad_norm": 0.43585901256643306, | |
| "learning_rate": 1.5185916243876108e-09, | |
| "loss": 0.1461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14331373572349548, | |
| "step": 4235, | |
| "valid_targets_mean": 6261.4, | |
| "valid_targets_min": 1324 | |
| }, | |
| { | |
| "epoch": 6.9851729818780885, | |
| "grad_norm": 0.49008244310573496, | |
| "learning_rate": 6.749343560286114e-10, | |
| "loss": 0.1387, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14540137350559235, | |
| "step": 4240, | |
| "valid_targets_mean": 5976.2, | |
| "valid_targets_min": 1274 | |
| }, | |
| { | |
| "epoch": 6.99341021416804, | |
| "grad_norm": 0.5246405834495542, | |
| "learning_rate": 1.6873430078945264e-10, | |
| "loss": 0.1408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13585354387760162, | |
| "step": 4245, | |
| "valid_targets_mean": 5064.4, | |
| "valid_targets_min": 832 | |
| }, | |
| { | |
| "epoch": 7.0, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1299251914024353, | |
| "step": 4249, | |
| "total_flos": 1654874798292992.0, | |
| "train_loss": 0.18129448925475283, | |
| "train_runtime": 26249.1152, | |
| "train_samples_per_second": 2.586, | |
| "train_steps_per_second": 0.162, | |
| "valid_targets_mean": 5076.1, | |
| "valid_targets_min": 783 | |
| } | |
| ], | |
| "logging_steps": 5, | |
| "max_steps": 4249, | |
| "num_input_tokens_seen": 0, | |
| "num_train_epochs": 7, | |
| "save_steps": 1500, | |
| "stateful_callbacks": { | |
| "TrainerControl": { | |
| "args": { | |
| "should_epoch_stop": false, | |
| "should_evaluate": false, | |
| "should_log": false, | |
| "should_save": true, | |
| "should_training_stop": true | |
| }, | |
| "attributes": {} | |
| } | |
| }, | |
| "total_flos": 1654874798292992.0, | |
| "train_batch_size": 1, | |
| "trial_name": null, | |
| "trial_params": null | |
| } | |