Text Generation
Transformers
Safetensors
qwen3
llama-factory
full
Generated from Trainer
conversational
text-generation-inference
Instructions to use DCAgent/a1-issue_tasks with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DCAgent/a1-issue_tasks with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DCAgent/a1-issue_tasks") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-issue_tasks") model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-issue_tasks") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DCAgent/a1-issue_tasks with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DCAgent/a1-issue_tasks" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-issue_tasks", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DCAgent/a1-issue_tasks
- SGLang
How to use DCAgent/a1-issue_tasks with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DCAgent/a1-issue_tasks" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-issue_tasks", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DCAgent/a1-issue_tasks" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-issue_tasks", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DCAgent/a1-issue_tasks with Docker Model Runner:
docker model run hf.co/DCAgent/a1-issue_tasks
| { | |
| "best_global_step": null, | |
| "best_metric": null, | |
| "best_model_checkpoint": null, | |
| "epoch": 7.0, | |
| "eval_steps": 500, | |
| "global_step": 4354, | |
| "is_hyper_param_search": false, | |
| "is_local_process_zero": true, | |
| "is_world_process_zero": true, | |
| "log_history": [ | |
| { | |
| "epoch": 0.008038585209003215, | |
| "grad_norm": 15.487869480598855, | |
| "learning_rate": 3.6697247706422023e-07, | |
| "loss": 0.6902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7185355424880981, | |
| "step": 5, | |
| "valid_targets_mean": 6218.2, | |
| "valid_targets_min": 1794 | |
| }, | |
| { | |
| "epoch": 0.01607717041800643, | |
| "grad_norm": 16.705030639862873, | |
| "learning_rate": 8.256880733944956e-07, | |
| "loss": 0.6933, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7040611505508423, | |
| "step": 10, | |
| "valid_targets_mean": 5971.0, | |
| "valid_targets_min": 1288 | |
| }, | |
| { | |
| "epoch": 0.024115755627009645, | |
| "grad_norm": 12.454754329110074, | |
| "learning_rate": 1.2844036697247707e-06, | |
| "loss": 0.6518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5946602821350098, | |
| "step": 15, | |
| "valid_targets_mean": 5925.1, | |
| "valid_targets_min": 2579 | |
| }, | |
| { | |
| "epoch": 0.03215434083601286, | |
| "grad_norm": 11.220424247643189, | |
| "learning_rate": 1.743119266055046e-06, | |
| "loss": 0.6289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5990424156188965, | |
| "step": 20, | |
| "valid_targets_mean": 4584.0, | |
| "valid_targets_min": 1532 | |
| }, | |
| { | |
| "epoch": 0.04019292604501608, | |
| "grad_norm": 6.228683067868804, | |
| "learning_rate": 2.2018348623853215e-06, | |
| "loss": 0.6005, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5438046455383301, | |
| "step": 25, | |
| "valid_targets_mean": 5913.2, | |
| "valid_targets_min": 1007 | |
| }, | |
| { | |
| "epoch": 0.04823151125401929, | |
| "grad_norm": 3.705703138387386, | |
| "learning_rate": 2.6605504587155968e-06, | |
| "loss": 0.5547, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4752989411354065, | |
| "step": 30, | |
| "valid_targets_mean": 5618.4, | |
| "valid_targets_min": 2547 | |
| }, | |
| { | |
| "epoch": 0.05627009646302251, | |
| "grad_norm": 2.2930526520291594, | |
| "learning_rate": 3.119266055045872e-06, | |
| "loss": 0.5046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4889165163040161, | |
| "step": 35, | |
| "valid_targets_mean": 5507.8, | |
| "valid_targets_min": 1890 | |
| }, | |
| { | |
| "epoch": 0.06430868167202572, | |
| "grad_norm": 1.3788680494230385, | |
| "learning_rate": 3.5779816513761473e-06, | |
| "loss": 0.4708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4641118049621582, | |
| "step": 40, | |
| "valid_targets_mean": 5501.1, | |
| "valid_targets_min": 2317 | |
| }, | |
| { | |
| "epoch": 0.07234726688102894, | |
| "grad_norm": 1.0707354953475499, | |
| "learning_rate": 4.036697247706423e-06, | |
| "loss": 0.4526, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.44059669971466064, | |
| "step": 45, | |
| "valid_targets_mean": 5165.7, | |
| "valid_targets_min": 2549 | |
| }, | |
| { | |
| "epoch": 0.08038585209003216, | |
| "grad_norm": 1.0042764156736175, | |
| "learning_rate": 4.4954128440366975e-06, | |
| "loss": 0.4748, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4787266254425049, | |
| "step": 50, | |
| "valid_targets_mean": 5976.1, | |
| "valid_targets_min": 2505 | |
| }, | |
| { | |
| "epoch": 0.08842443729903537, | |
| "grad_norm": 0.8578820860325631, | |
| "learning_rate": 4.954128440366973e-06, | |
| "loss": 0.4525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5156609416007996, | |
| "step": 55, | |
| "valid_targets_mean": 5803.9, | |
| "valid_targets_min": 707 | |
| }, | |
| { | |
| "epoch": 0.09646302250803858, | |
| "grad_norm": 0.7054778587694457, | |
| "learning_rate": 5.412844036697248e-06, | |
| "loss": 0.4304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3835570216178894, | |
| "step": 60, | |
| "valid_targets_mean": 4718.1, | |
| "valid_targets_min": 368 | |
| }, | |
| { | |
| "epoch": 0.1045016077170418, | |
| "grad_norm": 0.5921455336434676, | |
| "learning_rate": 5.871559633027524e-06, | |
| "loss": 0.4285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.42219042778015137, | |
| "step": 65, | |
| "valid_targets_mean": 5650.0, | |
| "valid_targets_min": 1775 | |
| }, | |
| { | |
| "epoch": 0.11254019292604502, | |
| "grad_norm": 0.6432811667585792, | |
| "learning_rate": 6.330275229357799e-06, | |
| "loss": 0.4026, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4135628044605255, | |
| "step": 70, | |
| "valid_targets_mean": 4397.6, | |
| "valid_targets_min": 918 | |
| }, | |
| { | |
| "epoch": 0.12057877813504823, | |
| "grad_norm": 0.5954192316469062, | |
| "learning_rate": 6.788990825688074e-06, | |
| "loss": 0.4032, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3610105514526367, | |
| "step": 75, | |
| "valid_targets_mean": 5901.3, | |
| "valid_targets_min": 860 | |
| }, | |
| { | |
| "epoch": 0.12861736334405144, | |
| "grad_norm": 0.5561323896575481, | |
| "learning_rate": 7.247706422018349e-06, | |
| "loss": 0.4197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3978654742240906, | |
| "step": 80, | |
| "valid_targets_mean": 6885.6, | |
| "valid_targets_min": 2842 | |
| }, | |
| { | |
| "epoch": 0.13665594855305466, | |
| "grad_norm": 0.5117138640075914, | |
| "learning_rate": 7.706422018348626e-06, | |
| "loss": 0.3789, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31353235244750977, | |
| "step": 85, | |
| "valid_targets_mean": 5075.9, | |
| "valid_targets_min": 1968 | |
| }, | |
| { | |
| "epoch": 0.14469453376205788, | |
| "grad_norm": 0.5185797042450433, | |
| "learning_rate": 8.1651376146789e-06, | |
| "loss": 0.3764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35735011100769043, | |
| "step": 90, | |
| "valid_targets_mean": 6150.4, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 0.1527331189710611, | |
| "grad_norm": 0.5651365251126268, | |
| "learning_rate": 8.623853211009175e-06, | |
| "loss": 0.3894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36827707290649414, | |
| "step": 95, | |
| "valid_targets_mean": 4620.4, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 0.1607717041800643, | |
| "grad_norm": 0.5030199885647652, | |
| "learning_rate": 9.08256880733945e-06, | |
| "loss": 0.3975, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4012790322303772, | |
| "step": 100, | |
| "valid_targets_mean": 5582.3, | |
| "valid_targets_min": 1716 | |
| }, | |
| { | |
| "epoch": 0.16881028938906753, | |
| "grad_norm": 0.6830238553526883, | |
| "learning_rate": 9.541284403669727e-06, | |
| "loss": 0.3755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4395039677619934, | |
| "step": 105, | |
| "valid_targets_mean": 6038.6, | |
| "valid_targets_min": 1901 | |
| }, | |
| { | |
| "epoch": 0.17684887459807075, | |
| "grad_norm": 0.5054231795330882, | |
| "learning_rate": 1e-05, | |
| "loss": 0.3638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33131682872772217, | |
| "step": 110, | |
| "valid_targets_mean": 4955.4, | |
| "valid_targets_min": 1799 | |
| }, | |
| { | |
| "epoch": 0.18488745980707397, | |
| "grad_norm": 0.5366188136072383, | |
| "learning_rate": 1.0458715596330275e-05, | |
| "loss": 0.388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3644598722457886, | |
| "step": 115, | |
| "valid_targets_mean": 4911.2, | |
| "valid_targets_min": 786 | |
| }, | |
| { | |
| "epoch": 0.19292604501607716, | |
| "grad_norm": 0.4523650959270129, | |
| "learning_rate": 1.091743119266055e-05, | |
| "loss": 0.3417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3105719983577728, | |
| "step": 120, | |
| "valid_targets_mean": 5628.5, | |
| "valid_targets_min": 2498 | |
| }, | |
| { | |
| "epoch": 0.20096463022508038, | |
| "grad_norm": 0.7309781032290188, | |
| "learning_rate": 1.1376146788990828e-05, | |
| "loss": 0.3362, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3655633330345154, | |
| "step": 125, | |
| "valid_targets_mean": 4542.7, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 0.2090032154340836, | |
| "grad_norm": 0.5332714906008124, | |
| "learning_rate": 1.1834862385321102e-05, | |
| "loss": 0.363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3906615376472473, | |
| "step": 130, | |
| "valid_targets_mean": 4868.5, | |
| "valid_targets_min": 710 | |
| }, | |
| { | |
| "epoch": 0.2170418006430868, | |
| "grad_norm": 0.5413137226235663, | |
| "learning_rate": 1.2293577981651376e-05, | |
| "loss": 0.3588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3571460545063019, | |
| "step": 135, | |
| "valid_targets_mean": 4481.6, | |
| "valid_targets_min": 2033 | |
| }, | |
| { | |
| "epoch": 0.22508038585209003, | |
| "grad_norm": 0.4819038192629592, | |
| "learning_rate": 1.2752293577981652e-05, | |
| "loss": 0.3356, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36454111337661743, | |
| "step": 140, | |
| "valid_targets_mean": 5984.9, | |
| "valid_targets_min": 1926 | |
| }, | |
| { | |
| "epoch": 0.23311897106109325, | |
| "grad_norm": 3.4522565919790846, | |
| "learning_rate": 1.3211009174311929e-05, | |
| "loss": 0.3607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.41981396079063416, | |
| "step": 145, | |
| "valid_targets_mean": 5873.2, | |
| "valid_targets_min": 2123 | |
| }, | |
| { | |
| "epoch": 0.24115755627009647, | |
| "grad_norm": 0.480191765296101, | |
| "learning_rate": 1.3669724770642203e-05, | |
| "loss": 0.342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3925219774246216, | |
| "step": 150, | |
| "valid_targets_mean": 6098.4, | |
| "valid_targets_min": 477 | |
| }, | |
| { | |
| "epoch": 0.2491961414790997, | |
| "grad_norm": 0.5275995100411426, | |
| "learning_rate": 1.4128440366972477e-05, | |
| "loss": 0.3548, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.39259496331214905, | |
| "step": 155, | |
| "valid_targets_mean": 5801.5, | |
| "valid_targets_min": 1534 | |
| }, | |
| { | |
| "epoch": 0.2572347266881029, | |
| "grad_norm": 0.4497605594039018, | |
| "learning_rate": 1.4587155963302753e-05, | |
| "loss": 0.3315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2661333680152893, | |
| "step": 160, | |
| "valid_targets_mean": 5559.2, | |
| "valid_targets_min": 2047 | |
| }, | |
| { | |
| "epoch": 0.2652733118971061, | |
| "grad_norm": 0.5186829745052237, | |
| "learning_rate": 1.504587155963303e-05, | |
| "loss": 0.3114, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32883208990097046, | |
| "step": 165, | |
| "valid_targets_mean": 5211.5, | |
| "valid_targets_min": 2437 | |
| }, | |
| { | |
| "epoch": 0.2733118971061093, | |
| "grad_norm": 0.5378694289846688, | |
| "learning_rate": 1.5504587155963304e-05, | |
| "loss": 0.328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3817879557609558, | |
| "step": 170, | |
| "valid_targets_mean": 6117.9, | |
| "valid_targets_min": 949 | |
| }, | |
| { | |
| "epoch": 0.28135048231511256, | |
| "grad_norm": 0.47655599775978846, | |
| "learning_rate": 1.5963302752293578e-05, | |
| "loss": 0.3431, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3332779109477997, | |
| "step": 175, | |
| "valid_targets_mean": 6652.5, | |
| "valid_targets_min": 1061 | |
| }, | |
| { | |
| "epoch": 0.28938906752411575, | |
| "grad_norm": 0.5175596131296907, | |
| "learning_rate": 1.6422018348623852e-05, | |
| "loss": 0.3322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2893512547016144, | |
| "step": 180, | |
| "valid_targets_mean": 5451.6, | |
| "valid_targets_min": 2715 | |
| }, | |
| { | |
| "epoch": 0.297427652733119, | |
| "grad_norm": 0.47484987666886874, | |
| "learning_rate": 1.688073394495413e-05, | |
| "loss": 0.3211, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3134007751941681, | |
| "step": 185, | |
| "valid_targets_mean": 5775.2, | |
| "valid_targets_min": 2507 | |
| }, | |
| { | |
| "epoch": 0.3054662379421222, | |
| "grad_norm": 0.5722106466766347, | |
| "learning_rate": 1.7339449541284407e-05, | |
| "loss": 0.3098, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.316297709941864, | |
| "step": 190, | |
| "valid_targets_mean": 5722.6, | |
| "valid_targets_min": 602 | |
| }, | |
| { | |
| "epoch": 0.3135048231511254, | |
| "grad_norm": 0.5720677850765011, | |
| "learning_rate": 1.779816513761468e-05, | |
| "loss": 0.3603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4151928126811981, | |
| "step": 195, | |
| "valid_targets_mean": 5507.1, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 0.3215434083601286, | |
| "grad_norm": 0.5017657301946835, | |
| "learning_rate": 1.8256880733944955e-05, | |
| "loss": 0.3238, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34416627883911133, | |
| "step": 200, | |
| "valid_targets_mean": 5537.2, | |
| "valid_targets_min": 962 | |
| }, | |
| { | |
| "epoch": 0.3295819935691318, | |
| "grad_norm": 0.5004261222863479, | |
| "learning_rate": 1.8715596330275232e-05, | |
| "loss": 0.3143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2886529862880707, | |
| "step": 205, | |
| "valid_targets_mean": 5239.9, | |
| "valid_targets_min": 643 | |
| }, | |
| { | |
| "epoch": 0.33762057877813506, | |
| "grad_norm": 0.4773649164878627, | |
| "learning_rate": 1.9174311926605506e-05, | |
| "loss": 0.3421, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3376484215259552, | |
| "step": 210, | |
| "valid_targets_mean": 6458.6, | |
| "valid_targets_min": 662 | |
| }, | |
| { | |
| "epoch": 0.34565916398713825, | |
| "grad_norm": 0.4850206053577162, | |
| "learning_rate": 1.963302752293578e-05, | |
| "loss": 0.3252, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2979908585548401, | |
| "step": 215, | |
| "valid_targets_mean": 5263.2, | |
| "valid_targets_min": 797 | |
| }, | |
| { | |
| "epoch": 0.3536977491961415, | |
| "grad_norm": 0.5013386580471846, | |
| "learning_rate": 2.0091743119266058e-05, | |
| "loss": 0.334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30896592140197754, | |
| "step": 220, | |
| "valid_targets_mean": 5896.0, | |
| "valid_targets_min": 1675 | |
| }, | |
| { | |
| "epoch": 0.3617363344051447, | |
| "grad_norm": 0.5570090249676257, | |
| "learning_rate": 2.0550458715596332e-05, | |
| "loss": 0.32, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3088981509208679, | |
| "step": 225, | |
| "valid_targets_mean": 5530.9, | |
| "valid_targets_min": 1200 | |
| }, | |
| { | |
| "epoch": 0.36977491961414793, | |
| "grad_norm": 0.597801049082016, | |
| "learning_rate": 2.100917431192661e-05, | |
| "loss": 0.3018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3181556165218353, | |
| "step": 230, | |
| "valid_targets_mean": 6380.2, | |
| "valid_targets_min": 2558 | |
| }, | |
| { | |
| "epoch": 0.3778135048231511, | |
| "grad_norm": 0.5191577104221832, | |
| "learning_rate": 2.1467889908256883e-05, | |
| "loss": 0.3355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3778053820133209, | |
| "step": 235, | |
| "valid_targets_mean": 5636.4, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 0.3858520900321543, | |
| "grad_norm": 0.5854719585440191, | |
| "learning_rate": 2.1926605504587157e-05, | |
| "loss": 0.3358, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35405513644218445, | |
| "step": 240, | |
| "valid_targets_mean": 5000.0, | |
| "valid_targets_min": 1732 | |
| }, | |
| { | |
| "epoch": 0.39389067524115756, | |
| "grad_norm": 0.5476862094602315, | |
| "learning_rate": 2.238532110091743e-05, | |
| "loss": 0.3159, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2814165949821472, | |
| "step": 245, | |
| "valid_targets_mean": 4733.5, | |
| "valid_targets_min": 1002 | |
| }, | |
| { | |
| "epoch": 0.40192926045016075, | |
| "grad_norm": 0.5926979503199129, | |
| "learning_rate": 2.2844036697247705e-05, | |
| "loss": 0.3551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3770928382873535, | |
| "step": 250, | |
| "valid_targets_mean": 5427.2, | |
| "valid_targets_min": 493 | |
| }, | |
| { | |
| "epoch": 0.409967845659164, | |
| "grad_norm": 0.6100209266847794, | |
| "learning_rate": 2.3302752293577983e-05, | |
| "loss": 0.3364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3547223210334778, | |
| "step": 255, | |
| "valid_targets_mean": 4440.1, | |
| "valid_targets_min": 485 | |
| }, | |
| { | |
| "epoch": 0.4180064308681672, | |
| "grad_norm": 0.5566232844963857, | |
| "learning_rate": 2.376146788990826e-05, | |
| "loss": 0.309, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3098330497741699, | |
| "step": 260, | |
| "valid_targets_mean": 4793.3, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 0.42604501607717044, | |
| "grad_norm": 0.5427122101027465, | |
| "learning_rate": 2.4220183486238534e-05, | |
| "loss": 0.302, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31646621227264404, | |
| "step": 265, | |
| "valid_targets_mean": 4927.3, | |
| "valid_targets_min": 452 | |
| }, | |
| { | |
| "epoch": 0.4340836012861736, | |
| "grad_norm": 0.5988872869137836, | |
| "learning_rate": 2.467889908256881e-05, | |
| "loss": 0.3189, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34495389461517334, | |
| "step": 270, | |
| "valid_targets_mean": 4704.2, | |
| "valid_targets_min": 329 | |
| }, | |
| { | |
| "epoch": 0.44212218649517687, | |
| "grad_norm": 0.547941917409528, | |
| "learning_rate": 2.5137614678899085e-05, | |
| "loss": 0.3255, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3075386881828308, | |
| "step": 275, | |
| "valid_targets_mean": 4813.1, | |
| "valid_targets_min": 946 | |
| }, | |
| { | |
| "epoch": 0.45016077170418006, | |
| "grad_norm": 0.5249153697259977, | |
| "learning_rate": 2.559633027522936e-05, | |
| "loss": 0.3121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28792035579681396, | |
| "step": 280, | |
| "valid_targets_mean": 5561.9, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 0.45819935691318325, | |
| "grad_norm": 0.43052674384652867, | |
| "learning_rate": 2.6055045871559633e-05, | |
| "loss": 0.2923, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31526419520378113, | |
| "step": 285, | |
| "valid_targets_mean": 7136.0, | |
| "valid_targets_min": 2915 | |
| }, | |
| { | |
| "epoch": 0.4662379421221865, | |
| "grad_norm": 0.5540575973124553, | |
| "learning_rate": 2.6513761467889907e-05, | |
| "loss": 0.3446, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32834839820861816, | |
| "step": 290, | |
| "valid_targets_mean": 5022.2, | |
| "valid_targets_min": 237 | |
| }, | |
| { | |
| "epoch": 0.4742765273311897, | |
| "grad_norm": 0.48094194473889346, | |
| "learning_rate": 2.6972477064220185e-05, | |
| "loss": 0.3056, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3205365836620331, | |
| "step": 295, | |
| "valid_targets_mean": 5865.0, | |
| "valid_targets_min": 2315 | |
| }, | |
| { | |
| "epoch": 0.48231511254019294, | |
| "grad_norm": 0.5618267371932142, | |
| "learning_rate": 2.7431192660550462e-05, | |
| "loss": 0.3165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3605026602745056, | |
| "step": 300, | |
| "valid_targets_mean": 4986.5, | |
| "valid_targets_min": 921 | |
| }, | |
| { | |
| "epoch": 0.4903536977491961, | |
| "grad_norm": 0.6404105383558549, | |
| "learning_rate": 2.7889908256880736e-05, | |
| "loss": 0.2895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30333420634269714, | |
| "step": 305, | |
| "valid_targets_mean": 4107.6, | |
| "valid_targets_min": 466 | |
| }, | |
| { | |
| "epoch": 0.4983922829581994, | |
| "grad_norm": 0.5306270510654744, | |
| "learning_rate": 2.8348623853211014e-05, | |
| "loss": 0.3029, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29267704486846924, | |
| "step": 310, | |
| "valid_targets_mean": 5249.4, | |
| "valid_targets_min": 2300 | |
| }, | |
| { | |
| "epoch": 0.5064308681672026, | |
| "grad_norm": 0.49099328683222326, | |
| "learning_rate": 2.8807339449541288e-05, | |
| "loss": 0.3195, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30092453956604004, | |
| "step": 315, | |
| "valid_targets_mean": 5905.2, | |
| "valid_targets_min": 1454 | |
| }, | |
| { | |
| "epoch": 0.5144694533762058, | |
| "grad_norm": 0.663160338132964, | |
| "learning_rate": 2.926605504587156e-05, | |
| "loss": 0.2776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2730156183242798, | |
| "step": 320, | |
| "valid_targets_mean": 4000.4, | |
| "valid_targets_min": 419 | |
| }, | |
| { | |
| "epoch": 0.522508038585209, | |
| "grad_norm": 0.49329600788095035, | |
| "learning_rate": 2.9724770642201836e-05, | |
| "loss": 0.2929, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32109999656677246, | |
| "step": 325, | |
| "valid_targets_mean": 6141.3, | |
| "valid_targets_min": 2263 | |
| }, | |
| { | |
| "epoch": 0.5305466237942122, | |
| "grad_norm": 0.5752524694010668, | |
| "learning_rate": 3.0183486238532113e-05, | |
| "loss": 0.3435, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.40009579062461853, | |
| "step": 330, | |
| "valid_targets_mean": 6205.9, | |
| "valid_targets_min": 437 | |
| }, | |
| { | |
| "epoch": 0.5385852090032154, | |
| "grad_norm": 0.5350750612199158, | |
| "learning_rate": 3.0642201834862384e-05, | |
| "loss": 0.2993, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3526759147644043, | |
| "step": 335, | |
| "valid_targets_mean": 5805.6, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 0.5466237942122186, | |
| "grad_norm": 0.5826856037526428, | |
| "learning_rate": 3.110091743119266e-05, | |
| "loss": 0.3106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3246879279613495, | |
| "step": 340, | |
| "valid_targets_mean": 5508.0, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 0.5546623794212219, | |
| "grad_norm": 0.42437748396411806, | |
| "learning_rate": 3.155963302752294e-05, | |
| "loss": 0.2945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2742518186569214, | |
| "step": 345, | |
| "valid_targets_mean": 7056.8, | |
| "valid_targets_min": 2993 | |
| }, | |
| { | |
| "epoch": 0.5627009646302251, | |
| "grad_norm": 0.6228872767131681, | |
| "learning_rate": 3.201834862385321e-05, | |
| "loss": 0.3009, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30292758345603943, | |
| "step": 350, | |
| "valid_targets_mean": 4400.6, | |
| "valid_targets_min": 873 | |
| }, | |
| { | |
| "epoch": 0.5707395498392283, | |
| "grad_norm": 0.4664154597507688, | |
| "learning_rate": 3.2477064220183487e-05, | |
| "loss": 0.3329, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3061991333961487, | |
| "step": 355, | |
| "valid_targets_mean": 6292.2, | |
| "valid_targets_min": 3215 | |
| }, | |
| { | |
| "epoch": 0.5787781350482315, | |
| "grad_norm": 0.5161256905301413, | |
| "learning_rate": 3.293577981651377e-05, | |
| "loss": 0.3099, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24547383189201355, | |
| "step": 360, | |
| "valid_targets_mean": 4519.0, | |
| "valid_targets_min": 319 | |
| }, | |
| { | |
| "epoch": 0.5868167202572347, | |
| "grad_norm": 0.5365528928336145, | |
| "learning_rate": 3.339449541284404e-05, | |
| "loss": 0.2969, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29534512758255005, | |
| "step": 365, | |
| "valid_targets_mean": 4922.1, | |
| "valid_targets_min": 984 | |
| }, | |
| { | |
| "epoch": 0.594855305466238, | |
| "grad_norm": 0.5097548549634454, | |
| "learning_rate": 3.3853211009174315e-05, | |
| "loss": 0.312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.296095073223114, | |
| "step": 370, | |
| "valid_targets_mean": 5232.6, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 0.6028938906752411, | |
| "grad_norm": 0.49051425109618985, | |
| "learning_rate": 3.431192660550459e-05, | |
| "loss": 0.2838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30179792642593384, | |
| "step": 375, | |
| "valid_targets_mean": 5981.7, | |
| "valid_targets_min": 1437 | |
| }, | |
| { | |
| "epoch": 0.6109324758842444, | |
| "grad_norm": 0.5966512802302277, | |
| "learning_rate": 3.477064220183486e-05, | |
| "loss": 0.3042, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3060946464538574, | |
| "step": 380, | |
| "valid_targets_mean": 5208.1, | |
| "valid_targets_min": 1644 | |
| }, | |
| { | |
| "epoch": 0.6189710610932476, | |
| "grad_norm": 0.6706382795742738, | |
| "learning_rate": 3.5229357798165144e-05, | |
| "loss": 0.3086, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34184399247169495, | |
| "step": 385, | |
| "valid_targets_mean": 5654.9, | |
| "valid_targets_min": 1790 | |
| }, | |
| { | |
| "epoch": 0.6270096463022508, | |
| "grad_norm": 0.5660946301266201, | |
| "learning_rate": 3.568807339449542e-05, | |
| "loss": 0.3078, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2994377017021179, | |
| "step": 390, | |
| "valid_targets_mean": 5468.1, | |
| "valid_targets_min": 696 | |
| }, | |
| { | |
| "epoch": 0.635048231511254, | |
| "grad_norm": 0.5340949761801713, | |
| "learning_rate": 3.614678899082569e-05, | |
| "loss": 0.3119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30014336109161377, | |
| "step": 395, | |
| "valid_targets_mean": 5420.6, | |
| "valid_targets_min": 574 | |
| }, | |
| { | |
| "epoch": 0.6430868167202572, | |
| "grad_norm": 0.4923375299212512, | |
| "learning_rate": 3.6605504587155966e-05, | |
| "loss": 0.3044, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29296672344207764, | |
| "step": 400, | |
| "valid_targets_mean": 5953.8, | |
| "valid_targets_min": 2356 | |
| }, | |
| { | |
| "epoch": 0.6511254019292605, | |
| "grad_norm": 0.4912029841427183, | |
| "learning_rate": 3.706422018348624e-05, | |
| "loss": 0.2696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24615658819675446, | |
| "step": 405, | |
| "valid_targets_mean": 5338.8, | |
| "valid_targets_min": 2680 | |
| }, | |
| { | |
| "epoch": 0.6591639871382636, | |
| "grad_norm": 0.46236233354479367, | |
| "learning_rate": 3.7522935779816514e-05, | |
| "loss": 0.2961, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3032350540161133, | |
| "step": 410, | |
| "valid_targets_mean": 5982.8, | |
| "valid_targets_min": 3495 | |
| }, | |
| { | |
| "epoch": 0.6672025723472669, | |
| "grad_norm": 0.5212184732903835, | |
| "learning_rate": 3.798165137614679e-05, | |
| "loss": 0.3096, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3059643507003784, | |
| "step": 415, | |
| "valid_targets_mean": 5736.6, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 0.6752411575562701, | |
| "grad_norm": 0.5207305397129113, | |
| "learning_rate": 3.844036697247707e-05, | |
| "loss": 0.2929, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31649553775787354, | |
| "step": 420, | |
| "valid_targets_mean": 5298.9, | |
| "valid_targets_min": 3490 | |
| }, | |
| { | |
| "epoch": 0.6832797427652733, | |
| "grad_norm": 0.6992704167148815, | |
| "learning_rate": 3.889908256880734e-05, | |
| "loss": 0.2922, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2679362893104553, | |
| "step": 425, | |
| "valid_targets_mean": 5616.1, | |
| "valid_targets_min": 2463 | |
| }, | |
| { | |
| "epoch": 0.6913183279742765, | |
| "grad_norm": 0.511472564575663, | |
| "learning_rate": 3.935779816513762e-05, | |
| "loss": 0.2751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30654776096343994, | |
| "step": 430, | |
| "valid_targets_mean": 5721.6, | |
| "valid_targets_min": 3272 | |
| }, | |
| { | |
| "epoch": 0.6993569131832797, | |
| "grad_norm": 0.4709582852431494, | |
| "learning_rate": 3.981651376146789e-05, | |
| "loss": 0.2848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2843867540359497, | |
| "step": 435, | |
| "valid_targets_mean": 5574.6, | |
| "valid_targets_min": 2023 | |
| }, | |
| { | |
| "epoch": 0.707395498392283, | |
| "grad_norm": 0.453220231562951, | |
| "learning_rate": 3.999994213536979e-05, | |
| "loss": 0.3015, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30728816986083984, | |
| "step": 440, | |
| "valid_targets_mean": 6203.7, | |
| "valid_targets_min": 3453 | |
| }, | |
| { | |
| "epoch": 0.7154340836012861, | |
| "grad_norm": 0.4755977887273865, | |
| "learning_rate": 3.999958851939771e-05, | |
| "loss": 0.2965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28380826115608215, | |
| "step": 445, | |
| "valid_targets_mean": 6442.1, | |
| "valid_targets_min": 2803 | |
| }, | |
| { | |
| "epoch": 0.7234726688102894, | |
| "grad_norm": 0.49493667906582395, | |
| "learning_rate": 3.999891344014729e-05, | |
| "loss": 0.2948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26691269874572754, | |
| "step": 450, | |
| "valid_targets_mean": 4888.2, | |
| "valid_targets_min": 2047 | |
| }, | |
| { | |
| "epoch": 0.7315112540192926, | |
| "grad_norm": 0.5095276271053644, | |
| "learning_rate": 3.999791690846941e-05, | |
| "loss": 0.301, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2900855243206024, | |
| "step": 455, | |
| "valid_targets_mean": 4938.0, | |
| "valid_targets_min": 1713 | |
| }, | |
| { | |
| "epoch": 0.7395498392282959, | |
| "grad_norm": 0.5205052676880491, | |
| "learning_rate": 3.9996598940381816e-05, | |
| "loss": 0.2963, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3060019910335541, | |
| "step": 460, | |
| "valid_targets_mean": 5271.0, | |
| "valid_targets_min": 699 | |
| }, | |
| { | |
| "epoch": 0.747588424437299, | |
| "grad_norm": 0.4876809443820683, | |
| "learning_rate": 3.9994959557068863e-05, | |
| "loss": 0.2781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27658092975616455, | |
| "step": 465, | |
| "valid_targets_mean": 5591.4, | |
| "valid_targets_min": 1875 | |
| }, | |
| { | |
| "epoch": 0.7556270096463023, | |
| "grad_norm": 0.4923720425882284, | |
| "learning_rate": 3.999299878488117e-05, | |
| "loss": 0.2871, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2531621754169464, | |
| "step": 470, | |
| "valid_targets_mean": 5429.0, | |
| "valid_targets_min": 2498 | |
| }, | |
| { | |
| "epoch": 0.7636655948553055, | |
| "grad_norm": 0.585919343961974, | |
| "learning_rate": 3.99907166553352e-05, | |
| "loss": 0.2943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30858397483825684, | |
| "step": 475, | |
| "valid_targets_mean": 4827.4, | |
| "valid_targets_min": 513 | |
| }, | |
| { | |
| "epoch": 0.7717041800643086, | |
| "grad_norm": 0.4747623432707116, | |
| "learning_rate": 3.9988113205112755e-05, | |
| "loss": 0.3013, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33221951127052307, | |
| "step": 480, | |
| "valid_targets_mean": 5955.8, | |
| "valid_targets_min": 675 | |
| }, | |
| { | |
| "epoch": 0.7797427652733119, | |
| "grad_norm": 0.456621692177911, | |
| "learning_rate": 3.998518847606036e-05, | |
| "loss": 0.2851, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28177374601364136, | |
| "step": 485, | |
| "valid_targets_mean": 5865.8, | |
| "valid_targets_min": 2025 | |
| }, | |
| { | |
| "epoch": 0.7877813504823151, | |
| "grad_norm": 0.4665811147170575, | |
| "learning_rate": 3.998194251518865e-05, | |
| "loss": 0.3184, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3034052550792694, | |
| "step": 490, | |
| "valid_targets_mean": 5511.8, | |
| "valid_targets_min": 3215 | |
| }, | |
| { | |
| "epoch": 0.7958199356913184, | |
| "grad_norm": 0.3911511618161914, | |
| "learning_rate": 3.9978375374671555e-05, | |
| "loss": 0.281, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24626390635967255, | |
| "step": 495, | |
| "valid_targets_mean": 7323.9, | |
| "valid_targets_min": 2857 | |
| }, | |
| { | |
| "epoch": 0.8038585209003215, | |
| "grad_norm": 0.5396114749470258, | |
| "learning_rate": 3.997448711184548e-05, | |
| "loss": 0.3037, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.308045893907547, | |
| "step": 500, | |
| "valid_targets_mean": 5182.1, | |
| "valid_targets_min": 747 | |
| }, | |
| { | |
| "epoch": 0.8118971061093248, | |
| "grad_norm": 0.4849544264425815, | |
| "learning_rate": 3.997027778920841e-05, | |
| "loss": 0.2887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3522014915943146, | |
| "step": 505, | |
| "valid_targets_mean": 6453.4, | |
| "valid_targets_min": 2028 | |
| }, | |
| { | |
| "epoch": 0.819935691318328, | |
| "grad_norm": 0.5350245918364654, | |
| "learning_rate": 3.9965747474418855e-05, | |
| "loss": 0.3212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3169665038585663, | |
| "step": 510, | |
| "valid_targets_mean": 4740.1, | |
| "valid_targets_min": 576 | |
| }, | |
| { | |
| "epoch": 0.8279742765273312, | |
| "grad_norm": 0.4938218700461966, | |
| "learning_rate": 3.996089624029482e-05, | |
| "loss": 0.2714, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25281214714050293, | |
| "step": 515, | |
| "valid_targets_mean": 5391.9, | |
| "valid_targets_min": 588 | |
| }, | |
| { | |
| "epoch": 0.8360128617363344, | |
| "grad_norm": 0.46030799853278453, | |
| "learning_rate": 3.995572416481258e-05, | |
| "loss": 0.2758, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2778337001800537, | |
| "step": 520, | |
| "valid_targets_mean": 5831.8, | |
| "valid_targets_min": 1928 | |
| }, | |
| { | |
| "epoch": 0.8440514469453376, | |
| "grad_norm": 0.4832421033589269, | |
| "learning_rate": 3.9950231331105464e-05, | |
| "loss": 0.2983, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28779375553131104, | |
| "step": 525, | |
| "valid_targets_mean": 5119.5, | |
| "valid_targets_min": 661 | |
| }, | |
| { | |
| "epoch": 0.8520900321543409, | |
| "grad_norm": 0.5293558575026555, | |
| "learning_rate": 3.994441782746251e-05, | |
| "loss": 0.2907, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2785415053367615, | |
| "step": 530, | |
| "valid_targets_mean": 5092.9, | |
| "valid_targets_min": 2581 | |
| }, | |
| { | |
| "epoch": 0.860128617363344, | |
| "grad_norm": 0.416392644109728, | |
| "learning_rate": 3.993828374732704e-05, | |
| "loss": 0.2966, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2805801033973694, | |
| "step": 535, | |
| "valid_targets_mean": 6368.6, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 0.8681672025723473, | |
| "grad_norm": 0.4645205058216023, | |
| "learning_rate": 3.993182918929515e-05, | |
| "loss": 0.2936, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30918392539024353, | |
| "step": 540, | |
| "valid_targets_mean": 5649.8, | |
| "valid_targets_min": 2795 | |
| }, | |
| { | |
| "epoch": 0.8762057877813505, | |
| "grad_norm": 0.5308378752814893, | |
| "learning_rate": 3.992505425711414e-05, | |
| "loss": 0.2655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2631094455718994, | |
| "step": 545, | |
| "valid_targets_mean": 4481.9, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 0.8842443729903537, | |
| "grad_norm": 0.4801689653969418, | |
| "learning_rate": 3.991795905968086e-05, | |
| "loss": 0.2806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30362850427627563, | |
| "step": 550, | |
| "valid_targets_mean": 5824.9, | |
| "valid_targets_min": 3011 | |
| }, | |
| { | |
| "epoch": 0.8922829581993569, | |
| "grad_norm": 0.44577993655229686, | |
| "learning_rate": 3.991054371103992e-05, | |
| "loss": 0.2839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26914289593696594, | |
| "step": 555, | |
| "valid_targets_mean": 6685.0, | |
| "valid_targets_min": 2217 | |
| }, | |
| { | |
| "epoch": 0.9003215434083601, | |
| "grad_norm": 0.45000332721151537, | |
| "learning_rate": 3.9902808330381876e-05, | |
| "loss": 0.2972, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26276689767837524, | |
| "step": 560, | |
| "valid_targets_mean": 6083.2, | |
| "valid_targets_min": 3312 | |
| }, | |
| { | |
| "epoch": 0.9083601286173634, | |
| "grad_norm": 0.5316772461451206, | |
| "learning_rate": 3.989475304204135e-05, | |
| "loss": 0.2926, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3673328757286072, | |
| "step": 565, | |
| "valid_targets_mean": 6526.5, | |
| "valid_targets_min": 2025 | |
| }, | |
| { | |
| "epoch": 0.9163987138263665, | |
| "grad_norm": 0.45002232479827187, | |
| "learning_rate": 3.988637797549496e-05, | |
| "loss": 0.2761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25491195917129517, | |
| "step": 570, | |
| "valid_targets_mean": 6206.9, | |
| "valid_targets_min": 660 | |
| }, | |
| { | |
| "epoch": 0.9244372990353698, | |
| "grad_norm": 0.6688784807827985, | |
| "learning_rate": 3.9877683265359294e-05, | |
| "loss": 0.2803, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23365923762321472, | |
| "step": 575, | |
| "valid_targets_mean": 5251.8, | |
| "valid_targets_min": 2129 | |
| }, | |
| { | |
| "epoch": 0.932475884244373, | |
| "grad_norm": 0.5328784356408865, | |
| "learning_rate": 3.986866905138874e-05, | |
| "loss": 0.2708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2967522144317627, | |
| "step": 580, | |
| "valid_targets_mean": 4377.9, | |
| "valid_targets_min": 398 | |
| }, | |
| { | |
| "epoch": 0.9405144694533762, | |
| "grad_norm": 0.4705101238005458, | |
| "learning_rate": 3.985933547847319e-05, | |
| "loss": 0.2678, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.276937335729599, | |
| "step": 585, | |
| "valid_targets_mean": 5281.8, | |
| "valid_targets_min": 971 | |
| }, | |
| { | |
| "epoch": 0.9485530546623794, | |
| "grad_norm": 0.39879237511284055, | |
| "learning_rate": 3.984968269663582e-05, | |
| "loss": 0.3115, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.268502801656723, | |
| "step": 590, | |
| "valid_targets_mean": 6211.7, | |
| "valid_targets_min": 2174 | |
| }, | |
| { | |
| "epoch": 0.9565916398713826, | |
| "grad_norm": 0.4480245882027022, | |
| "learning_rate": 3.983971086103053e-05, | |
| "loss": 0.2682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2714992165565491, | |
| "step": 595, | |
| "valid_targets_mean": 5526.9, | |
| "valid_targets_min": 349 | |
| }, | |
| { | |
| "epoch": 0.9646302250803859, | |
| "grad_norm": 0.5403266638905054, | |
| "learning_rate": 3.982942013193958e-05, | |
| "loss": 0.2762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3248938322067261, | |
| "step": 600, | |
| "valid_targets_mean": 6144.6, | |
| "valid_targets_min": 636 | |
| }, | |
| { | |
| "epoch": 0.9726688102893891, | |
| "grad_norm": 0.4516840633756994, | |
| "learning_rate": 3.981881067477095e-05, | |
| "loss": 0.2684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2717623710632324, | |
| "step": 605, | |
| "valid_targets_mean": 5657.6, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 0.9807073954983923, | |
| "grad_norm": 0.4528229245449106, | |
| "learning_rate": 3.9807882660055696e-05, | |
| "loss": 0.2867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2597085237503052, | |
| "step": 610, | |
| "valid_targets_mean": 5101.6, | |
| "valid_targets_min": 2236 | |
| }, | |
| { | |
| "epoch": 0.9887459807073955, | |
| "grad_norm": 0.44017945177584167, | |
| "learning_rate": 3.979663626344519e-05, | |
| "loss": 0.2847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25955039262771606, | |
| "step": 615, | |
| "valid_targets_mean": 5319.9, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 0.9967845659163987, | |
| "grad_norm": 0.47936626435483093, | |
| "learning_rate": 3.978507166570834e-05, | |
| "loss": 0.2894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27693191170692444, | |
| "step": 620, | |
| "valid_targets_mean": 4538.5, | |
| "valid_targets_min": 1621 | |
| }, | |
| { | |
| "epoch": 1.004823151125402, | |
| "grad_norm": 0.4481854100624881, | |
| "learning_rate": 3.9773189052728645e-05, | |
| "loss": 0.2759, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2927781641483307, | |
| "step": 625, | |
| "valid_targets_mean": 5881.1, | |
| "valid_targets_min": 439 | |
| }, | |
| { | |
| "epoch": 1.0128617363344052, | |
| "grad_norm": 0.4518269756671226, | |
| "learning_rate": 3.9760988615501216e-05, | |
| "loss": 0.2623, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21591615676879883, | |
| "step": 630, | |
| "valid_targets_mean": 5202.7, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 1.0209003215434083, | |
| "grad_norm": 0.4894405054425604, | |
| "learning_rate": 3.974847055012971e-05, | |
| "loss": 0.2816, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.263643354177475, | |
| "step": 635, | |
| "valid_targets_mean": 4966.3, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 1.0289389067524115, | |
| "grad_norm": 0.46931111935501457, | |
| "learning_rate": 3.97356350578232e-05, | |
| "loss": 0.2721, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27274036407470703, | |
| "step": 640, | |
| "valid_targets_mean": 6239.4, | |
| "valid_targets_min": 2734 | |
| }, | |
| { | |
| "epoch": 1.0369774919614148, | |
| "grad_norm": 0.45174077610801744, | |
| "learning_rate": 3.9722482344892885e-05, | |
| "loss": 0.2787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2617829442024231, | |
| "step": 645, | |
| "valid_targets_mean": 5997.4, | |
| "valid_targets_min": 2568 | |
| }, | |
| { | |
| "epoch": 1.045016077170418, | |
| "grad_norm": 0.4548552448131556, | |
| "learning_rate": 3.970901262274883e-05, | |
| "loss": 0.2675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26283806562423706, | |
| "step": 650, | |
| "valid_targets_mean": 6102.5, | |
| "valid_targets_min": 2567 | |
| }, | |
| { | |
| "epoch": 1.0530546623794212, | |
| "grad_norm": 0.43090204468670307, | |
| "learning_rate": 3.9695226107896544e-05, | |
| "loss": 0.2682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23034942150115967, | |
| "step": 655, | |
| "valid_targets_mean": 5608.3, | |
| "valid_targets_min": 2451 | |
| }, | |
| { | |
| "epoch": 1.0610932475884245, | |
| "grad_norm": 0.4842292354424898, | |
| "learning_rate": 3.96811230219335e-05, | |
| "loss": 0.2695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33251893520355225, | |
| "step": 660, | |
| "valid_targets_mean": 6044.6, | |
| "valid_targets_min": 1019 | |
| }, | |
| { | |
| "epoch": 1.0691318327974277, | |
| "grad_norm": 0.5166627086454233, | |
| "learning_rate": 3.9666703591545577e-05, | |
| "loss": 0.26, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.264485239982605, | |
| "step": 665, | |
| "valid_targets_mean": 5534.3, | |
| "valid_targets_min": 1623 | |
| }, | |
| { | |
| "epoch": 1.077170418006431, | |
| "grad_norm": 0.4561205383774768, | |
| "learning_rate": 3.965196804850341e-05, | |
| "loss": 0.2781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2511570453643799, | |
| "step": 670, | |
| "valid_targets_mean": 6292.3, | |
| "valid_targets_min": 3831 | |
| }, | |
| { | |
| "epoch": 1.085209003215434, | |
| "grad_norm": 0.4664119838831818, | |
| "learning_rate": 3.963691662965866e-05, | |
| "loss": 0.2727, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2670794725418091, | |
| "step": 675, | |
| "valid_targets_mean": 6324.6, | |
| "valid_targets_min": 2635 | |
| }, | |
| { | |
| "epoch": 1.0932475884244373, | |
| "grad_norm": 0.4596143591998667, | |
| "learning_rate": 3.9621549576940237e-05, | |
| "loss": 0.2734, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2652100920677185, | |
| "step": 680, | |
| "valid_targets_mean": 6118.9, | |
| "valid_targets_min": 3113 | |
| }, | |
| { | |
| "epoch": 1.1012861736334405, | |
| "grad_norm": 0.45527286052039845, | |
| "learning_rate": 3.9605867137350386e-05, | |
| "loss": 0.2687, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2878355383872986, | |
| "step": 685, | |
| "valid_targets_mean": 5584.5, | |
| "valid_targets_min": 861 | |
| }, | |
| { | |
| "epoch": 1.1093247588424437, | |
| "grad_norm": 0.46127939436236953, | |
| "learning_rate": 3.95898695629607e-05, | |
| "loss": 0.2555, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3002994954586029, | |
| "step": 690, | |
| "valid_targets_mean": 6248.5, | |
| "valid_targets_min": 2783 | |
| }, | |
| { | |
| "epoch": 1.117363344051447, | |
| "grad_norm": 0.4904239019276266, | |
| "learning_rate": 3.957355711090812e-05, | |
| "loss": 0.2765, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3002205491065979, | |
| "step": 695, | |
| "valid_targets_mean": 6415.4, | |
| "valid_targets_min": 900 | |
| }, | |
| { | |
| "epoch": 1.1254019292604502, | |
| "grad_norm": 0.4695407346217511, | |
| "learning_rate": 3.955693004339074e-05, | |
| "loss": 0.2334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2368508279323578, | |
| "step": 700, | |
| "valid_targets_mean": 4967.2, | |
| "valid_targets_min": 1648 | |
| }, | |
| { | |
| "epoch": 1.1334405144694535, | |
| "grad_norm": 0.4209177205357036, | |
| "learning_rate": 3.953998862766366e-05, | |
| "loss": 0.2425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22972455620765686, | |
| "step": 705, | |
| "valid_targets_mean": 5793.0, | |
| "valid_targets_min": 2220 | |
| }, | |
| { | |
| "epoch": 1.1414790996784565, | |
| "grad_norm": 0.5792332549777016, | |
| "learning_rate": 3.9522733136034607e-05, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3492357134819031, | |
| "step": 710, | |
| "valid_targets_mean": 6088.8, | |
| "valid_targets_min": 3000 | |
| }, | |
| { | |
| "epoch": 1.1495176848874598, | |
| "grad_norm": 0.481756880710946, | |
| "learning_rate": 3.9505163845859666e-05, | |
| "loss": 0.2805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2731996476650238, | |
| "step": 715, | |
| "valid_targets_mean": 5951.7, | |
| "valid_targets_min": 3413 | |
| }, | |
| { | |
| "epoch": 1.157556270096463, | |
| "grad_norm": 0.46578493054858106, | |
| "learning_rate": 3.94872810395387e-05, | |
| "loss": 0.2471, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2460387945175171, | |
| "step": 720, | |
| "valid_targets_mean": 5526.6, | |
| "valid_targets_min": 1749 | |
| }, | |
| { | |
| "epoch": 1.1655948553054662, | |
| "grad_norm": 0.4428649245983029, | |
| "learning_rate": 3.9469085004510915e-05, | |
| "loss": 0.2667, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2698541283607483, | |
| "step": 725, | |
| "valid_targets_mean": 5618.7, | |
| "valid_targets_min": 1464 | |
| }, | |
| { | |
| "epoch": 1.1736334405144695, | |
| "grad_norm": 0.41739653985008957, | |
| "learning_rate": 3.945057603325017e-05, | |
| "loss": 0.252, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25388237833976746, | |
| "step": 730, | |
| "valid_targets_mean": 6771.8, | |
| "valid_targets_min": 2408 | |
| }, | |
| { | |
| "epoch": 1.1816720257234727, | |
| "grad_norm": 0.5085464936906313, | |
| "learning_rate": 3.9431754423260336e-05, | |
| "loss": 0.2529, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24689799547195435, | |
| "step": 735, | |
| "valid_targets_mean": 4660.9, | |
| "valid_targets_min": 449 | |
| }, | |
| { | |
| "epoch": 1.189710610932476, | |
| "grad_norm": 0.46314872752532243, | |
| "learning_rate": 3.941262047707044e-05, | |
| "loss": 0.2689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2670508623123169, | |
| "step": 740, | |
| "valid_targets_mean": 6294.3, | |
| "valid_targets_min": 1012 | |
| }, | |
| { | |
| "epoch": 1.197749196141479, | |
| "grad_norm": 0.47881711338841565, | |
| "learning_rate": 3.939317450222987e-05, | |
| "loss": 0.2706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30444222688674927, | |
| "step": 745, | |
| "valid_targets_mean": 6074.1, | |
| "valid_targets_min": 1556 | |
| }, | |
| { | |
| "epoch": 1.2057877813504823, | |
| "grad_norm": 0.4255017920309088, | |
| "learning_rate": 3.93734168113034e-05, | |
| "loss": 0.2875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27554166316986084, | |
| "step": 750, | |
| "valid_targets_mean": 6897.2, | |
| "valid_targets_min": 2189 | |
| }, | |
| { | |
| "epoch": 1.2138263665594855, | |
| "grad_norm": 0.4254444753489672, | |
| "learning_rate": 3.935334772186617e-05, | |
| "loss": 0.2692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23489391803741455, | |
| "step": 755, | |
| "valid_targets_mean": 5864.9, | |
| "valid_targets_min": 818 | |
| }, | |
| { | |
| "epoch": 1.2218649517684887, | |
| "grad_norm": 0.4819794151433871, | |
| "learning_rate": 3.933296755649857e-05, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3091309070587158, | |
| "step": 760, | |
| "valid_targets_mean": 6131.4, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 1.229903536977492, | |
| "grad_norm": 0.5020999109939216, | |
| "learning_rate": 3.931227664278108e-05, | |
| "loss": 0.2597, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23047637939453125, | |
| "step": 765, | |
| "valid_targets_mean": 5521.8, | |
| "valid_targets_min": 1540 | |
| }, | |
| { | |
| "epoch": 1.2379421221864952, | |
| "grad_norm": 0.8030577323017846, | |
| "learning_rate": 3.9291275313289006e-05, | |
| "loss": 0.2825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3151301443576813, | |
| "step": 770, | |
| "valid_targets_mean": 5532.3, | |
| "valid_targets_min": 1775 | |
| }, | |
| { | |
| "epoch": 1.2459807073954985, | |
| "grad_norm": 0.5081302977797679, | |
| "learning_rate": 3.926996390558709e-05, | |
| "loss": 0.2637, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2657415270805359, | |
| "step": 775, | |
| "valid_targets_mean": 4668.9, | |
| "valid_targets_min": 2112 | |
| }, | |
| { | |
| "epoch": 1.2540192926045015, | |
| "grad_norm": 0.44368801982072564, | |
| "learning_rate": 3.9248342762224136e-05, | |
| "loss": 0.2748, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31955486536026, | |
| "step": 780, | |
| "valid_targets_mean": 6386.7, | |
| "valid_targets_min": 3611 | |
| }, | |
| { | |
| "epoch": 1.2620578778135048, | |
| "grad_norm": 0.498931124537264, | |
| "learning_rate": 3.922641223072747e-05, | |
| "loss": 0.2367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2425532042980194, | |
| "step": 785, | |
| "valid_targets_mean": 4810.9, | |
| "valid_targets_min": 486 | |
| }, | |
| { | |
| "epoch": 1.270096463022508, | |
| "grad_norm": 0.5320789984283447, | |
| "learning_rate": 3.9204172663597375e-05, | |
| "loss": 0.2778, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28130286931991577, | |
| "step": 790, | |
| "valid_targets_mean": 5046.2, | |
| "valid_targets_min": 563 | |
| }, | |
| { | |
| "epoch": 1.2781350482315113, | |
| "grad_norm": 0.49054524873365496, | |
| "learning_rate": 3.918162441830142e-05, | |
| "loss": 0.27, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27069205045700073, | |
| "step": 795, | |
| "valid_targets_mean": 5739.9, | |
| "valid_targets_min": 1960 | |
| }, | |
| { | |
| "epoch": 1.2861736334405145, | |
| "grad_norm": 0.5131172853073105, | |
| "learning_rate": 3.915876785726869e-05, | |
| "loss": 0.2683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30020472407341003, | |
| "step": 800, | |
| "valid_targets_mean": 5495.4, | |
| "valid_targets_min": 1822 | |
| }, | |
| { | |
| "epoch": 1.2942122186495177, | |
| "grad_norm": 0.389125813463063, | |
| "learning_rate": 3.9135603347884e-05, | |
| "loss": 0.2508, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24876119196414948, | |
| "step": 805, | |
| "valid_targets_mean": 6612.6, | |
| "valid_targets_min": 2915 | |
| }, | |
| { | |
| "epoch": 1.302250803858521, | |
| "grad_norm": 0.5127358532210575, | |
| "learning_rate": 3.911213126248198e-05, | |
| "loss": 0.2567, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30721384286880493, | |
| "step": 810, | |
| "valid_targets_mean": 5653.8, | |
| "valid_targets_min": 2095 | |
| }, | |
| { | |
| "epoch": 1.310289389067524, | |
| "grad_norm": 0.45232559019783497, | |
| "learning_rate": 3.908835197834105e-05, | |
| "loss": 0.2521, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27385708689689636, | |
| "step": 815, | |
| "valid_targets_mean": 5792.4, | |
| "valid_targets_min": 2123 | |
| }, | |
| { | |
| "epoch": 1.3183279742765273, | |
| "grad_norm": 0.41007757692762153, | |
| "learning_rate": 3.906426587767741e-05, | |
| "loss": 0.2542, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23161353170871735, | |
| "step": 820, | |
| "valid_targets_mean": 6021.1, | |
| "valid_targets_min": 829 | |
| }, | |
| { | |
| "epoch": 1.3263665594855305, | |
| "grad_norm": 0.46907797176400545, | |
| "learning_rate": 3.903987334763888e-05, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2761453092098236, | |
| "step": 825, | |
| "valid_targets_mean": 5064.2, | |
| "valid_targets_min": 786 | |
| }, | |
| { | |
| "epoch": 1.3344051446945338, | |
| "grad_norm": 0.4404949978973486, | |
| "learning_rate": 3.901517478029865e-05, | |
| "loss": 0.2894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3096669912338257, | |
| "step": 830, | |
| "valid_targets_mean": 6641.9, | |
| "valid_targets_min": 3329 | |
| }, | |
| { | |
| "epoch": 1.342443729903537, | |
| "grad_norm": 0.4447920716860381, | |
| "learning_rate": 3.8990170572649014e-05, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25655800104141235, | |
| "step": 835, | |
| "valid_targets_mean": 5353.0, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 1.3504823151125402, | |
| "grad_norm": 0.4758215947665256, | |
| "learning_rate": 3.8964861126594966e-05, | |
| "loss": 0.2692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24453280866146088, | |
| "step": 840, | |
| "valid_targets_mean": 5096.5, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 1.3585209003215435, | |
| "grad_norm": 0.4660195988112356, | |
| "learning_rate": 3.8939246848947744e-05, | |
| "loss": 0.2672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24251258373260498, | |
| "step": 845, | |
| "valid_targets_mean": 5149.8, | |
| "valid_targets_min": 2196 | |
| }, | |
| { | |
| "epoch": 1.3665594855305465, | |
| "grad_norm": 0.5276309826854884, | |
| "learning_rate": 3.891332815141832e-05, | |
| "loss": 0.2616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23830977082252502, | |
| "step": 850, | |
| "valid_targets_mean": 3673.4, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 1.3745980707395498, | |
| "grad_norm": 0.5071830945159351, | |
| "learning_rate": 3.888710545061073e-05, | |
| "loss": 0.2535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21140193939208984, | |
| "step": 855, | |
| "valid_targets_mean": 4074.1, | |
| "valid_targets_min": 1713 | |
| }, | |
| { | |
| "epoch": 1.382636655948553, | |
| "grad_norm": 0.612703762329473, | |
| "learning_rate": 3.886057916801541e-05, | |
| "loss": 0.2551, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24329236149787903, | |
| "step": 860, | |
| "valid_targets_mean": 5770.9, | |
| "valid_targets_min": 586 | |
| }, | |
| { | |
| "epoch": 1.3906752411575563, | |
| "grad_norm": 0.4449814033613736, | |
| "learning_rate": 3.883374973000242e-05, | |
| "loss": 0.2726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28173279762268066, | |
| "step": 865, | |
| "valid_targets_mean": 6413.9, | |
| "valid_targets_min": 3810 | |
| }, | |
| { | |
| "epoch": 1.3987138263665595, | |
| "grad_norm": 0.5574684911444241, | |
| "learning_rate": 3.880661756781459e-05, | |
| "loss": 0.2412, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2678368091583252, | |
| "step": 870, | |
| "valid_targets_mean": 5746.8, | |
| "valid_targets_min": 2217 | |
| }, | |
| { | |
| "epoch": 1.4067524115755627, | |
| "grad_norm": 0.47311959969303435, | |
| "learning_rate": 3.877918311756059e-05, | |
| "loss": 0.2553, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2587282955646515, | |
| "step": 875, | |
| "valid_targets_mean": 5679.6, | |
| "valid_targets_min": 818 | |
| }, | |
| { | |
| "epoch": 1.414790996784566, | |
| "grad_norm": 0.4587092469906857, | |
| "learning_rate": 3.875144682020789e-05, | |
| "loss": 0.2624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24418002367019653, | |
| "step": 880, | |
| "valid_targets_mean": 5821.7, | |
| "valid_targets_min": 1972 | |
| }, | |
| { | |
| "epoch": 1.422829581993569, | |
| "grad_norm": 0.4889856734893498, | |
| "learning_rate": 3.8723409121575744e-05, | |
| "loss": 0.2652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29113972187042236, | |
| "step": 885, | |
| "valid_targets_mean": 5141.8, | |
| "valid_targets_min": 717 | |
| }, | |
| { | |
| "epoch": 1.4308681672025725, | |
| "grad_norm": 0.42403843566188576, | |
| "learning_rate": 3.8695070472327926e-05, | |
| "loss": 0.2548, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.239852637052536, | |
| "step": 890, | |
| "valid_targets_mean": 5505.6, | |
| "valid_targets_min": 1016 | |
| }, | |
| { | |
| "epoch": 1.4389067524115755, | |
| "grad_norm": 0.4122605365448489, | |
| "learning_rate": 3.866643132796557e-05, | |
| "loss": 0.2616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23172590136528015, | |
| "step": 895, | |
| "valid_targets_mean": 5802.1, | |
| "valid_targets_min": 2917 | |
| }, | |
| { | |
| "epoch": 1.4469453376205788, | |
| "grad_norm": 0.450815706756309, | |
| "learning_rate": 3.86374921488198e-05, | |
| "loss": 0.2713, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23391862213611603, | |
| "step": 900, | |
| "valid_targets_mean": 5419.5, | |
| "valid_targets_min": 1843 | |
| }, | |
| { | |
| "epoch": 1.454983922829582, | |
| "grad_norm": 0.47144379339992176, | |
| "learning_rate": 3.8608253400044377e-05, | |
| "loss": 0.2482, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24085289239883423, | |
| "step": 905, | |
| "valid_targets_mean": 5648.8, | |
| "valid_targets_min": 1257 | |
| }, | |
| { | |
| "epoch": 1.4630225080385852, | |
| "grad_norm": 0.5190255717863838, | |
| "learning_rate": 3.857871555160816e-05, | |
| "loss": 0.2716, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29148176312446594, | |
| "step": 910, | |
| "valid_targets_mean": 5045.9, | |
| "valid_targets_min": 1940 | |
| }, | |
| { | |
| "epoch": 1.4710610932475885, | |
| "grad_norm": 0.43535103713572904, | |
| "learning_rate": 3.854887907828759e-05, | |
| "loss": 0.254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2516050338745117, | |
| "step": 915, | |
| "valid_targets_mean": 5595.3, | |
| "valid_targets_min": 2333 | |
| }, | |
| { | |
| "epoch": 1.4790996784565915, | |
| "grad_norm": 0.4597492992897256, | |
| "learning_rate": 3.851874445965907e-05, | |
| "loss": 0.2565, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28193798661231995, | |
| "step": 920, | |
| "valid_targets_mean": 5760.1, | |
| "valid_targets_min": 2041 | |
| }, | |
| { | |
| "epoch": 1.487138263665595, | |
| "grad_norm": 0.4693120011789064, | |
| "learning_rate": 3.848831218009123e-05, | |
| "loss": 0.2541, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30137860774993896, | |
| "step": 925, | |
| "valid_targets_mean": 5774.6, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 1.495176848874598, | |
| "grad_norm": 0.5339368400409602, | |
| "learning_rate": 3.845758272873714e-05, | |
| "loss": 0.2475, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22395455837249756, | |
| "step": 930, | |
| "valid_targets_mean": 4434.1, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 1.5032154340836013, | |
| "grad_norm": 0.45015256467293435, | |
| "learning_rate": 3.842655659952649e-05, | |
| "loss": 0.2838, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24812501668930054, | |
| "step": 935, | |
| "valid_targets_mean": 6940.7, | |
| "valid_targets_min": 1619 | |
| }, | |
| { | |
| "epoch": 1.5112540192926045, | |
| "grad_norm": 0.5272600662534491, | |
| "learning_rate": 3.839523429115758e-05, | |
| "loss": 0.2465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2567567229270935, | |
| "step": 940, | |
| "valid_targets_mean": 4858.2, | |
| "valid_targets_min": 631 | |
| }, | |
| { | |
| "epoch": 1.5192926045016077, | |
| "grad_norm": 0.46459840745407177, | |
| "learning_rate": 3.8363616307089386e-05, | |
| "loss": 0.2509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28069034218788147, | |
| "step": 945, | |
| "valid_targets_mean": 5754.1, | |
| "valid_targets_min": 2011 | |
| }, | |
| { | |
| "epoch": 1.527331189710611, | |
| "grad_norm": 0.45214826038540773, | |
| "learning_rate": 3.833170315553338e-05, | |
| "loss": 0.2335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22650398313999176, | |
| "step": 950, | |
| "valid_targets_mean": 4581.8, | |
| "valid_targets_min": 417 | |
| }, | |
| { | |
| "epoch": 1.535369774919614, | |
| "grad_norm": 0.48210984419010594, | |
| "learning_rate": 3.8299495349445456e-05, | |
| "loss": 0.2615, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31238895654678345, | |
| "step": 955, | |
| "valid_targets_mean": 5383.9, | |
| "valid_targets_min": 853 | |
| }, | |
| { | |
| "epoch": 1.5434083601286175, | |
| "grad_norm": 0.4928344805065733, | |
| "learning_rate": 3.82669934065176e-05, | |
| "loss": 0.2581, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25374823808670044, | |
| "step": 960, | |
| "valid_targets_mean": 4763.8, | |
| "valid_targets_min": 2176 | |
| }, | |
| { | |
| "epoch": 1.5514469453376205, | |
| "grad_norm": 0.4717265251908609, | |
| "learning_rate": 3.823419784916962e-05, | |
| "loss": 0.2304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22140252590179443, | |
| "step": 965, | |
| "valid_targets_mean": 5637.1, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 1.5594855305466238, | |
| "grad_norm": 0.43590323716499957, | |
| "learning_rate": 3.8201109204540724e-05, | |
| "loss": 0.2584, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.268430233001709, | |
| "step": 970, | |
| "valid_targets_mean": 6467.6, | |
| "valid_targets_min": 2807 | |
| }, | |
| { | |
| "epoch": 1.567524115755627, | |
| "grad_norm": 0.41768095476278655, | |
| "learning_rate": 3.8167728004481055e-05, | |
| "loss": 0.2674, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27189570665359497, | |
| "step": 975, | |
| "valid_targets_mean": 5752.6, | |
| "valid_targets_min": 450 | |
| }, | |
| { | |
| "epoch": 1.5755627009646302, | |
| "grad_norm": 0.4832348331279412, | |
| "learning_rate": 3.813405478554318e-05, | |
| "loss": 0.2778, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24511483311653137, | |
| "step": 980, | |
| "valid_targets_mean": 4317.7, | |
| "valid_targets_min": 1986 | |
| }, | |
| { | |
| "epoch": 1.5836012861736335, | |
| "grad_norm": 0.414328602413529, | |
| "learning_rate": 3.810009008897338e-05, | |
| "loss": 0.2524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21237172186374664, | |
| "step": 985, | |
| "valid_targets_mean": 5577.9, | |
| "valid_targets_min": 2366 | |
| }, | |
| { | |
| "epoch": 1.5916398713826365, | |
| "grad_norm": 0.4172501302274523, | |
| "learning_rate": 3.806583446070303e-05, | |
| "loss": 0.2823, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2472105622291565, | |
| "step": 990, | |
| "valid_targets_mean": 5920.4, | |
| "valid_targets_min": 2797 | |
| }, | |
| { | |
| "epoch": 1.59967845659164, | |
| "grad_norm": 0.4832394129539713, | |
| "learning_rate": 3.8031288451339795e-05, | |
| "loss": 0.2617, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2524241507053375, | |
| "step": 995, | |
| "valid_targets_mean": 4624.0, | |
| "valid_targets_min": 381 | |
| }, | |
| { | |
| "epoch": 1.607717041800643, | |
| "grad_norm": 0.43112961475264444, | |
| "learning_rate": 3.799645261615875e-05, | |
| "loss": 0.2553, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2469416856765747, | |
| "step": 1000, | |
| "valid_targets_mean": 5810.0, | |
| "valid_targets_min": 858 | |
| }, | |
| { | |
| "epoch": 1.6157556270096463, | |
| "grad_norm": 0.4272566439316645, | |
| "learning_rate": 3.796132751509352e-05, | |
| "loss": 0.2696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30346783995628357, | |
| "step": 1005, | |
| "valid_targets_mean": 6606.8, | |
| "valid_targets_min": 2392 | |
| }, | |
| { | |
| "epoch": 1.6237942122186495, | |
| "grad_norm": 0.623083786690726, | |
| "learning_rate": 3.792591371272722e-05, | |
| "loss": 0.2485, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31482112407684326, | |
| "step": 1010, | |
| "valid_targets_mean": 5230.8, | |
| "valid_targets_min": 731 | |
| }, | |
| { | |
| "epoch": 1.6318327974276527, | |
| "grad_norm": 0.44232132209642155, | |
| "learning_rate": 3.789021177828342e-05, | |
| "loss": 0.2729, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3005584478378296, | |
| "step": 1015, | |
| "valid_targets_mean": 6064.6, | |
| "valid_targets_min": 1145 | |
| }, | |
| { | |
| "epoch": 1.639871382636656, | |
| "grad_norm": 0.492902103630694, | |
| "learning_rate": 3.7854222285616956e-05, | |
| "loss": 0.2497, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26609134674072266, | |
| "step": 1020, | |
| "valid_targets_mean": 5095.8, | |
| "valid_targets_min": 1490 | |
| }, | |
| { | |
| "epoch": 1.647909967845659, | |
| "grad_norm": 0.4657249866434225, | |
| "learning_rate": 3.7817945813204754e-05, | |
| "loss": 0.2691, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20303010940551758, | |
| "step": 1025, | |
| "valid_targets_mean": 5232.4, | |
| "valid_targets_min": 2010 | |
| }, | |
| { | |
| "epoch": 1.6559485530546625, | |
| "grad_norm": 0.4708623793870404, | |
| "learning_rate": 3.778138294413651e-05, | |
| "loss": 0.247, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25198543071746826, | |
| "step": 1030, | |
| "valid_targets_mean": 4911.9, | |
| "valid_targets_min": 955 | |
| }, | |
| { | |
| "epoch": 1.6639871382636655, | |
| "grad_norm": 0.4165794121258526, | |
| "learning_rate": 3.774453426610528e-05, | |
| "loss": 0.2489, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24974118173122406, | |
| "step": 1035, | |
| "valid_targets_mean": 5947.4, | |
| "valid_targets_min": 502 | |
| }, | |
| { | |
| "epoch": 1.6720257234726688, | |
| "grad_norm": 0.4623343866634942, | |
| "learning_rate": 3.770740037139811e-05, | |
| "loss": 0.2585, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2963433563709259, | |
| "step": 1040, | |
| "valid_targets_mean": 5770.4, | |
| "valid_targets_min": 2475 | |
| }, | |
| { | |
| "epoch": 1.680064308681672, | |
| "grad_norm": 0.44810587508987115, | |
| "learning_rate": 3.7669981856886446e-05, | |
| "loss": 0.255, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23505745828151703, | |
| "step": 1045, | |
| "valid_targets_mean": 5224.3, | |
| "valid_targets_min": 2812 | |
| }, | |
| { | |
| "epoch": 1.6881028938906752, | |
| "grad_norm": 0.45408084656767045, | |
| "learning_rate": 3.763227932401658e-05, | |
| "loss": 0.2871, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28216275572776794, | |
| "step": 1050, | |
| "valid_targets_mean": 6022.9, | |
| "valid_targets_min": 2277 | |
| }, | |
| { | |
| "epoch": 1.6961414790996785, | |
| "grad_norm": 0.48656702854532063, | |
| "learning_rate": 3.759429337879997e-05, | |
| "loss": 0.2638, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2563022971153259, | |
| "step": 1055, | |
| "valid_targets_mean": 5356.8, | |
| "valid_targets_min": 2798 | |
| }, | |
| { | |
| "epoch": 1.7041800643086815, | |
| "grad_norm": 0.4640030243470164, | |
| "learning_rate": 3.755602463180352e-05, | |
| "loss": 0.2752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.281100332736969, | |
| "step": 1060, | |
| "valid_targets_mean": 5524.0, | |
| "valid_targets_min": 624 | |
| }, | |
| { | |
| "epoch": 1.712218649517685, | |
| "grad_norm": 0.4276795294452354, | |
| "learning_rate": 3.75174736981397e-05, | |
| "loss": 0.2576, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27050265669822693, | |
| "step": 1065, | |
| "valid_targets_mean": 6211.6, | |
| "valid_targets_min": 1851 | |
| }, | |
| { | |
| "epoch": 1.720257234726688, | |
| "grad_norm": 0.43509074431450273, | |
| "learning_rate": 3.747864119745676e-05, | |
| "loss": 0.2754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2742531895637512, | |
| "step": 1070, | |
| "valid_targets_mean": 5541.8, | |
| "valid_targets_min": 1342 | |
| }, | |
| { | |
| "epoch": 1.7282958199356915, | |
| "grad_norm": 0.4276767599048864, | |
| "learning_rate": 3.743952775392867e-05, | |
| "loss": 0.2725, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25381579995155334, | |
| "step": 1075, | |
| "valid_targets_mean": 6139.3, | |
| "valid_targets_min": 2570 | |
| }, | |
| { | |
| "epoch": 1.7363344051446945, | |
| "grad_norm": 0.4082778737016427, | |
| "learning_rate": 3.740013399624517e-05, | |
| "loss": 0.2506, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24455204606056213, | |
| "step": 1080, | |
| "valid_targets_mean": 6807.5, | |
| "valid_targets_min": 3321 | |
| }, | |
| { | |
| "epoch": 1.7443729903536977, | |
| "grad_norm": 0.5211692720816377, | |
| "learning_rate": 3.7360460557601606e-05, | |
| "loss": 0.2654, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22724205255508423, | |
| "step": 1085, | |
| "valid_targets_mean": 4765.9, | |
| "valid_targets_min": 1461 | |
| }, | |
| { | |
| "epoch": 1.752411575562701, | |
| "grad_norm": 0.41530096966623775, | |
| "learning_rate": 3.732050807568878e-05, | |
| "loss": 0.2324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2500365972518921, | |
| "step": 1090, | |
| "valid_targets_mean": 6449.4, | |
| "valid_targets_min": 2058 | |
| }, | |
| { | |
| "epoch": 1.760450160771704, | |
| "grad_norm": 0.4110566935959379, | |
| "learning_rate": 3.728027719268269e-05, | |
| "loss": 0.2664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20658278465270996, | |
| "step": 1095, | |
| "valid_targets_mean": 5735.3, | |
| "valid_targets_min": 732 | |
| }, | |
| { | |
| "epoch": 1.7684887459807075, | |
| "grad_norm": 0.4496533892795101, | |
| "learning_rate": 3.723976855523422e-05, | |
| "loss": 0.2558, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25664013624191284, | |
| "step": 1100, | |
| "valid_targets_mean": 5087.9, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 1.7765273311897105, | |
| "grad_norm": 0.43245947505535154, | |
| "learning_rate": 3.719898281445874e-05, | |
| "loss": 0.2411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21936315298080444, | |
| "step": 1105, | |
| "valid_targets_mean": 5918.4, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 1.784565916398714, | |
| "grad_norm": 0.4858462448376596, | |
| "learning_rate": 3.7157920625925635e-05, | |
| "loss": 0.2701, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2447369545698166, | |
| "step": 1110, | |
| "valid_targets_mean": 5035.6, | |
| "valid_targets_min": 306 | |
| }, | |
| { | |
| "epoch": 1.792604501607717, | |
| "grad_norm": 0.46070258121147467, | |
| "learning_rate": 3.711658264964776e-05, | |
| "loss": 0.2632, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2854937016963959, | |
| "step": 1115, | |
| "valid_targets_mean": 5777.2, | |
| "valid_targets_min": 1674 | |
| }, | |
| { | |
| "epoch": 1.8006430868167203, | |
| "grad_norm": 0.5098752084264075, | |
| "learning_rate": 3.7074969550070875e-05, | |
| "loss": 0.2441, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20868563652038574, | |
| "step": 1120, | |
| "valid_targets_mean": 4280.9, | |
| "valid_targets_min": 1278 | |
| }, | |
| { | |
| "epoch": 1.8086816720257235, | |
| "grad_norm": 0.5023676700932536, | |
| "learning_rate": 3.7033081996062896e-05, | |
| "loss": 0.2478, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19911417365074158, | |
| "step": 1125, | |
| "valid_targets_mean": 4927.6, | |
| "valid_targets_min": 413 | |
| }, | |
| { | |
| "epoch": 1.8167202572347267, | |
| "grad_norm": 0.4777981723842262, | |
| "learning_rate": 3.6990920660903196e-05, | |
| "loss": 0.2699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26569002866744995, | |
| "step": 1130, | |
| "valid_targets_mean": 5073.6, | |
| "valid_targets_min": 677 | |
| }, | |
| { | |
| "epoch": 1.82475884244373, | |
| "grad_norm": 0.405932778909337, | |
| "learning_rate": 3.6948486222271775e-05, | |
| "loss": 0.2692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23919036984443665, | |
| "step": 1135, | |
| "valid_targets_mean": 5587.4, | |
| "valid_targets_min": 1803 | |
| }, | |
| { | |
| "epoch": 1.832797427652733, | |
| "grad_norm": 0.5069414169140095, | |
| "learning_rate": 3.690577936223835e-05, | |
| "loss": 0.2634, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27143096923828125, | |
| "step": 1140, | |
| "valid_targets_mean": 5159.4, | |
| "valid_targets_min": 478 | |
| }, | |
| { | |
| "epoch": 1.8408360128617365, | |
| "grad_norm": 0.4527435917730049, | |
| "learning_rate": 3.686280076725141e-05, | |
| "loss": 0.2679, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2778286337852478, | |
| "step": 1145, | |
| "valid_targets_mean": 5918.1, | |
| "valid_targets_min": 3434 | |
| }, | |
| { | |
| "epoch": 1.8488745980707395, | |
| "grad_norm": 0.4865490947207327, | |
| "learning_rate": 3.6819551128127165e-05, | |
| "loss": 0.2582, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24340906739234924, | |
| "step": 1150, | |
| "valid_targets_mean": 4307.2, | |
| "valid_targets_min": 319 | |
| }, | |
| { | |
| "epoch": 1.8569131832797428, | |
| "grad_norm": 0.47010193702456116, | |
| "learning_rate": 3.677603114003846e-05, | |
| "loss": 0.2509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2454727590084076, | |
| "step": 1155, | |
| "valid_targets_mean": 5215.7, | |
| "valid_targets_min": 636 | |
| }, | |
| { | |
| "epoch": 1.864951768488746, | |
| "grad_norm": 0.44641315970979795, | |
| "learning_rate": 3.673224150250358e-05, | |
| "loss": 0.2623, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22875937819480896, | |
| "step": 1160, | |
| "valid_targets_mean": 5781.1, | |
| "valid_targets_min": 1309 | |
| }, | |
| { | |
| "epoch": 1.8729903536977492, | |
| "grad_norm": 0.431723088241146, | |
| "learning_rate": 3.6688182919375046e-05, | |
| "loss": 0.2397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23499958217144012, | |
| "step": 1165, | |
| "valid_targets_mean": 6213.2, | |
| "valid_targets_min": 4048 | |
| }, | |
| { | |
| "epoch": 1.8810289389067525, | |
| "grad_norm": 0.4482594154633358, | |
| "learning_rate": 3.664385609882825e-05, | |
| "loss": 0.2609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2637123763561249, | |
| "step": 1170, | |
| "valid_targets_mean": 5742.7, | |
| "valid_targets_min": 2334 | |
| }, | |
| { | |
| "epoch": 1.8890675241157555, | |
| "grad_norm": 0.46395903411691025, | |
| "learning_rate": 3.659926175335009e-05, | |
| "loss": 0.2445, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.267189085483551, | |
| "step": 1175, | |
| "valid_targets_mean": 5911.4, | |
| "valid_targets_min": 1800 | |
| }, | |
| { | |
| "epoch": 1.897106109324759, | |
| "grad_norm": 0.487503842139191, | |
| "learning_rate": 3.655440059972755e-05, | |
| "loss": 0.2479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2147846221923828, | |
| "step": 1180, | |
| "valid_targets_mean": 4204.1, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 1.905144694533762, | |
| "grad_norm": 0.5590050907024217, | |
| "learning_rate": 3.650927335903614e-05, | |
| "loss": 0.258, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25826340913772583, | |
| "step": 1185, | |
| "valid_targets_mean": 5233.5, | |
| "valid_targets_min": 2790 | |
| }, | |
| { | |
| "epoch": 1.9131832797427653, | |
| "grad_norm": 0.5172485408806574, | |
| "learning_rate": 3.646388075662832e-05, | |
| "loss": 0.2538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2364274263381958, | |
| "step": 1190, | |
| "valid_targets_mean": 3912.8, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 1.9212218649517685, | |
| "grad_norm": 0.47682543135007205, | |
| "learning_rate": 3.6418223522121835e-05, | |
| "loss": 0.2408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3126181960105896, | |
| "step": 1195, | |
| "valid_targets_mean": 6203.2, | |
| "valid_targets_min": 3521 | |
| }, | |
| { | |
| "epoch": 1.9292604501607717, | |
| "grad_norm": 0.4332316218299991, | |
| "learning_rate": 3.6372302389388e-05, | |
| "loss": 0.2602, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23799636960029602, | |
| "step": 1200, | |
| "valid_targets_mean": 6120.1, | |
| "valid_targets_min": 1830 | |
| }, | |
| { | |
| "epoch": 1.937299035369775, | |
| "grad_norm": 0.6921991264155489, | |
| "learning_rate": 3.632611809653991e-05, | |
| "loss": 0.2494, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23573598265647888, | |
| "step": 1205, | |
| "valid_targets_mean": 5838.8, | |
| "valid_targets_min": 1781 | |
| }, | |
| { | |
| "epoch": 1.945337620578778, | |
| "grad_norm": 0.4531182795153632, | |
| "learning_rate": 3.627967138592052e-05, | |
| "loss": 0.2572, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3050065040588379, | |
| "step": 1210, | |
| "valid_targets_mean": 5898.8, | |
| "valid_targets_min": 3244 | |
| }, | |
| { | |
| "epoch": 1.9533762057877815, | |
| "grad_norm": 0.5163483143343605, | |
| "learning_rate": 3.6232963004090806e-05, | |
| "loss": 0.2556, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29905956983566284, | |
| "step": 1215, | |
| "valid_targets_mean": 5265.6, | |
| "valid_targets_min": 700 | |
| }, | |
| { | |
| "epoch": 1.9614147909967845, | |
| "grad_norm": 0.46818084772289176, | |
| "learning_rate": 3.6185993701817676e-05, | |
| "loss": 0.2411, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.250349760055542, | |
| "step": 1220, | |
| "valid_targets_mean": 5315.4, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 1.9694533762057878, | |
| "grad_norm": 0.4220431966160426, | |
| "learning_rate": 3.613876423406197e-05, | |
| "loss": 0.2566, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2675197720527649, | |
| "step": 1225, | |
| "valid_targets_mean": 6162.6, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 1.977491961414791, | |
| "grad_norm": 0.6178589423180992, | |
| "learning_rate": 3.609127535996628e-05, | |
| "loss": 0.2591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2850725054740906, | |
| "step": 1230, | |
| "valid_targets_mean": 5503.9, | |
| "valid_targets_min": 2166 | |
| }, | |
| { | |
| "epoch": 1.9855305466237942, | |
| "grad_norm": 0.40980310008162896, | |
| "learning_rate": 3.604352784284276e-05, | |
| "loss": 0.2432, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28403544425964355, | |
| "step": 1235, | |
| "valid_targets_mean": 6754.5, | |
| "valid_targets_min": 877 | |
| }, | |
| { | |
| "epoch": 1.9935691318327975, | |
| "grad_norm": 0.4415011119338401, | |
| "learning_rate": 3.5995522450160876e-05, | |
| "loss": 0.2848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26181045174598694, | |
| "step": 1240, | |
| "valid_targets_mean": 6086.8, | |
| "valid_targets_min": 2879 | |
| }, | |
| { | |
| "epoch": 2.0016077170418005, | |
| "grad_norm": 0.5052011649335046, | |
| "learning_rate": 3.594725995353504e-05, | |
| "loss": 0.2531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2215358167886734, | |
| "step": 1245, | |
| "valid_targets_mean": 4575.7, | |
| "valid_targets_min": 671 | |
| }, | |
| { | |
| "epoch": 2.009646302250804, | |
| "grad_norm": 0.49499392194788194, | |
| "learning_rate": 3.589874112871225e-05, | |
| "loss": 0.2352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2807163596153259, | |
| "step": 1250, | |
| "valid_targets_mean": 6068.4, | |
| "valid_targets_min": 2833 | |
| }, | |
| { | |
| "epoch": 2.017684887459807, | |
| "grad_norm": 0.5037101851458272, | |
| "learning_rate": 3.584996675555956e-05, | |
| "loss": 0.2479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20071212947368622, | |
| "step": 1255, | |
| "valid_targets_mean": 4558.1, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 2.0257234726688105, | |
| "grad_norm": 0.5288682410650701, | |
| "learning_rate": 3.58009376180516e-05, | |
| "loss": 0.2365, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22459709644317627, | |
| "step": 1260, | |
| "valid_targets_mean": 4271.1, | |
| "valid_targets_min": 1686 | |
| }, | |
| { | |
| "epoch": 2.0337620578778135, | |
| "grad_norm": 0.4909968616058714, | |
| "learning_rate": 3.575165450425795e-05, | |
| "loss": 0.2421, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27638089656829834, | |
| "step": 1265, | |
| "valid_targets_mean": 5826.1, | |
| "valid_targets_min": 2544 | |
| }, | |
| { | |
| "epoch": 2.0418006430868165, | |
| "grad_norm": 0.501900908160109, | |
| "learning_rate": 3.5702118206330464e-05, | |
| "loss": 0.2281, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24548184871673584, | |
| "step": 1270, | |
| "valid_targets_mean": 5093.1, | |
| "valid_targets_min": 839 | |
| }, | |
| { | |
| "epoch": 2.04983922829582, | |
| "grad_norm": 0.4734765195442351, | |
| "learning_rate": 3.565232952049057e-05, | |
| "loss": 0.2469, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24846141040325165, | |
| "step": 1275, | |
| "valid_targets_mean": 5822.4, | |
| "valid_targets_min": 2417 | |
| }, | |
| { | |
| "epoch": 2.057877813504823, | |
| "grad_norm": 0.45203540694484684, | |
| "learning_rate": 3.560228924701644e-05, | |
| "loss": 0.237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23629529774188995, | |
| "step": 1280, | |
| "valid_targets_mean": 7071.5, | |
| "valid_targets_min": 4493 | |
| }, | |
| { | |
| "epoch": 2.0659163987138265, | |
| "grad_norm": 0.48899278419104575, | |
| "learning_rate": 3.555199819023013e-05, | |
| "loss": 0.2404, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24547456204891205, | |
| "step": 1285, | |
| "valid_targets_mean": 5600.9, | |
| "valid_targets_min": 1454 | |
| }, | |
| { | |
| "epoch": 2.0739549839228295, | |
| "grad_norm": 0.6019858773966105, | |
| "learning_rate": 3.5501457158484684e-05, | |
| "loss": 0.2537, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2651026248931885, | |
| "step": 1290, | |
| "valid_targets_mean": 4568.3, | |
| "valid_targets_min": 977 | |
| }, | |
| { | |
| "epoch": 2.081993569131833, | |
| "grad_norm": 0.5664362773624191, | |
| "learning_rate": 3.545066696415109e-05, | |
| "loss": 0.2166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21541890501976013, | |
| "step": 1295, | |
| "valid_targets_mean": 4279.4, | |
| "valid_targets_min": 591 | |
| }, | |
| { | |
| "epoch": 2.090032154340836, | |
| "grad_norm": 0.48714257423252616, | |
| "learning_rate": 3.5399628423605286e-05, | |
| "loss": 0.2328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20423497259616852, | |
| "step": 1300, | |
| "valid_targets_mean": 4730.8, | |
| "valid_targets_min": 1985 | |
| }, | |
| { | |
| "epoch": 2.098070739549839, | |
| "grad_norm": 0.5673114735609058, | |
| "learning_rate": 3.534834235721495e-05, | |
| "loss": 0.2384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3088694214820862, | |
| "step": 1305, | |
| "valid_targets_mean": 6417.6, | |
| "valid_targets_min": 2499 | |
| }, | |
| { | |
| "epoch": 2.1061093247588425, | |
| "grad_norm": 0.5072612807927125, | |
| "learning_rate": 3.5296809589326414e-05, | |
| "loss": 0.2505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27511778473854065, | |
| "step": 1310, | |
| "valid_targets_mean": 5805.1, | |
| "valid_targets_min": 2166 | |
| }, | |
| { | |
| "epoch": 2.1141479099678455, | |
| "grad_norm": 0.4887778350157169, | |
| "learning_rate": 3.524503094825134e-05, | |
| "loss": 0.2425, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29525697231292725, | |
| "step": 1315, | |
| "valid_targets_mean": 5837.6, | |
| "valid_targets_min": 3211 | |
| }, | |
| { | |
| "epoch": 2.122186495176849, | |
| "grad_norm": 0.4682327210334639, | |
| "learning_rate": 3.5193007266253436e-05, | |
| "loss": 0.2359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25870102643966675, | |
| "step": 1320, | |
| "valid_targets_mean": 5548.1, | |
| "valid_targets_min": 2741 | |
| }, | |
| { | |
| "epoch": 2.130225080385852, | |
| "grad_norm": 0.4289895984990567, | |
| "learning_rate": 3.514073937953507e-05, | |
| "loss": 0.2474, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23811715841293335, | |
| "step": 1325, | |
| "valid_targets_mean": 5777.2, | |
| "valid_targets_min": 2849 | |
| }, | |
| { | |
| "epoch": 2.1382636655948555, | |
| "grad_norm": 0.465702346096192, | |
| "learning_rate": 3.508822812822384e-05, | |
| "loss": 0.2317, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25546348094940186, | |
| "step": 1330, | |
| "valid_targets_mean": 5231.3, | |
| "valid_targets_min": 351 | |
| }, | |
| { | |
| "epoch": 2.1463022508038585, | |
| "grad_norm": 0.46068575075324203, | |
| "learning_rate": 3.503547435635906e-05, | |
| "loss": 0.2397, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22912633419036865, | |
| "step": 1335, | |
| "valid_targets_mean": 5305.3, | |
| "valid_targets_min": 450 | |
| }, | |
| { | |
| "epoch": 2.154340836012862, | |
| "grad_norm": 0.5793563972286564, | |
| "learning_rate": 3.4982478911878204e-05, | |
| "loss": 0.2414, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25946539640426636, | |
| "step": 1340, | |
| "valid_targets_mean": 6271.1, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 2.162379421221865, | |
| "grad_norm": 0.4321108762119069, | |
| "learning_rate": 3.492924264660326e-05, | |
| "loss": 0.2168, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21115905046463013, | |
| "step": 1345, | |
| "valid_targets_mean": 5771.3, | |
| "valid_targets_min": 1986 | |
| }, | |
| { | |
| "epoch": 2.170418006430868, | |
| "grad_norm": 0.47291052827659313, | |
| "learning_rate": 3.4875766416227046e-05, | |
| "loss": 0.2388, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22496125102043152, | |
| "step": 1350, | |
| "valid_targets_mean": 4972.3, | |
| "valid_targets_min": 3309 | |
| }, | |
| { | |
| "epoch": 2.1784565916398715, | |
| "grad_norm": 0.43470423287906323, | |
| "learning_rate": 3.4822051080299475e-05, | |
| "loss": 0.2284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20437441766262054, | |
| "step": 1355, | |
| "valid_targets_mean": 6334.6, | |
| "valid_targets_min": 2286 | |
| }, | |
| { | |
| "epoch": 2.1864951768488745, | |
| "grad_norm": 0.46020505480546187, | |
| "learning_rate": 3.47680975022137e-05, | |
| "loss": 0.2321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20249639451503754, | |
| "step": 1360, | |
| "valid_targets_mean": 5926.2, | |
| "valid_targets_min": 2284 | |
| }, | |
| { | |
| "epoch": 2.194533762057878, | |
| "grad_norm": 0.4305391092246054, | |
| "learning_rate": 3.471390654919226e-05, | |
| "loss": 0.2187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21444851160049438, | |
| "step": 1365, | |
| "valid_targets_mean": 5687.1, | |
| "valid_targets_min": 1717 | |
| }, | |
| { | |
| "epoch": 2.202572347266881, | |
| "grad_norm": 0.43458160830767617, | |
| "learning_rate": 3.4659479092273165e-05, | |
| "loss": 0.2527, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21879488229751587, | |
| "step": 1370, | |
| "valid_targets_mean": 5867.4, | |
| "valid_targets_min": 1288 | |
| }, | |
| { | |
| "epoch": 2.210610932475884, | |
| "grad_norm": 0.47071088104561204, | |
| "learning_rate": 3.460481600629584e-05, | |
| "loss": 0.2231, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22845019400119781, | |
| "step": 1375, | |
| "valid_targets_mean": 5171.8, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 2.2186495176848875, | |
| "grad_norm": 0.5187891419112453, | |
| "learning_rate": 3.4549918169887117e-05, | |
| "loss": 0.2361, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21421794593334198, | |
| "step": 1380, | |
| "valid_targets_mean": 4998.8, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 2.2266881028938905, | |
| "grad_norm": 0.5234262982245236, | |
| "learning_rate": 3.4494786465447046e-05, | |
| "loss": 0.2122, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25294065475463867, | |
| "step": 1385, | |
| "valid_targets_mean": 5855.6, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 2.234726688102894, | |
| "grad_norm": 0.5321562652327939, | |
| "learning_rate": 3.44394217791348e-05, | |
| "loss": 0.2434, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24745520949363708, | |
| "step": 1390, | |
| "valid_targets_mean": 4331.7, | |
| "valid_targets_min": 671 | |
| }, | |
| { | |
| "epoch": 2.242765273311897, | |
| "grad_norm": 0.4615900367652738, | |
| "learning_rate": 3.438382500085437e-05, | |
| "loss": 0.2279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2429751455783844, | |
| "step": 1395, | |
| "valid_targets_mean": 5807.1, | |
| "valid_targets_min": 3494 | |
| }, | |
| { | |
| "epoch": 2.2508038585209005, | |
| "grad_norm": 0.48836390162778914, | |
| "learning_rate": 3.432799702424026e-05, | |
| "loss": 0.2289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2340126484632492, | |
| "step": 1400, | |
| "valid_targets_mean": 4863.4, | |
| "valid_targets_min": 831 | |
| }, | |
| { | |
| "epoch": 2.2588424437299035, | |
| "grad_norm": 0.4992916459755242, | |
| "learning_rate": 3.427193874664316e-05, | |
| "loss": 0.2345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2595020532608032, | |
| "step": 1405, | |
| "valid_targets_mean": 5894.4, | |
| "valid_targets_min": 2328 | |
| }, | |
| { | |
| "epoch": 2.266881028938907, | |
| "grad_norm": 0.48577187074590183, | |
| "learning_rate": 3.421565106911549e-05, | |
| "loss": 0.2608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23440846800804138, | |
| "step": 1410, | |
| "valid_targets_mean": 4556.2, | |
| "valid_targets_min": 2277 | |
| }, | |
| { | |
| "epoch": 2.27491961414791, | |
| "grad_norm": 0.5013796741612716, | |
| "learning_rate": 3.4159134896396926e-05, | |
| "loss": 0.2323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.221354141831398, | |
| "step": 1415, | |
| "valid_targets_mean": 5789.2, | |
| "valid_targets_min": 555 | |
| }, | |
| { | |
| "epoch": 2.282958199356913, | |
| "grad_norm": 0.49298535713537966, | |
| "learning_rate": 3.410239113689988e-05, | |
| "loss": 0.2368, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23821865022182465, | |
| "step": 1420, | |
| "valid_targets_mean": 5011.8, | |
| "valid_targets_min": 1665 | |
| }, | |
| { | |
| "epoch": 2.2909967845659165, | |
| "grad_norm": 0.4626416750757732, | |
| "learning_rate": 3.4045420702694846e-05, | |
| "loss": 0.2261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22496405243873596, | |
| "step": 1425, | |
| "valid_targets_mean": 5377.4, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 2.2990353697749195, | |
| "grad_norm": 0.4296716912834519, | |
| "learning_rate": 3.398822450949581e-05, | |
| "loss": 0.2485, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23885153234004974, | |
| "step": 1430, | |
| "valid_targets_mean": 6061.9, | |
| "valid_targets_min": 3491 | |
| }, | |
| { | |
| "epoch": 2.307073954983923, | |
| "grad_norm": 0.47636710237452873, | |
| "learning_rate": 3.393080347664547e-05, | |
| "loss": 0.2267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2698220908641815, | |
| "step": 1435, | |
| "valid_targets_mean": 6900.9, | |
| "valid_targets_min": 2064 | |
| }, | |
| { | |
| "epoch": 2.315112540192926, | |
| "grad_norm": 0.41767354690125497, | |
| "learning_rate": 3.3873158527100476e-05, | |
| "loss": 0.2306, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2261713743209839, | |
| "step": 1440, | |
| "valid_targets_mean": 6009.2, | |
| "valid_targets_min": 954 | |
| }, | |
| { | |
| "epoch": 2.323151125401929, | |
| "grad_norm": 0.45765199576612026, | |
| "learning_rate": 3.381529058741662e-05, | |
| "loss": 0.2259, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19748841226100922, | |
| "step": 1445, | |
| "valid_targets_mean": 4535.4, | |
| "valid_targets_min": 763 | |
| }, | |
| { | |
| "epoch": 2.3311897106109325, | |
| "grad_norm": 0.512506152526694, | |
| "learning_rate": 3.375720058773393e-05, | |
| "loss": 0.2433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21968387067317963, | |
| "step": 1450, | |
| "valid_targets_mean": 4926.1, | |
| "valid_targets_min": 793 | |
| }, | |
| { | |
| "epoch": 2.3392282958199355, | |
| "grad_norm": 0.4272765943955022, | |
| "learning_rate": 3.369888946176169e-05, | |
| "loss": 0.2214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2100493162870407, | |
| "step": 1455, | |
| "valid_targets_mean": 6448.4, | |
| "valid_targets_min": 1340 | |
| }, | |
| { | |
| "epoch": 2.347266881028939, | |
| "grad_norm": 0.4563671696414422, | |
| "learning_rate": 3.364035814676348e-05, | |
| "loss": 0.2452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2688789665699005, | |
| "step": 1460, | |
| "valid_targets_mean": 5684.4, | |
| "valid_targets_min": 696 | |
| }, | |
| { | |
| "epoch": 2.355305466237942, | |
| "grad_norm": 0.4324693505868682, | |
| "learning_rate": 3.358160758354207e-05, | |
| "loss": 0.2367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23381435871124268, | |
| "step": 1465, | |
| "valid_targets_mean": 5994.9, | |
| "valid_targets_min": 3248 | |
| }, | |
| { | |
| "epoch": 2.3633440514469455, | |
| "grad_norm": 0.4759857152996837, | |
| "learning_rate": 3.352263871642431e-05, | |
| "loss": 0.233, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22635474801063538, | |
| "step": 1470, | |
| "valid_targets_mean": 5392.4, | |
| "valid_targets_min": 2033 | |
| }, | |
| { | |
| "epoch": 2.3713826366559485, | |
| "grad_norm": 0.4884246901728015, | |
| "learning_rate": 3.346345249324599e-05, | |
| "loss": 0.2277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21876902878284454, | |
| "step": 1475, | |
| "valid_targets_mean": 4432.8, | |
| "valid_targets_min": 1507 | |
| }, | |
| { | |
| "epoch": 2.379421221864952, | |
| "grad_norm": 0.4330099288850315, | |
| "learning_rate": 3.340404986533654e-05, | |
| "loss": 0.2428, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23252514004707336, | |
| "step": 1480, | |
| "valid_targets_mean": 6527.4, | |
| "valid_targets_min": 3722 | |
| }, | |
| { | |
| "epoch": 2.387459807073955, | |
| "grad_norm": 0.45464353022352455, | |
| "learning_rate": 3.334443178750378e-05, | |
| "loss": 0.2318, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21697625517845154, | |
| "step": 1485, | |
| "valid_targets_mean": 4793.3, | |
| "valid_targets_min": 700 | |
| }, | |
| { | |
| "epoch": 2.395498392282958, | |
| "grad_norm": 0.4068622119059787, | |
| "learning_rate": 3.328459921801855e-05, | |
| "loss": 0.2364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.201742023229599, | |
| "step": 1490, | |
| "valid_targets_mean": 5870.1, | |
| "valid_targets_min": 2359 | |
| }, | |
| { | |
| "epoch": 2.4035369774919615, | |
| "grad_norm": 0.4122118914595108, | |
| "learning_rate": 3.322455311859936e-05, | |
| "loss": 0.2457, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2373337745666504, | |
| "step": 1495, | |
| "valid_targets_mean": 6573.8, | |
| "valid_targets_min": 2667 | |
| }, | |
| { | |
| "epoch": 2.4115755627009645, | |
| "grad_norm": 0.5745692612063069, | |
| "learning_rate": 3.316429445439684e-05, | |
| "loss": 0.2459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24043476581573486, | |
| "step": 1500, | |
| "valid_targets_mean": 6095.8, | |
| "valid_targets_min": 2709 | |
| }, | |
| { | |
| "epoch": 2.419614147909968, | |
| "grad_norm": 0.454172764523821, | |
| "learning_rate": 3.310382419397832e-05, | |
| "loss": 0.219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22647792100906372, | |
| "step": 1505, | |
| "valid_targets_mean": 5380.6, | |
| "valid_targets_min": 554 | |
| }, | |
| { | |
| "epoch": 2.427652733118971, | |
| "grad_norm": 0.45495338149369174, | |
| "learning_rate": 3.30431433093122e-05, | |
| "loss": 0.2531, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2454707771539688, | |
| "step": 1510, | |
| "valid_targets_mean": 5483.3, | |
| "valid_targets_min": 2993 | |
| }, | |
| { | |
| "epoch": 2.435691318327974, | |
| "grad_norm": 0.41609998478720306, | |
| "learning_rate": 3.298225277575235e-05, | |
| "loss": 0.2363, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2000729739665985, | |
| "step": 1515, | |
| "valid_targets_mean": 5734.8, | |
| "valid_targets_min": 395 | |
| }, | |
| { | |
| "epoch": 2.4437299035369775, | |
| "grad_norm": 0.4628501255715947, | |
| "learning_rate": 3.292115357202244e-05, | |
| "loss": 0.2505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2211766541004181, | |
| "step": 1520, | |
| "valid_targets_mean": 5731.4, | |
| "valid_targets_min": 2463 | |
| }, | |
| { | |
| "epoch": 2.4517684887459805, | |
| "grad_norm": 0.48247050262549374, | |
| "learning_rate": 3.285984668020019e-05, | |
| "loss": 0.2265, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2340318262577057, | |
| "step": 1525, | |
| "valid_targets_mean": 5772.2, | |
| "valid_targets_min": 2556 | |
| }, | |
| { | |
| "epoch": 2.459807073954984, | |
| "grad_norm": 0.4090879734351391, | |
| "learning_rate": 3.27983330857016e-05, | |
| "loss": 0.2251, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20760081708431244, | |
| "step": 1530, | |
| "valid_targets_mean": 6043.8, | |
| "valid_targets_min": 2831 | |
| }, | |
| { | |
| "epoch": 2.467845659163987, | |
| "grad_norm": 0.46193519580530734, | |
| "learning_rate": 3.2736613777265076e-05, | |
| "loss": 0.214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18691232800483704, | |
| "step": 1535, | |
| "valid_targets_mean": 4827.1, | |
| "valid_targets_min": 2611 | |
| }, | |
| { | |
| "epoch": 2.4758842443729905, | |
| "grad_norm": 0.4790816763532421, | |
| "learning_rate": 3.2674689746935626e-05, | |
| "loss": 0.2416, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24825897812843323, | |
| "step": 1540, | |
| "valid_targets_mean": 4753.2, | |
| "valid_targets_min": 2313 | |
| }, | |
| { | |
| "epoch": 2.4839228295819935, | |
| "grad_norm": 0.4274916395823709, | |
| "learning_rate": 3.261256199004879e-05, | |
| "loss": 0.2336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19343507289886475, | |
| "step": 1545, | |
| "valid_targets_mean": 5236.5, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 2.491961414790997, | |
| "grad_norm": 0.462266180497401, | |
| "learning_rate": 3.255023150521474e-05, | |
| "loss": 0.2492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22728748619556427, | |
| "step": 1550, | |
| "valid_targets_mean": 5430.7, | |
| "valid_targets_min": 3000 | |
| }, | |
| { | |
| "epoch": 2.5, | |
| "grad_norm": 0.4876609578778693, | |
| "learning_rate": 3.2487699294302195e-05, | |
| "loss": 0.2333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24499937891960144, | |
| "step": 1555, | |
| "valid_targets_mean": 5483.6, | |
| "valid_targets_min": 368 | |
| }, | |
| { | |
| "epoch": 2.508038585209003, | |
| "grad_norm": 0.42431584453451293, | |
| "learning_rate": 3.242496636242231e-05, | |
| "loss": 0.2274, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23072059452533722, | |
| "step": 1560, | |
| "valid_targets_mean": 5792.4, | |
| "valid_targets_min": 2293 | |
| }, | |
| { | |
| "epoch": 2.5160771704180065, | |
| "grad_norm": 0.48288970283703386, | |
| "learning_rate": 3.236203371791252e-05, | |
| "loss": 0.2244, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21728533506393433, | |
| "step": 1565, | |
| "valid_targets_mean": 5067.8, | |
| "valid_targets_min": 922 | |
| }, | |
| { | |
| "epoch": 2.5241157556270095, | |
| "grad_norm": 0.48817034020969907, | |
| "learning_rate": 3.2298902372320354e-05, | |
| "loss": 0.2511, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2375354766845703, | |
| "step": 1570, | |
| "valid_targets_mean": 4957.4, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 2.532154340836013, | |
| "grad_norm": 0.38867992655742173, | |
| "learning_rate": 3.2235573340387155e-05, | |
| "loss": 0.2073, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17898300290107727, | |
| "step": 1575, | |
| "valid_targets_mean": 5888.5, | |
| "valid_targets_min": 2584 | |
| }, | |
| { | |
| "epoch": 2.540192926045016, | |
| "grad_norm": 0.430940952546417, | |
| "learning_rate": 3.2172047640031774e-05, | |
| "loss": 0.2352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22751542925834656, | |
| "step": 1580, | |
| "valid_targets_mean": 5754.3, | |
| "valid_targets_min": 1931 | |
| }, | |
| { | |
| "epoch": 2.548231511254019, | |
| "grad_norm": 0.4815894601905835, | |
| "learning_rate": 3.210832629233421e-05, | |
| "loss": 0.2415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2273508906364441, | |
| "step": 1585, | |
| "valid_targets_mean": 5406.9, | |
| "valid_targets_min": 790 | |
| }, | |
| { | |
| "epoch": 2.5562700964630225, | |
| "grad_norm": 0.6068676978104833, | |
| "learning_rate": 3.204441032151921e-05, | |
| "loss": 0.2364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21523119509220123, | |
| "step": 1590, | |
| "valid_targets_mean": 5741.2, | |
| "valid_targets_min": 2231 | |
| }, | |
| { | |
| "epoch": 2.564308681672026, | |
| "grad_norm": 0.5223510328320911, | |
| "learning_rate": 3.198030075493979e-05, | |
| "loss": 0.2448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26558348536491394, | |
| "step": 1595, | |
| "valid_targets_mean": 5132.1, | |
| "valid_targets_min": 1019 | |
| }, | |
| { | |
| "epoch": 2.572347266881029, | |
| "grad_norm": 0.60222440656527, | |
| "learning_rate": 3.19159986230607e-05, | |
| "loss": 0.2435, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1857927143573761, | |
| "step": 1600, | |
| "valid_targets_mean": 4652.8, | |
| "valid_targets_min": 1007 | |
| }, | |
| { | |
| "epoch": 2.580385852090032, | |
| "grad_norm": 0.45369146995792226, | |
| "learning_rate": 3.185150495944194e-05, | |
| "loss": 0.2128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.203944593667984, | |
| "step": 1605, | |
| "valid_targets_mean": 5606.6, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 2.5884244372990355, | |
| "grad_norm": 0.44158984490213105, | |
| "learning_rate": 3.1786820800722044e-05, | |
| "loss": 0.2177, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2083684355020523, | |
| "step": 1610, | |
| "valid_targets_mean": 5581.2, | |
| "valid_targets_min": 2542 | |
| }, | |
| { | |
| "epoch": 2.5964630225080385, | |
| "grad_norm": 0.3955024073939668, | |
| "learning_rate": 3.1721947186601494e-05, | |
| "loss": 0.2534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23385578393936157, | |
| "step": 1615, | |
| "valid_targets_mean": 6655.3, | |
| "valid_targets_min": 2377 | |
| }, | |
| { | |
| "epoch": 2.604501607717042, | |
| "grad_norm": 0.4448231599721356, | |
| "learning_rate": 3.165688515982599e-05, | |
| "loss": 0.2279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23172247409820557, | |
| "step": 1620, | |
| "valid_targets_mean": 5980.2, | |
| "valid_targets_min": 2938 | |
| }, | |
| { | |
| "epoch": 2.612540192926045, | |
| "grad_norm": 0.43163627602395105, | |
| "learning_rate": 3.1591635766169646e-05, | |
| "loss": 0.2222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21875952184200287, | |
| "step": 1625, | |
| "valid_targets_mean": 6797.9, | |
| "valid_targets_min": 3875 | |
| }, | |
| { | |
| "epoch": 2.620578778135048, | |
| "grad_norm": 0.48035161655460856, | |
| "learning_rate": 3.152620005441826e-05, | |
| "loss": 0.2256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25188302993774414, | |
| "step": 1630, | |
| "valid_targets_mean": 5222.1, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 2.6286173633440515, | |
| "grad_norm": 0.47804467495955605, | |
| "learning_rate": 3.146057907635238e-05, | |
| "loss": 0.2344, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23584258556365967, | |
| "step": 1635, | |
| "valid_targets_mean": 5037.3, | |
| "valid_targets_min": 1577 | |
| }, | |
| { | |
| "epoch": 2.6366559485530545, | |
| "grad_norm": 0.49724922962510637, | |
| "learning_rate": 3.1394773886730447e-05, | |
| "loss": 0.217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21783044934272766, | |
| "step": 1640, | |
| "valid_targets_mean": 5010.2, | |
| "valid_targets_min": 2341 | |
| }, | |
| { | |
| "epoch": 2.644694533762058, | |
| "grad_norm": 0.47360896933063057, | |
| "learning_rate": 3.1328785543271824e-05, | |
| "loss": 0.2479, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23639383912086487, | |
| "step": 1645, | |
| "valid_targets_mean": 5103.1, | |
| "valid_targets_min": 2430 | |
| }, | |
| { | |
| "epoch": 2.652733118971061, | |
| "grad_norm": 0.4425730996552444, | |
| "learning_rate": 3.126261510663979e-05, | |
| "loss": 0.227, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24800364673137665, | |
| "step": 1650, | |
| "valid_targets_mean": 5471.6, | |
| "valid_targets_min": 779 | |
| }, | |
| { | |
| "epoch": 2.660771704180064, | |
| "grad_norm": 0.5066362084521983, | |
| "learning_rate": 3.119626364042451e-05, | |
| "loss": 0.2322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23977752029895782, | |
| "step": 1655, | |
| "valid_targets_mean": 6356.9, | |
| "valid_targets_min": 1168 | |
| }, | |
| { | |
| "epoch": 2.6688102893890675, | |
| "grad_norm": 0.46571003694651886, | |
| "learning_rate": 3.112973221112591e-05, | |
| "loss": 0.2354, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23410023748874664, | |
| "step": 1660, | |
| "valid_targets_mean": 5421.5, | |
| "valid_targets_min": 2427 | |
| }, | |
| { | |
| "epoch": 2.676848874598071, | |
| "grad_norm": 0.49670167200910814, | |
| "learning_rate": 3.106302188813656e-05, | |
| "loss": 0.2352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23411546647548676, | |
| "step": 1665, | |
| "valid_targets_mean": 5066.6, | |
| "valid_targets_min": 902 | |
| }, | |
| { | |
| "epoch": 2.684887459807074, | |
| "grad_norm": 0.4489042276841892, | |
| "learning_rate": 3.099613374372449e-05, | |
| "loss": 0.237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2577555179595947, | |
| "step": 1670, | |
| "valid_targets_mean": 5970.1, | |
| "valid_targets_min": 3474 | |
| }, | |
| { | |
| "epoch": 2.692926045016077, | |
| "grad_norm": 0.4726499258070017, | |
| "learning_rate": 3.092906885301591e-05, | |
| "loss": 0.2299, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24142785370349884, | |
| "step": 1675, | |
| "valid_targets_mean": 5035.5, | |
| "valid_targets_min": 1983 | |
| }, | |
| { | |
| "epoch": 2.7009646302250805, | |
| "grad_norm": 0.45047970317259495, | |
| "learning_rate": 3.086182829397799e-05, | |
| "loss": 0.2366, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21712270379066467, | |
| "step": 1680, | |
| "valid_targets_mean": 5064.8, | |
| "valid_targets_min": 829 | |
| }, | |
| { | |
| "epoch": 2.7090032154340835, | |
| "grad_norm": 0.41145712760990627, | |
| "learning_rate": 3.07944131474015e-05, | |
| "loss": 0.2422, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19963139295578003, | |
| "step": 1685, | |
| "valid_targets_mean": 5160.9, | |
| "valid_targets_min": 268 | |
| }, | |
| { | |
| "epoch": 2.717041800643087, | |
| "grad_norm": 0.4773923313728519, | |
| "learning_rate": 3.072682449688342e-05, | |
| "loss": 0.2391, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24405084550380707, | |
| "step": 1690, | |
| "valid_targets_mean": 5581.8, | |
| "valid_targets_min": 2920 | |
| }, | |
| { | |
| "epoch": 2.72508038585209, | |
| "grad_norm": 0.4376554948889786, | |
| "learning_rate": 3.0659063428809574e-05, | |
| "loss": 0.2225, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2001868635416031, | |
| "step": 1695, | |
| "valid_targets_mean": 5621.9, | |
| "valid_targets_min": 2576 | |
| }, | |
| { | |
| "epoch": 2.733118971061093, | |
| "grad_norm": 0.43672581570188307, | |
| "learning_rate": 3.059113103233711e-05, | |
| "loss": 0.2354, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20036949217319489, | |
| "step": 1700, | |
| "valid_targets_mean": 5967.9, | |
| "valid_targets_min": 1912 | |
| }, | |
| { | |
| "epoch": 2.7411575562700965, | |
| "grad_norm": 0.4890236832710162, | |
| "learning_rate": 3.052302839937706e-05, | |
| "loss": 0.2239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23596787452697754, | |
| "step": 1705, | |
| "valid_targets_mean": 5446.7, | |
| "valid_targets_min": 3308 | |
| }, | |
| { | |
| "epoch": 2.7491961414790995, | |
| "grad_norm": 0.4635500560051159, | |
| "learning_rate": 3.045475662457671e-05, | |
| "loss": 0.2384, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24793817102909088, | |
| "step": 1710, | |
| "valid_targets_mean": 5555.2, | |
| "valid_targets_min": 1492 | |
| }, | |
| { | |
| "epoch": 2.757234726688103, | |
| "grad_norm": 0.4427640364687355, | |
| "learning_rate": 3.0386316805302064e-05, | |
| "loss": 0.2441, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22127273678779602, | |
| "step": 1715, | |
| "valid_targets_mean": 5390.1, | |
| "valid_targets_min": 2559 | |
| }, | |
| { | |
| "epoch": 2.765273311897106, | |
| "grad_norm": 0.4442190835436434, | |
| "learning_rate": 3.0317710041620194e-05, | |
| "loss": 0.2289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2600225806236267, | |
| "step": 1720, | |
| "valid_targets_mean": 6436.2, | |
| "valid_targets_min": 675 | |
| }, | |
| { | |
| "epoch": 2.7733118971061095, | |
| "grad_norm": 0.42238676851093976, | |
| "learning_rate": 3.024893743628154e-05, | |
| "loss": 0.2208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20658443868160248, | |
| "step": 1725, | |
| "valid_targets_mean": 5774.4, | |
| "valid_targets_min": 1262 | |
| }, | |
| { | |
| "epoch": 2.7813504823151125, | |
| "grad_norm": 0.42021767660168513, | |
| "learning_rate": 3.018000009470222e-05, | |
| "loss": 0.2234, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21164314448833466, | |
| "step": 1730, | |
| "valid_targets_mean": 5540.8, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 2.789389067524116, | |
| "grad_norm": 0.4630792997560636, | |
| "learning_rate": 3.0110899124946203e-05, | |
| "loss": 0.2401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24036768078804016, | |
| "step": 1735, | |
| "valid_targets_mean": 5809.8, | |
| "valid_targets_min": 2003 | |
| }, | |
| { | |
| "epoch": 2.797427652733119, | |
| "grad_norm": 0.5390343017669323, | |
| "learning_rate": 3.004163563770756e-05, | |
| "loss": 0.237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18927526473999023, | |
| "step": 1740, | |
| "valid_targets_mean": 5524.0, | |
| "valid_targets_min": 413 | |
| }, | |
| { | |
| "epoch": 2.805466237942122, | |
| "grad_norm": 0.506037679439928, | |
| "learning_rate": 2.9972210746292593e-05, | |
| "loss": 0.2373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2753189206123352, | |
| "step": 1745, | |
| "valid_targets_mean": 5811.8, | |
| "valid_targets_min": 1503 | |
| }, | |
| { | |
| "epoch": 2.8135048231511255, | |
| "grad_norm": 0.4477668704765513, | |
| "learning_rate": 2.9902625566601896e-05, | |
| "loss": 0.2532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2465018332004547, | |
| "step": 1750, | |
| "valid_targets_mean": 5265.3, | |
| "valid_targets_min": 677 | |
| }, | |
| { | |
| "epoch": 2.8215434083601285, | |
| "grad_norm": 0.43112747090224235, | |
| "learning_rate": 2.9832881217112507e-05, | |
| "loss": 0.2195, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23881378769874573, | |
| "step": 1755, | |
| "valid_targets_mean": 5734.8, | |
| "valid_targets_min": 2236 | |
| }, | |
| { | |
| "epoch": 2.829581993569132, | |
| "grad_norm": 0.43331469697490993, | |
| "learning_rate": 2.9762978818859828e-05, | |
| "loss": 0.2508, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25261190533638, | |
| "step": 1760, | |
| "valid_targets_mean": 6195.2, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 2.837620578778135, | |
| "grad_norm": 0.49846352204421746, | |
| "learning_rate": 2.9692919495419698e-05, | |
| "loss": 0.2239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.191496804356575, | |
| "step": 1765, | |
| "valid_targets_mean": 5438.2, | |
| "valid_targets_min": 793 | |
| }, | |
| { | |
| "epoch": 2.845659163987138, | |
| "grad_norm": 0.40728999160669044, | |
| "learning_rate": 2.9622704372890268e-05, | |
| "loss": 0.2198, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2332020401954651, | |
| "step": 1770, | |
| "valid_targets_mean": 6247.6, | |
| "valid_targets_min": 1709 | |
| }, | |
| { | |
| "epoch": 2.8536977491961415, | |
| "grad_norm": 0.44070923563948255, | |
| "learning_rate": 2.955233457987393e-05, | |
| "loss": 0.2428, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27117034792900085, | |
| "step": 1775, | |
| "valid_targets_mean": 6259.8, | |
| "valid_targets_min": 3163 | |
| }, | |
| { | |
| "epoch": 2.861736334405145, | |
| "grad_norm": 0.46843718336492085, | |
| "learning_rate": 2.9481811247459174e-05, | |
| "loss": 0.2283, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20232127606868744, | |
| "step": 1780, | |
| "valid_targets_mean": 4301.2, | |
| "valid_targets_min": 877 | |
| }, | |
| { | |
| "epoch": 2.869774919614148, | |
| "grad_norm": 0.4811831657791136, | |
| "learning_rate": 2.9411135509202407e-05, | |
| "loss": 0.2415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22563835978507996, | |
| "step": 1785, | |
| "valid_targets_mean": 4858.6, | |
| "valid_targets_min": 297 | |
| }, | |
| { | |
| "epoch": 2.877813504823151, | |
| "grad_norm": 0.4776739267077266, | |
| "learning_rate": 2.9340308501109726e-05, | |
| "loss": 0.2352, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25109878182411194, | |
| "step": 1790, | |
| "valid_targets_mean": 4963.9, | |
| "valid_targets_min": 921 | |
| }, | |
| { | |
| "epoch": 2.8858520900321545, | |
| "grad_norm": 0.4552374822125526, | |
| "learning_rate": 2.926933136161866e-05, | |
| "loss": 0.2522, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2678318917751312, | |
| "step": 1795, | |
| "valid_targets_mean": 6130.4, | |
| "valid_targets_min": 1699 | |
| }, | |
| { | |
| "epoch": 2.8938906752411575, | |
| "grad_norm": 0.4998597785043118, | |
| "learning_rate": 2.919820523157988e-05, | |
| "loss": 0.2373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2558537721633911, | |
| "step": 1800, | |
| "valid_targets_mean": 5725.7, | |
| "valid_targets_min": 1049 | |
| }, | |
| { | |
| "epoch": 2.901929260450161, | |
| "grad_norm": 0.3985526663408574, | |
| "learning_rate": 2.9126931254238856e-05, | |
| "loss": 0.2293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22993230819702148, | |
| "step": 1805, | |
| "valid_targets_mean": 5944.6, | |
| "valid_targets_min": 2174 | |
| }, | |
| { | |
| "epoch": 2.909967845659164, | |
| "grad_norm": 0.4328617676889273, | |
| "learning_rate": 2.905551057521748e-05, | |
| "loss": 0.2493, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24856430292129517, | |
| "step": 1810, | |
| "valid_targets_mean": 5760.9, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 2.918006430868167, | |
| "grad_norm": 0.4542317521238959, | |
| "learning_rate": 2.898394434249565e-05, | |
| "loss": 0.2401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21676376461982727, | |
| "step": 1815, | |
| "valid_targets_mean": 5277.0, | |
| "valid_targets_min": 1775 | |
| }, | |
| { | |
| "epoch": 2.9260450160771705, | |
| "grad_norm": 0.4960973232998586, | |
| "learning_rate": 2.8912233706392824e-05, | |
| "loss": 0.2118, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19444525241851807, | |
| "step": 1820, | |
| "valid_targets_mean": 4628.3, | |
| "valid_targets_min": 381 | |
| }, | |
| { | |
| "epoch": 2.9340836012861735, | |
| "grad_norm": 0.4351299120886839, | |
| "learning_rate": 2.8840379819549515e-05, | |
| "loss": 0.2476, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22263455390930176, | |
| "step": 1825, | |
| "valid_targets_mean": 5299.7, | |
| "valid_targets_min": 558 | |
| }, | |
| { | |
| "epoch": 2.942122186495177, | |
| "grad_norm": 0.47748132373633034, | |
| "learning_rate": 2.8768383836908804e-05, | |
| "loss": 0.2191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21326063573360443, | |
| "step": 1830, | |
| "valid_targets_mean": 4476.9, | |
| "valid_targets_min": 935 | |
| }, | |
| { | |
| "epoch": 2.95016077170418, | |
| "grad_norm": 0.42660596230377495, | |
| "learning_rate": 2.8696246915697714e-05, | |
| "loss": 0.2137, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2128058224916458, | |
| "step": 1835, | |
| "valid_targets_mean": 5964.4, | |
| "valid_targets_min": 2572 | |
| }, | |
| { | |
| "epoch": 2.958199356913183, | |
| "grad_norm": 0.4506785970078862, | |
| "learning_rate": 2.862397021540867e-05, | |
| "loss": 0.2563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26402390003204346, | |
| "step": 1840, | |
| "valid_targets_mean": 5856.1, | |
| "valid_targets_min": 2753 | |
| }, | |
| { | |
| "epoch": 2.9662379421221865, | |
| "grad_norm": 0.4391481347976392, | |
| "learning_rate": 2.855155489778082e-05, | |
| "loss": 0.2341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2654515504837036, | |
| "step": 1845, | |
| "valid_targets_mean": 5673.1, | |
| "valid_targets_min": 1145 | |
| }, | |
| { | |
| "epoch": 2.97427652733119, | |
| "grad_norm": 0.4582126351879179, | |
| "learning_rate": 2.8479002126781383e-05, | |
| "loss": 0.2386, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24582958221435547, | |
| "step": 1850, | |
| "valid_targets_mean": 5058.6, | |
| "valid_targets_min": 575 | |
| }, | |
| { | |
| "epoch": 2.982315112540193, | |
| "grad_norm": 0.4291403858281957, | |
| "learning_rate": 2.8406313068586926e-05, | |
| "loss": 0.2339, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2569013833999634, | |
| "step": 1855, | |
| "valid_targets_mean": 6363.9, | |
| "valid_targets_min": 1189 | |
| }, | |
| { | |
| "epoch": 2.990353697749196, | |
| "grad_norm": 0.4008457746324607, | |
| "learning_rate": 2.8333488891564647e-05, | |
| "loss": 0.2334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19619405269622803, | |
| "step": 1860, | |
| "valid_targets_mean": 5793.0, | |
| "valid_targets_min": 2157 | |
| }, | |
| { | |
| "epoch": 2.9983922829581995, | |
| "grad_norm": 0.42930616673946925, | |
| "learning_rate": 2.8260530766253557e-05, | |
| "loss": 0.2419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2623644769191742, | |
| "step": 1865, | |
| "valid_targets_mean": 6299.9, | |
| "valid_targets_min": 3015 | |
| }, | |
| { | |
| "epoch": 3.0064308681672025, | |
| "grad_norm": 0.38344200812668733, | |
| "learning_rate": 2.8187439865345685e-05, | |
| "loss": 0.2112, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20248943567276, | |
| "step": 1870, | |
| "valid_targets_mean": 6622.2, | |
| "valid_targets_min": 2228 | |
| }, | |
| { | |
| "epoch": 3.014469453376206, | |
| "grad_norm": 0.46103181792250764, | |
| "learning_rate": 2.8114217363667246e-05, | |
| "loss": 0.2079, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20105582475662231, | |
| "step": 1875, | |
| "valid_targets_mean": 5312.6, | |
| "valid_targets_min": 2085 | |
| }, | |
| { | |
| "epoch": 3.022508038585209, | |
| "grad_norm": 0.4647259314919432, | |
| "learning_rate": 2.8040864438159707e-05, | |
| "loss": 0.207, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20135775208473206, | |
| "step": 1880, | |
| "valid_targets_mean": 4949.4, | |
| "valid_targets_min": 1507 | |
| }, | |
| { | |
| "epoch": 3.030546623794212, | |
| "grad_norm": 0.4085250192487578, | |
| "learning_rate": 2.7967382267860946e-05, | |
| "loss": 0.2202, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23754431307315826, | |
| "step": 1885, | |
| "valid_targets_mean": 7139.6, | |
| "valid_targets_min": 1003 | |
| }, | |
| { | |
| "epoch": 3.0385852090032155, | |
| "grad_norm": 0.5510015583580707, | |
| "learning_rate": 2.7893772033886208e-05, | |
| "loss": 0.2098, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20717856287956238, | |
| "step": 1890, | |
| "valid_targets_mean": 5289.0, | |
| "valid_targets_min": 880 | |
| }, | |
| { | |
| "epoch": 3.0466237942122185, | |
| "grad_norm": 0.47108051497284203, | |
| "learning_rate": 2.7820034919409203e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20311778783798218, | |
| "step": 1895, | |
| "valid_targets_mean": 4827.5, | |
| "valid_targets_min": 567 | |
| }, | |
| { | |
| "epoch": 3.054662379421222, | |
| "grad_norm": 0.4687658440412486, | |
| "learning_rate": 2.774617210964305e-05, | |
| "loss": 0.2047, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22327911853790283, | |
| "step": 1900, | |
| "valid_targets_mean": 6458.3, | |
| "valid_targets_min": 1909 | |
| }, | |
| { | |
| "epoch": 3.062700964630225, | |
| "grad_norm": 0.5317314845653304, | |
| "learning_rate": 2.767218479182121e-05, | |
| "loss": 0.2208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28799089789390564, | |
| "step": 1905, | |
| "valid_targets_mean": 5339.3, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 3.0707395498392285, | |
| "grad_norm": 0.4479836422167699, | |
| "learning_rate": 2.7598074155178447e-05, | |
| "loss": 0.2075, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23213958740234375, | |
| "step": 1910, | |
| "valid_targets_mean": 6368.8, | |
| "valid_targets_min": 2232 | |
| }, | |
| { | |
| "epoch": 3.0787781350482315, | |
| "grad_norm": 0.4879671474173943, | |
| "learning_rate": 2.7523841390931676e-05, | |
| "loss": 0.2158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19620759785175323, | |
| "step": 1915, | |
| "valid_targets_mean": 5519.9, | |
| "valid_targets_min": 2982 | |
| }, | |
| { | |
| "epoch": 3.0868167202572345, | |
| "grad_norm": 0.576805679083638, | |
| "learning_rate": 2.7449487692260834e-05, | |
| "loss": 0.2368, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2544819414615631, | |
| "step": 1920, | |
| "valid_targets_mean": 4490.9, | |
| "valid_targets_min": 687 | |
| }, | |
| { | |
| "epoch": 3.094855305466238, | |
| "grad_norm": 0.4971668089459153, | |
| "learning_rate": 2.7375014254289695e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1839742660522461, | |
| "step": 1925, | |
| "valid_targets_mean": 4795.8, | |
| "valid_targets_min": 1833 | |
| }, | |
| { | |
| "epoch": 3.102893890675241, | |
| "grad_norm": 0.4687617088950276, | |
| "learning_rate": 2.7300422274066672e-05, | |
| "loss": 0.2028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2087346613407135, | |
| "step": 1930, | |
| "valid_targets_mean": 5627.9, | |
| "valid_targets_min": 2985 | |
| }, | |
| { | |
| "epoch": 3.1109324758842445, | |
| "grad_norm": 0.4927605336472376, | |
| "learning_rate": 2.722571295054554e-05, | |
| "loss": 0.2321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2347278743982315, | |
| "step": 1935, | |
| "valid_targets_mean": 5060.4, | |
| "valid_targets_min": 371 | |
| }, | |
| { | |
| "epoch": 3.1189710610932475, | |
| "grad_norm": 0.4987134488018747, | |
| "learning_rate": 2.715088748456622e-05, | |
| "loss": 0.2347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23042768239974976, | |
| "step": 1940, | |
| "valid_targets_mean": 5561.6, | |
| "valid_targets_min": 550 | |
| }, | |
| { | |
| "epoch": 3.127009646302251, | |
| "grad_norm": 0.523421193954538, | |
| "learning_rate": 2.7075947078835438e-05, | |
| "loss": 0.2135, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22954672574996948, | |
| "step": 1945, | |
| "valid_targets_mean": 4881.9, | |
| "valid_targets_min": 1394 | |
| }, | |
| { | |
| "epoch": 3.135048231511254, | |
| "grad_norm": 0.43883645279150485, | |
| "learning_rate": 2.7000892937907398e-05, | |
| "loss": 0.1886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19516025483608246, | |
| "step": 1950, | |
| "valid_targets_mean": 5517.6, | |
| "valid_targets_min": 3251 | |
| }, | |
| { | |
| "epoch": 3.143086816720257, | |
| "grad_norm": 0.4570199670863005, | |
| "learning_rate": 2.6925726268164427e-05, | |
| "loss": 0.2222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24825003743171692, | |
| "step": 1955, | |
| "valid_targets_mean": 6280.4, | |
| "valid_targets_min": 2112 | |
| }, | |
| { | |
| "epoch": 3.1511254019292605, | |
| "grad_norm": 0.45672754180296316, | |
| "learning_rate": 2.685044827779759e-05, | |
| "loss": 0.2103, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20939724147319794, | |
| "step": 1960, | |
| "valid_targets_mean": 6016.7, | |
| "valid_targets_min": 2771 | |
| }, | |
| { | |
| "epoch": 3.1591639871382635, | |
| "grad_norm": 0.47906774477386277, | |
| "learning_rate": 2.6775060176787256e-05, | |
| "loss": 0.2322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.281923770904541, | |
| "step": 1965, | |
| "valid_targets_mean": 6699.8, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 3.167202572347267, | |
| "grad_norm": 0.39767857581545424, | |
| "learning_rate": 2.6699563176883664e-05, | |
| "loss": 0.2013, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2029469907283783, | |
| "step": 1970, | |
| "valid_targets_mean": 6916.5, | |
| "valid_targets_min": 1302 | |
| }, | |
| { | |
| "epoch": 3.17524115755627, | |
| "grad_norm": 0.4425799974084158, | |
| "learning_rate": 2.6623958491587446e-05, | |
| "loss": 0.217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20335035026073456, | |
| "step": 1975, | |
| "valid_targets_mean": 5785.0, | |
| "valid_targets_min": 2366 | |
| }, | |
| { | |
| "epoch": 3.1832797427652735, | |
| "grad_norm": 0.44544395721654917, | |
| "learning_rate": 2.654824733613008e-05, | |
| "loss": 0.2085, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21528367698192596, | |
| "step": 1980, | |
| "valid_targets_mean": 5611.8, | |
| "valid_targets_min": 2466 | |
| }, | |
| { | |
| "epoch": 3.1913183279742765, | |
| "grad_norm": 0.4460608523219643, | |
| "learning_rate": 2.6472430927454436e-05, | |
| "loss": 0.2128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20754650235176086, | |
| "step": 1985, | |
| "valid_targets_mean": 7354.8, | |
| "valid_targets_min": 3155 | |
| }, | |
| { | |
| "epoch": 3.19935691318328, | |
| "grad_norm": 0.42210886653384705, | |
| "learning_rate": 2.6396510484195137e-05, | |
| "loss": 0.2185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2087785005569458, | |
| "step": 1990, | |
| "valid_targets_mean": 6236.5, | |
| "valid_targets_min": 2469 | |
| }, | |
| { | |
| "epoch": 3.207395498392283, | |
| "grad_norm": 0.43096075271764295, | |
| "learning_rate": 2.6320487226659025e-05, | |
| "loss": 0.1979, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16838577389717102, | |
| "step": 1995, | |
| "valid_targets_mean": 5186.6, | |
| "valid_targets_min": 1676 | |
| }, | |
| { | |
| "epoch": 3.215434083601286, | |
| "grad_norm": 0.4508053623757557, | |
| "learning_rate": 2.6244362376805526e-05, | |
| "loss": 0.2267, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23479191958904266, | |
| "step": 2000, | |
| "valid_targets_mean": 5779.5, | |
| "valid_targets_min": 763 | |
| }, | |
| { | |
| "epoch": 3.2234726688102895, | |
| "grad_norm": 0.4628118815083852, | |
| "learning_rate": 2.616813715822699e-05, | |
| "loss": 0.2118, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22613340616226196, | |
| "step": 2005, | |
| "valid_targets_mean": 5524.2, | |
| "valid_targets_min": 1492 | |
| }, | |
| { | |
| "epoch": 3.2315112540192925, | |
| "grad_norm": 0.4720924761087141, | |
| "learning_rate": 2.609181279612907e-05, | |
| "loss": 0.2166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24971206486225128, | |
| "step": 2010, | |
| "valid_targets_mean": 5421.2, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 3.239549839228296, | |
| "grad_norm": 0.4737698035846392, | |
| "learning_rate": 2.601539051731098e-05, | |
| "loss": 0.2059, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2119787633419037, | |
| "step": 2015, | |
| "valid_targets_mean": 5191.6, | |
| "valid_targets_min": 1255 | |
| }, | |
| { | |
| "epoch": 3.247588424437299, | |
| "grad_norm": 0.4572117282020224, | |
| "learning_rate": 2.5938871550145813e-05, | |
| "loss": 0.2119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1961323767900467, | |
| "step": 2020, | |
| "valid_targets_mean": 5815.9, | |
| "valid_targets_min": 2510 | |
| }, | |
| { | |
| "epoch": 3.255627009646302, | |
| "grad_norm": 0.43770534537636197, | |
| "learning_rate": 2.5862257124560776e-05, | |
| "loss": 0.2141, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19521623849868774, | |
| "step": 2025, | |
| "valid_targets_mean": 4878.7, | |
| "valid_targets_min": 359 | |
| }, | |
| { | |
| "epoch": 3.2636655948553055, | |
| "grad_norm": 0.5133043126611774, | |
| "learning_rate": 2.5785548472017428e-05, | |
| "loss": 0.2182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22816744446754456, | |
| "step": 2030, | |
| "valid_targets_mean": 5207.1, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 3.2717041800643085, | |
| "grad_norm": 0.45988818406168736, | |
| "learning_rate": 2.5708746825491872e-05, | |
| "loss": 0.206, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2291097640991211, | |
| "step": 2035, | |
| "valid_targets_mean": 5681.0, | |
| "valid_targets_min": 1539 | |
| }, | |
| { | |
| "epoch": 3.279742765273312, | |
| "grad_norm": 0.4942920353232446, | |
| "learning_rate": 2.5631853419454963e-05, | |
| "loss": 0.2219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19933205842971802, | |
| "step": 2040, | |
| "valid_targets_mean": 4273.0, | |
| "valid_targets_min": 312 | |
| }, | |
| { | |
| "epoch": 3.287781350482315, | |
| "grad_norm": 0.4710297159118358, | |
| "learning_rate": 2.5554869489852462e-05, | |
| "loss": 0.2241, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25016915798187256, | |
| "step": 2045, | |
| "valid_targets_mean": 6010.7, | |
| "valid_targets_min": 3356 | |
| }, | |
| { | |
| "epoch": 3.2958199356913185, | |
| "grad_norm": 0.4329815260262049, | |
| "learning_rate": 2.5477796274085128e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19916954636573792, | |
| "step": 2050, | |
| "valid_targets_mean": 5431.9, | |
| "valid_targets_min": 477 | |
| }, | |
| { | |
| "epoch": 3.3038585209003215, | |
| "grad_norm": 0.4382594314534806, | |
| "learning_rate": 2.540063501098889e-05, | |
| "loss": 0.2117, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22328132390975952, | |
| "step": 2055, | |
| "valid_targets_mean": 7057.8, | |
| "valid_targets_min": 2434 | |
| }, | |
| { | |
| "epoch": 3.311897106109325, | |
| "grad_norm": 0.4304909069133098, | |
| "learning_rate": 2.5323386940814894e-05, | |
| "loss": 0.222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20800434052944183, | |
| "step": 2060, | |
| "valid_targets_mean": 5571.2, | |
| "valid_targets_min": 2064 | |
| }, | |
| { | |
| "epoch": 3.319935691318328, | |
| "grad_norm": 0.45761114861789215, | |
| "learning_rate": 2.5246053305209576e-05, | |
| "loss": 0.2224, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22320008277893066, | |
| "step": 2065, | |
| "valid_targets_mean": 5810.9, | |
| "valid_targets_min": 2496 | |
| }, | |
| { | |
| "epoch": 3.327974276527331, | |
| "grad_norm": 0.5109891719100492, | |
| "learning_rate": 2.516863534719472e-05, | |
| "loss": 0.2197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2085902988910675, | |
| "step": 2070, | |
| "valid_targets_mean": 5335.4, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 3.3360128617363345, | |
| "grad_norm": 0.46691815612294274, | |
| "learning_rate": 2.509113431114745e-05, | |
| "loss": 0.2278, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18727019429206848, | |
| "step": 2075, | |
| "valid_targets_mean": 5446.6, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 3.3440514469453375, | |
| "grad_norm": 0.4900335050961584, | |
| "learning_rate": 2.5013551442780246e-05, | |
| "loss": 0.2237, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26614588499069214, | |
| "step": 2080, | |
| "valid_targets_mean": 5034.6, | |
| "valid_targets_min": 1621 | |
| }, | |
| { | |
| "epoch": 3.352090032154341, | |
| "grad_norm": 0.44650493165960287, | |
| "learning_rate": 2.493588798912094e-05, | |
| "loss": 0.1943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1934056580066681, | |
| "step": 2085, | |
| "valid_targets_mean": 6115.8, | |
| "valid_targets_min": 2803 | |
| }, | |
| { | |
| "epoch": 3.360128617363344, | |
| "grad_norm": 0.4994390565400204, | |
| "learning_rate": 2.485814519849262e-05, | |
| "loss": 0.2142, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22949208319187164, | |
| "step": 2090, | |
| "valid_targets_mean": 4577.1, | |
| "valid_targets_min": 1168 | |
| }, | |
| { | |
| "epoch": 3.368167202572347, | |
| "grad_norm": 0.4260321258743967, | |
| "learning_rate": 2.478032432049362e-05, | |
| "loss": 0.2007, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21895253658294678, | |
| "step": 2095, | |
| "valid_targets_mean": 5931.0, | |
| "valid_targets_min": 2107 | |
| }, | |
| { | |
| "epoch": 3.3762057877813505, | |
| "grad_norm": 0.4504678210337142, | |
| "learning_rate": 2.470242660597741e-05, | |
| "loss": 0.2078, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22290824353694916, | |
| "step": 2100, | |
| "valid_targets_mean": 5929.8, | |
| "valid_targets_min": 2326 | |
| }, | |
| { | |
| "epoch": 3.3842443729903535, | |
| "grad_norm": 0.41031164293445443, | |
| "learning_rate": 2.4624453307032475e-05, | |
| "loss": 0.2227, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23696118593215942, | |
| "step": 2105, | |
| "valid_targets_mean": 6418.2, | |
| "valid_targets_min": 1112 | |
| }, | |
| { | |
| "epoch": 3.392282958199357, | |
| "grad_norm": 0.48502919577531867, | |
| "learning_rate": 2.4546405676962212e-05, | |
| "loss": 0.2144, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2700331211090088, | |
| "step": 2110, | |
| "valid_targets_mean": 4992.6, | |
| "valid_targets_min": 955 | |
| }, | |
| { | |
| "epoch": 3.40032154340836, | |
| "grad_norm": 0.4229844699525625, | |
| "learning_rate": 2.4468284970264796e-05, | |
| "loss": 0.2155, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18260376155376434, | |
| "step": 2115, | |
| "valid_targets_mean": 5966.1, | |
| "valid_targets_min": 2274 | |
| }, | |
| { | |
| "epoch": 3.4083601286173635, | |
| "grad_norm": 0.4984857041447722, | |
| "learning_rate": 2.4390092442612977e-05, | |
| "loss": 0.2244, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23534205555915833, | |
| "step": 2120, | |
| "valid_targets_mean": 4669.6, | |
| "valid_targets_min": 1543 | |
| }, | |
| { | |
| "epoch": 3.4163987138263665, | |
| "grad_norm": 0.4859235898941334, | |
| "learning_rate": 2.4311829350833927e-05, | |
| "loss": 0.2135, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19799849390983582, | |
| "step": 2125, | |
| "valid_targets_mean": 6246.9, | |
| "valid_targets_min": 493 | |
| }, | |
| { | |
| "epoch": 3.42443729903537, | |
| "grad_norm": 0.4690768364402796, | |
| "learning_rate": 2.423349695288904e-05, | |
| "loss": 0.2226, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2516707181930542, | |
| "step": 2130, | |
| "valid_targets_mean": 5483.4, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 3.432475884244373, | |
| "grad_norm": 0.44024466308006444, | |
| "learning_rate": 2.4155096507853668e-05, | |
| "loss": 0.2241, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2450716495513916, | |
| "step": 2135, | |
| "valid_targets_mean": 6793.9, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 3.440514469453376, | |
| "grad_norm": 0.49255880868641494, | |
| "learning_rate": 2.407662927589697e-05, | |
| "loss": 0.2209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27766314148902893, | |
| "step": 2140, | |
| "valid_targets_mean": 5863.6, | |
| "valid_targets_min": 1504 | |
| }, | |
| { | |
| "epoch": 3.4485530546623795, | |
| "grad_norm": 0.49652597672181636, | |
| "learning_rate": 2.3998096518261563e-05, | |
| "loss": 0.2118, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21233215928077698, | |
| "step": 2145, | |
| "valid_targets_mean": 5722.1, | |
| "valid_targets_min": 2198 | |
| }, | |
| { | |
| "epoch": 3.4565916398713825, | |
| "grad_norm": 0.4508380257265703, | |
| "learning_rate": 2.39194994972433e-05, | |
| "loss": 0.2153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24831700325012207, | |
| "step": 2150, | |
| "valid_targets_mean": 5750.8, | |
| "valid_targets_min": 707 | |
| }, | |
| { | |
| "epoch": 3.464630225080386, | |
| "grad_norm": 0.505570603028868, | |
| "learning_rate": 2.3840839476170986e-05, | |
| "loss": 0.2341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2556144595146179, | |
| "step": 2155, | |
| "valid_targets_mean": 5813.5, | |
| "valid_targets_min": 1052 | |
| }, | |
| { | |
| "epoch": 3.472668810289389, | |
| "grad_norm": 0.5080212244608979, | |
| "learning_rate": 2.3762117719386035e-05, | |
| "loss": 0.2182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22166283428668976, | |
| "step": 2160, | |
| "valid_targets_mean": 5062.8, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 3.4807073954983925, | |
| "grad_norm": 0.47342379469363194, | |
| "learning_rate": 2.368333549222219e-05, | |
| "loss": 0.2371, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23823466897010803, | |
| "step": 2165, | |
| "valid_targets_mean": 5316.8, | |
| "valid_targets_min": 1762 | |
| }, | |
| { | |
| "epoch": 3.4887459807073955, | |
| "grad_norm": 0.5419798446391779, | |
| "learning_rate": 2.3604494060985147e-05, | |
| "loss": 0.2033, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24541357159614563, | |
| "step": 2170, | |
| "valid_targets_mean": 5685.3, | |
| "valid_targets_min": 1966 | |
| }, | |
| { | |
| "epoch": 3.4967845659163985, | |
| "grad_norm": 0.4519419903196845, | |
| "learning_rate": 2.352559469293223e-05, | |
| "loss": 0.191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21435996890068054, | |
| "step": 2175, | |
| "valid_targets_mean": 5869.8, | |
| "valid_targets_min": 2068 | |
| }, | |
| { | |
| "epoch": 3.504823151125402, | |
| "grad_norm": 0.6041112174173733, | |
| "learning_rate": 2.3446638656252007e-05, | |
| "loss": 0.228, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20639154314994812, | |
| "step": 2180, | |
| "valid_targets_mean": 4280.1, | |
| "valid_targets_min": 831 | |
| }, | |
| { | |
| "epoch": 3.512861736334405, | |
| "grad_norm": 0.438228796481592, | |
| "learning_rate": 2.336762722004391e-05, | |
| "loss": 0.2177, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19948528707027435, | |
| "step": 2185, | |
| "valid_targets_mean": 5539.8, | |
| "valid_targets_min": 842 | |
| }, | |
| { | |
| "epoch": 3.5209003215434085, | |
| "grad_norm": 0.4540829032794475, | |
| "learning_rate": 2.328856165429783e-05, | |
| "loss": 0.2028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2040947824716568, | |
| "step": 2190, | |
| "valid_targets_mean": 5811.5, | |
| "valid_targets_min": 2452 | |
| }, | |
| { | |
| "epoch": 3.5289389067524115, | |
| "grad_norm": 0.41836024915690306, | |
| "learning_rate": 2.3209443229873713e-05, | |
| "loss": 0.1978, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16951392590999603, | |
| "step": 2195, | |
| "valid_targets_mean": 6125.7, | |
| "valid_targets_min": 3390 | |
| }, | |
| { | |
| "epoch": 3.536977491961415, | |
| "grad_norm": 0.4850570987480443, | |
| "learning_rate": 2.313027321848113e-05, | |
| "loss": 0.2072, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20990464091300964, | |
| "step": 2200, | |
| "valid_targets_mean": 5732.7, | |
| "valid_targets_min": 364 | |
| }, | |
| { | |
| "epoch": 3.545016077170418, | |
| "grad_norm": 0.43896441152646415, | |
| "learning_rate": 2.3051052892658827e-05, | |
| "loss": 0.203, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21642711758613586, | |
| "step": 2205, | |
| "valid_targets_mean": 5755.6, | |
| "valid_targets_min": 2773 | |
| }, | |
| { | |
| "epoch": 3.553054662379421, | |
| "grad_norm": 0.4709870338495115, | |
| "learning_rate": 2.297178352575429e-05, | |
| "loss": 0.2304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21805457770824432, | |
| "step": 2210, | |
| "valid_targets_mean": 5804.4, | |
| "valid_targets_min": 983 | |
| }, | |
| { | |
| "epoch": 3.5610932475884245, | |
| "grad_norm": 0.44572679930140774, | |
| "learning_rate": 2.2892466391903258e-05, | |
| "loss": 0.2159, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2034660279750824, | |
| "step": 2215, | |
| "valid_targets_mean": 5561.6, | |
| "valid_targets_min": 717 | |
| }, | |
| { | |
| "epoch": 3.5691318327974275, | |
| "grad_norm": 0.454325434835351, | |
| "learning_rate": 2.2813102766009254e-05, | |
| "loss": 0.2106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22672885656356812, | |
| "step": 2220, | |
| "valid_targets_mean": 5497.9, | |
| "valid_targets_min": 505 | |
| }, | |
| { | |
| "epoch": 3.577170418006431, | |
| "grad_norm": 0.48626995312563503, | |
| "learning_rate": 2.2733693923723097e-05, | |
| "loss": 0.2133, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20194870233535767, | |
| "step": 2225, | |
| "valid_targets_mean": 5243.6, | |
| "valid_targets_min": 867 | |
| }, | |
| { | |
| "epoch": 3.585209003215434, | |
| "grad_norm": 0.4445447090227905, | |
| "learning_rate": 2.2654241141422386e-05, | |
| "loss": 0.226, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2123812437057495, | |
| "step": 2230, | |
| "valid_targets_mean": 4860.9, | |
| "valid_targets_min": 982 | |
| }, | |
| { | |
| "epoch": 3.593247588424437, | |
| "grad_norm": 0.6282387823578006, | |
| "learning_rate": 2.257474569619097e-05, | |
| "loss": 0.211, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18385881185531616, | |
| "step": 2235, | |
| "valid_targets_mean": 5456.4, | |
| "valid_targets_min": 747 | |
| }, | |
| { | |
| "epoch": 3.6012861736334405, | |
| "grad_norm": 0.4661679781841043, | |
| "learning_rate": 2.2495208865798484e-05, | |
| "loss": 0.2323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2516222596168518, | |
| "step": 2240, | |
| "valid_targets_mean": 5818.1, | |
| "valid_targets_min": 2430 | |
| }, | |
| { | |
| "epoch": 3.609324758842444, | |
| "grad_norm": 0.5063343638471681, | |
| "learning_rate": 2.241563192867974e-05, | |
| "loss": 0.2262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2693856358528137, | |
| "step": 2245, | |
| "valid_targets_mean": 5781.2, | |
| "valid_targets_min": 977 | |
| }, | |
| { | |
| "epoch": 3.617363344051447, | |
| "grad_norm": 0.46853096205787864, | |
| "learning_rate": 2.233601616391419e-05, | |
| "loss": 0.2119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19709277153015137, | |
| "step": 2250, | |
| "valid_targets_mean": 4633.3, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 3.62540192926045, | |
| "grad_norm": 0.47453245783445847, | |
| "learning_rate": 2.2256362851205423e-05, | |
| "loss": 0.2327, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20516790449619293, | |
| "step": 2255, | |
| "valid_targets_mean": 5310.6, | |
| "valid_targets_min": 584 | |
| }, | |
| { | |
| "epoch": 3.6334405144694535, | |
| "grad_norm": 0.4218807893646266, | |
| "learning_rate": 2.2176673270860513e-05, | |
| "loss": 0.209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2451988160610199, | |
| "step": 2260, | |
| "valid_targets_mean": 6055.2, | |
| "valid_targets_min": 2343 | |
| }, | |
| { | |
| "epoch": 3.6414790996784565, | |
| "grad_norm": 0.49840300841653384, | |
| "learning_rate": 2.209694870376951e-05, | |
| "loss": 0.2135, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23280146718025208, | |
| "step": 2265, | |
| "valid_targets_mean": 4976.8, | |
| "valid_targets_min": 628 | |
| }, | |
| { | |
| "epoch": 3.64951768488746, | |
| "grad_norm": 0.40449740396090167, | |
| "learning_rate": 2.2017190431384813e-05, | |
| "loss": 0.2134, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20014050602912903, | |
| "step": 2270, | |
| "valid_targets_mean": 6301.7, | |
| "valid_targets_min": 2562 | |
| }, | |
| { | |
| "epoch": 3.657556270096463, | |
| "grad_norm": 0.5348968486420073, | |
| "learning_rate": 2.1937399735700578e-05, | |
| "loss": 0.2341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23371417820453644, | |
| "step": 2275, | |
| "valid_targets_mean": 4516.2, | |
| "valid_targets_min": 793 | |
| }, | |
| { | |
| "epoch": 3.665594855305466, | |
| "grad_norm": 0.4412069174813758, | |
| "learning_rate": 2.185757789923213e-05, | |
| "loss": 0.2092, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22182171046733856, | |
| "step": 2280, | |
| "valid_targets_mean": 5953.4, | |
| "valid_targets_min": 1050 | |
| }, | |
| { | |
| "epoch": 3.6736334405144695, | |
| "grad_norm": 0.4681541955311709, | |
| "learning_rate": 2.177772620499533e-05, | |
| "loss": 0.2114, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19206824898719788, | |
| "step": 2285, | |
| "valid_targets_mean": 4931.1, | |
| "valid_targets_min": 591 | |
| }, | |
| { | |
| "epoch": 3.6816720257234725, | |
| "grad_norm": 0.43440468582181413, | |
| "learning_rate": 2.169784593648594e-05, | |
| "loss": 0.2067, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21512943506240845, | |
| "step": 2290, | |
| "valid_targets_mean": 6165.6, | |
| "valid_targets_min": 3636 | |
| }, | |
| { | |
| "epoch": 3.689710610932476, | |
| "grad_norm": 0.4533493453174465, | |
| "learning_rate": 2.161793837765904e-05, | |
| "loss": 0.2055, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20502886176109314, | |
| "step": 2295, | |
| "valid_targets_mean": 6049.6, | |
| "valid_targets_min": 2196 | |
| }, | |
| { | |
| "epoch": 3.697749196141479, | |
| "grad_norm": 0.40904993139435536, | |
| "learning_rate": 2.1538004812908345e-05, | |
| "loss": 0.2099, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16571852564811707, | |
| "step": 2300, | |
| "valid_targets_mean": 5779.1, | |
| "valid_targets_min": 1787 | |
| }, | |
| { | |
| "epoch": 3.7057877813504825, | |
| "grad_norm": 0.49153894981339047, | |
| "learning_rate": 2.1458046527045563e-05, | |
| "loss": 0.2199, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2017051726579666, | |
| "step": 2305, | |
| "valid_targets_mean": 5293.1, | |
| "valid_targets_min": 479 | |
| }, | |
| { | |
| "epoch": 3.7138263665594855, | |
| "grad_norm": 0.4285879061951372, | |
| "learning_rate": 2.1378064805279796e-05, | |
| "loss": 0.1996, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23837776482105255, | |
| "step": 2310, | |
| "valid_targets_mean": 6358.2, | |
| "valid_targets_min": 1503 | |
| }, | |
| { | |
| "epoch": 3.721864951768489, | |
| "grad_norm": 0.4735435580454719, | |
| "learning_rate": 2.1298060933196797e-05, | |
| "loss": 0.2213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22305136919021606, | |
| "step": 2315, | |
| "valid_targets_mean": 5233.4, | |
| "valid_targets_min": 1262 | |
| }, | |
| { | |
| "epoch": 3.729903536977492, | |
| "grad_norm": 0.4494458391637004, | |
| "learning_rate": 2.1218036196738382e-05, | |
| "loss": 0.2303, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20690946280956268, | |
| "step": 2320, | |
| "valid_targets_mean": 5577.9, | |
| "valid_targets_min": 2170 | |
| }, | |
| { | |
| "epoch": 3.737942122186495, | |
| "grad_norm": 0.4236260163992656, | |
| "learning_rate": 2.1137991882181716e-05, | |
| "loss": 0.219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2045375406742096, | |
| "step": 2325, | |
| "valid_targets_mean": 6003.2, | |
| "valid_targets_min": 735 | |
| }, | |
| { | |
| "epoch": 3.7459807073954985, | |
| "grad_norm": 0.4194948117221583, | |
| "learning_rate": 2.1057929276118663e-05, | |
| "loss": 0.2227, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2365095168352127, | |
| "step": 2330, | |
| "valid_targets_mean": 7019.9, | |
| "valid_targets_min": 2770 | |
| }, | |
| { | |
| "epoch": 3.7540192926045015, | |
| "grad_norm": 0.4554573167258968, | |
| "learning_rate": 2.097784966543509e-05, | |
| "loss": 0.2287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22981056571006775, | |
| "step": 2335, | |
| "valid_targets_mean": 5870.8, | |
| "valid_targets_min": 2514 | |
| }, | |
| { | |
| "epoch": 3.762057877813505, | |
| "grad_norm": 0.3979763789428403, | |
| "learning_rate": 2.0897754337290186e-05, | |
| "loss": 0.221, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.249480202794075, | |
| "step": 2340, | |
| "valid_targets_mean": 6951.8, | |
| "valid_targets_min": 2075 | |
| }, | |
| { | |
| "epoch": 3.770096463022508, | |
| "grad_norm": 0.4326484331885382, | |
| "learning_rate": 2.081764457909578e-05, | |
| "loss": 0.2172, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21711274981498718, | |
| "step": 2345, | |
| "valid_targets_mean": 5988.9, | |
| "valid_targets_min": 3006 | |
| }, | |
| { | |
| "epoch": 3.778135048231511, | |
| "grad_norm": 0.4537090060178923, | |
| "learning_rate": 2.073752167849563e-05, | |
| "loss": 0.2324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24819859862327576, | |
| "step": 2350, | |
| "valid_targets_mean": 6692.4, | |
| "valid_targets_min": 1672 | |
| }, | |
| { | |
| "epoch": 3.7861736334405145, | |
| "grad_norm": 0.4501156544408785, | |
| "learning_rate": 2.0657386923344763e-05, | |
| "loss": 0.2219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19166472554206848, | |
| "step": 2355, | |
| "valid_targets_mean": 5443.9, | |
| "valid_targets_min": 2086 | |
| }, | |
| { | |
| "epoch": 3.7942122186495175, | |
| "grad_norm": 0.448471765482259, | |
| "learning_rate": 2.0577241601688715e-05, | |
| "loss": 0.2078, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21458148956298828, | |
| "step": 2360, | |
| "valid_targets_mean": 5981.8, | |
| "valid_targets_min": 964 | |
| }, | |
| { | |
| "epoch": 3.802250803858521, | |
| "grad_norm": 0.4361570179845142, | |
| "learning_rate": 2.0497087001742897e-05, | |
| "loss": 0.2145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2361234426498413, | |
| "step": 2365, | |
| "valid_targets_mean": 5905.1, | |
| "valid_targets_min": 3336 | |
| }, | |
| { | |
| "epoch": 3.810289389067524, | |
| "grad_norm": 0.5319107692301696, | |
| "learning_rate": 2.0416924411871827e-05, | |
| "loss": 0.2119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22896161675453186, | |
| "step": 2370, | |
| "valid_targets_mean": 4759.8, | |
| "valid_targets_min": 718 | |
| }, | |
| { | |
| "epoch": 3.8183279742765275, | |
| "grad_norm": 0.43055763396562985, | |
| "learning_rate": 2.0336755120568474e-05, | |
| "loss": 0.2084, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16542640328407288, | |
| "step": 2375, | |
| "valid_targets_mean": 5030.9, | |
| "valid_targets_min": 700 | |
| }, | |
| { | |
| "epoch": 3.8263665594855305, | |
| "grad_norm": 0.424784709923934, | |
| "learning_rate": 2.0256580416433507e-05, | |
| "loss": 0.2182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1930401623249054, | |
| "step": 2380, | |
| "valid_targets_mean": 5712.7, | |
| "valid_targets_min": 3377 | |
| }, | |
| { | |
| "epoch": 3.834405144694534, | |
| "grad_norm": 0.41048594391090903, | |
| "learning_rate": 2.017640158815459e-05, | |
| "loss": 0.2063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19271403551101685, | |
| "step": 2385, | |
| "valid_targets_mean": 5625.4, | |
| "valid_targets_min": 398 | |
| }, | |
| { | |
| "epoch": 3.842443729903537, | |
| "grad_norm": 0.5112917261020897, | |
| "learning_rate": 2.0096219924485702e-05, | |
| "loss": 0.1899, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21932145953178406, | |
| "step": 2390, | |
| "valid_targets_mean": 4557.1, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 3.85048231511254, | |
| "grad_norm": 0.46118075226408284, | |
| "learning_rate": 2.0016036714226375e-05, | |
| "loss": 0.2088, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2620882987976074, | |
| "step": 2395, | |
| "valid_targets_mean": 5693.9, | |
| "valid_targets_min": 1769 | |
| }, | |
| { | |
| "epoch": 3.8585209003215435, | |
| "grad_norm": 0.44773172289290925, | |
| "learning_rate": 1.993585324620101e-05, | |
| "loss": 0.2084, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20329290628433228, | |
| "step": 2400, | |
| "valid_targets_mean": 4980.4, | |
| "valid_targets_min": 923 | |
| }, | |
| { | |
| "epoch": 3.8665594855305465, | |
| "grad_norm": 0.494882324438044, | |
| "learning_rate": 1.985567080923815e-05, | |
| "loss": 0.2247, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22697266936302185, | |
| "step": 2405, | |
| "valid_targets_mean": 5325.9, | |
| "valid_targets_min": 2328 | |
| }, | |
| { | |
| "epoch": 3.87459807073955, | |
| "grad_norm": 0.44015889783779993, | |
| "learning_rate": 1.9775490692149754e-05, | |
| "loss": 0.2217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20229412615299225, | |
| "step": 2410, | |
| "valid_targets_mean": 5463.9, | |
| "valid_targets_min": 1088 | |
| }, | |
| { | |
| "epoch": 3.882636655948553, | |
| "grad_norm": 0.42567259296177046, | |
| "learning_rate": 1.969531418371051e-05, | |
| "loss": 0.2042, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18126389384269714, | |
| "step": 2415, | |
| "valid_targets_mean": 5734.0, | |
| "valid_targets_min": 763 | |
| }, | |
| { | |
| "epoch": 3.890675241157556, | |
| "grad_norm": 0.45470968743706225, | |
| "learning_rate": 1.9615142572637088e-05, | |
| "loss": 0.2213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24312356114387512, | |
| "step": 2420, | |
| "valid_targets_mean": 5781.9, | |
| "valid_targets_min": 2244 | |
| }, | |
| { | |
| "epoch": 3.8987138263665595, | |
| "grad_norm": 0.467052238827403, | |
| "learning_rate": 1.9534977147567457e-05, | |
| "loss": 0.2252, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18865913152694702, | |
| "step": 2425, | |
| "valid_targets_mean": 4876.2, | |
| "valid_targets_min": 1931 | |
| }, | |
| { | |
| "epoch": 3.906752411575563, | |
| "grad_norm": 0.47098837302480645, | |
| "learning_rate": 1.9454819197040142e-05, | |
| "loss": 0.2019, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2384665310382843, | |
| "step": 2430, | |
| "valid_targets_mean": 5154.0, | |
| "valid_targets_min": 1026 | |
| }, | |
| { | |
| "epoch": 3.914790996784566, | |
| "grad_norm": 0.44490893084527133, | |
| "learning_rate": 1.9374670009473525e-05, | |
| "loss": 0.2011, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21634893119335175, | |
| "step": 2435, | |
| "valid_targets_mean": 5663.2, | |
| "valid_targets_min": 1532 | |
| }, | |
| { | |
| "epoch": 3.922829581993569, | |
| "grad_norm": 0.41659967178264024, | |
| "learning_rate": 1.9294530873145142e-05, | |
| "loss": 0.2158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22321747243404388, | |
| "step": 2440, | |
| "valid_targets_mean": 6121.1, | |
| "valid_targets_min": 3266 | |
| }, | |
| { | |
| "epoch": 3.9308681672025725, | |
| "grad_norm": 0.43802772871467616, | |
| "learning_rate": 1.9214403076170968e-05, | |
| "loss": 0.2035, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20500494539737701, | |
| "step": 2445, | |
| "valid_targets_mean": 5410.9, | |
| "valid_targets_min": 1856 | |
| }, | |
| { | |
| "epoch": 3.9389067524115755, | |
| "grad_norm": 0.6732315793660567, | |
| "learning_rate": 1.913428790648472e-05, | |
| "loss": 0.2165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21469956636428833, | |
| "step": 2450, | |
| "valid_targets_mean": 5684.2, | |
| "valid_targets_min": 2356 | |
| }, | |
| { | |
| "epoch": 3.946945337620579, | |
| "grad_norm": 0.4664945007826955, | |
| "learning_rate": 1.9054186651817145e-05, | |
| "loss": 0.2153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23946499824523926, | |
| "step": 2455, | |
| "valid_targets_mean": 5438.3, | |
| "valid_targets_min": 2176 | |
| }, | |
| { | |
| "epoch": 3.954983922829582, | |
| "grad_norm": 0.4661613319228112, | |
| "learning_rate": 1.8974100599675324e-05, | |
| "loss": 0.2239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22451001405715942, | |
| "step": 2460, | |
| "valid_targets_mean": 5269.4, | |
| "valid_targets_min": 1016 | |
| }, | |
| { | |
| "epoch": 3.963022508038585, | |
| "grad_norm": 0.41493904461143283, | |
| "learning_rate": 1.8894031037321983e-05, | |
| "loss": 0.2121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18410423398017883, | |
| "step": 2465, | |
| "valid_targets_mean": 5644.1, | |
| "valid_targets_min": 3111 | |
| }, | |
| { | |
| "epoch": 3.9710610932475885, | |
| "grad_norm": 0.475551566600847, | |
| "learning_rate": 1.8813979251754823e-05, | |
| "loss": 0.2184, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25625407695770264, | |
| "step": 2470, | |
| "valid_targets_mean": 6587.2, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 3.9790996784565915, | |
| "grad_norm": 0.4457179679996537, | |
| "learning_rate": 1.8733946529685755e-05, | |
| "loss": 0.2005, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19752007722854614, | |
| "step": 2475, | |
| "valid_targets_mean": 5307.2, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 3.987138263665595, | |
| "grad_norm": 0.48633447011742686, | |
| "learning_rate": 1.865393415752033e-05, | |
| "loss": 0.2305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2800830602645874, | |
| "step": 2480, | |
| "valid_targets_mean": 5018.9, | |
| "valid_targets_min": 502 | |
| }, | |
| { | |
| "epoch": 3.995176848874598, | |
| "grad_norm": 0.47160209353188187, | |
| "learning_rate": 1.8573943421336973e-05, | |
| "loss": 0.2287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19500645995140076, | |
| "step": 2485, | |
| "valid_targets_mean": 5507.2, | |
| "valid_targets_min": 551 | |
| }, | |
| { | |
| "epoch": 4.003215434083601, | |
| "grad_norm": 0.4431683493276922, | |
| "learning_rate": 1.849397560686635e-05, | |
| "loss": 0.2087, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1778678148984909, | |
| "step": 2490, | |
| "valid_targets_mean": 5407.1, | |
| "valid_targets_min": 1746 | |
| }, | |
| { | |
| "epoch": 4.011254019292605, | |
| "grad_norm": 0.5063154975052353, | |
| "learning_rate": 1.8414031999470704e-05, | |
| "loss": 0.1833, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2000581920146942, | |
| "step": 2495, | |
| "valid_targets_mean": 4873.6, | |
| "valid_targets_min": 2477 | |
| }, | |
| { | |
| "epoch": 4.019292604501608, | |
| "grad_norm": 0.49353913726419607, | |
| "learning_rate": 1.8334113884123168e-05, | |
| "loss": 0.2089, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17449554800987244, | |
| "step": 2500, | |
| "valid_targets_mean": 5010.6, | |
| "valid_targets_min": 2113 | |
| }, | |
| { | |
| "epoch": 4.027331189710611, | |
| "grad_norm": 0.46262558060542996, | |
| "learning_rate": 1.825422254538714e-05, | |
| "loss": 0.2057, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25343796610832214, | |
| "step": 2505, | |
| "valid_targets_mean": 6078.9, | |
| "valid_targets_min": 653 | |
| }, | |
| { | |
| "epoch": 4.035369774919614, | |
| "grad_norm": 0.5362065215987764, | |
| "learning_rate": 1.8174359267395624e-05, | |
| "loss": 0.1992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19421911239624023, | |
| "step": 2510, | |
| "valid_targets_mean": 4662.9, | |
| "valid_targets_min": 602 | |
| }, | |
| { | |
| "epoch": 4.043408360128617, | |
| "grad_norm": 0.4795175141250009, | |
| "learning_rate": 1.80945253338306e-05, | |
| "loss": 0.222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24856063723564148, | |
| "step": 2515, | |
| "valid_targets_mean": 6113.8, | |
| "valid_targets_min": 2423 | |
| }, | |
| { | |
| "epoch": 4.051446945337621, | |
| "grad_norm": 0.4876334669035427, | |
| "learning_rate": 1.801472202790235e-05, | |
| "loss": 0.2018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18681687116622925, | |
| "step": 2520, | |
| "valid_targets_mean": 5507.8, | |
| "valid_targets_min": 2851 | |
| }, | |
| { | |
| "epoch": 4.059485530546624, | |
| "grad_norm": 0.46788947316546087, | |
| "learning_rate": 1.7934950632328893e-05, | |
| "loss": 0.1995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18351972103118896, | |
| "step": 2525, | |
| "valid_targets_mean": 6029.2, | |
| "valid_targets_min": 1686 | |
| }, | |
| { | |
| "epoch": 4.067524115755627, | |
| "grad_norm": 0.4930457193060944, | |
| "learning_rate": 1.7855212429315323e-05, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1924591362476349, | |
| "step": 2530, | |
| "valid_targets_mean": 4770.0, | |
| "valid_targets_min": 310 | |
| }, | |
| { | |
| "epoch": 4.07556270096463, | |
| "grad_norm": 1.0707922894163995, | |
| "learning_rate": 1.7775508700533217e-05, | |
| "loss": 0.2037, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24290210008621216, | |
| "step": 2535, | |
| "valid_targets_mean": 5625.6, | |
| "valid_targets_min": 707 | |
| }, | |
| { | |
| "epoch": 4.083601286173633, | |
| "grad_norm": 0.45170678651858187, | |
| "learning_rate": 1.769584072710004e-05, | |
| "loss": 0.1984, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15965251624584198, | |
| "step": 2540, | |
| "valid_targets_mean": 5199.1, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 4.091639871382637, | |
| "grad_norm": 0.4793914315138488, | |
| "learning_rate": 1.7616209789558513e-05, | |
| "loss": 0.1962, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1897103488445282, | |
| "step": 2545, | |
| "valid_targets_mean": 6103.6, | |
| "valid_targets_min": 1674 | |
| }, | |
| { | |
| "epoch": 4.09967845659164, | |
| "grad_norm": 0.509928017351408, | |
| "learning_rate": 1.75366171678561e-05, | |
| "loss": 0.2079, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27034711837768555, | |
| "step": 2550, | |
| "valid_targets_mean": 6321.3, | |
| "valid_targets_min": 586 | |
| }, | |
| { | |
| "epoch": 4.107717041800643, | |
| "grad_norm": 0.48592035160455116, | |
| "learning_rate": 1.7457064141324368e-05, | |
| "loss": 0.1948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18286871910095215, | |
| "step": 2555, | |
| "valid_targets_mean": 5948.1, | |
| "valid_targets_min": 3357 | |
| }, | |
| { | |
| "epoch": 4.115755627009646, | |
| "grad_norm": 0.5511159152623385, | |
| "learning_rate": 1.7377551988658466e-05, | |
| "loss": 0.1923, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20568011701107025, | |
| "step": 2560, | |
| "valid_targets_mean": 4816.6, | |
| "valid_targets_min": 1377 | |
| }, | |
| { | |
| "epoch": 4.12379421221865, | |
| "grad_norm": 0.4984863584967216, | |
| "learning_rate": 1.7298081987896547e-05, | |
| "loss": 0.1818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21172256767749786, | |
| "step": 2565, | |
| "valid_targets_mean": 5188.6, | |
| "valid_targets_min": 360 | |
| }, | |
| { | |
| "epoch": 4.131832797427653, | |
| "grad_norm": 0.450989941612913, | |
| "learning_rate": 1.721865541639924e-05, | |
| "loss": 0.2345, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18919016420841217, | |
| "step": 2570, | |
| "valid_targets_mean": 5664.1, | |
| "valid_targets_min": 2879 | |
| }, | |
| { | |
| "epoch": 4.139871382636656, | |
| "grad_norm": 0.43703722696294917, | |
| "learning_rate": 1.7139273550829117e-05, | |
| "loss": 0.2007, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19631536304950714, | |
| "step": 2575, | |
| "valid_targets_mean": 6282.9, | |
| "valid_targets_min": 2968 | |
| }, | |
| { | |
| "epoch": 4.147909967845659, | |
| "grad_norm": 0.432558045727956, | |
| "learning_rate": 1.7059937667130163e-05, | |
| "loss": 0.2039, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20107629895210266, | |
| "step": 2580, | |
| "valid_targets_mean": 6507.6, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 4.155948553054662, | |
| "grad_norm": 0.42844076826077404, | |
| "learning_rate": 1.6980649040507296e-05, | |
| "loss": 0.2046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20934857428073883, | |
| "step": 2585, | |
| "valid_targets_mean": 6436.5, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 4.163987138263666, | |
| "grad_norm": 0.4612185120502811, | |
| "learning_rate": 1.6901408945405794e-05, | |
| "loss": 0.2082, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18850497901439667, | |
| "step": 2590, | |
| "valid_targets_mean": 5565.6, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 4.172025723472669, | |
| "grad_norm": 0.47366161861568484, | |
| "learning_rate": 1.6822218655490926e-05, | |
| "loss": 0.2076, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2318640649318695, | |
| "step": 2595, | |
| "valid_targets_mean": 5669.2, | |
| "valid_targets_min": 1787 | |
| }, | |
| { | |
| "epoch": 4.180064308681672, | |
| "grad_norm": 0.4478729477778503, | |
| "learning_rate": 1.674307944362738e-05, | |
| "loss": 0.2045, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16117283701896667, | |
| "step": 2600, | |
| "valid_targets_mean": 5841.4, | |
| "valid_targets_min": 2066 | |
| }, | |
| { | |
| "epoch": 4.188102893890675, | |
| "grad_norm": 0.513342399213119, | |
| "learning_rate": 1.6663992581858858e-05, | |
| "loss": 0.2212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19785936176776886, | |
| "step": 2605, | |
| "valid_targets_mean": 4952.5, | |
| "valid_targets_min": 2254 | |
| }, | |
| { | |
| "epoch": 4.196141479099678, | |
| "grad_norm": 0.44758461547392403, | |
| "learning_rate": 1.65849593413876e-05, | |
| "loss": 0.1901, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2149113267660141, | |
| "step": 2610, | |
| "valid_targets_mean": 5862.0, | |
| "valid_targets_min": 2548 | |
| }, | |
| { | |
| "epoch": 4.204180064308682, | |
| "grad_norm": 0.47362083484044387, | |
| "learning_rate": 1.6505980992553973e-05, | |
| "loss": 0.2071, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22370827198028564, | |
| "step": 2615, | |
| "valid_targets_mean": 6501.8, | |
| "valid_targets_min": 368 | |
| }, | |
| { | |
| "epoch": 4.212218649517685, | |
| "grad_norm": 0.4855097676216963, | |
| "learning_rate": 1.6427058804816035e-05, | |
| "loss": 0.1945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21815648674964905, | |
| "step": 2620, | |
| "valid_targets_mean": 5576.1, | |
| "valid_targets_min": 2816 | |
| }, | |
| { | |
| "epoch": 4.220257234726688, | |
| "grad_norm": 0.45317871679877536, | |
| "learning_rate": 1.634819404672916e-05, | |
| "loss": 0.2083, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18824812769889832, | |
| "step": 2625, | |
| "valid_targets_mean": 5293.6, | |
| "valid_targets_min": 2657 | |
| }, | |
| { | |
| "epoch": 4.228295819935691, | |
| "grad_norm": 0.4980247924474111, | |
| "learning_rate": 1.6269387985925607e-05, | |
| "loss": 0.2273, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3325211703777313, | |
| "step": 2630, | |
| "valid_targets_mean": 6427.8, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 4.236334405144695, | |
| "grad_norm": 0.6649140044751879, | |
| "learning_rate": 1.6190641889094167e-05, | |
| "loss": 0.21, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25851452350616455, | |
| "step": 2635, | |
| "valid_targets_mean": 5124.8, | |
| "valid_targets_min": 554 | |
| }, | |
| { | |
| "epoch": 4.244372990353698, | |
| "grad_norm": 0.4730526302752391, | |
| "learning_rate": 1.61119570219598e-05, | |
| "loss": 0.2109, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20996221899986267, | |
| "step": 2640, | |
| "valid_targets_mean": 5915.9, | |
| "valid_targets_min": 1928 | |
| }, | |
| { | |
| "epoch": 4.252411575562701, | |
| "grad_norm": 0.5118058577341814, | |
| "learning_rate": 1.6033334649263298e-05, | |
| "loss": 0.2131, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21130703389644623, | |
| "step": 2645, | |
| "valid_targets_mean": 5496.5, | |
| "valid_targets_min": 822 | |
| }, | |
| { | |
| "epoch": 4.260450160771704, | |
| "grad_norm": 0.43776478494746524, | |
| "learning_rate": 1.595477603474095e-05, | |
| "loss": 0.1983, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18410971760749817, | |
| "step": 2650, | |
| "valid_targets_mean": 5455.0, | |
| "valid_targets_min": 1437 | |
| }, | |
| { | |
| "epoch": 4.268488745980707, | |
| "grad_norm": 0.553906016668327, | |
| "learning_rate": 1.5876282441104215e-05, | |
| "loss": 0.1908, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19235216081142426, | |
| "step": 2655, | |
| "valid_targets_mean": 5985.0, | |
| "valid_targets_min": 2192 | |
| }, | |
| { | |
| "epoch": 4.276527331189711, | |
| "grad_norm": 0.6277934063923197, | |
| "learning_rate": 1.5797855130019446e-05, | |
| "loss": 0.1853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15879936516284943, | |
| "step": 2660, | |
| "valid_targets_mean": 5623.4, | |
| "valid_targets_min": 2064 | |
| }, | |
| { | |
| "epoch": 4.284565916398714, | |
| "grad_norm": 0.4175500519205594, | |
| "learning_rate": 1.5719495362087615e-05, | |
| "loss": 0.2037, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20981422066688538, | |
| "step": 2665, | |
| "valid_targets_mean": 6973.8, | |
| "valid_targets_min": 2993 | |
| }, | |
| { | |
| "epoch": 4.292604501607717, | |
| "grad_norm": 0.4837942057875352, | |
| "learning_rate": 1.5641204396824026e-05, | |
| "loss": 0.203, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19984592497348785, | |
| "step": 2670, | |
| "valid_targets_mean": 4973.3, | |
| "valid_targets_min": 1122 | |
| }, | |
| { | |
| "epoch": 4.30064308681672, | |
| "grad_norm": 0.47079344745028745, | |
| "learning_rate": 1.556298349263809e-05, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1954348385334015, | |
| "step": 2675, | |
| "valid_targets_mean": 4884.9, | |
| "valid_targets_min": 449 | |
| }, | |
| { | |
| "epoch": 4.308681672025724, | |
| "grad_norm": 0.4594060901178028, | |
| "learning_rate": 1.5484833906813078e-05, | |
| "loss": 0.1866, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20647072792053223, | |
| "step": 2680, | |
| "valid_targets_mean": 5675.9, | |
| "valid_targets_min": 2232 | |
| }, | |
| { | |
| "epoch": 4.316720257234727, | |
| "grad_norm": 0.4600805545978283, | |
| "learning_rate": 1.5406756895485952e-05, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1757199764251709, | |
| "step": 2685, | |
| "valid_targets_mean": 6161.0, | |
| "valid_targets_min": 1423 | |
| }, | |
| { | |
| "epoch": 4.32475884244373, | |
| "grad_norm": 0.5361149346307585, | |
| "learning_rate": 1.5328753713627114e-05, | |
| "loss": 0.1973, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21314969658851624, | |
| "step": 2690, | |
| "valid_targets_mean": 5107.5, | |
| "valid_targets_min": 417 | |
| }, | |
| { | |
| "epoch": 4.332797427652733, | |
| "grad_norm": 0.4734201902427743, | |
| "learning_rate": 1.5250825615020314e-05, | |
| "loss": 0.2053, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2571180462837219, | |
| "step": 2695, | |
| "valid_targets_mean": 5717.2, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 4.340836012861736, | |
| "grad_norm": 0.4384177489194904, | |
| "learning_rate": 1.5172973852242403e-05, | |
| "loss": 0.2102, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20939810574054718, | |
| "step": 2700, | |
| "valid_targets_mean": 6299.3, | |
| "valid_targets_min": 420 | |
| }, | |
| { | |
| "epoch": 4.34887459807074, | |
| "grad_norm": 0.4467839595052828, | |
| "learning_rate": 1.509519967664327e-05, | |
| "loss": 0.1959, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1935369223356247, | |
| "step": 2705, | |
| "valid_targets_mean": 6261.8, | |
| "valid_targets_min": 797 | |
| }, | |
| { | |
| "epoch": 4.356913183279743, | |
| "grad_norm": 0.43951166680518067, | |
| "learning_rate": 1.5017504338325724e-05, | |
| "loss": 0.1863, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17829066514968872, | |
| "step": 2710, | |
| "valid_targets_mean": 6320.8, | |
| "valid_targets_min": 4167 | |
| }, | |
| { | |
| "epoch": 4.364951768488746, | |
| "grad_norm": 0.5046374117946685, | |
| "learning_rate": 1.4939889086125354e-05, | |
| "loss": 0.1999, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2184838503599167, | |
| "step": 2715, | |
| "valid_targets_mean": 5021.6, | |
| "valid_targets_min": 2023 | |
| }, | |
| { | |
| "epoch": 4.372990353697749, | |
| "grad_norm": 0.47236424146260997, | |
| "learning_rate": 1.486235516759051e-05, | |
| "loss": 0.1959, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2101278305053711, | |
| "step": 2720, | |
| "valid_targets_mean": 4985.7, | |
| "valid_targets_min": 1302 | |
| }, | |
| { | |
| "epoch": 4.381028938906752, | |
| "grad_norm": 0.5296494215008, | |
| "learning_rate": 1.47849038289622e-05, | |
| "loss": 0.1815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19631826877593994, | |
| "step": 2725, | |
| "valid_targets_mean": 4830.1, | |
| "valid_targets_min": 1901 | |
| }, | |
| { | |
| "epoch": 4.389067524115756, | |
| "grad_norm": 0.46328346813912447, | |
| "learning_rate": 1.4707536315154105e-05, | |
| "loss": 0.2174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23391112685203552, | |
| "step": 2730, | |
| "valid_targets_mean": 6091.8, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 4.397106109324759, | |
| "grad_norm": 0.49549169831142276, | |
| "learning_rate": 1.4630253869732538e-05, | |
| "loss": 0.1994, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2387903928756714, | |
| "step": 2735, | |
| "valid_targets_mean": 5893.7, | |
| "valid_targets_min": 2805 | |
| }, | |
| { | |
| "epoch": 4.405144694533762, | |
| "grad_norm": 0.5005618204176583, | |
| "learning_rate": 1.4553057734896485e-05, | |
| "loss": 0.1977, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19630920886993408, | |
| "step": 2740, | |
| "valid_targets_mean": 5303.3, | |
| "valid_targets_min": 458 | |
| }, | |
| { | |
| "epoch": 4.413183279742765, | |
| "grad_norm": 0.5071861848546898, | |
| "learning_rate": 1.4475949151457576e-05, | |
| "loss": 0.2091, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18431368470191956, | |
| "step": 2745, | |
| "valid_targets_mean": 5103.1, | |
| "valid_targets_min": 1717 | |
| }, | |
| { | |
| "epoch": 4.421221864951768, | |
| "grad_norm": 0.5717407040608831, | |
| "learning_rate": 1.4398929358820232e-05, | |
| "loss": 0.1958, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18112431466579437, | |
| "step": 2750, | |
| "valid_targets_mean": 5726.4, | |
| "valid_targets_min": 2752 | |
| }, | |
| { | |
| "epoch": 4.429260450160772, | |
| "grad_norm": 0.44976048990233286, | |
| "learning_rate": 1.4321999594961668e-05, | |
| "loss": 0.2165, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24178093671798706, | |
| "step": 2755, | |
| "valid_targets_mean": 6700.8, | |
| "valid_targets_min": 2324 | |
| }, | |
| { | |
| "epoch": 4.437299035369775, | |
| "grad_norm": 0.4753927161894043, | |
| "learning_rate": 1.4245161096412038e-05, | |
| "loss": 0.2013, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20099860429763794, | |
| "step": 2760, | |
| "valid_targets_mean": 5602.4, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 4.445337620578778, | |
| "grad_norm": 0.5209267732228876, | |
| "learning_rate": 1.4168415098234523e-05, | |
| "loss": 0.1996, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22852633893489838, | |
| "step": 2765, | |
| "valid_targets_mean": 5029.3, | |
| "valid_targets_min": 763 | |
| }, | |
| { | |
| "epoch": 4.453376205787781, | |
| "grad_norm": 0.486708199841311, | |
| "learning_rate": 1.4091762834005512e-05, | |
| "loss": 0.1921, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21605432033538818, | |
| "step": 2770, | |
| "valid_targets_mean": 6792.0, | |
| "valid_targets_min": 2662 | |
| }, | |
| { | |
| "epoch": 4.461414790996785, | |
| "grad_norm": 0.5015486523699413, | |
| "learning_rate": 1.401520553579476e-05, | |
| "loss": 0.1811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19634661078453064, | |
| "step": 2775, | |
| "valid_targets_mean": 4970.1, | |
| "valid_targets_min": 2616 | |
| }, | |
| { | |
| "epoch": 4.469453376205788, | |
| "grad_norm": 0.47213874742476397, | |
| "learning_rate": 1.3938744434145597e-05, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17870403826236725, | |
| "step": 2780, | |
| "valid_targets_mean": 5548.0, | |
| "valid_targets_min": 1732 | |
| }, | |
| { | |
| "epoch": 4.477491961414791, | |
| "grad_norm": 0.48798130419819413, | |
| "learning_rate": 1.386238075805513e-05, | |
| "loss": 0.2016, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21423214673995972, | |
| "step": 2785, | |
| "valid_targets_mean": 5585.7, | |
| "valid_targets_min": 2704 | |
| }, | |
| { | |
| "epoch": 4.485530546623794, | |
| "grad_norm": 0.43853079491526586, | |
| "learning_rate": 1.3786115734954482e-05, | |
| "loss": 0.1847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16943836212158203, | |
| "step": 2790, | |
| "valid_targets_mean": 5543.2, | |
| "valid_targets_min": 2570 | |
| }, | |
| { | |
| "epoch": 4.493569131832797, | |
| "grad_norm": 0.4318067748827032, | |
| "learning_rate": 1.3709950590689099e-05, | |
| "loss": 0.1891, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20014822483062744, | |
| "step": 2795, | |
| "valid_targets_mean": 6038.9, | |
| "valid_targets_min": 2778 | |
| }, | |
| { | |
| "epoch": 4.501607717041801, | |
| "grad_norm": 0.5234018992311159, | |
| "learning_rate": 1.3633886549499016e-05, | |
| "loss": 0.1981, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18016895651817322, | |
| "step": 2800, | |
| "valid_targets_mean": 5694.5, | |
| "valid_targets_min": 2535 | |
| }, | |
| { | |
| "epoch": 4.509646302250804, | |
| "grad_norm": 0.48964885396418867, | |
| "learning_rate": 1.3557924833999192e-05, | |
| "loss": 0.198, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22023364901542664, | |
| "step": 2805, | |
| "valid_targets_mean": 5295.1, | |
| "valid_targets_min": 1017 | |
| }, | |
| { | |
| "epoch": 4.517684887459807, | |
| "grad_norm": 0.47778416704363436, | |
| "learning_rate": 1.3482066665159845e-05, | |
| "loss": 0.2006, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2275695502758026, | |
| "step": 2810, | |
| "valid_targets_mean": 5627.1, | |
| "valid_targets_min": 2083 | |
| }, | |
| { | |
| "epoch": 4.52572347266881, | |
| "grad_norm": 0.49092993021320624, | |
| "learning_rate": 1.3406313262286845e-05, | |
| "loss": 0.1826, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1840728521347046, | |
| "step": 2815, | |
| "valid_targets_mean": 4954.2, | |
| "valid_targets_min": 2803 | |
| }, | |
| { | |
| "epoch": 4.533762057877814, | |
| "grad_norm": 0.4928580565982705, | |
| "learning_rate": 1.333066584300211e-05, | |
| "loss": 0.2041, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19534510374069214, | |
| "step": 2820, | |
| "valid_targets_mean": 5510.9, | |
| "valid_targets_min": 2471 | |
| }, | |
| { | |
| "epoch": 4.541800643086817, | |
| "grad_norm": 0.4417672881758802, | |
| "learning_rate": 1.3255125623224025e-05, | |
| "loss": 0.1984, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19574567675590515, | |
| "step": 2825, | |
| "valid_targets_mean": 6772.1, | |
| "valid_targets_min": 816 | |
| }, | |
| { | |
| "epoch": 4.54983922829582, | |
| "grad_norm": 0.4770904638690074, | |
| "learning_rate": 1.317969381714791e-05, | |
| "loss": 0.1992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18321281671524048, | |
| "step": 2830, | |
| "valid_targets_mean": 5458.4, | |
| "valid_targets_min": 794 | |
| }, | |
| { | |
| "epoch": 4.557877813504823, | |
| "grad_norm": 0.6085685008276643, | |
| "learning_rate": 1.3104371637226483e-05, | |
| "loss": 0.1928, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22217214107513428, | |
| "step": 2835, | |
| "valid_targets_mean": 4775.9, | |
| "valid_targets_min": 780 | |
| }, | |
| { | |
| "epoch": 4.565916398713826, | |
| "grad_norm": 0.4838109854552672, | |
| "learning_rate": 1.30291602941504e-05, | |
| "loss": 0.2043, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19242960214614868, | |
| "step": 2840, | |
| "valid_targets_mean": 5120.0, | |
| "valid_targets_min": 1543 | |
| }, | |
| { | |
| "epoch": 4.57395498392283, | |
| "grad_norm": 0.5494698222405994, | |
| "learning_rate": 1.2954060996828783e-05, | |
| "loss": 0.2018, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2533257007598877, | |
| "step": 2845, | |
| "valid_targets_mean": 4385.7, | |
| "valid_targets_min": 669 | |
| }, | |
| { | |
| "epoch": 4.581993569131833, | |
| "grad_norm": 0.5485921989290943, | |
| "learning_rate": 1.2879074952369787e-05, | |
| "loss": 0.1983, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.211238831281662, | |
| "step": 2850, | |
| "valid_targets_mean": 4155.8, | |
| "valid_targets_min": 969 | |
| }, | |
| { | |
| "epoch": 4.590032154340836, | |
| "grad_norm": 0.5041645016380173, | |
| "learning_rate": 1.280420336606118e-05, | |
| "loss": 0.1852, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20651057362556458, | |
| "step": 2855, | |
| "valid_targets_mean": 5091.2, | |
| "valid_targets_min": 974 | |
| }, | |
| { | |
| "epoch": 4.598070739549839, | |
| "grad_norm": 0.5242677107999061, | |
| "learning_rate": 1.272944744135101e-05, | |
| "loss": 0.1984, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17013058066368103, | |
| "step": 2860, | |
| "valid_targets_mean": 4655.6, | |
| "valid_targets_min": 904 | |
| }, | |
| { | |
| "epoch": 4.606109324758842, | |
| "grad_norm": 0.4816881362793337, | |
| "learning_rate": 1.265480837982823e-05, | |
| "loss": 0.2006, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16162796318531036, | |
| "step": 2865, | |
| "valid_targets_mean": 5032.6, | |
| "valid_targets_min": 2480 | |
| }, | |
| { | |
| "epoch": 4.614147909967846, | |
| "grad_norm": 0.5595574344139143, | |
| "learning_rate": 1.2580287381203395e-05, | |
| "loss": 0.1836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19007542729377747, | |
| "step": 2870, | |
| "valid_targets_mean": 5541.3, | |
| "valid_targets_min": 2562 | |
| }, | |
| { | |
| "epoch": 4.622186495176849, | |
| "grad_norm": 0.5162023822041768, | |
| "learning_rate": 1.2505885643289376e-05, | |
| "loss": 0.1943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2073974907398224, | |
| "step": 2875, | |
| "valid_targets_mean": 5048.6, | |
| "valid_targets_min": 914 | |
| }, | |
| { | |
| "epoch": 4.630225080385852, | |
| "grad_norm": 0.4295671718244202, | |
| "learning_rate": 1.2431604361982097e-05, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15762320160865784, | |
| "step": 2880, | |
| "valid_targets_mean": 5443.6, | |
| "valid_targets_min": 2303 | |
| }, | |
| { | |
| "epoch": 4.638263665594855, | |
| "grad_norm": 0.436533185299774, | |
| "learning_rate": 1.2357444731241341e-05, | |
| "loss": 0.186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1579321324825287, | |
| "step": 2885, | |
| "valid_targets_mean": 5016.8, | |
| "valid_targets_min": 1380 | |
| }, | |
| { | |
| "epoch": 4.646302250803858, | |
| "grad_norm": 0.45315199666744377, | |
| "learning_rate": 1.2283407943071527e-05, | |
| "loss": 0.1958, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21414679288864136, | |
| "step": 2890, | |
| "valid_targets_mean": 6171.1, | |
| "valid_targets_min": 681 | |
| }, | |
| { | |
| "epoch": 4.654340836012862, | |
| "grad_norm": 0.45840997399289807, | |
| "learning_rate": 1.2209495187502591e-05, | |
| "loss": 0.194, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21173666417598724, | |
| "step": 2895, | |
| "valid_targets_mean": 6205.8, | |
| "valid_targets_min": 2671 | |
| }, | |
| { | |
| "epoch": 4.662379421221865, | |
| "grad_norm": 1.241580962067996, | |
| "learning_rate": 1.213570765257078e-05, | |
| "loss": 0.2024, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18394725024700165, | |
| "step": 2900, | |
| "valid_targets_mean": 5422.6, | |
| "valid_targets_min": 718 | |
| }, | |
| { | |
| "epoch": 4.670418006430868, | |
| "grad_norm": 0.4701730532716653, | |
| "learning_rate": 1.2062046524299662e-05, | |
| "loss": 0.1893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21215137839317322, | |
| "step": 2905, | |
| "valid_targets_mean": 6072.1, | |
| "valid_targets_min": 3793 | |
| }, | |
| { | |
| "epoch": 4.678456591639871, | |
| "grad_norm": 0.47514186818883347, | |
| "learning_rate": 1.1988512986680977e-05, | |
| "loss": 0.2067, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26547539234161377, | |
| "step": 2910, | |
| "valid_targets_mean": 7249.2, | |
| "valid_targets_min": 2544 | |
| }, | |
| { | |
| "epoch": 4.686495176848875, | |
| "grad_norm": 0.5245596550859682, | |
| "learning_rate": 1.1915108221655646e-05, | |
| "loss": 0.1964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1955907642841339, | |
| "step": 2915, | |
| "valid_targets_mean": 5297.4, | |
| "valid_targets_min": 964 | |
| }, | |
| { | |
| "epoch": 4.694533762057878, | |
| "grad_norm": 0.5750733791368552, | |
| "learning_rate": 1.1841833409094775e-05, | |
| "loss": 0.1887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19828759133815765, | |
| "step": 2920, | |
| "valid_targets_mean": 4593.1, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 4.702572347266881, | |
| "grad_norm": 0.46946293076584145, | |
| "learning_rate": 1.1768689726780652e-05, | |
| "loss": 0.1819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15981552004814148, | |
| "step": 2925, | |
| "valid_targets_mean": 5449.8, | |
| "valid_targets_min": 3162 | |
| }, | |
| { | |
| "epoch": 4.710610932475884, | |
| "grad_norm": 0.4567632066199438, | |
| "learning_rate": 1.1695678350387864e-05, | |
| "loss": 0.2027, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15649572014808655, | |
| "step": 2930, | |
| "valid_targets_mean": 5421.6, | |
| "valid_targets_min": 509 | |
| }, | |
| { | |
| "epoch": 4.718649517684888, | |
| "grad_norm": 0.46216431003159764, | |
| "learning_rate": 1.1622800453464387e-05, | |
| "loss": 0.1998, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.209588423371315, | |
| "step": 2935, | |
| "valid_targets_mean": 5569.3, | |
| "valid_targets_min": 2531 | |
| }, | |
| { | |
| "epoch": 4.726688102893891, | |
| "grad_norm": 0.46962991061066467, | |
| "learning_rate": 1.1550057207412704e-05, | |
| "loss": 0.188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16872183978557587, | |
| "step": 2940, | |
| "valid_targets_mean": 4777.9, | |
| "valid_targets_min": 2288 | |
| }, | |
| { | |
| "epoch": 4.734726688102894, | |
| "grad_norm": 0.4695838498589093, | |
| "learning_rate": 1.1477449781470975e-05, | |
| "loss": 0.2025, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21210195124149323, | |
| "step": 2945, | |
| "valid_targets_mean": 5874.9, | |
| "valid_targets_min": 1490 | |
| }, | |
| { | |
| "epoch": 4.742765273311897, | |
| "grad_norm": 0.5184365125355269, | |
| "learning_rate": 1.1404979342694271e-05, | |
| "loss": 0.213, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2304067313671112, | |
| "step": 2950, | |
| "valid_targets_mean": 5133.4, | |
| "valid_targets_min": 1612 | |
| }, | |
| { | |
| "epoch": 4.7508038585209, | |
| "grad_norm": 0.5906947681279004, | |
| "learning_rate": 1.1332647055935815e-05, | |
| "loss": 0.1967, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1860680878162384, | |
| "step": 2955, | |
| "valid_targets_mean": 5155.0, | |
| "valid_targets_min": 2688 | |
| }, | |
| { | |
| "epoch": 4.758842443729904, | |
| "grad_norm": 0.4759457622901129, | |
| "learning_rate": 1.1260454083828219e-05, | |
| "loss": 0.2126, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17885443568229675, | |
| "step": 2960, | |
| "valid_targets_mean": 4640.9, | |
| "valid_targets_min": 351 | |
| }, | |
| { | |
| "epoch": 4.766881028938907, | |
| "grad_norm": 0.46334565950112583, | |
| "learning_rate": 1.118840158676484e-05, | |
| "loss": 0.2065, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22456106543540955, | |
| "step": 2965, | |
| "valid_targets_mean": 5932.4, | |
| "valid_targets_min": 1003 | |
| }, | |
| { | |
| "epoch": 4.77491961414791, | |
| "grad_norm": 0.5024862102855205, | |
| "learning_rate": 1.1116490722881084e-05, | |
| "loss": 0.2039, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20128870010375977, | |
| "step": 2970, | |
| "valid_targets_mean": 5589.7, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 4.782958199356913, | |
| "grad_norm": 0.5640048243597826, | |
| "learning_rate": 1.1044722648035844e-05, | |
| "loss": 0.2095, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16415663063526154, | |
| "step": 2975, | |
| "valid_targets_mean": 4988.2, | |
| "valid_targets_min": 683 | |
| }, | |
| { | |
| "epoch": 4.790996784565916, | |
| "grad_norm": 0.5068459771538669, | |
| "learning_rate": 1.0973098515792886e-05, | |
| "loss": 0.1953, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21549053490161896, | |
| "step": 2980, | |
| "valid_targets_mean": 4673.6, | |
| "valid_targets_min": 1088 | |
| }, | |
| { | |
| "epoch": 4.79903536977492, | |
| "grad_norm": 0.4441531212030373, | |
| "learning_rate": 1.090161947740231e-05, | |
| "loss": 0.2163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21432173252105713, | |
| "step": 2985, | |
| "valid_targets_mean": 6529.7, | |
| "valid_targets_min": 2107 | |
| }, | |
| { | |
| "epoch": 4.807073954983923, | |
| "grad_norm": 0.4824083142982788, | |
| "learning_rate": 1.0830286681782058e-05, | |
| "loss": 0.2046, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17304901778697968, | |
| "step": 2990, | |
| "valid_targets_mean": 5721.2, | |
| "valid_targets_min": 3087 | |
| }, | |
| { | |
| "epoch": 4.815112540192926, | |
| "grad_norm": 0.4674232061795263, | |
| "learning_rate": 1.0759101275499436e-05, | |
| "loss": 0.1923, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1900356411933899, | |
| "step": 2995, | |
| "valid_targets_mean": 5728.6, | |
| "valid_targets_min": 274 | |
| }, | |
| { | |
| "epoch": 4.823151125401929, | |
| "grad_norm": 0.5691216186327724, | |
| "learning_rate": 1.0688064402752682e-05, | |
| "loss": 0.2097, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2550208270549774, | |
| "step": 3000, | |
| "valid_targets_mean": 5726.8, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 4.831189710610932, | |
| "grad_norm": 0.45136375661997785, | |
| "learning_rate": 1.0617177205352599e-05, | |
| "loss": 0.2128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23120132088661194, | |
| "step": 3005, | |
| "valid_targets_mean": 6514.9, | |
| "valid_targets_min": 2901 | |
| }, | |
| { | |
| "epoch": 4.839228295819936, | |
| "grad_norm": 0.4677445143572103, | |
| "learning_rate": 1.0546440822704161e-05, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16181737184524536, | |
| "step": 3010, | |
| "valid_targets_mean": 5833.4, | |
| "valid_targets_min": 2259 | |
| }, | |
| { | |
| "epoch": 4.847266881028939, | |
| "grad_norm": 0.4303599978006847, | |
| "learning_rate": 1.047585639178824e-05, | |
| "loss": 0.2153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19590692222118378, | |
| "step": 3015, | |
| "valid_targets_mean": 6491.2, | |
| "valid_targets_min": 2829 | |
| }, | |
| { | |
| "epoch": 4.855305466237942, | |
| "grad_norm": 0.4968455190517692, | |
| "learning_rate": 1.0405425047143306e-05, | |
| "loss": 0.2097, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19899852573871613, | |
| "step": 3020, | |
| "valid_targets_mean": 4808.4, | |
| "valid_targets_min": 1306 | |
| }, | |
| { | |
| "epoch": 4.863344051446945, | |
| "grad_norm": 0.44258897347035453, | |
| "learning_rate": 1.0335147920847195e-05, | |
| "loss": 0.1965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16738440096378326, | |
| "step": 3025, | |
| "valid_targets_mean": 6046.5, | |
| "valid_targets_min": 335 | |
| }, | |
| { | |
| "epoch": 4.871382636655948, | |
| "grad_norm": 0.4258283265646621, | |
| "learning_rate": 1.0265026142498931e-05, | |
| "loss": 0.1968, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17937082052230835, | |
| "step": 3030, | |
| "valid_targets_mean": 6212.4, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 4.879421221864952, | |
| "grad_norm": 0.4400634408990787, | |
| "learning_rate": 1.0195060839200509e-05, | |
| "loss": 0.1885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16629022359848022, | |
| "step": 3035, | |
| "valid_targets_mean": 5804.9, | |
| "valid_targets_min": 1181 | |
| }, | |
| { | |
| "epoch": 4.887459807073955, | |
| "grad_norm": 0.42872797950189373, | |
| "learning_rate": 1.0125253135538884e-05, | |
| "loss": 0.2127, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2543446123600006, | |
| "step": 3040, | |
| "valid_targets_mean": 7248.3, | |
| "valid_targets_min": 2207 | |
| }, | |
| { | |
| "epoch": 4.895498392282958, | |
| "grad_norm": 1.4172639399354463, | |
| "learning_rate": 1.0055604153567797e-05, | |
| "loss": 0.2012, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21109551191329956, | |
| "step": 3045, | |
| "valid_targets_mean": 4353.4, | |
| "valid_targets_min": 954 | |
| }, | |
| { | |
| "epoch": 4.903536977491961, | |
| "grad_norm": 0.516041539017509, | |
| "learning_rate": 9.98611501278979e-06, | |
| "loss": 0.1932, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21019768714904785, | |
| "step": 3050, | |
| "valid_targets_mean": 5650.9, | |
| "valid_targets_min": 2814 | |
| }, | |
| { | |
| "epoch": 4.911575562700965, | |
| "grad_norm": 0.48566131713136484, | |
| "learning_rate": 9.916786830138179e-06, | |
| "loss": 0.209, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2410656213760376, | |
| "step": 3055, | |
| "valid_targets_mean": 5875.2, | |
| "valid_targets_min": 2784 | |
| }, | |
| { | |
| "epoch": 4.919614147909968, | |
| "grad_norm": 0.46141499765797606, | |
| "learning_rate": 9.847620719959142e-06, | |
| "loss": 0.2079, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19822564721107483, | |
| "step": 3060, | |
| "valid_targets_mean": 6029.5, | |
| "valid_targets_min": 1188 | |
| }, | |
| { | |
| "epoch": 4.927652733118971, | |
| "grad_norm": 0.4994533341956204, | |
| "learning_rate": 9.77861779399377e-06, | |
| "loss": 0.2011, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18340840935707092, | |
| "step": 3065, | |
| "valid_targets_mean": 5793.5, | |
| "valid_targets_min": 419 | |
| }, | |
| { | |
| "epoch": 4.935691318327974, | |
| "grad_norm": 0.4787179730500465, | |
| "learning_rate": 9.709779161360238e-06, | |
| "loss": 0.1997, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1906869113445282, | |
| "step": 3070, | |
| "valid_targets_mean": 5818.4, | |
| "valid_targets_min": 1802 | |
| }, | |
| { | |
| "epoch": 4.943729903536978, | |
| "grad_norm": 0.49507384669911486, | |
| "learning_rate": 9.641105928535935e-06, | |
| "loss": 0.1881, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1777634173631668, | |
| "step": 3075, | |
| "valid_targets_mean": 5820.6, | |
| "valid_targets_min": 1582 | |
| }, | |
| { | |
| "epoch": 4.951768488745981, | |
| "grad_norm": 0.5046231532831364, | |
| "learning_rate": 9.572599199339685e-06, | |
| "loss": 0.2217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16210362315177917, | |
| "step": 3080, | |
| "valid_targets_mean": 4660.9, | |
| "valid_targets_min": 2346 | |
| }, | |
| { | |
| "epoch": 4.959807073954984, | |
| "grad_norm": 0.4659755600893732, | |
| "learning_rate": 9.50426007491404e-06, | |
| "loss": 0.1901, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20857881009578705, | |
| "step": 3085, | |
| "valid_targets_mean": 5818.4, | |
| "valid_targets_min": 2268 | |
| }, | |
| { | |
| "epoch": 4.967845659163987, | |
| "grad_norm": 0.43079763715355324, | |
| "learning_rate": 9.436089653707546e-06, | |
| "loss": 0.2028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2077915072441101, | |
| "step": 3090, | |
| "valid_targets_mean": 6538.9, | |
| "valid_targets_min": 1969 | |
| }, | |
| { | |
| "epoch": 4.97588424437299, | |
| "grad_norm": 0.44859465981547175, | |
| "learning_rate": 9.368089031457099e-06, | |
| "loss": 0.1858, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1704615354537964, | |
| "step": 3095, | |
| "valid_targets_mean": 5725.5, | |
| "valid_targets_min": 671 | |
| }, | |
| { | |
| "epoch": 4.983922829581994, | |
| "grad_norm": 0.5131061791485555, | |
| "learning_rate": 9.30025930117034e-06, | |
| "loss": 0.2014, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20215710997581482, | |
| "step": 3100, | |
| "valid_targets_mean": 4396.8, | |
| "valid_targets_min": 1779 | |
| }, | |
| { | |
| "epoch": 4.991961414790997, | |
| "grad_norm": 0.46330324395499967, | |
| "learning_rate": 9.23260155310807e-06, | |
| "loss": 0.2038, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19142061471939087, | |
| "step": 3105, | |
| "valid_targets_mean": 5785.6, | |
| "valid_targets_min": 1052 | |
| }, | |
| { | |
| "epoch": 5.0, | |
| "grad_norm": 0.46419646695644273, | |
| "learning_rate": 9.165116874766739e-06, | |
| "loss": 0.1928, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21007190644741058, | |
| "step": 3110, | |
| "valid_targets_mean": 5836.8, | |
| "valid_targets_min": 699 | |
| }, | |
| { | |
| "epoch": 5.008038585209003, | |
| "grad_norm": 0.4979445134634955, | |
| "learning_rate": 9.097806350860963e-06, | |
| "loss": 0.1796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15829969942569733, | |
| "step": 3115, | |
| "valid_targets_mean": 4624.7, | |
| "valid_targets_min": 2507 | |
| }, | |
| { | |
| "epoch": 5.016077170418006, | |
| "grad_norm": 0.4866122349627383, | |
| "learning_rate": 9.03067106330608e-06, | |
| "loss": 0.1842, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1525759994983673, | |
| "step": 3120, | |
| "valid_targets_mean": 4143.9, | |
| "valid_targets_min": 449 | |
| }, | |
| { | |
| "epoch": 5.02411575562701, | |
| "grad_norm": 0.5275864662151682, | |
| "learning_rate": 8.963712091200776e-06, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17857800424098969, | |
| "step": 3125, | |
| "valid_targets_mean": 4387.2, | |
| "valid_targets_min": 1658 | |
| }, | |
| { | |
| "epoch": 5.032154340836013, | |
| "grad_norm": 0.46221163971498497, | |
| "learning_rate": 8.896930510809729e-06, | |
| "loss": 0.182, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18663164973258972, | |
| "step": 3130, | |
| "valid_targets_mean": 5841.6, | |
| "valid_targets_min": 2836 | |
| }, | |
| { | |
| "epoch": 5.040192926045016, | |
| "grad_norm": 0.5585973033774798, | |
| "learning_rate": 8.83032739554631e-06, | |
| "loss": 0.1798, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21429578959941864, | |
| "step": 3135, | |
| "valid_targets_mean": 5289.9, | |
| "valid_targets_min": 700 | |
| }, | |
| { | |
| "epoch": 5.048231511254019, | |
| "grad_norm": 0.48266689854920847, | |
| "learning_rate": 8.763903815955343e-06, | |
| "loss": 0.172, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18342766165733337, | |
| "step": 3140, | |
| "valid_targets_mean": 5663.1, | |
| "valid_targets_min": 671 | |
| }, | |
| { | |
| "epoch": 5.056270096463023, | |
| "grad_norm": 0.47856495871810556, | |
| "learning_rate": 8.69766083969586e-06, | |
| "loss": 0.1825, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1659879833459854, | |
| "step": 3145, | |
| "valid_targets_mean": 5181.4, | |
| "valid_targets_min": 1288 | |
| }, | |
| { | |
| "epoch": 5.064308681672026, | |
| "grad_norm": 0.4583945660599572, | |
| "learning_rate": 8.631599531523984e-06, | |
| "loss": 0.1953, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19702893495559692, | |
| "step": 3150, | |
| "valid_targets_mean": 6398.4, | |
| "valid_targets_min": 2277 | |
| }, | |
| { | |
| "epoch": 5.072347266881029, | |
| "grad_norm": 0.4778149367767458, | |
| "learning_rate": 8.565720953275812e-06, | |
| "loss": 0.1894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21160531044006348, | |
| "step": 3155, | |
| "valid_targets_mean": 5688.0, | |
| "valid_targets_min": 2206 | |
| }, | |
| { | |
| "epoch": 5.080385852090032, | |
| "grad_norm": 0.5457542785844479, | |
| "learning_rate": 8.500026163850305e-06, | |
| "loss": 0.1752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19538471102714539, | |
| "step": 3160, | |
| "valid_targets_mean": 5499.2, | |
| "valid_targets_min": 2174 | |
| }, | |
| { | |
| "epoch": 5.088424437299035, | |
| "grad_norm": 0.47663165704336896, | |
| "learning_rate": 8.434516219192322e-06, | |
| "loss": 0.2003, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21058623492717743, | |
| "step": 3165, | |
| "valid_targets_mean": 6227.2, | |
| "valid_targets_min": 705 | |
| }, | |
| { | |
| "epoch": 5.096463022508039, | |
| "grad_norm": 0.5270713782036919, | |
| "learning_rate": 8.369192172275582e-06, | |
| "loss": 0.2081, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21326503157615662, | |
| "step": 3170, | |
| "valid_targets_mean": 5780.4, | |
| "valid_targets_min": 946 | |
| }, | |
| { | |
| "epoch": 5.104501607717042, | |
| "grad_norm": 0.47972931998923163, | |
| "learning_rate": 8.304055073085815e-06, | |
| "loss": 0.1726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1448415368795395, | |
| "step": 3175, | |
| "valid_targets_mean": 4978.8, | |
| "valid_targets_min": 574 | |
| }, | |
| { | |
| "epoch": 5.112540192926045, | |
| "grad_norm": 0.5339301964753969, | |
| "learning_rate": 8.239105968603824e-06, | |
| "loss": 0.1822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21358755230903625, | |
| "step": 3180, | |
| "valid_targets_mean": 5267.1, | |
| "valid_targets_min": 2349 | |
| }, | |
| { | |
| "epoch": 5.120578778135048, | |
| "grad_norm": 0.43894080846451544, | |
| "learning_rate": 8.174345902788712e-06, | |
| "loss": 0.1685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13934409618377686, | |
| "step": 3185, | |
| "valid_targets_mean": 5794.8, | |
| "valid_targets_min": 3113 | |
| }, | |
| { | |
| "epoch": 5.128617363344051, | |
| "grad_norm": 0.4377305756613267, | |
| "learning_rate": 8.109775916561026e-06, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1634282022714615, | |
| "step": 3190, | |
| "valid_targets_mean": 6257.9, | |
| "valid_targets_min": 2573 | |
| }, | |
| { | |
| "epoch": 5.136655948553055, | |
| "grad_norm": 0.5100917432819688, | |
| "learning_rate": 8.045397047786103e-06, | |
| "loss": 0.2099, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18037614226341248, | |
| "step": 3195, | |
| "valid_targets_mean": 6120.2, | |
| "valid_targets_min": 1045 | |
| }, | |
| { | |
| "epoch": 5.144694533762058, | |
| "grad_norm": 0.4857411122320573, | |
| "learning_rate": 7.981210331257336e-06, | |
| "loss": 0.1756, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15463681519031525, | |
| "step": 3200, | |
| "valid_targets_mean": 4595.8, | |
| "valid_targets_min": 2333 | |
| }, | |
| { | |
| "epoch": 5.152733118971061, | |
| "grad_norm": 0.46626013958080015, | |
| "learning_rate": 7.917216798679575e-06, | |
| "loss": 0.2031, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2193758636713028, | |
| "step": 3205, | |
| "valid_targets_mean": 6617.2, | |
| "valid_targets_min": 1900 | |
| }, | |
| { | |
| "epoch": 5.160771704180064, | |
| "grad_norm": 0.6986387448162077, | |
| "learning_rate": 7.853417478652516e-06, | |
| "loss": 0.1865, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20716939866542816, | |
| "step": 3210, | |
| "valid_targets_mean": 5879.0, | |
| "valid_targets_min": 2331 | |
| }, | |
| { | |
| "epoch": 5.168810289389068, | |
| "grad_norm": 0.5316897024374411, | |
| "learning_rate": 7.78981339665419e-06, | |
| "loss": 0.1849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17304065823554993, | |
| "step": 3215, | |
| "valid_targets_mean": 4310.8, | |
| "valid_targets_min": 628 | |
| }, | |
| { | |
| "epoch": 5.176848874598071, | |
| "grad_norm": 1.0402816301379194, | |
| "learning_rate": 7.72640557502446e-06, | |
| "loss": 0.1913, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18185323476791382, | |
| "step": 3220, | |
| "valid_targets_mean": 6621.9, | |
| "valid_targets_min": 2137 | |
| }, | |
| { | |
| "epoch": 5.184887459807074, | |
| "grad_norm": 0.5077283321168016, | |
| "learning_rate": 7.663195032948612e-06, | |
| "loss": 0.1937, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19564370810985565, | |
| "step": 3225, | |
| "valid_targets_mean": 5012.2, | |
| "valid_targets_min": 344 | |
| }, | |
| { | |
| "epoch": 5.192926045016077, | |
| "grad_norm": 0.4449974444997836, | |
| "learning_rate": 7.6001827864409574e-06, | |
| "loss": 0.1745, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19020414352416992, | |
| "step": 3230, | |
| "valid_targets_mean": 6730.3, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 5.20096463022508, | |
| "grad_norm": 0.45674780734397, | |
| "learning_rate": 7.537369848328482e-06, | |
| "loss": 0.1832, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19216638803482056, | |
| "step": 3235, | |
| "valid_targets_mean": 6425.2, | |
| "valid_targets_min": 3486 | |
| }, | |
| { | |
| "epoch": 5.209003215434084, | |
| "grad_norm": 0.5432545315748706, | |
| "learning_rate": 7.474757228234626e-06, | |
| "loss": 0.1889, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21978577971458435, | |
| "step": 3240, | |
| "valid_targets_mean": 4865.6, | |
| "valid_targets_min": 274 | |
| }, | |
| { | |
| "epoch": 5.217041800643087, | |
| "grad_norm": 0.4461775961777822, | |
| "learning_rate": 7.412345932562997e-06, | |
| "loss": 0.1916, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16950002312660217, | |
| "step": 3245, | |
| "valid_targets_mean": 5934.8, | |
| "valid_targets_min": 2995 | |
| }, | |
| { | |
| "epoch": 5.22508038585209, | |
| "grad_norm": 0.4956785865612323, | |
| "learning_rate": 7.350136964481218e-06, | |
| "loss": 0.1762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16553393006324768, | |
| "step": 3250, | |
| "valid_targets_mean": 4609.6, | |
| "valid_targets_min": 831 | |
| }, | |
| { | |
| "epoch": 5.233118971061093, | |
| "grad_norm": 0.478657006339967, | |
| "learning_rate": 7.288131323904812e-06, | |
| "loss": 0.1764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17582754790782928, | |
| "step": 3255, | |
| "valid_targets_mean": 5341.8, | |
| "valid_targets_min": 3220 | |
| }, | |
| { | |
| "epoch": 5.241157556270096, | |
| "grad_norm": 0.48870709583594846, | |
| "learning_rate": 7.2263300074810925e-06, | |
| "loss": 0.2078, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2231275886297226, | |
| "step": 3260, | |
| "valid_targets_mean": 5515.4, | |
| "valid_targets_min": 2141 | |
| }, | |
| { | |
| "epoch": 5.2491961414791, | |
| "grad_norm": 0.5538754582625227, | |
| "learning_rate": 7.164734008573186e-06, | |
| "loss": 0.1982, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19707731902599335, | |
| "step": 3265, | |
| "valid_targets_mean": 6562.5, | |
| "valid_targets_min": 3759 | |
| }, | |
| { | |
| "epoch": 5.257234726688103, | |
| "grad_norm": 0.4983801668294154, | |
| "learning_rate": 7.103344317244065e-06, | |
| "loss": 0.1733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18967768549919128, | |
| "step": 3270, | |
| "valid_targets_mean": 5918.1, | |
| "valid_targets_min": 2175 | |
| }, | |
| { | |
| "epoch": 5.265273311897106, | |
| "grad_norm": 0.4586010946247427, | |
| "learning_rate": 7.0421619202406e-06, | |
| "loss": 0.1801, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18738381564617157, | |
| "step": 3275, | |
| "valid_targets_mean": 6154.1, | |
| "valid_targets_min": 2023 | |
| }, | |
| { | |
| "epoch": 5.273311897106109, | |
| "grad_norm": 0.48811893275730395, | |
| "learning_rate": 6.981187800977706e-06, | |
| "loss": 0.1899, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18499714136123657, | |
| "step": 3280, | |
| "valid_targets_mean": 5491.1, | |
| "valid_targets_min": 1459 | |
| }, | |
| { | |
| "epoch": 5.281350482315113, | |
| "grad_norm": 0.5268803552811097, | |
| "learning_rate": 6.920422939522573e-06, | |
| "loss": 0.1892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19314301013946533, | |
| "step": 3285, | |
| "valid_targets_mean": 4503.8, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 5.289389067524116, | |
| "grad_norm": 0.46419450029948367, | |
| "learning_rate": 6.859868312578865e-06, | |
| "loss": 0.1766, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17092689871788025, | |
| "step": 3290, | |
| "valid_targets_mean": 6016.6, | |
| "valid_targets_min": 2676 | |
| }, | |
| { | |
| "epoch": 5.297427652733119, | |
| "grad_norm": 0.5047867299066612, | |
| "learning_rate": 6.799524893471055e-06, | |
| "loss": 0.2097, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18408851325511932, | |
| "step": 3295, | |
| "valid_targets_mean": 4765.6, | |
| "valid_targets_min": 955 | |
| }, | |
| { | |
| "epoch": 5.305466237942122, | |
| "grad_norm": 0.4638194012764823, | |
| "learning_rate": 6.739393652128776e-06, | |
| "loss": 0.1973, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17362159490585327, | |
| "step": 3300, | |
| "valid_targets_mean": 5807.0, | |
| "valid_targets_min": 2695 | |
| }, | |
| { | |
| "epoch": 5.313504823151125, | |
| "grad_norm": 0.518671277988672, | |
| "learning_rate": 6.679475555071198e-06, | |
| "loss": 0.1901, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1955638825893402, | |
| "step": 3305, | |
| "valid_targets_mean": 4877.8, | |
| "valid_targets_min": 3273 | |
| }, | |
| { | |
| "epoch": 5.321543408360129, | |
| "grad_norm": 0.4312127415985306, | |
| "learning_rate": 6.619771565391535e-06, | |
| "loss": 0.2005, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1737706959247589, | |
| "step": 3310, | |
| "valid_targets_mean": 6052.8, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 5.329581993569132, | |
| "grad_norm": 0.4577678581026521, | |
| "learning_rate": 6.560282642741544e-06, | |
| "loss": 0.1737, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17677032947540283, | |
| "step": 3315, | |
| "valid_targets_mean": 5634.6, | |
| "valid_targets_min": 2590 | |
| }, | |
| { | |
| "epoch": 5.337620578778135, | |
| "grad_norm": 0.47138237498351343, | |
| "learning_rate": 6.501009743316105e-06, | |
| "loss": 0.1948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20099903643131256, | |
| "step": 3320, | |
| "valid_targets_mean": 5851.9, | |
| "valid_targets_min": 2074 | |
| }, | |
| { | |
| "epoch": 5.345659163987138, | |
| "grad_norm": 0.4598019513900346, | |
| "learning_rate": 6.441953819837845e-06, | |
| "loss": 0.1992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18965350091457367, | |
| "step": 3325, | |
| "valid_targets_mean": 6094.8, | |
| "valid_targets_min": 2205 | |
| }, | |
| { | |
| "epoch": 5.353697749196142, | |
| "grad_norm": 0.483304158605293, | |
| "learning_rate": 6.383115821541834e-06, | |
| "loss": 0.201, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22649700939655304, | |
| "step": 3330, | |
| "valid_targets_mean": 5885.5, | |
| "valid_targets_min": 2844 | |
| }, | |
| { | |
| "epoch": 5.361736334405145, | |
| "grad_norm": 0.4779634450187073, | |
| "learning_rate": 6.324496694160318e-06, | |
| "loss": 0.1955, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21910080313682556, | |
| "step": 3335, | |
| "valid_targets_mean": 5985.5, | |
| "valid_targets_min": 2095 | |
| }, | |
| { | |
| "epoch": 5.369774919614148, | |
| "grad_norm": 0.5067078850486267, | |
| "learning_rate": 6.266097379907534e-06, | |
| "loss": 0.1849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17074373364448547, | |
| "step": 3340, | |
| "valid_targets_mean": 5271.4, | |
| "valid_targets_min": 818 | |
| }, | |
| { | |
| "epoch": 5.377813504823151, | |
| "grad_norm": 0.5595730160635201, | |
| "learning_rate": 6.2079188174645335e-06, | |
| "loss": 0.2064, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22083780169487, | |
| "step": 3345, | |
| "valid_targets_mean": 6631.5, | |
| "valid_targets_min": 1702 | |
| }, | |
| { | |
| "epoch": 5.385852090032154, | |
| "grad_norm": 0.5656540952095462, | |
| "learning_rate": 6.149961941964122e-06, | |
| "loss": 0.1902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2227296233177185, | |
| "step": 3350, | |
| "valid_targets_mean": 4248.2, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 5.393890675241158, | |
| "grad_norm": 0.4504085084825842, | |
| "learning_rate": 6.09222768497584e-06, | |
| "loss": 0.2097, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17969395220279694, | |
| "step": 3355, | |
| "valid_targets_mean": 6162.1, | |
| "valid_targets_min": 509 | |
| }, | |
| { | |
| "epoch": 5.401929260450161, | |
| "grad_norm": 0.5146997355757669, | |
| "learning_rate": 6.034716974490953e-06, | |
| "loss": 0.1999, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24291618168354034, | |
| "step": 3360, | |
| "valid_targets_mean": 5900.8, | |
| "valid_targets_min": 3057 | |
| }, | |
| { | |
| "epoch": 5.409967845659164, | |
| "grad_norm": 0.5351421245212054, | |
| "learning_rate": 5.977430734907559e-06, | |
| "loss": 0.1839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18290022015571594, | |
| "step": 3365, | |
| "valid_targets_mean": 5167.9, | |
| "valid_targets_min": 755 | |
| }, | |
| { | |
| "epoch": 5.418006430868167, | |
| "grad_norm": 0.5014907743840198, | |
| "learning_rate": 5.9203698870157114e-06, | |
| "loss": 0.1849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2125318944454193, | |
| "step": 3370, | |
| "valid_targets_mean": 6113.8, | |
| "valid_targets_min": 1147 | |
| }, | |
| { | |
| "epoch": 5.42604501607717, | |
| "grad_norm": 0.47556758623766643, | |
| "learning_rate": 5.863535347982645e-06, | |
| "loss": 0.1867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20354855060577393, | |
| "step": 3375, | |
| "valid_targets_mean": 6618.1, | |
| "valid_targets_min": 3805 | |
| }, | |
| { | |
| "epoch": 5.434083601286174, | |
| "grad_norm": 0.49228060520233496, | |
| "learning_rate": 5.806928031338015e-06, | |
| "loss": 0.188, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1746979057788849, | |
| "step": 3380, | |
| "valid_targets_mean": 5142.9, | |
| "valid_targets_min": 343 | |
| }, | |
| { | |
| "epoch": 5.442122186495177, | |
| "grad_norm": 0.4658584374183205, | |
| "learning_rate": 5.750548846959234e-06, | |
| "loss": 0.1823, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18658800423145294, | |
| "step": 3385, | |
| "valid_targets_mean": 5597.9, | |
| "valid_targets_min": 1719 | |
| }, | |
| { | |
| "epoch": 5.45016077170418, | |
| "grad_norm": 0.46391846520510605, | |
| "learning_rate": 5.694398701056801e-06, | |
| "loss": 0.1747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.165060892701149, | |
| "step": 3390, | |
| "valid_targets_mean": 5311.4, | |
| "valid_targets_min": 683 | |
| }, | |
| { | |
| "epoch": 5.458199356913183, | |
| "grad_norm": 0.46640745139708, | |
| "learning_rate": 5.638478496159791e-06, | |
| "loss": 0.1809, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22248056530952454, | |
| "step": 3395, | |
| "valid_targets_mean": 7049.8, | |
| "valid_targets_min": 2341 | |
| }, | |
| { | |
| "epoch": 5.466237942122186, | |
| "grad_norm": 0.488353156536436, | |
| "learning_rate": 5.5827891311013185e-06, | |
| "loss": 0.1999, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22211244702339172, | |
| "step": 3400, | |
| "valid_targets_mean": 5960.9, | |
| "valid_targets_min": 2687 | |
| }, | |
| { | |
| "epoch": 5.47427652733119, | |
| "grad_norm": 0.5126088033141332, | |
| "learning_rate": 5.52733150100409e-06, | |
| "loss": 0.1935, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20660114288330078, | |
| "step": 3405, | |
| "valid_targets_mean": 5240.8, | |
| "valid_targets_min": 359 | |
| }, | |
| { | |
| "epoch": 5.482315112540193, | |
| "grad_norm": 0.4838482712196251, | |
| "learning_rate": 5.472106497266023e-06, | |
| "loss": 0.1963, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17103157937526703, | |
| "step": 3410, | |
| "valid_targets_mean": 5297.1, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 5.490353697749196, | |
| "grad_norm": 0.4984237613144674, | |
| "learning_rate": 5.417115007545921e-06, | |
| "loss": 0.193, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22633609175682068, | |
| "step": 3415, | |
| "valid_targets_mean": 6094.9, | |
| "valid_targets_min": 904 | |
| }, | |
| { | |
| "epoch": 5.498392282958199, | |
| "grad_norm": 0.523636320883912, | |
| "learning_rate": 5.362357915749194e-06, | |
| "loss": 0.2048, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21240352094173431, | |
| "step": 3420, | |
| "valid_targets_mean": 4989.8, | |
| "valid_targets_min": 2961 | |
| }, | |
| { | |
| "epoch": 5.506430868167203, | |
| "grad_norm": 0.48044513730538596, | |
| "learning_rate": 5.307836102013666e-06, | |
| "loss": 0.19, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15104694664478302, | |
| "step": 3425, | |
| "valid_targets_mean": 5328.5, | |
| "valid_targets_min": 1807 | |
| }, | |
| { | |
| "epoch": 5.514469453376206, | |
| "grad_norm": 0.4751176561827757, | |
| "learning_rate": 5.2535504426954145e-06, | |
| "loss": 0.1936, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22195205092430115, | |
| "step": 3430, | |
| "valid_targets_mean": 6118.9, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 5.522508038585209, | |
| "grad_norm": 0.48516517392290076, | |
| "learning_rate": 5.199501810354692e-06, | |
| "loss": 0.1741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15345501899719238, | |
| "step": 3435, | |
| "valid_targets_mean": 5330.9, | |
| "valid_targets_min": 2513 | |
| }, | |
| { | |
| "epoch": 5.530546623794212, | |
| "grad_norm": 0.5121095480216645, | |
| "learning_rate": 5.145691073741898e-06, | |
| "loss": 0.1905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18172189593315125, | |
| "step": 3440, | |
| "valid_targets_mean": 4640.0, | |
| "valid_targets_min": 1492 | |
| }, | |
| { | |
| "epoch": 5.538585209003215, | |
| "grad_norm": 0.49530201825529846, | |
| "learning_rate": 5.092119097783621e-06, | |
| "loss": 0.1875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18862499296665192, | |
| "step": 3445, | |
| "valid_targets_mean": 6436.6, | |
| "valid_targets_min": 2814 | |
| }, | |
| { | |
| "epoch": 5.546623794212219, | |
| "grad_norm": 0.49091742317146175, | |
| "learning_rate": 5.0387867435687265e-06, | |
| "loss": 0.2009, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20091378688812256, | |
| "step": 3450, | |
| "valid_targets_mean": 5780.8, | |
| "valid_targets_min": 3062 | |
| }, | |
| { | |
| "epoch": 5.554662379421222, | |
| "grad_norm": 0.4462477186654765, | |
| "learning_rate": 4.9856948683345255e-06, | |
| "loss": 0.1797, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1533946990966797, | |
| "step": 3455, | |
| "valid_targets_mean": 5978.1, | |
| "valid_targets_min": 1061 | |
| }, | |
| { | |
| "epoch": 5.562700964630225, | |
| "grad_norm": 0.5033789822369987, | |
| "learning_rate": 4.9328443254529765e-06, | |
| "loss": 0.1842, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17586976289749146, | |
| "step": 3460, | |
| "valid_targets_mean": 5279.8, | |
| "valid_targets_min": 2018 | |
| }, | |
| { | |
| "epoch": 5.570739549839228, | |
| "grad_norm": 0.4705676576038694, | |
| "learning_rate": 4.880235964417009e-06, | |
| "loss": 0.1854, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17664816975593567, | |
| "step": 3465, | |
| "valid_targets_mean": 5694.1, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 5.578778135048232, | |
| "grad_norm": 0.47218177649090964, | |
| "learning_rate": 4.827870630826826e-06, | |
| "loss": 0.1796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16814285516738892, | |
| "step": 3470, | |
| "valid_targets_mean": 5590.7, | |
| "valid_targets_min": 2212 | |
| }, | |
| { | |
| "epoch": 5.586816720257235, | |
| "grad_norm": 0.48717963339489373, | |
| "learning_rate": 4.775749166376342e-06, | |
| "loss": 0.1962, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1970927119255066, | |
| "step": 3475, | |
| "valid_targets_mean": 5571.5, | |
| "valid_targets_min": 3288 | |
| }, | |
| { | |
| "epoch": 5.594855305466238, | |
| "grad_norm": 0.4948032769777901, | |
| "learning_rate": 4.723872408839623e-06, | |
| "loss": 0.1927, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21700280904769897, | |
| "step": 3480, | |
| "valid_targets_mean": 5748.1, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 5.602893890675241, | |
| "grad_norm": 0.4284945969590602, | |
| "learning_rate": 4.672241192057456e-06, | |
| "loss": 0.1985, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1854320764541626, | |
| "step": 3485, | |
| "valid_targets_mean": 6890.2, | |
| "valid_targets_min": 4311 | |
| }, | |
| { | |
| "epoch": 5.610932475884244, | |
| "grad_norm": 0.5279346730463442, | |
| "learning_rate": 4.620856345923928e-06, | |
| "loss": 0.1805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15860424935817719, | |
| "step": 3490, | |
| "valid_targets_mean": 5110.0, | |
| "valid_targets_min": 2205 | |
| }, | |
| { | |
| "epoch": 5.618971061093248, | |
| "grad_norm": 0.48080273997531287, | |
| "learning_rate": 4.56971869637308e-06, | |
| "loss": 0.1789, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18217507004737854, | |
| "step": 3495, | |
| "valid_targets_mean": 5349.5, | |
| "valid_targets_min": 1174 | |
| }, | |
| { | |
| "epoch": 5.627009646302251, | |
| "grad_norm": 0.47073465398426945, | |
| "learning_rate": 4.518829065365664e-06, | |
| "loss": 0.1734, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18675915896892548, | |
| "step": 3500, | |
| "valid_targets_mean": 5757.6, | |
| "valid_targets_min": 3083 | |
| }, | |
| { | |
| "epoch": 5.635048231511254, | |
| "grad_norm": 0.49790647435675167, | |
| "learning_rate": 4.468188270875873e-06, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16006457805633545, | |
| "step": 3505, | |
| "valid_targets_mean": 5294.3, | |
| "valid_targets_min": 576 | |
| }, | |
| { | |
| "epoch": 5.643086816720257, | |
| "grad_norm": 0.4831653661292901, | |
| "learning_rate": 4.417797126878247e-06, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17651981115341187, | |
| "step": 3510, | |
| "valid_targets_mean": 5286.9, | |
| "valid_targets_min": 466 | |
| }, | |
| { | |
| "epoch": 5.65112540192926, | |
| "grad_norm": 0.43558748477821085, | |
| "learning_rate": 4.367656443334574e-06, | |
| "loss": 0.1937, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19159835577011108, | |
| "step": 3515, | |
| "valid_targets_mean": 5938.7, | |
| "valid_targets_min": 1799 | |
| }, | |
| { | |
| "epoch": 5.659163987138264, | |
| "grad_norm": 0.4302587876871802, | |
| "learning_rate": 4.317767026180854e-06, | |
| "loss": 0.1959, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.172641783952713, | |
| "step": 3520, | |
| "valid_targets_mean": 6267.8, | |
| "valid_targets_min": 3421 | |
| }, | |
| { | |
| "epoch": 5.667202572347267, | |
| "grad_norm": 0.6167202636928861, | |
| "learning_rate": 4.26812967731437e-06, | |
| "loss": 0.2008, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2416028082370758, | |
| "step": 3525, | |
| "valid_targets_mean": 5056.8, | |
| "valid_targets_min": 297 | |
| }, | |
| { | |
| "epoch": 5.67524115755627, | |
| "grad_norm": 0.4939602441719698, | |
| "learning_rate": 4.218745194580778e-06, | |
| "loss": 0.1866, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21441826224327087, | |
| "step": 3530, | |
| "valid_targets_mean": 5435.4, | |
| "valid_targets_min": 1059 | |
| }, | |
| { | |
| "epoch": 5.683279742765273, | |
| "grad_norm": 0.4756465888695026, | |
| "learning_rate": 4.169614371761299e-06, | |
| "loss": 0.1933, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16448351740837097, | |
| "step": 3535, | |
| "valid_targets_mean": 5683.4, | |
| "valid_targets_min": 2153 | |
| }, | |
| { | |
| "epoch": 5.691318327974276, | |
| "grad_norm": 0.48743402946073533, | |
| "learning_rate": 4.120737998559945e-06, | |
| "loss": 0.2004, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1667463630437851, | |
| "step": 3540, | |
| "valid_targets_mean": 6335.8, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 5.69935691318328, | |
| "grad_norm": 0.5120939404906634, | |
| "learning_rate": 4.07211686059084e-06, | |
| "loss": 0.1837, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19326308369636536, | |
| "step": 3545, | |
| "valid_targets_mean": 5092.7, | |
| "valid_targets_min": 856 | |
| }, | |
| { | |
| "epoch": 5.707395498392283, | |
| "grad_norm": 0.5632246626673505, | |
| "learning_rate": 4.023751739365583e-06, | |
| "loss": 0.1821, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20783701539039612, | |
| "step": 3550, | |
| "valid_targets_mean": 4398.4, | |
| "valid_targets_min": 786 | |
| }, | |
| { | |
| "epoch": 5.715434083601286, | |
| "grad_norm": 0.586002529442884, | |
| "learning_rate": 3.975643412280687e-06, | |
| "loss": 0.1932, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18774420022964478, | |
| "step": 3555, | |
| "valid_targets_mean": 4162.2, | |
| "valid_targets_min": 449 | |
| }, | |
| { | |
| "epoch": 5.723472668810289, | |
| "grad_norm": 0.449918412147885, | |
| "learning_rate": 3.9277926526050915e-06, | |
| "loss": 0.1905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21148762106895447, | |
| "step": 3560, | |
| "valid_targets_mean": 6406.3, | |
| "valid_targets_min": 2032 | |
| }, | |
| { | |
| "epoch": 5.731511254019293, | |
| "grad_norm": 0.5102625262179785, | |
| "learning_rate": 3.8802002294677295e-06, | |
| "loss": 0.1924, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2425631582736969, | |
| "step": 3565, | |
| "valid_targets_mean": 5427.5, | |
| "valid_targets_min": 453 | |
| }, | |
| { | |
| "epoch": 5.739549839228296, | |
| "grad_norm": 0.39393512706795425, | |
| "learning_rate": 3.8328669078451454e-06, | |
| "loss": 0.1802, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16832399368286133, | |
| "step": 3570, | |
| "valid_targets_mean": 7606.4, | |
| "valid_targets_min": 4314 | |
| }, | |
| { | |
| "epoch": 5.747588424437299, | |
| "grad_norm": 0.4840682615333142, | |
| "learning_rate": 3.7857934485492308e-06, | |
| "loss": 0.1948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16461750864982605, | |
| "step": 3575, | |
| "valid_targets_mean": 5850.2, | |
| "valid_targets_min": 2446 | |
| }, | |
| { | |
| "epoch": 5.755627009646302, | |
| "grad_norm": 0.48701845226790647, | |
| "learning_rate": 3.738980608214988e-06, | |
| "loss": 0.1982, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2235550880432129, | |
| "step": 3580, | |
| "valid_targets_mean": 5985.8, | |
| "valid_targets_min": 1647 | |
| }, | |
| { | |
| "epoch": 5.763665594855306, | |
| "grad_norm": 0.4915821728484182, | |
| "learning_rate": 3.6924291392883537e-06, | |
| "loss": 0.2026, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1905771791934967, | |
| "step": 3585, | |
| "valid_targets_mean": 5496.2, | |
| "valid_targets_min": 407 | |
| }, | |
| { | |
| "epoch": 5.771704180064309, | |
| "grad_norm": 0.5265224188216318, | |
| "learning_rate": 3.646139790014114e-06, | |
| "loss": 0.2002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1957128345966339, | |
| "step": 3590, | |
| "valid_targets_mean": 5012.4, | |
| "valid_targets_min": 1706 | |
| }, | |
| { | |
| "epoch": 5.779742765273312, | |
| "grad_norm": 0.45450286343018304, | |
| "learning_rate": 3.6001133044238646e-06, | |
| "loss": 0.1822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22709839046001434, | |
| "step": 3595, | |
| "valid_targets_mean": 7698.6, | |
| "valid_targets_min": 2364 | |
| }, | |
| { | |
| "epoch": 5.787781350482315, | |
| "grad_norm": 0.47530926294274556, | |
| "learning_rate": 3.554350422324082e-06, | |
| "loss": 0.1922, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1802034229040146, | |
| "step": 3600, | |
| "valid_targets_mean": 5487.3, | |
| "valid_targets_min": 977 | |
| }, | |
| { | |
| "epoch": 5.795819935691318, | |
| "grad_norm": 0.601663523973185, | |
| "learning_rate": 3.5088518792841986e-06, | |
| "loss": 0.186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20563310384750366, | |
| "step": 3605, | |
| "valid_targets_mean": 4071.0, | |
| "valid_targets_min": 955 | |
| }, | |
| { | |
| "epoch": 5.803858520900322, | |
| "grad_norm": 0.5048571767634144, | |
| "learning_rate": 3.463618406624816e-06, | |
| "loss": 0.178, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1549883484840393, | |
| "step": 3610, | |
| "valid_targets_mean": 5174.1, | |
| "valid_targets_min": 828 | |
| }, | |
| { | |
| "epoch": 5.811897106109325, | |
| "grad_norm": 0.5393411182416932, | |
| "learning_rate": 3.418650731405897e-06, | |
| "loss": 0.201, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16713210940361023, | |
| "step": 3615, | |
| "valid_targets_mean": 5148.2, | |
| "valid_targets_min": 1475 | |
| }, | |
| { | |
| "epoch": 5.819935691318328, | |
| "grad_norm": 0.5925366852450519, | |
| "learning_rate": 3.373949576415132e-06, | |
| "loss": 0.1991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19236284494400024, | |
| "step": 3620, | |
| "valid_targets_mean": 4236.4, | |
| "valid_targets_min": 417 | |
| }, | |
| { | |
| "epoch": 5.827974276527331, | |
| "grad_norm": 0.4981596093711268, | |
| "learning_rate": 3.329515660156297e-06, | |
| "loss": 0.1988, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2149473875761032, | |
| "step": 3625, | |
| "valid_targets_mean": 4993.6, | |
| "valid_targets_min": 1612 | |
| }, | |
| { | |
| "epoch": 5.836012861736334, | |
| "grad_norm": 0.43691243542885205, | |
| "learning_rate": 3.2853496968377054e-06, | |
| "loss": 0.1885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1537531167268753, | |
| "step": 3630, | |
| "valid_targets_mean": 5682.8, | |
| "valid_targets_min": 355 | |
| }, | |
| { | |
| "epoch": 5.844051446945338, | |
| "grad_norm": 0.5038024496320214, | |
| "learning_rate": 3.2414523963607313e-06, | |
| "loss": 0.2012, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19316399097442627, | |
| "step": 3635, | |
| "valid_targets_mean": 5242.4, | |
| "valid_targets_min": 268 | |
| }, | |
| { | |
| "epoch": 5.852090032154341, | |
| "grad_norm": 0.48583257296181714, | |
| "learning_rate": 3.1978244643083944e-06, | |
| "loss": 0.1948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1713036298751831, | |
| "step": 3640, | |
| "valid_targets_mean": 5449.9, | |
| "valid_targets_min": 793 | |
| }, | |
| { | |
| "epoch": 5.860128617363344, | |
| "grad_norm": 0.5008462037330859, | |
| "learning_rate": 3.1544666019340274e-06, | |
| "loss": 0.1689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1978035867214203, | |
| "step": 3645, | |
| "valid_targets_mean": 6283.9, | |
| "valid_targets_min": 2974 | |
| }, | |
| { | |
| "epoch": 5.868167202572347, | |
| "grad_norm": 0.4767464107670729, | |
| "learning_rate": 3.1113795061499985e-06, | |
| "loss": 0.1747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1628812849521637, | |
| "step": 3650, | |
| "valid_targets_mean": 5884.2, | |
| "valid_targets_min": 1504 | |
| }, | |
| { | |
| "epoch": 5.87620578778135, | |
| "grad_norm": 0.5560475071125592, | |
| "learning_rate": 3.068563869516512e-06, | |
| "loss": 0.1944, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17907622456550598, | |
| "step": 3655, | |
| "valid_targets_mean": 3833.8, | |
| "valid_targets_min": 398 | |
| }, | |
| { | |
| "epoch": 5.884244372990354, | |
| "grad_norm": 0.7011000684899574, | |
| "learning_rate": 3.026020380230461e-06, | |
| "loss": 0.1961, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1727597713470459, | |
| "step": 3660, | |
| "valid_targets_mean": 5416.0, | |
| "valid_targets_min": 505 | |
| }, | |
| { | |
| "epoch": 5.892282958199357, | |
| "grad_norm": 0.5018255589278392, | |
| "learning_rate": 2.983749722114404e-06, | |
| "loss": 0.2058, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21103069186210632, | |
| "step": 3665, | |
| "valid_targets_mean": 6098.8, | |
| "valid_targets_min": 546 | |
| }, | |
| { | |
| "epoch": 5.90032154340836, | |
| "grad_norm": 0.4685610942564552, | |
| "learning_rate": 2.941752574605534e-06, | |
| "loss": 0.1803, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18297003209590912, | |
| "step": 3670, | |
| "valid_targets_mean": 6018.0, | |
| "valid_targets_min": 2116 | |
| }, | |
| { | |
| "epoch": 5.908360128617363, | |
| "grad_norm": 0.4933707589439783, | |
| "learning_rate": 2.900029612744777e-06, | |
| "loss": 0.196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2093782275915146, | |
| "step": 3675, | |
| "valid_targets_mean": 5958.1, | |
| "valid_targets_min": 1882 | |
| }, | |
| { | |
| "epoch": 5.916398713826366, | |
| "grad_norm": 0.465397474042975, | |
| "learning_rate": 2.8585815071659274e-06, | |
| "loss": 0.1856, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.13250520825386047, | |
| "step": 3680, | |
| "valid_targets_mean": 5000.4, | |
| "valid_targets_min": 256 | |
| }, | |
| { | |
| "epoch": 5.92443729903537, | |
| "grad_norm": 0.4861895013069857, | |
| "learning_rate": 2.8174089240848923e-06, | |
| "loss": 0.1875, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17181509733200073, | |
| "step": 3685, | |
| "valid_targets_mean": 5327.6, | |
| "valid_targets_min": 964 | |
| }, | |
| { | |
| "epoch": 5.932475884244373, | |
| "grad_norm": 0.4822977088482712, | |
| "learning_rate": 2.776512525288959e-06, | |
| "loss": 0.1828, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18289127945899963, | |
| "step": 3690, | |
| "valid_targets_mean": 5464.9, | |
| "valid_targets_min": 1861 | |
| }, | |
| { | |
| "epoch": 5.940514469453376, | |
| "grad_norm": 0.5287623887945523, | |
| "learning_rate": 2.7358929681261836e-06, | |
| "loss": 0.1847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1922057718038559, | |
| "step": 3695, | |
| "valid_targets_mean": 4948.4, | |
| "valid_targets_min": 804 | |
| }, | |
| { | |
| "epoch": 5.948553054662379, | |
| "grad_norm": 0.48382172273456314, | |
| "learning_rate": 2.695550905494797e-06, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15750959515571594, | |
| "step": 3700, | |
| "valid_targets_mean": 4836.6, | |
| "valid_targets_min": 1302 | |
| }, | |
| { | |
| "epoch": 5.956591639871383, | |
| "grad_norm": 0.5233715291485577, | |
| "learning_rate": 2.655486985832718e-06, | |
| "loss": 0.1805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18856148421764374, | |
| "step": 3705, | |
| "valid_targets_mean": 5918.6, | |
| "valid_targets_min": 636 | |
| }, | |
| { | |
| "epoch": 5.964630225080386, | |
| "grad_norm": 0.48717200725467924, | |
| "learning_rate": 2.615701853107149e-06, | |
| "loss": 0.202, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22136838734149933, | |
| "step": 3710, | |
| "valid_targets_mean": 5422.0, | |
| "valid_targets_min": 2412 | |
| }, | |
| { | |
| "epoch": 5.972668810289389, | |
| "grad_norm": 0.48597064916098776, | |
| "learning_rate": 2.576196146804204e-06, | |
| "loss": 0.1754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14957702159881592, | |
| "step": 3715, | |
| "valid_targets_mean": 4877.0, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 5.980707395498392, | |
| "grad_norm": 0.452012801780017, | |
| "learning_rate": 2.536970501918643e-06, | |
| "loss": 0.1757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15257996320724487, | |
| "step": 3720, | |
| "valid_targets_mean": 5929.3, | |
| "valid_targets_min": 514 | |
| }, | |
| { | |
| "epoch": 5.988745980707396, | |
| "grad_norm": 0.483217937140563, | |
| "learning_rate": 2.4980255489436565e-06, | |
| "loss": 0.1997, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20844154059886932, | |
| "step": 3725, | |
| "valid_targets_mean": 5669.2, | |
| "valid_targets_min": 1328 | |
| }, | |
| { | |
| "epoch": 5.996784565916399, | |
| "grad_norm": 0.4724190619750239, | |
| "learning_rate": 2.459361913860736e-06, | |
| "loss": 0.1843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1832464635372162, | |
| "step": 3730, | |
| "valid_targets_mean": 5815.4, | |
| "valid_targets_min": 2576 | |
| }, | |
| { | |
| "epoch": 6.004823151125402, | |
| "grad_norm": 0.83317522207539, | |
| "learning_rate": 2.4209802181296136e-06, | |
| "loss": 0.1736, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1619970053434372, | |
| "step": 3735, | |
| "valid_targets_mean": 5549.2, | |
| "valid_targets_min": 964 | |
| }, | |
| { | |
| "epoch": 6.012861736334405, | |
| "grad_norm": 0.5154255877325745, | |
| "learning_rate": 2.3828810786782674e-06, | |
| "loss": 0.1965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1856580525636673, | |
| "step": 3740, | |
| "valid_targets_mean": 5616.6, | |
| "valid_targets_min": 1732 | |
| }, | |
| { | |
| "epoch": 6.020900321543408, | |
| "grad_norm": 0.49816618610731805, | |
| "learning_rate": 2.34506510789301e-06, | |
| "loss": 0.1982, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18047022819519043, | |
| "step": 3745, | |
| "valid_targets_mean": 5633.9, | |
| "valid_targets_min": 2586 | |
| }, | |
| { | |
| "epoch": 6.028938906752412, | |
| "grad_norm": 0.6550527611073069, | |
| "learning_rate": 2.3075329136086434e-06, | |
| "loss": 0.1808, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17386046051979065, | |
| "step": 3750, | |
| "valid_targets_mean": 5468.9, | |
| "valid_targets_min": 457 | |
| }, | |
| { | |
| "epoch": 6.036977491961415, | |
| "grad_norm": 0.47279868176005735, | |
| "learning_rate": 2.2702850990986924e-06, | |
| "loss": 0.1827, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1906416267156601, | |
| "step": 3755, | |
| "valid_targets_mean": 6038.7, | |
| "valid_targets_min": 3062 | |
| }, | |
| { | |
| "epoch": 6.045016077170418, | |
| "grad_norm": 0.4933331908426635, | |
| "learning_rate": 2.2333222630657025e-06, | |
| "loss": 0.1658, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17692366242408752, | |
| "step": 3760, | |
| "valid_targets_mean": 5261.4, | |
| "valid_targets_min": 1379 | |
| }, | |
| { | |
| "epoch": 6.053054662379421, | |
| "grad_norm": 0.8130085639082187, | |
| "learning_rate": 2.1966449996316208e-06, | |
| "loss": 0.1826, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1743280589580536, | |
| "step": 3765, | |
| "valid_targets_mean": 5865.8, | |
| "valid_targets_min": 1931 | |
| }, | |
| { | |
| "epoch": 6.061093247588424, | |
| "grad_norm": 0.5466815490190559, | |
| "learning_rate": 2.1602538983282394e-06, | |
| "loss": 0.1846, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1809442937374115, | |
| "step": 3770, | |
| "valid_targets_mean": 6375.1, | |
| "valid_targets_min": 4224 | |
| }, | |
| { | |
| "epoch": 6.069131832797428, | |
| "grad_norm": 0.49542785389406413, | |
| "learning_rate": 2.1241495440877258e-06, | |
| "loss": 0.1898, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1862630546092987, | |
| "step": 3775, | |
| "valid_targets_mean": 5147.9, | |
| "valid_targets_min": 807 | |
| }, | |
| { | |
| "epoch": 6.077170418006431, | |
| "grad_norm": 0.47022987712208747, | |
| "learning_rate": 2.0883325172332357e-06, | |
| "loss": 0.1717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16111014783382416, | |
| "step": 3780, | |
| "valid_targets_mean": 5491.1, | |
| "valid_targets_min": 2277 | |
| }, | |
| { | |
| "epoch": 6.085209003215434, | |
| "grad_norm": 0.47395693170698344, | |
| "learning_rate": 2.052803393469549e-06, | |
| "loss": 0.1751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18137693405151367, | |
| "step": 3785, | |
| "valid_targets_mean": 5670.7, | |
| "valid_targets_min": 923 | |
| }, | |
| { | |
| "epoch": 6.093247588424437, | |
| "grad_norm": 0.48294409022389034, | |
| "learning_rate": 2.0175627438738488e-06, | |
| "loss": 0.17, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1723235547542572, | |
| "step": 3790, | |
| "valid_targets_mean": 5502.2, | |
| "valid_targets_min": 1221 | |
| }, | |
| { | |
| "epoch": 6.101286173633441, | |
| "grad_norm": 0.5705804543456575, | |
| "learning_rate": 1.9826111348865227e-06, | |
| "loss": 0.1768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1478518843650818, | |
| "step": 3795, | |
| "valid_targets_mean": 4955.9, | |
| "valid_targets_min": 2887 | |
| }, | |
| { | |
| "epoch": 6.109324758842444, | |
| "grad_norm": 0.509692421121887, | |
| "learning_rate": 1.9479491283020714e-06, | |
| "loss": 0.1837, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20045389235019684, | |
| "step": 3800, | |
| "valid_targets_mean": 5041.6, | |
| "valid_targets_min": 954 | |
| }, | |
| { | |
| "epoch": 6.117363344051447, | |
| "grad_norm": 0.4866828715923054, | |
| "learning_rate": 1.913577281260066e-06, | |
| "loss": 0.1851, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18292544782161713, | |
| "step": 3805, | |
| "valid_targets_mean": 5632.7, | |
| "valid_targets_min": 598 | |
| }, | |
| { | |
| "epoch": 6.12540192926045, | |
| "grad_norm": 0.4863806235895185, | |
| "learning_rate": 1.8794961462362194e-06, | |
| "loss": 0.1903, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18915069103240967, | |
| "step": 3810, | |
| "valid_targets_mean": 5742.5, | |
| "valid_targets_min": 1890 | |
| }, | |
| { | |
| "epoch": 6.133440514469453, | |
| "grad_norm": 0.4662278512075521, | |
| "learning_rate": 1.845706271033456e-06, | |
| "loss": 0.1748, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16785621643066406, | |
| "step": 3815, | |
| "valid_targets_mean": 6455.1, | |
| "valid_targets_min": 2205 | |
| }, | |
| { | |
| "epoch": 6.141479099678457, | |
| "grad_norm": 0.4749251483380231, | |
| "learning_rate": 1.81220819877316e-06, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22028598189353943, | |
| "step": 3820, | |
| "valid_targets_mean": 5887.7, | |
| "valid_targets_min": 1717 | |
| }, | |
| { | |
| "epoch": 6.14951768488746, | |
| "grad_norm": 0.5277330163447275, | |
| "learning_rate": 1.7790024678864149e-06, | |
| "loss": 0.1805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20545095205307007, | |
| "step": 3825, | |
| "valid_targets_mean": 5465.8, | |
| "valid_targets_min": 2498 | |
| }, | |
| { | |
| "epoch": 6.157556270096463, | |
| "grad_norm": 0.4888029274916281, | |
| "learning_rate": 1.746089612105355e-06, | |
| "loss": 0.1888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17628200352191925, | |
| "step": 3830, | |
| "valid_targets_mean": 5932.9, | |
| "valid_targets_min": 2621 | |
| }, | |
| { | |
| "epoch": 6.165594855305466, | |
| "grad_norm": 0.5116352727269389, | |
| "learning_rate": 1.7134701604545934e-06, | |
| "loss": 0.1831, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2042563557624817, | |
| "step": 3835, | |
| "valid_targets_mean": 5198.4, | |
| "valid_targets_min": 856 | |
| }, | |
| { | |
| "epoch": 6.173633440514469, | |
| "grad_norm": 0.4919942862743339, | |
| "learning_rate": 1.6811446372427108e-06, | |
| "loss": 0.1965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19126807153224945, | |
| "step": 3840, | |
| "valid_targets_mean": 5945.5, | |
| "valid_targets_min": 449 | |
| }, | |
| { | |
| "epoch": 6.181672025723473, | |
| "grad_norm": 0.4665227200564428, | |
| "learning_rate": 1.6491135620538256e-06, | |
| "loss": 0.171, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17806287109851837, | |
| "step": 3845, | |
| "valid_targets_mean": 5732.4, | |
| "valid_targets_min": 935 | |
| }, | |
| { | |
| "epoch": 6.189710610932476, | |
| "grad_norm": 0.463406210984963, | |
| "learning_rate": 1.6173774497392592e-06, | |
| "loss": 0.1822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14907455444335938, | |
| "step": 3850, | |
| "valid_targets_mean": 5815.6, | |
| "valid_targets_min": 726 | |
| }, | |
| { | |
| "epoch": 6.197749196141479, | |
| "grad_norm": 0.5817896201740729, | |
| "learning_rate": 1.5859368104092387e-06, | |
| "loss": 0.189, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20796798169612885, | |
| "step": 3855, | |
| "valid_targets_mean": 4837.5, | |
| "valid_targets_min": 841 | |
| }, | |
| { | |
| "epoch": 6.205787781350482, | |
| "grad_norm": 0.4781524045532261, | |
| "learning_rate": 1.5547921494247065e-06, | |
| "loss": 0.173, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17145711183547974, | |
| "step": 3860, | |
| "valid_targets_mean": 6007.2, | |
| "valid_targets_min": 2405 | |
| }, | |
| { | |
| "epoch": 6.213826366559486, | |
| "grad_norm": 0.49548094344040206, | |
| "learning_rate": 1.5239439673892098e-06, | |
| "loss": 0.1877, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19060102105140686, | |
| "step": 3865, | |
| "valid_targets_mean": 5291.6, | |
| "valid_targets_min": 2244 | |
| }, | |
| { | |
| "epoch": 6.221864951768489, | |
| "grad_norm": 0.49834757800080803, | |
| "learning_rate": 1.4933927601408327e-06, | |
| "loss": 0.1688, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14511215686798096, | |
| "step": 3870, | |
| "valid_targets_mean": 4564.9, | |
| "valid_targets_min": 2297 | |
| }, | |
| { | |
| "epoch": 6.229903536977492, | |
| "grad_norm": 0.510379153220315, | |
| "learning_rate": 1.4631390187442441e-06, | |
| "loss": 0.1834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17495837807655334, | |
| "step": 3875, | |
| "valid_targets_mean": 5613.2, | |
| "valid_targets_min": 3486 | |
| }, | |
| { | |
| "epoch": 6.237942122186495, | |
| "grad_norm": 0.46568643547236416, | |
| "learning_rate": 1.433183229482793e-06, | |
| "loss": 0.1895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21530888974666595, | |
| "step": 3880, | |
| "valid_targets_mean": 6843.9, | |
| "valid_targets_min": 2342 | |
| }, | |
| { | |
| "epoch": 6.245980707395498, | |
| "grad_norm": 0.515917833968904, | |
| "learning_rate": 1.4035258738506908e-06, | |
| "loss": 0.1849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18214061856269836, | |
| "step": 3885, | |
| "valid_targets_mean": 5141.1, | |
| "valid_targets_min": 1200 | |
| }, | |
| { | |
| "epoch": 6.254019292604502, | |
| "grad_norm": 0.5217242607576065, | |
| "learning_rate": 1.3741674285452854e-06, | |
| "loss": 0.185, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1475576013326645, | |
| "step": 3890, | |
| "valid_targets_mean": 5111.0, | |
| "valid_targets_min": 873 | |
| }, | |
| { | |
| "epoch": 6.262057877813505, | |
| "grad_norm": 0.4407473996704603, | |
| "learning_rate": 1.3451083654593932e-06, | |
| "loss": 0.1937, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21679872274398804, | |
| "step": 3895, | |
| "valid_targets_mean": 7420.8, | |
| "valid_targets_min": 3288 | |
| }, | |
| { | |
| "epoch": 6.270096463022508, | |
| "grad_norm": 0.5474233696114393, | |
| "learning_rate": 1.3163491516737104e-06, | |
| "loss": 0.1896, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1937793493270874, | |
| "step": 3900, | |
| "valid_targets_mean": 5128.5, | |
| "valid_targets_min": 1288 | |
| }, | |
| { | |
| "epoch": 6.278135048231511, | |
| "grad_norm": 0.48218883082894765, | |
| "learning_rate": 1.2878902494492973e-06, | |
| "loss": 0.1699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17568941414356232, | |
| "step": 3905, | |
| "valid_targets_mean": 5506.8, | |
| "valid_targets_min": 1393 | |
| }, | |
| { | |
| "epoch": 6.286173633440514, | |
| "grad_norm": 0.473274067929269, | |
| "learning_rate": 1.259732116220167e-06, | |
| "loss": 0.1949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18140992522239685, | |
| "step": 3910, | |
| "valid_targets_mean": 5537.2, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 6.294212218649518, | |
| "grad_norm": 0.4921331038109862, | |
| "learning_rate": 1.2318752045859262e-06, | |
| "loss": 0.1817, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2152450680732727, | |
| "step": 3915, | |
| "valid_targets_mean": 5908.2, | |
| "valid_targets_min": 1647 | |
| }, | |
| { | |
| "epoch": 6.302250803858521, | |
| "grad_norm": 0.5159644200801217, | |
| "learning_rate": 1.204319962304492e-06, | |
| "loss": 0.2004, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19172531366348267, | |
| "step": 3920, | |
| "valid_targets_mean": 5150.0, | |
| "valid_targets_min": 877 | |
| }, | |
| { | |
| "epoch": 6.310289389067524, | |
| "grad_norm": 0.48083461205602324, | |
| "learning_rate": 1.1770668322849032e-06, | |
| "loss": 0.1974, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17192471027374268, | |
| "step": 3925, | |
| "valid_targets_mean": 5867.6, | |
| "valid_targets_min": 904 | |
| }, | |
| { | |
| "epoch": 6.318327974276527, | |
| "grad_norm": 0.49748778307814, | |
| "learning_rate": 1.1501162525801956e-06, | |
| "loss": 0.1858, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1800798773765564, | |
| "step": 3930, | |
| "valid_targets_mean": 5804.5, | |
| "valid_targets_min": 1475 | |
| }, | |
| { | |
| "epoch": 6.326366559485531, | |
| "grad_norm": 0.509965747323788, | |
| "learning_rate": 1.1234686563803709e-06, | |
| "loss": 0.1759, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17962145805358887, | |
| "step": 3935, | |
| "valid_targets_mean": 5479.6, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 6.334405144694534, | |
| "grad_norm": 0.4569426127753139, | |
| "learning_rate": 1.097124472005422e-06, | |
| "loss": 0.1805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15181009471416473, | |
| "step": 3940, | |
| "valid_targets_mean": 5221.9, | |
| "valid_targets_min": 3021 | |
| }, | |
| { | |
| "epoch": 6.342443729903537, | |
| "grad_norm": 0.4827950680272922, | |
| "learning_rate": 1.0710841228984536e-06, | |
| "loss": 0.2011, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20170137286186218, | |
| "step": 3945, | |
| "valid_targets_mean": 5781.6, | |
| "valid_targets_min": 550 | |
| }, | |
| { | |
| "epoch": 6.35048231511254, | |
| "grad_norm": 0.7716956645165256, | |
| "learning_rate": 1.0453480276188777e-06, | |
| "loss": 0.175, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16286450624465942, | |
| "step": 3950, | |
| "valid_targets_mean": 4926.1, | |
| "valid_targets_min": 718 | |
| }, | |
| { | |
| "epoch": 6.358520900321543, | |
| "grad_norm": 0.5109589924706025, | |
| "learning_rate": 1.019916599835682e-06, | |
| "loss": 0.1776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1650046706199646, | |
| "step": 3955, | |
| "valid_targets_mean": 5433.2, | |
| "valid_targets_min": 867 | |
| }, | |
| { | |
| "epoch": 6.366559485530547, | |
| "grad_norm": 0.4795377899307965, | |
| "learning_rate": 9.94790248320785e-07, | |
| "loss": 0.1717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18109866976737976, | |
| "step": 3960, | |
| "valid_targets_mean": 5475.3, | |
| "valid_targets_min": 3093 | |
| }, | |
| { | |
| "epoch": 6.37459807073955, | |
| "grad_norm": 0.5202485003895791, | |
| "learning_rate": 9.699693769424634e-07, | |
| "loss": 0.1834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1738145351409912, | |
| "step": 3965, | |
| "valid_targets_mean": 4568.5, | |
| "valid_targets_min": 686 | |
| }, | |
| { | |
| "epoch": 6.382636655948553, | |
| "grad_norm": 0.4630700661014578, | |
| "learning_rate": 9.454543846588571e-07, | |
| "loss": 0.166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17738892138004303, | |
| "step": 3970, | |
| "valid_targets_mean": 6398.9, | |
| "valid_targets_min": 4449 | |
| }, | |
| { | |
| "epoch": 6.390675241157556, | |
| "grad_norm": 0.5531898311258766, | |
| "learning_rate": 9.212456655115609e-07, | |
| "loss": 0.2124, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23551887273788452, | |
| "step": 3975, | |
| "valid_targets_mean": 4852.2, | |
| "valid_targets_min": 872 | |
| }, | |
| { | |
| "epoch": 6.39871382636656, | |
| "grad_norm": 0.5392493230684815, | |
| "learning_rate": 8.973436086192966e-07, | |
| "loss": 0.1873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20924073457717896, | |
| "step": 3980, | |
| "valid_targets_mean": 4547.6, | |
| "valid_targets_min": 681 | |
| }, | |
| { | |
| "epoch": 6.406752411575563, | |
| "grad_norm": 0.5246926009101333, | |
| "learning_rate": 8.737485981716465e-07, | |
| "loss": 0.205, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21452780067920685, | |
| "step": 3985, | |
| "valid_targets_mean": 5697.2, | |
| "valid_targets_min": 1800 | |
| }, | |
| { | |
| "epoch": 6.414790996784566, | |
| "grad_norm": 0.4966041907551861, | |
| "learning_rate": 8.50461013422883e-07, | |
| "loss": 0.1762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18985769152641296, | |
| "step": 3990, | |
| "valid_targets_mean": 5423.9, | |
| "valid_targets_min": 371 | |
| }, | |
| { | |
| "epoch": 6.422829581993569, | |
| "grad_norm": 0.48928365103208005, | |
| "learning_rate": 8.274812286858714e-07, | |
| "loss": 0.1683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16009929776191711, | |
| "step": 3995, | |
| "valid_targets_mean": 5772.6, | |
| "valid_targets_min": 368 | |
| }, | |
| { | |
| "epoch": 6.430868167202572, | |
| "grad_norm": 0.5205676594223365, | |
| "learning_rate": 8.048096133260586e-07, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18772917985916138, | |
| "step": 4000, | |
| "valid_targets_mean": 5536.8, | |
| "valid_targets_min": 664 | |
| }, | |
| { | |
| "epoch": 6.438906752411576, | |
| "grad_norm": 0.46495455507078626, | |
| "learning_rate": 7.824465317555318e-07, | |
| "loss": 0.1762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15344102680683136, | |
| "step": 4005, | |
| "valid_targets_mean": 6885.2, | |
| "valid_targets_min": 712 | |
| }, | |
| { | |
| "epoch": 6.446945337620579, | |
| "grad_norm": 0.48861786608204816, | |
| "learning_rate": 7.603923434271699e-07, | |
| "loss": 0.1874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2028530240058899, | |
| "step": 4010, | |
| "valid_targets_mean": 5799.2, | |
| "valid_targets_min": 2698 | |
| }, | |
| { | |
| "epoch": 6.454983922829582, | |
| "grad_norm": 0.54991147386686, | |
| "learning_rate": 7.386474028288382e-07, | |
| "loss": 0.1751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18571394681930542, | |
| "step": 4015, | |
| "valid_targets_mean": 4742.5, | |
| "valid_targets_min": 1168 | |
| }, | |
| { | |
| "epoch": 6.463022508038585, | |
| "grad_norm": 0.5715790732228933, | |
| "learning_rate": 7.172120594777299e-07, | |
| "loss": 0.1844, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21050646901130676, | |
| "step": 4020, | |
| "valid_targets_mean": 5150.3, | |
| "valid_targets_min": 437 | |
| }, | |
| { | |
| "epoch": 6.471061093247588, | |
| "grad_norm": 0.4730497066181822, | |
| "learning_rate": 6.960866579147185e-07, | |
| "loss": 0.1781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17261850833892822, | |
| "step": 4025, | |
| "valid_targets_mean": 5338.9, | |
| "valid_targets_min": 363 | |
| }, | |
| { | |
| "epoch": 6.479099678456592, | |
| "grad_norm": 0.527431227427804, | |
| "learning_rate": 6.752715376988339e-07, | |
| "loss": 0.1728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18096715211868286, | |
| "step": 4030, | |
| "valid_targets_mean": 4199.8, | |
| "valid_targets_min": 1755 | |
| }, | |
| { | |
| "epoch": 6.487138263665595, | |
| "grad_norm": 0.46366935985990176, | |
| "learning_rate": 6.547670334017953e-07, | |
| "loss": 0.1908, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22837695479393005, | |
| "step": 4035, | |
| "valid_targets_mean": 6578.4, | |
| "valid_targets_min": 1681 | |
| }, | |
| { | |
| "epoch": 6.495176848874598, | |
| "grad_norm": 0.46722736029353956, | |
| "learning_rate": 6.345734746026466e-07, | |
| "loss": 0.1611, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1680130660533905, | |
| "step": 4040, | |
| "valid_targets_mean": 6325.9, | |
| "valid_targets_min": 2561 | |
| }, | |
| { | |
| "epoch": 6.503215434083601, | |
| "grad_norm": 0.5254245828058934, | |
| "learning_rate": 6.146911858824434e-07, | |
| "loss": 0.1867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16261672973632812, | |
| "step": 4045, | |
| "valid_targets_mean": 4747.1, | |
| "valid_targets_min": 1026 | |
| }, | |
| { | |
| "epoch": 6.511254019292604, | |
| "grad_norm": 0.45813060973081743, | |
| "learning_rate": 5.951204868190452e-07, | |
| "loss": 0.1964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16923211514949799, | |
| "step": 4050, | |
| "valid_targets_mean": 5838.7, | |
| "valid_targets_min": 411 | |
| }, | |
| { | |
| "epoch": 6.519292604501608, | |
| "grad_norm": 0.473936039803348, | |
| "learning_rate": 5.758616919819826e-07, | |
| "loss": 0.1753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19065213203430176, | |
| "step": 4055, | |
| "valid_targets_mean": 5748.9, | |
| "valid_targets_min": 2534 | |
| }, | |
| { | |
| "epoch": 6.527331189710611, | |
| "grad_norm": 0.47792389668692997, | |
| "learning_rate": 5.569151109273851e-07, | |
| "loss": 0.1914, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19362300634384155, | |
| "step": 4060, | |
| "valid_targets_mean": 5755.1, | |
| "valid_targets_min": 2366 | |
| }, | |
| { | |
| "epoch": 6.535369774919614, | |
| "grad_norm": 0.5855563122421675, | |
| "learning_rate": 5.382810481930256e-07, | |
| "loss": 0.1806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1771083027124405, | |
| "step": 4065, | |
| "valid_targets_mean": 4887.2, | |
| "valid_targets_min": 548 | |
| }, | |
| { | |
| "epoch": 6.543408360128617, | |
| "grad_norm": 0.4745893409118789, | |
| "learning_rate": 5.199598032934106e-07, | |
| "loss": 0.1778, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1631501466035843, | |
| "step": 4070, | |
| "valid_targets_mean": 5731.9, | |
| "valid_targets_min": 3409 | |
| }, | |
| { | |
| "epoch": 6.551446945337621, | |
| "grad_norm": 0.5161967610866468, | |
| "learning_rate": 5.019516707149685e-07, | |
| "loss": 0.1754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18905222415924072, | |
| "step": 4075, | |
| "valid_targets_mean": 4795.0, | |
| "valid_targets_min": 2189 | |
| }, | |
| { | |
| "epoch": 6.559485530546624, | |
| "grad_norm": 0.802682405482379, | |
| "learning_rate": 4.84256939911325e-07, | |
| "loss": 0.1805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21466057002544403, | |
| "step": 4080, | |
| "valid_targets_mean": 4320.0, | |
| "valid_targets_min": 548 | |
| }, | |
| { | |
| "epoch": 6.567524115755627, | |
| "grad_norm": 0.5031206858818378, | |
| "learning_rate": 4.6687589529863517e-07, | |
| "loss": 0.1968, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19306516647338867, | |
| "step": 4085, | |
| "valid_targets_mean": 5357.6, | |
| "valid_targets_min": 818 | |
| }, | |
| { | |
| "epoch": 6.57556270096463, | |
| "grad_norm": 0.5178506022576367, | |
| "learning_rate": 4.498088162510317e-07, | |
| "loss": 0.1971, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23467575013637543, | |
| "step": 4090, | |
| "valid_targets_mean": 5338.9, | |
| "valid_targets_min": 594 | |
| }, | |
| { | |
| "epoch": 6.583601286173634, | |
| "grad_norm": 0.5076224755919664, | |
| "learning_rate": 4.330559770961151e-07, | |
| "loss": 0.1911, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1578923761844635, | |
| "step": 4095, | |
| "valid_targets_mean": 4803.6, | |
| "valid_targets_min": 1802 | |
| }, | |
| { | |
| "epoch": 6.591639871382637, | |
| "grad_norm": 0.4592277887151131, | |
| "learning_rate": 4.166176471105554e-07, | |
| "loss": 0.1896, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15273262560367584, | |
| "step": 4100, | |
| "valid_targets_mean": 5040.8, | |
| "valid_targets_min": 1001 | |
| }, | |
| { | |
| "epoch": 6.59967845659164, | |
| "grad_norm": 0.5087668578828699, | |
| "learning_rate": 4.004940905157617e-07, | |
| "loss": 0.1744, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19955775141716003, | |
| "step": 4105, | |
| "valid_targets_mean": 5567.3, | |
| "valid_targets_min": 2229 | |
| }, | |
| { | |
| "epoch": 6.607717041800643, | |
| "grad_norm": 0.5439868707343057, | |
| "learning_rate": 3.8468556647363266e-07, | |
| "loss": 0.1664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18226560950279236, | |
| "step": 4110, | |
| "valid_targets_mean": 5092.6, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 6.615755627009646, | |
| "grad_norm": 0.5026023766001185, | |
| "learning_rate": 3.691923290823951e-07, | |
| "loss": 0.1733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18940620124340057, | |
| "step": 4115, | |
| "valid_targets_mean": 5858.8, | |
| "valid_targets_min": 2205 | |
| }, | |
| { | |
| "epoch": 6.62379421221865, | |
| "grad_norm": 0.46383552678879314, | |
| "learning_rate": 3.540146273725142e-07, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16480082273483276, | |
| "step": 4120, | |
| "valid_targets_mean": 5884.1, | |
| "valid_targets_min": 2157 | |
| }, | |
| { | |
| "epoch": 6.631832797427653, | |
| "grad_norm": 0.5348850869149254, | |
| "learning_rate": 3.391527053027033e-07, | |
| "loss": 0.1933, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24536944925785065, | |
| "step": 4125, | |
| "valid_targets_mean": 5093.4, | |
| "valid_targets_min": 2222 | |
| }, | |
| { | |
| "epoch": 6.639871382636656, | |
| "grad_norm": 0.4756091936153012, | |
| "learning_rate": 3.2460680175598e-07, | |
| "loss": 0.1924, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2090965211391449, | |
| "step": 4130, | |
| "valid_targets_mean": 6622.1, | |
| "valid_targets_min": 4223 | |
| }, | |
| { | |
| "epoch": 6.647909967845659, | |
| "grad_norm": 0.4963669522308755, | |
| "learning_rate": 3.103771505358499e-07, | |
| "loss": 0.1752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1624598205089569, | |
| "step": 4135, | |
| "valid_targets_mean": 5586.9, | |
| "valid_targets_min": 2579 | |
| }, | |
| { | |
| "epoch": 6.655948553054662, | |
| "grad_norm": 0.4678741621601647, | |
| "learning_rate": 2.9646398036253575e-07, | |
| "loss": 0.1724, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14038006961345673, | |
| "step": 4140, | |
| "valid_targets_mean": 4843.9, | |
| "valid_targets_min": 1675 | |
| }, | |
| { | |
| "epoch": 6.663987138263666, | |
| "grad_norm": 0.47210780129474755, | |
| "learning_rate": 2.828675148693005e-07, | |
| "loss": 0.1675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16181638836860657, | |
| "step": 4145, | |
| "valid_targets_mean": 5237.2, | |
| "valid_targets_min": 2943 | |
| }, | |
| { | |
| "epoch": 6.672025723472669, | |
| "grad_norm": 0.5158718099795397, | |
| "learning_rate": 2.695879725988637e-07, | |
| "loss": 0.1799, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1680976003408432, | |
| "step": 4150, | |
| "valid_targets_mean": 4968.0, | |
| "valid_targets_min": 486 | |
| }, | |
| { | |
| "epoch": 6.680064308681672, | |
| "grad_norm": 0.49688533024331555, | |
| "learning_rate": 2.566255669998707e-07, | |
| "loss": 0.181, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1574384570121765, | |
| "step": 4155, | |
| "valid_targets_mean": 4732.6, | |
| "valid_targets_min": 1495 | |
| }, | |
| { | |
| "epoch": 6.688102893890675, | |
| "grad_norm": 0.4971282543462799, | |
| "learning_rate": 2.4398050642348235e-07, | |
| "loss": 0.1722, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17601963877677917, | |
| "step": 4160, | |
| "valid_targets_mean": 5555.8, | |
| "valid_targets_min": 2327 | |
| }, | |
| { | |
| "epoch": 6.696141479099678, | |
| "grad_norm": 0.4679076982911688, | |
| "learning_rate": 2.3165299412001075e-07, | |
| "loss": 0.1664, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14313621819019318, | |
| "step": 4165, | |
| "valid_targets_mean": 5264.6, | |
| "valid_targets_min": 1180 | |
| }, | |
| { | |
| "epoch": 6.704180064308682, | |
| "grad_norm": 0.44245119575522096, | |
| "learning_rate": 2.1964322823565532e-07, | |
| "loss": 0.1563, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15354637801647186, | |
| "step": 4170, | |
| "valid_targets_mean": 5514.4, | |
| "valid_targets_min": 949 | |
| }, | |
| { | |
| "epoch": 6.712218649517685, | |
| "grad_norm": 0.4745054491408359, | |
| "learning_rate": 2.0795140180932982e-07, | |
| "loss": 0.1814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14562906324863434, | |
| "step": 4175, | |
| "valid_targets_mean": 5620.2, | |
| "valid_targets_min": 2377 | |
| }, | |
| { | |
| "epoch": 6.720257234726688, | |
| "grad_norm": 0.47955050039844965, | |
| "learning_rate": 1.9657770276953592e-07, | |
| "loss": 0.2113, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24234527349472046, | |
| "step": 4180, | |
| "valid_targets_mean": 6996.8, | |
| "valid_targets_min": 1504 | |
| }, | |
| { | |
| "epoch": 6.728295819935691, | |
| "grad_norm": 0.5392328506399422, | |
| "learning_rate": 1.8552231393137e-07, | |
| "loss": 0.1834, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.14472296833992004, | |
| "step": 4185, | |
| "valid_targets_mean": 4974.5, | |
| "valid_targets_min": 630 | |
| }, | |
| { | |
| "epoch": 6.736334405144694, | |
| "grad_norm": 0.524110682981634, | |
| "learning_rate": 1.7478541299356332e-07, | |
| "loss": 0.2014, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19961285591125488, | |
| "step": 4190, | |
| "valid_targets_mean": 5437.1, | |
| "valid_targets_min": 1492 | |
| }, | |
| { | |
| "epoch": 6.744372990353698, | |
| "grad_norm": 0.44822215241237556, | |
| "learning_rate": 1.6436717253563327e-07, | |
| "loss": 0.1894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16462916135787964, | |
| "step": 4195, | |
| "valid_targets_mean": 6963.6, | |
| "valid_targets_min": 1612 | |
| }, | |
| { | |
| "epoch": 6.752411575562701, | |
| "grad_norm": 0.5237024825611266, | |
| "learning_rate": 1.5426776001511212e-07, | |
| "loss": 0.1819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20051389932632446, | |
| "step": 4200, | |
| "valid_targets_mean": 5342.4, | |
| "valid_targets_min": 2431 | |
| }, | |
| { | |
| "epoch": 6.760450160771704, | |
| "grad_norm": 0.4934098015061834, | |
| "learning_rate": 1.444873377648537e-07, | |
| "loss": 0.1918, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18957063555717468, | |
| "step": 4205, | |
| "valid_targets_mean": 6711.7, | |
| "valid_targets_min": 3013 | |
| }, | |
| { | |
| "epoch": 6.768488745980707, | |
| "grad_norm": 0.5187855790783407, | |
| "learning_rate": 1.3502606299042432e-07, | |
| "loss": 0.1686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16103804111480713, | |
| "step": 4210, | |
| "valid_targets_mean": 5401.3, | |
| "valid_targets_min": 2834 | |
| }, | |
| { | |
| "epoch": 6.776527331189711, | |
| "grad_norm": 0.47209865860791694, | |
| "learning_rate": 1.2588408776756933e-07, | |
| "loss": 0.1897, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17956702411174774, | |
| "step": 4215, | |
| "valid_targets_mean": 5800.7, | |
| "valid_targets_min": 3545 | |
| }, | |
| { | |
| "epoch": 6.784565916398714, | |
| "grad_norm": 0.4701575804504904, | |
| "learning_rate": 1.1706155903977945e-07, | |
| "loss": 0.1904, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20110127329826355, | |
| "step": 4220, | |
| "valid_targets_mean": 5929.1, | |
| "valid_targets_min": 2331 | |
| }, | |
| { | |
| "epoch": 6.792604501607717, | |
| "grad_norm": 0.5296390861079925, | |
| "learning_rate": 1.0855861861592376e-07, | |
| "loss": 0.197, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19427093863487244, | |
| "step": 4225, | |
| "valid_targets_mean": 5843.8, | |
| "valid_targets_min": 607 | |
| }, | |
| { | |
| "epoch": 6.80064308681672, | |
| "grad_norm": 0.5310412544676056, | |
| "learning_rate": 1.0037540316796713e-07, | |
| "loss": 0.1741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1741572916507721, | |
| "step": 4230, | |
| "valid_targets_mean": 4511.2, | |
| "valid_targets_min": 808 | |
| }, | |
| { | |
| "epoch": 6.808681672025724, | |
| "grad_norm": 0.457276791840171, | |
| "learning_rate": 9.251204422878302e-08, | |
| "loss": 0.1804, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1941903531551361, | |
| "step": 4235, | |
| "valid_targets_mean": 5819.2, | |
| "valid_targets_min": 398 | |
| }, | |
| { | |
| "epoch": 6.816720257234727, | |
| "grad_norm": 0.4812323518875819, | |
| "learning_rate": 8.496866819002636e-08, | |
| "loss": 0.1795, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1971593052148819, | |
| "step": 4240, | |
| "valid_targets_mean": 5406.8, | |
| "valid_targets_min": 747 | |
| }, | |
| { | |
| "epoch": 6.82475884244373, | |
| "grad_norm": 0.513399635333341, | |
| "learning_rate": 7.774539630010846e-08, | |
| "loss": 0.1982, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2618066668510437, | |
| "step": 4245, | |
| "valid_targets_mean": 6306.6, | |
| "valid_targets_min": 2073 | |
| }, | |
| { | |
| "epoch": 6.832797427652733, | |
| "grad_norm": 0.4610409646272019, | |
| "learning_rate": 7.084234466225637e-08, | |
| "loss": 0.1859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16325844824314117, | |
| "step": 4250, | |
| "valid_targets_mean": 5903.2, | |
| "valid_targets_min": 2962 | |
| }, | |
| { | |
| "epoch": 6.840836012861736, | |
| "grad_norm": 0.48257013983282615, | |
| "learning_rate": 6.425962423262988e-08, | |
| "loss": 0.1806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2067861407995224, | |
| "step": 4255, | |
| "valid_targets_mean": 5777.6, | |
| "valid_targets_min": 1767 | |
| }, | |
| { | |
| "epoch": 6.84887459807074, | |
| "grad_norm": 0.4194055735557952, | |
| "learning_rate": 5.799734081855413e-08, | |
| "loss": 0.1918, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16234293580055237, | |
| "step": 4260, | |
| "valid_targets_mean": 6958.3, | |
| "valid_targets_min": 3030 | |
| }, | |
| { | |
| "epoch": 6.856913183279743, | |
| "grad_norm": 0.5762898140474347, | |
| "learning_rate": 5.205559507680313e-08, | |
| "loss": 0.1806, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16969682276248932, | |
| "step": 4265, | |
| "valid_targets_mean": 6194.4, | |
| "valid_targets_min": 3190 | |
| }, | |
| { | |
| "epoch": 6.864951768488746, | |
| "grad_norm": 0.4726219925484245, | |
| "learning_rate": 4.643448251199667e-08, | |
| "loss": 0.1848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.19437849521636963, | |
| "step": 4270, | |
| "valid_targets_mean": 6152.1, | |
| "valid_targets_min": 2800 | |
| }, | |
| { | |
| "epoch": 6.872990353697749, | |
| "grad_norm": 0.626802854083221, | |
| "learning_rate": 4.113409347505926e-08, | |
| "loss": 0.1895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21149322390556335, | |
| "step": 4275, | |
| "valid_targets_mean": 4295.7, | |
| "valid_targets_min": 305 | |
| }, | |
| { | |
| "epoch": 6.881028938906752, | |
| "grad_norm": 0.5992693492880397, | |
| "learning_rate": 3.615451316176355e-08, | |
| "loss": 0.1984, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22360846400260925, | |
| "step": 4280, | |
| "valid_targets_mean": 4792.6, | |
| "valid_targets_min": 1949 | |
| }, | |
| { | |
| "epoch": 6.889067524115756, | |
| "grad_norm": 0.5089549192427425, | |
| "learning_rate": 3.149582161136699e-08, | |
| "loss": 0.1897, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22241082787513733, | |
| "step": 4285, | |
| "valid_targets_mean": 5543.4, | |
| "valid_targets_min": 3958 | |
| }, | |
| { | |
| "epoch": 6.897106109324759, | |
| "grad_norm": 0.49095582885510597, | |
| "learning_rate": 2.7158093705323962e-08, | |
| "loss": 0.1915, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1889224797487259, | |
| "step": 4290, | |
| "valid_targets_mean": 5026.6, | |
| "valid_targets_min": 1093 | |
| }, | |
| { | |
| "epoch": 6.905144694533762, | |
| "grad_norm": 0.5161073267175579, | |
| "learning_rate": 2.3141399166080046e-08, | |
| "loss": 0.1707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1808251440525055, | |
| "step": 4295, | |
| "valid_targets_mean": 5040.4, | |
| "valid_targets_min": 705 | |
| }, | |
| { | |
| "epoch": 6.913183279742765, | |
| "grad_norm": 0.49706857696186435, | |
| "learning_rate": 1.9445802555952964e-08, | |
| "loss": 0.174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.15419217944145203, | |
| "step": 4300, | |
| "valid_targets_mean": 5024.6, | |
| "valid_targets_min": 2451 | |
| }, | |
| { | |
| "epoch": 6.921221864951768, | |
| "grad_norm": 0.5005639935906776, | |
| "learning_rate": 1.607136327609782e-08, | |
| "loss": 0.1928, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.180386483669281, | |
| "step": 4305, | |
| "valid_targets_mean": 5288.4, | |
| "valid_targets_min": 2894 | |
| }, | |
| { | |
| "epoch": 6.929260450160772, | |
| "grad_norm": 0.5507681061162143, | |
| "learning_rate": 1.3018135565541212e-08, | |
| "loss": 0.1849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.17055736482143402, | |
| "step": 4310, | |
| "valid_targets_mean": 4294.1, | |
| "valid_targets_min": 917 | |
| }, | |
| { | |
| "epoch": 6.937299035369775, | |
| "grad_norm": 0.47076314804544284, | |
| "learning_rate": 1.0286168500324157e-08, | |
| "loss": 0.1708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.18481239676475525, | |
| "step": 4315, | |
| "valid_targets_mean": 6214.2, | |
| "valid_targets_min": 1665 | |
| }, | |
| { | |
| "epoch": 6.945337620578778, | |
| "grad_norm": 0.5290017885779597, | |
| "learning_rate": 7.875505992707144e-09, | |
| "loss": 0.1829, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1767764687538147, | |
| "step": 4320, | |
| "valid_targets_mean": 4849.6, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 6.953376205787781, | |
| "grad_norm": 0.5194272213690765, | |
| "learning_rate": 5.786186790459614e-09, | |
| "loss": 0.1991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1747482419013977, | |
| "step": 4325, | |
| "valid_targets_mean": 4611.9, | |
| "valid_targets_min": 1654 | |
| }, | |
| { | |
| "epoch": 6.961414790996785, | |
| "grad_norm": 0.4677227966120155, | |
| "learning_rate": 4.018244476240441e-09, | |
| "loss": 0.1728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16645994782447815, | |
| "step": 4330, | |
| "valid_targets_mean": 5858.0, | |
| "valid_targets_min": 2611 | |
| }, | |
| { | |
| "epoch": 6.969453376205788, | |
| "grad_norm": 0.4969571582473657, | |
| "learning_rate": 2.5717074670628116e-09, | |
| "loss": 0.1881, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16073165833950043, | |
| "step": 4335, | |
| "valid_targets_mean": 5167.6, | |
| "valid_targets_min": 1905 | |
| }, | |
| { | |
| "epoch": 6.977491961414791, | |
| "grad_norm": 0.4569589126728696, | |
| "learning_rate": 1.4465990138323727e-09, | |
| "loss": 0.2074, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24416705965995789, | |
| "step": 4340, | |
| "valid_targets_mean": 6487.6, | |
| "valid_targets_min": 1716 | |
| }, | |
| { | |
| "epoch": 6.985530546623794, | |
| "grad_norm": 0.5594617903547844, | |
| "learning_rate": 6.429372009719714e-10, | |
| "loss": 0.1943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25965332984924316, | |
| "step": 4345, | |
| "valid_targets_mean": 5103.3, | |
| "valid_targets_min": 424 | |
| }, | |
| { | |
| "epoch": 6.993569131832797, | |
| "grad_norm": 0.4299759834166724, | |
| "learning_rate": 1.6073494613522146e-10, | |
| "loss": 0.1791, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.16728906333446503, | |
| "step": 4350, | |
| "valid_targets_mean": 6768.5, | |
| "valid_targets_min": 3807 | |
| }, | |
| { | |
| "epoch": 7.0, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.1630803495645523, | |
| "step": 4354, | |
| "total_flos": 1827324081995776.0, | |
| "train_loss": 0.23159787104006038, | |
| "train_runtime": 29184.2585, | |
| "train_samples_per_second": 2.386, | |
| "train_steps_per_second": 0.149, | |
| "valid_targets_mean": 6348.0, | |
| "valid_targets_min": 2723 | |
| } | |
| ], | |
| "logging_steps": 5, | |
| "max_steps": 4354, | |
| "num_input_tokens_seen": 0, | |
| "num_train_epochs": 7, | |
| "save_steps": 1500, | |
| "stateful_callbacks": { | |
| "TrainerControl": { | |
| "args": { | |
| "should_epoch_stop": false, | |
| "should_evaluate": false, | |
| "should_log": false, | |
| "should_save": true, | |
| "should_training_stop": true | |
| }, | |
| "attributes": {} | |
| } | |
| }, | |
| "total_flos": 1827324081995776.0, | |
| "train_batch_size": 1, | |
| "trial_name": null, | |
| "trial_params": null | |
| } | |