Text Generation
Transformers
Safetensors
qwen3
llama-factory
full
Generated from Trainer
conversational
text-generation-inference
Instructions to use DCAgent/a1-manybugs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DCAgent/a1-manybugs with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DCAgent/a1-manybugs") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-manybugs") model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-manybugs") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DCAgent/a1-manybugs with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DCAgent/a1-manybugs" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-manybugs", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DCAgent/a1-manybugs
- SGLang
How to use DCAgent/a1-manybugs with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DCAgent/a1-manybugs" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-manybugs", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DCAgent/a1-manybugs" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DCAgent/a1-manybugs", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DCAgent/a1-manybugs with Docker Model Runner:
docker model run hf.co/DCAgent/a1-manybugs
| { | |
| "best_global_step": null, | |
| "best_metric": null, | |
| "best_model_checkpoint": null, | |
| "epoch": 7.0, | |
| "eval_steps": 500, | |
| "global_step": 4382, | |
| "is_hyper_param_search": false, | |
| "is_local_process_zero": true, | |
| "is_world_process_zero": true, | |
| "log_history": [ | |
| { | |
| "epoch": 0.007987220447284345, | |
| "grad_norm": 22.29097396850603, | |
| "learning_rate": 3.644646924829157e-07, | |
| "loss": 0.8976, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.8962365984916687, | |
| "step": 5, | |
| "valid_targets_mean": 5544.1, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 0.01597444089456869, | |
| "grad_norm": 17.447315205009904, | |
| "learning_rate": 8.200455580865605e-07, | |
| "loss": 0.8775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.906883955001831, | |
| "step": 10, | |
| "valid_targets_mean": 6865.4, | |
| "valid_targets_min": 2049 | |
| }, | |
| { | |
| "epoch": 0.023961661341853034, | |
| "grad_norm": 18.40336164233113, | |
| "learning_rate": 1.2756264236902052e-06, | |
| "loss": 0.8746, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.8615728616714478, | |
| "step": 15, | |
| "valid_targets_mean": 5900.9, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 0.03194888178913738, | |
| "grad_norm": 13.408256165170581, | |
| "learning_rate": 1.7312072892938498e-06, | |
| "loss": 0.8415, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7963252067565918, | |
| "step": 20, | |
| "valid_targets_mean": 5199.8, | |
| "valid_targets_min": 2192 | |
| }, | |
| { | |
| "epoch": 0.039936102236421724, | |
| "grad_norm": 7.13149306522106, | |
| "learning_rate": 2.1867881548974945e-06, | |
| "loss": 0.7727, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7700879573822021, | |
| "step": 25, | |
| "valid_targets_mean": 6628.6, | |
| "valid_targets_min": 2982 | |
| }, | |
| { | |
| "epoch": 0.04792332268370607, | |
| "grad_norm": 4.456123394412002, | |
| "learning_rate": 2.642369020501139e-06, | |
| "loss": 0.7419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7289493680000305, | |
| "step": 30, | |
| "valid_targets_mean": 5167.9, | |
| "valid_targets_min": 1882 | |
| }, | |
| { | |
| "epoch": 0.05591054313099041, | |
| "grad_norm": 2.2132255037980966, | |
| "learning_rate": 3.0979498861047843e-06, | |
| "loss": 0.6814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.7029756307601929, | |
| "step": 35, | |
| "valid_targets_mean": 6758.6, | |
| "valid_targets_min": 2033 | |
| }, | |
| { | |
| "epoch": 0.06389776357827476, | |
| "grad_norm": 1.6914619271640663, | |
| "learning_rate": 3.5535307517084285e-06, | |
| "loss": 0.6716, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6477246880531311, | |
| "step": 40, | |
| "valid_targets_mean": 5334.2, | |
| "valid_targets_min": 2068 | |
| }, | |
| { | |
| "epoch": 0.07188498402555911, | |
| "grad_norm": 1.2938225524942115, | |
| "learning_rate": 4.009111617312073e-06, | |
| "loss": 0.6245, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6096252202987671, | |
| "step": 45, | |
| "valid_targets_mean": 6425.3, | |
| "valid_targets_min": 3166 | |
| }, | |
| { | |
| "epoch": 0.07987220447284345, | |
| "grad_norm": 1.1251836905658048, | |
| "learning_rate": 4.464692482915718e-06, | |
| "loss": 0.6949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.678775429725647, | |
| "step": 50, | |
| "valid_targets_mean": 6675.1, | |
| "valid_targets_min": 2692 | |
| }, | |
| { | |
| "epoch": 0.0878594249201278, | |
| "grad_norm": 0.8759483187690443, | |
| "learning_rate": 4.920273348519363e-06, | |
| "loss": 0.6011, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5774353742599487, | |
| "step": 55, | |
| "valid_targets_mean": 6467.1, | |
| "valid_targets_min": 3484 | |
| }, | |
| { | |
| "epoch": 0.09584664536741214, | |
| "grad_norm": 0.8391722198296493, | |
| "learning_rate": 5.375854214123008e-06, | |
| "loss": 0.6147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.6102516055107117, | |
| "step": 60, | |
| "valid_targets_mean": 6713.9, | |
| "valid_targets_min": 2838 | |
| }, | |
| { | |
| "epoch": 0.10383386581469649, | |
| "grad_norm": 0.7933498808821272, | |
| "learning_rate": 5.831435079726651e-06, | |
| "loss": 0.6085, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5953581929206848, | |
| "step": 65, | |
| "valid_targets_mean": 5745.7, | |
| "valid_targets_min": 2472 | |
| }, | |
| { | |
| "epoch": 0.11182108626198083, | |
| "grad_norm": 0.7011614956821861, | |
| "learning_rate": 6.287015945330297e-06, | |
| "loss": 0.5805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5873997807502747, | |
| "step": 70, | |
| "valid_targets_mean": 6020.4, | |
| "valid_targets_min": 3162 | |
| }, | |
| { | |
| "epoch": 0.11980830670926518, | |
| "grad_norm": 0.7078036797086826, | |
| "learning_rate": 6.742596810933942e-06, | |
| "loss": 0.5448, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5022627115249634, | |
| "step": 75, | |
| "valid_targets_mean": 5312.5, | |
| "valid_targets_min": 710 | |
| }, | |
| { | |
| "epoch": 0.12779552715654952, | |
| "grad_norm": 0.6310376996450648, | |
| "learning_rate": 7.1981776765375854e-06, | |
| "loss": 0.5373, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5161950588226318, | |
| "step": 80, | |
| "valid_targets_mean": 5789.0, | |
| "valid_targets_min": 2637 | |
| }, | |
| { | |
| "epoch": 0.13578274760383385, | |
| "grad_norm": 0.617185355615476, | |
| "learning_rate": 7.65375854214123e-06, | |
| "loss": 0.5274, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.49450981616973877, | |
| "step": 85, | |
| "valid_targets_mean": 5372.6, | |
| "valid_targets_min": 2430 | |
| }, | |
| { | |
| "epoch": 0.14376996805111822, | |
| "grad_norm": 0.5474710996437506, | |
| "learning_rate": 8.109339407744875e-06, | |
| "loss": 0.5336, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5298895835876465, | |
| "step": 90, | |
| "valid_targets_mean": 6071.4, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 0.15175718849840256, | |
| "grad_norm": 0.5059145649891029, | |
| "learning_rate": 8.564920273348521e-06, | |
| "loss": 0.5371, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5200586915016174, | |
| "step": 95, | |
| "valid_targets_mean": 6590.7, | |
| "valid_targets_min": 815 | |
| }, | |
| { | |
| "epoch": 0.1597444089456869, | |
| "grad_norm": 0.5555867702885045, | |
| "learning_rate": 9.020501138952164e-06, | |
| "loss": 0.4969, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.5005719661712646, | |
| "step": 100, | |
| "valid_targets_mean": 5887.2, | |
| "valid_targets_min": 778 | |
| }, | |
| { | |
| "epoch": 0.16773162939297126, | |
| "grad_norm": 0.5518211432988361, | |
| "learning_rate": 9.47608200455581e-06, | |
| "loss": 0.4856, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.44539234042167664, | |
| "step": 105, | |
| "valid_targets_mean": 5481.1, | |
| "valid_targets_min": 2512 | |
| }, | |
| { | |
| "epoch": 0.1757188498402556, | |
| "grad_norm": 0.5004735962021326, | |
| "learning_rate": 9.931662870159453e-06, | |
| "loss": 0.4646, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.42964237928390503, | |
| "step": 110, | |
| "valid_targets_mean": 5738.3, | |
| "valid_targets_min": 3215 | |
| }, | |
| { | |
| "epoch": 0.18370607028753994, | |
| "grad_norm": 0.5344452976401932, | |
| "learning_rate": 1.03872437357631e-05, | |
| "loss": 0.4804, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4274917244911194, | |
| "step": 115, | |
| "valid_targets_mean": 5590.9, | |
| "valid_targets_min": 3197 | |
| }, | |
| { | |
| "epoch": 0.19169329073482427, | |
| "grad_norm": 0.5167028899228091, | |
| "learning_rate": 1.0842824601366744e-05, | |
| "loss": 0.4735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4663624167442322, | |
| "step": 120, | |
| "valid_targets_mean": 6678.6, | |
| "valid_targets_min": 3448 | |
| }, | |
| { | |
| "epoch": 0.19968051118210864, | |
| "grad_norm": 0.6829126849974234, | |
| "learning_rate": 1.1298405466970387e-05, | |
| "loss": 0.4417, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4442654550075531, | |
| "step": 125, | |
| "valid_targets_mean": 4994.4, | |
| "valid_targets_min": 2545 | |
| }, | |
| { | |
| "epoch": 0.20766773162939298, | |
| "grad_norm": 0.5231813461442346, | |
| "learning_rate": 1.1753986332574032e-05, | |
| "loss": 0.4661, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.484214186668396, | |
| "step": 130, | |
| "valid_targets_mean": 6270.8, | |
| "valid_targets_min": 1923 | |
| }, | |
| { | |
| "epoch": 0.21565495207667731, | |
| "grad_norm": 0.5532665550579746, | |
| "learning_rate": 1.2209567198177677e-05, | |
| "loss": 0.4672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4827231764793396, | |
| "step": 135, | |
| "valid_targets_mean": 5484.6, | |
| "valid_targets_min": 654 | |
| }, | |
| { | |
| "epoch": 0.22364217252396165, | |
| "grad_norm": 0.54115252706349, | |
| "learning_rate": 1.2665148063781323e-05, | |
| "loss": 0.4433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4202059805393219, | |
| "step": 140, | |
| "valid_targets_mean": 5385.4, | |
| "valid_targets_min": 2523 | |
| }, | |
| { | |
| "epoch": 0.23162939297124602, | |
| "grad_norm": 0.4675239043218751, | |
| "learning_rate": 1.3120728929384968e-05, | |
| "loss": 0.4196, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.42856523394584656, | |
| "step": 145, | |
| "valid_targets_mean": 6593.8, | |
| "valid_targets_min": 3865 | |
| }, | |
| { | |
| "epoch": 0.23961661341853036, | |
| "grad_norm": 0.5014026032614606, | |
| "learning_rate": 1.357630979498861e-05, | |
| "loss": 0.4375, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4211668372154236, | |
| "step": 150, | |
| "valid_targets_mean": 6255.1, | |
| "valid_targets_min": 2793 | |
| }, | |
| { | |
| "epoch": 0.2476038338658147, | |
| "grad_norm": 0.5661528551947529, | |
| "learning_rate": 1.4031890660592255e-05, | |
| "loss": 0.447, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.41060033440589905, | |
| "step": 155, | |
| "valid_targets_mean": 5486.7, | |
| "valid_targets_min": 850 | |
| }, | |
| { | |
| "epoch": 0.25559105431309903, | |
| "grad_norm": 0.5316496249184558, | |
| "learning_rate": 1.4487471526195902e-05, | |
| "loss": 0.4215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4127007722854614, | |
| "step": 160, | |
| "valid_targets_mean": 6952.2, | |
| "valid_targets_min": 3028 | |
| }, | |
| { | |
| "epoch": 0.26357827476038337, | |
| "grad_norm": 0.5455090268123999, | |
| "learning_rate": 1.4943052391799546e-05, | |
| "loss": 0.4319, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4257356524467468, | |
| "step": 165, | |
| "valid_targets_mean": 6078.4, | |
| "valid_targets_min": 2634 | |
| }, | |
| { | |
| "epoch": 0.2715654952076677, | |
| "grad_norm": 0.594020125737104, | |
| "learning_rate": 1.539863325740319e-05, | |
| "loss": 0.4292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4559805989265442, | |
| "step": 170, | |
| "valid_targets_mean": 6805.6, | |
| "valid_targets_min": 4294 | |
| }, | |
| { | |
| "epoch": 0.2795527156549521, | |
| "grad_norm": 0.522417181293472, | |
| "learning_rate": 1.5854214123006836e-05, | |
| "loss": 0.4156, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3874110281467438, | |
| "step": 175, | |
| "valid_targets_mean": 6254.8, | |
| "valid_targets_min": 2253 | |
| }, | |
| { | |
| "epoch": 0.28753993610223644, | |
| "grad_norm": 0.5428395624186126, | |
| "learning_rate": 1.630979498861048e-05, | |
| "loss": 0.4173, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4159950017929077, | |
| "step": 180, | |
| "valid_targets_mean": 5860.8, | |
| "valid_targets_min": 1902 | |
| }, | |
| { | |
| "epoch": 0.2955271565495208, | |
| "grad_norm": 0.5358549711296916, | |
| "learning_rate": 1.6765375854214125e-05, | |
| "loss": 0.3776, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3822939395904541, | |
| "step": 185, | |
| "valid_targets_mean": 6356.2, | |
| "valid_targets_min": 2948 | |
| }, | |
| { | |
| "epoch": 0.3035143769968051, | |
| "grad_norm": 0.5426788394144003, | |
| "learning_rate": 1.722095671981777e-05, | |
| "loss": 0.4459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4878259301185608, | |
| "step": 190, | |
| "valid_targets_mean": 7617.1, | |
| "valid_targets_min": 4064 | |
| }, | |
| { | |
| "epoch": 0.31150159744408945, | |
| "grad_norm": 0.5026620650853947, | |
| "learning_rate": 1.7676537585421415e-05, | |
| "loss": 0.4111, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3959900736808777, | |
| "step": 195, | |
| "valid_targets_mean": 6262.6, | |
| "valid_targets_min": 2355 | |
| }, | |
| { | |
| "epoch": 0.3194888178913738, | |
| "grad_norm": 0.7377751223656953, | |
| "learning_rate": 1.813211845102506e-05, | |
| "loss": 0.4158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3927955627441406, | |
| "step": 200, | |
| "valid_targets_mean": 5987.2, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 0.3274760383386581, | |
| "grad_norm": 0.5558038762919252, | |
| "learning_rate": 1.8587699316628704e-05, | |
| "loss": 0.3835, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3979894518852234, | |
| "step": 205, | |
| "valid_targets_mean": 6104.1, | |
| "valid_targets_min": 2891 | |
| }, | |
| { | |
| "epoch": 0.3354632587859425, | |
| "grad_norm": 0.5500077281201249, | |
| "learning_rate": 1.904328018223235e-05, | |
| "loss": 0.4115, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4100519120693207, | |
| "step": 210, | |
| "valid_targets_mean": 5980.8, | |
| "valid_targets_min": 2149 | |
| }, | |
| { | |
| "epoch": 0.34345047923322686, | |
| "grad_norm": 0.5267342611853494, | |
| "learning_rate": 1.9498861047835993e-05, | |
| "loss": 0.3775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3988687992095947, | |
| "step": 215, | |
| "valid_targets_mean": 5932.4, | |
| "valid_targets_min": 2627 | |
| }, | |
| { | |
| "epoch": 0.3514376996805112, | |
| "grad_norm": 0.5687702541878422, | |
| "learning_rate": 1.9954441913439638e-05, | |
| "loss": 0.4121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.404974102973938, | |
| "step": 220, | |
| "valid_targets_mean": 6019.1, | |
| "valid_targets_min": 3197 | |
| }, | |
| { | |
| "epoch": 0.35942492012779553, | |
| "grad_norm": 0.5214189282649531, | |
| "learning_rate": 2.0410022779043283e-05, | |
| "loss": 0.408, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4424813985824585, | |
| "step": 225, | |
| "valid_targets_mean": 6421.6, | |
| "valid_targets_min": 3161 | |
| }, | |
| { | |
| "epoch": 0.36741214057507987, | |
| "grad_norm": 0.4916702781926527, | |
| "learning_rate": 2.0865603644646927e-05, | |
| "loss": 0.4112, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35374367237091064, | |
| "step": 230, | |
| "valid_targets_mean": 6313.9, | |
| "valid_targets_min": 3229 | |
| }, | |
| { | |
| "epoch": 0.3753993610223642, | |
| "grad_norm": 0.5675566728715148, | |
| "learning_rate": 2.1321184510250572e-05, | |
| "loss": 0.381, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36870861053466797, | |
| "step": 235, | |
| "valid_targets_mean": 5828.0, | |
| "valid_targets_min": 2662 | |
| }, | |
| { | |
| "epoch": 0.38338658146964855, | |
| "grad_norm": 0.5847156984294266, | |
| "learning_rate": 2.1776765375854217e-05, | |
| "loss": 0.3975, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38380903005599976, | |
| "step": 240, | |
| "valid_targets_mean": 5947.7, | |
| "valid_targets_min": 1960 | |
| }, | |
| { | |
| "epoch": 0.3913738019169329, | |
| "grad_norm": 0.5824619896215698, | |
| "learning_rate": 2.223234624145786e-05, | |
| "loss": 0.3822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3842066526412964, | |
| "step": 245, | |
| "valid_targets_mean": 6012.7, | |
| "valid_targets_min": 2141 | |
| }, | |
| { | |
| "epoch": 0.3993610223642173, | |
| "grad_norm": 0.5327184879832241, | |
| "learning_rate": 2.2687927107061506e-05, | |
| "loss": 0.3892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3933866024017334, | |
| "step": 250, | |
| "valid_targets_mean": 5548.7, | |
| "valid_targets_min": 820 | |
| }, | |
| { | |
| "epoch": 0.4073482428115016, | |
| "grad_norm": 0.5116314853804573, | |
| "learning_rate": 2.314350797266515e-05, | |
| "loss": 0.4008, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4094942808151245, | |
| "step": 255, | |
| "valid_targets_mean": 6455.1, | |
| "valid_targets_min": 2554 | |
| }, | |
| { | |
| "epoch": 0.41533546325878595, | |
| "grad_norm": 0.5766834587433843, | |
| "learning_rate": 2.3599088838268792e-05, | |
| "loss": 0.3822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3770338296890259, | |
| "step": 260, | |
| "valid_targets_mean": 4891.6, | |
| "valid_targets_min": 1079 | |
| }, | |
| { | |
| "epoch": 0.4233226837060703, | |
| "grad_norm": 0.5365075006934845, | |
| "learning_rate": 2.4054669703872436e-05, | |
| "loss": 0.3828, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.39622747898101807, | |
| "step": 265, | |
| "valid_targets_mean": 6021.6, | |
| "valid_targets_min": 3425 | |
| }, | |
| { | |
| "epoch": 0.43130990415335463, | |
| "grad_norm": 0.5460731656280722, | |
| "learning_rate": 2.4510250569476085e-05, | |
| "loss": 0.3817, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35893553495407104, | |
| "step": 270, | |
| "valid_targets_mean": 5989.9, | |
| "valid_targets_min": 2574 | |
| }, | |
| { | |
| "epoch": 0.43929712460063897, | |
| "grad_norm": 0.6079611703102372, | |
| "learning_rate": 2.496583143507973e-05, | |
| "loss": 0.3867, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.39559704065322876, | |
| "step": 275, | |
| "valid_targets_mean": 6586.9, | |
| "valid_targets_min": 2085 | |
| }, | |
| { | |
| "epoch": 0.4472843450479233, | |
| "grad_norm": 0.49978220129497253, | |
| "learning_rate": 2.5421412300683374e-05, | |
| "loss": 0.3757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3840124309062958, | |
| "step": 280, | |
| "valid_targets_mean": 7195.6, | |
| "valid_targets_min": 2446 | |
| }, | |
| { | |
| "epoch": 0.45527156549520764, | |
| "grad_norm": 0.4890739738357231, | |
| "learning_rate": 2.587699316628702e-05, | |
| "loss": 0.389, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38446980714797974, | |
| "step": 285, | |
| "valid_targets_mean": 6686.1, | |
| "valid_targets_min": 2189 | |
| }, | |
| { | |
| "epoch": 0.46325878594249204, | |
| "grad_norm": 0.5608685116856191, | |
| "learning_rate": 2.6332574031890663e-05, | |
| "loss": 0.369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3336634933948517, | |
| "step": 290, | |
| "valid_targets_mean": 4896.4, | |
| "valid_targets_min": 2432 | |
| }, | |
| { | |
| "epoch": 0.4712460063897764, | |
| "grad_norm": 0.5580624238237674, | |
| "learning_rate": 2.6788154897494308e-05, | |
| "loss": 0.3866, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3963642716407776, | |
| "step": 295, | |
| "valid_targets_mean": 6492.6, | |
| "valid_targets_min": 3021 | |
| }, | |
| { | |
| "epoch": 0.4792332268370607, | |
| "grad_norm": 0.5653648550455211, | |
| "learning_rate": 2.7243735763097953e-05, | |
| "loss": 0.3596, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38598793745040894, | |
| "step": 300, | |
| "valid_targets_mean": 6002.1, | |
| "valid_targets_min": 3217 | |
| }, | |
| { | |
| "epoch": 0.48722044728434505, | |
| "grad_norm": 0.5523573341881904, | |
| "learning_rate": 2.7699316628701597e-05, | |
| "loss": 0.3957, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38130322098731995, | |
| "step": 305, | |
| "valid_targets_mean": 6425.9, | |
| "valid_targets_min": 796 | |
| }, | |
| { | |
| "epoch": 0.4952076677316294, | |
| "grad_norm": 0.5908800201210523, | |
| "learning_rate": 2.815489749430524e-05, | |
| "loss": 0.3755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.40736281871795654, | |
| "step": 310, | |
| "valid_targets_mean": 5972.9, | |
| "valid_targets_min": 2481 | |
| }, | |
| { | |
| "epoch": 0.5031948881789138, | |
| "grad_norm": 0.5050182290593928, | |
| "learning_rate": 2.8610478359908883e-05, | |
| "loss": 0.3654, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3074182868003845, | |
| "step": 315, | |
| "valid_targets_mean": 5612.5, | |
| "valid_targets_min": 1649 | |
| }, | |
| { | |
| "epoch": 0.5111821086261981, | |
| "grad_norm": 0.48241805937181814, | |
| "learning_rate": 2.906605922551253e-05, | |
| "loss": 0.364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3513393700122833, | |
| "step": 320, | |
| "valid_targets_mean": 6846.2, | |
| "valid_targets_min": 3563 | |
| }, | |
| { | |
| "epoch": 0.5191693290734825, | |
| "grad_norm": 0.5454463339417979, | |
| "learning_rate": 2.9521640091116176e-05, | |
| "loss": 0.3803, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33758074045181274, | |
| "step": 325, | |
| "valid_targets_mean": 5402.2, | |
| "valid_targets_min": 2098 | |
| }, | |
| { | |
| "epoch": 0.5271565495207667, | |
| "grad_norm": 0.5541261666218144, | |
| "learning_rate": 2.997722095671982e-05, | |
| "loss": 0.3696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.390627920627594, | |
| "step": 330, | |
| "valid_targets_mean": 6347.6, | |
| "valid_targets_min": 2027 | |
| }, | |
| { | |
| "epoch": 0.5351437699680511, | |
| "grad_norm": 0.5660404300158841, | |
| "learning_rate": 3.0432801822323465e-05, | |
| "loss": 0.3534, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3044443726539612, | |
| "step": 335, | |
| "valid_targets_mean": 4961.2, | |
| "valid_targets_min": 2713 | |
| }, | |
| { | |
| "epoch": 0.5431309904153354, | |
| "grad_norm": 0.5636206871500397, | |
| "learning_rate": 3.088838268792711e-05, | |
| "loss": 0.3884, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31569594144821167, | |
| "step": 340, | |
| "valid_targets_mean": 5465.3, | |
| "valid_targets_min": 2793 | |
| }, | |
| { | |
| "epoch": 0.5511182108626198, | |
| "grad_norm": 0.552044840037857, | |
| "learning_rate": 3.1343963553530755e-05, | |
| "loss": 0.3624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3930482864379883, | |
| "step": 345, | |
| "valid_targets_mean": 6116.7, | |
| "valid_targets_min": 2682 | |
| }, | |
| { | |
| "epoch": 0.5591054313099042, | |
| "grad_norm": 0.5058315239087612, | |
| "learning_rate": 3.17995444191344e-05, | |
| "loss": 0.3517, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35888534784317017, | |
| "step": 350, | |
| "valid_targets_mean": 6523.8, | |
| "valid_targets_min": 2418 | |
| }, | |
| { | |
| "epoch": 0.5670926517571885, | |
| "grad_norm": 0.538733839072958, | |
| "learning_rate": 3.2255125284738044e-05, | |
| "loss": 0.3708, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37181180715560913, | |
| "step": 355, | |
| "valid_targets_mean": 6577.8, | |
| "valid_targets_min": 2399 | |
| }, | |
| { | |
| "epoch": 0.5750798722044729, | |
| "grad_norm": 0.5768881154685574, | |
| "learning_rate": 3.271070615034169e-05, | |
| "loss": 0.3522, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35219404101371765, | |
| "step": 360, | |
| "valid_targets_mean": 4975.4, | |
| "valid_targets_min": 719 | |
| }, | |
| { | |
| "epoch": 0.5830670926517572, | |
| "grad_norm": 0.5681827811525683, | |
| "learning_rate": 3.316628701594533e-05, | |
| "loss": 0.3501, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31764858961105347, | |
| "step": 365, | |
| "valid_targets_mean": 5188.6, | |
| "valid_targets_min": 2397 | |
| }, | |
| { | |
| "epoch": 0.5910543130990416, | |
| "grad_norm": 0.5114875013027304, | |
| "learning_rate": 3.362186788154898e-05, | |
| "loss": 0.3653, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3983859121799469, | |
| "step": 370, | |
| "valid_targets_mean": 6722.6, | |
| "valid_targets_min": 2953 | |
| }, | |
| { | |
| "epoch": 0.5990415335463258, | |
| "grad_norm": 0.524015930896569, | |
| "learning_rate": 3.407744874715262e-05, | |
| "loss": 0.3746, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38032034039497375, | |
| "step": 375, | |
| "valid_targets_mean": 6115.0, | |
| "valid_targets_min": 2369 | |
| }, | |
| { | |
| "epoch": 0.6070287539936102, | |
| "grad_norm": 0.543583068719088, | |
| "learning_rate": 3.453302961275627e-05, | |
| "loss": 0.3639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3506336212158203, | |
| "step": 380, | |
| "valid_targets_mean": 5941.8, | |
| "valid_targets_min": 2024 | |
| }, | |
| { | |
| "epoch": 0.6150159744408946, | |
| "grad_norm": 0.6024333808585206, | |
| "learning_rate": 3.498861047835991e-05, | |
| "loss": 0.3451, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3517494201660156, | |
| "step": 385, | |
| "valid_targets_mean": 5661.4, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 0.6230031948881789, | |
| "grad_norm": 0.5877239060485641, | |
| "learning_rate": 3.5444191343963557e-05, | |
| "loss": 0.346, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34555625915527344, | |
| "step": 390, | |
| "valid_targets_mean": 5934.5, | |
| "valid_targets_min": 3422 | |
| }, | |
| { | |
| "epoch": 0.6309904153354633, | |
| "grad_norm": 0.5796893402638248, | |
| "learning_rate": 3.58997722095672e-05, | |
| "loss": 0.3642, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3835155963897705, | |
| "step": 395, | |
| "valid_targets_mean": 5540.4, | |
| "valid_targets_min": 1047 | |
| }, | |
| { | |
| "epoch": 0.6389776357827476, | |
| "grad_norm": 0.5753191707352615, | |
| "learning_rate": 3.6355353075170846e-05, | |
| "loss": 0.3577, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3626410663127899, | |
| "step": 400, | |
| "valid_targets_mean": 6477.2, | |
| "valid_targets_min": 3771 | |
| }, | |
| { | |
| "epoch": 0.646964856230032, | |
| "grad_norm": 0.5086650808450215, | |
| "learning_rate": 3.681093394077449e-05, | |
| "loss": 0.3689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3235820233821869, | |
| "step": 405, | |
| "valid_targets_mean": 5537.2, | |
| "valid_targets_min": 1126 | |
| }, | |
| { | |
| "epoch": 0.6549520766773163, | |
| "grad_norm": 0.5106355971035841, | |
| "learning_rate": 3.7266514806378135e-05, | |
| "loss": 0.3675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37916722893714905, | |
| "step": 410, | |
| "valid_targets_mean": 7107.3, | |
| "valid_targets_min": 3546 | |
| }, | |
| { | |
| "epoch": 0.6629392971246006, | |
| "grad_norm": 0.5153041221999022, | |
| "learning_rate": 3.772209567198178e-05, | |
| "loss": 0.3433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34424179792404175, | |
| "step": 415, | |
| "valid_targets_mean": 5599.9, | |
| "valid_targets_min": 572 | |
| }, | |
| { | |
| "epoch": 0.670926517571885, | |
| "grad_norm": 0.5064092120881107, | |
| "learning_rate": 3.8177676537585425e-05, | |
| "loss": 0.3683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37698373198509216, | |
| "step": 420, | |
| "valid_targets_mean": 6666.9, | |
| "valid_targets_min": 3318 | |
| }, | |
| { | |
| "epoch": 0.6789137380191693, | |
| "grad_norm": 0.5379909104182273, | |
| "learning_rate": 3.863325740318907e-05, | |
| "loss": 0.3394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2994077801704407, | |
| "step": 425, | |
| "valid_targets_mean": 4974.1, | |
| "valid_targets_min": 831 | |
| }, | |
| { | |
| "epoch": 0.6869009584664537, | |
| "grad_norm": 0.5183532226395027, | |
| "learning_rate": 3.9088838268792714e-05, | |
| "loss": 0.3455, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33696937561035156, | |
| "step": 430, | |
| "valid_targets_mean": 5610.9, | |
| "valid_targets_min": 1445 | |
| }, | |
| { | |
| "epoch": 0.694888178913738, | |
| "grad_norm": 0.5253204809302123, | |
| "learning_rate": 3.954441913439636e-05, | |
| "loss": 0.3571, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3443463444709778, | |
| "step": 435, | |
| "valid_targets_mean": 5751.8, | |
| "valid_targets_min": 785 | |
| }, | |
| { | |
| "epoch": 0.7028753993610224, | |
| "grad_norm": 3.489224642333918, | |
| "learning_rate": 4e-05, | |
| "loss": 0.3498, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3365274667739868, | |
| "step": 440, | |
| "valid_targets_mean": 5479.6, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 0.7108626198083067, | |
| "grad_norm": 0.6292025027519582, | |
| "learning_rate": 3.999984129682125e-05, | |
| "loss": 0.3611, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.348555326461792, | |
| "step": 445, | |
| "valid_targets_mean": 5195.3, | |
| "valid_targets_min": 825 | |
| }, | |
| { | |
| "epoch": 0.7188498402555911, | |
| "grad_norm": 0.5361978224230144, | |
| "learning_rate": 3.9999365189803684e-05, | |
| "loss": 0.3333, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3423752188682556, | |
| "step": 450, | |
| "valid_targets_mean": 6010.0, | |
| "valid_targets_min": 713 | |
| }, | |
| { | |
| "epoch": 0.7268370607028753, | |
| "grad_norm": 0.5131863331488722, | |
| "learning_rate": 3.9998571686503264e-05, | |
| "loss": 0.3499, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3492792844772339, | |
| "step": 455, | |
| "valid_targets_mean": 6147.5, | |
| "valid_targets_min": 2546 | |
| }, | |
| { | |
| "epoch": 0.7348242811501597, | |
| "grad_norm": 0.5321359987371036, | |
| "learning_rate": 3.9997460799513134e-05, | |
| "loss": 0.3575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3969469666481018, | |
| "step": 460, | |
| "valid_targets_mean": 7030.2, | |
| "valid_targets_min": 3035 | |
| }, | |
| { | |
| "epoch": 0.7428115015974441, | |
| "grad_norm": 0.5303577643402191, | |
| "learning_rate": 3.999603254646343e-05, | |
| "loss": 0.3507, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3047478497028351, | |
| "step": 465, | |
| "valid_targets_mean": 5819.0, | |
| "valid_targets_min": 2279 | |
| }, | |
| { | |
| "epoch": 0.7507987220447284, | |
| "grad_norm": 0.46127509823014623, | |
| "learning_rate": 3.9994286950020986e-05, | |
| "loss": 0.3452, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3369569182395935, | |
| "step": 470, | |
| "valid_targets_mean": 6886.6, | |
| "valid_targets_min": 3701 | |
| }, | |
| { | |
| "epoch": 0.7587859424920128, | |
| "grad_norm": 0.5188569499151255, | |
| "learning_rate": 3.999222403788896e-05, | |
| "loss": 0.3653, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3413102626800537, | |
| "step": 475, | |
| "valid_targets_mean": 6113.4, | |
| "valid_targets_min": 1729 | |
| }, | |
| { | |
| "epoch": 0.7667731629392971, | |
| "grad_norm": 0.5098021967052796, | |
| "learning_rate": 3.9989843842806435e-05, | |
| "loss": 0.3588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4095863103866577, | |
| "step": 480, | |
| "valid_targets_mean": 6739.8, | |
| "valid_targets_min": 2695 | |
| }, | |
| { | |
| "epoch": 0.7747603833865815, | |
| "grad_norm": 0.4979874721896599, | |
| "learning_rate": 3.998714640254786e-05, | |
| "loss": 0.355, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32254543900489807, | |
| "step": 485, | |
| "valid_targets_mean": 5816.2, | |
| "valid_targets_min": 2064 | |
| }, | |
| { | |
| "epoch": 0.7827476038338658, | |
| "grad_norm": 0.5807798121365964, | |
| "learning_rate": 3.998413175992247e-05, | |
| "loss": 0.3683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3993215560913086, | |
| "step": 490, | |
| "valid_targets_mean": 6581.7, | |
| "valid_targets_min": 1372 | |
| }, | |
| { | |
| "epoch": 0.7907348242811502, | |
| "grad_norm": 0.582679881823141, | |
| "learning_rate": 3.99807999627736e-05, | |
| "loss": 0.3285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33715301752090454, | |
| "step": 495, | |
| "valid_targets_mean": 4802.1, | |
| "valid_targets_min": 2106 | |
| }, | |
| { | |
| "epoch": 0.7987220447284346, | |
| "grad_norm": 0.46597006694336823, | |
| "learning_rate": 3.997715106397794e-05, | |
| "loss": 0.3276, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36086905002593994, | |
| "step": 500, | |
| "valid_targets_mean": 6869.1, | |
| "valid_targets_min": 3361 | |
| }, | |
| { | |
| "epoch": 0.8067092651757188, | |
| "grad_norm": 0.4802265621540931, | |
| "learning_rate": 3.997318512144465e-05, | |
| "loss": 0.3338, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31953489780426025, | |
| "step": 505, | |
| "valid_targets_mean": 6587.1, | |
| "valid_targets_min": 3103 | |
| }, | |
| { | |
| "epoch": 0.8146964856230032, | |
| "grad_norm": 0.518957862374697, | |
| "learning_rate": 3.9968902198114516e-05, | |
| "loss": 0.3686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3454042971134186, | |
| "step": 510, | |
| "valid_targets_mean": 5485.5, | |
| "valid_targets_min": 2295 | |
| }, | |
| { | |
| "epoch": 0.8226837060702875, | |
| "grad_norm": 0.5940376702061821, | |
| "learning_rate": 3.996430236195889e-05, | |
| "loss": 0.3785, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.354292094707489, | |
| "step": 515, | |
| "valid_targets_mean": 6642.6, | |
| "valid_targets_min": 2110 | |
| }, | |
| { | |
| "epoch": 0.8306709265175719, | |
| "grad_norm": 0.5340056377966012, | |
| "learning_rate": 3.995938568597864e-05, | |
| "loss": 0.3405, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3192850947380066, | |
| "step": 520, | |
| "valid_targets_mean": 6080.1, | |
| "valid_targets_min": 2508 | |
| }, | |
| { | |
| "epoch": 0.8386581469648562, | |
| "grad_norm": 0.5175151270613394, | |
| "learning_rate": 3.995415224820297e-05, | |
| "loss": 0.3724, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3880966305732727, | |
| "step": 525, | |
| "valid_targets_mean": 6465.8, | |
| "valid_targets_min": 3332 | |
| }, | |
| { | |
| "epoch": 0.8466453674121406, | |
| "grad_norm": 0.46923531014605424, | |
| "learning_rate": 3.994860213168819e-05, | |
| "loss": 0.3401, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.38312670588493347, | |
| "step": 530, | |
| "valid_targets_mean": 7132.9, | |
| "valid_targets_min": 3574 | |
| }, | |
| { | |
| "epoch": 0.854632587859425, | |
| "grad_norm": 0.5066641496835113, | |
| "learning_rate": 3.9942735424516435e-05, | |
| "loss": 0.3268, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30763086676597595, | |
| "step": 535, | |
| "valid_targets_mean": 5669.9, | |
| "valid_targets_min": 2290 | |
| }, | |
| { | |
| "epoch": 0.8626198083067093, | |
| "grad_norm": 0.5457503547073947, | |
| "learning_rate": 3.9936552219794196e-05, | |
| "loss": 0.3313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31306904554367065, | |
| "step": 540, | |
| "valid_targets_mean": 5285.8, | |
| "valid_targets_min": 1864 | |
| }, | |
| { | |
| "epoch": 0.8706070287539937, | |
| "grad_norm": 0.4930567828775584, | |
| "learning_rate": 3.993005261565091e-05, | |
| "loss": 0.3433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3326902389526367, | |
| "step": 545, | |
| "valid_targets_mean": 5856.3, | |
| "valid_targets_min": 3136 | |
| }, | |
| { | |
| "epoch": 0.8785942492012779, | |
| "grad_norm": 0.5046679084868202, | |
| "learning_rate": 3.992323671523735e-05, | |
| "loss": 0.3351, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32354608178138733, | |
| "step": 550, | |
| "valid_targets_mean": 5849.5, | |
| "valid_targets_min": 2438 | |
| }, | |
| { | |
| "epoch": 0.8865814696485623, | |
| "grad_norm": 0.49039410725365223, | |
| "learning_rate": 3.991610462672403e-05, | |
| "loss": 0.3491, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.382174015045166, | |
| "step": 555, | |
| "valid_targets_mean": 6617.8, | |
| "valid_targets_min": 2769 | |
| }, | |
| { | |
| "epoch": 0.8945686900958466, | |
| "grad_norm": 0.5686260786567642, | |
| "learning_rate": 3.9908656463299456e-05, | |
| "loss": 0.3295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3528607487678528, | |
| "step": 560, | |
| "valid_targets_mean": 5212.4, | |
| "valid_targets_min": 665 | |
| }, | |
| { | |
| "epoch": 0.902555910543131, | |
| "grad_norm": 0.5108671000751085, | |
| "learning_rate": 3.990089234316835e-05, | |
| "loss": 0.3689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3839905858039856, | |
| "step": 565, | |
| "valid_targets_mean": 6591.1, | |
| "valid_targets_min": 2809 | |
| }, | |
| { | |
| "epoch": 0.9105431309904153, | |
| "grad_norm": 0.46883247058129784, | |
| "learning_rate": 3.989281238954978e-05, | |
| "loss": 0.3367, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31492048501968384, | |
| "step": 570, | |
| "valid_targets_mean": 5813.1, | |
| "valid_targets_min": 2263 | |
| }, | |
| { | |
| "epoch": 0.9185303514376997, | |
| "grad_norm": 0.4447993247627989, | |
| "learning_rate": 3.9884416730675155e-05, | |
| "loss": 0.3431, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.339530348777771, | |
| "step": 575, | |
| "valid_targets_mean": 6522.4, | |
| "valid_targets_min": 749 | |
| }, | |
| { | |
| "epoch": 0.9265175718849841, | |
| "grad_norm": 0.4682710462081107, | |
| "learning_rate": 3.987570549978626e-05, | |
| "loss": 0.32, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3164796233177185, | |
| "step": 580, | |
| "valid_targets_mean": 6027.9, | |
| "valid_targets_min": 1386 | |
| }, | |
| { | |
| "epoch": 0.9345047923322684, | |
| "grad_norm": 0.48470655724036354, | |
| "learning_rate": 3.986667883513311e-05, | |
| "loss": 0.3341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35007989406585693, | |
| "step": 585, | |
| "valid_targets_mean": 6677.7, | |
| "valid_targets_min": 2039 | |
| }, | |
| { | |
| "epoch": 0.9424920127795527, | |
| "grad_norm": 0.4687992457086807, | |
| "learning_rate": 3.985733687997173e-05, | |
| "loss": 0.3183, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34673264622688293, | |
| "step": 590, | |
| "valid_targets_mean": 6802.7, | |
| "valid_targets_min": 2832 | |
| }, | |
| { | |
| "epoch": 0.950479233226837, | |
| "grad_norm": 0.49804750348458293, | |
| "learning_rate": 3.984767978256192e-05, | |
| "loss": 0.3543, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37401020526885986, | |
| "step": 595, | |
| "valid_targets_mean": 6256.6, | |
| "valid_targets_min": 2177 | |
| }, | |
| { | |
| "epoch": 0.9584664536741214, | |
| "grad_norm": 0.45068650300014285, | |
| "learning_rate": 3.983770769616488e-05, | |
| "loss": 0.3341, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2899397611618042, | |
| "step": 600, | |
| "valid_targets_mean": 5941.1, | |
| "valid_targets_min": 2438 | |
| }, | |
| { | |
| "epoch": 0.9664536741214057, | |
| "grad_norm": 0.49256024842809054, | |
| "learning_rate": 3.9827420779040805e-05, | |
| "loss": 0.3214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29223012924194336, | |
| "step": 605, | |
| "valid_targets_mean": 5293.6, | |
| "valid_targets_min": 3390 | |
| }, | |
| { | |
| "epoch": 0.9744408945686901, | |
| "grad_norm": 0.4899302766050055, | |
| "learning_rate": 3.981681919444633e-05, | |
| "loss": 0.3241, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33953702449798584, | |
| "step": 610, | |
| "valid_targets_mean": 6229.4, | |
| "valid_targets_min": 2579 | |
| }, | |
| { | |
| "epoch": 0.9824281150159745, | |
| "grad_norm": 0.4769757737396491, | |
| "learning_rate": 3.980590311063197e-05, | |
| "loss": 0.3645, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.4187234044075012, | |
| "step": 615, | |
| "valid_targets_mean": 7530.5, | |
| "valid_targets_min": 4088 | |
| }, | |
| { | |
| "epoch": 0.9904153354632588, | |
| "grad_norm": 0.45304778746208046, | |
| "learning_rate": 3.9794672700839455e-05, | |
| "loss": 0.3364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.36301684379577637, | |
| "step": 620, | |
| "valid_targets_mean": 7218.6, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 0.9984025559105432, | |
| "grad_norm": 0.4705258653405169, | |
| "learning_rate": 3.9783128143298945e-05, | |
| "loss": 0.3443, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3477717638015747, | |
| "step": 625, | |
| "valid_targets_mean": 5919.8, | |
| "valid_targets_min": 1009 | |
| }, | |
| { | |
| "epoch": 1.0063897763578276, | |
| "grad_norm": 0.5050283025110829, | |
| "learning_rate": 3.977126962122625e-05, | |
| "loss": 0.3119, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32592248916625977, | |
| "step": 630, | |
| "valid_targets_mean": 5563.4, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 1.0143769968051117, | |
| "grad_norm": 0.46229527335680065, | |
| "learning_rate": 3.975909732281988e-05, | |
| "loss": 0.342, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3142285645008087, | |
| "step": 635, | |
| "valid_targets_mean": 6632.2, | |
| "valid_targets_min": 2947 | |
| }, | |
| { | |
| "epoch": 1.0223642172523961, | |
| "grad_norm": 0.47546572891270733, | |
| "learning_rate": 3.974661144125808e-05, | |
| "loss": 0.3096, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2969960868358612, | |
| "step": 640, | |
| "valid_targets_mean": 5500.5, | |
| "valid_targets_min": 2119 | |
| }, | |
| { | |
| "epoch": 1.0303514376996805, | |
| "grad_norm": 0.5122931590812704, | |
| "learning_rate": 3.973381217469576e-05, | |
| "loss": 0.3218, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3373990058898926, | |
| "step": 645, | |
| "valid_targets_mean": 5624.6, | |
| "valid_targets_min": 888 | |
| }, | |
| { | |
| "epoch": 1.038338658146965, | |
| "grad_norm": 0.4815260696712804, | |
| "learning_rate": 3.972069972626135e-05, | |
| "loss": 0.3215, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.341543972492218, | |
| "step": 650, | |
| "valid_targets_mean": 5392.6, | |
| "valid_targets_min": 2149 | |
| }, | |
| { | |
| "epoch": 1.0463258785942493, | |
| "grad_norm": 0.4917826291128604, | |
| "learning_rate": 3.970727430405357e-05, | |
| "loss": 0.3432, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32287609577178955, | |
| "step": 655, | |
| "valid_targets_mean": 6214.2, | |
| "valid_targets_min": 4250 | |
| }, | |
| { | |
| "epoch": 1.0543130990415335, | |
| "grad_norm": 0.5028922806576065, | |
| "learning_rate": 3.969353612113815e-05, | |
| "loss": 0.3105, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3034610152244568, | |
| "step": 660, | |
| "valid_targets_mean": 5340.4, | |
| "valid_targets_min": 2792 | |
| }, | |
| { | |
| "epoch": 1.0623003194888179, | |
| "grad_norm": 0.4555063370999899, | |
| "learning_rate": 3.96794853955444e-05, | |
| "loss": 0.3114, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30887871980667114, | |
| "step": 665, | |
| "valid_targets_mean": 6670.9, | |
| "valid_targets_min": 2818 | |
| }, | |
| { | |
| "epoch": 1.0702875399361023, | |
| "grad_norm": 0.4419224322404009, | |
| "learning_rate": 3.966512235026182e-05, | |
| "loss": 0.3288, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30245083570480347, | |
| "step": 670, | |
| "valid_targets_mean": 6296.7, | |
| "valid_targets_min": 2397 | |
| }, | |
| { | |
| "epoch": 1.0782747603833867, | |
| "grad_norm": 0.5482899490758951, | |
| "learning_rate": 3.96504472132365e-05, | |
| "loss": 0.3374, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35562747716903687, | |
| "step": 675, | |
| "valid_targets_mean": 5249.1, | |
| "valid_targets_min": 2771 | |
| }, | |
| { | |
| "epoch": 1.0862619808306708, | |
| "grad_norm": 0.455403036488509, | |
| "learning_rate": 3.9635460217367513e-05, | |
| "loss": 0.3222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34553956985473633, | |
| "step": 680, | |
| "valid_targets_mean": 6100.2, | |
| "valid_targets_min": 2272 | |
| }, | |
| { | |
| "epoch": 1.0942492012779552, | |
| "grad_norm": 0.459023340955603, | |
| "learning_rate": 3.962016160050327e-05, | |
| "loss": 0.34, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.338087260723114, | |
| "step": 685, | |
| "valid_targets_mean": 6738.0, | |
| "valid_targets_min": 2801 | |
| }, | |
| { | |
| "epoch": 1.1022364217252396, | |
| "grad_norm": 0.496989470058913, | |
| "learning_rate": 3.960455160543767e-05, | |
| "loss": 0.3284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3198395371437073, | |
| "step": 690, | |
| "valid_targets_mean": 5490.8, | |
| "valid_targets_min": 758 | |
| }, | |
| { | |
| "epoch": 1.110223642172524, | |
| "grad_norm": 0.4959370778039393, | |
| "learning_rate": 3.958863047990631e-05, | |
| "loss": 0.3169, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3394013047218323, | |
| "step": 695, | |
| "valid_targets_mean": 6389.7, | |
| "valid_targets_min": 2733 | |
| }, | |
| { | |
| "epoch": 1.1182108626198084, | |
| "grad_norm": 0.4718803985159449, | |
| "learning_rate": 3.95723984765825e-05, | |
| "loss": 0.335, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3159930109977722, | |
| "step": 700, | |
| "valid_targets_mean": 6206.4, | |
| "valid_targets_min": 2999 | |
| }, | |
| { | |
| "epoch": 1.1261980830670926, | |
| "grad_norm": 0.49309286772161354, | |
| "learning_rate": 3.955585585307329e-05, | |
| "loss": 0.3343, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3764703869819641, | |
| "step": 705, | |
| "valid_targets_mean": 6215.7, | |
| "valid_targets_min": 3028 | |
| }, | |
| { | |
| "epoch": 1.134185303514377, | |
| "grad_norm": 0.4512412872694635, | |
| "learning_rate": 3.9539002871915395e-05, | |
| "loss": 0.339, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3365987539291382, | |
| "step": 710, | |
| "valid_targets_mean": 5982.4, | |
| "valid_targets_min": 3021 | |
| }, | |
| { | |
| "epoch": 1.1421725239616614, | |
| "grad_norm": 0.49450708297479895, | |
| "learning_rate": 3.952183980057096e-05, | |
| "loss": 0.3248, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3273988366127014, | |
| "step": 715, | |
| "valid_targets_mean": 5661.2, | |
| "valid_targets_min": 2428 | |
| }, | |
| { | |
| "epoch": 1.1501597444089458, | |
| "grad_norm": 0.49441101213705674, | |
| "learning_rate": 3.950436691142339e-05, | |
| "loss": 0.329, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31003260612487793, | |
| "step": 720, | |
| "valid_targets_mean": 5226.9, | |
| "valid_targets_min": 892 | |
| }, | |
| { | |
| "epoch": 1.15814696485623, | |
| "grad_norm": 0.4327129332368562, | |
| "learning_rate": 3.948658448177299e-05, | |
| "loss": 0.3347, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2914605140686035, | |
| "step": 725, | |
| "valid_targets_mean": 5476.6, | |
| "valid_targets_min": 2777 | |
| }, | |
| { | |
| "epoch": 1.1661341853035143, | |
| "grad_norm": 0.4579630636645496, | |
| "learning_rate": 3.946849279383258e-05, | |
| "loss": 0.3145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2681208550930023, | |
| "step": 730, | |
| "valid_targets_mean": 6204.7, | |
| "valid_targets_min": 2515 | |
| }, | |
| { | |
| "epoch": 1.1741214057507987, | |
| "grad_norm": 0.48284199564071345, | |
| "learning_rate": 3.9450092134722984e-05, | |
| "loss": 0.3256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29863229393959045, | |
| "step": 735, | |
| "valid_targets_mean": 5136.0, | |
| "valid_targets_min": 2582 | |
| }, | |
| { | |
| "epoch": 1.182108626198083, | |
| "grad_norm": 0.4805539990378716, | |
| "learning_rate": 3.943138279646853e-05, | |
| "loss": 0.3327, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31790691614151, | |
| "step": 740, | |
| "valid_targets_mean": 6269.9, | |
| "valid_targets_min": 559 | |
| }, | |
| { | |
| "epoch": 1.1900958466453675, | |
| "grad_norm": 0.4784478712209745, | |
| "learning_rate": 3.941236507599234e-05, | |
| "loss": 0.3382, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34656983613967896, | |
| "step": 745, | |
| "valid_targets_mean": 5993.6, | |
| "valid_targets_min": 2471 | |
| }, | |
| { | |
| "epoch": 1.1980830670926517, | |
| "grad_norm": 0.43383763240803597, | |
| "learning_rate": 3.93930392751117e-05, | |
| "loss": 0.3161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29610034823417664, | |
| "step": 750, | |
| "valid_targets_mean": 6126.3, | |
| "valid_targets_min": 2064 | |
| }, | |
| { | |
| "epoch": 1.206070287539936, | |
| "grad_norm": 0.48883018348116913, | |
| "learning_rate": 3.9373405700533204e-05, | |
| "loss": 0.3257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31921499967575073, | |
| "step": 755, | |
| "valid_targets_mean": 4802.9, | |
| "valid_targets_min": 2423 | |
| }, | |
| { | |
| "epoch": 1.2140575079872205, | |
| "grad_norm": 0.4644836687483504, | |
| "learning_rate": 3.935346466384793e-05, | |
| "loss": 0.328, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3234442472457886, | |
| "step": 760, | |
| "valid_targets_mean": 5879.3, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 1.2220447284345048, | |
| "grad_norm": 0.45051923038048314, | |
| "learning_rate": 3.933321648152646e-05, | |
| "loss": 0.3221, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3240428566932678, | |
| "step": 765, | |
| "valid_targets_mean": 6665.6, | |
| "valid_targets_min": 2373 | |
| }, | |
| { | |
| "epoch": 1.230031948881789, | |
| "grad_norm": 0.42056159364192114, | |
| "learning_rate": 3.931266147491389e-05, | |
| "loss": 0.3195, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2504345178604126, | |
| "step": 770, | |
| "valid_targets_mean": 5726.6, | |
| "valid_targets_min": 2634 | |
| }, | |
| { | |
| "epoch": 1.2380191693290734, | |
| "grad_norm": 0.47062263420441686, | |
| "learning_rate": 3.929179997022471e-05, | |
| "loss": 0.3291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34417733550071716, | |
| "step": 775, | |
| "valid_targets_mean": 5756.7, | |
| "valid_targets_min": 2836 | |
| }, | |
| { | |
| "epoch": 1.2460063897763578, | |
| "grad_norm": 0.5458397011137837, | |
| "learning_rate": 3.927063229853763e-05, | |
| "loss": 0.3359, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33597809076309204, | |
| "step": 780, | |
| "valid_targets_mean": 6280.2, | |
| "valid_targets_min": 3186 | |
| }, | |
| { | |
| "epoch": 1.2539936102236422, | |
| "grad_norm": 0.4424051964736962, | |
| "learning_rate": 3.9249158795790316e-05, | |
| "loss": 0.3062, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3068455159664154, | |
| "step": 785, | |
| "valid_targets_mean": 6066.4, | |
| "valid_targets_min": 2875 | |
| }, | |
| { | |
| "epoch": 1.2619808306709266, | |
| "grad_norm": 0.4873562321931644, | |
| "learning_rate": 3.9227379802774106e-05, | |
| "loss": 0.3148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2939402461051941, | |
| "step": 790, | |
| "valid_targets_mean": 5670.5, | |
| "valid_targets_min": 2493 | |
| }, | |
| { | |
| "epoch": 1.269968051118211, | |
| "grad_norm": 0.4506015718936206, | |
| "learning_rate": 3.920529566512852e-05, | |
| "loss": 0.3161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3366793990135193, | |
| "step": 795, | |
| "valid_targets_mean": 6771.7, | |
| "valid_targets_min": 4219 | |
| }, | |
| { | |
| "epoch": 1.2779552715654952, | |
| "grad_norm": 0.4654389007890439, | |
| "learning_rate": 3.918290673333585e-05, | |
| "loss": 0.3009, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2800728976726532, | |
| "step": 800, | |
| "valid_targets_mean": 5903.6, | |
| "valid_targets_min": 2907 | |
| }, | |
| { | |
| "epoch": 1.2859424920127795, | |
| "grad_norm": 0.47840692346282043, | |
| "learning_rate": 3.916021336271556e-05, | |
| "loss": 0.3365, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3363194465637207, | |
| "step": 805, | |
| "valid_targets_mean": 5480.3, | |
| "valid_targets_min": 2627 | |
| }, | |
| { | |
| "epoch": 1.293929712460064, | |
| "grad_norm": 0.46612977679195894, | |
| "learning_rate": 3.913721591341867e-05, | |
| "loss": 0.3082, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31425386667251587, | |
| "step": 810, | |
| "valid_targets_mean": 6013.5, | |
| "valid_targets_min": 2918 | |
| }, | |
| { | |
| "epoch": 1.3019169329073481, | |
| "grad_norm": 0.41855869709453075, | |
| "learning_rate": 3.9113914750421985e-05, | |
| "loss": 0.3413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3336500823497772, | |
| "step": 815, | |
| "valid_targets_mean": 7569.5, | |
| "valid_targets_min": 4308 | |
| }, | |
| { | |
| "epoch": 1.3099041533546325, | |
| "grad_norm": 0.4733911267830699, | |
| "learning_rate": 3.9090310243522394e-05, | |
| "loss": 0.3222, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3009193539619446, | |
| "step": 820, | |
| "valid_targets_mean": 4800.0, | |
| "valid_targets_min": 805 | |
| }, | |
| { | |
| "epoch": 1.317891373801917, | |
| "grad_norm": 0.4694602200693918, | |
| "learning_rate": 3.90664027673309e-05, | |
| "loss": 0.315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3419846296310425, | |
| "step": 825, | |
| "valid_targets_mean": 5571.3, | |
| "valid_targets_min": 2137 | |
| }, | |
| { | |
| "epoch": 1.3258785942492013, | |
| "grad_norm": 0.4242061121855856, | |
| "learning_rate": 3.904219270126677e-05, | |
| "loss": 0.3047, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31825196743011475, | |
| "step": 830, | |
| "valid_targets_mean": 6412.2, | |
| "valid_targets_min": 2406 | |
| }, | |
| { | |
| "epoch": 1.3338658146964857, | |
| "grad_norm": 0.4157315386858848, | |
| "learning_rate": 3.901768042955144e-05, | |
| "loss": 0.3125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2768746614456177, | |
| "step": 835, | |
| "valid_targets_mean": 5879.1, | |
| "valid_targets_min": 2242 | |
| }, | |
| { | |
| "epoch": 1.34185303514377, | |
| "grad_norm": 0.42828031265648203, | |
| "learning_rate": 3.8992866341202446e-05, | |
| "loss": 0.3369, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3456231355667114, | |
| "step": 840, | |
| "valid_targets_mean": 6975.6, | |
| "valid_targets_min": 3585 | |
| }, | |
| { | |
| "epoch": 1.3498402555910542, | |
| "grad_norm": 0.4714898478342558, | |
| "learning_rate": 3.8967750830027277e-05, | |
| "loss": 0.3217, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33118703961372375, | |
| "step": 845, | |
| "valid_targets_mean": 6426.8, | |
| "valid_targets_min": 2822 | |
| }, | |
| { | |
| "epoch": 1.3578274760383386, | |
| "grad_norm": 0.47203648198959436, | |
| "learning_rate": 3.894233429461706e-05, | |
| "loss": 0.322, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33867913484573364, | |
| "step": 850, | |
| "valid_targets_mean": 6770.1, | |
| "valid_targets_min": 3588 | |
| }, | |
| { | |
| "epoch": 1.365814696485623, | |
| "grad_norm": 0.45582567159440307, | |
| "learning_rate": 3.89166171383403e-05, | |
| "loss": 0.3307, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3230893611907959, | |
| "step": 855, | |
| "valid_targets_mean": 5747.1, | |
| "valid_targets_min": 1988 | |
| }, | |
| { | |
| "epoch": 1.3738019169329074, | |
| "grad_norm": 0.43766367331612455, | |
| "learning_rate": 3.889059976933644e-05, | |
| "loss": 0.3048, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26820746064186096, | |
| "step": 860, | |
| "valid_targets_mean": 5200.1, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 1.3817891373801916, | |
| "grad_norm": 0.5846264744346037, | |
| "learning_rate": 3.88642826005094e-05, | |
| "loss": 0.3293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31131452322006226, | |
| "step": 865, | |
| "valid_targets_mean": 6175.3, | |
| "valid_targets_min": 3591 | |
| }, | |
| { | |
| "epoch": 1.389776357827476, | |
| "grad_norm": 0.4328811960129174, | |
| "learning_rate": 3.883766604952102e-05, | |
| "loss": 0.3079, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.330348938703537, | |
| "step": 870, | |
| "valid_targets_mean": 7087.1, | |
| "valid_targets_min": 689 | |
| }, | |
| { | |
| "epoch": 1.3977635782747604, | |
| "grad_norm": 0.4586049350161352, | |
| "learning_rate": 3.8810750538784404e-05, | |
| "loss": 0.3208, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3218524158000946, | |
| "step": 875, | |
| "valid_targets_mean": 5840.4, | |
| "valid_targets_min": 978 | |
| }, | |
| { | |
| "epoch": 1.4057507987220448, | |
| "grad_norm": 0.4161974005865729, | |
| "learning_rate": 3.878353649545728e-05, | |
| "loss": 0.3219, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29422056674957275, | |
| "step": 880, | |
| "valid_targets_mean": 6192.7, | |
| "valid_targets_min": 2357 | |
| }, | |
| { | |
| "epoch": 1.4137380191693292, | |
| "grad_norm": 0.4489913326752608, | |
| "learning_rate": 3.875602435143517e-05, | |
| "loss": 0.3191, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2698908746242523, | |
| "step": 885, | |
| "valid_targets_mean": 5695.6, | |
| "valid_targets_min": 3271 | |
| }, | |
| { | |
| "epoch": 1.4217252396166133, | |
| "grad_norm": 0.4219779165800395, | |
| "learning_rate": 3.872821454334453e-05, | |
| "loss": 0.2961, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29788100719451904, | |
| "step": 890, | |
| "valid_targets_mean": 6015.9, | |
| "valid_targets_min": 937 | |
| }, | |
| { | |
| "epoch": 1.4297124600638977, | |
| "grad_norm": 0.40053393941380927, | |
| "learning_rate": 3.870010751253587e-05, | |
| "loss": 0.3145, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28478747606277466, | |
| "step": 895, | |
| "valid_targets_mean": 5868.5, | |
| "valid_targets_min": 2389 | |
| }, | |
| { | |
| "epoch": 1.4376996805111821, | |
| "grad_norm": 0.44258924950097034, | |
| "learning_rate": 3.867170370507668e-05, | |
| "loss": 0.3286, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3253663182258606, | |
| "step": 900, | |
| "valid_targets_mean": 6370.3, | |
| "valid_targets_min": 1020 | |
| }, | |
| { | |
| "epoch": 1.4456869009584665, | |
| "grad_norm": 0.5021985687459187, | |
| "learning_rate": 3.8643003571744445e-05, | |
| "loss": 0.3236, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28128373622894287, | |
| "step": 905, | |
| "valid_targets_mean": 5399.1, | |
| "valid_targets_min": 2187 | |
| }, | |
| { | |
| "epoch": 1.4536741214057507, | |
| "grad_norm": 0.4655181153710023, | |
| "learning_rate": 3.861400756801938e-05, | |
| "loss": 0.3107, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.309695303440094, | |
| "step": 910, | |
| "valid_targets_mean": 6124.7, | |
| "valid_targets_min": 3143 | |
| }, | |
| { | |
| "epoch": 1.461661341853035, | |
| "grad_norm": 0.4724728099136258, | |
| "learning_rate": 3.8584716154077286e-05, | |
| "loss": 0.323, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.299552321434021, | |
| "step": 915, | |
| "valid_targets_mean": 5316.4, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 1.4696485623003195, | |
| "grad_norm": 0.41405364082318125, | |
| "learning_rate": 3.855512979478222e-05, | |
| "loss": 0.3313, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34244292974472046, | |
| "step": 920, | |
| "valid_targets_mean": 7273.3, | |
| "valid_targets_min": 2876 | |
| }, | |
| { | |
| "epoch": 1.4776357827476039, | |
| "grad_norm": 0.42347732062001664, | |
| "learning_rate": 3.852524895967911e-05, | |
| "loss": 0.3158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31581446528434753, | |
| "step": 925, | |
| "valid_targets_mean": 6855.9, | |
| "valid_targets_min": 3456 | |
| }, | |
| { | |
| "epoch": 1.4856230031948883, | |
| "grad_norm": 0.45430277309608497, | |
| "learning_rate": 3.8495074122986296e-05, | |
| "loss": 0.3261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34723660349845886, | |
| "step": 930, | |
| "valid_targets_mean": 6616.7, | |
| "valid_targets_min": 2742 | |
| }, | |
| { | |
| "epoch": 1.4936102236421724, | |
| "grad_norm": 0.40981666798968575, | |
| "learning_rate": 3.846460576358804e-05, | |
| "loss": 0.3294, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3579206168651581, | |
| "step": 935, | |
| "valid_targets_mean": 7637.8, | |
| "valid_targets_min": 4616 | |
| }, | |
| { | |
| "epoch": 1.5015974440894568, | |
| "grad_norm": 0.5034512767751957, | |
| "learning_rate": 3.843384436502688e-05, | |
| "loss": 0.3239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31429392099380493, | |
| "step": 940, | |
| "valid_targets_mean": 6014.1, | |
| "valid_targets_min": 3178 | |
| }, | |
| { | |
| "epoch": 1.5095846645367412, | |
| "grad_norm": 0.4207204740578826, | |
| "learning_rate": 3.8402790415496e-05, | |
| "loss": 0.3311, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2973242998123169, | |
| "step": 945, | |
| "valid_targets_mean": 6431.3, | |
| "valid_targets_min": 2767 | |
| }, | |
| { | |
| "epoch": 1.5175718849840254, | |
| "grad_norm": 0.44806482914752466, | |
| "learning_rate": 3.837144440783144e-05, | |
| "loss": 0.324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3175188899040222, | |
| "step": 950, | |
| "valid_targets_mean": 5893.8, | |
| "valid_targets_min": 2523 | |
| }, | |
| { | |
| "epoch": 1.5255591054313098, | |
| "grad_norm": 0.4407737876736171, | |
| "learning_rate": 3.833980683950431e-05, | |
| "loss": 0.3109, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29579806327819824, | |
| "step": 955, | |
| "valid_targets_mean": 5912.9, | |
| "valid_targets_min": 796 | |
| }, | |
| { | |
| "epoch": 1.5335463258785942, | |
| "grad_norm": 1.0255135517232294, | |
| "learning_rate": 3.8307878212612886e-05, | |
| "loss": 0.3051, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30610838532447815, | |
| "step": 960, | |
| "valid_targets_mean": 5395.6, | |
| "valid_targets_min": 2716 | |
| }, | |
| { | |
| "epoch": 1.5415335463258786, | |
| "grad_norm": 0.4832762194974854, | |
| "learning_rate": 3.827565903387461e-05, | |
| "loss": 0.3252, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3501489758491516, | |
| "step": 965, | |
| "valid_targets_mean": 6787.1, | |
| "valid_targets_min": 2506 | |
| }, | |
| { | |
| "epoch": 1.549520766773163, | |
| "grad_norm": 0.4647417874832435, | |
| "learning_rate": 3.82431498146181e-05, | |
| "loss": 0.3132, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.338148295879364, | |
| "step": 970, | |
| "valid_targets_mean": 5889.9, | |
| "valid_targets_min": 801 | |
| }, | |
| { | |
| "epoch": 1.5575079872204474, | |
| "grad_norm": 0.47041745696810133, | |
| "learning_rate": 3.821035107077499e-05, | |
| "loss": 0.3186, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31604963541030884, | |
| "step": 975, | |
| "valid_targets_mean": 6427.9, | |
| "valid_targets_min": 791 | |
| }, | |
| { | |
| "epoch": 1.5654952076677318, | |
| "grad_norm": 0.4433561631860064, | |
| "learning_rate": 3.817726332287179e-05, | |
| "loss": 0.2995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31374531984329224, | |
| "step": 980, | |
| "valid_targets_mean": 5654.8, | |
| "valid_targets_min": 2490 | |
| }, | |
| { | |
| "epoch": 1.573482428115016, | |
| "grad_norm": 0.4258254460194381, | |
| "learning_rate": 3.814388709602155e-05, | |
| "loss": 0.321, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3178654909133911, | |
| "step": 985, | |
| "valid_targets_mean": 6410.1, | |
| "valid_targets_min": 1960 | |
| }, | |
| { | |
| "epoch": 1.5814696485623003, | |
| "grad_norm": 0.45920946337830315, | |
| "learning_rate": 3.8110222919915626e-05, | |
| "loss": 0.3153, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32802289724349976, | |
| "step": 990, | |
| "valid_targets_mean": 5907.4, | |
| "valid_targets_min": 2600 | |
| }, | |
| { | |
| "epoch": 1.5894568690095847, | |
| "grad_norm": 0.43270863350564, | |
| "learning_rate": 3.8076271328815175e-05, | |
| "loss": 0.3154, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33153438568115234, | |
| "step": 995, | |
| "valid_targets_mean": 6554.1, | |
| "valid_targets_min": 900 | |
| }, | |
| { | |
| "epoch": 1.5974440894568689, | |
| "grad_norm": 0.4134343030141776, | |
| "learning_rate": 3.804203286154275e-05, | |
| "loss": 0.3433, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33324405550956726, | |
| "step": 1000, | |
| "valid_targets_mean": 6344.6, | |
| "valid_targets_min": 2263 | |
| }, | |
| { | |
| "epoch": 1.6054313099041533, | |
| "grad_norm": 0.3879750639924769, | |
| "learning_rate": 3.800750806147371e-05, | |
| "loss": 0.3203, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3132684826850891, | |
| "step": 1005, | |
| "valid_targets_mean": 6414.2, | |
| "valid_targets_min": 2462 | |
| }, | |
| { | |
| "epoch": 1.6134185303514377, | |
| "grad_norm": 0.3896308978348557, | |
| "learning_rate": 3.79726974765276e-05, | |
| "loss": 0.3127, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2856582701206207, | |
| "step": 1010, | |
| "valid_targets_mean": 6503.0, | |
| "valid_targets_min": 2649 | |
| }, | |
| { | |
| "epoch": 1.621405750798722, | |
| "grad_norm": 0.46744506104212574, | |
| "learning_rate": 3.793760165915947e-05, | |
| "loss": 0.3419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3406218886375427, | |
| "step": 1015, | |
| "valid_targets_mean": 7023.1, | |
| "valid_targets_min": 2492 | |
| }, | |
| { | |
| "epoch": 1.6293929712460065, | |
| "grad_norm": 0.44448545878269796, | |
| "learning_rate": 3.7902221166351106e-05, | |
| "loss": 0.3136, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3256237506866455, | |
| "step": 1020, | |
| "valid_targets_mean": 6055.3, | |
| "valid_targets_min": 810 | |
| }, | |
| { | |
| "epoch": 1.6373801916932909, | |
| "grad_norm": 0.4180157843999033, | |
| "learning_rate": 3.786655655960216e-05, | |
| "loss": 0.326, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3308889865875244, | |
| "step": 1025, | |
| "valid_targets_mean": 6660.6, | |
| "valid_targets_min": 2515 | |
| }, | |
| { | |
| "epoch": 1.645367412140575, | |
| "grad_norm": 0.48433352456898443, | |
| "learning_rate": 3.7830608404921294e-05, | |
| "loss": 0.3166, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3051565885543823, | |
| "step": 1030, | |
| "valid_targets_mean": 6259.9, | |
| "valid_targets_min": 2567 | |
| }, | |
| { | |
| "epoch": 1.6533546325878594, | |
| "grad_norm": 0.42015380335295704, | |
| "learning_rate": 3.7794377272817144e-05, | |
| "loss": 0.3134, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31414514780044556, | |
| "step": 1035, | |
| "valid_targets_mean": 6254.2, | |
| "valid_targets_min": 2582 | |
| }, | |
| { | |
| "epoch": 1.6613418530351438, | |
| "grad_norm": 0.43307984350277, | |
| "learning_rate": 3.775786373828929e-05, | |
| "loss": 0.3225, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3277095854282379, | |
| "step": 1040, | |
| "valid_targets_mean": 5964.2, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 1.669329073482428, | |
| "grad_norm": 0.4197270288651504, | |
| "learning_rate": 3.7721068380819135e-05, | |
| "loss": 0.3052, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29990091919898987, | |
| "step": 1045, | |
| "valid_targets_mean": 5805.6, | |
| "valid_targets_min": 3309 | |
| }, | |
| { | |
| "epoch": 1.6773162939297124, | |
| "grad_norm": 0.40495916031919665, | |
| "learning_rate": 3.768399178436069e-05, | |
| "loss": 0.312, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3308406472206116, | |
| "step": 1050, | |
| "valid_targets_mean": 6970.0, | |
| "valid_targets_min": 2733 | |
| }, | |
| { | |
| "epoch": 1.6853035143769968, | |
| "grad_norm": 0.4628946489574963, | |
| "learning_rate": 3.764663453733135e-05, | |
| "loss": 0.3143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2960417866706848, | |
| "step": 1055, | |
| "valid_targets_mean": 5815.1, | |
| "valid_targets_min": 2716 | |
| }, | |
| { | |
| "epoch": 1.6932907348242812, | |
| "grad_norm": 0.44801679103858266, | |
| "learning_rate": 3.7608997232602475e-05, | |
| "loss": 0.3254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3382951021194458, | |
| "step": 1060, | |
| "valid_targets_mean": 5987.8, | |
| "valid_targets_min": 2576 | |
| }, | |
| { | |
| "epoch": 1.7012779552715656, | |
| "grad_norm": 0.47324633829935747, | |
| "learning_rate": 3.757108046749006e-05, | |
| "loss": 0.3317, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3450305461883545, | |
| "step": 1065, | |
| "valid_targets_mean": 6763.6, | |
| "valid_targets_min": 2203 | |
| }, | |
| { | |
| "epoch": 1.70926517571885, | |
| "grad_norm": 0.497787820430924, | |
| "learning_rate": 3.753288484374524e-05, | |
| "loss": 0.2913, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2988240420818329, | |
| "step": 1070, | |
| "valid_targets_mean": 6481.5, | |
| "valid_targets_min": 3583 | |
| }, | |
| { | |
| "epoch": 1.7172523961661343, | |
| "grad_norm": 0.4302760688441754, | |
| "learning_rate": 3.7494410967544674e-05, | |
| "loss": 0.3073, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24575269222259521, | |
| "step": 1075, | |
| "valid_targets_mean": 6111.4, | |
| "valid_targets_min": 3068 | |
| }, | |
| { | |
| "epoch": 1.7252396166134185, | |
| "grad_norm": 0.4137232538455785, | |
| "learning_rate": 3.745565944948103e-05, | |
| "loss": 0.3014, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32070547342300415, | |
| "step": 1080, | |
| "valid_targets_mean": 6265.9, | |
| "valid_targets_min": 2867 | |
| }, | |
| { | |
| "epoch": 1.733226837060703, | |
| "grad_norm": 0.40907502696932574, | |
| "learning_rate": 3.7416630904553205e-05, | |
| "loss": 0.3148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2615659832954407, | |
| "step": 1085, | |
| "valid_targets_mean": 5698.2, | |
| "valid_targets_min": 2641 | |
| }, | |
| { | |
| "epoch": 1.741214057507987, | |
| "grad_norm": 0.3971522654964072, | |
| "learning_rate": 3.737732595215663e-05, | |
| "loss": 0.3319, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3249293565750122, | |
| "step": 1090, | |
| "valid_targets_mean": 6821.3, | |
| "valid_targets_min": 723 | |
| }, | |
| { | |
| "epoch": 1.7492012779552715, | |
| "grad_norm": 0.4854687520948088, | |
| "learning_rate": 3.733774521607338e-05, | |
| "loss": 0.3364, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3668111562728882, | |
| "step": 1095, | |
| "valid_targets_mean": 6596.1, | |
| "valid_targets_min": 2775 | |
| }, | |
| { | |
| "epoch": 1.7571884984025559, | |
| "grad_norm": 0.4483168840019974, | |
| "learning_rate": 3.729788932446231e-05, | |
| "loss": 0.2911, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28388333320617676, | |
| "step": 1100, | |
| "valid_targets_mean": 5423.2, | |
| "valid_targets_min": 2690 | |
| }, | |
| { | |
| "epoch": 1.7651757188498403, | |
| "grad_norm": 0.4469234263080508, | |
| "learning_rate": 3.7257758909849107e-05, | |
| "loss": 0.2808, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29026705026626587, | |
| "step": 1105, | |
| "valid_targets_mean": 5643.3, | |
| "valid_targets_min": 2639 | |
| }, | |
| { | |
| "epoch": 1.7731629392971247, | |
| "grad_norm": 0.4149887827036156, | |
| "learning_rate": 3.72173546091162e-05, | |
| "loss": 0.3175, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35154473781585693, | |
| "step": 1110, | |
| "valid_targets_mean": 7066.4, | |
| "valid_targets_min": 3237 | |
| }, | |
| { | |
| "epoch": 1.781150159744409, | |
| "grad_norm": 0.43153531356494945, | |
| "learning_rate": 3.7176677063492685e-05, | |
| "loss": 0.3133, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34095972776412964, | |
| "step": 1115, | |
| "valid_targets_mean": 6402.8, | |
| "valid_targets_min": 2290 | |
| }, | |
| { | |
| "epoch": 1.7891373801916934, | |
| "grad_norm": 0.43938526986199267, | |
| "learning_rate": 3.713572691854414e-05, | |
| "loss": 0.3056, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30775901675224304, | |
| "step": 1120, | |
| "valid_targets_mean": 5813.9, | |
| "valid_targets_min": 2150 | |
| }, | |
| { | |
| "epoch": 1.7971246006389776, | |
| "grad_norm": 0.4806973090278574, | |
| "learning_rate": 3.709450482416239e-05, | |
| "loss": 0.3324, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.37904632091522217, | |
| "step": 1125, | |
| "valid_targets_mean": 6133.9, | |
| "valid_targets_min": 2613 | |
| }, | |
| { | |
| "epoch": 1.805111821086262, | |
| "grad_norm": 0.41778019347715056, | |
| "learning_rate": 3.7053011434555165e-05, | |
| "loss": 0.3129, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32432445883750916, | |
| "step": 1130, | |
| "valid_targets_mean": 6229.0, | |
| "valid_targets_min": 2318 | |
| }, | |
| { | |
| "epoch": 1.8130990415335462, | |
| "grad_norm": 0.4311293450610947, | |
| "learning_rate": 3.701124740823575e-05, | |
| "loss": 0.3098, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27453941106796265, | |
| "step": 1135, | |
| "valid_targets_mean": 5342.9, | |
| "valid_targets_min": 983 | |
| }, | |
| { | |
| "epoch": 1.8210862619808306, | |
| "grad_norm": 0.4318055982258119, | |
| "learning_rate": 3.696921340801253e-05, | |
| "loss": 0.3103, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32878047227859497, | |
| "step": 1140, | |
| "valid_targets_mean": 5969.9, | |
| "valid_targets_min": 825 | |
| }, | |
| { | |
| "epoch": 1.829073482428115, | |
| "grad_norm": 0.3872387213492191, | |
| "learning_rate": 3.6926910100978444e-05, | |
| "loss": 0.3163, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.277811735868454, | |
| "step": 1145, | |
| "valid_targets_mean": 6656.8, | |
| "valid_targets_min": 2209 | |
| }, | |
| { | |
| "epoch": 1.8370607028753994, | |
| "grad_norm": 0.39237022660792376, | |
| "learning_rate": 3.688433815850041e-05, | |
| "loss": 0.3207, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3204132318496704, | |
| "step": 1150, | |
| "valid_targets_mean": 6770.2, | |
| "valid_targets_min": 3197 | |
| }, | |
| { | |
| "epoch": 1.8450479233226837, | |
| "grad_norm": 0.4536366998935553, | |
| "learning_rate": 3.68414982562087e-05, | |
| "loss": 0.3257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33859843015670776, | |
| "step": 1155, | |
| "valid_targets_mean": 5445.5, | |
| "valid_targets_min": 467 | |
| }, | |
| { | |
| "epoch": 1.8530351437699681, | |
| "grad_norm": 0.4427003638216911, | |
| "learning_rate": 3.679839107398618e-05, | |
| "loss": 0.3239, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2922360301017761, | |
| "step": 1160, | |
| "valid_targets_mean": 5173.2, | |
| "valid_targets_min": 1864 | |
| }, | |
| { | |
| "epoch": 1.8610223642172525, | |
| "grad_norm": 0.44874336566516054, | |
| "learning_rate": 3.6755017295957536e-05, | |
| "loss": 0.2915, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.34026211500167847, | |
| "step": 1165, | |
| "valid_targets_mean": 6032.8, | |
| "valid_targets_min": 2549 | |
| }, | |
| { | |
| "epoch": 1.8690095846645367, | |
| "grad_norm": 0.4146830932675077, | |
| "learning_rate": 3.67113776104784e-05, | |
| "loss": 0.3032, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3336482644081116, | |
| "step": 1170, | |
| "valid_targets_mean": 6547.4, | |
| "valid_targets_min": 3879 | |
| }, | |
| { | |
| "epoch": 1.876996805111821, | |
| "grad_norm": 0.44937230489151286, | |
| "learning_rate": 3.6667472710124475e-05, | |
| "loss": 0.3112, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24680504202842712, | |
| "step": 1175, | |
| "valid_targets_mean": 5101.9, | |
| "valid_targets_min": 2576 | |
| }, | |
| { | |
| "epoch": 1.8849840255591053, | |
| "grad_norm": 0.3930641951893354, | |
| "learning_rate": 3.6623303291680465e-05, | |
| "loss": 0.308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30171775817871094, | |
| "step": 1180, | |
| "valid_targets_mean": 6742.4, | |
| "valid_targets_min": 2942 | |
| }, | |
| { | |
| "epoch": 1.8929712460063897, | |
| "grad_norm": 0.4058562146225724, | |
| "learning_rate": 3.6578870056129086e-05, | |
| "loss": 0.3044, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2678030729293823, | |
| "step": 1185, | |
| "valid_targets_mean": 5225.1, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 1.900958466453674, | |
| "grad_norm": 0.4431573216461644, | |
| "learning_rate": 3.653417370863992e-05, | |
| "loss": 0.3125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33074915409088135, | |
| "step": 1190, | |
| "valid_targets_mean": 6745.6, | |
| "valid_targets_min": 914 | |
| }, | |
| { | |
| "epoch": 1.9089456869009584, | |
| "grad_norm": 0.4739777822806342, | |
| "learning_rate": 3.6489214958558206e-05, | |
| "loss": 0.3338, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3263131380081177, | |
| "step": 1195, | |
| "valid_targets_mean": 6518.8, | |
| "valid_targets_min": 2798 | |
| }, | |
| { | |
| "epoch": 1.9169329073482428, | |
| "grad_norm": 0.4181764178398567, | |
| "learning_rate": 3.644399451939358e-05, | |
| "loss": 0.3148, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.35324832797050476, | |
| "step": 1200, | |
| "valid_targets_mean": 6341.8, | |
| "valid_targets_min": 2985 | |
| }, | |
| { | |
| "epoch": 1.9249201277955272, | |
| "grad_norm": 0.45849955175649204, | |
| "learning_rate": 3.639851310880881e-05, | |
| "loss": 0.2992, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28246814012527466, | |
| "step": 1205, | |
| "valid_targets_mean": 4883.7, | |
| "valid_targets_min": 2135 | |
| }, | |
| { | |
| "epoch": 1.9329073482428116, | |
| "grad_norm": 0.41666842862548625, | |
| "learning_rate": 3.635277144860834e-05, | |
| "loss": 0.304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33422061800956726, | |
| "step": 1210, | |
| "valid_targets_mean": 7182.8, | |
| "valid_targets_min": 2897 | |
| }, | |
| { | |
| "epoch": 1.9408945686900958, | |
| "grad_norm": 0.40397761123191894, | |
| "learning_rate": 3.630677026472684e-05, | |
| "loss": 0.3107, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30174657702445984, | |
| "step": 1215, | |
| "valid_targets_mean": 6170.2, | |
| "valid_targets_min": 3070 | |
| }, | |
| { | |
| "epoch": 1.9488817891373802, | |
| "grad_norm": 0.49336580801673047, | |
| "learning_rate": 3.626051028721773e-05, | |
| "loss": 0.3214, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3574771285057068, | |
| "step": 1220, | |
| "valid_targets_mean": 6960.8, | |
| "valid_targets_min": 585 | |
| }, | |
| { | |
| "epoch": 1.9568690095846646, | |
| "grad_norm": 0.41814459322459196, | |
| "learning_rate": 3.621399225024156e-05, | |
| "loss": 0.307, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30000120401382446, | |
| "step": 1225, | |
| "valid_targets_mean": 5510.9, | |
| "valid_targets_min": 2455 | |
| }, | |
| { | |
| "epoch": 1.9648562300319488, | |
| "grad_norm": 0.46364740150979605, | |
| "learning_rate": 3.616721689205436e-05, | |
| "loss": 0.3129, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31898966431617737, | |
| "step": 1230, | |
| "valid_targets_mean": 6460.6, | |
| "valid_targets_min": 3381 | |
| }, | |
| { | |
| "epoch": 1.9728434504792332, | |
| "grad_norm": 0.41281283437569205, | |
| "learning_rate": 3.612018495499594e-05, | |
| "loss": 0.3204, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29932573437690735, | |
| "step": 1235, | |
| "valid_targets_mean": 6155.1, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 1.9808306709265175, | |
| "grad_norm": 0.38478921254611087, | |
| "learning_rate": 3.6072897185478074e-05, | |
| "loss": 0.3099, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3333500623703003, | |
| "step": 1240, | |
| "valid_targets_mean": 6937.1, | |
| "valid_targets_min": 3281 | |
| }, | |
| { | |
| "epoch": 1.988817891373802, | |
| "grad_norm": 0.40707630812679474, | |
| "learning_rate": 3.6025354333972714e-05, | |
| "loss": 0.3026, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.303952157497406, | |
| "step": 1245, | |
| "valid_targets_mean": 6159.0, | |
| "valid_targets_min": 2973 | |
| }, | |
| { | |
| "epoch": 1.9968051118210863, | |
| "grad_norm": 0.39592960242841935, | |
| "learning_rate": 3.597755715500002e-05, | |
| "loss": 0.3106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31186556816101074, | |
| "step": 1250, | |
| "valid_targets_mean": 6116.2, | |
| "valid_targets_min": 1895 | |
| }, | |
| { | |
| "epoch": 2.0047923322683707, | |
| "grad_norm": 0.41792156554885485, | |
| "learning_rate": 3.592950640711642e-05, | |
| "loss": 0.3187, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.287545382976532, | |
| "step": 1255, | |
| "valid_targets_mean": 5944.2, | |
| "valid_targets_min": 3330 | |
| }, | |
| { | |
| "epoch": 2.012779552715655, | |
| "grad_norm": 0.4430009398726886, | |
| "learning_rate": 3.5881202852902543e-05, | |
| "loss": 0.3207, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3533841371536255, | |
| "step": 1260, | |
| "valid_targets_mean": 6138.2, | |
| "valid_targets_min": 943 | |
| }, | |
| { | |
| "epoch": 2.0207667731629395, | |
| "grad_norm": 0.411062807229767, | |
| "learning_rate": 3.583264725895117e-05, | |
| "loss": 0.3232, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.268288791179657, | |
| "step": 1265, | |
| "valid_targets_mean": 5394.2, | |
| "valid_targets_min": 815 | |
| }, | |
| { | |
| "epoch": 2.0287539936102235, | |
| "grad_norm": 0.4463056538287073, | |
| "learning_rate": 3.5783840395854984e-05, | |
| "loss": 0.2991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3236192464828491, | |
| "step": 1270, | |
| "valid_targets_mean": 6318.4, | |
| "valid_targets_min": 1895 | |
| }, | |
| { | |
| "epoch": 2.036741214057508, | |
| "grad_norm": 0.3947869074138238, | |
| "learning_rate": 3.5734783038194446e-05, | |
| "loss": 0.3269, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2702856659889221, | |
| "step": 1275, | |
| "valid_targets_mean": 5745.1, | |
| "valid_targets_min": 2789 | |
| }, | |
| { | |
| "epoch": 2.0447284345047922, | |
| "grad_norm": 0.41212866110029256, | |
| "learning_rate": 3.5685475964525396e-05, | |
| "loss": 0.3212, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2972537875175476, | |
| "step": 1280, | |
| "valid_targets_mean": 5777.7, | |
| "valid_targets_min": 3647 | |
| }, | |
| { | |
| "epoch": 2.0527156549520766, | |
| "grad_norm": 0.4241225344226225, | |
| "learning_rate": 3.5635919957366775e-05, | |
| "loss": 0.2953, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3104124665260315, | |
| "step": 1285, | |
| "valid_targets_mean": 6188.8, | |
| "valid_targets_min": 3742 | |
| }, | |
| { | |
| "epoch": 2.060702875399361, | |
| "grad_norm": 0.38109403843268547, | |
| "learning_rate": 3.558611580318817e-05, | |
| "loss": 0.2943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2603932321071625, | |
| "step": 1290, | |
| "valid_targets_mean": 6280.2, | |
| "valid_targets_min": 1872 | |
| }, | |
| { | |
| "epoch": 2.0686900958466454, | |
| "grad_norm": 0.41277040096601525, | |
| "learning_rate": 3.5536064292397334e-05, | |
| "loss": 0.2931, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2540165185928345, | |
| "step": 1295, | |
| "valid_targets_mean": 6095.9, | |
| "valid_targets_min": 2951 | |
| }, | |
| { | |
| "epoch": 2.07667731629393, | |
| "grad_norm": 0.3973586775986563, | |
| "learning_rate": 3.548576621932766e-05, | |
| "loss": 0.2781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2642344534397125, | |
| "step": 1300, | |
| "valid_targets_mean": 6032.9, | |
| "valid_targets_min": 3220 | |
| }, | |
| { | |
| "epoch": 2.084664536741214, | |
| "grad_norm": 0.40878484857375963, | |
| "learning_rate": 3.543522238222555e-05, | |
| "loss": 0.2986, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.297451376914978, | |
| "step": 1305, | |
| "valid_targets_mean": 6279.5, | |
| "valid_targets_min": 2750 | |
| }, | |
| { | |
| "epoch": 2.0926517571884986, | |
| "grad_norm": 0.4672305221587348, | |
| "learning_rate": 3.538443358323777e-05, | |
| "loss": 0.2934, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3238731920719147, | |
| "step": 1310, | |
| "valid_targets_mean": 7744.9, | |
| "valid_targets_min": 3456 | |
| }, | |
| { | |
| "epoch": 2.1006389776357826, | |
| "grad_norm": 0.40413277793027885, | |
| "learning_rate": 3.5333400628398696e-05, | |
| "loss": 0.3063, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2986423671245575, | |
| "step": 1315, | |
| "valid_targets_mean": 5989.9, | |
| "valid_targets_min": 559 | |
| }, | |
| { | |
| "epoch": 2.108626198083067, | |
| "grad_norm": 0.5005965680503643, | |
| "learning_rate": 3.5282124327617556e-05, | |
| "loss": 0.3308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3341161012649536, | |
| "step": 1320, | |
| "valid_targets_mean": 6632.9, | |
| "valid_targets_min": 2355 | |
| }, | |
| { | |
| "epoch": 2.1166134185303513, | |
| "grad_norm": 0.4050373028727438, | |
| "learning_rate": 3.5230605494665535e-05, | |
| "loss": 0.3008, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2830360531806946, | |
| "step": 1325, | |
| "valid_targets_mean": 5885.3, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 2.1246006389776357, | |
| "grad_norm": 0.44939933780253244, | |
| "learning_rate": 3.517884494716289e-05, | |
| "loss": 0.2889, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2633146643638611, | |
| "step": 1330, | |
| "valid_targets_mean": 5526.2, | |
| "valid_targets_min": 3174 | |
| }, | |
| { | |
| "epoch": 2.13258785942492, | |
| "grad_norm": 0.4118681726495191, | |
| "learning_rate": 3.512684350656596e-05, | |
| "loss": 0.2977, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.298631489276886, | |
| "step": 1335, | |
| "valid_targets_mean": 6233.2, | |
| "valid_targets_min": 3421 | |
| }, | |
| { | |
| "epoch": 2.1405750798722045, | |
| "grad_norm": 1.0422487988396603, | |
| "learning_rate": 3.507460199815414e-05, | |
| "loss": 0.2922, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33046334981918335, | |
| "step": 1340, | |
| "valid_targets_mean": 6987.6, | |
| "valid_targets_min": 3721 | |
| }, | |
| { | |
| "epoch": 2.148562300319489, | |
| "grad_norm": 0.3836997756221973, | |
| "learning_rate": 3.5022121251016786e-05, | |
| "loss": 0.3047, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24991145730018616, | |
| "step": 1345, | |
| "valid_targets_mean": 5697.1, | |
| "valid_targets_min": 2917 | |
| }, | |
| { | |
| "epoch": 2.1565495207667733, | |
| "grad_norm": 0.6546701965938719, | |
| "learning_rate": 3.4969402098040015e-05, | |
| "loss": 0.2889, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31468701362609863, | |
| "step": 1350, | |
| "valid_targets_mean": 6064.4, | |
| "valid_targets_min": 2789 | |
| }, | |
| { | |
| "epoch": 2.1645367412140577, | |
| "grad_norm": 0.3944810242015382, | |
| "learning_rate": 3.491644537589356e-05, | |
| "loss": 0.2927, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28500664234161377, | |
| "step": 1355, | |
| "valid_targets_mean": 6117.1, | |
| "valid_targets_min": 1882 | |
| }, | |
| { | |
| "epoch": 2.1725239616613417, | |
| "grad_norm": 0.4545993238058608, | |
| "learning_rate": 3.486325192501743e-05, | |
| "loss": 0.3038, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29767951369285583, | |
| "step": 1360, | |
| "valid_targets_mean": 5651.1, | |
| "valid_targets_min": 1923 | |
| }, | |
| { | |
| "epoch": 2.180511182108626, | |
| "grad_norm": 0.44851310542134765, | |
| "learning_rate": 3.480982258960859e-05, | |
| "loss": 0.3008, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2720685601234436, | |
| "step": 1365, | |
| "valid_targets_mean": 5454.4, | |
| "valid_targets_min": 585 | |
| }, | |
| { | |
| "epoch": 2.1884984025559104, | |
| "grad_norm": 0.43846655333595497, | |
| "learning_rate": 3.47561582176076e-05, | |
| "loss": 0.3092, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3084791898727417, | |
| "step": 1370, | |
| "valid_targets_mean": 5129.3, | |
| "valid_targets_min": 2568 | |
| }, | |
| { | |
| "epoch": 2.196485623003195, | |
| "grad_norm": 0.41565024304411385, | |
| "learning_rate": 3.470225966068507e-05, | |
| "loss": 0.3174, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3107113242149353, | |
| "step": 1375, | |
| "valid_targets_mean": 6231.5, | |
| "valid_targets_min": 3169 | |
| }, | |
| { | |
| "epoch": 2.2044728434504792, | |
| "grad_norm": 0.40565775761435996, | |
| "learning_rate": 3.464812777422826e-05, | |
| "loss": 0.3003, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2875164747238159, | |
| "step": 1380, | |
| "valid_targets_mean": 5690.8, | |
| "valid_targets_min": 2378 | |
| }, | |
| { | |
| "epoch": 2.2124600638977636, | |
| "grad_norm": 0.4139723497948051, | |
| "learning_rate": 3.459376341732741e-05, | |
| "loss": 0.3045, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3445984721183777, | |
| "step": 1385, | |
| "valid_targets_mean": 6626.8, | |
| "valid_targets_min": 2600 | |
| }, | |
| { | |
| "epoch": 2.220447284345048, | |
| "grad_norm": 0.4063717272941482, | |
| "learning_rate": 3.453916745276213e-05, | |
| "loss": 0.3177, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31706178188323975, | |
| "step": 1390, | |
| "valid_targets_mean": 6249.3, | |
| "valid_targets_min": 2888 | |
| }, | |
| { | |
| "epoch": 2.2284345047923324, | |
| "grad_norm": 0.4117254694952023, | |
| "learning_rate": 3.4484340746987745e-05, | |
| "loss": 0.3001, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31049805879592896, | |
| "step": 1395, | |
| "valid_targets_mean": 6773.8, | |
| "valid_targets_min": 2637 | |
| }, | |
| { | |
| "epoch": 2.236421725239617, | |
| "grad_norm": 0.41800277829926236, | |
| "learning_rate": 3.4429284170121494e-05, | |
| "loss": 0.289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2911890149116516, | |
| "step": 1400, | |
| "valid_targets_mean": 5651.9, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 2.244408945686901, | |
| "grad_norm": 0.42298275156812076, | |
| "learning_rate": 3.4373998595928764e-05, | |
| "loss": 0.2967, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2631995379924774, | |
| "step": 1405, | |
| "valid_targets_mean": 5472.6, | |
| "valid_targets_min": 2827 | |
| }, | |
| { | |
| "epoch": 2.252396166134185, | |
| "grad_norm": 0.40008121266680086, | |
| "learning_rate": 3.431848490180919e-05, | |
| "loss": 0.3001, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25981318950653076, | |
| "step": 1410, | |
| "valid_targets_mean": 6185.2, | |
| "valid_targets_min": 4066 | |
| }, | |
| { | |
| "epoch": 2.2603833865814695, | |
| "grad_norm": 0.41760685130038744, | |
| "learning_rate": 3.4262743968782736e-05, | |
| "loss": 0.3167, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30810973048210144, | |
| "step": 1415, | |
| "valid_targets_mean": 6402.1, | |
| "valid_targets_min": 2832 | |
| }, | |
| { | |
| "epoch": 2.268370607028754, | |
| "grad_norm": 0.4333891981001015, | |
| "learning_rate": 3.4206776681475737e-05, | |
| "loss": 0.3022, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31947463750839233, | |
| "step": 1420, | |
| "valid_targets_mean": 6464.6, | |
| "valid_targets_min": 1009 | |
| }, | |
| { | |
| "epoch": 2.2763578274760383, | |
| "grad_norm": 0.42924124858599033, | |
| "learning_rate": 3.415058392810682e-05, | |
| "loss": 0.2837, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27959945797920227, | |
| "step": 1425, | |
| "valid_targets_mean": 5761.2, | |
| "valid_targets_min": 3055 | |
| }, | |
| { | |
| "epoch": 2.2843450479233227, | |
| "grad_norm": 0.4130181036105482, | |
| "learning_rate": 3.409416660047286e-05, | |
| "loss": 0.2964, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31207722425460815, | |
| "step": 1430, | |
| "valid_targets_mean": 5915.9, | |
| "valid_targets_min": 3126 | |
| }, | |
| { | |
| "epoch": 2.292332268370607, | |
| "grad_norm": 0.4072084020911982, | |
| "learning_rate": 3.403752559393477e-05, | |
| "loss": 0.297, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2991720736026764, | |
| "step": 1435, | |
| "valid_targets_mean": 6472.4, | |
| "valid_targets_min": 2477 | |
| }, | |
| { | |
| "epoch": 2.3003194888178915, | |
| "grad_norm": 0.40968105504533986, | |
| "learning_rate": 3.398066180740332e-05, | |
| "loss": 0.3092, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32481005787849426, | |
| "step": 1440, | |
| "valid_targets_mean": 6518.8, | |
| "valid_targets_min": 2771 | |
| }, | |
| { | |
| "epoch": 2.308306709265176, | |
| "grad_norm": 0.414972589481696, | |
| "learning_rate": 3.39235761433249e-05, | |
| "loss": 0.2969, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27577123045921326, | |
| "step": 1445, | |
| "valid_targets_mean": 5515.1, | |
| "valid_targets_min": 2036 | |
| }, | |
| { | |
| "epoch": 2.31629392971246, | |
| "grad_norm": 0.39386755603061235, | |
| "learning_rate": 3.386626950766713e-05, | |
| "loss": 0.2949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31666386127471924, | |
| "step": 1450, | |
| "valid_targets_mean": 6198.9, | |
| "valid_targets_min": 2795 | |
| }, | |
| { | |
| "epoch": 2.3242811501597442, | |
| "grad_norm": 0.41309475612531094, | |
| "learning_rate": 3.380874280990454e-05, | |
| "loss": 0.2805, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2711862325668335, | |
| "step": 1455, | |
| "valid_targets_mean": 5727.2, | |
| "valid_targets_min": 2576 | |
| }, | |
| { | |
| "epoch": 2.3322683706070286, | |
| "grad_norm": 0.39432974908476837, | |
| "learning_rate": 3.37509969630041e-05, | |
| "loss": 0.3161, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29293060302734375, | |
| "step": 1460, | |
| "valid_targets_mean": 6670.6, | |
| "valid_targets_min": 2540 | |
| }, | |
| { | |
| "epoch": 2.340255591054313, | |
| "grad_norm": 0.4056409349942693, | |
| "learning_rate": 3.369303288341078e-05, | |
| "loss": 0.302, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3370053172111511, | |
| "step": 1465, | |
| "valid_targets_mean": 7109.1, | |
| "valid_targets_min": 2308 | |
| }, | |
| { | |
| "epoch": 2.3482428115015974, | |
| "grad_norm": 0.4378797400057744, | |
| "learning_rate": 3.363485149103292e-05, | |
| "loss": 0.2824, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30367785692214966, | |
| "step": 1470, | |
| "valid_targets_mean": 6054.6, | |
| "valid_targets_min": 2666 | |
| }, | |
| { | |
| "epoch": 2.356230031948882, | |
| "grad_norm": 0.42505426763410487, | |
| "learning_rate": 3.357645370922772e-05, | |
| "loss": 0.2823, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2723078727722168, | |
| "step": 1475, | |
| "valid_targets_mean": 5148.9, | |
| "valid_targets_min": 961 | |
| }, | |
| { | |
| "epoch": 2.364217252396166, | |
| "grad_norm": 0.4442040999216026, | |
| "learning_rate": 3.351784046478655e-05, | |
| "loss": 0.2894, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3224413990974426, | |
| "step": 1480, | |
| "valid_targets_mean": 5366.8, | |
| "valid_targets_min": 846 | |
| }, | |
| { | |
| "epoch": 2.3722044728434506, | |
| "grad_norm": 0.41391095365122316, | |
| "learning_rate": 3.345901268792022e-05, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26531875133514404, | |
| "step": 1485, | |
| "valid_targets_mean": 5860.0, | |
| "valid_targets_min": 3087 | |
| }, | |
| { | |
| "epoch": 2.380191693290735, | |
| "grad_norm": 0.4202326491308265, | |
| "learning_rate": 3.3399971312244255e-05, | |
| "loss": 0.2787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27773553133010864, | |
| "step": 1490, | |
| "valid_targets_mean": 6156.2, | |
| "valid_targets_min": 3272 | |
| }, | |
| { | |
| "epoch": 2.3881789137380194, | |
| "grad_norm": 0.379482844969331, | |
| "learning_rate": 3.334071727476406e-05, | |
| "loss": 0.2999, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3333979845046997, | |
| "step": 1495, | |
| "valid_targets_mean": 7782.3, | |
| "valid_targets_min": 3267 | |
| }, | |
| { | |
| "epoch": 2.3961661341853033, | |
| "grad_norm": 0.3885894209226014, | |
| "learning_rate": 3.3281251515860035e-05, | |
| "loss": 0.2874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29636138677597046, | |
| "step": 1500, | |
| "valid_targets_mean": 6095.9, | |
| "valid_targets_min": 2213 | |
| }, | |
| { | |
| "epoch": 2.4041533546325877, | |
| "grad_norm": 0.4067090580510372, | |
| "learning_rate": 3.3221574979272676e-05, | |
| "loss": 0.3008, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30467158555984497, | |
| "step": 1505, | |
| "valid_targets_mean": 5867.9, | |
| "valid_targets_min": 2024 | |
| }, | |
| { | |
| "epoch": 2.412140575079872, | |
| "grad_norm": 0.3955152438449039, | |
| "learning_rate": 3.316168861208759e-05, | |
| "loss": 0.2878, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28427553176879883, | |
| "step": 1510, | |
| "valid_targets_mean": 6161.8, | |
| "valid_targets_min": 3434 | |
| }, | |
| { | |
| "epoch": 2.4201277955271565, | |
| "grad_norm": 0.40842681677984144, | |
| "learning_rate": 3.310159336472047e-05, | |
| "loss": 0.3065, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29921168088912964, | |
| "step": 1515, | |
| "valid_targets_mean": 5621.6, | |
| "valid_targets_min": 2121 | |
| }, | |
| { | |
| "epoch": 2.428115015974441, | |
| "grad_norm": 0.417870257055718, | |
| "learning_rate": 3.304129019090198e-05, | |
| "loss": 0.2807, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2757473587989807, | |
| "step": 1520, | |
| "valid_targets_mean": 5241.2, | |
| "valid_targets_min": 2836 | |
| }, | |
| { | |
| "epoch": 2.4361022364217253, | |
| "grad_norm": 0.4302500279708224, | |
| "learning_rate": 3.298078004766267e-05, | |
| "loss": 0.2794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26710593700408936, | |
| "step": 1525, | |
| "valid_targets_mean": 5666.2, | |
| "valid_targets_min": 2428 | |
| }, | |
| { | |
| "epoch": 2.4440894568690097, | |
| "grad_norm": 0.4501996501918382, | |
| "learning_rate": 3.292006389531774e-05, | |
| "loss": 0.2887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28270119428634644, | |
| "step": 1530, | |
| "valid_targets_mean": 5234.4, | |
| "valid_targets_min": 805 | |
| }, | |
| { | |
| "epoch": 2.452076677316294, | |
| "grad_norm": 0.42818391902259767, | |
| "learning_rate": 3.2859142697451835e-05, | |
| "loss": 0.2932, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29484790563583374, | |
| "step": 1535, | |
| "valid_targets_mean": 5797.1, | |
| "valid_targets_min": 571 | |
| }, | |
| { | |
| "epoch": 2.460063897763578, | |
| "grad_norm": 0.4065935717148912, | |
| "learning_rate": 3.279801742090372e-05, | |
| "loss": 0.2818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24434958398342133, | |
| "step": 1540, | |
| "valid_targets_mean": 4887.8, | |
| "valid_targets_min": 933 | |
| }, | |
| { | |
| "epoch": 2.4680511182108624, | |
| "grad_norm": 0.4230084055500473, | |
| "learning_rate": 3.2736689035750975e-05, | |
| "loss": 0.3073, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30077362060546875, | |
| "step": 1545, | |
| "valid_targets_mean": 5980.6, | |
| "valid_targets_min": 2794 | |
| }, | |
| { | |
| "epoch": 2.476038338658147, | |
| "grad_norm": 0.4647609067169842, | |
| "learning_rate": 3.2675158515294554e-05, | |
| "loss": 0.2963, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32127833366394043, | |
| "step": 1550, | |
| "valid_targets_mean": 5382.3, | |
| "valid_targets_min": 2588 | |
| }, | |
| { | |
| "epoch": 2.484025559105431, | |
| "grad_norm": 0.4967832988758696, | |
| "learning_rate": 3.2613426836043386e-05, | |
| "loss": 0.285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28116708993911743, | |
| "step": 1555, | |
| "valid_targets_mean": 5411.2, | |
| "valid_targets_min": 1013 | |
| }, | |
| { | |
| "epoch": 2.4920127795527156, | |
| "grad_norm": 0.4332548335288608, | |
| "learning_rate": 3.2551494977698844e-05, | |
| "loss": 0.315, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30997970700263977, | |
| "step": 1560, | |
| "valid_targets_mean": 5944.8, | |
| "valid_targets_min": 3291 | |
| }, | |
| { | |
| "epoch": 2.5, | |
| "grad_norm": 0.4304671738244861, | |
| "learning_rate": 3.248936392313921e-05, | |
| "loss": 0.3263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32509395480155945, | |
| "step": 1565, | |
| "valid_targets_mean": 5927.1, | |
| "valid_targets_min": 1508 | |
| }, | |
| { | |
| "epoch": 2.5079872204472844, | |
| "grad_norm": 0.3697235774628339, | |
| "learning_rate": 3.2427034658404056e-05, | |
| "loss": 0.3004, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29154810309410095, | |
| "step": 1570, | |
| "valid_targets_mean": 6942.3, | |
| "valid_targets_min": 2995 | |
| }, | |
| { | |
| "epoch": 2.515974440894569, | |
| "grad_norm": 0.38696749765054667, | |
| "learning_rate": 3.236450817267863e-05, | |
| "loss": 0.292, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.262685626745224, | |
| "step": 1575, | |
| "valid_targets_mean": 6033.0, | |
| "valid_targets_min": 2420 | |
| }, | |
| { | |
| "epoch": 2.523961661341853, | |
| "grad_norm": 0.39523138056393786, | |
| "learning_rate": 3.230178545827814e-05, | |
| "loss": 0.3121, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2954481542110443, | |
| "step": 1580, | |
| "valid_targets_mean": 5736.6, | |
| "valid_targets_min": 2292 | |
| }, | |
| { | |
| "epoch": 2.5319488817891376, | |
| "grad_norm": 0.3855605426323756, | |
| "learning_rate": 3.223886751063201e-05, | |
| "loss": 0.2915, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26660865545272827, | |
| "step": 1585, | |
| "valid_targets_mean": 5837.8, | |
| "valid_targets_min": 2256 | |
| }, | |
| { | |
| "epoch": 2.539936102236422, | |
| "grad_norm": 0.4527881383676555, | |
| "learning_rate": 3.2175755328268054e-05, | |
| "loss": 0.2919, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3090267777442932, | |
| "step": 1590, | |
| "valid_targets_mean": 4808.8, | |
| "valid_targets_min": 937 | |
| }, | |
| { | |
| "epoch": 2.547923322683706, | |
| "grad_norm": 0.4265416469217226, | |
| "learning_rate": 3.2112449912796675e-05, | |
| "loss": 0.3158, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3253001570701599, | |
| "step": 1595, | |
| "valid_targets_mean": 5822.6, | |
| "valid_targets_min": 2179 | |
| }, | |
| { | |
| "epoch": 2.5559105431309903, | |
| "grad_norm": 0.48175798957484584, | |
| "learning_rate": 3.204895226889494e-05, | |
| "loss": 0.2905, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29204100370407104, | |
| "step": 1600, | |
| "valid_targets_mean": 5364.2, | |
| "valid_targets_min": 2913 | |
| }, | |
| { | |
| "epoch": 2.5638977635782747, | |
| "grad_norm": 0.39676603316787096, | |
| "learning_rate": 3.1985263404290634e-05, | |
| "loss": 0.3147, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3155876398086548, | |
| "step": 1605, | |
| "valid_targets_mean": 6530.4, | |
| "valid_targets_min": 2430 | |
| }, | |
| { | |
| "epoch": 2.571884984025559, | |
| "grad_norm": 0.4028973606897274, | |
| "learning_rate": 3.19213843297463e-05, | |
| "loss": 0.299, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3220685124397278, | |
| "step": 1610, | |
| "valid_targets_mean": 6686.6, | |
| "valid_targets_min": 3500 | |
| }, | |
| { | |
| "epoch": 2.5798722044728435, | |
| "grad_norm": 0.3987795919287704, | |
| "learning_rate": 3.1857316059043144e-05, | |
| "loss": 0.2914, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2867772579193115, | |
| "step": 1615, | |
| "valid_targets_mean": 6227.4, | |
| "valid_targets_min": 3514 | |
| }, | |
| { | |
| "epoch": 2.587859424920128, | |
| "grad_norm": 0.4064361965726967, | |
| "learning_rate": 3.1793059608964986e-05, | |
| "loss": 0.3179, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33910614252090454, | |
| "step": 1620, | |
| "valid_targets_mean": 6681.3, | |
| "valid_targets_min": 2759 | |
| }, | |
| { | |
| "epoch": 2.5958466453674123, | |
| "grad_norm": 0.4319621634078842, | |
| "learning_rate": 3.172861599928212e-05, | |
| "loss": 0.3002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31038302183151245, | |
| "step": 1625, | |
| "valid_targets_mean": 6010.4, | |
| "valid_targets_min": 2418 | |
| }, | |
| { | |
| "epoch": 2.6038338658146962, | |
| "grad_norm": 0.41219848105942675, | |
| "learning_rate": 3.166398625273512e-05, | |
| "loss": 0.3198, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3155521750450134, | |
| "step": 1630, | |
| "valid_targets_mean": 6621.3, | |
| "valid_targets_min": 3548 | |
| }, | |
| { | |
| "epoch": 2.6118210862619806, | |
| "grad_norm": 0.4004060588134791, | |
| "learning_rate": 3.15991713950186e-05, | |
| "loss": 0.2855, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25764551758766174, | |
| "step": 1635, | |
| "valid_targets_mean": 5901.2, | |
| "valid_targets_min": 2397 | |
| }, | |
| { | |
| "epoch": 2.619808306709265, | |
| "grad_norm": 0.40358054212733874, | |
| "learning_rate": 3.153417245476495e-05, | |
| "loss": 0.3103, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.322949081659317, | |
| "step": 1640, | |
| "valid_targets_mean": 6593.8, | |
| "valid_targets_min": 680 | |
| }, | |
| { | |
| "epoch": 2.6277955271565494, | |
| "grad_norm": 0.3918367671686749, | |
| "learning_rate": 3.146899046352804e-05, | |
| "loss": 0.3003, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30850347876548767, | |
| "step": 1645, | |
| "valid_targets_mean": 6448.1, | |
| "valid_targets_min": 653 | |
| }, | |
| { | |
| "epoch": 2.635782747603834, | |
| "grad_norm": 0.40506591753598487, | |
| "learning_rate": 3.140362645576676e-05, | |
| "loss": 0.2876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2981518507003784, | |
| "step": 1650, | |
| "valid_targets_mean": 6171.4, | |
| "valid_targets_min": 3135 | |
| }, | |
| { | |
| "epoch": 2.643769968051118, | |
| "grad_norm": 0.4099656579666407, | |
| "learning_rate": 3.133808146882871e-05, | |
| "loss": 0.3226, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28401097655296326, | |
| "step": 1655, | |
| "valid_targets_mean": 5184.2, | |
| "valid_targets_min": 2727 | |
| }, | |
| { | |
| "epoch": 2.6517571884984026, | |
| "grad_norm": 0.400029838346055, | |
| "learning_rate": 3.1272356542933654e-05, | |
| "loss": 0.2896, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2984737753868103, | |
| "step": 1660, | |
| "valid_targets_mean": 6822.6, | |
| "valid_targets_min": 3295 | |
| }, | |
| { | |
| "epoch": 2.659744408945687, | |
| "grad_norm": 0.4115873747168827, | |
| "learning_rate": 3.120645272115707e-05, | |
| "loss": 0.2972, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2924659848213196, | |
| "step": 1665, | |
| "valid_targets_mean": 5307.4, | |
| "valid_targets_min": 992 | |
| }, | |
| { | |
| "epoch": 2.6677316293929714, | |
| "grad_norm": 0.4145108902659878, | |
| "learning_rate": 3.114037104941355e-05, | |
| "loss": 0.3125, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26558899879455566, | |
| "step": 1670, | |
| "valid_targets_mean": 4796.6, | |
| "valid_targets_min": 791 | |
| }, | |
| { | |
| "epoch": 2.6757188498402558, | |
| "grad_norm": 0.40102644198181403, | |
| "learning_rate": 3.1074112576440236e-05, | |
| "loss": 0.2857, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29855072498321533, | |
| "step": 1675, | |
| "valid_targets_mean": 6069.5, | |
| "valid_targets_min": 2344 | |
| }, | |
| { | |
| "epoch": 2.68370607028754, | |
| "grad_norm": 0.40569956620975584, | |
| "learning_rate": 3.1007678353780154e-05, | |
| "loss": 0.2787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27503690123558044, | |
| "step": 1680, | |
| "valid_targets_mean": 5470.2, | |
| "valid_targets_min": 2440 | |
| }, | |
| { | |
| "epoch": 2.6916932907348246, | |
| "grad_norm": 0.4261646603135661, | |
| "learning_rate": 3.094106943576553e-05, | |
| "loss": 0.2873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3134315609931946, | |
| "step": 1685, | |
| "valid_targets_mean": 5470.4, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 2.6996805111821085, | |
| "grad_norm": 0.43124548026519477, | |
| "learning_rate": 3.087428687950108e-05, | |
| "loss": 0.3039, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32255256175994873, | |
| "step": 1690, | |
| "valid_targets_mean": 6049.7, | |
| "valid_targets_min": 3044 | |
| }, | |
| { | |
| "epoch": 2.707667731629393, | |
| "grad_norm": 0.39556638955134604, | |
| "learning_rate": 3.080733174484719e-05, | |
| "loss": 0.2816, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29055914282798767, | |
| "step": 1695, | |
| "valid_targets_mean": 6225.1, | |
| "valid_targets_min": 3495 | |
| }, | |
| { | |
| "epoch": 2.7156549520766773, | |
| "grad_norm": 0.41711809099528485, | |
| "learning_rate": 3.074020509440313e-05, | |
| "loss": 0.2836, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2818969190120697, | |
| "step": 1700, | |
| "valid_targets_mean": 5565.9, | |
| "valid_targets_min": 2662 | |
| }, | |
| { | |
| "epoch": 2.7236421725239617, | |
| "grad_norm": 0.4006789123272051, | |
| "learning_rate": 3.0672907993490183e-05, | |
| "loss": 0.3054, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30483752489089966, | |
| "step": 1705, | |
| "valid_targets_mean": 5501.5, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 2.731629392971246, | |
| "grad_norm": 0.42940795774418195, | |
| "learning_rate": 3.060544151013473e-05, | |
| "loss": 0.3334, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3312098979949951, | |
| "step": 1710, | |
| "valid_targets_mean": 6130.9, | |
| "valid_targets_min": 2233 | |
| }, | |
| { | |
| "epoch": 2.7396166134185305, | |
| "grad_norm": 0.3695366272503606, | |
| "learning_rate": 3.053780671505132e-05, | |
| "loss": 0.293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26342782378196716, | |
| "step": 1715, | |
| "valid_targets_mean": 6225.9, | |
| "valid_targets_min": 2999 | |
| }, | |
| { | |
| "epoch": 2.747603833865815, | |
| "grad_norm": 0.4522720139205641, | |
| "learning_rate": 3.047000468162563e-05, | |
| "loss": 0.2886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3008291721343994, | |
| "step": 1720, | |
| "valid_targets_mean": 6036.2, | |
| "valid_targets_min": 3021 | |
| }, | |
| { | |
| "epoch": 2.755591054313099, | |
| "grad_norm": 0.3974992857281924, | |
| "learning_rate": 3.0402036485897496e-05, | |
| "loss": 0.3028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31188899278640747, | |
| "step": 1725, | |
| "valid_targets_mean": 6015.4, | |
| "valid_targets_min": 2923 | |
| }, | |
| { | |
| "epoch": 2.763578274760383, | |
| "grad_norm": 0.3827833868618921, | |
| "learning_rate": 3.0333903206543786e-05, | |
| "loss": 0.2991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2514100670814514, | |
| "step": 1730, | |
| "valid_targets_mean": 5808.3, | |
| "valid_targets_min": 2649 | |
| }, | |
| { | |
| "epoch": 2.7715654952076676, | |
| "grad_norm": 0.3983620084251882, | |
| "learning_rate": 3.02656059248613e-05, | |
| "loss": 0.3143, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31184884905815125, | |
| "step": 1735, | |
| "valid_targets_mean": 6261.7, | |
| "valid_targets_min": 2357 | |
| }, | |
| { | |
| "epoch": 2.779552715654952, | |
| "grad_norm": 0.37232187949418283, | |
| "learning_rate": 3.0197145724749615e-05, | |
| "loss": 0.3028, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29378968477249146, | |
| "step": 1740, | |
| "valid_targets_mean": 6060.1, | |
| "valid_targets_min": 2110 | |
| }, | |
| { | |
| "epoch": 2.7875399361022364, | |
| "grad_norm": 0.37342409642644414, | |
| "learning_rate": 3.0128523692693865e-05, | |
| "loss": 0.2782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2689356803894043, | |
| "step": 1745, | |
| "valid_targets_mean": 5888.7, | |
| "valid_targets_min": 3108 | |
| }, | |
| { | |
| "epoch": 2.7955271565495208, | |
| "grad_norm": 0.4319258089757981, | |
| "learning_rate": 3.005974091774751e-05, | |
| "loss": 0.2951, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3294171690940857, | |
| "step": 1750, | |
| "valid_targets_mean": 5912.4, | |
| "valid_targets_min": 2948 | |
| }, | |
| { | |
| "epoch": 2.803514376996805, | |
| "grad_norm": 0.4950304413074886, | |
| "learning_rate": 2.999079849151505e-05, | |
| "loss": 0.3025, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28061580657958984, | |
| "step": 1755, | |
| "valid_targets_mean": 5471.9, | |
| "valid_targets_min": 3185 | |
| }, | |
| { | |
| "epoch": 2.8115015974440896, | |
| "grad_norm": 0.40387845935919453, | |
| "learning_rate": 2.992169750813471e-05, | |
| "loss": 0.307, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30962783098220825, | |
| "step": 1760, | |
| "valid_targets_mean": 5990.1, | |
| "valid_targets_min": 2765 | |
| }, | |
| { | |
| "epoch": 2.819488817891374, | |
| "grad_norm": 0.375146343362634, | |
| "learning_rate": 2.985243906426106e-05, | |
| "loss": 0.3021, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2787879705429077, | |
| "step": 1765, | |
| "valid_targets_mean": 6204.7, | |
| "valid_targets_min": 3463 | |
| }, | |
| { | |
| "epoch": 2.8274760383386583, | |
| "grad_norm": 0.41132393338805634, | |
| "learning_rate": 2.9783024259047617e-05, | |
| "loss": 0.2949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2528681755065918, | |
| "step": 1770, | |
| "valid_targets_mean": 4946.7, | |
| "valid_targets_min": 3078 | |
| }, | |
| { | |
| "epoch": 2.8354632587859427, | |
| "grad_norm": 0.46376479090127043, | |
| "learning_rate": 2.971345419412941e-05, | |
| "loss": 0.3024, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3507474362850189, | |
| "step": 1775, | |
| "valid_targets_mean": 5836.9, | |
| "valid_targets_min": 2150 | |
| }, | |
| { | |
| "epoch": 2.8434504792332267, | |
| "grad_norm": 0.4134070480438868, | |
| "learning_rate": 2.964372997360548e-05, | |
| "loss": 0.285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29295796155929565, | |
| "step": 1780, | |
| "valid_targets_mean": 5649.1, | |
| "valid_targets_min": 2482 | |
| }, | |
| { | |
| "epoch": 2.851437699680511, | |
| "grad_norm": 0.39748516497602365, | |
| "learning_rate": 2.957385270402137e-05, | |
| "loss": 0.2735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2671281397342682, | |
| "step": 1785, | |
| "valid_targets_mean": 5237.8, | |
| "valid_targets_min": 2232 | |
| }, | |
| { | |
| "epoch": 2.8594249201277955, | |
| "grad_norm": 0.40512422862965664, | |
| "learning_rate": 2.9503823494351565e-05, | |
| "loss": 0.289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29251593351364136, | |
| "step": 1790, | |
| "valid_targets_mean": 5566.1, | |
| "valid_targets_min": 2035 | |
| }, | |
| { | |
| "epoch": 2.86741214057508, | |
| "grad_norm": 0.3954754492571278, | |
| "learning_rate": 2.9433643455981874e-05, | |
| "loss": 0.2995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29588550329208374, | |
| "step": 1795, | |
| "valid_targets_mean": 6322.9, | |
| "valid_targets_min": 1836 | |
| }, | |
| { | |
| "epoch": 2.8753993610223643, | |
| "grad_norm": 0.4030143180636043, | |
| "learning_rate": 2.9363313702691827e-05, | |
| "loss": 0.308, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28367578983306885, | |
| "step": 1800, | |
| "valid_targets_mean": 6224.6, | |
| "valid_targets_min": 3956 | |
| }, | |
| { | |
| "epoch": 2.8833865814696487, | |
| "grad_norm": 0.43565145942552674, | |
| "learning_rate": 2.9292835350636957e-05, | |
| "loss": 0.301, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.321044385433197, | |
| "step": 1805, | |
| "valid_targets_mean": 6034.8, | |
| "valid_targets_min": 2329 | |
| }, | |
| { | |
| "epoch": 2.891373801916933, | |
| "grad_norm": 0.4179529116687889, | |
| "learning_rate": 2.922220951833111e-05, | |
| "loss": 0.2892, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26396217942237854, | |
| "step": 1810, | |
| "valid_targets_mean": 5258.7, | |
| "valid_targets_min": 769 | |
| }, | |
| { | |
| "epoch": 2.899361022364217, | |
| "grad_norm": 0.3808282089363938, | |
| "learning_rate": 2.9151437326628706e-05, | |
| "loss": 0.3122, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2766105532646179, | |
| "step": 1815, | |
| "valid_targets_mean": 6825.6, | |
| "valid_targets_min": 2460 | |
| }, | |
| { | |
| "epoch": 2.9073482428115014, | |
| "grad_norm": 0.4124244745011186, | |
| "learning_rate": 2.908051989870692e-05, | |
| "loss": 0.2856, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24274176359176636, | |
| "step": 1820, | |
| "valid_targets_mean": 5708.0, | |
| "valid_targets_min": 2364 | |
| }, | |
| { | |
| "epoch": 2.915335463258786, | |
| "grad_norm": 0.41438184139217416, | |
| "learning_rate": 2.9009458360047867e-05, | |
| "loss": 0.2973, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28374814987182617, | |
| "step": 1825, | |
| "valid_targets_mean": 5260.8, | |
| "valid_targets_min": 2416 | |
| }, | |
| { | |
| "epoch": 2.92332268370607, | |
| "grad_norm": 0.41155404441527726, | |
| "learning_rate": 2.8938253838420762e-05, | |
| "loss": 0.3017, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3477973937988281, | |
| "step": 1830, | |
| "valid_targets_mean": 6668.8, | |
| "valid_targets_min": 2149 | |
| }, | |
| { | |
| "epoch": 2.9313099041533546, | |
| "grad_norm": 0.4085241124027252, | |
| "learning_rate": 2.8866907463864006e-05, | |
| "loss": 0.2948, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2856498956680298, | |
| "step": 1835, | |
| "valid_targets_mean": 6114.0, | |
| "valid_targets_min": 3129 | |
| }, | |
| { | |
| "epoch": 2.939297124600639, | |
| "grad_norm": 0.3732637105205897, | |
| "learning_rate": 2.8795420368667225e-05, | |
| "loss": 0.3016, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.270649790763855, | |
| "step": 1840, | |
| "valid_targets_mean": 6849.4, | |
| "valid_targets_min": 3145 | |
| }, | |
| { | |
| "epoch": 2.9472843450479234, | |
| "grad_norm": 0.4226521461884014, | |
| "learning_rate": 2.8723793687353354e-05, | |
| "loss": 0.289, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31452998518943787, | |
| "step": 1845, | |
| "valid_targets_mean": 6376.6, | |
| "valid_targets_min": 1649 | |
| }, | |
| { | |
| "epoch": 2.9552715654952078, | |
| "grad_norm": 0.379485107873191, | |
| "learning_rate": 2.8652028556660593e-05, | |
| "loss": 0.298, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29785484075546265, | |
| "step": 1850, | |
| "valid_targets_mean": 6641.9, | |
| "valid_targets_min": 2981 | |
| }, | |
| { | |
| "epoch": 2.963258785942492, | |
| "grad_norm": 0.3856093464155722, | |
| "learning_rate": 2.8580126115524373e-05, | |
| "loss": 0.3106, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3266083598136902, | |
| "step": 1855, | |
| "valid_targets_mean": 6517.6, | |
| "valid_targets_min": 1301 | |
| }, | |
| { | |
| "epoch": 2.9712460063897765, | |
| "grad_norm": 0.43428930480362304, | |
| "learning_rate": 2.8508087505059302e-05, | |
| "loss": 0.2857, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31606584787368774, | |
| "step": 1860, | |
| "valid_targets_mean": 5092.9, | |
| "valid_targets_min": 667 | |
| }, | |
| { | |
| "epoch": 2.979233226837061, | |
| "grad_norm": 0.40969816294635303, | |
| "learning_rate": 2.843591386854102e-05, | |
| "loss": 0.3088, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3517284691333771, | |
| "step": 1865, | |
| "valid_targets_mean": 6976.1, | |
| "valid_targets_min": 1755 | |
| }, | |
| { | |
| "epoch": 2.987220447284345, | |
| "grad_norm": 0.3630500921673988, | |
| "learning_rate": 2.8363606351388068e-05, | |
| "loss": 0.304, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30743950605392456, | |
| "step": 1870, | |
| "valid_targets_mean": 7188.4, | |
| "valid_targets_min": 4148 | |
| }, | |
| { | |
| "epoch": 2.9952076677316293, | |
| "grad_norm": 0.3816838598746091, | |
| "learning_rate": 2.829116610114375e-05, | |
| "loss": 0.3031, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3180239796638489, | |
| "step": 1875, | |
| "valid_targets_mean": 6521.1, | |
| "valid_targets_min": 2569 | |
| }, | |
| { | |
| "epoch": 3.0031948881789137, | |
| "grad_norm": 0.39407614826065124, | |
| "learning_rate": 2.821859426745786e-05, | |
| "loss": 0.2782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29579219222068787, | |
| "step": 1880, | |
| "valid_targets_mean": 5941.6, | |
| "valid_targets_min": 2701 | |
| }, | |
| { | |
| "epoch": 3.011182108626198, | |
| "grad_norm": 0.3782083320968264, | |
| "learning_rate": 2.8145892002068454e-05, | |
| "loss": 0.2796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2401447743177414, | |
| "step": 1885, | |
| "valid_targets_mean": 6048.2, | |
| "valid_targets_min": 2035 | |
| }, | |
| { | |
| "epoch": 3.0191693290734825, | |
| "grad_norm": 0.5038352640657161, | |
| "learning_rate": 2.8073060458783606e-05, | |
| "loss": 0.2848, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3108987808227539, | |
| "step": 1890, | |
| "valid_targets_mean": 5316.5, | |
| "valid_targets_min": 1013 | |
| }, | |
| { | |
| "epoch": 3.027156549520767, | |
| "grad_norm": 0.3668199255925011, | |
| "learning_rate": 2.8000100793463056e-05, | |
| "loss": 0.2847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2814244031906128, | |
| "step": 1895, | |
| "valid_targets_mean": 6788.8, | |
| "valid_targets_min": 2404 | |
| }, | |
| { | |
| "epoch": 3.0351437699680512, | |
| "grad_norm": 0.3929170893253138, | |
| "learning_rate": 2.792701416399988e-05, | |
| "loss": 0.2839, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.279710590839386, | |
| "step": 1900, | |
| "valid_targets_mean": 6570.0, | |
| "valid_targets_min": 3117 | |
| }, | |
| { | |
| "epoch": 3.0431309904153356, | |
| "grad_norm": 0.3803358516560281, | |
| "learning_rate": 2.7853801730302134e-05, | |
| "loss": 0.2741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2647709846496582, | |
| "step": 1905, | |
| "valid_targets_mean": 5775.4, | |
| "valid_targets_min": 3030 | |
| }, | |
| { | |
| "epoch": 3.0511182108626196, | |
| "grad_norm": 0.38808017152613633, | |
| "learning_rate": 2.7780464654274396e-05, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27732783555984497, | |
| "step": 1910, | |
| "valid_targets_mean": 6046.7, | |
| "valid_targets_min": 1994 | |
| }, | |
| { | |
| "epoch": 3.059105431309904, | |
| "grad_norm": 0.3883111774894938, | |
| "learning_rate": 2.770700409979938e-05, | |
| "loss": 0.271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24863779544830322, | |
| "step": 1915, | |
| "valid_targets_mean": 5406.1, | |
| "valid_targets_min": 892 | |
| }, | |
| { | |
| "epoch": 3.0670926517571884, | |
| "grad_norm": 0.4120010452872276, | |
| "learning_rate": 2.7633421232719442e-05, | |
| "loss": 0.2911, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28361284732818604, | |
| "step": 1920, | |
| "valid_targets_mean": 5957.8, | |
| "valid_targets_min": 2628 | |
| }, | |
| { | |
| "epoch": 3.0750798722044728, | |
| "grad_norm": 0.39346008689686557, | |
| "learning_rate": 2.7559717220818067e-05, | |
| "loss": 0.2969, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30084335803985596, | |
| "step": 1925, | |
| "valid_targets_mean": 6254.4, | |
| "valid_targets_min": 3332 | |
| }, | |
| { | |
| "epoch": 3.083067092651757, | |
| "grad_norm": 0.3917952007567887, | |
| "learning_rate": 2.7485893233801366e-05, | |
| "loss": 0.2789, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2666378617286682, | |
| "step": 1930, | |
| "valid_targets_mean": 5519.2, | |
| "valid_targets_min": 525 | |
| }, | |
| { | |
| "epoch": 3.0910543130990416, | |
| "grad_norm": 0.4043183652692264, | |
| "learning_rate": 2.7411950443279458e-05, | |
| "loss": 0.2796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28478050231933594, | |
| "step": 1935, | |
| "valid_targets_mean": 5719.7, | |
| "valid_targets_min": 2639 | |
| }, | |
| { | |
| "epoch": 3.099041533546326, | |
| "grad_norm": 0.44871836179762264, | |
| "learning_rate": 2.7337890022747945e-05, | |
| "loss": 0.2869, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3125566840171814, | |
| "step": 1940, | |
| "valid_targets_mean": 5846.2, | |
| "valid_targets_min": 1020 | |
| }, | |
| { | |
| "epoch": 3.1070287539936103, | |
| "grad_norm": 0.426753525563353, | |
| "learning_rate": 2.7263713147569243e-05, | |
| "loss": 0.2815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2868148386478424, | |
| "step": 1945, | |
| "valid_targets_mean": 5920.9, | |
| "valid_targets_min": 2324 | |
| }, | |
| { | |
| "epoch": 3.1150159744408947, | |
| "grad_norm": 0.4445643822116863, | |
| "learning_rate": 2.7189420994953945e-05, | |
| "loss": 0.283, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3119467794895172, | |
| "step": 1950, | |
| "valid_targets_mean": 5628.3, | |
| "valid_targets_min": 2471 | |
| }, | |
| { | |
| "epoch": 3.123003194888179, | |
| "grad_norm": 0.43466756536910034, | |
| "learning_rate": 2.7115014743942114e-05, | |
| "loss": 0.2852, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28476303815841675, | |
| "step": 1955, | |
| "valid_targets_mean": 5465.6, | |
| "valid_targets_min": 2890 | |
| }, | |
| { | |
| "epoch": 3.130990415335463, | |
| "grad_norm": 0.3865673796340792, | |
| "learning_rate": 2.704049557538461e-05, | |
| "loss": 0.3009, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27190887928009033, | |
| "step": 1960, | |
| "valid_targets_mean": 6987.8, | |
| "valid_targets_min": 3535 | |
| }, | |
| { | |
| "epoch": 3.1389776357827475, | |
| "grad_norm": 0.41130693895793174, | |
| "learning_rate": 2.696586467192434e-05, | |
| "loss": 0.2864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29361218214035034, | |
| "step": 1965, | |
| "valid_targets_mean": 5782.1, | |
| "valid_targets_min": 2068 | |
| }, | |
| { | |
| "epoch": 3.146964856230032, | |
| "grad_norm": 0.5614162138714198, | |
| "learning_rate": 2.6891123217977448e-05, | |
| "loss": 0.3074, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3212922215461731, | |
| "step": 1970, | |
| "valid_targets_mean": 6647.4, | |
| "valid_targets_min": 3636 | |
| }, | |
| { | |
| "epoch": 3.1549520766773163, | |
| "grad_norm": 0.386938808590077, | |
| "learning_rate": 2.681627239971458e-05, | |
| "loss": 0.2943, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2853558361530304, | |
| "step": 1975, | |
| "valid_targets_mean": 6590.2, | |
| "valid_targets_min": 2460 | |
| }, | |
| { | |
| "epoch": 3.1629392971246006, | |
| "grad_norm": 0.37358498500182025, | |
| "learning_rate": 2.6741313405041997e-05, | |
| "loss": 0.2747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25737184286117554, | |
| "step": 1980, | |
| "valid_targets_mean": 7468.1, | |
| "valid_targets_min": 723 | |
| }, | |
| { | |
| "epoch": 3.170926517571885, | |
| "grad_norm": 0.3840885675344956, | |
| "learning_rate": 2.666624742358279e-05, | |
| "loss": 0.3002, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2827538251876831, | |
| "step": 1985, | |
| "valid_targets_mean": 6448.6, | |
| "valid_targets_min": 2574 | |
| }, | |
| { | |
| "epoch": 3.1789137380191694, | |
| "grad_norm": 0.41046271616899715, | |
| "learning_rate": 2.659107564665794e-05, | |
| "loss": 0.2817, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30209285020828247, | |
| "step": 1990, | |
| "valid_targets_mean": 7323.8, | |
| "valid_targets_min": 3817 | |
| }, | |
| { | |
| "epoch": 3.186900958466454, | |
| "grad_norm": 0.41696921761969497, | |
| "learning_rate": 2.6515799267267445e-05, | |
| "loss": 0.2576, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25836455821990967, | |
| "step": 1995, | |
| "valid_targets_mean": 5259.9, | |
| "valid_targets_min": 1872 | |
| }, | |
| { | |
| "epoch": 3.194888178913738, | |
| "grad_norm": 0.39316330317707604, | |
| "learning_rate": 2.6440419480071366e-05, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2698454260826111, | |
| "step": 2000, | |
| "valid_targets_mean": 5670.1, | |
| "valid_targets_min": 2098 | |
| }, | |
| { | |
| "epoch": 3.202875399361022, | |
| "grad_norm": 0.3900179184666186, | |
| "learning_rate": 2.6364937481370887e-05, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2806251645088196, | |
| "step": 2005, | |
| "valid_targets_mean": 6425.8, | |
| "valid_targets_min": 3226 | |
| }, | |
| { | |
| "epoch": 3.2108626198083066, | |
| "grad_norm": 0.38151963631950214, | |
| "learning_rate": 2.628935446908933e-05, | |
| "loss": 0.3015, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3154348134994507, | |
| "step": 2010, | |
| "valid_targets_mean": 7543.2, | |
| "valid_targets_min": 3697 | |
| }, | |
| { | |
| "epoch": 3.218849840255591, | |
| "grad_norm": 0.44628089099405716, | |
| "learning_rate": 2.6213671642753124e-05, | |
| "loss": 0.2979, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27721768617630005, | |
| "step": 2015, | |
| "valid_targets_mean": 4927.4, | |
| "valid_targets_min": 2233 | |
| }, | |
| { | |
| "epoch": 3.2268370607028753, | |
| "grad_norm": 0.39013389111709296, | |
| "learning_rate": 2.6137890203472786e-05, | |
| "loss": 0.2965, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28547370433807373, | |
| "step": 2020, | |
| "valid_targets_mean": 7734.6, | |
| "valid_targets_min": 4273 | |
| }, | |
| { | |
| "epoch": 3.2348242811501597, | |
| "grad_norm": 0.3994942801653719, | |
| "learning_rate": 2.606201135392383e-05, | |
| "loss": 0.3022, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29625070095062256, | |
| "step": 2025, | |
| "valid_targets_mean": 6331.4, | |
| "valid_targets_min": 2632 | |
| }, | |
| { | |
| "epoch": 3.242811501597444, | |
| "grad_norm": 0.41278079697619247, | |
| "learning_rate": 2.5986036298327725e-05, | |
| "loss": 0.2949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3153744339942932, | |
| "step": 2030, | |
| "valid_targets_mean": 6242.7, | |
| "valid_targets_min": 1808 | |
| }, | |
| { | |
| "epoch": 3.2507987220447285, | |
| "grad_norm": 0.3891671436772246, | |
| "learning_rate": 2.590996624243276e-05, | |
| "loss": 0.2643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26932135224342346, | |
| "step": 2035, | |
| "valid_targets_mean": 6116.2, | |
| "valid_targets_min": 2928 | |
| }, | |
| { | |
| "epoch": 3.258785942492013, | |
| "grad_norm": 0.4041378088541498, | |
| "learning_rate": 2.5833802393494897e-05, | |
| "loss": 0.2926, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30763524770736694, | |
| "step": 2040, | |
| "valid_targets_mean": 6715.1, | |
| "valid_targets_min": 559 | |
| }, | |
| { | |
| "epoch": 3.2667731629392973, | |
| "grad_norm": 0.3761589059354125, | |
| "learning_rate": 2.5757545960258638e-05, | |
| "loss": 0.2973, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3172910213470459, | |
| "step": 2045, | |
| "valid_targets_mean": 6865.1, | |
| "valid_targets_min": 2967 | |
| }, | |
| { | |
| "epoch": 3.2747603833865817, | |
| "grad_norm": 0.4073637917923728, | |
| "learning_rate": 2.5681198152937804e-05, | |
| "loss": 0.2786, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2412528693675995, | |
| "step": 2050, | |
| "valid_targets_mean": 4718.0, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 3.2827476038338657, | |
| "grad_norm": 0.40631051191479356, | |
| "learning_rate": 2.5604760183196377e-05, | |
| "loss": 0.2773, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27420973777770996, | |
| "step": 2055, | |
| "valid_targets_mean": 5630.1, | |
| "valid_targets_min": 2237 | |
| }, | |
| { | |
| "epoch": 3.29073482428115, | |
| "grad_norm": 0.41097441043848115, | |
| "learning_rate": 2.5528233264129228e-05, | |
| "loss": 0.2895, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2977592349052429, | |
| "step": 2060, | |
| "valid_targets_mean": 5871.2, | |
| "valid_targets_min": 1787 | |
| }, | |
| { | |
| "epoch": 3.2987220447284344, | |
| "grad_norm": 0.37932082757860947, | |
| "learning_rate": 2.54516186102429e-05, | |
| "loss": 0.2888, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2956275939941406, | |
| "step": 2065, | |
| "valid_targets_mean": 6453.6, | |
| "valid_targets_min": 612 | |
| }, | |
| { | |
| "epoch": 3.306709265175719, | |
| "grad_norm": 0.4231995554892241, | |
| "learning_rate": 2.5374917437436294e-05, | |
| "loss": 0.3051, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26246821880340576, | |
| "step": 2070, | |
| "valid_targets_mean": 5138.6, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 3.3146964856230032, | |
| "grad_norm": 0.41283978593806675, | |
| "learning_rate": 2.5298130962981403e-05, | |
| "loss": 0.2977, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3205075263977051, | |
| "step": 2075, | |
| "valid_targets_mean": 6270.7, | |
| "valid_targets_min": 2560 | |
| }, | |
| { | |
| "epoch": 3.3226837060702876, | |
| "grad_norm": 0.4034511797041537, | |
| "learning_rate": 2.5221260405503985e-05, | |
| "loss": 0.2761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2481868714094162, | |
| "step": 2080, | |
| "valid_targets_mean": 5122.6, | |
| "valid_targets_min": 2418 | |
| }, | |
| { | |
| "epoch": 3.330670926517572, | |
| "grad_norm": 0.43029045688479733, | |
| "learning_rate": 2.5144306984964225e-05, | |
| "loss": 0.2653, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2917163670063019, | |
| "step": 2085, | |
| "valid_targets_mean": 6054.2, | |
| "valid_targets_min": 3320 | |
| }, | |
| { | |
| "epoch": 3.3386581469648564, | |
| "grad_norm": 0.39239657125059835, | |
| "learning_rate": 2.5067271922637377e-05, | |
| "loss": 0.2762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2448638379573822, | |
| "step": 2090, | |
| "valid_targets_mean": 5684.7, | |
| "valid_targets_min": 713 | |
| }, | |
| { | |
| "epoch": 3.3466453674121404, | |
| "grad_norm": 0.3903950545189857, | |
| "learning_rate": 2.499015644109435e-05, | |
| "loss": 0.276, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2958728075027466, | |
| "step": 2095, | |
| "valid_targets_mean": 6683.0, | |
| "valid_targets_min": 905 | |
| }, | |
| { | |
| "epoch": 3.3546325878594248, | |
| "grad_norm": 0.4004885209858901, | |
| "learning_rate": 2.491296176418236e-05, | |
| "loss": 0.285, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27448421716690063, | |
| "step": 2100, | |
| "valid_targets_mean": 5940.2, | |
| "valid_targets_min": 2664 | |
| }, | |
| { | |
| "epoch": 3.362619808306709, | |
| "grad_norm": 0.3654722126214252, | |
| "learning_rate": 2.4835689117005472e-05, | |
| "loss": 0.2889, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2784554362297058, | |
| "step": 2105, | |
| "valid_targets_mean": 6328.2, | |
| "valid_targets_min": 3656 | |
| }, | |
| { | |
| "epoch": 3.3706070287539935, | |
| "grad_norm": 0.4199048765889977, | |
| "learning_rate": 2.4758339725905152e-05, | |
| "loss": 0.2884, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27836596965789795, | |
| "step": 2110, | |
| "valid_targets_mean": 5817.2, | |
| "valid_targets_min": 1856 | |
| }, | |
| { | |
| "epoch": 3.378594249201278, | |
| "grad_norm": 0.42623999943394303, | |
| "learning_rate": 2.4680914818440823e-05, | |
| "loss": 0.2784, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28224313259124756, | |
| "step": 2115, | |
| "valid_targets_mean": 5713.3, | |
| "valid_targets_min": 2742 | |
| }, | |
| { | |
| "epoch": 3.3865814696485623, | |
| "grad_norm": 0.4135723729647487, | |
| "learning_rate": 2.4603415623370387e-05, | |
| "loss": 0.3019, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28288450837135315, | |
| "step": 2120, | |
| "valid_targets_mean": 5835.9, | |
| "valid_targets_min": 1753 | |
| }, | |
| { | |
| "epoch": 3.3945686900958467, | |
| "grad_norm": 0.3948649972522792, | |
| "learning_rate": 2.4525843370630697e-05, | |
| "loss": 0.2879, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2941212058067322, | |
| "step": 2125, | |
| "valid_targets_mean": 6642.4, | |
| "valid_targets_min": 3088 | |
| }, | |
| { | |
| "epoch": 3.402555910543131, | |
| "grad_norm": 0.4120188119446171, | |
| "learning_rate": 2.4448199291318058e-05, | |
| "loss": 0.2865, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3203067481517792, | |
| "step": 2130, | |
| "valid_targets_mean": 6592.6, | |
| "valid_targets_min": 2512 | |
| }, | |
| { | |
| "epoch": 3.4105431309904155, | |
| "grad_norm": 0.39104291480241576, | |
| "learning_rate": 2.4370484617668707e-05, | |
| "loss": 0.2864, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2826640009880066, | |
| "step": 2135, | |
| "valid_targets_mean": 6314.2, | |
| "valid_targets_min": 2548 | |
| }, | |
| { | |
| "epoch": 3.4185303514377, | |
| "grad_norm": 0.37836556558335366, | |
| "learning_rate": 2.4292700583039194e-05, | |
| "loss": 0.2882, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27248314023017883, | |
| "step": 2140, | |
| "valid_targets_mean": 6604.1, | |
| "valid_targets_min": 3329 | |
| }, | |
| { | |
| "epoch": 3.426517571884984, | |
| "grad_norm": 0.4136825101312036, | |
| "learning_rate": 2.4214848421886893e-05, | |
| "loss": 0.2709, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24891617894172668, | |
| "step": 2145, | |
| "valid_targets_mean": 5143.1, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 3.4345047923322682, | |
| "grad_norm": 0.38430589198007, | |
| "learning_rate": 2.4136929369750344e-05, | |
| "loss": 0.2924, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28506019711494446, | |
| "step": 2150, | |
| "valid_targets_mean": 6425.4, | |
| "valid_targets_min": 2515 | |
| }, | |
| { | |
| "epoch": 3.4424920127795526, | |
| "grad_norm": 0.3991648912111197, | |
| "learning_rate": 2.4058944663229672e-05, | |
| "loss": 0.3155, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30513352155685425, | |
| "step": 2155, | |
| "valid_targets_mean": 6308.9, | |
| "valid_targets_min": 2263 | |
| }, | |
| { | |
| "epoch": 3.450479233226837, | |
| "grad_norm": 0.40606202574077355, | |
| "learning_rate": 2.398089553996696e-05, | |
| "loss": 0.305, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28048115968704224, | |
| "step": 2160, | |
| "valid_targets_mean": 6116.3, | |
| "valid_targets_min": 2038 | |
| }, | |
| { | |
| "epoch": 3.4584664536741214, | |
| "grad_norm": 0.41238886031195693, | |
| "learning_rate": 2.3902783238626604e-05, | |
| "loss": 0.2713, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30598849058151245, | |
| "step": 2165, | |
| "valid_targets_mean": 6816.4, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 3.466453674121406, | |
| "grad_norm": 0.38624458034758125, | |
| "learning_rate": 2.3824608998875652e-05, | |
| "loss": 0.2771, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29611045122146606, | |
| "step": 2170, | |
| "valid_targets_mean": 7105.1, | |
| "valid_targets_min": 2404 | |
| }, | |
| { | |
| "epoch": 3.47444089456869, | |
| "grad_norm": 0.41088167186750163, | |
| "learning_rate": 2.3746374061364138e-05, | |
| "loss": 0.2735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3055521845817566, | |
| "step": 2175, | |
| "valid_targets_mean": 6002.1, | |
| "valid_targets_min": 2496 | |
| }, | |
| { | |
| "epoch": 3.4824281150159746, | |
| "grad_norm": 0.4113148149440506, | |
| "learning_rate": 2.3668079667705404e-05, | |
| "loss": 0.28, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28291821479797363, | |
| "step": 2180, | |
| "valid_targets_mean": 5510.7, | |
| "valid_targets_min": 2713 | |
| }, | |
| { | |
| "epoch": 3.4904153354632586, | |
| "grad_norm": 0.39690184054958405, | |
| "learning_rate": 2.3589727060456342e-05, | |
| "loss": 0.2751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2515103816986084, | |
| "step": 2185, | |
| "valid_targets_mean": 5275.2, | |
| "valid_targets_min": 1009 | |
| }, | |
| { | |
| "epoch": 3.498402555910543, | |
| "grad_norm": 0.4163085822534811, | |
| "learning_rate": 2.3511317483097753e-05, | |
| "loss": 0.2945, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32339000701904297, | |
| "step": 2190, | |
| "valid_targets_mean": 6685.1, | |
| "valid_targets_min": 3472 | |
| }, | |
| { | |
| "epoch": 3.5063897763578273, | |
| "grad_norm": 0.41929454848195985, | |
| "learning_rate": 2.3432852180014548e-05, | |
| "loss": 0.3034, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3325154185295105, | |
| "step": 2195, | |
| "valid_targets_mean": 6569.1, | |
| "valid_targets_min": 3133 | |
| }, | |
| { | |
| "epoch": 3.5143769968051117, | |
| "grad_norm": 0.41715237963782664, | |
| "learning_rate": 2.3354332396476026e-05, | |
| "loss": 0.3025, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3265277147293091, | |
| "step": 2200, | |
| "valid_targets_mean": 6760.0, | |
| "valid_targets_min": 2039 | |
| }, | |
| { | |
| "epoch": 3.522364217252396, | |
| "grad_norm": 0.4054680510437043, | |
| "learning_rate": 2.327575937861612e-05, | |
| "loss": 0.301, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3051404356956482, | |
| "step": 2205, | |
| "valid_targets_mean": 6122.1, | |
| "valid_targets_min": 2955 | |
| }, | |
| { | |
| "epoch": 3.5303514376996805, | |
| "grad_norm": 0.3802083701163208, | |
| "learning_rate": 2.3197134373413595e-05, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3239632844924927, | |
| "step": 2210, | |
| "valid_targets_mean": 6963.6, | |
| "valid_targets_min": 2745 | |
| }, | |
| { | |
| "epoch": 3.538338658146965, | |
| "grad_norm": 0.40131758226744096, | |
| "learning_rate": 2.3118458628672272e-05, | |
| "loss": 0.295, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28352153301239014, | |
| "step": 2215, | |
| "valid_targets_mean": 5898.5, | |
| "valid_targets_min": 2916 | |
| }, | |
| { | |
| "epoch": 3.5463258785942493, | |
| "grad_norm": 0.3762936001516868, | |
| "learning_rate": 2.3039733393001233e-05, | |
| "loss": 0.293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28905823826789856, | |
| "step": 2220, | |
| "valid_targets_mean": 6666.5, | |
| "valid_targets_min": 2505 | |
| }, | |
| { | |
| "epoch": 3.5543130990415337, | |
| "grad_norm": 0.392571305748706, | |
| "learning_rate": 2.2960959915794998e-05, | |
| "loss": 0.2737, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27087992429733276, | |
| "step": 2225, | |
| "valid_targets_mean": 6498.5, | |
| "valid_targets_min": 3328 | |
| }, | |
| { | |
| "epoch": 3.562300319488818, | |
| "grad_norm": 0.6086041530164692, | |
| "learning_rate": 2.2882139447213673e-05, | |
| "loss": 0.2843, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3048592805862427, | |
| "step": 2230, | |
| "valid_targets_mean": 5341.1, | |
| "valid_targets_min": 2083 | |
| }, | |
| { | |
| "epoch": 3.5702875399361025, | |
| "grad_norm": 0.37497767062282483, | |
| "learning_rate": 2.280327323816317e-05, | |
| "loss": 0.2755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27159368991851807, | |
| "step": 2235, | |
| "valid_targets_mean": 6834.6, | |
| "valid_targets_min": 3721 | |
| }, | |
| { | |
| "epoch": 3.5782747603833864, | |
| "grad_norm": 0.3820126990404262, | |
| "learning_rate": 2.2724362540275288e-05, | |
| "loss": 0.2752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2762431502342224, | |
| "step": 2240, | |
| "valid_targets_mean": 5892.3, | |
| "valid_targets_min": 2627 | |
| }, | |
| { | |
| "epoch": 3.586261980830671, | |
| "grad_norm": 0.387363522781126, | |
| "learning_rate": 2.2645408605887882e-05, | |
| "loss": 0.2889, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2990701198577881, | |
| "step": 2245, | |
| "valid_targets_mean": 6069.8, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 3.594249201277955, | |
| "grad_norm": 0.4325757141283567, | |
| "learning_rate": 2.2566412688025003e-05, | |
| "loss": 0.2522, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2801590859889984, | |
| "step": 2250, | |
| "valid_targets_mean": 5381.9, | |
| "valid_targets_min": 2068 | |
| }, | |
| { | |
| "epoch": 3.6022364217252396, | |
| "grad_norm": 0.38452605714606686, | |
| "learning_rate": 2.2487376040376958e-05, | |
| "loss": 0.2853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2836499810218811, | |
| "step": 2255, | |
| "valid_targets_mean": 6136.5, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 3.610223642172524, | |
| "grad_norm": 0.3976434026934483, | |
| "learning_rate": 2.240829991728048e-05, | |
| "loss": 0.2935, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31026923656463623, | |
| "step": 2260, | |
| "valid_targets_mean": 6232.8, | |
| "valid_targets_min": 758 | |
| }, | |
| { | |
| "epoch": 3.6182108626198084, | |
| "grad_norm": 0.37942251572462077, | |
| "learning_rate": 2.2329185573698778e-05, | |
| "loss": 0.284, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24145904183387756, | |
| "step": 2265, | |
| "valid_targets_mean": 5855.1, | |
| "valid_targets_min": 2389 | |
| }, | |
| { | |
| "epoch": 3.626198083067093, | |
| "grad_norm": 0.41523968900739516, | |
| "learning_rate": 2.225003426520163e-05, | |
| "loss": 0.2748, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2884466052055359, | |
| "step": 2270, | |
| "valid_targets_mean": 5608.3, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 3.6341853035143767, | |
| "grad_norm": 0.4064120443525162, | |
| "learning_rate": 2.2170847247945466e-05, | |
| "loss": 0.2588, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24048921465873718, | |
| "step": 2275, | |
| "valid_targets_mean": 4951.3, | |
| "valid_targets_min": 2317 | |
| }, | |
| { | |
| "epoch": 3.642172523961661, | |
| "grad_norm": 0.3963503758114164, | |
| "learning_rate": 2.209162577865342e-05, | |
| "loss": 0.2887, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27281975746154785, | |
| "step": 2280, | |
| "valid_targets_mean": 6056.3, | |
| "valid_targets_min": 2645 | |
| }, | |
| { | |
| "epoch": 3.6501597444089455, | |
| "grad_norm": 0.3797468110039216, | |
| "learning_rate": 2.201237111459539e-05, | |
| "loss": 0.2873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28083449602127075, | |
| "step": 2285, | |
| "valid_targets_mean": 6525.4, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 3.65814696485623, | |
| "grad_norm": 0.37924803759144315, | |
| "learning_rate": 2.193308451356809e-05, | |
| "loss": 0.3001, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29979899525642395, | |
| "step": 2290, | |
| "valid_targets_mean": 6610.4, | |
| "valid_targets_min": 4545 | |
| }, | |
| { | |
| "epoch": 3.6661341853035143, | |
| "grad_norm": 0.4492054263182636, | |
| "learning_rate": 2.1853767233875084e-05, | |
| "loss": 0.2846, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28562217950820923, | |
| "step": 2295, | |
| "valid_targets_mean": 6567.2, | |
| "valid_targets_min": 1445 | |
| }, | |
| { | |
| "epoch": 3.6741214057507987, | |
| "grad_norm": 0.40385812480737254, | |
| "learning_rate": 2.1774420534306812e-05, | |
| "loss": 0.2819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28911226987838745, | |
| "step": 2300, | |
| "valid_targets_mean": 6200.9, | |
| "valid_targets_min": 2834 | |
| }, | |
| { | |
| "epoch": 3.682108626198083, | |
| "grad_norm": 0.3837316452022455, | |
| "learning_rate": 2.1695045674120615e-05, | |
| "loss": 0.2777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28620877861976624, | |
| "step": 2305, | |
| "valid_targets_mean": 6418.9, | |
| "valid_targets_min": 3469 | |
| }, | |
| { | |
| "epoch": 3.6900958466453675, | |
| "grad_norm": 0.3589940699122065, | |
| "learning_rate": 2.1615643913020755e-05, | |
| "loss": 0.2775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23665347695350647, | |
| "step": 2310, | |
| "valid_targets_mean": 6631.9, | |
| "valid_targets_min": 2340 | |
| }, | |
| { | |
| "epoch": 3.698083067092652, | |
| "grad_norm": 0.3778540126245049, | |
| "learning_rate": 2.1536216511138423e-05, | |
| "loss": 0.2845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3107551336288452, | |
| "step": 2315, | |
| "valid_targets_mean": 7123.0, | |
| "valid_targets_min": 863 | |
| }, | |
| { | |
| "epoch": 3.7060702875399363, | |
| "grad_norm": 0.36669619512735524, | |
| "learning_rate": 2.145676472901174e-05, | |
| "loss": 0.261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2750699520111084, | |
| "step": 2320, | |
| "valid_targets_mean": 6343.7, | |
| "valid_targets_min": 4282 | |
| }, | |
| { | |
| "epoch": 3.7140575079872207, | |
| "grad_norm": 0.40160883990615326, | |
| "learning_rate": 2.1377289827565734e-05, | |
| "loss": 0.279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2908309996128082, | |
| "step": 2325, | |
| "valid_targets_mean": 6274.9, | |
| "valid_targets_min": 769 | |
| }, | |
| { | |
| "epoch": 3.722044728434505, | |
| "grad_norm": 0.38135724275827115, | |
| "learning_rate": 2.129779306809236e-05, | |
| "loss": 0.2801, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27185767889022827, | |
| "step": 2330, | |
| "valid_targets_mean": 6539.3, | |
| "valid_targets_min": 2593 | |
| }, | |
| { | |
| "epoch": 3.730031948881789, | |
| "grad_norm": 0.41854634180355516, | |
| "learning_rate": 2.121827571223046e-05, | |
| "loss": 0.271, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2506662905216217, | |
| "step": 2335, | |
| "valid_targets_mean": 4737.0, | |
| "valid_targets_min": 1372 | |
| }, | |
| { | |
| "epoch": 3.7380191693290734, | |
| "grad_norm": 0.40851557924126697, | |
| "learning_rate": 2.113873902194575e-05, | |
| "loss": 0.2953, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.279274046421051, | |
| "step": 2340, | |
| "valid_targets_mean": 5637.3, | |
| "valid_targets_min": 4070 | |
| }, | |
| { | |
| "epoch": 3.746006389776358, | |
| "grad_norm": 0.3910794596464284, | |
| "learning_rate": 2.1059184259510786e-05, | |
| "loss": 0.278, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3100409507751465, | |
| "step": 2345, | |
| "valid_targets_mean": 7028.7, | |
| "valid_targets_min": 623 | |
| }, | |
| { | |
| "epoch": 3.753993610223642, | |
| "grad_norm": 0.401024887867437, | |
| "learning_rate": 2.0979612687484935e-05, | |
| "loss": 0.2917, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26389920711517334, | |
| "step": 2350, | |
| "valid_targets_mean": 5422.7, | |
| "valid_targets_min": 665 | |
| }, | |
| { | |
| "epoch": 3.7619808306709266, | |
| "grad_norm": 0.3926950453958137, | |
| "learning_rate": 2.0900025568694334e-05, | |
| "loss": 0.272, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25960761308670044, | |
| "step": 2355, | |
| "valid_targets_mean": 5643.0, | |
| "valid_targets_min": 897 | |
| }, | |
| { | |
| "epoch": 3.769968051118211, | |
| "grad_norm": 0.3893179064995369, | |
| "learning_rate": 2.082042416621187e-05, | |
| "loss": 0.2995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31199175119400024, | |
| "step": 2360, | |
| "valid_targets_mean": 6725.8, | |
| "valid_targets_min": 3334 | |
| }, | |
| { | |
| "epoch": 3.777955271565495, | |
| "grad_norm": 0.46322927120903745, | |
| "learning_rate": 2.0740809743337095e-05, | |
| "loss": 0.2876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2446075975894928, | |
| "step": 2365, | |
| "valid_targets_mean": 6197.3, | |
| "valid_targets_min": 2438 | |
| }, | |
| { | |
| "epoch": 3.7859424920127793, | |
| "grad_norm": 0.3868064929625221, | |
| "learning_rate": 2.0661183563576202e-05, | |
| "loss": 0.2996, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29353129863739014, | |
| "step": 2370, | |
| "valid_targets_mean": 6627.8, | |
| "valid_targets_min": 2923 | |
| }, | |
| { | |
| "epoch": 3.7939297124600637, | |
| "grad_norm": 0.394083117057408, | |
| "learning_rate": 2.0581546890621986e-05, | |
| "loss": 0.2792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2892863154411316, | |
| "step": 2375, | |
| "valid_targets_mean": 5990.1, | |
| "valid_targets_min": 2808 | |
| }, | |
| { | |
| "epoch": 3.801916932907348, | |
| "grad_norm": 0.38653188970744146, | |
| "learning_rate": 2.0501900988333755e-05, | |
| "loss": 0.2803, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24584591388702393, | |
| "step": 2380, | |
| "valid_targets_mean": 5592.1, | |
| "valid_targets_min": 2891 | |
| }, | |
| { | |
| "epoch": 3.8099041533546325, | |
| "grad_norm": 0.390959060912465, | |
| "learning_rate": 2.0422247120717304e-05, | |
| "loss": 0.2902, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2661975622177124, | |
| "step": 2385, | |
| "valid_targets_mean": 5660.2, | |
| "valid_targets_min": 2270 | |
| }, | |
| { | |
| "epoch": 3.817891373801917, | |
| "grad_norm": 0.39206024055058375, | |
| "learning_rate": 2.0342586551904824e-05, | |
| "loss": 0.2862, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29259228706359863, | |
| "step": 2390, | |
| "valid_targets_mean": 6328.2, | |
| "valid_targets_min": 937 | |
| }, | |
| { | |
| "epoch": 3.8258785942492013, | |
| "grad_norm": 0.3750426021512079, | |
| "learning_rate": 2.026292054613486e-05, | |
| "loss": 0.2962, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30779290199279785, | |
| "step": 2395, | |
| "valid_targets_mean": 6747.1, | |
| "valid_targets_min": 4294 | |
| }, | |
| { | |
| "epoch": 3.8338658146964857, | |
| "grad_norm": 0.40784501815728036, | |
| "learning_rate": 2.0183250367732253e-05, | |
| "loss": 0.2991, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3180014491081238, | |
| "step": 2400, | |
| "valid_targets_mean": 6070.1, | |
| "valid_targets_min": 2232 | |
| }, | |
| { | |
| "epoch": 3.84185303514377, | |
| "grad_norm": 0.4064995550403897, | |
| "learning_rate": 2.0103577281088065e-05, | |
| "loss": 0.2645, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2629479765892029, | |
| "step": 2405, | |
| "valid_targets_mean": 5365.2, | |
| "valid_targets_min": 2279 | |
| }, | |
| { | |
| "epoch": 3.8498402555910545, | |
| "grad_norm": 0.40276468606337057, | |
| "learning_rate": 2.00239025506395e-05, | |
| "loss": 0.3032, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3314054012298584, | |
| "step": 2410, | |
| "valid_targets_mean": 6474.4, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 3.857827476038339, | |
| "grad_norm": 0.4208783819685992, | |
| "learning_rate": 1.9944227440849856e-05, | |
| "loss": 0.2775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23719042539596558, | |
| "step": 2415, | |
| "valid_targets_mean": 5318.1, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 3.8658146964856233, | |
| "grad_norm": 0.3850598138475823, | |
| "learning_rate": 1.9864553216188454e-05, | |
| "loss": 0.2768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2985256612300873, | |
| "step": 2420, | |
| "valid_targets_mean": 6684.6, | |
| "valid_targets_min": 4385 | |
| }, | |
| { | |
| "epoch": 3.873801916932907, | |
| "grad_norm": 0.3889109661093418, | |
| "learning_rate": 1.978488114111057e-05, | |
| "loss": 0.2728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2758444547653198, | |
| "step": 2425, | |
| "valid_targets_mean": 6286.0, | |
| "valid_targets_min": 3676 | |
| }, | |
| { | |
| "epoch": 3.8817891373801916, | |
| "grad_norm": 0.41974665101718217, | |
| "learning_rate": 1.970521248003735e-05, | |
| "loss": 0.293, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28620395064353943, | |
| "step": 2430, | |
| "valid_targets_mean": 6822.9, | |
| "valid_targets_min": 2359 | |
| }, | |
| { | |
| "epoch": 3.889776357827476, | |
| "grad_norm": 0.4108196582764407, | |
| "learning_rate": 1.9625548497335783e-05, | |
| "loss": 0.2649, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24179211258888245, | |
| "step": 2435, | |
| "valid_targets_mean": 6230.8, | |
| "valid_targets_min": 3030 | |
| }, | |
| { | |
| "epoch": 3.8977635782747604, | |
| "grad_norm": 0.3887184644874627, | |
| "learning_rate": 1.9545890457298592e-05, | |
| "loss": 0.2896, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27514466643333435, | |
| "step": 2440, | |
| "valid_targets_mean": 6194.8, | |
| "valid_targets_min": 4190 | |
| }, | |
| { | |
| "epoch": 3.905750798722045, | |
| "grad_norm": 0.3808169582267541, | |
| "learning_rate": 1.9466239624124203e-05, | |
| "loss": 0.2796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28203779458999634, | |
| "step": 2445, | |
| "valid_targets_mean": 5893.8, | |
| "valid_targets_min": 2481 | |
| }, | |
| { | |
| "epoch": 3.913738019169329, | |
| "grad_norm": 0.39903432475190986, | |
| "learning_rate": 1.938659726189663e-05, | |
| "loss": 0.2794, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2902190089225769, | |
| "step": 2450, | |
| "valid_targets_mean": 6198.2, | |
| "valid_targets_min": 2428 | |
| }, | |
| { | |
| "epoch": 3.9217252396166136, | |
| "grad_norm": 0.3909585098825061, | |
| "learning_rate": 1.9306964634565505e-05, | |
| "loss": 0.287, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2583644390106201, | |
| "step": 2455, | |
| "valid_targets_mean": 5603.4, | |
| "valid_targets_min": 2394 | |
| }, | |
| { | |
| "epoch": 3.9297124600638975, | |
| "grad_norm": 0.38212898858543853, | |
| "learning_rate": 1.9227343005925926e-05, | |
| "loss": 0.2846, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29948654770851135, | |
| "step": 2460, | |
| "valid_targets_mean": 6591.0, | |
| "valid_targets_min": 3077 | |
| }, | |
| { | |
| "epoch": 3.937699680511182, | |
| "grad_norm": 0.35832168924593294, | |
| "learning_rate": 1.914773363959845e-05, | |
| "loss": 0.2804, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24180595576763153, | |
| "step": 2465, | |
| "valid_targets_mean": 6173.3, | |
| "valid_targets_min": 3429 | |
| }, | |
| { | |
| "epoch": 3.9456869009584663, | |
| "grad_norm": 0.4042242888931492, | |
| "learning_rate": 1.906813779900903e-05, | |
| "loss": 0.3128, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2923146188259125, | |
| "step": 2470, | |
| "valid_targets_mean": 5477.3, | |
| "valid_targets_min": 2124 | |
| }, | |
| { | |
| "epoch": 3.9536741214057507, | |
| "grad_norm": 0.40266244023116793, | |
| "learning_rate": 1.8988556747368953e-05, | |
| "loss": 0.2814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29180198907852173, | |
| "step": 2475, | |
| "valid_targets_mean": 5750.6, | |
| "valid_targets_min": 3483 | |
| }, | |
| { | |
| "epoch": 3.961661341853035, | |
| "grad_norm": 0.4064958674965927, | |
| "learning_rate": 1.890899174765481e-05, | |
| "loss": 0.307, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3093706965446472, | |
| "step": 2480, | |
| "valid_targets_mean": 6397.2, | |
| "valid_targets_min": 2649 | |
| }, | |
| { | |
| "epoch": 3.9696485623003195, | |
| "grad_norm": 0.3962056895279825, | |
| "learning_rate": 1.8829444062588433e-05, | |
| "loss": 0.2868, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29254212975502014, | |
| "step": 2485, | |
| "valid_targets_mean": 6045.2, | |
| "valid_targets_min": 1301 | |
| }, | |
| { | |
| "epoch": 3.977635782747604, | |
| "grad_norm": 0.41416641131661436, | |
| "learning_rate": 1.8749914954616882e-05, | |
| "loss": 0.2928, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3371998369693756, | |
| "step": 2490, | |
| "valid_targets_mean": 6160.2, | |
| "valid_targets_min": 3202 | |
| }, | |
| { | |
| "epoch": 3.9856230031948883, | |
| "grad_norm": 0.37474609318480057, | |
| "learning_rate": 1.867040568589236e-05, | |
| "loss": 0.2635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2889212965965271, | |
| "step": 2495, | |
| "valid_targets_mean": 6763.1, | |
| "valid_targets_min": 1985 | |
| }, | |
| { | |
| "epoch": 3.9936102236421727, | |
| "grad_norm": 0.4217384626431671, | |
| "learning_rate": 1.859091751825225e-05, | |
| "loss": 0.2939, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29209163784980774, | |
| "step": 2500, | |
| "valid_targets_mean": 5283.7, | |
| "valid_targets_min": 1867 | |
| }, | |
| { | |
| "epoch": 4.001597444089457, | |
| "grad_norm": 0.41789778022803, | |
| "learning_rate": 1.8511451713199038e-05, | |
| "loss": 0.2662, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2662074863910675, | |
| "step": 2505, | |
| "valid_targets_mean": 5955.0, | |
| "valid_targets_min": 2533 | |
| }, | |
| { | |
| "epoch": 4.0095846645367414, | |
| "grad_norm": 0.40278438837833236, | |
| "learning_rate": 1.843200953188031e-05, | |
| "loss": 0.2781, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2680213451385498, | |
| "step": 2510, | |
| "valid_targets_mean": 5708.9, | |
| "valid_targets_min": 572 | |
| }, | |
| { | |
| "epoch": 4.017571884984026, | |
| "grad_norm": 0.397993731547539, | |
| "learning_rate": 1.8352592235068735e-05, | |
| "loss": 0.2773, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28705108165740967, | |
| "step": 2515, | |
| "valid_targets_mean": 6314.8, | |
| "valid_targets_min": 3044 | |
| }, | |
| { | |
| "epoch": 4.02555910543131, | |
| "grad_norm": 0.3859130960809358, | |
| "learning_rate": 1.8273201083142056e-05, | |
| "loss": 0.2913, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23025313019752502, | |
| "step": 2520, | |
| "valid_targets_mean": 5676.9, | |
| "valid_targets_min": 2873 | |
| }, | |
| { | |
| "epoch": 4.033546325878595, | |
| "grad_norm": 0.4128778693762648, | |
| "learning_rate": 1.8193837336063096e-05, | |
| "loss": 0.2509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3019358515739441, | |
| "step": 2525, | |
| "valid_targets_mean": 6668.1, | |
| "valid_targets_min": 2049 | |
| }, | |
| { | |
| "epoch": 4.041533546325879, | |
| "grad_norm": 0.4034915694915475, | |
| "learning_rate": 1.8114502253359745e-05, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2430892437696457, | |
| "step": 2530, | |
| "valid_targets_mean": 5848.4, | |
| "valid_targets_min": 2613 | |
| }, | |
| { | |
| "epoch": 4.0495207667731625, | |
| "grad_norm": 0.3926112605110736, | |
| "learning_rate": 1.8035197094104986e-05, | |
| "loss": 0.2707, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2987821698188782, | |
| "step": 2535, | |
| "valid_targets_mean": 6712.1, | |
| "valid_targets_min": 1985 | |
| }, | |
| { | |
| "epoch": 4.057507987220447, | |
| "grad_norm": 0.42284516870435657, | |
| "learning_rate": 1.7955923116896908e-05, | |
| "loss": 0.2699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2968297004699707, | |
| "step": 2540, | |
| "valid_targets_mean": 6279.9, | |
| "valid_targets_min": 2149 | |
| }, | |
| { | |
| "epoch": 4.065495207667731, | |
| "grad_norm": 0.4119549970451748, | |
| "learning_rate": 1.787668157983872e-05, | |
| "loss": 0.2845, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2702583968639374, | |
| "step": 2545, | |
| "valid_targets_mean": 5337.4, | |
| "valid_targets_min": 1406 | |
| }, | |
| { | |
| "epoch": 4.073482428115016, | |
| "grad_norm": 0.43532677076464465, | |
| "learning_rate": 1.779747374051881e-05, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30135321617126465, | |
| "step": 2550, | |
| "valid_targets_mean": 5596.6, | |
| "valid_targets_min": 2532 | |
| }, | |
| { | |
| "epoch": 4.0814696485623, | |
| "grad_norm": 0.42805115338364774, | |
| "learning_rate": 1.7718300855990767e-05, | |
| "loss": 0.2755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2958075702190399, | |
| "step": 2555, | |
| "valid_targets_mean": 5754.2, | |
| "valid_targets_min": 2644 | |
| }, | |
| { | |
| "epoch": 4.0894568690095845, | |
| "grad_norm": 0.43235576286525906, | |
| "learning_rate": 1.7639164182753434e-05, | |
| "loss": 0.2747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2922317087650299, | |
| "step": 2560, | |
| "valid_targets_mean": 5877.0, | |
| "valid_targets_min": 2713 | |
| }, | |
| { | |
| "epoch": 4.097444089456869, | |
| "grad_norm": 0.3787889279541212, | |
| "learning_rate": 1.7560064976730967e-05, | |
| "loss": 0.2777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28099751472473145, | |
| "step": 2565, | |
| "valid_targets_mean": 6627.5, | |
| "valid_targets_min": 2975 | |
| }, | |
| { | |
| "epoch": 4.105431309904153, | |
| "grad_norm": 0.37473245224159224, | |
| "learning_rate": 1.7481004493252915e-05, | |
| "loss": 0.2643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2825896143913269, | |
| "step": 2570, | |
| "valid_targets_mean": 6897.7, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 4.113418530351438, | |
| "grad_norm": 0.42809437702255415, | |
| "learning_rate": 1.740198398703428e-05, | |
| "loss": 0.264, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29898396134376526, | |
| "step": 2575, | |
| "valid_targets_mean": 5503.2, | |
| "valid_targets_min": 2512 | |
| }, | |
| { | |
| "epoch": 4.121405750798722, | |
| "grad_norm": 0.3931245114329627, | |
| "learning_rate": 1.7323004712155613e-05, | |
| "loss": 0.2873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2750072777271271, | |
| "step": 2580, | |
| "valid_targets_mean": 6251.4, | |
| "valid_targets_min": 2014 | |
| }, | |
| { | |
| "epoch": 4.1293929712460065, | |
| "grad_norm": 0.40455158947027997, | |
| "learning_rate": 1.724406792204312e-05, | |
| "loss": 0.2946, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31611204147338867, | |
| "step": 2585, | |
| "valid_targets_mean": 7131.2, | |
| "valid_targets_min": 2279 | |
| }, | |
| { | |
| "epoch": 4.137380191693291, | |
| "grad_norm": 0.40745917071878723, | |
| "learning_rate": 1.7165174869448738e-05, | |
| "loss": 0.2849, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29489070177078247, | |
| "step": 2590, | |
| "valid_targets_mean": 6287.9, | |
| "valid_targets_min": 2452 | |
| }, | |
| { | |
| "epoch": 4.145367412140575, | |
| "grad_norm": 0.38823361164382825, | |
| "learning_rate": 1.7086326806430298e-05, | |
| "loss": 0.2974, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2753092050552368, | |
| "step": 2595, | |
| "valid_targets_mean": 6502.6, | |
| "valid_targets_min": 846 | |
| }, | |
| { | |
| "epoch": 4.15335463258786, | |
| "grad_norm": 0.41032607321966874, | |
| "learning_rate": 1.7007524984331623e-05, | |
| "loss": 0.2738, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3026641607284546, | |
| "step": 2600, | |
| "valid_targets_mean": 6315.2, | |
| "valid_targets_min": 1013 | |
| }, | |
| { | |
| "epoch": 4.161341853035144, | |
| "grad_norm": 0.4267557488891164, | |
| "learning_rate": 1.692877065376268e-05, | |
| "loss": 0.2696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3002544641494751, | |
| "step": 2605, | |
| "valid_targets_mean": 5747.0, | |
| "valid_targets_min": 796 | |
| }, | |
| { | |
| "epoch": 4.169329073482428, | |
| "grad_norm": 0.39592909964136863, | |
| "learning_rate": 1.6850065064579724e-05, | |
| "loss": 0.2777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28585320711135864, | |
| "step": 2610, | |
| "valid_targets_mean": 6384.6, | |
| "valid_targets_min": 3185 | |
| }, | |
| { | |
| "epoch": 4.177316293929713, | |
| "grad_norm": 0.38911269848186064, | |
| "learning_rate": 1.677140946586548e-05, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2697932720184326, | |
| "step": 2615, | |
| "valid_targets_mean": 6012.8, | |
| "valid_targets_min": 559 | |
| }, | |
| { | |
| "epoch": 4.185303514376997, | |
| "grad_norm": 0.3933309392629126, | |
| "learning_rate": 1.66928051059093e-05, | |
| "loss": 0.2743, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29853546619415283, | |
| "step": 2620, | |
| "valid_targets_mean": 6470.2, | |
| "valid_targets_min": 4089 | |
| }, | |
| { | |
| "epoch": 4.193290734824281, | |
| "grad_norm": 0.38813684001801246, | |
| "learning_rate": 1.6614253232187363e-05, | |
| "loss": 0.2612, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2507900595664978, | |
| "step": 2625, | |
| "valid_targets_mean": 5594.2, | |
| "valid_targets_min": 1895 | |
| }, | |
| { | |
| "epoch": 4.201277955271565, | |
| "grad_norm": 0.40467045928060663, | |
| "learning_rate": 1.6535755091342876e-05, | |
| "loss": 0.2691, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2329675257205963, | |
| "step": 2630, | |
| "valid_targets_mean": 5645.5, | |
| "valid_targets_min": 805 | |
| }, | |
| { | |
| "epoch": 4.2092651757188495, | |
| "grad_norm": 0.3875601610855551, | |
| "learning_rate": 1.6457311929166286e-05, | |
| "loss": 0.2886, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2972150146961212, | |
| "step": 2635, | |
| "valid_targets_mean": 7001.7, | |
| "valid_targets_min": 3156 | |
| }, | |
| { | |
| "epoch": 4.217252396166134, | |
| "grad_norm": 0.4127200373089555, | |
| "learning_rate": 1.637892499057551e-05, | |
| "loss": 0.2648, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26377594470977783, | |
| "step": 2640, | |
| "valid_targets_mean": 5655.3, | |
| "valid_targets_min": 3434 | |
| }, | |
| { | |
| "epoch": 4.225239616613418, | |
| "grad_norm": 0.36826905769913587, | |
| "learning_rate": 1.6300595519596177e-05, | |
| "loss": 0.2802, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27024176716804504, | |
| "step": 2645, | |
| "valid_targets_mean": 6891.1, | |
| "valid_targets_min": 5072 | |
| }, | |
| { | |
| "epoch": 4.233226837060703, | |
| "grad_norm": 0.41783309492278914, | |
| "learning_rate": 1.6222324759341897e-05, | |
| "loss": 0.2741, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28622257709503174, | |
| "step": 2650, | |
| "valid_targets_mean": 6172.2, | |
| "valid_targets_min": 2268 | |
| }, | |
| { | |
| "epoch": 4.241214057507987, | |
| "grad_norm": 0.39515051538691914, | |
| "learning_rate": 1.6144113951994515e-05, | |
| "loss": 0.2822, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26668697595596313, | |
| "step": 2655, | |
| "valid_targets_mean": 6379.2, | |
| "valid_targets_min": 3110 | |
| }, | |
| { | |
| "epoch": 4.2492012779552715, | |
| "grad_norm": 0.4137394527647848, | |
| "learning_rate": 1.60659643387844e-05, | |
| "loss": 0.2728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23816585540771484, | |
| "step": 2660, | |
| "valid_targets_mean": 5357.4, | |
| "valid_targets_min": 2394 | |
| }, | |
| { | |
| "epoch": 4.257188498402556, | |
| "grad_norm": 0.39445643694253296, | |
| "learning_rate": 1.5987877159970765e-05, | |
| "loss": 0.2733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27433449029922485, | |
| "step": 2665, | |
| "valid_targets_mean": 6585.4, | |
| "valid_targets_min": 3205 | |
| }, | |
| { | |
| "epoch": 4.26517571884984, | |
| "grad_norm": 0.4370745792319941, | |
| "learning_rate": 1.5909853654821947e-05, | |
| "loss": 0.2808, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3000541925430298, | |
| "step": 2670, | |
| "valid_targets_mean": 5766.7, | |
| "valid_targets_min": 2256 | |
| }, | |
| { | |
| "epoch": 4.273162939297125, | |
| "grad_norm": 0.40440104350695244, | |
| "learning_rate": 1.583189506159579e-05, | |
| "loss": 0.2742, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3138236701488495, | |
| "step": 2675, | |
| "valid_targets_mean": 6135.3, | |
| "valid_targets_min": 769 | |
| }, | |
| { | |
| "epoch": 4.281150159744409, | |
| "grad_norm": 0.3835475333605527, | |
| "learning_rate": 1.575400261751993e-05, | |
| "loss": 0.2777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.270091712474823, | |
| "step": 2680, | |
| "valid_targets_mean": 6739.2, | |
| "valid_targets_min": 4116 | |
| }, | |
| { | |
| "epoch": 4.289137380191693, | |
| "grad_norm": 0.42632445332522834, | |
| "learning_rate": 1.567617755877223e-05, | |
| "loss": 0.2773, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29185786843299866, | |
| "step": 2685, | |
| "valid_targets_mean": 5249.2, | |
| "valid_targets_min": 2665 | |
| }, | |
| { | |
| "epoch": 4.297124600638978, | |
| "grad_norm": 0.4030110406426799, | |
| "learning_rate": 1.5598421120461106e-05, | |
| "loss": 0.2726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25116804242134094, | |
| "step": 2690, | |
| "valid_targets_mean": 5738.0, | |
| "valid_targets_min": 3991 | |
| }, | |
| { | |
| "epoch": 4.305111821086262, | |
| "grad_norm": 0.4092073755858647, | |
| "learning_rate": 1.552073453660595e-05, | |
| "loss": 0.2797, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29239794611930847, | |
| "step": 2695, | |
| "valid_targets_mean": 6055.5, | |
| "valid_targets_min": 2506 | |
| }, | |
| { | |
| "epoch": 4.313099041533547, | |
| "grad_norm": 0.45424119210212927, | |
| "learning_rate": 1.5443119040117544e-05, | |
| "loss": 0.2602, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2504233717918396, | |
| "step": 2700, | |
| "valid_targets_mean": 5604.6, | |
| "valid_targets_min": 2582 | |
| }, | |
| { | |
| "epoch": 4.321086261980831, | |
| "grad_norm": 0.4050030441304935, | |
| "learning_rate": 1.536557586277849e-05, | |
| "loss": 0.2731, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28171196579933167, | |
| "step": 2705, | |
| "valid_targets_mean": 6385.3, | |
| "valid_targets_min": 2485 | |
| }, | |
| { | |
| "epoch": 4.329073482428115, | |
| "grad_norm": 0.410297247246705, | |
| "learning_rate": 1.5288106235223655e-05, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2512205243110657, | |
| "step": 2710, | |
| "valid_targets_mean": 5399.4, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 4.3370607028754, | |
| "grad_norm": 0.41510753013612095, | |
| "learning_rate": 1.521071138692066e-05, | |
| "loss": 0.2802, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2584072947502136, | |
| "step": 2715, | |
| "valid_targets_mean": 5446.3, | |
| "valid_targets_min": 2407 | |
| }, | |
| { | |
| "epoch": 4.345047923322683, | |
| "grad_norm": 0.383504073102031, | |
| "learning_rate": 1.5133392546150347e-05, | |
| "loss": 0.2837, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30143845081329346, | |
| "step": 2720, | |
| "valid_targets_mean": 6775.9, | |
| "valid_targets_min": 2652 | |
| }, | |
| { | |
| "epoch": 4.353035143769968, | |
| "grad_norm": 0.4304753569025277, | |
| "learning_rate": 1.5056150939987303e-05, | |
| "loss": 0.2812, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3040391206741333, | |
| "step": 2725, | |
| "valid_targets_mean": 5710.7, | |
| "valid_targets_min": 2511 | |
| }, | |
| { | |
| "epoch": 4.361022364217252, | |
| "grad_norm": 0.3812452760315638, | |
| "learning_rate": 1.4978987794280363e-05, | |
| "loss": 0.2647, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2979819178581238, | |
| "step": 2730, | |
| "valid_targets_mean": 6864.4, | |
| "valid_targets_min": 2768 | |
| }, | |
| { | |
| "epoch": 4.3690095846645365, | |
| "grad_norm": 0.3912121411405226, | |
| "learning_rate": 1.4901904333633179e-05, | |
| "loss": 0.2689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28652453422546387, | |
| "step": 2735, | |
| "valid_targets_mean": 6519.9, | |
| "valid_targets_min": 2567 | |
| }, | |
| { | |
| "epoch": 4.376996805111821, | |
| "grad_norm": 0.40271400569902377, | |
| "learning_rate": 1.4824901781384777e-05, | |
| "loss": 0.2809, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29264265298843384, | |
| "step": 2740, | |
| "valid_targets_mean": 6142.9, | |
| "valid_targets_min": 810 | |
| }, | |
| { | |
| "epoch": 4.384984025559105, | |
| "grad_norm": 0.3874913299465262, | |
| "learning_rate": 1.4747981359590138e-05, | |
| "loss": 0.2621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23004579544067383, | |
| "step": 2745, | |
| "valid_targets_mean": 5417.1, | |
| "valid_targets_min": 2365 | |
| }, | |
| { | |
| "epoch": 4.39297124600639, | |
| "grad_norm": 0.45332639426671206, | |
| "learning_rate": 1.4671144289000806e-05, | |
| "loss": 0.2628, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27936509251594543, | |
| "step": 2750, | |
| "valid_targets_mean": 5746.8, | |
| "valid_targets_min": 689 | |
| }, | |
| { | |
| "epoch": 4.400958466453674, | |
| "grad_norm": 0.3805516692241136, | |
| "learning_rate": 1.4594391789045513e-05, | |
| "loss": 0.2995, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26949167251586914, | |
| "step": 2755, | |
| "valid_targets_mean": 6413.2, | |
| "valid_targets_min": 2693 | |
| }, | |
| { | |
| "epoch": 4.4089456869009584, | |
| "grad_norm": 0.3858941053847421, | |
| "learning_rate": 1.4517725077810835e-05, | |
| "loss": 0.2772, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25780051946640015, | |
| "step": 2760, | |
| "valid_targets_mean": 6056.6, | |
| "valid_targets_min": 3698 | |
| }, | |
| { | |
| "epoch": 4.416932907348243, | |
| "grad_norm": 0.4251912739395793, | |
| "learning_rate": 1.4441145372021847e-05, | |
| "loss": 0.2764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30358684062957764, | |
| "step": 2765, | |
| "valid_targets_mean": 5682.9, | |
| "valid_targets_min": 2567 | |
| }, | |
| { | |
| "epoch": 4.424920127795527, | |
| "grad_norm": 0.3749645245630162, | |
| "learning_rate": 1.4364653887022831e-05, | |
| "loss": 0.2844, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26234883069992065, | |
| "step": 2770, | |
| "valid_targets_mean": 6423.8, | |
| "valid_targets_min": 3567 | |
| }, | |
| { | |
| "epoch": 4.432907348242812, | |
| "grad_norm": 0.4172284202002455, | |
| "learning_rate": 1.4288251836757956e-05, | |
| "loss": 0.2733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2528131902217865, | |
| "step": 2775, | |
| "valid_targets_mean": 5334.3, | |
| "valid_targets_min": 1878 | |
| }, | |
| { | |
| "epoch": 4.440894568690096, | |
| "grad_norm": 0.4372290403653499, | |
| "learning_rate": 1.4211940433752052e-05, | |
| "loss": 0.2677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23210656642913818, | |
| "step": 2780, | |
| "valid_targets_mean": 5115.0, | |
| "valid_targets_min": 2481 | |
| }, | |
| { | |
| "epoch": 4.44888178913738, | |
| "grad_norm": 0.4058720159497367, | |
| "learning_rate": 1.4135720889091342e-05, | |
| "loss": 0.2658, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2466217279434204, | |
| "step": 2785, | |
| "valid_targets_mean": 5953.0, | |
| "valid_targets_min": 2416 | |
| }, | |
| { | |
| "epoch": 4.456869009584665, | |
| "grad_norm": 0.40492658919816266, | |
| "learning_rate": 1.4059594412404227e-05, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3062133193016052, | |
| "step": 2790, | |
| "valid_targets_mean": 6615.1, | |
| "valid_targets_min": 3434 | |
| }, | |
| { | |
| "epoch": 4.464856230031949, | |
| "grad_norm": 0.41873459429789195, | |
| "learning_rate": 1.3983562211842094e-05, | |
| "loss": 0.2694, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2544900178909302, | |
| "step": 2795, | |
| "valid_targets_mean": 5504.8, | |
| "valid_targets_min": 806 | |
| }, | |
| { | |
| "epoch": 4.472843450479234, | |
| "grad_norm": 0.4037709639480373, | |
| "learning_rate": 1.390762549406013e-05, | |
| "loss": 0.2811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2991892695426941, | |
| "step": 2800, | |
| "valid_targets_mean": 6091.9, | |
| "valid_targets_min": 2036 | |
| }, | |
| { | |
| "epoch": 4.480830670926517, | |
| "grad_norm": 0.41551739820781297, | |
| "learning_rate": 1.3831785464198188e-05, | |
| "loss": 0.2538, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23418962955474854, | |
| "step": 2805, | |
| "valid_targets_mean": 4782.4, | |
| "valid_targets_min": 943 | |
| }, | |
| { | |
| "epoch": 4.488817891373802, | |
| "grad_norm": 0.45956147465107355, | |
| "learning_rate": 1.3756043325861648e-05, | |
| "loss": 0.2893, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3095807433128357, | |
| "step": 2810, | |
| "valid_targets_mean": 5793.2, | |
| "valid_targets_min": 2852 | |
| }, | |
| { | |
| "epoch": 4.496805111821086, | |
| "grad_norm": 0.39357976753285634, | |
| "learning_rate": 1.3680400281102326e-05, | |
| "loss": 0.2775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.289013147354126, | |
| "step": 2815, | |
| "valid_targets_mean": 6744.7, | |
| "valid_targets_min": 2657 | |
| }, | |
| { | |
| "epoch": 4.50479233226837, | |
| "grad_norm": 0.4238578338807567, | |
| "learning_rate": 1.3604857530399378e-05, | |
| "loss": 0.2722, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2508664131164551, | |
| "step": 2820, | |
| "valid_targets_mean": 5159.8, | |
| "valid_targets_min": 2311 | |
| }, | |
| { | |
| "epoch": 4.512779552715655, | |
| "grad_norm": 0.3837263973848263, | |
| "learning_rate": 1.3529416272640278e-05, | |
| "loss": 0.277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2732656002044678, | |
| "step": 2825, | |
| "valid_targets_mean": 6027.3, | |
| "valid_targets_min": 2460 | |
| }, | |
| { | |
| "epoch": 4.520766773162939, | |
| "grad_norm": 0.4144600107136905, | |
| "learning_rate": 1.3454077705101766e-05, | |
| "loss": 0.2862, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2717664837837219, | |
| "step": 2830, | |
| "valid_targets_mean": 5457.3, | |
| "valid_targets_min": 2850 | |
| }, | |
| { | |
| "epoch": 4.5287539936102235, | |
| "grad_norm": 0.3932520354085718, | |
| "learning_rate": 1.3378843023430856e-05, | |
| "loss": 0.2756, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28590431809425354, | |
| "step": 2835, | |
| "valid_targets_mean": 6365.2, | |
| "valid_targets_min": 2024 | |
| }, | |
| { | |
| "epoch": 4.536741214057508, | |
| "grad_norm": 0.3786623507452636, | |
| "learning_rate": 1.3303713421625864e-05, | |
| "loss": 0.2535, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24466568231582642, | |
| "step": 2840, | |
| "valid_targets_mean": 5966.8, | |
| "valid_targets_min": 2774 | |
| }, | |
| { | |
| "epoch": 4.544728434504792, | |
| "grad_norm": 0.4122344219843952, | |
| "learning_rate": 1.3228690092017452e-05, | |
| "loss": 0.2728, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2588158845901489, | |
| "step": 2845, | |
| "valid_targets_mean": 5963.2, | |
| "valid_targets_min": 2525 | |
| }, | |
| { | |
| "epoch": 4.552715654952077, | |
| "grad_norm": 0.4162989657258426, | |
| "learning_rate": 1.3153774225249705e-05, | |
| "loss": 0.2879, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30662378668785095, | |
| "step": 2850, | |
| "valid_targets_mean": 6070.2, | |
| "valid_targets_min": 2619 | |
| }, | |
| { | |
| "epoch": 4.560702875399361, | |
| "grad_norm": 0.4014435324404791, | |
| "learning_rate": 1.3078967010261243e-05, | |
| "loss": 0.2907, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3092246651649475, | |
| "step": 2855, | |
| "valid_targets_mean": 6231.2, | |
| "valid_targets_min": 961 | |
| }, | |
| { | |
| "epoch": 4.568690095846645, | |
| "grad_norm": 0.40532255667081835, | |
| "learning_rate": 1.3004269634266357e-05, | |
| "loss": 0.2813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27692824602127075, | |
| "step": 2860, | |
| "valid_targets_mean": 5605.0, | |
| "valid_targets_min": 2121 | |
| }, | |
| { | |
| "epoch": 4.57667731629393, | |
| "grad_norm": 0.38599469872887887, | |
| "learning_rate": 1.2929683282736135e-05, | |
| "loss": 0.2607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2656497359275818, | |
| "step": 2865, | |
| "valid_targets_mean": 6379.1, | |
| "valid_targets_min": 3304 | |
| }, | |
| { | |
| "epoch": 4.584664536741214, | |
| "grad_norm": 0.41171512306553953, | |
| "learning_rate": 1.285520913937969e-05, | |
| "loss": 0.2547, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24021103978157043, | |
| "step": 2870, | |
| "valid_targets_mean": 5186.9, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 4.592651757188499, | |
| "grad_norm": 0.39926320317556246, | |
| "learning_rate": 1.2780848386125354e-05, | |
| "loss": 0.29, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.265019953250885, | |
| "step": 2875, | |
| "valid_targets_mean": 5713.6, | |
| "valid_targets_min": 863 | |
| }, | |
| { | |
| "epoch": 4.600638977635783, | |
| "grad_norm": 0.38165899493837296, | |
| "learning_rate": 1.2706602203101913e-05, | |
| "loss": 0.265, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2591170072555542, | |
| "step": 2880, | |
| "valid_targets_mean": 5792.3, | |
| "valid_targets_min": 2570 | |
| }, | |
| { | |
| "epoch": 4.608626198083067, | |
| "grad_norm": 0.39333620978852973, | |
| "learning_rate": 1.2632471768619894e-05, | |
| "loss": 0.2656, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25593432784080505, | |
| "step": 2885, | |
| "valid_targets_mean": 5759.2, | |
| "valid_targets_min": 2492 | |
| }, | |
| { | |
| "epoch": 4.616613418530352, | |
| "grad_norm": 0.38583042244903354, | |
| "learning_rate": 1.255845825915286e-05, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27156567573547363, | |
| "step": 2890, | |
| "valid_targets_mean": 5962.1, | |
| "valid_targets_min": 2493 | |
| }, | |
| { | |
| "epoch": 4.624600638977636, | |
| "grad_norm": 0.39088557473221014, | |
| "learning_rate": 1.248456284931873e-05, | |
| "loss": 0.2788, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3066374361515045, | |
| "step": 2895, | |
| "valid_targets_mean": 6906.6, | |
| "valid_targets_min": 3025 | |
| }, | |
| { | |
| "epoch": 4.63258785942492, | |
| "grad_norm": 0.4083196304388837, | |
| "learning_rate": 1.241078671186115e-05, | |
| "loss": 0.2775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27976199984550476, | |
| "step": 2900, | |
| "valid_targets_mean": 5486.9, | |
| "valid_targets_min": 2439 | |
| }, | |
| { | |
| "epoch": 4.640575079872205, | |
| "grad_norm": 0.4073650583555991, | |
| "learning_rate": 1.2337131017630873e-05, | |
| "loss": 0.2668, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3037996292114258, | |
| "step": 2905, | |
| "valid_targets_mean": 6225.2, | |
| "valid_targets_min": 2554 | |
| }, | |
| { | |
| "epoch": 4.6485623003194885, | |
| "grad_norm": 0.3872658386510121, | |
| "learning_rate": 1.2263596935567175e-05, | |
| "loss": 0.29, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27664124965667725, | |
| "step": 2910, | |
| "valid_targets_mean": 6432.6, | |
| "valid_targets_min": 2522 | |
| }, | |
| { | |
| "epoch": 4.656549520766773, | |
| "grad_norm": 0.4009065678417215, | |
| "learning_rate": 1.2190185632679316e-05, | |
| "loss": 0.2702, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26349592208862305, | |
| "step": 2915, | |
| "valid_targets_mean": 5652.6, | |
| "valid_targets_min": 623 | |
| }, | |
| { | |
| "epoch": 4.664536741214057, | |
| "grad_norm": 0.396845801704347, | |
| "learning_rate": 1.2116898274028008e-05, | |
| "loss": 0.2884, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2704426348209381, | |
| "step": 2920, | |
| "valid_targets_mean": 5868.4, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 4.672523961661342, | |
| "grad_norm": 0.3813900111199906, | |
| "learning_rate": 1.2043736022706934e-05, | |
| "loss": 0.2621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26645517349243164, | |
| "step": 2925, | |
| "valid_targets_mean": 6337.1, | |
| "valid_targets_min": 723 | |
| }, | |
| { | |
| "epoch": 4.680511182108626, | |
| "grad_norm": 0.39393059528919255, | |
| "learning_rate": 1.1970700039824271e-05, | |
| "loss": 0.2583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2667585611343384, | |
| "step": 2930, | |
| "valid_targets_mean": 6091.6, | |
| "valid_targets_min": 2818 | |
| }, | |
| { | |
| "epoch": 4.68849840255591, | |
| "grad_norm": 0.42478161670319114, | |
| "learning_rate": 1.1897791484484284e-05, | |
| "loss": 0.2755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2721218764781952, | |
| "step": 2935, | |
| "valid_targets_mean": 5633.8, | |
| "valid_targets_min": 2868 | |
| }, | |
| { | |
| "epoch": 4.696485623003195, | |
| "grad_norm": 0.4190056099039446, | |
| "learning_rate": 1.182501151376893e-05, | |
| "loss": 0.2949, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3438076376914978, | |
| "step": 2940, | |
| "valid_targets_mean": 6943.8, | |
| "valid_targets_min": 2954 | |
| }, | |
| { | |
| "epoch": 4.704472843450479, | |
| "grad_norm": 0.41134082677798917, | |
| "learning_rate": 1.1752361282719472e-05, | |
| "loss": 0.2729, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.32966452836990356, | |
| "step": 2945, | |
| "valid_targets_mean": 6345.1, | |
| "valid_targets_min": 2179 | |
| }, | |
| { | |
| "epoch": 4.712460063897764, | |
| "grad_norm": 0.4056966401894177, | |
| "learning_rate": 1.1679841944318171e-05, | |
| "loss": 0.3067, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29474887251853943, | |
| "step": 2950, | |
| "valid_targets_mean": 5847.2, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 4.720447284345048, | |
| "grad_norm": 0.3989401711763822, | |
| "learning_rate": 1.1607454649469991e-05, | |
| "loss": 0.2686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30055108666419983, | |
| "step": 2955, | |
| "valid_targets_mean": 6679.1, | |
| "valid_targets_min": 2697 | |
| }, | |
| { | |
| "epoch": 4.728434504792332, | |
| "grad_norm": 0.7072946642972064, | |
| "learning_rate": 1.1535200546984291e-05, | |
| "loss": 0.2619, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2401805967092514, | |
| "step": 2960, | |
| "valid_targets_mean": 5509.9, | |
| "valid_targets_min": 2408 | |
| }, | |
| { | |
| "epoch": 4.736421725239617, | |
| "grad_norm": 0.41211337707644763, | |
| "learning_rate": 1.1463080783556663e-05, | |
| "loss": 0.2863, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25430363416671753, | |
| "step": 2965, | |
| "valid_targets_mean": 5854.0, | |
| "valid_targets_min": 665 | |
| }, | |
| { | |
| "epoch": 4.744408945686901, | |
| "grad_norm": 0.3962082557962024, | |
| "learning_rate": 1.139109650375066e-05, | |
| "loss": 0.2606, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2777406871318817, | |
| "step": 2970, | |
| "valid_targets_mean": 6056.5, | |
| "valid_targets_min": 2355 | |
| }, | |
| { | |
| "epoch": 4.752396166134186, | |
| "grad_norm": 0.3906084371922766, | |
| "learning_rate": 1.1319248849979705e-05, | |
| "loss": 0.2697, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24518491327762604, | |
| "step": 2975, | |
| "valid_targets_mean": 6053.1, | |
| "valid_targets_min": 2932 | |
| }, | |
| { | |
| "epoch": 4.76038338658147, | |
| "grad_norm": 0.3915629913152369, | |
| "learning_rate": 1.1247538962488882e-05, | |
| "loss": 0.2777, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2643856704235077, | |
| "step": 2980, | |
| "valid_targets_mean": 5882.1, | |
| "valid_targets_min": 808 | |
| }, | |
| { | |
| "epoch": 4.768370607028754, | |
| "grad_norm": 0.37476728452346414, | |
| "learning_rate": 1.1175967979336913e-05, | |
| "loss": 0.2756, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2582591474056244, | |
| "step": 2985, | |
| "valid_targets_mean": 5969.6, | |
| "valid_targets_min": 2098 | |
| }, | |
| { | |
| "epoch": 4.776357827476039, | |
| "grad_norm": 0.41123692648464333, | |
| "learning_rate": 1.1104537036378054e-05, | |
| "loss": 0.2802, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25756096839904785, | |
| "step": 2990, | |
| "valid_targets_mean": 5366.1, | |
| "valid_targets_min": 3005 | |
| }, | |
| { | |
| "epoch": 4.784345047923322, | |
| "grad_norm": 0.39968473555805045, | |
| "learning_rate": 1.1033247267244063e-05, | |
| "loss": 0.275, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2624298930168152, | |
| "step": 2995, | |
| "valid_targets_mean": 5830.2, | |
| "valid_targets_min": 2353 | |
| }, | |
| { | |
| "epoch": 4.792332268370607, | |
| "grad_norm": 0.40672603784124606, | |
| "learning_rate": 1.0962099803326257e-05, | |
| "loss": 0.2686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2500597834587097, | |
| "step": 3000, | |
| "valid_targets_mean": 6001.9, | |
| "valid_targets_min": 3125 | |
| }, | |
| { | |
| "epoch": 4.800319488817891, | |
| "grad_norm": 0.40413233843367646, | |
| "learning_rate": 1.089109577375748e-05, | |
| "loss": 0.2739, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25579679012298584, | |
| "step": 3005, | |
| "valid_targets_mean": 5744.9, | |
| "valid_targets_min": 2369 | |
| }, | |
| { | |
| "epoch": 4.8083067092651754, | |
| "grad_norm": 0.3921130603196491, | |
| "learning_rate": 1.082023630539427e-05, | |
| "loss": 0.2847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28473031520843506, | |
| "step": 3010, | |
| "valid_targets_mean": 6311.4, | |
| "valid_targets_min": 1836 | |
| }, | |
| { | |
| "epoch": 4.81629392971246, | |
| "grad_norm": 0.36503737491231003, | |
| "learning_rate": 1.07495225227989e-05, | |
| "loss": 0.2817, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2782742977142334, | |
| "step": 3015, | |
| "valid_targets_mean": 7165.0, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 4.824281150159744, | |
| "grad_norm": 0.3832633141885287, | |
| "learning_rate": 1.0678955548221595e-05, | |
| "loss": 0.2768, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2613392472267151, | |
| "step": 3020, | |
| "valid_targets_mean": 6698.9, | |
| "valid_targets_min": 2716 | |
| }, | |
| { | |
| "epoch": 4.832268370607029, | |
| "grad_norm": 0.43490464386224226, | |
| "learning_rate": 1.0608536501582654e-05, | |
| "loss": 0.2662, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2747645378112793, | |
| "step": 3025, | |
| "valid_targets_mean": 5440.4, | |
| "valid_targets_min": 2650 | |
| }, | |
| { | |
| "epoch": 4.840255591054313, | |
| "grad_norm": 0.4096138056908477, | |
| "learning_rate": 1.0538266500454739e-05, | |
| "loss": 0.2706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2703027129173279, | |
| "step": 3030, | |
| "valid_targets_mean": 5908.4, | |
| "valid_targets_min": 2512 | |
| }, | |
| { | |
| "epoch": 4.848242811501597, | |
| "grad_norm": 0.3865314876439589, | |
| "learning_rate": 1.0468146660045118e-05, | |
| "loss": 0.2589, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2619239091873169, | |
| "step": 3035, | |
| "valid_targets_mean": 6681.9, | |
| "valid_targets_min": 2521 | |
| }, | |
| { | |
| "epoch": 4.856230031948882, | |
| "grad_norm": 0.4014164554757156, | |
| "learning_rate": 1.0398178093177928e-05, | |
| "loss": 0.2436, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2636415958404541, | |
| "step": 3040, | |
| "valid_targets_mean": 5910.8, | |
| "valid_targets_min": 2823 | |
| }, | |
| { | |
| "epoch": 4.864217252396166, | |
| "grad_norm": 0.41346445358468975, | |
| "learning_rate": 1.0328361910276592e-05, | |
| "loss": 0.2901, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3118521571159363, | |
| "step": 3045, | |
| "valid_targets_mean": 6069.5, | |
| "valid_targets_min": 864 | |
| }, | |
| { | |
| "epoch": 4.872204472843451, | |
| "grad_norm": 0.4145719084507319, | |
| "learning_rate": 1.0258699219346091e-05, | |
| "loss": 0.2559, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.250371515750885, | |
| "step": 3050, | |
| "valid_targets_mean": 5769.2, | |
| "valid_targets_min": 3176 | |
| }, | |
| { | |
| "epoch": 4.880191693290735, | |
| "grad_norm": 0.39688422964095726, | |
| "learning_rate": 1.0189191125955503e-05, | |
| "loss": 0.2507, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27627962827682495, | |
| "step": 3055, | |
| "valid_targets_mean": 5919.9, | |
| "valid_targets_min": 3616 | |
| }, | |
| { | |
| "epoch": 4.888178913738019, | |
| "grad_norm": 0.4718062560973942, | |
| "learning_rate": 1.0119838733220342e-05, | |
| "loss": 0.2814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30140748620033264, | |
| "step": 3060, | |
| "valid_targets_mean": 5039.1, | |
| "valid_targets_min": 2010 | |
| }, | |
| { | |
| "epoch": 4.896166134185304, | |
| "grad_norm": 0.402246162085842, | |
| "learning_rate": 1.0050643141785148e-05, | |
| "loss": 0.2873, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2798316776752472, | |
| "step": 3065, | |
| "valid_targets_mean": 6281.8, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 4.904153354632588, | |
| "grad_norm": 0.3822320161668385, | |
| "learning_rate": 9.981605449805933e-06, | |
| "loss": 0.2726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28229570388793945, | |
| "step": 3070, | |
| "valid_targets_mean": 6582.2, | |
| "valid_targets_min": 2667 | |
| }, | |
| { | |
| "epoch": 4.912140575079873, | |
| "grad_norm": 0.42983190412817457, | |
| "learning_rate": 9.912726752932823e-06, | |
| "loss": 0.291, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3040129840373993, | |
| "step": 3075, | |
| "valid_targets_mean": 6080.6, | |
| "valid_targets_min": 2682 | |
| }, | |
| { | |
| "epoch": 4.920127795527156, | |
| "grad_norm": 0.4177548860188241, | |
| "learning_rate": 9.844008144292643e-06, | |
| "loss": 0.2704, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2931748032569885, | |
| "step": 3080, | |
| "valid_targets_mean": 5857.6, | |
| "valid_targets_min": 2919 | |
| }, | |
| { | |
| "epoch": 4.928115015974441, | |
| "grad_norm": 0.4290855665464435, | |
| "learning_rate": 9.775450714471537e-06, | |
| "loss": 0.272, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2496170699596405, | |
| "step": 3085, | |
| "valid_targets_mean": 5240.9, | |
| "valid_targets_min": 2756 | |
| }, | |
| { | |
| "epoch": 4.936102236421725, | |
| "grad_norm": 0.8863118534559026, | |
| "learning_rate": 9.707055551497734e-06, | |
| "loss": 0.2791, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3124041259288788, | |
| "step": 3090, | |
| "valid_targets_mean": 6965.9, | |
| "valid_targets_min": 2968 | |
| }, | |
| { | |
| "epoch": 4.944089456869009, | |
| "grad_norm": 0.3912009602868405, | |
| "learning_rate": 9.638823740824177e-06, | |
| "loss": 0.2782, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27026793360710144, | |
| "step": 3095, | |
| "valid_targets_mean": 5969.6, | |
| "valid_targets_min": 2602 | |
| }, | |
| { | |
| "epoch": 4.952076677316294, | |
| "grad_norm": 0.39127362584582165, | |
| "learning_rate": 9.570756365311418e-06, | |
| "loss": 0.2837, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30117690563201904, | |
| "step": 3100, | |
| "valid_targets_mean": 6593.9, | |
| "valid_targets_min": 3020 | |
| }, | |
| { | |
| "epoch": 4.960063897763578, | |
| "grad_norm": 0.37385552091108576, | |
| "learning_rate": 9.502854505210326e-06, | |
| "loss": 0.2689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2801275849342346, | |
| "step": 3105, | |
| "valid_targets_mean": 6445.2, | |
| "valid_targets_min": 2751 | |
| }, | |
| { | |
| "epoch": 4.968051118210862, | |
| "grad_norm": 0.4239526444198297, | |
| "learning_rate": 9.435119238145018e-06, | |
| "loss": 0.2557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30280107259750366, | |
| "step": 3110, | |
| "valid_targets_mean": 5772.2, | |
| "valid_targets_min": 2067 | |
| }, | |
| { | |
| "epoch": 4.976038338658147, | |
| "grad_norm": 0.4076703740667104, | |
| "learning_rate": 9.367551639095704e-06, | |
| "loss": 0.2737, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2898576855659485, | |
| "step": 3115, | |
| "valid_targets_mean": 6491.9, | |
| "valid_targets_min": 2165 | |
| }, | |
| { | |
| "epoch": 4.984025559105431, | |
| "grad_norm": 0.4149968613729256, | |
| "learning_rate": 9.300152780381664e-06, | |
| "loss": 0.2771, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30020564794540405, | |
| "step": 3120, | |
| "valid_targets_mean": 6070.2, | |
| "valid_targets_min": 2849 | |
| }, | |
| { | |
| "epoch": 4.992012779552716, | |
| "grad_norm": 0.3952670169134852, | |
| "learning_rate": 9.232923731644216e-06, | |
| "loss": 0.2698, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2594717741012573, | |
| "step": 3125, | |
| "valid_targets_mean": 6255.2, | |
| "valid_targets_min": 2401 | |
| }, | |
| { | |
| "epoch": 5.0, | |
| "grad_norm": 0.3795767439064435, | |
| "learning_rate": 9.16586555982972e-06, | |
| "loss": 0.2827, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3044956922531128, | |
| "step": 3130, | |
| "valid_targets_mean": 6953.6, | |
| "valid_targets_min": 3270 | |
| }, | |
| { | |
| "epoch": 5.007987220447284, | |
| "grad_norm": 0.37855586072860214, | |
| "learning_rate": 9.098979329172702e-06, | |
| "loss": 0.2696, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25174397230148315, | |
| "step": 3135, | |
| "valid_targets_mean": 5779.1, | |
| "valid_targets_min": 1923 | |
| }, | |
| { | |
| "epoch": 5.015974440894569, | |
| "grad_norm": 0.41253031288002934, | |
| "learning_rate": 9.032266101178872e-06, | |
| "loss": 0.2413, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22522510588169098, | |
| "step": 3140, | |
| "valid_targets_mean": 5078.2, | |
| "valid_targets_min": 2251 | |
| }, | |
| { | |
| "epoch": 5.023961661341853, | |
| "grad_norm": 0.4073917604822317, | |
| "learning_rate": 8.965726934608392e-06, | |
| "loss": 0.2751, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27434486150741577, | |
| "step": 3145, | |
| "valid_targets_mean": 6213.6, | |
| "valid_targets_min": 654 | |
| }, | |
| { | |
| "epoch": 5.031948881789138, | |
| "grad_norm": 0.42063342826897976, | |
| "learning_rate": 8.899362885458964e-06, | |
| "loss": 0.2855, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3078088164329529, | |
| "step": 3150, | |
| "valid_targets_mean": 6110.6, | |
| "valid_targets_min": 2389 | |
| }, | |
| { | |
| "epoch": 5.039936102236422, | |
| "grad_norm": 0.4241712530830461, | |
| "learning_rate": 8.833175006949165e-06, | |
| "loss": 0.2759, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29483354091644287, | |
| "step": 3155, | |
| "valid_targets_mean": 6049.3, | |
| "valid_targets_min": 1979 | |
| }, | |
| { | |
| "epoch": 5.047923322683706, | |
| "grad_norm": 0.41704528400266916, | |
| "learning_rate": 8.767164349501648e-06, | |
| "loss": 0.2624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25017309188842773, | |
| "step": 3160, | |
| "valid_targets_mean": 6135.2, | |
| "valid_targets_min": 2789 | |
| }, | |
| { | |
| "epoch": 5.055910543130991, | |
| "grad_norm": 0.4244743848270066, | |
| "learning_rate": 8.70133196072654e-06, | |
| "loss": 0.2641, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.266876757144928, | |
| "step": 3165, | |
| "valid_targets_mean": 5422.1, | |
| "valid_targets_min": 2036 | |
| }, | |
| { | |
| "epoch": 5.063897763578275, | |
| "grad_norm": 0.38689486271620444, | |
| "learning_rate": 8.63567888540478e-06, | |
| "loss": 0.2714, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2700386047363281, | |
| "step": 3170, | |
| "valid_targets_mean": 6671.2, | |
| "valid_targets_min": 2874 | |
| }, | |
| { | |
| "epoch": 5.0718849840255595, | |
| "grad_norm": 0.41674426334241815, | |
| "learning_rate": 8.570206165471535e-06, | |
| "loss": 0.2877, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27029523253440857, | |
| "step": 3175, | |
| "valid_targets_mean": 5963.7, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 5.079872204472843, | |
| "grad_norm": 0.37992809719699144, | |
| "learning_rate": 8.504914839999691e-06, | |
| "loss": 0.2628, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26301029324531555, | |
| "step": 3180, | |
| "valid_targets_mean": 6505.4, | |
| "valid_targets_min": 2242 | |
| }, | |
| { | |
| "epoch": 5.087859424920127, | |
| "grad_norm": 0.37537580056476105, | |
| "learning_rate": 8.439805945183333e-06, | |
| "loss": 0.277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26608848571777344, | |
| "step": 3185, | |
| "valid_targets_mean": 6537.8, | |
| "valid_targets_min": 2068 | |
| }, | |
| { | |
| "epoch": 5.095846645367412, | |
| "grad_norm": 0.38900240852049334, | |
| "learning_rate": 8.374880514321318e-06, | |
| "loss": 0.2675, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25961828231811523, | |
| "step": 3190, | |
| "valid_targets_mean": 6352.6, | |
| "valid_targets_min": 2568 | |
| }, | |
| { | |
| "epoch": 5.103833865814696, | |
| "grad_norm": 0.39522074438243243, | |
| "learning_rate": 8.310139577800864e-06, | |
| "loss": 0.2735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2567019462585449, | |
| "step": 3195, | |
| "valid_targets_mean": 6079.7, | |
| "valid_targets_min": 3813 | |
| }, | |
| { | |
| "epoch": 5.111821086261981, | |
| "grad_norm": 0.4415565404330811, | |
| "learning_rate": 8.245584163081228e-06, | |
| "loss": 0.2759, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2831454575061798, | |
| "step": 3200, | |
| "valid_targets_mean": 6036.4, | |
| "valid_targets_min": 2460 | |
| }, | |
| { | |
| "epoch": 5.119808306709265, | |
| "grad_norm": 0.4074884074177118, | |
| "learning_rate": 8.18121529467735e-06, | |
| "loss": 0.257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24731406569480896, | |
| "step": 3205, | |
| "valid_targets_mean": 5382.1, | |
| "valid_targets_min": 2386 | |
| }, | |
| { | |
| "epoch": 5.127795527156549, | |
| "grad_norm": 0.43875395540030476, | |
| "learning_rate": 8.117033994143635e-06, | |
| "loss": 0.2859, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2998838722705841, | |
| "step": 3210, | |
| "valid_targets_mean": 5511.3, | |
| "valid_targets_min": 810 | |
| }, | |
| { | |
| "epoch": 5.135782747603834, | |
| "grad_norm": 0.40671735232126843, | |
| "learning_rate": 8.053041280057733e-06, | |
| "loss": 0.256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24320551753044128, | |
| "step": 3215, | |
| "valid_targets_mean": 5451.4, | |
| "valid_targets_min": 834 | |
| }, | |
| { | |
| "epoch": 5.143769968051118, | |
| "grad_norm": 0.403021328640831, | |
| "learning_rate": 7.989238168004347e-06, | |
| "loss": 0.2635, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2588720917701721, | |
| "step": 3220, | |
| "valid_targets_mean": 6049.2, | |
| "valid_targets_min": 2745 | |
| }, | |
| { | |
| "epoch": 5.151757188498403, | |
| "grad_norm": 0.4049253690849904, | |
| "learning_rate": 7.925625670559158e-06, | |
| "loss": 0.2706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.269950807094574, | |
| "step": 3225, | |
| "valid_targets_mean": 6281.1, | |
| "valid_targets_min": 2666 | |
| }, | |
| { | |
| "epoch": 5.159744408945687, | |
| "grad_norm": 0.40905288117332234, | |
| "learning_rate": 7.862204797272716e-06, | |
| "loss": 0.2788, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31759244203567505, | |
| "step": 3230, | |
| "valid_targets_mean": 6549.2, | |
| "valid_targets_min": 3081 | |
| }, | |
| { | |
| "epoch": 5.167731629392971, | |
| "grad_norm": 0.4382554965693554, | |
| "learning_rate": 7.798976554654438e-06, | |
| "loss": 0.2757, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25907617807388306, | |
| "step": 3235, | |
| "valid_targets_mean": 4957.9, | |
| "valid_targets_min": 897 | |
| }, | |
| { | |
| "epoch": 5.175718849840256, | |
| "grad_norm": 0.42627441723420223, | |
| "learning_rate": 7.73594194615663e-06, | |
| "loss": 0.2593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2848866283893585, | |
| "step": 3240, | |
| "valid_targets_mean": 5505.1, | |
| "valid_targets_min": 3428 | |
| }, | |
| { | |
| "epoch": 5.18370607028754, | |
| "grad_norm": 0.4194849196058544, | |
| "learning_rate": 7.673101972158572e-06, | |
| "loss": 0.2554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24621835350990295, | |
| "step": 3245, | |
| "valid_targets_mean": 5243.2, | |
| "valid_targets_min": 1998 | |
| }, | |
| { | |
| "epoch": 5.1916932907348246, | |
| "grad_norm": 0.3670903406950453, | |
| "learning_rate": 7.610457629950621e-06, | |
| "loss": 0.2569, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27662086486816406, | |
| "step": 3250, | |
| "valid_targets_mean": 7332.8, | |
| "valid_targets_min": 2188 | |
| }, | |
| { | |
| "epoch": 5.199680511182109, | |
| "grad_norm": 0.43543591727939973, | |
| "learning_rate": 7.548009913718402e-06, | |
| "loss": 0.2702, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21813823282718658, | |
| "step": 3255, | |
| "valid_targets_mean": 4860.3, | |
| "valid_targets_min": 612 | |
| }, | |
| { | |
| "epoch": 5.207667731629393, | |
| "grad_norm": 0.41673312799313406, | |
| "learning_rate": 7.485759814527034e-06, | |
| "loss": 0.2813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.251434862613678, | |
| "step": 3260, | |
| "valid_targets_mean": 5452.9, | |
| "valid_targets_min": 2490 | |
| }, | |
| { | |
| "epoch": 5.215654952076678, | |
| "grad_norm": 0.3861351427182616, | |
| "learning_rate": 7.423708320305361e-06, | |
| "loss": 0.2813, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26638728380203247, | |
| "step": 3265, | |
| "valid_targets_mean": 6605.4, | |
| "valid_targets_min": 2430 | |
| }, | |
| { | |
| "epoch": 5.223642172523961, | |
| "grad_norm": 0.39870413308501174, | |
| "learning_rate": 7.361856415830335e-06, | |
| "loss": 0.2769, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30399465560913086, | |
| "step": 3270, | |
| "valid_targets_mean": 7056.1, | |
| "valid_targets_min": 712 | |
| }, | |
| { | |
| "epoch": 5.231629392971246, | |
| "grad_norm": 0.3766841365535545, | |
| "learning_rate": 7.300205082711336e-06, | |
| "loss": 0.2833, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33117106556892395, | |
| "step": 3275, | |
| "valid_targets_mean": 8182.8, | |
| "valid_targets_min": 4238 | |
| }, | |
| { | |
| "epoch": 5.23961661341853, | |
| "grad_norm": 0.41449496272925274, | |
| "learning_rate": 7.2387552993746204e-06, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29419130086898804, | |
| "step": 3280, | |
| "valid_targets_mean": 6567.6, | |
| "valid_targets_min": 864 | |
| }, | |
| { | |
| "epoch": 5.247603833865814, | |
| "grad_norm": 0.4086294372206149, | |
| "learning_rate": 7.177508041047769e-06, | |
| "loss": 0.277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2674436569213867, | |
| "step": 3285, | |
| "valid_targets_mean": 5939.1, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 5.255591054313099, | |
| "grad_norm": 0.40996425388769214, | |
| "learning_rate": 7.116464279744262e-06, | |
| "loss": 0.2643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26989826560020447, | |
| "step": 3290, | |
| "valid_targets_mean": 5547.9, | |
| "valid_targets_min": 888 | |
| }, | |
| { | |
| "epoch": 5.263578274760383, | |
| "grad_norm": 0.440326183564263, | |
| "learning_rate": 7.055624984247977e-06, | |
| "loss": 0.2786, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28565216064453125, | |
| "step": 3295, | |
| "valid_targets_mean": 5991.4, | |
| "valid_targets_min": 559 | |
| }, | |
| { | |
| "epoch": 5.271565495207668, | |
| "grad_norm": 0.4063890775733396, | |
| "learning_rate": 6.9949911200978805e-06, | |
| "loss": 0.2715, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26227593421936035, | |
| "step": 3300, | |
| "valid_targets_mean": 5691.9, | |
| "valid_targets_min": 3232 | |
| }, | |
| { | |
| "epoch": 5.279552715654952, | |
| "grad_norm": 0.405481768918501, | |
| "learning_rate": 6.934563649572678e-06, | |
| "loss": 0.2608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28821951150894165, | |
| "step": 3305, | |
| "valid_targets_mean": 6222.0, | |
| "valid_targets_min": 2290 | |
| }, | |
| { | |
| "epoch": 5.287539936102236, | |
| "grad_norm": 0.4239074476651318, | |
| "learning_rate": 6.874343531675521e-06, | |
| "loss": 0.2815, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.296866238117218, | |
| "step": 3310, | |
| "valid_targets_mean": 6222.3, | |
| "valid_targets_min": 2649 | |
| }, | |
| { | |
| "epoch": 5.295527156549521, | |
| "grad_norm": 0.40738621182481205, | |
| "learning_rate": 6.814331722118837e-06, | |
| "loss": 0.2723, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2828754782676697, | |
| "step": 3315, | |
| "valid_targets_mean": 6422.0, | |
| "valid_targets_min": 3552 | |
| }, | |
| { | |
| "epoch": 5.303514376996805, | |
| "grad_norm": 0.41930033097718117, | |
| "learning_rate": 6.754529173309112e-06, | |
| "loss": 0.27, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25146254897117615, | |
| "step": 3320, | |
| "valid_targets_mean": 5031.1, | |
| "valid_targets_min": 876 | |
| }, | |
| { | |
| "epoch": 5.31150159744409, | |
| "grad_norm": 0.40050805330795236, | |
| "learning_rate": 6.694936834331809e-06, | |
| "loss": 0.278, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2962872385978699, | |
| "step": 3325, | |
| "valid_targets_mean": 6703.7, | |
| "valid_targets_min": 2667 | |
| }, | |
| { | |
| "epoch": 5.319488817891374, | |
| "grad_norm": 0.40757274673979366, | |
| "learning_rate": 6.635555650936278e-06, | |
| "loss": 0.2559, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25634026527404785, | |
| "step": 3330, | |
| "valid_targets_mean": 6424.2, | |
| "valid_targets_min": 2907 | |
| }, | |
| { | |
| "epoch": 5.327476038338658, | |
| "grad_norm": 0.41459583057249944, | |
| "learning_rate": 6.576386565520794e-06, | |
| "loss": 0.2715, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27021244168281555, | |
| "step": 3335, | |
| "valid_targets_mean": 6251.3, | |
| "valid_targets_min": 2611 | |
| }, | |
| { | |
| "epoch": 5.335463258785943, | |
| "grad_norm": 0.42845334639523325, | |
| "learning_rate": 6.5174305171175336e-06, | |
| "loss": 0.265, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26541537046432495, | |
| "step": 3340, | |
| "valid_targets_mean": 5584.6, | |
| "valid_targets_min": 1372 | |
| }, | |
| { | |
| "epoch": 5.343450479233227, | |
| "grad_norm": 0.4330904919198064, | |
| "learning_rate": 6.458688441377734e-06, | |
| "loss": 0.2863, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2650597095489502, | |
| "step": 3345, | |
| "valid_targets_mean": 5421.7, | |
| "valid_targets_min": 2309 | |
| }, | |
| { | |
| "epoch": 5.3514376996805115, | |
| "grad_norm": 0.4023604168120322, | |
| "learning_rate": 6.400161270556817e-06, | |
| "loss": 0.2789, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27077850699424744, | |
| "step": 3350, | |
| "valid_targets_mean": 5968.1, | |
| "valid_targets_min": 3619 | |
| }, | |
| { | |
| "epoch": 5.359424920127796, | |
| "grad_norm": 0.41594972924632195, | |
| "learning_rate": 6.341849933499573e-06, | |
| "loss": 0.2747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2645888924598694, | |
| "step": 3355, | |
| "valid_targets_mean": 5354.4, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 5.36741214057508, | |
| "grad_norm": 0.4158229352039209, | |
| "learning_rate": 6.283755355625472e-06, | |
| "loss": 0.2762, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29557183384895325, | |
| "step": 3360, | |
| "valid_targets_mean": 6036.9, | |
| "valid_targets_min": 612 | |
| }, | |
| { | |
| "epoch": 5.375399361022364, | |
| "grad_norm": 0.4033668357397428, | |
| "learning_rate": 6.225878458913917e-06, | |
| "loss": 0.2817, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2787281274795532, | |
| "step": 3365, | |
| "valid_targets_mean": 6356.4, | |
| "valid_targets_min": 2991 | |
| }, | |
| { | |
| "epoch": 5.383386581469648, | |
| "grad_norm": 0.3708761235200844, | |
| "learning_rate": 6.168220161889678e-06, | |
| "loss": 0.2525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2597508430480957, | |
| "step": 3370, | |
| "valid_targets_mean": 6750.4, | |
| "valid_targets_min": 3810 | |
| }, | |
| { | |
| "epoch": 5.391373801916933, | |
| "grad_norm": 0.3926327188011646, | |
| "learning_rate": 6.110781379608226e-06, | |
| "loss": 0.2606, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30121123790740967, | |
| "step": 3375, | |
| "valid_targets_mean": 7324.6, | |
| "valid_targets_min": 2986 | |
| }, | |
| { | |
| "epoch": 5.399361022364217, | |
| "grad_norm": 0.4120172449770398, | |
| "learning_rate": 6.053563023641318e-06, | |
| "loss": 0.2484, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20111224055290222, | |
| "step": 3380, | |
| "valid_targets_mean": 5221.2, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 5.407348242811501, | |
| "grad_norm": 0.42388002723382, | |
| "learning_rate": 5.996566002062439e-06, | |
| "loss": 0.2808, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3406032919883728, | |
| "step": 3385, | |
| "valid_targets_mean": 7409.2, | |
| "valid_targets_min": 2649 | |
| }, | |
| { | |
| "epoch": 5.415335463258786, | |
| "grad_norm": 0.387704334465912, | |
| "learning_rate": 5.939791219432442e-06, | |
| "loss": 0.2609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23108546435832977, | |
| "step": 3390, | |
| "valid_targets_mean": 5793.6, | |
| "valid_targets_min": 3281 | |
| }, | |
| { | |
| "epoch": 5.42332268370607, | |
| "grad_norm": 0.42616198263495114, | |
| "learning_rate": 5.8832395767851846e-06, | |
| "loss": 0.2593, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2731560468673706, | |
| "step": 3395, | |
| "valid_targets_mean": 5060.1, | |
| "valid_targets_min": 2181 | |
| }, | |
| { | |
| "epoch": 5.431309904153355, | |
| "grad_norm": 0.5142404072671888, | |
| "learning_rate": 5.826911971613198e-06, | |
| "loss": 0.2586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24271123111248016, | |
| "step": 3400, | |
| "valid_targets_mean": 6266.3, | |
| "valid_targets_min": 2733 | |
| }, | |
| { | |
| "epoch": 5.439297124600639, | |
| "grad_norm": 0.40530464013425294, | |
| "learning_rate": 5.770809297853495e-06, | |
| "loss": 0.2811, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27602797746658325, | |
| "step": 3405, | |
| "valid_targets_mean": 6693.9, | |
| "valid_targets_min": 3209 | |
| }, | |
| { | |
| "epoch": 5.447284345047923, | |
| "grad_norm": 0.401008415625418, | |
| "learning_rate": 5.714932445873325e-06, | |
| "loss": 0.2639, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2681991755962372, | |
| "step": 3410, | |
| "valid_targets_mean": 5763.9, | |
| "valid_targets_min": 3035 | |
| }, | |
| { | |
| "epoch": 5.455271565495208, | |
| "grad_norm": 0.38838536917803695, | |
| "learning_rate": 5.659282302456115e-06, | |
| "loss": 0.2684, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2287471443414688, | |
| "step": 3415, | |
| "valid_targets_mean": 5909.8, | |
| "valid_targets_min": 2017 | |
| }, | |
| { | |
| "epoch": 5.463258785942492, | |
| "grad_norm": 0.4152732525215406, | |
| "learning_rate": 5.603859750787302e-06, | |
| "loss": 0.2627, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2477230578660965, | |
| "step": 3420, | |
| "valid_targets_mean": 5648.4, | |
| "valid_targets_min": 2150 | |
| }, | |
| { | |
| "epoch": 5.4712460063897765, | |
| "grad_norm": 0.3752950310689603, | |
| "learning_rate": 5.548665670440418e-06, | |
| "loss": 0.2816, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2774547040462494, | |
| "step": 3425, | |
| "valid_targets_mean": 6946.4, | |
| "valid_targets_min": 815 | |
| }, | |
| { | |
| "epoch": 5.479233226837061, | |
| "grad_norm": 0.4541071980541953, | |
| "learning_rate": 5.4937009373630535e-06, | |
| "loss": 0.2726, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2792585492134094, | |
| "step": 3430, | |
| "valid_targets_mean": 6652.8, | |
| "valid_targets_min": 2717 | |
| }, | |
| { | |
| "epoch": 5.487220447284345, | |
| "grad_norm": 0.40156002324892504, | |
| "learning_rate": 5.438966423862997e-06, | |
| "loss": 0.277, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2652904987335205, | |
| "step": 3435, | |
| "valid_targets_mean": 6201.8, | |
| "valid_targets_min": 3902 | |
| }, | |
| { | |
| "epoch": 5.49520766773163, | |
| "grad_norm": 0.648954919675528, | |
| "learning_rate": 5.384462998594384e-06, | |
| "loss": 0.279, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2893713116645813, | |
| "step": 3440, | |
| "valid_targets_mean": 6348.4, | |
| "valid_targets_min": 585 | |
| }, | |
| { | |
| "epoch": 5.503194888178914, | |
| "grad_norm": 0.3897876669054615, | |
| "learning_rate": 5.330191526543884e-06, | |
| "loss": 0.2643, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23646822571754456, | |
| "step": 3445, | |
| "valid_targets_mean": 5237.7, | |
| "valid_targets_min": 2488 | |
| }, | |
| { | |
| "epoch": 5.511182108626198, | |
| "grad_norm": 0.37347776243725006, | |
| "learning_rate": 5.276152869017028e-06, | |
| "loss": 0.2713, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27135783433914185, | |
| "step": 3450, | |
| "valid_targets_mean": 6863.1, | |
| "valid_targets_min": 3669 | |
| }, | |
| { | |
| "epoch": 5.519169329073483, | |
| "grad_norm": 0.5171980367484461, | |
| "learning_rate": 5.22234788362447e-06, | |
| "loss": 0.261, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2549384832382202, | |
| "step": 3455, | |
| "valid_targets_mean": 5262.7, | |
| "valid_targets_min": 995 | |
| }, | |
| { | |
| "epoch": 5.527156549520766, | |
| "grad_norm": 0.38848096026401324, | |
| "learning_rate": 5.168777424268454e-06, | |
| "loss": 0.2634, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2879067659378052, | |
| "step": 3460, | |
| "valid_targets_mean": 6928.7, | |
| "valid_targets_min": 3091 | |
| }, | |
| { | |
| "epoch": 5.535143769968051, | |
| "grad_norm": 0.39172238709172325, | |
| "learning_rate": 5.115442341129171e-06, | |
| "loss": 0.2525, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24223783612251282, | |
| "step": 3465, | |
| "valid_targets_mean": 6096.4, | |
| "valid_targets_min": 3169 | |
| }, | |
| { | |
| "epoch": 5.543130990415335, | |
| "grad_norm": 0.38074183780196175, | |
| "learning_rate": 5.062343480651364e-06, | |
| "loss": 0.2742, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27156221866607666, | |
| "step": 3470, | |
| "valid_targets_mean": 6334.4, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 5.55111821086262, | |
| "grad_norm": 0.41297146376784594, | |
| "learning_rate": 5.009481685530817e-06, | |
| "loss": 0.2755, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.284872829914093, | |
| "step": 3475, | |
| "valid_targets_mean": 6020.4, | |
| "valid_targets_min": 2179 | |
| }, | |
| { | |
| "epoch": 5.559105431309904, | |
| "grad_norm": 0.3934923387326401, | |
| "learning_rate": 4.956857794701026e-06, | |
| "loss": 0.2642, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2739129066467285, | |
| "step": 3480, | |
| "valid_targets_mean": 6693.8, | |
| "valid_targets_min": 2268 | |
| }, | |
| { | |
| "epoch": 5.567092651757188, | |
| "grad_norm": 0.38310351759296046, | |
| "learning_rate": 4.904472643319873e-06, | |
| "loss": 0.2764, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2645479738712311, | |
| "step": 3485, | |
| "valid_targets_mean": 6612.6, | |
| "valid_targets_min": 768 | |
| }, | |
| { | |
| "epoch": 5.575079872204473, | |
| "grad_norm": 0.3876539214594446, | |
| "learning_rate": 4.852327062756352e-06, | |
| "loss": 0.2701, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27958667278289795, | |
| "step": 3490, | |
| "valid_targets_mean": 6435.6, | |
| "valid_targets_min": 2510 | |
| }, | |
| { | |
| "epoch": 5.583067092651757, | |
| "grad_norm": 0.4097876889940266, | |
| "learning_rate": 4.800421880577411e-06, | |
| "loss": 0.27, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24855288863182068, | |
| "step": 3495, | |
| "valid_targets_mean": 6263.6, | |
| "valid_targets_min": 3591 | |
| }, | |
| { | |
| "epoch": 5.5910543130990416, | |
| "grad_norm": 0.3864010358073123, | |
| "learning_rate": 4.748757920534779e-06, | |
| "loss": 0.2575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24530455470085144, | |
| "step": 3500, | |
| "valid_targets_mean": 6208.0, | |
| "valid_targets_min": 2603 | |
| }, | |
| { | |
| "epoch": 5.599041533546326, | |
| "grad_norm": 0.40655201847325134, | |
| "learning_rate": 4.697336002551947e-06, | |
| "loss": 0.2659, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2666792869567871, | |
| "step": 3505, | |
| "valid_targets_mean": 5832.1, | |
| "valid_targets_min": 3437 | |
| }, | |
| { | |
| "epoch": 5.60702875399361, | |
| "grad_norm": 0.40435890744575836, | |
| "learning_rate": 4.6461569427110684e-06, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29509562253952026, | |
| "step": 3510, | |
| "valid_targets_mean": 6046.1, | |
| "valid_targets_min": 2725 | |
| }, | |
| { | |
| "epoch": 5.615015974440895, | |
| "grad_norm": 0.4183587223374008, | |
| "learning_rate": 4.5952215532401146e-06, | |
| "loss": 0.2461, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24413025379180908, | |
| "step": 3515, | |
| "valid_targets_mean": 5483.4, | |
| "valid_targets_min": 3196 | |
| }, | |
| { | |
| "epoch": 5.623003194888179, | |
| "grad_norm": 0.4154547399295291, | |
| "learning_rate": 4.544530642499894e-06, | |
| "loss": 0.2679, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2898193895816803, | |
| "step": 3520, | |
| "valid_targets_mean": 6051.2, | |
| "valid_targets_min": 2925 | |
| }, | |
| { | |
| "epoch": 5.6309904153354635, | |
| "grad_norm": 0.42403141467359734, | |
| "learning_rate": 4.4940850149712765e-06, | |
| "loss": 0.2509, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25216978788375854, | |
| "step": 3525, | |
| "valid_targets_mean": 5015.3, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 5.638977635782748, | |
| "grad_norm": 0.41034855152724853, | |
| "learning_rate": 4.443885471242418e-06, | |
| "loss": 0.2738, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26855704188346863, | |
| "step": 3530, | |
| "valid_targets_mean": 6189.1, | |
| "valid_targets_min": 2442 | |
| }, | |
| { | |
| "epoch": 5.646964856230032, | |
| "grad_norm": 0.3936414247032374, | |
| "learning_rate": 4.393932807996017e-06, | |
| "loss": 0.2465, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25683677196502686, | |
| "step": 3535, | |
| "valid_targets_mean": 6258.0, | |
| "valid_targets_min": 2661 | |
| }, | |
| { | |
| "epoch": 5.654952076677317, | |
| "grad_norm": 0.4001042562019606, | |
| "learning_rate": 4.344227817996735e-06, | |
| "loss": 0.2583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2364477813243866, | |
| "step": 3540, | |
| "valid_targets_mean": 5464.9, | |
| "valid_targets_min": 3055 | |
| }, | |
| { | |
| "epoch": 5.6629392971246, | |
| "grad_norm": 0.3826420693221661, | |
| "learning_rate": 4.294771290078548e-06, | |
| "loss": 0.2467, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27298253774642944, | |
| "step": 3545, | |
| "valid_targets_mean": 6495.2, | |
| "valid_targets_min": 2756 | |
| }, | |
| { | |
| "epoch": 5.6709265175718855, | |
| "grad_norm": 0.37839555654351587, | |
| "learning_rate": 4.245564009132293e-06, | |
| "loss": 0.2876, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27785027027130127, | |
| "step": 3550, | |
| "valid_targets_mean": 6781.5, | |
| "valid_targets_min": 678 | |
| }, | |
| { | |
| "epoch": 5.678913738019169, | |
| "grad_norm": 0.40034452570425777, | |
| "learning_rate": 4.196606756093138e-06, | |
| "loss": 0.2503, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26036790013313293, | |
| "step": 3555, | |
| "valid_targets_mean": 6464.2, | |
| "valid_targets_min": 3082 | |
| }, | |
| { | |
| "epoch": 5.686900958466453, | |
| "grad_norm": 0.4127138508381542, | |
| "learning_rate": 4.147900307928268e-06, | |
| "loss": 0.2674, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27701306343078613, | |
| "step": 3560, | |
| "valid_targets_mean": 5604.1, | |
| "valid_targets_min": 572 | |
| }, | |
| { | |
| "epoch": 5.694888178913738, | |
| "grad_norm": 0.3895941920307152, | |
| "learning_rate": 4.099445437624487e-06, | |
| "loss": 0.2801, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2567237913608551, | |
| "step": 3565, | |
| "valid_targets_mean": 6499.4, | |
| "valid_targets_min": 2760 | |
| }, | |
| { | |
| "epoch": 5.702875399361022, | |
| "grad_norm": 0.4026846040934063, | |
| "learning_rate": 4.051242914175995e-06, | |
| "loss": 0.2877, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2394181340932846, | |
| "step": 3570, | |
| "valid_targets_mean": 5061.2, | |
| "valid_targets_min": 2121 | |
| }, | |
| { | |
| "epoch": 5.710862619808307, | |
| "grad_norm": 0.41150417005633266, | |
| "learning_rate": 4.003293502572163e-06, | |
| "loss": 0.2594, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2529963552951813, | |
| "step": 3575, | |
| "valid_targets_mean": 5579.4, | |
| "valid_targets_min": 2877 | |
| }, | |
| { | |
| "epoch": 5.718849840255591, | |
| "grad_norm": 0.4274042197989922, | |
| "learning_rate": 3.955597963785391e-06, | |
| "loss": 0.2587, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2583443224430084, | |
| "step": 3580, | |
| "valid_targets_mean": 5731.5, | |
| "valid_targets_min": 2010 | |
| }, | |
| { | |
| "epoch": 5.726837060702875, | |
| "grad_norm": 0.4471777655641501, | |
| "learning_rate": 3.908157054759048e-06, | |
| "loss": 0.2962, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3019868731498718, | |
| "step": 3585, | |
| "valid_targets_mean": 5724.8, | |
| "valid_targets_min": 2716 | |
| }, | |
| { | |
| "epoch": 5.73482428115016, | |
| "grad_norm": 0.41964363853508374, | |
| "learning_rate": 3.860971528395427e-06, | |
| "loss": 0.2752, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2542869746685028, | |
| "step": 3590, | |
| "valid_targets_mean": 4950.4, | |
| "valid_targets_min": 706 | |
| }, | |
| { | |
| "epoch": 5.742811501597444, | |
| "grad_norm": 0.4098343540525919, | |
| "learning_rate": 3.814042133543847e-06, | |
| "loss": 0.2532, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25015988945961, | |
| "step": 3595, | |
| "valid_targets_mean": 6112.3, | |
| "valid_targets_min": 2741 | |
| }, | |
| { | |
| "epoch": 5.7507987220447285, | |
| "grad_norm": 0.38532448037141603, | |
| "learning_rate": 3.7673696149887117e-06, | |
| "loss": 0.2771, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25711530447006226, | |
| "step": 3600, | |
| "valid_targets_mean": 6476.9, | |
| "valid_targets_min": 3217 | |
| }, | |
| { | |
| "epoch": 5.758785942492013, | |
| "grad_norm": 0.39846001438401557, | |
| "learning_rate": 3.72095471343773e-06, | |
| "loss": 0.2686, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2989582419395447, | |
| "step": 3605, | |
| "valid_targets_mean": 6977.7, | |
| "valid_targets_min": 2119 | |
| }, | |
| { | |
| "epoch": 5.766773162939297, | |
| "grad_norm": 0.3774788207979415, | |
| "learning_rate": 3.674798165510136e-06, | |
| "loss": 0.2621, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23739445209503174, | |
| "step": 3610, | |
| "valid_targets_mean": 6336.8, | |
| "valid_targets_min": 2474 | |
| }, | |
| { | |
| "epoch": 5.774760383386582, | |
| "grad_norm": 0.3861551377463408, | |
| "learning_rate": 3.6289007037250244e-06, | |
| "loss": 0.2685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2617093622684479, | |
| "step": 3615, | |
| "valid_targets_mean": 6711.7, | |
| "valid_targets_min": 2876 | |
| }, | |
| { | |
| "epoch": 5.782747603833866, | |
| "grad_norm": 0.40744967812385374, | |
| "learning_rate": 3.5832630564897073e-06, | |
| "loss": 0.2792, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2509753108024597, | |
| "step": 3620, | |
| "valid_targets_mean": 5979.4, | |
| "valid_targets_min": 2730 | |
| }, | |
| { | |
| "epoch": 5.7907348242811505, | |
| "grad_norm": 0.41073447793342704, | |
| "learning_rate": 3.5378859480881443e-06, | |
| "loss": 0.2735, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.33972814679145813, | |
| "step": 3625, | |
| "valid_targets_mean": 7017.2, | |
| "valid_targets_min": 2084 | |
| }, | |
| { | |
| "epoch": 5.798722044728435, | |
| "grad_norm": 0.4166804580562159, | |
| "learning_rate": 3.492770098669478e-06, | |
| "loss": 0.244, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24957159161567688, | |
| "step": 3630, | |
| "valid_targets_mean": 5835.2, | |
| "valid_targets_min": 2756 | |
| }, | |
| { | |
| "epoch": 5.806709265175719, | |
| "grad_norm": 0.41914079223078876, | |
| "learning_rate": 3.4479162242365717e-06, | |
| "loss": 0.2724, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2966158986091614, | |
| "step": 3635, | |
| "valid_targets_mean": 6760.2, | |
| "valid_targets_min": 3019 | |
| }, | |
| { | |
| "epoch": 5.814696485623003, | |
| "grad_norm": 0.3881502822445414, | |
| "learning_rate": 3.403325036634679e-06, | |
| "loss": 0.2647, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.257976233959198, | |
| "step": 3640, | |
| "valid_targets_mean": 6245.3, | |
| "valid_targets_min": 1886 | |
| }, | |
| { | |
| "epoch": 5.822683706070287, | |
| "grad_norm": 0.4071501184047579, | |
| "learning_rate": 3.3589972435401184e-06, | |
| "loss": 0.2819, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2423725128173828, | |
| "step": 3645, | |
| "valid_targets_mean": 6025.7, | |
| "valid_targets_min": 2923 | |
| }, | |
| { | |
| "epoch": 5.830670926517572, | |
| "grad_norm": 0.4041058312903423, | |
| "learning_rate": 3.3149335484490553e-06, | |
| "loss": 0.2727, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28053832054138184, | |
| "step": 3650, | |
| "valid_targets_mean": 5899.2, | |
| "valid_targets_min": 2582 | |
| }, | |
| { | |
| "epoch": 5.838658146964856, | |
| "grad_norm": 0.43091998438838186, | |
| "learning_rate": 3.2711346506663346e-06, | |
| "loss": 0.2542, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21161791682243347, | |
| "step": 3655, | |
| "valid_targets_mean": 5107.8, | |
| "valid_targets_min": 893 | |
| }, | |
| { | |
| "epoch": 5.84664536741214, | |
| "grad_norm": 0.38726488678088394, | |
| "learning_rate": 3.2276012452943893e-06, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23468036949634552, | |
| "step": 3660, | |
| "valid_targets_mean": 5872.8, | |
| "valid_targets_min": 1757 | |
| }, | |
| { | |
| "epoch": 5.854632587859425, | |
| "grad_norm": 0.41132585423643675, | |
| "learning_rate": 3.1843340232222064e-06, | |
| "loss": 0.2554, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24414730072021484, | |
| "step": 3665, | |
| "valid_targets_mean": 5770.9, | |
| "valid_targets_min": 983 | |
| }, | |
| { | |
| "epoch": 5.862619808306709, | |
| "grad_norm": 0.3889001434066048, | |
| "learning_rate": 3.1413336711143437e-06, | |
| "loss": 0.2368, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26720675826072693, | |
| "step": 3670, | |
| "valid_targets_mean": 6698.4, | |
| "valid_targets_min": 850 | |
| }, | |
| { | |
| "epoch": 5.8706070287539935, | |
| "grad_norm": 0.3834925702308612, | |
| "learning_rate": 3.0986008714000703e-06, | |
| "loss": 0.2818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28179192543029785, | |
| "step": 3675, | |
| "valid_targets_mean": 6642.9, | |
| "valid_targets_min": 2407 | |
| }, | |
| { | |
| "epoch": 5.878594249201278, | |
| "grad_norm": 0.40207437409947944, | |
| "learning_rate": 3.056136302262489e-06, | |
| "loss": 0.2685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25856301188468933, | |
| "step": 3680, | |
| "valid_targets_mean": 6361.6, | |
| "valid_targets_min": 2705 | |
| }, | |
| { | |
| "epoch": 5.886581469648562, | |
| "grad_norm": 0.4071568242836644, | |
| "learning_rate": 3.0139406376278212e-06, | |
| "loss": 0.2793, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24484190344810486, | |
| "step": 3685, | |
| "valid_targets_mean": 5454.2, | |
| "valid_targets_min": 467 | |
| }, | |
| { | |
| "epoch": 5.894568690095847, | |
| "grad_norm": 0.3796701828688457, | |
| "learning_rate": 2.972014547154671e-06, | |
| "loss": 0.2566, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2547157108783722, | |
| "step": 3690, | |
| "valid_targets_mean": 6390.4, | |
| "valid_targets_min": 788 | |
| }, | |
| { | |
| "epoch": 5.902555910543131, | |
| "grad_norm": 0.3923662568751408, | |
| "learning_rate": 2.930358696223423e-06, | |
| "loss": 0.2655, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2698602080345154, | |
| "step": 3695, | |
| "valid_targets_mean": 6420.2, | |
| "valid_targets_min": 3770 | |
| }, | |
| { | |
| "epoch": 5.9105431309904155, | |
| "grad_norm": 0.4109462182361228, | |
| "learning_rate": 2.8889737459256695e-06, | |
| "loss": 0.2598, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22529888153076172, | |
| "step": 3700, | |
| "valid_targets_mean": 5201.7, | |
| "valid_targets_min": 914 | |
| }, | |
| { | |
| "epoch": 5.9185303514377, | |
| "grad_norm": 0.3845440375366245, | |
| "learning_rate": 2.8478603530537285e-06, | |
| "loss": 0.2705, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24273991584777832, | |
| "step": 3705, | |
| "valid_targets_mean": 6101.9, | |
| "valid_targets_min": 2518 | |
| }, | |
| { | |
| "epoch": 5.926517571884984, | |
| "grad_norm": 0.3949527057761477, | |
| "learning_rate": 2.8070191700902194e-06, | |
| "loss": 0.2478, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23416760563850403, | |
| "step": 3710, | |
| "valid_targets_mean": 5605.8, | |
| "valid_targets_min": 2403 | |
| }, | |
| { | |
| "epoch": 5.934504792332269, | |
| "grad_norm": 0.4620246171400671, | |
| "learning_rate": 2.7664508451976903e-06, | |
| "loss": 0.2456, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2660924196243286, | |
| "step": 3715, | |
| "valid_targets_mean": 6548.9, | |
| "valid_targets_min": 2862 | |
| }, | |
| { | |
| "epoch": 5.942492012779553, | |
| "grad_norm": 0.4567195217993061, | |
| "learning_rate": 2.726156022208362e-06, | |
| "loss": 0.2721, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2496548295021057, | |
| "step": 3720, | |
| "valid_targets_mean": 5497.1, | |
| "valid_targets_min": 2146 | |
| }, | |
| { | |
| "epoch": 5.950479233226837, | |
| "grad_norm": 0.3927833860346143, | |
| "learning_rate": 2.6861353406138713e-06, | |
| "loss": 0.2692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2523316740989685, | |
| "step": 3725, | |
| "valid_targets_mean": 6091.9, | |
| "valid_targets_min": 2827 | |
| }, | |
| { | |
| "epoch": 5.958466453674122, | |
| "grad_norm": 0.3931387893514867, | |
| "learning_rate": 2.646389435555172e-06, | |
| "loss": 0.2553, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23789174854755402, | |
| "step": 3730, | |
| "valid_targets_mean": 5674.8, | |
| "valid_targets_min": 3259 | |
| }, | |
| { | |
| "epoch": 5.966453674121405, | |
| "grad_norm": 0.4302699938045076, | |
| "learning_rate": 2.6069189378124015e-06, | |
| "loss": 0.2572, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2736428678035736, | |
| "step": 3735, | |
| "valid_targets_mean": 5477.9, | |
| "valid_targets_min": 3186 | |
| }, | |
| { | |
| "epoch": 5.97444089456869, | |
| "grad_norm": 0.3906268892019169, | |
| "learning_rate": 2.567724473794908e-06, | |
| "loss": 0.2717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2649077773094177, | |
| "step": 3740, | |
| "valid_targets_mean": 5947.3, | |
| "valid_targets_min": 3106 | |
| }, | |
| { | |
| "epoch": 5.982428115015974, | |
| "grad_norm": 0.3879890345025771, | |
| "learning_rate": 2.5288066655312914e-06, | |
| "loss": 0.2667, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26016589999198914, | |
| "step": 3745, | |
| "valid_targets_mean": 6068.9, | |
| "valid_targets_min": 2214 | |
| }, | |
| { | |
| "epoch": 5.9904153354632586, | |
| "grad_norm": 0.4096443704289698, | |
| "learning_rate": 2.4901661306595414e-06, | |
| "loss": 0.2713, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26794230937957764, | |
| "step": 3750, | |
| "valid_targets_mean": 7041.1, | |
| "valid_targets_min": 2348 | |
| }, | |
| { | |
| "epoch": 5.998402555910543, | |
| "grad_norm": 0.39112915932836695, | |
| "learning_rate": 2.451803482417234e-06, | |
| "loss": 0.2607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2606961727142334, | |
| "step": 3755, | |
| "valid_targets_mean": 6312.5, | |
| "valid_targets_min": 2995 | |
| }, | |
| { | |
| "epoch": 6.006389776357827, | |
| "grad_norm": 0.38028239845216216, | |
| "learning_rate": 2.413719329631785e-06, | |
| "loss": 0.2521, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2524528503417969, | |
| "step": 3760, | |
| "valid_targets_mean": 6795.7, | |
| "valid_targets_min": 4504 | |
| }, | |
| { | |
| "epoch": 6.014376996805112, | |
| "grad_norm": 0.4027725424370273, | |
| "learning_rate": 2.375914276710811e-06, | |
| "loss": 0.2796, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2757946252822876, | |
| "step": 3765, | |
| "valid_targets_mean": 6085.4, | |
| "valid_targets_min": 796 | |
| }, | |
| { | |
| "epoch": 6.022364217252396, | |
| "grad_norm": 0.3883650206387635, | |
| "learning_rate": 2.338388923632513e-06, | |
| "loss": 0.2581, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2821754813194275, | |
| "step": 3770, | |
| "valid_targets_mean": 6603.0, | |
| "valid_targets_min": 4626 | |
| }, | |
| { | |
| "epoch": 6.0303514376996805, | |
| "grad_norm": 0.4103352968084469, | |
| "learning_rate": 2.3011438659361794e-06, | |
| "loss": 0.2492, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2881932258605957, | |
| "step": 3775, | |
| "valid_targets_mean": 6244.6, | |
| "valid_targets_min": 1649 | |
| }, | |
| { | |
| "epoch": 6.038338658146965, | |
| "grad_norm": 0.4364303760134223, | |
| "learning_rate": 2.2641796947127114e-06, | |
| "loss": 0.2651, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29247602820396423, | |
| "step": 3780, | |
| "valid_targets_mean": 5705.2, | |
| "valid_targets_min": 633 | |
| }, | |
| { | |
| "epoch": 6.046325878594249, | |
| "grad_norm": 0.3873582260007207, | |
| "learning_rate": 2.2274969965952553e-06, | |
| "loss": 0.2763, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28913000226020813, | |
| "step": 3785, | |
| "valid_targets_mean": 6626.6, | |
| "valid_targets_min": 2934 | |
| }, | |
| { | |
| "epoch": 6.054313099041534, | |
| "grad_norm": 0.39119323008234547, | |
| "learning_rate": 2.1910963537498887e-06, | |
| "loss": 0.2557, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2703927159309387, | |
| "step": 3790, | |
| "valid_targets_mean": 6168.2, | |
| "valid_targets_min": 3209 | |
| }, | |
| { | |
| "epoch": 6.062300319488818, | |
| "grad_norm": 0.4077362610798925, | |
| "learning_rate": 2.1549783438663872e-06, | |
| "loss": 0.2665, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2611781358718872, | |
| "step": 3795, | |
| "valid_targets_mean": 6124.8, | |
| "valid_targets_min": 3091 | |
| }, | |
| { | |
| "epoch": 6.0702875399361025, | |
| "grad_norm": 0.4465123362221287, | |
| "learning_rate": 2.1191435401490534e-06, | |
| "loss": 0.2644, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2754276692867279, | |
| "step": 3800, | |
| "valid_targets_mean": 6258.2, | |
| "valid_targets_min": 2972 | |
| }, | |
| { | |
| "epoch": 6.078274760383387, | |
| "grad_norm": 0.4077979225461768, | |
| "learning_rate": 2.0835925113076062e-06, | |
| "loss": 0.2775, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24296343326568604, | |
| "step": 3805, | |
| "valid_targets_mean": 5585.8, | |
| "valid_targets_min": 2717 | |
| }, | |
| { | |
| "epoch": 6.086261980830671, | |
| "grad_norm": 0.3883189660644492, | |
| "learning_rate": 2.0483258215481784e-06, | |
| "loss": 0.2885, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2830982804298401, | |
| "step": 3810, | |
| "valid_targets_mean": 6790.4, | |
| "valid_targets_min": 2438 | |
| }, | |
| { | |
| "epoch": 6.094249201277956, | |
| "grad_norm": 0.3993613251446817, | |
| "learning_rate": 2.0133440305643413e-06, | |
| "loss": 0.2683, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2775847613811493, | |
| "step": 3815, | |
| "valid_targets_mean": 6336.6, | |
| "valid_targets_min": 2124 | |
| }, | |
| { | |
| "epoch": 6.102236421725239, | |
| "grad_norm": 0.4026541713453085, | |
| "learning_rate": 1.9786476935282463e-06, | |
| "loss": 0.2565, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24327577650547028, | |
| "step": 3820, | |
| "valid_targets_mean": 5976.4, | |
| "valid_targets_min": 1729 | |
| }, | |
| { | |
| "epoch": 6.110223642172524, | |
| "grad_norm": 0.38709911456403306, | |
| "learning_rate": 1.944237361081782e-06, | |
| "loss": 0.2744, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2668118476867676, | |
| "step": 3825, | |
| "valid_targets_mean": 7182.4, | |
| "valid_targets_min": 1878 | |
| }, | |
| { | |
| "epoch": 6.118210862619808, | |
| "grad_norm": 0.41752595504794143, | |
| "learning_rate": 1.9101135793278746e-06, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24993827939033508, | |
| "step": 3830, | |
| "valid_targets_mean": 5771.0, | |
| "valid_targets_min": 1753 | |
| }, | |
| { | |
| "epoch": 6.126198083067092, | |
| "grad_norm": 0.39389774417625606, | |
| "learning_rate": 1.8762768898217732e-06, | |
| "loss": 0.2616, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27845335006713867, | |
| "step": 3835, | |
| "valid_targets_mean": 6528.6, | |
| "valid_targets_min": 2756 | |
| }, | |
| { | |
| "epoch": 6.134185303514377, | |
| "grad_norm": 0.4007868024624515, | |
| "learning_rate": 1.8427278295625006e-06, | |
| "loss": 0.2847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28565725684165955, | |
| "step": 3840, | |
| "valid_targets_mean": 6376.4, | |
| "valid_targets_min": 3358 | |
| }, | |
| { | |
| "epoch": 6.142172523961661, | |
| "grad_norm": 0.3995828752599394, | |
| "learning_rate": 1.8094669309843161e-06, | |
| "loss": 0.2608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.20159706473350525, | |
| "step": 3845, | |
| "valid_targets_mean": 4589.8, | |
| "valid_targets_min": 2860 | |
| }, | |
| { | |
| "epoch": 6.1501597444089455, | |
| "grad_norm": 0.39694544048945113, | |
| "learning_rate": 1.776494721948241e-06, | |
| "loss": 0.2847, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26797810196876526, | |
| "step": 3850, | |
| "valid_targets_mean": 6385.9, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 6.15814696485623, | |
| "grad_norm": 0.45715765848495526, | |
| "learning_rate": 1.7438117257337239e-06, | |
| "loss": 0.2618, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2801579236984253, | |
| "step": 3855, | |
| "valid_targets_mean": 5414.9, | |
| "valid_targets_min": 774 | |
| }, | |
| { | |
| "epoch": 6.166134185303514, | |
| "grad_norm": 0.39416150093967306, | |
| "learning_rate": 1.7114184610302964e-06, | |
| "loss": 0.2607, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26423269510269165, | |
| "step": 3860, | |
| "valid_targets_mean": 6070.0, | |
| "valid_targets_min": 3144 | |
| }, | |
| { | |
| "epoch": 6.174121405750799, | |
| "grad_norm": 0.3816052623224548, | |
| "learning_rate": 1.67931544192937e-06, | |
| "loss": 0.2767, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2711952328681946, | |
| "step": 3865, | |
| "valid_targets_mean": 6857.6, | |
| "valid_targets_min": 2493 | |
| }, | |
| { | |
| "epoch": 6.182108626198083, | |
| "grad_norm": 0.42472104670938376, | |
| "learning_rate": 1.6475031779160611e-06, | |
| "loss": 0.2645, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26078149676322937, | |
| "step": 3870, | |
| "valid_targets_mean": 5219.5, | |
| "valid_targets_min": 1406 | |
| }, | |
| { | |
| "epoch": 6.1900958466453675, | |
| "grad_norm": 0.4426817652480142, | |
| "learning_rate": 1.6159821738611192e-06, | |
| "loss": 0.272, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2876974046230316, | |
| "step": 3875, | |
| "valid_targets_mean": 5769.3, | |
| "valid_targets_min": 805 | |
| }, | |
| { | |
| "epoch": 6.198083067092652, | |
| "grad_norm": 0.43126668162257803, | |
| "learning_rate": 1.5847529300128827e-06, | |
| "loss": 0.2575, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23353397846221924, | |
| "step": 3880, | |
| "valid_targets_mean": 5951.6, | |
| "valid_targets_min": 876 | |
| }, | |
| { | |
| "epoch": 6.206070287539936, | |
| "grad_norm": 0.389639874885683, | |
| "learning_rate": 1.5538159419893895e-06, | |
| "loss": 0.2787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2204650640487671, | |
| "step": 3885, | |
| "valid_targets_mean": 5891.4, | |
| "valid_targets_min": 2690 | |
| }, | |
| { | |
| "epoch": 6.214057507987221, | |
| "grad_norm": 0.38979494581579804, | |
| "learning_rate": 1.5231717007704738e-06, | |
| "loss": 0.2602, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26473143696784973, | |
| "step": 3890, | |
| "valid_targets_mean": 6832.6, | |
| "valid_targets_min": 3317 | |
| }, | |
| { | |
| "epoch": 6.222044728434505, | |
| "grad_norm": 0.40890636968724303, | |
| "learning_rate": 1.4928206926899801e-06, | |
| "loss": 0.2472, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26221221685409546, | |
| "step": 3895, | |
| "valid_targets_mean": 5755.1, | |
| "valid_targets_min": 1047 | |
| }, | |
| { | |
| "epoch": 6.2300319488817895, | |
| "grad_norm": 0.42711616458394136, | |
| "learning_rate": 1.4627633994280599e-06, | |
| "loss": 0.2556, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2683911919593811, | |
| "step": 3900, | |
| "valid_targets_mean": 5839.6, | |
| "valid_targets_min": 801 | |
| }, | |
| { | |
| "epoch": 6.238019169329074, | |
| "grad_norm": 0.3997092749939614, | |
| "learning_rate": 1.433000298003504e-06, | |
| "loss": 0.2816, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24764733016490936, | |
| "step": 3905, | |
| "valid_targets_mean": 5888.2, | |
| "valid_targets_min": 3461 | |
| }, | |
| { | |
| "epoch": 6.246006389776358, | |
| "grad_norm": 0.43062677996709836, | |
| "learning_rate": 1.4035318607662029e-06, | |
| "loss": 0.2697, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24495817720890045, | |
| "step": 3910, | |
| "valid_targets_mean": 4845.8, | |
| "valid_targets_min": 2083 | |
| }, | |
| { | |
| "epoch": 6.253993610223642, | |
| "grad_norm": 0.38826625279200744, | |
| "learning_rate": 1.3743585553896144e-06, | |
| "loss": 0.257, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2641289234161377, | |
| "step": 3915, | |
| "valid_targets_mean": 6398.6, | |
| "valid_targets_min": 2311 | |
| }, | |
| { | |
| "epoch": 6.261980830670926, | |
| "grad_norm": 0.3956794527968643, | |
| "learning_rate": 1.345480844863376e-06, | |
| "loss": 0.2676, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2406679391860962, | |
| "step": 3920, | |
| "valid_targets_mean": 5696.4, | |
| "valid_targets_min": 2741 | |
| }, | |
| { | |
| "epoch": 6.2699680511182105, | |
| "grad_norm": 0.44965320676085757, | |
| "learning_rate": 1.316899187485925e-06, | |
| "loss": 0.255, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24731853604316711, | |
| "step": 3925, | |
| "valid_targets_mean": 4457.3, | |
| "valid_targets_min": 710 | |
| }, | |
| { | |
| "epoch": 6.277955271565495, | |
| "grad_norm": 0.39204003729788195, | |
| "learning_rate": 1.288614036857252e-06, | |
| "loss": 0.2545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23420819640159607, | |
| "step": 3930, | |
| "valid_targets_mean": 6753.6, | |
| "valid_targets_min": 3941 | |
| }, | |
| { | |
| "epoch": 6.285942492012779, | |
| "grad_norm": 0.42002390772136805, | |
| "learning_rate": 1.260625841871692e-06, | |
| "loss": 0.2795, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26312363147735596, | |
| "step": 3935, | |
| "valid_targets_mean": 5598.0, | |
| "valid_targets_min": 2716 | |
| }, | |
| { | |
| "epoch": 6.293929712460064, | |
| "grad_norm": 0.3998814018456629, | |
| "learning_rate": 1.2329350467107925e-06, | |
| "loss": 0.2572, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2606557011604309, | |
| "step": 3940, | |
| "valid_targets_mean": 6200.5, | |
| "valid_targets_min": 2353 | |
| }, | |
| { | |
| "epoch": 6.301916932907348, | |
| "grad_norm": 0.41789797999440276, | |
| "learning_rate": 1.2055420908362781e-06, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2655007839202881, | |
| "step": 3945, | |
| "valid_targets_mean": 5902.4, | |
| "valid_targets_min": 2764 | |
| }, | |
| { | |
| "epoch": 6.3099041533546325, | |
| "grad_norm": 0.40243132478120186, | |
| "learning_rate": 1.1784474089830612e-06, | |
| "loss": 0.2688, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29167425632476807, | |
| "step": 3950, | |
| "valid_targets_mean": 6443.7, | |
| "valid_targets_min": 1013 | |
| }, | |
| { | |
| "epoch": 6.317891373801917, | |
| "grad_norm": 0.37432377437529774, | |
| "learning_rate": 1.1516514311523607e-06, | |
| "loss": 0.2455, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.260292649269104, | |
| "step": 3955, | |
| "valid_targets_mean": 7428.6, | |
| "valid_targets_min": 3561 | |
| }, | |
| { | |
| "epoch": 6.325878594249201, | |
| "grad_norm": 0.4047327682405144, | |
| "learning_rate": 1.1251545826048593e-06, | |
| "loss": 0.256, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23277956247329712, | |
| "step": 3960, | |
| "valid_targets_mean": 5727.9, | |
| "valid_targets_min": 889 | |
| }, | |
| { | |
| "epoch": 6.333865814696486, | |
| "grad_norm": 0.4255506667930687, | |
| "learning_rate": 1.098957283853972e-06, | |
| "loss": 0.2692, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2866438627243042, | |
| "step": 3965, | |
| "valid_targets_mean": 6260.7, | |
| "valid_targets_min": 759 | |
| }, | |
| { | |
| "epoch": 6.34185303514377, | |
| "grad_norm": 0.42312730083398004, | |
| "learning_rate": 1.0730599506591476e-06, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2608621120452881, | |
| "step": 3970, | |
| "valid_targets_mean": 5459.2, | |
| "valid_targets_min": 1079 | |
| }, | |
| { | |
| "epoch": 6.3498402555910545, | |
| "grad_norm": 0.4110280597614948, | |
| "learning_rate": 1.0474629940192994e-06, | |
| "loss": 0.2476, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2226857841014862, | |
| "step": 3975, | |
| "valid_targets_mean": 5432.8, | |
| "valid_targets_min": 3066 | |
| }, | |
| { | |
| "epoch": 6.357827476038339, | |
| "grad_norm": 0.40086115594507055, | |
| "learning_rate": 1.0221668201662726e-06, | |
| "loss": 0.248, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.242458313703537, | |
| "step": 3980, | |
| "valid_targets_mean": 5737.6, | |
| "valid_targets_min": 2586 | |
| }, | |
| { | |
| "epoch": 6.365814696485623, | |
| "grad_norm": 0.4032124706373358, | |
| "learning_rate": 9.971718305583767e-07, | |
| "loss": 0.2646, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30006730556488037, | |
| "step": 3985, | |
| "valid_targets_mean": 6250.2, | |
| "valid_targets_min": 802 | |
| }, | |
| { | |
| "epoch": 6.373801916932908, | |
| "grad_norm": 0.4369518802495719, | |
| "learning_rate": 9.724784218740524e-07, | |
| "loss": 0.2533, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24730181694030762, | |
| "step": 3990, | |
| "valid_targets_mean": 4995.5, | |
| "valid_targets_min": 1945 | |
| }, | |
| { | |
| "epoch": 6.381789137380192, | |
| "grad_norm": 0.429599879347496, | |
| "learning_rate": 9.480869860055364e-07, | |
| "loss": 0.254, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24054160714149475, | |
| "step": 3995, | |
| "valid_targets_mean": 5166.2, | |
| "valid_targets_min": 680 | |
| }, | |
| { | |
| "epoch": 6.389776357827476, | |
| "grad_norm": 0.3876796056354229, | |
| "learning_rate": 9.239979100526763e-07, | |
| "loss": 0.2753, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25194332003593445, | |
| "step": 4000, | |
| "valid_targets_mean": 6341.6, | |
| "valid_targets_min": 2325 | |
| }, | |
| { | |
| "epoch": 6.397763578274761, | |
| "grad_norm": 0.3922690814846412, | |
| "learning_rate": 9.002115763167585e-07, | |
| "loss": 0.2537, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2845813035964966, | |
| "step": 4005, | |
| "valid_targets_mean": 7029.4, | |
| "valid_targets_min": 2481 | |
| }, | |
| { | |
| "epoch": 6.405750798722044, | |
| "grad_norm": 0.4144382572722086, | |
| "learning_rate": 8.7672836229447e-07, | |
| "loss": 0.2459, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2669537663459778, | |
| "step": 4010, | |
| "valid_targets_mean": 5912.7, | |
| "valid_targets_min": 2424 | |
| }, | |
| { | |
| "epoch": 6.413738019169329, | |
| "grad_norm": 0.40260962781729587, | |
| "learning_rate": 8.535486406718684e-07, | |
| "loss": 0.2668, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24243292212486267, | |
| "step": 4015, | |
| "valid_targets_mean": 5417.9, | |
| "valid_targets_min": 2268 | |
| }, | |
| { | |
| "epoch": 6.421725239616613, | |
| "grad_norm": 0.424208872675099, | |
| "learning_rate": 8.306727793185132e-07, | |
| "loss": 0.2518, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2556074261665344, | |
| "step": 4020, | |
| "valid_targets_mean": 5542.9, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 6.4297124600638975, | |
| "grad_norm": 0.4231517971283913, | |
| "learning_rate": 8.081011412815965e-07, | |
| "loss": 0.2515, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.286851704120636, | |
| "step": 4025, | |
| "valid_targets_mean": 6050.6, | |
| "valid_targets_min": 2272 | |
| }, | |
| { | |
| "epoch": 6.437699680511182, | |
| "grad_norm": 0.3937294857956087, | |
| "learning_rate": 7.858340847801815e-07, | |
| "loss": 0.2579, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2628542184829712, | |
| "step": 4030, | |
| "valid_targets_mean": 6836.8, | |
| "valid_targets_min": 2757 | |
| }, | |
| { | |
| "epoch": 6.445686900958466, | |
| "grad_norm": 0.3676466718840952, | |
| "learning_rate": 7.638719631995406e-07, | |
| "loss": 0.2568, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23427250981330872, | |
| "step": 4035, | |
| "valid_targets_mean": 6799.4, | |
| "valid_targets_min": 3726 | |
| }, | |
| { | |
| "epoch": 6.453674121405751, | |
| "grad_norm": 0.44470994190939733, | |
| "learning_rate": 7.422151250855214e-07, | |
| "loss": 0.2505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26523175835609436, | |
| "step": 4040, | |
| "valid_targets_mean": 5138.9, | |
| "valid_targets_min": 992 | |
| }, | |
| { | |
| "epoch": 6.461661341853035, | |
| "grad_norm": 0.44881772807886583, | |
| "learning_rate": 7.208639141390295e-07, | |
| "loss": 0.282, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30667585134506226, | |
| "step": 4045, | |
| "valid_targets_mean": 5630.2, | |
| "valid_targets_min": 2290 | |
| }, | |
| { | |
| "epoch": 6.4696485623003195, | |
| "grad_norm": 0.3817703378311513, | |
| "learning_rate": 6.998186692105657e-07, | |
| "loss": 0.2608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2372807413339615, | |
| "step": 4050, | |
| "valid_targets_mean": 6366.2, | |
| "valid_targets_min": 3032 | |
| }, | |
| { | |
| "epoch": 6.477635782747604, | |
| "grad_norm": 0.42501622886020957, | |
| "learning_rate": 6.790797242948644e-07, | |
| "loss": 0.2665, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25420382618904114, | |
| "step": 4055, | |
| "valid_targets_mean": 5281.9, | |
| "valid_targets_min": 3415 | |
| }, | |
| { | |
| "epoch": 6.485623003194888, | |
| "grad_norm": 0.40642644510416537, | |
| "learning_rate": 6.5864740852557e-07, | |
| "loss": 0.2682, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.270402193069458, | |
| "step": 4060, | |
| "valid_targets_mean": 6061.5, | |
| "valid_targets_min": 2312 | |
| }, | |
| { | |
| "epoch": 6.493610223642173, | |
| "grad_norm": 0.40486318831736334, | |
| "learning_rate": 6.385220461700248e-07, | |
| "loss": 0.2689, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31059807538986206, | |
| "step": 4065, | |
| "valid_targets_mean": 6979.2, | |
| "valid_targets_min": 820 | |
| }, | |
| { | |
| "epoch": 6.501597444089457, | |
| "grad_norm": 0.47904580324378615, | |
| "learning_rate": 6.187039566241337e-07, | |
| "loss": 0.2605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24163641035556793, | |
| "step": 4070, | |
| "valid_targets_mean": 5791.2, | |
| "valid_targets_min": 1051 | |
| }, | |
| { | |
| "epoch": 6.5095846645367414, | |
| "grad_norm": 0.41017964801968143, | |
| "learning_rate": 5.99193454407272e-07, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27509069442749023, | |
| "step": 4075, | |
| "valid_targets_mean": 6401.7, | |
| "valid_targets_min": 2276 | |
| }, | |
| { | |
| "epoch": 6.517571884984026, | |
| "grad_norm": 0.39608729690040123, | |
| "learning_rate": 5.799908491573148e-07, | |
| "loss": 0.258, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2366548329591751, | |
| "step": 4080, | |
| "valid_targets_mean": 5398.1, | |
| "valid_targets_min": 1902 | |
| }, | |
| { | |
| "epoch": 6.52555910543131, | |
| "grad_norm": 0.3761642354654187, | |
| "learning_rate": 5.610964456257107e-07, | |
| "loss": 0.2652, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24891404807567596, | |
| "step": 4085, | |
| "valid_targets_mean": 6751.4, | |
| "valid_targets_min": 3351 | |
| }, | |
| { | |
| "epoch": 6.533546325878595, | |
| "grad_norm": 0.40685164416522424, | |
| "learning_rate": 5.425105436726496e-07, | |
| "loss": 0.2761, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2536323666572571, | |
| "step": 4090, | |
| "valid_targets_mean": 5744.8, | |
| "valid_targets_min": 2652 | |
| }, | |
| { | |
| "epoch": 6.541533546325878, | |
| "grad_norm": 0.3872216478634318, | |
| "learning_rate": 5.242334382623004e-07, | |
| "loss": 0.2671, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2931691110134125, | |
| "step": 4095, | |
| "valid_targets_mean": 7036.7, | |
| "valid_targets_min": 3402 | |
| }, | |
| { | |
| "epoch": 6.549520766773163, | |
| "grad_norm": 0.40135613522920494, | |
| "learning_rate": 5.062654194581429e-07, | |
| "loss": 0.2584, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2978881001472473, | |
| "step": 4100, | |
| "valid_targets_mean": 6882.9, | |
| "valid_targets_min": 654 | |
| }, | |
| { | |
| "epoch": 6.557507987220447, | |
| "grad_norm": 0.3940263743209835, | |
| "learning_rate": 4.88606772418343e-07, | |
| "loss": 0.2685, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3106963634490967, | |
| "step": 4105, | |
| "valid_targets_mean": 6899.5, | |
| "valid_targets_min": 1918 | |
| }, | |
| { | |
| "epoch": 6.565495207667731, | |
| "grad_norm": 0.40476916349000935, | |
| "learning_rate": 4.7125777739123857e-07, | |
| "loss": 0.2583, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24585530161857605, | |
| "step": 4110, | |
| "valid_targets_mean": 5548.6, | |
| "valid_targets_min": 3331 | |
| }, | |
| { | |
| "epoch": 6.573482428115016, | |
| "grad_norm": 0.44511561583008213, | |
| "learning_rate": 4.54218709710903e-07, | |
| "loss": 0.2609, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.290644109249115, | |
| "step": 4115, | |
| "valid_targets_mean": 5344.9, | |
| "valid_targets_min": 2319 | |
| }, | |
| { | |
| "epoch": 6.5814696485623, | |
| "grad_norm": 0.3905212204877214, | |
| "learning_rate": 4.374898397927507e-07, | |
| "loss": 0.2672, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2835393249988556, | |
| "step": 4120, | |
| "valid_targets_mean": 6561.9, | |
| "valid_targets_min": 2496 | |
| }, | |
| { | |
| "epoch": 6.5894568690095845, | |
| "grad_norm": 0.43043739073174303, | |
| "learning_rate": 4.210714331292698e-07, | |
| "loss": 0.2394, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2793094217777252, | |
| "step": 4125, | |
| "valid_targets_mean": 5826.1, | |
| "valid_targets_min": 2422 | |
| }, | |
| { | |
| "epoch": 6.597444089456869, | |
| "grad_norm": 0.42798360971174304, | |
| "learning_rate": 4.049637502857895e-07, | |
| "loss": 0.2733, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2985469102859497, | |
| "step": 4130, | |
| "valid_targets_mean": 5765.9, | |
| "valid_targets_min": 571 | |
| }, | |
| { | |
| "epoch": 6.605431309904153, | |
| "grad_norm": 0.3921248080260788, | |
| "learning_rate": 3.8916704689635707e-07, | |
| "loss": 0.2693, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23658479750156403, | |
| "step": 4135, | |
| "valid_targets_mean": 5843.1, | |
| "valid_targets_min": 2510 | |
| }, | |
| { | |
| "epoch": 6.613418530351438, | |
| "grad_norm": 0.4238407542906127, | |
| "learning_rate": 3.736815736596766e-07, | |
| "loss": 0.2658, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2605782151222229, | |
| "step": 4140, | |
| "valid_targets_mean": 5526.4, | |
| "valid_targets_min": 3246 | |
| }, | |
| { | |
| "epoch": 6.621405750798722, | |
| "grad_norm": 0.4018626657027095, | |
| "learning_rate": 3.5850757633513424e-07, | |
| "loss": 0.2746, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29174667596817017, | |
| "step": 4145, | |
| "valid_targets_mean": 6742.7, | |
| "valid_targets_min": 585 | |
| }, | |
| { | |
| "epoch": 6.6293929712460065, | |
| "grad_norm": 0.3916027791934614, | |
| "learning_rate": 3.4364529573888803e-07, | |
| "loss": 0.2523, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2614343762397766, | |
| "step": 4150, | |
| "valid_targets_mean": 6216.9, | |
| "valid_targets_min": 2637 | |
| }, | |
| { | |
| "epoch": 6.637380191693291, | |
| "grad_norm": 0.3723514909223308, | |
| "learning_rate": 3.2909496774005344e-07, | |
| "loss": 0.2787, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.30010271072387695, | |
| "step": 4155, | |
| "valid_targets_mean": 6974.5, | |
| "valid_targets_min": 2404 | |
| }, | |
| { | |
| "epoch": 6.645367412140575, | |
| "grad_norm": 0.39140863018129085, | |
| "learning_rate": 3.14856823256966e-07, | |
| "loss": 0.2663, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2734515070915222, | |
| "step": 4160, | |
| "valid_targets_mean": 6848.9, | |
| "valid_targets_min": 2745 | |
| }, | |
| { | |
| "epoch": 6.65335463258786, | |
| "grad_norm": 0.39190342951790336, | |
| "learning_rate": 3.009310882534999e-07, | |
| "loss": 0.249, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.22873139381408691, | |
| "step": 4165, | |
| "valid_targets_mean": 6034.2, | |
| "valid_targets_min": 3020 | |
| }, | |
| { | |
| "epoch": 6.661341853035144, | |
| "grad_norm": 0.42096685369869974, | |
| "learning_rate": 2.8731798373550004e-07, | |
| "loss": 0.2754, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26596885919570923, | |
| "step": 4170, | |
| "valid_targets_mean": 5746.1, | |
| "valid_targets_min": 852 | |
| }, | |
| { | |
| "epoch": 6.669329073482428, | |
| "grad_norm": 0.4060322243618663, | |
| "learning_rate": 2.740177257472576e-07, | |
| "loss": 0.2545, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27120035886764526, | |
| "step": 4175, | |
| "valid_targets_mean": 6087.8, | |
| "valid_targets_min": 2695 | |
| }, | |
| { | |
| "epoch": 6.677316293929713, | |
| "grad_norm": 0.4082882469649237, | |
| "learning_rate": 2.6103052536810226e-07, | |
| "loss": 0.2778, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2792065441608429, | |
| "step": 4180, | |
| "valid_targets_mean": 5868.9, | |
| "valid_targets_min": 2378 | |
| }, | |
| { | |
| "epoch": 6.685303514376997, | |
| "grad_norm": 0.40530290992476375, | |
| "learning_rate": 2.4835658870902226e-07, | |
| "loss": 0.2734, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27434685826301575, | |
| "step": 4185, | |
| "valid_targets_mean": 6331.1, | |
| "valid_targets_min": 2867 | |
| }, | |
| { | |
| "epoch": 6.693290734824281, | |
| "grad_norm": 0.3945907796808697, | |
| "learning_rate": 2.3599611690943158e-07, | |
| "loss": 0.2608, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24785107374191284, | |
| "step": 4190, | |
| "valid_targets_mean": 6136.4, | |
| "valid_targets_min": 1618 | |
| }, | |
| { | |
| "epoch": 6.701277955271565, | |
| "grad_norm": 0.3764639380713518, | |
| "learning_rate": 2.2394930613393927e-07, | |
| "loss": 0.2419, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2778383195400238, | |
| "step": 4195, | |
| "valid_targets_mean": 6814.8, | |
| "valid_targets_min": 3992 | |
| }, | |
| { | |
| "epoch": 6.7092651757188495, | |
| "grad_norm": 0.40876617747946253, | |
| "learning_rate": 2.122163475692629e-07, | |
| "loss": 0.2814, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3094503581523895, | |
| "step": 4200, | |
| "valid_targets_mean": 6526.4, | |
| "valid_targets_min": 1650 | |
| }, | |
| { | |
| "epoch": 6.717252396166134, | |
| "grad_norm": 0.41409063976797617, | |
| "learning_rate": 2.0079742742118878e-07, | |
| "loss": 0.2853, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27831193804740906, | |
| "step": 4205, | |
| "valid_targets_mean": 5999.0, | |
| "valid_targets_min": 2690 | |
| }, | |
| { | |
| "epoch": 6.725239616613418, | |
| "grad_norm": 0.4129581785750722, | |
| "learning_rate": 1.8969272691160334e-07, | |
| "loss": 0.264, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25264161825180054, | |
| "step": 4210, | |
| "valid_targets_mean": 5947.4, | |
| "valid_targets_min": 825 | |
| }, | |
| { | |
| "epoch": 6.733226837060703, | |
| "grad_norm": 0.41261177762332807, | |
| "learning_rate": 1.789024222756397e-07, | |
| "loss": 0.2717, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2740635275840759, | |
| "step": 4215, | |
| "valid_targets_mean": 5676.9, | |
| "valid_targets_min": 2217 | |
| }, | |
| { | |
| "epoch": 6.741214057507987, | |
| "grad_norm": 0.38921747696529335, | |
| "learning_rate": 1.6842668475885782e-07, | |
| "loss": 0.2874, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3034498989582062, | |
| "step": 4220, | |
| "valid_targets_mean": 6683.3, | |
| "valid_targets_min": 2780 | |
| }, | |
| { | |
| "epoch": 6.7492012779552715, | |
| "grad_norm": 0.39522613969519027, | |
| "learning_rate": 1.582656806145444e-07, | |
| "loss": 0.2528, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2690703272819519, | |
| "step": 4225, | |
| "valid_targets_mean": 6550.1, | |
| "valid_targets_min": 2863 | |
| }, | |
| { | |
| "epoch": 6.757188498402556, | |
| "grad_norm": 0.4260197819350322, | |
| "learning_rate": 1.4841957110106388e-07, | |
| "loss": 0.2624, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2838810682296753, | |
| "step": 4230, | |
| "valid_targets_mean": 5512.8, | |
| "valid_targets_min": 2650 | |
| }, | |
| { | |
| "epoch": 6.76517571884984, | |
| "grad_norm": 0.43256835522161147, | |
| "learning_rate": 1.388885124793049e-07, | |
| "loss": 0.2505, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24123039841651917, | |
| "step": 4235, | |
| "valid_targets_mean": 5417.0, | |
| "valid_targets_min": 769 | |
| }, | |
| { | |
| "epoch": 6.773162939297125, | |
| "grad_norm": 0.40541255696406014, | |
| "learning_rate": 1.2967265601019573e-07, | |
| "loss": 0.2508, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24490627646446228, | |
| "step": 4240, | |
| "valid_targets_mean": 6160.9, | |
| "valid_targets_min": 2738 | |
| }, | |
| { | |
| "epoch": 6.781150159744409, | |
| "grad_norm": 0.4192974119053909, | |
| "learning_rate": 1.207721479523105e-07, | |
| "loss": 0.2677, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24304074048995972, | |
| "step": 4245, | |
| "valid_targets_mean": 5336.4, | |
| "valid_targets_min": 2824 | |
| }, | |
| { | |
| "epoch": 6.789137380191693, | |
| "grad_norm": 0.42666940769259437, | |
| "learning_rate": 1.1218712955954003e-07, | |
| "loss": 0.2599, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.26424548029899597, | |
| "step": 4250, | |
| "valid_targets_mean": 5964.0, | |
| "valid_targets_min": 1021 | |
| }, | |
| { | |
| "epoch": 6.797124600638978, | |
| "grad_norm": 0.39937087759335077, | |
| "learning_rate": 1.0391773707885578e-07, | |
| "loss": 0.2706, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.27442675828933716, | |
| "step": 4255, | |
| "valid_targets_mean": 6102.8, | |
| "valid_targets_min": 2335 | |
| }, | |
| { | |
| "epoch": 6.805111821086262, | |
| "grad_norm": 0.40863325023857433, | |
| "learning_rate": 9.596410174814497e-08, | |
| "loss": 0.2695, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.257610946893692, | |
| "step": 4260, | |
| "valid_targets_mean": 5718.2, | |
| "valid_targets_min": 2545 | |
| }, | |
| { | |
| "epoch": 6.813099041533547, | |
| "grad_norm": 0.4069647471285601, | |
| "learning_rate": 8.83263497941278e-08, | |
| "loss": 0.2591, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2750399708747864, | |
| "step": 4265, | |
| "valid_targets_mean": 6767.9, | |
| "valid_targets_min": 3444 | |
| }, | |
| { | |
| "epoch": 6.821086261980831, | |
| "grad_norm": 0.4059635974903752, | |
| "learning_rate": 8.10046024303568e-08, | |
| "loss": 0.2488, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2263932079076767, | |
| "step": 4270, | |
| "valid_targets_mean": 5993.4, | |
| "valid_targets_min": 2711 | |
| }, | |
| { | |
| "epoch": 6.8290734824281145, | |
| "grad_norm": 0.41477442009936805, | |
| "learning_rate": 7.399897585528726e-08, | |
| "loss": 0.2622, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2576657831668854, | |
| "step": 4275, | |
| "valid_targets_mean": 5577.6, | |
| "valid_targets_min": 766 | |
| }, | |
| { | |
| "epoch": 6.8370607028754, | |
| "grad_norm": 0.40172911683552204, | |
| "learning_rate": 6.730958125044319e-08, | |
| "loss": 0.2918, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.271068811416626, | |
| "step": 4280, | |
| "valid_targets_mean": 6353.2, | |
| "valid_targets_min": 1410 | |
| }, | |
| { | |
| "epoch": 6.845047923322683, | |
| "grad_norm": 0.38352639944603106, | |
| "learning_rate": 6.093652477864309e-08, | |
| "loss": 0.263, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24830323457717896, | |
| "step": 4285, | |
| "valid_targets_mean": 6276.2, | |
| "valid_targets_min": 1508 | |
| }, | |
| { | |
| "epoch": 6.853035143769968, | |
| "grad_norm": 0.39422551287265184, | |
| "learning_rate": 5.4879907582316986e-08, | |
| "loss": 0.2772, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.31273385882377625, | |
| "step": 4290, | |
| "valid_targets_mean": 7305.0, | |
| "valid_targets_min": 4199 | |
| }, | |
| { | |
| "epoch": 6.861022364217252, | |
| "grad_norm": 0.4216835901964001, | |
| "learning_rate": 4.913982578190535e-08, | |
| "loss": 0.2562, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23493213951587677, | |
| "step": 4295, | |
| "valid_targets_mean": 6236.2, | |
| "valid_targets_min": 978 | |
| }, | |
| { | |
| "epoch": 6.8690095846645365, | |
| "grad_norm": 0.4253919773938608, | |
| "learning_rate": 4.3716370474331527e-08, | |
| "loss": 0.2472, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2774316072463989, | |
| "step": 4300, | |
| "valid_targets_mean": 5980.0, | |
| "valid_targets_min": 751 | |
| }, | |
| { | |
| "epoch": 6.876996805111821, | |
| "grad_norm": 0.39799081688196125, | |
| "learning_rate": 3.8609627731558405e-08, | |
| "loss": 0.2605, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2209227830171585, | |
| "step": 4305, | |
| "valid_targets_mean": 5948.8, | |
| "valid_targets_min": 1976 | |
| }, | |
| { | |
| "epoch": 6.884984025559105, | |
| "grad_norm": 0.50771347607883, | |
| "learning_rate": 3.381967859920954e-08, | |
| "loss": 0.273, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25479426980018616, | |
| "step": 4310, | |
| "valid_targets_mean": 5528.9, | |
| "valid_targets_min": 623 | |
| }, | |
| { | |
| "epoch": 6.89297124600639, | |
| "grad_norm": 0.38835805866217404, | |
| "learning_rate": 2.9346599095305685e-08, | |
| "loss": 0.2745, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28347715735435486, | |
| "step": 4315, | |
| "valid_targets_mean": 6656.9, | |
| "valid_targets_min": 2755 | |
| }, | |
| { | |
| "epoch": 6.900958466453674, | |
| "grad_norm": 0.41565107262448814, | |
| "learning_rate": 2.5190460209039146e-08, | |
| "loss": 0.2818, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.296379417181015, | |
| "step": 4320, | |
| "valid_targets_mean": 6231.1, | |
| "valid_targets_min": 2385 | |
| }, | |
| { | |
| "epoch": 6.9089456869009584, | |
| "grad_norm": 0.4289779482034415, | |
| "learning_rate": 2.1351327899656883e-08, | |
| "loss": 0.2586, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28387531638145447, | |
| "step": 4325, | |
| "valid_targets_mean": 6475.8, | |
| "valid_targets_min": 2539 | |
| }, | |
| { | |
| "epoch": 6.916932907348243, | |
| "grad_norm": 0.3969157421312715, | |
| "learning_rate": 1.782926309540578e-08, | |
| "loss": 0.2603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.29361945390701294, | |
| "step": 4330, | |
| "valid_targets_mean": 6809.4, | |
| "valid_targets_min": 3356 | |
| }, | |
| { | |
| "epoch": 6.924920127795527, | |
| "grad_norm": 0.38640407541050914, | |
| "learning_rate": 1.462432169257344e-08, | |
| "loss": 0.2524, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2357715219259262, | |
| "step": 4335, | |
| "valid_targets_mean": 6227.8, | |
| "valid_targets_min": 2836 | |
| }, | |
| { | |
| "epoch": 6.932907348242812, | |
| "grad_norm": 0.3879185914808639, | |
| "learning_rate": 1.1736554554604429e-08, | |
| "loss": 0.2603, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.23897549510002136, | |
| "step": 4340, | |
| "valid_targets_mean": 5895.3, | |
| "valid_targets_min": 3767 | |
| }, | |
| { | |
| "epoch": 6.940894568690096, | |
| "grad_norm": 0.3856401857440791, | |
| "learning_rate": 9.166007511274278e-09, | |
| "loss": 0.2743, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.24993009865283966, | |
| "step": 4345, | |
| "valid_targets_mean": 6160.5, | |
| "valid_targets_min": 2705 | |
| }, | |
| { | |
| "epoch": 6.94888178913738, | |
| "grad_norm": 0.3830415902123125, | |
| "learning_rate": 6.912721357985597e-09, | |
| "loss": 0.2747, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2752983272075653, | |
| "step": 4350, | |
| "valid_targets_mean": 6781.8, | |
| "valid_targets_min": 4555 | |
| }, | |
| { | |
| "epoch": 6.956869009584665, | |
| "grad_norm": 0.4108348801920983, | |
| "learning_rate": 4.976731855104166e-09, | |
| "loss": 0.2403, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2581372857093811, | |
| "step": 4355, | |
| "valid_targets_mean": 5807.4, | |
| "valid_targets_min": 2440 | |
| }, | |
| { | |
| "epoch": 6.964856230031949, | |
| "grad_norm": 0.37029108470652733, | |
| "learning_rate": 3.3580697274016028e-09, | |
| "loss": 0.262, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2512226104736328, | |
| "step": 4360, | |
| "valid_targets_mean": 6862.8, | |
| "valid_targets_min": 714 | |
| }, | |
| { | |
| "epoch": 6.972843450479234, | |
| "grad_norm": 0.4080289348592804, | |
| "learning_rate": 2.056760663555757e-09, | |
| "loss": 0.2791, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.28627151250839233, | |
| "step": 4365, | |
| "valid_targets_mean": 6440.4, | |
| "valid_targets_min": 3159 | |
| }, | |
| { | |
| "epoch": 6.980830670926517, | |
| "grad_norm": 0.44086072052629505, | |
| "learning_rate": 1.0728253157599178e-09, | |
| "loss": 0.2699, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.21977540850639343, | |
| "step": 4370, | |
| "valid_targets_mean": 5270.1, | |
| "valid_targets_min": 2667 | |
| }, | |
| { | |
| "epoch": 6.988817891373802, | |
| "grad_norm": 0.38517817958136297, | |
| "learning_rate": 4.062792993786424e-10, | |
| "loss": 0.258, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.25942468643188477, | |
| "step": 4375, | |
| "valid_targets_mean": 6358.6, | |
| "valid_targets_min": 2666 | |
| }, | |
| { | |
| "epoch": 6.996805111821086, | |
| "grad_norm": 0.40393719527385674, | |
| "learning_rate": 5.7133192707947705e-11, | |
| "loss": 0.2626, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.2742898464202881, | |
| "step": 4380, | |
| "valid_targets_mean": 6395.8, | |
| "valid_targets_min": 3183 | |
| }, | |
| { | |
| "epoch": 7.0, | |
| "loss_nan_ranks": 0, | |
| "loss_rank_avg": 0.3153660297393799, | |
| "step": 4382, | |
| "total_flos": 2217013767766016.0, | |
| "train_loss": 0.30451241010457397, | |
| "train_runtime": 31496.7211, | |
| "train_samples_per_second": 2.224, | |
| "train_steps_per_second": 0.139, | |
| "valid_targets_mean": 6202.0, | |
| "valid_targets_min": 1923 | |
| } | |
| ], | |
| "logging_steps": 5, | |
| "max_steps": 4382, | |
| "num_input_tokens_seen": 0, | |
| "num_train_epochs": 7, | |
| "save_steps": 1500, | |
| "stateful_callbacks": { | |
| "TrainerControl": { | |
| "args": { | |
| "should_epoch_stop": false, | |
| "should_evaluate": false, | |
| "should_log": false, | |
| "should_save": true, | |
| "should_training_stop": true | |
| }, | |
| "attributes": {} | |
| } | |
| }, | |
| "total_flos": 2217013767766016.0, | |
| "train_batch_size": 1, | |
| "trial_name": null, | |
| "trial_params": null | |
| } | |